On-line machine learning and biological spike-timing-dependent plasticity (STDP) rules both generate Markov chains for the synaptic weights. We give a perturbation expansion (in powers of the learning rate) for the dynamics that, unlike the usual approximation by a Fokker-Planck equation (FPE), is rigorous. Our approach extends the related system size expansion by giving an expansion for the probability density as well as its moments. Applied to two observed STDP learning rules, our approach provides better agreement with Monte-Carlo simulations than either the FPE or a simple linearized theory. The approach is also applicable to stochastic neural dynamics.