title | abstract | layout | series | publisher | issn | id | month | tex_title | firstpage | lastpage | page | order | cycles | bibtex_author | author | date | address | container-title | volume | genre | issued | extras | |||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
On the Stability of Random Matrix Product with Markovian Noise: Application to Linear Stochastic Approximation and TD Learning |
This paper studies the exponential stability of random matrix products driven by a general (possibly unbounded) state space Markov chain. It is a cornerstone in the analysis of stochastic algorithms in machine learning (e.g. for parameter tracking in online-learning or reinforcement learning). The existing results impose strong conditions such as uniform boundedness of the matrix-valued functions and uniform ergodicity of the Markov chains. Our main contribution is an exponential stability result for the p-th moment of random matrix product, provided that (i) the underlying Markov chain satisfies a super-Lyapunov drift condition, (ii) the growth of the matrix-valued functions is controlled by an appropriately defined function (related to the drift condition). Using this result, we give finite-time p-th moment bounds for constant and decreasing stepsize linear stochastic approximation schemes with Markovian noise on general state space. We illustrate these findings for linear value-function estimation in reinforcement learning. We provide finite-time p-th moment bound for various members of temporal difference (TD) family of algorithms. |
inproceedings |
Proceedings of Machine Learning Research |
PMLR |
2640-3498 |
durmus21a |
0 |
On the Stability of Random Matrix Product with Markovian Noise: Application to Linear Stochastic Approximation and TD Learning |
1711 |
1752 |
1711-1752 |
1711 |
false |
Durmus, Alain and Moulines, Eric and Naumov, Alexey and Samsonov, Sergey and Wai, Hoi-To |
|
2021-07-21 |
Proceedings of Thirty Fourth Conference on Learning Theory |
134 |
inproceedings |
|