Hostname: page-component-669899f699-tpknm Total loading time: 0 Render date: 2025-05-02T00:33:19.072Z Has data issue: false hasContentIssue false

Weak convergence of adaptive Markov chain Monte Carlo

Published online by Cambridge University Press:  30 April 2025

Austin Brown*
Affiliation:
University of Toronto
Jeffrey S. Rosenthal*
Affiliation:
University of Toronto
*
*Postal address: Department of Statistical Sciences, University of Toronto, Toronto, Canada.
*Postal address: Department of Statistical Sciences, University of Toronto, Toronto, Canada.

Abstract

We develop general conditions for weak convergence of adaptive Markov chain Monte Carlo processes and this is shown to imply a weak law of large numbers for bounded Lipschitz continuous functions. This allows an estimation theory for adaptive Markov chain Monte Carlo where previously developed theory in total variation may fail or be difficult to establish. Extensions of weak convergence to general Wasserstein distances are established, along with a weak law of large numbers for possibly unbounded Lipschitz functions. Applications are applied to autoregressive processes in various settings, unadjusted Langevin processes, and adaptive Metropolis–Hastings.

Type
Original Article
Copyright
© The Author(s), 2025. Published by Cambridge University Press on behalf of Applied Probability Trust

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Andrieu, C. and Moulines, É. (2006). On the ergodicity properties of some adaptive MCMC algorithms. Ann. Appl. Prob. 16, 14621505.CrossRefGoogle Scholar
Atchadé, Y. F. and Rosenthal, J. S. (2005). On adaptive Markov chain Monte Carlo algorithms. Bernoulli 11, 815828.CrossRefGoogle Scholar
Bai, Y., Roberts, G. O. and Rosenthal, J. S. (2009). On the containment condition for adaptive Markov chain Monte Carlo algorithms. Technical report, Centre for Research in Statistical Methodology, University of Warwick.Google Scholar
Bertazzi, A. and Bierkens, J. (2022). Adaptive schemes for piecewise deterministic Monte Carlo algorithms. Bernoulli 28, 24042430.CrossRefGoogle Scholar
Bogachev, V. I. (1998). Gaussian Measures. American Mathematical Society, Providence, RI.CrossRefGoogle Scholar
Bogachev, V. I. (2018). Weak Convergence of Measures. American Mathematical Society, Providence, RI.CrossRefGoogle Scholar
Breyer, L., Roberts, G. O. and Rosenthal, J. S. (2001). A note on geometric ergodicity and floating-point roundoff error. Statist. Prob. Lett. 53, 123127.CrossRefGoogle Scholar
Butkovsky, O. (2014). Subgeometric rates of convergence of Markov processes in the Wasserstein metric. Ann. Appl. Prob. 24, 526552.CrossRefGoogle Scholar
Chimisov, C., Latuszynski, K. and Roberts, G. (2018). Air Markov chain Monte Carlo. Preprint, arXiv:1801.09309.Google Scholar
Cotter, S. L., Roberts, G. O., Stuart, A. M. and White, D. (2013). MCMC methods for functions: Modifying old algorithms to make them faster. Statist. Sci. 28, 424446.CrossRefGoogle Scholar
Craiu, R. V., Gray, L., Latuszyński, K., Madras, N., Roberts, G. O. and Rosenthal, J. S. (2015). Stability of adversarial Markov chains, with an application to adaptive MCMC algorithms. Ann. Appl. Prob. 25, 35923623.CrossRefGoogle Scholar
Da Prato, G. and Zabczyk, J. (2014). Stochastic Equations in Infinite Dimensions, 2nd edn (Encycl. Math. Appl. 152). Cambridge University Press.CrossRefGoogle Scholar
Dudley, R. M. (2018). Real Analysis and Probability. Chapman and Hall/CRC, Boca Raton, FL.Google Scholar
Durmus, A., Fort, G. and Moulines, É. (2016). Subgeometric rates of convergence in Wasserstein distance for Markov chains. Ann. Inst. H. Poincaré Prob. Statist. 52, 17991822.CrossRefGoogle Scholar
Durmus, A. and Moulines, É. (2015). Quantitative bounds of convergence for geometrically ergodic Markov chain in the Wasserstein distance with application to the Metropolis adjusted Langevin algorithm. Statist. Comput. 25, 519.CrossRefGoogle Scholar
Durmus, A. and Moulines, É. (2019). High-dimensional Bayesian inference via the unadjusted Langevin algorithm. Bernoulli 25, 28542882.CrossRefGoogle Scholar
Gibbs, A. L. (2004). Convergence in the Wasserstein metric for Markov chain Monte Carlo algorithms with applications to image restoration. Stoch. Models 20, 473492.CrossRefGoogle Scholar
Haario, H., Saksman, E. and Tamminen, J. (2001). An adaptive Metropolis algorithm. Bernoulli 7, 223242.CrossRefGoogle Scholar
Haario, H., Saksman, E. and Tamminen, J. (2005). Componentwise adaptation for high-dimensional MCMC. Comput. Statist. 20, 265273.CrossRefGoogle Scholar
Hairer, M., Mattingly, J. C. and Scheutzow, M. (2011). Asymptotic coupling and a general form of Harris’ theorem with applications to stochastic delay equations. Prob. Theory Relat. Fields 149, 223259.CrossRefGoogle Scholar
Hofstadler, J., Latuszynski, K., Roberts, G. O. and Rudolf, D. (2024). Almost sure convergence rates of adaptive increasingly rare Markov chain Monte Carlo. Preprint, arXiv:2402.12122.Google Scholar
Kallenberg, O. (2021). Foundations of Modern Probability, 3rd edn. Springer, Cham.CrossRefGoogle Scholar
Latuszynski, K. and Rosenthal, J. S. (2014). The containment condition and AdapFail algorithms. J. Appl. Prob. 51, 11891195.CrossRefGoogle Scholar
Meyn, S. P. and Tweedie, R. L. (2012). Markov Chains and Stochastic Stability. Springer, New York.Google Scholar
Nesterov, Y. (2018). Lectures on Convex Optimization, 2nd edn. Springer, New York.CrossRefGoogle Scholar
Pompe, E., Holmes, C. and Latuszyński, K. (2020). A framework for adaptive MCMC targeting multimodal distributions. Ann. Statist. 48, 29302952.CrossRefGoogle Scholar
Qin, Q. and Hobert, J. P. (2021). On the limitations of single-step drift and minorization in Markov chain convergence analysis. Ann. Appl. Prob. 31, 16331659.CrossRefGoogle Scholar
Qin, Q. and Hobert, J. P. (2022). Wasserstein-based methods for convergence complexity analysis of MCMC with applications. Ann. Appl. Prob. 32, 124166.CrossRefGoogle Scholar
Robbins, H. and Monro, S. (1951). A stochastic approximation method. Ann. Statist. 22, 400407.CrossRefGoogle Scholar
Roberts, G. O. and Rosenthal, J. S. (1998). Optimal scaling of discrete approximations to Langevin diffusions. J. R. Statist. Soc. B 60, 255268.CrossRefGoogle Scholar
Roberts, G. O. and Rosenthal, J. S. (2001). Optimal scaling for various Metropolis–Hastings algorithms. Statist. Sci. 16, 351367.CrossRefGoogle Scholar
Roberts, G. O. and Rosenthal, J. S. (2007). Coupling and ergodicity of adaptive Markov chain Monte Carlo algorithms. J. Appl. Prob. 44, 458475.CrossRefGoogle Scholar
Roberts, G. O. and Rosenthal, J. S. (2009). Examples of adaptive MCMC. J. Comput. Graph. Statist. 18, 349367.CrossRefGoogle Scholar
Roberts, G. O., Rosenthal, J. S. and Schwartz, P. O. (1998). Convergence properties of perturbed Markov chains. J. Appl. Prob. 35, 111.CrossRefGoogle Scholar
Roberts, G. O. and Tweedie, R. L. (1996). Exponential convergence of Langevin distributions and their discrete approximations. Bernoulli 2, 341363.CrossRefGoogle Scholar
Sandrić, N. (2017). A note on the Birkhoff ergodic theorem. Results Math. 72, 715730.CrossRefGoogle Scholar
Strassen, V. (1965). The existence of probability measures with given marginals. Ann. Statist. 36, 423439.CrossRefGoogle Scholar
Tulcea, C. I. (1949). Mesures dans les espaces produits. Atti Accad. Naz. Lincei Rend. 7, 208211.Google Scholar
Tweedie, R. L. (1977). Modes of convergence of Markov chain transition probabilities. J. Math. Anal. Appl. 60, 280291.CrossRefGoogle Scholar
Villani, C. (2003). Topics in Optimal Transportation. American Mathematical Society, Providence, RI.CrossRefGoogle Scholar
Villani, C. (2009). Optimal Transport: Old and New. Springer, Berlin.CrossRefGoogle Scholar