Hostname: page-component-669899f699-rg895 Total loading time: 0 Render date: 2025-04-29T14:57:01.691Z Has data issue: false hasContentIssue false

A Note on Improving Variational Estimation for Multidimensional Item Response Theory

Published online by Cambridge University Press:  01 January 2025

Chenchen Ma
Affiliation:
University of Michigan
Jing Ouyang
Affiliation:
University of Michigan
Chun Wang*
Affiliation:
University of Washington
Gongjun Xu*
Affiliation:
University of Michigan
*
Correspondence should be made to Chun Wang, College of Education, University of Washington, 312 E Miller Hall, 2012 Skagit Lane, Seattle, WA98105, USA. Email: [email protected]
Correspondence should be made to Gongjun Xu, Department of Statistics, University of Michigan, 456 West Hall, 1085 South University, Ann Arbor, MI 48109, USA. Email: [email protected]

Abstract

Survey instruments and assessments are frequently used in many domains of social science. When the constructs that these assessments try to measure become multifaceted, multidimensional item response theory (MIRT) provides a unified framework and convenient statistical tool for item analysis, calibration, and scoring. However, the computational challenge of estimating MIRT models prohibits its wide use because many of the extant methods can hardly provide results in a realistic time frame when the number of dimensions, sample size, and test length are large. Instead, variational estimation methods, such as Gaussian variational expectation–maximization (GVEM) algorithm, have been recently proposed to solve the estimation challenge by providing a fast and accurate solution. However, results have shown that variational estimation methods may produce some bias on discrimination parameters during confirmatory model estimation, and this note proposes an importance-weighted version of GVEM (i.e., IW-GVEM) to correct for such bias under MIRT models. We also use the adaptive moment estimation method to update the learning rate for gradient descent automatically. Our simulations show that IW-GVEM can effectively correct bias with modest increase of computation time, compared with GVEM. The proposed method may also shed light on improving the variational estimation for other psychometrics models.

Type
Theory & Methods
Copyright
Copyright © 2023 The Author(s), under exclusive licence to The Psychometric Society.

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Albert, J.H.. (1992). Bayesian estimation of normal ogive item response curves using GIBBS sampling. Journal of educational statistics, 17 3251269.CrossRefGoogle Scholar
Bates, D., Mächler, M., Bolker, B., & Walker, S. (2014). Fitting linear mixed-effects models using lme4. arXiv preprint arXiv:1406.5823.Google Scholar
Bishop, C.M.Pattern recognition and machine learning 2006 Springer.Google Scholar
Blei, D.M., Kucukelbir, A, McAuliffe, J.D.. (2017). Variational inference: A review for statisticians. Journal of the American Statistical Association, 112 518859877.CrossRefGoogle Scholar
Bock, R.D., Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: Application of an EM algorithm. Psychometrika, 46 4443459.CrossRefGoogle Scholar
Briggs, D. C., & Wilson, M. (2003). An introduction to multidimensional measurement using rasch models.Google Scholar
Burda, Y., Grosse, R., & Salakhutdinov, R. (2015). Importance weighted autoencoders. arXiv preprint arXiv:1509.00519.Google Scholar
Cai, L. (2008). Sem of another flavor: Two new applications of the supplemented EM algorithm. British Journal of Mathematical and Statistical Psychology, 61, 309329.CrossRefGoogle ScholarPubMed
Cai, L. (2010). Metropolis–Hastings Robbins–Monro algorithm for confirmatory item factor analysis. Journal of Educational and Behavioral Statistics, 35 3307335.CrossRefGoogle Scholar
Cai, L. (2010). High-dimensional exploratory item factor analysis by a Metropolis–Hastings Robbins–Monro algorithm. Psychometrika, 75 13357.CrossRefGoogle Scholar
Cai, L, Hansen, M. (2018). Improving educational assessment: Multivariate statistical methods. Policy Insights from the Behavioral and Brain Sciences, 5 11924.CrossRefGoogle Scholar
Cai, L, Yang, J.S., Hansen, M. (2011). Generalized full-information item bifactor analysis. Psychological methods, 16 3221.CrossRefGoogle ScholarPubMed
Chen, Y, Li, X, Zhang, S. (2019). Joint maximum likelihood estimation for high-dimensional exploratory item factor analysis. Psychometrika, 84 1124146.CrossRefGoogle ScholarPubMed
Chen, P, Wang, C. (2021). Using EM algorithm for finite mixtures and reformed supplemented EM for MIRT calibration. Psychometrika, 86, 299326.CrossRefGoogle ScholarPubMed
Cho, A. E., Xiao, J., Wang, C., & Xu, G. (2022). Regularized variational estimation for exploratory item response theory. Psychometrika, pp. 129.Google Scholar
Cho, A.E., Wang, C, Zhang, X, Xu, G. (2021). Gaussian variational estimation for multidimensional item response theory. British Journal of Mathematical and Statistical Psychology, 74, 5285.CrossRefGoogle ScholarPubMed
CRESST (2017). English language proficiency assessment for the 21st century: Item analysis and calibration.Google Scholar
Curi, M., Converse, G. A., Hajewski, J., & Oliveira, S. (2019). Interpretable variational autoencoders for cognitive models. In 2019 international joint conference on neural networks (IJCNN), pp. 18. IEEE.CrossRefGoogle Scholar
Domke, J., & Sheldon, D. R. (2018). Importance weighting and variational inference. Advances in Neural Information Processing Systems, 31.Google Scholar
Gibbons, R.D., Hedeker, D.R.. (1992). Full-information item bi-factor analysis. Psychometrika, 57 3423436.CrossRefGoogle Scholar
Hamilton, L.S., Nussbaum, E.M., Kupermintz, H, Kerkhoven, J.I., Snow, R.E.. (1995). Enhancing the validity and usefulness of large-scale educational assessments: Ii. nels: 88 science achievement. American Educational Research Journal, 32 3555581.CrossRefGoogle Scholar
Hartig, J, Höhler, J. (2009). Multidimensional IRT models for the assessment of competencies. Studies in Educational Evaluation, 35 2–35763.CrossRefGoogle Scholar
Hui, F.K., Warton, D.I., Ormerod, J.T., Haapaniemi, V, Taskinen, S. (2017). Variational approximations for generalized linear latent variable models. Journal of Computational and Graphical Statistics, 26 13543.CrossRefGoogle Scholar
Jeon, M, Rijmen, F, Rabe-Hesketh, S. (2017). A variational maximization-maximization algorithm for generalized linear mixed models with crossed random effects. Psychometrika, 82 3693716.CrossRefGoogle Scholar
Jordan, M.I.. (2004). Graphical models. Statistical science, 19 1140155.CrossRefGoogle Scholar
Kingma, D. P., & Ba, J. (2014). Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.Google Scholar
Kupermintz, H, Ennis, M.M., Hamilton, L.S., Talbert, J.E., Snow, R.E.. (1995). In dedication: Leigh burstein: Enhancing the validity and usefulness of large-scale educational assessments: I. nels: 88 mathematics achievement. American Educational Research Journal, 32 3525554.Google Scholar
Lindstrom, M.J., Bates, D.M.. (1988). Newton–Raphson and EM algorithms for linear mixed-effects models for repeated-measures data. Journal of the American Statistical Association, 83 40410141022.Google Scholar
Liu, T., Wang, C., & Xu, G. (2022). Estimating three- and four-parameter MIRT models with importance-weighted sampling enhanced variational auto-encoder. Frontiers in Psychology, 13.CrossRefGoogle Scholar
McCulloch, C.E.. (1997). Maximum likelihood algorithms for generalized linear mixed models. Journal of the American statistical Association, 92 437162170.CrossRefGoogle Scholar
Natesan, P, Nandakumar, R, Minka, T, Rubright, J.D.. (2016). Bayesian prior choice in IRT estimation using MCMC and variational bayes. Frontiers in Psychology, 7, 1422.CrossRefGoogle ScholarPubMed
Neyman, J., & Scott, E. L. (1948). Consistent estimates based on partially consistent observations. Econometrica: Journal of the Econometric Society, 132.CrossRefGoogle Scholar
OECD, N. (2003). The pisa 2003 assessment framework: Mathematics, reading, science and problem solving knowledge and skills.Google Scholar
Ormerod, J.T., Wand, M.P.. (2010). Explaining variational approximations. The American Statistician, 64 2140153.CrossRefGoogle Scholar
Patz, R.J., Junker, B.W.. (1999). Applications and extensions of MCMC in IRT: Multiple item types, missing data, and rated responses. Journal of educational and behavioral statistics, 24 4342366.CrossRefGoogle Scholar
Pinheiro, J.C., Bates, D.M.. (1995). Approximations to the log-likelihood function in the nonlinear mixed-effects model. Journal of computational and Graphical Statistics, 4 11235.CrossRefGoogle Scholar
Reckase, M. D. (2009). Multidimensional item response theory models. In Multidimensional item response theory, pp. 79112. Springer.CrossRefGoogle Scholar
Rijmen, F, Jeon, M. (2013). Fitting an item response theory model with random item effects across groups by a variational approximation method. Annals of Operations Research, 206 1647662.CrossRefGoogle Scholar
Rijmen, F, Vansteelandt, K, De Boeck, P. (2008). Latent class models for diary method data: Parameter estimation by local computations. Psychometrika, 73 2167182.CrossRefGoogle ScholarPubMed
Thissen, D. (2013). Using the testlet response model as a shortcut to multidimensional item response theory subscore computation. In New developments in quantitative psychology, pp. 2940. Springer.CrossRefGoogle Scholar
Urban, C.J., Bauer, D.J.. (2021). A deep learning algorithm for high-dimensional exploratory item factor analysis. Psychometrika, 86 1129.CrossRefGoogle ScholarPubMed
von Davier, M, Sinharay, S. (2010). Stochastic approximation methods for latent regression item response models. Journal of Educational and Behavioral Statistics, 35 2174193.CrossRefGoogle Scholar
Wainer, H, Bradlow, E.T., Wang, XTestlet response theory and its applications 2007 Cambridge University Press.CrossRefGoogle Scholar
Wang, C, Xu, G. (2015). A mixture hierarchical model for response times and response accuracy. British Journal of Mathematical and Statistical Psychology, 68 3456477.CrossRefGoogle ScholarPubMed
Wu, M., Davis, R. L., Domingue, B. W., Piech, C., & Goodman, N. (2020). Variational item response theory: Fast, accurate, and expressive. arXiv preprint arXiv:2002.00276.Google Scholar
Yamaguchi, K, Okada, K. (2020). Variational Bayes inference algorithm for the saturated diagnostic classification model. Psychometrika, 85 4973995.CrossRefGoogle ScholarPubMed
Yamaguchi, K, Okada, K. (2020). Variational Bayes inference for the DINA model. Journal of Educational and Behavioral Statistics, 45 5569597.CrossRefGoogle Scholar
Zhang, H, Chen, Y, Li, X. (2020). A note on exploratory item factor analysis by singular value decomposition. Psychometrika, 85, 358372.CrossRefGoogle ScholarPubMed
Zhang, S, Chen, Y, Liu, Y. (2020). An improved stochastic EM algorithm for large-scale full-information item factor analysis. British Journal of Mathematical and Statistical Psychology, 73 14471.Google ScholarPubMed