Project Activities
The project team improved an open-source Bayesian inference package Stan, increasing its computational efficiency and accessibility to make it more useful and available for solving problems in education research. In addition to improving Stan, the team also improved its interfaces and user-facing tools such as documentation and case studies.
Statistical/Methodological Product: Stan is an open-source Bayesian inference engine, accessible from Python, R, Julia, Stata, and other statistics and mathematics languages. Stan is created for a range of users, from statisticians and computer scientists to applied researchers in science, engineering, government, and business. Stan is widely used in education research, especially for fitting latent-variable models in hierarchical or multilevel data structures that arise in program evaluation, policy analysis, and item-response modeling.
Development/Refinement Process: The project team developed algorithms, tested them on a range of theoretical and applied examples, and developed user-friendly tools and documentation, using a range of datasets.
Key outcomes
- Code that allows Stan to analyze data for a wider variety of models (https://mc-stan.org/).
- Stan interfaces for R and Python that make Stan more accessible to a larger group of users and allow software developers to more easily create wrappers making Stan programs accessible to R and Python users who are not Stan users themselves (cmdstanR at https://mc-stan.org/cmdstanr/) and Python (cmstanPy at https://mc-stan.org/cmdstanpy/).
- Case studies on Bayesian latent class models and handling of label switching (https://mc-stan.org/users/documentation/case-studies/Latent_class_case_study.html) and on multilevel regression and poststratification (https://bookdown.org/jl5522/MRP-case- studies/).
People and institutions involved
IES program contact(s)
Project contributors
Products and publications
Project website:
Publications:
Broderick, T., Gelman, A., Meager, R., Smith, A. L., & Zheng, T. (2023). Toward a taxonomy of trust for probabilistic machine learning. Science Advances 9, eabn3999.
Gao, Y., Kennedy, L., Simpson, D. & Gelman, A. (2021). Improving multilevel regression and poststratification with structured priors. Bayesian Analysis 16, 719-744.
Gelman, A. (2022). Criticism as asynchronous collaboration: An example from social science research. Stat 11, e464.
Gelman, A., Hullman, J., Wlezien, C., & Morris, G. E. (2020). Information, incentives, and goals in election forecasts. Judgment and Decision Making 15, 863-880.
Gelman, A. & Kennedy, L. (2021). Know your population and know your model: Using model-based regression and post-stratification to generalize findings beyond the observed sample. Psychological Methods 26, 547-558.
Gelman, A., & Vákár, M. (2021) Slamming the sham: A Bayesian model for adaptive adjustment with noisy control data. Statistics in Medicine 40, 3403-3424.
Gin, B., Sim, N., Skrondal, A. and Rabe-Hesketh, S. (2020). A dyadic IRT model. Psychometrika 85, 815-836.
Heidemanns, M., Gelman, A. and Morris, G. E. (2020). An updated dynamic Bayesian forecasting model for the US presidential election. Harvard Data Science Review 2(4). https://doi.org/10.1162/99608f92.fc62f1e1.
Kennedy, L., Simpson, S., & Gelman, A. (2019). The experiment is just as important as the likelihood in understanding the prior: A cautionary note on robust cognitive modeling. Computational Brain and Behavior 2, 210-217.
McShane, B.B., Gal, D., Gelman, A., Robert, C., & Tackett, J.L. (2019). Abandon statistical significance. American Statistician 73(S1), 235-245.
Merkle, E. C., Fitzsimmons, E., Uanhoro, J., and Goodrich, B. (2022). Efficient Bayesian structural equation modeling in Stan. Journal of Statistical Software 100(6), 1-22.
Merkle, E., Furr, D., & Rabe-Hesketh, S. (2019). Bayesian Comparison of Latent Variable Models: Conditional Versus Marginal Likelihoods. Psychometrika 84, 802-829.
Vehtari, A., Gelman, A, Simpson D., Carpenter, B., & Bürkner, P. (2021). Rank-normalization, folding, and localization: An improved R-hat for assessing convergence of MCMC. Bayesian Analysis 16, 667-718.
Vehtari, A., Gelman, A., Sivula T., Jylanki, P., Tran, D., Sahai, S., Blomstedt, P., Cunningham, J., Schiminovich, D., & Robert, C. (2020). Expectation propagation as a way of life: A framework for Bayesian inference on partioned data. Journal of Machine Learning Research 21, 1-53. ED634110
Yao, Y., Vehtari, A., & Gelman, A. (2022). Stacking for non-mixing Bayesian computations: The curse and blessing of multimodal posteriors. Journal of Machine Learning Research 23, 79.
Additional project information
Additional Online Resources and Information:
- Stan interfaces for R (cmdstanR at https://mc-stan.org/cmdstanr/) and Python (cmstanPy at https://mc-stan.org/cmdstanpy/)
- Case studies on Bayesian latent class models and handling of label switching (https://mc-stan.org/users/documentation/case-studies/Latent_class_case_study.html) and on multilevel regression and poststratification (https://bookdown.org/jl5522/MRP-case- studies/)
Related projects
Questions about this project?
To answer additional questions about this project or provide feedback, please contact the program officer.