Skip to main content

Breadcrumb

Home arrow_forward_ios Information on IES-Funded Research arrow_forward_ios Fair Prediction of College-Student ...
Home arrow_forward_ios ... arrow_forward_ios Fair Prediction of College-Student ...
Information on IES-Funded Research
Grant Closed

Fair Prediction of College-Student Success Using Multivariate Adaptive Regression Splines

NCER
Program: Statistical and Research Methodology in Education
Program topic(s): Early Career
Award amount: $299,461
Principal investigator: Hadis Anahideh
Awardee:
University of Illinois, Chicago
Year: 2022
Award period: 2 years 6 months (03/01/2022 - 08/31/2024)
Project type:
Methodological Innovation
Award number: R305D220055

Purpose

The purpose of this project is to develop a new fair Multivariate Adaptive Regression Splines (MARS) statistical model to predict college-student success. MARS is a parsimonious non-parametric regression model that can identify useful input variables through a built-in feature-selection step when many potential variables are considered. MARS also renders an easily interpretable model, making it more helpful for use in higher education settings.

Project Activities

The project sought to develop a model for predicting college-student success based on student attributes (e.g., high school GPA and demographics) that is fair, transparent, and intelligible. This fair predictive model for college-student success will be based on multivariate adaptive regression splines (MARS), which incorporates fairness measures in the learning process. The novel model incorporated a fairness constraint to reduce the relationship between membership in socially relevant groups (e.g., racial/ethnic groups) and prediction outcomes. Data from the Education Longitudinal Study (ELS) of 2002 and the Integrated Postsecondary Education Data System (IPEDS) were used to detect bias in education data to show how standard models can produce unfair results and motivate and inform the development of fair MARS; evaluate Fair MARS compared to other models, on both fairness and accuracy; and
simulate the trade-off between fairness and accuracy.

Structured Abstract

Setting

The research was conducted using real-world data from college admissions processes and student-success interventions in higher education settings.

Sample

The study utilized anonymized student data from multiple higher education institutions, including variables related to student demographics, academic performance, and institutional support.

Research design and methods

The project employed advanced machine learning techniques, with a focus on fairness-aware algorithms. Specifically, the team developed and implemented FairMARS (Multivariate Adaptive Regression Splines) to evaluate and mitigate algorithmic bias. Quantitative methods included assessing predictive performance, fairness metrics (e.g., demographic parity, equalized odds), and comparative evaluations of traditional predictive models.

Key measures

Model performance was measured using accuracy, fairness metrics, and interpretability of outcomes. Additional measures included the impact of imputation techniques and the role of race in predictive analytics.

Data analytic strategy

Researchers applied fairness-aware machine learning methods to detect, measure, and mitigate algorithmic bias across racialized groups. Robust testing and validation were conducted on models for college admissions and student-success predictions, with a focus on transparency and equity.

Key outcomes

The project resulted in three major outcomes that contribute to fairness and transparency in predictive
analytics for higher education:

  1. Development of Fair MARS: The team introduced Fair MARS, an interpretable and fairness-aware predictive model that reduces algorithmic bias while maintaining predictive performance. FairMARS demonstrated its effectiveness in ensuring equitable outcomes for underrepresented racial groups in college admissions and student-success interventions. This outcome has been supported by peer-reviewed publications, including articles in AERA Open and AAAI conference Proceedings.
  2. Quantitative Insights on Race and Model Fairness: Researchers quantitatively demonstrated the critical role of race in predictive modeling, providing evidence on how including or excluding race affects fairness and model accuracy. The study also offered actionable recommendations for higher education institutions on the ethical considerations of using race in predictive analytics. These findings are detailed in the submitted manuscript, "Equity Beyond Numbers: Exploring the Interplay of Race and Machine Learning Models" (under review at Big Data & Society).
  3. Practical Tools and Dissemination: The team developed and disseminated a toolkit for fairness-aware machine learning using Google Colab, enabling practitioners to assess and mitigate bias in predictive models. Dissemination included academic presentations (e.g., SREE, INFORMS) and media coverage in platforms such as The Chronicle of Higher Education, Inside Higher Ed, and ACM Tech News.
     

People and institutions involved

IES program contact(s)

Charles Laurin

Education Research Analyst
NCER

Project contributors

Denisa Gandara

Co-principal investigator

Products and publications

The researchers developed and posted to Github open-source, user-friendly software in Python with a comprehensive user guide for researchers and education practitioners. The team disseminated their work at conferences and through peer-reviewed journal manuscripts.

Project website:

FairMARS GitHub Repository

Publications:

ERIC Citations: Find available citations in ERIC for this award here.

Publications:

Anahideh, H., Nezami, N., & Asudeh, A. (2021). Finding Representative Group Fairness Metrics Using Correlation Estimations. Expert Systems with Applications, 262, p.125652. 

Di Carlo, F., Nezami, N., Anahideh, H., & Asudeh, A. (2023). FairPilot: An Explorative System for Hyperparameter Tuning through the Lens of Fairness. arXiv preprint arXiv:2304.04679.

Gándara, D., Anahideh, H., Ison, M.P. and Picchiarini, L., 2024. Inside the Black Box: Detecting and Mitigating Algorithmic Bias Across Racialized Groups in College Student-Success Prediction. AERA Open, 10.

Haghighat, P., Gándara, D., Kang, L., & Anahideh, H. (2024, March). Fair Multivariate Adaptive Regression Splines for Ensuring Equity and Transparency. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 38, No. 20, pp. 22076-22086).

Nezami, N., Haghighat, P., Gándara, D., & Anahideh, H. (2024). Assessing Disparities in Predictive Modeling Outcomes for College Student Success: The Impact of Imputation Techniques on Model Performance and Fairness. Education Sciences, 14(2), 136.

Additional project information

Additional Online Resources and Information: 

  • https://colab.research.google.com/drive/1Jatv9lTx8OpnzEXU7oal6uspz4uQvC-w#scrollTo=_Rv6Gqr1UdVV
  • https://fairpilot.streamlit.app/

     

Questions about this project?

To answer additional questions about this project or provide feedback, please contact the program officer.

 

Tags

Postsecondary Education

Share

Icon to link to Facebook social media siteIcon to link to X social media siteIcon to link to LinkedIn social media siteIcon to copy link value

Questions about this project?

To answer additional questions about this project or provide feedback, please contact the program officer.

 

You may also like

Zoomed in IES logo
Request for Applications

Education Research and Development Center Program ...

March 14, 2025
Read More
Blue 3 Placeholder Pattern 1
Request for Applications

Research Training Programs in the Education Scienc...

March 07, 2025
Read More
Zoomed in IES logo
Blog

Happy New Year from the ECLS-K: 2024!

January 07, 2025 by Jill McCarroll
Read More
icon-dot-govicon-https icon-quote