Skip to main content

Breadcrumb

Home arrow_forward_ios Information on IES-Funded Research arrow_forward_ios Methods and Software to Classify Co ...
Home arrow_forward_ios ... arrow_forward_ios Methods and Software to Classify Co ...
Information on IES-Funded Research
Grant Open

Methods and Software to Classify College Courses at Scale

NCER
Program: Statistical and Research Methodology in Education
Program topic(s): Core
Award amount: $899,226
Principal investigator: Kevin Stange
Awardee:
University of Michigan
Year: 2024
Award period: 2 years 11 months (09/01/2024 - 08/31/2027)
Project type:
Methodological Innovation
Award number: R305D240029

Purpose

Research-ready (de-identified, standardized, documented) data on the courses students take and what they learn while attending college has, historically, been challenging to obtain on a large scale. Research on student course-taking has been hampered by inconsistent course titles and numbers across multiple institutions, as well as the sheer amount of human intervention required to manually classify thousands of unique courses within an institution. The 'big data revolution,' however, brings approaches to automated pattern recognition that we can use to solve this problem.

Project Activities

This project will advance education research and practice by 1) applying machine learning approaches to text-based descriptions of courses to systematically classify the content of courses and 2) developing software education researchers and practitioners can use to apply our classification algorithms to course data at a very large scale. The classification approach and corresponding open-source college course mapping tool will open up new possibilities in applied education research around college course-taking and student success.

Structured Abstract

Research design and methods

The project team will use various hierarchical classification approaches on text-as-data, including both supervised machine learning and generative AI. To train the algorithm, the project will use human-classified course data from several nationally representative NCES longitudinal studies that is included in the Postsecondary Education Transcript Studies dataset.

User Testing: Users will be recruited early in the project period to pilot the software and provide feedback on product usability and subject their newly classified data to validation, which will be used to further refine our algorithm. A wider set of user-testers will also be convened towards the end of the project.

Use in Applied Education Research: The software will be useful in any education research that uses postsecondary course-level data. Such applications are numerous, including studies of disparities in course-taking, transfer students, bottleneck and gateway courses, and the long-term consequences of college curriculum.

People and institutions involved

IES program contact(s)

Charles Laurin

Education Research Analyst
NCER

Products and publications

The team will publish an open-source software package that will assign consistent College Course Map (CCM) codes to individual course records. End users will provide a dataset (in CSV form) containing course features and the software tool will return CCM codes for the same set of course records at a 2-digit, 4-digit, and 6-digit level (where appropriate), along with estimated confidence levels. The tool will take the form of a package in R and Python (with wrappers facilitating use by other statistical products) that is freely available. It can be downloaded by anyone and can be used on their own institutional data, and individual institutions can integrate it into their workflows however it makes sense for them. The tool will be well documented, have example data, and be a reproducible artifact which will last past the end of the grant. The open-source tool will be freely available and disseminated through various platforms and promoted at professional conferences, through professional associations, and through social media.

Publications:

ERIC Citations: Find available citations in ERIC for this award here.

Supplemental information

Co-Principal Investigators: Flaster, Allyson; Jurgens, David

Questions about this project?

To answer additional questions about this project or provide feedback, please contact the program officer.

 

Tags

Education TechnologyPostsecondary Education

Share

Icon to link to Facebook social media siteIcon to link to X social media siteIcon to link to LinkedIn social media siteIcon to copy link value

Questions about this project?

To answer additional questions about this project or provide feedback, please contact the program officer.

 

You may also like

Zoomed in IES logo
Workshop/Training

Data Science Methods for Digital Learning Platform...

August 18, 2025
Read More
Zoomed in IES logo
Workshop/Training

Data Science for Education (DS4EDU)

April 01, 2025
Read More
Zoomed in IES logo
Request for Applications

Education Research and Development Center Program ...

March 14, 2025
Read More
icon-dot-govicon-https icon-quote