Skip Navigation
Funding Opportunities | Search Funded Research Grants and Contracts

IES Grant

Title: iCODE: Investigating and Scaffolding Students' Code Comprehension Processes to Improve Learning, Engagement, and Retention
Center: NCER Year: 2022
Principal Investigator: Rus, Vasile Awardee: University of Memphis
Program: Postsecondary and Adult Education      [Program Details]
Award Period: 3 years (09/01/2022 - 08/31/2025) Award Amount: $1,999,598
Type: Development and Innovation Award Number: R305A220385

Co-Principal Investigators: Bernacki, Matthew L.; Cook, Amy S.; Dupuis, Danielle; Kendeou, Panayiota; Tawfik, Andrew A.

Purpose: The research team will develop and pilot test a novel education technology called iCODE (improve source CODE comprehension). Code comprehension is a critical skill for both learners and professionals. iCODE will integrate reading strategies training, animated pedagogical agents, inclusive and culturally responsive instructional design, and the open pro-social learner model to improve code comprehension and learning and student engagement, self-efficacy, computer science (CS) identity, and retention in CS programs. By adapting to individual learner characteristics (prior knowledge, self-efficacy, engagement, socio-cultural factors) and code characteristics (language, cohesion, and readability), iCODE aims to benefit both CS and non-CS majors, including underrepresented groups in CS and higher education such as women, students of color, and first-generation college students.

Project Activities: The project combines design-based researchwith randomized controlled trialsto pilot the of the iCODE system and assess its overall promise to improve student outcomes. Through an iterative process informed by students and faculty from universities and community colleges, the researchers will build on top of the DeepTutor platform, developed under previously fundedIES grant, and refine and develop it for use in courses teaching Java or Python. During the pilot study, the researchers will compare outcomes for students in courses using iCODE in their lab session to those using business-as-usual or another automated tutor that does not focus on reading comprehension.

Products: The expected products of the project include a full version of iCODE, education materials such as examples for code comprehension activities and assessment instruments, descriptions of the overall intervention and its components, and published results from the pilot study and cost study.

Structured Abstract

Setting: The research will take place in universities and community colleges located in urban areas of Tennessee and Minnesota.

Sample: For the development and usability and feasibility work, the research team  will collect data from approximately 400 students and at least 8 instructors across the different institutions, primarily at the Tennessee and Minnesota universities. During the pilot study, they will recruit 300 college-level students from the 2 universities and Tennessee and Minnesota community colleges. Students at the Tennessee and Minnesota universities will come from intro-to-programming and introductory psychology courses, and students from the Tennessee and Minnesota community colleges will come from a wide range of backgrounds including CS majors, non-CS majors.

Intervention: iCODE will be a web-based intervention that adaptively monitors, tracks, models, and scaffolds students' source code comprehension processes as they engage in a variety of code-comprehension tasks. iCODE draws on and integrates theories of reading comprehension and source code comprehension, motivation theory, and frameworks of self-regulated learning with open prosocial learner models (OPLMs) and animated pedagogical agents (APAs) and adopts culturally responsive teaching models to address the fundamental difficulty of students in intro-to-programming classes to construct accurate mental models during source code comprehension. iCODE will focus on two coding languages, Java and Python, and will be a web-based supplemental tool, such as those commonly used in the lab component of computer science courses. The web-based environment will also include motivational elements such as mastery framing and self-assessment through an aspirational peer in the OPLM to increase students' self-regulated, mastery-oriented engagement with the iCODE system and the assigned instructional tasks. Furthermore, to help students navigate with the social factors of learning computing and internal factors that impact students' computing identity and self-efficacy, iCODE will implement practices to support diversity, equity, inclusion, and fairness. The researchers will design both the assignments and the APAs to follow the culturally responsive instruction model and will incorporate professional development training workshops to all instructors involved in the project and their support staff on inclusive learning practices and assess the fidelity of training.

Research Design and Methods: The research will occur in three major phases: prototyping, formative evaluation and refinement, and summative evaluation. The iterative development work occurs over the first two phases.During these phases, the researchers will engage with stakeholders, such as university instructors and students, through interviews. They will use cases and walkthroughs to gain insights from their knowledge and experience to develop and refine iCODE. This work will also attend to issues of usability, feasibility, and cost to help guide revisions. Once the full iCODE is complete, the researchers will conduct a pilot study (the summative evaluation phase). They will collect data from students across multiple sites who are in introduction to programming courses. They will compare the outcomes of CS majors and non-CS majors and consider different underrepresented subgroups in CS (e.g., women, racial minorities).

Control Condition: During the pilot study, the researchers will compare iCODE to two control conditions: standard instruction (business-as-usual control) and another intelligent tutoring system, namely DeepTutor, which does not include the core reading strategies of iCODE but can still provide a one-to-one tutoring context for implementation.

Key Measures: During the development phase, the research team  will use data from interviews, think-alouds, eye-tracking, and log files. During the pilot study, they will assess learning using the Foundational CS1 test or a similar one, course completion and course grades, and source code comprehension. They will assess motivation and beliefs using the Computer Science Cultural the Attitude and Identity Survey, the Computer Programming Self-Efficacy Scale, and the Achievement-Goal Orientation Questionnaire-Revised. They will measure engagement using logfiles and measures such as time spent on tasks, verbosity of self-explanation, session time. To estimate retention, they will administer the Intentions for CS survey at the end of the semester asking students if they are likely to continue with their CS degree (for majors) or study further and use programming in their work (non-CS majors) and course registration for up to two semesters.

Data Analytic Strategy: The researchers will compare the output of the automated method with expert observations and judgments to analyze the accuracy of various iCODE components. For the summative final pilot study, the primary focus of the inferential analyses will be on testing the effectiveness of iCODE relative to the two control conditions. Statistical analyses will comprise descriptive and visual analyses as well as fitting inferential models to each dependent variable separately. The researchers will fit a multiple regression model to each dependent variable separately and will include pretest variables (covariates) in the model as well as two interaction terms testing if pretest status moderates the effect of iCODE.The models will include variables for CS major, gender, race/ethnicity, and first-generation status.

Cost Analysis: The researchers will conduct a cost analysis following the ingredients method, deriving the average unit-cost to deliver the intervention at the student-level based upon the estimated total cost of each condition divided by the number of students served.

Related IES Projects: DeepTutor: An Intelligent Tutoring System Based on Deep Language and Discourse Processing and Advanced Tutoring Strategies (R305A100875)


ERIC Citations:  Find available citations in ERIC for this award here.

Project Website:

Select Publications:

Book chapters

Rus, V., Olney, A.M., & Graesser, A.C. (2023). Deeper learning through interactions with students in natural language, In DuBoulay, B., Mitrovic, A., Yacef, K. (eds) Handbook of Artificial Intelligence in Education Elgar Handbooks in Education, Edward Elgar Publishing, 2023, ISBN 1800375409


Chapagain, J., Risha, Z., Banjade, R., Oli, P., Tamang, L., Brusilovsky, P., & Rus, V. (2023). SelfCode: An Annotated Corpus and a Model for Automated Assessment of Self-Explanation During Source Code Comprehension. The International FLAIRS Conference Proceedings, 36(1).

Oli, P. et al. (2023). Improving Code Comprehension Through Scaffolded Self-explanations. In: Wang, N., Rebolledo-Mendez, G., Dimitrova, V., Matsuda, N., Santos, O.C. (eds) Artificial Intelligence in Education. Posters and Late Breaking Results, Workshops and Tutorials, Industry and Innovation Tracks, Practitioners, Doctoral Consortium and Blue Sky. AIED 2023. Communications in Computer and Information Science, vol 1831. Springer, Cham.

Oli, P., Banjade, R., Narayanan, A.L., Brusilovsky, P. (2023). When is reading more effective than tutoring? An analysis through the lens of students' self-efficacy among novices in computer science, Proceedings of the 7th Educational Data Mining in Computer Science Education (CSEDM) Workshop, In conjunction with The 13th International Conference on Learning Analytics & Knowledge (LAK23), March 13–17, 2023 Arlington, TX.