|Title:||Automating the Measurement and Assessment of Classroom Discourse|
|Principal Investigator:||Nystrand, Martin||Awardee:||Board of Regents of the University of Wisconsin System|
|Program:||Education Technology [Program Details]|
|Award Period:||3 years (8/1/13- 1/31/16)||Award Amount:||$1,599,828|
Co-Principal Investigators: Arthur Graesser, Andrew Olney (University of Memphis); Sidney D’Mello (University of Notre Dame); Sean Kelly (University of Pittsburgh)
Purpose: The purpose of this project is to automate the Classroom Language Assessment System (CLASS 4.24), a computer program developed at the University of Wisconsin Center for Education Research. CLASS 4.24 is used to code classroom interactions between a teacher and his or her students, and (a) has successfully identified key instructional variables promoting reading achievement, (b) has been used to successfully support professional development, and (c) is currently used for teacher education at Michigan State and Ohio State Universities. By automating CLASS 4.24, the team anticipates that the coding of classroom discourse will be radically simplified, thereby reducing its cost and increasing its potential use. CLASS 5.0 will facilitate scaling up the coding of classroom discourse, allowing classroom discourse to become a quantifiable variable for teacher education, professional development, and classroom research.
Project Activities: The research team will adopt a systems-engineering approach involving iterative processes to design, test, revise, and retest the CLASS to produce CLASS 5.0, which will eventually function to automate the coding and results of CLASS 4.24. In order to automate the system three major technologies will be developed: (a) automatic speech recognition (ASR), (b) a question and speech act classifier (QSAC), and (c) macro-level discourse modeling. The system will be cross-validated against human coding of audio recordings. The software system (QSAC and ASR) will be combined with a macro-level discourse model in order to identify higher-order latent categories of discourse structure in the sequence of speech acts. Testing of usability and feasibility will occur in small focus groups and classroom environments. Additional activities include the design and use of a sociological survey of the classroom context.
Products: The product will be a software system that automatically transcribes and codes classroom discourse. This system can provide feedback to teachers that can inform their instruction, particularly in reading and literature classes. Peer-reviewed publications will also be produced.
Setting: This project will take place in middle school classrooms, from a variety of settings (urban, suburban, and rural) in Wisconsin, and from a variety of classroom types (i.e., remedial, regular, college-prep, and untracked depending on school site).
Sample: Participants for this study include students in and teachers of grades 7–8.
Intervention: The project goal is to develop an automated system that will measure classroom discourse. CLASS 5.0 will measure dimensions of classroom discourse that influence the cognitively engaging dimensions of students’ literacy development as well as engagement in classroom instruction. The system will record classroom discussions and analyze its content for discourse characteristics in order to provide feedback and guidance to teachers, so that they can in turn improve upon their dialogic instruction. A part of the system, the QSAC tool will identify authentic questions; classify student statements and their level of evaluation; and categorize the cognitive level of the statements.
Research Design and Methods: The research team will iteratively design, test, revise, and retest the automated system to produce CLASS 5.0, which will automate the coding and results of the current system, CLASS 4.24, which relies on human scoring. The work will proceed in parallel with two research teams—once at Wisconsin, the other at Memphis and Notre Dame. In Year 1, the Memphis/Notre Dame team will work with previously collected transcripts to create a version of CLASS 5.0 that works on transcripts without audio, while the Wisconsin team will record, transcribe, and code authentic sessions using high-quality microphones and phase microphone arrays. In Year 2, the Wisconsin team will collect more data and test the transcript-only system developed in Year 1. The team will test the usability and feasibility of the intervention with small focus groups and in classroom environments. The team will also design and carry out a sociological survey of classroom context. During this time, the Memphis/Notre Dame team will use the data Wisconsin collected in Year 1 to incorporate acoustic information into the existing models. At the end of Year 2, the Memphis/Notre Dame team will have produced the first fully functional version of CLASS 5.0. In Year 3, the Wisconsin team will conduct a pilot study of this version of CLASS 5.0 in classrooms. The goal of the pilot study is to investigate the relationship between students’ exposure to dialogic instruction and patterns of achievement growth. The Memphis/Notre Dame team will continue to refine CLASS 5.0, focusing both on performance and usability issues.
Control Condition: Due to the nature of this study design, there is no control condition.
Key Measures: Key measures of the system include comparison of the coding of discourse between the automated version of CLASS (CLASS 5.0) and the existing version (CLASS 4.24), which requires human coding. Measures of student learning from Terra Nova include the reading test (grades 7 and 8) and the language arts and writing tests (grade 8); and a researcher-developed essay writing assessment, the coding of which examines the level of abstraction, coherence, and elaboration of argument.
Data Analytic Strategy: Researchers will use clustering, Hidden Markov Models, state transition networks, and other categorical time series analysis techniques to identify higher order latent categories of discourse structure in the sequence of speech acts. The research will use multilevel models of student achievement growth of literacy skills. Student achievement data will be collected at two time points, allowing for change score models with multiple specifications to adjust for the relationship between initial status and achievement growth.
Nystrand, M. (2017). Twenty acres: Events that transform us. .
Blanchard, N., Brady, M., Olney, A. M., Glaus, M., Sun, X., Nystrand, M., and D'Mello, S. (2015). A study of automatic speech recognition in noisy classroom environments for automated dialog analysis. Artificial Intelligence in Education (pp. 23–33).
Graesser, A.C., Dowell, N., and Clewley, D (2017). Assessing Collaborative Problem Solving Through Conversational Agents. Innovative Assessment of Collaboration (pp. 65–80).
Graesser, A.C., Keshtkar, F., and Li, H. (2014). The Role of Natural Language and Discourse Processing in Advanced Tutoring Systems. In T. Holtgraves (Ed.), The Oxford Handbook of Language and Social Psychology (pp. 491–509). New York: Oxford Handbooks Online.
Olney, A., D'Mello, S.K., Risko, E.F., and Graesser, A.C. (in press). Attention and Engagement in Educational Contexts: The Role of Task Demands in Structuring Goals That Guide Attention. In J. Fawcett, E.F. Risko, and A. Kingstone (Eds.), Handbook of Attention. Cambridge, MA: MIT Press.
Book chapter, edition specified
Olney, A. M., Kelly, S., Samei, B., Donnelly, P., and D'Mello, S. K. (2017). Assessing Teacher Questions in Classrooms. Design Recommendations for Intelligent Tutoring Systems: Assessment (Volume 5) (5th ed., pp. 261–274). Army Research Laboratory.
Journal article, monograph, or newsletter
Kelly, S., and Northrop, L. (2014). Opportunities for Class Discussioni English and Language Arts. A Tri-State Area School Study Council Research Brief. The Forum, 20: 2.
Kelly, S., Zhang, Y., Northrop, L., VanDerHeide, J., Dunn, M., and Caughlan, S. (2018). English and Language Arts Teachers' Perspectives on Schooling: Initial Exposure to a Teacher Education Curriculum. Teacher Education Quarterly, 45(1): 57–85.
Blanchard, N., D'Mello, S., Olney, A. M. and Nystrand, M. (2015). Automatic Classification of Question & Answer Discourse Segments from Teacher's Speech in Classrooms. In International Conference on Educational Data Mining (EDM) (pp. 282–288). Madrid, ES: International Educational Data Mining Society.
Blanchard, N., Donnelly, P.J., Olney, A.M., Borhan, S., Ward, B., Sun, X., Kelly, S., Nystrand, M., and D'Mello, S.K. (2016). Identifying Teacher Questions Using Automatic Speech Recognition in Classrooms. In 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue. Los Angeles, CA: ResearchGate.
Blanchart, N., Donnelly, P.J., Olney, A.M., Samei, B., Ward, B., Sun, X., Kelly, S., Nystrand, M., and D'Mello, S.K. (2016). Semi-Automatic Detection of Teacher Questions from Human-Transcripts of Audio in Live Classrooms. In 9th International Conference on Educational Data Mining (pp. 672–674). Raleigh, NC: International Educational Data Mining Society (IEDMS).
D'Mello, S. K., Olney, A. M., Blanchard, N., Samei, B., Sun, X., Ward, B., and Kelly, S (2015). Multimodal Capture of Teacher-Student Interactions for Automated Dialogic Analysis in Live Classrooms. In Proceedings of the 2015 ACM on International Conference on Multimodal Interaction (pp. 557–566).
Donnelly, P. J., Blanchard, N., Olney, A. M., Kelly, S., Nystrand, M., and D'Mello, S. K. (in press). Words Matter: Automatic Detection of Questions in Classroom Discourse Using Linguistics, Paralinguistics, and Context. In Proceedings of the 7th International Learning Analytics and Knowledge Conference.
Donnelly, P. J., Blanchard, N., Samei, B., Olney, A. M., Sun, X., Ward, B., Kelly, S., Nystrand, M., and D'Mello, S. K. (2016). Automatic Teacher Modeling from Live Classroom Audio. In 24th ACM International Conference on User Modeling, Adaptation, and Personalization (UMAP 2016) (pp. 45–53). Halifax, NS, Canada: Association for Computing Machinery (ACM).
Donnelly, P., Blanchard, N., Samei, B., Olney, A. M., Sun, X., Ward, B., and D'Mello, S. K. (2016). Multi-sensor Modeling of Teacher Instructional Segments in Live Classrooms. In Proceedings of the 18th ACM International Conference on Multimodal Interaction (pp. 177–184).
Olney, A. M., Samei, B., Donnelly, P. J., and D'Mello, S. K. (2017). Assessing the Dialogic Properties of Classroom Discourse: Proportion Models for Imbalanced Classes. In Proceedings of the 10th International Conference on Educational Data Mining .
Samei, B., Olney, A., Kelly, S., Nystrand, M., D'Mello, S., Blanchard, N., and Graesser, A. (2014). Domain Independent Assessment of Dialogic Properties of Classroom Discourse. In Proceedings of the 7th International Conference on Educational Data Mining (EDM 2014) (pp. 233–236).
Samei, B., Olney, A., Kelly, S., Nystrand, M., D'Mello, S., Blanchard, N., and Graesser, A. (2015). Modeling Classroom Discourse: Do Models That Predict Dialogic Instruction Properties Generalize Across Populations?. In Proceedings of the 8th International Conference on Educational Data Mining (pp. 444–447).