Skip Navigation
Funding Opportunities | Search Funded Research Grants and Contracts

IES Grant

Title: Measuring Vocabulary with Testlets: A New Tool for Assessment
Center: NCER Year: 2009
Principal Investigator: Scott, Judith Awardee: University of California, Santa Cruz
Program: Literacy      [Program Details]
Award Period: 4 years Award Amount: $2,036,502
Type: Measurement Award Number: R305A090550

Co-Principal Investigators: Susan Flinspach and Jack Vevea

Purpose: Vocabulary knowledge is recognized by reading and education experts as an influential component of reading comprehension. Traditional assessments of vocabulary yield scores indicating choice of correct versus incorrect answers, but do not provide more nuanced information about the student's level of understanding. The VINE (Vocabulary Innovations in Education) assessments were developed with support from an IES grant in 2006 to assess growth of fourth graders' vocabulary knowledge. The current project extends this work to develop valid and reliable vocabulary assessments for both fourth and fifth grade students in English language arts, science, math, and social studies.

Project Activities: Investigators will work with teachers at various times in this 4-year study to maximize the validity and utility of the assessments. First, a pool of 40,000–80,000 vocabulary words found in textbooks and trade books in fourth and fifth grade will be identified and then narrowed into a word bank that tags each word with information of relevance to test developers and to teachers. In Year 2, researchers will select 1,000 words from the word bank for item development, pilot testing, and refinement of the assessment. A 'testlet' will be written for each word, with each testlet consisting of a sequence of questions that demonstrate progressive knowledge of each vocabulary word. The two fourth-grade and the two fifth-grade assessments with common anchor items will each be administered to 1,000 children to provide further data for assessment refinement, item bias analysis, and validity studies. In the third year of the study, the final four assessments (consisting of 30 testlets that can be administered in approximately 30 minutes) will be field tested. An item pool will also be developed and tested for use by teachers in constructing ad hoc classroom tests of vocabulary knowledge. In the fourth year of the project, the final forms of the tests will be administered in the fall and in the spring both to demonstrate their utility for assessing growth in vocabulary knowledge and to allow an assessment of the ad hoc tests created from the previously developed item pool.

Products: This project will yield a set of standardized assessments with demonstrated utility, validity, and reliability for studying growth in vocabulary proficiency from fourth through fifth grade. A set of items will also be developed for use by teachers to design their own interim tests of student progress in vocabulary development. In addition, the word bank will become available as a tool for teachers that can facilitate informed decisions about which words to teach and what, besides the definition, to teach about the words.

Structured Abstract

Setting: Elementary schools in linguistically and ethnically diverse school districts in California.

Population: Approximately 9,000 fourth- and fifth-grade students will be tested and their scores used for calibration and assessment of the measurement instruments.

Research Design and Methods: In the first year, the investigators will recruit teachers to help with test development based on the curriculum in California. Twenty pilot studies including 4,000 students will be conducted in the fall of Year 2 to gather assessment data for use in identifying good assessment items. The revised items will be used to construct four draft tests to be administered to the same students in spring. Investigators will administer four forms of the tests to another 4,000 students in Year 3 and to a small operational sample in Year 4. In Years 3 and 4, investigators will scale the assessments, conduct studies of differential item functioning, use confirmatory factor analysis to investigate concurrent validity, produce manuals for test administration and score interpretation, and develop a computerized scoring system. In the last year of the project both the standardized and ad hoc assessments will be finalized along with documentation on scoring and administration.

Key Measures: Performance on the new tests will be compared with student scores on the California Standards Test (CST). Confirmatory factor analysis will be conducted to study validity. Convergent validity will be demonstrated to the extent that scores on these new vocabulary tests are positively related to performance on three CST scales: Word Analysis, Fluency, and Systematic Vocabulary Development; Reading Comprehension; and Written and Oral English Language Conventions. Evidence for divergent validity will be shown to the extent that VINE scores show lower correlations with mathematics and science scores than with the reading scales.

Data Analytic Strategy: Because students can obtain a score ranging from 0 through 8 on each vocabulary testlet, the assessments will be scaled using graded response item response theory (IRT). Anchor items will be included on test forms for fourth and fifth grade to allow for linking tests across grade level. Item bias will be evaluated to identify items on which students of different genders, or at different levels of proficiency in English, but with equal vocabulary knowledge perform differently. Both summed scores (in which the number of correct responses is added up) and IRT-based scores will be produced to facilitate interpretation of assessment results by a wide variety of audiences.

Related IES Projects: Vocabulary Development Through Writing: A Key to Academic Success (R305G060140)


Book chapter

Nagy, W.E., and Scott J.A. (2013). Vocabulary Processes. In D.E. Alvermann, N.J. Unrau, and R.B. Ruddell (Eds.), Theoretical Models and Processes of Reading (pp. 458–476). Newark, DE: International Reading Association.

Book chapter, edition specified

Scott, J.A., Miller, T.F., and Flinspach, S.L. (2012). Developing Word Consciousness: Lessons From Highly Diverse Fourth-Grade Classrooms. In J. Baumann, and E. Kame'enui (Eds.), Vocabulary Instruction: From Research to Practice (2nd ed., pp. 169–188). New York: Guilford Press.

Journal article, monograph, or newsletter

Hiebert, E., Scott, J., Castaneda, R., & Spichtig, A. (2019). An analysis of the features of words that influence vocabulary difficulty. Education Sciences, 9(1), 8. doi:10.3390/educsci9010008

Scott, J.A. (2015). Essential, Enjoyable, and Effective: The What, Why and How of Powerful Vocabulary Instruction That Works in Highly Diverse Settings. Literacy Learning: The Middle Years, 23(1): 14–22.


Flinspach, S.L., Scott, J.A., and Vevea, J.L. (2010). Rare Words in Students' Writing as a Measure of Vocabulary. Oak Creek, WI: National Reading Conference. Miller, T.F., Gage-Serio, O., and Scott, J.A. (2010). Word Consciousness in Practice: Illustrations From a Fourth-Grade Teacher's Classroom. Oak Creek, WI: National Reading Conference.

Scott, J., Flinspach, S., Miller, T., Gage-Serio, O. and Vevea, J. (2009). An Analysis of Reclassified English Learners, English Learners and Native English Fourth Graders on Assessments of Receptive and Productive Vocabulary. In Proceedings of the 58th Annual Yearbook of the National Reading Conference (pp. 312–329). Oak Creek, WI: National Reading Conference.