Skip Navigation

Effect of Linguistic Modification of Math Assessment Items on English Language Learner StudentsEffect of Linguistic Modification of Math Assessment Items on English Language Learner Students

Regional need and study purpose

It may be challenging for students learning English to show what they know and can do in mathematics if the test items that assess this knowledge also test their English language skills. The complexity of the language, or language load, in a mathematics test item may interfere with the ability of English language learner students to demonstrate their understanding of mathematics concepts on achievement tests (Rivera and Collum 2004; Rivera and Stansfield 2001). Mathematics test items can be reworded to minimize their language load without altering the content assessed (Abedi, Courtney, and Leon 2003).

For English language learner students, minimizing language load can provide students access to the tested content in a way that will help them better demonstrate their content knowledge and skills. Because the effectiveness of practices for making high-stakes assessments accessible remains unclear, this study investigates how research-based changes to the language of test items can increase English language learner students' access to test content—yielding a more valid and reliable measure of what these students know and can do and more meaningful comparisons with scores of other students. Generalizable to other Spanish-speaking populations of similar grade levels, the study results may shed new light on prior research.

This study has one primary research question:

  1. Is the difference between the mean scores of the unmodified and linguistically modified item sets for English language learner students comparable to that for non–English language learner students?
  2. Several secondary questions also guide it:

  3. Is the difference between mean scores on the linguistically modified and unmodified item sets greater for non–English language learner students at low reading levels than for those at high reading levels? That is, does reading ability mediate the effect of linguistic modification for non–English language learner students?
  4. When comparing English language learner and non–English language learner students of similar mathematics achievement levels, does the probability of their answering individual items correctly differ by item set—linguistically modified compared with unmodified?
  5. Are the underlying dimensions measured by the unmodified and linguistically modified items the same for both student groups? Do the relationships between latent factors such as mathematics achievement and verbal ability differ for English language learner and non–English language learner students—or between latent factors and test items?
  6. For non–English language learner students do scores on the linguistically modified and unmodified item sets correlate similarly to standardized tests of mathematics achievement?

Intervention description

This study investigates how linguistic modification affects students' ability to access and respond to mathematics content on standardized achievement tests. Linguistic modification purposefully alters the language of test items, directions, and/or response options—by reducing sentence length and complexity, using common or familiar words, and inserting concrete language (Abedi 2008; Abedi, Lord, and Plummer 1997; Sato 2008; Sireci, Li, and Scarpati 2002)—to clarify and simplify the text without simplifying or significantly altering the construct (concepts, knowledge, skills) tested. Such modification should reduce the language load of the text in a mathematics item.

Six principles guide linguistic modification:

In the initial stage of the study, experts with specialties in mathematics, applied linguistics, measurement, curriculum and instruction, English language development, and the English language learner population developed two sets of items to measure the effects of linguistic modification on student access to test content. The work group first collected released test items from the National Assessment of Educational Progress and the California Standardized Testing and Reporting (STAR) program web sites. Measurement and content specialists at the national and state levels extensively reviewed the items and found them to be psychometrically sound, aligned to state content standards, and developmentally appropriate for students in grades 7 and 8. (The pooled items represent the broad content strands of all states' standards in mathematics.)

The work group determined that the items most amenable to research-based linguistic modification strategies came from two mathematics content strands: measurement, and numbers and operations. From these two strands, they selected the items most appropriate for linguistic modification and applied the linguistic modification strategies, producing two mathematics item sets—one with linguistically modified items and one with unmodified items.

The treatment intervention in this study was the item set containing 25 linguistically modified mathematics items. The counterfactual was the item set with the original unmodified items. Initial study activities focused on ensuring that the two item sets were sufficiently valid for large-scale data collection efforts. This included conducting cognitive interviews (think-aloud protocols) with English language learner and non–English language learner students in grades 7 and 8 and pilot testing linguistically modified items with a sample of 100 such students.

Study design

Using a 2-by-3 fully crossed design, the study tests middle school students in grades 7 and 8 in the spring and summer of 2008 with linguistically modified items and unmodified items. The factors are item sets (unmodified and linguistically modified) and student population; grade level (7 or 8) serves as a blocking factor. The student population for each item set consists of 600 English language learner students, 600 non–English language learner students at low reading levels, and 600 non–English language learner students at high reading levels, equally divided between grade 7 and 8 students.

Students are randomly assigned to a treatment or control condition within each school, grade, and English proficiency group. Box 1 summarizes the study design.

The sample requirements were determined through analyses of the power of the statistical tests used, which include analysis of variance and confirmatory factor analysis. With the intended cell sizes of 3,600 students, this design provides a minimum detectable effect size of 0.20 standard deviation (power = 0.80 and alpha = 0.05) for the main research question addressed by the analysis of variance, which asks whether the score difference between the unmodified and linguistically modified item sets differs for the English language learner and non–English language learner student subgroups. Because researchers anticipated challenges in matching achievement history data to each student participant, the study over sampled; thus the actual sample size is around 4,600 students to ensure sufficient sample sizes in each cell following the matching process.

Box 1. Study features

Study design • Experimental design.
Implementation • Spring/summer 2008–fall 2008.
Treatment • Item set (treatment group is given set with the linguistically modified items; control group is given set with unmodified items).
Unit of assignment • Random assignment at the student level.
  • Each student randomly assigned to a condition within each school and grade.
  • Each student randomly assigned to the control (administered an item set with the unmodified items) or treatment condition (administered an item set with the linguistically modified items).
  • Achievement history data collected after testing for each student and matched through student identifiers. Data collected included state standardized test scores in English language arts and mathematics and, for English language learner students only, English-language proficiency scores (California English Language Development Test).
Student recruitment • Participating schools submitted rosters of all students in grades 7 and 8 who met eligibility criteria for the state assessment; in most cases this included all students in intact math classrooms.
Sample characteristics • 4,600 students enrolled in grades 7 or 8 in public schools in California.1
  • The target English language learner student sample consisted of students whose first language is Spanish who have early-intermediate to high levels of English language proficiency.2
  • The non–English learner sample consisted of general education students who are English proficient. These students were divided into two groups based on state achievement test scores in English language arts: those with high reading ability and those with low reading ability.
Informed consent • Letters informing parents and guardians about the study were sent home with eligible students. A passive parent consent process was approved for this study
1 To control for cross-state differences in the math content standards of state assessments, students were sampled from only one state.
2 By studying only native speakers of Spanish, the study controls for sources of variability related to native language. Spanish was selected as the language for the study because 75 percent of English language learner students in the West Region (California, Nevada, Utah, and Arizona) identify Spanish as their primary or secondary language. We limit the population to students who have sufficient levels of proficiency in the English language to benefit from linguistic modification of test items, as students who cannot yet read English are less likely to benefit from this accommodation.

Key outcomes and measures

This study examines the effect of linguistic modification on student access to test content. The outcome is student performance in mathematics measured through the linguistically modified and unmodified assessment items.

Data collection approach

Data collection involved six sources of data: expert review, cognitive interviews, item tryouts, operational administration of the item sets, student language background survey, and student achievement history data.

Expert review. Experts in mathematics, the English language learner student population, English language development, measurement, curriculum and instruction, and applied linguistics together reviewed released items from state tests and National Assessment of Educational Progress and noted their appropriateness for study purposes. The experts linguistically modified test items, using research-based strategies.

Cognitive interviews. Using a think-aloud protocol, researchers elicited feedback from nine students in grades 7 and 8 about the strategies they use to solve mathematics problems.

Item tryouts. The two item sets were administered to 100 middle school English language learner and non–English language learner students.

Operational administration of item sets. Middle school students were randomly assigned to the treatment (item set with the linguistically modified items) or control group (item set with the unmodified items). Each item set included 25 multiple choice items.

Student language background survey. A five-item language background survey at the end of the item set booklet was used to provide additional information about the Spanish speakers in the English language learner analytic sample.

Student achievement history data. For all participating (tested) students, districts submitted archived achievement history data—state test scores in English language arts and mathematics and, for English language learner students, English language proficiency scores and proficiency level—from school records or the district database. Researchers from Regional Educational Laboratory West then matched archived data to student responses on the study's mathematics item set (linguistically modified or unmodified).

Analysis plan

Various analyses are planned: item-level descriptive, variance, differential item functioning, factor structure of the tests, and test correlation.

Item-level descriptive analyses. Item-level statistics—including frequency distributions for item choices, p-values, standard deviations, point biserial correlations, and omission rates—are generated from pilot and operational item sets.

Analysis of variance. Scores on item sets, disaggregated by group (English language learner or non–English language learner students) and item sets (linguistically modified or unmodified), provide information on how each group performed on each item set. A three-factor analysis of variance tests mean differences in scores for the two student groups, to examine whether English language learner students can better demonstrate their mathematics ability on the linguistically modified item set. The three factors are item set (linguistically modified or unmodified), student population (English language learner and non–English language learner students), and grade (7 or 8). This analysis should help answer the primary research question.

If linguistic modification provides English language learner students greater access to the mathematics content, then the score difference between the linguistically modified and unmodified item sets should be greater for the English language learner population than for the non–English language learner population. The interaction between student population and item set is of particular interest in this analysis because it addresses this hypothesis.

Analysis of variance is also used to examine whether there are score differences between non–English language learner students at low reading levels (below proficient in English language arts) and non–English language learner students at high reading levels (proficient and above in English language arts). The expectation is that if linguistic modification reduces the language burden, the score difference in item sets will be greater for the low-level readers than for the high-level readers. This analysis should help answer research question 2. If there is a performance difference between item sets—that does not vary by reading group for non–English language learner students—this may indicate that the modification has increased student access to mathematics as well as language.

Differential item functioning analysis. An analysis of differential item functioning addresses whether the chance of a student answering an unmodified item correctly is greater in the non–English language learner population than in the English language learner population, even after controlling for total item set score. In general, exhibiting differential item functioning may indicate the multidimensionality of the item. That is, there could be another construct—other than the target achievement construct assessed by the items in the analysis—associated with group membership that is contributing to performance on the item. Items showing differential item functioning are examined closely, along with information about the item obtained from the factor analyses, to explain the differential item functioning. Such analyses should help answer research question 3.

Factor structure of the item sets. For each operational item set, exploratory factor analyses estimate the number of constructs assessed by the item sets and their underlying measurement structure (correlations). The results from these analyses are the foundation for further nested confirmatory factor analyses. Testing for differences in measurement structure across student groups and item sets, these confirmatory factor analyses should reveal the effects of linguistic modification, and thus the degree of access to test content, on the dimensionality of the item set.

For each item set the researchers examine the correlation of item parcels with latent factors as well as the correlations between latent factors (defined through the exploratory factor and item content analyses) and English language learner and non–English language learner students. Researchers anticipate that the item loadings for the non–English language learner students on both item sets will be higher than for the English language learner students, the correlations between latent factors will be higher for the non–English language learner students than for English language learner students, and the gap between these student groups will narrow on the linguistically modified item set. These analyses are intended to help answer research question 4.

To explore the factors accounting for these differences, another construct-irrelevant latent factor is incorporated in the model. This latent factor, which may be labeled student verbal ability, may also affect students' performance on a math test, especially for English language learner students (Abedi, Leon, and Mirocha 2003). If examining the correlations of item parcels with the linguistic latent factor supports this hypothesis, the item parcel correlations might be higher for English language learner students than for non–English language learner students, regardless of item set. However, these differences should be less pronounced on the linguistically modified item set.

Test correlations. Analyses of data from school records or district databases will provide information on the relationship between performance on statewide tests of mathematics and on the two item sets (linguistically modified and unmodified). The researchers hypothesize that linguistic modification should not alter the mathematics construct assessed, as shown by the strong correlation of level of performance on the study item set to level of performance on a standardized test of mathematics achievement.

Stanley Rabinowitz, PhD, and Edynn Sato, PhD.

Contact information

Dr. Neal Finkelstein
Regional Educational Laboratory West
730 Harrison Street
San Francisco, CA 94107-1242
Voice: (415) 615-3171
Fax: (415) 565-3012
Email: nfinkel@wested.org

Region: West

References

Abedi, J. (2008). Linguistic modification. Part I—Language factors in the assessment of English language learners: The theory and principles underlying the linguistic modification approach. Washington, DC: LEP Partnership.

Abedi, J., Courtney, M., and Leon, S. (2003). Research-supported accommodation for English language learners in NAEP. Los Angeles: University of California, Center for the Study of Evaluation, National Center for Research on Evaluation, Standards, and Student Testing.

Abedi, J., Leon, S., and Mirocha, J. (2003). Impact of students' language background on content-based assessment: Analyses of extant data (CSE Technical Report No. 603). Los Angeles: University of California, National Center for Research on Evaluation, Standards, and Student Testing.

Abedi, J., Lord, C., and Plummer, J. (1997). Final report of language background as a variable in NAEP mathematics performance (CSE Technical Report 429). Los Angeles: University of California, Center for the Study of Evaluation, National Center for Research on Evaluation, Standards, and Student Testing.

Rivera, C., and Collum, E. (2004). An analysis of state assessment policies addressing the accommodation of English language learners. Issue paper commissioned for the National Assessment Governing Board Conference on Increasing the Participation of SD and LEP Students in NAEP. Arlington, VA: George Washington University.

Rivera, C., and Stansfield, C.W. (2001, April 10–14). The effects of linguistic simplification of science test items on performance of limited English proficient and monolingual English-speaking students. Paper presented at the Annual Meeting of the American Educational Research Association, Seattle, WA.

Sato, E. (2008). Linguistic modification. Part II—A guide to linguistic modification: increasing English language learner access to academic content. Washington, DC: LEP Partnership.

Sireci, S.G., Li, S., and Scarpati, S. (2002). The effects of test accommodations on test performance: A review of the literature (CEA Research Report 485). Amherst, MA: University of Massachusetts, School of Education.

Return to Index