Approved National Board for Education Sciences Resolutions (Since Inception)
Congress, in authorizing and funding evaluations of federal education programs, should require [program] grantees, as a condition of grant award, to participate in the evaluation if asked, including the random assignment to intervention and control groups as appropriate. (April 2005)
Congress and the U.S. Department of Education should ensure that individual student data can be used by researchers (with appropriate safeguards for confidentiality) in order to provide evaluations and analyses to improve our schools. (September 2006: see background material below)
Policy Recommendation: Congress and the Department of Education should ensure that individual student data can be used by researchers (with appropriate safeguards for confidentiality) in order to provide evaluations and analyses to improve our schools.
Resolution Adopted by the National Board for Education Sciences, September 2006
Problem to be addressed:
The use of longitudinal student data, especially those about achievement generated by NCLB, could provide a wealth of information to districts about what is and is not working in their attempts to improve student outcomes. Because of increasing concerns about the interpretation of the Family Educational Rights and Privacy Act (FERPA)1, however, many states are afraid to provide data to researchers. This includes states that previously released data for research purposes (e.g., Texas and New York) and a variety of others that have the capacity now but are afraid to proceed.
FERPA is designed to protect individual student privacy and, as a general principle, requires that individuals sign consent forms before any data about them is released by educational institutions in an identifiable way. (Data that cannot be individually identified can of course be used). There are, however, special provisions that permit identifiable data to be used by school systems, states, and the federal Department to conduct analyses that will help to improve the schools.
Perhaps because of the recent increased availability of data on student performance and on school operations, there has been new attention to ensuring the confidentiality of student records. It is essential, however, that this attention to confidentiality issues not preclude valid research and evaluation activities that are conducted within the requirements of FERPA. Because of the Department of Education's role in developing regulations for the implementation of FERPA, the Department has a crucial part in achieving these goals.
The most reliable analyses employ longitudinal information on individual student experiences, but only if these analyses employ modern panel data methods—ones that uniformly require linking individuals over time. (Note, however, that linking individuals is not the same as identifying the individuals). Attempts to make data available for public use (de-identified data in FERPA) generally destroy their research value. For example, approaches such as suppressing data with "small cell sizes" or adding some random element to some of the data destroy the possibility of conducting the most reliable research studies.
NCLB requires states to develop databases about student achievement, and IES under Congressional direction has provided funds and encouragement for states to develop longitudinal student databases. These activities make an enormous amount of sense from a public policy view of attempting to learn about the causes of student achievement. But, they are rendered useless if the data cannot be used effectively to learn about the impacts of educational policies and other factors on student performance.
The Department of Education, following IES leadership, should work to promote data access for qualified researchers. First, the Secretary should clearly articulate the need for states to provide researcher access to data so that new knowledge can be generated. Second, the Secretary should provide guidelines that ensure protection for confidential records while still allowing individual longitudinal data to be used by independent researchers. For example, providing for third party encryption of identifying information (such as SSNs, names, or addresses) would maintain privacy while permitting access to longitudinal data to bona fide researchers. Third, the Department should make it clear that protection of the data and not restrictions of the analysis and researchers is the primary goal. It is important that independent researchers be permitted to conduct scientific research and evaluations as long as they can demonstrate that they protect the confidentiality of the data and that their research satisfies the FERPA requirements.
1 The Family Educational Rights and Privacy Act (FERPA) (20 U.S.C. § 1232g; 34 CFR Part 99) is a Federal law that protects the privacy of student education records. See http://www.ed.gov/policy/gen/guid/fpco/ferpa/index.html.
Congress should designate the Institute of Education Sciences, in statute, as the lead agency for all congressionally authorized evaluations of U.S. Department of Education programs, responsible for all operations, contracts, and reports associated with such evaluations. (September 2006: background material)
Policy Recommendation: That Congress designate IES, in statute, as the lead agency for all Congressionally-authorized evaluations of Education Department programs, responsible for all operations, contracts, and reports associated with such evaluations.
Resolution Adopted by the National Board for Education Sciences, September 2006
Rationale:
- Rationale for IES as the lead agency for evaluations of program effectiveness (sometimes called "impact" evaluations): Although the Department and Congress generally recognize IES as the lead agency for such evaluations, this practice has not been codified in law.
As a result, although IES—uniquely among Departmental agencies—has the institutional expertise to ensure the scientific rigor and independence of such evaluations, its role as the lead agency for these evaluations is susceptible to administrative change.
The Senate Appropriations Committee report accompanying its FY 2007 Labor-HHS Education Appropriations bill (S. 3708, as approved by the Committee on July 20, 2006) summarizes reasons why IES should have the lead role in such evaluations, as follows:
The Committee strongly supports the Department's efforts to carry out congressionally authorized evaluations of Federal education programs using rigorous methodologies, particularly random assignment, that are capable of producing scientifically valid knowledge regarding which program activities are effective. To ensure that authorized evaluations are conducted in a rigorous manner that is independent of the program office and includes scientific peer review, the Committee believes that the Institute of Education Sciences should be the lead agency for the design and implementation of these evaluations. The Committee believes further that it is essential for program offices to work collaboratively with the Institute to include a priority or requirement in program solicitations for grantee participation in such evaluations, including random assignment, to the extent the Institute deems appropriate and where not specifically prohibited by law. (Sen. Rept. 109-287, July 20, 2006, p. 287)
Such Committee report language provides direction to the Department, and is generally followed, but does not have the same authority or permanence as statute. Thus it, too, is potentially susceptible to change in a way that statutory language is not.
- Rationale for IES as the lead agency for other types of evaluations, many of which are current administered by other Department offices (e.g., evaluations of program implementation, cost, and feasibility):
- It is important that all evaluations—not just those assessing effectiveness—be conducted with the rigor, independence, and scientific peer review that IES uniquely offers. All IES evaluations are governed by statutory requirements for peer review and managed by a standards and review office that operates independently of the programs being evaluated. The same is not true for evaluations currently managed by other Departmental offices.
- It would allow for greater coordination among the various types of evaluations, and avoid costly duplication of effort. For instance, it often makes sense for a single research team to conduct an evaluation of a program's effectiveness and of its implementation as part of the same project, rather than for two research teams to conduct these evaluations separately. As an actual example, IES's evaluation of the effectiveness of Reading First, and the Policy and Program Studies Service's (PPSS) evaluation of the implementation of Reading First, are both collecting data on classroom practices of program grantees and non-grantees. The two evaluations are using different measures, samples, and timelines, causing confusion and wasting resources.
Precedent for this recommendation: IES is explicitly designated in statute as the lead agency for the national evaluation of Title I of the Elementary and Secondary Education Act.
Specifically, Section 173 of the Education Sciences Reform Act of 2002 (Public Law 107-279) states that "The Evaluation and Regional Assistance Commissioner [within IES] . . . shall administer all operations and contracts associated with evaluations authorized by part E of title I of the Elementary and Secondary Education Act of 1965 . . . and administered by the Department . . . ."
The Title I evaluation is one of the largest and most important Departmental evaluations. Our recommendation is that Congress enact a legislative provision extending this same requirement to all Congressionally-authorized evaluations of Department programs.
This recommendation would apply to: All Departmental program evaluations conducted with explicit Congressional authorization.
This would include evaluations mandated by Congress as well as those authorized but not mandated (e.g., through national activities, evaluation set-aside, or similar statutory provisions). This recommendation would not apply to evaluations conducted without explicit Congressional authorization (e.g., evaluations that the Department might conduct with its discretionary funds for Program Administration).
Congress should allow the U.S. Department of Education to pool funds generated by the 0.5 percent evaluation set-aside from smaller programs. (September 2006): see background material below)
Policy Recommendation: That Congress Allow the Education Department To Pool Funds Generated By the 0.5% Evaluation Set-Aside From Smaller Programs
Resolution Adopted by the National Board for Education Sciences, September 2006
Background: Section 9601 of the Elementary and Secondary Education Act (ESEA) authorizes a 0.5% evaluation set-aside for many K–12 programs.
Specifically, Section 9601 of ESEA gives the Department the authority, in many of its K–12 programs, to set aside up to 0.5% of the program's funds to conduct evaluations of that program. This provision applies to the K–12 programs authorized by ESEA other than those in Titles I and III, and other than those which have their own separate evaluation authority. As an example, the Department uses this authority in the $2.9 billion Improving Teacher Quality State Grants program, where it yields approximately $14 million annually for evaluation activities in that program.
Problem to be addressed: The 0.5% evaluation set-aside, when applied to a small Department program, does not generate sufficient funds for a meaningful evaluation of that program.
In many small K–12 programs, the Department does not currently use its Section 9601 set-aside authority, because the set aside does not generate sufficient funds for a meaningful evaluation of the program. For example, in a $5 million program, the set-aside would yield just $25,000 annually, and it is hard to envision conducting a worthwhile evaluation with that amount.
Section 9601, as currently drafted, does not appear to give the Department the flexibility to pool the set-aside amounts from several small programs in order to generate sufficient funds to conduct a high-quality evaluation in a one or a few of these programs.
As a result, the Department is not using its evaluation set-aside authority in programs whose funding totals approximately $2.38 billion, where it could potentially yield a total of $11.9 million in funds for evaluation. Many programs activities—including some highly innovative smaller programs—therefore go unevaluated, denying the policy community of a valuable opportunity to learn which activities are truly effective in improving K–12 education.
Recommendation: That Congress make a modest legislative revision to Section 9601, to explicitly allow the Department to pool set-aside amounts from several smaller programs to fund meaningful evaluations in one or a few such programs.
In accompanying report language, Congress might request that the Department use these pooled funds in a balanced way to ensure that, over time, the evaluations generate knowledge about "what works" across a wide range of these smaller programs, rather than just a few. This would be in keeping with the concept that programs from which evaluation funds are set aside should generally benefit from their use.
Precedent for this policy: The Department of Health and Human Services (HHS) has had similar evaluation set-aside authority since 1970, which allows it to pool funds in this way.
Specifically, Section 241 of the Public Heath Service Act authorizes the Secretary of HHS to set aside 1% of funding from many of its public health service programs to fund program evaluation activities. HHS and its predecessor organization (HEW) have used this authority to fund evaluation activities since 1970, and generally set aside the full authorized amount each year. Because of a difference in the wording of Section 241 compared to Section 9601, HHS is allowed to pool its set-aside funds, and does so. (The relevant wording of these two statutory provisions is attached.)
Attachment: The Relevant Statutory Provisions
The Education Department's evaluation set-aside authority (in Section 9601 of the Elementary and Secondary Education Act):
. . . the Secretary [of Education] may reserve not more than 0.5 percent of the amount appropriated to carry out each categorical program and demonstration project authorized under this Act . . . to conduct . . . (A) comprehensive evaluations of the program or project; and (B) studies of the effectiveness of the program or project and its administrative impact on schools and local educational agencies . . . .
[Note that, in the above text, the words "program" and "project" are singular rather than plural.]
HHS's evaluation set-aside authority (in Section 241 of the Public Health Service Act):
"Such portion as the Secretary [of HHS] shall determine, but not less than 0.2 percent nor more than 1 percent, of any amounts appropriated for programs authorized under this chapter shall be made available for the evaluation . . . of the implementation and effectiveness of such programs."
[Note the use of the plural "programs."]
The U.S. Department of Education should use its "waiver" authority to build scientifically valid knowledge about what works in K–12 education. (September 2006: see background material below)
Policy Recommendation: That the Education Department Use Its "Waiver" Authority To Build Scientifically-Valid Knowledge About What Works In K–12 Education
Resolution Adopted by the National Board for Education Sciences, September 2006
Precedent from Welfare Policy in 80s and 90s Shows How It Can Work
Problem to be addressed: The dearth of scientifically-valid knowledge about "what works" in K–12 education that schools/districts can use to meet Adequate Yearly Progress requirements.
The No Child Left Behind Act (NCLB) provides strong incentives for schools and districts to improve educational outcomes of their students, in order to meet "Adequate Yearly Progress" requirements. NCLB also requires them to implement educational practices backed by "scientifically-based research." Yet schools and districts are missing a critical piece needed to implement these requirements: scientifically-valid knowledge about what works they can use to improve practice. Specifically, educational practices that have been proven effective in randomized controlled trials—research's gold standard for establishing what works, and the preferred study design set out in NCLB—are rare or, in many areas of education, nonexistent, leaving schools and districts with few research-proven tools they can use to meet NCLB requirements.
Recommendation 1—That the Department use its "waiver" authority to build valid knowledge about which educational interventions (e.g., classroom curricula, teacher professional development programs) are truly effective in enabling schools to make Adequate Yearly Progress.
Specifically, NCLB gives the Education Department the authority to waive certain provisions of federal law and regulation to enable states and districts to carry out innovative demonstration projects of new educational practices. We recommend that the Department, using this authority, put in place an official policy of granting waivers for demonstration projects that include two elements:
- They implement an educational intervention which, based on existing evidence, shows major promise in enabling schools to make Adequate Yearly Progress (e.g., could potentially have a sizeable impact on reading or math achievement, based on evidence from small-scale randomized controlled trials or matched comparison-group studies).
As illustrative examples, such interventions might include highly-promising teacher recruitment and professional development programs, classroom curricula, supplemental services such as tutoring, or schoolwide reform programs. Demonstration projects would generally be designed to implement such an intervention on a large scale, in typical public school settings.
- They provide for an independent, randomized controlled trial to evaluate the effectiveness of the intervention in this typical school setting (i.e., an "effectiveness" trial).
The Department's waiver would allow the schools and districts carrying out these projects to calculate their Adequate Yearly Progress either with or without the students who participate in the randomized controlled trial—whichever calculation yields the higher score.
Such a waiver policy would be attractive to many troubled schools and districts, giving them the flexibility to conduct demonstration projects designed to make Adequate Yearly Progress, in a way that also creates scientifically-valid knowledge that many schools and districts can then use to make Adequate Yearly Progress. Importantly, this policy would require no additional expenditure of federal funds, and could be implemented by the Department within its existing statutory authority.
Recommendation 2—That the Department also use its waiver authority to rigorously evaluate the effect on student achievement of variations it allows in NCLB's accountability rules themselves (e.g., in the sanctions for schools not making Adequate Yearly Progress, or in the formula for calculating Adequate Yearly Progress).
Examples of waivers in NCLB accountability rules that the Department has granted include: (i) allowing several school districts that are not making Adequate Yearly Progress to provide free tutoring to their students before they offer public school choice (rather than offer school choice first, as required by NCLB); and (ii) allowing several states to use "growth models" to determine whether schools make Adequate Yearly Progress (rather than a model based on the absolute level of student achievement, as set out in NCLB).
We recommend that, in the future, whenever the Department grants such a waiver in NCLB's accountability rules to test a new approach that could potentially have important effects on student achievement, it require the entity receiving the waiver to conduct a randomized controlled trial to evaluate the waiver's effect on such achievement. For example, if the Department provides additional waivers to allow troubled school districts to provide free tutoring to their students before offering public school choice, it would require schools in each district to be randomly assigned to a group that provides tutoring first versus a group that offers school choice first, so as to rigorously evaluate what effect this variation in NCLB rules has on student achievement.
Our proposed requirement for a randomized evaluation would not apply to waivers in NCLB accountability rules that serve purposes other than testing new approaches to accountability, are minor rule variations unlikely to have important effects on student achievement, cannot feasibly be evaluated in a randomized controlled trial, and/or are a response to local or regional emergencies. Examples include the waivers that the Department granted to certain schools and districts that enrolled large numbers of students displaced by Hurricane Katrina, allowing them to calculate Adequate Yearly Progress in a way that would not penalize them for accepting these students.
The precedent from welfare policy: Such waiver policies can greatly accelerate the development of valid knowledge about what works, and have a major impact on program effectiveness.
Specifically, from the Reagan through the Clinton Administrations, the U.S. Department of Health and Human Services (HHS) had in place a "demonstration waiver" policy. Under this policy, HHS waived certain provisions of federal law to allow state grantees to test new welfare reform approaches, but only if the grantees agreed to evaluate their reforms in randomized controlled trials. This policy directly resulted in more than 20 large-scale randomized controlled trials of welfare reform programs from the mid-80s through the mid-90s.
These trials—along with those that HHS funded directly—built valuable, scientifically-valid knowledge about what works in moving people from welfare to work. Of particular value, they showed conclusively that welfare reform programs that emphasized short-term job-search assistance and training, and encouraged participants to find work quickly, had larger effects on employment, earnings, and welfare dependence than programs that emphasized basic education. The work-focused programs were also much less costly to operate. This knowledge was a key to the political consensus behind the 1988 welfare reform act and helped shape the major 1996 welfare reform act including its strong work requirements. These legislative changes led to dramatic changes in state and federal programs, resulting in major reductions in welfare rolls and gains in employment among low-income Americans.
Conclusion: A similar waiver-demonstration policy in education could help supply a critical missing piece that states, districts, and schools need to achieve the goals of NCLB—namely, scientifically-valid knowledge about which educational interventions, and accountability rules, are truly effective in improving student educational achievement.
Congress should create, in statute, effective incentives for federal education program grantees to adopt practices or strategies meeting the highest standard of evidence of sizeable, sustained effects on important educational outcomes. (May 2007: see background material below)
Policy Recommendation: That Congress create, in statute, effective incentives for federal education program grantees to adopt practices or strategies meeting the highest standard of evidence of sizeable, sustained effects on important educational outcomes.
Resolution Adopted by the National Board for Education Sciences, May 24, 2007
The Problem: Federal education programs, set up to address important problems, often fall short by funding specific practices or strategies ("interventions") that are not effective.
When federally-funded educational interventions have been evaluated in scientifically-rigorous studies, the studies typically find many ineffective or marginally effective, and a few even harmful. Those interventions found in rigorous studies to produce meaningful, sustained effects on important outcomes— such as academic achievement, grade retention, dropout rates, post-secondary enrollment, and employment and earnings—tend to be the exception. This general pattern occurs in many diverse areas of education— such as dropout prevention, literacy programs, after-school programs, educational technology, school choice, and substance-abuse prevention—as well as other fields in which rigorous studies have been carried out (e.g., medicine, psychology, welfare and employment, crime and justice).
The Opportunity: Research has identified a few interventions meeting the "top tier" of evidence—i.e., well-designed randomized controlled trials showing sizeable, sustained effects on important outcomes.
Perhaps only about 10 such top-tier interventions now exist in the field of education, in areas such as dropout prevention, early reading, schoolwide reform, school-based substance-abuse prevention, and vocational and adult education.
Possible Incentive: Establishment of a modest-sized competitive grant program to replicate and scale up top-tier interventions, and leverage other funds to support such replication.
- This recommendation is patterned on the evidence-based nurse visitation initiative in the President's FY 08 budget request. That initiative provides $10 million for a new competitive grant program at the Department of Health and Human Services to fund nurse visitation activities that meet the top tier of evidence.
- The competitive grant program would award funds to organizations that:
- Implement a top-tier intervention in any area of education—where "top-tier" might be defined in statute as including interventions shown, in well-designed randomized controlled trials conducted in typical school or community settings, to produce sizeable, sustained improvements in important educational or life outcomes. (Such a showing could be based on a What Works Clearinghouse review or other evidence.)
- Adhere closely to the specific elements of the intervention.
- Obtain sizeable matching funds for their project from other federal or non-federal sources that can appropriately fund the intervention, such as federal Title I, Special Education, Safe and Drug-Free Schools, Career and Technical Education, or Adult Education grants, or state or local funding sources. (Congress may need to make clear that funds from these larger federal programs can be used for this purpose.)
The program would thus be designed to provide seed funding for the replication and scale-up of top-tier interventions—funding which would leverage money from the larger sources described above.
- The program would include rigorous evaluations of the funded projects, where appropriate, to ensure that the interventions remain effective when replicated on a large scale.
Another Possible Incentive: Give priority consideration, in competitive grant programs, to applicants that propose to implement a top-tier intervention (defined as above).
Specifically, in areas of education were top-tier interventions exist, Department programs that make competitive grant awards would give priority consideration (such as 10 additional points out of a possible 100) to grant applicants that propose to implement such an intervention, and ensure close replication of its specific elements.
Conclusion: Rigorous research has identified a few interventions that are very effective in preventing reading failure, substance abuse, dropping out of school, workforce failure, and other outcomes that damage millions of American lives each year. We recommend that Congress provide effective incentives to replicate such interventions, and put them into widespread use.
Congress should revise the statutory definition of "scientifically based research" so that it includes studies likely to produce valid conclusions about a program's effectiveness, and excludes studies that often produce erroneous conclusions. (October 2007: see background material below)
Policy Recommendation: That Congress revise the statutory definition of "scientifically based research" so that it includes studies likely to produce valid conclusions about a program's effectiveness, and excludes studies that often produce erroneous conclusions.
Resolution Adopted by the National Board for Education Sciences, October 31, 2007
The Problem: The current definition includes some study designs that can produce erroneous findings about program effectiveness, leading to practices that are ineffective or possibly harmful.
Many of the Department of Education programs authorized in the No Child Left Behind Act (NCLB) require program grantees to implement educational practices that are based on "scientifically based research" or "scientifically based reading research." Similarly, the Education Sciences Reform Act (ESRA) requires the Institute of Education Sciences' research activities to follow "scientifically based research standards." Currently the law defines these terms quite broadly. To elaborate—
"Scientifically based research" now includes studies that compare program participants to a "control group" of non-participants, without restrictions on how the controls are selected.
The current definitions of "scientifically based reading research" and "scientifically based research standards" are even broader, with no requirement for a control group.
"Scientifically based research" thus currently encompasses studies with very different levels of rigor, including the following:
- Well-designed and implemented randomized controlled trials — which, when feasible, are widely recognized as the strongest design for evaluating a program's effectiveness.
The unique advantage of such studies is that they enable one to assess whether the program itself, as opposed to other factors, causes the observed outcomes. This is because the process of randomly assigning a sufficiently large number of individuals to either a program group or a control group ensures, to a high degree of confidence, that there are no systematic differences between the two groups in any characteristics (observed and unobserved) except one — the program group participates in the program, and the control group does not. Thus the resulting difference in outcomes between the two groups can confidently be attributed to the program and not to other factors.1 (Such studies are sometimes called "experimental" studies.) - Well-matched comparison-group studies, which evidence suggests can be a second-best alternative when a randomized controlled trial is not feasible.
Such studies compare program participants to a group of non-participants selected through means other than random assignment, but who are very closely matched with participants in key characteristics, such as prior educational achievement, demographics, and motivation (e.g., through matching methods such as "propensity scores," or selection of sample members just above and just below the threshold for program eligibility). Careful investigations have found that, among studies that use non-randomized control groups, these well-matched studies are the most likely to produce valid conclusions about a program's overall effectiveness, although they may still mis-estimate the size of the effect.2 - Comparison-group studies without close matching, which often produce erroneous findings about which practices are effective (but can still be useful in generating hypotheses that merit testing in more rigorous studies).
These are among the most common designs in educational research. There is strong evidence from education and other fields that such designs, although useful in hypothesis-generation, often produce erroneous findings, and therefore should not be relied upon to inform policy decisions.3 This is true even when statistical techniques, such as regression adjustment, are used to correct for observed differences between the program participants and non-participants. Attachment 1 provides a concrete example of how such designs can yield the wrong answer about a program's effectiveness. (Comparison-group studies are sometimes called "quasi-experimental" studies.)
Specific recommendation: That Congress revise the statutory definition of "scientifically based research" and "scientifically based reading research" to clarify that such research "makes claims about an activity's impact on educational outcomes only in well-designed and implemented random assignment experiments, when feasible, and other methods (such as well-matched comparison group studies) that allow for the strongest possible causal inferences when random assignment is not feasible."
Attachment 2 shows this revision, and the language it replaces, in the relevant sections of NCLB and ESRA. This revision is actually an adaptation of language that already appears in ESRA under the definition of "scientifically valid education evaluation" (also shown in attachment 2).
Precedent for the revised definition: It is broadly consistent with the standards of evidence used by authoritative organizations across a wide range of policy areas, such as:
- National Academy of Sciences, Institute of Medicine
- American Psychological Association
- Society for Prevention Research
- Department of Education
- Academic Competitiveness Council (13 federal agencies funding math/science education)
- Department of Justice, Office of Justice Programs
- Food and Drug Administration
- Office of Management and Budget
These various standards all recognize well-designed and implemented randomized controlled trials, where feasible, as the strongest design for evaluating a program or practice's effectiveness, and many recognize well-matched comparison-group studies as a second-best alternative when a randomized controlled trial is not feasible.
Conclusion: The definition of "scientifically based research" should be revised, as discussed above, so that it helps focus federal funds on activities that are truly effective.
Attachment 1: Example of How a Comparison-Group Study Without Close Matching Can Produce Erroneous Conclusions
The following example shows how a comparison-group study without careful matching can fail to replicate a central finding of a well-designed randomized controlled trial, producing an invalid result.
Randomized controlled trial results: From 1993–2004, the Departments of Education and Labor, and several private foundations, sponsored a large, well-designed randomized controlled trial of Career Academies. Career Academies are an educational program for middle and high school students that provides academic and technical courses in small learning communities, with a career theme and partnership with local employers. One of the trial's main findings, at the 8-year follow-up, was that the program had no effect on participants' high school graduation rate, compared to the control group (as shown by the two left-hand bars in the chart below).
Comparison-group study results: When the study team then used a comparison-group design comparing program participants to a non-randomized control group of similar students in similar schools — rather than a randomized control group — the study produced an erroneous finding that Career Academies had a large effect on the high school graduation rate, increasing it by over 30 percent (see the two right-hand bars in the chart below).
add image here
A likely reason the comparison-group design produced this erroneous finding: The program group had volunteered for the Career Academy — which is an indication that they were motivated to achieve — whereas the non-randomized control group members had not volunteered, and so presumably were less motivated on average. This difference in motivational level likely caused the program group to have a higher graduation rate (an effect sometimes called "self-selection bias"). In the randomized controlled trial, by contrast, both the program group and the control group had volunteered for Career Academies prior to random assignment, and so were well-matched in level of motivation, as well as other characteristics.
Attachment 2: Proposed Legislative Language
- Suggested revisions to the definition of "scientifically based research" and "scientifically based reading research" in the No Child Left Behind Act of 2001 (P.L. 107–110)
"(37) Scientifically based research.—The term 'scientifically based research'—
- "means research that involves the application of rigorous, systematic, and objective procedures to obtain reliable and valid knowledge relevant to education activities and programs; and
- "includes research that—
- "employs systematic, empirical methods that draw on observation or experiment;
- "involves rigorous data analyses that are adequate to test the stated hypotheses and justify the general conclusions drawn;
- "relies on measurements or observational methods that provide reliable and valid data across evaluators and observers, across multiple measurements and observations, and across studies by the same or different investigators;
- makes claims about an activity's impact on educational outcomes only in well-designed and implemented random assignment experiments, when feasible, and other methods (such as well-matched comparison group studies) that allow for the strongest possible causal inferences when random assignment is not feasible.
- "ensures that experimental studies are presented in sufficient detail and clarity to allow for replication or, at a minimum, offer the opportunity to build systematically on their findings; and
- "has been accepted by a peer-reviewed journal or approved by a panel of independent experts through a comparably rigorous, objective, and scientific review.
* * *
"(6) Scientifically based reading research.—The term 'scientifically based reading research' means research that—
- "applies rigorous, systematic, and objective procedures to obtain valid knowledge relevant to reading development, reading instruction, and reading difficulties; and
- "includes research that—
- "employs systematic, empirical methods that draw on observation or experiment;
- "involves rigorous data analyses that are adequate to test the stated hypotheses and justify the general conclusions drawn;
- "relies on measurements or observational methods that provide valid data across evaluators and observers and across multiple measurements and observations;
- "makes claims about an activity's impact on educational outcomes only in well-designed and implemented random assignment experiments, when feasible, and other methods (such as well-matched comparison group studies) that allow for the strongest possible causal inferences when random assignment is not feasible; and
- "has been accepted by a peer-reviewed journal or approved by a panel of independent experts through a comparably rigorous, objective, and scientific review.
- Suggested revision to the definition of "scientifically based research standards" in the Education Sciences Reform Act of 2002 (P.L. 107–279)
(18) Scientifically based research standards.—
- The term "scientifically based research standards" means research standards that—
- apply rigorous, systematic, and objective methodology to obtain reliable and valid knowledge relevant to education activities and programs; and
- present findings and make claims that are appropriate to and supported by the methods that have been employed.
- The term includes, appropriate to the research being conducted—
- employing systematic, empirical methods that draw on observation or experiment;
- involving data analyses that are adequate to support the general findings;
- relying on measurements or observational methods that provide reliable data;
- making claims about an activity's impact on educational outcomes only in well-designed and implemented random assignment experiments, when feasible, and other methods (such as well-matched comparison group studies) that allow for the strongest possible causal inferences when random assignment is not feasible of causal relationships only in random assignment experiments or other designs (to the extent such designs substantially eliminate plausible competing explanations for the obtained results);
- ensuring that studies and methods are presented in sufficient detail and clarity to allow for replication or, at a minimum, to offer the opportunity to build systematically on the findings of the research;
- obtaining acceptance by a peer-reviewed journal or approval by a panel of independent experts through a comparably rigorous, objective, and scientific review; and
- using research designs and methods appropriate to the research question posed.
- The proposed revisions above are adaptations of language that already appears in the Education Sciences Reform Act, under the definition of "scientifically valid education evaluation":
(19) Scientifically valid education evaluation.—The term "scientifically valid education evaluation means an evaluation that—
- adheres to the highest possible standards of quality with respect to research design and statistical analysis;
- provides an adequate description of the programs evaluated and, to the extent possible, examines the relationship between program implementation and program impacts;
- provides an analysis of the results achieved by the program with respect to its projected effects;
- employs experimental designs using random assignment, when feasible, and other research methodologies that allow for the strongest possible causal inferences when random assignment is not feasible; and
- may study program implementation through a combination of scientifically valid and reliable methods.
References
1 By contrast, nonrandomized studies by their nature can never be entirely confident that they are comparing program participants to non-participants who are equivalent in observed and unobserved characteristics (e.g., motivation). Thus, these studies cannot rule out the possibility that such characteristics, rather than the program itself, are causing an observed difference in outcomes between the two groups.
2 The following are citations to the relevant literature in education, welfare/employment, and other areas of social policy. Howard S. Bloom, Charles Michalopoulos, and Carolyn J. Hill, "Using Experiments to Assess Nonexperimental Comparison-Groups Methods for Measuring Program Effects," in Learning More From Social Experiments: Evolving Analytic Approaches, Russell Sage Foundation, 2005, pp. 173–235. James J. Heckman et. al., "Characterizing Selection Bias Using Experimental Data," Econometrica, vol. 66, no. 5, September 1998, pp. 1017–1098. Daniel Friedlander and Philip K. Robins, "Evaluating Program Evaluations: New Evidence on Commonly Used Nonexperimental Methods," American Economic Review, vol. 85, no. 4, September 1995, pp. 923–937. Thomas Fraker and Rebecca Maynard, "The Adequacy of Comparison Group Designs for Evaluations of Employment-Related Programs," Journal of Human Resources, vol. 22, no. 2, spring 1987, pp. 194–227. Robert J. LaLonde, "Evaluating the Econometric Evaluations of Training Programs With Experimental Data," American Economic Review, vol. 176, no. 4, September 1986, pp. 604–620. Roberto Agodini and Mark Dynarski, "Are Experiments the Only Option? A Look at Dropout Prevention Programs," Review of Economics and Statistics, vol. 86, no. 1, 2004, pp. 180–194. Elizabeth Ty Wilde and Rob Hollister, "How Close Is Close Enough? Testing Nonexperimental Estimates of Impact against Experimental Estimates of Impact with Education Test Scores as Outcomes," Institute for Research on Poverty Discussion paper, no. 1242–02, 2002, at http://www.ssc.wisc.edu/irp/, and forthcoming in Journal of Public Policy and Management.
This literature is systematically reviewed in Steve Glazerman, Dan M. Levy, and David Myers, "Nonexperimental Replications of Social Experiments: A Systematic Review," Mathematica Policy Research discussion paper, no. 8813–300, September 2002. The portion of this review addressing labor market interventions is published in "Nonexperimental versus Experimental Estimates of Earnings Impact," The American Annals of Political and Social Science, vol. 589, September 2003, pp. 63–93.
Ibid (the literature cited in reference 2 addresses the general question of whether, and under what circumstances, comparison-group studies can replicate the results of well-designed randomized controlled trials).
"The Urgent Need to Improve Health Care Quality," Consensus statement of the Institute of Medicine National Roundtable on Health Care Quality, Journal of the American Medical Association, vol. 280, no. 11, September 16, 1998, p. 1003.
American Psychological Association, "Criteria for Evaluating Treatment Guidelines," American Psychologist, vol. 57, no. 12, December 2002, pp. 1052–1059.
Society for Prevention Research, Standards of Evidence: Criteria for Efficacy, Effectiveness and Dissemination, April 12, 2004, at http://www.preventionresearch.org/sofetext.php.
U.S. Department of Education, "Scientifically-Based Evaluation Methods: Notice of Final Priority," Federal Register, vol. 70, no. 15, January 25, 2005, pp. 3586–3589. U.S. Education Department, Institute of Education Sciences, What Works Clearinghouse Study Review Standards, February 2006, http://ies.ed.gov/ncee/wwc/DocumentSum.aspx?sid=19.
U.S. Department of Education, Report of the Academic Competitiveness Council, May 2007.
U.S. Department of Justice, Office of Juvenile Justice and Delinquency Prevention, Model Programs Guide, at http://www.dsgonline.com/mpg2.5/ratings.htm; U.S. Department of Justice, Office of Justice Programs, What Works Repository, December 2004.
The Food and Drug Administration's standard for assessing the effectiveness of pharmaceutical drugs and medical devices, at 21 C.F.R. §314.12.
Office of Management and Budget, What Constitutes Strong Evidence of Program Effectiveness, op. cit., no. 4.
James Kemple and Judith Scott-Clayton, "Career Academies: Impacts on Labor Market Outcomes and Educational Attainment," MDRC, February 2004, at http://www.mdrc.org/publications/366/full.pdf. Although the study found that Career Academies had no effect on high school graduation rates, it did find that the program produced sizeable increases in participants' job earnings, compared to the control group.
James Kemple and Kathleen Floyd, "Why Do Impact Evaluations? Notes from Career Academy Research and Practice," presentation at a conference of the Coalition for Evidence-Based Policy and the Council of Chief State School Officers, December 10, 2003, http://www.excelgov.org/usermedia/images/uploads/PDFs/MDRC-Conf-12-09-2003.ppt.
The Board will review and advise the IES Director on grant awards where the proposed grantee is selected out of rank order of applicant scores that result from peer review for scientific merit. (January 2008)
Notice inviting comments on priorities to be proposed to the National Board for Education Sciences of the Institute of Education Sciences
4000-01-U
DEPARTMENT OF EDUCATION
Docket ID ED-2010-IES-0008
AGENCY: Institute of Education Sciences, Department of Education.
ACTION: Notice inviting comments on priorities to be proposed to the National Board for Education Sciences of the Institute of Education Sciences.
SUMMARY: The Director of the Institute of Education Sciences (Institute) has developed priorities to guide the work of the Institute. The National Board for Education Sciences (Board) must approve the priorities, but before proposing the priorities to the Board, the Director must seek public comment on the priorities. The public comments will be provided to the Board prior to its action on the priorities.
DATES: We must receive your comments on or before September 7, 2010.
ADDRESSES: Submit your comments through the Federal eRulemaking Portal or via postal mail, commercial delivery, or hand delivery. We will not accept comments by fax or by e-mail. Please submit your comments only one time, in order to ensure that we do not receive duplicate copies. In addition, please include the Docket ID at the top of your comments.
- Federal eRulemaking Portal: Go to http://www.regulations.gov to submit your comments electronically. Information on using Regulations.gov, including instructions for accessing agency documents, submitting comments, and viewing the docket, is available on the site under "How To Use This Site."
- Postal Mail, Commercial Delivery, or Hand Delivery. If you mail or deliver your comments about these proposed priorities, address your comments to Elizabeth Payer, U.S. Department of Education, 555 New Jersey Avenue, NW., room 602c, Washington, DC 20208.
Privacy Note: The Department's policy for comments received from members of the public (including those comments submitted by mail, commercial delivery, or hand delivery) is to make these submissions available for public viewing in their entirety on the Federal eRulemaking Portal at http://www.regulations.gov. Therefore, commenters should be careful to include in their comments only information that they wish to make publicly available on the Internet.
FOR FURTHER INFORMATION CONTACT: Elizabeth Payer.
Telephone: (202) 219-1310.
If you use a telecommunications device for the deaf (TDD), call the Federal Relay Service (FRS), toll free, at 1-800-877-8339.
SUPPLMENTARY INFORMATION:
Invitation to Comment: We invite you to submit comments regarding these proposed priorities. During and after the comment period, you may inspect all public comments about these proposed priorities by accessing Regulations.gov. You may also inspect the comments, in person, in room 602q, 555 New Jersey Avenue, NW., Washington, DC, between the hours of 8:30 a.m. and 4:00 p.m., Eastern time, Monday through Friday of each week except Federal holidays.
Assistance to Individuals with Disabilities in Reviewing the Record: On request we will provide an appropriate accommodation or auxiliary aid to an individual with a disability who needs assistance to review the comments or other documents in the public rulemaking record for these proposed priorities. If you want to schedule an appointment for this type of accommodation or auxiliary aid, please contact the person listed under FOR FURTHER INFORMATION CONTACT.
Background: The Education Sciences Reform Act of 2002 (20 U.S.C. 9516) requires that the Director of the Institute propose to the Board priorities for the Institute. The Director is to identify topics that require long term research and topics that are focused on understanding and solving education problems and issues, including those associated with the goals and requirements of the Elementary and Secondary Education Act of 1965, as amended; the Individuals with Disabilities Education Act, as amended; and the Higher Education Act of 1965, as amended; such as closing the achievement gap; ensuring that all children have the ability to obtain a high-quality education and reach, at a minimum, proficiency on State standards and assessments; and ensuring access to, and opportunities for, postsecondary education.
Before submitting proposed priorities to the Board, the Director must make the priorities available to the public for comment for not less than 60 days. Each comment submitted must be provided to the Board.
The Director anticipates submitting to the Board proposed priorities for the Institute at a meeting to be held in September, 2010.
The Board must approve or disapprove the priorities for the Institute proposed by the Director, including any necessary revision of the priorities. Approved priorities are to be transmitted to appropriate congressional committees by the Board.
The Director will publish in the Federal Register the Institute's plan for addressing the priorities and make it available for comment for not less than 60 days.
PROPOSED PRIORITIES
The overall mission of the Institute is to expand fundamental knowledge and understanding of education and to provide education leaders and practitioners, parents and students, researchers, and the general public with unbiased, reliable, and useful information about the condition and progress of education in the United States; about education policies, programs, and practices; and about the effectiveness of Federal and other education programs.
The work of the Institute is grounded in the principle that effective education research must be informed by the interests and needs of education practitioners and policymakers. To this end, the Institute will encourage close partnerships between researchers and practitioners in the conceptualization, planning, and conduct of research and evaluation. The Institute will facilitate the use of education statistics, research, and evaluation in educational planning both by including members of the practitioner community in the design and conduct of the work and by producing reports that are accessible, timely, and meaningful to the day-to-day work of education practitioners and policymakers. Further, the Institute will seek to increase the capacity of education policymakers and practitioners to use the knowledge generated from high quality data analysis, research, and evaluation.
To accomplish this mission, the Institute will compile statistics, support research, conduct evaluations, and facilitate the use of scientific evidence addressing a broad range of education outcomes for all students, including those with disabilities. These education outcomes may include, but are not limited to: school readiness and developmental outcomes for infants, toddlers, and young children; learning, higher order thinking, and achievement in reading and writing, mathematics, and the sciences; behaviors, skills, and dispositions that support learning in school and later success in the workforce; educational attainment in postsecondary, vocational, and adult education; and the training, recruitment, and retention of educators.
Within these areas, the Institute will sponsor work to: examine the state of education in the United States; develop and evaluate innovative approaches to improving education outcomes; understand the characteristics of high-quality teaching and how better to train current and prospective teachers; understand the processes of schooling through which educational policies, programs, and practices affect students; and understand classroom, school, and other social contextual factors that moderate the effects of education practices and contribute to their successful implementation and sustainability. In doing so, the Institute will seek to identify education policies, programs, and practices that improve education outcomes; and to determine how, why, for whom, and under what conditions these policies, programs, and practices are effective. In particular, the Institute will promote research to improve education outcomes for those students who have traditionally been poorly served by the education system because of their socioeconomic status, race/ethnicity, disability, limited English proficiency, and residential or school mobility, with a goal of generating knowledge to assist educators and policymakers in assessing and improving the equity of the education system.
The Institute will maintain rigorous scientific standards for the technical quality of its statistics, research, and evaluation activities, ensuring that the methods applied are appropriate to the questions asked and the results are valid and reliable. The work of the Institute will include a variety of research and statistical methods. The Institute will support the development of improved research methods; improved measures of a broad range of education processes, systems, and outcomes; and improved analytical approaches for designing and conducting education research. Where needed, the Institute will develop and publish rigorous technical standards for these methods. The Institute will ensure the quality and objectivity of its work by submitting all products to rigorous scientific review. In addition to supporting new research, the Institute will facilitate the synthesis of existing and ongoing research to construct coherent bodies of scientific knowledge about education. The Institute will build the capacity of the education research community by supporting post-doctoral and interdisciplinary doctoral training in the education sciences, equipping education researchers with the skills to conduct rigorous research and effectively engage the practitioner community in that research, and by conducting training in research design and methods and in the use of longitudinal data.
Accessible Format: Individuals with disabilities can obtain this document in an accessible format (e.g., braille, large print, audiotape, or computer diskette) on request to the contact person listed under FOR FURTHER INFORMATION CONTACT.
Electronic Access to This Document
You can view this document, as well as all other documents of this Department published in the Federal Register, in text or Adobe Portable Document Format (PDF) on the Internet at the following site: http://www.ed.gov/news/fedregister. To use PDF you must have Adobe Acrobat Reader, which is available free at this site.
You may also view this document in text:
Note: The official version of this document is the document published in the Federal Register. Free Internet access to the official edition of the Federal Register and the Code of Federal Regulations is available on GPO Access at: http://www.gpoaccess.gov/nara/index.html.
(Catalog of Federal Domestic Assistance number does not apply.)
PROGRAM AUTHORITY: 20 U.S.C. 9501 et seq.
Dated:
John Q. Easton,
Director, Institute of Education Sciences.
The Board commends the Secretary and the U.S. Department of Education for moving forward in developing new regulations and guidance about how to maintain confidentiality of educational data under the Family Educational Rights and Privacy Act (FERPA) while also providing for research uses of student and school data. The Department should finalize these regulations quickly, incorporating the major clarifications that have been submitted in comments. (May 2008)
Congress should expand on the program of supporting statewide longitudinal data systems by requiring that states accepting funding under this program agree to make data in these systems available to qualified researchers (subject to FERPA) for the purpose of research that is intended to help improve student achievement. (May 2008)
The Board recommends that Congress continue funding for the Regional Educational Laboratories at current levels as part of any Congressional spending agreement for FY 2011, and authorize the Institute of Education Sciences to extend the existing Laboratory contracts for one additional year beyond their scheduled completion date. (March 2011)
The Board recommends that Congress include the following reforms in the authorizing language of Education Department grant programs, wherever feasible and cost-effective, to advance the use of evidence of effectiveness in decision-making:
- Funding incentives for grant applicants to use program models or strategies ("interventions") supported by evidence of effectiveness, as judged by IES standards such as those used in the Department's Investing in Innovation program;
- Funding to evaluate previously untested but highly-promising interventions, through studies overseen by IES that allow for strong causal conclusions, including randomized controlled trials where appropriate; and
- Funding incentives for state and local educational agencies to engage in systematic evaluation and improvement of local initiatives, consistent with evidence standards established by IES. (March 2011)
The Board's recommendations for ESRA reauthorization (August 2013):
National Board for Education Sciences' Recommended Language for Reauthorization of the Education Sciences Reform Act of 2002 (June 2012):