Institute of Education Sciences Board Room
80 F Street NW
Board Members Present:
Mr. Jonathan Baron, Vice Chairman
Dr. Carol D'Amico
Dr. David Geary
Mr. Philip Handy
Dr. Eric Hanushek, Chairman
Dr. Sally Shaywitz, M.D.
Ex Officio Members Present:
Dr. Arden Bement, National Science Foundation
Dr. John Q. Easton, Director, Institute of Education Sciences (IES)
Dr. Stuart Kerachsky, National Center for Education Sciences
Dr. Robert Kominski, U.S. Census Bureau, delegate representing Dr. Robert Groves
Dr. Peggy McCardle, National Institute of Child Health and Human Development, delegate representing Dr. Susan Shurin
Dr. Lynn Okagaki, Institute of Education Sciences
Ms. Dixie Sommers, U.S. Department of Labor, delegate, representing Mr. Keith Hall
Ms. Norma Garza, Executive Director, NBES
Ms. Mary Grace Lucier, Designated Federal Official
Gil N. Garcia
MATHEMATICA Research Policy, Inc. (MPR):
Members of the Public:
Linda Anthony, AASCU
Janille Chambers, HELP Committee
Sarah Hutcheon, Society for Research in Child Development
John Kohlmoos, Knowledge Alliance
Jady Johnson, Reading Recovery Council
Vicki Myers, U.S. Department of Education
Wangui Njuguna, Education Daily
Kristina Peterson, Fellow, House Education and Labor
Marcia Sprague, ED-OGC
Sarah Spreitzer, Lewis-Burke/USC
Debra Viadero, Education Week
Maria Worthen, HELP Committee
Call to Order, Approval of Agenda, Chair Remarks, and Remarks of the Executive Director, National Board for Education Sciences
Eric Hanushek, Chair
Norma Garza, Executive Director
Dr. Hanushek called the meeting of the National Board for Education Sciences (NBES, the Board) to order. Prior to introducing participants, he explained that although only six Board members were present, nine potential new members were being vetted by the U.S. Senate. The next meeting of the NBES, formerly scheduled for January 2010, will be postponed to accommodate the congressional vetting process so all 15 members Board members can attend.
Dr. Hanushek then introduced Dr. Peggy McCardle, the newly appointed designate representing the director of the National Institute of Child Health and Human Development (NICHD), and welcomed comments by John Easton, Director of the Institute of Education Sciences (IES).
IES Director Report
John Easton, IES Director
Dr. Easton began his presentation by reviewing three primary topics for discussion: (1) evaluation of stimulus funds, (2) announcements, and (3) a five-bullet talk about potential IES objectives.
Dr. Easton commented on the unprecedented number of applications to the IES grant program, a situation that has created some pressure for IES to respond. In late October, the Office of Scientific Review successfully ran three 2-day panel review sessions, during which a total of 16 panels that included 288 panelists reviewed 600 proposals. Electronic scoring, used for the first time, substantially improved the efficiency of the review process. IES will run two rounds of applications per year; the second round of review will occur in early winter with the same funding stream, although the funding implications for this number of grant applications are currently unclear.
In addition, 10 regional educational laboratories (RELs) are now running under 5-year contracts that will expire in January, February, and March 2011. IES is preparing for a new REL competition, collecting information to determine the direction of the next generation of RELs and examining the nature of collaborations with colleagues at the Maine Department of Education, who sponsor similar comprehensive centers. IES solicited feedback from some 12,000 school superintendents about how the RELs can best serve their needs; the IES is currently categorizing these responses. A number of people will be in Washington on December 16 to provide input about the focus of lab activities. This information-gathering phase should be solidified by the winter, when a statement of work and an RFP are prepared.
In response to a question about whether the RELs will perform similar functions and respond to the same RFPs, Dr. Easton said that the current underlying assumption is that they would be the same. IES is exploring how the comprehensive centers and content centers will interact with the labs; alternatives to the current system may be considered.
Dr. Easton informed the members that Dr. Marsha Silverberg is now the IES Acting Associate Commissioner for Knowledge Utilization, which encompasses the labs, the What Works Clearinghouse (WWC), and the Educational Resources Information Center (ERIC).
Dr. Easton described the following points as five overarching determinants of IES's direction during the course of his term. He explained that his short-term objective is to receive feedback so these priorities ultimately can be translated into research topics.
IES can also draw on the work of the National Assessment Governing Board (NAGB), which also has a wide range of voices at the table. NAGB has been responsible for making the National Assessment of Educational Progress (NAEP) more responsive and sensitive to the needs of practitioners and policymakers.
The American Reinvestment and Recovery Act of 2009 (ARRA) evaluation represents a challenge for IES to maintain rigor while also ensuring the relevance, applicability, and timeliness of research findings.
Dick Murnane, an economist at Harvard University, has addressed key issues related to education research. In "Educational Policy Research: Progress, Puzzles, and Challenges," he argues that although schools may improve by adopting proven programs and curricula, to thrive, schools must become "learning organizations," with leaders learning how to collect, evaluate, and apply data to make decisions about which programs to buy and implement.
Dr. Easton referred to the work of Anthony Bryk, who has written about R&D as an engineering cycle, and to the potentially fruitful conversations he has begun with Jim Shelton of the Office of Innovation and Improvement (OII). He asked the Board to consider how federal resources can be applied to create an infrastructure that supports a more thoughtful and deliberative approach to fostering innovation.
Dr. Easton then invited discussion and feedback from the Board members.
Dr. Hanushek asked how IES has worked and can continue working with the U.S. Department of Education's (ED) Offices of Policy and Innovation to promote these objectives. Mr. Easton replied that the agency has been engaged in various partnerships, some of which represent change; others are progressing in this direction. IES is cautiously moving forward while seeking to maintain objectivity and independence and to develop closer relationships with policymakers in making education research more relevant.
Dr. Easton clarified that not all IES work will focus on specific ED initiatives because of the agency's unique mandate to consider long-range as well as short-term objectives.
Mr. Jonathan Baron, after expressing support for Dr. Easton's approach, asked how strategies for more effective data use and implementation of evidence-based strategies can be rigorously evaluated, to determine if they are improving student achievement, graduation rates, and other outcomes.
Dr. Easton responded that learning organizations depend on a number of key components, including leadership, the ability of teachers to develop professional learning communities, and access to and use of data. Multiple models can be developed from the resulting information, with clearly defined variables, which can then be tested for efficacy.
Mr. Baron suggested two possible models for evaluating whole systems and organizations: (1) the "Communities That Care" randomized evaluation of a community-wide crime and substance-abuse prevention strategy, in which communities choose from a "menu" of evidence-based (EB) crime, delinquency, and substance-abuse prevention programs, and (2) Tom Cook's randomized evaluations of Comer's "School Development Program" — a comprehensive school reform program.
Dr. Easton agreed that a menu approach is appropriate as long as options are presented and study objectives are clearly defined regarding what is to be influenced.
Mr. Phillip Handy commented that relevance and dissemination (priorities 1 and 4) are identical and might be linked. He further suggested that "marketing" might be a more effective departmental title than "communications" in that it connotes outcomes and end products that eventually are accessed by consumers. The objectives are to bring end users to the table to learn about study findings and outcomes and to consider approaches to implementing effective strategies. Ideally, the focus should be on comprehensive rather than silo databases. His question about whether funds are being allocated to a kindergarten–12 database was responded to in the negative. Systemic databases should be combined using IES funds to compile state databases.
In response to a question that followed Dr. Carol D'Amico's question about the relationship between IES and their offices within the agency and how the agency facilitates good practice, Dr. Easton responded that most interaction has been about their priorities and implications for research and evaluation (R&E). Some interaction has occurred related to the use of the RELs and the comprehensive and content centers. He suggested that IES might encourage further exploration in this direction.
In response to Dr. David Geary's question about the capacity of the education field to use data systems and data to enhance school effectiveness, Dr. Easton said that he could not comment because teacher education has not been an agency priority, but that the Board should consider how to introduce the subject.
Mr. Handy asked about the implications of the five priority items for allocating funds across the different delivery systems (e.g., prekindergarten, kindergarten–12, universities vs. community colleges). Dr. Easton responded that some funding is restricted and that there is some uncertainty about the availability of resources going forward.
Evaluation of Stimulus Funds
Addressing the ARRA evaluations, Dr. Easton informed the Board that ED will allocate $100 billion, 90 percent of which consists of stabilization funds to replace lost state revenue; about $10 billion will be dedicated to competitive grants. Given the significance of this budget and the opportunity to assess both short- and long-term results, it will be important to plan aggressively for evaluations of grant-funded programs. After several months of internal discussions, including outside consultation, plans are being developed to conduct a single, comprehensive, integrated evaluation of stimulus funds.
Rather than evaluating individual programs or funding streams, ED will take a three-pronged approach to (1) assess how funds are being allocated; (2) identify common strategies implemented across funding streams, which will produce in-depth multimethod evaluations of common strategies across these steams; and (3) embed impact evaluations within these strategies, where feasible, for states and localities to stagger in the treatments.
This comprehensive approach offers multiple advantages, including eliminating "dueling evaluations" of similar projects and improving the efficiency of data collection by employing one single data collection plan across multiple programs. This type of evaluation will provide timely feedback and longer term outcomes. This will enable the embedding of impact evaluations, some of which will be conducted by IES. Notices are being written in ways that encourage local evaluations, coordinated through technical assistance (TA).
Update on IES Center Activity
IES Commissioners and Staff:
National Center for Education Evaluation (NCEE)
Audrey Pendleton, Marsha Silverberg
Dr. Pendleton presented an overview of individual ARRA program-related evaluations, informing the panel of four principles that guide all ARRA programs in education: (1) state development of rigorous standards and high-quality assessments; (2) establishment of prekindergarten-through-career data systems funded directly by IES; (3) improvement of teacher effectiveness and distribution of effective teachers across all low-performing schools; and (4) intensive support for comprehensive reform, focused on low-performing schools.
Six Individual ARRA Programs
The Race to the Top (RTT), Title I School Improvement Grants (SIG), Investing in Innovation (i3), Teacher Incentive Fund (TIF), State Longitudinal Data Systems (SLDS), and State Fiscal Stabilization Fund (SFSF) were summarized with regard to four key issues:
Race to the Top rewards states that have developed solid infrastructure in data systems to support the process of whole-school reform. The program facilitates reform at the district and school levels. Not all states are expected to receive these substantial grant awards. The focus of the grant is on STEM (science, technology, engineering, and mathematics). An evaluation of Race to the Top school wide reform will be conducted in conjunction with School Improvement Grants under Title I.
In the past, the Title I School Improvement Grants program has received approximately $500,000 a year; $3 billion has been added through ARRA—also for school wide reform. Funding will be allocated to states and to districts with the largest numbers of low-performing schools in states without demonstrated capacity under Race to the Top. Similar strategies are being implemented within states and districts but, at least theoretically, with a difference in capacity at the state level. Random evaluations will be conducted of school wide efforts, and these evaluations will involve dialogue in the field with states and districts about gradually phasing in school wide programs.
Investing in Innovations builds on IES's work in the WWC to support large-scale implementation of strategies and interventions for which there is rigorous evidence. Much discussion has occurred about the nature of rigorous evidence. This program is structured with the largest amount of money and at the largest scale. It is aimed at programs and interventions with the most rigorous evidence at two levels of funding: (1) intermediate level, referred to as "validation grants," and (2) a level of grants for which there is not a great deal of evidence but for which the project seeks to explore a promising idea. IES will fund a contract to provide technical assistance (TA) to the grantees, who must hire an independent evaluator to conduct an evaluation; IES will also provide TA to improve the work of the independent evaluators. Although experimental evaluations are not required, substantial competitive points will be awarded for conducting these types of evaluations.
For the Teacher Incentive Fund, NCEE has been working closely with the program office to structure the grant application and evaluation. Procurement is underway for an evaluation contract that examines plans for incentivizing teacher performance based on value-added measures and classroom observations. It is expected that the evaluation will move forward based on a random assignment model.
Dr. Pendleton did not comment on the State Longitudinal Data Systems, given the familiarity of Board members with the program.
A total of $48 billion has been allocated for the State Fiscal Stabilization Fund, which states may use for a number of objectives, including hiring teachers who might have been laid off. In addition, smaller scale formula grant programs under Title I are receiving a substantial amount of funding.
Dr. Silverberg (Acting Associate Commissioner of Knowledge Utilization and National Center for Education Evaluation and Regional Assistance) reviewed the comprehensive perspective on the various evaluation components of ARRA programs, with particular focus on money and strategies, as outlined in the "Comprehensive Evaluation of ARRA Implementation, Outcomes, Impacts." The comprehensive strategy addresses four components:
Dr. Silverberg noted that a variety of large-scale data collection efforts will also be employed. These will include a multiple-respondents survey of states, a nationally representative sample of districts and their schools, and an oversampling of low-performing districts and schools because they are ultimately the target of funding and other efforts.
To understand the functioning of some ARRA programs, particularly those for which funding is allocated to states—most probably RTT and TIF—samples will be expanded to include all grantees. For example, all RTT grantees, rather than just a sample, will be surveyed at the district level.
Dr. Silverberg clarified that to avoid burdening grantees, small samples have been collected with enough frequency to keep ED informed about successes and challenges that schools and districts are facing. The survey will generate a combination of different types of analyses, some of which will describe funding and types of implementation and some of which will provide correlational analyses of how funding relates to outcomes. A national overview of ARRA and a review of specific programs will shed additional light on their relative effectiveness.
Additional features of the study include more frequent feedback about programs to help department managers determine the need for additional TA or other guidance. NCEE will present a thorough compilation of funding data to the states and districts. The multiple advantages of this approach include a closer connection with and cooperation from state-level grant recipients. The study will also facilitate earlier and more frequent reporting.
Mr. Baron commented that large-scale IES evaluations make it evident that common assumptions about program effectiveness often are found to be untrue when properly evaluated. He suggested that a possible approach to large-scale data collection would be to identify a few strategies and promising approaches that appear to be effective and that merit replication and evaluation in rigorous impact studies.
Dr. Pendleton responded that the goal of the i3 program is to award grants to programs supported by some evidence. For RTT and SIG grants, only those programs with the highest levels of replication and evidence will be evaluated. This effort marks the first time IES will allocate IES research funds for evaluation, which also may have implications for the study. The Office of Management and Budget has added money to the IES budget for this purpose.
Dr. Sally Shaywitz suggested amending or adding to relevant outcome variables so as to follow students from kindergarten through grade 12 and to focus on later outcomes. In her role as head of the study "Mature Adult Outcomes" (following children from age 5 through their 30s), she has determined that studies based on evaluating comparative results have often been surprising.
Dr. Pendleton responded that student achievement will be the primary outcome measure. Most of the ARRA assessments will not be administered by IES. The administration's focus is on prekindergarten-through-career data sets. Some programs also provide for mediating variables, particularly teacher- and principal-related measures. Dr. Silverberg added that high schools also will be a focus, particularly with regard to students from low-performing schools, with an emphasis on other outcome measurements such as graduation rates and transitions to college.
In response to questions from Dr. Geary, Dr. Pendleton said that the polls make an attempt to determine grantee self-assessments with regard to successes and challenges. Also, the work of supporting state-level databases would enable states to develop these scorecards. She also clarified that RTT focuses on STEM and includes elementary schools, although a formal notice has not yet been released. The goals of TIF are to (1) determine whether teacher incentives improve student achievement and (2) ensure better performance and retention for teachers and principals.
Dr. McCardle expressed appreciation for ED's investment in state longitudinal databases and allocation of significant funding for the evaluation programs. She asked about the impact of this budget on other research grants. Dr. Easton responded that funding is set for FY 2010 but that some uncertainty remains for FY 2011.
Dr. Silverberg reflected that "teasing out" evaluation strategies may be challenging, as most of the relevant information will come from large-scale surveys, possibly from reports made by individual grantees to monitors in their program offices. Each program also has its own strategies or priorities. The challenge will be finding a balance between obtaining enough useful data without the study becoming unwieldy and decreasing the response rate. The overall approach is not yet entirely mapped out; one approach, however, is to go into more depth in collecting qualitative data.
Dr. Hanushek said that regarding outcomes and impacts, reference had not been made to accessing SLDS. Discussions with Title I staff resulted in an agreement to forego national evaluations and instead focus on states that already have reliable data sets and systems, which offer many advantages. Dr. Silverberg responded that the impact evaluations will take advantage of the SLDS, either at the district or state levels. Dr. Hanushek emphasized that if evaluation objectives are to provide a "national picture," results can be overly broad, with end users asking for more details.
Dr. Easton commented that the agency uses only four or five SLDS, including data from prekindergarten through 12th grade. Survey results will be reported by strategy across programs, with data on how funds are being applied in different strategies.
Dr. Pendleton anticipates that within states or districts, large-scale reform efforts will be mounted using Title I, RTT, and i3 funds for very similar, if not the same, programs. An integrated evaluation strategy is required due to the difficulty of separating out individual funding streams.
Mr. Handy asked whether stabilization funding also includes reform. Dr. Easton responded that situations will occur in which stabilization funds will influence how competitive funds are allocated. States will use stabilization funds differently, which will help clarify the focus of competitive grants.
Mr. Baron asked whether, with regard to the i3 program, additional points would be granted for random assignment versus quasi-experimental designs. Dr. Pendleton responded that this has not yet been determined but that more rigorous designs are preferred. In response to Mr. Baron's question about whether IES will be able to vet outsider evaluators, she said that this level of specificity would not apply but that the agency does have TA experience on a smaller scale and has achieved some success in improving programs such as Striving Readers.
In response to Dr. Hanushek's concern that the four evaluation categories outlined in the "Comprehensive Evaluation" may be somewhat vague, Dr. Pendleton responded that for RTT and SIG, in particular, funding allocations have not been determined, although it is clear there will be school wide efforts. Commonalities and delineated interventions should be determined as early as possible; random assignment may be feasible if phased implementation is possible in states or districts.
Dr. Hanushek pointed out the potential role of RELs in conducting serious evaluations so that, working with the local districts, they could learn from these improvement grants. Dr. Easton agreed and said that Dr. Silverberg is working on a framework for the new REL competition.
Although the evaluation data set may be diluted—in that every school district in the United States would receive some grant funding—Dr. Easton reflected that due to grant reporting requirements, it should be possible to determine how funds are used and how allocations affect other budget items within the ARRA program matrix. He assured Board members that evaluations would provide answers to questions about the relevance of the data being collected.
Regarding the use of SLDS, Dr. Hanushek said NBES had strongly recommended that one criterion for grant funding was for states to have plans in place for the use and dissemination of data to researchers and evaluators. He asked whether the recommendation should be made again.
Dr. Kerachsky confirmed that this recommendation was included in the grant announcement as a goal but not as a requirement. One complication is confidentiality; attempts at reviewing the Family Educational Rights and Privacy Act (FERPA) rules have been made, to enable both construction of the database across jurisdictions and use of the data. In addition to FERPA, there also are state regulations. To Dr. Hanushek's point that the SLDS are required for funding and that states should be queried with regard to plans and strategies for implementing them, Dr. Kerachsky replied that this is not being done, in part because states are awaiting revisions and clarifications to the FERPA rules, which have not yet been announced.
Mr. Baron referred to a study of an H&R Block program, funded by NCER, the National Science Foundation (NSF), and the Gates Foundation. In this program, a large effectiveness trial in Ohio and North Carolina examined low-income H&R Block clients with college-age children who were offered a vastly simplified college enrollment process. Results of the study indicated a 26 percent increase in students' college enrollment the following fall, compared to the control group. Mr. Baron asked whether i3 or other funding streams could be dedicated to this or similar programs. Because school districts are the i3 grantees, the program would need to be changed to qualify for funding. Dr. Silverberg responded that i3 grants are being targeted to these types of innovative interventions.
Dr. Geary asked about the possible availability of menus of proven interventions that can be scaled up (for districts and states) and menus that help to define evidence-based interventions. Dr. Hanushek replied that currently, end users rely on past dissemination functions of IES.
Dr. Easton stressed that when IES conducts impact studies, rather than simply approving or rejecting the studies, more data about implementation, process, and context information should be collected and analyzed to ensure a better understanding of impacts.
Dr. Hanushek said that John Easton is serving as Acting Commissioner for NCEE because Dr. Phoebe Cottingham's term as Commissioner expired in September. He is conducting a search for a replacement Commissioner. A discussion of a proposed resolution would follow.
National Center for Education Statistics (NCES)
Dr. Kerachsky (Acting Commissioner, National Center for Education Statistics) informed the Board of several program areas—to be highlighted during his presentation—that are most active in planning new initiatives and reflecting some change.
Postsecondary Education Division (PSD)
One-quarter of NCES's budget is devoted to postsecondary education. The main program is IPEDS (Integrated Postsecondary Education Data System), which is mandated by law and consists of a set of surveys and data collection efforts. IPEDS stretches the purview of a statistical agency because it is an enforcement system; university Title IV recipients are required by law to participate. Other PSD studies include the National Postsecondary Student Aid Study (NPSAS), the Baccalaureate and Beyond Longitudinal Study (B&B), and the Beginning Postsecondary Students Longitudinal Study (BPS).
Dr. Kerachsky explained that the Office of the Under Secretary of Education (OUS), under Martha Kanter, coordinates the Department's operational work on postsecondary education, including the work of the Federal Student Aid program; the Office of Vocational and Adult Education; the Office of Postsecondary Education; and a few others. The Under Secretary has included IES in her management meetings as a data collection group fairly independent of operational units; this allows for input and the opportunity to receive feedback.
More important, a new activity for discussion includes measurement of postsecondary education that is not currently captured with the focus on Title IV-funded educational programs. A variety of other programs provide industry-level certifications that are not currently measured systematically, and, while community colleges are covered under Title IV, some of their particular training and certification programs may not be. Considered in light of the President's goal that all Americans should have at least 1 year of postsecondary education, it is apparent that that our databases should reflect the full breadth of student and workforce accomplishments. Other countries have more aggressive apprenticeship structures of industry-recognized certifications and do a better job of providing and recording postsecondary accomplishments. Dr. Hanushek pointed out that the United States also ranks 15th in terms of participation in postsecondary education.
Dr. Kerachsky said NCES's goal is to further promote and capture postsecondary accomplishments, although specific measures are still to be determined. A meeting was held with the Council of Economic Advisers, the Office of Management and Budget, the U.S. Census Bureau, the Bureau of Labor Statistics and other agencies to determine collection methods. Working groups have been held in the meantime, with NCES assigned a lead role to help the agencies capture these phenomena. Personal interviews will likely be a primary data collection methodology.
Dr. Kominski said both enrollment and attainment objectives are being implied. The American Graduation Initiative, which President Obama informally unveiled in July, includes a number of goals or standards, such as all American adults should have 1 or more years of college or equivalent education beyond high school. The question remains of how "equivalent" and "1 year" are to be defined. Other goals include producing 5 million more graduates by 2020 and achieving the highest college completion rates in the world. The distinction between "certification" and "certificate" has been discussed. It is clear from the involvement of the Council of Economic Advisers and the Joint Economic Committee that attention should be paid to labor market value and outcomes.
Determining these variables will be challenging. A great deal of data is available, but it is critical to determine appropriate capture mechanisms. Dr. Kerachsky emphasized the importance of accounting for human capital in the workforce and obtaining the missing core information that would shed light on these data.
Statewide Longitudinal Data Systems
Significant discussion has been devoted to the SLDS initiative and allocation of the $245 million designated under ARRA for establishing SLDS. Responsibility for this mission was assigned to NCES, under Nancy Smith—a data systems professional and former deputy director of the National Quality Campaign—and what is now a newly constituted team. ED is also coordinating their efforts with the U.S. Departments of Health and Human Services (HHS) and Labor (DOL).
Several states already have good SLDS, many of which are university based; states are also building separate systems. University data are primarily derived from the states, largely for research purposes, in some cases as a function of parallel efforts.
Mr. Handy commented that the political will required to consolidate databases is immense; this requires CEOs to work in concert with funding legislatures. Federal money will be helpful in this regard, as challenges are rife in trying to consolidate political aspirations of those managing these systems.
NCES is also responsible for developing common data standards. The federal government, which is not allowed to have unit records data, does not operate these data systems and cannot build a single database. The role of the government is to encourage the building and good use of state data systems, with the proviso that they be interoperable across postsecondary and workforce data systems, kindergarten through 12th grade, and states.
The NCES-sponsored Management Information System Forum includes representatives from all states who meet twice yearly to hold serious discussions, demonstrating significant state ownership in this process. Working with the other postsecondary and workforce groups represents other challenges. Although states can provide workforce data, single data systems are not available with regard to postsecondary educational data. Although much work remains, Dr. Kerachsky said that the structure and leadership are in place to move a unified effort forward.
Assessment and Standards
At approximately $130 million, NAEP constitutes the largest part of the budget. NAEP recently issued reports on two major activities.
The first is the Nation's Report Card, an assessment conducted every 2 years of samples in national, state, and large-city schools of fourth-, eighth-, and twelfth-grade reading and mathematics. Results for mathematics have been released; the reading results are being deferred to check trend lines with the new assessment; and twelfth grade results will also be released later.
For the first time in the history of the series, initiated in 1990, no gains were evident in fourth-grade mathematics scores from the previous assessment to the current one (2007 to 2009, in this case), and mathematics achievement results were flat for fourth graders across all subgroups. Modest but statistically significant gains were reported for eighth graders, and this has been characteristic of eighth-grade results from the inception of the assessment. Five districts showed gains for both fourth and eighth grades, including Washington, DC, and Nevada, and the New England Common Assessment Program states—Vermont, New Hampshire, and Rhode Island.
The second important activity was completion and release of the Mapping State Proficiency Standards Onto NAEP Scales: 2005-2007. Continuing work that began with the 2003 NAEP, the study compares state proficiency standards, that is, it compares one state's AYP-measured educational progress to another. The study does not evaluate state assessments or standards—these are established to meet state needs. However, using NAEP as a common measure of educational achievement, the study determines the NAEP-equivalent score associated with the percent proficient reported by each state for AYP.
Results of this exercise demonstrate that (1) dramatic variation is evident where states set proficiency standards and (2) most states' proficiency standard fall in the NAEP Basic proficiency range. Few are in the NAEP Proficient range, and many are below Basic. State fourth-grade mathematics standards stand out as being particularly low, with the majority below Basic.
Dr. Kerachsky concluded by commenting that there is no evidence to support the view that states have been "dumbing down" their standards to meet NCLB requirements. With the 58 grade/subject combinations for which we saw changes in standards between 2005 and 2007, of 21 percent showed change to more rigorous standards, 45 percent showed change to less rigorous standards, and the remaining 34 percent showed no change in rigor.
Results for the NAEP reading study will be available in April 2010, or possibly earlier.
National Center for Education Research and National Center for Special Education Research (NCER, NCSER)
Dr. Okagaki said her presentation would respond to a request from the NBES Chair and Co-chair about how NCER research programs engage practitioner communities in the research process.
Thus far in 2009, nearly 400 applications have been submitted in response to NCER/NCSER research competitions, representing about a 35 percent increase. During the past year, efforts have been underway to broaden special education research in policy system finance issues, with the goal of applying research in teacher effectiveness and alternative certification to special education.
Results from an analysis conducted by Li Feng and Tim Sass of Florida's K–20 data warehouse came out earlier in 2009. These results focus on the impact of teacher preparation or teacher professional development (PD) in special education on special education students. The findings show that for students with disabilities, teacher PD does not appear to improve outcomes; however, improved outcomes are demonstrated for students with disabilities who are in general education classes taught by certified special education teachers. This study has not yet been published.
Evaluation of State and Local Education Programs and Policies
Five grants were allocated under this program in FY 2008. Unlike typical NCER research programs, the impetus for this evaluation must originate with state education agencies (SEAs) or local education agencies (LEAs); IES then funds the evaluation. Partnerships are being formed among university researchers, research firms, and districts and states. Grants awarded under this program include evaluations in Ninth Grade Academies in Broward County Public Schools, Core Knowledge Charter Schools in Colorado, the Tennessee Voluntary Pre-K Program, Indiana's Diagnostic Assessment Intervention, and the New Jersey Preschool Expansion Program.
Chronically Low-Performing Schools Research Initiative
This is one of two programs initiated this year in which districts with chronically low-performing schools partner with researchers—working with members from schools and districts—to develop programs geared to improve school outcomes.
Reading for Understanding Initiative
This program is designed to foster development and "rapid" innovation. It comprises large multidisciplinary teams that include cognitive psychologists, school evaluators, school district representatives, evaluators, and others. Interventions will target reading comprehension, including direct team involvement in the schools. NICHD also has funded work to teach students word-level skills. Experimental programs are being designed that address school capacity issues and that, from inception, can be implemented in schools. Interventions will be applied across grade levels.
These cooperative agreements have been developed with the objective of engaging practitioners and cognitive therapists in working together to provide program input.
Dissemination of Research
NCER does not have dissemination authority. In October, all publications that NCER has issued to date were posted on the agency's website at http://ncer.ed.gov, which includes 90 pages of citations.
Dr. Okagaki then turned to a discussion of NCEE's practice guides, an effort targeted at practitioners. NCER has produced two guides, which are disseminated through the WWC. Dr. Okagaki said that NCER has not worked on any additional guides since the WWC started producing practice guides but that NCER staff may consider producing them in the future for the WWC to disseminate.
Dr. Okagaki concluded her presentation by referring to the many applications that have been received for the Education Research Grant Program 84.305A. This represents nearly an 80 percent increase. She said that 2009 is the first year that the number of awards will be determined by the amount of funding rather than the number of high-quality proposals. An NCER-conducted workshop on evaluating state and local programs elicited an attendance of 100 people; 25 proposals have been submitted, a number of which are strong.
"National Board for Education Sciences Recommendation Regarding the Department's Investing in Innovation Fund"
Mr. Baron said that he and Dr. Hanushek had drafted the NBES recommendation for the Board's consideration.
We strongly support the Department's proposed plan to make rigorous evidence of effectiveness a central principle guiding the allocation of the Fund's award money and evaluation requirements. We also support its preference for randomized experiments over other study designs in the evaluation requirements for the Fund's Scale-Up grants. We urge the Department to incorporate this same preference into the plan's definition of "strong evidence" and evaluation requirements for Validation Grants.
As currently drafted, these two provisions give equal weight to well-conducted randomized experiments and nonrandomized quasi-experiments. The evaluations sponsored by IES since its establishment in 2002 clearly illustrate the dangers of this approach, as a number of promising findings from quasi-experimental research (as well as small-scale experiments in tightly-controlled settings) have been overturned in IES's major randomized experiments. A similar pattern occurs in other fields, such as medicine, welfare and employment, and violence prevention. Although we believe well-conducted quasi-experiments play a valuable role in identifying promising interventions that merit evaluation in more definitive randomized experiments, and in guiding policy when such experiments are not feasible, we urge the Fund to recognize a concept that our Board,1 the National Academies2, and many other respected scientific bodies have articulated: that well-conducted randomized experiments, where feasible, are the strongest design for evaluating a program's effectiveness.
States, districts, and schools need valid estimates of both a program's benefits and costs to make informed decisions about whether the program merits taxpayer investment, and in what amount. Yet evaluations of program effectiveness are often designed to measure only benefit. We therefore recommend that the Fund require evaluations to measure costs for the program group (relative to the control or comparison group), and not just benefits.
In many cases, it may be possible to carry out well-conducted randomized experiments, or prospective matched quasi-experiments, at modest cost and minimal burden, by —
The recommendation will be submitted during the comment period for the $650 million Investing in Innovation Fund program. The Fund's three grant categories are (1) scale-up grants, backed by strong evidence; (2) validation grants, backed by moderate evidence; and (3) developmental grants, which have a lower evidence standard. Evaluations are built into the first two grant categories. Parts A, B, and C below lay out the recommendations in response to the Fund's parameters.
Randomized Design (Part A)
As currently drafted, scale-up grants give equal weight to well-conducted randomized and nonrandomized experiments. The recommendation, consistent with that of the National Academy of Sciences, is that ED should prioritize randomized designs, where feasible, and, when not feasible, use other methods (e.g., well-matched comparison group studies) that allow for the strongest possible causal inferences. In medicine, for example, 50 percent to 80 percent of promising findings from phase II nonrandomized clinical trials are overturned in phase III trials. These outcomes have also been demonstrated in research conducted by IES.
Cost Efficiency (Part B)
Evaluations should be conducted from a cost-efficiency standpoint, using existing data on program outcomes. The Fund should require evaluations to produce valid estimates of program costs, not only program benefits. Dr. Hanushek emphasized that the cost efficiency should be considered in terms of producing the most possible and better informed decisions about experimental evidence. A statement should be included in the Comments that cost and effectiveness should be considered throughout.
Engaging School Officials and Other Stakeholders (Part C)
The Fund should encourage evaluations to reduce study costs and burdens by (1) incorporating design features (e.g., delayed-treatment control groups) that facilitate cooperation by school officials and other stakeholders and (2) measuring outcomes, where feasible, using low-cost administrative data already collected for other purposes (e.g., district-administered test scores).
Mr. Baron explained that views differ with respect to Part A. The tiered funding structure provides money for studies of programs with strong evidence; however, it also supports those that are backed up with less evidence but that strive to demonstrate their effectiveness.
Mr. Handy reflected that given the many criteria on which NBES members might have an impact, the fact that recommendations are being provided in this area seems somewhat random. Dr. Hanushek responded that the Investing in Innovation Fund program has explicitly expressed that evidence be involved in determining grants and, more explicitly than usual, regarding the types of evidence. NBES's global status as arbiter of good evidence and the fact that the Board has made resolutions on the issue in the past should be considered. He reviewed Part B, regarding cost considerations, as another example of the relevance of NBES's role in making recommendations for the Fund.
Mr. Handy asked whether Dr. Easton would submit and support the recommendations. Dr. Easton expressed reservations about stipulating the same strict evidentiary requirements for validation grants as for scaled-up grants. Mr. Baron clarified that the resolution's proposed evidence requirements for validation grants apply only to the post-award project evaluation, and not to the pre-award application — that is, applicants for validation grants need not show that they already have experimental evidence showing effectiveness. Dr. McCardle suggested that the recommendation should be framed as an offer to IES and the Executive Director, and that the judgment of the Executive Director should take precedence. Mr. Baron clarified that the Fund is sponsored by the Office of Innovation and Improvement (OII), not by IES, so this is a recommendation to OII.
Mr. Baron then moved that the draft recommendation be accepted, and the motion was seconded by Dr. Geary. Five NBES members voted in the affirmative. In response to Dr. Hanushek's call for opposing votes, Mr. Handy commented that the matter did not appear to make the best use of NBES's limited political power but voted in favor of the recommendation. Dr. Hanushek then asked for a vote on the recommendation to send the resolution forward.
Dr. Hanushek declared that the motion was approved and that the resolution would be entered into the record. He then said that during the second half of the meeting, ex officio members would have 15 minutes each to speak about their respective agency activities in order to give these members more of a voice in NBES proceedings.
Ex Officio Member Agency Overview
National Science Foundation
After relaying preliminary background information about NSF, Dr. Bement explained that critical to the NSF's goal of promoting science to "secure the nation's health, prosperity and welfare, and promote progress in science to secure the national defense, and other purposes" is fostering a scientifically literate population and providing the same education at all levels—from preK to postdoctoral—to increase the technologically based workforce.
Funding Priorities and Goals of NSF Education Funding
NSF funding is divided among four strategic goals: (1) learning, (2) research infrastructure, (3) discovery, and (4) stewardship to boost leadership in support of all fields of science and engineering and the associated education. Education programs within the NSF are operated through the Directorate of Education and Human Resources (EHR). A third of NSF's education funding comes from other directorates and offices; it targets graduate and undergraduate programs and deals with curriculum and content development. Eighteen percent of the EHR budget is dedicated to grades K–12.
Four primary program goals are to (1) prepare the next generation of STEM professionals; (2) broaden participation of women and minorities in STEM fields; (3) increase technical, scientific, and quantitative literacy; and (4) promote STEM education research and evaluation. The EHR also emphasizes the study of cognition through social, behavioral, and economic sciences. The challenge for the EHR is to integrate all of its programs, almost all of which are mandated by law.
The EHR's education portfolio is devoted primarily to R&D, with only a limited focus on large-scale implementation. Some funding is allocated to career grants, fellowships, and other types of grants awarded to individuals. Most research is early stage, focused on project-level efficacy studies and conducted through partnerships with state authorities, ED, and other federal departments.
In the future, EHR will be more proficient in evaluating return on investment, which will be facilitated by a special program, Science and Innovation Policy. Assessment metrics, submitted to OMB as part of NSF's FY 2011 budget, were well received.
In addition, informal science education targets public literacy and youth to create links between formal and informal education and expand the learning experience. Features of these programs include IMAX films, television programs, museum exhibits, and nonclassroom learning, in general.
NSF Partnerships and Collaborations
Significant cross-collaboration occurs among directorates and offices within NSF, especially with regard to integrating precollege education into college education. These programs are intended to lower barriers, reduce dropout rates, and improve graduation rates. Strategic partnerships and coordinated efforts have been initiated with ED and most federal agencies that have education programs. Joint memoranda of understanding (MOU) are in place with the National Aeronautics and Space Administration and the U.S. Departments of Defense, Energy, and Health and Human Services; the NSF meets with representatives from these agencies, coordinated through the National Science and Technology Council and special workshops.
In collaboration with ED, mathematics and science partnerships have also been formed under the terms of the America Competes Act of 2007 to improve preK–12 student achievement in these fields.
Dr. Bement noted NSF's achievement in raising student proficiency levels in excess of 90 percent, particularly in Texas, and closing the educational achievement gap among Black, White, and Hispanic students; lagging indicators are evident still for Hispanic student populations, and NSF will address these.
Additional programs of interest to IES include efforts funded by ITEST (Innovative Technology Experiences for Students and Teachers) that emphasize preparing teachers to encourage students to enter the IT workforce.
Another social and behavior program is Human and Social Dynamics, which fosters breakthroughs in understanding human knowledge and machine knowledge to determine the potential symbiotic relationship between humans and computers. Cyber-enhanced learning is also a growing component of NSF programs.
In response to Dr. Hanushek's question about whether NSF has collaborated with IES in program evaluation, Dr. Bement responded that NSF has contributed to the WWC website. He added that all NSF programs receive third-party evaluations and assessments; some are randomly assigned, while others are comparative studies. They are listed on both NSF and IES websites. Dr. Bement added that NSF is open to evaluative collaborations and to working with IES to identify grant challenge questions of mutual interest. Dr. Easton added that active discussion is ongoing to hold collaborative discussions on development activities for mathematics education professionals.
U.S. Census Bureau
Dr. Kominski reviewed the mission of the U.S. Census Bureau to conduct a constitutionally mandated decennial census. With this expertise and knowledge base, the Census Bureau has built a sizeable infrastructure, comprising 12 operational centers with permanent staff and about 6,000 permanent field employees who conduct interviews and collect data—at times, on a monthly basis. The Census Bureau facilitates other data collection programs, both for itself and the federal establishment; the majority of these programs are not owned by the Census Bureau. The agency also serves as an analytic provider to the public. With respect to the education community, its main interface has been through NCES. Three fundamental sets of operations are the following:
Major data collections—conducted through the Governments Division—include interviews and data collections, such as the Integrated Postsecondary Education Data System (IPEDS). Under the oversight and funding of NCES, the Census Bureau conducts interviews and data collections through contact with schools and school districts, drawing on its database of institutional and administrative organizations throughout the country.
Population-based surveys—based on contractual agreements with organizations and are reimbursable. These surveys are used by all federal departments and agencies. They include the Schools and Staffing Survey, the recent college graduate survey, and others. Bids are competitive.
Recurring collections—under the Bureau's proprietorship—are basic large-scale national survey operations, including the following:
In fall 2010, the Census Bureau will produce the first set of 5-year estimates for 1.1 million geopolitical units in the United States. Data will be randomly refreshed yearly to provide a continuous population data production system. From these data operations, the Census Bureau has analytic responsibility for a wide range of topics.
Since 1940, the Census Bureau has collected data on detailed educational attainment and school enrollment; yearly reporting has been conducted since 1964. Since 1960, the Census Bureau has published a record of all core data collections in school enrollment and a variety of supplemental data collections, which include major fields of study, amount of homework, college plans, computer usage, home schooling, vocational education, student mobility, homeschooling, degrees, and many other data products. Both the Census Bureau and NCES produce these types of data products.
The Census Bureau's current most challenging operational tasks are to determine (1) the proper word selections to elicit responses and useful data from regular household respondents and (2) the fields of bachelor's degree recipients, an effort mandated by Congress and a response to NSF's interest in determining science and technology training for students with bachelor's degrees and beyond.
Dr. Kominski then elaborated on the implications for ED of the power, evolution, and utility of the ACS, which can provide significant, relatively small-scale geographic data related to many different needs on a yearly basis. One example has been determining funding allocations for English language learner programs. The Department determined the efficacy of the ACS over state data collection methods. As these types of data collection become available through the Census Bureau, federal agencies will have the option of accessing them.
Eunice Kennedy Shriver National Institute of Child Health and Human Development
Dr. McCardle introduced herself as the new designee of Dr. Susan Shurin, a pediatric oncologist and hematologist and the Acting Director of the NICHD. Dr. McCardle is the chief of the Child Development and Behavior Branch. In explaining the new name of the Institute, Dr. McCardle explained that the agency owes its existence to the efforts of Eunice Kennedy Shriver, in partnership with pediatricians, to increase funding for research on children with disabilities and on child health and development.
The National Institutes of Health (NIH) comprise 27 different institutes and centers that offer grants for biomedical and behavioral research. Of the more than $28 billion allocated for this research, the NICHD, a medium-sized institute, receives slightly more than $1 billion. The agency is one of the top four NIH Institutes that funds behavioral and social sciences research and the only one that focuses on child health and developmental and learning difficulties from preconception into middle age.
The NICHD comprises four major centers; each has at least one branch that funds behavioral and social science relevant to IES. The two agencies both complement one another and overlap in some ways in the research that is supported by them. The NICHD funds more basic and then translational research; IES moves the research toward application.
The Demographic and Behavioral Sciences branch, within the Center for Population Research, conducts research on population diversity and change. The Intellectual and Development Disabilities branch sponsors studies of developmental disabilities. The focus of branches within the National Center for Medical Rehabilitation and Research is on behavioral, biomedical, and biomechanical rehabilitation.
Dr. McCardle's branch, Child Development and Behavior, within the Center for Research on Mothers and Children, funds research on behavioral and social sciences; interventions in learning and learning disabilities; and links among behavioral, neurobiological, and genetics research—bringing to bear neuro-imaging, genetics, epigenetic, and purely behavioral research. The branch funds all research designs and methods.
Dr. McCardle reviewed the focus of the eight programs within the branch detailed in the handout—cognitive, language, social-emotional development, child maltreatment and violence, mathematics and science, reading, cognition and learning, and learning disabilities.
Dr. McCardle concluded with a summary of four upcoming NICHD and NIH offerings:
U.S. Department of Labor, Bureau of Science
Founded in 1884, the Bureau of Science (BOS) is an agency within the U.S. Department of Labor (DOL). Its primary mission is to produce economic data focused on employment and unemployment, such as the Consumer Price Index, and data on productivity, wages, compensation, worker safety, health statistics, occupations, and jobs outlook. Ms. Sommers said her office deals with the latter. The BOS works closely with other federal statistical agencies, including the U.S. Census Bureau, the Office of Economic Analysis, NSF, NCES, and NICHD.
The BOS's focus on education includes educational characteristics of individuals and their labor market entry, labor market outcomes, educational attainment, and occupational preparation required for entering and performing in particular occupations. The BOS prepares data about types of jobs and jobs in relation to salaries, location, outlook, and preparation needed for particular jobs. This information is used extensively for planning DOL workforce development programs, career technical education, community college planning, and other objectives.
The BOS also produces information used in career counseling and publishes the Occupational Outlook Handbook, the most widely used publication for career information in the United States, which is incorporated into other products in the public and private sectors. The BOS also studies education as part of the economy.
Ms. Sommers reviewed the following primary BOS activities:
Business Employment Dynamics, a BOS publication, tracks business and employment dynamics over time from an unemployment insurance data system. The BOS collects data from more than 9 million establishments every quarter through its state partners to follow job loss, gain, and creation. Data are derived from the workforce part of the SLDS about employers in terms of industry, geographic location, and types and sizes of businesses.
National Longitudinal Surveys (NLS), initiated in 1966 and completed in 1990, are widely used in social research. They provide an overview of specific cohorts, tracking statistically representative samples of the U.S. population over long time periods. More recent cohorts include an NLS of youth, which was initiated in 1979. Currently, the BOS is assessing the children of the women who participated in that panel. A later cohort, initiated in 1997, is still being collected.
The Employment and Training Administration (ETA), BOS's sister agency, evaluates the workforce development programs it funds. The ETA funds research projects and has been active in expanding the capacity of states to support data systems related to this work. The state agencies that ETA works with through the unemployment insurance system are the sources of establishment data and Wage Record Files, initiated in the 1990s, which contain the names, Social Security numbers, earnings, and other employment data of every individual who works for a business covered by unemployment insurance. This data set is extremely valuable in identifying whether individuals have paid employment and what happens to them over time.
ETA has developed archives and data systems around these data sets, which are used on a state-by-state basis for supporting education research. Ms. Summers said she is responsible for this effort in Ohio, although generally these programs are coordinated within university settings. About a dozen states have developed these systems, while others commission them.
The President's FY 2010 budget includes $15 million for ETA to increase the number of states that have this capacity and to enhance the capacity of others that have already established these systems. One of the limitations to this approach is that states must enter data-sharing agreements to track individuals who move from state to state. In addition, no funding is currently dedicated to maintaining and managing these systems once they have been established. Ms. Summers confirmed that greater coordination is occurring between the DOL and the education community.
Practice Guides Overview
Scott Cody, Mathematica Policy Research, Inc.
Dr. Hanushek reconvened the NBES meeting by introducing Jill Constantine and Scott Cody of Mathematica Policy Research, Inc., explaining that their presentations would follow up on the Board's discussion during the prior NBES meeting of routine WWC research dissemination efforts.
Dr. Cody, a deputy director of the WWC, discussed the development and content of the WWC practice guides, beginning with recommendations for response interventions for reading.
Purpose of Practice Guides
The practice guides address current challenges in education and are modeled on medical community "how to" guides for surgeons and physicians. Evidence is rated, and expert panels of researchers and practitioners are assembled to address education topics and related recommendations that target teachers, principals, and district administrators to improve student outcomes. To date, the WWC has completed 12 guides, several of which focus on the classroom level, including response to intervention (RtI) guides in mathematics and reading and a guide on reducing behavior problems. Other guides target more building- or district-level recommendations, including guiding schools to help students navigate the path to college; structuring out-of-school time programs; using data to support instructional decision making; preventing students from dropping out; providing literacy instruction for English language learners; encouraging girls in math and science; turning around low-performing schools; improving adolescent literacy; and organizing instruction to improve student learning.
The guides serve a niche for the WWC and have been very well received by educators seeking practical input about the strength and implications of education research.
Structure and Relevance of Practice Guides
The structure of the practice guides has been applicable across audiences. The guides address the level of evidence of research studies and make concrete recommendations for application in the classroom. Internet downloads, tracked via the WWC website, are the best measure of the guides' relevance; to date, the guides have been downloaded 350,000 times (3,000–5,000 times per month for the most popular), with classroom-focused guides receiving the most hits. Relative to other WWC products, these download numbers are high, although Dr. Cody emphasized that the WWC has only barely "scratched the surface" in terms of increasing practice guide visibility.
Launching Practice Guides
Dr. Cody reviewed the following steps for disseminating practice guides:
Practice Guide Literature Searches
Literature informing guide content is grouped into three general categories: (1) studies that are not relevant to the age or grade range of students in the study; (2) effectiveness studies that are eligible for review against WWC standards; and (3) correlational studies that bear on the guides but that are not necessarily based on strong causal evidence.
Practice Guide Levels of Evidence
Strong level is determined by high confidence that the research positively affects student outcomes. Multiple studies with causal validity must meet WWC standards for external validity. Consistent positive effects must be evident for relevant outcomes. Panel confidence is another necessary criterion for this level.
Moderate level is determined by panel confidence that the evidence shows effectiveness, although questions remain about causal or external validity regarding general application. Mixed effects can also result in a moderate level of evidence or evidence not directly related to the scope of the study—or bundled into a larger intervention.
Low level is determined by limited or no evidence, but the panel still prioritizes the recommendation. This category also may include recommendations based on theory or common sense that have not been tested in education research.
Dr. Cody explained that the WWC continually explores guide topics. Within a review of branded products, it may be difficult to assess practices. For example, review of research in mathematics branded products will not yield studies on teaching practices, although they may be available in areas such as dropout prevention intervention reports and drop-out prevention practice.
Response to Intervention (RtI)
RtI screens for and assists students who are struggling with reading by providing a three-tiered response system that introduces new interventions based on student needs. Its three tiers are (1) general education, (2) small-group instruction, and (3) one-on-one intensive instruction—potentially special education referral. The guide acknowledges that different states may have different tier structures. RtI targets school psychologists, counselors, and administrators in primary grades.
Dr. Cody reviewed the five key RtI recommendations and associated levels of evidence detailed in the publication Assisting Students Struggling with Reading Response to Intervention (RtI) and Multi-Tier Intervention in the Primary Grades.
Although no evidence has emerged to conflict with individual practice guides, some guides may be updated with specific recommendations to include new and sometimes nuanced research that may alter recommendations.
Dr. Shaywitz reflected on the challenges of reconciling recommendations that lack evidence with implementation, stressing that findings must be targeted for use among teachers and other education professionals. Dr. Cody agreed that focusing solely on research outcomes does not provide clear direction for application; furthermore, making recommendations based on low levels of evidence is challenging. Educators also need advice about application of studies with mixed levels of evidence, although recommendations may vary across researcher and practitioner groups. Another challenge is that target audiences are not primarily researchers and do not necessarily think in depth about these issues. New labels are being considered for levels of evidence that can better capture the intent of the guides.
More evidence supports the success of practice guides in mathematics, with better research and more specific recommendations about interventions. The two guides were then divided to leverage the body of evidence in mathematics.
Dr. McCardle commented that the information about the number of studies supporting an intervention—accompanied by assignments of high, middle, and low levels of evidence for those studies—will resonate with teachers. Dr. Cody confirmed that the strongest initial impression to the practice guides pertained to the statements about levels of evidence.
In support of Dr. Shaywitz's comment above, Mr. Barron suggested that expert opinion can be unreliable, that it is not necessarily a good predictor of successful interventions. Dr. McCardle responded that the guides are based on evidence, albeit moderate and low levels that are unlikely to do harm.
Mr. Baron suggested that some of the recommendations in the practice guide are based on small efficacy trials, that the panels may be placing too much confidence in them as a guide to practice. Dr. Cody explained that ratings of strong evidence are not based on efficacy trials but rather on large-scale IES-style evaluations.
Dr. Hanushek commented that packaging the relative strengths and weaknesses of the recommendations, as presented in the guides, expands possibilities for randomized field trials. A natural extension of the guides is to use them to define a research area that could be evaluated more rigorously as a package through NCEE-conducted studies.
Dr. Cody responded that expert opinion is predicated on the various types of research evidence rather than being at the bottom of a hierarchy. All WWC panels are seeded with individuals who are reluctant to make low-level-of-evidence recommendations. Techniques cannot be isolated in large-scale IES evaluations, but practical knowledge can be bundled with recommendations that are of value to practitioners. While noting concern about making recommendations not backed by strong evidence, the WWC puts emphasis on helping teachers learn to distinguish strong- from low-evidence recommendations.
Forthcoming Practice Guides
In addition to the reading comprehension and fractions practice guides, Dr. Cody mentioned that practice guides are in the planning stages for the following topics: writing, addressing behavior problems, using word problems in mathematics, and promoting early literacy in childhood.
Dr. Shaywitz raised the question of how concerns of the Board and the WWC with regard to low levels of evidence will be transmitted to congressional decision makers engaged in making recommendations and reauthorization of RtI—this represents a significant shift in education policy.
Presentation of Recently Released IES Teacher Quality Evaluations
Impacts of Comprehensive Teacher Induction
Steve Glazerman, Mathematica Policy Research, Inc.
Dr. Glazerman presented the results of Year 2 of a nationwide large-scale randomized trial of 17 school districts across the United States. The trial estimated impacts of comprehensive teacher induction on teacher retention and other teacher and student outcomes.
Comprehensive induction addresses the challenges faced by beginning teachers that can result in high attrition rates in both hard-to-staff districts and within the profession. Teacher induction programs seek to formalize widespread teacher-student mentoring relationships with the end result of also improving teacher quality.
The two induction programs chosen for the study were the Educational Testing Service of Princeton and the New Teacher Center at UC-Santa Cruz; participating school districts were divided among them. Twelve beginning teachers were assigned to each mentor. Mentors were carefully selected and trained. The study provided opportunities for mentors to observe and be observed while teaching, and for structured activities including weekly meetings, logs, and monthly PD meetings for beginning teachers. WestEd monitored the implementation to ensure optimal assessment and interpretation. A new feature of the program was that IES experimented with 1-year vs. 2-year induction programs. In 10 districts, the program ended after 1 year; in the other 7, the program was continued in the treatment group in a 2-year program. Groups were analyzed separately.
The counterfactuals included teacher mentoring implemented in large, urban high-poverty districts that (1) did not have comprehensive programs or full-time mentors and (2) were not spending more than $1,000 per teacher. Detailed induction activities and questionnaires were designed to expand what is currently known about comprehensive induction from more general assessments, such as the Schools and Staffing Survey. Data sources for the Year 2 report included mentors, teachers, induction activities, teacher mobility surveys, and two rounds of school records data.
A total of 418 elementary schools in 17 large, urban, high-poverty school districts with a need for induction were randomly assigned to the study. Outcomes were measured for 1,009 new teachers in self-contained classrooms. Comparisons were made between teachers in the same district and grade to assess student achievement, with the effects being aggregated across districts using hierarchical modeling.
Survey Response Rates
Dr. Glazerman next reviewed the PowerPoint presentation of the comprehensive level of statistical detail yielded by the study across metrics of survey response rates and induction services received for 1- and 2-year districts, including mentor meeting time, level of assistance received, impacts on student test scores, and teacher retention and composition.
Summary of Findings After 2 Years
Control group teachers received induction support; the treatment group teachers received significantly more. After 2 years, no impacts were found on student achievement, on teacher retention rates, or on the types of teachers being retained. The same findings were evidenced for both 1- and 2-year school districts.
Dr. Hanushek noted the lack of evidence for the effectiveness of teacher mentoring and PD programs.
In response to a question from Dr. McCardle about whether teacher content preparation matched what teachers were actually teaching, Dr. Glazerman responded that teachers' subject-matter knowledge is predicated on their preparation for elementary school teaching. Teacher retention was found to be higher across the board than is reported in the literature on beginning teachers. Support systems appear to be in place but are currently unclear to researchers—such as whether veteran teachers stay late to help newer teachers. Findings should be reconciled, drawing on the available rich knowledge about struggling school districts.
Dr. Easton asked whether mentoring presents a burden for teachers and whether the study helped eliminate ineffective teachers. Dr. Hanushek agreed that the question is whether PD or teacher mentoring will ultimately improve teacher quality. He continued that experiments have been based on assumptions that all PD providers are equally good rather than including the impact of PD curricula on student achievement.
Dr. Glazerman said this issue is also of concern to researchers. The question of whether results would differ based on a different sample of mentors has not been answered; explanations for mentor effectiveness are not clear. A challenge to conducting this type of nonexperimental analysis is to determine what is reportable by IES standards.
Mr. Baron asked about the scope of the intervention and whether the treatment group had expert mentors as compared to those for the control group, who might be—for example—the teacher in the next classroom.
Dr. Glazerman confirmed that treatment group mentors were expert. The scope of application of the comprehensive induction model is not clear, as whole states were ruled out, having already adopted similar models. Large, urban high-poverty districts have pockets, such as districts within New York City, where the program has been introduced. In addition, many districts are laying off teachers and are not in teacher-retention mode.
An Evaluation of Teachers Trained Through Different Routes to Certification
Jill Constantine, Mathematica Policy Research, Inc.
Dr. Constantine said that Alternative Certification (AC) is effective in supplying an increasing number of teachers to the teacher labor force; recent estimates indicate this number to be 30 percent of new hires. AC programs have significantly increased since 2000, with the introduction of NCLB. Debate is ongoing with regard to AC teacher preparedness, compared to teachers from traditional certification (TC) programs. The effectiveness of different training strategies has not been rigorously studied. AC programs draw on midcareer professionals with mathematics and science skills and require less course work than TC programs, allowing teachers to begin teaching before completing them. The programs reduce barriers to entry, although they may also produce teachers who are inadequately trained.
Studies of AC programs are increasing, particularly in Teach for America (TFA) schools, in the states of New York and Florida. Most are not highly selective; admission requirements are similar to TC programs. For this study, GPA requirements were no higher than 3.0, and there was no other obvious selection process. As a result, the evidence was expanded by focusing on a more populous and common program.
Research questions targeted student achievement of AC-trained teachers and aspects of teacher preparation associated with teacher effectiveness (e.g., amount, timing, and content of coursework). Students were randomly assigned to novice AC or TC teachers in the same grade and school to create several mini-experiments, and outcomes were then compared. The uniqueness of the study was in isolating the effect of the teachers on student achievement, and also distinguishing teacher effects from those of the programs. Teachers were relatively new to the field—no more than 5 years of experience. Schools were not required to have vacancies, but participation in the study was limited to elementary schools.
The study sample consisted of seven states, with the largest samples coming from California, New Jersey, and Texas. Many more states have established AC programs since the study was conducted in 2004. Grade distribution was kindergarten through 2nd grade, which had the largest concentration of AC teachers. Data collected for the study, in both the fall and spring, were derived from tests measuring student achievement, teacher practices and characteristics, and program characteristics; they yielded significant findings.
Course Work Hours
The study indicated the desired contrast in coursework hours for initial certification of both low AC (just fewer than 200 hours) and high AC (just more than 400 hours) teachers, based on total hours of course work compared to TC pairs in their own schools—650 and 600 hours of course work for low- and high-TC teachers, respectively. Requirements for AC hours are state driven.
Timing of Required Course Work Hours
Teachers with lower AC hours, on average, took most of their course work before they started teaching. Those with higher numbers of hours took courses both before and during teaching, with an average of 150 hours while teaching in AC programs and 131 hours prior to their second year of teaching.
In terms of characteristics, no significant differences were found between AC and TC teachers. Teacher backgrounds tended not to be in mathematics and sciences, as was hoped. AC teachers were somewhat older, but the largest difference in characteristics was racial composition, with AC teachers far less likely to be White.
No statistically significant differences were found in reading and mathematics with regard to experimental results. Reading and mathematics distributed very normally. However, regression-adjusted statistics, ranging from minus to plus deviation can account for significant student effects even in 1 year. Nonexperimental analysis may explain variance in these differences.
Students in California with AC teachers scored statistically significantly lower in mathematics than students of TC counterparts. In addition, students of teachers taking AC course work also scored lower in mathematics than students of TC counterparts. Overall findings are compounded given that California is a high course work state. No other subgroups showed statistically significant differences.
The quality of pedagogy was measured in three areas: implementation, lesson content, and classroom culture, or management, on the basis of two math and two literacy lessons. Differences were not observed for low levels of course work, but high levels of course work accounted for a large standard deviation. For literacy, differences in culture were statistically significant. Principal ratings aligned with these results. No correlation of the measurement tool to student achievement was evident other than a negative result for reading achievement.
Differences in AC teacher characteristics, practices, and training explained about 5 percent of mathematics scores and 1 percent of reading scores. Students of AC teachers taking course work scored lower than TC comparisons in reading. Students of AC teachers with master's degrees scored lower than TC comparisons in reading.
Dr. Constantine summarized the three study conclusions: (1) students of AC teachers performed the same, on average, as students of TC teachers; (2) variations in the amount and content of requirement course work in teacher preparation was not linked to teacher effectiveness in terms of student achievement; and (3) completing required course work while teaching was associated with lower student achievement.
The line is becoming increasingly blurred around course work completion. In some AC programs, course work centers around daily activities in the classroom, although many programs invest significant time and money in AC training. Volume and structure of AC training programs are still being explored to determine the most effective approaches.
Dr. Hanushek reflected that evidence increasingly indicates that structured programs such as PD, mentoring, and induction are not effective, and asked the Board to consider whether other programs should be systematically evaluated and whether they would yield different results.
Dr. Constantine responded that the study's focus was on whether AC programs essentially do no harm. Studies show that resources and structures of traditional programs are not necessary to meet current teacher capacity She noted that it may be of value to explore developing mathematics and science content knowledge as standard preparatory routes, at least in elementary schools.
Dr. Glazerman said that teacher induction is in only the third year of analysis and that a longer term view may be needed. Another route may be to study and replicate control groups. A third direction would be to question and consider alternatives to the logic model. Other studies will explore highly selective routes, such as teacher incentive programs, or measure teacher quality. Another ongoing study is researching teacher incentives to identify veteran teachers. Policy goals must be geared to assist low-performing schools. Many research areas are not explored, including how retirement policy affects incentives for teachers in their late careers.
Dr. Hanushek commented about the preponderance of evidence derived from elementary schools and asked whether Dr. Constantine's current projects include middle and high school studies. Dr. Constantine responded that an ongoing IES-sponsored study focuses on TFAs and New York state-sponsored middle schools and high schools, which are increasingly subjects of education research studies.
Summary Views, and Next Steps
Dr. Hanushek added that by the next NBES meeting in the middle of March 2010, the NBES will be filled with appointees confirmed by the U.S. Senate.
1 National Board for Education Sciences recommendation on the definition of "scientifically-based research," approved by the Board on October 31, 2007.
2 Preventing Mental, Emotional, and Behavioral Disorders Among Young People: Progress and Possibilities (National Academies Press, 2009, recommendation 12–4, p. 371).
The National Board for Education Sciences is a Federal advisory committee chartered by Congress, operating under the Federal Advisory Committee Act (FACA); 5 U.S.C., App. 2). The Board provides advice to the Director on the policies of the Institute of Education Sciences. The findings and recommendations of the Board do not represent the views of the Agency, and this document does not represent information approved or disseminated by the Department of Education.