National Evaluation of the Comprehensive Technical Assistance Centers

National Evaluation of the Comprehensive Technical Assistance Centers
Executive Summary

NCEE 2011-4031
August 2011

Acknowledgments
Executive Summary
- The Comprehensive Centers Program
- Research Questions and Methods
- Center Operations
- Extent to Which Centers Addressed State Priorities
- Ratings of Center Assistance
List of Exhibits
PDF & Related Info

Ratings of Center Assistance

To assess the technical assistance provided by the Center program, quality, relevance, and usefulness of a sample of Center projects were rated. All sampled projects were identified by the Centers as "major" or "moderate" in their level of effort, relative to other projects in the same Center. The projects were rated for technical quality by panels of experts with strong knowledge of the content or substantive focus of the specific projects they reviewed. Projects' relevance and usefulness were rated by a sample of participants—state staff, intermediate agency staff, local educators working on behalf of the state, and RCC staff—who were the intended beneficiaries of the project and who had received at least some of the technical assistance the project provided. Quality was judged on three dimensions; relevance was assessed with eight survey items and usefulness with 11 survey items (exhibit ES.4). Each overall measure (relevance, usefulness, or quality) was calculated as the mean of ratings assigned to each item. The item-level ratings themselves were based on 5-point rating scales.¹²

Based on the ratings, Center technical assistance was rated higher on each measure in each successive year, with program-wide average ratings in 2008–09 falling in a range between "moderate" and "high" for quality, and around "high" for relevance and usefulness (exhibit ES.4). On a scale of 1 to 5 with a 3 representing "moderate" and a 4 representing "high," the program-wide average ratings for the sampled projects were 3.34 in 2006–07, 3.51 in 2007–08, and 3.57 in 2008–09 for technical quality, scored by panels of content experts. Program-wide average ratings for relevance, scored by participants, were 3.94 in 2006–07, 4.08 in 2007–08, and 4.15 in 2008–09. Average usefulness ratings for the program were 3.69 in 2006–07, 3.95 in 2007–08, and 3.96 in 2008–09, also scored by participants.¹³

Given that the RCC and CC roles and activity emphases differed, the evaluation looked at variation across Center types. The mean ratings for types of Centers, based on their sampled projects, showed the CCs with higher mean ratings than RCCs for the quality of their sampled projects in all three years although RCCs' average quality ratings were higher in each successive year (exhibit ES.5). The RCCs had higher mean ratings than CCs for the relevance of their sampled projects in 2006–07 and 2007–08 although the average ratings of relevance for CCs went up each year. There were no consistent differences in mean ratings of usefulness across types of Centers.

The evaluation also looked at the relationships between the three measures: quality, relevance, and usefulness. It was reasoned that the content experts rating quality and the participants rating relevance and usefulness might be better able to judge different aspects of a Center project. On this rationale, content experts rated the projects for their technical quality, and participants rated the projects for relevance and usefulness. An examination of the associations among the three dimensions was conducted by calculating correlation coefficients.¹⁴ Such a statistic indicates the strength and direction of a linear relationship between two factors. A correlation coefficient can vary from positive 1.00 (indicating a perfect positive relationship), through zero (indicating the absence of a relationship), to negative 1.00 (indicating a perfect negative relationship). If the correlation is statistically significant (p <.05), we can have strong (95 percent) confidence that what we calculated is not due to chance.

In every year, ratings of quality were unrelated to ratings of relevance and usefulness, although relevance and usefulness ratings were highly correlated with each other within each of the three data collection years. The correlation coefficient for relevance and usefulness was +0.84 for 2006–07, +.79 for 2007–08, and +.83 for 2008–09. This indicates that the extent to which participants rated the projects as relevant was associated with how they deemed the project to be useful to their agency. These coefficients were all statistically significant at p<.05. On the other hand, the results indicated correlations ranging from -0.12 to +0.04 between quality and relevance, and from -0.09 to +0.07 between quality and usefulness. Because these coefficients are not statistically significant we cannot be sure that they are different from zero (no relationship). In other words, the extent to which a project faithfully reflected the knowledge base on a topic and provided appropriate caveats about the quality of its evidence was unrelated to the extent to which participants deemed that project relevant or useful to their agency.

Given the variation in ratings across Centers, additional analyses were conducted to explore whether there were consistent patterns between ratings and the particular features of the projects. Such information could provide suggestions for possible program improvement if there were consistent relationships. Quality ratings in 2008–09 were higher for RCC projects that included CC contributions of materials or in-person help than projects that the RCCs completed without CC contributions (3.72 vs. 3.39), although this was not the case in earlier years. In addition, quality ratings were higher in 2008–09 for projects that had been reviewed by CCs (3.83 vs. 3.46) and by outside experts (3.73 vs. 3.42) for quality assurance as opposed to projects that had not been reviewed in each of these ways (a project-level feature that was studied only in that year of the evaluation). In other analyses of project-level variation, projects that differed from each other in the activities they encompassed or the topics they addressed did not show differences in ratings of quality, relevance, or usefulness that were consistent across the three years.

On the other hand, more consistent differences were found in ratings of relevance and usefulness awarded to projects by different types of participants. Higher ratings were awarded by those participants who had been involved in determining the project goals or design than by participants not involved in this way, and by those who had spent more time in project activities (i.e., 6 or more days) as compared to participants who had spent five days or less (these differences were statistically significant, with p <.01 for both relevance and usefulness). For 2007–08 and 2008–09, also, each type of Center targeted its assistance more successfully to participants who worked in one type of agency, compared with participants who worked in other types of agencies: specifically, RCC projects were rated higher by participants from SEAs than participants from intermediate or local education agencies or schools; CC projects were rated higher by RCC staff than by SEA staff (statistically significant differences, with p<.05 for both relevance and usefulness).

Top

¹² Efforts were made to develop parallel wording and rubrics that would result in similar gradations between rating levels (e.g., very high vs. high vs. moderate) across the three measures. However, given the different content of each set of items within the three measures and the different contexts for the ratings (experts who underwent training for the rating process and reviewed identical packages of materials vs. survey respondents who typically participated in different subsets of project activities), the ratings across the three measures are not directly comparable.
¹³ This averaging procedure across Centers and across projects was designed so that each Center contributed equally to the overall mean for the program (or for its type of Center, where RCC means were compared with CC means), and each project sampled from a Center contributed equally to the Center mean.
¹⁴ For this analysis, the evaluation team used Spearman's rank order correlation, as this non-parametric rating is the appropriate statistical function to describe correlations between two variables where the values of the variables are not normally distributed and are on a scale (such as ratings).