The Relationship Between QRIS Ratings and Observed Program Quality in California

Hawkinson, Laura E.; Hawkinson, Laura E.

Background

Research shows that children who attend high quality early childhood education (ECE) programs have cognitive and behavioral advantages at school entry and beyond. In response to this evidence, states and counties have developed Quality Rating and Improvement Systems (QRIS) to rate ECE programs, with the goal of documenting and improving the quality of care available to children. The methods used to calculate ratings vary widely across QRISs. Previous research has found that the QRIS ratings received by programs are highly sensitive to the method used to calculate ratings. However, consensus is lacking on the best way to calculate ratings in order to successfully differentiate program quality, and also on the specific aspects of quality that should be included in QRISs.

Method

This paper aims to address this gap, by comparing the relationship between QRIS ratings and observed quality measures using alternative rating calculation approaches. The sample includes 134 - 140 centers that participated in the independent evaluation of California’s Race to the Top–Early Learning Challenge QRIS. First, we examine the relationship between California’s QRIS ratings and scores on two other valid measures of program quality collected for the study: the Classroom Assessment Scoring System (CLASS) and the Program Quality Assessment (PQA). Next, we use data collected for the QRIS to simulate ratings using alternative calculation methods. We examine the relationship between each alternative rating and scores on the CLASS and PQA, and compare the results across rating approaches. All ratings are based on center scores on a scale of 1 to 5 in each of seven domains. The rating methods examined in the study include:

California’s current approach, determined by the sum of points earned on each domain;
Block ratings that require programs to meet minimum standards (or “blocks”) in each domain; and
Average score ratings, determined by averaging the points earned on each domain and rounding to the nearest whole number.

We also examine the relationship between each domain and scores on the CLASS and PQA.

Results

The distribution of rating levels varies by rating approach. The average score ratings were quite similar to California’s current approach; only 9% of centers had different ratings and all were higher with average score ratings. The largest changes occur in the block ratings; 93 percent of centers have lower ratings with blocks than in California’s current approach.

Average score ratings are somewhat more effective than California’s current approach at differentiating centers by CLASS and PQA classroom observation scores. Ratings using blocks are less effective than California’s current approach or average score ratings at differentiating centers by CLASS scores, but more effective at differentiating centers by PQA scores.

Among the seven domain scores, only the two based on classroom environment and interactions were associated with CLASS and PQA scores. Other domains based on more structural aspects of quality were not predictive of CLASS and PQA scores. We discuss possible ways to adjust the QRIS ratings to better predict observed quality, limitations of the study, and policy implications.

Association for Public Policy Analysis & Management

Panel Paper: The Relationship Between QRIS Ratings and Observed Program Quality in California