Research on The Accelerated Reader

A number of brief and anecdotal reports on the use of AR have appeared; see, for example, Clements (1995), Lind (1995), DuBose (1996), and Smith (1996). A largely qualitative study was conducted by McKnight (1992), focusing on a population of at-risk readers and using multiple measures (observations, surveys, and questionnaires). The use of AR over a 12-week period was found to be associated with improved attitudes to reading in 17 fifth-grade students. However, many interventions other than AR were occurring in parallel, no data on AR implementation integrity were reported, there was no control or comparison group, and the possibility of Hawthorne effect is evident.

In a large-scale study, Paul (1992) analyzed AR program and reading test data from a sample of 4498 students aged 6 to 16 in 64 schools. The schools decided whether they would respond to an invitation to submit their data for analysis, so their representativeness is questionable; further, reading test data were presumably from more than one test, and no information on integrity of AR implementation is presented. Analysis did indicate a strong positive relationship between the number of points accumulated through the AR program and gains in reading test scores, but given the study's methodology, the direction of any causal link is equivocal. However, students with the lowest ability showed the greatest gains.

In a second study, Paul (1993) extended the approach, including data from 10,124 students in first through ninth grade from 136 schools, and using 12 different standardized tests. Similar results were found for reading, and a positive relationship with increases in math scores was also evident. Younger and poorer readers appeared to improve more than older and more able readers.

Peak and Dewalt (1993) conducted a longitudinal evaluation of the impact of AR with a relatively able group of 25 students at one school who used the program from third to eighth grades. The program was only available centrally in the school, and students could access it only out of class time. No process data were reported regarding implementation integrity. Reading test scores were compared with those of a same-size group from another school when students were in third, sixth, and eighth grades. Overall, students in the AR group gained approximately twice as much as those in the comparison group, and they appeared to spend twice as much time reading, though it is unclear exactly how this was measured.

Turner (1993) developed and evaluated a program to improve reading comprehension in 46 underachieving students in sixth through eighth grades. The program included AR, but in the study's analysis, the effect of this aspect was not partialled from other components. Pre- and post-testing indicated that reading comprehension had improved in 82 percent of participants. Degree of increased reading activity was related to test gain. However, only eighth graders showed improvement in measured attitude to reading.

Mathis (1996) used a time-series design to evaluate change in the progress of 30 sixth-grade students after the introduction of The Accelerated Reader, as measured by the Stanford Achievement Test. No statistically significant differences were found.

Paul, VanderZee, Rue, and Swanson (1997) related school ownership of the AR software to scores on statewide standardized tests in five curricular areas for elementary, middle, and high schools in Texas, comparing approximately 2500 AR-owning schools with 3500 schools that did not own the software and matching experimental and control schools for socioeconomic status. The AR schools performed at higher levels in all grades except sixth and tenth, the differences reaching statistical significance. This trend was most apparent in urban and low socioeconomic environments, and appeared to be independent of hardware ownership. A similar relationship with attendance rates was noted.

Penuel (1997) analyzed longitudinal data on statewide norm-referenced tests of reading and language for 19 elementary schools in a metropolitan area that reportedly had used AR for an average of almost two years. Actual gains were compared to expected gains. Students in third and fourth grades made higher than expected gains in both language and reading, although these differences only reached statistical significance in language. Unfortunately, the unit of analysis appears to be the school rather than the class or the student, and no information is presented on AR implementation integrity or range. Additionally, the specific procedures used to determine statistical significance were not recorded. Penuel concluded that the longer the AR program had been in the school, the greater were the gains in language, but only one significant correlation coefficient is offered to support this.

In a study using data collected by Paul (Topping & Paul, 1999), I noted that research indicates that the amount of reading practice occurring in schools is very low (see Monitoring Reading Practice). The study explored the relationship between practice at reading, student reading performance, and organizational features of the school system. Data generated through The Accelerated Reader were gathered as a measure of reading practice for more than 659,000 U.S. students in kindergarten through grade 12 in one school year. Students and states performing high and low on reading tests were compared. The data suggested that student reading ability was strongly positively related to amount of in-school reading practice, and there was some evidence in the literature we reviewed that one causal direction was from practice to achievement. School time allocated to reading from self-selected materials rose from kindergarten to fifth or sixth grade, and then declined. Amount of reading practice was negatively correlated with school size. More reading practice occurred in private than in public schools. There was some evidence that in states where test results indicated higher average reading performance, students also had higher levels of reading practice.

Schools that had used the AR program for longer periods showed higher rates of reading practice. This might reflect the impact of program installation and embedding time, or this relation could be confounded by other variables. It should be noted that almost all the participating schools used AR as a supplement to the regular reading curriculum and had not added a great deal more silent reading time to the daily class routine; only a tiny proportion had sought to implement the recommended allocation of 60 minutes per day for silent reading. Thus, increases in reading in these schools are largely attributable to increased reading by students in their own out-of-class time.

A quasi-experimental action research evaluation in the U.K. with which I was involved (Vollands, Topping, & Evans, 1999) reported on AR in terms of formative effects on reading achievement and motivation in two schools in severely socioeconomically disadvantaged areas. The results suggest that the program yielded gains in reading achievement among at-risk readers, superior to those stemming from regular classroom teaching and an alternative intensive method. Additionally, the program yielded significant gains among girls in measured attitudes to reading.

The issue of implementation integrity was specifically addressed. We noted that implementation was initially poor in both experimental locations, despite the teachers' having received one day of training (it did, however, improve over time). In particular, less time was devoted to class silent reading in experimental than in comparison classes. The study therefore suggests that AR was effective by improving the quality of students' engagement with literature, rather than merely by increasing the quantity of reading practice (time on task at reading). It also suggests that AR was effective without extrinsic rewards or tangible reinforcement, which elicited virtually no interest from the participant students (c.f. Cameron & Pierce, 1994).

The Tennessee Value-Added Assessment System (TVAAS) encompasses all students in grades three to eight in the state. These children are tested annually in reading, language arts, math, science, and social studies with the Tennessee Comprehensive Assessment Program (TCAP). The value-added system then incorporates the test-result data into the largest longitudinally merged database of achievement in the U.S., and applies a statistical process of mixed-model, multivariate, longitudinal analysis in which each child is her or his own control and “value added” each year can be estimated (Sanders, Saxton, & Horn, 1997). This assessment system yields unbiased estimates of the effects on students' academic gains of school systems, schools, and teachers.

The school value-added scores from the TVAAS for the 1995-96 school year for all five TCAP-assessed subjects were related to school ownership of AR software by Paul, Swanson, Zhang, and Hehenberger (1997). In every subject for grades five through eight, AR schools showed higher average scores than non-AR schools, with less variance. It might be that effective schools are particularly likely to acquire AR software and other modern learning aids. However, the AR sample included almost half of the schools in the state, so it is difficult to claim that they were exceptional. Additionally, the performance of schools that had purchased AR recently and had not had time to implement it was also analyzed. These schools generally fell between the AR and non-AR schools, rather than at the same level as the former. Of course, mere ownership of software does not necessarily lead to its use (let alone its effective use), and purchaser decisions cannot be equated to educational effectiveness.

In a major study conducted in Tennessee, AR data on 62,739 students from grades three to eight were merged with the TVAAS teacher-effects database, and relationships between these independently obtained measures were explored (Sanders & Topping, 1999). In analysis at student and teacher levels, there was a consistently positive and statistically significant relationship between increased number of books read and value added in grades three to six. There was also a consistently positive and statistically significant relationship between percent of AR test questions answered correctly and value added across all grades, but it became positive only at 80 percent correct in grades three and four and 85 percent in grades five to seven (which validates the 85 percent criterion suggested as optimal in AR). This was true at all levels of student ability and amount of material read. However, in analysis at the student level, more than half the children were found to be operating below the 85 percent threshold, suggesting implementation integrity was very variable. In grades three to seven, the lowest ability group had the lowest percentage of correct answers (72 to 74%), and the highest ability group the highest (89 to 92%). This implies that some teachers were not generating or not responding to AR at-risk reports.

Data for amount read (reading volume) and percentage of AR questions answered correctly (percent correct) were divided into top, middle, and bottom thirds, and the relationship of these categories to the dependent variable of teacher effectiveness was explored. The differences between effective and ineffective teachers in relation to their management of reading volume and percent correct was most striking in the lower grades (see Figure 1), especially for students of lower ability.

Figure 1
Teacher Effectiveness (Least Squares Means), by Reading Volume and Percent Correct


Grade 3 Grade 4
red, low; yellow, medium; green, high


Analysis of interactions indicated that a greater amount of reading practice could yield higher reading achievement, but only when the reading practice was also characterized by a high percentage of correct answers (i.e., the reading was successful). Sustaining both increased reading volume and a high percentage of correct answers among students are key aspects of the implementation training for AR, an aspect of that training which is validated by this study.

Summary. Of 12 studies of AR that cite substantial outcome data, only one failed to find evidence of the program's impact, mostly on norm-referenced test scores. However, these studies are of very mixed quality, and many failed to control confounding variables, lack data on implementation integrity, and consequently are not definitive about causal direction. However, a more recent study suggests that with good quality implementation, the AR program can contribute to teacher effectiveness in terms of value added in reading and other core curricular areas.

Nevertheless, much of the existing research is based on either superficial quantitative data or qualitative data that was not rigorously or systematically collected. There is a need to combine rigorous ethnographic research with a wider range of outcome indicators in order to explore the many varieties of child experience in relation to different styles of AR implementation, and the associated benefits and any negative impact -- whether intended or serendipitous.



Go to map



Reading Online, www.readingonline.org
Posted November 1999
© 1999-2000 International Reading Association, Inc. ISSN 1096-1232