A Critical Multiplist Evaluation of

Developmental Reading Instruction

at Suffolk Community College

Anthony R. Napoli, Ph.D.

Suffolk Community College

Running head: Developmental Reading Instruction


Three quasi-experimental studies are presented which lend support for the positive effects of developmental reading courses on reading comprehension levels of junior college students. Study 1 used a regression-discontinuity design to test the effects of the developmental reading course on overall grade point average. Standardized reading test scores were regressed against overall grade point average for students whose reading test scores either placed them in a developmental reading course or above the cutoff point requiring no placement. The regression findings for the classification variable showed a significant effect for group suggesting a positive direct effect of the developmental reading course on overall grade point average. Possible "mortality" bias was ruled out in Study 2 using a nonequivalent control group design. Study 3 used a single group pretest-posttest design to assess the effectiveness of the developmental reading course on improving reading comprehension skills. Significant pre- to posttest gains were found. These results form a critical multiplism indicating a positive overall effect for the developmental reading program.

In recent years there has been considerable attention to the decline in Scholastic Aptitude Test scores. According to the National Association of Educational Progress (Hashway, 1988) this decline has been interpreted as indicating that the freshman college population has been entering the nation's colleges less well prepared than in the past. These overall declines in students' basic skills academic preparedness have brought developmental, or remedial, education to the fore in community colleges which, in general, have open admission policies (Ahrendt, 1987). National surveys conducted by Cohen and Bawer (1982) revealed that, among community colleges, one out of three mathematics course was being taught at a pre-algebra level - instead focussing on arithmetic. These investigators also report that three out of eight English classes were actually, and admittedly, remedial or developmental. Friedlander and Grede (1981) estimated that more than 50 percent of all students entering community colleges read below the eighth-grade level and that 20 to 35 percent read at or below the fourth-grade level. In a comprehensive study of New Jersey's eighteen community colleges and eight state colleges the New Jersey Basic Skills Council (1985) reports that greater than 30 percent of the state's fulltime public college students require remedial, or developmental reading instruction. Within the inner city campuses of the Dallas Community College system the proportion of under-prepared students is considerably high with more than eighty percent of the freshmen class requiring remediation in reading, writing, and mathematics. The large proportion of community college students who matriculate with inadequate basic skills dictates that developmental studies will be at the heart of the curriculum. Roueche and Snow (1977) and Gruenberg (1983) developed working models for developmental programs in community colleges in which a variety of courses are designed to assist or elevate entering students' skills in reading, writing, and mathematics. In the early 1980s, the Board of Trustees of the Massachusetts State College System, in recognition of this problem, initiated The Developmental Education Program whose purpose is to perform the following three functions:

1. Identify students in need of basic skills assistance with valid assessment instruments.

2. Recommend resources and remediation procedures which would allow students to remedy their basic skill problems.

3. Monitor, and document, the progress of students through the system.

4. Evaluate each instructional resource in terms of students progress.

To address the basic skills problem, nation-wide, public and private colleges have adopted these or similar models, and established academic departments, and programs of study, designed to remediate students' basic skills to levels commensurate to the demands of college curricula. At Suffolk Community College (SCC) approximately one-third of the full-time students (or 2,500 out of 8,000 students), need, and are required to enroll in a remedial reading course each year, based on assessed reading comprehension levels (SCC, 1993). Assessing the effectiveness of the remedial programs is an ongoing process and required by host of external regulating agencies (e.i., the New York State Department of Education, The State University of New York, and the Middle States Association of College and Secondary Schools). The present manuscript is a detailed report on a sets of studies which were designed to assess the overall effectiveness of the developmental reading program at Suffolk Community College. The first set of studies were devised to evaluate the effectiveness of the college's remedial reading programs. The second series of studies examine the validity (predictive and concurrent) of the current placement test used to identify students with reading comprehension deficiencies.

To meet the needs of academically "under-prepared" entrants, colleges and universities throughout the country have implemented basic skills assessment and remediation programs. The goals of these programs are first to identify skills deficiencies and then, where indicated, to attempt to elevate reading, writing or mathematics skills proficiency to levels commensurate with the demands of college-level coursework through developmental or remedial courses. Currently, there is conflicting evidence on the effectiveness of these programs.

In a review of the reading skills assessment programs at three community colleges, McElroy (1985) reported that students completing a remediation program had a higher graduation rate than those students who were either diagnosed as not needing remediation and those who did not complete remediation. However, McElroy (1985) also found some conflicting results which complicate the relationship between assessment, remedial programs, and student success. At one community college, 50 percent of the students who did not complete the remedial reading course received a GPA of 2.0 or higher, while at another community college there were no differences in GPA or in credits earned to credits attempted between those completing and those not completing the remedial reading course.

Similar conflicting results have been found by Hodges (1981) at another community college. Students placed in the remedial reading classes did not necessarily achieve higher grades than others. These remedial classes seemed to benefit the poorest students the most (61 percent of those enrolled in the remedial classes received a 2.5 or higher GPA whereas only 40 percent of those students who did not enroll in the remedial course did as well). For the students who were below average, but not among the poorest readers, the reading course did not help in their chance of attaining a C average, but did help 10 percent of these students to achieve a B average. It was concluded that improved reading ability does not alone increase student success in other courses.

The developmental education program at Suffolk Community College (SCC) focuses on remediating reading, writing, and math skills to college levels. As part of the outcomes assessment orientation and mission of the college, the developmental studies program is continually subjected to ongoing evaluation efforts to measure its effectiveness. This paper describes a recent series of studies which focus on the developmental reading program at SCC.

The process of selecting courses at SCC follows the administration of a series of placement tests to assess basic skills levels. For the past five years the college has used the College Board's Computerized Placement Tests (CPTs; College Board, 1990) to identify students in need of reading remediation. Data on the SCC population reveal that the CPT Reading Comprehension Test (CPT-Read) has served as a reliable and valid assessment device for identifying students in need of some level of reading remediation. Specifically, the test developers report test-retest reliabilty to be equal to .90. for a sample of eighteen hundred examinees (College Board, 1990). Napoli (1991) and Napoli and Coffey (1992) obtained validity coefficient ranging from .50 to .63 when vailiditating the CPT-Read against other standardized measures of reading comprehension and performance in college level coursework

The developmental reading program at SCC consists of a sequence of two single-semester reading courses. The first course (Introduction to College Reading, RE09) was designed for students with low-level reading abilities -- that is, those with CPT-Read scores below 66, which has been equated to reading abilities below the eighth grade level (Napoli & Coffey, 1992). These students attend two semesters of developmental reading instruction. Students with assessed reading levels below college level but, above the eighth grade level (CPT-Read scores between 66 and 72), are only required to attend the second developmental course (Reading in the Content Areas, RE10).

The RE09 course description appearing in the SCC catalog, states that the objectives and goals are "to provide individual and small-group instruction in basic reading and study skills in order to develop a higher level of competence so as to assure success in subject classes and allow entry into RE10." The course description for RE10 includes the following statements: "designed for the student who needs to enhance basic reading skills necessary for successful completion of other content-area courses by developing students' ability to: read and study textbook materials effectively and discover main ideas in paragraphs; discover meaning through the use of absolute and conditional language; note details and make inferences; recognize structural devices in sentences and paragraphs; draw conclusions; outline and summarize; take notes from written and oral material; use proper form and style for research paper writing; develop vocabulary; prepare for and take exams; and develop study skills."

Remediation is achieved by a combination of factors. Class size in developmental education courses is limited to eighteen students, thus providing individualized attention and follow-up. In addition, developmental education students spend at least one hour per course per week in the academic skills center, where they receive one-to-one instruction and work with self-paced books, tapes and computer software programs.

The degree to which the developmental reading courses are achieving the desired level of remediation and fulfilling their objectives is the central focus of an ongoing evaluation. Since only quasi-experimental methods (Campbell & Stanley, 1966; Cook & Campbell, 1979) are appropriate to such an evaluation, a "critical multiplist" (Shaddish, Cook, and Houts, 1986) approach involving a combination of designs was employed to rule out rival hypotheses. This evaluation focuses on three recently concluded studies which were designed to assess the effectiveness of the developmental reading program. The first uses a "regression-discontinuity" design (Trochim, 1984) to examine the impact developmental reading courses have on a commonly accepted measure of college success, namely overall grade point average (GPA). The second study employs a nonequivalent control group design (Cook & Campbell, 1979), and compares developmental reading students to a sample of students with similar reading levels but who had not participated in the program. Study 3 employs a single group pretest-posttest design (Cook & Campbell, 1979) employing a standardized reading test.

Study 1


For the first study, two student cohort groups were identified from the college-wide master student data file. All students were selected from Fall 1988 through Fall 1991 entrants. One group consisted of those students who scored below the "cutting point" (i.e., CPT of 80) and were enrolled in a developmental reading (Dev.Read) course (n = 6433) at any time during the semesters of Fall 1988 through Spring 1992. The second group consisted of students whose CPT-Read Score was above the cutting point and who "tested out" of the developmental reading program (n = 6109). We refer to these students as the non-developmental reading (Non-Dev. Read) group.

Trochim (1984) provides an extensive review of how regression-discontinuity can serve as a design for program evaluations when randomized assignment is not feasible or possible, and where placement into a treatment occurs only for those subjects who fall on one side of a predetermined cutting point on a continuous interval scale. Seaver and Quarton (1976) employed the regression-discontinuity model to examine the effects of dean's list awards, a non-randomized assignment, on grade point averages (GPA) in subsequent terms for a sample of college students. These investigators regressed term 2 GPA on term 1 GPA for dean's list and non-dean's students. The intercept at the dean's list cutting point was significantly higher than that of non-dean's list students who fell below the cutting point. This rise in the regression line at the dean's list cut point indicates the subsequent improvement or gain in academic performance associated with the award.

Earlier work with the CPT-Read test has shown it to be reliably related to end-of-term course grades (Ward, Kline, & Flauger, 1986). More recently Napoli, (1991), observed for a sample of 1,450 community college students significant correlations between the CPT-Read and Introductory Psychology final course grades (r = .52) and overall grade point averages (r = .41). If reading abilities could be significantly improved among program participants then their overall GPA would also be expected to shift or increase causing a regression-discontinuity between program participants and non-participants. Conversely, if the program has no effect on reading skills then the regression line for the program participants should not be displaced away from that of the non program students.

Results for Study 1

Statistical assessment of the CPT Read - GPA relationship, and an examination of potential regression-discontinuity between the student groups was tested employing the SAS general linear model procedure (SAS, 1988). In the first regression model GPA was simultaneously regressed on CPT-Read and the two-level class variable (Group) consisting of assignment to Dev. Read and Non-Dev. Read groups. Results for the analysis are presented in Table 1. As seen in the table, CPT-Read scores serve as significant predictors of overall GPA, F(1, 12539) = 502, p.< .0001). Following Trochim, an examination of nonlinear higher order CPT-Read effects failed to produce any meaningful increments in R2. The regression findings for the classification variable detected a significant effect for group, F(1,12541) = 49.3, p. < .0001. An examination of the CPT-Read X Group effect failed to produce any meaningful improvements in R2.

The significant main effect for the grouping variable indicates the presence of a significant regression-discontinuity (Trochim, 1984). To determine the nature of the regression-discontinuity, separate regression equations were created for each. In both cases CPT-Read was observed to be significantly related to GPA (see Table 2). Further examination of the regression constants show that the Dev. Read group has an intercept which is indeed higher than the Non-Dev. Read group. This difference, which is evidenced most noticeable at the criterion cut-point (see Figure 1), represents the regression-discontinuity between the two groups.

Within the regression discontinuity design Pedhazur (1982) points out that testing the difference between intercepts (regression discontinuity) is the same as testing the difference between adjusted means obtained in an ANCOVA. A statistical comparison of differences between adjusted means (intercepts) was conducted employing post hoc mean comparisons of CPT-adjusted GPA means (Pedhazur, 1982). Adjusted mean comparisons were assessed employing SAS generated Duncan t-tests. Results for the comparison (see Table 3) indicate the Dev.Read students achieved a mean adjusted GPA significantly above the Non-Dev. Read students.

Summary of Study 1

An examination of relationship between initial reading levels and subsequent GPA for a group of developmental reading students and a group of non-developmental students, employing a regression-discontinuity model, shows that the GPA of developmental students is significantly greater than what their pretest scores would predict. This finding suggests that involvement in the developmental reading program may be directly related to subsequent academic success as evidenced by the GPA gain. Numerous alternative explanations or threats to the validity of this assumption can be made, however. Chief among them is "mortality" or attrition (Campbell & Stanley, 1966). In this regard, it is possible that there is significant attrition among the remedial reading students, such that only the brightest enroll in or complete the program. If this were the case, then we would expect these students to be better than students with comparable reading levels, but who are not exposured to the developmental reading program. To examine this rival hypothesis study 2 was conducted.

Study 2

To rule out the "mortality" or attrition bias a third student group was identified from the college's master student data file. This group (n = 2210) consisted of those students whose CPT-Read score was below 80 but who were not placed into or enrolled in a developmental reading course. These students based on the CPT-Read score, tested into the developmental program but for various reasons were not placed into or enrolled in a reading course. We refer to this group as the Placed/Non-Attender group. The purpose for creating the Placed/Non-Attender group was to serve as a nonequivalent control group for the developmental reading group (Campbell & Stanley, 1966). These students, based on their CPT-Read assessed reading levels, are equivalent to the Dev.Read group, with one important exception, they had not been exposed to the reading program (the treatment). It is therefore expected that, if the developmental program is unrelated to future academic success, then it would have higher CPT scores than the Placed/Non-Attender's group that could be primarily attributed to mortality bias. Conversely, if the program is achieving its goals the Dev. Read group should be similar at pretest to the Placed/Non-Attenders, but should achieve significantly higher GPAs than their matched counterparts at post-test.

Results for Study 2

To assess initial reading level comparability between the two groups, CPT-Read pretest means were compared. Results for the comparison indicates that the Dev.Read students (mean = 67.1) and Placed/Non-Attender students (mean = 66.9) had comparable (t(8,641) < 1.0, p.=.76) initial reading levels. This is a critical finding since it indicates that prior to exposure to the remedial course-work, program participants (i.e., Dev.Read) and qualified non-participants (i.e., the Placed/Non-Attenders) had nearly identical reading levels. If the Dev.Read students had shown an initially higher reading pretest mean, then any subsequent between group GPA differences might be best attributed to an initial advantage among program participants or a selection bias, rather than programmatic factors. With nearly identical performances on the pretest measure, however, it appears quite justified to deploy the Placed/Non-Attenders students group as matched control group.

A comparison of the GPA means for the two groups indicates that the Dev.Read students earned a significantly higher (t(8,641)=20.7, p.<.0001) overall GPA in comparison to the Placed/Non-Attenders. The mean GPA for the Dev.Read students equals 2.40, whereas the mean GPA for the Placed/Non-Attenders' equals 1.93.

Summary of Study 2

The focus of the analyses presented above was to test for "mortality" bias as a factor contributing to the post-program academic achievements of the Dev.Read students. After identifying a matched control group, and confirming the success of matching, a comparison of GPA means for the Dev.Read students and the control group (i.e., Placed/Non-Attenders) revealed that the program participants earn significantly higher GPAs than control students. Since program participants achieve higher subsequent GPA in comparison to initially similar controls, however, we can rule out mortality bias as a plausible rival explanation.

Study 3

Results from Studies 1 and 2 suggest that exposure to the developmental reading program produces significant gains in academic performance among students who would otherwise perform at lower achievement levels. These long-term performance gains may be attributed to programmatic factors which enhance reading comprehension levels, but this conclusion requires a more immediate assessment of the Developmental Reading Program to be substantiated. From an evaluation perspective, determining the degree to which a remedial reading program accomplishes its immediate objectives, improving comprehension skills, can be achieved by: 1) identifying relevant parameters which would be sensitive to skills development; 2) selecting a sample of students to serve as reliable representatives of the developmental population, and 3) obtaining both pre-and post-instruction measurements on the relevant parameters. To this end, it is the practice of the college to conduct periodic posttesting, employing the CPT-Read on randomly selected samples of developmental students, to assess the degree of skills growth over the course of the term.

The rationale for selecting the CPT test to serve as a relevant parameter is supported by two lines of reasoning. First, (as stated above) , the test has sufficiently established reliability and validity to be accepted as a true measure of the construct (reading) it was designed to measure. Secondly, utilizing the same assessment tool on subsequent occasions allows for a direct assessment of change since the repeated administrations are made with the same measuring device.


The sample consists of 555-RE09 students, and 910-RE10 students attending randomly targeted classes between the Fall 1990 through Spring 1992 semesters. For both groups the pretest to posttest interval was approximately 16 weeks.

Results for Study 3

Data aggregated from the pretest and end of the term posttest administrations of the CPT-Read test (i.e., pretest to posttest comparisons for the two developmental reading courses) were examined. The results of the statistical comparisons (T-tests for correlated groups) between pretest and posttest CPT-Read means broken-down by developmental course (RE09 & RE10) appear in Table 4. These results indicate that significant pre-to post-performance gains were observed in CPT-assessed reading comprehension levels for students in both RE09 and RE10 classes.


Employing a psychometrically reliable and valid reading comprehension test, significant reading skills performance enhancements were observed over the progression of the remedial reading courses. A statistically significant improvement from pre-to posttest, however, cannot be automatically interpreted as a meaningful or substantial improvement. It is, rather, a prerequisite for such conclusions. Only by an examination of the level and magnitude of gain following the detection of statistically significant movement in group means can such conclusions be drawn.

In a series of studies which focused on the criterion-related validity of the CPT-Read test (Napoli, 1991; and Napoli & Coffey, 1992), three points on the CPT-Read score distribution were identified to serve as reliable markers to: 1) identify students with reading comprehension levels commensurate with the demands of college-level course-work (CPT reading comprehension test scores above 72); 2) identify students with moderate comprehension-level difficulties who would benefit from RE10-level remediation (CPT scores between 66 and 72); and 3) identify students with more pronounced comprehension difficulties i.e., below eighth-grade reading level) who would be best placed into the RE09 format (CPT scores of 65 and below).

An examination of the mean posttest values appearing in Table 4 shows that on average the performance of the students in both groups was elevated to a proficiency level sufficient either to 1) move into the next level of developmental courses, as in the case of the RE09 students; or 2) "test out" of the developmental reading program, as the average RE10 posttest score indicates.

The findings from the three studies provide compelling evidence that exposure to the reading program produces meaningful enhancement in reading comprehension levels. It is unlikely that the observed reading performance improvements can be attributed to "mortality" bias since the results of Study 2 failed to detect that phenomena within the developmental population. Together with evidence provided from Studies 1 and 2 there appears to be convergent empirical or "critical multiplist" support to conclude that the developmental reading courses are indeed achieving their stated goals. Individually, none of these studies provides convincing evidence to link the remedial intervention to subsequent academic outcomes. A critical multiplist approach, however, provided sufficient convergent evidence to assess the programs impact.

As a final consideration, future evaluation efforts must continue to monitor the success of the developmental reading program in similar replication studies employing long-term outcomes. Ultimately, only through comprehensive longitudinal investigations can the full impact of the program be assessed.


Campbell, D.T., and Stanley, J.C. (1966). Experimental and Quasi-experimental designs

for research. Chicago: Rand McNally.

College Entrance Examination Board and Educational Testing Service. (1990)

Coordinator's Guide for Computerized Placement Tests, Version 3.0. Princeton, NJ.

Cook, T.D., and Campbell, D.T. 1979. Quasi-Experimentation: Design & Analysis Issues

for Field Settings. New York: Houghton Mifflin.

Hodges D.L. 1981. Frustration: Or why LCC's system of testing and placement does not

work better. Focus on Productivity: 8-11.

McElroy P. (1985). Entrance testing at community colleges, In: Current Issue for the

Community College: Essays by Fellows in the Mid-Career Fellowship Program at Princeton University.

Napoli, A. (1991). Validating CPT Reading Comprehension Test standards employing

relevant college-level performance criteria. Background Readings for College Placement Tests, Educational Testing Service/The College Board. Princeton, NJ.

Napoli, A., and Coffey, C. (1992). An examination of the criterion-related validity of

the CPT Reading Comprehension Test. (in review)

Pedhazur, E. (1982). Multiple Regression in Behavioral Research (2nd ed.). New York,

Holt, Rinehart, and Winston.

Seaver, W.B., and Quarton, J. (1975). Regression discontinuity analysis of dean's list

effects. Journal of Educational Psychology 88: 459-465.

SAS Institute (1988). SAS/STAT User's Guide (release 6.03 ed.) Cary, NC: SAS Institute.

Shaddish, W.R. Jr., Cook, T.D., and Houts, A.L. (1986). Quasi-experimentation in a

critical multiplist mode. New Directions for Program Evaluation 3l: 29-45.

Trochim, W.M.K. (1984). Research Design for Program Evaluation. Newbury Park, CA: Sage.

Ward, Kline, and Flauger, (1986). Summary of pilot testing results on computer adaptive

testing. Background Readings for College Placement Tests, Educational Testing Service/The College Board. Princeton, NJ.

Authors' Note

Anthony Napoli is Director of Institutional Research, and Assistant Professor of Psychology and Statistics at Suffolk Community College. He is also a doctoral student and adjunct lecturer in the Psychology Department of SUNY Stony Brook. Specialization: Evaluation research, tests and measurement, health psychology.

Dr. Paul Wortman is Professor of Psychology and Director of Undergraduate Studies, Department of Psychology, State University of New York at Stony Brook. Specialization: Evaluation of innovatitive programs using new research methodologies.

Christina Norman is a doctoral student in the Department of Psychology, State University at Stony Brook. Specialization: Evaluation Research, Health Psychology.


This study was funded in part by the College Board.

Table 1

Analysis of Variance for the Regression of GPA on CPT Read and Group.

Source of Variance















* p.< .0001

Table 2

Regression of GPA on CPT-Read. for each Group.




A - Intercept



Non-Dev. Read

1, 6107




Dev Read.

1, 6431




* p. < .0001

Table 3

Comparison of CPT Adjusted GPA Means for Developmental Reading Students and Non Developmental Students.

Group (Ntot = 12542)

Adjusted1 Mean GPA2 (SEM)

Non-Dev. Read (n = 6109)

2.39 (.016)

Dev. Read (n = 6433)

2.56 (.013)

1 Means which are statistically adjusted for initial CPT-Read Levels.

2 All adjusted means in column significantly differ at the p. < .0001 level.

Table 4.

Comparison of CPT-Read Pretest and Posttest Means for RE09 and RE10 College-Wide Samples.




Mean (SD)


Mean (SD)


Mean (Std Err)

T Prob. <


N = 555

50.5 (9.4)

64 (14.8)

13.5 (.63)

21 .0001


N = 910

71.7 (8.9)

75.5 (15.4)

3.8 (.53)

7.24 .0001