Educating Minds and Hearts to Change the World Rhetoric and Composition Essay Assessment Results Office of Institutional Assessment October 5, 2008
Mark Meritt, Ph. D., Associate Professor, Rhetoric and Composition Devon Holmes, Ph. D., Associate Professor, Rhetoric and Composition Leslie Dennen, Ed. D., Assistant Professor, Rhetoric and Composition University of San Francisco Background and Method: Report on Rhetoric and Composition s Assessment of Freshman Writing, 2007-2008 The purposes of the present study are identical to those conducted last academic year (2006-2007): 1) to determine the degree to which students in Rhetoric and Composition s first-year writing (RC 110-120) and first-year written and oral communication (RC 130-131) sequences are meeting or progressing toward learning outcomes for area A2 (written communication) of USF s Core Curriculum, and 2) to assess the effectiveness of the present method of determining that progression. The above-mentioned learning outcomes include the following: critical analysis of academic discourse, integration of sources, academic research, style, and revision. (For a fuller discussion of these specific learning outcomes and of the goals of the Program in Rhetoric and Composition, please refer to last year s report.) The method adopted for this assessment is identical to last year s and is as follows: During the 2007-2008 academic year, the Program in Rhetoric and Composition collected sample first essays from fall semester sections of RC 110 and 130 (the first semester courses in the two sequences) and final essays from spring semester sections of RC 120 and 131 (the second semester courses in the two sequences). During the fall 2007 semester, two students from each section of RC 110 and 130 were randomly selected. Copies of these students first essays (with grade and comments but without names) were to be collected by the Program in Rhetoric and Composition. At the end of the spring 2008 semester, these same students final essays for RC 120 and 131 (with grades and comments but without names) were also to be collected. Due to the usual attrition that occurs as students pass from one course to another (e. g., students failing the first course, dropping either course, electing not to take the second-semester course in the spring, withdrawing or transferring from the university) and due to the program s failure to obtain second-semester essays from many of the students, we collected a much smaller sample than anticipated. In the end, we collected 4 pairs of 110-120 essays, and 10 pairs of 130-131 essays. This sample size is clearly inadequate and significantly qualifies any generalizations attempted below regarding the success of the writing sequence in helping students achieve the core learning outcomes. It also suggests that a new mode of collecting student essays must be found. During the fall 2007 semester, we began the process of reading and assessing the collected essays. Each essay was read twice, each time by a different reader. Also, readers recused themselves from reviewing work they recognized to be that produced by their own students. Readers assessed the degree to which each essay displayed evidence of the student s achieving four of the five stated learning outcomes, rating each essay s performance in relation to each outcome on a scale of 0 (no evidence of mastery) to 3 (strong evidence of mastery). For each essay, the two readers scores were combined to yield a final score on a scale of zero (0) to six (6). (See Appendix for rubric and grading charts.) Through this process, we hoped to determine the degree to which students work and experience during our first-year writing and writing/speaking sequences enabled them to meet our core learning outcomes. More specifically, we hoped to see an overall improvement in scores (and therefore evidence of students having met learning outcomes) as students moved from the beginning of the first-semester course to the end of the secondsemester course. We also hoped to determine whether this assessment of essays out of the context of students other work (and without the context of written assignment guidelines or constraints, as well as of other factors such as course
reading) would provide us with a sufficient basis upon which to assess both students progress and the effectiveness of our curriculum (both in its conception and in its execution). We determined at the outset that the assessment method described above would not allow us to assess our fifth learning outcome, revision (E). Not having required instructors to provide us with both drafts and final versions of the essays, we would not have been able to determine the degree to which students revised their work for final presentation. This absence of any assessment of revision may lead to a refinement of the assessment process (e. g., asking instructors to provide both draft and final versions of sample papers). Results: RC 110-120: Overall, 120 scores were higher than 110 scores for every outcome area. Scores in areas B and C (Integrating Sources and Academic Research) were dramatically higher, while gains in areas A and D (Analysis and Style) were more modest (particularly in area D). For outcome area A (Critical analysis of academic discourse), the average 110 essay score was 1.5. The average 120 essay score was 2.5. Students on average therefore improved by a margin of 1 (67 %). For outcome area B (Integrating multiple academic sources), the average 110 score was.25. The average 120 essay score was 3. Students on average therefore improved by a margin of 2.75 (1,100 %). For outcome area C (Academic research), the average 110 essay score was 0. The average 120 essay score was 3. Students on average therefore improved by a margin of 3. (It is impossible to determine a percentage.) For outcome area D (Style), the average 110 essay score was 2.75. The average 120 essay score was 3. Students on average therefore improved by a margin of 0.25 (9 %). RC 131-131: 131 scores were higher than 130 scores for areas B, C, and D (Integrating Sources, Academic Research, and Style). Scores in area C were dramatically higher. 130 scores were significantly higher than 131 scores for area A (Analysis). For outcome area A (Critical analysis of academic discourse), the average 130 essay score was 4. The average 131 essay score was 3.1. Student performance therefore on average therefore decreased by a margin of -.9 (- 22.5%). For outcome area B (Integrating multiple academic sources), the average 130 score was 2.1. The average 131 essay score was 3.4. Students on average therefore improved by a margin of 1.3 (62 %). For outcome area C (Academic research), the average 130 essay score was 1.5. The average 131 essay score was 3.9. Students on average therefore improved by a margin of 2.4 (160%). For outcome area D (Style), the average 130 essay score was 3.7. The average 131 essay score was 4.4. Students on average therefore improved by a margin of 0.7 (18%).
Scores of 110-120 Essays Learning Outcome 110 Average Score 120 Average Score Average Degree of Improvement A. Critical analysis of 1.5 2.5 1 (67%) academic discourse B. Integrating multiple.25 3 2.75 (1,100%) academic sources C. Academic Research 0 3 3 (?%) D. Style 2.75 3.25 (9%) Scores of 130-131 Essays Learning Outcome 130 Average Score 131 Average Score Average Degree of Improvement A. Critical analysis of 4 3.1 -.9 (-22.5%) academic discourse B. Integrating multiple 2.1 3.4 1.3 (62%) academic sources C. Academic Research 1.5 3.9 2.4 (160%) D. Style 3.7 4.4.7 (18 %) Discussion: As in the 2006-2007 assessment, the overall rise in scores may suggest that students in both sequences are making measurable progress towards the Core Area A2 outcomes. In particular, scores on 120 and 131 essays present clear evidence that students are progressing toward writing longer papers requiring library and internet research and the incorporation of numerous sources. As was the case last year, while such increases in scores over the course of the academic year are encouraging, positive conclusions regarding success in the delivery of our curriculum must be qualified for several reasons. First, the dramatic increases seen in areas B and C may reflect merely the absence of required research and integration of sources in early 110 and 130 assignments, compared to late or final 120 and 131 assignments that require library and internet research, as well as the integration of multiple sources. Indeed, three of the four 110 papers included clearly did not call for the student to draw upon either class or outside reading, while all of the 130 paper assignments appeared to ask students to analyze closely a single text (whether one assigned as reading or chosen by the student), rather than integrate multiple sources. In sum, the increase in scores for areas B and C shows clearly that instructors near the end of the sequence are asking students to perform more challenging research and source-based writing tasks; however, it cannot be concluded that students are performing those tasks more capably, since there are no early attempts at such tasks to which they later efforts might be compared. As was also the case last year, 130-131 essay scores in area A (Analysis) actually decreased after the one-year sequence. Again, however, to interpret these results as evincing a fall-off in analytical writing skills would be a serious mistake. Rather, it is clear from the essays that instructors in the 130-131 sequence frequently began the year by asking students to write essays analyzing, assessing, and responding to a single non-fiction prose text
(leading to papers receiving high scores in area A), and closed the sequence with a research-based writing project that required not close analysis of texts but the integration of multiple sources to support the student s own argument (a task which often leads a student to mine many sources for small items of information [e. g., quotations, statistics] rather than to analyze sources closely). In contrast to last year s assessment, this year s study revealed a noticeable discrepancy between the performances of 110-120 and 130-131 students. Performance levels for both groups last year were roughly comparable, as seen in the chart below: Average 120 vs. 131 scores, 2006-2007 Outcome Area Average 120 Score Average 131 Score A (Analysis) 3.29 3.83 B (Integrating Sources) 3.58 3.56 C (Academic Research) 3.32 3.06 D (Style) 3.9 4 As can be seen, there is no great discrepancy (except perhaps for outcome A) between the scores of the two groups; in fact, the 120 group (who have lower placement scores and may be expected to perform at a slightly lower level) exceeded the performance of the 131 group in areas B and C. Such, however, was not the case, in the 2007-2008 study, as seen in the chart below: Average 120 vs. 131 scores, 2007-2008 Outcome Area Average 120 Score Average 131 Score A (Analysis) 2.5 3.1 B (Integrating Sources) 3 3.4 C (Academic Research) 3 3.9 D (Style) 3 4.4 Outcome Area Average 120 Score, 2006-2007 Average 120 Score, 2007-2008 A (Analysis) 3.29 2.5 B (Integrating Sources) 3.58 3 C (Academic Research) 3.32 3 D (Style) 3.9 3 Outcome Area Average 130 Score, 2006-2007 Average 131 Score, 2007-2008 A (Analysis) 3.83 3.1 B (Integrating Sources) 3.56 3.4 C (Academic Research) 3.06 3.9 D (Style) 4 4.4 As can be seen in the charts above, 120 scores for 2007-8 were significantly lower than both the 2006-2007 120 scores and the 2007-2008 131 scores. Yet while it may be tempting to interpret these differences as suggesting either a decrease in student performance in the 110-120 sequence from one academic year to another or an overall superiority in the 130-131 curriculum or student performance in that curriculum during the 2007-2008 year, the insufficient sample size and participation rate (only four students participated from the 110-120 sequence) make such an inference questionable.
In fact, this low participation rate prohibits any meaningful overall assessment of learning in the Rhetoric and Composition course sequences. Clearly, Rhetoric and Composition must devise a more effective means of collecting a larger sampling of student work if the program wishes to assess student achievement of learning outcomes with any degree of validity. The assessment project faces several additional obstacles in its current form. As noted last year and in the discussion above, collecting first and last essays from the sequence may provide a misleading portrait of student learning in the courses. More specifically, courses are often designed so that different learning outcomes are met in different writing tasks. For example, close analysis of texts may be the focus in one assignment, while researching and developing an argument on a compelling civic or academic issue may be the focus in another. Without requiring that courses begin and end with assignments similarly comprehensive in addressing the core outcomes (a policy that would hamper teachers capacity to design assignment sequences that build in complexity or address different writing situations), looking merely at first and last assignments may lead to the impression that courses are neglecting some outcomes (those explicitly addressed in the assignments collected) in favor of others (those explicitly addressed in assignments that were not collected). Another potential problem with the assessment is that readers were aware of the course from which essays were drawn. That is, we as readers knew whether we were reading 110, 120, 130, or 131 essays. Such an awareness may bias the reading in several ways. For example, a reader might over-rate a 130 paper (vs. a 110 essay), since he or she knows that placement in 130 requires a higher writing placement score than that needed for 110. Similarly, since readers were aware when they were reading papers from the second courses in the sequences (120 and 131), it is possible that expectations for or assumptions about progress from one semester to another may have led to inflated 120 and 131 scores. Though such inflation or bias is unlikely, a blind reading of the essays (one in which the course number was not indicated) would produce a more valid result. Finally, while this assessment s direct observation of student writing (as opposed to more indirect methods of assessment) lends it greater weight, such a method may also overlook important development in students writing abilities that do not manifest themselves in their written products from the course sequence. According to Witte and Faigley, a potential weakness in such assessments is the assumption that the effects of writing instruction on students should be evident in the students written products after only a very short time (36). Students may in fact be developing significantly as writers without being yet able to demonstrate that development fully after their first year. An assessment project that includes examination of student writing after and beyond the course, and one that includes a means for understanding developments in students habits and thoughts about writing (perhaps through interviews or questionnaires) may provide data that could usefully supplement that derived from direct observation of student writing. Works Cited Witte, Stephen, and Lester Faigley. Evaluating College Writing Programs. Carbondale: Southern Illinois UP, 1983.
Essay Scoring Results: Analysis/Reading Comprehension Essay number 110 combined score 120 combined score Improvement 01 0 2 2 02 0 2 2 03 6 3-3 04 0 2 2 Integrating Sources Essay number 110 combined score 120 combined score Improvement 01 0 3 3 02 0 3 3 03 1 4 3 04 0 2 2 Academic Research Essay number 110 combined score 120 combined score Improvement 01 0 4 4 02 0 2 2 03 0 4 4 04 0 2 2 Style/Sentence Structure Essay number 110 combined score 120 combined score Improvement 01 4 4 0 02 2 2 0 03 3 4 1 04 2 2 0
Analysis/Reading Comprehension Essay number 130 combined score 131 combined score Improvement 01 3 2-1 02 3 4 1 03 4 2-2 04 5 4-1 05 3 2-1 06 4 4 0 07 5 3-2 08 4 3-1 09 5 4 1 10 4 3-1 Integrating Sources Essay number 130 combined score 131 combined score Improvement 01 3 3 0 02 4 3-1 03 2 2 0 04 2 4 2 05 1 2 1 06 2 4 2 07 2 6 4 08 2 3 1 09 1 4 3 10 1 3 2 Academic Research Essay number 130 combined score 131 combined score Improvement 01 3 4 1 02 4 4 0 03 2 4 2 04 1 5 4 05 0 3 3 06 1 4 3 07 1 6 5 08 1 2 1 09 1 4 3 10 0 3 3 Style/Sentence Structure Essay number 130 combined score 131 combined score Improvement 01 4 5 1 02 5 6 1 03 4 5 1 04 5 5 0 05 1 2 1 06 3 4 1 07 4 5 1 08 2 4 2 09 5 4-1 10 4 4 0