Remarks to presidents of the American Association of Universities (AAU), prepared for delivery at the AAU meeting on Oct. 23, 2012
I would like to begin by thanking Hunter and the AAU for the opportunity to speak with all of you about this topic. It is important to all of us, though it affects our campuses in different ways. Indeed, each time that I meet with the AAU provosts, I am impressed by the diversity of missions and circumstances that exist even among this rather elite group of institutions. For public institutions, state regulatory oversight adds a layer of complexity to the assessment issue not confronted by those of us in the private sector. I will try to bear all of these differences in mind as I speak. When I draw upon Princeton-specific examples, I do so not to suggest that our circumstances or policies are paradigmatic — they are not — but simply because those are the examples that I know best.
The assessment movement, by the way, seems to consider Princeton anything but paradigmatic. I was visited last month by a prominent member of that community who told me that he and his colleagues regard Princeton as — and I quote — "intransigent." I suspect that many of your institutions may be on the "intransigence" list along with us. Perhaps that is why I have been invited here this morning — to present the "intransigent" perspective on assessment. If so, I will try not to disappoint! But I want first to observe that this characterization of our position is unfortunate and misleading.
Princeton — like all of the institutions in this room — believes emphatically in the importance of learning. And, like all of you, we endorse the need to assess learning. For example, our senior thesis, which is required for nearly all of our students, is a demanding version of what the assessment movement calls a "capstone experience," one that compels students to integrate what they have learned over their careers and apply it to an original research project. We also have what I like to call a "grading fairness policy," but what our students insist on calling a "grade deflation policy." The policy is designed to ensure, among other things, that grades convey real and meaningful information about performance. (If more of you would like to join us in this endeavor, we would welcome your entry into the field, since our students never tire of pointing out that very few other institutions seem to care about assessment in this particular way.)
We invoke other mechanisms, including various kinds of specialized testing instruments, when our faculty members deem such devices suitable on the basis of their scholarly and pedagogical judgment. For example, Professor Bonnie Bassler of our molecular biology department has recently led an overhaul of our science requirements for non-science majors. She is now partnering with educational researchers from Stanford to determine whether the new curriculum is achieving its objectives.
That said, our commitment — and I suspect yours — is fundamentally to a culture of learning and engagement, not to a culture of assessment. This is why the assessment movement considers Princeton to be "intransigent." To elaborate a bit: we think that there is clear evidence that the variable most critical to learning is genuine faculty and student engagement (for example, faculty-student contact, faculty-guided research projects, class participation, hours spent studying, and so on). Assessment is useful if and only if it promotes genuine engagement and learning. Done wrongly, assessment can also get in the way of learning. We believe that the kind of assessment now fashionable with foundations and accreditors is likely to do exactly that, principally because of two characteristics:
- Its insistence on "evidence ... that can be assessed against an externally informed or benchmarked level of achievement, or compared with those of other institutions and programs."(1)
- Its focus on creating a "culture of assessment," rather than on creating a "culture of engagement" or a "culture of learning" on campuses.
These two tendencies generate seven distinct problems with the calls for "learning outcomes assessment" that we hear from foundations and accreditors:
1. Vagueness. Proponents of assessment often endorse surveys and standardized tests such as NSSE and the "Collegiate Learning Assessment." When critics express skepticism about surveys and standardized tests, they say that these are not the only mechanisms they would allow — but they are very vague about the alternatives. We are concerned that this vagueness is an effort to obscure the problems with insisting on "externally informed ... benchmarks" in domains where such benchmarks may be of little value.
2. Standardization. An insistence on external benchmarking presupposes that an important purpose for assessment is to make comparisons across institutions. This emphasis on comparison causes the assessment movement to undervalue important kinds of evidence about learning (for example, the evidence about learning produced by our senior thesis requirement, which cannot be externally benchmarked because the projects are highly individualized and because few if any other institutions require a thesis). We believe that this tendency toward standardization is especially inappropriate given the legitimate diversity among institutional missions. For example, Juliard, Caltech, Princeton, and, indeed, the many institutions in this room have significantly different missions — and the diversity becomes orders of magnitude greater when we consider higher education as a whole. Of course, there is also diversity across departments and courses within universities — our classics department and our physics department have different views about what students should learn from their curricula. And finally there is diversity across students. One of the benefits of small student/faculty ratios, when we can achieve them, is that teachers may adapt instruction to the needs, questions, and passions that their students bring to the classroom.
3. Teaching to the test. In our view, external benchmarking inevitably leads to an environment where investment is driving by the benchmarks rather than by faculty judgment about learning. We see this with "No Child Left Behind"; we see it with the impact of the U.S. News and World Report rankings. The assessment movement tells us that this need not be so — they say, for example, that there are better forms of externally benchmarked assessment which somehow avoid the problems of "No Child Left Behind." They do not say exactly what these better forms are — and it would be naïve to accept the claim. If you make rewards or prestige depend on particular quantitative or semi- quantitative measures, people will manage toward the maximization of those measures in order to secure the rewards or prestige they promise. This is part of what I mean to caution about when referring to a "culture of assessment."
4. Pervasiveness. We believe, as I have already said, that assessment data is useful in controlled experiments supervised by engaged faculty with good judgment, but it does not follow that every institution should collect data of this kind, much less that every course should do so. The assessment move tends to ignore this point; it insists not only on the importance of externally benchmarked evidence but also on the pervasive collection of such evidence. For example, the New Leadership Alliance's presidential commitment to excellence requires institutions to promise to make "continuous improvement of student learning assessment" a priority on their campuses. (2) In fact, the presidential commitment to excellence includes five different promises related to assessment and none that mention engagement, independent research, student-faculty contact, or the rigor of reading and writing assignments (this presidential commitment is, I remind you, supposed to be a commitment to excellence, not to assessment!). Likewise, the alliance's pamphlet entitled "Assuring Quality" tells colleges and universities that "student learning outcomes assessment" should be "pervasive — part of the institutional culture, ongoing, consistent, systematic, and sustainable across programs, departments, and the entire institution." (3)
This point is important, and it may be useful to offer a couple of examples to illustrate it:
- Williams College has conducted a study, using learning outcomes data, that shows that the achievement of its liberal arts objectives depends critically on engagement. Here are the engagement variables that Williams found most significant: discussing career plans, intellectual topics, or course selection with a faculty member; interacting with a faculty member at a social event; working with a faculty member on a research project; papers or projects requiring the integration of material from multiple sources; conducting research from primary sources; making a formal in-class presentation; and participating actively in class discussions. Now, if I take this assessment evidence seriously, what is the most important thing that I can do to improve learning on Princeton's campus? Should I replicate the study to see whether Princeton studentsare different from Williams students? Or should I maximize the kinds of engagements recommended by the Williams study? The answer is obvious, if one favors a culture of engagement and learning rather than a culture of assessment.
- Richard Arum and Josipa Roksa are the authors of Academically Adrift: Limited Learning on College Campuses, which is something of a bible or manifesto for the assessment movement. They run regressions on CLA results and, on page 118, they offer a set of conclusions about what factors positively and negatively affect student learning. (4) Here are their main findings (and hold on to your seats, folks, because these are the kind of earthshaking thunderbolts made possible by the rigorous analysis of externally benchmarked learning outcomes data!). Arum and Roksa found that student learning correlates positively with high faculty expectations; demanding reading and writing assignments; hours spent studying alone; and concentration in the traditional liberal arts disciplines. They found that there is a negative correlation with hours spent in fraternities and sororities (they also found a negative correlation with hours spent studying in groups, though they seem less confident about this conclusion). Now suppose I accept these findings. What should I do? Does it follow that we need more assessment? Or that we need more intensive reading and writing assignments, and higher faculty expectations? Arum himself followed up his book with a letter to university trustees urging all of them to "demand that your institution assess student learning with a clear and valid instrument that shows the value that your institution adds to the core skills that every graduate needs." But Arum's research presents no evidence whatsoever that administration of the CLA promotes learning — on the contrary, his evidence suggests that trustees should urge their institutions to focus on rigorous reading and writing requirements.
5. Overvaluing of artificial outcomes/undervaluing of real outcomes. We believe that to the extent accountability is a purpose of assessment, the best way to achieve it is by focusing on the real outcomes that parents and students want. These include better jobs, more fulfilling post-graduation lives, and, we think, high levels of engagement while they are at college. Measuring these things may require data collection projects beyond what most of us now undertake, but they are real outcomes, and they can be measured. By contrast, parents and students do not come to college wanting high scores on the Collegiate Learning Assessment; indeed, many come with an explicit aversion to the standardized testing regimes that they have endured throughout their schooling.
6. Undervaluing of input evidence non-benchmarked output evidence. The assessment movement tends to disparage the value of input data — for example, data about hours spent studying, or about student-faculty ratios. Their argument is that what matters are the outputs, not the inputs — it matters whether students learn, not whether they were taught in a small group. It is of course true that input evidence can be misleading — a course with a low student-faculty ratio, for example, can be a lousy course. But measurements of faculty and student engagement — including those identified by the Williams study — do matter (and, indeed, externally benchmarked learning outcomes data regularly confirm that they matter). They are often the best evidence we have of whether an institution is doing what it should to promote learning.
Some adherents of the assessment movement may in fact have reasons to distract attention from the importance of engagement. Achieving real engagement can be expensive and difficult: it requires genuinely high standards. Assessment is easier. Institutions like the University of Phoenix — which has signed the New Leadership Alliances' commitment to excellence — can enthusiastically commit themselves to a "culture of assessment." Trumpeting the importance of pervasive, externally benchmarked assessment is a way to construct phony measures of quality that distract attention from the critical importance of meaningful student-faculty engagement — which is a shame, because there is powerful evidence both that students value engagement and that it helps them learn.
7. Circularity and self-validation. If assessment proponents were to take their own medicine, they should want to test empirically the benefits of assessment itself. They should want to prove, in other words, that students from institutions that adopt their practices have better real world outcomes — better jobs, better performance in graduate school, higher levels of civic engagement, more fulfilling lives, and so on — than students who do not. To my knowledge, there is no evidence to support this claim.
Let me conclude by returning to where I began. Despite my disagreements with Arum and Roksa, the problems identified in their book are real. I don't think that we need the CLA or other fancy tests to tell us what the problems are. There are too many students who drift through college without investing enough work and energy to benefit from the resources that all of us offer. We have, at the same time, some faculty members who flee from teaching rather than relishing it. To address these problems, we need to promote engagement with learning. If judiciously used, assessment — whether or not externally benchmarked — can help institutions to achieve that goal. How it helps, and in what form it helps, will vary across our campuses. It may be that on some campuses, the best way to use assessment is through the kinds of externally benchmarked measures that the assessment movement prizes. I do not, however, believe that will be true on every campus; I doubt that it will be true on most of our campuses. I also realize, of course, that for some institutions, it may simply be impractical to resist the pressures being brought by the assessment movement or by state legislators who sympathize with it. In any event, if externally benchmarked assessment works for your campus, for whatever reason, that's fine. My point is that engagement promotes learning more reliably than does assessment. To the extent we can, all of us should aim for a culture of engagement, not a culture of assessment. That goal, unfortunately, is one that the assessment movement all too often ignores or even disparages.
1 New Leadership Alliance, "Committing to Quality: Guidelines for Assessment and Accountability in Higher Education," page 7 (2012).
2 New Leadership Alliance, "Commitment Text," available online at http://www.newleadershipalliance.org/what_we_do/presidents_alliance/pre… text/
3 New Leadership Alliance, "Assuring Quality: An Institutional Self‐Assessment Tool for Excellent Practice in Student Learning Outcomes Assessment," page 8 (preview version 2012).
4 Richard Arum and Josipa Roksa, Academically Adrift: Limited Learning on College Campuses 118 (Chicago: University of Chicago Press 2011).