Evaluation and the Academy: Are We Doing the Right Things?
INTRODUCTION
It is a traditional and generally accepted role of teachers to evaluate their students. We usually accomplish this task by assigning grades and writing letters of recommendation. Informally, of course, we are constantly evaluating students in conversations, office hours, and the like. As representatives of a discipline and members of a larger academic community, we also evaluate peers as well as younger colleagues: it is a well-established professional obligation that commonly takes the form of letters of recommendation. Evaluation is generally considered to be a core function of our collegial life.
That all is not well in these domains is no secret: inside and outside colleges and universities there has been much discussion about grade inflation and the debasement of letters of recommendation (we prefer the term “letters of evaluation.”) There is no unanimity about either the causes or consequences of changed standards of evaluation. Even the very existence of a problem is doubted by some observers. Nevertheless, there appears to be enough unease, lack of consensus, and “noise” to justify a closer examination.
To that end, an informal group of academics from different fields and backgrounds for the past year met at the American Academy of Arts and Sciences. We asked the same questions for both grades and letters of recommendation: what is the current situation, what are its consequences, and what remedies, if any, are needed and possible? This Occasional Paper represents the results of our discussions.
On all these issues we reached a general consensus, although individual differences about some interpretations remain. Our hope is to start a discussion among our colleagues in all different types of institutions across the country. Such discussions could clarify the situation in each college and university and lead to salutary changes. The quality of evaluation admits of no national solution. Each institution has to determine and be responsible for its own standards, and the best beginning is awareness of the issues.
Current conditions have to be seen in the context of recent history. Since World War II, colleges and universities—along with nearly all American institutions—have experienced major changes. A few examples will suffice. The number of faculty members and the number and percentage of students seeking higher education have dramatically increased since that time. The 1950 census indicates that there were 190,000 academics; a decade later there were 281,000, and by 1970 the number had swelled to 532,000.1 In 1998, according to the latest figures from the U.S. Department of Education, there were 1,074,000 faculty members employed by institutions of higher learning. At the turn of the twentieth century only about 1 percent of high-school students attended college; that figure is closer to 70 percent today. Racial and gender diversity has also increased markedly over the past several decades. In 1975, there were 11 million students: 47 percent were women, 15 percent were minorities (Black, Hispanic, Asian, American Indian/Alaskan Native). By 1997, there were 12,298,000 students, the percentage of women had grown to 56 percent, and minorities represented 25 percent of the student population.
At the same time, the country’s tertiary institutions have faced, and some are still facing, serious economic pressures and increased competition, and many are far less isolated from the outside world. All sectors of society clamor for access to knowledge and skills available in our laboratories and in other forms of faculty expertise.
These changes—largely external in origin—have had a variety of consequences for higher education. In what follows we begin by examining the implications of a specific and in our opinion undesirable practice that is part of these changes: grade inflation. At first glance, this practice may appear to be of little consequence, but we shall argue that its presence calls into question central values of academic life.
WHAT ARE THE FUNCTIONS OF GRADES?
Professors expect, and have received, a considerable measure of respect in our society. The privileges that flow from this status are related to the functions they perform and the values they bring to these performances. Consensus about these values has become diluted in recent years. For example, there is controversy in some institutions over the relative weight to be given to teaching and research, and over the role of political and ideological commitments in teaching and scholarship. The appropriateness of faculty unions is a matter of concern for other institutions. Nevertheless, whatever the balance of energies, commitments, and working arrangements, academics are only entitled to the respect they would like to command if they affirm some common standards. Among these, the least controversial—perhaps the most elementary—is the imperative for accuracy in evaluating their students’ academic work. Yet, there is overwhelming evidence that standards regarding student grading have changed substantially over time.
Grades are intended to be an objective—though not perfect—index of the degree of academic mastery of a subject. As such, grades serve multiple purposes. They inform students about how well or how poorly they understand the content of their courses. They inform students of their strengths, weaknesses, and areas of talent. This may be helpful to students in making decisions about a career. They also provide information to external audiences: for example, to colleagues not only in one’s own institution but to those in other institutions, to graduate schools, and to employers. We believe that this view of grades represents the consensus within the academy.
We recognize, of course, that a significant number of students who had low grades in school were spectacularly successful in later life. That fact, however, does not weaken the rationale for grades. No one would claim that grades are a completely accurate index of the comprehension of subject matter, let alone a predictor of achievement in the world at large. Yet, they remain an efficient way to communicate valid information, but only if a meaningful range of grades exists.
Some professors hold the view that low grades discourage students and frustrate their progress. Some contend it is defensible to give a student a higher grade than he or she deserves in order to motivate those who are anxious or poorly prepared by their earlier secondary school experiences. Advocates of this opinion contend that students ought to be encouraged to learn and that grades can distort that process by motivating students to compete only for grades. A few institutions have acted on this premise by using only written comments; for example, Hampshire College, Goddard College, and Evergreen State College (all small liberal arts colleges) and until recently U.C. Santa Cruz.2 A more radical view holds that it is inappropriate for a professor to perform the assessment function because it violates the relationship that should exist between a faculty member and students engaged in the collaborative process of inquiry. Some critics of grades argue that it is a distorting, harsh, and punitive practice.
We doubt that these positions are espoused by large numbers in the academic community. Grades certainly are not harsh for those who do well, and empirical evidence for the hypothesis that lowering the anxiety over grades leads to better learning is weak. As for the inappropriateness of professors performing the assessment function, one must ask: who will perform this task? Relegating evaluation to professional or graduate schools and employers simply “passes the buck” and is unlikely to lead to more accurate and fair evaluations. Although the rejection of grading does not represent the academic mainstream, the criticisms are influential in some circles, and so we will return to them later in this paper.
DOES GRADE INFLATION EXIST: THE EVIDENCE
Grade inflation can be defined as an upward shift in the grade point average (GPA) of students over an extended period of time without a corresponding increase in student achievement.3 Unlike price inflation, where dollar values can—at least in theory—rise indefinitely, the upper boundary of grade inflation is constrained by not being able to rise above an A or a 100. The consequence is grade “compression” at the upper end.
We will begin by reviewing grading trends as described in the literature, but will confine our sample to undergraduates. The situation in professional and graduate schools requires separate analysis. Relatively undifferentiated course grading has been a traditional practice in many graduate schools for a very long time. One justification for this may be the wide reliance on general examinations and theses.
Most investigators agree that grade inflation began in the 1960s4 and continued through, at least, the mid-1990s. Several studies have examined the phenomenon over time, as illustrated in the following table:
Grade Inflation from 1960 to 1997
Author(s) and Years studied | Sample size | Findings |
Arvo E. Juola 1960-1978a |
180 colleges (with graduate programs) | From 1960 to 1974 the average GPA increased half a grade point (0.432). From 1974 to 1978, a leveling of grade inflation was detected. |
Arthur Levine and Jeanette S. Cureton 1967, 1976, 1993b |
Data from survey of 4,900 undergraduates at all institutional types | Grades of A- or higher grew from 7 to 26 percent. Grades of C or below fell from 25 to 9 percent. |
George Kuh and Shouping Hu 1984-1987; 1995-1997c |
52,256 student surveys from the Colleges Student Experiences Questionaire (CSEQ) at all institutional types | College grades increased over time in every institutional type on the average from 3.07 to 3.343 |
a Arvo E. Juola, “Grade inflation in higher education-1979. Is
it over?” ED189129
(March 1980).
b Arthur Levine and Jeanette S. Cureton, When Hope and Fear Collide:
A Portrait
of Today’s College Student (San Francisco: Jossey-Bass, 1998).
cGeorge Kuh and Shouping Hu, “Unraveling the Complexity of the
Increase in
College Grades from the Mid-1980s to the Mid-1990s,” Educational Evaluation
and
Policy Analysis (Fall 1999): 297–320.
Arvo Juola from Michigan State University was one of the earliest researchers to raise concerns about grade inflation.5 His surveys of colleges and universities found that grade inflation continued unabated between 1960 and 1977.6 From 1960–1974 the average GPA increased nearly half a letter grade (0.432) with the greatest annual increases occurring between 1968 and 1972.7 Arthur Levine and Jeanette Cureton compared data from undergraduate surveys of 4,900 college students from all types of institutions in 1969, 1976 and 1993. Their research found that the number of A’s increased nearly four fold during that time (from 7 percent in 1969 to 26 percent in 1993) and the number of C’s declined by 66 percent (from 25 percent in 1969 to 9 percent in 1993).8 Different estimates suggest that across all institutional types GPA’s rose approximately 15–20 percent from the mid-1960s through the mid-1990s.9 A recent study by George Kuh and Shouping Hu comparing the GPA’s of 52,000 students— approximately half from the mid-1980s and half from the mid1990s—found that student grades had risen from 3.07 in the mid1980s to 3.34 in the mid-1990s.10 By the mid-1990s, the average grade (formerly a C) resided in the B- to B range.11 More recent research across all types of schools shows that only between 10 percent and 20 percent of students receive grades lower than a B-.12
Grade inflation moderated by the second half of the 1990s; its rate of growth has declined from the highs of the 1960s and 1970s. This result is to be expected because—as noted earlier—unlike price inflation, grade inflation is constrained by an immovable ceiling. An A is the upper limit, and, therefore, the recent decline in the growth rate is not an unambiguous indication of changed standards. Indeed, the seemingly mild degree of inflation in the table is, over time, very much magnified by compression at the top, which inexorably lessens the possibility of meaningful gradations.
Patterns of grading show inflation to be more prevalent in selected disciplines. Grades tend to be higher in the humanities than in the natural sciences, where objective standards of measurement are enforced more easily.13 This was probably always true, but the differences by discipline appear to have increased over time. It is not surprising that the “softer” subjects exhibit the severest grade inflation.
Although higher grades appear in all types of institutions, grade inflation appears to have been especially noticeable in the Ivy League. In 1966, 22 percent of all grades given to Harvard undergraduates were in the A range. By 1996 that percentage had risen to 46 percent and in that same year 82 percent of Harvard seniors graduated with academic honors.14 In 1973, 30.7 percent of all grades at Princeton were in the A range and by 1997 that percentage had risen to 42.5 percent. In 1997, only 11.6 percent of all grades fell below the B range.15 Similarly, at Dartmouth, in 1994, 44 percent of all grades given were in the A range.
When considered alongside indexes of student achievement, these increases in grades do not appear to be warranted. During the time period in which grades increased dramatically, the average combined score on the Scholastic Achievement Test (SAT) actually declined by 5 percent (1969–1993).16 Since the SAT’s recentering in 1995 (when the mean was reset to a midpoint of 500 in a range of 200 to 800) scores increased only slightly—the average combined score in 1995 was 1,010 and in 2000 it was 1,019.
By one estimate, one third of all college and university students were forced to take remedial education courses, and the need for remediation has increased over time. One study found that between 1987 and 1997, 73 percent of all institutions reported an increase in the proportion of students requiring remedial education.17 Further, from 1990 to 1995, 39 percent of institutions indicated that their enrollments in remedial courses had increased.18 Currently, higher education devotes $2 billion a year to remedial offerings,19 and faculty have noticed a shift in student ability and preparation. In 1991, a survey conducted by the Higher Education Research Institute found that only 25 percent of faculty felt their students were “well-prepared academically.”20
Discussions that led to standards-based reform also show that systems’ administrators, regents, and state boards of education felt a growing unease about the competence of their students. Eighteen states have currently implemented competency tests that all high-school graduates must pass. Similar testing programs are being considered in several states for institutions of higher learning. The University of Texas System, Utah’s State Board of Regents, and the sixty-four campus SUNY system are all considering implementing competency tests.21
Measures of average achievement are far from perfect, but the available evidence does support the proposition that grading has become more lenient since the 1960s. Higher average grades unaccompanied by proportionate increases in average levels of achievement defines grade inflation.
We have already mentioned that increases in average grades appear to have been especially noticeable in the Ivy League. Because admission into these institutions became increasingly competitive since the 1960s, it might be possible to argue that higher average grades merely reflected a more academically talented student body. There is some evidence for higher quality, but the magnitude of grade increases in Ivy League institutions seems to indicate inflationary pressures as well.22
EXPLANATIONS OFFERED FOR GRADE INFLATION
The dynamics of grade inflation are complex, and a variety of explanations have been offered.
The Sixties and the Vietnam War
Students played a prominent part in the turmoil of the 1960s and early 1970s. Their activities were dominated by resistance to the Vietnam War draft, and institutions of higher learning were challenged by the resulting social unrest. It has been suggested that faculty members were reluctant to give poor grades to male students during those years because forcing them to drop out of school would have made them subject to wartime military service.23 In the words of one professor at the University of Florida:
The upward shift started in the jungles of Vietnam, when those of us now at the full-professor level were safely in graduate school. We were deferred by virtue of being in school, which wasn’t fair and we knew it. So when grading time came, and we knew that giving a C meant that our student (who deserved a D) would go into the jungle, we did one better and gave him a B.24
Eventually, the courtesies extended to draft-age males became the norm.
Specific incidents of campus unrest created particularly large inflationary leaps in grades. In 1969–1970 many institutions cancelled final examinations following the U.S. Army invasion of Cambodia. At Harvard—to cite just one case—students were allowed to designate ex post facto whether they preferred a letter grade or pass-fail. The effects of this decision on GPA’s are obvious.
The 1960s and the first half of the 1970s also witnessed rising student enrollments and therefore a great expansion of the faculty. Some three hundred thousand new professors were hired between 1960 and 1970, doubling the size of the professorate.25 The new faculty members generally were young, anti-war individuals who identified with the values of students, and this shifted the faculty’s ideological base. The ideals of these new “student centered” faculty members, who were concerned with student development and protection, collided with those “institutionally centered” faculty members, who were more concerned with preserving the assessment function of higher education.26
Response to Student Diversity
During the past three decades, increasing numbers of students from varied socioeconomic groups have attended institutions of higher learning. The preparation of these students has sometimes been inadequate. Some have argued that in the interest of retaining these students, colleges and universities have been forced to become more lenient.27 It has been suggested that lower grades (C’s and D’s) were effectively eliminated and grades became compressed into the upper (B and A) range. However, as we have already shown, grade inflation began in the 1960s when poor and minority students represented a tiny proportion of the national student body.28 Even as late as the early 1970s, for example, black students represented only 8 percent of the total student population. Furthermore, fully 60 percent of these students attended historically black colleges and universities at this time.29 Thus, the role of minority students in starting grade inflation appears specious. Most importantly, William Bowen and Derek Bok have demonstrated that, on average, black students in their sample did somewhat less well in college than white students who entered with the same SAT scores.30 That finding does not support the idea of faculty favoritism toward minorities.
New Curricular and Grading Policies
Certain curricular requirements, for example, foreign language, mathematics, and science, were abandoned by many schools in the 1960s, giving students the opportunity to avoid difficult courses that were less suited to their abilities. Many colleges and universities adopted freer distribution requirements, which gave students increased control over their curriculum and allowed them to avoid more demanding courses and the risk of a poor grade.
Other policy changes with similar consequences allowed students to withdraw from courses well into the semester (sometimes up to the final week), removed “first attempt” grades (letting students take a class again and substitute the higher grade), and presented pass-fail as an option.31 Many institutions adopted “pluses” and “minuses” for the first time, which, some have argued, allowed grades to drift upwards.32
Student Evaluations
Another policy frequently linked to grade inflation is the widespread and growing use of student evaluations. Student evaluations have played a role (sometimes an important one, depending on the type of institution) in promotion, tenure decisions, and merit-pay increases.33 Research has shown that grades were significantly correlated with student ratings of faculty performance—that is, courses with higher grades received higher evaluations.34 For example, a study conducted at the University of Washington found that faculty members who were “easy graders” received better evaluations.35 Thus, according to this source, good evaluations could be partially “bought” by assigning good grades.36 On the other hand, low grades carried the risk of small enrollments, which might endanger a promising professional career in its early stages.37
Students as Consumers
Another force associated with grade inflation, particularly in the 1980s, is the rise in consumerism—universities operating like businesses for student clients. Demographic projections during the 1980s suggested that the pool of potential college applicants would decline. Although students of nontraditional age ultimately closed the anticipated gap for some institutions, colleges and universities began competing more fiercely for students and their retention. Students wanted good grades,38 and because from their perspective a C was well below average,39 institutions that resisted grade inflation found that their graduates had a more difficult time being accepted into graduate programs.40 This fact made their graduates unhappy and their programs less attractive.
Former President Rudenstine applied a version of this reasoning to Harvard: “The faculty over the last thirty years have begun to realize that the transcript matters…. Often your degree as an undergraduate is not your last degree, so [students] are worried about their transcripts.” 41 Rudenstine went on to imply that an increased demand for graduate education “led professors to give better grades so that Harvard students would not be disadvantaged.”
Faculty practices have reinforced student expectations. Students counted on “good grades.” For example, a 1999 study at one university found that large proportions of undergraduates in five different courses assumed grade inflation was the norm—even students who reported doing “average” work expected B’s or A’s.42
Watering Down Content
Another source of grade inflation is the watering down of course content at some institutions. As course content becomes less demanding, it is reasonable to see grade averages rise. But grade inflation cannot be accounted for by identifying faculty members who are especially lenient. Other faculty members may become party to the process by simply demanding less of students than they did in the past. The grades they assign may be valid, but students are required to master less content to earn them.43
The Role of Adjuncts
Even if some of the historical factors producing grade inflation have recently become less powerful—if only because of compression— there are other pressures that may sustain inflationary tendencies. One is the changing internal structure of the faculty in our colleges and universities. Currently, only about half of all faculty members are designated “tenured” or “tenure track.” The other half are described as “adjuncts”: an academic proletariat with few rights and benefits, frequently holding part-time jobs at more than one institution at the same time. Their position is vulnerable from below in the form of student pressure and from above in the form of the displeasure of administrators. They have little reason to be loyal to the institutions for which they work for they are often overworked and underpaid. This situation is likely to lead to more tolerant grading, a tendency that is exacerbated by high workloads that make it impractical to engage in careful student evaluation.
This pressure extends beyond adjuncts. A study conducted by Michael Kolevzon of Virginia Commonwealth University compared ten “high grade inflation” and ten “low grade inflation” departments at a four-year university with approximately 8,500 undergraduates. Three-quarters of the faculty from high grade inflation departments indicated that rising class sizes and more nonclassroom commitments (e.g., committee work, publishing, advising) detracted from time that could be devoted to evaluating students.44
RECAPITULATION, MECHANISM, AND CONSEQUENCES
The fact that grade inflation has existed between the late 1960s and the present is beyond dispute. The rate of inflation, however, has varied during these thirty-five years among institutions and departments. But again the phenomenon of inflation is undeniable unless one asserts that there has been an extraordinary improvement in the quality of students during this period, and for that there is very little evidence. Indeed, on a national level, most evidence goes in the opposite direction.
There is much less agreement about the causes of grade inflation. We have supplied the reasons found in the literature, and there is little doubt that the beginning of grade inflation was closely related to the Vietnam War and its consequences. Other cited causes are controversial and may or may not have played an important role. For the record, it should be noted that the most controversial claim is rejected by almost all who have studied the subject: we refer to it as a “response to student diversity.” When grade inflation originated in the 1960s there was virtually no “student diversity” in the sense in which that term is used today.
It is most important to stress that, once started, grade inflation has a self-sustaining character: it becomes systemic, and it is difficult for faculty to opt out of the system. When significant numbers of professors adjust their grades upwards so as to shelter students from the draft—as certainly happened during the Vietnam era—others are forced to follow suit. Otherwise, some students will be disadvantaged, and pressures from students, colleagues, and administrators will soon create conformity to emerging norms. (The analogy is not perfect, but when the economy experiences price inflation, the individual seller will adjust prices upwards, and in higher education there is no equivalent of government or the Federal Reserve that can arrest that process.)
We are describing an inflationary system in which the individual instructor has very little choice. Grade inflation is not the consequence of individual faculty failure, lowered standards, or lack of moral courage. It is the result of a system that is self-sustaining and that produces less than optimal results for all concerned. The issue is not to assign blame; rather, it is to understand the dynamics of grade inflation and its consequences.
Are there any adverse consequences? Quite a few can be deduced from what we have said. The present situation creates internal confusion giving students and colleagues less accurate information; it leads to individual injustices because of compression at the top that prevents discrimination between a real and an inflated A; it may also engender confusion for graduate schools and employers. Not to address these issues represents a failure of responsibility on the part of university and college faculties acting collectively: we have the obligation to make educational improvements when needed and when possible. Simply to accept the status quo is not acceptable professional conduct. We need, if possible, to suggest ways for institutions to initiate reforms that will allow as clear gradations as possible to replace the present confusion.
EXTERNAL VERSUS INTERNAL CONSIDERATIONS
Do inflated grades really hamper the selection process as carried out by those who normally rely on undergraduate transcripts? It is very difficult to answer that question with a desirable degree of certainty. We have found no large body of writings in which, for example, employers or graduate schools complain about lack of information because of inflated grades. Informal conversations with some employers and graduate schools lead us to believe that the traditional users of grades have learned to work around present practices: they expect to find high and relatively undifferentiated grades, and therefore rely more heavily on other criteria.
Graduate schools use standardized tests (e.g., the GRE), recommendations, the ranking of particular schools, and interviews. Grade inflation invites admissions committees to place more emphasis on standardized test scores, which is not necessarily in our view a wise shift in emphasis. Corporations conduct their own evaluations—interviewing candidates, checking references, and in some cases testing the analytic skills of candidates. Grades remain an important criterion but their influence may be waning. For example, one survey of the Human Resource Officers (HRO) from Fortune 500 companies in 1978, 1985, and 1995 found that the percentage of HROs who agreed that tran-scripts of college grades ought to be included with an applicant’s resume fell from 37.5 percent to 20 percent.45 Judith Eaton, president of the Council for Higher Education Accreditation, asserts that employers have become dissatisfied with grading information, arguing that now “government and business want to know more specifically what kind of competencies students have.” 46
It is certain that a diminution in the use of grades increases the relative weight of informal evaluations, and thus being in the proper network may become more valuable than personal achievement. As a matter of fairness, society should have an interest in counteracting this trend.
Suppose, just for the sake of argument, that the net negative impact of working around grades is small, and in addition that grades are less important to those who—in some manner—choose our graduates. Should we then adopt the radical response either to give no grades at all, or—and it amounts to the same thing—award A’s to all students? In other words, are there wholly internal justifications for formal evaluations of students that offer meaningful gradations? The answers have been given at the beginning of this essay. Grades, if they discriminate sufficiently, help and inform students in many different ways, and students are entitled to these evaluations.
For evaluations to accomplish their intended purpose we must question a currently popular assumption in psychology and education that virtually all students can excel academically across the board—and in life as well. Accordingly, differences in performance are primarily attributed to levels of “self-confidence” or “self-esteem” because this is assumed to be the most important determinant of success; motivation and talent are relevant, though secondary. The enemy of high self-confidence is criticism, and that is how rigorous evaluation is perceived.
These sentiments may be powerful elements in grade inflation: praise motivates accomplishment. There may even be a grain of truth in this proposition, but it is far from the whole truth. Talent as well as motivation remain powerful explanatory factors in achieving success. In fact, most studies do not support the connection between academic success and self-esteem. In a recent comprehensive review article, Joseph Kahne quotes Mary Ann Scheirer and Robert E. Kraut as follows:
The overwhelmingly negative evidence reviewed here for a causal connection between self-concept and academic achievement should create caution among both educators and theorists who have heretofore assumed that enhancing a person’s feelings about himself would lead to academic achievement.47
THE NEED FOR AND THE POSSIBILITIES OF CHANGE
Is there a way to change the status quo? There is neither an easy nor a single answer to that question. Since the term “inflation” originated in economics, we can refer to another concept from the same discipline in order to put the question in focus. Gresham’s Law says that if two kinds of money have the same denomination but different intrinsic value—for example, gold coins versus paper money—the bad money (paper) will drive the good money (gold) out of circulation because the good money will be hoarded. The only solution is currency reform in which only a single standard prevails. In education, bad grading practices drive out good grading practices creating their own version of Gresham’s Law. Can we devise the equivalent of currency reform in higher education? The obstacles are obvious. Currencies are controlled by a single authority, and generally a state can enforce uniform standards. None of this exists in the American system of higher education, nor would we favor anything of the sort. Each institution has to make its own assessment and find its own solutions. The best we can hope for is a series of small steps and individual institutional initiatives whose cumulative effects could amount to the beginnings of reform. Recognizing the problem is a meaningful place to start.
What are the characteristics of a good grading system?
- It should be rigorous, accurate, and permit meaningful distinctions among students in applying a uniform standard of performance.
- It should be fair to students and candid to those who are entitled to information about students.
- It should be supportive of learning and helpful to students in achieving their educational goals.
Short of a fundamental systemic overhaul or return to an earlier day, neither of which are realistic possibilities, we review various suggestions that are contained in the literature.
Institutional Dialogue
The academic profession is the only one that provides virtually no formal training or guidance to new entrants concerning one of their primary responsibilities: teaching and evaluation.48 Expectations, responsibilities, and standards are rarely discussed or committed to paper.49 It would be helpful if this type of dialogue occurred in departments or in faculties as a committee of the whole.50 Greater comparability of standards and fairness could result.
It would also be a good idea to make students a part of the institutional dialogue. Their ideas about how the system might be made more supportive of their educational ambitions would be especially appropriate and valuable.
More Information
Faculty members ought to know how their grading standards compare to those of their colleagues. Some universities (Harvard and Duke are examples) provide such data. In Harvard’s Faculty of Arts and Sciences each professor annually receives an index number for each course taught that compares individual grading practices with departmental averages. This practice has not eliminated grade inflation, but it may have slowed its progress and made the system more equitable.
Additional Information
Some schools have adopted the practice of providing additional information about course grades on student transcripts. These schools include Columbia,51 Dartmouth, Indiana,52 and Eastern Kentucky.53 Typically information about the number of students in the class and the average grade is added to the letter grade on the transcript.54 Grade inflation is not addressed directly, but the information does help those who wish to put the transcript in perspective.
Alternative Grading Systems
Various alternative or modified grading systems are in use that intend to mitigate aspects of grade inflation. For example, a reduction in the range of grades from A through E to a simpler honors, pass, and fail might perhaps help reestablish “pass” as the average. Providing comments along with letter grades is another method of contextualization. Still another strategy is to administer general examinations to seniors, perhaps using outside examiners, which is the practice at Swarthmore.55 However, both written comments and general examinations are labor intensive and do not seem practical for mass higher education.
A Standard Grade Distribution
In large classes it seems appropriate for departments and/or instructors to establish a standard distribution (a curve) so that distinctions are both fair and maintained over time. The distribution need not be totally inflexible—exceptions can occur—but this would be a useful yardstick.
We are conscious of the fact that all suggestions for change are partial and not wholly persuasive. This is not a surprise because no single or easy solution exists. The main plea is to be clear about professional standards and obligations and to bring practices into line with these standards. The selection of a standard will necessarily be an individual matter—individual for each college or university, department, and faculty. The present system is flawed. The ethics of professional conduct demand that we—as faculty members—seek the best solutions for our institutions.
FACTORS LEADING TO INFLATED LETTERS OF RECOMMENDATION
Thus far, we have dealt in some detail with the most common form of evaluation, namely, grades. The other major type of evaluation is letters of reference. Faculty members write letters on behalf of colleagues who are seeking promotion, tenure, and other positions, or who are competing for grants and fellowships. 56 They also provide references for students, which is an integral part of the graduate admissions and employment process.57 This form of evaluation will receive less extensive treatment in this paper: the overlap with grade inflation is very large and problems related to letters are unfortunately much less well researched. What evidence is available—empirical, anecdotal, and experiential—leads us to conclude that letters of recommendation suffer from many of the same, or worse, weaknesses and problems as grades. A commentary on letters written for promotion and tenure decisions summarized well the prevailing view: “Puffery is rampant. Evasion abounds. Deliberate obfuscation is the rule of the day.”58 Letters for students are similarly flawed. A member of Cornell’s admissions committee observed ruefully: “I would search applications in vain for even subordinate clauses like ‘While Susan did not participate often in discussions…. ’”59 As experienced academics, all of us sense the accuracy of these observations.
LETTERS OF REFERENCE: EVALUATION OR ACCLAMATION?
We believe that since the late 1960s, academics have been less willing to express negative opinions—either about their students or their colleagues. Many reasons for this phenomenon are identical to the forces that have created grade inflation, such as a legacy of the 1960s, an absence of clear standards, pressures to accommodate student “customers,” and the like. As with grade inflation, the problem is systemic: once inflationary rhetoric becomes normative, it is difficult for individual faculty members to do otherwise. As one faculty member remarked: “It becomes like a nuclear arms race. If Michigan is using lots of adjectives, U.C.L.A. better too.”60 It is even possible to think of compression at the top, just as was the case with grade inflation: if letters are largely positive, how can one indicate true distinction?
Some differences between letters and grades do exist. Letters are much more personal. They use descriptive words about specific individuals and therefore it is easier to make an author of a letter accountable for his or her text. Some faculty members are concerned that their anonymity cannot be assured when they write a letter of recommendation. Many faculty members have had their recommendations inadvertently leaked to candidates and have reported being harassed because of their statements.61 Other faculty members are uncomfortable criticizing colleagues. They may wish to help a colleague who failed to achieve tenure land on his or her feet.62 They also have a desire to see their students succeed—for the sake of their students and for their own sake, since having students accepted into graduate school reflects well on their department.63
The most important difference between letters of recommendation and grades is the fear of legal action, which appears to have had a powerful influence on letters. In 1974 Congress passed the Family Educational Rights and Privacy Act that gave students legal access to their files, including letters of recommendation written on their behalf. The extent to which letters could remain confidential became—and has remained—uncertain, even if students “waive their rights.”64 In addition, states with “sunshine laws,” such as California, provided little anonymity for letter writers.65 At present, fear of litigation has a chilling effect on the candor of those writing letters of evaluation, even though such litigation is rare. 66
To explore this important matter in more detail, and especially because it is the factor that most clearly separates grades and letters, we sought the most authoritative advice possible. Martin Michaelson, of Hogan and Hartson in Washington, D.C., and a leading expert on legal issues that affect the academy, was kind enough to offer his thoughts on the legal risks associated with letters of evaluation, and we include his important communication in its entirety.
In recent years, a distinct perception has taken hold among employers in the United States (by no means limited to colleges and universities) that candid disclosure of negative or even equivocal information about personnel, to prospective employers and others, entails considerable risk and hence is inadvisable. The factual and legal basis of that perception is not entirely clear. Plainly, however, underlying factors include increased litigiousness in our society, a heightened responsiveness of the law to workers’ rights, and the belief of many thoughtful persons that even slight and subtle criticisms can sometimes harm reputations and severely derail careers.
Coincident with the perception that candid disclosures can be legally dangerous, a new level of concern had developed that an employer’s failure to reveal pertinent personnel information, too, can have far-reaching implications. For example, the institution that declines to disclose a departing employee’s serious, proven misconduct to a prospective employer (especially when the prospective employer seeks such information) should worry that the omission could have a range of unwholesome ramifications.
Pertinent law, which varies considerably among the 50 states, addresses those concerns in several ways. In some contexts, the law recognizes a qualified privilege that attaches to personnel evaluations communicated in good faith in response to a prospective employer’s request. More basically, truth is a legal defense to alleged defamation. And—apart from the duty, sometimes recognized by courts, not to misrepresent in a recommendation by being false or misleading in context—judicial precedents generally limit employers’ “duty to warn” to situations in which the law assigns the person who has the adverse information special obligations regarding it. (In some jurisdictions, for example, psychotherapists are required to take steps to prevent bodily harm to a person a patient threatens in the course of psychotherapy.) But the legal current does not flow reliably in a direction favorable to candid referees. For example, the Supreme Court in University of Pennsylvania v. EEOC (1990) declined to recognize a special privilege of confidentiality for faculty references subpoenaed in an EEOC proceeding.
Extensive law reform, to promote reliable disclosure of adverse (as well as favorable) information about personnel, may be desirable. But such reform would be beyond the capacity of the academy acting alone; a far wider consensus, entailing coordinated action by legislatures and courts throughout the nation, would be required. Colleges and universities can, however, promote candor in evaluations in valuable ways, such as these:
- Higher education institutions should be prepared to indemnify faculty, and other personnel, who in the course of performing their duties supply good faith candid appraisals that other institutions and non-institutional employers seek and need.
- Faculty members should have access, without charge, to the institution’s attorneys for particularized advice on how to handle requests for references, appraisals, and the like, in the full range of circumstances germane to the faculty member’s discharge of institutional duties.
- Institutions should carefully address and specify their policies on confidentiality of and, where indicated, access to letters of reference and similar materials, taking into account the relevant legal considerations.
- No less than in other delicate areas of legal regulations (such as sexual harassment, research-related conflicts of interest, or compliance with copyright law in the reproduction of course materials), faculty should regularly be supplied background information and general guidance on legal implications of appraisals of personnel, by experts engaged by the institution.
No single prescription is likely to address adequately all personnel reference and disclosure situations. But periodic written and oral guidance on this topic to faculty from university attorneys and others is bound to reduce risks in practice and foster a salutary candor in evaluations of faculty, students, and staff.
Michaelson’s observations underscore the complexity of the problem both inside and outside the academy. Legal precedent neither entirely dispels concerns about potential litigation, nor does it substantiate undue concerns of those who fear writing candid letters. The law does provide the greatest protection to the frank evaluator, because “truth is a legal defense to alleged defamation.” Michaelson rightly points out that colleges and universities can do much to encourage a climate of candor by supporting faculty and staff who write candid appraisals, allocating resources to this end, providing background information, allowing consultation with the college or university counsel, and indemnifying those who supply good faith candid appraisals.
CONSEQUENCES
The consequences of inflated letters of recommendation are much the same as for grade inflation: poorly differentiated and therefore less useful information.
- Inflated recommendations do not help external audiences distinguish between candidates: If too many candidates are described with superlatives, one might as well wonder about the use of recommendations at all.67 Furthermore, inflation cheats those excellent candidates who deserve great praise68 and gives less distinguished applicants an unfair and unearned advantage.69 It may also cause the employer or educational institution to have unrealistic expectations of the candidate.70
- Inflated letters create self-sustaining and systemic pressures that make this form of evaluation almost meaningless.71
- The evaluation process is driven into increasingly informal channels: In some fields, grade inflation has created an increasing reliance on letters of recommendation.72 However, if recommendations fail to provide useful information, people who need information about potential candidates will be forced to gather information in more informal ways (e.g., telephone calls to friends). This may result in a process where the real information is shared primarily in private channels and therefore is not open to outside scrutiny—a strengthening of the “old boy and girl” network.
A FEW RECOMMENDATIONS
Can anything be done? A few partial remedies have been suggested. For example:
- Avoid writing “general” letters of recommendation: Whenever possible, evaluators ought to write recommendations regarding specific positions rather than writing a blanket “all purpose” letter. Research suggests that greater specificity results in less vague and lofty rhetoric.73 Specificity also adds to the perceived credibility of a recommendation in the minds of employers,74 and no doubt fellowship committees as well.
- Discuss what you will and will not write with the candidate: Before agreeing to write a letter, discuss with the candidate your assessment of him or her. He or she will then be in a better position to decide whether to have you write on his or her behalf.75
- Be clear about your expectations regarding confidentiality: Confidentiality tends to produce more honest appraisals, and research suggests that confidential recommendations are less likely to be inflated.76 Insisting on student waivers is desirable. Those in charge of admissions and job searches look more favorably on confidential letters.77 Confidentiality can be breached in case of lawsuits, but those are rare events.
Faculty members who write letters of evaluation have a two-fold responsibility. First, the candidate deserves to have his or her unique qualities and qualifications accurately and carefully described. Second, evaluators also have a responsibility to the persons who are receiving the letter and using that information to make decisions. Those persons deserve a balanced account of all candidates. A rephrased Golden Rule is the best guide: Write to others the kind of letter of recommendation you would like to receive from them. To follow the rule is responsible professional conduct. Not to follow the rule perpetuates harmful practices in the academy.
CONCLUSION
The reluctance to engage in frank evaluation of students and colleagues has—as we have shown—many different sources. Individually, these are less important than the dynamics created by this reluctance. Once it starts, grade inflation and inflated letters are subject to self-sustaining pressures stemming from the desire not to disadvantage some students or colleagues without cause. This self-sustaining character eventually weakens the very meaning of evaluation: compression at the top before long will create a system of grades in which A’s predominate and in which letters consist primarily of praise. Meaningful distinctions will have disappeared.
A system that fears candor is demoralizing. Much is lost in the current situation, primarily useful information for students, colleagues, graduate schools, and employers. Even if those who need accurate information have learned to “work around the system,” the cost of what prevails today remains high. Instead of moving through formal and open channels, information is guided toward informal and more secretive byways.
We know of no quick or easy solutions; habits of thirty years’ duration are not easily changed. But change has to begin by recognizing the many aspects of the problem, and that is why we urge discussion and education about professional conduct and responsibilities. Reform will have to occur institution by institution, and we hope that what we have presented in this paper will offer a good way to begin.
ENDNOTES
3. Goldman, “The Betrayal of the Gatekeepers: Grade Inflation,” 1985.
4. Juola, “Grade inflation in higher education: What can or should we do?” 1976.
6. Juola, “Grade inflation in higher education-1979. Is it over?” 1980.
8. Levine and Cureton, When Hope and Fear Collide: A Portrait of Today’s College Student, 1998.
12. Farley, “A is for average: The grading crisis in today’s colleges,” 1995.
13. Wilson, “The Phenomenon of Grade Inflation in Higher Education,” 1999.
14. Lambert, “Desperately Seeking Summa,” 1993.
17. Levine, “How the Academic Profession is Changing,” 1997.
19. Schmidt, “Colleges are starting to become involved in high-school testing policies,” 2000.
20. Dey, Astin, and Korn, “The American Freshman: Twenty-Five Year Trends, 1966–1990,” 1991.
21. Schmidt, “Faculty outcry greets proposal of competency tests at U. of Texas,” 2000.
22. This is verified by data provided by C. Anthony Broh, director of research for COFHE.
23. Lamont in Goldman, “The Betrayal of the Gatekeepers: Grade Inflation,” 1985.
24. Twitchell, “Stop Me Before I Give Your Kid Another ‘A,’” 1997.
25. Goldman, “The Betrayal of the Gatekeepers: Grade Inflation,” 1985.
26. Wilson, “The Phenomenon of Grade Inflation in Higher Education,” 1999.
27. Mansfield, “Grade Inflation: It’s time to face the facts,” 2001.
28. Cross, “On scapegoating Blacks for grade inflation,” 1993.
29. Lucas, American Higher Education: A History, 1994.
30. Bowen and Bok, The Shape of the River, 1998.
32. Potter, “Grade Inflation: Unmasking the Scourge of the Seventies,” 1979.
34. Aleamoni and Kennedy in Goldman, “The Betrayal of the Gatekeepers: Grade Inflation,” 1985.
35. Wilson, “New Research Casts Doubt on Value of Student Evaluations of Professors,” 1998.
37. Beaver, “Declining college standards: It’s not the courses, it’s the grades,” 1997.
38. Basinger, “Fighting grade inflation: A misguided effort?” 1997.
39. Walhout, “Grading across a career,” 1997.
40. Perrin, “How Students at Dartmouth Came to Deserve Better Grades,” 1998.
41. Harvard Crimson, 7 March 2001.
42. Landrum, “Student Expectations of Grade Inflation,” 1999.
43. Crumbley in Basinger, “Fighting grade inflation: A misguided effort?” 1997.
44. Kolevzon, “Grade inflation in higher education: A comparative study,” 1981.
45. Spinks and Wells, “Trends in the Employment Process: Resumes and Job Application Letters,” 1999.
46. McMurtie, “Colleges are Urged to Devise Better Ways to Measure Learning,” 2001.
47. Kahne, “The Politics of Self-Esteem,” 1996.
51. Archibald, “Just because the grades are up, are Princeton students smarter?” 1998.
52. McConahay and Cote, “The Expanded Grade Context Record at Indiana University,” 1998.
53. Wilson, “The Phenomenon of Grade Inflation in Higher Education,” 1999.
55. Whitla, personal communication, 12 July 2000.
58. Schneider, “Why you can’t trust letters of recommendation,” 2000.
59. Altshuler, “Dear admissions committee,” 2000.
60. Schneider, “Why you can’t trust letters of recommendation,” 2000.
62. Callahan, “When friendship calls, should truth answer?” 1978.
63. Schneider, “Why you can’t trust letters of recommendation,” 2000.
64. Fox, personal communication, 1 August 2000.
65. Schneider, “Why you can’t trust letters of recommendation,” 2000.
66. Ibid.; Ryan and Martinson, “Perceived effects of exaggeration in recommendation letters,” 2000.
67. Ryan and Martinson, “Perceived effects of exaggeration in recommendation letters,” 2000.
71. Ryan and Martinson, “Perceived effects of exaggeration in recommendation letters,” 2000.
72. Kasambira, “Recommendation inflation,” 1984.
74. Knouse, “The letter of recommendation: Specificity and favorability of information,” 1983.
75. Fox, personal communication, 1 August 2000.
REFERENCES
Aamodt, Michael G., Bryan, Devon A., and Whitcomb, Alan J. 1993. “Predicting performance with letters of recommendation.” Public Personnel Management 22: 81–89.
Altshuler, Glenn C. 2000. “Dear admissions committee.” New York Times (January 9): 4A.
Archibald, Randal C. 1998. “Just because the grades are up, are Princeton students smarter?” New York Times (18 February).
Basinger, David. 1997. “Fighting grade inflation: A misguided effort?” College Teaching 45 (3) (Summer): 81–91.
Baummeistr, R. F. 1996. “Should schools try to boost self esteem?” American Educator 22 (Summer): 14–19.
Beaver, William. 1997. “Declining college standards: It’s not the courses, it’s the grades.” The College Board Review 181 (July): 2–7.
Bok, Sissela. 1999. Lying, 2d ed. New York: Vintage Books.
Bowen, W. G. and Bok, D. 1998. The Shape of the River. Princeton, N.J.: Princeton University Press.
Bromley, D. G., Crow, H. L., and Gibson, M. S. 1973. “Grade inflation: Trends, causes, and implications.” Phi Delta Kappan 59 (10): 694–697.
Callahan, Daniel. 1978. “When friendship calls, should truth answer?” Chronicle of Higher Education (7 August): 32.
Ceci, Stephen and Peters, Douglas. 1984. “Letters of Reference: A Naturalistic Study of the Effects of Confidentiality.” American Psychologist 39 (1) (January): 29–31.
Cole, W. 1993. “By Rewarding Mediocrity We Discourage Excellence.” Chronicle of Higher Education (6 January): B1–B2.
Cross, Theodore L. 1993. “On scapegoating Blacks for grade inflation.” Journal of Blacks in Higher Education (1) (Fall): 47–56.
Dey, Eric L., Astin, Alexander W., and Korn, William S. 1991. “The American Freshman: Twenty-Five Year Trends, 1966–1990,” Higher Education Research Institute, Graduate School of Education, University of California, Los Angeles (September): 37–38.
Dreyfuss, Simeon. 1993. “My fight against grade inflation: A response to William Cole.” College Teaching 41 (4) (Fall): 149–152.
Edwards, C. H. 2000. “Grade inflation: The effects on educational quality and personal well being.” Education 120 (3) (Spring): 538–546.
Farley, Barbara. 1995. “A is for average: The grading crisis in today’s colleges.” Princeton, N.J.: Princeton University Mid-Career Fellowship Program. [BBB24000] ED384384.
Fox, John B., Jr. 2000. Personal communication (1 August).
Geisinger, K. F. 1979. “A Note on Grading Policies and Grade Inflation.” Improving College and University Teaching 27 (3): 113–115.
Goldman, L. 1985. “The Betrayal of the Gatekeepers: Grade Inflation.” Journal of General Education 37 (2): 97–121.
Grieves, R. 1982. “A Policy Proposal Regarding Grade Inflation.” Educational Research Quarterly (Summer).
Juola, Arvo. 1976. “Grade inflation in higher education: What can or should we do?” ED129917 (April).
Juola, Arvo. 1980. “Grade inflation in higher education-1979. Is it over?” ED189129 (March).
Kahne, Joseph. 1996. “The Politics of Self-Esteem.” American Educational Research Journal 33 (2) (Spring): 3–22.
Kasambira, K. Paul. 1984. “Recommendation inflation.” Teacher Educator 20 (2) (Fall): 26–29.
Knouse, Stephen B. 1983. “The letter of recommendation: Specificity and favor-ability of information.” Personal Psychology 36: 331–341.
Kolevzon, Michael S. 1981. “Grade Inflation in Higher Education: A Comparative Study.” Research in Higher Education 15 (3): 195–212.
Kuh, George and Hu, Shouping. 1999. “Unraveling the Complexity of the Increase in College Grades from the Mid-1980s to the Mid-1990s.” Educational Evaluation and Policy Analysis (Fall): 297–320.
Lambert, Craig. 1993. “Desperately Seeking Summa.” Harvard Magazine (May/June): 37.
Landrum, R. Eric. 1999. “Student Expectations of Grade Inflation.” Journal of Research and Development in Education 32 (2) (Winter): 124–128.
Levine, Arthur. 1997. “How the Academic Profession is Changing.” Daedalus 126 (4): 1–20.
Levine, Arthur and Cureton, Jeanette S. 1998. When Hope and Fear Collide: A Portrait of Today’s College Student. San Francisco: Jossey-Bass.
Lucas, Christopher J. 1994. American Higher Education: A History. New York: St. Martin’s Press.
Mansfield, Harvey C. 2001. “Grade inflation: It’s time to face the facts.” Chronicle of Higher Education (6 April).
McConahay, Mark and Cote, Roland. 1988. “The Expanded Grade Context Record at Indiana University.” Cause/Effect 21 (4): 47–48, 60.
McMurtie, Beth. 2001. “Colleges are Urged to Devise Better Ways to Measure Learning.” Chronicle of Higher Education (24 January).
Metzger, Walter P. 1987. “The Academic Profession in the United States” in The Academic Profession, ed. Burton R. Clark. Berkeley: University of California Press.
Mitchell, Joyce Slayton. 1996. “The college letter: College advisor as anthropologist in the field.” Journal of College Admissions 150 (Winter): 24–28.
Nagel, Brian. 1998. “A Proposal for Dealing with Grade Inflation: The Relative Performance Index.” Journal of Education for Business 74 (1) (September/October).
Perrin, Noel. 1998. “How Students at Dartmouth Came to Deserve Better Grades.” Chronicle of Higher Education (9 October).
Potter, William P. 1979. “Grade Inflation: Unmasking the Scourge of the Seventies.” College and University 55 (1) (Fall): 19–26.
Reibstein, L. 1994. “Give me an A, or give me death.” Newsweek 62 (13 June).
Rosovsky, Henry with Ameer, Inge-Lise. 1998. “A neglected topic; Professional conduct of college and university teachers” in Universities and their Leadership, ed. William E. Bowen and Harold Shapiro. Princeton: Princeton University Press.
Ryan, Michael and Martinson, David L. 2000. “Perceived effects of exaggeration in recommendation letters.” Journalism and Mass Communication Educator. Columbia. (Spring).
Schackner, B. 1997. “Inflation in grades: a 1990s fact of life.” Pittsburgh Post Gazette (24 August): 1.
Schmidt, Peter. 2000. “Colleges are starting to become involved in high-school testing policies.” Chronicle of Higher Education (January 21).
Schmidt, Peter. 2000. “Faculty outcry greets proposal of competency tests at U. of Texas.” Chronicle of Higher Education (October 6).
Schneider, Alison. 2000. “Why you can’t trust letters of recommendation.” Chronicle of Higher Education (30 June): A14–16.
Shils, Edward Albert. 1983. The Academic Ethic. Chicago: University of Chicago Press.
Spinks, Nelda and Wells, Barron. 1999. “Trends in the Employment Process: Resumes and Job Application Letters.” Career Development International.
Stone, J. E. 1996. “Inflated Grades, Inflated Enrollment, and Inflated Budgets: An Analysis and Call for Review at the State Level.” Educational Policy Analysis Archives 3 (11) (26 June).
Twitchell, James B. 1997. “Stop Me Before I Give Your Kid Another ‘A.’” The Washington Post (4 June).
Walhout, Donald. 1997. “Grading across a career.” College Teaching 45 (3) (Summer): 83–87.
Weller, L. David. 1986. “Attitude Toward Grade Inflation: A Random Survey of American Colleges of Arts and Sciences and Colleges of Education.” College & University 61 (2) (Winter): 118–127.
Williams, Wendy M. and Ceci, Stephen J. 1997. “How’m I Doing? Problems with Student Ratings of Instructors and Courses.” Change 29 (5) (September/October).
Wilson, Bradford P. 1999. “The Phenomenon of Grade Inflation in Higher Education.” National Forum (Fall): 38–41.
Wilson, Robin. 1998. “New research casts doubt on value of student evaluations of professors.” Chronicle of Higher Education (January 16).
Zander, Rosamund Stone and Zander, Benjamin. 2000. “The power of A.” Boston Globe Magazine (27 August).