Appropriate Undergraduate Grade Distributions

August 9, 2012

I had heard many references to grade inflation.  I wondered how that shaped up, in detail:  what was the likelihood of getting an A, a B, or some other grade in a college course?  This post describes some steps I took to learn more about that.

I realized that grading had long been a topic of intense interest, and that a tremendous amount had been learned and written on that subject.  My ambition here was not to achieve new knowledge, nor even to develop mastery of the existing literature.  I was looking simply for a rough sense of how grading tended to shape up at this time.

A Sample from MU

For starters, I did searches in Google and Google Scholar.  These turned up a number of readily accessible sources.  One, the University of Missouri (MU) Registrar, provided a comprehensive statement of the grade distribution in each course taught within a given semester.  I copied and massaged the data for Spring 2012.  While MU has its various schools and colleges, I decided to group these courses and their departments in ways that seemed logical to me.  (This grouping was somewhat informed by my two years at that university.)  One such grouping, Languages, was superfluous.  I had speculated that grades in language courses might be different from those in Humanities (e.g., History, English) courses.  I was wrong.  Grades in the Languages and Humanities groups were very close to each other, at least in that particular semester.  Several other groupings (i.e., for Agriculture, Miscellaneous, and Veterinary) seemed unlikely to be of much interest in most other places, so I ignored those.

This left me with ten groups.  Their average grades fell into ranges of B– to B (i.e., 2.67 to 2.99) (i.e., Science and Social Science), B to B+ (i.e., 3.00 to 3.32) (i.e., Humanities, Business, Engineering, Human & Cultural Studies, and Communication), B+ to A– (i.e., 3.33 to 3.66) (i.e., Arts and Health & Medical), and A– to A (i.e., 3.67 to 4.00) (i.e., Education).  This comparison seemed to indicate a few things.  One was that, for better or worse, Education at MU had mostly discarded the concept of letter grades.  There also appeared to be something of a consensus, with five of the ten groups (or seven of 12, if I included Agriculture and Veterinary) opting for an average grade in the B to B+ range – more precisely, somewhere around the median of 3.20.

In about half of the ten groups, I noticed that there were nearly as many grades of F as of D, if not more.  This was the case even in the Science group.  This grading pattern seemed incompatible with a bell-shaped curve.  I suspected that some grades of F arose from administrative situations (e.g., student fails to properly complete paperwork to drop a course).  I thought it might be useful to combine the grades of D and F into a single Subpar category.  The percentage of Subpar grades was within the range of 4% to 8% for five of ten groups (median = 6%).  The exceptions were Social Science and Science (10-11%) and Education, Health & Medical, and Communication (1-3%).

There seemed to be similar consensus at the A level.  Discarding Science (32%) and Education (89%) as outliers, most groups gave A grades to about 44% of students, in a fairly tight range from 39% (Social Science and Humanities) to 47% (Human & Cultural Studies); the exceptions were the Arts and Health & Medical groups, both at 62%.  The pattern was not quite the same at the B level, in a range from 33% (Social Science) to 39% (Communication) (consensus median = 36%); here, the only exceptions were Arts (26%), Health & Medical (28%), and Education (9%).  Finally, in the C range, most groups ranged between 12% and 17% (median = 15%).  The exceptions were Science (23%), Arts (8%), Health & Medical (7%), and Education (1%).

In summary, this review of the data for one recent semester at MU suggested that, in a generic university department at a place like MU, one might find students’ grades distributed as follows:  44% A, 36% B, 15% C, and 5% D or F.  More condensed, that meant 80% A or B and 20% C, D, or F.

Comparisons and Elaborations

The grades at MU seemed to be on the high side.  This may have been due to the apparent inclusion of graduate grades.  A webpage from Virginia Commonwealth University indicated that, in 2010, undergraduate grades there were distributed as follows:  37% A, 32% B, 19% C, 6% D, and 6% F.  Similarly, at the University of Nevada – Las Vegas (UNLV) in 2009, after adjusting for withdrawals, the overall undergraduate distribution was 40% A, 33% B, 17% C, 4% D, and 6% F.  Then again, at the University of Delaware in Fall 2010, the undergraduate distribution was 44% A, 35% B, 15% C, 4% D, and 2% F.

Grades seemed to rise through the undergraduate years.  For example, a report from the University of Wisconsin for spring 2012 pointed toward a somewhat steady rise from a freshman average of B (3.003) to a senior average of B+ (3.367).  The data from MU for spring 2012 showed a similar pattern.  While B grades held nearly constant at all course levels (from 1000 through the 4000s), ranging irregularly from 32% to 37% of all grades, A grades rose from 36% (1000-level) to 55% (4000-level) of all grades, and C grades dropped from 20% to 10%.  At the graduate level, there seemed to be something of a starting-over, with B acc0unting for 47% of grades at the 5000-6000 level (combining two groups because of the small numbers of courses) but only 17% of grades at the 8000-9000 level.  These patterns of rising grades could derive from multiple sources.  Presumably attrition of the least qualified students would account for some of the change.  More senior students may have adapted more completely to their professors’ preferred styles of thought, speech, and output.

I wondered if course size would also make a difference, where introductory courses would tend to be larger.  Course sections at MU (graduate and undergraduate combined) ranged in size from 10 to 499 students.  B grades held steady in a range of 31-37% for all course sizes except those of 1-15 students (25%).  A grades declined somewhat from courses having 16-30 students (50% of grades were A) to courses having 61-150 students (43% were A); but in the smallest courses (1-15 students) almost two-thirds (65%) of grades were A, while in the largest courses (150-500 students) only 33% were A.  A similar pattern in reverse held at the C level:  7% for the smallest courses, 21% for the largest, and about 13-15% for the rest.  F grades were irregular but few, in a range from 2% for the smallest courses to 4% for the largest.  Differences among frequencies of D grades were most pronounced:  1% in the smallest courses, 6% in the largest, and 3% for the rest.  These findings raised thoughts of difficult adaptation for introductory students and potential depersonalization on the part of large-section instructors.  It seemed ironic that some of the largest classes were in psychology, where many students might be much more comfortable in smaller class settings.  It was not clear how the ability to succeed in a large, impersonal setting would provide a useful gauge of students’ adaptability to their various careers.

I did not look much at grade inflation.  A brief glance at data from Georgia State University in 1920 suggested that, in that year, 68% of grades in the field of Education were A.  My impression was that there had been grade inflation in recent years, but this was not important for the question at hand — assuming a focus on what was, as distinct from what perhaps should be.  An argument for the former was that students pay the price, in terms of diminished competitiveness, when their professor uses them to advance his/her belief that grades should come down.  It did seem that students could be as competitive, when seeking an A rather than a B+, as they might previously have been when seeking an A rather than a C.

The remarks in this section of this post suggest that, overall, grades in an undergraduate course might be about 37-44% A, 32-36% B, 15-19% C, 4-6% D, and 2-6% F.  Larger and lower-level courses could break out of those ranges, with about twice as high a rate of A grades in the smallest as in the largest, and about 50% more A grades in 4000-level than in 1000-level courses.  B grades tended to hold fairly constant; the slack for these higher or lower rates of A grades seemed to be taken up largely at the C level.  Absent some normative reason to treat students more gently in small or upper-level courses, or to treat them more harshly in larger or lower-level courses, it appeared that — to adjust what was previously said about MU — a good target for a given class might be 40% A, 34% B, 17% C, and 9% D or F.  Rounding slightly, that would be 75% A or B and 25% C, D, or F.  This seemed consistent with, for example, recommendations at the University of Iowa business school that no more than 80% of grades should be A or B and at least 5% should be D or F.  This proposed distribution would have a mild deflationary effect, insofar as it would produce an overall average of about 3.00 (B), which appeared to be typical of lower-level courses at some schools, as distinct from the 3.33 (B+) average of higher-level courses (above).

The sites mentioned here did not provide much insight into plus and minus grades.  It appeared that some schools would record such grades on transcripts, and some would not.  This would not prevent an instructor from awarding such grades and citing them in letters of recommendation.  Given a likely mean in the range of B or B+ (above), the values just listed could justify a distribution of around 5% A+, 15% A, 20% A–, 16% B+, 10% B, 8% B–, 8% C+, 6% C, 3% C–, and 9% D or F – yielding, again, an overall mean course grade of about 3.0, with a median around 3.4 or 3.5.  This distribution would assume an attempt at a curvilinear distribution, as distinct from a pigeonholing approach in which an instructor might give most students simple letter grades (especially A, B, or C), awarding plus or minus grades in a minority of cases, as apparent adjustments for slightly superior or inferior performance (e.g., Shlafer, n.d., p. 24).

Departments

Graphs by department suggested that, at UNLV, three grading patterns predominated.  One was a bell-shaped pattern in which there were about as many A as C grades, but somewhat more B grades than either A or C.  This was the nature of Business and Science grades.  In Business, for example, the division was 23% A, 36% B, 23% C, with the rest going to D, F, and W.  In the second pattern, there were about as many B grades as A, but only one-half to one-quarter as many C grades.  Among major departments, this was the pattern for Allied Health Sciences, Liberal Arts, Nursing, and Urban Affairs.  For example, Nursing grades were 38% A, 40% B, but only 10% C.  In the third pattern, A was dominant, with perhaps half as many B as A grades, and possibly half as many C as B grades.  This was the pattern for Community Health Sciences, Education, Engineering, and Fine Arts.  In Education, for example, the distribution was 68% A, 18% B, and 5% C.

As I was grouping departments at MU (above), I was aware that some departments within those larger groups might vary substantially from the overall group impressions presented above.  Perhaps the most unfamiliar, among the groups I identified, was what I called the Human & Cultural Studies group.  One particular area of interest to me, within that group, was the field of social work.  I was interested in how that predominantly female discipline might compare, gradewise, against nursing, from the similarly broad Health & Medical group.  I was surprised.  My impression was that nursing was considerably more difficult to get into, and that its grading pattern might thus resemble the patterns found in the sciences.  Instead, the two were pretty close.  In nursing, 63% of grades were A, 33% B, 3% C, and 0.4% D or F.  In social work, 67% of grades were A, 27% were B, 4% were C, and 2% were D or F.  That nursing pattern at MU diverged markedly from that found at UNLV (above).  It also diverged from the pattern at the University of Delaware, where the distribution was 45% A, 40% B, 13% C, and 2% D or F.

It appeared, then, that departmental custom could dictate much of what would be appropriate for the distribution of grades within an undergraduate course.  This would be true, not only in this nursing example, but even in a field like engineering, where the 42-27-12% (A-B-C) pattern at UNLV varied from the 36-39-19% pattern at Delaware.  Of course, such numbers could fluctuate from one semester to another, and might be more variable in smaller departments offering fewer courses.  Nonetheless, it seemed that one’s sense of an appropriate grade distribution might be informed by customs and guidelines within the setting of a particular school.  The effort here has been to reach a sense of what might be typical or atypical within such customs and guidelines.  Absent clear guidance to the contrary, it appeared that the distribution suggested at the end of the preceding section in this post would constitute a reasonably fair and typical allocation of grades in an undergraduate course.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: