What Is “Statistics”?

July 1, 2013

The question here is not, “What are statistics?”  Statistics are numbers.  The question is also not, “What is the field of statistics?”  That is not hard to answer:  according to Wikipedia, statistics is the study of the collection, organization, analysis, interpretation and presentation of data.  The question is, “What is the requirement that people take one or more semesters of a course called ‘statistics’?”

Parts of that question can be addressed easily.  “Statistics” – presenting that introductory course in quotation marks, in this post – is required for some high school students, many college students, and most graduate students.  As a required course, it has to appear on students’ transcripts, with a passing grade, before they can receive a degree or other credential.

So let us rephrase the question.  Why are students required to take “statistics”?  A part of this question has to do with why students are required to take anything, where the requirement may be imposed by people who have no understanding of what the student can do or intends to do – who, indeed, may not even understand much of the world into which students are graduating.  There is also the question of whether “statistics” should be among the things that students of a particular type (e.g., undergraduates) are required to take.  But answers to these questions may depend upon what “statistics” is.  There is probably a kind of “statistics” course that can be very helpful to, say, an English major, and there is probably another kind that is not.

The Rationales of Kuzma & Bohnenblust

So we come to the title question:  what is “statistics”?  What is it that students are being required to study, and why are they being required to study it?  What important content are we forcing people to study in “statistics”?

One set of answers to that question comes from the textbook by Kuzma and Bohnenblust (K & B, 2005).  This is the text that I had to use when I taught a course in “statistics” for pre-nursing students in fall 2012.  (As a student myself, I have taken and retaken introductory and intermediate “statistics” courses multiple times during my encounters with graduate education, starting in 1980.)

As K & B (p. 6) acknowledge, “Those of you using this text are probably taking your first statistics class, and for many of you, it may be your only statistics class.”  So the text we had to use, in the course I was teaching, was designed for people who, the authors admitted, might be taking “statistics” as a one-semester required course.  Actually, “might” is not the right word:  the overwhelming majority of “statistics” students are not there by choice.

For these unwilling students, according to K & B, one alleged reason to take “statistics” was that “Statistics are used to analyze the data” on whether “a new method, drug, device or intervention” is “worthy of being incorporated into your professional lives” (p. 6)  In other words, these pre-nursing students were going to use this one-semester “statistics” course to evaluate the effectiveness of the interventions they would be using with patients.

That would tend to be a ridiculous notion.  The Food and Drug Administration works with pharmaceutical companies to test drugs and submit evidence that they are safe and effective.  These processes can take years and cost many millions of dollars.  No undergraduate is going to march forth and conduct competent clinical trials on the effectiveness of a drug or anything else.  Even people with master’s degrees are unlikely to do much investigative reading into the efficacy of interventions (assuming they had access to expensive academic journal articles on such matters, which most working professionals do not); and even if their diligent investigations did uncover interventions that their employers had failed to recognize, they would still face questions such as whether their malpractice insurance covered them for experimentation, and how their boss would feel about them deciding to take off on some pharmaceutical tangent.

K & B do seem to recognize these truths.  In the following paragraph, they remind us that investigations into the efficacy of a drug “require researchers to gather data, statistically analyze the data, and then interpret the results within the context of their profession” (p. 6).  So what they were just saying about using statistics to test the effectiveness of interventions was, they admit, relevant to professional researchers, not to the first-semester undergraduate “statistics” students who would actually be using their book.

Therefore, K & B try again with another rationale:  “Some proficiency in statistics is also helpful for individuals who are preparing or may be called upon to evaluate research proposals” (p. 7).  Would that be relevant to pre-nursing undergraduates?  Nay.  Well, would you believe, “People with an understanding of statistics are better able to decide whether their professional colleagues use statistics to objectively view the issue or merely to support their personal biases” (p. 7)?  It’s not clear, but apparently K & B mean to admit that “statistics” can be used to support falsehood as well as truth.  Would this not imply that a little knowledge can be a dangerous thing – that a bit of “statistics” training could actually augment a screwy person’s bullshit reserves?

K & B, not knowing when to quit, waste another arrow on the suggestion that “A knowledge of statistics can help anyone discriminate between fact and fiction in everyday life” (p. 7).  They don’t seem to mean that “statistics” training of the kind provided in their text would help anyone decide whether someone is lying; their point is, rather, that a course in “statistics” will help you interpret “health claims in newspapers and magazines . . . and even in ads for pharmacies.”  The proposition seems to be that someone who has spent a semester working through K & B’s textbook will be equipped to evaluate a Walgreen’s ad.  If I had told my students that this was the purpose of that semester-long course, they really might have thrown things at me.  Fortunately, my students were not remotely interested in reading these introductory words in the K & B text, for which I could hardly blame them.

Consulting the Experts

The quiver empties without a single kill when K & B propose, finally, that “a course in statistics should help you know when, and for what purpose, a statistician should be consulted” (p. 7).  That one is my favorite.  Without their book, you would not be capable of realizing that you did not know how to use “statistics” in a particular situation, but fortune has smiled upon you:  you have used their book, and are now able to realize that you don’t know what you are doing.  But you really did not need to take a “statistics” course for that.  All you needed was to hang around a social science PhD program for a while.  There, you would see tired-looking doctoral students who have taken several advanced statistics courses and are now writing dissertations that contain statistical analysis; you would observe them asking statistics professors to tell them what they need to do, in order to get past that part of the ordeal.

So, yes, “statistics” is at least putatively useful for the small fraction of an undergraduate class that will go on to PhD programs and, despite PhD program attrition rates of up to and sometimes beyond 50%, will persevere to the point of the dissertation, where they will use their doctoral statistical training to help them engage in limited statistics-related conversation with a statistician – and, as I know from a year’s personal experience, may then run off to ask the tutors in the statistics lab to explain what the statistics professor said.  I should emphasize:  these PhD students did not seem to know or care much about other procedures; they just wanted to know how to do the one procedure that their doctoral advisor had recommended for their own study.  For some, I was able to help; for those attempting more advanced procedures, I was generally not – though sometimes, amusingly enough, I was able to help someone work through a procedure by just Googling the techniques they were attempting to use, and pointing them to some random website’s recommended steps.

Even at the doctoral level, I am saying, it is a mistake to treat statistical knowledge as though it were something that you pour into the cranial container and then it stays there.  Taking a bunch of courses does not necessarily prove much.  The more important question is whether the courses align with your occupation (i.e., with whatever it is that you are doing with your time), so that your knowledge stays fresh and is added to rather than withering.  I have seen PhD students – even one or two at advanced stages of doctoral programs in statistics – who have become confused and have been unable to assist, when asked for help with ordinary homework problems by students in introductory or intermediate “statistics” courses.

It is not surprising, then, what happens when PhD students complete their dissertations in their various fields.  They go off to be professors; they are assumed to have some concept of statistics; on this basis, they will publish articles reporting statistical findings in academic journals; and they will thus proceed to join their professorial colleagues in situations like that described by Lang and Altman (2013, p. 1, citations omitted):

The first major study of the quality of statistical reporting in the biomedical literature was published in 1966.  Since then, dozens of similar studies have been published, every one of which has found that large proportions of articles contain errors in the application, analysis, interpretation, or reporting of statistics or in the design of research.  Further, large proportions of these errors are serious enough to call the authors’ conclusions into question.  The problem is made worse by the fact that most of these studies are of the world’s leading peer-reviewed general medical and specialty journals.  . . . Paradoxically, many errors are in basic, not advanced, statistical methods.

In other words, all that anxiety while writing the dissertation was probably not crazy; it probably reflected PhD students’ self-appraisal that they really were not too sure of what statistics was about, or how to use it.  Even at this level, it was not always clear that much value had been added by those semesters of hard work.  Like introductory “statistics,” the higher-level courses seemed primarily to have been designed to prepare a student for even more advanced courses that s/he might take within the next semester or two.

Despite all that training, a dearth of statistical expertise crops up at the professorial level in a variety of fields.  For instance, Gregoris and Shorvon (2013, p. 1) examined 300 research papers on the subject of clinical epilepsy, over the previous several decades, and concluded that 71% had no enduring value – and that in nearly a quarter of those papers, the lack of value was due to significant methodological flaws.  Prinz, Schlange, and Asadullah (2011) report a rule of thumb, from the world of drug manufacturing, by which “at least 50% of published studies, even those in top-tier academic journals, can’t be repeated with the same conclusions by an industrial lab” – that, in other words, nobody else seemed to get the same outcomes when they stepped back through the research process described by the authors.  Such flaws can affect even the largest and most prestigious studies, as in the Women’s Health Initiative study of more than 160,000 women.  There has been criticism of what may be an overstated claim by Stanford Medical School Professor John Ioannidis (2005) – that “Most research findings are false for most research designs in most fields” – but Siegfried (2010, p. 26) does seem to be expressing the general view when he says,

Statistical tests are supposed to guide scientists in judging whether an experimental result reflects some real effect or is merely a random fluke, but the standard methods mix mutually inconsistent philosophies and offer no meaningful basis for making such decisions.  Even when performed correctly, statistical tests are widely misunderstood and frequently misinterpreted.  As a result, countless conclusions in the scientific literature are erroneous, and tests of medical dangers or treatments are often contradictory and confusing. . . . Experts in the math of probability and statistics are well aware of these problems and have for decades expressed concern about them in major journals. . . . [A]ny single scientific study alone is quite likely to be incorrect, thanks largely to the fact that the standard statistical system for drawing conclusions is, in essence, illogical.

Lang (2003, p. 67) had previously pointed out “common statistical errors . . . that can be identified even by those who know little about statistics.”  Given Lang’s observation ten years later (quoted above), it seems that “statistics” professors might have been well advised, during those years, to focus less on the mathematics of statistical calculations and more on common-sense tips that students were most likely to understand, remember, and use.

Other Rationales

K & B (above) touch upon most of the common excuses for making undergraduates study “statistics.”  But there are a few others worth mentioning.  Consider these quotes, taken on June 28, 2013.  First, from the website of Boston University:

Knowledge in statistics provides you with the necessary tools and conceptual foundations in quantitative reasoning to extract information intelligently from this sea of data.

and now from a webpage at San Jose State University:

Knowledge in statistics provides you with the necessary tools and conceptual foundations in quantitative reasoning to extract information intelligently from this sea of data.

Since this is not a post on plagiarism, we should thank the authors of those redundant webpages for reiterating that, as numerous others have pointed out, a knowledge of “statistics” (or of Google) can help to identify cases in which data are misrepresented – although, as already established, it may take quite a bit of statistical knowledge to be sure that one’s conclusions are correct, and to defend those conclusions if challenged.

But when the San Jose people prattle on about how “statistics” is valuable for “Anyone who has problems to solve . . . [such as] finding ways to make a business more profitable,” it may be worthwhile to think about what goes into making a business profitable.  Statistics?  Yes, and a winning smile – but especially knowledge and experience in what the business is actually about.  Let us consider, indeed, what one might find in a business-oriented “statistics” textbook.  Hoerl and Snee (2012, pp. 18-20) provide an illustration.  In their remarks on the value of studying “statistics,” they offer the examples of a restaurant manager who wants to analyze customer service data to understand why it is taking so long to serve some customers, and of the factory quality control engineer who wants to calculate and reduce the number of defective outputs.  These seem to be matters that anyone who knows high school algebra can address, without a college or graduate course in “statistics.”  There can also be a lot to know about non-statistical data collection and application in a business context, and the Hoerl & Snee text concentrates on those sorts of matters, leaving only two chapters at the end for a small selection from the mathematical material that fills a typical “statistics” course (e.g., hypothesis tests, sampling distributions).  Hoerl and Snee, it seems, have a rather different concept of what “statistics” is than do K & B.  And so do the people at that San Jose State website:

Investigative questioning, designing ways to collect data to answer those questions, collecting data, and making sense of what that data says to produce reliable answers – this is the subject matter of statistics.

Like most “statistics” professors and textbooks, K & B focus almost exclusively on just the last part of the quote – indeed, a subpart within that last part – that involves just a few ways of “making sense of what that data says.”  Similarly, the Coursera online course in introductory “statistics” focuses largely on mathematical topics.

Hoerl and Snee and the people at San Jose State are trying to include, within “statistics,” material from the broader field of research methods.  Research methods include more than the interpretation of statistical data.  For example, you can conduct a phone survey and get an average approval rating for the U.S. president as of a certain day.  But that is just scratching the surface of how people really feel.  Research methodologists will examine what question was asked, who asked it, how they asked it, what time of day they asked it, who they asked, and so forth.  It often develops that you get one answer when you arrange your questionnaire a certain way, and an entirely different answer when you go beyond the questionnaire to let the interviewee explain his/her views in more open-ended fashion.  The field of research methods includes not only numbers but a welter of qualitative research methods – focus groups, textual analysis, ethnography, and so forth.

This broader brush – incorporating statistics, as Hoerl and Snee do, within a larger context – often permits instructors to devise projects where students can get firsthand experience with useful skills.  It seems wise to offer that kind of training – to make sure that the mathematical content is provided within a meaningful context – given Kotz’s (2010, p. 16) observation:

[O]ur introductory statistics class may well be the ONLY quantitative analysis course these students EVER take given current degree requirements both here and at most of the colleges to which these students may transfer. . . . [A]s an introductory statistics teacher, if I do not take the time and effort to make sure that the material is presented well, that the learning is facilitated well, and that the assessment is genuine and meaningful, there is a decent chance that I have truly shortchanged and compromised my students (and thus my community) in terms of the critical thinking, decision making, and general reasoning needs of their academic, professional, or even personal lives. And what of the subsequent impact to those students’ clients, patients, customers, families, etc.?

Kotz (p. 17) elaborates on the difference between good statistics education and purely mathematical education by observing that, in many cases,

a far more substantial writing component needs to be in place for a student’s introductory statistics course as compared to that student’s previous math courses. Again, to be blunt, if I do not take care of these things when I have the students in front of me, there is a chance that they will not be exposed to this material, this way of analysis, this way of writing, etc. in their undergraduate experience.

If this is indeed the only quantitative analysis course students take, it can be seriously mistaken to devote it to mathematical statistics.  Statistics is like mustard:  a little bit, great; too much, yecch.  It was not clear, in the undergraduate introductory “statistics” course I taught, why we should focus on relatively difficult statistical techniques when most students were not even confident of how to read graphs or calculate percentages.  Indeed, someone told me that some high schools did not even offer courses in algebra.  To the extent that this one precious course does focus on math, it might better be oriented toward basic numeracy, also known as statistical or quantitative literacy or reasoning (e.g., Rumsey, 2002; Wilder, 2012; Hassad, 2009; Moore, 2005; Makar & Rubin, 2009).  With needs like these, it is not surprising that even those students who have experienced a relatively well-taught mathematical “statistics” course might doubt that their time was well-spent (e.g., Hagen et al., 2012, AllNurses.com Forum, 2011).

As noted above, “statistics” courses emphasize math because they tend to be designed for people who plan to continue to more advanced courses in mathematical statistics.  The San Jose webpage lists a variety of careers in which advanced statistical training can be profitable (e.g., financial analyst, actuary, epidemiologist, demographer), but few if any in which taking one “statistics” course will make much of a difference.  A McGraw-Hill webpage for another highly technical “statistics” textbook inadvertently makes the point:

There are at least three reasons for studying statistics:  (1) data are everywhere, (2) statistical techniques are used to make many decisions that affect our lives, and (3) no matter what your career, you will make professional decisions that involve data.  An understanding of statistical methods will help you make these decisions more effectively.

Data are indeed everywhere.  But the interpretation of the data that people encounter on a day-to-day basis is not a primary topic in “statistics” courses.  It is also true that statistical techniques are used to make many decisions that affect our lives.  But almost nobody works in, or even knows how to use, the kinds of techniques upon which corporations and research institutions base such decisions.  My knowledge or ignorance of “statistics” had no effect on whether Honda and Intel used advanced statistical analysis to design my car or computer.  As for the notion that there is some link between my professional decisions and the contents of a “statistics” course, allow me to mention that I can count on one hand the number of professors with whom I have taken courses or had other significant interactions, in ten years of graduate study, who went beyond the most elementary statistical concepts in their teaching and in discussions with students (see Golinski & Cribbie, 2009; Mills, Abdulla & Cribbie, 2010).  What is taught in those courses is just not very relevant to most purposes in most careers.

In short, students would probably find it very useful to receive training in assorted areas cited by various authors – in how to collect data, for example, and how to interpret others’ presentations of statistical information.  Yet these sorts of topics receive little to no attention in the mathematical orientation of typical “statistics” courses and textbooks.  The teaching does not correspond to the recognized needs; it is primarily geared to and useful as a first step toward more advanced statistics courses that most students will not take.

Teaching Obsolete and Controversial Math

It is bad enough that those who teach “statistics” courses and write “statistics” textbooks tend to emphasize mathematical statistics rather than other kinds of numerical and non-numerical training that students might actually be able to understand, use, and appreciate.  It is even worse that the particular mathematical techniques taught in such courses and textbooks have been outdated for many years.  Consider this quote from Wilcox (2002):

To put it simply, all of the hypothesis testing methods taught in a typical introductory statistics course, and routinely used by applied researchers, are obsolete; there are no exceptions. Hundreds of journal articles and several books point this out, and no published paper has given a counter argument as to why we should continue to be satisfied with standard statistical techniques. These standard methods include Student’s T for means, Student’s T for making inferences about Pearson’s correlation, and the ANOVA F, among others.

(See also e.g., Rouder et al., 2012; Metfessel & Greene, 2012; Skidmore & Thompson, 2010, p. 790.)  As “statistics” students may recall, ANOVA and the t test are standard topics within “statistics” courses and textbooks.  So is null hypothesis significance testing (NHST), which Matthews (1998, p. 2) characterized as a “fatally flawed” technique that, as of his writing, “experts have been warning about . . . for more than 30 years.”  According to Wilcox and also to Erceg-Hurn and Mirosevich (2008), the t and ANOVA procedures fail because the mathematical assumptions on which they depend (e.g., that the data are normally distributed) are often not applicable in real-life situations.  Matthews and Thompson (2002) explain that NHST, with its p values and its typical 0.05 alpha level, relies on an arbitrary concept of significance that is better replaced with confidence intervals and appropriately calculated effect sizes (see Erceg-Hurn & Mirosevich, p. 599).

Why do “statistics” professors keep writing textbooks, and teaching, in these outmoded techniques?  Wilcox, Erceg-Hurn and Mirosevich, and Rouder et al. identify several explanations, including unfamiliarity with or difficulty of alternatives; actual or perceived unavailability of modern techniques in statistical programs like SPSS and SAS; insufficient faculty comprehension of the assumptions and limitations of traditional methods; inertia in publishers’ continued republication of familiar methods; failure by “statistics” teachers to keep up with current developments in the field; and mutual incomprehension on the part of theoretical and applied statisticians.

That seems like a pretty intimidating list of barriers.  In a statistical counterpart to the story about the man who was looking for his lost car keys under the lamppost just because the light was so much better there, the “statistics” situation seems to be that we teach outmoded techniques, and neglect thorough instruction in the essential assumptions underlying those techniques, because that is so much easier than doing it right.  This is like teaching Chinese language and culture in 2013 using instructional materials copyrighted in 1908 because one can download those materials for free – never mind that so much has changed in China meanwhile.  One reason for the quotation marks, then, is that we seem to be teaching “statistics” like those people would be teaching “Chinese,” in some sense of the term not necessarily linked with contemporary reality.

There is another thing to consider in connection with the mathematical orientation found in a typical “statistics” course.  Math enjoys a reputation of solidity and durability.  And yet math actually depends upon compromises and assumptions at every stage.  For instance, algebra includes “irrational” numbers (e.g., pi, π) and “undefined” numbers (e.g., division by zero) and nonexistent or “imaginary” numbers (e.g., the square root of –1).  Beyond that, we have books like Kline’s well-received Mathematics: The Loss of Certainty, about which Mark Poyser made two interesting remarks.  First, “When extremely intelligent [mathematicians disagree about fundamental principles] . . . you know something has gone wrong.”  Second, “At the end of the book you will have an appreciation for . . . the fragility of [math’s] claim to represent Truth.  Anyone who has majored in mathematics at college and mastered it – though with a nagging feeling that they were only manipulating symbols on paper – will enjoy Kline’s work.”

“Statistics” depends upon some of those mathematical tangles, and introduces others as well.  And let it be clear:  the tangles are not merely a matter of expert disputations on advanced topics.  To the contrary, disagreements among statisticians extend even to basic matters covered in introductory “statistics” courses.  For example:

  • Blume-Kohout (2010, p. 4) says, “Statisticians disagree about how to interpret even a simple statement about a coin flip.”
  • It has been demonstrated that the same data, analyzed statistically according to guidance from different experts, can result in vastly different outcomes (e.g., Mingers, 2003).
  • Streiner and Norman (2011, p. 17) point out that statisticians are in strong disagreement regarding the need for the Bonferroni-type corrections that supposedly require students to use ANOVA rather than t tests in certain situations.

That last example, regarding Bonferroni adjustments, introduces another complication in the simplistic assumption that math is pure and straightforward.  The complication is this:  statisticians are human.  They may prefer to advance the impression that theirs is a disinterested kind of knowledge, existing not to express opinions but rather to cut through to the hard facts presented by appropriately collected numerical data.  But exposure to statisticians and their writings offers the occasional reminder that scientists do not possess superhuman objectivity (e.g., Carrier, 2012; Kuhn, 1973).  Sometimes one can get a glimpse of passions even within the dry pages of statistical publications.  Consider, for example, the wording of Glass and Hopkins (1996, p. 268), when they say “it is a statistical sin to not play fair” regarding Bonferroni adjustments (above).  Words like “sin” and “play fair” hardly seem mathematical.  Apparently the authors could not restrain themselves:  in “statistics,” it seems, if you don’t do it their way – if, instead, you take the approach advocated by Perneger (1998) – then, according to Glass and Hopkins, you are a sinner and a cheat.

My own interactions with “statistics” professors have reinforced this awareness of the statistician’s capability for preferring nonscientific and/or nonobjective views.  Consider, for example, the conversation in which my supervisor told me that global warming isn’t really happening, that temperature readings are higher just because the places where they take the temperatures have become more urbanized, which tends to make local temperatures warmer.  He did admit that the Arctic was melting, so that first argument didn’t seem relevant; but the warming, he felt, was due to solar flares, not greenhouse gases.  He wasn’t a climate scientist.  Not that you need a PhD in climate science to have an opinion on it.  Someday it could turn out that he was at least partly right.  But in that conversation, he seemed to be offering excuses in support of a predetermined outcome.  I was not entirely hostile to his creationist beliefs, either, but my own years in that viewpoint had exposed me to enough of this biased way of thinking where you just grab onto any screwball argument that seems like it might support your views, and discount everything that doesn’t.  Being a statistician did not seem to have instilled a strongly neutral ethic in him, nor did he seem to have a commitment to the evidence even if it cast doubt upon his preferred beliefs.  The point, in any event, is that intensively mathematical training may leave someone – even a statistics professor – poorly prepared to evaluate claims effectively, using data collected in a defensible manner from persuasive sources.

In addition to all that, there are problems with the standard teaching approach, which does not seem to have changed much, in the 33 years that have elapsed since my first introductory “statistics” class.  You basically have a mathematician with an armload of formulas, and you have to memorize what those formulas say and how you are supposed to interpret and apply them.  Indeed, the situation is worse in “statistics” than it was in high school algebra.  In algebra, it can all fit together, to an extent that is just not possible for the mishmash found in most introductory “statistics” courses.  The problem includes but goes beyond the lack of a unifying theory desired by Hoerl and Snee (2010, p. 12).  From the student’s perspective, there are unexplained concepts (e.g., degrees of freedom), often represented by seemingly random symbols (e.g., ν – and no, that’s not the letter V).  Nobody knows why the denominator is n in some calculations, but n – 1 in others.  It just has to be memorized, which almost guarantees it will be forgotten.  The specifics of what must be memorized will vary somewhat from one “statistics” course to another, but the overall story remains the same.

Prior exposure to subjects like algebra also tends to teach that you can work very hard to learn math concepts that you will never use again.  It would certainly be handy, throughout life, if we all remembered how to solve various problems that we studied in school, and were confident that we were solving them correctly.  But we don’t.  We just don’t use those skills often enough.  A course in “statistics” can remind the student of such experiences – can prime us, that is, for another pointless exercise in math education.  You would have to be a little crazy not to want to get away from this sort of thing at the first opportunity.  That is one travesty of today’s “statistics” education – that for so many students, it reinforces the impression of math as something bad.

Those with prior exposure to algebra will also remember that mathematical courses are different from most other kinds of courses.  Math requires precision.  It constantly emphasizes how stupid you are.  In political science or German, you learn various facts, you grasp assorted principles, and you stitch them together to make a halfway credible presentation.  You tend to be encouraged to keep adding to what you have learned.  But in math, if you don’t get it right, it’s wrong.  The situation can be compounded by well-meaning instructors who give partial credit:  it makes you feel better about not getting exactly the right answer, but inadvertently sends a signal that you aren’t really expected to get it right – even though this is, most likely, the only “statistics” course you will ever take.  We are essentially confirming that we don’t think you will leave this course with valuable skills and the ability to solve problems in real life.  It’s more like a dry run, a rehearsal just in case you ever do get around to the point of developing actual competence in statistical calculations.

Which is, itself, a nonsensical pursuit.  Learning how to do statistical calculations by hand (as many “statistics” courses require), in order to work with data in real life, is like learning how to ride a donkey as preparation to drive a car.  After we have run you through the mill in manually calculating t tests and regressions and ANOVAs, we lift the curtain and – what’s this? – behold!  A computer!  OMG.  You mean they have those now?  And not just a computer, but statistical software.  These include SAS, which you will never see again, once you leave the university, but which you may have to screw with for a semester, and R, in case you were looking for a new computer programming challenge, and SPSS, which you are free to tinker with for only a hundred bucks a year and which has virtually no usefulness that would persuade most college graduates to keep a copy; but also Excel, which (even without statistical add-ins) is still a mile ahead of punching numbers into a calculator, and LibreOffice and OpenOffice, in case you don’t already have Excel and want something like it for free.  Whatever the software, the question is the same:  if the computer could do all that for us, why did we spend so much time doing it manually?  Why weren’t we spending that time to become familiar with a relatively user-friendly tool like Excel or some other spreadsheet, which we could apply to all sorts of tasks throughout life and might thus remember and not be afraid of?

How to Make Your Subject Completely Hateful

So far, we seem to be pretty well on track to construct “statistics” as a bizarre hodgepodge that students will not understand, will not be able to use, and will resent as a pointless imposition.  But we may need to add a few more ingredients to insure that this course requirement is thoroughly despised.

One refinement, to that end, would be to insure that the content of such a course would be insulated from student feedback.  And so we do that.  It is as if every nation must compel its young to endure interminable political indoctrination, and in our culture the reigning ideology happens to have more to do with “statistics” than with, say, Mao or Islam.  Just as in other kinds of indoctrination, many eventually come to hate “statistics,” and cheer the day when it ceases to control their lives.  As Garfield and Ben-Zvi (2007, pp. 379-380) caution, “[T]here is little [improvement] in attitudes from beginning to end of a first course in ‘statistics’ (and sometimes negative changes).”  I would emphasize that parenthetical.  My experiences with classmates, and with others whom I have assisted, make clear that even bright, hardworking, mature graduate students can sincerely resent the obligation to take a course that they may correctly view as largely irrelevant to them – and their resentment can deepen as they find that the course is inordinately time-consuming and ultimately a matter of sheer memorization to boot.

It can be difficult for math-oriented faculty to understand the reactions of people who do not share their fascination with numbers.  Professors who have long since learned by heart every aspect of the material they teach in their “statistics” courses may forget what it is like for a newcomer – often, a newcomer motivated solely by extrinsic, career-related objectives – to try to find his/her way through the statistical thickets.  The professor who does not lack intrinsic math interest could perhaps gain a sense of the student’s perspective by engaging in an assignment to rearrange cups on a table, and to keep doing so for hours on end, week after week.

Moreover, poor teaching can require even a motivated student to spend inordinate amounts of time on a single homework assignment, just figuring out the unstated assumptions and otherwise translating what the professor or the textbook is trying to say.  Here is an example from my own experience.  I was a student in an intermediate “statistics” course – intermediate in the sense that it reviewed beginner-level “statistics” topics in somewhat more depth.  It was a distance-learning class (i.e., online).  I knew the professor.  She seemed reasonable and well-intentioned.  I, personally, found her to be friendly and likeable.  I am sure she meant to assign straightforward homework questions that students could answer on the basis of the videotaped lecture and the assigned reading.  She created a positive environment for asking questions, and most of the time, she was diligent in answering students’ questions posted to the online forum.

That’s not to say she was working very hard to insure a quality learning experience.  She really was not into this class.  She was busy with her own research interests.  She had taught the course in a previous semester and understandably did not wish to rewrite all of her homework assignments from scratch.  Instead, she altered the questions a bit, here and there, so that nobody could just copy and hand in the previous semester’s answers.  The problem was, she did not then do a careful re-reading of the questions to make sure they still made sense.  On a number of occasions, they didn’t.  So in many cases her diligent efforts to answer questions in the online forum were actually devoted to clearing up unnecessary confusion that she, herself, had generated by doing a slipshod job of writing up her homework assignments.  Of course, consistent with typical university patterns, she was not canvassing students to see what could be improved, beyond the standard (and, for tenured faculty, often ineffectual) end-of-semester evaluation.

So, as I say, I think this professor probably would have been surprised, but really should not have been surprised, to learn that what she might consider a fairly easy homework question could actually take hours to work through.  For instance, one question called for a comparison of a sample mean against a population mean.  Simple enough.  But when I tried to do that comparison using the information supplied in her assigned homework question, other questions arose whose answers were not clear from the online lecture and the assigned chapter.  Were we supposed to do a directional confidence interval?  Was there even such a thing?  Why were we trying to estimate the population mean when it was already given?  Were we allowed to use the population standard deviation in a formula that called for the sample standard deviation?

It took hours, as I say, to identify these and other underlying assumptions of the assigned homework problem, and to dig through the textbook and do online searches and otherwise flail around until I could bootstrap myself into an understanding of what I didn’t know and where I could go to find it.  The same thing happened in other homework assignments.  Among other things, I actually wrote up a blog post, for my own reference and for anyone else who might find it interesting, to explain aspects of the SAS statistical software program that I found necessary or helpful to understand or add to what the professor told us.

Not everyone could and would spare hours for things like that.  My classmates were adult students.  They were mature; they just tended to have jobs and other things going on in their lives, and couldn’t always spare large blocks of time to find their way around to the professor’s viewpoint.  This debility gradually accumulated during that semester, in terms of what students knew and also in their faith that they could keep on top of it all.  Not surprisingly, mean and median exam scores for the class dropped way off in the second and final exams.  I also noticed that most participants had largely abandoned the online discussion board by mid-semester.  And yet, despite such indicators, the professor seemed to have no awareness and/or concern that students might be struggling; or perhaps she had simply assumed that a lot of students would just not do very well in the game of “statistics.”

There seems to be a considerable amount of denial about such things among statisticians.  For example, Chernick (2011, p. 4) notes that, “For many physicians and nurses, there is a fear of statistics.  Perhaps this comes from hearing horror stories about statistics classes.”  But of course those physicians and nurses have already taken “statistics.”  It is not a matter of what they have heard.  It is a matter of what they have encountered firsthand.  In the example just given, a rangy accumulation of statistical lore and ad hoc reasoning had somehow become transformed, in my professor’s mind, into a sleek set of crisp mathematical solutions that a good student would grasp quickly.  There may have been dozens of good students whose counterexamples should have given pause, over the years – but who is going to go against the grain of “statistical” orthodoxy?  It seems to be much more common, among such professors, to assume that many students suffer from “math anxiety” or just don’t want to work hard.

“Statistics” professors’ determined misperception of pedagogical reality produces a torrent of unnecessary misery.  Students just want the course to go away.  As one put it, “If I had just one day to live, I would want to spend it in this statistics course, because it seems like it will never end.”  In the statistics lab and in my own experience as a student, I have encountered resentment, tears, and hostility toward “statistics” courses and professors.  On more than one occasion, I had to assure the student that this was not a reflection upon his/her intelligence or worth as a student – that, for practical purposes, “statistics” courses were designed to make people look and feel like losers.

It is true that math anxiety can ease over the course of a semester, but I suspect much of that is due to changing classroom dynamics.  As in the example of the course I took (above), by mid-semester some students have figured out the professor’s routine, and have developed a survival plan, and others have reconciled themselves to the frustrating or perhaps calming conclusion that they will just have to live with a grade of B or C.  Meanwhile, with passing weeks, instructors have decreasing incentives to crack the whip, and increasing incentives to come out as a nice person after all.  Math anxiety eases in such cases, not because students have become more comfortable with math per se, but just because they have arrived at a concept of how they can get through the course in some halfway acceptable fashion.

Unfortunately, there is not a lot of research on undergraduate “statistics” education.  In one recent exception, Hagen et al. (2012, 2013) did a mixed-methods study at the university where they, themselves, were instructors, administering their questionnaire to classes taught by others in their department, and conducting their focus group sessions on campus.  It is doubtful that this context would elicit the kind of frankness that might emerge in students’ remarks among themselves.  There is also reason to doubt that the study would draw highly independent evaluations from the authors, given that they were inquiring specifically into the efficacy of an Applied Statistics course that they, themselves, had developed for nursing students.

Hagen et al. do nonetheless contribute some insights.  Their course incorporated some reforms (e.g., an orientation toward nursing practice contexts), but it continued to focus on traditional “statistics” topics.  Not surprisingly, Hagen et al. found that their modified nursing “statistics” course yielded no rise in participants’ faint pre-course agreement that “statistics will be useful and relevant in my chosen career.”  Focus group participants seemed to feel that an improved ability to understand research articles was the primary benefit of the course.  But among the few specific examples of such a benefit (e.g., an understanding of what a p value means), there did not seem to be a justification for the continuing focus on topics in mathematical statistics.  The significantly improved “ability to understand graphs” (2013, p. 3) cited by multiple participants would probably not have been fostered in the more mathematical “actual university stats class” (p. 6) against which these participants contrasted the authors’ modified class.

For whatever it may be worth, my own informal inquiries with practicing nurses have invariably yielded negative appraisals:  those nurses have cited virtually no benefits from what was taught in their “statistics” courses.  One volunteered that a basic ability to work with percentages could make a life-or-death difference in nursing practice.  Another responded to my mention of Microsoft Excel (which most “statistics” courses do not teach) by stating that skill with that software would at least be useful for those seeking to move into nursing management.

Research suggests that students can do perversely impressive things with the subject matter of a “statistics” course, once it becomes “one of those courses that you mentally ‘throw out’ after you are done with it” (Hagen et al., 2013, p. 3, quoting a student).  For instance, Garfield and Ben-Zvi (p. 377) summarize research in which, just weeks after semester’s end, students who earned A grades in their “statistics” courses barely understood the fundamental concepts of mean and standard deviation, and had only “fragmentary recall” of something as basic as the Central Limit Theorem.  Similarly, Gardner and Hudson (1999), studying university students who had taken “statistics” courses, and who claimed that they were familiar with specific procedures, were able to choose the appropriate statistical procedure for a specific situation only about one-quarter of the time.  In phrasing that may appeal to professors not inclined toward self-critique, Zieffler et al. (2008) cite research indicating that

Even after formal instruction, many students do not reason consistently or accurately.  Furthermore, the studies suggest an inconsistency between what students are able to demonstrate on homework, quizzes and exams and their ability to reason about statistics. . . . [E]ven students who take more statistics courses may not necessarily understand the purposes and work of statistics at a higher level.

In other words, students learn the game.  They figure out what the professor expects them to memorize, and they memorize it.  But they have not necessarily learned much.  Needless to say, such outcomes limit the extent to which introductory “statistics” classes reliably lay a foundation on which later courses can build.  Those later courses seem obliged either to reprise the earlier material or to shift to different topics.

None of these remarks are intended to disparage the many hardworking “statistics” instructors and writers who endeavor to make the best of a bad situation, variously incorporating suggestions on good teaching (e.g., Epstein et al., 2011).  For example, some have tried to inject humor into the subject matter (e.g., Lesser & Pearl, 2008; Salkind, 2010; Field, 2009; Vickers, 2009; Gonick & Smith, 1993).  Others have looked for ways to make it engaging or fun (if not funny) to learn standard content (e.g., Larson & Farber, 2011; Wheelan, 2013; Dabney & Phillips, 2012; Gelman & Nolan, 2002; Ograjensek, 2010).  There have been many noteworthy efforts to convey statistical understanding in straightforward yet sympathetic terms (e.g., Khan Academy; StatSci.org; University of Toronto; UCLA; Stanford; Petty; Templin; Wells; Baharun & Porter).

In “statistics,” as in some other important things in life, if it’s not fun, then there’s a good chance you’re not doing it right.  And there are some really creative, organized, humorous “statistics” professors out there – although, as someone said, there are really no funny statistics professors; there are just funny people who have made a career mistake.  Yet to the extent that such teachers and writers are merely putting lipstick on a pig, one could recommend that what is needed is not to make traditional content more palatable, but rather to rethink that content – most likely, to cut the narrow mathematical focus in favor of attention to broader numeracy and research methods (above).

Generally, if you are looking for a way to lose credibility with students, you really can’t go wrong with becoming a “statistics” instructor.  Even with good teaching, your very presence in that role makes you part of a system that forces students to take a course with an intimidating mathematical emphasis, so as to constantly remind them of their incompetence.  To some extent, you may be emphasizing the manual calculations when you could be emphasizing the computer, as if your purpose were to delay their acquisition of real-world skills for as long as possible.  In the name of teaching them how to work with data in daily life, you are teaching techniques that they will not remember because they will never be doing them again – and that are obsolete in any event.  You are burning up a semester in this ivory-tower mathematical exercise, denying them an introduction to a tremendous swath of useful stuff that they could have been learning about research methods.  Altogether, quite a package.

The Scam

A travesty of this nature does not occur by accident.  Somebody, somewhere, is making money off this thing.  If they weren’t, it wouldn’t continue.  The question is, who, and how?

It’s not that the university is somehow wringing extra credits out of students by requiring “statistics.”  Unlike the situation that may exist in, say, a master’s of social work program, the “statistics” requirement is not arbitrarily increasing the total number of courses that students must take.  So the university does not generate more revenues for itself by imposing a “statistics” requirement.  But the university does save money on the cost side.  Leaving students free to choose a course that interests them can be expensive, insofar as it typically requires the university to pay a professor who can teach that course.  Even at the national median salary of $62,000 for postsecondary teachers generally (never mind the typically higher salaries at research institutions), a professor tends to be a lot more expensive than a teaching assistant (TA, i.e., graduate student).  Sources differ, but TA compensation (including fellowships) apparently tends to average somewhere around $18,000.  And TAs (or, at any rate, bottom-level faculty) tend to be the ones teaching those introductory “statistics” courses.  Here’s how Soler (2010) characterizes the situation at the large community college where he works:

[W]e offer about 30 sections of introductory statistics per quarter. Including our very popular summer quarter, the numbers are staggering: over 100 sections per academic year serving an enrollment of well over 4000 students [on a campus of about 25,000 students total]. The corresponding teaching assignment is divided among twenty or so faculty (both full and part time) who have indicated an interest to teach such a course. Among these twenty faculty there are only six who have either an undergraduate or graduate degree in statistics. The typical statistical background of the remaining faculty consists of one or two courses, mostly on probability, with little or no significant exposure to statistical methodologies or statistical thinking.

As suggested by these numbers (and confirmed by the chair of the statistics department in which I was a teaching assistant), “statistics” TAs can bring in a lot of revenue at low cost.  Revenue minus cost is profit.  TAs predominate in these courses because, among other things, (a) the higher-paid professors don’t want to teach them, partly because of the flood of anxiety (statistics professors are not usually interested in being therapists) and partly because it’s a lot of work to grade homework for classes of 30+ students (and homework tends to be helpful for purposes of teaching math), and (b) it doesn’t take a lot of statistical knowledge to thoroughly intimidate and baffle these students.  It’s not as though you’re teaching European History, where anybody can Google what you’ve said and find out if you know what you’re talking about.  In “statistics,” some of it literally is Greek – α and β and σ – and most of it may as well be.  Master the lingo and you are golden, even if nobody learns much of lasting value.

So the university has a financial interest in promoting a required introductory “statistics” course over, say, an abstruse seminar that would tie down an expensive tenured professor for the benefit of perhaps only six or eight students.  And the department teaching “statistics” certainly has an interest in it.  They can’t afford to pay competitive salaries to a lot of statistics professors if nobody is taking their courses – and if it weren’t for requirements, almost nobody would.  If you want the luxury of being in demand, so as to pick and choose the kinds of professorial work you would prefer to do, then you definitely have incentives to maintain a narrative in which undergraduate and graduate students from all over the university are required to take “statistics” because doing so supposedly teaches them important things that they will understand, remember, and use correctly.

I should mention that, along with the financial incentives to use TAs, there is the problem that it can be hard to find adequately trained statisticians to teach college courses at any level (Soler, p. 19).  That problem has been around for at least 70 years (see Hotelling, 1940).  One might expect that a good-paying field with decent job prospects would have tended to attract an adequate supply of workers.  And yet, for some reason, the nation’s half-assed introductory “statistics” courses do not seem to have been inspiring waves of young people to tread this path.  It is a mystery.

And so this post has elucidated, to some extent, the nature of that introductory course in “statistics.”  It does bring together a sanctified cluster of current and antiquated statistical procedures, and can thus be referred to (without quotation marks) as being, in some loose sense, a course in statistics.  But it is also a study in the abuse of the power to impose counterproductive and even psychologically harrowing curricular requirements.  It is the sort of pedagogical tangent that one might expect to originate in a research-oriented institution that persistently undervalues teaching and learning.  As such, it seems to offer an object lesson in the phenomenon of Big Education, wherein the primary objective is to serve the interests of faculty and administrators, and the purported mission of service to the public becomes an excuse in support of that self-serving end.


One Response to “What Is “Statistics”?”

  1. Ray Woodcock Says:

    I received the following remarks from someone. The writer seems to have felt that our views diverged, but I would be inclined to say that these remarks tend to underscore some of the concerns presented in the post:

    * * * * *

    I read most of this article. It was interesting, and I had a number of thoughts about it. Overall, I think I had a more positive experience in my statistics course than what you are describing. For one thing, the professor taught the class rather than a TA, and this teacher was pretty good. The only real complaint I had against him was that he often let the class out early. Our class was on Tuesday and Thursday mornings from 9:30 til 10:50, but often he let us out closer to 10:30. I think he would have stayed longer if students had questions. But still, he could have gone a little more slowly when covering the material. My class had 30 students in it until the last. (I think we had 28 students at the end.)

    I guess you are right that this class is of no practical use to almost everyone. However, I know of one student who just graduated last December in actuarial science and was hired right away by an insurance company. There are a bunch of tests he must pass to continue in this, and I think he has already passed some of them. Do you think that statistics is less useful than other areas of math?

    I think it is possible that the students in my class were generally smarter than most of the students you have seen in other statistics classes. Not all of them did real well, but towards the last the teacher said that no one was flunking, and for every test there were several As. I was surprised they did that well since they did not have the large amount of time to study that I had.

    The teacher was creative about bringing the grades up. After the 3rd test everyone did worse, which was what the teacher had expected. The teacher announced that each of us would have a chance to get back up to half of the points we had lost on the test. He gave out another set of problems and even said we could each work with one other person on this set of problems.

    This teacher was fine with grading homework. (He chose at each class 2 of the problems from each homework assignment and had us put down the answers. If we were correct, we received points for this.) The teacher graded the assignments quickly, and after each test he told each student exactly what grade they had (the percentage and the letter).

    My downfall was with the calculator. I’ve always avoided using one if possible, and I had to get used to just entering numbers into it. Another good thing was that they had a math center very close by, and there were several people there who could assist us. I went there right away after each class in which we learned a new procedure on the calculator. One woman was especially helpful to me. She used to teach math. She even wrote down the steps for me to use with the calculator. If a student wanted help with the material, the people at the math center would help them. I did not see a lot of other students there. But maybe some of them came later in the day.

    What you say may be true about a lot of classes, I guess. I wanted to comment about a few of your conclusions. I still don’t know what grade I got. ( I had a long ways to go to catch up on points after the first test.) I know he said I had an 87.5, before I took the final. I think the class average was about 78.5, which was a high C. I should have had a fairly high B then. But the final was not easy. We don’t get to find out our scores on the final.

    Thanks for this article.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: