Thursday, May 7, 2015

The Purpose of the Test

Background: Spring is one of those special times of year when the air handling system in the Donorbucks Building switches from heating to cooling, or vice versa, sometimes more than once per day. Because the air supply and return vents are not in the same rooms, we must keep our doors open while in our offices to achieve airflow sufficient for sustaining life. While the fans shut down for the changeover, the minutes-long absence of their jet-plane-like roar underscores just how acoustically live the cinder-block-walled, concrete-floored seventh floor can be, and we can’t help but hear almost every word in nearby offices. Fortunately, we’re all pretty quiet, except when a student happens upon the floor. I tender for your amusement what I overheard the other day.

Unidentified voice: Professor Panquehue?

Panquehue: Oh, hello, Stu. What brings you here today?

Stu: Well, I heard from other students that there was an adjustment on the last exam?

Panquehue: Yes, we discovered a problem with one of the questions that resulted in some students gaining another point.

Stu: Well, I feel it's not fair that I didn't get any more points.

Panquehue: Not fair? Please explain.

Stu: My score should have gone up.

Panquehue: Okay, essentially reaffirming the consequent. Please make your case again in different words.

Stu: I should have gotten another point like my friend did, so it's not fair.

Panquehue: I am sensing a pattern here. Maybe I can help. You think another point was due to you, and you are inquiring as to why that didn't happen, such as maybe there was a mistake in the scoring.

Stu: Yeah. I want you to show me why my friend got another point and I didn't. Her name is...

Panquehue: No, I'm not discussing anyone else's exam with you. Your exam is in the department office with the rest, but I have the master right here, so let's see how far we get with that. The question in question is... number eighteen? Yes, eighteen. Do you recognize this one?

Stu: The main cause of loss of structural integrity [unintelligible mumbling]... yeah, I remember. This is the one I almost got tripped up on.

Panquehue: OK, good. Answer B was correct, which is what we'd keyed into the scantron reader. It turned out answer A was correct as well. We don't make mistakes like this too often, but when we do, we try to do right by the class. So the forty or so percent of the class who answered A got another point when we reran the scantrons to accept both A and B.

Stu: But I answered C.

Panquehue: Well, then, this is pretty straightforward. You don't get the point, and we're done here.

Stu: But it's not fair!

Panquehue: I disagree but sense you wish to pursue the matter further.

Stu: We're supposed to pick the BEST answer. When there are two answers that are equivalent, that means neither of them can be the best, so you can eliminate them and work from the other three.

Panquehue: A and B are not equivalent, just approximately equally good. And that's an intriguing test-taking strategy. Did you come up with that on your own?

Stu: No, I think I got that at Kaplan, prepping for the SAT.

Panquehue: Well, that's well and good, but this isn't Kaplan, and this exam isn't the SAT. Are we done?

Stu: But I shouldn't be penalized for using a proven strategy...

Panquehue: Let's stop there. You were not penalized. You chose the wrong answer, and consequently you didn't earn the point. You start from zero and work your way up, not the other way around.

Stu: But I chose the best answer!

Panquehue: Answer C was clearly wrong.

Stu: But for the purposes of the test, C is the best answer.

Panquehue: Again, if you knew the material, you'd know that "loss of hysteresis" was not a good answer, and you'd have answered A or B which were clearly better.

Stu: But after you eliminate A and B, C is better than D and E!

Panquehue: No, that's not how it works. The question wasn't about which of the wrong answers was less wrong.

Stu: But for the purposes of this test, because there were two equivalent right answers...

Panquehue: Yes, we covered that. In the Olympics, if two runners tie for first, the gold medal is not awarded to the one who comes in third, and you don't get a point for the bronze here. And I'm now at a loss how to explain it better, but my colleague is really good with this stuff, and we don't even need to get out of our seats.

Voice on speakerphone: Yes?

Panquehue: Do you have a minute? We need some help with the readjustment on the latest exam.

Voice on speakerphone: I'll be right there.

[Sound of footsteps]

Feta: Hi. What seems to be the difficulty?

Panquehue: What is the procedure when it turns out there is more than one correct answer on a multiple choice question?

Feta: That's simple. We rerun the scantrons to accept all answers that were correct for that question. Students who answer the remaining wrong choices, or who left it blank, don't earn the point.

Stu: Yeah, but that penalizes those who use test-taking strategy to eliminate...

Feta: It penalizes no one. It actually gives credit where credit is due.

Stu: But for the purposes of the test, you should accept the best of the wrong answers, because there can only be only one right answer.

Feta: Not so. As happens more often in life, on an exam there are sometimes two or more good answers to a question.

Stu: But for the purposes of the test...

Feta: This is about question eighteen, isn't it? You answered D?

Stu: No, C.

Feta: Oh, you were that one? Most of the people who got that question wrong answered D. Look. About eighty percent of the class split about evenly between A and B, and of the upper quintile, about ninety-five percent answered either A or B, so this was clearly a do-able question.

Stu: But for the purposes of this test, you should throw out that question and give everybody a point for it, because it was flawed.

Feta: The purpose of the test is for you to demonstrate what you know by providing the correct response, the result of which is you earn a score roughly commensurate with your competency. What you just suggested is what we do if it turns out there is NO correct answer. But that’s even more rare than two right answers, and at any rate, it’s not what happened here.

Stu: But it’s not fair...

Feta: It is not fair to those who studied and learned the material and selected the correct answer for those who didn’t to get the same score, even on a single question. Doctor Panquehue, was there anything else?

Panquehue: No. Thanks for your trouble.

Feta: De nada.

[Sound of footsteps.]

Stu: I want to file an appeal.

Panquehue: That is your right.

Stu: How do I do that?

Panquehue: The handbook says that appeals go first to the course director -- that's me -- then to the Associate Dean of Curriculum and Assessment.

Stu: Who's that?

Panquehue: I'll save you the trouble. Doctor Feta?

Voice on speakerphone: This is Stilton. Feta stepped out... oh, she's back.

Feta: [on speakerphone] Yes?

Panquehue: A student wishes to appeal a grade on an exam.

Feta: I'll be right there.

[Sound of footsteps.]

Stu: Oh, it's you again.

Feta: Oui, c'est moi. You wish to file an appeal?

Stu: Yeah.

Feta: Okay. I've already heard part of the case. I only need one more piece of information. Doctor Panquehue, you're the content expert. I'm sorry to have to ask you this, but, well, it's policy. Is there any way that answer C is correct?

Panquehue: No. And to answer your next question, no evidence has been presented that it is even partly correct, much less on par with A or B.

Feta: Then the appeal is denied.

Stu: How do I appeal your denial?

Feta: Of course. Your next stop is the provost. But so that you don't waste your time, you should build a better case. Understand that you are asking to be given a point for selecting a wrong answer. The provost’s response is likely to be the same as ours was: if you knew the material, you would have picked one of the right answers, just like the vast majority of the class did. So formulate an argument based on evidence from the most current and reliable research on that topic. The course director will be providing the counterargument.

[Sound of chair sliding on the floor.]

Stu: Well, I’ll take my chances with the provost right now.

Feta: Of course. Good afternoon, Stu.

[Sound of footsteps.]

Stu: [fading into the distance] It’s not fair.

Feta: So how long was he with you before I got involved?

Panquehue: Not too long. I had several opportunities to end it, but this was kind of a new twist. Sorry if it took too much of your time.

Feta: No worries. Like fire drills -- often enough to keep in practice, seldom enough to not be a nuisance.

Panquehue: What do you think will happen with the provost?

Feta: Well, it probably won’t set a precedent that has any long-term repercussions, and I’ve fought on principle enough that I don’t mind being reversed, and it’s just one point, so he’ll side with us. He reserves his reversals for things that fuck us in more drastic and lasting ways.

Panquehue: We could make BINGO cards for these things. “It’s not fair.” “How do I appeal.”

Feta: I know, right? “But for the purposes of the test.” How many times did you hear that?

Panquehue: I lost count. I wanted to say, “You keep using that phrase. I do not think it means what you think it means.”

Feta: But for the purposes of the test, I assumed the horse was a sphere.

Panquehue: But for the purposes of the test, I assumed a can opener.

Feta: But for the purposes of the test, you need to give me a private jet and transfer all funds to my account in the Caymans.

Panquehue: But for the purposes of the test, rainbows and unicorns will fly straight out of my ass.

[Whereupon the air handler fans rev up and the conversation is no longer audible.]


  1. Wow. I hear a lot about high school students being taught to take tests rather than being taught material, but this really nails home how damaging that is.

    1. It really does. I wonder how much this plays a role in the failures of my attempts to use MC tests. Since I've never been formally indoctrinated into the ways of commercial standardized testing, I'm sure the techniques honed there would fail miserably.

    2. I've attended several faculty development workshops on the topic of writing MCQs. To some extent, it feels like counterterrorism: many students have indeed learned the "science" of picking up points when they are weak on the material, so question writers have to be on guard not to commit the subtle (and not so subtle) errors that enable such practice. I want to double down and use the students' tricks against them, i.e. to increase the chances that they will pick the wrong answer when they are simply using the trick.

    3. The other side of that coin I've come to realize, is that when they get through two years of large classes with multiple choice exams, they struggle with a written test.

      It's one of those be-careful-what-you-wish-for situations. Everyone says they hate multiple choice because 'it's just memorization' or 'it doesn't reflect what I really know'. But when they get into upper level classes, the tune changes. Instead of 'I'm so glad that an actual living breathing professor will actually engage with my written expression of my knowledge', it's 'why didn't this word salad get full marks?'

      And Hell hath no fury like an A student looking at a B.

  2. The Donorbucks Building (love it!) seems to have the same HVAC system as my building.

    For the purposes of the test, a finch is a primate.

    1. My building has very similar construction, but a slightly better HVAC system (at least in my office; we have actual vents, of both types, as far as I can tell. Assorted bees, wasps, etc. -- which are presumably hominids for the purposes of the test -- occasionally fly out, but hey, there's air ). If it were a Donorbucks building (we have several, as well as several Lofty Entrepreneurial/Educational Concept buildings, which were presumably intended to be DonorBucks buildings, but the recession hit), I assume the carpet wouldn't be going on for 40 years old, and the wasp nests in the ceiling would be removed.

    2. The Donorbucks Building was quite possibly underfunded from its inception, but I've also heard rumors to the effect that Donor passed away during its construction and the remaining funds were "un-earmarked". If there ever had been carpet in certain parts of it, it was long ago removed; the floors are coated with epoxy paint of some hideous shade popular in the 1970s.

    3. We have a series of buildings funded by a donor/corporation. The first one ran over budget, so the second one is slightly less awesome looking. The second and third ran over also. Building #4 is clearly inferior.

    4. My shiny new building was apparently designed with one aim in mind: to look good on the web.

      Hell to teach, research, or - probably - learn in, it appears glorious incorporated into brochures, homepages, powerpoint slides, and any other flytrap like material that may ensnare a few more punters.

      We lack donors, but appear to have a extremely vital Committee Researching Audience Preferences For Emerging Smart Technologies.

  3. But for the purposes of this test, you should throw out that question and give everybody a point for it, because it was flawed.

    Interestingly, this is an option programmed into our edition of Blackholeboard. It's called something like "assign default grade," and, when I first saw it, I thought it would let me give everybody full credit on a question, then go back and adjust scores for those who didn't deserve full credit (a handy option for short-answer pre-class reading quizzes that are really note-taking exercises), but no, it just means that everybody gets full credit, with no changes possible, because the instructor decided to throw out the question.

    I'm ashamed to say I once taught a Kaplan review course (I think the confidentiality agreement has expired). It sounds like the student actually paid attention in that one.

    1. I do a fair amount of work with what I call the BEAST. An MCQ can be rescored according to several options, two of which are "accept all answers" and "give full credit to all students." The difference is that the former gives no points to students who left the answer blank. It is also possible to manually checkmark whichever of the answers turns out to be correct.

      No shame in making a buck teaching students how to take tests; they still need to know the material to score well. Stu's problem was clearly in failing to recognize that the rules of the SAT apply in very few other places, for which he was amply schooled.

    2. I imagine that faculty inboxes are frequently invaded by 'professional development' courses and seminars offered by teaching-focused units at the institution, one of which is often 'How to design multiple-choice questions'; I am often annoyed by these emails and delete them, but for one the workshop was being run by a colleague in my department, so I went to it, and damn did I learn a lot. After the workshop I went back and looked over my midterms and exams, and realized I was obviously telescoping the correct answer in more than half of my questions.
      I imagine that these teaching workshops exist in part to counteract the 'question patterns' highlighted by courses like Kaplan, because thorough question design isn't on most people's radar when they are making test questions from scratch, or using question banks from a textbook, which are only as good as the instructor who helped write the supplementary material for the textbook.

    3. I summarily delete those spams as well. The workshops I've been to were all arranged by and/or presented by colleagues, or at conferences centered on education in my field.

      I have the "honor" of editing questions whose multiple answers look like this:

      A. Rhubarb
      B. Bamboo
      C. Bitumen
      D. Fescue
      E. Drop forged chromium molybdenum work hardened 316 stainless alloy

      I am of course using hyperbole, but I suspect that even a first-grader can play the game of "one of these things is not like the others" and pick up the point.

  4. Must we explain to students how to file an appeal of our decisions? We are not cops and we're not dealing with Miranda rights. Screw them. Make the student figure out the appeals process. For one thing, I don't care enough about it to know how the process works, so I'm not that helpful. Second, it would give the student some exposure to beaurocracies, which is a valuable experience.

    1. We are now required to include in our syllabus, or reference with a link to the official document, a set of basic administrative procedures, including the severe weather policy and grade challenge procedures.

    2. I am alarmed at the additional material I've been required to add to my syllabus over the past couple of years. I feel as if I need to get an attorney before I copy the things.

  5. About halfway down the post I was thinking to myself, at this point I'd say to the student "You keep using that word. I do not think it means what you think it means," very cool that it made it into the banter of the playlet. Nothing sends my blood pressure to the ceiling quicker than when a student whines out the word "unfair" with nothing to back it up. While I don't use that particular phrase, I do now make the effort to interrupt the student mid-whine, immediately upon hearing that word, and ask "Unfair? You're going to have to explain to me what is 'unfair' about the situation. 'Unfair' has a specific meaning, and I'm certainly willing to change your mark if there is something 'unfair' about it. Now, please tell me, what is "unfair" about your mark?" With those phrases now uttered, with emphatic 'air quotes' every time I say "unfair", I've got about a 90% rate of the student staring back incomprehensibly, and then with very little fuss slinking away because they know they've got fuck-all to back up their request for a mark change due to something being "unfair". The other 10% stumble their way through an explanation that, by the end, even they themselves realize is a load of hooey. Unfair, phffttttt.

    1. Amen to that, Good Professor. The "air quotes" do indeed convey that in order to prevail, "the student" must "bring it" and "leave it on the field." But they seldom do, because "they got nuthin."

      It may cheer you to know that both Panquehue and Feta have been known to use phrasing such as:

      "Technician in Training Stuart, your claim that this is unfair is without foundation. What you really mean is that you are not being afforded some advantage that you believe is your birthright. Pragmatically, it would be better for you to align your expectations with a more objective definition of 'fairness', lest you spend the remainder of your days in a want that can never be fulfilled. That is a life I would not wish on my worst enemy, and the enemies I've made could make you go 'poof' and your own family forget they ever knew you. And if you think I'm being harsh now, you are not taking into account the mellowing that age brings to a person."

      Compared to them, I'm just a big old damn softie.

    2. "Unfair" usually means only that they don't like the outcome and they want it fixed. I point out that to do so would be, in fact, unfair to everyone who did not get that consideration. Which is another way of saying, go away.

    3. Yeah, they almost never think about what it would be like to be on the other end of it, i.e. to be one of the "everyone who did not get that consideration".

      When our students don't get the point they are arguing for, they sometimes escalate to the "throw the question" gambit. They don't think what that means for the students who answered the question as asked and got it right for the right reasons. Feta pointed out that problem to Stu while strongly implying that any further debate about "fairness" would not bear him fruit -- and at any rate, the conversation was over.

      This wasn't Panquehue's or Feta's first rodeo. Some of their retorts came pre-made, honed to concision during previous iterations. I, too, have found that it helps to have common counterarguments in the can.

  6. Ben brings up good points about how students should figure out the appeals process for themselves. I suspect my colleagues played it out as they did for a couple of reasons.

    1) Panquehue was likely intrigued by Stu's argument, which was based entirely on a "rule" -- only one answer can be correct -- that is written nowhere and isn't even stated orally by any instructor. So Panquehue called in Feta to see this fresh hell for herself, knowing that she would lay down some extra smack while she was there.

    2) Stu would have met Feta during orientation, which is a Very Big Deal in his program. Yet he didn't recognize her or remember her role in the program, a fact that quickly became apparent to my colleagues. At that point they played with him as might two cats with a rubber mouse. If they wanted to shut it down, they could have; please see my comment of a few minutes ago further up this page. I almost pissed myself listening to them playing it so straight.

    3) My colleagues did not quote the letter of the appeal policy. Had Stu bothered to read the handbook, he'd have known that formal challenges to exam questions must be conveyed through a class representative -- first to the course director, then to the dean, etc., which they said, technically.

    We often entertain informal challenges when it seems likely to save time in the long run. (Sometimes it's faster to just rerun the scantrons for the whole class based on a quick but well-argued oral challenge, than to wait for the formal process.) My colleagues got Stu's challenge and appeal handled in far less time than formal hearings -- and all the emails -- would have taken. If Stu ever reads the handbook, he might see that he didn't follow the correct route and thus in theory had not yet exhausted his option to work through the class rep. But the rep is smart enough to ask him if he was already shot down, and then she'd just shoot him down as well.

    Sidebar: when the kiddies have been dismissed and appeal, it is astounding how well they come up to speed on the policies in the handbook. Too little, too late.

    4) Question 18 was never rescored to give credit for anything but answer choices A and B, but I concede that the same outcome would occur if Stu never made it to the provost's office (it is, after all, in a different building). Panquehue would know that the provost entertains whiners at his whim, but almost never sides with "point lawyers". I strongly suspect Feta recognized Panquehue's question ("What do you think will happen with the provost?") as rhetorical and took the opportunity to sardonically allude to some serious shit that went down in the near past that would merit a post (or several) of its own if I could figure out how to do that without outing myself like the lead actor in a snuff film.

    1. "Point lawyers" just entered my vocabulary. I've seen them plenty, but didn't have the right phrase.

  7. If I had a dime for every time a student came to me whining about things like that....