Saturday, April 21, 2012

No wonder they're confused

The moral of the story: Pineapples are evil. 
The perils and pratfalls of standardized testing have appeared on the radar of the Interwebz for the past few days. Why (this time), you ask? Here's the decidedly tropical flava:
Eighth-graders were stumped. The author was bewildered. Finally, in a highly unusual move after a barrage of criticism on Friday, the New York state Education Department invalidated a puzzler of a section that appeared on high-stakes state English tests earlier this week.

The passage—about a race between a hare and a talking pineapple—was adapted from a novel by children's book author Daniel Pinkwater. The pineapple, despite having proposed the contest, doesn't move. In the end, the animals eat it. Students were asked why the animals devour the pineapple, which animal spoke the wisest words and how the animals feel about the fruit.

The passage has appeared in tests in other states and is notorious enough to have a Facebook page that has received more than 11,000 "likes" since 2010. Pearson PLC, which created New York's test as part of a five-year, $32 million contract, referred questions to the state Education Department. 
This FULL ARTICLE* is by far the best piece I've seen on the story because it includes an interview with Daniel Pinkwater, the story's original author. Prodded by his sudden burst of semi-celebrity, the esteemed Mr. Pinkwater has this to say:
There was never all this attention before. Occasionally there would be some mention, every couple of years, that that quote has been appearing on those stupid tests—and you can quote me, stupid tests. There's big to-do about it now since it ran in New York this past week. I've gotten a ton of emails from kids. One kid phoned me up. They had many comments ranging from, "What are you, crazy?" to "That was the funniest thing I ever saw on a test" to "These tests are stupid, aren't they, Mr. Pinkwater?"
Indeed, kid. Indeed they are. 

* Nota bene: For those clicking on the full shizzle, it would behoove you to download the proffered PDF linked to the left of the article text, if only to get your mitts on some SECURE MATERIALS straight from the bajillion dollar brainwashing scam English test.

15 comments:

  1. It seems proffies have a universal distaste for such tests. So why do we continue to use them as a primary means of determining who is allowed in our hallowed halls?

    Something of note from my grad program: those who received 4 year fellowship packages did so based on their GRE scores. Usually, those same 4 year package-earners failed to secure the PhD. We watch them drop like flies.

    An anecdote, perhaps, but studies and articles seem to be universal in their assertion that these random tests do NOT equal valid measurement of performance, ability, perseverance.

    Again: why both using GRE, SAT, ACT?

    ReplyDelete
    Replies
    1. David Brooks just wrote an article in the NYT this week arguing that universities need to start IMPLEMENTING more standardized testing.

      Delete
    2. David Brooks is an idiot.



      http://delong.typepad.com/sdj/2011/11/new-york-times-total-fail-yes-another-david-brooks-edition.html

      Delete
    3. I in no way disagree. That picture they have of him makes me want to shove his face in.

      Delete
  2. Proffies don't have a universal distaste for such tests. Such condemnation is mostly limited to humanities folks, from my experience, with many soapboxes and little data involved.

    So here's some data, a meta-analysis of 1753 studies of over 82000 graduate students in the use of the GRE to predict student success (hint: it works fairly well for a variety of important outcomes): http://socrates.berkeley.edu/~maccoun/PP279_GRE.pdf

    The SAT and ACT are not quite as good at predicting college success, although they do still provide some information that you don't get from HS GPA alone. Whether that's worth the resulting racial differences in students selected in is a matter for each institution to decide.

    This article, however, seems to be referencing some sort of HS-level test. I'm thinking it's one of those "No Child Left Behind" sort of tests, where everyone must pass to get to the next grade. I don't think there's as much data on those generally, and I don't think there's much predictive validity evidence available specifically, but it's too far outside my research area for me to say that with any confidence.

    ReplyDelete
    Replies
    1. It sounds as if there's good reason humanities types don't put much credence in these tests. Maybe they should be restricted to the STEM fields. As far as I could tell the humanities GRE did nothing but test my vocabulary. In which I did brilliantly, as linguistics types almost always do. But it didn't tell them anything about my ability to, you know, function in my field, or do anything but know what the words meant.

      Delete
    2. I was told that my grad (English) department considered the GRE (required by the institution) meaningless, since every remotely-plausible candidate who spoke English as a first language scored in the 99th percentile anyway.

      The part that scared me was that I scored in the 85th percentile in math. This does not bode well for the state of state of bridges, etc.

      Delete
    3. This was on the 8th grade NYS ELA exam given state-wide at the start of last week.

      A few years ago, NYS kiddoes read a passage, then were bid to speculate as to author intent-- INTENT, mind; not "purpose." Those were FOURTH graders.

      When my juniors are flummoxed, irritated and bored on the Regents, I suggest they plink themselves on the temple and repeat the mantra "Sometimes the State is an ass.

      Delete
    4. I did well in the math portion as well, which , as you say, was troubling.

      Delete
  3. I've been asked to serve as a consultant for these HS high-stakes tests before, usually on questions which require writing. When I see the questions, they don't usually seem as absurd as the ones presented in this article; however, when I see the answers the "customers" have deemed acceptable as demonstrating reading or writing comprehension, I quickly understand why Johnny has limited literacy skills when he comes to college. As long as he picks an idea that falls into their range of acceptable ones from the passage and has any direct quote which supports it, he's deemed competent. And then there are certain quotes the test designers refer to as "platinum," meaning that if the student gets them in the response and has an idea that's even in the ballpark from being about the story, it's an automatic pass. There is little room for creativity or interpretation of the selection, and the evidence must be a direct quote no matter what or the response fails. Apparently paraphrase and summary are not essential skills for college writing or reading comprehension.

    ReplyDelete
    Replies
    1. Great info to have from the front lines, EnglishDoc. You've aptly illustrated why English tests in particular make very little sense when "standardized." Imagine a "range" of "acceptable" answers on a math test, or a science test...it just wouldn't happen. That they are forced to finagle the format just to have an English test that can be graded via ScanTron speaks volumes about the corners that are cut in order to implement a test that might (MIGHT) be more useful if it could be designed and scored by actual people. (And boy, even then there would be problems, I'm sure...)

      Delete
    2. Actually you can have a range of acceptable answers on a science test: organic chemistry synthesis problems often have more than one possible correct answer. That's just the example from my own experience.

      We can put "the correct answer is how you answer" problems on our exams, too!

      Delete
  4. OK, some things.

    1) My kindergartener could answer most of these questions, which require only bare factual recall.

    2) Question 10 is hilariously flatfooted. Q: "What would have happened if the animals had decided to cheer for the hare?" A is supposed to be: "they would have been happy to have cheered for a winner." Yes, if the bare facts of the story matter and irony has been completely eliminated. But the animals themselves knew that if they had cheered for the hare the pineapple would have won, so they cheered for the pineapple, who lost (1-800-irony).

    3) Pinkwater's answer to #7 is awesome.

    4) This begins to explain some of what goes wrong in my classroom.

    ReplyDelete
  5. My husband teaches 4th grade. There was a reading passage about ancient Egypt and mother fucking Ptolemy. PTOLEMY??!!!


    FOURTH GRADE!!!


    Yes. There are 4th graders who can handle it. Most of you probably were. Many of you may have a 4th grader who can read "weird" names of unrelatably old scientists and societies. My 4th grade stepson, for example, went through a big astronomy phase in the 2nd grade, and though it's cooled slightly, he still points out a gibbous moon now and then, and could run down a short list of historically important physicists.

    But does that make "Ptolemy" a reasonable string of letters for a standardized test for "regular" kids in the fourth fucking grade?

    ReplyDelete
  6. The passage strikes me as poorly chosen and poorly adapted (Pinkwater is great, but not for this purpose), and the questions are worse. Testing comprehension of fiction by multiple choice strikes me as a dicey proposition in any case (though it could certainly be done better than this).

    I do, however, think it's possible to measure some basics of reading comprehension (at least of nonfiction texts) by multiple choice. I wrote reading comprehension units for one of the major national professional-school tests (one where the applicants were aspiring to make life-or-death decisions), and, though I wouldn't want it to be the only or main admission criterion, I was satisfied that the sort of things the questions tested -- distinguishing main from subsidiary points; distinguishing points supported by evidence from unsupported assertions; deducing meaning from context; figuring out the purpose for which one author was quoting another -- were the sort of skills one needs to read fairly complex, sophisticated, texts, and that the ability to complete such reading was germane to the professional school and professional experience the applicants wanted to pursue. But producing test questions (and accompanying rationales) of that quality is fairly expensive, and I doubt that the work underlying most NCLB and similar tests is similarly rigorous.

    Also, Pinkwater has a good point about compensation: authors whose passages are adapted and those who write *good* questions should be fairly compensated. The gig I had actually paid pretty well; I wonder whether that's still the case (or the case for most of the ever-proliferating tests).

    ReplyDelete

Note: Only a member of this blog may post a comment.