The Trouble With the Curve
Something strange happened on the November SAT. The morning of score release one of our tutors emailed us, confounded by his 750 on the math section. He missed a single item on the math test, which typically would have yielded a score of 770-790. In November, his single wrong answer dropped his score an unprecedented 50 points! Moreover, other students who missed 2 questions attained scores of 720, and those who missed two questions and omitted two questions came up with a 680! There is something funky with that curve!
Our tutors who had taken the test commented immediately afterwards how easy the math felt. It appears there were simply too few difficult items on November’s SAT. By creating a test lacking an adequate number of challenging items, the College Board forced too many students to the top of the scoring distribution. Test-writers aim for a normal distribution of scores, with enough challenging items to differentiate between a student who scored a 740, a 760, and a 780. With too few challenging items, the November test pushed too many students towards perfect and near perfect raw scores. To spread the raw scores out over the scaled-score distribution, it had to push students who missed one item to a 750, and two items to a 720.
To put this in context, we examined the relationship between raw and scaled scores for students who attained a composite score of 50 on the math section, examining the College Board data all the way back to 2005. There is a meaningful range of score conversions, from a high of 740 in 2011 to a low of 700 in 2013. It appears a student from the November test with a composite score of 50 would have attained a score of 680-690, a small drop below the 2013 score conversion.
But the big drop, the most remarkable difference is at the very top: the 53 raw, 750 scaled conversion. That got everyone’s attention.
We’ve received multiple inquiries from our counselor and educational consultant colleagues: is this a trend? Is November harder? Are smarter kids taking November and should we counsel our students away from November? In a nutshell: No. November is a fine test. By all accounts, this scoring conversion for the 2014 November Math test was an anomaly. The test writers wrote an easy math test, and the score distribution reflects that. Is this is a sign of things to come? Not likely. The December and January tests will most likely bring things back in line with typical score conversions, and it’s unlikely we’ll see this same skewed distribution next November.
We pulled student data from the past 5 years for each testing period, showing in the charts above the November tests from 2012, 2013, and 2014, and found some points worth mentioning. First, evaluating the past three tests, there is no observable trend in the student outcomes for that month. The 2012 test saw fewer high-scoring students; 2013 had an even spread; and 2014, by comparison, saw a gap between low-700 scores and 800. Some November tests saw higher and lower scores, but the month as a whole did not suggest higher or lower outcomes for students. When we pulled the data for the other months, we saw a similar variance for the other test dates. This data supports the College Board’s assertion that there is no predictably easy or hard test date.
Second, when we look at November 2014, we see an unusually high number of perfect 800s and low-700s, but not much in the middle. With an easier math section, the College Board likely saw many more students score a perfect 800. In order to keep the upper end of the distribution curve accurate, it pushed the otherwise mid-700 students further down the scale.
Looking at our data from the past 5 years and beyond, we feel that they strongly indicate that the College Board does a fairly good job of equating the tests and keeping one test date from differentiating itself from the rest. Easier and more difficult tests will rear their heads, as we saw in November, but their occurrence will not happen in a predictable way. The best advice we could give our families is to plan for several test sittings. Should one test prove more difficult, it is advantageous to have a second test to mitigate the unpredictability of any one SAT.