Outdated Practice Tests and the New Realities of the SAT Scoring Curves
As a standardized test, the SAT has the reputation of being predictable. However, the scoring curves on the recent SAT tests have been unexpected – and that’s troubling for many students. The College Board has been struggling to create tests of consistent difficulty and has erred on the side of making tests that are too easy, leading to scoring curves that are punishing. The kind of scoring scales we are seeing on recent tests have been shocking to students, leading tens of thousands of students to lend their signatures to online petitions calling on the CollegeBoard to rescore the tests in question.
The primary issue is one of calibration of expectations. The study materials and practice tests officially released by the College Board do not reflect the kind of scoring curves the students will see on their actual tests. This misalignment of expectations is a major problem. If it intends to continue to administer easier tests with tough curves, The College Board needs to update its practice materials so students will have an accurate sense of their expected testing outcomes.
Why is this happening?
It seems the primary driver behind the steep scoring curves of recent testing administrations is the failure to accurately pretest and norm questions. The College Board distanced itself from the Educational Testing Service and took its pre-testing in-house. Previously, the College Board normed questions using an experimental section that was administered with every SAT, resulting in a broad national sample of students. After parting ways with ETS, the College Board began to norm its questions using a sample of students who were opting out of the essay section. At that point, the essay was required by many of the most selective schools in the nation, and thus the non-essay cohort of students were relatively weaker test takers than the cohort of students who took the essay. Additionally, the fact that the section didn’t count – and the students knew it – may have influenced the performance of certain groups of students. In the end, items that appeared hard to this group didn’t seem as difficult to students at the upper end of the scoring spectrum. This became a problem when the test items were given on actual SAT exams because too many students got them right. When too many students are acing the “hard items,” the curve will be much steeper, and single errors will come at a much greater cost towards the upper end of the scale.
Despite the fact that the SAT is supposed to be a standardized test, recent tests have varied in difficulty. Students don’t know if missing 3 items will yield them a score of 760 or a score of 690: that’s a lot of uncertainty. The bigger challenge lies in the ability of students to calibrate their own level of readiness for these tests. If the officially released practice tests are telling them they’re heading for a score of 760, but the actual tests are reducing those scores to 690, then that’s a real problem. Accurately calibrated practice materials are essential for students to predict their performance and understand how much work remains before they attain their goals.
This isn’t the first time a test provider has made changes to its tests that are not reflected in current practice materials. A similar lag took place about 8 years ago with the ACT, when it began to transform its science section and its scoring curves: students who were consistently hitting 34s and 35s on their practice tests were surprised to attain scores of 29 and 30 on official tests. The practice tests had grown stale and less useful at predicting performance. Thankfully, the ACT finally updated its bank of released tests, and students could get a better read on how ready they were for the current tests.
The SAT is now in a similar place with its publicly available practice materials. As the scoring curves keep shifting and the penalties for missed items lead to more precipitous drops, the practice tests are becoming more outdated and less useful.
When we look at the scoring curves for recent math tests, it’s clear the bar has been raised. Students must be more accurate to attain the same scaled score. To attain that coveted 700 in Math, a student would have needed a raw score of 47 (out of 58) in January 2017, and a 48 in April 2017, October 2017, March 2018, and April 2018. In contrast, to attain a 700 in June of 2018, a student would have needed a raw score of 54 and a score of 52 in October 2018, November 2018 and December 2018. When you remember that the Math test only has 58 questions, that bar starts to look higher and higher.
|Test Date||Raw Score Needed to Attain
a 700 on the Math Section (out of 58)
SAT Writing scores show a similar trend. A scaled writing score accounts for half of the EBRW score, maxing out at 400. For the sake of illustration and comparison, I’ll simply double the scaled score to model a scaled score comparable to the Math examples above. To achieve the equivalent of a 760 on Writing, a student needed a raw score of 41-42 for all of 2016-2017, but that score popped up in June 2018 to a raw score of 43, and has stayed there for the October 2018, November 2018, and December 2018 tests. Again, the bar is simply higher, and students need to attain near perfection to attain the same scores that they’d earned on practice materials while missing more questions.
How can students prepare for the SAT, knowing the practice material isn’t calibrated?
Students prepping for the SAT, and tutors who are helping them prepare, must be aware of the lack of consistency in scoring. Whenever possible, students should order the Question and Answer Service – available October, March, and May – because that will give the student a current, calibrated test that will reflect future test administrations far more accurately than any available prep materials. We are all awaiting more accurate practice tests so that we can help our students adjust to the SAT’s new normal.