Can You Trust Your PSAT Score?
The wait is over and 2015 PSAT scores have finally arrived. Alas, these much-anticipated numbers have produced more questions than answers for students and educators. When the College Board first released the scores last week, many immediately compared scores and percentiles from the 2015 revised PSAT against last year’s PSAT numbers. We became aware of a potential issue when private school counselors in New York, Seattle, and Atlanta contacted us to report significant spikes in the number of students scoring in the top 1% nationally. Were more National Merit scholarships coming their way? Did students attending academically rigorous private schools have an edge on this revised PSAT?
When we expanded the conversation to include more than a dozen public school counselors at schools in cities across the nation, we heard the same story repeated again and again: many public schools were similarly experiencing a 1% bonanza. One school, ranked in the top 40-50 public schools in its region, reported that an unprecedented 42% of its juniors scored in the top 1% of test-takers. Others saw over 20% of their students in the top 1%. These results do not pass the sniff test. Have the rules of percentiles changed?
We’ve spent the past few days poring over both the datasets sent to us by various schools and any available documents from the College Board explaining the data (including the newest, PSAT/NMSQT: Understanding Scores 2015, which was released earlier this week). Here’s what we know and what we don’t.
A quick primer on scoring, percentiles, and index scores.
Before we dive into data, it’s important we all use the same vocabulary. This year’s PSAT results include several different scores and percentiles, including:
Raw score: The number of questions a student correctly answered per section. There’s no guessing penalty, so consequently there’s no need to subtract for missed questions.
Test score: Converted scores that range from 8-38 in Reading, Writing, and Math. The Reading and Writing scores are added together and multiplied by 10 to calculate a student’s scaled EBRW score, and the Math score is multiplied by 20 to determine the student’s scaled Math score.
Scaled (Section) score: The converted score. The PSAT is scaled to a verbal section (i.e., Evidence-Based Reading and Writing) and a math section, each ranging from 160 to 760 points. The total combined scaled score ranges from 320 to 1520 points.
NMSC Selection Index: For juniors who qualify to compete in the National Merit competition, this is the score that will be used to determine eligibility. With a range from 48-228, it looks deceptively similar to the old 60-240 scale. The NMSC Index is calculated by adding a student’s three test scores (reading, math, and writing) and multiplying by 2. You can also derive it from the scaled section scores. To do this, double your verbal score, add it to your math score, and divide by 10. For example, a student who scored a 540 Verbal and a 610 Math would have a NMSC of (540*2 + 610)/10, or 169.
National Percentile A (Nationally Representative Sample Percentile): The first of the three main percentiles released (and the one prominently displayed at the top of every student’s score report), it is also the most dubious. The College Board defines this (on page 6 of its score explanation doc.) as a percentile “derived via a research study sample of U.S. students in the student’s grade (10th or 11th), weighted to represent all U.S. students in that grade, regardless of whether they typically take the PSAT/NMSQT.” Translation: this sample includes scores from students who don’t typically take the test, i.e., those who are least likely to be college bound and who are most likely to score in the bottom quartile of test-takers.
National Percentile B (PSAT/NMSQT User Percentile – National): This is essentially the same as the Nationally Representative Sample Percentile, except that it is weighted to represent “students who typically take the PSAT/NMSQT.” Translation: this sample drops the scores of the students unlikely to take the test. Unsurprisingly, a student’s User percentile was, on average, about two points lower than their Nationally Representative percentile. For some students, the variance was as high as 5 percentage points.
NMSC Selection Index Percentile: When scores were released last week, educators were provided with a third set of percentiles, the “Selection Index” percentile. This was still calculated using a research sample, but the sample was limited to 11th grade students. This percentile is based on a student’s NMSC Selection Index (48-228) rather than the student’s scaled score (320-1520). For example, if a student earned a selection index score of 205+ out of 228, they scored in the 99th percentile using the selection index percentiles. If you don’t know what your selection index percentile is, you can find it on page 11 of College Board’s Guide to Understanding PSAT scores.
What’s going on with the percentiles?
From the data we’ve had access to, three facts start to emerge:
- There appears to be a general inflation in student percentile rankings as compared to last year. This is true regardless of which percentile you choose as the reference (Nationally Representative Sample, National User, or Selection Index).
- An individual student’s percentile ranking can vary by more than 20 points, depending on the percentile used. One student in our sample was in the 54th percentile according to the Nationally Representative Sample percentile and in the 32nd percentile according to the Selection Index percentile.
- There is sometimes variance between students’ reported percentile rankings, even if those students earn exactly the same score. This one, frankly, we cannot explain.
Comparison to Last Year: General Inflationary Trend
Until this year, a student’s percentile on the PSAT was based upon the scores earned by all of the juniors who took the PSAT in the previous year. The percentiles reported at each score varied minimally from year to year, as seen when we compare the percentile charts from 2012-2014:
This year, percentiles are instead based upon a “nationally representative sample” of students. We would expect that, if this sample were truly representative and the concordance tables produced by the College Board were reliable and equipercentile (as claimed), we would see a similar curve when we converted the 2015 scores to the 60-240 scale using those tables and then layered the concordant scores onto the graph. Instead, a very different picture emerges:
Notice how far above the other lines our new, concorded 2015 scores lie. The inflation in percentiles is easy to see for students who earned concorded 2015 scores between about 100 and 180. The graph masks the differences for the very top scorers, but those differences still exist. Consider the top-scorers in our sample data:
95 of the juniors in our sample (29.4%!) earned 2015 PSAT scores of 205 or higher, placing them in the “99th percentile” according to the selection index percentiles provided. When we use the students’ math, reading, and writing scores to concord their 2015 PSAT scores to 2014 PSAT scores, more than 60% of these students fall out of the 99th percentile, some to as low as the 95th percentile. After converting the scores, our 95 “National Merit hopefuls” fell to 37, much more typical for this subset of schools.
Thus far, the biggest hurdle to comparing 2015 percentiles with concorded percentiles has been that the College Board has not published percentile charts for the individual test scores (Reading, Writing, and Math). We used the percentiles reported by College Board in our dataset to build these charts. Where a reported value is “–” we did not have any juniors in our dataset who earned that score, so we could not report a value for the 2015 percentile. Notice that, at almost every score in almost every section, a student’s 2015 User Percentile is higher than his or her concorded percentile, sometimes by as much as 15 percentile points.
|2015 Reading Test Score
|2015 Reading Percentile (National User, 11th grade)||Concorded 2014 Reading Score
|Concorded 2014 Reading Percentile (PSAT, 11th grade)|
|2015 Writing Test Score
|2015 Writing Percentile (National User, 11th grade)||Concorded 2014 Writing Score
|Concorded 2014 Writing Percentile (PSAT, 11th grade)|
|2015 Math Test Score (8-38)||2015 Math Percentile (National User, 11th grade)||Concorded 2014 Math Score (20-80)||Concorded 2014 Math Percentile (PSAT, 11th grade)|
From these charts, you can “deflate” the scores and compare your 2015 PSAT National Merit Selection Index (or your school’s average index) to a concorded 2014 PSAT score. For example, a school counselor asked us if the average selection index of her juniors this year (187) was better or worse than the school’s past average (180). You would think that 187/228 is better than 180/240, right? Well, let’s use the tables to compare:
To convert this year’s 187 to a 2014 PSAT score, you need to know the average score on each of the three tests. Let’s say reading was an average 31, math 31.5, and writing 31 (notice that (31+31.5+31)*2 gives us 187. This may be off a point or two due to rounding, but it should be close to the total average selection index.
Using the tables above, we see that the 31 reading score concords to a 59 on last year’s PSAT. A 31.5 math concords to a 64, and a 31 writing concords to a 57. Adding those together, we get a total concorded PSAT score of 59+64+57, or 180. We don’t even have to use the 2014 percentile tables to know that 180 = 180 — the school performed exactly the same this year as it did last year, once we deflate the percentiles.
Why the increase in percentiles?
There are several possible explanations, and we don’t claim to have the answers. A few possible explanations might be the following:
- The “nationally representative sample” used to develop the percentile tables wasn’t particularly representative;
- The students who took the test this year varied in a statistically significant way from students who have taken the test in years past;
- The preliminary concordance tables are specious and will be significantly reworked at some point (revised tables are slated to arrive in May, following the administration of the March and May SATs);
- The College Board’s new percentile charts fail to properly account for the change in distribution that results from eliminating the guessing penalty and shifting the number of answer choices from 5 to 4;
- Something else entirely is going on.
It is interesting to note the change in the percentile curves from 2014 to 2015:
In both, we’ve drawn a vertical line to indicate where the 50th percentile would be if the test were distributed across a normal curve. In 2014, more than 50% of students performed below the expected median. In 2015, that has shifted. This year, because of the elimination of the guessing penalty and the reduction in the number of wrong answers, there is a very long “tail” on the left end of the curve, similar to that found on the ACT. Given that students can now randomly guess their way to a higher raw score, it’s much harder to earn a very low score, and all students are seeing the “benefit” of these changes.
You may find yourself thinking, “Wait, percentile is percentile. If you’re in the 89th percentile, that should mean you perform as well as or better than 89% of the people who take the test. Who cares if the curve shifts to the right? That should just push the scores needed to be at a given percentile higher.”
This is true, but the College Board has very little incentive to quickly “fix” any inflated percentiles. If students feel like they are doing better on this test than on the ACT, they may elect to take the New SAT in the spring. Of course, if and when the percentiles are eventually deflated, that could come back to bite them.
Variation within an Individual Student’s Scores
General inflation isn’t the end of the story when it comes to interesting finds from the new PSAT results.
For all the reasons described above, we would expect an individual student’s percentile score from this year to differ from the student’s score last year, even if we were to hold constant other factors like growth over time or preparation for the exam. The truly surprising thing about this year’s PSAT scores is not that students scored differently than they did last year, but rather that a single student may score differently depending on which score you look at this year.
We believe the Nationally Representative Sample percentiles to be high, so let’s limit our comparison to the User percentiles and the Selection Index percentiles. Since the User percentiles are grade-specific and the Selection Index is specifically normed to 11th graders, let’s also limit our discussion to the 11th graders in our sample.
On average, students saw a small (1.55 point) difference between their User and Selection Index percentiles. For about 3% of students, though, the difference was at least 10 percentage points. Consider these two students from our sample:
|Selection Index Percentile||National User Percentile||Difference|
|Student A||85th||75th||10 points|
|Student B||53rd||68th||15 points|
What accounts for this, and why does it matter?
Because of how the scores are calculated, math matters more for the nationally representative sample and national user percentiles. Math makes up ½ of the score for these percentiles, compared to ⅓ of the score in the selection index percentile.
This means that students with a large gap between their math and verbal scores can expect their percentile rankings to vary significantly depending on which percentile they’re looking at. Students who are particularly strong in math relative to verbal will probably see a user percentile that is higher than a selection index percentile, while students who struggle in math will likely find that their selection index percentile is the higher score. Indeed, when we look at the scores earned by the two example students described above, we find that the spread between their verbal and math scores was very large: the student with a selection index in the 85th percentile scored a 660 EBRW and a 440 Math (33 R, 33 W, 22 M). Our student with a selection index in the 53rd percentile scored a 430 EBRW and a 630 Math (21 R, 22 W, 31.5 M).
|Total Scaled Score||Selection Index Percentile||National User Percentile||Difference|
|Student A (weak math)||1100||85th||75th||10 points|
|Student B (strong math)||1060||53rd||68th||15 points|
|Difference||40 points (statistically negligible, according to CB)||32 percentage points!||7 percentage points|
Notice that, while these two students earned total scaled scores that were within 40 points of one another (which, according to the College Board, is within the margin of error and could theoretically be earned by two students who are identically prepared), their selection index percentiles are separated by more than 30 percentile points. However, when the same two students are compared against one another using the national user percentiles, they are only 7 percentage points apart. Why? Math matters more. The student with the lower score did significantly better on math than verbal, so when her percentile is calculated using the scaled section scores, a 25 percentile point difference between the students disappears.
Why might this matter? We don’t yet know how colleges and universities are going to use the Redesigned SAT scores in their admissions processes. If they use concordance tables (and percentile rankings) based on the total, scaled section scores, students who are relatively stronger in math than verbal stand to earn a significant boost in their standing relative to other students. By contrast, students who are weak in math stand to lose. On the other hand, if colleges use the percentile rankings derived from looking at a student’s individual test scores (Reading, Writing, and Math), students who thought that they were well-positioned for admission because their overall score looks strong may find themselves falling behind if the spread between their math and verbal scores is too high.
Inconsistencies between Student Scores
One anomaly to note: sometimes, students with identical scores were given different percentiles. For example, seven sophomores in our dataset earned a total scaled score of 760. Six of these students were told that this score placed them in the 13th percentile of national users, and one was told that the score placed him in the 9th percentile of national users. We cannot explain this. So far, we’ve looked to see whether any of the following could account for it: Test Form, Individual Section Scores, Gender, Race/Ethnicity, Public/Private School Status. No dice. Our data came directly from the .csv files sent by College Board, so there shouldn’t be any data entry problems, either. We also considered the possibility it was due to differences in the section scores, but we’ve found students who’ve earned identical section scores yet still earn different percentiles for those sections. If you have an explanation or other ideas you’d like us to check into, we’d love to hear your thoughts.
The good news is that the variation is usually small (generally only 1 percentage point) and affected only 6% of the students in our sample.
The significance of PSAT scores: why this matters.
PSAT scores are meaningful for students and schools in several ways. First, the PSAT scores are used as an initial screening mechanism for the National Merit Scholarship. The cut-off scores vary by state, based on the performance of juniors at the very top of the scale.
When it comes to National Merit, it’s important to understand how spots are allocated. Contrary to popular belief, it’s not a simple top 1% in a state, or top 0.5% in a state. The National Merit Corporation allocates scholarships proportionally based on the number of high school graduates in a given state, irrespective of how many students in that state take the PSAT or how they score relative to students nationally. That’s where the variability comes into play. If California has 13% of the high school graduates, it will be allocated 13% of the 16,000 Semifinalist spots. The top-scoring 2,080 eligible juniors in California would be named National Merit Semifinalists, regardless of whether they scored in the top 3% or the top 0.03% of test-takers.
The National Merit thresholds for the class of 2016 ranged from a high of 225 out of 240 for the District of Columbia to a low of 202 out of 240 for North and South Dakota, West Virginia, and Wyoming. We will not know this year’s National Merit cutoffs until September. Some articles on the web suggest that you should just subtract 12 points from last year’s cut-offs to estimate this year’s numbers (because last year the test was out of 240, and this year the selection index is out of 228). This is not a statistically valid approach and will almost certainly result in “estimated cut-off scores” that are too low.
Frankly, given the way the College Board has stamped “PRELIMINARY” on everything released, we’d suggest you sit tight and wait until they release the cut-offs in September. If you’re too anxious, a better (albeit still speculative) approach would be to use the tables above to convert your 2015 PSAT score to a 2014 score and see how that compares to the typical cut-offs for your state. Our guess? When College Board finally does release the cut-offs, many students in the “99th percentile” are going to find themselves with scores too low to qualify.
PSAT scores also factor significantly into the placement and academic tracking of students at many high schools. Some schools use PSAT scores to help determine whether a student will be eligible for an AP or honors level class. One very concerned school counselor shared that, according to the College Board, more than 30% of her kids were seemingly in the top 10% nationally, which would be anomalous for this school. Using this metric, how would the school be able to determine which students would be best placed in AP and honors classes? We advised that it might make more sense to ignore the national percentiles completely and simply put those students into the context of their relative performance for that high school. Additionally it may make more sense to rely much more heavily upon teacher recommendations in this year of PSAT transition and recalibration.
SAT Prediction and ACT Comparison
Finally, PSAT scores are typically a decent predictor of performance on the SAT and can be used to help students determine whether they should prepare for the SAT or the ACT. For many students, the PSAT is not nearly as important as the SAT or ACT score that will help them gain admission to college.
According to the College Board, the PSAT is “vertically scaled” such that a student would earn the same score on the SAT that they earn on the PSAT. They claim that a student “who took the PSAT/NMSQT and received a Math section score of 500 would expect to also get a 500 on the SAT or PSAT 8/9 if he or she had taken either of those tests on that same day; a score of 500 represents the same level of academic achievement on all three assessments.” This logic gets thin when it comes to the highest-scoring students (will someone who earns a perfect 1520 on the PSAT really not earn any additional points on the SAT?), but, for the most part, you can expect to earn the same score on the New SAT as you did on the PSAT.
How do a student’s anticipated SAT scores compare to his or her ACT scores? This is harder to say, given the changes to the test. We typically rely upon the national percentiles to determine whether a student should focus on the SAT or the ACT, and the published current-SAT-to-ACT concordance tables do just this. Given that it appears we cannot take the PSAT percentiles at face value, unless or until we receive additional guidance from the College Board, our advice is to concord 2015 PSAT scores to 2014 SAT scores and use the established tables to compare your ACT and concorded SAT scores.
Tell us about your PSAT experience!
We’d love to hear what you’re seeing with these new scores. Please share your thoughts and questions here!
Footnote 1. Throughout this article, when we talk about the general inflationary trend, we are using a dataset that contains only those schools for which we received complete results. This dataset includes both public and private schools, and we have no reason to believe that it is non-representative of the larger population. However, our sample represents 0.05% of the total results, so it is possible that our sample differs in a statistically meaningful way from the general population of test takers.