Should Selective College Admissions be Test-Optional?
A hundred years ago, the SAT expanded access to the Ivy League. Can history repeat itself?
It’s been a busy last few months at Harvard. But I’m back with a two-part series on whether selective colleges should be test-optional. Part 1 (today) is about the historical context of college admissions testing. I’ll also walk you through some recent research on whether SAT/ACT scores predict later life success. Part 2, coming later this week, is about the strategic logic of test-optional admissions. I hope you enjoy it!
Two weeks ago, Dartmouth College announced that starting in Fall 2025 it would require U.S. applicants to submit SAT or ACT scores. This comes after several years of “test optional” admissions, which began at Dartmouth and many other universities during the pandemic. MIT reinstated college testing in 2022, and now Yale is considering bringing back the SAT/ACT as well. As David Leonhardt discusses in a recent New York Times Magazine article, these changes come after several years of anti-testing backlash.
In her announcement to the community, Dartmouth President Sian Beilock argued that college test scores are helpful in “identifying students from less-resourced backgrounds who would succeed at Dartmouth”. Harry Feder, the head of a nonprofit called FairTest that has been critical of college testing, gave this colorful quote to the Washington Post – “I think they really don’t care. I will say this quite bluntly: They are not there to be an institution of broad opportunity for the American student body. They are there to pluck and craft and create a perceived elite.”
If you’ve been living among the wolves for the past few years you might ask “what’s the big deal? Don’t most colleges require test scores already?”
(You might also be wondering about new additions to the cultural vernacular like “social distancing”, “essential workers”, and “Ivermectin”…..trust me, some memories are better cast into the dustbin of history.)
Until very recently, most colleges required the SAT or the ACT, but the pandemic upended the college testing landscape. The consulting company Compass Education Group tracked testing policies at more than 400 selective colleges and universities and found that about the number requiring SAT or ACT shrank from 268 pre-pandemic to only 17 as of May 2023. About 75% of the colleges that switched to test-optional in 2020 declared the change as temporary, pending further study. Among schools in the top 25 of the hated-but-still-used US News & World Report rankings, only MIT and Georgetown require applicants to take the SAT or the ACT.
The test-optional political movement has been around for a long time. Bowdoin, a highly selective college in Maine, has been test-optional for more than 50 years. About a hundred other colleges (mostly small, moderately selective and private) have been test optional since 2010 or earlier. To save you time, I unearthed a complete list from pages 8-11 of the supplementary appendix to a really nice academic paper on test-optional policies (you need me on that wall!).
Pressure for colleges to remain test-optional even after the pandemic is best understood as the continuation of a broader education policy trend away from standardized testing and assessment, especially in K-12 schooling. The zenith of testing was the bipartisan passage of the No Child Left Behind Act in 2002, which established federal requirements for regular testing and monitoring of student performance.1 Opposition to NCLB built steadily throughout the 2000s, and each successive federal reauthorization led to more state flexibility on testing requirements.
In response to a civil rights lawsuit in 2019, the University of California system banned consideration of SAT and ACT in admissions altogether (test-blind, not test-optional). More on the UC system later.
The interesting history of college testing
The SAT was first given in 1926 as a supplement to the mostly essay-based exams given by the Ivy League and other elite colleges. In 1933, Harvard President James Bryant Conant used the SAT to award scholarships to academically talented boys who did not attend the elite east coast boarding schools that mostly comprised Harvard’s applicant pool. Bryant also expanded educational opportunity in other ways, abolishing athletic scholarships and promoting co-education at Harvard college and at the graduate schools of law and medicine. Other colleges soon followed Harvard’s lead, and by the late 1940s millions of applicants each year were taking the SAT. The ACT emerged as a national competitor to the SAT in 1959.
Although Conant’s original purpose was to make Harvard more accessible, there was also at the time an ugly eugenicist undercurrent to the rationale for standardized testing. The inventor of the test, Carl Brigham, thought of the SAT as a measure of fixed intellectual capacity. Unfortunately, he also wrote in 1923 that intelligence tests would prove the superiority of the “Nordic race”. However, to his credit, Brigham eventually recanted. By 1930, he had disavowed the practice of studying racial differences in test scores, arguing that the SAT was a “composite including schooling, family background, familiarity with English and everything else, relevant and irrelevant.”
The SAT is best viewed as an imperfect measurement of a young person’s preparation for college-level work. There are differences between people in academic talent, but the SAT measures much more than that. SAT scores are responsive to coaching, and people’s scores generally improve with age and with additional retakes (I know mine did!)
Let’s look at a question from the original 1926 SAT:
I happen to know the answer because I took Latin in high school and my daughter is taking it now (although she is in high school, so I get maybe 10 words per day out of her, usually in monosyllabic English). But I wasn’t born knowing the definition of the accusative case! I learned it only because I spent (wasted?) several years of my life learning a dead language.
Carl Brigham eventually got it right – SAT scores don’t measure innate talent, but rather a composite of talent, family resources, school quality, and many other things.
Given this reality, why does Dartmouth think that the SAT will help them identify talented students? 100 years ago, James Bryant Conant used the SAT to accomplish the highly progressive goal of expanding access to elite colleges. It can still serve that purpose today, as long as we recognize that a high score is a much bigger accomplishment for some kids than for others.
College admissions test scores predict future success, even among disadvantaged students
Why bring the tests back at all? Can’t selective colleges predict who will succeed there using other information on an application?
Friedman, Sacerdote, and Tine (2024) assembled admissions records and first-year college grades from multiple (anonymous) Ivy-Plus colleges between 2017 and 2022 and asked whether SAT/ACT scores predict college grades, even after accounting for high school grades and lots of other data.
Controlling for gender, family income, race and ethnicity, legacy, recruited athlete, and first gen status, urbanicity, citizenship, high school average income, and high school GPA, students with 99th percentile SAT or ACT scores achieve a college GPA that is 0.43 points higher than students who score in the 75th percentile (about a 1200 SAT or a 25 ACT). The key figure is below:
In other words, comparing two applicants with the same demographics and the same high school grades, the one with a higher SAT/ACT score will on average have much better college grades (roughly an A-, or 3.7, rather than a B+, or 3.3).
SAT/ACT is an equally powerful predictor of college success when comparing two applicants with the same GPA and from the same high school (e.g. with high school fixed effects), which is a very strong test. Controlling for high school GPA and high school fixed effects is effectively controlling for class rank. It also accounts for unobserved differences across high schools in the rigor of the coursework, the number of AP classes offered, and other factors.
Notice the dot on the far right. The authors looked at the average college performance of the admitted students in post-pandemic classes who did not submit an SAT score. Their GPA was about 3.38, well below the sample average of 3.49. 40% of students with an SAT score of 1200 and 35% of those who did not submit a score ended up with at least one first year grade of C+ or lower, compared to only about 15% of those with the highest possible test scores.
The bottom line is that SAT/ACT scores tell you a lot about who is prepared for college-level work. Importantly, this holds for disadvantaged students as well. The authors split the prediction into applicants coming from more vs. less-advantaged high schools and found that SAT is roughly equally predictive within each group (see Finding 3 from their short brief).
What about longer-run outcomes? In my paper with Chetty and Friedman, we produced a similar set of plots for predicted future earnings and for elite graduate school attendance.2 The lefthand panel shows that higher SAT scores strongly predict having higher earnings, even after controlling for high school GPA and income, gender, race, and legacy and athlete status. The righthand panel shows that the reverse is not true – after controlling for SAT and demographics, high school GPA is unrelated to future income.
High school GPA does predict elite graduate school attendance, although the slope is substantially flatter than it is for SAT. Going from an SAT score of 1300 to 1600 effectively doubles (from 10% to 20%) the chance that an Ivy-Plus college applicant will eventually attend a top graduate school. Going from a high school GPA of 3.3 to 4.0 increases the likelihood of elite graduate school attendance by only about 2 percentage points (from 13% to 15%).
Reasons why I might be wrong about the predictive power of SAT/ACT
I’ve seen two main criticisms of the argument that college test scores are highly predictive of future success.3 Jesse Rothstein argues that the findings above only hold at highly selective colleges where almost every applicant has a high GPA, and thus the poor prediction is due to range restriction. This is a fair point. SAT and ACT are probably most valuable for highly selective colleges, who are deciding between talented students from all around the world. Rothstein’s 2004 paper finds that in the University of California system, SAT does predict college grades after controlling for high school GPA but the relationship is weaker than in our setting. This makes sense, both because of range restriction issues and because it is much easier to make GPA comparisons across applicants from public high schools within a single state.4
The second criticism, which I attribute to my HKS colleague Sharad Goel, is that we are rigging the game by comparing SAT/ACT to a simple metric like high school GPA. College applications contain much richer information than GPA, such as the quality and depth of coursework, extracurriculars, essays, etc. If we ran a massive machine learning model on the data we would eventually get the marginal predictive power of the SAT down to zero.
This might be true, but it misses the point. Predictions can be highly accurate but for the wrong reasons. For example, suppose your fancy ML model tells you that applicants who take college courses in high school and get As have especially good outcomes. How do you interpret that result? It’s probably true in a predictive sense. But you also have a missing data problem. Some students, especially those from under-resourced high schools, never get the opportunity to take college courses. If they did, maybe they would get A grades too, but the prediction doesn’t give them credit for their talents. The key advantage of college tests like SAT and ACT is that everyone can be required to take them, and they are scored on a common scale.
There is a procedural fairness to college admissions testing. That’s worth defending in principle, while still acknowledging and solving the many problems that arise in practice.
For example, a troubling finding by Goodman, Gurantz, and Smith (2020) is that high-income students are much more likely to retake the SAT and that retakes tend to improve scores by 40 to 50 points. One obvious solution is to end the recent practice of “superscoring”, which gives test-takers the highest score in each section across multiple assessments. Colleges could instead require applicants to submit all scores, following Georgetown’s lead. Or they could allow people to take the test only once.
Using SAT/ACT instead of other measures of academic preparation doesn’t necessarily conflict with normative goals like having a diverse college class, which I strongly support. Remember, the figures above showed that SAT/ACT predict later life success within demographic groups. I think elite colleges should be more income diverse. I want to impose that constraint on admissions offices, and then have them admit the most academically prepared students, subject to those and other constraints.
In the next installment I’ll explain why “test optional” is worse than test-blind, and how Dartmouth’s analysis of their own policy led them to reinstate the SAT.
Based on a huge body of evidence, including some of my own work, I’d argue that NCLB was on balance a good thing for kids in low-income urban school districts, although it also led to a variety of strategic responses to high-stakes testing ranging from humorous (increasing the caloric content of school lunches on testing day) to harmful (cheating scandals in Atlanta and other places).
The earnings prediction is complicated – see my explanation here if you want to understand how we did it and why you should believe it. “Elite” graduate school is defined in the paper, but you can think of it roughly as top 10 graduate programs across a variety of fields of study.
Let me also address a widely discussed critique by Jake Vigdor on X. He argues that the figures shown above are misleading because they should focus on the r-squared of the prediction, not the slope of the relationship. I disagree. Suppose a college admissions office is deciding between applicants and they have a choice of whether to observe that person’s SAT score. The slope of the line in the graph above answers the question “What is our best guess of how much higher this applicant’s grades will be if their SAT score is X instead of Y?” The R-squared answers the question “how confident are you that your prediction is correct?” I agree that more precise predictions are better, but if you must decide between one applicant and another, you want to know the slope. I also think he misinterprets the implications of some of the results – you can see my response here if you are interested.
Interestingly, Rothstein and two other esteemed education economists and colleagues (Michal Kurlaender and Sarah Reber) have argued that the UC system should replace the SAT with the state test administered to all public high school students (called the SBAC). I agree! I don’t think there’s anything special about the SAT or the ACT, I just like the idea of having a common metric of college preparation that is given to all applicants.
That the SAT is "an imperfect measurement" is true. It is also true that everything else in a college application is an imperfect measurement of the qualities being looked for, and the SAT is relatively reliable by comparison, particularly considering it is a three hour test. It does not primarily measure "a young person’s preparation for college-level work." Academic preparation is better measured by AP test scores, the old SAT II achievement tests, and, for high schools that have rigorous required courses and honest grades (are there any left?), High School GPA. The SAT is what for much of its history it was called, a "scholastic aptitude test." It is useful in identifying (1) students with the potential to do well in college who have relatively unimpressive high school grades because of lack of diligence, unchallenging schools, or for other reasons, and (2) students who have done well in high school due to exceptional diligence or easy grading but will probably be unable to duplicate that success at the college level. It measures, imperfectly, academic potential, not preparation. Sending students to a year of prep school after graduation might make them better prepared for college, but it would not cure low SAT scores.
It is inevitable that selective colleges will return to the SAT or something equivalent. These colleges are prepared to make significant academic quality tradeoffs to achieve their diversity objectives, but within each applicant category they want the smartest kids they can find, and the SAT is the most useful tool that is readily available to identify those kids. Colleges which use SAT scores in admissions will have a significant advantage in terms of student body academic quality over those which do not.
The recent SAT studies cited simply confirm the findings of the truly enormous body of previous SAT studies and technical literature which the College Board and independent researchers have amassed over the last century.
“In response to a civil rights lawsuit in 2019, the University of California system banned consideration of SAT and ACT in admissions altogether (test-blind, not test-optional). More on the UC system later.”
The reporting on this has been bad so I don’t blame you for this frame. This really should be seen as a collusive use of the Court system to evade democratic accountability. The UC Regents had been looking to ban testing, and had in fact commissioned an internal academic study that they expected would show it was racist and not useful to predict student outcomes. The faculty study did not in fact find that, and recommended keeping the SAT. https://senate.universityofcalifornia.edu/_files/underreview/sttf-report.pdf#page66
They then turned to this case as a reason to do what they already wanted to do. It essentially was not an adversary court proceeding at all.