New York City Mayor Bill DeBlasio has been pushing a plan to change the way students are selected for New York’s elite academic high schools. His plan is probably dead (it requires approval at the state level), but it got a push this week when it was announced that only seven black students were accepted into Stuyvesant High School, one of those elite campuses. Overall, the incoming freshman class at the eight elite schools was only 3 percent black in a school system that’s a quarter black.
Over at Vox, José Vilson, a NYC public school teacher, explains what happened:
None of this is by accident….New York State passed the Calandra-Hecht Act in 1971 which stated that “admissions to [these specialized high schools] shall be solely and exclusively by taking a competitive, objective and scholastic achievement examination.”…Essentially, these schools enshrined into law the right to ignore school performance, grades, interviews, standardized state exams, or any other qualification in favor of a test that rarely aligns with the standards they learn in school, tacitly keeping these schools out of reach for under-resourced students and schools. The specialized high schools continue to exemplify why New York City has the most segregated school system in the country.
The Specialized High School Admission Test, much like the IQ tests of yore and the SAT or ACT of the present, has been gamed since its inception. Everything from expensive test prep centers concentrated in specific neighborhoods to private tutors who spend hours with students across the city helps exacerbate admissions, and with it racial disparity.
I think progressives are ill-served by the continuing notion that the black-white gap is mostly due to things like test prep and biased tests. Over the past several decades, the organizations that create these tests have gone to considerable lengths to address racial bias, and they’ve been largely successful. The tests aren’t perfect, and they have flaws quite aside from any questions of race, but they aren’t terrible either. They also show a consistent but complicated pattern. Here’s a chart showing racial gaps for a lifetime of student testing:
There are several things to understand about these results:
- The black-white gap shows up as early as kindergarten and primary school—long before test prep classes come into play—and continues all the way through graduate level tests like the LSAT.
- Since every test has a different scale (120-180 for the LSAT, 400-1600 for the SAT, etc.), you can’t compare them using raw scores. Instead, you have to convert the scores to standard deviations from the mean. The chart above does this, with the black-white gap shown for each age group, and you can see that the black-white gap increases over time. Very roughly, the gap is 0.5 SD in kindergarten, 0.7 in fourth grade, 0.8 in eighth grade, 0.9 in high school, and 1.1 at the graduate level.
- With the possible exception of the initial kindergarten gap, these gaps continue to show up even after you control for income, class, parental education, test prep, etc.
These gaps are real effects of education, not just an artifact of test-taking, and the fact that the gaps increase over time is good evidence that much of the fault lies with our schools and the communities they serve. We miss this if we insist that standardized tests are useless. After all, if there’s no “real” gap at all, then our schools must be doing fine.
I’m no expert in how to close this gap, though I can say that there have been many dozens of serious efforts—some aimed specifically at schools, others aimed at parents and communities—and virtually all of them have failed. In any case, we shouldn’t pretend there’s nothing here except a bunch of racist test constructors. The black-white performance gap in America is real; it’s a national disgrace; and we can’t give up trying to fix it. If we could figure out how, no matter how much it cost, I’d take it over the mythical hope of reparations any day.
UPDATE: I have replaced the original chart with one that shows test results as standard deviations from the mean. This is a way of comparing different tests with different raw scale scores.