There’s a Bigger Difference Between 6 and 10 Than You Think

What the hell?

This is exactly what it looks like. A large research university decided to switch its teaching evaluation surveys from a 10-point scale to a 6-point scale. In most fields, this made little difference. But in fields that are traditionally male-dominated, the enormous gender gap in evaluations disappeared. Why?

This chart comes from a new paper by Lauren Rivera and András Tilcsik, and they don’t really seem to know either. Here’s what they say:

Drawing from a complementary survey experiment, we show that this effect is not due to gender differences in instructor quality. Rather, it is driven by differences in the cultural meanings and stereotypes raters attach to specific numeric scales. Whereas the top score on a 10-point scale elicited images of exceptional or perfect performance—and, as a result, activated gender stereotypes of brilliance manifest in raters’ hesitation to assign women top scores—the top score on the 6-point scale did not carry such strong performance expectations. Under the 6-point system, evaluators recognized a wider variety of performances—and, critically, performers—as meriting top marks. Consequently, our results show that the structure of rating systems can shape the evaluation of women’s and men’s relative performance and alter the magnitude of gender inequalities in organizations.

In other words, students viewed a 9 or 10 on a scale of 1-10 as implying true brilliance, and they were reluctant to evaluate female instructors as brilliant. However, a 6 on a scale of 1-6 doesn’t carry the same connotations. Students interpret it as really good, but not necessarily brilliant. Because of that, they were perfectly happy to evaluate the top female instructors with the top evaluation.

Do you believe this? Do I believe it? Beats me. The sample size in the study is large, so that’s not a problem. The switch to a 6-point scale was unrelated to gender concerns, so that’s not an issue. The modeling appears to be reasonable. And the change in results is large. The effect sure seems real, but it’s still anyone’s guess about why the effect is real and why it’s so large. Given my respect for cognitive biases like framing effects, the authors’ explanation seems OK to me, but it’s still a bit of a guess. I’d sure like to hear a few other people weigh in.


The more we thought about how MoJo's journalism can have the most impact heading into the 2020 election, the more we realized that so many of today's stories come down to corruption: democracy and the rule of law being undermined by the wealthy and powerful for their own gain.

So we're launching a new Mother Jones Corruption Project to do deep, time-intensive reporting on systemic corruption. We aim to hire, build a team, and give them the time and space needed to understand how we got here and how we might get out. We'll publish what we find as a major series in the summer of 2020, including a special issue of our magazine, a dedicated online portal, and video and podcast series so it doesn't get lost in the daily deluge of breaking news.

It's unlike anything we've done before and we've got seed funding to get started, but we're asking readers to help crowdfund this new beat with an additional $500,000 so we can go even bigger. You can read why we're taking this approach and what we want to accomplish in "Corruption Isn't Just Another Scandal. It's the Rot Beneath All of Them," and if you like how it sounds, please help fund it with a tax-deductible donation today.

We Recommend


Sign up for our newsletters

Subscribe and we'll send Mother Jones straight to your inbox.

Get our award-winning magazine

Save big on a full year of investigations, ideas, and insights.


Support our journalism

Help Mother Jones' reporters dig deep with a tax-deductible donation.