One of the big ideas in the study of educational differences is “stereotype threat.” This is the idea that you will do worse on a test of your skills if someone tells you that people in your group don’t perform well. If you tell people that they suck at math, they will get anxious and suck at math.
It’s a popular idea because it allows you to evade the possibility that ability, skill, or effort differences might be the source of test score gaps. It’s popular: the original papers presenting this hypothesis and elaborating the theory have been cited thousands of times.
Thus, I was interested to see that a team of researchers sought to replicate the stereotype threat hypothesis for gender (i.e., telling men/women that men/women are bad at math will be correlated with lower math test scores for men/women). Stovenbelt et al (2024) run a replication with large N (1297) and with experiments run by eight labs in multiple countries. This is important because, like many topics in social psychology, stereotype threat studies were done with small sample sizes and a lack of uniform lab conditions. Thus, stereotype threat hasn’t really been replicated, at least by modern standards. In this analysis, they also tested the hypothesis that you could reduce test score gaps by having teachers intervene before the test.
So what did this team find? From page 21:
Based on this pattern of results, we consider the effects found by Johns et al. (2005) as not replicated. We did not find a significant interaction effect between gender and the contrast comparing the stereotype-threat condition to the other conditions, in contrast to the findings from Johns et al. (2005). Moreover, we did not find an alleviating effect of the teaching-intervention condition for women.
Ouch.
In the replication paper, the authors admit that maybe stereotype threat was much stronger in the cultural environment in 2005, when the original papers were written. Maybe, but I think that sort of argument would be stronger had the original papers been written with modern experimental standards. Also, if you think about the cultural environment post-2020 and post-#MeToo, when the replication was done, people might be touchier about gender issues. As far as I understand it, stereotype threat implies that confrontations with stereotypes lead to diminished memory and executive control during task completion. That certainly applies to the post-woke, social media environment we have today. If you say “sexism” before a test in 2025, I’d expect a lot of folks to ruin their concentration and lose their shit.
Am I ready to throw the towel for stereotype threat? Not necessarily, but any explanation of academic performance should start with basics like intelligence, hours studied, and family background and then move on to other more conceptually delicate explanations like stereotype threat. If you can do that and then document the hypothesized effect with nice large N replicated studies, I’m open to it.
Bottom line: Higher standards in soc psych claim another victim.
+++++++
Buy these books!
Sociology and Classical Liberalism (Open access/free)
Grad Skool Rulz - cheap ($5) advice manual for grad students
Obama and the antiwar movement
A Social Theory book you will enjoy reading