1. Introduction
In the tropical forests of New Guinea, the Etoro believe that for a boy to achieve manhood he must ingest the semen of his elders. This is accomplished through ritualized rites of passage that require young male initiates to fellate a senior member (Herdt 1984/1993; Kelley Reference Kelley1980). In contrast, the nearby Kaluli maintain that male initiation is only properly done by ritually delivering the semen through the initiate's anus, not his mouth. The Etoro revile these Kaluli practices, finding them disgusting. To become a man in these societies, and eventually take a wife, every boy undergoes these initiations. Such boy-inseminating practices, which are enmeshed in rich systems of meaning and imbued with local cultural values, were not uncommon among the traditional societies of Melanesia and Aboriginal Australia (Herdt 1984/1993), as well as in Ancient Greece and Tokugawa Japan.
Such in-depth studies of seemingly “exotic” societies, historically the province of anthropology, are crucial for understanding human behavioral and psychological variation. However, this target article is not about these peoples. It is about a truly unusual group: people from Western, Educated, Industrialized, Rich, and Democratic (WEIRD)Footnote 1 societies. In particular, it is about the Western, and more specifically American, undergraduates who form the bulk of the database in the experimental branches of psychology, cognitive science, and economics, as well as allied fields (hereafter collectively labeled the “behavioral sciences”). Given that scientific knowledge about human psychology is largely based on findings from this subpopulation, we ask just how representative are these typical subjects in light of the available comparative database. How justified are researchers in assuming a species-level generality for their findings? Here, we review the evidence regarding how WEIRD people compare with other populations.
We pursued this question by constructing an empirical review of studies involving large-scale comparative experimentation on important psychological or behavioral variables. Although such larger-scale studies are highly informative, they are rather rare, especially when compared to the frequency of species-generalizing claims. When such comparative projects were absent, we relied on large assemblies of studies comparing two or three populations, and, when available, on meta-analyses.
Of course, researchers do not implicitly assume psychological or motivational universality with everything they study. The present review does not address those phenomena assessed by individual difference measures for which the guiding assumption is variability among populations. Phenomena such as personal values, emotional expressiveness, and personality traits are expected a priori to vary across individuals, and by extension, societies. Indeed, the goal of much research on these topics is to identify the ways that people and societies differ from one another. For example, a number of large projects have sought to map out the world on dimensions such as values (Hofstede Reference Hofstede2001; Inglehart et al. Reference Inglehart, Basanez and Moreno1998; Schwartz & Bilsky Reference Schwartz and Bilsky1990), personality traits (e.g., McCrae et al. 2005; Schmitt et al. Reference Schmitt, Allik, McCrae, Benet-Martinez, Alcaly, Ault, Austers, Bennett, Bianchi, Boholst, Cunen, Braeckman, Brainerd, Gerard, Caron, Casullo, Cunningham, Daibo, de Backer, Desouza, Diaz-Loving, Diniz, Durkin, Echegaray, Eremsoy, Euler, Falzon, Fisher, Foley, Fry, Fry, Ghayar, Giri, Golden, Grammer, Grimaldi, Halberstadt, Hague, Herrera, Hertel, Hoffmann, Hooper, Hradilekova, Jaafar, Jankauskaite, Kabanagu-Stahel, Kardum, Khoury, Kwon, Laidra, Laireiter, Lakerveld, Lampert, Lauri, Lavallee, Lee, Leung, Locke, Locke, Luksik, Magaisa, Marcinkeviciene, Mata, Mata, McCarthy, Mills, Mikhize, Moreira, Moreira, Moya, Munyae, Noller, Olimar, Opre, Panayiotou, Petrovic, Poels, Popper, Poulimenou, P'yatokh, Raymond, Reips, Reneau, Rivera-Aragon, Rowatt, Ruch, Rus, Safir, Salas, Sambataro, Sandnabba, Schulmeyer, Schutz, Scrimali, Shackelford, Sharan, Shaver, Sichona, Simonetti, Sineshaw, Sookdew, Spelman, Spyron, Sumer, Sumer, Supekova, Szlendak, Taylor, Timmermans, Tooke, Tsaousis, Tungaranza, Van Overwalle, Vandermassen, Vanhoomissen, Vanwesenbeeck, Vasey, Verissimo, Voracek, Wan, Wang, Weiss, Wijaya, Woertman, Youn and Zupaneic2007), and levels of happiness, (e.g., Diener et al. Reference Diener, Diener and Diener1995). Similarly, we avoid the vast psychopathology literature, which finds much evidence for both variability and universality in psychological pathologies (Kleinman Reference Kleinman1988; Tseng Reference Tseng2001), because this work focuses on individual-level (and unusual) variations in psychological functioning. Instead, we restrict our exploration to those domains which have largely been assumed, at least until recently, to be de facto psychological universals.
Finally, we also do not address societal-level behavioral universals, or claims thereof, related to phenomena such as dancing, fire making, cooking, kinship systems, body adornment, play, trade, and grammar, for two reasons. First, at this surface level alone, such phenomena do not make specific claims about universal underlying psychological or motivational processes. Second, systematic, quantitative, comparative data based on individual-level measures are typically lacking for these domains.
Our examination of the representativeness of WEIRD subjects is necessarily restricted to the rather limited database currently available. We have organized our presentation into a series of telescoping contrasts showing, at each level of contrast, how WEIRD people measure up relative to the available reference populations. Our first contrast compares people from modern industrialized societies with those from small-scale societies. Our second telescoping stage contrasts people from Western societies with those from non-Western industrialized societies. Next, we contrast Americans with people from other Western societies. Finally, we contrast university-educated Americans with non–university-educated Americans, or university students with non-student adults, depending on the available data. At each level we discuss behavioral and psychological phenomena for which there are available comparative data, and we assess how WEIRD people compare with other samples.
We emphasize that our presentation of telescoping contrasts is only a rhetorical approach guided by the nature of the available data. It should not be taken as capturing any unidimensional continuum, or suggesting any single theoretical explanation for the variation. Throughout this article we take no position regarding the substantive origins of the observed differences between populations. While many of the differences are probably cultural in nature in that they were socially transmitted (Boyd & Richerson Reference Boyd and Richerson1985; Nisbett et al. Reference Nisbett, Peng, Choi and Norenzayan2001), other differences are likely environmental and represent some form of non-cultural phenotypic plasticity, which may be developmental or facultative, as well as either adaptive or maladaptive (Gangestad et al. Reference Gangestad, Haselton and Buss2006; Tooby & Cosmides Reference Tooby, Cosmides, Barkow, Cosmides and Tooby1992). Other population differences could arise from genetic variation, as observed for lactose processing (Beja-Pereira et al. Reference Beja-Pereira, Luikart, England, Bradley, Jann, Bertorelle, Chamberlain, Nunes, Metodiev, Ferrand and Erhardt2003). Regardless of the reasons underlying these population differences, our concern is whether researchers can reasonably generalize from WEIRD samples to humanity at large.
Many radical versions of interpretivism and cultural relativity deny any shared commonalities in human psychologies across populations (e.g., Gergen Reference Gergen1973; see critique and discussion in Slingerland Reference Slingerland2008, Ch. 2). To the contrary, we expect humans from all societies to share, and probably share substantially, basic aspects of cognition, motivation, and behavior. As researchers who see great value in applying evolutionary thinking to psychology and behavior, we have little doubt that if a full accounting were taken across all domains among peoples past and present, the number of similarities would indeed be large, as much ethnographic work suggests (e.g., Brown Reference Brown1991) – ultimately, of course, this is an empirical question. Thus, our thesis is not that humans share few basic psychological properties or processes; rather, we question our current ability to distinguish these reliably developing aspects of human psychology from more developmentally, culturally, or environmentally contingent aspects of our psychology given the disproportionate reliance on WEIRD subjects. Our aim here, then, is to inspire efforts to place knowledge of such universal features of psychology on a firmer footing by empirically addressing, rather than a priori dismissing or ignoring, questions of population variability.
2. Background
Before commencing with our telescoping contrasts, we first discuss two observations regarding the existing literature: (1) The database in the behavioral sciences is drawn from an extremely narrow slice of human diversity; and (2) behavioral scientists routinely assume, at least implicitly, that their findings from this narrow slice generalize to the species.
2.1. The behavioral sciences database is narrow
Who are the people studied in behavioral science research? A recent analysis of the top journals in six subdisciplines of psychology from 2003 to 2007 revealed that 68% of subjects came from the United States, and a full 96% of subjects were from Western industrialized countries, specifically those in North America and Europe, as well as Australia and Israel (Arnett Reference Arnett2008). The make-up of these samples appears to largely reflect the country of residence of the authors, as 73% of first authors were at American universities, and 99% were at universities in Western countries. This means that 96% of psychological samples come from countries with only 12% of the world's population.
Even within the West, however, the typical sampling method for experimental studies is far from representative. In the Journal of Personality and Social Psychology, the premier journal in social psychology – the subdiscipline of psychology that should (arguably) be the most attentive to questions about the subjects' backgrounds – 67% of the American samples (and 80% of the samples from other countries) were composed solely of undergraduates in psychology courses (Arnett Reference Arnett2008). In other words, a randomly selected American undergraduate is more than 4,000 times more likely to be a research participant than is a randomly selected person from outside of the West. Furthermore, this tendency to rely on undergraduate samples has not decreased over time (Peterson Reference Peterson2001; Wintre et al. Reference Wintre, North and Sugar2001). Such studies are therefore sampling from a rather limited subpopulation within each country (see Rozin Reference Rozin2001).
It is possible that the dominance of American authors in psychology publications just reflects that American universities have the resources to attract the best international researchers, and that similar tendencies exist in other fields. However, psychology is a distinct outlier here: 70% of all psychology citations come from the United States – a larger percentage than any of the other 19 sciences that were compared in one extensive international survey (see May Reference May1997). In chemistry, by contrast, the percentage of citations that come from the United States is only 37%. It seems problematic that the discipline in which there are the strongest theoretical reasons to anticipate population-level variation is precisely the discipline in which the American bias for research is most extreme.
Beyond psychology and cognitive science, the subject pools of experimental economics and decision science are not much more diverse – still largely dominated by Westerners, and specifically Western undergraduates. However, to give credit where it is due, the nascent field of experimental economics has begun taking steps to address the problem of narrow samples.Footnote 2
In sum, the available database does not reflect the full breadth of human diversity. Rather, we have largely been studying the nature of WEIRD people, a certainly narrow and potentially peculiar subpopulation.
2.2. Researchers often assume their findings are universal
Sampling from a thin slice of humanity would be less problematic if researchers confined their interpretations to the populations from which they sampled. However, despite their narrow samples, behavioral scientists often are interested in drawing inferences about the human mind and human behavior. This inferential step is rarely challenged or defended – with important exceptions (e.g., Medin & Atran Reference Medin and Atran2004; Rozin Reference Rozin2001; Triandis Reference Triandis1994; Witkin & Berry Reference Witkin and Berry1975) – despite the lack of any general effort to assess how well results from WEIRD samples generalize to the species. This lack of epistemic vigilance underscores the prevalent, though implicit, assumption that the findings one derives from a particular sample will generalize broadly; one adult human sample is pretty much the same as the next.
Leading scientific journals and university textbooks routinely publish research findings claiming to generalize to “humans” or “people” based on research done entirely with WEIRD undergraduates. In top journals such as Nature and Science, researchers frequently extend their findings from undergraduates to the species – often declaring this generalization in their titles. These contributions typically lack even a cautionary footnote about these inferential extensions.
In psychology, much of this generalization is implicit. A typical article does not claim to be discussing “humans” but will rather simply describe a decision bias, psychological process, set of correlations, and so on, without addressing issues of generalizability, although findings are often linked to “people.” Commonly, there is no demographic information about the participants, aside from their age and gender. In recent years there is a trend to qualify some findings with disclaimers such as “at least within Western culture,” though there remains a robust tendency to generalize to the species. Arnett (Reference Arnett2008) notes that psychologists would surely bristle if journals were renamed to more accurately reflect the nature of their samples (e.g., Journal of Personality and Social Psychology of American Undergraduate Psychology Students). They would bristle, presumably, because they believe that their findings generalize much beyond this sample. Of course, there are important exceptions to this general tendency, as some researchers have assembled a broad database to provide evidence for universality (Buss Reference Buss1989; Daly & Wilson Reference Daly and Wilson1988; Ekman Reference Ekman, Dalgleish and Power1999b; Elfenbein & Ambady Reference Elfenbein and Ambady2002; Kenrick & Keefe Reference Kenrick and Keefe1992a; Tracy & Matsumoto Reference Tracy and Matsumoto2008).
When is it safe to generalize from a narrow sample to the species? First, if one had good empirical reasons to believe that little variability existed across diverse populations in a particular domain, it would be reasonable to tentatively infer universal processes from a single subpopulation. Second, one could make an argument that as long as one's samples were drawn from near the center of the human distribution, then it would not be overly problematic to generalize across the distribution more broadly – at least the inferred pattern would be in the vicinity of the central tendency of our species. In the following, with these assumptions in mind, we review the evidence for the representativeness of findings from WEIRD people.
3. Contrast 1: Industrialized societies versus small-scale societies
Our theoretical perspective, which is informed by evolutionary thinking, leads us to suspect that many aspects of people's psychological repertoire are universal. However, the current empirical foundations for our suspicions are rather weak because the database of comparative studies that include small-scale societies is scant, despite the obvious importance of such societies in understanding both the evolutionary history of our species and the potential impact of diverse environments on our psychology. Here we first discuss the evidence for differences between populations drawn from industrialized and small-scale societies in some seemingly basic psychological domains, and follow this with research indicating universal patterns across this divide.
3.1. Visual perception
Many readers may suspect that tasks involving “low-level” or “basic” cognitive processes such as vision will not vary much across the human spectrum (Fodor Reference Fodor1983). However, in the 1960s an interdisciplinary team of anthropologists and psychologists systematically gathered data on the susceptibility of both children and adults from a wide range of human societies to five “standard illusions” (Segall et al. Reference Segall, Campbell and Herskovits1966). Here we highlight the comparative findings on the famed Müller-Lyer illusion, because of this illusion's importance in textbooks, and its prominent role as Fodor's indisputable example of “cognitive impenetrability” in debates about the modularity of cognition (McCauley & Henrich Reference McCauley and Henrich2006). Note, however, that population-level variability in illusion susceptibility is not limited to the Müller-Lyer illusion; it was also found for the Sander-Parallelogram and both Horizontal-Vertical illusions.
Segall et al. (Reference Segall, Campbell and Herskovits1966) manipulated the length of the two lines in the Müller-Lyer illusion (Fig. 1) and estimated the magnitude of the illusion by determining the approximate point at which the two lines were perceived as being of the same length. Figure 2 shows the results from 16 societies, including 14 small-scale societies. The vertical axis gives the “point of subjective equality” (PSE), which measures the extent to which segment “a” must be longer than segment “b” before the two segments are judged equal in length. PSE measures the strength of the illusion.

Figure 1. The Müller-Lyer illusion. The lines labeled “a” and “b” are the same length. Many subjects perceive line “b” as longer than line “a”.

Figure 2. Müller-Lyer results for Segall et al.'s (1966) cross-cultural project. PSE (point of subjective equality) is the percentage that segment a must be longer than b before subjects perceived the segments as equal in length. Children were sampled in the 5-to-11 age range.
The results show substantial differences among populations, with American undergraduates anchoring the extreme end of the distribution, followed by the South African-European sample from Johannesburg. On average, the undergraduates required that line “a” be about a fifth longer than line “b” before the two segments were perceived as equal. At the other end, the San foragers of the Kalahari were unaffected by the so-called illusion (it is not an illusion for them). While the San's PSE value cannot be distinguished from zero, the American undergraduates' PSE value is significantly different from all the other societies studied.
As discussed by Segall et al., these findings suggest that visual exposure during ontogeny to factors such as the “carpentered corners” of modern environments may favor certain optical calibrations and visual habits that create and perpetuate this illusion. That is, the visual system ontogenetically adapts to the presence of recurrent features in the local visual environment. Because elements such as carpentered corners are products of particular cultural evolutionary trajectories, and were not part of most environments for most of human history, the Müller-Lyer illusion is a kind of culturally evolved by-product (Henrich Reference Henrich and Brown2008).
These findings highlight three important considerations. First, this work suggests that even a process as apparently basic as visual perception can show substantial variation across populations. If visual perception can vary, what kind of psychological processes can we be sure will not vary? It is not merely that the strength of the illusory effect varies across populations – the effect cannot be detected in two populations. Second, both American undergraduates and children are at the extreme end of the distribution, showing significant differences from all other populations studied; whereas, many of the other populations cannot be distinguished from one another. Since children already show large population-level differences, it is not obvious that developmental work can substitute for research across diverse human populations. Children likely have different developmental trajectories in different societies. Finally, this provides an example of how population-level variation can be useful for illuminating the nature of a psychological process, which would not be as evident in the absence of comparative work.
3.2. Fairness and cooperation in economic decision-making
By the mid-1990s, researchers were arguing that a set of robust experimental findings from behavioral economics were evidence for a set of evolved universal motivations (Fehr & Gächter Reference Fehr and Gächter1998; Hoffman et al. Reference Hoffman, McCabe and Smith1998). Foremost among these experiments, the Ultimatum Game provides a pair of anonymous subjects with a sum of real money for a one-shot interaction. One of the pair – the proposer – can offer a portion of this sum to the second subject, the responder. Responders must decide whether to accept or reject the offer. If a responder accepts, she gets the amount of the offer and the proposer takes the remainder; if she rejects, both players get zero. If subjects are motivated purely by self-interest, responders should always accept any positive offer; knowing this, a self-interested proposer should offer the smallest non-zero amount. Among subjects from industrialized populations – mostly undergraduates from the United States, Europe, and Asia – proposers typically offer an amount between 40% and 50% of the total, with a modal offer of 50% (Camerer Reference Camerer2003). Offers below about 30% are often rejected.
With this seemingly robust empirical finding in their sights, Nowak et al. (Reference Nowak, Page and Sigmund2000) constructed an evolutionary analysis of the Ultimatum Game. When they modeled the Ultimatum Game exactly as played, they did not get results matching the undergraduate findings. However, if they added reputational information, such that players could know what their partners did with others on previous rounds of play, the analysis predicted offers and rejections in the range of typical undergraduate responses. They concluded that the Ultimatum Game reveals humans' species-specific evolved capacity for fair and punishing behavior in situations with substantial reputational influence. But, since the Ultimatum Game is typically played one-shot without reputational information, Nowak et al. argued that people make fair offers and reject unfair offers because their motivations evolved in a world where such interactions were not fitness relevant – thus, we are not evolved to fully incorporate the possibility of non-reputational action in our decision-making, at least in such artificial experimental contexts.
Recent comparative work has dramatically altered this initial picture. Two unified projects (which we call Phase 1 and Phase 2) have deployed the Ultimatum Game and other related experimental tools across thousands of subjects randomly sampled from 23 small-scale human societies, including foragers, horticulturalists, pastoralists, and subsistence farmers, drawn from Africa, Amazonia, Oceania, Siberia, and New Guinea (Henrich et al. 2005; Reference Henrich, McElreath, Ensminger, Barr, Barrett, Bolyanatz, Cardenas, Gurven, Gwako, Henrich, Lesorogol, Marlowe, Tracer and Ziker2006; Reference Henrich, Ensminger, McElreath, Barr, Barrett, Bolyanatz, Cardenas, Gurven, Gwako, Henrich, Lesorogol, Marlowe, Tracer and Ziker2010). Three different experimental measures show that people in industrialized societies consistently occupy the extreme end of the human distribution. Notably, people in some of the smallest-scale societies, where real life is principally face-to-face, behaved in a manner reminiscent of Nowak et al.'s analysis before they added the reputational information. That is, these populations made low offers and did not reject.
To concisely present these diverse empirical findings, we show results only from the Ultimatum and Dictator Games in Phase II. The Dictator Game is the same as the Ultimatum Game except that the second player cannot reject the offer. If subjects are motivated purely by self-interest, they would offer zero in the Dictator Game. Thus, Dictator Game offers yield a measure of “fairness” (equal divisions) among two anonymous people. By contrast, Ultimatum Game offers yield a measure of fairness combined with an assessment of the likelihood of rejection (punishment). Rejections of offers in the Ultimatum Game provide a measure of people's willingness to punish unfairness.
Using aggregate measures, Figure 3 shows that the behavior of the U.S. adult (non-student) sample occupies the extreme end of the distribution in each case. For Dictator Game offers, Figure 3A shows that the U.S. sample has the highest mean offer, followed by the Sanquianga from Colombia, who are renowned for their prosociality (Kraul Reference Kraul2008). The U.S. offers are nearly double that of the Hadza, foragers from Tanzania, and the Tsimane, forager-horticulturalists from the Bolivian Amazon. Figure 3B shows that for Ultimatum Game offers, the United States has the second highest mean offer, behind the Sursurunga from Papua New Guinea. On the punishment side in the Ultimatum Game, Figure 3C shows the income-maximizing offers (IMO) for each population, which is a measure of the population's willingness to punish inequitable offers. IMO is the offer that an income-maximizing proposer would make if he knew the probability of rejection for each of the possible offer amounts. The U.S. sample is tied with the Sursurunga. These two groups have an IMO five times higher than 70% of the other societies. While none of these measures indicates that people from industrialized societies are entirely unique vis-à-vis other populations, they do show that people from industrialized societies consistently occupy the extreme end of the human distribution.

Figure 3. Behavioral measures of fairness and punishment from the Dictator and Ultimatum Games for 15 societies (Phase II). Figures 3A and 3B show mean offers for each society in the Dictator and Ultimatum Games, respectively. Figure 3C gives the income-maximizing offer (IMO) for each society.
Analyses of these data show that a population's degree of market integration and its participation in a world religion both independently predict higher offers, and account for much of the variation between populations. Community size positively predicts greater punishment (Henrich et al. Reference Henrich, Ensminger, McElreath, Barr, Barrett, Bolyanatz, Cardenas, Gurven, Gwako, Henrich, Lesorogol, Marlowe, Tracer and Ziker2010). The authors suggest that norms and institutions for exchange in ephemeral interactions culturally coevolved with markets and expanding larger-scale sedentary populations. In some cases, at least in their most efficient forms, neither markets nor large populations were feasible before such norms and institutions emerged. That is, it may be that what behavioral economists have been measuring among undergraduates in such games is a specific set of social norms, culturally evolved for dealing with money and strangers, that have emerged since the origins of agriculture and the rise of complex societies.
In addition to differences in populations' willingness to reject offers that are too low, the evidence also indicates a willingness to reject offers that are too high in about half the societies studied. This tendency to reject so-called hyper-fair offers rises as offers increase from 60% to 100% of the stake (Henrich et al. Reference Henrich, McElreath, Ensminger, Barr, Barrett, Bolyanatz, Cardenas, Gurven, Gwako, Henrich, Lesorogol, Marlowe, Tracer and Ziker2006). This phenomenon, which is not observed in typical undergraduate subjects (who essentially never reject offers greater than half), has now emerged among populations in Russia (Bahry & Wilson Reference Bahry and Wilson2006) and China (Hennig-Schmidt et al. Reference Hennig-Schmidt, Li and Yang2008), as well as (to a lesser degree) among non-student adults in Sweden (Wallace et al. Reference Wallace, Cesarini, Lichtenstein and Johannesson2007), Germany (Guth et al. Reference Guth, Schmidt and Sutter2003), and the Netherlands (Bellemare et al. Reference Bellemare, Kröger and Van Soest2008). Attempts to explain away this phenomenon as a consequence of confusion or misunderstanding, have not found support despite substantial efforts.
Suppose that Nowak and his coauthors were Tsimane, and that the numerous empirical findings they had on hand were all from Tsimane villages. If this were the case, presumably these researchers would have simulated the Ultimatum Game and found that there was no need to add reputation to their model. This unadorned evolutionary solution would have worked fine until they realized that the Tsimane are not representative of humanity. According to the above data, the Tsimane are about as representative of the species as are Americans, but at the opposite end of the spectrum. If the database of the behavioral sciences consisted entirely of Tsimane subjects, researchers would likely be quite concerned about generalizability.
3.3. Folkbiological reasoning
Recent work in small-scale societies suggests that some of the central conclusions regarding the development and operation of human folkbiological categorization, reasoning, and induction are limited to urban subpopulations of non-experts in industrialized societies. Although much more work needs to be done, it appears that typical subjects (children of WEIRD parents) develop their folkbiological reasoning in a culturally and experientially impoverished environment, by contrast to those of small-scale societies (and of our evolutionary past), distorting both the species-typical pattern of cognitive development and the patterns of reasoning in WEIRD adults.
Cognitive scientists using (as subjects) children drawn from U.S. urban centers – often those surrounding universities – have constructed an influential, though actively debated, developmental theory in which folkbiological reasoning emerges from folkpsychological reasoning. Before age 7, urban children reason about biological phenomena by analogy to, and by extension from, humans. Between ages 7 and 10, urban children undergo a conceptual shift to the adult pattern of viewing humans as one animal among many. These conclusions are underpinned by three robust findings from urban children: (1) Inferential projections of properties from humans are stronger than projections from other living kinds; (2) inferences from humans to mammals emerge as stronger than inferences from mammals to humans; and (3) children's inferences violate their own similarity judgments by, for example, providing stronger inference from humans to bugs than from bugs to bees (Carey Reference Carey1985; Reference Carey, Sperber, Premack and Premack1995).
However, when the folkbiological reasoning of children in rural Native American communities in Wisconsin and Yukatek Maya communities in Mexico was investigated (Atran et al. Reference Atran, Medin, Lynch, Vapnarsky, Ucan and Sousa2001; Ross et al. Reference Ross, Medin, Coley and Atran2003; Waxman & Medin Reference Waxman and Medin2007) none of these three empirical patterns emerged. Among the American urban children, the human category appears to be incorporated into folkbiological induction relatively late compared to these other populations. The results indicate that some background knowledge of the relevant species is crucial for the application and induction across a hierarchical taxonomy (Atran et al. Reference Atran, Medin, Lynch, Vapnarsky, Ucan and Sousa2001). In rural environments, both exposure to and interest in the natural world is commonplace, unavoidable, and an inevitable part of the enculturation process. This suggests that the anthropocentric patterns seen in U.S. urban children result from insufficient cultural input and a lack of exposure to the natural world. The only real animal that most urban children know much about is Homo sapiens, so it is not surprising that this species dominates their inferential patterns. Since such urban environments are highly “unnatural” from the perspective of human evolutionary history, any conclusions drawn from subjects reared in such informationally impoverished environments must remain rather tentative. Indeed, studying the cognitive development of folkbiology in urban children would seem the equivalent of studying “normal” physical growth in malnourished children.
This deficiency of input likely underpins the fact that the basic-level folkbiological categories for WEIRD adults are life-form categories (e.g., bird, fish, and mammal), and these are also the first categories learned by WEIRD children – for example, if one says “What's that?” (pointing at a maple tree), their common answer is “tree.” However, in all small-scale societies studied, the generic species (e.g., maple, crow, trout, and fox) is the basic-level category and the first learned by children (Atran Reference Atran1993; Berlin Reference Berlin1992).
Impoverished interactions with the natural world may also distort assessments of the typicality of natural kinds in categorization. The standard conclusion from American undergraduate samples has been that goodness of example, or typicality, is driven by similarity relations. A robin is a typical bird because this species shares many of the perceptual features that are commonly found in the category BIRD. In the absence of close familiarity with natural kinds, this is the default strategy of American undergraduates, and psychology has assumed it is the universal pattern. However, in samples which interact with the natural world regularly, such as Itza Maya villagers, typicality is based not on similarity but on knowledge of cultural ideals, reflecting the symbolic or material significance of the species in that culture. For the Itza, the wild turkey is a typical bird because of its rich cultural significance, even though it is in no way most similar to other birds. The same pattern holds for similarity effects in inductive reasoning – WEIRD people make strong inferences from computations of similarity, whereas populations with greater familiarity with the natural world, despite their capacity for similarity-based inductions, prefer to make strong inferences from folkbiological knowledge that takes into account ecological context and relationships among species (Atran et al. Reference Atran, Medin and Ross2005). In general, research suggests that what people think about can affect how they think (Bang et al. Reference Bang, Medin and Atran2007). To the extent that there is population-level variability in the content of folkbiological beliefs, such variability affects cognitive processing in this domain as well.
So far we have emphasized differences in folkbiological cognition uncovered by comparative research. This same work has also uncovered reliably developing aspects of human folkbiological cognition that do not vary, such as categorizing plants and animals in a hierarchical taxonomy, or that the generic species level has the strongest inductive potential, despite the fact that this level is not always the basic level across populations, as discussed above. Our goal in emphasizing the differences here is to show (1) how peculiar industrialized (urban, in this case) samples are, given the unprecedented environment they grow up in; and (2) how difficult it is to conclude a priori what aspects will be reliably developing and robust across diverse slices of humanity if research is largely conducted with WEIRD samples.
3.4. Spatial cognition
Human societies vary in their linguistic tools for, and cultural practices associated with, representing and communicating (1) directions in physical space, (2) the color spectrum, and (3) integer amounts. There is some evidence that each of these differences in cultural content may influence some aspects of nonlinguistic cognitive processes (D'Andrade Reference D'Andrade1995; Gordon Reference Gordon2004; Kay Reference Kay2005; Levinson Reference Levinson2003; Roberson et al. Reference Roberson, Davies and Davidoff2000). Here we focus on spatial cognition, for which the evidence is most provocative. As above, it appears that industrialized societies are at the extreme end of the continuum in spatial cognition. Human populations show differences in how they think about spatial orientation and deal with directions, and these differences may be influenced by linguistically based spatial reference systems.
Speakers of English and other Indo-European languages favor the use of an egocentric (relative) system to represent the location of objects – that is, relative to the self (e.g., “the man is on the right side of the flagpole”). In contrast, many if not most languages favor an allocentric frame, which comes in two flavors. Some allocentric languages such as Guugu Yimithirr (an Australian language) and Tzeltal (a Mayan language) favor a geocentric system in which absolute reference is based on cardinal directions (“the man is west of the house”). The other allocentric frame is an object-centered (intrinsic) approach that locates objects in space, relative to some coordinate system anchored to the object (“the man is behind the house”). When languages possess systems for encoding all of these spatial reference frames, they often privilege one at the expense of the others. However, the fact that some languages lack one or more of the reference systems suggests that the accretion of all three systems into most contemporary languages may be a product of long-term cumulative cultural evolution.
In data on spatial reference systems from 20 languages drawn from diverse societies – including foragers, horticulturalists, agriculturalists, and industrialized populations – only three languages relied on egocentric frames as their single preferred system of reference. All three were from industrialized populations: Japanese, English, and Dutch (Majid et al. Reference Majid, Bowerman, Kita, Haun and Levinson2004).
The presence of, or emphasis on, different reference systems may influence nonlinguistic spatial reasoning (Levinson Reference Levinson2003). In one study, Dutch and Tzeltal speakers were seated at a table and shown an arrow pointing either to the right (north) or the left (south). They were then rotated 180 degrees to a second table where they saw two arrows: one pointing to the left (north) and the other one pointing to the right (south). Participants were asked which arrow on the second table was like the one they saw before. Consistent with the spatial-marking system of their languages, Dutch speakers chose the relative solution, whereas the Tzeltal speakers chose the absolute solution. Several other comparative experiments testing spatial memory and reasoning are consistent with this pattern, although lively debates about interpretation persist (Levinson et al. Reference Levinson, Kita, Haun and Rasch2002; Li & Gleitman Reference Li and Gleitman2002).
Extending the above exploration, Haun and colleagues (Haun et al. 2006a; 2006b) examined performance on a spatial reasoning task similar to the one described above, using children and adults from different societies and great apes. In the first step, Dutch-speaking adults and 8-year-olds (speakers of an egocentric language) showed the typical egocentric bias, whereas Hai//om-speaking adults and 8-year-olds (a Namibian foraging population who speak an allocentric language) showed a typical allocentric bias. In the second step, 4-year-old German-speaking children, gorillas, orangutans, chimpanzees, and bonobos were tested on a simplified version of the same task. All showed a marked preference for allocentric reasoning. These results suggest that children share with other great apes an innate preference for allocentric spatial reasoning, but that this bias can be overridden by input from language and cultural routines.
If one were to work on spatial cognition exclusively with WEIRD subjects (say, using subjects from the United States and Europe), one might conclude that children start off with an allocentric bias but naturally shift to an egocentric bias with maturation. The problem with this conclusion is that it would not apply to many human populations, and it may be the consequence of studying subjects from peculiar cultural environments. The next telescoping contrast highlights some additional evidence suggesting that WEIRD people may even be unusual in their egocentric bias vis-à-vis most other industrialized populations.
3.5. Other potential differences
We have discussed several lines of data suggesting not only population-level variation, but that industrialized populations are consistently unusual compared to small-scale societies. There are also numerous studies that have found differences between much smaller numbers of samples (usually two samples). In these studies it is impossible to discern who is unusual, the small-scale society or the WEIRD population. For example, one study found that both samples from two different industrialized populations were risk-averse decision makers when facing monetary gambles involving gains (Henrich & McElreath Reference Henrich and McElreath2002), whereas both samples from small-scale societies were risk-prone. Risk-aversion for monetary gains may be a recent, local phenomenon. Similarly, extensive inter-temporal choice experiments using a panel method of data collection indicates that the Tsimane, an Amazonian population of forager-horticulturalists, discount the future 10 times more steeply than do WEIRD people (Godoy et al. Reference Godoy, Byron, Reyes-Garcia, Leonard, Patel, Apaza, Perez, Vadez and Wilkie2004). In Uganda, a study of individual decision-making among small-scale farmers showed qualitatively different deviations from expected utility maximization than is typically found among undergraduates. For example, rather than the inverse S-shape for probabilities in Prospect Theory, a regular S-shape was found.Footnote 3
3.6. Similarities between industrialized and small-scale societies
Some larger-scale comparative projects show universal patterns in human psychology. Here we list some noteworthy examples:
-
1. Some perceptual illusions: We discussed the Müller-Lyer illusion above. However, there are illusions, such as the Perspective Drawing Illusion, for which the industrialized populations are not extreme outliers, and for which perception varies little in the populations studied (Segall et al. Reference Segall, Campbell and Herskovits1966).
-
2. Perceiving color: While the number of basic color terms systematically varies across human languages (Regier et al. Reference Regier, Kay and Cook2005), the ability to perceive different colors does emerge in small-scale societies (Rivers Reference Rivers and Haddon1901a),Footnote 4 although terms and categories do influence color perception at the margins (Kay & Regier Reference Kay and Regier2006).
-
3. Emotional expression: In studying facial displays of emotions, Ekman and colleagues have shown much evidence for universality in recognition of the “basic” facial expressions of emotions, although this work has included only a small – yet convincing – sampling of small-scale societies (Ekman Reference Ekman, Dalgleish and Power1999a; Reference Ekman, Dalgleish and Power1999b). There is also evidence for the universality of pride displays (Tracy & Matsumoto Reference Tracy and Matsumoto2008; Tracy & Robins Reference Tracy and Robins2008). This main effect for emotional recognition across population (58% of variance) is qualified by a smaller effect for cultural specificity of emotional expressions (9% of variance: Elfenbein & Ambady Reference Elfenbein and Ambady2002).
-
4. False belief tasks: Comparative work in China, the United States, Canada, Peru, India, Samoa, and Thailand suggests that the ability to explicitly pass the false belief task emerges in all populations studied (Callaghan et al. Reference Callaghan, Rochat, Lillard, Claux, Odden, Itakura, Tapanya and Singh2005; Liu et al. Reference Liu, Wellman, Tardif and Sabbagh2008), although the age at which subjects can pass the explicit version of the false belief task varies from 4 to at least 9 (Boesch Reference Boesch2007; Callaghan et al. Reference Callaghan, Rochat, Lillard, Claux, Odden, Itakura, Tapanya and Singh2005; Liu et al. Reference Liu, Wellman, Tardif and Sabbagh2008), with industrialized populations at the extreme low end.
-
5. Analog numeracy: There is growing consensus in the literature on numerical thinking that quantity estimation relies on a primitive “analog” number sense that is sensitive to quantity but limited in accuracy. This cognitive ability appears to be independent of counting practices and was shown to operate in similar ways among two Amazonian societies with very limited counting systems (Gordon Reference Gordon2004; Pica et al. Reference Pica, Lerner, Izard and Dehaene2004), as well as in infants and primates (e.g., Dehaene Reference Dehaene1997).
-
6. Social relationships: Research on the cognitive processes underlying social relationships reveals similar patterns across distinct populations. Fiske (Reference Fiske1993) studied people's tendency to confuse one person with another (e.g., intending to phone your son Bob but accidentally calling your son Fred). Chinese, Korean, Bengali, and Vai (Liberia and Sierra Leone) immigrants tended to confuse people in the same category of social relationship. Interestingly, the social categories in which the most confusion occurred varied across populations.
-
7. Psychological essentialism: Research from a variety of societies, including Vezo children in Madagascar (Astuti et al. Reference Astuti, Solomon and Carey2004), children from impoverished neighborhoods in Brazil (Sousa et al. Reference Sousa, Atran and Medin2002), Menominee in Wisconsin (Waxman et al. Reference Waxman, Medin and Ross2007), and middle-class children and adults in the United States (Gelman Reference Gelman2003), shows evidence of perceiving living organisms as having an underlying and non-trivial nature that makes them what they are. Psychological essentialism also extends to the understanding of social groups, which may be found in Americans (Gelman Reference Gelman2003), rural Ukranians (Kanovsky Reference Kanovsky2007), Vezo in Madagascar (Astuti Reference Astuti2001), Mapuche farmers in Chile (Henrich & Henrich, unpublished manuscript), Iraqi Chaldeans and Hmong immigrants in Detroit (Henrich & Henrich Reference Henrich and Henrich2007), and Mongolian herdsmen (Gil-White Reference Gil-White2001). Notably, this evidence is not well suited to examining differences in the degree of psychological essentialism across populations, though it suggests that inter-population variation may be substantial.
There are also numerous studies involving dyadic comparisons between a single small-scale society and a Western population (or a pattern of Western results) in which cross-population similarities have been found. Examples are numerous but include the development of an understanding of death (Barrett & Behne Reference Barrett and Behne2005), shame (Fessler Reference Fessler2004),Footnote 5 and cheater detection (Sugiyama et al. Reference Sugiyama, Tooby and Cosmides2002). Finding evidence for similarities across two such disparate populations is an important step towards providing evidence for universality (Norenzayan & Heine Reference Norenzayan and Heine2005); however, the case would be considerably stronger if it was found across a larger number of diverse populations.Footnote 6
3.7. Summary for Contrast 1
Although there are several domains in which the data from small-scale societies appear similar to that from industrialized societies, comparative projects involving visual illusions, social motivations (fairness), folkbiological cognition, and spatial cognition all show industrialized populations as outliers. Given all this, it seems problematic to generalize from industrialized populations to humans more broadly, in the absence of supportive empirical evidence.
4. Contrast 2: WesternFootnote 7 versus non-Western societies
For our second contrast, we review evidence comparing Western with non-Western populations. Here we examine four of the most studied domains: social decision making (fairness, cooperation, and punishment), independent versus interdependent self-concepts (and associated motivations), analytic versus holistic reasoning, and moral reasoning. We also briefly return to spatial cognition.
4.1. Anti-social punishment and cooperation
In the previous contrast, we reviewed social decision-making experiments showing that industrialized populations occupy the extreme end of the behavioral distribution vis-à-vis a broad swath of smaller-scale societies. Here we show that even among industrialized populations, Westerners are again clumped at the extreme end of the behavioral distribution. Notably, the behaviors measured in the experiments discussed below are strongly correlated with the strength of formal institutions, norms of civic cooperation, and Gross Domestic Product (GDP) per capita.
In 2002, Fehr and Gächter published their classic paper, “Altruistic Punishment in Humans,” in Nature, based on Public Goods Games with and without punishment, conducted with undergraduates at the University of Zurich. The paper demonstrated that adding the possibility of punishment to a cooperative dilemma dramatically altered the outcome, from a gradual slide towards little cooperation (and rampant free-riding), to a steady increase towards stable cooperation. Enough subjects were willing to punish non-cooperators at a cost to themselves to shift the balance from free-riding to cooperation. In stable groups this cooperation-punishment combination dramatically increases long-run gains (Gächter et al. 2008).
To examine the generalizability of these results, which many took to be a feature of our species, Herrmann, Thoni, and Gächter conducted systematic comparable experiments among undergraduates from a diverse swath of industrialized populations (Herrmann et al. Reference Herrmann, Thoni and Gächter2008). In these Public Goods Games, subjects played with the same four partners for 10 rounds and could contribute during each round to a group project. All contributions to the group project were multiplied by 1.6 and distributed equally among all partners. Players could also pay to punish other players by taking money away from them.
In addition to finding population-level differences in the subjects' initial willingness to cooperate, Gächter's team unearthed in about half of these samples a phenomenon that is not observed beyond a trivial degree among typical undergraduate subjects (see our Fig. 4): Many subjects engaged in anti-social punishment; that is, they paid to reduce the earnings of “overly” cooperative individuals (those who contributed more than the punisher did). The effect of this behavior on levels of cooperation was dramatic, completely compensating for the cooperation-inducing effects of punishment in the Zurich experiment. Possibilities for altruistic punishment do not generate high levels of cooperation in these populations. Meanwhile, participants from a number of Western countries, such as the United States, the United Kingdom, and Australia, behaved like the original Zurich students. Thus, it appears that the Zurich sample works well for generalizing to the patterns of other Western samples (as well as the Chinese sample), but such findings cannot be readily extended beyond this.

Figure 4. Mean punishment expenditures from each sample for a given deviation from the punisher's contribution to the public good. The deviations of the punished subject's contribution from the punisher's contribution are grouped into five intervals, where [-20,-11] indicates that the punished subjects contributed between 11 and 20 less than the punishing subject; [0] indicates that the punished subject contributed exactly the same amount as the punishing subject; and [1,10] ([11,20]) indicates that the punished subject contributed between 1 and 10 (11 and 20) more than the punishing subject. Adapted from Herrmann et al. (Reference Herrmann, Thoni and Gächter2008).
4.2. Independent and interdependent self-concepts
Much psychological research has explored the nature of people's self-concepts. Self-concepts are important, as they organize the information that people have about themselves, direct attention to information that is perceived to be relevant, shape motivations, influence how people appraise situations that influence their emotional experiences, and guide their choices of relationship partners. Markus and Kitayama (Reference Markus and Kitayama1991) posited that self-concepts can take on a continuum of forms stretching between two poles, termed independent and interdependent self-views, which relate to the individualism-collectivism construct (Triandis Reference Triandis1989; Reference Triandis1994). Do people conceive of themselves primarily as self-contained individuals, understanding themselves as autonomous agents who consist largely of component parts, such as attitudes, personality traits, and abilities? Or do they conceive of themselves as interpersonal beings intertwined with one another in social webs, with incumbent role-based obligations towards others within those networks? The extent to which people perceive themselves in ways similar to these independent or interdependent poles has significant consequences for a variety of emotions, cognitions, and motivations.
Much research has underscored how Westerners have more independent views of self than non-Westerners. For example, research using the Twenty Statements Test (Kuhn & McPartland Reference Kuhn and McPartland1954) reveals that people from Western populations (e.g., Australians, Americans, Canadians, Swedes) are far more likely to understand their selves in terms of internal psychological characteristics, such as their personality traits and attitudes, and are less likely to understand them in terms of roles and relationships, than are people from non-Western populations, such as Native Americans, Cook Islanders, Maasai and Samburu (both African pastoralists), Malaysians, and East Asians (for a review, see Heine Reference Heine2008). Studies using other measures (Hofstede Reference Hofstede1980; Morling & Lamoreaux Reference Morling and Lamoreaux2008; Oyserman et al. Reference Oyserman, Coon and Kemmelmeier2002; Triandis et al. Reference Triandis, McCusker and Hui1990) provide convergent evidence that Westerners tend to have more independent, and less interdependent, self-concepts than those of other populations. These data converge with much ethnographic observation, in particular Geertz's (1975, p. 48) claim that the Western self is “a rather peculiar idea within the context of the world's cultures.”
There are numerous psychological patterns associated with self-concepts. For example, people with independent self-concepts are more likely to demonstrate (1) positively biased views of themselves; (2) a heightened valuation of personal choice; and (3) an increased motivation to “stand out” rather than to “fit in.” Each of these represents a significant research enterprise, and we discuss them in turn.
4.2.1. Positive self-views
The most widely endorsed assumption regarding the self is that people are motivated to view themselves positively. Roger Brown (Reference Brown1986) famously declared this motivation to maintain high self-esteem an “urge so deeply human, we can hardly imagine its absence” (p. 534). The strength of this motivation has been perhaps most clearly documented by assessing the ways that people go about exaggerating their self-views by engaging in self-serving biases, in which people view themselves more positively than objective benchmarks would justify. For example, in one study, 94% of American professors rated themselves as better than the average American professor (Cross Reference Cross1977). However, meta-analyses reveal that these self-serving biases tend to be more pronounced in Western populations than in non-Western ones (Heine & Hamamura Reference Heine and Hamamura2007; Mezulis et al. Reference Mezulis, Abramson, Hyde and Hankin2004) – for example, Mexicans (Tropp & Wright Reference Tropp and Wright2003), Native Americans (Fryberg & Markus Reference Fryberg and Markus2003), Chileans (Heine & Raineri Reference Heine and Raineri2009), and Fijians (Rennie & Dunne Reference Rennie and Dunne1994) score much lower on various measures of positive self-views than do Westerners (although there are some exceptions to this general pattern; see Harrington & Liu Reference Harrington and Liu2002). Indeed, in some cultural contexts, most notably East Asian ones, evidence for self-serving biases tends to be null, or in some cases, shows significant reversals, with East Asians demonstrating self-effacing biases (Heine & Hamamura Reference Heine and Hamamura2007). At best, the sharp self-enhancing biases of Westerners are less pronounced in much of the rest of the world, although self-enhancement has long been discussed as if it were a fundamental aspect of human psychology (e.g., Rogers Reference Rogers1951; Tesser Reference Tesser and Berkowitz1988).
4.2.2. Personal choice
Psychology has long been fascinated with how people assert agency by making choices (Bandura Reference Bandura1982; Kahneman & Tversky Reference Kahneman and Tversky2000; Schwartz Reference Schwartz2004), and has explored the efforts that people go through to ensure that their actions feel freely chosen and that their choices are sensible. However, there is considerable variation across populations in the extent to which people value choice and in the range of behaviors over which they feel that they are making choices. For example, one study found that European-American children preferred working on a task, worked on it longer, and performed better on it, if they had made some superficial choices regarding the task than if others made the same choices for them. In contrast, Asian-American children were equally motivated by the task if a trusted other made the same choices for them (Iyengar & Lepper Reference Iyengar and Lepper1999). Another two sets of studies found that Indians were slower at making choices, were less likely to make choices consistent with their personal preferences, and were less likely to view their actions as expressions of choice, than were Americans (Savani et al. Reference Savani, Markus and Conner2008; in press). Likewise, the extent to which people feel that they have much choice in their lives varies across populations. Surveys conducted at bank branches in Argentina, Brazil, Mexico, the Philippines, Singapore, Taiwan, and the United States found that Americans were more likely to perceive having more choice at their jobs than were subjects from the other countries (Iyengar & DeVoe Reference Iyengar, DeVoe, Murphy-Berman and Berman2003). Another survey administered in more than 40 countries found, in general, that feelings of free choice in one's life were considerably higher in Western nations (e.g., Finland, the United States, and Northern Ireland) than in various non-Western nations (e.g., Turkey, Japan, and Belarus: Inglehart et al. Reference Inglehart, Basanez and Moreno1998). This research reveals that perceptions of choice are experienced less often, and are a lesser concern, among those from non-Western populations.
4.2.3. Motivations to conform
Many studies have explored whether motivations to conform are similar across populations by employing a standard experimental procedure (Asch Reference Asch and Guetzkow1951; Reference Asch1952). In these studies, which were initially conducted with Americans, participants first hear a number of confederates making a perceptual judgment that is obviously incorrect, and then participants are given the opportunity to state their own judgment. A majority of American participants were found to go along with the majority's incorrect judgment at least once. This research sparked much interest, apparently because Westerners typically feel that they are acting on their own independent resolve and are not conforming. A meta-analysis of studies performed in 17 societies (Bond & Smith Reference Bond and Smith1996), including subjects from Oceania, the Middle East, South America, Africa, South America, East Asia, Europe, and the United States, found that motivations for conformity are weaker in Western societies than elsewhere. Other research converges with this conclusion. For example, Kim and Markus (Reference Kim and Markus1999) found that Koreans preferred objects that were more common, whereas Americans showed a greater preference for objects that were more unusual.
4.3. Analytic versus holistic reasoning
Variation in favored modes of reasoning has been compared across several populations. Most of the research has contrasted Western (American, Canadian, Western European) with East Asian (Chinese, Japanese, Korean) populations with regard to their relative reliance on what is known as “holistic” versus “analytic” reasoning (Nisbett Reference Nisbett2003; Peng & Nisbett Reference Peng and Nisbett1999). However, growing evidence from other non-Western populations points to a divide between Western nations and most everyone else, including groups as diverse as Arabs, Malaysians, and Russians (see Norenzayan et al. [2007] for a review), as well as subsistence farmers in Africa and South America and sedentary foragers (Norenzayan et al., n.d.; Witkin & Berry Reference Witkin and Berry1975), rather than an East-West divide.
Holistic thought involves an orientation to the context or field as a whole, including attention to relationships between a focal object and the field, and a preference for explaining and predicting events on the basis of such relationships. Analytic thought involves a detachment of objects from contexts, a tendency to focus on objects' attributes, and a preference for using categorical rules to explain and predict behavior. This distinction between habits of thought rests on a theoretical partition between two reasoning systems. One system is associative, and its computations reflect similarity and contiguity (i.e., whether two stimuli share perceptual resemblances and co-occur in time); the other system relies on abstract, symbolic representational systems, and its computations reflect a rule-based structure (e.g., Neisser Reference Neisser1963; Sloman Reference Sloman1996).
Although both cognitive systems are available in all normal adults, different environments, experiences, and cultural routines may encourage reliance on one system at the expense of the other, giving rise to population-level differences in the use of these different cognitive strategies to solve identical problems. There is growing evidence that a key factor influencing the prominence of analytic versus holistic cognition is the different self-construals prevalent across populations. First, independent self-construal primes facilitate analytic processing, whereas interdependent primes facilitate holistic processing (Oyserman & Lee Reference Oyserman and Lee2008). Second, geographic regions with greater prevalence of interdependent self-construals show more holistic processing, as can be seen in comparisons of Northern and Southern Italians, Hokkaido and mainland Japanese, and Western and Eastern Europeans (Varnum et al. Reference Varnum, Grossmann, Kitayama and Nisbett2008).
Furthermore, the analytic approach is culturally more valued in Western contexts, whereas the holistic approach is more valued in East Asian contexts, leading to normative judgments about cognitive strategies that differ across the respective populations (Buchtel & Norenzayan Reference Buchtel and Norenzayan2008). Below we highlight some findings from this research showing that, compared to diverse populations of non-Westerners, Westerners (1) attend more to objects than fields; (2) explain behavior in more decontextualized terms; and (3) rely more on rules over similarity relations to classify objects (for further discussion of the cross-cultural evidence, see Nisbett Reference Nisbett2003; Norenzayan et al. Reference Norenzayan, Choi, Peng and Cohen2007).
-
1. Using evidence derived from the Rod & Frame Test and Embedded Figures Test, Witkin and Berry (Reference Witkin and Berry1975) summarize a wide range of evidence from migratory and sedentary foraging populations (Arctic, Australia, and Africa), sedentary agriculturalists, and industrialized Westerners. Only Westerners and migratory foragers consistently emerged at the field-independent end of the spectrum. Recent work among East Asians (Ji et al. Reference Ji, Nisbett and Zhang2004) in industrialized societies using the Rod & Frame Test, the Framed Line Test (Kitayama et al. Reference Kitayama, Duffy, Kawamura and Larsen2003), and the Embedded Figures Test again shows Westerners at the field-independent end of the spectrum, compared to field-dependent East Asians, Malays, and Russians (Kuhnen et al. Reference Kuhnen, Hannover, Roeder, Shah, Schubert, Upmeyer and Zakaria2001). Similarly, Norenzayan et al. (Reference Norenzayan, Choi, Peng and Cohen2007) found that Canadians showed less field-dependent processing than did Chinese, who in turn were less field-dependent than were Arabs (also see Zebian & Denny Reference Zebian and Denny2001).
-
2. East Asians' recall for objects is worse than Americans' if the background has been switched (Masuda & Nisbett Reference Masuda and Nisbett2001), indicating that East Asians are attending more to the field. This difference in attention has also been found in saccadic eye-movements as measured with eye-trackers. Americans gaze at focal objects longer than East Asians, who in turn gaze at the background more than Americans (Chua et al. Reference Chua, Boland and Nisbett2005). Furthermore, when performing identical cognitive tasks, East Asians and Westerners show differential brain activation, corresponding to the predicted cultural differences in cognitive processing (Gutchess et al. Reference Gutchess, Welsh, Boduroglu and Park2006; Hedden et al. Reference Hedden, Ketay, Aron, Markus and Gabrieli2008).
-
3. Several classic studies, initially conducted with Western participants, found that “people” tend to make strong attributions about a person's disposition, even when there are compelling situational constraints (Jones & Harris Reference Jones and Harris1967; Ross et al. Reference Ross, Amabile and Steinmetz1977). This tendency to ignore situational information in favor of dispositional information is so commonly observed – among typical subjects – that it was dubbed the “fundamental attribution error” (Ross et al. Reference Ross, Amabile and Steinmetz1977). However, consistent with much ethnography in non-Western cultures (e.g., Geertz Reference Geertz1975), comparative experimental work demonstrates differences that, while Americans attend to dispositions at the expense of situations (Gilbert & Malone Reference Gilbert and Malone1995), East Asians are more likely than Americans to infer that behaviors are strongly controlled by the situation (Miyamoto & Kitayama Reference Miyamoto and Kitayama2002; Morris & Peng Reference Morris and Peng1994; Norenzayan et al. Reference Norenzayan, Choi and Nisbett2002a; Van Boven et al. Reference Van Boven, Kamada and Gilovich1999), particularly when situational information is made salient (Choi & Nisbett Reference Choi and Nisbett1998).Footnote 8 Grossmann and Varnum (Reference Grossmann and Varnum2010) provides parallel findings with Russians. Likewise, in an investigation of people's lay beliefs about personality across eight populations, Church et al. (Reference Church, Katigbak, Del Prado, Ortiz, Mastor, Harumi, Tanaka-Matsumi, De Jesús Vargas-Flores, Ibáñez-Reyes, White, Miramontes, Reyes and Cabrera2006) found that people from Western populations (i.e., American and Euro-Australian) strongly endorsed the notion that traits remain stable over time and predict behavior over many situations, whereas people from non-Western populations (i.e., Asian-Australian, Chinese-Malaysian, Filipino, Japanese, Mexican, and Malay) more strongly endorsed contextual beliefs about personality, such as ideas suggesting that traits do not describe a person as well as roles or duties do, and that trait-related behavior changes from situation to situation. These patterns are consistent with earlier work on attributions comparing Euro-Americans with Hindu Indians (see Miller Reference Miller1984; Shweder & Bourne Reference Shweder, Bourne, Marsella and White1982). Hence, although dispositional inferences can be found outside the West, the fundamental attribution error seems less fundamental elsewhere (Choi et al. Reference Choi, Nisbett and Norenzayan1999).
-
4. Westerners are also more likely to rely on rules over similarity relations in reasoning and categorization. Chinese subjects were found to be more likely to group together objects which shared a functional (e.g., pencil-notebook) or contextual (e.g., sky-sunshine) relationship, whereas Americans were more likely to group objects together if they belonged to a category defined by a simple rule (e.g., notebook-magazine; Ji et al. Reference Ji, Nisbett and Zhang2004). Similarly, work with Russian students (Grossmann, Reference Grossmann2010) and Russian small-scale farmers (Luria Reference Luria1976) showed strong tendencies for participants to group objects according to their practical functions. This appears widespread, as Norenzayan et al. (n.d.) examined classification among the Mapuche and Sangu subsistence farmers in Chile and Tanzania, respectively, and found that their classification resembled the Chinese pattern, although it was exaggerated towards holistic reasoning.
-
5. In a similar vein, research with East Asians found they were more likely to group objects if the objects shared a strong family resemblance, whereas Americans were more likely to group the same objects if they could be assigned to that group on the basis of a deterministic rule (Norenzayan et al. Reference Norenzayan, Smith, Kim and Nisbett2002b). When those results are compared with Uskul et al.'s (2008) findings from herding, fishing, and tea-farming communities on the Black Sea coast in Turkey – the two studies used the same stimuli – it is evident that European-Americans are again at the extreme (see our Figure 5).

Figure 5. Relative dominance of rule-based versus family resemblance–based judgments of categories for the same cognitive task. European-American, Asian-American, and East Asian university students were tested by Norenzayan et al. (Reference Norenzayan, Smith, Kim and Nisbett2002b); the herders, fishermen, and farmers of Turkey's Black Sea coast were tested by Uskul et al. (Reference Uskul, Kitayama and Nisbett2008). Positive scores indicate a relative bias towards rule-based judgments, whereas negative scores indicate a relative bias towards family resemblance–based judgments. It can be seen that European-American students show the most pronounced bias toward rule-based judgments, and they are outliers in terms of absolute deviation from zero. Adapted from Norenzayan et al. (Reference Norenzayan, Smith, Kim and Nisbett2002b) and Uskul et al. (Reference Uskul, Kitayama and Nisbett2008).
In summary, although analytic and holistic cognitive systems are available to all normal adults, a large body of evidence shows that the habitual use of what are considered “basic” cognitive processes, including those involved in attention, perception, categorization, deductive reasoning, and social inference, varies systematically across populations in predictable ways, highlighting the difference between the West and the rest. Several biases and patterns are not merely differences in strength or tendency, but show reversals of Western patterns. We emphasize, however, that Westerners are not unique in their cognitive styles (Uskul et al. Reference Uskul, Kitayama and Nisbett2008; Witkin & Berry Reference Witkin and Berry1975), but they do occupy the extreme end of the distribution.
4.4. Moral reasoning
A central concern in the developmental literature has been the way people acquire the cognitive foundations of moral reasoning. The most influential approach to the development of moral reasoning has been Kohlberg's (Reference Kohlberg and Lickona1971; Reference Kohlberg1976; Reference Kraul1981), in which people's abilities to reason morally are seen to hinge on cognitive abilities that develop over maturation. Kohlberg proposed that people progressed through the same three levels: (1) Children start out at a pre-conventional level, viewing right and wrong as based on internal standards regarding the physical or hedonistic consequences of actions; (2) then they progress to a conventional level, where morality is based on external standards, such as that which maintains the social order of their group; and finally (3) some progress further to a post-conventional level, where they no longer rely on external standards for evaluating right and wrong, but instead do so on the basis of abstract ethical principles regarding justice and individual rights – the moral code inherent in most Western constitutions.
While all of Kohlberg's levels are commonly found in WEIRD populations, much subsequent research has revealed scant evidence for post-conventional moral reasoning in other populations. One meta-analysis carried out with data from 27 countries found consistent evidence for post-conventional moral reasoning in all the Western urbanized samples, yet found no evidence for this type of reasoning in small-scale societies (Snarey Reference Snarey1985). Furthermore, it is not just that formal education is necessary to achieve Kohlberg's post-conventional level. Some highly educated non-Western populations do not show this post-conventional reasoning. At Kuwait University, for example, faculty members scored lower on Kohlberg's schemes than the typical norms for Western adults, and the elder faculty there scored no higher than the younger ones, contrary to Western patterns (Al-Shehab Reference Al-Shehab2002; Miller et al. Reference Miller, Bersoff and Harwood1990).
Research in moral psychology indicates that typical Western subjects rely principally on justice- and harm/care-based principles in judging morality. However, recent work indicates that non-Western adults and Western religious conservatives rely on a wider range of moral principles than these two dimensions of morality (Baek Reference Baek2002; Haidt & Graham Reference Haidt and Graham2007; Haidt et al. Reference Haidt, Koller and Dias1993; e.g., Miller & Bersoff Reference Miller and Bersoff1992). Shweder et al. (Reference Shweder, Much, Mahapatra, Park, Brandt and Rozin1997) proposed that in addition to a dominant justice-based morality, which they termed an “ethic of autonomy,” there are two other ethics that are commonly found outside the West: an ethic of community, in which morality derives from the fulfillment of interpersonal obligations that are tied to an individual's role within the social order, and an ethic of divinity, in which people are perceived to be bearers of something holy or god-like, and have moral obligations to not act in ways that are degrading to or incommensurate with that holiness. The ethic of divinity requires that people treat their bodies as temples, not as playgrounds, and so personal choices that seem to harm nobody else (e.g., about food, sex, and hygiene) are sometimes moralized (for a further elaboration of moral foundations, see Haidt & Graham Reference Haidt and Graham2007). In sum, the high-socioeconomic status (SES), secular Western populations that have been the primary target of study thus far, appear unusual in a global context, based on their peculiarly narrow reliance, relative to the rest of humanity, on a single foundation for moral reasoning (based on justice, individual rights, and the avoidance of harm to others; cf. Haidt & Graham Reference Haidt and Graham2007).
4.5. Other potential differences
There are many other psychological phenomena in which Western samples differ from non-Western ones; however, at present there are insufficient data in these domains derived from diverse populations to assess where Westerners reside in the human spectrum. For example, compared with Westerners, some non-Westerners (1) have less dynamic social networks, in which people work to avoid negative interactions among their existing networks rather than seeking new relations (Adams Reference Adams2005); (2) prefer lower to higher arousal-positive affective states (Tsai Reference Tsai2007); (3) are less egocentric when they try to take the perspective of others (Cohen et al. Reference Cohen, Hoshino-Browne, Leung and Zanna2007; Wu & Keysar Reference Wu and Keysar2007); (4) have weaker motivations for consistency (Kanagawa et al. Reference Kanagawa, Cross and Markus2001; Suh Reference Suh2002); (5) are less prone to “social-loafing” (i.e., reducing efforts on group tasks when individual contributions are not being monitored) (Earley Reference Earley1993); (6) associate fewer benefits with a person's physical attractiveness (Anderson et al. Reference Anderson, Adams and Plaut2008); and (7) have more pronounced motivations to avoid negative outcomes relative to their motivations to approach positive outcomes (Elliot et al. Reference Elliot, Chirkov, Kim and Sheldon2001; Lee et al. Reference Lee, Aaker and Gardner2000).
With reference to the spatial reasoning patterns discussed earlier, emerging evidence suggests that a geocentric bias (i.e., a landscape- or earth-fixed spatial coordinate system) may be much more widespread than previously thought – indeed, it may be the common pattern outside of the West, even among non-Western speakers of languages which make regular use of egocentric linguistic markers. Comparative research contrasting children and adults in Geneva with samples in Indonesia, Nepal, and rural and urban India have found the typical geocentric reasoning pattern in all of these populations, except for the Geneva samples (Dasen et al. Reference Dasen, Mishra, Niraula and Wassmann2006). Although many of these population-level differences are pronounced, more research is needed before we can assess whether the geocentric pattern is common across a broader swath of humanity.
4.6. Similarities between Western and non-Western societies
We expect that as more large-scale comparative studies of Western and non-Western populations are conducted, they will reveal substantial similarities in psychological processes. However, given the relative ease of conducting such studies (as compared to working in small-scale societies), there have been few comparative programs that have put universality claims to the test. Here we highlight three examples of larger-scale comparative projects that show broad and important similarities across populations.
-
1. Mate preferences: First, Buss (Reference Buss1989) compared people from 37 (largely industrialized) populations around the world and found some striking similarities in their mate preferences. In all 37 of the populations, males ranked the physical attractiveness of their mates to be more important than did females; and in 34 of the 37 populations, females ranked the ambition and industriousness of their mates as more important than did males (but for other interpretations, see Eagly & Wood Reference Eagly and Wood1999).Footnote 9 Likewise, Kenrick and Keefe (Reference Kenrick and Keefe1992a; Reference Kenrick and Keefe1992b) provide evidence of robust differences in age preferences of mates across populations. Finally, comparative research examining men's preferred waist-to-hip ratios in potential mates finds that men in both industrialized and developing large-scale populations prefer a waist-to-hip ratio of around 0.7 (Singh Reference Singh2006; Singh & Luis Reference Singh and Luis1994; Streeter & McBurney Reference Streeter and McBurney2003; Swami et al. Reference Swami, Neto, Tovée and Furnham2007).Footnote 10
-
2. Personality structure: Recent efforts have taken personality instruments to university students in 51 different countries (McCrae et al. 2005). In most of these populations, the same five-factor structure emerges that has previously been found with American samples,Footnote 11 indicating the universal structure of the Five Factor Model of personality (also see Allik & Mccrae Reference Allik and McCrae2004; Yik et al. Reference Yik, Russell, Ahn, Fernandez-Dols, Suzuki, McCrae and Allik2002).Footnote 12
-
3. Punishment of free-riding: While in Hermann et al.'s (2008) study (Fig. 4) both initial cooperation and antisocial punishment varied dramatically, the willingness of players to punish low contributors (free-riders) was not different among populations, once age, sex, and other socio-demographic controls are included.
4.7. Summary of Contrast 2
Although robust patterns have emerged among people from industrialized societies, Westerners emerge as unusual – frequent global outliers – on several key dimensions. The experiments reviewed are numerous, arise from different disciplines, use diverse methods, and are often part of systematically comparable data sets created by unified projects. Many of these differences are not merely differences in the magnitude of effects but often show qualitative differences, involving effect reversals or novel phenomena such as allocentric spatial reasoning and antisocial punishment.
5. Contrast 3: Contemporary Americans versus the rest of the West
Above we compared WEIRD populations to non-Western populations. However, given the dominance of American research within psychology (see May Reference May1997) and the behavioral sciences, it is important to assess the similarity of American data with that from Westerners more generally. Is it reasonable to generalize from Americans to the rest of the West? Americans are, of course, people too, so they will share many psychological characteristics with other Homo sapiens. At present, we could find no systematic research program to compare Americans with other Westerners, so the evidence presented is assembled from many sources.
5.1. Individualism and related psychological phenomena
Americans stand out relative to other Westerners on phenomena that are associated with independent self-concepts and individualism. A number of analyses, using a diverse range of methods, reveal that Americans are, on average, the most individualistic people in the world (e.g., Hofstede Reference Hofstede1980; Lipset Reference Lipset1996; Morling & Lamoreaux Reference Morling and Lamoreaux2008; Oyserman et al. Reference Oyserman, Coon and Kemmelmeier2002). The observation that the United States is especially individualistic is not new and dates at least as far back as de Toqueville (1835). The unusually individualistic nature of Americans may be caused by, or reflect, an ideology that particularly stresses the importance of freedom and self-sufficiency, as well as various practices in education and childrearing that may help to inculcate this sense of autonomy. American parents, for example, were the only ones in a survey of 100 societies who created a separate room for their baby to sleep (Burton & Whiting Reference Burton and Whiting1961; also see Lewis Reference Lewis1995), reflecting that from the time they are born, Americans are raised in an environment that emphasizes their independence (on the unusual nature of American childrearing, see Lancy Reference Lancy2008; Rogoff Reference Rogoff2003).Footnote 13
The extreme individualism of Americans is evident on many demographic and political measures. In American Exceptionalism, sociologist Seymour Martin Lipset (Reference Lipset1996) documents a long list of the ways that Americans are unique in the Western world. At the time of Lipset's surveys, compared with other Western industrialized societies, Americans were found to be the most patriotic, litigious, philanthropic, and populist (they have the most positions for elections and the most frequent elections, although they have among the lowest voter turnout rates). They were also among the most optimistic, and the least class-conscious. They were the most churchgoing in Protestantism, and the most fundamentalist in Christendom, and were more likely than others from Western industrialized countries to see the world in absolute moral terms. In contrast to other large Western industrialized societies, the United States had the highest crime rate, the longest working hours, the highest divorce rate, the highest rate of volunteerism, the highest percentage of citizens with a post-secondary education, the highest productivity rate, the highest GDP, the highest poverty rate, and the highest income-inequality rate; and Americans were the least supportive of various governmental interventions. The United States is the only industrialized society that never had a viable socialist movement; it was the last country to get a national pension plan, unemployment insurance, and accident insurance; and, at the time of writing, remain the only industrialized nation that does not have a general allowance for families or a national health insurance plan. In sum, there is some reason to suspect that Americans might be different from other Westerners, as de Tocqueville noted.
Given the centrality of self-concept to so many psychological processes, it follows that the unusual emphasis in America on individualism and independence would be reflected in a wide spectrum of self-related phenomena. For example, self-concepts are implicated when people make choices (e.g., Vohs et al. Reference Vohs, Baumeister, Schmeichel, Twenge, Nelson and Tice2008). While Westerners in general tend to value choices more than non-Westerners do (e.g., Iyengar & DeVoe Reference Iyengar, DeVoe, Murphy-Berman and Berman2003), Americans value choices more still, and prefer more opportunities, than do Westerners from elsewhere (Savani et al. Reference Savani, Markus and Conner2008). For example, in a survey of people from six Western countries, only Americans preferred a choice from 50 different ice cream flavors compared with 10 flavors. Likewise, Americans (and Britons) prefer to have more choices on menus in upscale restaurants than do people from other European countries (Rozin et al. Reference Rozin, Fischler, Shields and Masson2006). The array of choices available, and people's motivation to make such choices, is even more extreme in the United States compared to the rest of the West.
Likewise, because cultural differences in analytic and holistic reasoning styles appear to be influenced by whether one views the social world as a collection of discrete individuals or as a set of interconnected relationships (Nisbett Reference Nisbett2003), it follows that exceptionally individualistic Americans should be exceptionally analytic as well. One recent study suggests that this might indeed be the case: Americans showed significantly more focused attention in the Framed Line Task than did people from other European countries (Britain and Germany) as well as from Japan (Kitayama et al. Reference Kitayama, Park, Sevincer, Karasawa and Uskul2009). Although more research is needed, Americans may see the world in more analytic terms than the rest of the West.
Terror management theory maintains that because humans possess the conscious awareness that they will someday die, they cope with the associated existential anxiety by making efforts to align themselves with their cultural worldviews (Greenberg et al. Reference Greenberg, Solomon and Pyszczynski1997). The theory is explicit that the existential problem of death is a human universal, and indeed posits that an awareness of death preceded the evolution of cultural meaning systems in humans (Becker Reference Becker1973). In support of this argument of universality, the tendency to defend one's cultural worldview following thoughts about death has been found in every one of the more than a dozen diverse populations studied thus far. However, there is also significant cross-population diversity in the magnitude of these effects. A recent meta-analysis of all terror management studies reveals that the effect sizes for cultural worldview defense in the face of thoughts of death are significantly more pronounced among American samples (r=0.37) than among other Western (r=0.30) or non-Western samples (r=0.26: Burke et al. Reference Burke, Martens and Faucher2010). Curiously, Americans respond more defensively to death thoughts than do those from other countries.
In the previous section, we discussed Herrmann et al.'s (2008) work showing substantial qualitative differences in punishment between Western and non-Western societies. While Western countries all clump at one end of Figure 4, the Americans anchor the extreme end of the West's distribution. Perhaps it is this extreme tendency for Americans to punish free-riders, while not punishing cooperators, that contributes to Americans having the world's highest worker productivity. American society is also anomalous, even relative to other Western societies, in its low relational focus in work settings, which is reflected in practices such as the encouragement of an impersonal work style, direct (rather than indirect) communication, the clear separation of the work domain from the non-work, and discouragement of friendships at work (Sanchez-Burks Reference Sanchez-Burks2005).
5.2. Similarities between Americans and other Westerners
We are unable to locate any research program (other than the ones reviewed in the first two telescoping contrasts) that has demonstrated that American psychological and behavioral patterns are similar to the patterns of other Westerners. We reason that there should be many similarities between the United States and the rest of the West, and we assume that many researchers share our impression. Perhaps this is why we are not able to find studies that have been conducted to explicitly establish these similarities – many researchers likely would not see such studies as worth the effort. In the absence of comparative evidence for a given phenomenon, it might not be unreasonable to assume that the Americans would look similar to the rest of the West. However, the above findings provide a hint that, at least along some key dimensions, Americans are extreme.
5.3. Summary of Contrast 3
There are few research programs that have explicitly sought to contrast Americans with other Westerners on psychological or behavioral measures. However, those phenomena for which sufficient data are available to make cross-population comparisons reveal that American participants are exceptional even within the unusual population of Westerners – outliers among outliers.
6. Contrast 4: Typical contemporary American subjects versus other Americans
The previous contrasts have revealed that WEIRD populations frequently occupy the tail-ends of distributions of psychological and behavioral phenomena. However, it is important to recognize, as a number of researchers have (e.g., Arnett Reference Arnett2008; Medin & Atran Reference Medin and Atran2004; Sears Reference Sears1986), that the majority of behavioral research on non-clinical populations within North America is conducted with undergraduates (Peterson Reference Peterson2001; Wintre et al. Reference Wintre, North and Sugar2001). Further, within psychology, the subjects are usually psychology majors, or at least taking introductory psychology courses. In the case of child participants, they are often the progeny of high-SES people. Thus, there are numerous social, economic, and demographic dimensions that tentatively suggest that these subjects might be unusual. But, are they?
6.1. Comparisons among contemporary adult Americans
Highly educated Americans differ from other Americans in many important respects. In the following subsections, we first highlight findings from social psychology and then from behavioral economics.
6.1.1. Findings from social psychology
For a number of the phenomena reviewed above in which Americans were identified as global outliers, highly educated Americans occupy an even more extreme position than less-educated Americans. Here we itemize eight examples.
-
1. Although college-educated Americans have been found to rationalize their choices in dozens of post-choice dissonance studies, Snibbe and Markus (Reference Snibbe and Markus2005) found that non-college-educated American adults do not (cf. Sheth Reference Sheth1970).
-
2. Although Americans are the most individualistic people in the world, American undergraduates score higher on some measures of individualism than do their non-college-educated counterparts, particularly for those aspects associated with self-actualization, uniqueness, and locus of control (Kusserow Reference Kusserow1999; Snibbe & Markus Reference Snibbe and Markus2005).
-
3. Conformity motivations were found to be weaker among college-educated Americans than among non-college-educated Americans (Stephens et al. Reference Stephens, Markus and Townsend2007), who acted in ways more similar to that observed in East Asian samples (cf. Kim & Markus Reference Kim and Markus1999).
-
4. Non-college-educated adults are embedded in more tightly structured social networks than are college students (Lamont Reference Lamont2000), which raises the question of whether research on relationship formation, dissolution, and interdependence conducted among students will generalize to the population at large (cf. Adams Reference Adams2005; Falk et al. Reference Falk, Heine, Yuki and Takemura2009).
-
5. A large study that sampled participants from the general population in southeastern Michigan found that working-class people were more interdependent and more holistic than middle-class people (Na et al., in press)
-
6. The moral reasoning of college-educated Americans occurs almost exclusively within the ethic of autonomy, whereas non-college-educated Americans use the ethics of community and divinity (Haidt et al. Reference Haidt, Koller and Dias1993; Jensen Reference Jensen1997). Parallel differences exist in moral reasoning between American liberals and conservatives (Haidt & Graham Reference Haidt and Graham2007).
-
7. American college students respond more favorably toward other groups in society, are more supportive of racial diversity, and are more motivated to mask or explain away negative intergroup attitudes, than are American (non-student) adults (Henry Reference Henry2009). This difference is more problematic because the percentage of psychological studies of prejudice that exclusively rely on student samples has increased over the last two decades (from 82.7% to 91.6%), and this percentage is accentuated in the higher-impact social psychology journals (Henry Reference Henry2009).
-
8. A meta-analysis reveals that college students (the vast majority of whom were American) respond with more cultural worldview defense to death thoughts (r=0.36) than do non-college students (r=0.25: Burke et al. Reference Burke, Martens and Faucher2010).
More broadly, a second-order meta-analysis (N>650,000, Number of studies>7,000) of studies that included either college student samples or non-student adult samples revealed that the two groups differed either directionally or in magnitude for approximately half of the phenomena studied (e.g., attitudes, gender perceptions, social desirability: Peterson Reference Peterson2001). However, no clear pattern regarding the factors that accounted for the differences emerged. Other research has found that American undergraduates have higher degrees of self-monitoring (Reifman et al. Reference Reifman, Klein and Murphy1989), are more susceptible to attitude change (Krosnick & Alwin Reference Krosnick and Alwin1989), and are more susceptible to social influence (Pasupathi Reference Pasupathi1999) compared to non-student adults.
6.1.2. Findings from behavioral economics
Consistent and non-trivial differences between undergraduates and fully-fledged adults are emerging in behavioral economics as well. When compared with diverse and sometimes representative adult samples, undergraduate subjects consistently set the lower bound for prosociality in experimental measures of trust, fairness, cooperation, and punishment of unfairness or free-riding. For example, in both the Ultimatum and Dictator Games, non-student Americans (both rural and urban participants) make significantly higher offers than do undergraduate subjects (Henrich & Henrich Reference Henrich and Henrich2007). The difference is most pronounced in Dictator Games in which samples of non-student American adults from Missouri (urban and rural Missouri did not differ) offered a mean 47% of the total stake while undergraduate freshmen gave 32%, well within the typical range for undergraduates in this game (Camerer Reference Camerer2003; Ensminger & Cook, under review; Henrich & Henrich, under review). These seemingly high offers among non-students in the Dictator Game are similar to those found in other non-student samples in the United States (Carpenter et al. Reference Carpenter, Burks, Verhoogen, Carpenter, Harrison and List2005; Henrich & Henrich Reference Henrich and Henrich2007). It is the student results that are anomalous. Similarly, more recent research comparing students with both representative and selectively diverse samples of adults using the Trust Game, Ultimatum Game, and Public Goods Game shows that undergraduates ride the lower bound on prosociality measures (Bellemare & Kröger Reference Bellemare and Kröger2007; Bellemare et al. Reference Bellemare, Kröger and Van Soest2008; Carpenter et al. Reference Carpenter, Connolly and Myers2008; Fehr & List Reference Fehr and List2004). In fact, “being an undergraduate” (or being young and educated) is one of the few demographic variables that seems to matter in explaining within-country variability.
Behavioral economics research also indicates that developmental or acculturative changes to some motivations and preferences are still occurring within the age range of undergraduates (Henrich Reference Henrich and Brown2008). For example, Ultimatum Game offers continue to change over the university years, with freshmen making lower offers than seniors (Carter & Irons Reference Carter and Irons1991). Other work shows that offers do not hit their adult plateau in behavioral games until around age 24 (Carpenter et al. Reference Carpenter, Burks, Verhoogen, Carpenter, Harrison and List2005), after which time offers do not change with age until people reach old age. In the Trust Game, measures of trust and trustworthiness increase with age, until they reach a plateau close to age 30 (Sutter & Kocher Reference Sutter and Kocher2007a).
Such research may explain why treatment effects also depend on the subject pool used, with students being the most sensitive. For example, Dictator Game treatments involving double-blind setups, such that the experimenter cannot know how much a subject contributes, have dramatically smaller effects on offers among non-student adults, and sometimes no effect at all in adult populations outside the United States (Lesorogol & Ensminger, under review). Similarly, unconscious religious primes increased Dictator Game offers in a Canadian student sample of religious and nonreligious participants alike, but when non-student adults were sampled, no significant effect emerged for the nonreligious adults (Shariff & Norenzayan Reference Shariff and Norenzayan2007).
For several of these economics measures, such as public good contributions (Egas & Riedl Reference Egas and Riedl2008), undergraduate behavior is qualitatively similar to fully-fledged adult behaviors, just less prosocial. However, in at least one area (so far), it appears that a particularly interesting phenomenon is qualitatively absent in undergraduates by comparison with fully-fledged adults from the same populations: As discussed earlier for small-scale societies, researchers using the Ultimatum Game have found systematic, non-trivial tendencies in many populations to reject offers greater than 50% of the stake, a phenomenon neither previously observed in students nor intuited by researchers. Recent work using representative adult samples has revealed this tendency for “hyper-fair rejections” among non-student adults in Western populations, though it is substantially weaker than in many of the non-Western populations discussed above (Bellemare et al. Reference Bellemare, Kröger and Van Soest2008; Guth et al. Reference Guth, Schmidt and Sutter2003; Wallace et al. Reference Wallace, Cesarini, Lichtenstein and Johannesson2007).
6.2. Comparisons among subpopulations of American children
Although studying young children is one important strategy for discerning universals, it does not completely avoid these challenges, as developmental studies are frequently biased toward middle- and upper-class American children. Recent evidence indicates that something as seemingly basic as the differences in spatial reasoning between males and females (Hyde Reference Hyde1981; Mann et al. Reference Mann, Sasanuma, Sakuma and Masaki1990; Voyer et al. Reference Voyer, Voyer and Bryden1995) does not generalize well to poor American children. On two different spatial tasks, repeated four times over two years with 547 second- and third-graders, low-SES children did not show the sex differences observed in middle- and high-SES children from Chicago (Levine et al. Reference Levine, Vasilyeva, Lourenco, Newcombe and Huttenlocher2005). Such findings, when combined with other research indicating no sex differences on spatial tasks among migratory foragers (Berry Reference Berry1966), suggest that a proper theory of the origins of sex differences in spatial abilities needs to explain why both poor Chicago children and foragers do not show any sex differences.
Research on IQ using analytical tools from behavioral genetics has long shown that IQ is highly heritable, and not strongly influenced by shared family environment (Bouchard Reference Bouchard2004). However, research using 7-year-old twins drawn from a wide range of socioeconomic statuses, shows that contributions of genetic variation and shared environment vary dramatically from low- to high-SES children (Turkheimer et al. Reference Turkheimer, Haley, Waldron, D'Onofrio and Gottesman2003). For high-SES children, where environmental variability is negligible, genetic differences account for 70–80% of the variation, with shared environment contributing less than 10%. For low-SES children, where there is far more variability in environmental contributions to intelligence, genetic differences account for 0–10% of the variance, with shared environment contributing about 60%. This raises the specter that much of what we think we have learned from behavioral genetics may be misleading, as the data are disproportionately influenced by WEIRD people and their children (Nisbett Reference Nisbett2009).
A similar problem of generalizing from narrow samples exists for genetics research more broadly. Genetic findings obtained with one sample frequently do not replicate in a second sample, to the point that Nature Genetics now requires all empirical papers to include data from two independent samples. There are at least two ways in which geographically limited samples may give rise to spurious genotype-phenotype associations. First, the proportions of various polymorphisms vary across different regions of the world due to different migratory patterns and histories of selection (e.g., Cavalli-Sforza et al. Reference Cavalli-Sforza, Menozzi and Piazza1994). A genetic association identified in a sample obtained from one region may not replicate in a sample from another region because it involves interactions with other genetic variants that are not equally distributed across regions. Second, the same gene may be expressed differently across populations. For example, Kim et al. (in press) found that a particular serotonin receptor polymorphism (5-HTR1A) was associated with increased attention to focal objects among Americans, but that the same allele was associated with decreased attention to focal objects among Koreans. Researchers would draw different conclusions regarding the function of this polymorphism depending upon the location of their sample. A more complete investigation of heritability and genetic associations demands a comparison of measures across diverse environments and populations.
6.3. Contemporary Americans compared with previous generations
Contemporary Americans may also be psychologically unusual compared to their forebears 50 or 100 years ago. Some documented changes among Americans over the past few decades include increasing individualism, as indicated by increasingly solitary lifestyles dominated by individual-centered activities and a decrease in group participation (Putnam Reference Putnam2000), increasingly positive self-esteem (Twenge & Campbell Reference Twenge and Campbell2001), and a lower need for social approval (Twenge & Im Reference Twenge and Im2007). These findings suggest that the unusual nature of Americans in these domains, as we reviewed earlier, may be a relatively recent phenomenon. For example, Rozin (Reference Rozin2003) found that attitudes towards tradition are more similar between Indian college students and American grandparents than they are between Indian and American college students. Although more research is needed to reach firm conclusions, these initial findings raise doubts as to whether research on contemporary American students (and WEIRD people more generally) is even extendable to American students of previous decades.
The evidence of temporal change is probably best for IQ. Research by Flynn (Reference Flynn1987; Reference Flynn2007) shows that IQ scores increased over the last half century by an average of 18 points across all industrialized nations for which there were adequate data. Moreover, this rise was driven primarily by increasing scores on the analytic subtests. This is a striking finding considering recent work showing how unusual Westerners are in their analytic reasoning styles. Given such findings, it seems plausible that Americans of only 50 or 100 years ago were reasoning in ways much more similar to the rest of the non-Western world than Americans of today.
6.4. Similarities between typical experimental subjects and other Americans
We expect that typical American subjects are very similar to other Americans in myriad ways. The problem with this expectation, however, is that it is not immediately apparent in which domains they should be similar. We think that there are enough differences between these two groups to raise concerns about speaking incautiously on the thoughts and behaviors of Americans, in general. There have been rather few studies that have explicitly contrasted whether undergraduates or college-educated Americans differ in various psychological measures from those who are not currently students, or who were never college-educated. There are numerous meta-analyses that include data from both college student and non-student samples that speak partially to this issue. Although the meta-analyses do not specify the national origin of the participants, we assume that most of the subjects were American. Some of these analyses indicate considerable similarity between student and non-student samples. For example, the aforementioned second-order meta-analysis (Peterson Reference Peterson2001) revealed similarities between students and non-student samples for about half of the phenomena. Similarly, the relation between attribution styles and depression (Sweeney et al. Reference Sweeney, Anderson and Bailey1986), and the relations among intentions, attitudes, and norms (Farley et al. Reference Farley, Lehmann and Ryan1981) do not show any appreciable differences between student and non-student samples. In these instances, there do not appear to be any problems in generalizing from student to non-student samples, which may suggest that college education, and SES more generally, is not related to these phenomena.
6.5. Summary of Contrast 4
Numerous findings from multiple disciplines indicate that, in addition to many similarities, there are differences among typical subjects and the rest of the American population in unexpected domains. In some of these domains (e.g., individualism, moral reasoning, worldview defense in response to death thoughts, and perceptions of choice), the data from American undergraduates represent even more dramatic departures from the patterns identified in non-Western samples. Further, contemporary American college students appear further removed along some of these dimensions than did their predecessors a few decades earlier. Typical subjects may be outliers within an outlier population.
7. General discussion
As the four contrasts summarized above reveal, WEIRD subjects are unusual in the context of the world in some key ways. In this section, we first discuss the main conclusions and implications of our empirical review. We then address two common challenges to our claim that WEIRD subjects are frequent outliers. Finally, we offer some recommendations for how the behavioral sciences may address these challenges.
7.1. Summary of our conclusions and implications
7.1.1. Pronounced population variation is commonplace in the behavioral sciences
There are now enough sources of experimental evidence, using widely differing methods from diverse disciplines, to indicate that there is substantial psychological and behavioral variation among human populations. As we have seen, some of this variability involves differences in the magnitude of effects, motivations, or biases. There is also considerable variability in both whether certain effects or biases exist in some populations (as with antisocial punishment and the Müller-Lyer illusion) and in which direction they go (as with preferences for analytic versus holistic reasoning). The causal origins of such population-level variation may be manifold, including behavioral plasticity in response to different environments, epigenetic effects, divergent trajectories of cultural evolution, and even the differential distribution of genes across groups in response to divergent evolutionary histories. With all these causal possibilities on the table, we think the existence of this population-level variation alone should suffice to energize course corrections in our research directions.
We have also identified many domains in which there are striking similarities across populations. These similarities could indicate reliably developing adaptations (e.g., theory of mind), by-products of innate adaptations (such as some aspects of religious cognition), or independent inventions or diffusions of learned responses that have universal utility (such as counting systems, dance, cooking practices, or techniques for making fire). We have no doubt that there are many more pan-human similarities than we have mentioned (e.g., movement perception, taste for sugar, chunking, habituation, and depth computation); however, thus far there are few databases with individual-level measures sufficient to evaluate the similarities or differences across populations.
Many of the processes identified above that vary dramatically across populations would seem to be “basic” psychological processes. The reviewed findings identified variation in aspects of visual perception, memory, attention, fairness motivations, categorization, induction, spatial cognition, self-enhancement, moral reasoning, defensive responses to thoughts about death, and heritability estimates of IQ. These domains are not unique to the social world – they span social as well as nonsocial aspects of the environment, and do not appear to be any less “fundamental” than those domains for which much similarity has been identified. At this point, we know of no strong grounds to make a priori claims to the “fundamentalness” or the likely universality of a given psychological process.
The application of evolutionary theory does not provide grounds for such a priori claims of “fundamental” or “basic” processes, at least in general. Evolutionary theory is a powerful tool for generating and eliminating hypotheses. However, despite its power (or perhaps because of it), it is often overly fecund, as it generates multiple competing hypotheses, with predictions sometimes dependent on unknown or at least debatable aspects of ancestral environments. Hence, adjudicating among alternative evolutionary hypotheses often requires comparative work. Moreover, theoretical work is increasingly recognizing that natural selection has favored ontogenetic adaptations that allow humans, and other species, to adapt non-genetically to local environments (Henrich Reference Henrich and Brown2008).
Although we do not yet know of a principled way to predict whether a given psychological process or behavioral pattern will be similar across populations in the absence of comparative empirical research, it would surely be of much value to the field if there were a set of criteria that could be used to anticipate universality (Norenzayan Reference Norenzayan2006; Norenzayan & Heine Reference Norenzayan and Heine2005). Here we discuss some possible criteria that might be considered.
First, perhaps there are some domains in which researchers could expect phenomena to be more universal than they are in other domains. We believe that the degree of universality does likely vary across domains, although this has yet to be demonstrated. Many researchers (including us) have the intuition that there are cognitive domains related to attention, memory, and perception in which inter-population variability is likely to be low. Our review of the data, however, does not bolster this intuition. Second, it might be reasonable to assume that some phenomena are more fundamental to the extent that they are measured at a physiological or genetic level, such as genotype-phenotype relations or neural activity. However, recall that the same genes can be expressed differently across populations (e.g., Kim et al., in press), and the same cognitive task may be associated with different neural activations across populations (e.g., Hedden et al. Reference Hedden, Ketay, Aron, Markus and Gabrieli2008). Third, there may be criteria by which one could confidently make generalizations from one well-studied universal phenomenon to another similar phenomenon; for example, because pride displays are highly similar across populations (e.g., Tracy & Matsumoto Reference Tracy and Matsumoto2008), it might follow that the conceptually related shame display should also be similar across populations as well (Fessler Reference Fessler and Hinton1999).
Fourth, it would seem that demonstrating a process or effect in other species, such as rats or pigeons, would indicate human universality (and more). Although this may generally be true, several researchers have argued that culture-gene coevolution has dramatically shaped human evolution in a manner uncharacteristic of other species (Richerson & Boyd Reference Richerson and Boyd2005). Part of this process may involve the off-loading of previously genetically encoded preferences and abilities into culture (e.g., tastes for spices). Fifth, phenomena which are evident among infants might be reasonably assumed to be more universal than phenomena identified in older children or adults. We suspect this is the case, but it is possible that early biases can be reversed by later ontogeny. Showing parallel findings or effects in both adults and infants from the same population is powerful, and it raises the likelihood of universality; but quite different environments might still shape adult psychologies away from infant patterns (consider the spatial cognition finding with apes, children, and adults). Finally, perhaps particular brain regions are less responsive to experience, such that if a given phenomenon was localized to those regions one could anticipate more universality.
Whatever the relevant principles, it is an important goal to develop theories that predict which elements of our psychological processes are reliably developing across normal human environments and which are locally variable (focusing on the how and why of that variability: Barrett Reference Barrett, Carruthers, Laurence and Stich2006). We note that behavioral scientists have typically been overly confident regarding the universality of what they study, and as this review reveals, our intuitions for what is universal do not have a particularly good track record. We also think this article explains why those intuitions are so poor: Most scientists are WEIRD, or were trained in WEIRD subcultures. Hence, any set of criteria by which universality can be successfully predicted must be grounded in substantial empirical data. We look forward to seeing data that can help to identify criteria to anticipate universality in future research.
7.1.2. WEIRD subjects may often be the worst population from which to make generalizations
The empirical foundation of the behavioral sciences comes principally from experiments with American undergraduates. The patterns we have identified in the available (albeit limited) data indicate that this sub-subpopulation is highly unusual along many important psychological and behavioral dimensions. It is not merely that researchers frequently make generalizations from a narrow subpopulation. The concern is that this particular subpopulation is highly unrepresentative of the species. The fact that WEIRD people are the outliers in so many key domains of the behavioral sciences may render them one of the worst subpopulations one could study for generalizing about Homo sapiens.
To many anthropologically savvy researchers it is not surprising that Americans, and people from modern industrialized societies more generally, appear unusual vis-à-vis the rest of the species. For the vast majority of its evolutionary history, humans have lived in small-scale societies without formal schools, governments, hospitals, police, complex divisions of labor, markets, militaries, formal laws, or mechanized transportation. Every household provisioned much or all of its own food; made its own clothes, tools, and shelters; and – aside from sexual divisions of labor – most everyone had to master the same skills and domains of knowledge. Children typically did not grow up in small, monogamous nuclear families with few kin around, nor were they away from their families at school for much of the day.
Rather, through the course of this history, and in some contemporary societies still, children have typically grown up in mixed-age playgroups, where they received little active instruction or exposure to books or TV (Fiske Reference Fiske1998; Lancy Reference Lancy1996; Reference Lancy2008); they learned largely by observation and imitation; received more directives, more physical punishment, and less praise; and were less likely to be engaged in conversation by adults (and there's no “why” phase). By age 10, children in some foraging societies obtain sufficient calories to feed themselves, and routinely kill and butcher animals. Adolescent females in particular take on most of the work-related responsibilities of adult women. People in small-scale societies tend to have less reliable nutrition, greater exposure to hunger, pain, chronic diseases, and lethal dangers, and more frequently experience the death of family members. WEIRD people, from this perspective, grow up in, and adapt to, a rather atypical environment vis-à-vis that of most of human history. It should not be surprising that their psychological world is unusual as well.
7.1.3. Research topics have been limited by the heavy reliance on WEIRD populations
Relying on WEIRD populations may cause researchers to miss important dimensions of variation, and devote undue attention to behavioral tendencies that are unusual in a global context. There are good arguments for choosing topics that are of primary interest to the readers of the literature (i.e., largely WEIRD people); however, if the goal of the research program is to shed light on the human condition, then this narrow, unrepresentative sample may lead to an uneven and incomplete understanding. We suspect that some topics such as self-enhancement, cognitive dissonance, fairness, and analytic reasoning might not have been sufficiently interesting to justify in-depth investigation for most humans at most times throughout history. Alternatively, the behavioral sciences have shown a rather limited interest in such topics as kinship, food, ethnicity (not race), religion, sacred values, polygamy, animal behavior, and rituals (for further critiques on this point, see Rozin Reference Rozin2001; Rozin et al. Reference Rozin, Fischler, Shields and Masson2006). Had the behavioral sciences developed elsewhere, important theoretical foci and central lines of research might likely look very different (Medin & Bang Reference Medin and Bang2008). Moreover, it may be unnecessarily difficult to study psychological phenomena in populations where the phenomena are unusually weak, as is the case for conformity or shame among Americans (see Fessler Reference Fessler2004).
7.1.4. Studying children and primates is crucial, but not a replacement for comparative work
Working with children and nonhuman primates is essential for understanding human psychology. However, it is important to note that despite its great utility and intuitive appeal, such research does not fully obviate these challenges. In the case of primate research, discovering parallel results in great apes and in one human population is an important step, but it doesn't tell us how reliably a particular aspect of psychology develops. As the spatial cognition work indicates, because language and cultural practices can – but need not – influence the cognition humans acquired from their phylogenetic history as apes, establishing the same patterns of cognition in apes and Westerners is insufficient to make any strong claims about universality. Suppose most psychologists were Hai\\om speakers (instead of Indo-European speakers); they might have studied only Hai\\om-speaking children and adults, as well as nonhuman apes, and concluded (incorrectly) that allocentric spatial reasoning was universal. Similarly, imagine if Tsimane economists compared Ultimatum Game results for Tsimane adults to those for chimpanzees (Gurven Reference Gurven, Henrich, Boyd, Bowles, Camerer, Fehr and Gintis2004; Henrich & Smith 2001; Jensen et al. Reference Jensen, Call and Tomasello2007). These researchers would have found the same results for both species, and concluded that standard game theoretic models (assuming pure self-interest) and evolutionary analyses (Nowak et al. Reference Nowak, Page and Sigmund2000) were fairly accurate predictors in Ultimatum Game behavior for both chimpanzees and humans – a very tidy finding. In both of these cases, the conclusions would be opposite to those drawn from studies with WEIRD populations.Footnote 14
Studying children is crucial for developing universal theories. However, evidence suggests that psychological differences among populations can emerge relatively early in children (as with folkbiological reasoning), and sometimes differences are even larger in children than in adults, as with the Müller-Lyer illusion. Moreover, developmental patterns may be different in different populations, as with sex differences in spatial cognition between low-income versus middle- and high-income subpopulations in the United States, or with performance in the false belief task. This suggests a need for converging lines of research. The most compelling conclusions regarding universality would derive from comparative work among diverse human populations done with both adults and children, including infants if possible. Human work can then be properly compared with work among nonhuman species (including but not limited to primates), based on a combination of field and laboratory work.
7.1.5. Understanding human diversity is crucial for constructing evolutionary theories of human behavior
Evolution has equipped humans with ontogenetic programs, including cultural learning, that help us adapt our bodies and brains to the local physical and social environment. Over the course of human history, convergent forms of cultural evolution have effectively altered (1) our physical environments with tools, technology, and knowledge; (2) our cognitive environments with counting systems, color terms, written symbols, novel grammatical structures, categories, and heuristics; and (3) our social environments with norms, institutions, laws, and punishments. Broad patterns of psychology may be – in part – a product of our genetic program's common response to culturally constructed environments that have emerged and converged over thousands of years. This means that the odd results from small-scale societies, instead of being dismissed as unusual exceptions, ought to be considered as crucial data points that help us understand the ontogenetic processes that build our psychologies in locally adaptive and context-specific ways.
Based on this and the previous point, it seems clear that comparative developmental studies involving diverse human societies combined with parallel studies of nonhuman primates (and other relevant species) provide an approach to understanding human psychology and behavior that can allow us to go well beyond merely establishing universality or variability. Such a systematic, multi-pronged approach can allow us to test a richer array of hypotheses about the processes by which both the reliable universal patterns and the diversity of psychological and behavioral variation emerge.
7.1.6. Exclusive use of WEIRD samples is justified when seeking existential proofsFootnote 15
Our argument should not be construed to suggest that the exclusive use of WEIRD samples should always be avoided. There are cases where the exclusive use of these samples would be legitimate to the extent that generalizability is not a relevant goal of the research, at least initially (Mook Reference Mook1983). Research programs that are seeking existential proofs for psychological or behavioral phenomena, such as in the case of altruistic punishment discussed earlier (e.g., Fehr & Gächter Reference Fehr and Gächter2002), could certainly start with WEIRD samples. That is, if the question is whether a certain phenomenon can be found in humans at all, reliance on any slice of humanity would be a legitimate sampling strategy. For another example, Tversky, Kahneman, and their colleagues sought to demonstrate the existence of systematic biases in decision-making that violate the basic principles of rationality (Gilovich et al. Reference Gilovich, Griffin and Kahneman2002). Most of their work was done with WEIRD samples. Counterexamples to standard rationality predictions could come from any sample in the world.Footnote 16 Furthermore, existential proof for a psychological phenomenon in WEIRD samples can be especially compelling when such a finding is theoretically unexpected. For example, Rozin and Nemeroff (Reference Rozin, Nemeroff, Stigler, Herdt and Shweder1990) found (surprisingly, to many) that even elite U.S. university students show some magical thinking. Nevertheless, even in such cases, learning about the extent to which population variability affects such phenomena is a necessary subsequent phase of the enterprise, since any theory of human behavior ultimately has to account for such variability (if it exists).
7.2. Concerns with our argument
We have encountered two quite different sets of concerns about our argument. Those with the first set of concerns, elaborated below, worry that our findings are exaggerated because (a) we may have cherry-picked only the most extreme cases that fit our argument, and have thus exaggerated the degree to which WEIRD people are outliers, and/or (b) the observed variation across populations may be due to various methodological artifacts that arise from translating experiments across contexts. The second set of concerns is quite the opposite: Some researchers dismissively claim that we are making an obvious point which everyone already recognizes. Perhaps the most productive thing we offer is for these two groups of readers to confront each other.
We preface our response to the first set of concerns with an admonition: Of course, many patterns and processes of human behavior and psychology will be generally shared across the species. We recognize that human thought and behavior is importantly tethered to our common biology and our common experiences. Given this, the real challenge is to design a research program that can explain the manifest patterns of similarity and variation by clarifying the underlying evolutionary and developmental processes.
We offer three general responses to the concern that our review presents a biased picture. To begin, we constructed our empirical review by targeting studies involving important psychological or behavioral concepts which were, or still are, considered to be universal, and which have been tested across diverse populations. We also listed and discussed major comparative studies that have identified important cross-population similarities. Since we have surely overlooked relevant material, we invite commentators to add to our efforts in identifying phenomena which have been widely tested across diverse subpopulations.
Second, we acknowledge that because proper comparative data are lacking for most studied phenomena, we cannot accurately evaluate the full extent of how unusual WEIRD people are. This is, however, precisely the point. We hope research teams will be inspired to span the globe and prove our claims of non-representativeness wrong. The problem is that we simply do not know how well many key phenomena generalize beyond the extant database of WEIRD people. The evidence we present aims only to challenge (provoke?) those who assume that undergraduates are sufficient to make claims about human psychology and behavior.
Third, to address the concern that the observed population-level differences originate from the methodological challenges of working across diverse contexts, we emphasize that the evidence in our article derives from diverse disciplines, theoretical approaches, and methodological techniques. They include experiments involving (1) incentivized economic decisions; (2) perceptual judgments; (3) deceptive experimental practices that prevented subjects from knowing what was being measured; and (4) children, who are less likely than adults to have motivations to shape their responses in ways that they perceive as desirable (or undesirable) to the experimenter. The findings, often published in the best journals of their respective fields, hinged on the researchers making a compelling case that their methodology was comparably meaningful across the populations being studied.
Furthermore, the same methods that have yielded population differences in one domain have demonstrated similarities in other domains (Atran Reference Atran2005; Haun et al. 2006b; Henrich et al. Reference Henrich, McElreath, Ensminger, Barr, Barrett, Bolyanatz, Cardenas, Gurven, Gwako, Henrich, Lesorogol, Marlowe, Tracer and Ziker2006; Herrmann et al. Reference Herrmann, Thoni and Gächter2008; Medin & Atran Reference Medin and Atran2004; Segall et al. Reference Segall, Campbell and Herskovits1966). If one wants to highlight the demonstrated similarities, one cannot then ignore the demonstrated differences which relied on the same or similar methodologies.
Note also that few of the findings that we reviewed involve comparing means across subjective self-report measures, for which there are well-known challenges in making cross-population comparisons (Chen et al. Reference Chen, Lee and Stevenson1995; Hamamura et al. Reference Hamamura, Heine and Paulhus2008; Heine et al. Reference Heine, Lehman, Peng and Greenholtz2002; Norenzayan et al. Reference Norenzayan, Smith, Kim and Nisbett2002b; Peng et al. Reference Peng, Nisbett and Wong1997). Therefore, while methodological challenges may certainly be an issue in some specific cases, we think it strains credulity to suggest that such issues invalidate the thrust of our argument, and thus eliminate concerns about the non-representativeness of typical subjects.
7.3. Our recommendations
Our experience is that many researchers who work exclusively with WEIRD subjects would like to establish the broad generalizability of their findings. Even if they strongly suspect that their findings will generalize across the species, most agree that it would be better to have comparative data across diverse populations. The problem, then, is not exclusively a scientific or epistemological disagreement, but one of institutionalized incentives as well. Hence, addressing this issue will require adjusting the existing incentive structures for researchers. The central focus of these adjustments should be that in presenting our research designs to granting agencies, or our empirical findings in journals, we must explicitly address questions of generalizability and representativeness. With this in mind, we offer the following recommendations.
Journal editors and reviewers should press authors to both explicitly discuss and defend the generalizability of their findings. Claims and confidence regarding generalizability must scale with the strength of the empirical defense. If a result is novel, being explicitly uncertain about generalizability should be fine, but one should not imply universality without an empirically grounded argument.
This does not imply that all experimentalists need to shift to performing comparative work across diverse subject pools! As comparative evidence accumulates in different domains, researchers will be able to assess the growing body of comparative research and thus be able to calibrate their confidence in the generalizability of their findings. The widespread practice of subtly implying universality by using statements such as “people's reasoning is biased…” should be avoided. “Which people?” should be a primary question asked by reviewers. We think this practice alone will energize more comparative work (Rozin Reference Rozin2009).
The experience of evolutionarily-oriented researchers attests to the power of such incentives. More than other researchers in the social sciences, evolutionary researchers have led the way in performing systematic comparative work, drawing data from diverse societies. This is not because they are interested in variation per se (though some are), but because they are compelled, through some combination of their scientific drive and the enthusiasm of their critics, to test their hypotheses in diverse populations (e.g., Billing & Sherman Reference Billing and Sherman1998; Buss Reference Buss1989; Daly & Wilson Reference Daly and Wilson1988; Fessler et al. Reference Fessler, Nettle, Afshar, Pinheiro, Bolyanatz, Mulder, Cravalho, Delgado, Gruzd, Correia, Khaltourina, Korotayev, Marrow, de Souza and Zbarauskaite2005; Gangestad et al. Reference Gangestad, Haselton and Buss2006; Henrich et al. 2005; Kenrick & Keefe Reference Kenrick and Keefe1992a; Reference Kenrick and Keefe1992b; Low Reference Low2000; Medin & Atran Reference Medin and Atran2004; Schaller & Murray Reference Schaller and Murray2008; Schmitt Reference Schmitt2005; Sugiyama et al. Reference Sugiyama, Tooby and Cosmides2002; Tracy & Robins Reference Tracy and Robins2008).
Meta-analyses are often compromised because many studies provide little background information about the subjects. Journal editors should require explicit and detailed information on subject-pool composition (see Rozin Reference Rozin2001). Some granting agencies already require this. Comparative efforts would also be greatly facilitated if researchers would make their data readily available to any who asked; or, better yet, data files should be made available online. Sadly, a recent investigation found that only 27% of authors in psychology journals shared their data when an explicit request was made to them to do so in accordance with APA guidelines (Wicherts et al. Reference Wicherts, Borsboom, Kats and Molenaar2006). Tests of generalizability require broad access to published data.
Given the general state of ignorance with regard to the generalizability of so many findings, we think granting agencies, reviewers, and editors would be wise to give researchers credit for tapping and comparing diverse subject pools. Work with undergraduates and the children who live around universities is much easier than going out into the world to find subjects. As things stand, researchers suffer a competitive disadvantage when seeking a more diverse sampling of subjects. Because many of the best journals routinely require that papers include several studies to address concerns about internal validity (Carver Reference Carver2004), the current incentives greatly favor targeting the easiest subject pool to access. There is an often unrecognized tradeoff between the experimental rigor of using multiple studies and the concomitant lack of generalizability that easy-to-run subject pools entail (Rozin Reference Rozin2009). If the incentive structure came to favor non-student subject pools, we anticipate that researchers could also be more persuasive in encouraging their universities and departments to invest in building non-student subject pools – for example, by setting up permanent psychological and behavioral testing facilities in bus terminals, Fijian villages, rail stations, airports, and anywhere diverse subjects might find themselves with extra time.
Beyond this, departments and universities should build research links to diverse subject pools. There are literally untapped billions of people around the world who would be willing to participate in research projects, as both paid subjects and research assistants. The amounts of money necessary to pay people who might normally make less than $12 per day are trivial vis-à-vis the average research grant. Development economists, anthropologists, and public health researchers already do extensive research among diverse populations, and therefore already possess the contacts and collaborations. Experimentalists merely need to work on building the networks.
Funding agencies, departments, and universities can encourage and facilitate both professors and graduate students to work on expanding sample diversity. Research partnerships with non-WEIRD institutions can be established to further the goal of expanding and diversifying the empirical base of the behavioral sciences. By supplying research leaves, adjusted expectations of student progress, special funding sources, and institutionalized relationships to populations outside the university as well as to non-WEIRD universities, these organizations can make an important contribution to building a more complete understanding of human nature.
8. Closing words
Although we are certainly not the first to worry about the representativeness of prevalent undergraduate samples in the behavioral sciences (Gergen Reference Gergen1973; Medin & Atran Reference Medin and Atran2004; Norenzayan & Heine Reference Norenzayan and Heine2005; Rozin Reference Rozin2001; Reference Rozin2009; Sears Reference Sears1986; Sue Reference Sue1999), our efforts to compile an empirical case have revealed an even more alarming situation than previously recognized. The sample of contemporary Western undergraduates that so overwhelms our database is not just an extraordinarily restricted sample of humanity; it is frequently a distinct outlier vis-à-vis other global samples. It may represent the worst population on which to base our understanding of Homo sapiens. Behavioral scientists now face a choice – they can either acknowledge that their findings in many domains cannot be generalized beyond this unusual subpopulation (and leave it at that), or they can begin to take the difficult steps to building a broader, richer, and better-grounded understanding of our species.
ACKNOWLEDGMENTS
We thank several anonymous reviewers and the following colleagues for their very helpful comments on earlier versions of this manuscript: Nicholas Epley, Alan Fiske, Simon Gächter, Jonathan Haidt, Shinobu Kitayama, Shaun Nichols, Richard Nisbett, Paul Rozin, Mark Schaller, Natalie Henrich, Daniel Fessler, Michael Gurven, Clark Barrett, Ted Slingerland, Rick Shweder, Mark Collard, Paul Bloom, Scott Atran, Doug Medin, Tage Rai, Ayse Uskul, Colin Camerer, Karen Wynn, Tim Wilson, and Stephen Stich.