1. What is Grammatical Gender?

In everyday speech, the word “gender” is associated with the biological and social differences between women and men. In addition, people might know that languages can have masculine and feminine words. So at first blush, it may seem that grammatical gender is a reflection of natural gender in grammar.

1.1 Kinds and Concepts

The view that grammatical gender mirrors natural gender has been widespread since antiquity and is still evident in the terms “masculine,” “feminine,” and “neuter” (historically meaning “neither”), which are used to label individual gender distinctions, especially in Indo-European languages. Indeed, many languages show a match between natural and grammatical gender. Clear examples from across the world are Tamil in India, Dizi in Ethiopia, Diyari in Southern Australia (now extinct), and Bagvalal in the Caucasus (Corbett, 1991; Kibort & Corbett, 2008). In these and many other languages, nouns denoting male persons are masculine, and nouns denoting female persons are feminine. Other nouns are treated in varied ways: they may be added to the masculine or the feminine gender or may occupy one or several genders of their own.

However, not all languages function like this. First, many languages—slightly more than half of the languages in a representative sample (Corbett, 2013a)—do not have grammatical gender at all. Of those that do, some disregard the difference between male and female and assign all words for humans or for living beings to the same class. Yet other languages have a special “vegetable” gender for plants, a gender for foodstuffs, a gender for large or important things, a gender for liquids or abstracts, and many more. Such patterns remind us that the word gender (Greek: γένος‎‎) originally meant “kind” rather than “sex.” While the split into male and female is the most common semantic base of gender systems (Corbett, 2013b), it is by no means the only option.

Relaxing the expectation that grammatical gender is always related to biological sex also opens up the possibility that a language may have more than two or three genders. Indeed, systems can be far richer, with a maximum of around 20 different genders found in Fula, a language of the Niger-Congo family spoken in Nigeria. In descriptions of such large systems, it is common practice to label the various classes with numbers rather than names. This is not only more practical, it also reflects the fact that not all of these classes are meaningful. In fact, most classes in Fula do not have a clear semantic content (Breedveld, 1995, p. 297).

The observation that gender does not always perfectly align with meaning holds for almost half of the relevant languages (47% of the 112 gender languages in Corbett, 2013c). This may mean either that one or two classes are meaningful while the others are not, or that all classes contain words for semantic as well as non-semantic reasons. The first situation can be seen in the Nakh-Daghestanian language Tsez (Comrie, 1999, example 1), the second in the Indo-European language Latvian (example 2).

(1) Tsez (Nakh-Daghestanian):

Gender I

male persons

Gender II

female persons + various other

Gender III


Gender IV


(2) Latvian (IE, Baltic, Heiko Marten p.c.):

vecā māte ‘old mother’

feminine for semantic reasons

vecā māja ‘old house’

feminine for formal reasons

vecais tēvs ‘old father’

masculine for semantic reasons

vecais koks ‘old tree’

masculine for formal reasons

The imperfect match between gender and meaning has inspired two diverging lines of thinking, both dating back to the early Greek scholars (see Kilarski, 2013 for an overview of the scientific history). The first sought to restore the match with the help of hidden layers of meaning attributed to metaphorical extension, personification, or culture-specific classification often inaccessible to the outside observer (notable advocates of this view were Grimm, 1831 and von Humboldt, 1822), but the idea also appears in Lakoff (1987). The second acknowledges that gender is, to a large extent, a matter of grammar—a classification of nouns rather than of kinds and concepts.

1.2 Classifying Nouns

Gender is one of the systems of noun classification, alongside classifiers on one end (3) and inflectional classes on the other (4).

(3) Classifiers in Jacaltec (Kanjobalan Mayan: Craig, 1992, p. 284; adapted from Aikhenvald, 2000, p. 82).











‘(man) John saw the (animal) snake’

(4) Inflectional classes in Latin (from Haspelmath & Sims, 2010, p. 159)



Nominative singular



Genitive singular



In (3), the classifiers naj “man” and no7 “animal” indicate that John is a person, while the snake is an animal. In (4), the nouns hortus “garden” and gradus “step” have the same ending in the nominative singular, but different endings in other cases. The different forms used to express the same feature, here genitive singular, show that the two nouns belong to different inflection classes or “declensions.” While both classifiers and declensions are means to classify nouns, they differ in many respects. Among other things, classifiers are meaningful, while most inflectional class systems have at best weak links with semantics.

Gender seems to have affinities with both systems. We find historical evidence that gender may develop out of classifier systems (see section 3.1, The Birth of Gender Systems). On the other hand, genders often partially match inflectional classes when a language has both, leading linguists to think that the systems strive to cooperate or—arguably—that one determines the other (Doleschal, 2000; Faarlund, Lie, & Vannebo, 1997; Bittner, 2000; Kürschner & Nübling, 2011; see Enger, 2004 and Thornton, 2001 for critical discussion, and Aronoff, 1994 on the general relation between inflectional class and gender).

Gender also has links with derivational morphology. Many languages have morphological means of deriving words for male and female persons and animals, with morphemes that resemble the gender markers found elsewhere in the language. For example, the South-American language Mosetén has pairs of nouns as in (5), whose endings, –si’ (feminine) and –tyi’ (masculine), also appear as agreement markers on adjectives, relative clause markers, numerals, and other words (Sakel, 2004, pp. 86–88, translations adjusted).

The argument for analyzing nominal –si’ and –tyi’ in (5) as derivational morphemes rather than as gender markers is that the language does not usually express gender overtly on the noun (Sakel, 2004, p. 86).

In addition, derivational suffixes are typically associated with a fixed gender value. For example, French nouns ending in -elle are feminine: ruelle “alleyway,” dentelle “lace.” Such regularities may even override semantic motivations in favor of another gender value. A classic example is the French noun sentinelle “guard,” which often denotes a male person but is feminine nonetheless.

1.3 Agreement Classes

The property that sets gender apart from other types of noun classification is agreement, the morphological expression on words other than the noun. While languages can mark gender on the noun itself—such systems are called overt gender systems—this is not a necessary characteristic. Marking on associated words, however, is required: without agreement, we have no evidence for gender (Corbett, 1991; Hockett, 1958, p. 231; see Audring (2011) for a number of key references from the extensive literature on gender agreement). Common places where gender agreement shows up are adjectives, verbs, and pronouns, and many languages also mark gender on articles, numerals, and question words (see example 6 from Russian).

(6) Russian, gender agreement on numerals, adjectives, and verbs (Stephan Audring, p.c.)






‘one empty bottle fell over’

More rarely, gender agreement can be found on adverbs, prepositions, conjunctions, and even words for “yes” and “no”—see example (7) from a variety of Dutch spoken in Belgium.

(7) Wambeek Dutch, gender marked on “yes” (Van Craenenbroeck, 2010, p. 211)








‘Is Mary coming tomorrow? – Yes, she is.’

Agreement is what makes gender a morphosyntactic feature, together with number and person, and distinguishes it from inflectional class and from classes of derived words. Examples (8) and (9) illustrate the difference.

(8) Gender vs. inflectional class (Italian; Thornton, 2001, p. 485)


Class 1

Class 2

Class 3

Class 4

Class 5

Class 6

sg. –o,

sg. –a,

sg. –e,

sg. –a,

sg. –o,


pl. -i

pl. -e

pl. -i

pl. -i

pl. -a











‘coffee shop’













‘famous person’

Table (8) shows that the relation between gender and inflectional class in Italian is not 1 to 1—every inflectional class except class 2 contains nouns of both genders, although there are large statistical tendencies (e.g., class 1 nouns are typically masculine). For gender, agreement is decisive; although mano inflects like a masculine noun, it takes feminine agreements, while papa looks like it should be feminine but takes masculine agreements.

(9) Gender vs. classes of derived words (German)





die Freiheit ‘the freedom’



die Ordnung ‘the order’



die Kontinuität ‘the continuity’



die Finsternis ‘the darkness’


das Gedächtnis ‘the memory’



der Reichtum ‘the wealth’


das Wachstum ‘the growth’


In Table (9), we see that gender and suffix classes are not equivalent; the suffixes -heit, -ung, and -(i)tät take the same gender agreement in German, while the suffix -nis is found in both feminine and neuter nouns, and the suffix -tum is associated with masculine or neuter gender. Again, agreement is what is decisive for gender, not the noun’s own morphology.

Summing up, gender can be viewed from three basic angles. First, it can be seen as a classification system for concepts, based on properties such as sex or animacy, or shape and size. Second, it can be taken as a system for classifying nouns, which highlights its affinities with inflectional and derivational morphology as well as with classifiers. Third, gender can be viewed as a system of agreement classes, defined via the behavior of associated words. The last view, which takes a syntactic rather than a semantic criterion as foundational, is prevalent in the current linguistic literature.

1.4 Gender and Other Grammatical Features

Gender interacts in various ways with other grammatical features, especially person, number, and case, but also tense. These interactions often manifest themselves in the form of conditions; gender marking may be restricted to a part of the paradigm. A well-known condition has been formulated as one of Greenberg’s universals: “A language never has more gender categories in nonsingular numbers than in the singular” (Greenberg, 1963, p. 112; Universal 37). While a number of counterexamples have been found (Plank & Schellinger, 1997), it appears to be generally true that many languages mark fewer genders in the plural than in the singular, or that they neutralize gender completely in non-singular environments. Similar conditions can be found between other features. Another proposed universal is that “[i]f a language has gender distinctions in the first person, it always has gender distinctions in the second or third person, or in both” (Greenberg, 1963, p. 96; Universal 44). This means that gender marking in pronominal paradigms is often present in the second and/or third person, but absent in the first. In other cases, conditions apply between gender and tense. In Russian, for example, verbs agree in gender (example 6), but only in the past tense.

A further complexity in many languages is the interaction between gender and case. Especially when the same morphological markers express both features at once, children may have a harder time figuring out the forms and functions of the two systems (see section 4.1 below).

1.5 A Canonical Gender System

Languages across the world vary widely and interestingly. In some instances, there may be doubts whether a language has grammatical gender or not. Therefore, it is useful to look at a few basic properties expected in a gender system, and some common divergences (for more on canonical agreement, canonical features, and canonical gender, see Corbett, 2006, Corbett, 2012, and Corbett & Fedden, 2015).

First, we expect that if a language has grammatical gender, then every noun in that language should belong to exactly one gender. This means that the system accommodates all nouns (rather than just a subset) and that, in principle, each noun has only one fixed gender value. Divergences from this ideal can be sporadic or systematic. In sporadic cases, we find individual nouns varying in the agreements they trigger. For example in Hebrew, a few nouns are reported to have either feminine or masculine gender, for instance, dereh “road, way” (Aikhenvald, 2000, p. 44). This is an example of a double-gender noun (Corbett, 1991, pp. 181–182). A different case is hybrid nouns (Corbett, 1991, pp. 183–184) such as the Dutch diminutive zusje “(little) sister,” which belongs to the neuter gender but often takes feminine pronouns.

Especially interesting are more systematic cases of variation, where the gender of nouns can be manipulated by the speaker. For example, in languages that associate certain genders with size, high value, or importance, it may be possible to upgrade or downgrade a person or object by placing it into another gender. Example (10) comes from the Nigerian Bantu language Herero (also known as Otjiherero). The noun for “knife” belongs to class 11, as indicated by the prefix (o)ru- (10a), but it can be used with the class 7 prefix (o)tji- to mean “big knife” (10b). The new class prefix is added before the existing one. Note how the class change is reflected in the agreement on the possessive pronoun.

(10) Herero (Kavari & Marten, 2009; glosses simplified)

In systems such as this, gender may be difficult to distinguish from (or indeed be intertwined with) diminution and/or augmentation, as well as lexical derivation. Similar difficulties may arise when there is overlap between gender and number (see Corbett & Hayward, 1987, for a famous case, the Cushitic language Bayso, whose plural is sometimes analyzed as a gender).

A second expected property of a gender system is that it has a semantic core (Corbett, 1991, p. 63). This means that even when many or most nouns are assigned to a gender on the basis of their form (see section 2.2 below), some alignment between gender and semantics is expected. Even in languages for which the gender of nouns has been regarded as arbitrary (famously French and German, but see again 2.2), the system is semantically motivated to some degree, especially for persons and higher animals (11).

(11) Semantically motivated feminine/masculine noun pairs in French and German

German (F/M)

French (F/M)


die Frau/der Mann

la femme/l’homme

‘the woman/the man’

die Nichte/der Neffe

la nièce/le neveu

‘the niece/the nephew’

die Stute/der Hengst

la jument/l’étalon

‘the mare/the stallion’

die Kuh/der Stier

la vache/le taureau

‘the cow/the bull’

On the other hand, when gender systems are perfectly semantic, researchers sometimes separate them from grammatical gender and speak of “semantic gender,” “natural gender,” “agreement in sex” or “animacy agreement,” which may be unhelpful, as it introduces artificial splits between otherwise equivalent systems.

A third canonical property is that gender agreement should occur in the form of affixes or (more rarely) clitics, and in more than one lexical category or more than one syntactic domain. This means that we expect languages to show gender on several words in the utterance, for instance on adjectives, verbs, and pronouns. The Bantu language Chichewa, for example, is highly canonical in this respect: in addition to marking gender on the noun itself, it clocks up the following list of agreement targets (Bentley & Kulemeka, 2001; Mchombo, 2004,):

In general, more agreement results in an easier to recognize gender system. If agreement in a particular language is restricted to a single category, like pronouns, then the existence of grammatical gender in that language might be debatable. The most famous case is English, which only shows evidence for gender on personal and possessive pronouns, leading researchers (and laypersons) to argue about whether English has a gender system or not.

By looking at just three of the many ways in which gender systems can meet or defy expectations, the usefulness of typological knowledge about cross-linguistic variation becomes evident—an indispensable tool in analysis and theory.

2. Gender in the Languages of the World

In a sample of 257 languages from different geographical areas and linguistic families, 112 are shown to have a gender system (Corbett, 2013a). Their distribution across the world is heterogeneous. Gender systems are common in Europe, in Africa, and in Australia, but they are comparatively rare in the Americas and practically absent in large parts of Asia and in the Pacific (Aikhenvald, 2000, p. 78; Corbett, 2013a). In the linguistic literature, the best-represented and most widely researched gender systems are those of the Indo-European and the Niger Congo languages, in particular from the Bantu genus. Aside from these, individual fame is enjoyed by languages such as Arapesh (Fortune, 1942; but especially thanks to Aronoff, 1994; see also Dobrin, 2012), Bayso (Corbett & Hayward, 1987), Dyirbal (Dixon, 1972; popularly known through Lakoff, 1987; but see also Plaster & Polinsky, 2010), Miraña (Seifart, 2004), Ngan’gityemmeri (Reid, 1997), Russian (Corbett, 1991), Yimas (Foley, 1991), and Zande (Aikhenvald, 2000; Claudi, 1985). These languages have gender systems that are seen as especially informative or challenging for various reasons, such as their many genders (Arapesh, Ngan’gityemmeri), their complex or unusual assignment systems (Arapesh, Dyirbal, Yimas), their history (Ngan’gityemmeri, Zande), or their interaction between gender and other features (Bayso, Miraña, Ngan’gityemmeri, Russian).

Gender systems come in a broad variety of shapes and sizes. Generally speaking, we can distinguish three parameters of variation:

Let us briefly look at each in turn.

2.1 How Many Gender Values?

The smallest possible number of gender values is two, and two-gender-systems are the most common worldwide (Corbett, 2013a). On the upper end, languages with more than a dozen classes have been identified, for instance, Arapesh, spoken on Papua New Guinea, with 13 genders (Aronoff, 1992,1994; Fortune, 1942), Ngan’gityemerri, a Daly language spoken in Australia, with 15 genders (Reid, 1997), and Nigerian Fula with more than 20 genders depending on dialect and analysis (Arnott, 1970; Breedveld, 1995).

Establishing how many genders a language has is not always simple and straightforward. Since the indicators for gender are agreeing words, any inconsistencies or mismatches within or among these words can complicate the analysis. For example, there are languages in which not all agreeing words mark the same array of genders. A case in point is Dutch, where gender is marked on definite articles, attributive adjectives, and relative and demonstrative pronouns. All of these distinguish two gender values: common and neuter. Furthermore, gender is marked on personal and possessive pronouns, and here we find three values: masculine, feminine, and neuter (with syncretism between masculine and neuter in the possessives). This makes it hard to say how many genders Dutch has—two or three—and this is indeed a matter of debate in the linguistic and pedagogical literature (see Audring, 2009 for discussion). In other languages, the number of genders is difficult to state for other reasons, for example, because markers are syncretic or otherwise ambiguous (e.g., in Romanian, see Corbett, 1991, pp. 150–152; Corbett, 2014, pp. 93–94). Moreover, small clusters of nouns may behave exceptionally (see Corbett, 1991, pp. 170–175 on “inquorate genders”) or the gender system may overlap with other systems, such as location marking, diminution/augmentation, or grammatical number (see, e.g., Di Garbo, 2014).

2.2 Types of Assignment Rules

In some languages, gender appears to be more clearly rule-based than in others. Rules for gender assignment have two basic functions: they serve to motivate the gender of existing words, and they can be used productively to select a gender for loanwords and novel coinages. Generally speaking, there are three types of assignment rule: semantic, phonological, and morphological.

Semantic rules—already mentioned in section 1—are often based on general conceptual splits such as male/female, human/non-human, or animate/inanimate. For example, languages might work like Kolami, a Dravidian language spoken in India (Emeneau, 1955; discussed in Corbett, 1991, p. 10), which attributes masculine gender to nouns denoting male persons and feminine gender to all others. However, not all semantic rules are as straightforward. Many languages have genders that combine a rather heterogeneous set of items, some of which belong to smaller semantic classes such as plants, fruits, or body parts. An example is Isangu, a Niger Congo (Bantu) language mentioned in Comrie (1999, p. 463). As is the custom for Bantu languages, genders are notated as singular/plural pairs with a designated number for each member of a pair.

(12) Isangu genders.




Semantic Characterization










only (but not all) humans






most plants; also some animals, concrete nouns, abstract nouns






most body parts, most fruits; also some humans, plants, concrete nouns, abstract nouns






most artifacts; also some humans, plants, concrete nouns, abstract nouns






most animals; also some plants, concrete nouns, abstract nouns

For yet other languages, linguists have proposed gender assignment rules that—rather than describing the semantics of a whole class—only cover individual clusters of nouns. These are regularities like the following, suggested for German (Köpcke & Zubin, 1983; Steinmetz & Rice, 1989):

Such rules are small in scope, and if a language employs them, the number of different rules will be large, as each regularity accounts for only a limited subset of the nouns (a critical account of such rules is given in Enger, 2009).

While semantic rules seem to be primary in the sense that genders—we believe—are born as semantic classes (see 3.1 below), languages can develop associations between gender and formal properties of nouns. Such associations can make reference to nearly any formal property, be it phonological (word-initial or word-final sounds or sound sequences, mono-syllabicity, but also patterns of word accent) or morphological (inflectional classes as well as derivational patterns, e.g., certain affixes).

Examples of form-based gender assignment are the following:

Again, we can see a difference between “large rules” of broad scope and “small rules” of narrow scope. A famous example for a language with large phonological rules is the Cushitic language Qafar, for which it is claimed that nouns ending in an accented vowel are feminine, while all others are masculine (discussed in Corbett, 1991; Parker & Hayward, 1985). These rules appear to cover nearly all of the nouns in the language. Of the three formal rules mentioned above, the first is obviously an example of a small rule, while the second and (to a lesser extent) the third account for a wider array of nouns.

Among the languages in the world, mixed systems of semantic and formal rules are in a slight majority (Corbett, 2013c), though their prevalence can be more pronounced in certain macro-areas (see Di Garbo, 2014 for Africa). For more references on gender assignment, see Audring (2011).

2.3 Amount and Place of Marking

The third dimension of complexity lies in the formal expression of gender. Typically, the gender of a noun is not visible on the noun itself—though in some languages it is—but is expressed via agreement on other words, such as the adjective, the predicate, and various pronouns. In some languages, agreement is so ubiquitous that nearly every word in the sentence carries a gender marker. The following example is from Chichewa (Bantu, spoken in East-Africa), where 7, 1, and 9 indicate noun classes (Mchombo, 2004, p. 87; glosses adapted). Note that Chichewa is one of the languages that mark gender overtly on the noun itself, as well as by agreement.

With the exception of the copula, all words in the sentence express gender: either their own inherent value or the value of the noun they agree with. Also, in Archi, a Nakh-Daghestanian language spoken in the Caucasus, “almost every part of speech can agree in gender” (Chumakina & Corbett, 2015; Corbett, 2014, p. 107; although this does not hold for every item within the parts of speech).

At the other extreme, there are languages with sparse expression of gender. The best-known example is English, where gender is visible only on the personal and possessive pronouns, with not more than seven distinct forms: he/she/it, him/her, and his/its. As mentioned in 1.4 above, languages with frequent marking have gender systems that are easier to spot in fieldwork and easier to defend analytically. Pronominal gender languages like English provide less clear evidence for a gender system. Interestingly, the same considerations appear relevant for the acquisition of gender, which will be discussed in section 4.1.

2.4 Gender in Sign Languages

It makes sense to conclude this brief typological survey with a look at sign languages. Whether there are sign languages that have gender systems is a matter of debate. Many scholars argue that sign languages systematically lack grammatical gender (Pfau, Steinbach, & Woll, 2012, p. 234), partly because they are generally young languages, while gender (agreement) takes time to develop (see section 3.1). Two exceptions have been proposed. First, many sign languages have classifying handshapes that encode various properties of a referent, for example that it is a person, an animal, or a vehicle, or that its shape is long and thin or broad and flat. What makes such handshapes candidates for gender is that they can be carried over into the verb, which then reflects properties of its subject or object reminiscent of the way gender agreement on the verb reflects properties of nouns. For example, in the Sign Language of the Netherlands (Nederlandse Gebarentaal), the verb meaning “to fall” has a different handshape depending on whether the falling entity is cylindrical, long and thin, or legged (Zwitserlood & Van Gijn, 2006, which analyzes the phenomenon as gender agreement). However, a more common analysis is that these markers are classifiers rather than genders, since they are clearly semantic, involve a large (and potentially open-ended) variety of classes, and are often optional.

Moreover, there are suggestions that sign languages may mark gender on pronouns. For example, Smith (1990) and Fischer (1996) describe masculine and feminine handshapes in personal pronouns in Japanese and Taiwan sign language, respectively. Byun, Zwitserlood, & De Vos (2015) discuss the same phenomenon for Korean sign language. Still, the evidence is debatable, as the markers are only used for persons and are probably optional. A careful and convincing analysis of such phenomena, however, might provide evidence for pronominal gender systems as—albeit non-canonical—cases of gender in sign language.

3. Rise, Development, and Fall

The issue of “young languages” brings us from typology to diachrony, and the next question to address is how gender systems arise, as well as how they develop and—possibly—decline.

3.1 The Birth of Gender Systems

Gender systems do not arise overnight. Since the central characteristic is agreement, the growth of a gender system requires the development of (bound) gender morphology, either from scratch or by repurposing existing morphological material, such as derivational morphemes, case or number affixes, or locative markers (Aikhenvald, 2000). For this reason, gender is counted among the “mature elements of language,” involving long chains of evolutionary events (Dahl, 2004, p. 112). The same reason accounts for why gender is allegedly absent in pidgin and creole languages (McWhorter, 2001, p. 163). However, the APiCS database (Maurer, 2013) lists at least one example, the Canadian mixed language Michif (Bakker, 1997), which shows an agreement system described as “truly weird” by Corbett (2006, p. 269), since it involves not only one gender system but actually two, from both lexifier languages, French and Plains Cree. Applying a broader definition and including sporadic agreement as well as pronominal genders might yield more young languages with grammatical gender (Maurer, 2013).

If agreement is developed “from scratch,” several possible pathways have been proposed. Figure 1 summarizes them graphically. In most cases, the original sources are nouns, in particular nouns with classifying potential, such as “man,” “animal,” or “thing.” Such words can develop into classifiers that are used with other nouns to indicate their class membership (see example 3 in section 1.2). From here on, developments can proceed in two directions. Classifiers can be used for derivational purposes, as in constructions like man child

