Adventures in Glossonymy
Coby Lubliner

What is glossonymy?

The words toponym (‘place-name’) and toponymy (‘place-naming or the study thereof’) are words that have become well established in English, and can be found in any standard dictionary. According to Merriam-Webster, “toponym” is a back-formation from “toponymy.”

By contrast, the corresponding words for language name and language naming – glossonym and glossonymy, respectively – have not yet found their way into any dictionary I know of. “Glossonym” can be found scattered sparsely in linguistic literature, but the only Yahoo or Google hit for “glossonymy” that I got, after being asked if I had meant to search for glossotomy (more on that later), was to a glossary (no pun intended) of linguistic terms, compiled in Germany, in which it is given as the English translation of the German Glossonymie.

Why the difference? Perhaps one reason for the attention given to toponymy is that it is an integral part of cartography, and most literate people use maps.

It appears, on the other hand, that few people are interested in the details of language naming, unless it forms a part of a political agenda, as in the Balkans. I, however, find such details fascinating, and I would like to share some observations on the subject.

Let me say from the outset that by “language” I mean any language variety – standard, vernacular, dialect, pidgin – for which people have a name. If what people generally think of as a single language, such as Spanish, has more than one name given to it by its native speakers (español and castellano), or if a single glossonym, such as Arabic or Chinese, is sometimes used for several distinct (if related) language varieties, then so be it. In any case, I am one of those who believe that language cannot be quantified, and that any news reports on the number of languages in the world are nonsense.

Types of glossonyms

Glossonyms – as is the case with toponyms – can be either endonyms (names by which a language is known by its own speakers or writers) or exonyms (names given to a language in other languages). And just as one occasionally finds an exonymic adoption of endonyms in toponymy (e.g., saying “Torino” for “Turin” when speaking English), the same can be found in glossonymy (e.g., referring to Persian as Farsi).

The main difference between toponyms and glossonyms is that the former are, by and large, absolute, while the latter are relative. What I mean is that a place-name is usually a name that exists on its own, independently of context. Not always: inhabitants of the San Francisco Bay Area, among themselves, often refer to San Francisco as “The City”; similarly, Majorcans call Palma de Mallorca Ciutat, and Israelis, when speaking Hebrew, call the land of Israel simply ha-aretz (‘the land’), as do other Hebrew-speakers. But these are exceptions.

With languages, however, there is typically no need to name them, except in contrast (explicit or implicit) to other languages. A linguistic community that has no contact with any other is quite likely not to have a name for its language (except perhaps something like “speech”), until it is discovered by outsiders. And in many cases what a language is called depends on what it’s compared or contrasted with.

Glossonyms are, for the most part, of two types: ethnic and geographic (the terms are, I believe, self-explanatory). But it isn’t always obvious, from the form of a given glossonym, to which category it belongs: the English exonym “Russian,” for example, may refer either to the ethnic Russian people or to the country of Russia. On the other hand, the endonym russkii presents no such ambiguity, since it has only an ethnic reference; the adjective pertaining to Russia is rossiiskii.

A most interesting case occurs in the Hebrew Bible, in which the only reference to the language in which it is written occurs in Isaiah 36:11 (the passage is repeated verbatim in 2nd Kings18:26), and the language there is called, in a dialogue, yehudit in explicit contrast to aramit (Aramaic). As the various English versions illustrate, this can mean either “the language of [the kingdom of] Judah,” as in the English Standard Version (equivalently, “Judean” in the New American Bible and in the Judaica Press Complete Tanach), or “the Jews’ language,” as in the King James Version and other versions based on it (including the original Jewish Publication Society version). Many other versions modernize the dialogue by using “Hebrew,” but the New King James Version adds, in a footnote, “literally Judean.” The Septuagint ioudaïstí and the Vulgate iudaice are of no help in this regard.

My inclination, like that of most recent Bible translators, is toward the geographic interpretation of yehudit. The period in question is shortly after the fall and exile of the kingdom of Israel, and since Judah was now the only Hebrew kingdom, yehudi was perhaps beginning to replace ‘ivri as an ethnonym, but it does not seem to have done so completely until after the fall of Judah, since even at that later time the latter can still occasionally be found (as in Jeremiah 34:9) alongside the former.

There is, moreover, a general tendency for the inhabitants of a compact geographic entity (such as an island or a polity with well-defined borders), when they contrast their local speech with a high-prestige language that is current over a wide area, to name it for that entity. This is true even when several different language varieties are spoken there: Sardinians, for example, say that they speak sardo in contrast to italiano; they refer to campidano, logodurese and gallurese mainly when they compare these dialects with one another. Similarly, people of Taiwan usually refer to their language as Taiwanese when they contrast it with Mandarin, even though several vastly different dialects are spoken there. Conversely, the people of the Balearic Islands, who speak almost identical variants of Catalan, do not call their speech balear (only philologists do that) but mallorquí on Majorca, menorquí on Minorca, and eivissenc on Ibiza. By the same token, the language of Judah (what we now call Hebrew) was, according to what we know from inscriptions, not substantially different from that of Edom, Ammon and other states in Canaan; but, as these examples show, it would seem natural for the people of Judah to think of it as Judean.

Hebrew underwent another glossonymic change when it stopped being used as a vernacular: it became known as leshon ha-qodesh (‘the tongue of holiness’) when contrasted with the languages actually spoken by Jews (such as Aramaic or Greek). This glossonym is obviously neither ethnic nor geographic, but belongs to a third category – that of glossonyms based on some characteristic of the language – which I choose to call perigraphic (the word means simply ‘descriptive,’ but I want to stick to a Greek-based terminology). Other perigraphic glossonyms include “Mandarin” (and some of its endonyms, such as guanhua ‘bureaucrat speech,’ baihua ‘clear speech,’ putonghua ‘common speech,’ and guoyu ‘national language,’ but not zhongwen ‘Chinese [written] language’); the ancient languages of India (Sanskrit ‘perfected’ and Prakrit ‘ordinary’); Tok Pisin ‘pidgin talk’; the term “broad” for some of the dialects of England; and bokmål ‘book-language’ as the name of one of the forms of standard Norwegian. As I will discuss below, deutsch (German) was, in its ancestral form, a perigraphic glossonym before becoming an ethnic one; it’s the ethnonym that grew out of the glossonym.

German, Yiddish and Dutch

“German” is the English name of the common language of historic Germany (including Austria, South Tyrol, “German” Switzerland and Luxembourg). It is also an element of compound glossonyms such as Low German, Swiss-German and Judeo-German. In each of these cases it corresponds to the endonym deutsch or a variant thereof: nedderdüütsch (or plattdüütsch), schwyzertütsch, yidish-taytsh (or ivre-taytsh).

This word goes back to a presumed old Germanic form, theudisc (preserved in the Latin lingua theodisca, theotisca or teutisca, though later writers also used the similar but not directly related teutonica); the form þeodisc (related to þeod ‘people’ or ‘nation’) is found in Old English, while the earliest attested Old High German form is diutisc, presumably derived from an earlier thiudisc. The word means ‘of the people’ and is the analogue of the Latin (lingua) vulgaris as opposed to latina. Eventually the people who spoke diutisc came to call themselves by that name; that is, the glossonym became an ethnonym, which later led to the toponym Diutiscland and finally Deutschland.

Along the German-Romance boundary, the Romance-speakers called their neighbors’ speech by such derived names as thiois or tiche in Belgium and lower Lorraine, tudesque elsewhere in French-speaking territory, tudestg among the Romansh, todesco or tedesco in Italy. The last-named became the standard Italian word for ‘German,’ but in French tudesque was replaced by allemand, named for the Alemannic tribe that was the Romance-speakers’ more southerly Germanic neighbor. Tudesque is still used in French as a literary synonym of allemand, or to designate medieval German language or culture. Allemand was taken over by Iberian Romance-speakers in the forms alemany (Catalan), alemán (Spanish) and alemão (Portuguese), though tudesco may be found in literary Spanish and has been used by Sephardic Jews to designate those of German origin (Ashkenazim).

When speakers of Judeo-German contrasted their language with Hebrew or Aramaic – for example, when translating sacred texts – they had no need to indicate its Jewishness and called it simply taytsh; eventually a question like vos iz taytsh? (literally, ‘what is [it in] German?’) came to stand for ‘what does it mean?’; and the verb (far)taytshn – originally ‘translate into German’ – now means ‘interpret.’ Conversely, when contrasting yidish-taytsh with other (non-Jewish) forms of German, there was no need for taytsh, and in this context the language was simply called yidish (meaning ‘Jewish’). As German Jews moved eastward, they continued the habit of calling it that in contrast to the languages of their hosts, and the exonyms in those languages came to be the respective translations of ‘Jewish’: żydowski in Polish, evreiskii in Russian, evreieşte in Rumanian. The complication in the last two languages is that, as in others belonging to Greek Orthodox communities, the usual word for ‘Jew’ is derived from the Greek Evréos (ancient Greek Hebraios, i.e., Hebrew), leaving room for potential confusion between the Jews’ vernacular and their sacred tongue. The Russian solution, for one, was to call Hebrew drevneevreiskii (‘ancient Hebrew/Jewish’), but this became impractical when Hebrew was revived as a living language, so that the adopted endonym ivrit became the Russian name of modern Hebrew. Something similar was done in Bulgarian, which has tended to follow Russian usage (staroevrejski and ivrit), even though the Jewish population in Bulgaria was almost entirely Sephardic and called its Judeo-Spanish vernacular (evrejsko-)španjolski, not evrejski. (In Ottoman Turkey, on the other hand, the Turkish name for it was often Yahudice, i.e., ‘Jewish,’ which is also how the aforementioned Biblical yehudit is translated into Turkish.)

In fact, in all countries in which not all Jews were Ashkenazic (including even Germany), it clearly did not make sense to use ‘Jewish’ as the name of a language that was not spoken by all Jews, and so the exonym for it became an adopted endonym, though in the English form “Yiddish” (as in the German form jiddisch) the [ɪ] sound of the first vowel is foreign to the native form, which in almost all dialects has [i] (as either [jidɪʃ] or [idɪʃ]). This practice has since been extended to eastern countries as well, so that the language is now more likely to be called jidysz in Polish, idysh in Russian, idiş in Rumanian.

Oddly enough, the modern English cognate of diutisc, namely “Dutch,” is used almost exclusively to designate a language whose endonym does not include such a cognate but is either Nederlands (for the standard) or Hollands (for the vernacular). (I say “almost” because “Pennsylvania Dutch” alternates with “Pennsylvania German” as the exonym for what its speakers call Pennsilfaanisch-Deitsch.) True, in the Middle Ages the Germanic-speaking people of the Low Countries (Hollanders, Flemings and such) considered themselves just as German (duutsch, duytsch, dietsch according to dialect) as their fellows to the east, and it is from duutsch that “Dutch” – referring, originally, to all the Germanic peoples and languages of historic Germany – is derived. Long after the Netherlands ceased to be a part of Germany in any way, the Netherlanders continued to call their language Nederduytsch (later Nederduitsch) until well into the nineteenth century, when Nederlandsch (later Nederlands) finally took over. The two major South African descendants of the Dutch Reformed Church (Nederlandse Gereformeerde Kerke) are still called (in Afrikaans) Nederduitse Gereformeerde Kerk (NGK) and Nederduitse Hervormde Kerk (NHK)

In a seventeenth-century Dutch-English dictionary, the Dutch language is referred to in English as Netherduytch, with the spelling changed to Netherdutch in later printings. But except for a Netherdutch Synagogue that was inaugurated in New York in 1847, and the fact that the NHK uses “Netherdutch Reformed Church” as its English name, the term did not prosper. And while it would have been natural to adopt “Netherlandish” (a term that is quite common in art history) in the eighteenth or nineteenth century as the counterpart of the emerging Nederlands(ch) – as Niederländisch was adopted in German and néerlandais in French – for some reason this did not happen. Nor has the Encyclopedia Britannica’s “Netherlandic” had much success. English-speakers are stuck with the anachronistic “Dutch” as the counterpart of Nederlands and Hollands. Only for the Belgian vernacular is the English exonym, Flemish, equivalent to the endonym Vlaams.

The special case of French

The term ‘France’ has, almost from the beginning, had two distinct meanings, as has the associated adjective or glossonym (français). This has both simplified and complicated things.

France, originally pronounced something like ‘frahntsuh’ (IPA [francə]) is how the northern Gallo-Romance vernacular transformed the Latin Francia, which did not mean ‘land of the Franks’ but rather something like ‘land ruled by the Franks.’ More specifically, Francia designated the territory whose rulers – of whom there were often more than one at a time – bore the title rex Francorum or Francorum rex (king of the Franks). The reason that there were often several such kings was that the Frankish rulers treated their domains as personal property that could be divided among their sons, and so, for much of the Franks’ history, their realm was divided into kingdoms with such names as Austrasia, Neustria, Aquitania and so on. But a king of, say, Neustria, would usually be styled not simply rex Neustriae but rather Francorum rex Neustriae.

There is nothing unusual, by the way, about a country being named for the title borne by its rulers, even if the title is derived from a people whose homeland is outside the country. Perhaps the best-known example is Prussia: it was precisely because the original Prussians (a Baltic people) lived outside the German Empire that the Prince-Electors of Brandenburg, after they came to rule over the Prussians’ territory, could call themselves kings of Prussia (since the only kingships within the Empire were the German one held by the Emperor and that of Bohemia), and eventually the name “Prussia” came to stand for their German territories as well.

At its greatest extent, under Charlemagne, Francia covered a lot more ground than present-day France: it included the Low Countries, a large part of historic Germany, and even a bit of Spain. For this reason, the Francia of that period is rendered in modern French not as France but as Francie. (Charlemagne was also crowned Roman Emperor, and he ruled over northern Italy as well, but it was as king of the Lombards, and Italy was not a part of Francia.)

After the death of Charlemagne’s son and successor Louis I, and following some fratricidal wars (a common occurrence in Frankish history), Francia was irrevocably divided into an eastern and a western kingdom. Francia orientalis was inhabited almost entirely by German tribes, and its kings gradually replaced the title of rex Francorum orientalium by rex Germanicus. This left the western kings free to drop the occidentalium from their titles and style themselves simply Francorum rex. From this point on Francia means essentially France, and while the kings continued, for the most part, to be officially styled Francorum rex, they were often also referred to as rex Franciae, and in French the title was invariably roi de France.

But Francia had yet another, more restricted meaning, and it was there that the vernacular France had its beginning.

The German tribes that came under Frankish rule – even the Saxons, who were forcibly subdued by Charlemagne – continued to be governed, on behalf of the king, by their hereditary dukes. In the thoroughly Romanized west, however, the old tribal structure of the Gauls, Iberians and others had long been obliterated, and to facilitate administration Charlemagne created counties, governed by counts (comites) who at first were royally appointed civil and military governors, but whose tenure, as the Frankish kingdom weakened in the later ninth century, became hereditary. Several of these counts became quite powerful and managed to acquire a number of neighboring counties, and a few of them went so far as to give themselves the title of duke: the counts of Autun became dukes of Burgundy; those of Poitou became dukes of Aquitaine; and some of the counts of Paris called themselves dux Francorum, and the territory over which they ruled directly, equivalent to much (but not all) of present-day Ile de France, also came to be called occasionally Francia in Latin and consistently France in French. As a vestige of this usage, consider the town that is home to Charles de Gaulle airport, named Roissy-en-France. Why this name? Because there is, not too far away, another town called Roissy, located in the onetime county of Meaux, an area that is now a part Ile de France but once belonged to the counts of Champagne; hence it was not en France.

The original meaning of français also referred to this smaller France; here, too, there are toponymic vestiges. There are two geographic formations, named Vexin and Brie, that Ile de France shares with its neighbors, Normandy and Champagne, respectively. The part of Vexin that is in Normandy is called, naturally enough, le Vexin normand, and the part of Brie that is in Champagne is called la Brie champenoise. The other parts are called le Vexin français and la Brie française.

Likewise, français as a glossonym originally referred to the speech of the Paris region, in contrast to neighboring dialects such as Norman or Picard. When the language that we think of as French was contrasted with Latin or Germanic (German or Dutch), it was called roman(s) (Romance), except that in areas close to the Romance-Germanic boundary it was also called walesc or walec, derived from an old Germanic word (walhisk) that is the ancestor of the modern Dutch waals and German welsch (and of English “Welsh” as well); in modern times this was replaced, in the north, by wallon, which eventually came to designate the Romance speech of Belgium (both dialect and standard French) and the population that uses it.

When contrasted with the equally Romance dialects of southern France – which the northern French called langue d’oc – it was called langue d’oïl; oïl (the predecessor of modern French oui) and oc were the respective words for ‘yes.’

As the meaning of France and français lost its local reference (except in the aforementioned toponyms) and came to refer exclusively to the whole kingdom, français came to mean the official language of France, for which the speech of Paris was only coincidentally the model (indeed, it was once widely believed that the ‘best’ French was spoken in Touraine). French linguists had to coin the term francien for the original français, and politicians had to invent francilien as the adjective for the administrative region of Ile de France.

This peculiar coincidence, which spared France from a conflict over the name of its national language, did not occur in Italy or Spain. Italians argued for several centuries over whether the language that they now call italiano should be called that or toscano; even Machiavelli contributed an essay to the debate (he was in favor if italiano).

Spanish-speakers to this day have not resolved whether to call their language castellano or español; in the Middle Ages it was called – as was the general case with Romance languages castellano in contrast to other Romance dialects and romance in contrast to Latin or Basque (the title of a sixteenth-century grammar calls it romance castellano español, no less). In contrast to Arabic, though, the language was also known as ladino, perhaps because, when Arabs came to Iberia early in the eighth century, the Ibero-Romans still thought of themselves as speaking Latin. Spanish Jews extended the meaning of ladino to imply a contrast to Hebrew, and Ladino remains one of the names by which Judeo-Spanish is known, though according to some authorities it applies only to the language of Bible translations and other writings of a sacred nature.

The curious case of Occitan and Catalan

Occitan is a word that, in the latter part of the twentieth century, gradually replaced Provençal as the common name (except in Provence, where it is fiercely resisted) for the large group of Romance dialects (by now mostly extinct) that have historically been spoken across southern France, with incursions into Spain (the Aran Valley) and Italy (the Piedmontese Alps). The name, a modern coinage, comes from a medieval Latin text in which one of these dialects is called lingua occitana, a translation – probably humorous – of the aforementioned langue d’oc.

From the glossonym , the toponym Occitania has been formed to designate the territory where Occitan is the historic language. Travelers entering the Aran Valley are greeted by signs welcoming them to South Occitania.

Like northern French, Occitan was called roman(s) when contrasted with Latin (and with Basque as well). Texts of local import (contracts, ordinances, records and the like) were usually referred to as being in a geographically designated language, but names such as gascon (in Gascony), biarnes (in Bearn), tolzan (in the County of Toulouse) and the like, as usual with such glossonyms, did not necessarily designate linguistically distinct dialects; Bearnese, for example, is but a variant of Gascon, and old documents in either of these often do not present some of the most salient characteristics of these dialects, such as the replacement of f by h.

But the poetry of the troubadours, which was read all over the Occitan territory and outside it (in northern France, Italy and Catalonia), was considered by native Occitan-speakers to be simply in roman. Italians called the language provenzale, a reasonable name in view of Provence’s proximity to Italy. (The troubadour Raimbaut de Vaqueiras, who was from Provence and lived in Italy, has his language called pro[v]enzalesco – to rhyme with to[d]esco – by an Italian-speaking character in one of his poems.) From Italian, “Provençal” was adopted as the name of the language in other European languages, and has been so used until recently.

Catalans, on the other hand, named it lemosi (llemosí in modern Catalan spelling), after the Limoges region, even though this is the Occitan-speaking district that is perhaps the farthest from Catalonia. But there was a tradition that the Limousins spoke the “best” Occitan; it can be found explicitly in the fourteenth-century text Leys d’amor (a kind of textbook on the writing of troubadour poetry), and a sixteenth-century Italian scholar named Giovanni Maria Barbieri compared the role of Limousin for the troubadours to that of Tuscan for Italian poets.

The troubadours’ language itself was a kind of supradialectal koiné – not a standard in the modern sense, but a blend of dialects, in which word forms from different dialects were often used in the same poem for purposes of rhyme or subtext, similar to Robert Burns’ blend of Scots and standard English (in which he would variously use, say, go, gae and gang). At least this was true from the late eleventh to the early thirteenth century, when Occitan poetry was written and read (and sung) over a large, culturally cohesive territory that was joined by dynastic ties among the ruling houses. This cohesion was lost after the Albigensian Crusade, when the most central part of the territory – the region that came to be called Languedoc – was taken over by the crown of France.

Catalonia, in particular, was now culturally isolated, and the language of Catalonian troubadours came more and more to be infused by Catalan (which, though technically not an Occitan dialect, is sufficiently close for such infusions to take place). The process, however, was gradual and seems to have taken place unconsciously, so that the poets continued to think of themselves, even when writing essentially pure Catalan, as writing in llemosí. Eventually this word (and its Spanish form lemosín) became a synonym for written Catalan (except that Valencian writers, from about 1400 on, called their variant of it valencià). When the Bourbon king Philip V of Spain finally conquered Catalonia in 1714 and issued a decree banning Catalan from official use, he referred to it as lemosín. The term continued to be used well into the nineteenth century, partly as a way avoiding the imposition of català on Valencians and Balearic Islanders (for, while in English it is possible to differentiate between “Catalan” [pertaining to the language and culture] and “Catalonian” [pertaining to the land of Catalonia], Catalan itself allows no such distinction). Llemosí was dropped only when Romance linguistics clearly showed that what Catalan-speakers meant by it was really nothing like the dialect of Limoges.

The glossonymy of Catalan remains problematic. The first major dictionary of the language, published in the 1930s, was called Diccionari català-valencià-balear. While in the Balearic Islands català has by now been generally accepted as the name of the official language, despite several formal differences between it and the Catalonian variant, in Valencia the question of whether the region’s historic language (now spoken only by a minority of the population) is to be called català or valencià has become a political football. On the insistence of those Valencians who call themselves “Valencianists” (as opposed to “Catalanists”), the Spanish Constitution exists in both a “Catalan” and a “Valencian” version, though one might need a magnifying glass to find the differences between them. The official position of the Valencian Academy of the Language is a compromise that seems to satisfy no one, namely, that the proper name of the language in Valencia is Valencian, but that this does not mean that it is a separate language from Catalan. (Something like this, however, is the case in Moldova, where the language officially known as Moldovan is essentially standard Romanian.)

To Catalanists, “the unity of the language” (la unitat de la llengua) is a sacred cow. In Catalonia the position is shared by Catalanists of all political stripes, from right to left. In Valencia, however, Catalanism is associated with the left, and Valencianism (including the insistence on Valencian being a separate language) with the right. In any case, Catalanist ideology, except in its extreme form, does not imply a claim that Catalan-speakers who are not Catalonians belong to the Catalan nation. In the West (that is, west of the Seipel line) there is, as a rule, no automatic identification between language and nation, and while Catalonia is currently pushing for a recognition of its status as a nation in the Spanish Constitution, it is meant to be as a territorial, not an ethnic, nation.

The Slavic case

As I have remarked elsewhere, Slavic-speakers seem to be particularly sensitive to diglossia; they have little tolerance for significant differences between the official or written language and the vernacular. But since the Slavic lands lie to the east of the Seipel line, the creation of a new standard implies the emergence of a new nation, and vice versa.

Nineteenth-century Austrian sources dealing with the nationalities of the Austro-Hungarian monarchy usually define Slovaks as Czechs under Hungarian rule. But when, under the influence of the Catholic Church, a Slovak standard distinct from Czech (though the differences are far smaller than those between neighboring dialects of, say, German or Italian) was formulated, Slovaks began to view themselves as a separate nation. The Czechoslovak Republic between the World Wars maintained the fiction that there was a Czechoslovak language of which Czech and Slovak were variants – a tenable position from a purely linguistic point of view, but not from a practical one: the region whose dialect was the basis of standard Slovak is fairly distant from the Czech border, and the orthography was, probably intentionally, made quite different from that of Czech, and so no unified schooling could be instituted. There is historical precedent for such a choice: written Portuguese was designed, in the Middle Ages, to look as different as possible from Galician, as was Swedish from Danish at the time of the Reformation.

Something similar was done with Macedonian with respect to Bulgarian, this time under the influence of the Communist Party. Macedonian Slavs were regarded – and regarded themselves – as Bulgarians until the Balkan Wars of the early twentieth century, when the part of Macedonia that did not belong to Greece or Bulgaria was annexed by Serbia and called South Serbia. It retained this designation in the kingdom of Yugoslavia, where an effort was made to Serbianize the population. This attempt failed, and under Tito the Macedonians were given nationality status, a republic, and a standard language. But this standard was based, not surprisingly, on dialects as far as possible from Bulgarian proper, and the Cyrillic script chosen for it is much closer to Serbian than to Bulgarian Cyrillic. Of course, Bulgarians and Macedonians communicate orally with ease, and many Bulgarians to this day deny the existence of a Macedonian language.

Greeks seem to have even more of a problem with the notion of a Macedonian language, since the very idea of a non-Greek Macedonian identity is anathema in Greece. I remember seeing, on a modest office building in Thessaloniki, a sign advertising a translation service and listing the languages covered by it. One of the language names had been partly blacked out, but not completely: one could make it out as SERVOMAKEDHONIKA, that is, Serbomacedonian. Obviously what was meant was Macedonian, and it may even have been the translator’s native language, since his name was Koprinsky.

When it comes to Serbocroat, it should be noted first that, until the nineteenth century, the language was generally called Illyrian by most Europeans and slavonski by its speakers; even now, there are Americans of Serb or Croat descent who refer to themselves as Slavonic (in San Francisco, it was only last year that what is now the Croatian American Cultural Center changed its name from Slavonic Cultural Center). It took on the name Serbocroat (srpskohrvatski or hrvatskosrpski) when Vuk Karadžić gave it a unified orthography (both Cyrillic and Latin – the scripts are in one-to-one correspondence, and a crossword puzzle written in one can be solved in the other). But the fact that the same orthography is used for its several variants has not prevented most of their users from regarding them as distinct languages. There are, for example, four separate Serbocroat sites on Wikipedia: srpskohrvatski (Serbocroat), hrvatski (Croatian), srpski (Serbian) and bosanski (Bosnian). They all say dobrodošli na Wikipediju for ‘welcome to Wikipedia,’ and any Serbocroat-speaker can read any of the sites with ease (Croats who are unfamiliar with the Cyrillic that is the default script on the srpski site need only click on the latinica button to see it in Latin script), though some of the terms they use may vary: ‘featured article’ is izabrani članak on the first three, but odabrani članak on the Bosnian site, while ‘home page’ is glavna stranica on the first two, and glavna strana and početna strana, respectively, on the last two. But any dictionary, whether it calls itself Serbocroat, Croatian, Serbian or Bosnian, gives both izab(i)rati and odabrati for ‘select,’ and both strana and stranica for ‘page’; and whether ‘home page’ is interpreted as ‘main page’ or ‘starting page’ is a matter of semantics. In any case it seems rather wasteful to have to consult four separate sites (which don’t cross-reference one another) in search of information.

All this without entering the ongoing argument over whether the language used in Montenegro – which most Montenegrins call srpski, though it is, dialectologically, closer to Croatian than to Serbian – should be regarded as yet another language (crnogorski); or whether the “Bosnian” language should be called bosanski or bošnjački, that is, whether its name should reflect the land of Bosnia or the Bosniak [i.e. Muslim] people (since ethnic Croats and Serbs in Bosnia-Herzegovina prefer to call what they use hrvatski and srpski, respectively). Needless to say, the linguistic boundaries seen on dialect maps (see here and here) bear no relation to the political boundaries by which the official glossonyms are defined.

Language policy in the post-Communist Balkans seems to be a reflex of Stalinist nationality policy, which typically led to the creation of a new standard whenever it was politically desirable to define a new nationality. Thus the USSR created a Karelian language distinct from Finnish and a Moldavian language distinct from Romanian (and, in Central Asia, a Tajiki language distinct from Persian). Of course this was not done when it was more desirable to uphold the unity of a nationality, and so the Ruthenian of eastern Galicia and the Carpathians was subsumed in Ukrainian, while in Albania a single standard was maintained over the sharply divergent Gheg and Tosk variants of Albanian; the standard would vary depending on the origin of whoever was in power.

Overseas variants

There are cases when the geographic boundary dividing language variants is not a line on land but an expanse of sea, whether small (like the Skagerrak between Denmark and Norway) or vast (like the oceans that separate Europe from the Americas, Africa and Oceania).

In the case of Norway and the Afrikaners (Boers) of South Africa, deliberate decisions were made to replace the previously used standards (Danish and Dutch, respectively) by ones that were closer to the local speech. Since both societies are traditionally Protestant, the publication of a Bible translation in the new standard was, in each case, the signal of linguistic independence.

When it came to naming the standards, though, the two situations were quite different. For the Afrikaners, it was a simple matter to apply the name that they had historically used for their vernacular – Afrikaans – to the standard. This endonym has, like Yiddish, become the universal exonym as well, since it would not make sense to use its literal translation (‘African’) as a glossonym (except for the Dutch, who have no choice).

In Norway, however, the vernacular spoken by most people was in the form of dialects that were quite different from the more-or-less Norwegianized Danish used by an educated minority, called at first norsk-dansk and later (as more Norwegian features were incorporated) dansk-norsk, rendered in English as Dano-Norwegian. When a standardized form of this language replaced standard Danish as the official language of the Norwegian government, its endonym became riksmål (‘state language’), while the exonym officially fostered around the world by Norway was simply the local version of “Norwegian.” Things became complicated by the official adoption of another standard – based on Norwegian dialects as far from Danish as possible – that was originally called landmål (‘country language’) and later nynorsk, a word that may be translated as ‘new Norwegian’ but is intended to be interpreted as New Norse (in contrast to Old Norse). Since there were now two official languages, riksmål was replaced by bokmål (‘book language’). When used without qualification, the exonym ‘Norwegian’ is still usually taken to mean bokmål, just as (unqualified) ‘Chinese’ usually means Mandarin; but if a Norwegian-American recounts that his grandparents spoke Norwegian, it is not at all apparent what their speech was, just as a descendant of Cantonese immigrants might say that they spoke Chinese.

Overseas speakers of English, French and Spanish (not of Creoles based on them) have, on the other hand, felt no need to give their languages names different from the ones in the homelands. Even those Cajuns who still speak their decidedly nonstandard French call it français. Hispano-Americans may differ in what they call the language (castellano or español), but this difference occurs in Spain as well. The French, to be sure, like to label their translations of literature from the Americas as traduit de l’américain when they mean American English and as traduit de l’argentin, du colombien, du mexicain, and so on, for Latin American Spanish, but that’s just another example of l’exception française. (The practice is not confined to the Americas: translated German literature from Austria is similarly labeled as traduit de l’autrichien.) I doubt, though, that Francophone Canadian literature would be regarded in France as being in canadien or québecois

Among the societies in the Western Hemisphere that don’t speak Creole, only Brazil is home to a state of strong diglossia. While Brazilians generally acknowledge that what they study at school and what they write formal prose in is Portuguese (sometimes qualified as português brasileiro or português do Brasil), they are far more likely to assert that they speak brasileiro (or at the very least português brasileiro or português do Brasil) than simply português. For that matter, ever since the founding of an independent Brazil, whenever the need arose for the naming of the ‘national language,’ a debate would ensue over whether the name should be português or brasileiro, often resulting in the compromise of calling it língua nacional. As in the Balkans and in the east of Spain, the glossonymic debate of Brazil has political overtones, but here the politics is not of nationality but of class and social ideology, comparable to the demotic-katharévousa dichotomy in Greece. As I mentioned earlier, the debate in Valencia also harbors some left-right tension, but there it is mainly symbolic, since it deals with a language that most people don’t speak. In Brazil, the choice of a glossonym has deep implications for the Brazilians’ identity as a people. Many Brazilians seem to feel that, once their language has been defined as Brazilian, they will no longer be bound by the spelling compromises between Brazil and Portugal and will be free to develop an orthography that will better reflect Brazilian speech, finally completing Brazil’s achievement of independence, a process that began with a declaration by a Portuguese prince.

Back to “glossotomy”

I have found a few instances of “glossotomy” (the word that Google asked me if I had meant when I searched for ‘glossonymy’) being used, not its medical sense (‘an incision of the tongue’ according to Taber’s Cyclopedic Medical Dictionary), but, pejoratively, in the sense of the splitting of a language, as for example by Bulgarians (sometimes peevishly) complaining about the status of Macedonian as an independent language. I have noticed similar (though more politely couched) concerns expressed by Portuguese people about a potential Brazilian language. I can understand these concerns. The perceived sharing of a language with Brazil gives the people of Portugal a link to a culture of broad international popularity, something that their own lacks, despite José Saramago’s Nobel Prize and the recent upsurge in the popularity of fado. Changing the glossonym would break the link. What’s in a name? one might ask. O que há num nome? as a Brazilian Juliet would ask, or O que é que há, pois, num nome? as her counterpart in Portugal might put it. By any other name, the language of Brazil would sound as sweet. And yet...

February 9, 2006
Revised August 7, 2006

© 2006 by Jacob Lubliner

