Tag Archives: etymology

Septiembre, octubre, noviembre, diciembre

If you are reading this blog you must be interested in languages, so you may already be familiar with the etymology of the last four months of the year in Spanish, English, and many other languages. They come from the Latin words for seven, eight, nine, and ten, and were thus named because the Roman calendar began in March, making September the seventh month and so on.

Despite being fascinated by languages since I was a girl, speaking at least three languages that use these words, and knowing the relevant numerical prefixes, I somehow never made the connection between the numbers and the months until recently, when I started to study Italian in preparation for an upcoming trip. I had made some progress in Italian before but have now zoomed ahead using Language Transfer, a method developed by linguist and humanitarian Mihalis Eleftheriou. In his free courses, Mr. Eleftheriou likes to draw connections between the target language and English (and sometimes other languages), and in the process points out interesting etymologies such as these.

I recommend Language Transfer’s “Complete Spanish” as a first course in Spanish (or a refresher) to anyone who reads this blog. And if Mr. Eleftheriou comes across this blog post, I encourage him to contact me. I would be delighted to send him a copy of my first book (¿Por qué?) in thanks for his help with Italian.

For the sake of completeness, the etymology of the remaining months of the year is as follows:

eneroJanus, Roman god of beginnings and gates
febreroLatin februa ‘purification’. Since February was the last month of the Roman calendar, the Romans held a feast of purification on the ides of the month (February 15).
marzoMars, god of war
abrilunknown / disputed
mayoMaia, earth goddess and wife of Vulcan
junioJuno, goddess of marriage and wife of Jupiter
julioJulius Caesar
agostoAugustus Caesar

Something borrowed, something blue

For the last few years I’ve had a research project about Spanish word origins on the back burner. This summer I’ve resurrected the project, and it is simmering nicely: I have now finished the first major stage.

The focus of the project is Spanish borrowings, or loanwords: words in Spanish that originated in other languages. The project applies to Spanish the methodology from Martin Haspelmath and Uri Tadmor’s World Loanword Database (WOLD) project. Beginning in 2004, Haspelmath and Tadmor organized a team of linguists to collect data on loanwords in forty-one languages around the world. In 2009 they published their results in a book, Loanwords in the World’s Languages: A Comparative Handbook (De Gruyter), and the contributing linguists shared their data on the WOLD website.

My goals in this project are:

  1. To compare Spanish to the forty-one languages in the WOLD project, in terms of (i) its percentage of loanwords, and (ii) these words’ characteristics, such as their part of speech.
  2. To quantify the relative contributions of different source languages to Spanish vocabulary. I already did this for my first book, using a random sampling of five hundred words from a standard Spanish etymological dictionary. But that sample may have skewed toward more recherché vocabulary.
  3. To address various issues involved in etymological research, in Spanish and in general.

More about the WOLD project

In order to obtain comparable results across the WOLD languages, all participating linguists started with the same list of 1460 core meanings: ‘house,’ ‘mother,’ ‘go,’ and so on. Each linguist identified ‘their’ language’s words for these meanings, then traced the origins of those words using a standardized set of guidelines. I have now completed the first of these two steps for Spanish. It raised all sorts of interesting issues, which I will discuss in my next blog post.

One goal of the WOLD project was to compare the frequency of borrowing in different languages. In other words, of the core meanings, how many were expressed in each language by loanwords? As shown in the table below, borrowing rates ranged from 1.2% for Mandarin Chinese to 62.7% for Selice Romani. Yaron Matras’s review of the WOLD Handbook in the journal Language points out that these two languages are spoken in diametrically different environments. Speakers of Mandarin “show little or no bilingualism”; the language has “a status as a majority language, a powerful standard, and a sociopolitically dominant population.” In contrast, Selice Romani is associated with “universal multilingualism, a minority language status, the absence of a written standard, and sociopolitical marginalization.”

Romanian, the only Romance language in the project, fell into the “high borrowers” category (25.9% to 45.6%), as did English. My previous research (see above) placed Spanish in the “very high borrowers” category, with roughly one-third “native” vocabulary (from Vulgar Latin), one-third later borrowings from Latin, and one-third words from other languages. It will be interested to see whether this holds up for a WOLD-based lexicon.

Borrowing typeLanguages (in increasing order of % loanwords)
“Low borrowers”
(1.2 – 9.7%)
Mandarin Chinese, Old High German, Manange, Ket
“Average borrowers”
(10.7 – 22.4%)
Otomi, Seychelles Creole, Gawwada, Hug, Oroqen, Hawaiian, Kali’na, Iraqw, Q’eqchi’, Wichí, Zinacantán Tzotzil, Malagasy, Dutch, Kanuri, White Hmong, Mapudungun, Hausa, Lower Sorbian
“High borrowers”
(25.9 – 45.6%)
Takia, Thai, Yaqui, Swahili, Vietnamese, Sakha, Archi, Imbabura Quechua, Kildin Saami, Bezhta, Indonesian, Japanese, Ceq Wong, Sarmaccan, English, Romanian, Gurindji
“Very high borrowers”
(51.7 – 62.7%)
Tarifyt Berber, Selice Romani

Another goal of the WOLD project was to learn more about borrowing in general. The research confirmed several generally accepted principles about borrowings:

  • Function words were borrowed less than content words (nouns, verbs, adjectives, and adverbs). Overall, 12% of function words were borrowed, compared to 25% of content words.
  • Nouns were more likely to be borrowed (31%) than other types of content words (14-15%).
  • Borrowing was most common for cultural vocabulary, such as religion, clothing, housing, law, social and political relations, agriculture, food, and warfare; and least common for personal vocabulary, such as sense perception, spatial relations, body parts, and kinship.

Motivation

My interest in the WOLD methodology dates from 2018, when I was starting to work on my second book, Bringing Linguistics into the Spanish Language Classroom. The book is organized around five themes, or “essential questions,” including “How is Spanish different from other languages?” and “How is Spanish similar to other languages?” I thought it would be interesting to compare Spanish to the WOLD languages so that I could say either “Spanish has borrowed more words than most other languages” or “Spanish has borrowed a typical amount of words.” (I was confident that Spanish would be a “low borrower.”)

I originally imagined that I could research this topic in a couple of weeks, but soon ran into methodological issues such as:

  • Should word pairs like hijo and hija (‘son/daughter’) be counted as two separate words, even though they are just masculine and feminine forms of the same word?
  • WOLD linguists could identify multiple words for a single meaning. How far should this be taken for Spanish? How does one draw the line between synonyms and dialectal variants?
  • When looking up word origins, the WOLD guidelines count a word as borrowed if it entered the language at any point in the language’s history. This would include, for instance, words borrowed into Classical or Vulgar Latin, such as gato ‘cat.’ (Vulgar Latin cattus is believed to be Afro-Asiatic in origin, and replaced the original Latin feles.) This guideline rubbed me the wrong way. Shouldn’t Spanish begin with Vulgar Latin?

After three months of a futile quick-and-dirty run at these issues, I decided to put the project on my back burner and eventually do a more thorough job that would hopefully yield publishable results. So…here we are.

57 words with eñe

I’ve had the wonderful Spanish ñ on my mind lately (see e.g. here), and today decided to make a list of reasonably common Spanish words that use this characteristic letter. This started as a plain list of 57 words. Then I added translations. Then I couldn’t resist going back in time: I knew that the ñ sound had several different origins, but was curious to see how this worked out statistically.

The results are below, in tabular form so you can play with the words yourself if you like. The table is sorted by Type, meaning the type of the word’s origin; within each type, words are listed in alphabetical order. The types themselves are ordered by frequency.

  • The most common origin is therefore the first one you see in the table: a Latin ne or ni. When followed by another vowel, the e or i turned into a y sound, which in turn had a transformative effect on the n.  An example is España, from Latin Hispania. (The y sound had a similar effect on other consonants, not just n, and the resulting changes are referred to as palatalization.)
  • The next most common origin is a Latin double nn; this is the source of the tilde (~) itself. An example is año, from Latin annus. Pleasingly, the suffix -eño has a dual origin, with one derivational path of the ne/ni type (seen in words like isleño) and another of the nn type (seen in pequeño).
  • The third group is a Latin gn or ng sequence as in enseñar from Latin insignare. I knew that some ñ‘s came from gn, but the ng words were a surprise.
  • Next are words borrowed from other languages. Here we find words that begin with an ñ, from languages as disparate as Quechua (chuño), Dutch (ñu), and Italian (tacaño).
  • The mn group could really be collapsed under nn, because these words passed through an nn stage before emerging with an ñ. An example is sueño, from Latin somnus.
  • Some words on the list were internally derived from other Spanish words. For example, caña ‘reed’ gave rise to both cañón and cañada.
  • Finally, one word (rebaño) is of unknown origin — too bad I missed it when writing this recent post — and one (cariño) has a known origin that doesn’t seem likely to produce an ñ.
Word Translation Origin Type
1.           araña spider Lat. aranea ne/ni
2.           baño bathroom Lat. balneum ne/ni
3.           campaña field, campaign Lat. campania ne/ni
4.           compañero companion Lat. compania ne/ni
5.           emponzoñar to poison Lat. potiniare ne/ni
6.           España/español Spain/Spanish Lat. Hispania ne/ni
7.           huraño shy Lat. foraneus ne/ni
8.           isleño islander -eño suffix from Lat. ‑ineus ne/ni
9.           jalapeño type of pepper Jalapa (Mex. prov.) plus -eño ne/ni
10.       migraña migraine Lat. hermicrania ne/ni
11.       montaña mountain Lat. montanea ne/ni
12.       ordeñar to milk Lat. ordiniare ne/ni
13.       piña pinecone, pineapple Lat. pinea ne/ni
14.       saña rage Lat. insania ne/ni
15.       señor, señora, señorita Mr., Mrs., Miss Lat. senior ne/ni
16.       viña vine Lat. vinea ne/ni
17.       añil indigo Ar. an-nil nn
18.       año year Lat. annus nn
19.       caña reed Lat. canna nn
20.       engañar to fool Lat. ingannare nn
21.       guiño wink Lat. cinnus nn
22.       muñeca wrist, doll Lat. bonnicca nn
23.       niño boy Lat. ninnus nn
24.       ñoño dull Lat. nonnu nn
25.       paño cloth Lat. pannus nn
26.       peña rock, crag Lat. pinna nn
27.       pequeño small -èño suffix from Lat. ‑innu nn
28.       enseñar to teach Lat. insignare gn/ng
29.       estaño tin Lat. stagnum gn/ng
30.       heñir to knead Lat. fingere gn/ng
31.       leña firewood Lat. ligna gn/ng
32.       puño fist Lat. pugnus gn/ng
33.       reñir scold Lat. ringi gn/ng
34.       señal signal Lat. signa gn/ng
35.       tamaño size Lat. tam magnus ‘so big!’ gn/ng
36.       teñir to dye Lat. tingere gn/ng
37.       uña nail Lat. ungula gn/ng
38.       bruñir polish Occ. brunir borr
39.       buñuelo fritter Cat. bony borr
40.       champaña champagne Fr. champagne borr
41.       chuño potato starch Quech. ch’uñu borr
42.       gañán farmhand Fr. gaaignant borr
43.       ñandú rhea Guar. ñandú borr
44.       ñoqui gnocchi Ital. gnocchi borr
45.       ñu gnu Dutch gnoe borr
46.       tacaño stingy Ital. taccagno borr
47.       vicuña Quech. uikuña borr
48.       daño harm Lat. damnum mn
49.       doña lady Lat. domina mn
50.       dueño master Lat. dominus mn
51.       otoño autumn Lat. autumnus mn
52.       sueño dream Lat. somnus mn
53.       apañar to fix from paño (below) der
54.       cañada ravine from caña der
55.       cañón cannon, canyon from caña der
56.       cariño affection Lat. carere
57.       rebaño flock unknown

Linguistics projects for the foreign language classroom

In a workshop I recently gave in Atlantic City, I distributed the following list of possible linguistics-based projects for the foreign language classroom. They are adaptable for a variety of languages and levels of instruction. To download a PDF version, click here.

This list is a subset of the projects included in the companion website for my book¿Por qué? 101 Questions about Spanish. Here I divided them into the four categories of “Language history,” “The target language in the world,” “Language learning,” and “Language use”.

If you make use of this list, as an instructor or a student, please write back and let me know how the project(s) turned out.

Language history

  • Examine a few pages written in an older form of the target language. What are obvious ways that the language has changed?
  • Look up the origins of the words in either (i) a sample of text from the target language, or (ii) a specific vocabulary domain, such as clothing or animals. Where do the words come from, and what does this teach about the history of your language?
  • Research and create an infographic about a phase in the history of your language, such as the Golden Age of Spain or the Napoleonic period in France. What were the linguistic landmarks of these periods?
  • Research vocabulary borrowings into English from the target language. What do they tell you about how the two cultures have interacted?
  • Research the etymology of a dozen place names (names of cities, towns, etc.) in a country that speaks your target language. What does it this exercise teach you about the language’s history? Summarize your findings on a map or other infographic.

The target language in the world

  • Use Ethnologue (an on-line database about world languages) to gather data on where the target language is spoken and what other languages are spoken in those countries. Present as an infographic or a slideshow.
  • Profile a language academy such as the Académie française or the United States branch of the Real Academia Española. Who are the members? What are their activities and/or publications? What would you ask if you could interview them?
  • Research and present information about a language controversy, such as Catalan versus Castilian in Catalonia, or the historical tussle between French and Alsatian in Alsace.
  • What information does the most recent USA census provide about speakers of your target language in our country?

Language learning

  • Try to predict which features of English are most hardest to learn for speakers of other languages. Interview an ESL teacher to test your predictions.
  • Try a few lessons in the target language from Duolingo, Rosetta Stone, or other language learning software. How does the software try to teach the language? How is this different from classroom learning?

Language use

  • How might you reform the spelling of your target language to make it easier? Argue for your changes and transform a sample page using your proposed changes.
  • Pick your favorite language rule: ser vs. estar, passé composé vs. imparfait, and so on. Analyze actual text (perhaps a newspaper article) to see if the rules taught in class explain the actual usage.
  • Learn how to speak “Pig Latin” in the target language (e.g. Spanish jerigonza). A speed contest may be in order! What do you have to think about as you speak in order to accomplish this?
  • Find, watch, and compare instructional videos on some difficult aspect of pronouncing your language (like rolling your r’s). Make your own instructional video.

A new online Spanish etymological dictionary

Today’s post is about a new online resource for the Spanish language lover: the Online Etymological Dictionary of Spanish, or OEDoS. A screen clip of the welcome screen is below. The website was inspired by Douglas Harper’s very useful online etymological dictionary of English. It went live in July, and has its own Facebook page. The primary resource consulted to create the entries has been Corominas’s Diccionario crítico etimológico de la lengua castellana. (This is the six-volume standard, whose shorter version is one of the “top 10 books” on my bookshelf.)

Capture

I contacted the OEDoS Team to find out more about their methodology. Via a friendly return email I learned that the dictionary began with the 2000 most frequently used words of Spanish, with others added because of etymological importance, user requests, and other reasons. My OEDoS contact’s (Patrick Welsh) explanation of how the OEDoS handles etymological disagreements was quite interesting:

As regards conflicting etymologies, we the OEDoS team recognize a dual responsibility of both accuracy and readability. We aim to capture disagreement between linguists whenever possible. In the interest of our time constraints and resources, this is not always possible. Sometimes this breeds disagreement on our side as well. For example, the etymology of hacer (http://spanishetym.com/term/hacer) sparked significant disagreement on historical accentuation and lexical borrowing; this caused the publication of the entry to be delayed for some time. We note the incisive criticism of Penny and others in the 1980s toward Meyer-Leubke, as well as very recent scholarship on Latin’s reflexes in Romance. Ultimately, we decided Meyer-Leubke’s comments were strong enough to overcome our initial wariness. Brief mention of two modern publications were included in the entry as well. Sometimes the entry you see in the dictionary is a snapshot of disagreement: not only between historical linguists at their university desks but between us as well

 

I hope that you will all visit this website and spread the word about the project.

The strange history of muñeca

A stray comment on /r/Spanish got me thinking about muñeca, the word that, bizarrely, means both ‘doll’ and ‘wrist’. The ‘doll’ meaning is primary. It’s the one listed first in dictionaries, and if you do a Google image search on muñeca, you see more dolls than wrists. It’s the first meaning that I learned, since ‘wrist’ is one of the less important body parts. When I eventually learned the second meaning, I was surprised that one word could have two such completely unrelated interpretations.

I’ve just looked up the history of muñeca in my can’t-live-without-it etymological dictionary by Joan Corominas. It turns out that the word’s original meaning was neither ‘doll’ nor ‘wrist’, but something entirely different: ‘milestone’, in the physical sense of a road marker.

Muñeca ‘milestone’ turns into both ‘wrist’ and ‘doll’.

How did this bizarre transformation take place? According to Corominas, the key was the interpretation of a milestone marker as something that sticks up out of the ground: a bump, or using fancier English, a protuberance. The word was then extended to ‘wrist’ because the wrist bone protrudes from the arm. The road to ‘doll’ began with the extension of muñeca to a bumpy bundle of rags, and from there to a rag doll, and then other dolls.

Muñeca‘s original meaning of ‘milestone’ has been lost from everyday discourse, but is still included in the Real Academia’s dictionary — but only after ‘doll’, ‘wrist’, and other meanings related to ‘doll’, such as ‘cadaver’ and ‘bimbo’.

Incidentally, the earlier history of muñeca is obscure. It is not Latin, but seems to come from a pre-Roman language, possibly Celtic.

Fun with Proto-Indo-European roots

Recently I’ve been playing with John Slocum’s terrific Indo-European Lexicon website and wishing I’d discovered it earlier. In case you didn’t know, Spanish and the other Romance languages are part of the Indo-European language family. Other branches of this enormous family include Germanic, Greek, Celtic, Slavic, and Indo-Iranian. Spanish is therefore related to language as diverse as Gaelic and Gujarati; to Sanskrit, Serbian, and Swedish; to Pashto, Persian, and Polish; and to Hindi and Hittite.

Dr. Slocum’s Lexicon lets you trace vocabulary roots up and down the Indo-European family tree. For example, let’s say you’re curious about the origin of the Spanish word pan “bread”. If you click on the Language Index you can then scroll down to Spanish. (For a shortcut, you can access the Spanish page here.) This page lists almost 500 Spanish words whose Indo-European roots are included in the Lexicon. Pan is traced back to the Proto-Indo-European root pā-. Click on that root and you’ll move up the tree, to an entire page devoted to pā-. This page provides a definition and a list of the root’s descendants in all ten branches of the Indo-European family.

It turns out that pan is related to several sets of English words. I knew about some of them, but not all.

  • food and fodder
  • company, companion
  • forage, foray, foster
  • pantry, pannier
  • pastor, pasture, repast, pastern, pester
  • antipasto (but not “pasta”, go figure)
  • pabulum

For other words, though not pan, tracing a root back down the tree can show you surprising connections within Spanish. For example, llama “flame” and blanco “white” share the same Indo-European root, as do armisticio, arrestar, asistir, costar, estado, and estar.

Why are you still reading? Run along and play!