As the internet’s leading wugologist (is anyone else a wugologist?), I’ve decided to begin collecting the diverse variety of wugs found in the world. After all, wugs come in all shapes, sizes, and colors! More »

I’m not gonna lie, I’m somewhat jealous of #thegiftofdata. This couple tracked their text messages for a whole year of dating and a whole year of marriage, and got some pretty cool word clouds out of it!

A few months ago, Linguistics Club had a arts & crafts night, and most people ended up making cute wugs!! But I forgot to blog about it! More »

I’m a linguist. At the core of my soul, I believe in descriptive linguistics. A biologist goes out in the world, and describes the anatomy of the plants and animals he collects. He doesn’t say “this shouldn’t exist, because evolution says it shouldn’t!” Likewise, linguists go out in the world and describe the languages they collect. They don’t say “this shouldn’t exist, because the grammar rules says it shouldn’t!” Linguists and biologists alike wonder at all the weird structures they find, and aren’t judgmental about them.

Is there a time and a place for “correct grammar”? Sure, but it’s cultural. It’s not an objective fact. Stephen Fry put it nicely: just like you dress up your attire for special occasions (like job interviews), so do people dress up their language. All the science says that most of the “rules” people are taught in grade school are totally bunk. When I taught Intro to Linguistics last summer, I spent a whole lecture going over why Weird Al’s Word Crimes was totally bunk. Other linguists did so too.

Which is why I feel totally weird and out of place being a prescriptivist teaching “correct” “academic” writing. Sure, I can teach them some common conventions and research skills. But I’ve been asked questions like “Is it Tom and me or Tom and I?” And the real answer is “it depends” or maybe “whichever one is more frequent in your dialect”, but the answer they want to hear is “it’s Tom and me if the phrase is an object and Tom and I if the phrase is a subject, because and me are case-marked pronouns and English was a historically case-marking language even though we don’t use case anymore except in these pronouns and some people who still use whom“.

One of the first things many people learn in introductory linguistics classes is that there are about 7,000 languages in the world, plus or minus 1,000, depending on whether or not you’re only counting living languages or dead ones too, and how you divide the line between languages and dialects. Counting languages is difficult, even for professional linguists. But what about for laypeople? Does average Joe even realize how many languages there are in the world? Are linguists doing a good job of educating the public? More »

Counting languages is difficult for several reasons. First, there are still uncontacted tribes who speak undocumented languages, so the number could increase. Second, language endangerment and language death is causing rapid loss of linguistic diversity, such that over 50% of the world’s languages are expected to be extinct by 2100, and there’s a good chance linguists have already missed the opportunity of documenting hundreds, if not thousands, of languages which have already died.

Politics and culture also sometimes make classifying languages and dialects difficult. For example, there are nearly a dozen distinct, yet related, Chinese languages. They’re as different as Spanish, French, and Italian. Yes, they share a common ancestor (just as the Romance languages descended from Latin), but they are no longer mutually intelligible. Mutual intelligibility is the criteria linguists use to determine whether two varieties are a language or a dialect, so under this definition they are different languages. However, the Chinese census counts them as dialects of the same language, in order to portray China as a unified country.

The opposite can happen too. Hindi and Urdu are mutually intelligible. They are dialects of the same language. However, since they’re spoken in different countries, by people who largely use different writing systems and follow different religions, many people consider them to be different languages!

All in all, most linguists agree there’s somewhere in the ballpark of 7,000 languages in the world, and @AllTheLanguages is tweeting about them all!

A few months ago, I created a bot on Twitter. @AllTheLanguages tweets a new language from the Ethnologue database once every hour or so, and will do so for about a year. Give or take. Sometimes the bot goes down and I have to reboot it. And there are some other bugs too. But more on that in another post…

When I tell people that I made a twitter bot, the first thing they ask (after “why?”) is “how?” Well, today, I’m going to answer that! Why? Because it was fun! How? Well, it’s complicated… More »

Language has a pretty interesting property known as Zipf’s Law. That is, language data (and even subsets of language data) have a Zipfian distribution. There are a small number of highly frequent words, and a large number of highly infrequent words. Moreover, the frequent words tend to be short, grammatical (words that are grammatically required but don’t really mean anything) and the infrequent words tend to be longer, lexical (words like nouns and verbs which have some sort of referent or meaning).

What does this mean? Well, to show you I downloaded all of the English wikipedia (and you can too here). More »

Teaching Intro to Linguistics is probably one of the most challenging but fun things I’ve done in a long time. Usually I can answer any question my students come up with, but once in a while they stump me.

Case in point: the other day, one of my students asked if we new phonemes ever get discovered and the IPA chart gets updated. More »

Every now and then I come across and article or a documentary which claims to reveal insights into an “ancient” language or culture. This video takes modern camera equipment to the the Khoisan people of Namibia, who are said to be an ancient tribe, with ancient ways of speaking and an ancient culture. This video takes out-of-context quotes to argue that Basque is an ancient and superior language and culture. More »