Tuesday, March 20, 2012

Wednesday Search Challenge (3/21/12): Are there more languages above or below the equator?

We were talking the other day about how languages came into being. You know the stories--English grew out of the Anglo-Saxon dialects with dashes of other languages (especially French and Latin) added in.  Here's one view of how languages are scattered around the world: 
Image inspired by a language map from WikiMedia Commons
But then the discussion grew a bit more heated... Did languages arise because of geographic isolation, did they come about because of drift over time, or... what?  

After a few glasses of wine, various complex and learned theories were bandied about, and in the end we boiled it down to one really interesting search problem that we couldn't resolve with a quick Google search:  

     Are there more languages ABOVE the Equator or BELOW the Equator?  

A big part of answering questions like this is making your terms very clear.  For this question, let's consider only  Official languages (that is, languages recognized by the country) and not worry about relative sizes of speakers. We're just interested in whether or not countries below the Equator  have more languages than those above the Equator.  

To take an example, we know Switzerland has four "official" languages (German, French, Italian and Romansch).  Romansch has been recognized as one of four "national languages" by the Swiss Federal Constitution since 1938.

So, once you figure out what the languages of all the world's countries are, and divide them into languages spoken above-equator and below-equator, you'll be set to give the answer.  Notice that some languages (such as French or English) are spoken both above and below the Equator.

And for countries that straddle the Equator, let's go with the location of the capital as deciding if the country is above or below.  Thus, Ecuador (capital: Quito, at 0° 9′ 0″ S, 78° 21′ 0″ W) is in the southern hemisphere.  

Obviously, you don't want to do this one country or one language at a time, so the real question is... HOW will you do this?  

As usual, please let us know your answer, HOW you figured it out, and HOW LONG it took for you to figure this out!  

More above or below?  The ideal solution will actually determine some way of measuring this directly from data available on the web.  

Search on! 


  1. Took me a little while, this one (although this is my first time attempting a SearchReSearch challenge), though I guess 10 minutes only makes me impatient. Almost gave up, but I kept looking back at your suggestion about "making your terms very clear." So, after trying some different things, I searched "languages southern hemisphere" — instead of "below equator" — and found that Wikipedia's page on the Southern Hemisphere mentioned languages. From the page:

    "The Hemisphere is also remarkably less diverse linguistically compared to the North, as the majority of the hemisphere's population can speak one of just five languages Portuguese, English, Spanish, French, or Indonesian."

    And there you have it.

  2. My take on this was to look at population distribution in the Northern & Southern hemispheres - given that it appears roughly 90% of the population lives north of the equator, it would seem reasonable to assume that most of the languages would be there also. It also seems reasonable to see language heavily influenced by exploration and cultural projection-given the way things have gone the last 500=/- years, the dominance of European, Middle Eastern & Asian based languages (predominately north of the equator) would also be reasonably accepted. That said, clearly language is fluid,dynamic and continually evolving; e.g., from the "isolated", south of the equator, "english" speaking continent of Australia - offered the word bogan that I had to search the meaning of the other day... point being, that even within a common language, it is quite easy to loose comprehension/meaning without other cultural talisman. Perhaps another way to have approached it would be from an epidemiological view - Language is a virus, constantly morphing to fulfill the need to communicate/survive - courtesy of Laurie Anderson & William S. Burroughs:
    Home of the Brave
    So, I didn't find the " ideal solution " , spent about an hour looking & thinking about this - in terms of sources, other than the usual search engines, looked at WolframAlpha since I thought this was fairly hard data based, but didn't get anywhere there. Also looked at the CIA World Factbook but also found that less than helpful for this... did run across specific languages info here that might be of interest to someone:
    Nations Online Project
    This is now in TL;DR terrritory - will be interested in tomorrow's explanation - feeling kinda meh about the ?...por qué? the ? was too amorphous, the "clear terms" too arbitrary = hollow info?

  3. Also a timely search challenge given one recent language-related news item:

    I wonder if sign language is recognized as an official language anywhere...

    Back to the challenge

  4. South. PNG and Indonesia have well over 1000 combined.


  5. No search required. Northern Hemisphere because Europe. Also more land area in northern hemisphere, so even with Africa as a wildcard, there are more countries in the northern hemisphere. Same for South America.

    Source: deductive reasoning.

  6. Most probably above equator


  7. Northern Hemisphere has more languages than the Southern Hemisphere. 1713 vs 930 according to the World Atlas of Language Structures (wals.info)

    I searched for "languages by longitude latitude" on Google. Third hit is a page by Dr Dryer from the Linguistic department at the University of Buffalo. Looking at his faculty page lead me to the World Atlas of Language Structures, where I search for languages by region (sections above and below the equator).

    Most likely took 15 min.

  8. North.

    I used http://www.nationsonline.org/oneworld/languages.htm which allows you to look at official langauges of the world by country. I looked only for languages which were unique to one side of the equator or the other, since languages spoken by both hemispheres would cancel each other out. I realized that most of the landmass is in the northern hempisphere. After checking only the southern hemisphere, there were only about 40 some languages that were unique, whereas there were far more in the northern hemisphere. It was easy to count these quickly since in most cases it is obvious which hemisphere the country is in, the only ones that took some time were the ones that sit on the equator and therefore require me to check where the capitol is.

    Took about ten minutes.

  9. I think that the key term was "official"; I looked at the Wikipedia page http://en.wikipedia.org/wiki/List_of_official_languages which generally supported CA Jim's assertion. But just look at http://en.wikipedia.org/wiki/Trans%E2%80%93New_Guinea_languages and you can see the huge number of languages there. I suspect there's a similarly large number of languages spoken in the Amazon basin

  10. So one way to approach the search is to combine multiple data sets using tools like Google Docs or Excel. In order to do so, you'll need to find three lists: a) official languges for each country, b) country location with respect to the equator (this could be latitudinal coordinates or some other indicator), and c)a list of lat/long locations for the national capitals. Combining the three data sets shouldn't take too much effort.

    A) For lists of official languages, there are probably numerous sources available online. Wikipedia has a page dedicated to the list, and, even better, the CIA factbook series has a table-formated list, by country, indicating all languages spoken -with official languages highlighted.
    B)Country location lists should be obtainable the same way. Querying "list of country latitudes" produces this page: http://goo.gl/9Fj7r with XLS files available for download.
    C)Same for capitals.

    Merge the three lists, have the spreadsheet break the list into N and S latitudes, and tally the number of official languages in each bin.

    To answer my own question from my previous comment...New Zealand lists New Zealand Sign language among its Official Languages.

  11. "A big part of answering questions like this is making your terms very clear."

    There are more languages above the equator. The equator is an imaginary line on the surface of the earth. Below the surface of the earth, there are very few languages. Above the surface of the earth is where most languages are.

    "A big part of answering questions like this is making your terms very clear."

  12. Step one - go to the Ethnologue and download their database.
    Step two - count the number of languages with latitude > 0, and the number of languages with latitude < 0.

    Above the equator: 4,409
    Below the equator: 2,749

    Since I already have a copy of the ethnologue (step 1), then this took me about 1 minute.

  13. I think its more complicated than that as there are 11 official Languages in South Africa alone.

  14. Papua New Guinea has over 850 separate languages (not dialects) and about 20% of the world' total. Greatest diversity anywhere
    Amazon rainforest accounts for another 300 plus native languages.
    Australian Aboriginal languages are down to about 20 from 400 a century ago.
    Don't be so Eurocentric.

    1. I greatly agree, despite the fact that the North may be more populated than the South( thanks to Asia, ) without googling this I can assume the South is much more diversified when it comes to languages. I am Nigerian and I speak 3 languages of about 500. You can google that, Nigeria alone has about 500 maybe more actually. So my guesses are the South.

    2. ...uh, Nigeria, north of the equator on the maps I've seen... geography isn't "eurocentric" and the parameters of the question wasn't about total number of languages - just sayin'... if you were to say don't be so "googlecentric" you might have a stronger point, but maybe i"m just stating the obvious.

  15. How about asking dbpedia?


  16. More above the equator, almost by a factor of 3.

    1. Use the "Official languages of sovereign countries" section on Wikipedia's "List of official languages" page. http://bit.ly/GEIIev

    2. Extract a list of unique countries from that section.

    3. Assign each country a hemisphere -- "N" or "S". (I did this manually, but it would likely be possible to automate this. If you're good at geography, it doesn't take too long!)

    4. Write a script (I used Perl) that uses the list created in #2 and then parses the Wikipedia section above. For each language, read each country that speaks it. For each country, determine in which hemisphere each language is spoken, increment a counter for that hemisphere, but only count each language once per hemisphere.

    When I did this, I got something like 84 unique official languages spoken in the Northern Hemisphere ("above") and 29 unique official languages spoken in the Southern Hemisphere ("below").


    1. There was some ambiguity as to how "overseas regions," "territories," and other "sub-national" entities should be handled; I eliminated them and attributed the spoken language to the "home" country only. I don't think this had a large effect on the outcome.

    2. The Wikipedia section, once extracted, required a small amount of "cleanup" (e.g., adding missing colon (":") characters so I could easily differentiate between languages and countries).

  17. I have to take issue with the terms of this question. It's anything but clear, actually. Many countries officially "recognize" a multitude of languages spoken by their citizens, but those languages are not their "official language". If you're really "just interested in whether or not countries below the Equator have more languages than those above the Equator" then you've muddied up the question quite a bit.

    I live in Mexico, for example, which "officially recognizes" fifty-some languages spoken by its people, although its "official language" is Spanish. Ditto Brazil, the Philippines, Nigeria, etc. It's certainly not true that all Brazilians speak Portuguese, so counting only one language for all of Brazil is a completely nonsensical way to actually answer this question.

    Should we be counting official languages of countries, or should we be counting languages officially recognized by the countries in which they're spoken?

  18. Based on the way the question was originally worded, I'm not convinced that the solution approach was sufficient. For example since English is an official language for atleast one country both north and south of the equator, it should not influence the count. I didn't see such a "cancellation" approach outlined in the approach.