Friday, May 23, 2014

Answer: How much does a country spend on schools?

How much a country spends on its schools isn't the only factor that determines how well the students are taught or how much they learn... but it's a measure of the investment a country chooses to place in its youth.  

Setting aside all of the debates about school policies, how can we answer these two simple questions? 

The Challenge from Wednesday had two parts:  

1.  Can you find the data on which my graph is built?  And, once you find it, can you create a chart showing the investment-per-pupil for Serbia, Singapore, US, Finland (and maybe one or two other countries of your choice)?  
2.  If you've got the time and inclination, can you discover why Singapore manages to spend so little per student, and still have a great school system?  (This is clearly extra credit.)  


The first thing I realized when setting up this question is that it's a little ambiguous.  I wrote it that way to make a point:  Many times research questions ARE ambiguous.  Part of your task as a researcher is to clarify the question itself.  

When you start looking for data about the investment-per-pupil, are we talking about ALL students in a country?  Did I mean K-12?  How about junior colleges?  Universities?  Vocational schools?  

I started by first doing the search:  

     [ expenditure per student ] 

just to see what I'd find.  The results are pretty good:  International results from the World Bank (#2 below) are mixed in with US states results.  

The data is even normalized as a "% of GDP per capita," which isn't bad at all. 

If you go to that site, you can very quickly reproduce the chart I showed in the Challenge. 

Chart 1:
World Bank data "Expenditure per student, primary, % of GDP per capita" 

Here's the query that creates the above chart.  It's pretty straightforward--just use their mapping tool and select the countries you want.  

Now, if the truth were told, that's NOT what I was hoping you'd do!  

I wanted to have you discover that Google's Public Data Explorer ALSO has this data from the World Bank, and offers a very nice charting suite as well!  

If you visit the Google Public Data Explorer (PDE) site, and do the same query, you'll find many slightly different data sets of the equivalent data.  

And if you go to the second link in the results, you can create a very nice graph showing the same data.  
That chart will look like this: 

Chart 2:  
Google PDE chart of World Bank data on "Public spending on education, total % of GDP"

Now, compare this chart with the one we charted using the World Bank data (Chart 1, above).  

THEY'RE NOT THE SAME.  They're not even close.  What happened?  

It took me a minute to notice that there are actually more than one data set here.  In fact, when I clicked on the second data set in the PDE, I was pulling World Bank data about "public spending on education total, as a % of GDP."  That's great data, but it's not the same as "expenditure per student, primary"!   

Lesson:  Be very, very careful about the metadata that describes the data set you're analyzing.  It's telling you what you need to know; but you HAVE to read it carefully.  

So let's go back to the PDE data set and select the FIRST link in the PDE results page.  

Chart 3:  Expenditure per student - secondary. 

Note that Chart 3 is STILL not the same as Chart 1.  What gives now? 

Lesson:  I wasn't kidding: You still have to be careful.   Really pay attention to the metadata.   Look around the UI for options that let you change the view of the data.  Learn to read the UI to see what's possible!  

Look at the lower left.  There's an option selector that determines "Education levels" to show.  Here I've selected "secondary" (meaning, ages 12 - 18).  If you change that selector to "Primary"  (ages 5 - 11), you'll get a much different chart.  

(And yes, I know the definition of "primary" and "secondary" changes from country to country.  This is roughly what it means.)  

Chart 4:  Expenditure per student, PRIMARY grades only 

Now this chart looks like the chart we created at the World Bank site.  

Here's the thing to know:  The World Bank graph really IS the "expenditure per student, primary grades"... it says that.  But you have to read carefully.   

So I went back to the World Bank site to see if I could change the data set to Secondary, and see if that would match our Chart 3 from Google's PDE.  

Here's that chart.  Notice how much it looks like Chart 3 (the PDE version).  

Chart 5:  World Bank data for secondary schools expenditure 

I'd say we've figured it out.  

You can get the data straight from the World Bank itself, or get it from the PDE.  (As you can see by reading the Google PDE metadata, it's actually the same data. Google just scrapes it from the World Bank (with their agreement) and re-publishes it along with the visualization tool.)  

Now we can turn our attention to the second question:  How does Singapore get by spending so little?  (Relatively speaking.) 

One thing to notice when looking at the graphs is that Singapore is ALWAYS near the bottom of the spend-per-student charts.  Yet we know that they have superb schools.  

Interestingly, Serbia (with a total population of around 7M, and a student population of 1M) spends a LOT of money per student, but only in primary grades.  That seems to be because their population demographic is so young. There are lots of school-age kids in Serbia... 

To answer this fully probably requires writing a Master's thesis.  But to get a quick answer, I really liked Rosemary's approach.  She did a simple query: 

     [ Singapore low education GDP ] 

And discovered a bunch of articles on the topic.  Reading around just through these articles is fascinating.  

Interestingly, the #1 hit is blogger Roy Ngerng's post on "How is Singapore's Education System Unequal?"    He presents a lot of charts and data to make the case that Singapore is actually underperforming, suffers from inherent inequities,  and should be doing a better job.  

But in the middle of the data, it becomes clear that the Singaporean school system is doing a good job of teaching students (although with larger class sizes, and then NOT progressing all of their students to secondary school)!  It's a complex situation, but one of the side effects of this would be a reduced spend on students (because there are fewer of them).  It's also clear that this is a topic of some concern for Singaporeans, who are concerned that their schools aren't doing a better job. 

It helps quite a bit to be a small island nation, with all of the students in a fairly small, fairly homogenous region, although with a population that speaks many languages as their native tongue.  (By contrast, I don't know how much of US student expenditure is for transportation alone, but I suspect it's substantial.  The cost of moving books, meals, and students all around is going to be high.  The US also deals with a diversity of languages as well; although perhaps not with the intensity that Singapore has.)  

As I said, this is a large, complex topic, but Rosemary's approach is a good one:  Start with the simple and obvious query--read through the top ten articles or so; learn from that, then refine.  

Using a similar approach, Debbie G and Anne found an excellent overview article from the National Center for Educational Benchmarking which gives a one-page summary of how Singapore got to be where it is today (educationally speaking).   

Search lessons:  As we've learned, it's important to be very careful about the metadata of the data you're charting.  Be sure you've got the right sources AND understand what's in the data, and how it's defined.  (e.g. the definition of "primary" and "secondary" above).  

Remember that there are many places to get data of this form.  PDE just re-surfaces the World Bank data, but the UN also has data of this kind.  (But again, but careful of what you're comparing.)  

Finally, when dealing with a large complex topic ("How does Singapore not spend so much money on students?") be aware that you might not find a single answer, and that this is a question that you'll need to study for a while.  Searching for overview articles, and skimming the top-ten hits for a well-crafted query will get you a long way towards understanding the issues, even if you don't come out with a single, short answer that's suitable for putting onto a multiple-choice quiz.  

Search on! 


  1. May I add that had you searched via Fusion Tables "search public data base" you will get a slightly different results pages that gives you a sampling of what is in the tables and as well the ability to open the tables in Google Sheets or Fusion Tables directly from the snippet. Now in this instance I can see that it would be simplified to use the tools already available as you showed us. I didn't think of that. But should you want to create your own tables or charts the export function works really well.

    Your solution for this challenge makes so much sense, now.

    1. That's absolutely right. I'll do a Fusion Table challenge in the future. In this case, the PDE tools are simplest. But Fusion Tables have some great capabilities. Stay tuned!

    2. Great answer as always Dr. Russell. I had no idea about Google Public Data Explorer. Thanks for telling us about it. A new tool for us.

  2. Anne and I were thinking of using Fusion Tables but didn't get that far. Rosemary, looks like the librarians had this challenge this week! Love these challenges to practice using some of these tools!

  3. I did not think I was cheating when I got the answer immediately. Honest, I just was thinking how very clever I was for once.

    Don't believe I have ever heard of Google Public Data Explorer previously but now another trick to master.

    Thanks for this.


    1. Oh... I didn't think you were cheating either! It's great that you found a fast way to get to the data for the Challenge.

      PDE is an underused resource. It's definitely worth a few minutes of exploration.