Thursday, September 13, 2018

Answer: Can you be a Fermi estimator?

Being a good estimator... 

... requires two kinds of knowledge:  (a) facts about the world, and (b) basic math estimation skills.  Those skills let you work from the basic facts, combining the data and doing quick math estimates to get to some numbers that give you the information you seek.  

How tall was Enrico Fermi, the physicist?  How would you estimate?  Assume he doesn't have extraordinarily long legs.  

Fermi Estimation needs both kinds of knowledge--along with a bit of practice in figuring out how to go from point A to point B.  

Let's practice with a true Fermi estimation!  

1.  Can you estimate how tall Enrico Fermi was?  
In this example, I give you a piece of data: Fermi's head is 9.4 in (23.9 cm) tall.  

How would you estimate his height just from this information?  Ask yourself, What other information do I need to know?  

If you're an artist, you probably learned that when you sketch a person, their body proportions are often expressed in terms of head-heights.  Example:  An ordinary person is about 7.5 heads high--a tall, slender, elegant person is about 8 heads high. 

If you don't know that bit of data, you could look the relationship up quickly:  

     [ body proportion in heads ] 

Later edit (Sep 14, 2018):  In the original post I messed up the calculation with a typo... I wrote that 24 * 7 was 164, which is clearly wrong.  Once I had that typo, all kinds of errors followed.  Bottom line, check your math. Didn't your sixth grade teacher say that? 

So know you know Fermi's head is 23.9 cm tall.  To Fermi estimate, let's round that up to 24 cm.  You can probably guess that 24 * 7 = 168 cm.  If you then add in the half-head height (12), you can add 168 + 12 and get 180 cm.  Make sense?  

In this case, I did the mental math in metric because that value (23.9) could be easily rounded up to 24 without causing many estimation problems, whereas the English measurement (9.4 inches) was kind of a pain--if you round up OR down, you lose a lot of accuracy.  

Plus, I know how to convert from cm to inches pretty easily.  

I know (off the top of my head, OTTOMH) that 10 cm ~ 4 inches.  

(Here, the ~ means "close to" in value.)  

SO... to convert 180 cm  into inches, I would divide 180 by 10 and multiply that by 4.  My mental math was 180/10 gives me 18 --I then multiply 18  by 4 to get your Fermi estimate of 72  inches, or 6 feet.  

When I read that, it SOUNDS really complicated... but it's not.  Here are the steps laid out in a diagram.  

1.  I know 10 cm~ 4 inches.  So... I just have to divide by 10, then multiple the result by 4 to do that conversion. 
2.  180 cm /10 = 18
3.  18 * 4 = 72  inches...  

At that point, most English-unit-using people can convert that into feet/inches without any trouble: 6 feet tall. (Yes, I know that's a weird skill; so be it.  We grew up with it. Point is, it's not that hard.  

So yes, you have to know a basic conversion fact (10 cm ~ 4 in), and know how to break up a complex mental multiplication into parts.

(If you want to learn more about how to do this kind of mental math for estimation, there are many online videos that will walk you through the process.  One that's pretty nice is from the Khan Academy:  Estimating values.)  

2.  Can you estimate (without looking up the answer!) how many people in the United States are over 80 years old?  (For extra credit, how many people worldwide are over 80 years old?)   
This is an interesting question:  How would you go about estimating this? 

Again, you need to know a few things to start. 

1.  The population of the US.  (I know this:  It's around 325M people.) 2.  The population age distribution.  

Once upon a time, I remember seeing a chart like this.  They're sometimes called "age pyramids" or "age distribution" charts.  When I look at this chart in my mind's eye, the top couple of slices of that chart constituted around 3% of the whole.  

To check this, I also remembered that the average life expectancy is around 78 (which is lower than in some countries, higher than in some).  So 3% off the top sounds like a reasonable guess.  

So... 3% of 325 is easy.  Divide by 100 (3.25) and multiply that by 3.  That's 9.75 million people who are older than 80 in the US.  

When I look up the actual numbers to check myself, I see that there are 328.3M people in the US, and this is the age distribution: 

And, surprisingly, my estimate of 3% is pretty close to the actual value.  

Even more surprisingly, I also happen to know that the US is slightly above average for ages > 80 years throughout the world.  Since I know that the population of the Earth is 7.5B, 3% of 7.5B is 225M people, which is a lot of elder wisdom.  

3.  To do Fermi Estimates you actually need to know a few basic facts (e.g., about how many people live in the US).  This brings up a great meta-question for Fermi Estimation and sensemaking of data that you see presented in the news... What facts do you need to know to be a good Fermi Estimator? (There's no perfect answer for this; just tell us what facts you've used to do your own Fermi Estimates!)  

When I start thinking about how you do estimates in your everyday life, I think about all of the times I do the Fermi estimations on news stories that I read.  Usually, these days, these are stories about income, wealth, distribution, immigration, and science stories. 

To do any kind of Fermi estimation you need to know a few things.  I've been writing down some of what I'd consider core knowledge over the past week, just taking notes when I did my own estimates to see what core knowledge I used.  Here's my list (yours is probably different): 

- population of the US in 2017:  327M 
- population of the world in 2017:  7.5B
- area of California:  ~100M acres
- conversion from inches to cm:  1 inch = 2.54 cm 
- number of stars in our galaxy:  300B stars

And so on... 

The reality is that nobody can take the time to spend 5 minutes doing online research for everything.  But it's pretty simple to do Fermi estimates to see if what's being said actually makes sense.  

It's a really good skill to have.  I highly recommend you practice this whenever you see something that sounds a little off.  It just might be.  But with your Fermi estimator skills, you can see through the mistakes.  

Here's a short video by my friend Jevin West talking about Fermi estimation in his class at the University of Washington.  It's well worth watching--his examples are great!    (Video link.)  

Search Lessons 

A few lessons spring out at me... 

1.  You still have to know basic facts about the world.  Dates, places, quantities, names, sizes, durations...  For instance, how important are plastic straws as a source of plastic pollution?  (Can you estimate what fraction of the total plastic trash they are?)  Once you know how to do Fermi estimates, your reading and understanding of news stories  / current events changes.  Beyond just knowing facts--e.g., the population of Japan is around 127M--once you know things like this, you know what kinds of information can be derived from them. Example:  While 3% is a good estimate of the number of people world-wide who are older than 80, you might also know that Japan is an outlier in the age pyramid--they have many more older residents than most countries, so you should Ferminate a higher fraction for folks > 80 years of age.  

2.  Realize that you CAN do estimates that are pretty good, and use them to do basic fact-checking of a story.  If it doesn't add up, you have more research to do.  This more of an attitude about research than anything--but it's important.  Many people don't know how to do estimates, and so they don't know they can guesstimate what they're reading or hearing about. And that makes them more susceptible to incorrect data and news stories.   

3.  Of course, online search is handy to get to the basic facts and formulas.  This is really true for breaking stories where there simply isn't any good information yet.  Even if you can't determine a particular value for something you've read, there's a good chance that you can estimate it by looking up other information and combining the data together into a fuller picture.  

Keep estimating.  It's a great personal skill to have! 

Search on! 

Thursday, September 6, 2018

New: Data Set Search Mode

It's often hard to find good data sets. 

But it just got a lot easier.  

Earlier this week Google announced a new Data Search mode that lets you do a Google search just for data sets.

My colleague Natasha Noy wrote in the Google Blog that 

"There are many thousands of data repositories on the web, providing access to millions of datasets; and local and national governments around the world publish their data as well. To enable easy access to this data, we launched Dataset Search, so that scientists, data journalists, data geeks, or anyone else can find the data required for their work and their stories, or simply to satisfy their intellectual curiosity."
For people looking for online data, this is a godsend.

To use it, visit: 

and do a search.  Here's one of the first things I tried to do... (Naturally, I checked local data that I probably would recognize..)  

Notice that Dataset Search, like regular Google Search, uses Autocomplete.  This is a wonderful behavior that will let you search a data space very quickly.  (Caution:  It doesn't seem to reflect ALL of the possible completions, so use this feature carefully.  )  

And, naturally, when you get a dataset, read the metadata carefully.  (We discussed this a while ago, Feb 15, 2014--"Read metadata carefully.")  

The next search I tried was for Stanford's dataset of global warming information, I've always wanted a copy of it so I could do my own analysis.  

I did the obvious search, and found not only three different providers of the datasets, but other, related datasets as well.  

On the left hand side you'll see a scrolling list of related datasets (that is, other data sets that match the query, but like regular Google results, are not ranked quite as high as the first hit).  

You can use Control-F to do a text search within the page, but notice that the scrolling list of related datasets might go on-and-on-and-on... You can't trust Control-F to search everything in that list.  (It's a "scroll on demand" list; Control-F only searches what's "visible" and not the entire contents of the list.)  

Note that you can ALSO use site: to restrict the sites that are searched for data.  

A search tool like this one is only as good as the metadata that data publishers create.  Maybe some enterprising SRS folk will publish a data set or two. (I certainly will try!)  

If you publish data and don't see it in the results, visit our instructions on our developers site which also includes a link to ask questions and provide feedback.  Learn all about how to publish your own datasets here at the dataset publishing guidelines page.  

We'll have some future Challenges that will use Dataset search, I'm sure.  

Search on (for data)! 

Wednesday, September 5, 2018

SearchResearch Challenge (9/5/18): Can you be a Fermi estimator?

Are you a good estimator? 

How tall was Enrico Fermi, the physicist?  How would you estimate?  Assume he doesn't have extraordinarily long legs.  
An important skill in a lot of skilled reading (or fact checking, or just being a skilled SearchResearcher) is being able to do quick estimates of values by just doing a bit of thinking about them.  

This estimation technique is often called Fermi Estimation, after the famous physicist who was known for his ability to good approximate calculations with little or no actual data.  These are sometimes called "back-of-the-envelope calculations," but great Fermi Estimators don't actually do this without any data--they need to start somewhere... but they know a few key facts, and then work forward from what they know towards an estimate. Fermi estimationg problems typically involve making reasonable guesses about quantities, their variances, the upper and lower bounds, and how to combine guesses together to lead to a useful insight.  

This came up for me recently when the shopper in front of me at the grocery store insisted that she wanted 1.00 pounds of ground hamburger.  When the butcher dropped a lump of ground beef onto the scale, she insisted on NOT paying for the 1.05 pounds, but she really wanted 1.0o pounds.  

The butcher rolled his eyes and removed a tiny bit of meat to get it exactly to 1.00 pounds.  

I immediately thought of this as a Fermi Estimation question:  About how big a lump of meat is 0.05 pounds?  Would it be the size of your hand?  Would it be the size of your little finger?  

If you assume that ground beef is about the same weight as water, then if we know how many ounces are in 0.05 pounds, then we can estimate the volume of meat that represents.  

Here's the Fermi Estimation I did in my head:  

In the US, 1 pound is 16 ounces.  In particular, 1 pound of meat (which weighs about the same as water) is 16 ounces.  First I converted that number into a ratio I could work with.  I realized that 0.05 of 1 pound is half of 0.1 of a pound... and that's half of 1.6 ounces, or 0.8 ounces.  Luckily, an ounce of water weighs the same as 1 fluid ounce of water. So..  0.8 ounces is just under one fluid ounce of water.  1 fluid ounce which is 2 tablespoons of water, so 8/10ths of 2 tablespoons is close to 1.5 tablespoons of water... around the size of my little finger!  

(Of course, doing this in grams is SO much easier. Imagine if the shopper wanted exactly 500 grams, and not 530 grams of ground meat.  About how big is 30 grams of meat?   1 gram of water is 1 milliliter, so we'd be looking at 30 ml of meat..) 

As Enrico Fermi might say:  non tanto! (not much!)  

This leads us to our Challenges for this week--the first is a true Fermi estimation?  

1.  Can you estimate how tall Enrico Fermi was?  

2.  Can you estimate (without looking up the answer!) how many people in the United States are over 80 years old?  (For extra credit, how many people worldwide are over 80 years old?)   

3.  To do Fermi Estimates you actually need to know a few basic facts (e.g., about how many people live in the US).  This brings up a great meta-question for Fermi Estimation and sensemaking of data that you see presented in the news... What facts do you need to know to be a good Fermi Estimator?  (There's no perfect answer for this; just tell us what facts you've used to do your own Fermi Estimates!)  

What I want you to think about is how you do estimates in your everyday life.  When you read an article that claims something that seems excessive, what kind of thinking should you be doing in order to sanity-check the assertion? 

Have you seen any recent examples of assertions that fall apart under a Fermi Estimation?  What do you have to know in order to Fermi-check the claim? 

Let us know in the comments!  

Search (and estimate!) on!