Wednesday, September 19, 2018

SearchResearch Challenge (9/19/18): Mysteries from Mozart's time


Salzburg was Mozart's home ... 

... in his pre-Vienna years.  It's where he was born, lived, and launched his career.  

When I was visiting earlier this year, I visited several of the places that Mozart lived and performed.  As you'd expect, as I looked around, I noticed a few things--things that struck me as funny/odd.  Of course, being a curious fellow, I had to look these things up with a bit of online research.  Can you answer these oddities as well?  

Here are three of the things I had to look up.  I was surprised at the answers to each of these.  


1.  As you know, Wolfgang Amadeus Mozart had a famous sister--also a performing prodigy.  Maria Anna Walburga Ignatia Mozart, called Marianne and given the nickname "Nannerl, was his older sister.  But in nearly every picture of her, she's got a remarkable head of hair.  See this picture below: 

Nannerl and Wolfgang Mozart playing two-handed duets at the piano


The question I had was How do you sleep with hair like that?  Or, I suppose, the other way to think about it would be How much time do you spend on hair like that?  Any ideas?  


2.  When I visited the Mozart residence at No. 8 Makartplatz (the so-called "Dance Master's House," aka Tanzmeisterhaus), I was struck by the odd appearance of one of Mozart's pianos--the key colors are reversed!  The accidentals are white, while the naturals are black.  Here's what I mean; Mozart's piano keyboard is on the left, while a current piano keyboard is on the right. 


So... when did the colors change from black naturals to white naturals... and why? 


3.  Speaking of Fermi Estimation and Mozart's piano (we were, weren't we?)... What would you estimate as the total number of pianos in Salzburg during the year when Mozart left Salzburg for Vienna?  How would you do this estimate?  

Austrian pianos over the years

As always (and I hate to sound like an 8th grade math teacher), show your work!  At least tell us how you answered each of the Challenges!  

(And note that you don't need to answer all three of the Challenges.  If you don't have enough time, just do one.  But be sure to tell us HOW you figured out the answer. 

I'll write up what I found next week.  

Until then, Search On! 




Thursday, September 13, 2018

Answer: Can you be a Fermi estimator?


Being a good estimator... 

... requires two kinds of knowledge:  (a) facts about the world, and (b) basic math estimation skills.  Those skills let you work from the basic facts, combining the data and doing quick math estimates to get to some numbers that give you the information you seek.  



How tall was Enrico Fermi, the physicist?  How would you estimate?  Assume he doesn't have extraordinarily long legs.  

Fermi Estimation needs both kinds of knowledge--along with a bit of practice in figuring out how to go from point A to point B.  

Let's practice with a true Fermi estimation!  


1.  Can you estimate how tall Enrico Fermi was?  
In this example, I give you a piece of data: Fermi's head is 9.4 in (23.9 cm) tall.  

How would you estimate his height just from this information?  Ask yourself, What other information do I need to know?  

If you're an artist, you probably learned that when you sketch a person, their body proportions are often expressed in terms of head-heights.  Example:  An ordinary person is about 7.5 heads high--a tall, slender, elegant person is about 8 heads high. 



If you don't know that bit of data, you could look the relationship up quickly:  

     [ body proportion in heads ] 

Later edit (Sep 14, 2018):  In the original post I messed up the calculation with a typo... I wrote that 24 * 7 was 164, which is clearly wrong.  Once I had that typo, all kinds of errors followed.  Bottom line, check your math. Didn't your sixth grade teacher say that? 

So know you know Fermi's head is 23.9 cm tall.  To Fermi estimate, let's round that up to 24 cm.  You can probably guess that 24 * 7 = 168 cm.  If you then add in the half-head height (12), you can add 168 + 12 and get 180 cm.  Make sense?  

In this case, I did the mental math in metric because that value (23.9) could be easily rounded up to 24 without causing many estimation problems, whereas the English measurement (9.4 inches) was kind of a pain--if you round up OR down, you lose a lot of accuracy.  

Plus, I know how to convert from cm to inches pretty easily.  

I know (off the top of my head, OTTOMH) that 10 cm ~ 4 inches.  

(Here, the ~ means "close to" in value.)  

SO... to convert 180 cm  into inches, I would divide 180 by 10 and multiply that by 4.  My mental math was 180/10 gives me 18 --I then multiply 18  by 4 to get your Fermi estimate of 72  inches, or 6 feet.  

When I read that, it SOUNDS really complicated... but it's not.  Here are the steps laid out in a diagram.  


1.  I know 10 cm~ 4 inches.  So... I just have to divide by 10, then multiple the result by 4 to do that conversion. 
2.  180 cm /10 = 18
3.  18 * 4 = 72  inches...  

At that point, most English-unit-using people can convert that into feet/inches without any trouble: 6 feet tall. (Yes, I know that's a weird skill; so be it.  We grew up with it. Point is, it's not that hard.  

So yes, you have to know a basic conversion fact (10 cm ~ 4 in), and know how to break up a complex mental multiplication into parts.

(If you want to learn more about how to do this kind of mental math for estimation, there are many online videos that will walk you through the process.  One that's pretty nice is from the Khan Academy:  Estimating values.)  



2.  Can you estimate (without looking up the answer!) how many people in the United States are over 80 years old?  (For extra credit, how many people worldwide are over 80 years old?)   
This is an interesting question:  How would you go about estimating this? 

Again, you need to know a few things to start. 


1.  The population of the US.  (I know this:  It's around 325M people.) 2.  The population age distribution.  

Once upon a time, I remember seeing a chart like this.  They're sometimes called "age pyramids" or "age distribution" charts.  When I look at this chart in my mind's eye, the top couple of slices of that chart constituted around 3% of the whole.  

To check this, I also remembered that the average life expectancy is around 78 (which is lower than in some countries, higher than in some).  So 3% off the top sounds like a reasonable guess.  

So... 3% of 325 is easy.  Divide by 100 (3.25) and multiply that by 3.  That's 9.75 million people who are older than 80 in the US.  

When I look up the actual numbers to check myself, I see that there are 328.3M people in the US, and this is the age distribution: 



And, surprisingly, my estimate of 3% is pretty close to the actual value.  

Even more surprisingly, I also happen to know that the US is slightly above average for ages > 80 years throughout the world.  Since I know that the population of the Earth is 7.5B, 3% of 7.5B is 225M people, which is a lot of elder wisdom.  



3.  To do Fermi Estimates you actually need to know a few basic facts (e.g., about how many people live in the US).  This brings up a great meta-question for Fermi Estimation and sensemaking of data that you see presented in the news... What facts do you need to know to be a good Fermi Estimator? (There's no perfect answer for this; just tell us what facts you've used to do your own Fermi Estimates!)  

When I start thinking about how you do estimates in your everyday life, I think about all of the times I do the Fermi estimations on news stories that I read.  Usually, these days, these are stories about income, wealth, distribution, immigration, and science stories. 

To do any kind of Fermi estimation you need to know a few things.  I've been writing down some of what I'd consider core knowledge over the past week, just taking notes when I did my own estimates to see what core knowledge I used.  Here's my list (yours is probably different): 

- population of the US in 2017:  327M 
- population of the world in 2017:  7.5B
- area of California:  ~100M acres
- conversion from inches to cm:  1 inch = 2.54 cm 
- number of stars in our galaxy:  300B stars

And so on... 

The reality is that nobody can take the time to spend 5 minutes doing online research for everything.  But it's pretty simple to do Fermi estimates to see if what's being said actually makes sense.  

It's a really good skill to have.  I highly recommend you practice this whenever you see something that sounds a little off.  It just might be.  But with your Fermi estimator skills, you can see through the mistakes.  

Here's a short video by my friend Jevin West talking about Fermi estimation in his class at the University of Washington.  It's well worth watching--his examples are great!    (Video link.)  



Search Lessons 

A few lessons spring out at me... 

1.  You still have to know basic facts about the world.  Dates, places, quantities, names, sizes, durations...  For instance, how important are plastic straws as a source of plastic pollution?  (Can you estimate what fraction of the total plastic trash they are?)  Once you know how to do Fermi estimates, your reading and understanding of news stories  / current events changes.  Beyond just knowing facts--e.g., the population of Japan is around 127M--once you know things like this, you know what kinds of information can be derived from them. Example:  While 3% is a good estimate of the number of people world-wide who are older than 80, you might also know that Japan is an outlier in the age pyramid--they have many more older residents than most countries, so you should Ferminate a higher fraction for folks > 80 years of age.  

2.  Realize that you CAN do estimates that are pretty good, and use them to do basic fact-checking of a story.  If it doesn't add up, you have more research to do.  This more of an attitude about research than anything--but it's important.  Many people don't know how to do estimates, and so they don't know they can guesstimate what they're reading or hearing about. And that makes them more susceptible to incorrect data and news stories.   

3.  Of course, online search is handy to get to the basic facts and formulas.  This is really true for breaking stories where there simply isn't any good information yet.  Even if you can't determine a particular value for something you've read, there's a good chance that you can estimate it by looking up other information and combining the data together into a fuller picture.  


Keep estimating.  It's a great personal skill to have! 

Search on! 


Thursday, September 6, 2018

New: Data Set Search Mode


It's often hard to find good data sets. 

But it just got a lot easier.  
  

Earlier this week Google announced a new Data Search mode that lets you do a Google search just for data sets.

My colleague Natasha Noy wrote in the Google Blog that 


"There are many thousands of data repositories on the web, providing access to millions of datasets; and local and national governments around the world publish their data as well. To enable easy access to this data, we launched Dataset Search, so that scientists, data journalists, data geeks, or anyone else can find the data required for their work and their stories, or simply to satisfy their intellectual curiosity."
For people looking for online data, this is a godsend.

To use it, visit: 

     toolbox.google.com/datasetsearch 

and do a search.  Here's one of the first things I tried to do... (Naturally, I checked local data that I probably would recognize..)  



Notice that Dataset Search, like regular Google Search, uses Autocomplete.  This is a wonderful behavior that will let you search a data space very quickly.  (Caution:  It doesn't seem to reflect ALL of the possible completions, so use this feature carefully.  )  

And, naturally, when you get a dataset, read the metadata carefully.  (We discussed this a while ago, Feb 15, 2014--"Read metadata carefully.")  

The next search I tried was for Stanford's dataset of global warming information, I've always wanted a copy of it so I could do my own analysis.  

I did the obvious search, and found not only three different providers of the datasets, but other, related datasets as well.  



  
On the left hand side you'll see a scrolling list of related datasets (that is, other data sets that match the query, but like regular Google results, are not ranked quite as high as the first hit).  

You can use Control-F to do a text search within the page, but notice that the scrolling list of related datasets might go on-and-on-and-on... You can't trust Control-F to search everything in that list.  (It's a "scroll on demand" list; Control-F only searches what's "visible" and not the entire contents of the list.)  



Note that you can ALSO use site: to restrict the sites that are searched for data.  


A search tool like this one is only as good as the metadata that data publishers create.  Maybe some enterprising SRS folk will publish a data set or two. (I certainly will try!)  

If you publish data and don't see it in the results, visit our instructions on our developers site which also includes a link to ask questions and provide feedback.  Learn all about how to publish your own datasets here at the dataset publishing guidelines page.  

We'll have some future Challenges that will use Dataset search, I'm sure.  


Search on (for data)! 




Wednesday, September 5, 2018

SearchResearch Challenge (9/5/18): Can you be a Fermi estimator?

Are you a good estimator? 


How tall was Enrico Fermi, the physicist?  How would you estimate?  Assume he doesn't have extraordinarily long legs.  
An important skill in a lot of skilled reading (or fact checking, or just being a skilled SearchResearcher) is being able to do quick estimates of values by just doing a bit of thinking about them.  

This estimation technique is often called Fermi Estimation, after the famous physicist who was known for his ability to good approximate calculations with little or no actual data.  These are sometimes called "back-of-the-envelope calculations," but great Fermi Estimators don't actually do this without any data--they need to start somewhere... but they know a few key facts, and then work forward from what they know towards an estimate. Fermi estimationg problems typically involve making reasonable guesses about quantities, their variances, the upper and lower bounds, and how to combine guesses together to lead to a useful insight.  

This came up for me recently when the shopper in front of me at the grocery store insisted that she wanted 1.00 pounds of ground hamburger.  When the butcher dropped a lump of ground beef onto the scale, she insisted on NOT paying for the 1.05 pounds, but she really wanted 1.0o pounds.  

The butcher rolled his eyes and removed a tiny bit of meat to get it exactly to 1.00 pounds.  

I immediately thought of this as a Fermi Estimation question:  About how big a lump of meat is 0.05 pounds?  Would it be the size of your hand?  Would it be the size of your little finger?  

If you assume that ground beef is about the same weight as water, then if we know how many ounces are in 0.05 pounds, then we can estimate the volume of meat that represents.  

Here's the Fermi Estimation I did in my head:  

In the US, 1 pound is 16 ounces.  In particular, 1 pound of meat (which weighs about the same as water) is 16 ounces.  First I converted that number into a ratio I could work with.  I realized that 0.05 of 1 pound is half of 0.1 of a pound... and that's half of 1.6 ounces, or 0.8 ounces.  Luckily, an ounce of water weighs the same as 1 fluid ounce of water. So..  0.8 ounces is just under one fluid ounce of water.  1 fluid ounce which is 2 tablespoons of water, so 8/10ths of 2 tablespoons is close to 1.5 tablespoons of water... around the size of my little finger!  

(Of course, doing this in grams is SO much easier. Imagine if the shopper wanted exactly 500 grams, and not 530 grams of ground meat.  About how big is 30 grams of meat?   1 gram of water is 1 milliliter, so we'd be looking at 30 ml of meat..) 


As Enrico Fermi might say:  non tanto! (not much!)  


This leads us to our Challenges for this week--the first is a true Fermi estimation?  

1.  Can you estimate how tall Enrico Fermi was?  

2.  Can you estimate (without looking up the answer!) how many people in the United States are over 80 years old?  (For extra credit, how many people worldwide are over 80 years old?)   

3.  To do Fermi Estimates you actually need to know a few basic facts (e.g., about how many people live in the US).  This brings up a great meta-question for Fermi Estimation and sensemaking of data that you see presented in the news... What facts do you need to know to be a good Fermi Estimator?  (There's no perfect answer for this; just tell us what facts you've used to do your own Fermi Estimates!)  

What I want you to think about is how you do estimates in your everyday life.  When you read an article that claims something that seems excessive, what kind of thinking should you be doing in order to sanity-check the assertion? 

Have you seen any recent examples of assertions that fall apart under a Fermi Estimation?  What do you have to know in order to Fermi-check the claim? 

Let us know in the comments!  

Search (and estimate!) on! 






Wednesday, August 29, 2018

Answer: How to find difficult web pages? (Part 2)


What makes a page difficult to find?  (Part 2)

I was impressed by how well (and how quickly!) SRS readers were able to figure these out.  Some of the search paths were lovely and inspired.  Nice work, Readers!  


Here's what I did.  Let me repeat the two Challenges and then tell you what I did to answer them.  


1.  This happens to me more often that I would like:  Images in my blog will sometimes go missing in action.  This happens when a website disappears, leaving my nice link to their image with a gaping hole.  Perhaps you've seen it on other web sites--the hole looks like this: 

A broken image link leaves behind a hole-in-the-page.  I want to find a replacement image.  One that looks the same as this missing image!
How can I find a replacement image for this hole in my blog?  In other words, can you find this missing image?  This hole-in-the-blog comes from the SRS post of December 14, 2011 and shows a particular remote-control glider.  (In fact, it's one that I built back in the late 1990s.)  
The Challenge for skilled SRS-ers is to (a) figure out what that image looked like, and (b) find that image somewhere else on the internet.  Can you? 

My solution:  I tried opening this image by Control-clicking (right-click on Windows) on the image-hole and then "Open Image in New Tab"--like this: 

  
I wanted to get the URL of the image.  (And yes, I could have done "Copy Link Address," but you'll see why I did it this way in a second...)   The URL for this image is: 

www.carlgoldbergproducts.com/airplanes/gpma0960_01_bg.jpg


This is what you might see if you open this link in a new tab: 


This is a classic "page not found" error.  

If you recall from a few weeks ago, I mentioned that it's handy to use the Wayback Machine browser extension.  This is a Chrome (or FF) extension that pops up when you hit a missing page (or file).  So my display really looked like this: 


If you "click here," it takes you to the Wayback Machine, and if you follow the obvious links forward, you'll get to this page: 



Now I see that the image is from an old site about remote-control gliders.  That makes sense, and it's going to be one of those images, but which one?  

I just went back to the Wayback Machine and put in that image URL above (the one in green + bold above).  Here's what I get from the Wayback Machine: 


Great!  It looks like the image was last saved on Feb 8, 2018. But if you click on that, you get another "missing" image.  Truth is, sometimes you have to work your way back along the timeline to find a real version of this image.  I jumped back to Mar 12, 2014 and found this: 


But I wasn't quite done yet.  I was wondering if that image had been used somewhere else.  Did this particular glider move from the Carl Goldberg company to some other place?  

To test this out, I did this query, looking for another use of this image name elsewhere on the web: 

     [ inurl:gpma0960_01_bg.jpg ] 

As you know, the inurl: operator searches for any string inside of a URL.  In this case, I was searching for that particular file name.  (Why?  Because I know that people are lazy and usually don't rename images.)  

Unfortunately, that gave me zero results.  

Now what?  

Let's look at the file name in detail.  It's: 

      gpma0960_01_bg.jpg

To me, this looks like a product code ("gpma0960") with a number (01) and a code indicating that it was used in the background (bg).  

What would happen if we just did an inurl: search for the product code name?  I'd expect to find all kinds of things with that code in the URL.  Here's my next search: 

     [ inurl:gpma0960 ] 

And... we hit the mother lode!  Here's the SERP for this query.  See how the product code appears in all of the URLs.  


This inurl: trick is incredibly useful for finding products, especially those that are no longer in production!  


2.  A while ago I was having dinner at a hole-in-the-wall Turkish restaurant somewhere in Europe and had a fantastic dessert.  It was rich, creamy, simple and wonderful.  I wrote down the namekaymak–so I could find it again at a place closer to home.  My Challenge was to find a place near me (that is, in Mountain View, California) that sells kaymak.  Can you find a place in Mountain View, CA that sells this fantastic dessert?  
(Note that I do not want clotted cream, nor do I want to buy it through online purchase, I want real kaymak that I can eat today!!  
For extra credit (and this is the difficult part)--How much does this place in Mountain View sell it for?  


My solution started by searching for: 

     [ kaymak near me ] 

But if you're not in Mountain View, CA (as I am), you could do the equivalent thing with this query: 

     [ kaymak near Mountain View, CA ] 

In this case I included the city and state because there are multiple cities that share our name.  I wanted to be sure to get the right one.  Here's what I see: 


Notice that the first result is to a Yelp result that lists places that sells "clotted cream."  That's close, but not quite what I wanted.  I want kaymak!  In this case, I want to turn off the synonyms, so I quote the term to get exactly that (and only that).  Note the difference between these two SERPs.  


This looks great!  

But oddly, when I open the Olympus Caffe & Bakery web site, I can't find the word kaymak on the page.  This is a case where my Control-F skills didn't pan out.  

Now what?  As you can see, it's not on the page!  


I'm confident that kaymak is here, somewhere.  Where?  

I could start clicking on all of the buttons (e.g. "Cakes/Desserts"), but I went with a more hacker approach, a method that's sometimes handy.  

I went ahead and did a View Source.  It's an option that you can get to like this:  


This will show you the raw HTML, which can be scary, but you can then search for kaymak... Here, I've highlighted the line, which happens to include the price:  $4.50 


If you read HTML, you can see it appears under the "Turkish Breakfast" menu item, which would have taken me a long time to find by clicking on all of the options.  

Viewing the source of the page is often a useful method when the page is complex and has a lot of 


As I said, I was impressed by some of the answers in the comments this week.  Well done team!  


Search Lessons 


1.  Remember the Internet Archive / Wayback Machine when looking for lost pages or images!  They don't cover absolutely everything, but it is an invaluable service to the community. 

2. Using INURL: to find other pages with the same text in the URL is often a great way to track down pages that share content with what-you're-seeking.  Don't underestimate the power of inertia:  Webmasters often prefer to keep the URLs of previously existing images and pages when they move (or copy) content.  As a side-effect of this, you can often find content that would otherwise go missing.  

3.  Developer>View Source  ... it gives you access to the ground truth for many pages.   In this case, I was able to find the kaymak entry very quickly, without all of that annoying clicking around in the menus to figure out which category of thing it was hidden under.  

Search on! 

Wednesday, August 22, 2018

SearchResearch Challenge (8/22/18): How to find difficult to find web pages? (Part 2)


What makes a page difficult to find?  (Part 2)

As we saw last week, sometimes you remember the page, but have difficulty figuring out the exact words for your query.  In one case, I remembered seeing an article on a topic (the American author disputing a Wikipedia article), but the results were filled with Wikipedia results, which in this case, really didn't help.  So we used the site: operator to exclude those results.  

The other example from last week was to find an article about a black racer snake from the New Jersey government educational site.  There, we had to use the right domain name (site:NJ.gov) and search in that part of the NJ web site with site:NJ.GOV.  

This week, I have two other "difficult to find" problems that I hope you can solve.  These are both a bit more tricky than last week's and require a bit more sophisticated search knowledge, so I hope you're up to the Challenge!  


1.  This happens to me more often that I would like:  Images in my blog (THIS blog!) will sometimes go missing in action.  This happens when a website disappears, leaving my nice link to their image with a gaping hole.  Perhaps you've seen it--the hole looks like this: 

A broken image link leaves behind a hole-in-the-page.  I want to find a replacement image–one that looks the same as this missing image!
Arrgh!  This is frustrating, but an inevitable consequence of having companies go out of business.  This causes link-rot and that makes the target of the link (in this case, the image of a remote-control glider) go missing.  It shows a broken image icon instead.     
How can I find a replacement image for this hole in my blog?  In other words, can you find this missing image?  This hole-in-the-blog comes from the SRS post of December 14, 2011 and shows a particular remote-control glider.  (In fact, it's one that I built back in the late 1990s.)  
The Challenge for skilled SRS-ers is to (a) figure out what that image looked like, and (b) find that image somewhere else on the internet.  Can you? 

A related difficult to find web page relies on a different technique... but it's also a toughie.  Can you answer this dessert-related Challenge?  

2.  A while ago I was having dinner at a hole-in-the-wall Turkish restaurant somewhere in Europe and had a fantastic dessert.  It was rich, creamy, simple and wonderful.  I wrote down the namekaymak–so I could find it again at a place closer to home.  My Challenge was to find a place near me (that is, in Mountain View, California) that sells kaymakCan you find a place in Mountain View, CA that sells this fantastic dessert?  
(Note that I do not want clotted cream, nor do I want to buy it through online purchase, I want real kaymak that I can eat today!!  
For extra credit (and this is the difficult part)--How much does this place in Mountain View sell it for?  



These two Challenges need very different and fairly advanced techniques.  If you can solve both of these, you can rate yourself as a Jedi-level SearchResearcher!  

Please let us know how you solved the Challenges--and be as clear as possible in HOW you did it.  (For these Challenges, you need more than a clever query and the use of site:) 

Search on! 


Thursday, August 16, 2018

Answer: How to find difficult to find web pages? (Part 1)


There are many reasons...   


... why a particular page might be difficult to find.  Sometimes your memory is just plain wrong; sometimes your memory is so generic as to not remember anything that would let you pick the right page out of thousands of similar pages; sometimes the page is really missing (that is, a 404 error--web page missing).  
The Challenges from last week are interesting examples of Difficult-Web-Pages.  Let's talk about how to find these, and why they're tough.  

1.  A while ago I remember reading an article about a famous US author that was having some difficulty editing his own Wikipedia page.  As crazy as it sounds, they wanted to have independent verification of what he was saying.  I found myself wanting to re-read that article so I could refer to it in my writing.  I needed to find it to confirm details.  This was my Challenge:    
Who was the famous US author that was involved in a dispute with Wikipedia over the accuracy of the entry describing his novel?   

I had a very clear memory of this article, but I couldn't remember where I read it.  You might think that the obvious query is something like: 

     [ Wikipedia famous author article challenged ] 

(or something similar).  The problem is that the query isn't specific enough--there are a LOT of web pages with these terms, so the results are pretty scattered--they didn't help me find what I'm looking for.  We have to find a way to change the query:  


A big part of the problem here is that many of the results are FROM Wikipedia.  (Makes sense, Wikipedia is a search term.) In fact, the first 40 results are all from Wikipedia.org  
What would happen if we excluded the Wikipedia results?  Would that improve our accuracy?
  
As you know, site: lets you search just within that site (e.g.,  [ site:Wikipedia.org ] )  
But how can we exclude a site?  That's easy: Use the minus symbol like this:  
     [ blah blah blah –site:Wikipedia.org ] 

Notice that small MINUS sign (aka a hypen) in front of the site: operator.  That means to search everywhere on the web, but NOT on this site.  

When I do this search, my SERP looks like this:  




See that 4th result?  (The one at the bottom of this image.)  This is exactly what I was looking for--a famous US author (Philip Roth) who was in a dispute with Wikipedia over the accuracy of the entry describing one of his novels.  As Roth wrote in his open letter to Wikipedia,  he was told that "...I, Roth, was not a credible source: 'I understand your point that the author is the greatest authority on their own work,' writes the Wikipedia Administrator – 'but we require secondary sources.'"  This dispute went on for a while, and to their credit, Wikipedia repaired the entry, and it stands as an accurate source of information about Roth and his work.  

Other approaches work too.  Reader SpiritualLadder found the answer with the query: 

     [ dispute with author over book Wikipedia entry ] 

And D. Lazar found it with: 

     [ author dispute wikipedia ] 

I tried a few other queries like this (that is, without the site:) and found it to be pretty hit or miss.  If you managed to guess the right words, you'd find the article.  Using the -site: operator gets you to the result pretty quickly. 

Several readers also found their way to the Wikipedia List of Controversies page (which is pretty interesting reading), and then found the article about Philip Roth on that page.  


A black racer rising up out of the grass.
Thanks & P/C Continis on Flickr.
2. See that image above?  That's a black racer snake.  I happened to see one the other day, and I remembered from previous reading that the state of New Jersey had a few articles about snakes in their state, and I remember one about black racers in particular. 
Can you help my fading memory and find an article about the black racer snake that’s published by the state of New Jersey as part of their educational outreach program? 
This was my first query.  


I'm not proud of it, but I wanted to show you that even practiced searchers also make mistakes.  

What's wrong here? 

What I'm trying to do is to just search on websites in New Jersey.  I know that the code for New Jersey is .nj  so I used that as the target of the site: operator.  

But I got zero results.  Why?  

After I calmed down after the shock of zero results, I realized that no matter what you think of New Jersey, it doesn't have its own top-level-domain name.  (A top-level-domain name is the code at the very end of a URL--e.g., .GOV .MIL  .EDU  .INFO etc.)  

An important lesson for searchers is to look at what's going on and try to debug your process.  What can you learn from this for next time?  In this case, I needed to know what the REAL web name is for the state of New Jersey.  

A quick search for [ official website New Jersey ] tells you that they're part of .GOV -- and their URLs all end in .NJ.GOV!  

Let's redo that query with the correct site specifier (and a better query).  


This looks more like what I'm seeking.  It's in New Jersey's educational web site, and it's about black racer (Coluber constrictor) snake.  

Now, to find all of the educational content at the NJ.GOV site, I just truncated the URL of the first result.  That is, I went to: https://www.nj.gov/pinelands/infor/educational/   and found a great page full of results...





Search Lessons 

In this post, I really wanted to emphasize the way that site: operates.  There are two big lessons here. 

1.  You can use –site: as a way to remove invasive results from your search.  In this case, because we were searching for something about Wikipedia (but not necessarily ON Wikipedia), we used the –site: operator as a way to get rid of the annoying results that were all on Wikipedia.  Use this trick anytime you want to remove an entire site from consideration.  Usually, this happens with super popular sites that tend to dominate the results. 

2.  SITE: can take any site specifier, including subdomains and directories.  In this example we just used the subdomain + top-level-domain  .NJ.GOV  -- but we could also do a site: with a directory as well.  Here's an example showing that the Pinelands part of the official web site has around two thousand pages covering a broad range of topics.  (And you can see that Educational content is part of their much larger mission.)   





As I mentioned, this is Part 1 of a series of "Difficult Web Page" search Challenges.  This one wasn't too difficult--Part 2 will be more challenging.  

During the next week or so I'll be doing an occasional additional post about topics that I think you'll be interested in reading.  See you here soon. 

Search on!