Wednesday, October 1, 2014

Search Challenge (10/1/14): YOU make up the challenge!

As you can tell, I'm still catching up.  I've got a queue of about 4 posts I still need to make to get back up to real-time, so I'm going to start by posting an unusual Search Challenge for today.  

The Challenge today is for YOU to create a Challenge of your own!  It's a Reverse-Challenge day.  

Let me scope out the Challenge.  

First, I'd like to tell you something about how to search in Google Books that you probably don't know.  

Then, after you've got that, you get to explore a bit and then post a Challenge of your own into the comments. 

On Friday, I'll talk about what happened, and what we found out collectively.  Ready?  

1.  How to Search Google Books Search by Subject Heading 

One of the nicest (and least known) of all the features in Google Books is "Search by Subject Heading."  That is, you can use a subject heading string (found in either the Library of Congress (LC) Subject Headings list, OR the BISAC subject headings) 

For example, if you go to Google Books and do the query: 

The quoted phrase "Word War, 1939-1945" is the official subject heading name of that war.  

The big advantages of doing a search like this is that you can use pretty common terms (say, "armor") and get search results back in the context of that subject heading.  If you just go to Google Books and search for [ armor ], you'll get back a lot of results, few of which have anything to do with World War II.  But by contrast, this is what you get back from this search--targeted, relevant, focused results:  

You could do the same thing and use the the BISAC subject heading:

See the difference? The results are more-or-less the same, but both subject catalogs have slightly different ways of representing the world.  (More on this in a later post.) 

How do you get to the subject headings listing?  Well, telling you to search would be too obvious, so let me give you the links and save you one search: 

     Library of Congress Subject Headings search interface

     BISAC Subject Headings browse interface 

They're very different interfaces.  Here they are side-by-side: 

As you can see, the LC subject headings page has a search interface, while the BISAC (which is run by the BISG) has more of a click-to-browse style.  Both are useful; it depends on if you need any prompting as you do your search for a subject heading.  

Once you've found your subject heading entry, you just copy it out (see the text highlighted in blue below?  that's what you copy), and then paste into the quotes after the subject: keyword as shown above. 

Then, the query [ armor ] makes a LOT more sense when the scope of books being searched are those tagged with the LC subject heading "World War, 1939-1945"  (which is a phrase for WW2 that you'd never guess).  

With me so far?  

Now for today's Challenge:  

2.  Can you create a Search Challenge that highlights the use of the  subject: operator in Google Books? 

That is, you get to be me! 

Ideally, you'll write a Challenge that requires the use of a search in Google Books that needs the subject: + subject-heading method to be solvable.  

Once you've written your Challenge, post it to the comments below, and let's see if we can figure out what really interesting kinds of subject-headings (and their uses) we can discover. 

Sample Challenge:  "For today's Search Challenge, can you find a book about insects that feed on other insects and can cause sterility?" 
Sample Answer:  Using LC, I searched for "parasite insect" and learned that there is a subject heading called "Insects--Parasites" (the double dashes are important!).  Searching Google Books with 
     [ subject:"Insects--Parasites" sterility ] 
led me to a few books, one of which is Parasites in Social Insectsby Paul Schmid-Hempel, 1998. (Check out the cover for an astonishing photo.) In this book the author notes that some wasp species in Europe are parasitized by a variety of nematode that causes "parasitic castration" leading to infertility. (And be sure to search in the book for the phrase "brain worm," and learn something amazing about certain ant parasites that cause huge changes in ant behavior.)  

You, of course, should pursue your own interests and discover what wonderful things you can find in Google Books by using this new search method.  

Be sure to tell us how to solved your Challenge, and tell us of any interesting discoveries you made along the way.  

Search (and create Challenges) on!  

Friday, September 26, 2014

Answer: Should I be worried about this fish?

In our previous episode... 

WHILE diving in the Somosomo Strait on September 8th of this year, I found this fish down at 10m, busily picking up chunks of coral and moving them from place to place.  

The question for this week was "Should I be worried about this fish?" 

You SearchResearchers did an excellent job of answering the questions.  

1.  What IS this fish, and should I worry about it being aggressive? 
2.  If so, WHEN should I worry?  Everyday?  Or just sometimes?
3.  Should I have been worried on the day I took the photo? 

1.  How can we identify a random fish like this?  

As we've discussed before, you could spend a lot of time looking at photos of fish. But there a literally a lot of fish in the sea, and it could take a while.  A much better approach is to use some kind of key to identify the category of fish, and then zoom in to photos once you know a bit more.  (This is really the best, most general method of identifying any kind of plant or animal. Use a key.)  

My first query was to figure out where the "Somosomo Strait" is--that's not hard--it's an island in the Fiji archipelago that's known for great scuba diving.  (Makes sense. Of course that's where I'd go!)  

So now I want to find a good fish ID key, my query was: 

     [ fish identification key ] 

which led me to which has a really extensive index AND a great key system.  They're pretty serious about their fish.  

Here's a piece of their home page.  I immediately noticed the "Quick Identification" link and clicked on that.  

Once there, you'll see a set of options.  Each is a category of fish--click on it, and you'll go the subcategory, etc etc, until you reach the fish family you're interested in examining.  Here's the top of their visual key: 

In this case, I'm going to click on the fish that looks most like the one in the image.  So here I click on "Ray-finned fishes."   That takes me to the next choice point in the key

Now at this point, I MIGHT click on "Puffers and Filefishes" (the same of the mystery fish looks pretty much like the fish on the far left of that category), or I might scroll down and click on "Dories" farther down the page.  But "Dories" are, as the note down there says, "most are deep sea."  So I'll click on "Puffers and Filefishes" and see what's there. 

At that page, I see something that looks a LOT like the mystery fish.  See the "Baslistidae (Triggerfishes)"?  I suspect that's our kind of fish.  An Image search for: 

     [ Fiji triggerfish ] 

quickly brings up a bunch of triggerfish, including one that's a perfect match.  Clicking on that then tells me it's a Titan triggerfish (Balistoides viridescens).  A quick Image search on the Latin name [ Balistoides viridescens ] gives me yet another confirmation: 

So we've identified our fish.  

Now, is it aggressive?  Let's do this search to see if we can find anything out about the behavior of the Titan triggerfish.  

     [ behavior Balistoides viridescens ] 

Note that I didn't search for "aggressive" here--that's just ASKING for confirmation.  Instead, I searched for "behavior" because maybe it's perfectly passive 99% of the time, and rarely aggressive.  If I searched for "aggressive," I'd be sure to find every single page talking about it's aggressive tendencies.  It would be a fair research question.  

When I did this I was slightly surprised to learn that (despite me trying to be fair), the Titan triggerfish HAS been observed being fairly aggressive to other fish (and humans) who enter their territory.   

The Wikipedia Titan triggerfish article says that "...The titan triggerfish is usually wary of divers and snorkelers, but during the reproduction season the female guards its nest, which is placed in a flat sandy area, vigorously against any intruders. The territory around the nest is roughly cone-shaped and divers who accidentally enter it may be attacked. Divers should swim horizontally away from the nest rather than upwards which would only take them further into the territory. Although bites are not venomous, the strong teeth can inflict serious injury that may require medical attention..."  

Zounds!  As you can see from the photos, any fish with teeth like (who bites coral!) would have an impressive bite.  

Just to double check this, I used Google Scholar on that same query and found several articles documenting aggressive behavior of the Titans.  Interestingly, several of the articles (e.g. "Lek-like spawning, parental care and mating periodicity of the triggerfish Pseudobalistes falvimarginatus" [1]) point out that the males set up a mating ground (a "lek") where they establish, and defend, territories to which the females come and deposit their eggs. Both parents care for the eggs, although the female is "confined to the nest by the male."  Mating was semi-lunar, several days before the new and full moons on days when high tide occurred near sunset.  (Note that this paper is about a related fish, Pseudobalistes, but at the end of the paper, the authors say that this is also true for the Titans as well.)  
Okay. So the question NOW is "was Sept 8 near a new or full moon when a high tide was near sunset?"  

Phases of the moon are easy to figure out:  

     [ phase of the moon calendar ] 

And yes, Sept 8 WAS a full moon according to   

What about the tides?  My query was: 

     [ high tide Fiji September 8 2014 ] 

(I gave the date because I wanted the historical record, not this week's tides.)  

The second and fourth columns are the high tide times.  Holy cow!  5:46PM was the high tide AND sunset was just 15 minutes later at 6:00PM FJT!  

So YES... I should be careful!  

Search Lessons:  

1.  Know the geography.  In this case, fish look very much alike, but can be different worldwide.  Knowing that Somosomo Strait is in Fiji really helps. 

2. Use an identification key.  There are keys for fish, plants, animals, insects, fungi, flowers, etc etc.  Know that a good key is almost always the best approach for identifying something.  

3.  Do not bias the results by including "leading terms" in your query.  In this case, the Titan triggerfish really IS aggressive, but don't search for trouble to begin with.  Let the data guide you to that interpretation--don't overlimit you search results to only those with evidence that confirms your already existing biases.  

Note:  Rosemary made a great observation about using Search-By-Image for this Challenge.  It's such an interesting finding that I'll write a separate post about that.  

[1] Gladstone, William. "Lek-like spawning, parental care and mating periodicity of the triggerfishPseudobalistes flavimarginatus (Balistidae)." Environmental Biology of Fishes 39.3 (1994): 249-257.

Wednesday, September 24, 2014

Search Challenge (9/24/14): Should I be worried about this fish?

WHILE diving in the Somosomo Strait on September 8th of this year, I found this fish down at 10m, busily picking up chunks of coral and moving them from place to place.  

It's a pretty big fish, around 70 cm / 27 inches tip to tail (and the chunks of coral it was moving were the size of my dive buddy's fist).  The teeth on this thing are also impressive, and seeing what it could do to coral makes me think that I'd prefer to not tangle with this fish. 

And that's today's Search Challenge: 

1.  What IS this fish, and should I worry about it being aggressive? 
2.  If so, WHEN should I worry?  Everyday?  Or just sometimes?
3.  Should I have been worried on the day I took the photo? 

Even though it sounds crazy-hard, this isn't that hard of a problem, but it requires linking together a few different resources.  Can you figure it out?  (Ideally, we should find authoritative resources to answer this.  Can you find them?)  

As always, be sure to tell us what you did to answer the Challenge, and how you figured it out.  

Search on! 

P.S.  I'll get back to the Twain place-names tomorrow.  It's been an overly busy week, unfortunately.  

Friday, September 19, 2014

Answer (delayed)...

Sorry folks... this really hasn't been my week at all.  After coming back from my dive trip, I'm still working through the tail-end of my cold/flu thing, while simultaneously trying to get a bunch of unexpected things done at work.  Usually I can just get up earlier in the morning and get my SRS writing done, but this week I just couldn't quite pull it off.  

Luckily, the weekend is coming, and I'll catch up on this long conversation on Monday.  More maps, more analysis, more information.  

Have a great weekend!  See you then. 

-- Dan 

Wednesday, September 17, 2014

Answer (Part 1) to: Can you find the places Twain mentions in "Around the Equator"?

I have to start off by saying that this really is a complicated and difficult challenge.  But the SRSers rose to the challenge.

Answering this is slightly complicated as well, so I'm going to write this up in two (or three) parts.  

Here's installment #1, which is really a story of how to keep digging in, learning things along the way, and finally coming up with something that works.  

Entity identification in arbitrary text.    

When I sat down to do this Challenge I had an advantage--I already knew about the idea of "entity identification" (aka "named-entity recognition").  The idea is that your computer can scan a text (say, "Around the Equator") and automatically identify named entities--the names of cities, rivers, states, countries, mountain ranges, villages, etc.    

Just knowing that this kind of thing exists is a huge help.  All I figured I'd need to do is to find such a service and then use it to pull out all of the entities from the text. 

My plan at this point was just to filter them by kind, merge duplicates, clean the data a bit, and I'd be done.  

But things are never quite this easy.  

My first query was for: 

     [ geo name text entity extraction ] 

which leads to a number of online services that will run an entity extractor over the text.  

The one I tried first, Alchemy, looks like this: 

You can see that I downloaded the fulltext from Gutenberg onto my personal web server ( and handed that link to Alchemy.  

I thought that this would be it--that I'd be done in just a few moments.  But no.  Turns out that you can't just hand Alchemy a giant blob of text (like the entire book), but you have to do it in 50K chunks.  

That is, I would have to split up the entire book (Twain-full-text-Equator-book.txt) into a bunch of smaller files, and run those one at a time.  

Since the entire book is 1.1Mb, that means I'd have to create 22 separate files, each with 49,999 bytes.  

I happen to know that Unix has a command called split that will do that.  I used the split command to break it up into 22 files and I moved those all back out to my server.  

At this point my natural inclination would be to write a program to call the Alchemy API.  The program would basically be something like: 

for each file in Twain-Docs: 
     entities =  Alchemy-Api-Extract-Entities( file ) 
     append entities to end of entitiesListFile 

Which would give me a big file with all of the entities in it.  But I didn't want this to turn into a programming problem, so I looked for a Spreadsheet solution.  

Turns out that Google Spreadsheets has a function that lets you do exactly this.  You can write this into your spreadsheet cell:  

     =ImportXML (url, xpath)  

where url is the URL of the AlchemyAPI and xpath is an expression that says what you're looking for from the result.  

Basically, the url looks like this:  (I learned all this by reading the documentation at

Let's decrypt this a bit.... 

The first part:

tells Alchemy that I want for it to pull out all of the "RankedNamedEntities" in the text file that follows. 

The second part:  apikey=XXXXXXXXXX

tells Alchemy what my secret APIKey is.  (Note that XXXXXXXXXX is not my API key.  You have to fill out the form on Alchemy to get your own.  It's free, but it's how they track how many queries you've done.)  

The third part:  &url=

is the name of the file (neatly less than 50K bytes long) that I want it to analyze.  

Now, I make a spreadsheet with 22 of these =ImportXML(longURL, xpath

Here's my spreadsheet (but note that I've hidden my APIKey here).  

You can see the "Alchemy base url:" which is the basic part of the call to Alchemy. 

The "Composed URL" is the thing we hand to ImportXML.  That is, it's basically the: 

     AlchemyBase + analysisAction + APIkey + baseTextFile

Remember that the spreadsheet function ImportXML takes two arguments--the first is the URL to call Alchemy (which has the link to the file built into it) and an XPath expression. 

What's XPath?  I did the obvious search to find out.... 

     [ xpath tutorial ] 

and found a nice little intro to XPath.  Turns out that it's a kind of language for reaching into XML data and pulling out the parts that you want.  (It took me about 15 minutes to read up about XPath, and then figure out that all *I* wanted was to pull out the entities from the XML that's being imported.  In short, all I needed was the XPath expression:  "//entity" as the second argument.  

Then, for each of the 22 files I split up from the original text, I created a separate spreadsheet, cell A1 gets the magic ImportXML function.  In this case, A1 on spreadsheet A has the ImportXML function that looks like this: 

   = ImportXML ("
            Twain-part-aa", "//entity")  

Here's what the sheets look like after the ImportXML function runs.  This is the Alchemy analysis of Twain-part-aa (that is, the first 50K bytes of the book):  

Looks pretty good, eh? 

I did this same thing 22 times, one analysis for each of the 22 sections of the book (Twain-part-aa through Twain-part-av).  

Then I copy/pasted all of the results into a single (new) tab of the spreadsheet.  I used paste-special>values so I could then do whatever I wanted with them.  

That new page of the spreadsheet looks like this.  

Remember that Alchemy is searching for MANY different kinds of entities (as you can see: HealthCondition, Person, Organization...) 

What we want is just the geographic entities.  This means I can now use the spreadsheet Filter operation.  (Click on cell A1, then click on Data>Filter.  It will popup a menu with all of the values you can filter on.) 

Here you can see that I've already deselected "Crime"-- so all of the "Crime" entities will be filtered out of the list.  

Once I've filtered the list, I'm nearly done.  I can selectively filter for only the geographic entities I care about (City, Country, GeographicFeature, StateOrCountry...).  And my spreadsheet now looks like this: 

This list now has 567 placenames in it, many of which are duplicates.  To create a new list of only the unique names, I'll use the =Unique (range) function to create another tab in my spreadsheet with the unique names.

This gives me a sheet that looks like this: 

Now we have 283 unique entities. 

This column (which I sorted into alphabetic order) looks pretty good, although there are a few oddities in it.  ("Ballarat Fly" is an express train to the New Zealand town of Ballarat. And "Bunder Rao Ram Chunder Clam Chowder" isn't a place name, it's just a funny expression that Alchemy Analytics thinks is a place. "Ornithorhynchus" isn't a place, it's the Latin name for a platypus...)  

So we still have some data cleaning to do.

But this is point at which we need to do some spot checking to see how accurate the process has been.  As is clear, it has included a few extra "place names" that aren't quite right.  This is called a "false positive."  By my count, the false positive rate is around 3% (that is, out of the 283, I found 8 clear mistakes).  

And that makes me wonder, how many "false negatives" are there?  That is, how many place names does Alchemy miss?

There's no good way to do this other than by sampling.  So I choose a section out of the middle of the text (Twain-part-ak, if you're curious) and manually checked for place names. 

I found about a 5% false negative rate as well... (including cities that should have been straightforward, like "Goa").  So this approach could be off by as much as 9 or 10%.  

Still, this isn't bad for a first approximation.  But there's more work to be done. 

In tomorrow's installment, I'll talk about some of the other approaches people used in the Groups discussion.  There are always tradeoffs to make in these kinds of situations, and I'll talk about some of those tomorrow as well.  Creating a map with all this data?  That's Friday's discussion.  See you then. 

Part 2... tomorrow! 

Search on! 

Tuesday, September 16, 2014

I'm teaching a class on Google Books next week (Sept 23rd, 2014)

Want to be Google Books wizard? If you're in Mountain View on 9/23/14, you can take my (free) class at the Plex.  

Register by clicking on this link.  

It starts at 6PM, runs till 7:30, with dinner (free!) to follow.  

You should already be a Books user (but I suspect that most of you reading this are)... 

Feel free to pass around to folks you know in the Mountain View / Palo Alto / Santa Clara / Menlo Park area that might have an interest in this. 

See you then. 

- Dan 

I'm back from vacation... and trying to catch up with your work!

Hi folks.  

I'm now back at home, reading through all of the comments and ideas, all of the back-and-forth everyone's been posting since I left.  Two quick comments spring to mind... 

1.  You guys worked really hard on this!  I see lots of evidence of people putting out ideas, other people testing them, and then other people doing some work, and generally building atop each other's investigations.  This is superb, and exceeds my wildest expectations.  Thanks.  

2.  This is a really hard problem.  I started working on it yesterday, and it's taken me about 4 hours thus far.  (Including dead ends.)  But I see the end, and it will come it at around 5 hours to complete.  (Not counting the writeup.)  It won't be perfect--there will be places mentioned in the text that will be missed--but we should be able to get pretty good accuracy.  Details tomorrow. 

To slightly complicate things, I had a great time on my vacation.  Turns out the resort did have Wifi, but it was a bit spotty; trying to do any real work would have been crazy-making.  

The good news is that the South Pacific was fantastic.  

And the bad news is that the moment I got home I was hammered with a bad case of the flu, so I'm barely functioning.  My solution won't be as clean and beautiful as I would have liked, but it'll be there. 

More tomorrow.  I'll post a few comments in the group today, but the answer will be on Wednesday.  (With no new challenge this week.  You've worked hard enough.  Take a week off yourself!)  

Dan enjoying the surface interval between dives.

Wednesday, September 3, 2014

Search challenge (9/3/14): Can you find the places Twain mentions in "Around the Equator"?

As I mentioned in my last post, I'm about to head out for a few days of SCUBA diving in an exotic, tropical (and undisclosed) location.  Who knows?  I might want to use some of things I pick up there as future Search Challenges! 

This week's Challenge is one that I've wanted to do for a while, but never quite had the time (or nerve) to post it as a Challenge.  

It's fairly tricky, and will require some new skills on the part of Search Researchers.  But I'm confident that you can do this.

Here's the Search Challenge for today: 

Background:  I remember reading Mark Twain's Following the Equator as a schoolboy and completely enjoying the story.  I was also amazed at all of the places he visited.  I know he made it to Hawai'i and Australia, but he also seemed to visit much of the world... and in 1895.  By ship.  Suppose I want to do his trip over again.  Where all would I have to go?  
Challenge 1:  Can you figure out all of the place names he mentions in the book?  The link above is to the Gutenberg Project's plain-text version of his book.  Can you figure out some way to determine ALL of the place names he mentions? 

Example: The first two paragraphs of the book are... 

"The starting point of this lecturing-trip around the world was Paris, where we had been living a year or two.
We sailed for America, and there made certain preparations.  This took but little time.  Two members of my family elected to go with me.  Also a carbuncle.  The dictionary says a carbuncle is a kind of jewel.  Humor is out of place in a dictionary." 

In these paragraphs he mentions "Paris" and "America."  Those should be the first two entries in your list of placenames.  

Now, can you figure out ALL of the OTHER places he mentions in the course of the text?  

(And yes, I know he mentions a lot of places he doesn't actually visit; that's okay, for our list let's include every place he writes about and not worry about whether or not he actually visited there.)  

Obviously, you don't want to do this by hand.  So the question really is, can you find a way to solve this problem using SearchResearch methods? 

Challenge 2:  In case anyone finishes this early... Can you then create a set of Placemarks on Google Earth to show all of the places mentioned in your list of placenames?  Ideally, you should give us a link to your KML file with all of the places Twain mentions in the book.  

This is probably the most sophisticated Challenge I've issued--which is why I'll write up my answer in about 2 weeks.  (Note that I haven't yet solved this myself; but I'm confident that I can.)  

As mentioned, I'll be out-of-town for the next 10 days, so we won't have a Challenge next week (Sept 10).  Instead, I'll write up my solution on Wednesday, Sept 17th.  

I'm also going to be off-the-grid (mostly), so I won't be able to approve your posts to the blog after Thursday.  (Well.. probably.  I will try to check in; but I'm not sure about Wifi coverage where I'm going.)  

So I set up a Google Group for everyone to discuss this Challenge.  For this problem, we can have our discussion in SRS Discusses Around The Equator.  (Click on that link to join the group.)  This way, I won't need to manually approve every comment to the blog (which is what I do now).  

As I said in the Welcome message for the group, this is a no-hold-barred Search Challenge.  If you want to work together, be my guest. You can set up Hangouts to meet and chat about possible solutions, you can swap ideas about how to solve it... Whatever works for you.  

It's a two week Challenge.  Are you up for it?  Can Team SearchResearch do it?  

Search on! 

Friday, August 29, 2014

Answer: What are these plants?

This week was obviously far too easy.  

Or, the SearchResearch readers have been developing their research skills!  

I'll assume it was the second. In either case, very nice work.  Some people knew the answers off the top of their heads, which goes to show the value of a great social network--you can quickly tap into the collective knowledge base (and superior recognition skills) that your extended personal network has.   This isn't to be undervalued!  As Howard Rheingold illustrates in his new book, Net Smart, there is a value and a quality of participation that links together bloggers, netizens, tweeters, and other online community participants.  This set of people and networks form an online collaborative enterprise that can contribute new knowledge to the world in new ways.  And best of all, it forms a personal knowledge network that you can tap into.  

But we'll talk about that in another post.  Today, let's figure out how to search for the answers to these challenges.  

This week I showed the following three images and asked the obvious question--what are these plants?  Here's what I did to answer each question:  

1.  I found this under a redwood tree in a lawn at one of the Google buildings.  I visited here every day for a week, and took this series of pictures over a couple of days.  It's shady here, but as you can see, it's just the lawn under the canopy of the redwood.  What ARE these things? What's the genus and species name?  

A few people reported success with doing Search-by-Image (and that's a great approach).  But I did a simple series of searches: 

     [ mushroom dissolving ] 

Why that query?  Because this transformation (from left to right in the images) happened over a short period of time (about 2 days).  This was easily the most striking thing about this mushroom.  Sure mushrooms often fall apart quickly, but the way the edge of the mushroom just... "dissolved"... was remarkable.  So I chose "dissolving" as one of my key search terms.  And sure enough, the first hit was to the Mushroom Appreciation site where I learned this is the Coprinus comatus, the "Shaggy Mane" mushroom, aka "Lawyer's Wig" or "ink caps."  

I then did a search for the binomial name (that is, Coprinus comatus) and found lots of corroborating evidence (and more images that match very closely).  As a few other folks did, I discovered (their page on Shaggy Mane) and liked their level of detail in describing how to identify the particular variety.  

As Mushroom Appreciation writes:  "Like a frightened squid or exploding pen, this mushroom releases a black liquid that is laden with spores. As it matures it will deliquesce, meaning it will appear to melt away until only the stem is left."  (That word, deliquesce, was new to me, so I did a [ define deliquesce ] -- a lovely term meaning "to become liquid"!)  

There's also a section on the Mushroom Appreciation site that gives details about how to identify this mushroom (and possibly similar-appearing mushrooms).  

Apparently this mushroom is also edible, although a bit delicate to prepare.  (And you have to move quickly from "just picked" to "just cooked," as they'll deliquesce not long after you pick them. 

 { As always, don't eat any mushrooms until you've taken a class in mycology and identification!  It's easy to get really sick or die after eating a mis-identified mushroom. } 

2.  Here's another thing I found sticking up out of the soil in my garden.  This is a particularly well-watered section of the garden--you can see the green beans growing in the background.  Just before I took this picture, the brown parts at the tip were covered in flies.  I know why, because it smelled terrible--a bit like rotting meat--perfect fly attractant.  Unfortunately, I only got one good picture.  I took several, but it was in a somewhat difficult to reach place, and this was the only one in good focus. It's about 5 inches long, and seemingly appeared overnight.  What IS this thing?  (And should I be worried about it?)  

For this, the most salient search clues would seem to be: (a) it smells really bad, and (b) it's growing in my garden.  

I'm going to include "garden" in my search term because I mostly seem mushrooms in lawns, or in woodlands where the places mushrooms grow is fairly stable over time.  Since garden soil is churned up at least twice a year, the mushrooms that grow in that kind of place would seem to be very different than "ordinary" mushrooms.  

So my first query was: 

     [ stinky mushroom garden ] 

Which gave me the following SERP: 

See that row of images?  This is called "Universal blended Images" (because the algorithm "blends in" image results into the regular search results. 

This kind of thing happens only when there's pretty strong evidence that your search terms are all included with the texts describing these images.  

I was also struck by the appearance of the word "Stinkhorn" on the page several times.  What a strange thing!  

To evaluate this page, I clicked on the row of images to see what was there.  It's a surprising set of mushrooms.  Such shapes and colors!  And all, apparently, stinky.  

When cruising through the images I found a couple that looked very similar to the picture I took.  When I clicked on the first one that seemed very similar, I found myself back on on a page titled "Stinkhorns: The Phallaceae and Clathraceae."  

There's that word again:  Stinkhorn.  (And two genus names as well, Phallaceae and Clathraceae.)  

I read the MushroomExpert page about Stinkhorns and found an identification key at the bottom of the page.  This makes me feel good about the credibility of the content:  Good botanical guides will have "keys" like this to help you winnow out the various possibilities.  
Here's what their key looks like: 

Start at step one.  Answer the question.  If it's true, then you know it's Stahelimyces cinctus.
If that's "Not as above" then jump to question 2.  Proceed like this, answering questions
and following the flowchart.  If the the "spore slime occurring on the inner surface..." then jump
to question #12.  
Sure enough, if you run through their key, you'll find it's a Lysurus mokusin, the "Lantern Stinkhorn."   

Curiously, for something that smells so bad, it is "... considered to be edible when still in the immature "egg" stage, and is thought to be a delicacy in China. When mature, its foul odor would deter most individuals from attempting consumption..."  

No kidding.  

 3.  While running through the Stanford Industrial Park (where HP headquarters, Varian, Xerox PARC, and a bunch of Silicon Valley research labs are located) I found the bush below covered in red berries.  Each berry is around 1 inch in diameter, and the bushes themselves are used as hedges.  It's an attractive plant, and I can see why you'd plant long stretches of this between buildings.  Oddly, I've also seen this plant grown as a tree with a trunk planted as a decorative planting along sidewalks.  And if I recall correctly, I remember there's some connection with Madrid.  What kind of bush/tree is this?  And what's the connection with Madrid? What's the genus/species name?  

Most SearchResearchers seemed to have ID-ed this by using "Search-by-Image," and that's a fine way to do it.  (The trick seems to have been to crop the image down.) 

But I have to admit to doing the relatively simple description of the most obvious feature: 

     [ strawberry tree ] 

As luck would have it, the first page of results are all about this tree.  I had no idea that it would be THAT easy to identify.  

As everyone seems to have figured out instantly, this is the Arbutus unedo, aka the "Strawberry Tree," that's commonly planted in temperate climates as a reliable hedge or ornamental.  

Interestingly, Arbutus unedo was one of the species described by Carl Linnaeus in Volume One of his landmark 1753 work Species Plantarum, giving Arbutus the name it still bears today.  This book was the landmark work that set up the whole binomial naming scheme that we still use today.  (That is, the Genus species names that we give to organisms.)  Given that so many of the names assigned to plants have changed over the past 250 years, it's remarkable that Arbutus unedo still has the same name!   

Wikipedia entry says:  "The fruit is a red berry, 1–2 cm diameter, with a rough surface. The fruit is edible, though many people find it bland and mealy.  The name 'unedo' is explained by Pliny the Elder as being derived from unum edo "I eat one," which may seem an apt response to the flavor."

Fact checking:  For the good of the blog, I ate one of the ripe berries.  (After, of course, checking that I had my identification down correctly.)  And I can report that they are mealy, with a fairly bland flavor.  It was more-or-less a "meh" experience. 

But unlike a real strawberry, the "mealiness" of the undeo meant that I kept picking little bits of the fruit out of my teeth for hours afterwards.  The fruit is really a composite of many tiny bits, so it was a bit like eating a slightly fruity ball of cornmeal.  You could eat many of these and survive, but you probably wouldn't want to do so.  

To make the final connection, I double checked Wikipedia's comment about the undeo fruit appearing on the herald of Madrid.  

     [ Arbutus Madrid ] 

leads to many confirming pages, including a site that specializes in heraldry, and confirms that the bear is eating the fruit of the Arbutus unedo.  

Search lessons 

1. Sometime the obvious search is exactly right.  I find people often overly complexify their searches.  Try the obvious search ( [strawberry tree] or [mushroom dissolving]) and you might well be surprised to see that this is the way many people have written about the topic; meaning that your obvious search will lead to the obvious results.  

2.  Search by image is great (especially if you use the cropping trick).  As a few readers found (cropping the image to get to just the important parts), this works remarkably well.   When cropping, choose the parts that you think other photographers will likely focus on.  

3.  The presence of a identification key marks botanical pages as being serious works.  If you've read many plant or flower identification sites, the ones with a "key to identification" tend to be pretty serious sites.  Yes, the keys can be intimidating, but it tells me that someone has gone to a LOT of trouble to help us understand how to figure out what kind of plant this is.  You don't just toss off a key in a few minutes--they take a lot of time and effort to create.  Any site that has one (that they've created) is probably a pretty decent reference source.  

Next week... a real challenge--a two week challenge (as I'm going on vacation for the first 2 weeks of September)!  Get your search skills out, and get ready to research! 

Search on!