Wednesday, July 11, 2018

SearchResearch Challenge (6/11/18): How do you plot out data by region? The case of regional boundaries.

It's time for a Challenging Challenge!  

As you know, every so often I like to mix up the SRS Challenge with something that's a bit more in-depth.  (And if this is overwhelming, just take the week off--I'll be back next week with an easier one.) 

The Setup:  If you read the news these days you'll see all kinds of claims about various kinds of data.  In an earlier SRS post we talked about immigration rates, and found that the data is a bit complicated, but you can figure it out.  

One of the things you'll see in the news are charts like this one: 

.. by COUNTY (not MSA or CSA).

This is the "Median household income in 2012 by county."  This chart is from Wikimedia and shows the median income by county in the US.  Of course, counties are sometimes just arbitrary boundaries.  They may or may-not make sense.  (For instance, Los Angeles County has around 10M souls living inside the county, while only 600K people live in Providence county, Rhode Island.  That's a factor of 16X difference in size.) 

There are many ways to draw regional boundaries that make some kind of sense. For instance, gerrymandering is the practice of drawing political boundaries to give a particular party more (or less) voting power.  

There are commercial regional boundaries (such as the "Designated Market Areas," aka DMAs, define by the polling / survey company Nielsen).  These regions correspond to media markets.  
More often, though, people who are looking at data use either "Metropolitan Statistical Areas" (MSA).  An MSA is “is a geographical region with a relatively high population density at its core and close economic ties throughout the area.”
For instance, the San Francisco-Oakland-Hayward Metropolitan Statistical Area (with a population of 4.5 million) and the larger San Jose-San Francisco-Oakland Combined Statistical Area (8.4 million) are both near where I live in Silicon Valley.  
A slightly different version of the MSA is the "Combined Statistical Area" (CSA), whi is composed of "adjacent metropolitan (MSA) and micropolitan (μSA) regions in the United States and Puerto Rico that can demonstrate economic or social linkage."  (This is primarily defined by commuting patterns.)  
A map of the combined metropolitan and micropolitan statistical areas of the US looks like this: 

I'm telling you all of this background because it leads to today's Challenge.  
1.  Can you make a map of the median household income for each of the MSAs in the United States?  (Or equivalent statistical areas, if you're from another country.)  
That is, you'll need to: 
A. Find a source of recent data that's organized by MSAs.  2017 would be best, but you should look for the most recent data. 
B. Find a visualization application that can ingest both the median income data and the shape of the MSA.   
C. Figure out a way to create a visualization of the US MSAs that color-codes the income.  It should look a bit like the above example, except with the income level determining the color of the MSA region.   
This is a bit of a Challenge, but it doesn't require programming.  (If you want to program, be my guest, but this doesn't really need it.)  
And, if you really don't like MSAs as the boundaries of map regions... find a different one, and tell us why you like yours better.  
Once you figure out how to do this, you'll have the means to do your own analysis, looking at data in your own way.  
Search on!  

P.S. This is the kind of thing that Data Scientists do all the time.  With this Challenge, I'm hoping to instill some of the skills and values that Data Scientists bring to the job every day.  Hope you have fun with it.  I'm looking forward to your comments! 

Wednesday, July 4, 2018

Answer: How big was the range of these animals?

What's the original / natural range?   

The basic question from last week's Challenge was "What was the original or natural range of these three animals?"  (Lion, Ground Sloth, Camel)  It's a natural enough question, but the answers can be surprising.  

The Challenge was:  

If there were, once upon a time, lions and camels and ground sloths  (Oh my!) in North America, what was their historic region?  During the past 100,000 years, where could you find camels, lions, and ground sloths?  

Basically what we're looking for is a map of  the ranges of these animals.  But... it's tricky.  The obvious queries like: 
     [ historic range lion ] 
can give us conflicting answers.  What's going on here?  

With questions like this, we need to be very clear about what our terms actually mean.  It sounds obvious, but what, really, is a lion?  

For instance, it's easy to find this map from the above query on Wikipedia: 

"Historic lion range" per Wikipedia

But the number of variant answers that we see tell us that the problem is a little subtle:  

What do you mean by "lion" and what do you mean by "historic"?  

Reading the Wikipedia article about lions we find:  "In the Pleistocene, the lion ranged throughout Eurasia, Africa and North America from the Yukon to Peru.."  Really?  There were lions in North America?  

We must go deeper.  

And it doesn't take us long to learn that there are extinct subspecies of lions, and that today the American lion that lived during the Pleistocene is usually treated as a sub species of the African lion‭ (‬Panthera leo‭), ‬which is why it is more commonly listed as Panthera leo atrox.‭ ‬However there are a number of researchers who consider the American lion to be different enough from the African lion to give it its own distinct species of Panthera atrox.  

But this is just the tip of the iceberg when dealing with the classification of the American lion and it‭’‬s easy to become lost in all of the multitude of theories and arguments about its true position in the Panthera clan.  

Of course, there's another lion that's known from its bones, the Eurasian cave lion‭ (‬Panthera leo spelaea‭), ‬which is itself another kind of closely related lion that went extinct about 13,000 years ago.  ‬This similarity has been confirmed by mitochondrial DNA analysis which shows that the American lion and Eurasian cave lion were almost identical,‭ ‬although the American does seem to have grown slightly larger.‭‭  (See the Wikipedia article about leo spelaea.)  

So if we look for range maps for each of these subspecies of lion (notice how I'm looking for this particular species name, not just for lion):  

     [ map range Panthera leo spelaea ] 

we'll see a number of maps, such as this one: 

Historic range (13,000 years ago) of Panthera leo spelaea  (From a paper on its genetics)

Although even this map seems a little incomplete--as we see in the Encyclopedia of Life entry about the American Lion, its range went down into northwestern South America.   (Another article about lions in South America.)  It never lived in Australia or Antartica, but something very much like a lion seems to have lived just about everywhere else.   

So... what's its original range?  Well, there were lions just about everywhere--in North and South America, as well as all of Africa, most of Europe, and vast swaths of Asia.  

What about the ground sloth?  We don't see them roaming around anywhere these days.  So what was their original range? 

Let's go back to our previous question:  What do you mean by ground sloth?  

The Wikipedia article on ground sloths seems fairly complete.  It tells us that (summarizing here):  

Ground sloths are a diverse group of extinct sloths in the mammalian superorder Xenarthra.... a term used as a reference for all extinct sloths because of the large size of the earliest forms discovered, as opposed to existing tree sloths... 
Much ground sloth evolution took place during the late Paleogene and Neogene of South America while the continent was isolated. At their earliest appearance in the fossil record, the ground sloths were already distinct at the family level. The presence of intervening islands between the American continents in the Miocene allowed a dispersal of forms into North America. A number of mid- to small-sized forms are believed to have previously dispersed to the Antilles. They were hardy as evidenced by their diverse numbers and dispersals into remote areas given the finding of their remains in Patagonia (Cueva del Milodón) and parts of Alaska. 
Sloths, and xenarthrans as a whole, represent one of the more successful South American groups during the Great American Interchange. During the interchange, many more taxa moved from North America into South America than in the other direction. At least five genera of ground sloths have been identified in North American fossils; these are examples of successful immigration to the north.

(And, yes, I checked other resources about camels, e.g., books about camel evolution.  This story checks out.)  

Reading through this, we see that ground sloths (the Xenarthrans) inhabited at least North and South America, from Patagonia to Alaska... AND much of the Caribbean!  That's a huge range.  Some varieties were just gigantic--five tons in weight, 6 meters (18 feet) in length, and able to reach as high as 17 feet (5.2 m).  These were significant megafauna of the American landscape.  

One particular kind of ground sloth has a special place in American history (which we're celebrating today, July 4th).  The megaloynx ground sloth was first identified by Thomas Jefferson in a paper that he presented before the American Philosophical Society on March 10, 1797: "A Memoir on the Discovery of Certain Bones of a Quadruped of the Clawed Kind in the Western Parts of Virginia."  

This paper is widely seen as establishing the science of vertebrate paleontology in the US.  Interestingly, Jefferson identified the Megalonyx as a giant lion, and asked Lewis & Clark to be on the lookout for any Megalonyx as they explored the American west in their expedition of 1804 - 1806.  Jefferson, along with many other scientists of the time, had no idea that animals could go extinct, and so he naturally saw these bones in terms of existing animal forms.  Alas, he missed the last Megalonyx by a few million years.  

But their range was clear:  It varied by species, but collectively, the ground sloths were found in the Americas.  

Finally, what about camels?  

Again I have to ask:  What do you mean by camels?  

If  we check the Wikipedia page on camels, we learn that dromedaries live in the Middle East and the Horn of Africa, while Bactrian camels are native to Central Asia, but live throughout remote areas of northwest China and Mongolia.  (Historically, the area known as Bactria is the flat region straddling modern-day Afghanistan, Tajikistan, and Uzbekistan. More generally, Bactria was the area north of the Hindu Kush, west of the Pamirs and south of the Tian Shan with the Amu Darya flowing west through the center.)  

But we also learn that an extinct species of camel in the separate genus Camelops, known as C. hesternus, lived in western North America before humans entered the continent at the end of the Pleistocene.

This takes us, once again, into a definitional moment.  

Looking up the evolution of camels, we learn that the earliest known camel, called Protylopus, lived in North America 40 to 50 million years ago.  It was about the size of a rabbit and lived in the open woodlands of what is now South Dakota. 35 million years later, the Poebrotherium was the size of a goat and had many more traits similar to camels and llamas. 

But the direct ancestor of all modern camels, Procamelus, lived in North America around 3–5 million years ago.  This proto-camel species, Camelidaespread to South America as part of the Great American Interchange,  where they gave rise to guanacos and related animals.  They also spread to Asia via the Bering land bridge, including Ellesmere Island (in modern Canada, well above the Arctic Circle, near Greenland). 

Procamelus from the mid-Miocene in Colorado, by Robert Horsfall.
 (p/c: Wikimedia, from the book
  A history of land mammals in the western hemisphere by William Berryman Scott)  

The Wikipedia article on camels tells us that the "last camel native to North America was Camelops hesternus, which vanished along with horses, short-faced bears, mammoths and mastodons, ground sloths, sabertooth cats, and many other megafauna, coinciding with the migration of humans from Asia."

Well, that's interesting:  the American Lion, ground sloths, and camels all went extinct around the time humans appeared in the Americas.  

So if we consider as camels only "large animals with humps"  (and not the smaller camelids like llamas and guanacos), their range was historically North America, then spreading into central Asia and the Middle East.  

If you search for other possibilities (e.g., [ Europe camel] or [Africa camel]) all you'll find are references to the (relatively recent)  introduction of camels into those places.  

Likewise, camels seem to have spread a bit more recently into places you might not have expected.   

Around 700,000 dromedary camels are now feral in Australia, descended from those introduced as a method of transport in the 19th and early 20th centuries.  This population is growing about 8% per year, even becoming a problem as they consume resources in a limited landscape.  

In North America, after being away for a 15,000 years, a small population of introduced camels were imported in the 19th century as part of the U.S. Camel Corps experiment. Never a full part of the Army, it was a short-lived experiment to use camels in arid climes instead of horses. When the project ended a few years later, they were sold to be used as draft animals in mines, some escaped, or were released in to the desert. Twenty-five U.S. camels were bought and imported to Canada during the Cariboo Gold Rush.  (NYTimes article about the Camel Corps.)  

Upon finding water during a surveying trip, horses would drink immediately, while the accompanying camels would show little interest (see the camels in the upper right background). The Army’s camels proved they could withstand the oppressive climate of the American Southwest and other hardships that could send horses and mules into a panic. (Horses Quenching Their Thirst, Camels Disdaining, by Ernest Etienne de Franchville Narjot, courtesy of The Stephen Decatur House Museum)

Search Lessons 

There are two big lessons here (and one little one).  

1.  Be clear about what you're searching for!  In the case of the lion, do you mean to include American Lions?  Or only the African variety?  In our case, we chose to go with clearly related (albeit extinct) lions (e.g., the Cave lion and the American lion).  This changes your search strategy to include very specific terms, like Panthera leo spelaea, or Pathera atrox.  Likewise, we had to make a cutoff for camels since the evolutionary slope is a bit slippery.  We didn't want to include camelids like llamas, so we went with "big cloven footed animalas with humps" which evolved at a particular time.  (Note:  camels and llamas can interbreed, although like mules, the offspring is sterile... so we chose a morphological cut-point, rather than a clear species boundary.)  

2.  Checking Images is sometimes helpful for finding maps that give extents.  As a general strategy, I often check Google images for information that's presented graphically, using the same queries as I use when searching the web.  About half the time, I end up learning something that I hadn't anticipated learning.  Serendipity is your friend, and Image search lets serendipity happen! 

3.  (A small point)  I found the image of the Procamelus not by doing "regular" web search, but by searching on Wikimedia.  Normally, I think of Wikimedia as being the repository for all of the images on Wikipedia, but sometimes you can find images that appear only in the Wikimedia collection that don't seem to appear anywhere else in Wikipedia.  This is incredibly handy for teachers (and blog post writers) because they come with the copyright information and are often uncommon images.  It's easy to search this space, just use a site: query like this:  [ procamelus ] You'll find all of the images you might need, along with useful copyright information.  

Ending note: 
This week's post took longer than normal for a couple of reasons--the book needed attention, I had a few things to do at work, and there was a bug in Blogger than I needed to run down.  (One of the downsides of working at Google is that I feel obligated to help find bugs in our products. C'est la vie.)  
As I was feeling bad about being slow to stick to the publication cycle, I realized that the rest of the summer is going to be busy as well.  I'm teaching a few search classes in Europe, and taking a bit of vacation in places that won't have Wifi, nor will I have my computer.  These places should be fun, and should lead to even more interesting SearchResearch Challenges.  
I'll keep writing, and will write something every week (unless I know I'm going to be off the grid, but I'll give a heads-up when I know...).  

Until then, Search on!  

Wednesday, June 20, 2018

SearchResearch Challenge (6/20/18): How big was the range of these animals?

Going beyond avocados and onto animals! 

As you recall, one of important ways that avocado seeds were dispersed was through the gullet of large animals (like the gomphothere).  Many of those animals are long gone--extinct.  But they left an indelible effect on our landscape, leaving a trail of guacamole behind them.    

That made me start to wonder about those beasts and what the landscape must have looked like back in the day when gomphotheres wandered around through southern Mexico and Central America.  

If I recall my Pleistocene history correctly, there were a LOT of charismatic megafauna back then--including ones that we think of as being native to some other places.  I associate sloths with South America, but there used to be really big sloths in North America as well (although there are none today).  Likewise, I think of lions in Africa and camels in the Middle East + Africa.  

But it wasn't always this way.  This leads to our fun SearchResearch Challenge for the week.  

If there were, once upon a time, lions and camels and ground sloths  (Oh my!) in North America, what was their historic region?  During the past 100,000 years, where could you find camels, lions, and ground sloths?  

Just as importantly, how do you do this search?  What queries do you need to figure this out? 

Be sure to let us know in the comments.  Share your knowledge about how to answer questions like this!  

Search on! 

Friday, June 15, 2018

Answer: Seed dispersal mechanisms for giant seeds? (And search strategies)


... that fruit beloved of millennials on their toasts, that mainstay of guacamole, that luscious green ambrosial centerpiece of tortilla chips...  our lives would be poorer without it. 

But as you know, it's a food of the New World, having originated somewhere between southern Mexico and Peru.  There is some evidence that avocados were domesticated at least 3 times, resulting in the currently recognized Mexican (aoacatl), Guatemalan (quilaoacatl), and West Indian (tlacacolaocatl) landraces.  (See the Wikipedia entry on avocado.)  

As noted, that seed is pretty big.  And that led me to this week's SRS Challenges.  How would you work out the answers to these questions?  

1.  How are avocados dispersed?  If they rely on just falling from the tree, that doesn't seem to work well... so is the "natural" dispersal by animal?  If so, WHAT animal would eat an avocado... and then be big enough to carry it somewhere and leave it behind in a new place?    
I think I kind of gave away a big clue here by writing "dispersal" in the Challenge.  That's a huge suggestion of a specialized search term to use in your query:

     [ avocado dispersal ]

 this leads directly to a Smithsonian Magazine article Why the Avocado should have gone the way of the Dodo, which points out that:

 "The plant hit its evolutionary prime during the beginning of the Cenozoic era when megafauna, including mammoths, horses, gomphotheres [a kind of giant, now extinct, elephant]  and giant ground sloths ... roamed across North America, from Oregon to the panhandle of Florida. The fruit attracted these very large animals... that would then eat it whole, travel far distances and defecate, leaving the seed to grow in a new place."

This article is based on the writings of Connie Barlow in magazines like Biodiversity (article about avocados) and Arnoldia  (article about anachronistic fruits).  Both are well respected journals). And she's author of the book The Ghosts of Evolution, which is all about this topic.

To triangulate (that is, to get multiple perspectives on the same topic and look for agreements / disagreements).  It's fairly easy to get multiple articles by different authors  (e.g. National GeographicScience...) all of which agree on this interpretation of events.  

2.  Are there other plants with giant seeds (like the avocado's) and how do THEY get dispersed?  
This category of "anachronistic fruits" (that is, fruits that evolved in a very different ecosystem that now holds) include seeds of the canistel tree, the honey locust, and the Osage orange.  (From the Barlow article and that Science journal article.)  These all have the property that the ecosystem is missing animals (that used to be around, hence "anachronistic") that can disperse the seeds.

The fruits (and seeds within) are either large (Osage Orange, Avocado) or have a tough exterior (Honey Locust, Canistel).  In all these cases, the hypothesis is that really large animals, such as gomphotheres or giant ground sloths, would eat these fruits, and then deposit them later after they'd passed through a large animal digestive system, which would prepare them for germination and growth.

It takes a Gomphothere to eat, and process, an entire avocado.  (P/C

That is, these are really archaic fruits that literally belong to another time.  And, in the case of the avocado, a strong case can be made that after the demise of the megafauna, it was humans that propagated the bizarre avocado trees, and their seeds.

And a simple search for:   [ large seeds ]  leads handily to several lists of large seeds (e.g., the coco de mer), and then combining that name with dispersal leads directly to a discussion of how that particular seed is dispersed.  For instance: 

     [ coco de mer dispersal ] 

where you can learn that these giant seeds might have been an anachronistic fruit 65M years ago, but that currently, they're dispersed only nearby the parent trees.  (Apparently they're unable to float, so they can't colonize distant shores the way a coconut palm can. See the paper by Edwards, et al., in New Phytologist.)  But other giant seeds DO require animal interventions.  Again, I leave it to you to explore... 

3.  In writing this post, I’ve been looking for an early illustration of an avocado.  It's kind of tricky.  What’s a strategy for finding the earliest illustration of something?  In our case, an avocado seed... but how about finding an early illustration of ANY thing?  

Here's what I did to find an early illustration of an avocado.

Note that I was a bit vague, so any reasonably old illustration would do (although I admit that I was hoping for the first publication of the avocado).   Here are my four strategies...

1.  Work from Wikipedia's history (or any history of the object in question). The Wikipedia articles often give a reasonable history with references.  In this case, it clearly talks about avocados being mentioned in the Florentine Codex, a 16th-century report about Mesoamerica by the Spanish Franciscan friar Bernardino de Sahagún.  Interesting!  So I then did a search:

     [ "Florentine Codex" avocado ]

Here's the first image I could find, from roughly 1590.  

2.  Use a term that connotes age.  Often, older images or illustrations will be labeled with specific terms--archive, etching, engraving, etc.  My first guess at using a context term for age was "antique," so I incorporated this into my query:   

     [ antique avocado print ]

This kind of query implies pic from a particular era. And in particular, this leads to a lot of older prints/illustrations of avocados.  

But none of these really push the date back to far, but it definitely gives us a bunch of early-ish prints.  

3. Books search--set to an earlier date. I just searched for [ avocado ] in Google Books, and restricted my search to the 19th century, knowing full well that avocado was a rare word back then, so any mention probably would have been in a botanical journal.  Sure enough, I was able to find a nice illustration from 1891:

This appeared in Frank Leslie's Popular Monthly from 1891.  (Monthlies from this era often contained extensive illustrations--perfect for this kind of thing.)  I expect that more searching around would find more illustrations like this, and I leave this exploration to you.  

4. Search for the earliest publication explicitly. Often, the simplest way is the best.  When searching for an early publication or illustration, you can predict that someone will write something like "... and here's the earliest publication about avocados..." When I did this query, it worked out quite well.  This query:

    [ earliest publication avocado ] 

leads to a paper on "The Early History of the Avocado" (interestingly enough, from the web site).  This paper refers to an early book, the Suma de Geografia, published in Seville, Spain, in 1519. In that era, Enciso travelled with the great navigator and cartographer, Juan de la Cosa, during one of the first exploration of the New Worlds. He described the avocado that he saw in one of the small harbors at the foot of the Sierra Nevada de Santa Marta (trans. by Wilson Popenoe; this quote is from "The Early History of the Avocado paper)
"Yaharo is a good port, with good lands and here are groves of many different sorts of edible fruits, among others is one which looks like an orange, and when it is ready for eating it turns yellowish; that which it contains is like butter and is of marvelous flavor, so good and pleasing to the palate that it is a marvelous thing."

This is great, but I'm looking for an illustration.  I checked Google Books for a scanned volume (hoping there would be an illustration there, but no dice--it's not scanned).  But this is a good clue--perhaps I can find an article that refers to the Suma de Geografia, and that would have an illustration.   So my next query was for:

     [ "Suma de Geografia" avocado ]

I checked Google Books (and Hathi Trust), but couldn't find a completely scanned version.  Then I looked in Images, and that led me quickly to this fascinating paper, "The Avocado (Persea Americana, Lauraceae) Crop in Mesoamerica: 10,000 Years of History" published in the Harvard Papers on Botany.  Reading through this text, I found the sentence 

"[in this diagram] Every figure emerges from the earth, and behind each of them there is a tree with fruits that include the cacao, avocado, soursop (Annona muricata L.), and chicozapote (Manilkara zapota )..." 
 with a reference to  another paper, "Observations on the Cross Motif at Palenque" by Linda Schele, published in Primera mesa redonda de Palenque (a conference on the art, iconography, history of Palenque). 

And in THAT paper from 1974, the author includes a detailed illustration from the Temple of Inscriptions, where on the east side of a sarcophagus is this (just a small piece of a much larger picture): 

The oval things are avocados.  The Temple of Inscriptions was completed around 683 AD, so I'd argue that this is probably the first illustration of an avocado--literally carved in stone.  

That's probably the earliest depiction of an avocado that I could find.  

Search Lessons 

There are a few here: 

1.  Using the right terms (e.g., dispersal) helps a lot!  In this Challenge, I gave you a big clue.  A precise, relatively rare term can help a lot.  

2. "Reading around" in the literature can give you many other terms for searching.  In my case I learned about "anachronistic fruits," which then opened up an entire literature for perusal.  

3.  There are many ways to find old illustrations--(a) working from the literature, (b) adding in a context term that suggests age, (c) searching for illustrations in books, (d) searching explicitly for "early illustration."  There are other ways that we'll cover in the future, but this is a good start on a short list of methods to find older pictures and illustrations.  

I'll be back next week with another update, and a new Challenge!  

Search on! 

Thursday, June 7, 2018

Survey Results--an analysis of advice about searching

Last week we had a survey.  

It asked the SRS community about what they thought were the most important search skills.  (Take it by clicking here, if you want to see what the original survey was all about.)  

Since I do surveys professionally (for my research work at Google), I know this isn't a perfect survey.  The sample size is too small, too biased towards professional researchers, and doesn't have a broad enough set of questions to be accurate. 

I don't care.  

What I wanted from this survey is a sense of what experts think about search advice.  That is, I really wanted to hear what you, gentle reader, had to say about search methods and conducting online research.  

The good news (for me) is that many of the ideas you put into the survey are covered in my book.  The better news is that I don't cover everything you mentioned.  

To everyone who filled out the survey, many thanks for your comments.  They were thoughtful and had great insights, including a few that I hadn't thought about before!  We'll cover some of these in future posts.  

Analysis:  We got 48 responses from readers.  I was hoping for a bit more, but this is a great foundation to start.  There were also 4 questions, so that's 192 responses that I'm summarizing.  

All I did was to read through all of the responses and summarize the themes I saw.  Each of the items below was suggested by more than one person.  The rank order reflects the number of times each was written about.  #1 in the list was the most common, #2 was second most common, etc.  

Here's what you said, with a short explanation / summary afterwards. 

Most important skills:

1. Query formulation (and reformulation)
This was by far the most common skill that you think good searchers should have.  This makes sense, as the quality of your search query is a strong determiner of how successful your search will be.  Over the years I've seen that some people get stuck when they can't figure out the right search terms--that's an important barrier for getting to a successful answer.  Reformulating your query is important, both to get out of being stuck, and to hone in on what you're actually seeking.  (More on this below.) 
2. Learn from previous searches
We've talked about this in lots of SRS posts--a very important skill is that of reading and learning from your search experience.  In classes, too often I see people not reading the results page and consequently NOT learning what worked or didn't work.  Careful reading and thinking about what your search returns leads to learning, and learning leads to better searching in the future.  

3. Know what’s possible
Knowing what you can search for, and understanding the ways in which you can search, is absolutely fundamental.  In my testing of search skills, one of the things that completely blocks people from completing their search task is not knowing that you can search for X, Y, or Z (you fill in the blanks). In one dramatic example, 250 Google engineers were unable to complete a search challenge I gave them because they didn't know it was possible to search for archival aerial images on Google Earth.  (Did you?)  

4. Context
When you search for something that's more than just a simple answer, you often need to find the context that surrounds what you find.  This is especially true for complex topics such as social or historical questions.  Just reading a single web page (or, God forbid, reading only the snippets on the SERP) almost never gives you enough context information on the topic.  Read widely, and learn the when/where/why/how about your topic.

5. Lateral searching (using tabs and windows to organize your searching)
I'll be writing much more about this in weeks to come.  My colleagues Sam Wineburg and Sarah McGrew (Stanford) have written really well about this and why you'd want to do lateral searching. See: Lateral Reading (Sept, 2017).  I'll be writing about some effective strategies for doing this.  

6. How to limit your searches to a particular domain 
Limiting the scope of your searching is simple, if you know how to use site: or filetype: -- those are the simplest ways to get results with certain kinds of properties.  You use site: when you want results from a particular web site or organization.  Thus, [ influenza] will give you results with a distinctly CDC perspective.  Likewise, you can exclude certain sites with the minus operator.  [ influenza ] Or you can search multiple sites or entire domains:  [site:.IN Brexit OR site:.BE Brexit] will search for "Brexit" on Indian or Belgian sites. 

7. Ability to evaluate what you find
Yes!  This is critical... and requires much more discussion than I have room for here.  We'll return to evaluation rules-of-thumb in future blog posts. 

8. How to find experts and learn what they know
This is a great way to find advanced, expert content on a topic.  The trick here is to locate experts that are truly expert (why do you think they're an expert?) and then tracking down their writing.  The other trick here is to always locate more than one expert.  Ideally, you want to see the varieties of thought on a topic (there could be more than one point-of-view... in some cases, there can be 4 or 5!), and get experts from each perspective.  

9. Critical thinking 
A critical thinker is always wondering how could this be wrong?  And how can I break this big problem down into pieces? There are many approaches to critical thinking, but these two heuristics work well for me.  A critical thinker criticizes an idea, some writing, or a point-of-view... but does so productively, not to just be mean-spirited about it.  Critical analysis is trying to break something into its components to understand how it works.  It does not assume that authorities are always correct, but calls things into question.  (This is why critical thinking annoys some people.)  Of course, you want to do your critical thinking with care, and not be annoying about it.  But it's an essential skill.  

Most important attitudes:

1. Persistence

Persistence in the face of failure is the #1 attitude (or trait) that was mentioned by those surveyed.  I agree.  It's obvious in my search classes when someone has this attitude, and when they don't.  Much of what I teach is attitude encouragement.  "If that didn't work, what do you think would work?"  And "Just try one more search.."  

2. Curiosity 

You'd think this would be obvious, but I agree.  Intrinsic curiosity about the world is essential.  On the other hand, I don't know how to communicate this other than by being curious myself and showing what kinds of things you can learn about the world through satisfying your curiosity.  (In many ways, my book is exactly this--ways to find answers to your questions about curious things.)  

3. Adaptability 

Adaptability is just the ability to respond to changing circumstances.  In online research, it's the ability to act effectively when things go wrong (e.g., getting 404 errors, finding that your favorite website no longer supports that service, etc.)  You'll also find that you need to adapt to changing tools (new features introduced, old ones go away) and data.  

4. Enthusiastic

 Someone's enthusiasm goes a long way in making both persistence and curiosity work.  I've seen people who were not great searchers manage to succeed because they had great enthusiasm for their topic, which in turn made them persistent, curious, and adaptable.  (These people are always great to work with.  Their enthusiasm for the topic means they want to learn how to be better and more accurate.)  

How to ask good questions:  

These answers were pretty interesting, and all over the map.  Here's my condensation of the top 10 ideas. 

1. Be specific / be clear about what you’re asking

2. Ask yourself:  WWDD  (what would Dan do?) 
I was surprised about how many people said this.  Aww.. gee... thanks! 

3. Who would know the answer to your question… how would they put that info online?
This is a variation on the "find an expert" heuristic from above.  But I like it as a heuristic for asking questions too.  How would an expert think about your question?  

4. Think of alternative ways to ask the question

5. Start large – don’t zero in on a specific topic

6. When stuck, work from specific examples

7. Always ask why? 
That is, ask why about the answers you’re finding.  Why is this true?  And How do you know?  This almost always leads to a better question. 

8. “Predict the answer”
One way to create a question is to think forward to what an answer would look like.  That is, if the answer looks like this, what question would let me get to that?

9. Ask yourself WHY do you want to ask this question.
This is a classic library reference desk question.  A great reference person always tries to dig into what you want to know.. and why.  Knowing the answer to why tells you a lot about how to frame your question.  

10. Be open to nuance. 
Don't give up on the subtlties of a question.  You can almost always ask "Is there more I should know?"  That's a useful question in almost all cases. 

Other advice from the survey

1. be aware of other search tools (e.g., thesaurus, dictionaries, geographical tools, …)

2. don’t limit yourself to English and US sites.

3. search for tools to help you accomplish your task

4. recognize that Google doesn’t have everything

5. be humble in your searching

6. consider the motivation for whoever is writing whatever you’re reading. Why did they write this?  Why did they post it?

7. be aware of confirmation bias


There's obviously a lot that's been packed into these answers.  I'll be unpacking some of these themes over time.  

You'll be amused to know that the attitudes section is almost exactly what I cover in chapter 19 of my book.  Thanks for confirming my intuitions!  

See you next week when we discuss the answer to the mysterious avocado seed question! 

Search on.