SearchReSearch: October 2011

Monday, October 31, 2011

The + operator is gone... So what?

Just the other day, Google turned off the + operator. There was a bit of a kerfuffle about it in a few blogs, and Danny Sullivan got bent out of shape about it, but relatively few people noticed that the behavior of the double-quote operator also changed at the same time.

Here's the deal: LOTS of people believed incorrectly that the + operator was the opposite of the - operator. You know what – does, it excludes the term from the search results. That is, if you do a search like [apples –macintosh] the results will not contain the term macintosh in the results. That makes sense. (Some places use the NOT operator for this. Same behavior.)

Unfortunately, many people believed that a search like [apples +macintosh] would require the term to be in the search results. That's NOT what it did. While the + term would usually be in the results, it was only there because you'd put it into the query!

So what did the + do? Answer: It turned off synonymization and spell-correction. That is, with a query like [apples +macintosh] you wouldn't get that term macintosh being synonymized for a term like gala, gravenstein or jonathan. (Those are other apple varieties, if you're wondering.)

NOW... we no longer have the + operator. So what can you do if you want the same effect?

Answer: Use double-quotes for single terms. You'd write that query now as [apples "macintosh"], and it'll give you the same effect as what the plus used to do.

Some people complain that it's an extra character to type. Geez. Give me a break--you're going to complain about 1 extra character?

The double-quote mechanism is actually pretty interesting. For instance, did you know you can do the following kinds of a query?

[ intitle:"abc" ] finds “ABC” in title of a web document, no synonyms allowed

[ intitle:a intitle:b intitle:c ] finds the single letters A, B, C in any order in title

And you can use double/double-quotes to look for non-synonymized terms in a particular sequence. A la,

[ " "abc" "cnn" "cbs" " ] finds those 3 terms in that sequence without synonyms

and you can extend that to include

[" intitle:"a" intitle:"b" intitle:"c" "] finds the single letters A, B, C in title of page,
in that sequence

Search for these terms in ANY order in the title of a web page:

Search for these terms in THIS given order on the web page (using double-double quotes):

Contrast with this result, where the 3 terms can appear in any sequence...

Does this help you understand the new power of double-quotes? Yes, the + operator is gone, but what we now have makes a bit more sense.

Thursday, October 27, 2011

Answer: Where did that crazy uniform come from?

--> Quick answer: This is the Civil War uniform of the Zouaves, a rather colorful uniform dress adopted by some regiments as a particularly fashionable way to go to war. They were heavily influenced by African battle dress, especially from the Zouaoua tribe in Algeria

Image from LOC's collection of articles about the American Civil War

(Internet Archive)

What was fun for me was to see how people solved the search challenge. A few folks recognized that dragging-and-dropping the image in the manner shown in this YouTube demo video. (If you haven't seen this trick before, it's a remarkable thing to know. You'll be able to find things that were just impossible to do before.)

A couple others used Google Goggles, which does basically the same trick, although through your mobile phone camera, rather than dragging images around on the desktop.

But other people used more traditional search methods. Dave, the first to reply, wrote:

"...quick Google of "Civil War Uniform Turban" turned up that this uniform style was called "Zouave" after North African troups serving in the French army in the 1830s and until WWI. Wikipedia says these were originally recruited from the vowel-rich Zouaoua, a tribe of Berbers...."

Interestingly, Fred also did this same strategy, but wasn't able to pick out the link that Dave found. That's actually fairly common: two people follow the same strategy, but end up with different outcomes. Usually it's because one person knows just a bit more about the area, and follows a different link from the SERP to a landing page. I know Fred is a strong searcher, so I think this points out that your-mileage-may-vary even when following the same strategy. Search knowledge is powerful, but knowing about a domain is just as important!

In my case, my strategy was to search for [Civil War uniform fez] (the first image I saw of these soldiers had them wearing fezzes, although as we see, turbans were also acceptable alternates). This led me quickly to the Wikipedia article on Zouaves, which took me to learn more about Civil War dress styles and the utter craziness of these soldiers who had a reputation for hard fighting and being tougher than the average warrior (who were already pretty tough).

The purpose of this blog is really to show you techniques for being a better searcher, and we see two maxims at work here.

first: If you're searching for an image, an Image search is often best. Dragging and dropping the image into the Google Image box is a great first thing to try. (Just so you'll know, I tried to make it not work by cropping the original images--obviously that wasn't enough.)

second: When you do your searches, try to use the simplest possible language to describe what you're searching for. My search, [Civil War uniform fez] , was pretty short and to the point. Once you start adding that 5th and 6th word, you might be accidentally removing the target of your search with those extra words. Think of adding an additional search term as a "focusing" mechanism... if you focus in too tightly, what you're looking for might be just outside the range of your search. So start broadly, then tighten in as you learn more.

Search on!

Wednesday, October 26, 2011

Wednesday Search Challenge (October 26, 2011): Where did that crazy uniform come from?

I just came back from a short trip to Washington DC where I wandered on the Mall between all of the monuments and museums, taking in the whole sweep of American history and puzzling over the parts that I really don't recognize or know any of the back-story.

One of the more puzzling bits was looking at photographs from the Civil War. I'd seen the pictures of Matthew Brady and knew most of the generals, battlegrounds, armaments and forts. But what really surprised me were some of the crazy uniforms the soldiers were wearing.

Really? These guys were fighting in the American Civil War? But I assure you, they are, and did.

I have to admit--these images didn't line up with my expectations of Civil War uniforms at all! What's going on here?

Naturally, I looked it up.. and found that these unusual Civil War uniforms were inspired by a fashion craze at the time that came from another part of the world.

The search challenge for today is pretty simple and comes in two parts:

1. What is the name of this style of uniform dress?

2. Where did this style of uniform originate?

And... for extra credit,

Extra credit: The original wearers of this uniform style were recruited from what group?

Answer tomorrow. (A little hint: Searchers in Louisiana might find this problem fairly easy.)

Search on!

Thursday, October 20, 2011

Answer: Who owns that piece of land?

Okay, I admit it. I kind of set you up for this one. But there's a reason... and that is that preconceptions of what you're searching for can ALSO damage your ability to search. Let me explain.

We've actually discussed land ownership before. (See previous episode on determining how much acreage there is in a given land parcel.)

Today we'll discuss a slightly different solution and I'll point out the setup.

Using Maps.Google.com you can right click at the given location on the map. That will popup a context menu with an option for "What's here?"

It will show up a link to the nearest address--in this case, 7380 Morton Ave. I should have been tipped off at this point that maybe it wasn't a Cargill-owned site. I recognized the name "Morton" as a big salt company. But I didn't really notice or pay attention. So I kept looking for the Cargill connection.

If you copy that address into regular Google, you'll see a Map result (which we already have) and a few links about "Morton Salt."

Huh. I thought I was looking for Cargill. All I ever read in the local press about the baylands has the word "Cargill" in it. Maybe Cargill bought Morton Salt?

This led to a flurry of clicks and tracking down information about Morton Salt. Does Cargill own it? Answer: No, it's actually owned by K+S Aktiengesellschaft, a German company that bought Morton in October of 2009 for $1.5B and in the process became the world's biggest producer of salt. Okay, who owns K+S? Answer: Nobody.

So.. who owns that parcel of land?

A quick check of the street location in Google Streetview shows an obvious first clue: Morton Salt!

The next link the SERP is to Wikimapia, a site that shows ownership and is great for seeing city and neighborhood boundaries. (It's a great resource for this kind of thing... highly recommended.) You can also use PropertyShark.com (login required) to get even more information about parcels like this. They both also confirm that this is owned by Morton.

To verify all this, I went to the real authority: the Alameda County Assessor's website and looked up information about the parcel (which is labeled: 537-751-6-4) As is usual, the government website is tough to use. (Do anything slightly wrong, and it gives you nothing.) But after a bit of trial and error, I found the parcel is in fact owned by Morton Salt, and is worth ~$20M.

Moral of the story: I went looking for Cargill and found Morton. This kind of thing happens fairly often--BEWARE of the made-up mind--it's often hard to let go of a preconception and see what you're actually looking for. While, as Pasteur said, chance favors the prepared mind, a correctly prepared mind is able to look around and also see alternatives. An effective searcher needs to be both prepared and willing to give up on preconceptions.

Search on!

Wednesday, October 19, 2011

Wednesday Search Challenge (October 19, 2011): Who owns that piece of land?

The salt flats of south San Francisco bay are hard to believe when you first see them. They're vast geometric shapes that are often bizarre colors--red, maroon, yellow, green--depending on the time of year as you fly overhead. They change color depending on what kind of halophlic creature is currently living in the water. As the sun evaporates the bay water, the salinity goes up, and a variety of animals take over at different degrees of saltiness.

They're also very controversial as they're the remnant of the marshlands and estuaries that once supported millions of migrating birds (and local, year-round fauna). But they were diked a long time ago, and the question these days is how to best use the marshlands.

A key player in these controversies over land-use policy is the giant chemical manufacturer Cargill. They've been in the bay for a long time, and own many of the salt evaporation ponds around the bay.

The other day as I flew into SFO, I passed over the south bay, and looking down from my window, I couldn't help but notice that one salt manufacturing facility looked slightly different than the others. I later found its GPS coordinates, 37.518372, -122.032700‎ (aka: +37° 31' 6.14", -122° 1' 57.72" ) and tried to look up which branch of Cargill actually owned that site.

Question for today: Can you figure out which branch of Cargill owns that location?

Search on!

Tuesday, October 18, 2011

Knowing what's possible still matters

One of the promises of a search engine is that it would make the information landscape flat. That is, you could do your search and you wouldn't need to know all of the fiddly little details about what databases are available and what each of them has in stock.

It hasn't quite worked out that way. I content that while Google has done a great job of flattening, it hasn't--and cannot -- remove the need to know what kinds of data is possible.

My friend Eisar had a fantastic G+ post that illustrates this point perfectly. In his post he pointed out the wonders of SFGenealogy.com, a site that collects lots of historical and archival materials about San Francisco and the Bay Area.

As his example, he pointed to their recent posting of telephone directories from San Francisco. They started with the directory from 1850 by Charles B. Kimball. Who knew that telephone directories had authors? Or that they used to have a preface. The 1850 preface begins "It is not to be expected, in a city like this, where whole Streets are built up in a week and whole Squares set up in an hour--where the floating population numbers thousands and a large portion of the fixed population lives in tents and places that cannot be described with any accuracy..."

The phone books (mostly the equivalent of Yellow Pages) have intriguing history built into them. In the 1907 edition, one year after the great San Francisco earthquake, many of the phone numbers are listed as "Temporary." No kidding.

They also have pages that list social clubs (such as the "Bunker Hill Association" to "inculcate a feeling of patriotism and commemorate the anniversary of the ever eventful Battle of Bunker Hill, June 17th..."), street crossings and associated building numbers, and of course, listings of ordinary people and their occupations.

Some of these are abbreviated in intriguing ways. "Stenog" for "stenographer" or "tmstr" for "teamster," etc. Some of the abbreviations are beyond my understanding. What is as "lab" or a "tmatr"? "Lab" probably isn't a "lab assistant" so I'm guessing it's laborer. But I haven't found the master index yet, so I'm not really sure.

What this means for search is that ....

1. you still need to know that such a directory exists, and exists online in a way that you can find it

2. once you find it, you need to know that some of the words you'd expect to find (e.g., "teamster") are actually written in a way that spelling-correction can't repair

3 sometimes terms are used in ways that you'd never expect--"gutta percha" is a kind of rubber (so if you're researching rubber vendors, you need to know terms like that)

4. concepts exist that you need to discover--"electric baths" or "Russian baths" or "French range" or that a "hand grenade" used to be a kind of fire extinguisher...

And... it's useful to know a bit about how to search. For instance, you can use site: to restrict your search to a specific year.
Example: [ site:www.sfgenealogy.com/sanfranciscodirectory/1885/ grenade ] will find the "hand grenade" fire extinguisher makers in the city.

If you'll notice, they've nicely organized their documents so you can get to a specific page in the directory. Look at the URL for the first hit: http://www.sfgenealogy.com/sanfranciscodirectory/1885/1885_537.pdf
The number after the underscore character is the page number. Hence, this is the directory from 1885, page 537.

I'll give them kudos for their scanning as well. It's beautifully done (although I WOULD like a simple way to get to the full-text, rather than having to pull it from the PDF).

Tuesday, October 11, 2011

Confusion about the mythical safesearch operator

I figure one of my little jobs is to help people understand what's possible and what's NOT possible with Google search. To do this, I read a lot of blogs (and LOT of blogs) and try to extract the sense of what people are thinking about how search works.

Turns out there's a lot of misinformation out there about advanced Google search. Rather than take them all on at once, let's fix one thing at a time. Here's today's insight...

There is no SAFESEARCH: operator.

Yes, I know, there are about 1 million results for the search [ safesearch operator ], but I'm telling you. It doesn't DO anything.

You can see this yourself by comparing the queries:

[ safesearch:breast cancer ] and [safesearch-breast cancer] (I'm using this because it's the standard example used to "demonstrate" how it works.

You can see that the results are the same. IF safesearch: was a real operator, you'd expect them to be different (because the hyphen wouldn't mean anything in that context, it's just ignored).

I'm pretty sure the way this got started is through a mis-reading of the CGI arguments used in the URLs passed to Google. A URL used to run a Google search for the search [ breast cancer ] will look something like this (some things elided):

http://www.google.com/search?....
sourceid=chrome&ie=UTF-8
....&q=safesearch%3Abreast#pq=safesearch%3Abreast&hl=en....&q=breast+cancer&....

But you can't quite reverse engineer operator arguments from the query string. It just doesn't work that way.

In any case, this is a broken meme. Safesearch: doesn't exist. Use the safesearch setting instead!

Search on! (Safely.)

Friday, October 7, 2011

Answer: What are those things in the desert?

I first ran across these mysterious desert structures when I read an article about the Nazca lines in Peru and found an odd reference to the "Nazca-like lines in the Saudi Arabian desert." But there was no reference! So I did the obvious search, drawing on the analogy between the Nazca lines and the geo-reference of the Saudi desert:

[ Nazca lines in Saudi desert ]

Which led me down the rabbit hole of reading about strange geoglyphs around the world, of which these were probably the most interesting (at least at the moment as I was reading).

We had two readers send in answers... gasstationswithoutpumps and jpp both found the mysteries of the "kites" fairly easily. (I hope the rest of you did as well!)

As jpp wrote in his comment:

1. google maps "loc: 26.00053,40.48997" => "al hayit"
2. google images [al hayit ground] => http://www.abovetopsecret.com/forum/thread492292/pg1
3. From there I extracted the term: “works of the Old Men”
4. Google Search: [works of the Old Men] => Saudiaramcoworld.com

From that site:

"The most striking are the so-called “kites,” the remnants of long stone walls most likely built by groups of hunters to trap game; the walls outline the shape of a child’s kite. But the kites are huge: The “body” is a wall enclosing a corral-like space often 100 or more meters (328') across. The “tails,” two or more walls running out from the head, are typically each a few hundred meters long, but they can be as long as two or three kilometers (1.2–1.8 mi). On the ground, however, kites are almost impossible to find, because the walls, built of basalt boulders, are only about a meter (3') wide and their surviving height is seldom over half a meter, making them nearly invisible on a landscape already thickly strewn with the same rock.

They were apparently discovered in the late 1920s when airplane pilots were first flying over Saudi Arabia. The pilots thought they looked mostly like kites, although it's pretty clear that there are a large number of different shapes (rings, wheels, funnels, circles-with-tails, etc.)

Interestingly, this is also the same time the Peruvian Nazca plain lines were also discovered.

Other references to the kites: www.LiveScience.com

I've been trying to find the original reference to the 1920's aviators who noticed the kites, but still haven't been able to track it down. Anyone have an idea? (Later: My friend Lee pointed out that I could find the original article in the first volume of the journal "Antiquity"

Antiquity, v 1 , n 2, pp: 197–203, 1927
The 'Works of the Old Men' in Arabia
Flight-Lieutenant Maitland, Royal Air Force

You can't help but wonder how much other stuff is out there waiting to be found.

---------------------------------
Things I found nearby with a quick look around...

Circle w/ teardrop shaped-wall

Nearby, another circle/teardrop with bars across

Wednesday, October 5, 2011

Wednesday Search Challenge (10/5/11): What are those things in the desert?

To begin with, I still haven't figured out the symbolism of the Tyn Church multi-steeples. In fact, I haven't even found a decent explanation of why they're so complicated. We'll work on this going forward. Stay tuned.

Onto new business.

Today's search challenge is in this picture. Any idea what it is? (Or can you find the best possible theory about what it is?) I'll tell you a couple of things about it... The structures shown are ~2000 years old (give or take).

And I'll spare you searching the entire globe for this... it's at lat/long: 26.00053, 40.48997

There's a very nice solution to this little search challenge.

Can you find it? (To put it another way, it took me longer to write this post than to solve it!)

Search on!

Tuesday, October 4, 2011

Language for searching is subtle

In central London, just across the bus-clogged street from Victoria Station, there is a very modern newsroom. It's a large, open-plan space full of desks arranged like the spokes of a wheel pivoting around the central conference desk. Each line of desks has two monitors at each place, and a autumnal scattering of newsprint broadsheets all around, lending a sense of functional chaos to the orderly and ergonomically correct work stations. There’s a sense of new-media about the place. Lots of stories being written around the clock, many Twitter feeds examined, telephones everywhere and even a tiny studio just off on the side so new media journalist can do a quick standup video or audio recording when needed.

With the radial layout, it’s a bit of a panopticon design, shades of Jeremy Bentham and the all-seeing eye, although in this case it’s intended to help people on different desks work together efficiently rather than the pervasive monitoring of inmates as Bentham proposed.

I was in the newsroom to teach journalists a few of the finer points of search. Of course they all use Google everyday, so everyone knew many of the basics, but once outside of their comfort zone, I realized once again that even the best investigative reporters know only a fraction of what’s really possible. One more time I see that while they’re often great reporters and have a drive to get-to-the-bottom of a story, but even the young reporters tend to follow the tropes and patterns of previous years, and that limits them.

It’s been like that everyplace I’ve gone in the past two weeks travelling throughout Europe lecturing, teaching and giving press briefings. London, Dublin, Warsaw, Prague, Hamburg, Zurich… The raw data tells you a bit: 2 invited talks; 14 press briefings; 7 classes taught (5 for Googlers, 2 for journalists). That’s a lot to do in 8 working days. (And for the scariest piece of data: 40.2 hours of flight time. That is, forty hours in the aluminum-tube-that-flies.)

I did an interview on Czech television that was good fun, and then the next day I was in Hamburg, answering the same questions about what makes someone a good searcher on Google, but this time with a slightly more German twist. Sample question: “How can someone be the most efficient searcher possible?” That’s an interestingly engineering-style question. It’s very Googley, but also very different from questions I got from reporters in Dublin. There the questions were more about the user’s experience—“how can we be sure that the searcher is really happy with what they find?” I don’t mean to caricature, but there’s a reason the stereotypes are the way they are. Hamburg… Dublin… they have very different outlooks on life. I suspect they also search differently, although I didn’t do any studies to find out.

On the other hand, in the Swiss newspapers I became the “Chef für Kundenzufriedenheit,” literally, “chief for customer satisfaction,” which is not quite the way I think about myself, but I can see how they got that from “user happiness,” which IS in my job title. It’s a distinction that matters in English, but does it work that way in German? Don’t know.

These subtleties in language kept coming up again and again. One of the things I teach is the way to use additional search terms that describe the *kind* of thing you want to find. A nice example is to do an Images search for [bicycle diagram]:

which always gives you a page full of nice diagrams, each with the parts all labeled. Alas, when you do this in German, it doesn’t work. Turns out that the German word “diagramm" (that’s their spelling) has a slightly more limited meaning than “diagram” in English. (You have to use the German word “schema” to get labeled diagrams.) Likewise, a word like “serious” (“serious” or “seriös” in German) has several meanings in German. But you can’t say (in German) “he was seriously ill,” that sense of “serious” as “substantial in number or size” doesn’t carry over into German.

I shouldn’t have been surprised, really. I know all about false cognates (example: Spanish “dia” means “day” but has no relationship to “diary” in English, it just looks like it does). But somehow had the impression that a relatively simple word like “diagram” (a word that IS a true cognate across language pairs) would also copy all of the subsenses of the word as well. Nope. Not true.

What’s so interesting to me is that search strategies that I thought would work across-different languages don’t turn out to be very robust. When I tried the [

SearchReSearch