Monday, May 30, 2011

The jonquil glade mystery / how do you know what you know?

Sometimes things make you immediately think, “whodunit?” 

Today as I walked through a nearby open space preserve  I found a glade full of jonquils that were blooming with wild abandon.  They must have been planted some time ago—jonquils aren’t native to this part of the world, and they’re not transported by accident.  They were densely packed in no particular pattern, just the bulbs bumping together in an otherwise quiet grassy spot.  It a very tight cluster of flowers, overwhelmingly transcendent scent as I stood there admiring them.

But they made me wonder: Why are they here?  There’s no way this is an accident, and it’s too far from anything else to be a gardener’s idea of a planting folly, something done just for the unexpected fun of it.  So  I started poking around through the woods by the jonquils, finding a bit of foundation here, an old water pipe there.  Clearly, this was the remnant of an old house that was no longer there. 

What was the story with the missing house? 

Back home I did a little web research.  Looking at the aerial photos of the place on Google Earth I see blank spots on the hilltop.  A little more quick digging told me part of the story.  It turns out that there was once a ranch house there.  It was the “Casa Maximo Martinez"--a sprawling ranch home with six bedrooms, five bathrooms, a walk-in freezer, cavernous hallways, a huge living room with a fireplace, a large dining room with bay windows and a swimming pool. The property was torn down in 1997 after attempts to make it into a youth hostel were thwarted by local residents (including former HP CEO John Young).   They didn’t like the idea of all those young ruffians hanging about in their neck of the woods.

The story goes that in 1833, Gov. Jose Figueroa granted one square league (around 3,500 acres) of foothills known as Rancho Cañada Corte de Madera to partners Domingo Peralta and Maximo Martinez.  After the death of his wife in 1834, Peralta sold his share to Martinez, later enlarging the rancho to around 20,000 acres, including most of what is now Stanford.  (For his part, Peralta moved back to his father’s rancho a few miles south to Rancho San Antonio, near where Interstate 280 intersects Foothill Blvd.)   

The house was apparently built in early 1948 by John Marthens, and aerial photos I found on Google Earth from 1948 show the large house and a nearby oval racetrack that must have been for horse racing or training. 

So the jonquils must be the Marthens’ family garden.  From the looks of it, the patch hadn’t been tended in at least 20 years—that all makes sense. 

And then it occurred to me:  Sometimes there are things that make you think beyond “whodunit?” to “how’d I know that?” 

In this case, I’ve seen what happens to untended bulb gardens.  If you leave them alone for more than a year or two, the bulbs start budding daughter bulblets (also called “offsets” or “bubils”) all around the perimeter.  Even if you start with the original bulbs 6 inches apart, in only 2 or 3 years, the space between them fills in.  Over the course of 20 years, the edge grows outward and makes a ragged perimeter and a pretty solid mat of flowers in the middle.  Of course this probably isn’t what your gardening soul wants (the flowers in the center get compressed and deprived, so their flowers aren’t as strong or large), so gardeners tend to divide and separate. 

But I know this because I’ve seen bulb gardens be left unattended for years at a time.  So it’s by direct experience of daffodils and tulips. 

So how do I know about the history of the Rancho and the story of Domingo Peralta and Maximo Martinez?  The answer’s obvious... I just looked it up. 

But how do I *know* that story is true?   Or, more generally, how do you know that *anything* you’re told (or read) is true / correct / right?

I realize that there’s a huge literature on epistemology, the philosophy of how-you-know.  I’ve read some of that (I started doing my PhD thesis on the epistemology of knowledge representations), but I also find it mostly incredibly obscure, abstruse and abstracted from the question at hand… which is this—operationally speaking, HOW do I know something? 

It seems that it breaks down into direct knowledge (such as my understanding of how gardens of jonquils crowd together over time), and knowledge I’ve read / looked-up / been told… the *indirect* kind of knowing. 

Okay, stay with me now, this gets interesting right around here. 

HOW do I know to trust what I’ve read /looked-up/been told?  (I’m going to abbreviate this as RLBT, shorthand for “indirectly learned knowledge.”)  The answer is that we all develop an intuition about what we RLBT and how much to believe it. 

The problem is that this is largely automatic.  The process of RLBTing and deciding to believe something is so well-practiced that we do this constantly beneath the level of conscious perception.  When I hear something on Fox News, I’m immediately skeptical (without thinking much about why); when I read something in the NYTimes, I’m immediately believing (again, without much thinking about why). 

It makes sense that we operate this way.  One of the huge benefits of language is that we can learn from other’s experiences without having to go through those experiences (possibly dangerous or tedious, or both) ourselves. 

But something fundamental has changed in the past 50 years.  Once upon a time it was fairly difficult to produce something that others would read (and I have a broad definition of “read” in mind, including all forms of publication).  Even something as simple as writing a pamphlet took a bit of gumption, money and time to do.  All of that acted as a kind of “credibility bar” above which writers would have to pass.  Want to publish a book?  You’ll spend a good deal of time (years!) actually writing the book, then you need to convince a publisher to front the money, then distribute the book, etc etc. 

Now, of course, the cost of production for written materials (including video, audio, etc.) is basically zero.  If you want to write a book, you can borrow a library computer, type something up and push it to a web site with almost no cost.  With a little polish, it can seem as valid as anything else out there on the web. 

So how do we evaluate the trust-worthiness of what we find now? 

In some sense, it’s the same as it’s always been—you’ve got to know what to accept based on two factors:  (1) does it come from a source with a positive reputation?  And, (2) does it cohere with what I already know about the world? 

The “positive reputation” part seems simple enough.  Have you heard of the source and do you have reasons to believe that what they’re saying makes sense?  Problem is, with the low cost of publishing, MOST of the sources that might be useful don’t have any reputation for you to evaluate.   For example, when I looked up what the daffodil bulblets are actually called (“offsets”) I found the University of Illinois Extension program website.  Should I trust them?  They’re a university, one I recognize, and I know from previous encounters that they actually have an “extension” program that specializes in this kind of information.  Great!  So I did a quick check on that term by searching for [ bulb offset ] and found that it’s in fact the correct term, LOTS of other sites use “offsets” to describe bulblets as well.  Consensus rules. 

Of course, not all sources have uniformly high credibility in all areas.  I trust my mother to tell me true and interesting facts about our family history, but I wouldn’t trust her explanation of cryptography one tiny bit.  Likewise, I often trust university websites for information about basic science questions (e.g., bulb propagation), but wonder about their non-scientific content, such as opinions about the Iraq war.  Maybe I’d trust their opinions and their data contained within opinion pieces, but only if it “makes sense” to me—that is, if it aligns with what I already know about the world.   (And, as you can imagine, not all university web sites are created equal.  I don’t believe anything I read about evolution from Oral Roberts University— 

“Coherence” is the second big piece of validation.  When we RLBT something,  we all instantly determine if it’s coherent with what we already know.  This evaluation can be shallow (“yeah, sounds the same…”) or deep (“all the points align”), but it happens pretty quickly.  And when it sounds the alarm, the concern sometimes takes some figuring out.  Why does this feel fishy?  What’s not right here?  What doesn’t *cohere* with what I already know? 

The trick is to learn when something’s out of alignment (not-cohering) in a way that’s fundamentally flawed versus something that’s out of alignment because you don’t have accurate pre-existing information.  This is the great role of education—to teach the skills that let you evaluate whether you’re learning something new and valuable, or if you should reject what you’re RLBTing as dissonant. 

For instance, when I looked up the bit about the Rancho Cañada Corte de Madera, I already knew a good deal about Spanish land grants.  As a side-hobby, I read histories of California and know that large grants of land were given to Californios (such as Peralta and Martinez) in the 1830s.   And I knew I could even probably track down the original claim.  Sure enough, even though my 19th century Spanish isn’t all that it could be, I know enough to recognize the landmarks on the map and know that it’s all about this part of the Bay Area.  

So it all fit—it *cohered* with what I already knew.  What’s more, the map is from a place I’ve heard about (Calisphere, run by the University of California’s Bancroft library).   

If you look at the image of the claim, there’s a museum label stuck on the right side of the image.  That’s credibility enhancing, as that’s pretty clearly a museum cataloger’s annotation. 

IF, on the other hand, the compass rose had the letter “W” marking the western direction, I’d be right to be suspicious.  (“West,” in Spanish is “Oest,” hence the letter “O” marking the western direction). 

It all fits.  So, without much thought, with a merest twinkle of my brain, the map, the claim, the name of the region… it all slips into my memory as a fully authorized belief.  I find it credible, and it’s written down in my neocortext. 

On the other hand, on that same time as I was looking at the patch of jonquils on the former Rancho Cañada del Corte de Madera, I also walked over a trail that has a huge number of  clam shell fragments glinting white against the dark brown earth.  This had always puzzled me, as the trail is about 8 or 9 miles from the bay.  What were the clamshell fragments doing there?

I happened to run into two park rangers and asked them, “what about those clam shell fragments up on the trail?  You don’t suppose it’s a shellmound do you?” 

The answer came back a bit too quickly.  “Nope.”   One ranger gave the other a sidelong glance.  “No way there’s a shellmound up there.”  The second ranger agreed, again, a bit too quickly.  “No sir.  There never was a prehistoric village up there.  No way.” 

Uh… Okay.  Their hurry to tell me there wasn’t anything there made me deeply suspicious.  It wasn’t what they said, it was the *way* they’d said it—in a rush, a little dismissively, with a sense of wanting to move onto something, anything, else. 

They’re a credible source, they know a good deal about what’s in the open space preserve, but their way of saying it left a lot to be desired.  I had a few shell fragments in my hand to show them, but thought that discretion might be a good move at this point.  No need to press the issue.

The next day I went back to return the shells to the non-existent shell midden, located on a beautiful overlook above the bay.  It’s kind of a schlep to lug clams up that far, but the view is utterly worth the trek.  And besides, a couple of hundred years later, there would be jonquils at bloom just a bit downhill and to the left. 

Believe me.

Friday, May 27, 2011

On writing good search questions

Writing a search puzzle turns out to be epistemologically interesting in a way that I'd never imagined.  

I'd always figured I could just sit down, ask a few questions, write them up and be done.  Oh no, that's not the way it works.  

Mostly, the problem boils down to What's true?  What's determinable?  and How do you know?  

In essence, the grand philosophical questions for all time.  But let's not discuss that here (at least not yet).  

These kinds of questions come up constantly when I'm doing web research.  Even a simple question such as "When and where was oil first discovered in California?" (SearchResearch 12/29/10) leads inexorably into a search for truth, or as much of it as we can find.  

In that search, you'll recall, the simple question unravels into niceties of definition--what do you mean by "first"?  What do you mean by "discovered"?  And while you can punt on many of these deeper issues if you're running a class in person, it's much harder to do it when you're trying to write a search challenge question.  They need clear answers and clearly resolvable web-findable answers. 

(I don't believe for a millisecond that all possible questions can be resolved by web-hosted content.  That's another topic for another day.)  

So, when you're writing a good search questions you need to not only pose an interesting problem, but also demonstrate that a search path clearly leads to a solution.  Let me unpack that a little... 

The best problems are interesting--I won't try to define it, but you know what I mean.  They're intrinsically interesting to a fairly broad range of people.  Abstruse and obscure issues of number theory generally don't appeal to more than a few people.  And "interesting" can be found in a number of ways.  I like to use "obscure connections" as a way to link topics together in ways you might not have thought about.  Example: In the AGAD problem of April 24th, 2011

Rembrandt painted a philosopher looking at the bust of a Greek poet. The gold medallion on the chain represents another famous Greek. Who is it?

That's a nice linkage between Rembrandt, Aristole (the philosopher) and Home (the poet).  You might have known one, but you probably didn't know the other.  That's nicely interesting, and a good example of something that's simple to do with search, but tough to know as trivia.  

Demonstrable means that there's a clear (and repeatable) sequence of search steps that gets you to the answer.  The best ones here also illustrate features of search skill that you might not know about.  We had one question (April 25, 2011) that required you to know about the Conversion feature in Google:  

If you came home from a trip with 150 South African rand, 350 Kuwaiti dinars and 200 Japanese yen, how much would you have in U.S. dollars?  

You can answer that one with the simple one-liner trick of [ 150 rand + 350 kuwaiti dinars + 200 yen in USD ] (or you could do it step-at-a-time, then add them all up with Google Calculator.  

So when we write these questions, we're looking for interest, a teachable skill and a clear way to demonstrate the skill in question.  

I've gotten just a few questions from SearchResearch readers.  Anyone else want to give it a try?  A limited-edition AGoogleADay t-shirt maybe lies in your future if your question is picked! 

Write on! 

Wednesday, May 25, 2011

Wednesday Search Challenge (May 25, 2011): Make a great question, win a great t-shirt!

Okay, so technically speaking, today's Search Challenge isn't a challenge of the kind we've had traditionally.  Nope.  Today's challenge is for real stakes!  

As you know, the daily search puzzle has been running since April 11th.  We're really happy with the way it's going, but it's a bit of a black hole, consuming lots of time and energy as we race to write questions well ahead of publication date.  

So I'm hoping you can help out.  

The challenge is simple:  If you write a question that we can use on AGoogleADay, we'll send you one of our very rare, very limited edition AGoogleADay t-shirts!  

You've already seen the AGAD puzzles, so you know how they work.  Short, easily understood problem; an answer that's easily verifiable and (to fit into the master plan!) teaches something about how to search along the way.  

Turns out that writing these questions/answers isn't exactly easy.  While it's easy to write really trivial (but uninteresting) questions, it's harder to make questions that are verifiably correct via a search engine AND have an answer that's succinct and to-the-point.  The puzzler needs to be able to recognize a correct answer when they see it, and our automatic "correct answer" code needs to be able to determine that what someone enters is in fact the same as the right answer.

Thus, questions like:  "Give three causes of the Spanish-American war" aren't great--both because it's difficult to actually tell if you've got the right answer, AND it's hard to know if what you typed into the tiny answer box actually IS a correct answer.  

See what I mean?  

If you have a question and answer you think will work, please send it to me at drussell+AGAD@ work.  (The +AGAD will let me filter your answers into a labeled file so I won't get overwhelmed with emails.)  

Make sense?  If so, send me your questions (and answers--while I love to search, I can't spend ALL my time checking your answers).  

Anyone sending homework problems will be severely reprimanded!  

Start writing!  

On Friday I'll comment on some of the problems I've received, and talk about how to improve the questions.  

Good luck! 

Tuesday, May 24, 2011

Answer: How can you find a book by its color?

Whoa!  How did it get to be Tuesday of the next week?  

Sorry about the delay in getting the answer to you.  Between the activity and a few other things going on, it's been a bit since I've been able to get back here to till the fields of SearchResearch.  

So, a quick answer to the Challenge from last week:  A kid walks up to you and says:  "I'm looking for a book about Rosa Parks, but I don't remember the title.  All I DO remember is that it's got a picture of her on the cover... and the cover is mostly green."

As I found out from friends, this specific problem  is actually fairly easy to do on regular Google Book search.  (Just search for books about "Rosa Parks" and you'll see the covers.  As it turns out, the green book is in the first set of books.  Unlucky for me, lucky for you!) 

What I meant to point out was the trick of using Google Image search to find books by various image attributes.  That is, you can go to Image search, search for [ Rosa Parks book ] then filter by color (or face, or lineart, or whatever option is there).    

Several people wrote to say they'd found the answer quickly.   Nice. 

Search lesson:  Sometimes doing a search for something in Images is a great idea... especially if it's a visual property you're trying to search on (such as "green book cover").  

Okay... tomorrow... a more difficult challenge!  One worthy of all you uber-searchers out there!  

Search on!

Thursday, May 19, 2011

Wednesday Search Challenge (May 18, 2011): How can you find a book by its color?

Again with the running a day late!  

Sorry about that... I ended up doing 32 press briefings about between 4AM and 9AM on Tuesday morning, and that had the side-effect of giving my regular Wednesday schedule an entirely new look and feel!  

Today's Search Challenge isn't that hard--once you figure out the trick.  It's pretty simple if you know.  

A kid walks up to you and says:  "I'm looking for a book about Rosa Parks, but I don't remember the title.  All I DO remember is that it's got a picture of her on the cover... and the cover is mostly green." 

Can you find the book?  Who's the author?  

0 - 1 minute:  Über-searcher 
1 - 2 minutes:  Power-searcher 
2 - 5 minutes:  Need to learn a few new tricks!
> 5 minutes:  Read SearchResearch more often!!!! 

Search on! 

Monday, May 16, 2011

Synthesizing knowledge with Google Spreadsheets

Sorry about the delay in answering.  I spent last week at the Computer-Human Interaction conference in Vancouver, hanging out with my user-experience colleagues and enjoying Vancouver's flowers and waterfront! 

As you recall from last week, my friend asked:  "Can you tell which of the Fortune 200 companies have been around for more than 100 years?"

It's a great question, but not an easy one.  Here's how I solved it.  I followed much the same process as Fred Deventhal (see his comments from last Wednesday).   Nice job, Fred! 

Another regular reader, Ross Nelson, tried to solve this using Google Squared. That approach almost worked, but fell apart because you don't have quite enough control over how it works. 

My solution... 

After poking around for a while with queries like [ Fortune 500 start date ] and [ Fortune 500 1700..1910 ] I realized that the start date of a company is typically called its "founding" -- as in, AT&T was founded on ....  

Tried that for a while, and got nowhere either.  So. HOW can I get the founding dates?

I knew that for any ONE company I could easily look it up (say, on Wikipedia, or the corporate web site).  But doing that repeatedly would quickly grow old.  How about if I start with a list that's easy to get--just the names of the Fortune 500, from which I'd pull off the top 200.

Changed my query to [ list of  Fortune 500 ]   that worked well and go me to this:

Now I've got the list.  I copied/pasted the top 200 company names into a Google Spreadsheet ("Fortune 200 Spreadsheet")  and then wrote a simple GoogleLookup function to take the company name and lookup (using Google) the "founded" date.  

If you look at column C (the "Date Founded") column, you'll see each formula looks like this: 

=googlelookup(B2, "founded")

This is the Google Spreadsheet Lookup function that searches for the "Founded" property on each of the company names from column B.

As you can see, it actually worked pretty well. I had to go a fix up only a couple of the entries (which returned the date plus a bit of extra text). The fixed-up dates are in column D ("Int version" -- where "Int" stands for "integer").
A little calculation later, and you get the years-as-a-company column (E).

If you look at the right side of the spreadsheet, you'll find two charts I made for fun. The first one "Years as a company") shows the distribution of companies by age. The downward trending slope is what you'd expect if there were companies founded roughly every year that made it to the Fortune 200. So that's not a surprise.

I then made a version with companies grouped by decade using the Spreadsheets 
histogram function.

Search lesson: When you've got a search task that requires looking up a lot of very similar pieces of data, consider using a tool to help out. This one worked pretty well. In future posts I'll talk about more complex ways to do similar kinds of searches over larger, more diverse data sets.

Search on!

Thursday, May 12, 2011

Wednesday Search Challenge (May 9, 2011): Which companies are more than 100 years old?

My friend Alison asked a simple question:

"Can you tell which of the Fortune 200 companies have been around for more than 100 years?"

You wouldn't think it'd be that hard.  But I found it using an interesting method that you might not know about.  

Clearly, you have to figure out what the Fortune 200 are.  That's not too difficult—it's just the current Fortune 500 list ordered by revenue—take 200 off the top and you've got it.  

Once you've got that list, just figure out when all 200 companies were started, and then you'll know.  

Here's the trick:  you probably don't want to do this by hand, one company at a time.  So... how can you search for all 200... at once? 

Tomorrow's answer will reveal all!  Note:  It took me about 20 minutes to solve this.  

Search on!  

Monday, May 9, 2011

Getting the answer right

As you can imagine, finding answers is easy--but showing that they're right is another thing.

In the daily puzzle, we're constantly learning what makes for a good puzzle (short, easy to state, clear method for solution, unambiguous answer).  And it's that last point that makes life in a Google-world much more difficult for puzzle-writers.

Last week, for example, we gave the following puzzle:

This is a nice puzzle.  It's short, there's a pretty clear method to solve it... but it's a little ambiguous.

What we meant to say was:  "You are standing in the farthest west incorporated U.S. town in the lower 48 states with a population of one person.  What is the official, posted speed limit?"

But, as you can see, it's a longer, less appealing question.  Yes, there are towns in Alaska and Hawai'i that have 1 resident, but in our defense, they're all unincorporated (at least the ones I found).

It's not a great question, though, because--who knows?--since we wrote this question, a town might have turned into a population-size-one town, and the answer wouldn't reflect the current state of affairs.  Rats.

So it's a bit tricky to write these kinds of questions.  Especially when there are multiple potential ways of answering the same question.  For instance, many puzzlers wrote in to point out that Quirky Travel Guy said the speed limit was 40 mph.  We report the speed limit as 30.  What's up with this?  

The big, fundamental point I would love to make in the puzzle is that while people can FIND answers, they need to also learn to evaluate their level of credibility.  A self-admitted "quirky travel guy" (while very entertaining), is not nearly as compelling as seeing the speed limit sign at the edge of town via StreetView.  (Use the plus-widget in the map streetview panel below to zoom in on the speed limit sign.)  

View Larger Map

And that's the drawback of the short-form of the puzzle.  I'd love to be able to write longer back-stories like this, but can't quite fit it into the 420 character limit of the answer space!  

Search lesson:  Always second-source your findings, and when in doubt, choose the one that seems more likely to be true in the long run.  

Lesson for teachers:  There are always interesting other interpretations of your questions.  Think about those when designing your test questions!  (And, most importantly, try out your test questions before assigning them to students.  Writing good questions is an art!)  

Thursday, May 5, 2011

Answer: How common are animal-adjectives?

There are many ways to answer this question, and in his comment, Hans goes from [ frequency lists ] to Wiktionary to the Project Gutenberg word count lists.  

From the search point-of-view, that's not a bad search, but here's what I did for contrast.  

I started by framing the problem in the most obvious way I could think of with the query: 
[ word frequency over time ] 

Yes, I know the term "frequency" has a notion of time built into it, but I was searching for a web page that would write over the changes "over time" in relative term use.  That's why I included "over time" in the query.  In essence, this is a two-part query:  "word frequency" and the idea of "change over time."  I left out "change" because I thought that would mess up the bigram whereas "over time" would be frequent enough to be useful.  

As I sifted through the results, I noticed something I should have thought about straight-away:  the Google NGRAM Viewer project!  This is a marvelous "digital humanities" tool that would give me exactly what I wanted in one convenient package!  

In this case, once I'd found the right tool, the problem becomes trivial.  I did a comparison query of  piscine, lupine, vulpine, and bovine to generate the following chart.

Why so common?  Clicking through on the links at the bottom of the page and comparing hits from the 1900s vs. the 1980s shows why: Agriculture and disease.  The first bump in the 1910s is agricultural.. but then "bovine spongiform encephalopathy" starts the rise to bovine's 
current frequency levels.  

In this case my suspicion about the relative occurrences of the terms is confirmed.  Bovine really IS much more frequent than the other terms.  It's SO frequent that I had to remove it from the query to see more of the details on the other terms.

Still... lupine is rare.  But why does it dominate the other two "animal adjectives"?  My best guess from looking at the books in the hit list is that this the period when lupine (the plant) began to be taken as a serious topic of horticulture.  

(Among the other things I discovered while writing this post was that "Current Minnesota recommendations are that white lupines are unacceptable for growing pigs (under 225 lbs). A 1988 Minnesota study reported a 2% reduction in feed intake for each 1% lupine in the diet..."  Alternative Field Crops Manual, "Lupine," Nov. 1997.  Or, in other words, "lupines not suitable for porcines.")  

So the search lesson is clear--when you find a tool that does what you want, use it.  And in fact, search for the tool first.  You could save yourself a large amount of time! 

Search on!

Wednesday, May 4, 2011

Wednesday Search Challenge (May 4, 2011): How common are adjectives describing animal-like characteristics?

In yesterday's post I talked about writing puzzles.  But the Wednesday search challenges here are somewhat different--they're (mostly) real search problems that people have had  Usually I'll be talking with someone and they'll say "gee... I could never find out about foo... " and away we go into another challenge.  

As you know, I love words.  Recently a friend and I got into a discussion about how common various kinds of adjectives are.  The easy ones are easy to estimate:  red is common, angry is common.  But what about others?  How common is something like florid?  

When word people talk, they often end up talking about extreme cases--and this conversation went that same way.  "What about animal-adjectives like piscine, lupine, vulpine, or bovine?"  

You might think this is just so much how-many-angels-can-dance-on-a-pinhead kind of discussion... but no... gauging relative term frequency (the fancy way to say how common a word is) is a very useful bit of knowledge when constructing search queries.  If a term, say vulpine is very common, then you'd wonder if it means something other than what you might think.  You might wonder if there's a common usage that's unknown to you.  

So this all leads to today's search challenge:  What is the relative frequency of the animal-adjectives piscine, lupine, vulpine and bovine

And... because we're word people here, how has the relative frequency changed over the past 100 years?  For instance, was the word "vulpine" much more common in the past?  (It's relatively rare now.)  

To spare you the lookups: 

piscine... like a fish
lupine... like a wolf 
vulpine... like a fox 
bovine... like a cow 

Any ideas? 

Search on! 

Tuesday, May 3, 2011

Making up search problems -- more complex than you might think

The puzzles aren't easy to write.  I'd wager they take longer to write than to solve.  The problem is getting the right balance of interesting topic, having a clear solution and actually teaching something interesting to the puzzler.  You see, the goal of AGAD really is to illustrate the way search works, largely by example.  

For instance, today's puzzle: 
is actually fairly simple.  It's on an interesting subject area and illustrates an important search concept, to wit:  choose your search terms carefully.  

Search beginners make beginner mistakes, often including too much in their query.  A query like [ hottest cousins ] or [ hottest cousins rank 10000 ] are doomed to be unsuccessful--they're too generic (or, in this case, probably NSFW unless you have safe search turned all the way up!).  

But a wise searcher will recognize that the term "SHUs" is an odd thing.  You might not recognize it (in which case a quick [ define:shu ] will yield an appropriate definition), so it's probably material to the investigation!  

I solved this with a simple [ 10000 SHUs ], and scanned the snippets.  It's only Tuesday, how hard do you want it to be?? 

But if you're a teacher making up search problems like this, figuring out how to write the problem can be tricky. 

I generally start on a topic and then explore around for interesting side-bits of information.  I actually keep a notebook in my pocket and write down good ideas as I run across them in the real-world.  I see a blimp passing overhead and start to wonder... what do I not know about blimps... a few queries usually reveal a world of information that can be converted into a puzzle question for teaching search skills.  

A common mistake that puzzle-writers make is to pick on some piece of information that's SO obscure that nobody will know it.  (Example:  "what's the 1,000th decimal digit of pi?")  Nobody knows it because it's effectively random.  Boring.  

But if you start to wonder about blimps--what gas IS used in blimps?--you'll find out something interesting.  If you then start to find out something interesting about helium... then you've got two parts to an interesting puzzle.  I'd link those together into a puzzle something like this:  "There have been two gases used to float blimps.  We're not about to run out of one kind of gas that was used to lift blimps into the sky, but we ARE going to run out of the other gas.  What is that naturally occurring gas that we're in danger of exhausting?"  

It's a naturally interesting question, and teaches you something along the way... the way all great teachers do.  

Search on!   And teach along the way! 

Monday, May 2, 2011

Trick of the day: to simplify inserting special characters

When you write or edit, you often need to insert a special character into your text.  How do you do it?  If you're like me, it's probably by dropping into a special insert symbol sub-mode, which is fine, but sometimes laborious.  In MS Word, for example, you can use their Insert>Symbol menu choice, but then you're presented with ALL of characters.  Yet, most often you need to insert just curly-quotes, plus-or-minus, the Yen symbol, or equivalent.  

I found a web site that's marvelous: -- it does just what it says.  When you click on a symbol, it puts it into your keyboard copy-buffer.  That means you can now go back to your editor (say, Gmail message window) and Paste (Control-V) that character.  VERY simple, VERY nice.  

I usually keep it in a tab somewhere handy.  (FWIW, I also shrink it slightly to make sure it completely fits onto my screen... a single Control- shrinks it just a bit.)  

Also note that if you hold down the ALT key, you can click on multiple characters to get a few chars all at once.  (Example:  ≠~÷≈∞ )  Handy when you know you're going to use all those characters in a single message.  

And for people who write HTML, note that you can get the HTML equivalents for any of the characters!  Click on the "As HTML" button at the top of the page, and you can generate things like & iquest; (for the Spanish 'inverted question mark' symbol, ¿).  

You know and I know that search engines don't work so well to find special characters.  Generally, special characters are dropped from the search string.  The exceptions are special characters used as operators (e.g., [ salsa -dancing ] or [ +joiker music ] ), and the characters hash (#), ++ (to handle C++).

Write on!