## Friday, May 30, 2014

### Answer: Four (six!) small challenges

As I said, THAT was interesting.

I'm trying to develop a method for determining what people know about searching. One of the big problems is that you can't just walk up and ask someone "Tell me everything you know about searching!"  That's guaranteed to fail.

What's more, you can't even ask them specific questions.  If you say "When would you use a feature like filetype: ?"  a smart person can more-or-less figure it out from the question itself.

In general, figuring out what people know about a topic is pretty hard.  So I came up with this idea of asking "how long would it take you to solve this question?"

I was hoping that if you didn't know how to solve the challenge, you'd estimate a very high number ("it would take me 20 minutes to do that").  Contrariwise, if the challenge was thought to be simpler, you'd estimate a lower amount of time ("that would take 1 minute").

Or at least that's what I thought.  So I posed a question, had you estimate how long it would take, and then had you solve the challenge and report back about how long it actually took to solve it.  As usual, the reality is much more interesting.

The first challenge had four questions:

1A.  How long would it take you to find a picture (any picture will do) of the current Princess of Norway?
2A.  How long would it take you to find data describing the unemployment rate of Santa Clara County (California) for the past 10 years?  (You're looking for a table or a chart containing the data.)
3A.  How long would it take you to find a picture of Neil Armstrong standing on the moon that was published in The Guardian newspaper sometime in the past 30 years?
4A. How long would it take you to figure out how many times the name "Lothario" appears in the book "Moby Dick"?

The first two were intentionally simple.  I did that so you'd get the idea of how the survey worked, and would have a couple of successes.

Questions 3A and 4A were supposed to be harder.

Unfortunately, they were also pretty simple.

1A.  Finding a picture of the current Princess of Norway is easy. A search like: [ Norway Princess ] will do it.  Just go to Images, and there she is.  There are lots of pictures of Mette-Marit, Crown Princess of Norway and Princess Märtha Louise of Norway.  Mete-Marit married into the family, while Märtha Louise is the only daughter of King Harald V and Queen Sonja.

2A.  How long would it take you to find data describing the unemployment rate of Santa Clara County (California) for the past 10 years?   I thought that this might be a little tricker than it was.  Turns out that the obvious query [ unemployment rate Santa Clara county ] triggers the Google Public Data Explorer onebox, which gives you everything you'd want.

3A.  How long would it take you to find a picture of Neil Armstrong standing on the moon that was published in The Guardian newspaper sometime in the past 30 years? This was supposed to require some clever date restriction filtering.  (That's the way I solved it!)  But I didn't check the simple, obvious, and straightforward query [ Neil Armstrong moon Guardian ].  Sure enough, a nice image shows up on the first page.  (In retrospect, I should have asked for the image to be printed in the Guardian newspaper in the 1970s.  That would have made it a bit more tricky.)

4A. How long would it take you to figure out how many times the name "Lothario" appears in the book "Moby Dick"?

This question was interesting.  This was the first question were I asked for an answer.

4C.  And how many times DOES the name "Lothario" appear in "Moby Dick"?

You were supposed to type in an answer based on what you found.  Interestingly, 27% of the respondents got the answer wrong.  The correct answer is 2 (while all of the wrong answers were only 1; they missed the second appearance because it was hidden below the fold).

I found this by downloading the full-text of Moby Dick from Project Gutenberg and then opening this text file in Chrome and searching for "Lothario."  You'll find that name mentioned twice.
"...he cannot keep the most notorious Lothario out of his bed; for, alas! all fish bed in common..."
and a few paragraphs later
"Gently he insinuates his vast bulk among them again and revels there awhile, still in tantalizing vicinity to young Lothario..."

In looking at the estimates of time vs. the actual time required, it became clear that the guesses were... interesting.  Mostly people guessed that the tasks would take longer than they really did.  Guesses and Actuals ranged from:

Ranges:
1.  Guess:  5 seconds to 5 minutes.       Actual:  2 seconds to 2 minutes
2.  Guess:  8 seconds to 15 minutes.    Actual:  5 seconds to 16 minutes (1 outlier at 51)
3.  Guess:  6 seconds to 25 minutes.    Actual:  5 seconds to 28 minutes
4.  Guess:  12 seconds to 30 minutes.  Actual: 11 seconds to 10 minutes

Medians:
1.  Guess:  1 minute.    Actual:  10 seconds
2.  Guess:  5 minutes.  Actual:  1 minute
3.  Guess:  2 minutes.  Actual:  1.75 minutes
4.  Guess:  3 minutes.  Actual:  1.5 minutes

This tells me that we're not especially good at estimating how long a search task will take.  Many people were off by a factor of 2!

This is also what prompted me to add two additional questions--ones that I thought would be much harder than the previous four.  I was thinking that maybe, just maybe, these questions were way too easy.

So I added questions 5 and 6.  (Which I unfortunately labeled as 1 and 2 on the second survey.  Here I'll just call them numbers 5 and 6.)

5A.  How long would it take you to find a picture of Rosa Lubienski's daughter?
6A. How long would it take you to figure out how many times the name "Absalom" appears in the King James Bible?

These were significantly more difficult.

5A.  How long would it take you to find a picture of Rosa Lubienski's daughter?
To solve this you first have to figure out who "Rosa Lubienski" is (turns out she's an actress whose stage name is "Rula Lenska," an English actress of Polish descent who became well-known in the 1970s).  Once you know that, you can find her daughter's name, Lara Parker Deacon, and Google Images has plenty of pictures of her.

This is a multi-step problem; interesting, but not super-difficult. By contrast, the last question IS fairly tricky, even though it looks very similar to challenge #4 above.

6A. How long would it take you to figure out how many times the name "Absalom" appears in the King James Bible?

Again, I did the same thing as before.  Looked for the King James Bible on Gutenberg, downloaded it, and used Control-F in Chrome to find that "Absalom" appears 109 times.  (Although because there are variant versions of the KJB, I also scored 108 as correct.)

Several people used a Bible search site (www.KingJamesBibleOnline.org) to search for Absalom.  They're the people who answered that the word occurs 90 times.  (Note to those folks:  That site counts the number of verses in which the word occurs, not the number of times the word "Absalom" appears.   You have to read carefully!)

This time around:

Ranges:
5.  Guess:     5 seconds to 20 minutes.    Actual:  27 seconds to 11 minutes
6.  Guess:  20 seconds to 10 minutes.    Actual:  20 seconds to 10 minutes

Medians:
5.  Guess:  180 seconds.    Actual:   96 seconds
6.  Guess:  180 seconds.    Actual:   85 seconds.

This doesn't look inspiring, but here's the thing that's not in the guesses and timing tables....

Everyone did questions 1 - 4; mostly correctly.  They estimated high, but were able to do the challenge quickly.

But questions 5 and 6 were much trickier.

Question 5 (Rula):  Of the 5 people who said it would take 10 minutes or more, 4/5 of them were not able to do the task at all.  In general, people who said that it would take 10 minutes or more found it really did take that long, or that it was impossible.

Question 6 (Absalom):  Only 32% of the people who answered "Absalom" question actually got it right.  There was no correlation between the time they estimated, the time they actually took to solve the problem, OR their accuracy.

My takeaways:

1.  It's hard to estimate the difficulty of a search task.  Regular readers of SRS know that something might LOOK simple, but turn out to be hard; and vice-versa--it looks hard, but turns out to be simple.

2.  On the other hand, searchers who practice can estimate time-to-answer better than people who don't practice.  In another study, I asked 500 people the "Absalom" question.  90% of them said it would take around 1 minute to do.  Really?  I suspect that people who don't search much have a distorted view of what's possible to find online.  (They're a bit over-optimistic!)  I had a smaller sample of non-SRS people do the "Absalom" problem.  They were also all over-optimistic.

3.  But when something really IS hard, and LOOKS hard... it usually is.   Again, experience really helps when estimating.  This is an underappreciated metacognitive skill (that is, the MC skill of understanding how hard a task will be, and what you need to do to complete it).

4.  People still don't check their work.  The "Absalom" question is a good example of this.  For the people who used the KingJamesBibleOnline web site, the answer is correct.... but the number of 90 hits is measured in VERSES, not number of instances of the term "Absalom"!

5.  We find it VERY difficult to estimate the number of queries we do / day.  Out of the first 100 people to responded to the survey, the average number of reported queries / day was 30.  I suspect this is a high estimate.  When I've surveyed before, I find that people often mis-estimate by as much as a factor of 5.  To give you sense of this, I checked my number-of-queries for the past 4 days:  20 (May 26), 17 (May 27), 22 (May 28), 16 (May 29).  And so on.  I'll have to send out another survey to discovery what people estimate, and then what number they find by looking at Google.com/history -- you can see all of your searches listed there (assuming you have it turned on).  Check it out--tell us how close your estimate is wrt the actual number.

Overall, did this help my quest?  Absolutely.  Stayed tuned, I'll report back... after a bit more development.  Thanks for helping out.

Search on!

1. I was one of the careless SRS who accepted the first answers as right without carefully reading and double-checking them.

There was one sentence in the wording of your challenge that set me in that mode: "I expect most SearchResearch Readers will be able to do this in about 5 minutes." I'm not trying to defend my case. I just think that this might have happened with other respondents, in which case this alert makes sense, just in case you didn't take that into account.

If, instead of that sentence, I had read something like "As always, be careful to carefully check your answers", it might have triggered me to slightly higher response time estimates and probably to significantly higher actual response times.

1. Yes, I agree. I unwittingly biased the way in which people answered the questions. In writing "about 5 minutes" I was trying to be encouraging (that is, to suggest it wasn't going to take all day). But the unexpected side-effect was to probably push folks along a bit too quickly.

Thanks for the feedback. That's really useful!

2. Lessons learned by Anne and me: we got the answer 1 for the Moby Dick question. We used google books to get our answer. There it was as clear as day, 1 and the snippet even showed the passage. Never occurred to us to click on the snippet to show if the word was used more than once in a passage. Lesson learned! This would work if the word didn't appear too often. If it was used many times sounds like it is better to use the control F function in Chrome browser.
The King James Bible question has us stumped. We used 2 different sites: http://www.kingjamesbibleonline.org/ and http://www.biblegateway.com/versions/King-James-Version-KJV-Bible/ First site gave the answer of 90; second site gave answer of 89. Looking at the King James Bible Online page it appears to be counting multiple times the word in used in the same verse. For instance looking at 2 Samuel 18:33 it shows the word Absalom being used 3 times. Now we would verify using Project Gutenberg but looking at the site and the answer would have had no idea that the wrong answer was given.

3. Hi Daniel and fellow searchers. I was not home in time for the survey but just checked the answers. Could you give me a link for a picture of "Lara Parker Deacon", I searched for almost an hour and here, in France, couldn't find a picture of her, Google images not giving a right answer (the Lara Parker that appears there is not Lara Parker Deacon, the DOB and birth name don't match). Searching for her name in Google, joined with her mother's or father's one didn't give any result, she appears to be a 'private' person not an actress or someone public. I'm very curious to see her face. Unless we again have a Google country bias in that search…

1. Yes, I too found this to be the case. So, I'm also wondering whether there's a country bias, or whether -- heaven forfend! -- when Dr Russell says that 'Google Images has plenty of pictures of her', those are actually of the actress/author of that name, but not Ms Lenska's daughter! (I did wonder whether some UK newspaper archives might have had a picture of Lara Parker Deacon as a baby, but life's too short, and I didn't really want to disappear down that particular rabbit-hole!)

2. I'm in the U.S. and I found the same thing. Lots of pictures of Lara Parker, and some of Lara Deacon, but couldn't actually tell if she was Rula Lenska's daughter. I think this is not as easy as Daniel says. I wish he would respond.

3. Hi Marian (et al.) - See today's SearchResearch post (June 5, 2014). I talk about this in that posting. (I decided to make a post about it for everyone, rather than just in the comments. These questions are good and deserve a higher profile airing.)

4. Interesting results. I didn't use the link where people got 90 hits but I checked it out & I would have thought 90 as well. What tipped you off? I can' figure that out.

The quality of the keywords used for searches versus what I will call long-hand searches finding resources based on topic. Using keywords and getting good results, is it experience, knowledge or luck? It seems to me it's all three. I think the better I get at understanding how keywords work the quicker I'll get an answer BUT sometimes it leads me in done the wrong path.
Here's an example-
Query [king james bible word search] SERP quod.lib.umich.edu/k/kjv/simple.html Enter absalom = 109 matches.

Now I thought it might work because we're talking the Bible. And that the term "word search" pretty common. Put the two together and we get an answer in 2 seconds. But as well I think I got lucky. I'm not sure how to weigh this result if you see what I mean.

1. Great question ("what tipped you off"). I was suspicious because when I looked at the result, I saw verses that had "Absalom" mentioned twice, and that made me wonder--were both of those counted as separate, or as one? Since it was only 90 verses, I just blasted through them all and counted manually. (If it were a larger number, I would have done something different.) That's how I figured it out.

I suspect that experience is what's playing a major role here. My research question is: "What is the knowledge one learns through experience?" Is it knowledge of sources? Knowledge of term frequency, or what?

5. just curious - I couldn't find any indication that any of the Lara Parker Deacon Google images were actually Rula Lenska's daughter…
could you give a link, please? btw; "I'm NOT Rula Lenska" ;^)
this seemed to be the "Lara Parker" that dominated the images… also, would you give the total number of responses you received to the two surveys?
Lara Parker
"5A. How long would it take you to find a picture of Rosa Lubienski's daughter?
To solve this you first have to figure out who "Rosa Lubienski" is (turns out she's an actress whose stage name is "Rula Lenska," an English actress of Polish descent who became well-known in the 1970s). Once you know that, you can find her daughter's name, Lara Parker Deacon, and Google Images has plenty of pictures of her. "

"4. People still don't check their work. "
fwiw: in May I had 3 Google search usage days in the 1-30 range, 2 days in the 151+ area, and the majority of the month ran 76-150/day, that doesn't include other SEs… motwyw…
used two browsers - Chrome worked very well using "OK Google" voice activation (used for the Princess of Norway query) and used it
for the image search - Volkstelling 1925 census
…quicker than Safari, fewer clicks/taps…

6. I just tried Query [moby dick word search lothario] and got SERP mobydiction.com/ Result - 2 times- new tool in my toolbox, so I guess this is where experience comes into play.

Also I did think you meant to search for the actual newspaper front page for Neil Armstrong. I just did a 'by date' search using the following day I found the frontpage. I knew the Guardian has a good archive.

7. Moby Dick: I used BOOKS. First up is an 1892 edition which when searched for Lothario says 1. However when I click on the only page referenced it shows 2 hits. What gives ?
Thanks to Debbie G I tried this just now.

KJV: Once again I used BOOKS just now and find there are so many different editions, What's a boy to do ?

How many queries: How do we count ? Only sources ? Each item at a source ? I bang lots of things into Google in a day. Do they all count ? I can for sure ask Ancestry for dozens in a day (some days) Counted once or dozens ?

These 2 surveys: How many people responded ?

Cheers

jon

1. Jon -- quick answers: 84 people replied to the first survey; 35 to the second.

WRT Books and number of hits -- the UI is misleading there. You actually have to look at the landing page to see what's really going on.

There are many editions of the King James Bible. Luckily, I believe all of them (I haven't checked extensively) have the same number of Absaloms. Your more general question ("how do you know which edition to use") is a tougher question without a good answer. (Which is why I made my question edition-invariant!)

Queries/day: What I *meant* was "queries used to find information" -- but a better measure would be specific, and to include Ancestry.com, Amazon.com, Bing.com, DuckDuckGo, etc.

2. In order to estimate the number I gave (about 30 queries), I browsed fast through my chrome://history, counting each thing I looked for, not actual queries, which would be twice ro threefold that figure. I forgot that I also query in Incognito mode, as well as through other browsers (for different tasks I might use Maxthon, Firefox and sometimes even IE).

I also forgot that I could have just checked my dashboard. Here's what it says at the moment:

Account activity last 28 days
Searches 1,895 ^ 17%
Search types
Web 68%
Maps 17%
Images 10%
Books 3%

I also search on Gmail (a lot), on YouTube (a lot), on Chrome History, my Favorites and other personal data, as well as on many other online sites outside the Google Realm.

So maybe the answer you were seeking for would be closer to more than 100 for me.

3. Jon they say if you get the right answer you learn only so much; if you get the wrong answer you learn even more. Getting the wrong answer using google books was really interesting. Glad it happened because would have never guessed that it was only counting once per passage or maybe it is even only once per page. Now we will know to do a more in depth search using google books. Was wondering if knowing this first if I would have carried this over to the King James Bible search but think I still wouldn't have tried anything different once I saw that each word was highlighted. If the number was lower may have tried counting.

8. Lara Deacon: SEARCH [brian deacon NEAR lara deacon] points to

Has image which is clearly identified as her and husband James Parker top of page 8:

"Lara Deacon, daughter of Rula Leńska and Brian Deacon, married
James Parker, son of John and Anita Parker. The wedding was
celebrated at St Mathias Church in Richmond, England, on September
5, 2009. The couple has settled in London"

There are many other Lara's around confusing the search

Cheers

jon

1. This is the best known accurately labeled photo. Just out of curiosity, Jon, how did you find it?

2. Congratulations "Unknow". [brian deacon NEAR lara deacon] was the solution. The link is #5 on the serp. Well, it looks like People still don't check their works… :-)

3. impressive find jon - was it your genealogy experience that prompted you to form that query?… leaving [Lubienski/Leńska] out of the query altogether… "Tarnowskich"… would have never looked there - in Polish or English.
am still curious about the survey numbers:
"5. We find it VERY difficult to estimate the number of queries we do / day. Out of the first 100 people to responded to the survey," (a 2nd survey question)
versus
"quick answers: 84 people replied to the first survey; 35 to the second."
wuwt?
one other observation, really clever header graphic on Rosemary's Moby-Diction.com site — the white whale lives…
and a clever format for the site.

4. Dan asked how I got to that daughter image, so:

SEARCH [Rosa Lubienski's daughter] points to

en.wikipedia.org/wiki/Rula_Lenska

Rula Lenska (born Rosa Marie Lubienski, 30 September 1947) is an English .... to actor Brian Deacon (4 June 1977 – 1987), with whom she had one daughter, Lara Parker Deacon,

IMAGES [Lara Parker Deacon]

Hmmmm somethins wrong, there are too many of them and none look promising. So using a genealogy search strategy I tried lots of variants of the name; then in combo with each parent.

Voila!

I found her in the very last place I looked.

Cheers

jon, who notes in the 'Talk' part of the Wikipedia article no one has noticed that the daughter's name is incorrect

5. Great finding Jon. Thanks for sharing with us how you find it.

6. jon's answer was crucial for my subsequent research on Lara and James Parker.

Just for completeness, her friend the photographer Matt Staples (Matt S Staples) has a portfolio on his professional website including one "engagement shoot" and six photos of their wedding (four with the groom and two with the bridesmaids): http://www.mattstaplesphotography.co.uk/fourcolumns.html

The same photos (or some subset) can be find on Matt Staples' Facebook, Flickr and Twitter pages, all of them public.

9. Hi - I'd like to see the answer for Lara Parker Deacon too as I couldn't find it. Not in the 10 mins I was willing to devote to it anyway.
Yes - there are a lot of pictures against her name in Images but they are all of her mother Rula Lenska or people related to her like her ex-husband Dennis Waterman and his daughter - who isn't Lara. The other odd ones are different people entirely. I have the advantage of being British and therefore knowing what these people look like and their complicated family relationships! But it defeated me so I'd like to see an answer or as passager says - is country filter affecting results?

Thanks
Sarah

10. DOH! I missed the field to put in the number of times the name appeared on BOTH of the surveys. I used Project Gutenberg for both and their HTML versions. KJ Bible slowed my time as it took a couple seconds to load.

1. Fred, [thanks!] for the point, it's a conundrum - worth the read, but now I'm spent… tp angst
…can one run up a time point deficit the way countries treat imagined/impending debt?
how does sRs time convert to tp (time points)?
*
almost sounds like a Jerry Maguire line…
This Should Only
*courtesy of Frank Chimero
Professional designer. Amateur human.

11. Very interesting as always, Dr. Russell. The challenges were apparently simple and they have lots of knowledge.

In Moby Dick, I did exactly as Anne and Debbie. Lessons learned.

Have a great weekend.

12. And didn't check until now how often we use search. I checked my history on my personal account and it was much lower than I would have thought but I use that account at home and most of my search activity at work using a different account. History wasn't turned on, so couldn't check. When Anne and I answered this question we were also including in all of the times we are doing search queries using our library databases and catalog.

1. Ramon, so funny that you did what Anne and I did! Your name comes up ALL the time when we are working on challenges! Along with Rosemary's and Fred also! Fred has become my go to person for information literacy information that I share with school. So this weekly challenge has turned into much more of a professional development session for Anne and me. Ramon maybe we need to say great minds think alike!

13. After a lot of search, here's the most official (and recent) photo I could find of Lara 'Deacon' Parker, on her own Facebook page.

Taking from her Facebook name ("Lara 'Deacon' Parker", with the single quotes around "Deacon"), I'd say she dropped that surname when she married James Parker, so her official name is probably Lara Parker (and never Lara Parker Deacon).

And here's another picture of Lara and James on her Rummikub profile.

As to Wikipedia:

On 15 September 2009, Wikipedia editor 80.194.5.121 changed "Lara Deacon" to "Lara Parker". This was only 10 days after her marriage to James Parker (as we know from the Tarnowski Clan newsletter found by jon).

On 11 May 2012, Wikipedia editor 85.210.130.53 changed "Lara Parker" to "Lara Parker Deacon". I can't find any other mention to this strange combination and it seems in fact to be wrong. Apparently, no Wikipedia editor noticed it until now.

Further sources:

The website claiming to be The Official Rula Lenska Website, states that she "has a daughter, Lara Parker, from her marriage to Brian Deacon."

On an article on Candis magazine, Rula Lenska herself is quoted as saying "my daughter, Lara Parker".

On a final unrelated note, I found out that Lara Parker is a nurse.

1. I think "Unknown" and Luis get to share the gold medal for finding that picture. I'm impressed by your approach.
I'm not on Facebook so that "walled garden" was closed to me anyway.

Also NB the Guardian archive is chargeable to use (though public and academic libraries often have access to their members). It didn't matter with such a famous item as Neil Armstrong as it has been republished openly but a less famous item might have been different.

I guestimated my number of searches per day but I could have been way wrong. I suspect anyone who is reading this blog is going to be a "professional" searcher of some kind and therefore likely to have a higher no of searches per day than average. Plus we are probably more curious than average i.e if we see something we don't know or understand we are more likely to search for it to find out more. It can be fatal if reading a novel though as it drags you out of the story - I just make a note for later now :)

2. Even if you don't have Facebook, you can still peek on some stuff that was (probably unwittingly) left public.

For example, the Google search [ site:facebook.com "lara deacon parker" ] will lead you to several Facebook pages where Lara Parker has posted comments. Check for example the 5th result (on HandmadebyJax) and you'll find a thumbnail of her profile's picture. If you hover your mouse on that thumbnail, you will be able to see a 100x100 pixels version of that picture.

If then you do a "Search by Image" on that picture and help Google with a description (writing "Lara Deacon Parker" where you see "describe image here"), you won't be given access to any other sizes but you'll see the Rummikub profile picture.

My last post was filled with links, I have no idea what happened to them. Anyway, this being a search research group, I guess readers will find out the links by themselves. :)

14. I can speak for myself only. There is an occupational bias in estimating "how long will it take"... questions. If I under-sell and out-perform, namely if I say it will take a half an hour and it takes 10 minutes nobody will complain, If I say it takes 10 minutes and it takes an hour EVERYONE will complain. Since I do research for a job I must admit that my occupation does effect how I answered the estimate vs. performance. Also once I got the solution I did play with the results to assure that everything was as it should be. Once you start playing with the answer you may find that it will draw you in to assure that everything is as complete as possible

1. Paul, totally agree on this. Anne and I were very conscious of this. We felt it was much better to overestimate than to say it was only going to take a short amount of time and then be wrong about that!

15. When we have the daughter of a famous actress who seems to be living quite a ordinary life, it wasn't difficult to find the photos mentioned here already. However she is overshadowed by her parents & probably quite deliberately, therefore she doesn't have a huge presence on the internet. When searching for an average person we can use social networks and perhaps photos that have been tagged. We can search for hits by location, interests, work, and family. My take-away for this part of the challenge wasn't finding photos of her but realizing there are limitations based on a persons preference to be seen or not.