As I said, THAT was interesting.
I'm trying to develop a method for determining what people know about searching. One of the big problems is that you can't just walk up and ask someone "Tell me everything you know about searching!" That's guaranteed to fail.
What's more, you can't even ask them specific questions. If you say "When would you use a feature like filetype: ?" a smart person can more-or-less figure it out from the question itself.
In general, figuring out what people know about a topic is pretty hard. So I came up with this idea of asking "how long would it take you to solve this question?"
I was hoping that if you didn't know how to solve the challenge, you'd estimate a very high number ("it would take me 20 minutes to do that"). Contrariwise, if the challenge was thought to be simpler, you'd estimate a lower amount of time ("that would take 1 minute").
Or at least that's what I thought. So I posed a question, had you estimate how long it would take, and then had you solve the challenge and report back about how long it actually took to solve it. As usual, the reality is much more interesting.
The first challenge had four questions:
The first two were intentionally simple. I did that so you'd get the idea of how the survey worked, and would have a couple of successes.
Questions 3A and 4A were supposed to be harder.
Unfortunately, they were also pretty simple.
1A. Finding a picture of the current Princess of Norway is easy. A search like: [ Norway Princess ] will do it. Just go to Images, and there she is. There are lots of pictures of Mette-Marit, Crown Princess of Norway and Princess Märtha Louise of Norway. Mete-Marit married into the family, while Märtha Louise is the only daughter of King Harald V and Queen Sonja.
2A. How long would it take you to find data describing the unemployment rate of Santa Clara County (California) for the past 10 years? I thought that this might be a little tricker than it was. Turns out that the obvious query [ unemployment rate Santa Clara county ] triggers the Google Public Data Explorer onebox, which gives you everything you'd want.
3A. How long would it take you to find a picture of Neil Armstrong standing on the moon that was published in The Guardian newspaper sometime in the past 30 years? This was supposed to require some clever date restriction filtering. (That's the way I solved it!) But I didn't check the simple, obvious, and straightforward query [ Neil Armstrong moon Guardian ]. Sure enough, a nice image shows up on the first page. (In retrospect, I should have asked for the image to be printed in the Guardian newspaper in the 1970s. That would have made it a bit more tricky.)
4A. How long would it take you to figure out how many times the name "Lothario" appears in the book "Moby Dick"?
This question was interesting. This was the first question were I asked for an answer.
4C. And how many times DOES the name "Lothario" appear in "Moby Dick"?
You were supposed to type in an answer based on what you found. Interestingly, 27% of the respondents got the answer wrong. The correct answer is 2 (while all of the wrong answers were only 1; they missed the second appearance because it was hidden below the fold).
I found this by downloading the full-text of Moby Dick from Project Gutenberg and then opening this text file in Chrome and searching for "Lothario." You'll find that name mentioned twice.
In looking at the estimates of time vs. the actual time required, it became clear that the guesses were... interesting. Mostly people guessed that the tasks would take longer than they really did. Guesses and Actuals ranged from:
Ranges:
1. Guess: 5 seconds to 5 minutes. Actual: 2 seconds to 2 minutes
2. Guess: 8 seconds to 15 minutes. Actual: 5 seconds to 16 minutes (1 outlier at 51)
3. Guess: 6 seconds to 25 minutes. Actual: 5 seconds to 28 minutes
4. Guess: 12 seconds to 30 minutes. Actual: 11 seconds to 10 minutes
Medians:
1. Guess: 1 minute. Actual: 10 seconds
2. Guess: 5 minutes. Actual: 1 minute
3. Guess: 2 minutes. Actual: 1.75 minutes
4. Guess: 3 minutes. Actual: 1.5 minutes
This tells me that we're not especially good at estimating how long a search task will take. Many people were off by a factor of 2!
This is also what prompted me to add two additional questions--ones that I thought would be much harder than the previous four. I was thinking that maybe, just maybe, these questions were way too easy.
So I added questions 5 and 6. (Which I unfortunately labeled as 1 and 2 on the second survey. Here I'll just call them numbers 5 and 6.)
These were significantly more difficult.
5A. How long would it take you to find a picture of Rosa Lubienski's daughter?
To solve this you first have to figure out who "Rosa Lubienski" is (turns out she's an actress whose stage name is "Rula Lenska," an English actress of Polish descent who became well-known in the 1970s). Once you know that, you can find her daughter's name, Lara Parker Deacon, and Google Images has plenty of pictures of her.
This is a multi-step problem; interesting, but not super-difficult. By contrast, the last question IS fairly tricky, even though it looks very similar to challenge #4 above.
6A. How long would it take you to figure out how many times the name "Absalom" appears in the King James Bible?
Again, I did the same thing as before. Looked for the King James Bible on Gutenberg, downloaded it, and used Control-F in Chrome to find that "Absalom" appears 109 times. (Although because there are variant versions of the KJB, I also scored 108 as correct.)
Several people used a Bible search site (www.KingJamesBibleOnline.org) to search for Absalom. They're the people who answered that the word occurs 90 times. (Note to those folks: That site counts the number of verses in which the word occurs, not the number of times the word "Absalom" appears. You have to read carefully!)
This time around:
Ranges:
5. Guess: 5 seconds to 20 minutes. Actual: 27 seconds to 11 minutes
6. Guess: 20 seconds to 10 minutes. Actual: 20 seconds to 10 minutes
Medians:
5. Guess: 180 seconds. Actual: 96 seconds
6. Guess: 180 seconds. Actual: 85 seconds.
This doesn't look inspiring, but here's the thing that's not in the guesses and timing tables....
Everyone did questions 1 - 4; mostly correctly. They estimated high, but were able to do the challenge quickly.
But questions 5 and 6 were much trickier.
Question 5 (Rula): Of the 5 people who said it would take 10 minutes or more, 4/5 of them were not able to do the task at all. In general, people who said that it would take 10 minutes or more found it really did take that long, or that it was impossible.
Question 6 (Absalom): Only 32% of the people who answered "Absalom" question actually got it right. There was no correlation between the time they estimated, the time they actually took to solve the problem, OR their accuracy.
My takeaways:
1. It's hard to estimate the difficulty of a search task. Regular readers of SRS know that something might LOOK simple, but turn out to be hard; and vice-versa--it looks hard, but turns out to be simple.
2. On the other hand, searchers who practice can estimate time-to-answer better than people who don't practice. In another study, I asked 500 people the "Absalom" question. 90% of them said it would take around 1 minute to do. Really? I suspect that people who don't search much have a distorted view of what's possible to find online. (They're a bit over-optimistic!) I had a smaller sample of non-SRS people do the "Absalom" problem. They were also all over-optimistic.
3. But when something really IS hard, and LOOKS hard... it usually is. Again, experience really helps when estimating. This is an underappreciated metacognitive skill (that is, the MC skill of understanding how hard a task will be, and what you need to do to complete it).
4. People still don't check their work. The "Absalom" question is a good example of this. For the people who used the KingJamesBibleOnline web site, the answer is correct.... but the number of 90 hits is measured in VERSES, not number of instances of the term "Absalom"!
5. We find it VERY difficult to estimate the number of queries we do / day. Out of the first 100 people to responded to the survey, the average number of reported queries / day was 30. I suspect this is a high estimate. When I've surveyed before, I find that people often mis-estimate by as much as a factor of 5. To give you sense of this, I checked my number-of-queries for the past 4 days: 20 (May 26), 17 (May 27), 22 (May 28), 16 (May 29). And so on. I'll have to send out another survey to discovery what people estimate, and then what number they find by looking at Google.com/history -- you can see all of your searches listed there (assuming you have it turned on). Check it out--tell us how close your estimate is wrt the actual number.
Overall, did this help my quest? Absolutely. Stayed tuned, I'll report back... after a bit more development. Thanks for helping out.
Search on!
I'm trying to develop a method for determining what people know about searching. One of the big problems is that you can't just walk up and ask someone "Tell me everything you know about searching!" That's guaranteed to fail.
What's more, you can't even ask them specific questions. If you say "When would you use a feature like filetype: ?" a smart person can more-or-less figure it out from the question itself.
In general, figuring out what people know about a topic is pretty hard. So I came up with this idea of asking "how long would it take you to solve this
I was hoping that if you didn't know how to solve the challenge, you'd estimate a very high number ("it would take me 20 minutes to do that"). Contrariwise, if the challenge was thought to be simpler, you'd estimate a lower amount of time ("that would take 1 minute").
Or at least that's what I thought. So I posed a question, had you estimate how long it would take, and then had you solve the challenge and report back about how long it actually took to solve it. As usual, the reality is much more interesting.
The first challenge had four questions:
1A. How long would it take you to find a picture (any picture will do) of the current Princess of Norway?
2A. How long would it take you to find data describing the unemployment rate of Santa Clara County (California) for the past 10 years? (You're looking for a table or a chart containing the data.)
3A. How long would it take you to find a picture of Neil Armstrong standing on the moon that was published in The Guardian newspaper sometime in the past 30 years?
4A. How long would it take you to figure out how many times the name "Lothario" appears in the book "Moby Dick"?
The first two were intentionally simple. I did that so you'd get the idea of how the survey worked, and would have a couple of successes.
Questions 3A and 4A were supposed to be harder.
Unfortunately, they were also pretty simple.
1A. Finding a picture of the current Princess of Norway is easy. A search like: [ Norway Princess ] will do it. Just go to Images, and there she is. There are lots of pictures of Mette-Marit, Crown Princess of Norway and Princess Märtha Louise of Norway. Mete-Marit married into the family, while Märtha Louise is the only daughter of King Harald V and Queen Sonja.
2A. How long would it take you to find data describing the unemployment rate of Santa Clara County (California) for the past 10 years? I thought that this might be a little tricker than it was. Turns out that the obvious query [ unemployment rate Santa Clara county ] triggers the Google Public Data Explorer onebox, which gives you everything you'd want.
3A. How long would it take you to find a picture of Neil Armstrong standing on the moon that was published in The Guardian newspaper sometime in the past 30 years? This was supposed to require some clever date restriction filtering. (That's the way I solved it!) But I didn't check the simple, obvious, and straightforward query [ Neil Armstrong moon Guardian ]. Sure enough, a nice image shows up on the first page. (In retrospect, I should have asked for the image to be printed in the Guardian newspaper in the 1970s. That would have made it a bit more tricky.)
4A. How long would it take you to figure out how many times the name "Lothario" appears in the book "Moby Dick"?
This question was interesting. This was the first question were I asked for an answer.
4C. And how many times DOES the name "Lothario" appear in "Moby Dick"?
You were supposed to type in an answer based on what you found. Interestingly, 27% of the respondents got the answer wrong. The correct answer is 2 (while all of the wrong answers were only 1; they missed the second appearance because it was hidden below the fold).
I found this by downloading the full-text of Moby Dick from Project Gutenberg and then opening this text file in Chrome and searching for "Lothario." You'll find that name mentioned twice.
"...he cannot keep the most notorious Lothario out of his bed; for, alas! all fish bed in common..."and a few paragraphs later
"Gently he insinuates his vast bulk among them again and revels there awhile, still in tantalizing vicinity to young Lothario..."
In looking at the estimates of time vs. the actual time required, it became clear that the guesses were... interesting. Mostly people guessed that the tasks would take longer than they really did. Guesses and Actuals ranged from:
Ranges:
1. Guess: 5 seconds to 5 minutes. Actual: 2 seconds to 2 minutes
2. Guess: 8 seconds to 15 minutes. Actual: 5 seconds to 16 minutes (1 outlier at 51)
3. Guess: 6 seconds to 25 minutes. Actual: 5 seconds to 28 minutes
4. Guess: 12 seconds to 30 minutes. Actual: 11 seconds to 10 minutes
Medians:
1. Guess: 1 minute. Actual: 10 seconds
2. Guess: 5 minutes. Actual: 1 minute
3. Guess: 2 minutes. Actual: 1.75 minutes
4. Guess: 3 minutes. Actual: 1.5 minutes
This tells me that we're not especially good at estimating how long a search task will take. Many people were off by a factor of 2!
This is also what prompted me to add two additional questions--ones that I thought would be much harder than the previous four. I was thinking that maybe, just maybe, these questions were way too easy.
So I added questions 5 and 6. (Which I unfortunately labeled as 1 and 2 on the second survey. Here I'll just call them numbers 5 and 6.)
5A. How long would it take you to find a picture of Rosa Lubienski's daughter?
6A. How long would it take you to figure out how many times the name "Absalom" appears in the King James Bible?
These were significantly more difficult.
5A. How long would it take you to find a picture of Rosa Lubienski's daughter?
To solve this you first have to figure out who "Rosa Lubienski" is (turns out she's an actress whose stage name is "Rula Lenska," an English actress of Polish descent who became well-known in the 1970s). Once you know that, you can find her daughter's name, Lara Parker Deacon, and Google Images has plenty of pictures of her.
This is a multi-step problem; interesting, but not super-difficult. By contrast, the last question IS fairly tricky, even though it looks very similar to challenge #4 above.
6A. How long would it take you to figure out how many times the name "Absalom" appears in the King James Bible?
Again, I did the same thing as before. Looked for the King James Bible on Gutenberg, downloaded it, and used Control-F in Chrome to find that "Absalom" appears 109 times. (Although because there are variant versions of the KJB, I also scored 108 as correct.)
Several people used a Bible search site (www.KingJamesBibleOnline.org) to search for Absalom. They're the people who answered that the word occurs 90 times. (Note to those folks: That site counts the number of verses in which the word occurs, not the number of times the word "Absalom" appears. You have to read carefully!)
This time around:
Ranges:
5. Guess: 5 seconds to 20 minutes. Actual: 27 seconds to 11 minutes
6. Guess: 20 seconds to 10 minutes. Actual: 20 seconds to 10 minutes
Medians:
5. Guess: 180 seconds. Actual: 96 seconds
6. Guess: 180 seconds. Actual: 85 seconds.
This doesn't look inspiring, but here's the thing that's not in the guesses and timing tables....
Everyone did questions 1 - 4; mostly correctly. They estimated high, but were able to do the challenge quickly.
But questions 5 and 6 were much trickier.
Question 5 (Rula): Of the 5 people who said it would take 10 minutes or more, 4/5 of them were not able to do the task at all. In general, people who said that it would take 10 minutes or more found it really did take that long, or that it was impossible.
Question 6 (Absalom): Only 32% of the people who answered "Absalom" question actually got it right. There was no correlation between the time they estimated, the time they actually took to solve the problem, OR their accuracy.
My takeaways:
1. It's hard to estimate the difficulty of a search task. Regular readers of SRS know that something might LOOK simple, but turn out to be hard; and vice-versa--it looks hard, but turns out to be simple.
2. On the other hand, searchers who practice can estimate time-to-answer better than people who don't practice. In another study, I asked 500 people the "Absalom" question. 90% of them said it would take around 1 minute to do. Really? I suspect that people who don't search much have a distorted view of what's possible to find online. (They're a bit over-optimistic!) I had a smaller sample of non-SRS people do the "Absalom" problem. They were also all over-optimistic.
3. But when something really IS hard, and LOOKS hard... it usually is. Again, experience really helps when estimating. This is an underappreciated metacognitive skill (that is, the MC skill of understanding how hard a task will be, and what you need to do to complete it).
4. People still don't check their work. The "Absalom" question is a good example of this. For the people who used the KingJamesBibleOnline web site, the answer is correct.... but the number of 90 hits is measured in VERSES, not number of instances of the term "Absalom"!
5. We find it VERY difficult to estimate the number of queries we do / day. Out of the first 100 people to responded to the survey, the average number of reported queries / day was 30. I suspect this is a high estimate. When I've surveyed before, I find that people often mis-estimate by as much as a factor of 5. To give you sense of this, I checked my number-of-queries for the past 4 days: 20 (May 26), 17 (May 27), 22 (May 28), 16 (May 29). And so on. I'll have to send out another survey to discovery what people estimate, and then what number they find by looking at Google.com/history -- you can see all of your searches listed there (assuming you have it turned on). Check it out--tell us how close your estimate is wrt the actual number.
Overall, did this help my quest? Absolutely. Stayed tuned, I'll report back... after a bit more development. Thanks for helping out.
Search on!