I had two questions that came up that superficially look very different, but upon reflection, I realized it's the same search challenge in both cases.
Earlier this week the son of a friend asked if I could help them find a summer internship that would involve working on the topic of Big Data "somewhere in Silicon Valley, but not in San Francisco." He went on to ask if could be within an easy commute of Redwood City (since that's where he's going to live this summer).
I thought about it for a while, and was able to fairly quickly make a map that looks like this:
where each red pin shows a possible summer internship position working on "big data." (Interestingly enough, this is pretty much the map of the cities of Silicon Valley...)
1. Can you find (or create) a table of 50 summer internship positions in cities that are in Silicon Valley, and not in San Francisco? Ideally, you'd make an interactive map (like the one above), where you can click on the red button and read about the internship.
And then..... the very next day, a different friend said she was frustrated looking at the Ikea catalog. She's trying to buy a sofa, and found the range of options pretty overwhelming. It's a great asset to have many things to choose among, but it's sometimes kind of a lot.
She wondered to me, "I just want to know the range of prices of Ikea sofas!" In talking with her, it became clear that she was also really interested in what the distribution of prices is. (That is, she wanted to know if all Ikea sofas are expensive, or if they have just as many economy-priced sofas as well.)
I fairly quickly whipped up a chart like this one (not the actual chart, but it looks a lot like this):
Here the X axis is just different model numbers, and the Y axis is the price. So you can immediately see that about 25% of all their models fall in the $200 - $400 price range, with a bit more than half being priced below $200. Obviously, the chart for sofas will be different (everything is probably more expensive).
2. Can you make a chart like this one showing the price distribution of all the sofas in the Ikea catalog? (With the current catalog.)
Finding the prices isn't hard. (Ikea.com) The question is how do you extract the prices (or internship position descriptions) and then do something with THAT data?
Big Tips: I know this seems like a crazy hard problem, but it's really not. You just have to know the right tools. You should NOT spend much time (if any) copying and pasting data from the online catalogs of jobs or sofas. You should be able to find a tool to help you do the automatic extraction of data from a web page.
(If nobody's figured out how to do this by the EOD tomorrow, I'll give you another big hint on Thursday.)
A bit o' philosophy: This is yet-another of Dan's "find the data and massage it" Search Challenges. As I've said before, this is a blog about Search and Sensemaking. Although "sensemaking" is typically a larger, longer behavior pattern, these "find the data / massage it" kinds of questions are typical of the kinds of sensemaking questions that professional analysts have to solve all the time. Because we're trying to have fun AND learn something, my Challenges don't go on for weeks or months, but try to give you the sense of what the larger skill set is like.
So I hope you enjoy these "find & massage" search data challenges as much as I do. In truth, I'm having a good time creating these Challenges that teach a very particular skill, and sometimes give a bit of insight at the same time.