Friday, December 19, 2014

Answer: What was the best Search Challenge of the year?

Thanks... for all your comments, both on list and through DM.  It's great to have such a dedicated and thoughtful set of SearchResearchers; I appreciate all of the time and energy everyone puts into their comments.  

It was great fun to read through the comments.  There were some posts that definitely brought back memories.  

Like many of you found, the change in the Challenge / Answer timing has been a blessing in disguise. 

When I started the Challenges, I was hoping to get people to answer as quickly as possible, and the 24-hour cycle was intended to be a mild stimulant to help things along.  But then as the Regular Readers pointed out, many people don't live on the 24-clock, and since some of the Challenges were a bit... well, challenging... it made more sense to stretch the cycle time out a bit. 

What I hadn't realized is that it gave ME more time to think about the best answer, and let us all take on much harder Challenges than ever.  If you go back to 2013 and compare with 2014, this year was much more impressive.  Nice work, team!  

To remind myself of what the Challenges were, I did this query: 

     [ inurl:/2014/ intitle:challenge ] 

This searched only posts from 2014 (since that's part of the URL for each challenge), and the intitle: operator limited the results to just those posts that were titled "challenge..."  

This is a nice search because it finds exactly 49 posts, reflecting all of the posts in the year until this point.  

I then changed my "number of search results shown" to 100 (you can do that in Search Settings--in the gear icon in the upper right, and now I have a complete list of all the posts!  

Browsing through here is a trip down memory lane. 

We sought out answers about Music ("what's a plagal cadence?"),  the Other Side of Buildings (discovering ways to date things in the world), Statues in London, How Dan can run on water (mislaignments in geo data),  the Price of Horses in 1918, making snowfall maps (visualization tools),  how to find out the price of properties in NYC, and Titanic Triggerfish.  

Interestingly enough, the post with the most clicks (and the most posts!) is....... 

"What wreck is this?

Remember this Challenge?  You had to figure out what this hunk of junk was doing in the Carquinez Straits, midway between Sacramento and San Francisco.  

The answer to that meant we had to learn about how to search for archival information, even in pretty out-of-the-way locations.  We learned out to get EXIF data out of cellphone photos, and then use the lat-long to locate interesting features on Maps.  

It was a good one, and voting with your clicks and visit is a great way to get my attention.  

Looking at the comments on this week's Challenge, it's clear to me that the most interesting Challenges are often the most difficult ones.  I hope you find them educational AND entertaining; that's certainly my intent.  

People seemed to like the newsy Challenges, as well as the ones that need Maps (or Geo information), and finally, people liked the tough data challenges, especially the ones that meant learning new visualization methods.  

But the biggest takeaway was that things are going along pretty well.  As always, I'm all ears, and will happily listen to good ideas about how to steer the blog along paths that are interesting to the SearchResearchers.  

And yes, I'll write one more post about the Twain challenge.  I've actually done a fair bit of hacking around on that one, and I'll tell you about it next week before we close off the year with the special Year-End Search Challenge.  

Here's to a Healthy & Happy holiday and New Year to all Researchers everywhere!  

Wednesday, December 17, 2014

Search Challenge (12/17/14): What was the best Challenge of the year?

Snow in the Sierras. The best thing we Californios could have at the end of 2014.

Regular reader Fred had an excellent idea. 

 "Why not," he wrote, "do a summary of the year in SearchResearch?"  A kind of year-end look back at the best-of (and maybe worst of) the year.  

I like that!  I'll write up my summary of what I thought went well, and what didn't for Friday's post.  

But in the meantime, I'd love to hear YOUR thoughts about this.  This leads to this week's Challenge question:  

1.  What was your favorite SearchResearch Search Challenge of 2014?  (If you must, feel free to slip back into 2013--we never did a summary of that year.)   Tell us what you liked about the Challenge you enjoyed the most, or what you learned from it.  
2.  What was the best / most-interesting thing you learned in 2014 about search or research in 2014?  I'm looking for good ideas for future SearchResearch Challenges, but also I'm also looking for new information resources that we should all know about.  (I have a couple tucked up my sleeve.) 

Pro tip:  You can find all of the posts during a given month by using the inurl: operator.  Do it like this to find all of the January, 2014 posts:  

     [  ] 

Also, a heads-up:  Next Wednesday is Dec 24th and the real beginning of our holiday season.  I'm going to be traveling to Oaxaca, Mexico for 10 days immediately after Christmas, so I'll be mostly off-line.  (Yes, I know I could get online, but I'm taking a holiday....)  

That means I'll post a Challenge that day, but it's going to be the most challenging Challenge of the year.  I'll be back online on January 6th to look at how things have been going, and I'll be sure to give lots of hints in my Challenge writeup--but be prepared.  I think you'll all have to work together to solve this one!  It'll be a great wrapup for the year of SearchResearch.  

Search on! 

P.S.  For anyone who didn't see it, I made a quick video showing you how to convert last week's KML file into a CSV for analysis.  See:  KML to CSV demo video.  (Be sure to watch this one full-screen, at 720p.  Use the gear menu to change the resolution of playback.)  

Saturday, December 13, 2014

10 new languages in Google Translate (Chichewa, Malagasy, Sesotho, Malayalam, Burmese, Sinhalese, Sundanese, Kazakh, Tajik, Uzbek)

As you know, Google keeps changing the set of services it offers.  Sometimes things are removed for simplicity or cost reasons, but sometimes things are also added to the mix. 

Just this week, Google Translate added 10 new languages for translation:
  • Chichewa (Chinyanja) is spoken by 12 million people in Malawi and surrounding countries.
  • Malagasy is spoken by 18 million people in Madagascar, where it is the national language. (It is one of only a few languages that is VOS, that is, it puts the verb first in sentences, followed by the object and then the subject.)
  • Sesotho has 6 million native speakers. It is the national language of Lesotho and one of 11 official languages in South Africa.

In India and Southeast Asia, Google adds Malayalam, Burmese, Sinhala, and Sundanese:

  • Malayalam (മലയാളം), with 38 million native speakers, is a major language in India and one of that country’s 6 classical languages.
  • Myanmar (Burmese, မြန်မာစာ) is the official language of Myanmar with 33 million native speakers. Myanmar language has been in the works for a long time as it's a challenging language for automatic translation, both from language structure and font encoding perspectives.
  • Sinhala (Sinhalese, සිංහල) is one of the official languages of Sri Lanka and natively spoken by 16 million people. In September the local community in Sri Lanka organized Sinhala Translate Week, contributing tens of thousands of translations to the Google translation system.
  • Sundanese (Basa Sunda) is spoken on the island of Java in Indonesia by 39 million people.

In Central Asia, Google is adding Kazakh, Tajik, and Uzbek:

  • Kazakh (Қазақ тілі) with 11 million native speakers in Kazakhstan.
  • Tajik (Тоҷикӣ), a close relative to modern Persian, is spoken by more than 4 million people in Tajikistan and beyond.
  • Uzbek (Oʻzbek tili) is spoken by 25 million people in Uzbekistan. The Uzbek dictionary by Shavkat Butaev is also now available.

Friday, December 12, 2014

Answer: Finding the measures of things

Vernal Fall in Yosemite 
This week's Challenge was intended to get us thinking about the questions.   As many of you correctly pointed out, even questions that sound simple often have a certain amount of ambiguity in them.  

You learn this very quickly when you're a Reference Librarian sitting at the Reference Desk.  They're the people in the library (or online, as in the "Ask a Librarian" service) who answer your toughest reference questions.  They get hit with some questions like the ones I asked in the Challenge.  Prototypical questions might be: 

  • Why was the Civil War fought? 
  • How deep is the Grand Canyon? 
  • Who won the War of 1812? 
  • I can't find War & Peaches. What's that book all about anyway? 
As you see, you can't just flat out answer these questions.  At the reference desk, the ref librarian conducts a reference interview to figure out what the asker really wants to know (and how much they'll understand).   "Why was the Civil War fought?"  Great question. Just saying "slavery" is inadequate.  Instead, the librarian will talk with you--what do you really want to know?  The economic conditions?  The history of abolition?  Political tensions between the states?  

Likewise, answering "how deep is the Grand Canyon?" and "Who won the War of 1812?" depends a lot on where you stand and how you take the measurements. 

Of course, War & Peaches is really a mondegreen of the Tolstoy book title, War & Peace.  

So it goes with our questions from the Challenge.  They were:  

1.   What is the elevation of Vernal Fall?
2.  What is the distance to Jupiter?   
3.  How big is a 2x4 ("two by four") piece of wood?  

The ambiguity of the questions becomes pretty evident once you start to look for the results?  

1.  "The elevation of Vernal Fall?"  Well, tell me what you mean by "elevation"?  Is that the altitude above sea level of the top... or the bottom of the falls?  Or could you mean the distance the water falls from lip to base?  

If you do the obvious query to Google, you'll get a result that looks like this: 

You have to read this result carefully.  Note that the search term "elevation" is bolded in the result snippets.  It's a big hint that elevation might not mean what you think it means.  If you click through the first couple of results, it becomes clear that "elevation" here means "height above the Yosemite Valley floor."  The number (1,014 m) is extracted from a number of web pages that are hiking guides.  For that purpose, "elevation" is a good number to indicate how difficult of a hike it is.  

Now if you thought "elevation" meant "altitude," you'd be completely misinterpreting the result.  

Unfortunately, if you do a more precise query [ altitude Vernal Fall ] you'll still get the same answer because Google synonymizes "elevation" and "altitude."  See? 

Note carefully what it says in grey text below the number:  "Vernal Fall, Elevation" 

Of course, that's not what you intended.  You asked for "altitude"  

If you want the altitude of the top (or bottom) of the Fall, you'll need to look at a map or find a guide to Vernal Fall that gives it to you explicitly.  

The official National Park Service Yosemite Valley Map shows that the Falls are at 5044 feet (1538 m).  But is that the top or the bottom of the falls?  

To REALLY check on things like this, I always look at the topographic map.  (Search for [USGS topographic maps ] and use their system to search for the map of Yosemite Valley.  It's called "Half Dome" for the famous semi-dome nearby.) 

When you find that map, the relevant piece will look like this: 

See that elevation line at 4800?  It's one heavy line below the top of the Fall.  (Trace it with your finger.)  The legend at the bottom of the map says these lines are 200 feet apart.  So the heavy elevation line that passes just below the top of Vernal Fall is 5000 feet.  So the NPS map's number (5044 feet; 1538 meters) checks out.  

2.  "Distance to Jupiter?"  Well, tell me what you mean by "Jupiter"?  And then tell me what you mean by "distance"?  

It seems silly to ask "what's Jupiter?" but don't tell that to the citizens of Jupiter, FL.  If you're in Talahasee, FL, "what's the distance to Jupiter?" probably means "how long will it take me to drive to the town of Jupiter!"  

So you have to be impressed when the answer from Google to the obvious query is this:  

As the first answer box points out--the distance from Earth to Jupiter varies day-by-day.  If you read the result carefully, it says "at their closest.." they can be 628,743,036 km apart.  But what about at their farthest?  I'll let you look that up, but it's a long way from one side of the solar system to the other.  The distance changes moment-by-moment.  

Of course, the other interpretation could be "how far is Jupiter from the sun?"  If that's the real question, then you have to ask "do you want the distance from surface of the sun to the surface of Jupiter?"  or do you just want center-to-center?  The site puts Jupiter (center-to-center) at 5.2 AUs (Astronomical Unit, which is 1 Earth-t0-Sun distance of 93M miles).  So from Sun-center to Jupiter-center is around 483M miles (777M km).  

3.  "How big is a 2x4 piece of wood?"  Well, tell me what you mean by "big"? 

Usually people think of a 2x4 as a standard measure.  As Wikipedia tells us in the article about lumber: "...a "2x4" board historically started out as a green, rough board actually 2 by 4 inches (51 mm × 102 mm). After drying and planing, it would be smaller, by a nonstandard amount. Today, a "2x4" board starts out as something smaller than 2 inches by 4 inches and not specified by standards, and after drying and planing is reliably 1.5 by 3 .5 inches (38 mm × 89 mm). It is made to absorb natural variation."

This is backed up by the article on Wood Sizes from the site, which also points out that lumber sizes vary depending on if its softwood (e.g., pine lumber) or hardwood (e.g., oak) that's being measured.  
Of course, there's the issue of length.  The actual weight (or volume) of a single 2x4 can vary tremendously depending on length.   In the United States and Canada the standard lengths of 2x4 lumber are 6 feet (1.83 meters), 8 (2.44), 10 (3.05), 12 (3.66), 14 (4.27), 16 (4.88), 18 (5.49), 20 (6.10), 22 (6.71), and 24 feet (7.32 meters).

The upshot is that a "2x4" is actually smaller than the term would suggest, and the largest variation is size (that is, "how big it is") depends on its length--from 6 feet up to 24 feet.  

Search lessons:   

Like a great reference librarian, when you're searching, you really want to understand the question that is being asked.  You have to take into account the variation in what the original question statement is, and the variation in what your reference materials (including Google!) can tell you.  

As we see in the Vernal Fall elevation example, you must read the content carefully.  This is probably the single biggest source of mistakes I see in students.  Even if they frame the search carefully and correctly, but then misread the results, then they're still wrong, even though they did everything right up until the last step.  

One way to avoid this problem is to double-source everything.  I've done that in these examples, and it's a great practice to do.  (But be wary of duplicates in the results.  If you're seeing the same language multiple times, start digging more deeply.)  

Search on!  

(And read those results carefully!!!)  

Thursday, December 11, 2014

Answer--part 3: What's going on in this file?

Okay, NOW it's time to reveal what's going on...  

Let's start with the simplest way to analyze the KML file.  

As several of you found out, importing the KML file into Google Earth makes it very clear what's going on.  Here's what I see when I import it.  (It's labeled "Location history from 11/22/20..." on the left hand side panel below. 

Notice the scrubber in the upper left.  It looks like this.  (Note the "house shaped" icon that indicates the end point in the timeline being shown.  Note also the "crescent moon" shaped icon that indicates the earliest time of the data being shown.   

 With this widget, you can scrub along in time and follow the track of the phone.  If you click and drag on the "moon" and "house icons", you can select the segment of time on the KML track that you want to see displayed. Here I'm looking at the track from 8AM on 11/27 up to 8:47AM on 11/28.  

Using this slider widget, you can find out remarkable things.  

As several of you figured out, the track starts in North Carolina, in the parking lot of the Carolina Inn, Chapel Hill.  

To get the name of the building, I switched to Maps view (in Google Earth), and then did a search for:  

   [  *  ] 

this gives me a clickable red dot for each "known place" in the map.  

Fast forwarding the track a little bit, you can see that the phone went to the Raleigh-Durham airport... and then was turned off.  

You can keep on going like this--zooming in to see the fine details, and by looking at the time slider, you can figure out the time of day.  Below you can see that the phone landed at the San Francisco airport (SFO) around 10AM local time.  

By 2PM, the phone had driven down to Palo Alto.  (You know it was driven because from Tuesday's charts we know the speed the phone was moving at that time, around 65 mph.)  

Sliding forward a bit more in time, you can figure out what this trip is into the hills.  From the speed chart, we know the speed varied from 25 mph to around 4 mph.  If you zoom in even farther on maps, you'll see that some of this track is off-road, on trails in the Los Altos Hills.  The speed for that section is slow...  This is driving to a running location!  

Going forward a couple of days (to 11/26), you can see a trip to the coast.  Doing the same trick as before [* ] you can figure out that this is a quick overnight trip to the Costanoa Lodge.   

But you have to be a little careful when interpreting this data:  sometimes you'll get spurious data points.  As you can see here, several of these points seem to zip back and forth to an odd location (the focus at the top of the image).  If you find these points in the data set, they're really odd--very rapid back and forth.  Since I can't really travel at that speed (>100mph), they're really just jitter in the GPS signal.  

Errors happen.  This means we have to clean the data; yes, even cell phones can generate spurious signals.  Luckily, it's pretty easy to remove the whacky data points.  (Basically, you look for impossible value and just delete them.  It leaves holes in your data, but that's better than assuming I can fly from place-to-place at 3,000 mph... or as you can see in some of the data, that I used a submarine to travel at negative altitude!)  

There's another spurious value here. (The big spike going to the left.)  On the whole, the data is good--but this is why you compute speeds (as we did on Tuesday).... it makes it simple to find the broken data points.  

Let's wrap up this analysis with a quick telling of the story... 

What this KML tells us.... The KML file starts in Raleigh, NC, and travels to San Francisco on Nov 22.  After a couple days of traveling back and forth between work and home (and it's very easy for you to figure out which is which!), you can tell that I took a quick vacation over to the coast.  If you look carefully, you can even see where I went for a run, where I went for a bike ride, and where my favorite morning coffee shop is located!   

Search Lessons:  

There are many lessons here... let's start with the obvious.  

1.  A cell phone track can tell you a LOT about what a person is doing.  In this case, it's just my cell phone GPS locations. But you want to be aware that your phone has the capability of tracking your movements, and giving anyone who has access to that data a VERY deep insight into what you do.  In my case, I use this tracking information to geocode my photos, and on occasion, I can figure out where that interesting place I visited while driving.  That is, I can reconstruct where I went... sometimes that's exactly what I want to do.  

2.  Sometimes the cell phone data is spurious.  Luckily, the bad data points really stick out and are fairly easy to clean from the collected data.  Here's a deep lesson:  This is true for many logged events.  Just because the data is logged doesn't mean it's right.  Data sets almost always need to be cleaned carefully.  (And in particular, descriptive statistics can be misleading.  The "average speed" for my movements on Nov 26 is really high... unless you remember account for the incorrect data points.)  

3.  The mode switch from Google Earth to Google Maps can be useful.  You can use several of the methods you know in Maps (e.g., [ * ] or StreetView) to get additional data.  The deep lesson here is that you should learn the different viewing modes for any data viewer (e.g., Maps, Earth, Google Charts, or  Switching between viewing modes is often a way to get deep insights into your data very easily.  

4.  Zooming in can often reveal lots of data.  Don't stay just at the most distant view; zoom in and out to get the details (when zoomed way in), and the context (when zoomed way out).  

5.  Know your tools.  The Google Earth time slider has a "current" slider (the little "house" shaped icon) and a "previous point" slider (the "crescent moon" icon).  

6.  Know what your cell phone can do.  Personally, I like the geotracking feature as it gives me a bunch of nice capabilities.  But you might not care for it.  To change the tracking behavior on Android, you need to go to Settings, then Location, where you'll see a screen like this: 

Click on "Location" in the above menu, then click on "Google Location Reporting"  in the next menu (at the bottom):  

You can then turn Location Reporting on/off.  (This is how your photos get the Lat/Long EXIF data.)  

And this is the option that gives Android the information that generates track of the device.  (Note that you can activate "Location History" on a device-by-device basis.  So you can let your laptop be tracked, but not your phone, or vice-versa.)  

Now you know, and now you're empowered to make your own choices.  

Search on, geographically!

Google News to stop service in Spain

Plaza Mayor, Madrid.


As the result of a new Spanish law that will go into effect on January 1, 2015, Google has decided to stop serving Google News in Spain.  

The Spanish law is simple--if any internet service provider (such as Google News, Facebook news clippings, Daily Slate, etc.) uses any part of a news article from a Spanish news source, they have to set up a licensing deal.  That is, Google (or FB or DS, has to pay them for each use... Even if it's just a headline and a link to the paper, you still have to set up a deal and pay them for that use.  

The reasoning on Google's part is equally straightforward: Google doesn't run any advertising on News, so it's really a loss-leader, a kind of public service that lets people see a broad spectrum of the news across a wide number of sources.  That was Krishna Bharat's intent when he set it up originally, and that's what it still does.  

You can read the official Google Blog post here for all the details.  But it's a sad day for Spanish news readers.  

I hope that the Spanish publishers, as the German news publishers did a few months ago, discover that they're getting a LOT of traffic from Google News, and revert the law back to the way it is now to avoid losing half of their organic traffic.  

¡Qué lástima!

Wednesday, December 10, 2014

Search Challenge (12/10/14): Finding measures of things

Not to worry... we'll close off the "KML analysis" problem tomorrow. I know a couple of people still wanted to work on it, so I'll write a post with all the answers on Thursday.  

And in the meantime, here are a couple of quick Search Challenges to keep you going.  Here, the Challenge isn't so much finding an answer, it's really about understanding the question.  

1.  Vernal Fall in Yosemite National Park (see above) is one of the most beautiful waterfalls in the world.  It's also dangerous, as people get too close to edge and slip over the side in a heartbeat.  The Challenge here is this:  What is the elevation of Vernal Fall?
2.  What is the distance to Jupiter?   
3.  How big is a 2x4 ("two by four") piece of wood?  

Warning!  These seem trivially simple,they're all easy to look up... and that's why they're interesting search challenges for this week.  They don't take long to answer, but each one has a hidden gotcha.  

Tell us what the answer is for each question, and WHY you think that's correct.  


Search on!  

Tuesday, December 9, 2014

Answer--part 2: What's going on in this file?

As you might have figured, I've been traveling, and now that I'm back, I'm trying to catch up.  

I'm not going to give the final answer(s) to the Challenge yet, but I do want to show you another way to look at the data.  

If you open the KML file with a plain text editor, it looks like this: 

As we've already discussed, this is just a KML file.  If you look at the definition of a KML file, you'll see that this is a TRACK piece of data which is just an AltitudeMode (in this case, "clampToGround") followed by a list of timestamps followed by a coord.  Each coord is longitude, latitude, altitude.  (In that order, with the altitude being optional.)  

Since I wanted to really do something interesting with the data, I first opened it in a text editor (again, my choice was TextWrangler), and then pulled out all of the timestamp, long, lat, altitude data.  (notice... this is the opposite of the "lat/long" order that is often used elsewhere).  

Given this, it was pretty straightforward to extract the data and create a data file consisting of four columns, which I then imported into a spreadsheet.  

Based on this, in the spreadsheet I was able to easily compute the additional columns for "Distance in degrees"  "miles traveled" "elapsed time (seconds)" and then ultimately, "speed (mph)." 
 (note:  to compute "distance in degrees" I just used the Pythagorean formula, A^2 + B^2 = C^2 -- that gives me distance in degrees.  See the formula I used in Column F of the spreadsheet.  I looked up that at this latitude, 1 degree of latitude is roughly 55.3 miles.  That's not exactly right, but it's a good enough approximation for this evaluation since all of the location points are more-or-less at the same latitude.)  

I also wrote a couple of spreadsheet functions to extract the day (where 11/22/2014 is day 0, 11/23/2014 is day 1, etc), and the time from the data.  Using those two pieces of information, I could compute the "timevalue," which is the number-of-the-day + fraction of the day as a decimal value.  That sounds complicated, but think of it this way:  noon on the 0th day is 0 + 0.5  (noon is halfway through the day--6PM on the second day is 1 + 0.75, that is, day 1 + 3/4ths of the day, or 1.75).  

Why do that?  I created the "timevalue" so we could easily make a scatter plot of the timevalue (which handily enough goes from 0 - 6) vs. the speed.  

If you look at the second tab (at the bottom of the spreadsheet--the one labeled "Charts of speed and time"), you'll see this: 

See columns A and B?  I just copied those from the main chart above (x-value and speed).  Use copy and then "paste special" in order to just paste the values of the spreadsheet.  

The scatterplot above is interesting, and odd.  You can see big gaps in the data.  Obviously, the phone is turned off during those intervals--that's when I turned my phone off at night.  (Even if it's out of the service area, it will keep recording data.) 

Handily, those correspond pretty nicely to the night times.  

The odd thing is that one spike on day 4.5 (that is, around noon on day 4).  If that's REALLY my speed, then I've moved really fast (close to 3000 mph!!).  Obviously, this bears some closer inspection.  

But zooming in on this data is kind of a pain in Google Spreadsheets, so I went looking for another tool that lets me zoom in more quickly and easily.  

I exported this data to a tab-separated value (TSV) file (just plain text, with each value separated by a tab) and then imported it into a data set.   Here's what the data looks like imported into a scatterplot diagram. Click here to see this data plot live.  

Now the inter-day gaps are really visible, and it's clear that the huge speed spike at day 4.5 has to be spurious.  

And, when you zoom in (which you can do with the live version above), you can see this on Day 0 (that is, timevalues from 0 - 1.0). 

These values look more reasonable.  It's clear that between 0.1 and 0.2 I was driving somewhere, and driving at speeds up to 75 mph.  (That is, on the freeway.)  The phone was turned off between 0.2 and 0.41, and then more freeway driving ensued.  Same story at 0.65 - 0.7.  

Given this kind of view of the data, can you now figure how how the movements (in space) correlated with speeds?  

As another clue, you can look at this plot: 

This is me traveling right around the middle of the day at speeds between 5 and 18 mph from 5.57 until 5.61 (or, to convert that back into regular clock time, that's 1:50PM - 2:35PM).  That's too fast for walking, too fast for running, and too slow for driving in a car.  What could I have been doing during that time?  

More details revealed tomorrow!  We'll talk about the 3,000 mph data spike.  

Search on.  (And now, with spreadsheets and data visualizations!)  

Friday, December 5, 2014

Answer--part 1: What's going on in this file?


You're doing great!  But I'm going to let this Challenge run over the weekend so you'll have a bit more time to do the analysis.  (There's a lot more that can be done with this data.)  

As you quickly noticed, this is a KML file, which is the native file format for Google Earth, but which can be used by other products as well.  A query like: 

     [ KML apps ] 

quickly shows a bunch of applications that can be used to inspect or view the contents of the file. 

         Here's the KML file for you:

I want to recommend one in particular to you:  Google Maps Engine.  

With Google Maps Engine (now called "My Maps")  

Just this week, the Maps Engine/My Maps tool got a serious upgrade.  You can now have up to 10 layers in your maps, import spreadsheets or CSV files with up to 2,000 addresses per layer.  You can then share publicly and embed on your site and easily support up to 25,000 times per day. 

Most importantly, for this Challenge, you can import a KML file.  

In this case, I "Create a new map" and then import the KML file... 

Once imported, it should look something like this: 

Here, I've zoomed way in, and changed the style of the trace to be thinner and red... 

And now, given this big hint... What else can you figure out?  

As Fred asked in the comments, yes, this person (well, me) went to the Costanoa Lodge (just south of Half Moon Bay), and as Mihai discovered, this person also went to the Carolina Inn.  

Now, HOW did you find that out?  What properties of the KML file led you to that conclusion?  

What ELSE can you find out?   

(And, just to rest your worries, I always turn off the phone on the plane....)  

Search on!