I've been enjoying watching as SearchResearchers tackle this Challenge. It's a bit trickier than I thought, so let me walk you through my (relatively) simple solution.
To begin with, I searched for a government organization that I thought would have the snowfall data for November. My query was:
[ NOAA snow depth data ] -- I added “daily” to make sure I found complete data.
This led me to the NOAA.gov site for snowtracking.
Which leads to:
And from there to the data file, http://www1.ncdc.noaa.gov/pub/data/snowmonitoring/fema/11-2014-dlysnfl.txt This is just a plain text file, so I opened this in my favorite boring text editor (I use TextWrangler because it has lots of nice features, such as grep, for data wrangling). I pulled out the data for Buffalo, Rochetser and Oswego for November 2014. (Actually, I pulled out the data for Pulaski which is very near Oswego – the other rows for Oswego had too many missing values, indicated by -9999.00)
For Canada data, I did the simple query:
[ Canada snow data daily ]
Which led to the official government site, http://climate.weather.gc.ca/index_e.html
On this interface to their data I selected: “daily” data values, for “November” - then downloaded London International Airport data and the downtown Toronto data. Here's what the London data looks like.
This is from the London International Airport data set.
Again, I copy paste this into my text file and then converted it all fit into a CSV file.
I uploaded that CSV into my spreadsheet, did a quick conversion of the US data (from inches into cm).
Now, I need to visualize it. There are lots of things one could do here, but first I did the simplest thing I could imagine. I computed a line graph for each of the cities over time. It looks like this:
While that shows what happened and when, but it's not all that great for showing spatial distribution. So here's another simple version of the chart (remember that you can click on the chart to see it at full size).
I didn't do anything fancy here--I just screenshot the map, then laid the charts for each location on top of them. This is a visualization technique known as "small multiples" (that is, a small number of repeated charts all of the same type)--but one of the things to know is that they all have to show the same thing on the same axes, or all bets are off.
Note that I DID have to set the max-grid values to all be 19. If left to their own devices, the Y axis will be different on each one. I wanted them to be comparable, so I had to manually set them all to 19. If I was producing hundreds of charts, I would have written a piece of code to this... but this data set was small enough that you could do it by hand.
As I worked through this data collection and plot task, the biggest challenge for me is just keeping track of where my data came from, and how it gets transformed from one data source to another. As I worked, I kept backtracking and checking my data. (This is where having a buddy to work with is a great idea--looking at you Ann & Debbie.)
I also really like your comments about wanting to create contour maps (or heat maps, although maybe we should call them cold maps).
If I get the chance, I'll write that up tomorrow.
Search/Sensemaking Lessons: In the meantime, it's useful to see that sometimes the simplest approach is best. (See my queries above.)
The hard part is knowing which of the many data sources will work out.
My approach is always to grab a small data set (3 or 4 cells) and then work through the whole process from beginning data download to visualization. It's a mistake to try and do the entire data manipulation process on the full data set at the beginning. Use a small sample and work UP to the full dataset. Trust me, I've wasted many hours on wrangling data, only to find out, when I got to the end, that the whole thing was broken. Better to find that out on a small subset of the data than the entire thing.
Because sometimes you'll get halfway through an analysis and realize that everything you're doing is wrong. Or that the data doesn't make sense. Or it's too full of errors, or missing data points.
More comments tomorrow... but this was my quick answer for today.
We'll be talking more about this in the future!
In the meantime, Keep Searching On!