Tuesday, July 12, 2011

Answer: How many planes from SFO / day?

Last week I asked a question that should be pretty simple to answer:  How many regularly scheduled flights depart SFO (San Francisco airport) on a typical day?  

The hard part here is to find a reasonable data source of flights from SFO.  Here's what I did: 

First search for [ faa flight departures ] 

When I scanned the results, I quickly find that the FAA has an "ASDI Data Feed," which shows all of the flights that the FAA tracks, and that wouldn't be a bad thing to use, but I only want to do this once from one airpot, not many times over and over, so that seems like overkill.  

In my reading of the results I also learned about codeshares.  Ah, that's right, any listing of airline flights will have the same flight listed as being from multiple carriers.  Since I wanted the number of planes departing, I need to exclude the codeshare flights.  Don’t want to double or triple count these, so I want to get a singlecode for 1 plane.  

After then searching for [ SFO flights listing ] I find that FlightStats.com allows you to search for all departures by the hour!  Although it's inelegant (and I wouldn't want to do it this way more than once), I can manually copy/paste all of the departures from SFO into spreadsheet!  Luckily, FlightStats also has a "Remove codeshares" button which lets me skip the post-data download step of removing dups.  

It seems painful, but the truth is that I was able to grab the 18 screen-fulls of data in about 3 minutes and paste it into my spreadsheet.  Great!  

After spot checking to make sure that I really DID have all the flights, and that there really were no codeshares, and that I hadn't accidentally duplicated data somehow, I was ready to go.  Finding the number of flights was then just looking at the last row in the sheet:  there are 583 regularly-scheduled flights / day out of SFO (at least for July 12, 2011).  

But I couldn't stop there!  From here, it was easy to write out a CSV file and then import that into Google Fusion Tables in order to use the map visualization tool that I know is there.  A quick import, and then a click on "Visualize" and then "Map" yields: 

Interesting, is it not, that there are more flights to more cities in China that to Russia, Africa and Australia combined?  

And for my last quick look at the data, I computed a histogram to see the number of flights departing by hour of the day.  As you can see, it's quiet after midnight, but pretty steady for most of the day.  

And finally, this chart that buckets departures by 30 minute buckets... 

So now you know the quiet times for departures at SFO.  I've flown enough times out of SFO to say with some certainty that this is probably right.  

People who left comments (Hans, Julia and gasstationswithoutpumps) all had reasonable guesses, but they were estimates from other people's summaries.  That's a valid method when you can't get to the actual data, but when possible (as in this case), I like to get the raw numbers and do the analysis myself.  (Still--Hans got extremely close to my number! Nice job.)  

There are a couple of search lessons here.... 

1.  When you can, get the raw data and do your own analysis.  Other people's analyses often contain assumptions that you can't verify (and they might not even explain).  Be careful! 

2.  A really important part of the search analysis process is learning stuff along the way.  This is such an important concept that we'll have to talk about it in more detail in a future post, but I wanted to highlight it here.  I was reminded about the codeshare problem as I read through the initial search results... and that let me know how to frame my subsequent searches to be more to the point.  

Keep searching! 

Postscript:  I also should mention here Aaron Koblin's wonderful visualizations of air traffic patterns. It's not quiet what I was asking for, but you might be interested in seeing them.   

Here I zoomed into his flow patterns around SFO.  The labels are mine to add a bit of clarity.  

1 comment: