Monday, July 29, 2013

Postscript: Have the Tour de France riders really gotten faster?

In his comment on last week's Challenge, regular reader Unknown (aka Jon) made the point that: 
Statistical Analysis is tricky. Over at the exact same chart was already posted Jan 2012 in order to prove just the opposite of this Challenge, namely to show that correlation does not equal causality.

And remember that Rosemary made a similar point:  "There are many factors at work..."  

To clarify a bit.  

I was asking a fairly simple question.  (Just "have the riders gotten faster over the years.")  

There are a million ways to analyze this, but I *meant* to ask the simple version:  "Is the average speed faster now than in previous years..."  

Note that I wasn't really looking for anything more subtle than that. I was just curious (and my pedagogical goal was to show how straightforward it is to do this kind of straightforward analysis--find the data, clean it up, start analyzing).  

So I really wasn't trying to answer the more sophisticated question of "are riders now faster than they were back then, IF you account for doping, course difficulty, technology... etc. etc."  

That's a fascinating topic, but it leads into a much more complex analysis.  (See the link above--Jon is right--they have an excellent in-depth discussion on )  

And to answer Jon's question--I wasn't proposing that this correlation equals causality.  (I'm not sure what the causal link would be anyway!)  It's simply an observation: the average speed now is higher than it was 10, 20, 30...100 years ago.  

I certainly enjoyed reading the StackExchange article.  And for fans of data analysis, the Python code in one of the answers is really interesting.  

One of the authors had it right--the dominant term in these equations these days is aerodynamic drag.  As you speed up, the drag forces increase as a square of the speed based on the frontal surface area.  In other words, to get a speedup of 2X (to go from 10kph to 20kph), you need to use 4X more power (or use other tricks to get around the resistance--e.g., making your front surface smaller).  

And from where we stand today, it seems as though we're starting to see the beginning of the asymptote of human performance around 41 kph.  I'm sure there will be a Tour that is relatively flat, or short, or wind-aided that will be faster than 44... but all things being equal, it seems VERY unlikely that humans will beat 45 kph over the length of the Tour.  

Nice question and comments!  

Search lesson:  When doing an analysis of this kind, ALWAYS check for communities of experts who might well have already done the analysis, or at very least, will have discussions of the topic (which might well have issues and factors that you might not have considered).    

I've found the communitities to be pretty high quality.  They have a LOT of communities that have extensive discussions about lots of topics.  Example list:  software, mathematics, home repair, Wordpress, physics, homebrewing, etc.  Perhaps most interesting from our perspective, there's a community about OpenData, which discusses public data sets and the tools you might use to work with them.  

1 comment: