This week's stat-tastic run report comes thanks to Daniel ROWAN who ran his 90th parkrun and 85th at Banbury:
On a lovely autumnal Saturday morning I joined 254 other runners for another parkrun; it’s always a great feeling to start the weekend with an achievement (sadly for me, that’s usually as productive as it gets). Welcome to the 19 first timers, and 18 tourists, who joined us; we hope you decide to see you again soon!
As always, a massive thank you to the volunteers, who give up their time to make this event possible.
And now for the stats…
Wait! Hang on a second, don’t the stats usually come at the end of the report?! Well yes, normally… Apologies in advance, it has always been an intention of mine to spend a couple of hours exploring the vast amount of data held on the parkrun servers, and this week’s weather-induced switch to the winter course seemed to give me the perfect premise for a report, and it was also something I found myself wondering recently as I try to chase that all important PB…
Question: Is the winter course faster or slower than the summer course?
From my casual asking around, this is certainly something that people have differing opinions on, and for a variety of reasons. Some are of the belief that the winter course should be objectively faster, because more of the run is on a harder surface, which surely helps with pace. Others, like my partner Iwona, believe that it is slower, reasoning that as well as the weather usually being worse running an additional lap is that bit more repetitive, making the winter course a more daunting prospect.
I’ve found that a useful rule for life is that the plural of anecdote isn’t evidence, so I turned to the data held on Banbury parkrun servers to help answer this question.
Some caveats to bear in mind before going any further:
- To provide a healthy amount of data on which to base my unsophisticated analysis, all data in this report comes from Banbury parkrun, from the dates 20 October 2018 until 19 October 2019.
- I thought about going back further with the data (after all, Banbury parkrun has been running since September 2014) however as the parkrun grows, the profile of parkrunners changes with it, and in the interests of data quality and consistency I decided only to use the past years’ worth of data to try and limit some of the variation.
- For the past year, the winter course was used by Banbury parkrun during the following dates - between 13 Nov 2018 – 20 April 2019 and of course this week - 19 October 2019.
- For ease of calculation, I’ve used decimal time. To get back to hours: minutes, take the digits to the right of the decimal point, and multiply it by 60. E.g. a time of just the digits to the right of the decimal point, and multiply it by 60 (minutes in an hour). For example, a time of 28.45 becomes 28:27 (.045*60)
So… Back to the task of trying to measure which is the quicker Banbury course. My first thought was that perhaps the proportion of runners achieving a PB (that’s Personal Best to the uninitiated) each week would be a good indicator of course speed (the hypothesis being that on a fast course, more people are likely to get PBs, right?)
As you can see from Figure 1, this statistic seems to suffer from quite volatile levels of variation at a weekly level; this inconsistency may suggest it more exposed to day-specific conditions such as weather, field size, and other external events (you don’t have to look far for a good example of this - take last week (12th Oct), which had one of the slowest average times recorded this year – two possible explanations being that runners stayed at home to watch Eliud Kipchoge’s epic sub-2 hour marathon finish, or perhaps many were preparing for the Oxford or Birmingham Half Marathon the following day?)
Although Table 1 suggests that people are more likely to reach a PB during the winter course, the fact that the percentage point difference is relatively small combined with the lower number of total finishes suggests to me that this is far from conclusive evidence. It may be worth a quick look to see if there is any other data which may offer more robust insight.
Fun with averages
Perhaps it’s as simple as looking at the average times for each course then?
Table 2 suggests, much like our findings from Figure 1, that the Winter course holds a slightly faster time, and a lower mean and median time (just as a side note, the median is often a useful average when measuring times because it is more resistant to the effects of outliers, e.g. tail walking marshalling. I’ve always wondered why parkrun UK use the mean average instead…) There is also a smaller standard deviation – a measure of variance – which can be interpreted as majority of the times falling within 6.9 and 6.5 minutes of the mean for each respective course.
Well, it looks like the Winter course has the edge on these numbers, but is the difference enough to make it significant? If only there was a formal statistical test for comparing the two distributions…
Thankfully, there is! I’ve decided to use an un-paired two sample t-test as quick and dirty method for comparing two samples from the same population. As much as I would love to get into the details, I’m aware I’ve probably lost a good many readers already so I will just summarise to say that a p-value (a probability score) of less than 95% (p-value < 0.05) in this test is commonly interpreted as significant evidence enough to reject a null hypothesis that there is no difference. Therefore the results of the two-sample test in Table 3 (p-value = 0.048) suggests that it is enough to meet the test for significance – indicating evidence that the winter course is faster (though it’s extremely close!)
For those more interested in the nature of the distribution, there are a couple of bonus graphs below. The graphs further illustrate the nature of the distribution (if nothing else, they’re pretty to look at).:
This ‘s’ shape distribution crops up a lot in statistics and is known as a sigmoid curve. Because much of the comparative increase in variation of green dots comes from the larger sample size of summer course runs, it isn’t particularly insightful here, however look closely and you can see some cool patterns in the data. For instance, the arm of green dots stretching out like a tentacle near the top – it looks like all these results came from an event on May 4th (Star Wars day!). I wonder why?
So it’s simple then, the winter course is definitely faster?
Not so fast – in fact I would hesitate to draw any conclusions from the data I’ve looked at today. Not that I’m suggesting this exercise was entirely a waste of our time; certainly a case can be made that the winter course will not affect your PB chances. However it’s likely (of course) that there are other things that will have much more of an impact, such as training, or what you had for breakfast. All things being equal, I don’t really believe that there will be too much difference. It’s important also to refer back to our caveats: this data is vulnerable to things that are difficult to measure, such as human behaviour. I imagine you see far fewer walkers and ramblers on cold winter days, which will inevitably affect the average for the field.
If I was going to explore this further (don’t worry, I won’t!), I would probably look for people who run regularly, and run tests on how either course affects their times (though, this too, wouldn’t be perfect, as most of the regular runners I know are improving all the time, and this is another compounding factor you would have to control for).
Well, that’s all for this week. I hope you found it interesting, please let me know if you have any thoughts, or perhaps avenues for future exploration and hopefully I’ll see you at parkrun soon!
PS: If you’re a fan of data driven parkrun insights then worth checking out this report from event 183, authored by our very own RD Sam Young. Among many insights it contains what must be one of my favourite statistics: someone at Banbury parkrun has finished in position no. 186 a grand total of six times!
PPS: Shout out to Pug-man, thanks for the eggs.
And now for the stats (no really this time)...
This week 255 people ran, jogged and walked the course, of whom 36 were first timers and 37 recorded new Personal Bests. Representatives of 23 different clubs took part.
The event was made possible by 25 volunteers:
Jeffrey TRYBUS • Warren HARRISON • Graeme HACKLAND • Alice PALMER • Gyles HORNER • Helen ROBERTS • Claire UPTON • Danny BATCHELOR • Sera RELTON • Becky ROGERS • Martyn BANHAM • Alan UPSTONE • Jocelyn ATKINSON • Sally ANDREWS-DUKE • Fabienne GORDON-CUMMING • Tim KYTE • Stuart TAYLOR • Stuart HADEN • Neil SIMMONDS • Cat MILLER • Joe MILLER • Daniel ROWAN • Richard TEW • Kathleen BEGGIN • Nick MACEY
Today's full results and a complete event history can be found on the Banbury parkrun Results Page.
The male record is held by Ian KIMPTON who recorded a time of 16:30 on 4th April 2015 (event number 27).
The female record is held by Amelia PETTITT who recorded a time of 18:07 on 20th June 2015 (event number 38).
The Age Grade course record is held by Lilian CARPENTER who recorded 87.73% (28:40) on 29th June 2019 (event number 243).
Banbury parkrun started on 27th September 2014. Since then 6,931 participants have completed 51,544 parkruns covering a total distance of 257,720 km, including 8,979 new Personal Bests. A total of 714 individuals have volunteered 5,430 times.