Is skating really unfair? Yes, even in extra stringent analysis

This post first appeared on pygaze.org, in October 2015. There is a previous post on the initial analysis, and a next post on data from 100-meter times.

TL;DR

Yesterday, we reported that random variability in the starting procedure of racing sports can bias competitions, even at Olympic events. Not everyone was keen to believe this, and some people have made suggestions for things we should control for. Some even went so far as to criticise our methods. In this post we address all questions, and provide an extra analysis that looks at within-athlete effects of changes in the ready-start interval on changes in race times. This analysis is robust to differences between skaters’ individual qualities, and has causal power. Our results indicate that there still is evidence that random differences in ready-start intervals might bias competitions. At the very least, this calls for future research into the starting procedure of racing sports. Which is exactly what we intended to provoke with yesterday’s publication.

What happened?

Perspective article in Frontiers in Psychology

Yesterday, we published a Perspective article in the academic journal Frontiers in Psychology. The article made a theoretical point about a potential bias in the starting procedure in some racing sports, including speed skating. We explained about the alerting effect, that makes people quicker to respond when they are just alerted:

The starting procedure in racing sports closely resembles a classical experiment, where participants receive an alerting cue before having to respond to a target stimulus.The cue is a general, non-spatial signal that precedes the target stimulus by a variable interval. In the lab, participants are quicker (Posner and Boies, 1971; Adams and Lambos, 1986) and more precise(Klein and Kerr, 1974) to respond after an optimal interval duration of around 500ms, and are progressively slower and less precise after longer durations.

Inconsistent starts in racing sports

In some racing sports, the referee signals the competing athletes to get ready, and fires the starting shot after a certain interval. We refer to this as the ready-start interval. The crucial point is that this interval is variable. It depends on how quick athletes assume their starting position, but also on regulations that specifically call for variability. The latter is true in speed skating. In addition, speed skaters compete in pairs, and whoever ends up with the lowest total wins the gold. Crucially, each skater starts with different ready-start intervals.

Alerting differences could bias competitions

In our article, we argued that athletes that start with a shorter ready-start interval should have a theoretical benefit over those that start with a longer ready-start interval, due to the alerting effect we describe above. Of course, we realised that this was quite a strong claim to make based on data from psychological studies that is collected in a highly controlled environment. Surely, at Olympic events, factors like talent and training should strongly outweigh the potential benefits of a short ready-start interval? This is why we collected data from the 500 meter speed skating competition at the 2010 Winter Olympics. We collected millisecond accurate ready-start intervals from the audio trace of the event’s broadcast, and chose to correlate that with the race times of each athlete.

What we show, is that ready-start intervals and race times correlate. And the effect was quite large at that: several hundreds of milliseconds. That does not sound like much, but can make a real difference in racing sports. Enough, in some cases, to mean the difference between winning the gold, and not winning a medal at all!

The longer a referee waits between “Ready” and “Start”, the slower athletes finish!

So what’s wrong?

Lots of attention for our article…

The story got picked up by media and was reported on in newspapers, websites, and radio shows in several countries. In the Netherlands, where Dutch people live, reporting was especially pronounced: Beorn (second author and former speed skater) even made it to national television. There are two things you should know about the Dutch: they love speed skating, and they are a naturally sceptic kind of people. This quality makes them very good at evaluating science, and so they did: Mere minutes from publication, I was already receiving emails from colleagues that suggested further analyses, and what kind of control studies could be run. This was great! Post-publication discussion of the starting procedure is exactly what we wanted!

Some bad attention…

In addition to the constructive comments, some people simply dismissed our findings. Some did so because they simply “do not believe them”, but some had more substantiated criticisms. The most prevalent of these was that we relied on what is essentially a correlation between ready-start intervals and race times. As we have al heard, correlation does not imply causation! What people argued was that the causality might be opposite: maybe good skaters take less time to get ready, and thus have shorter ready-start intervals!

In addition, some people argued that we should not have been using a correlation in the first place, because our data included at most two data points from the same athletes. The assumption here is that the data points coming from the same athlete might cause non-independence in our data set. At least, that is what I assume they meant, because they did not do a very good job at explaining their point. They also did not make any suggestions about what would be the appropriate analysis, and they hid behind an anonymous website. Furthermore, they did not contact us before publication, and they fail to recognise that our manuscript has passed scientific peer review, which should vote for its methodological and statistical soundness. Instead, TopSport Topics decided to post a rather shallow discussion of our paper, that came to a very strong conclusion, without more than simply handwaving towards potential issues.

Cool scientific discussion

From the previous paragraph, you might conclude that we were a bit upset. And we were. But not because they criticised us! Other people have contacted us about similar issues, and we had very good discussions about our data. For example, Prof. Lex Borhans of Maastricht University contacted us with the question highlighted before: what direction was causality in? Were skaters really quicker due to shorter ready-start intervals, or could it be that good skaters assume their starting positions quicker and thus have shorter ready-start intervals? I should note that Lex was not the only one asking this question, but he was the only one (to my knowledge) that wanted to write a blog post on it. He asked for our data, and we were very happy to send it over for Lex to re-analyse. The resulting blog post can be found here.

The method that Lex applied was really clever: he tried to predict the race times of skaters’ second race by using the ready-start interval from the first time. This allowed him to infer the direction of causality. If shorter ready-start intervals really cause shorter race times, there should be no correlation in Lex’ analysis. If, on the other hand, short ready-start intervals were caused by athletes being really good, there should be a correlation. Lex’ findings were more in line with the latter suggestion: there were correlations between the ready-start intervals on one race correlated with the race time of the other! The correlation was less strong than the correlation we found, but that difference was not significant.

We have to go deeper!

So where does that leave us? Did we publish too soon? Not necessarily, because Lex’ method is a bit like a statistical sledgehammer: It’s not really sensitive to the kind of subtle effects that our alerting hypothesis would predict. In addition, if you assume skaters are causing both their ready-start intervals and their race times to be longer or shorter at the same time, then the skater would be a moderator. In Lex’ analysis, this would mean that the relationship between the ready-start interval of one race and the finish time of the other race is inflated. For example, if a skater performs bad (e.g. due to a long ready-start interval) during the first race and gets demotivated after the first race, they will show both a longer ready-start interval and a slower race time. Thus the correlation between the first ready-start interval and the finish time of the second race would be moderated by the skater’s state. In the new analysis we describe below, this is not an issue: When the state of the skater is assumed to have an effect on both ready-start interval and race time, this does not change the direction of the relationship between ready-start interval and race time. And that direction is exactly the problem in the current starting procedure. Please do read on if you want to learn about our new super-stringent analysis.

So we set out to do what every scientist does when they are being critiqued: we dove back into our own data, and we scienced the shit out of it. The results are presented below.

What’s really going on?

One analysis to rule them all

There is one way in which we can address all criticisms at the same time, and that is by looking at individual differences within each speed skater. In that way we can measure the effect that differences in ready-start interval have on each individual skater’s performance. This answers the causality question, because we use a difference in ready-start interval (caused by the referee) to explain a difference in race times using a linear regression. It also bypasses the issue of differences in skaters’ abilities, because we are looking at differences in each individual’s performance. Finally, it bypasses all other confounds that others have brought up here or there, for example the idea that both referees and skaters get quicker as the competition progresses due to excitement building up. The following analysis addresses those issues, because it is a direct test of the effect of within-skater differences in ready-start interval on differences in within-skater race times.

Methods, methods, methods

We want to be absolutely clear about our methods here, so here they are. We excluded all skaters that fell or nearly fell during their race, and also those that did not complete one of either. These are Mitchell Whitmore (nearly fell in first race), Maciej Biega (nearly fell in second race), Shani Davis (gave up after first race), Annette Gerritsen (fell in first race), and Yulia Nemaya (fell in second race). Falls or near falls have such a massive impact on race time that they obscure everything else, including athletes’ talent and training, but also the effects of ready-start intervals. Therefore we strongly feel that we should exclude these races from further analysis. For all remaining skaters, 70 in total, we calculated the difference between the first and the second race (race 1 minus race 2) in both their ready-start interval and their race time. We combined the data for men and women, because we have no theoretical reason to split them up, because their individual differences are on the same scale (unlike their race times), and because we need the sample to be sufficiently large for any regression or correlation to be sensitive enough to pick up the effects that we predicted.

Way more stringent analysis, but same results

The linear regression between the individual differences in ready-start intervals and race times demonstrates that there is a significant positive effect of ready-start interval on race time. When the difference in ready-start interval is negative (i.e. the second race had a longer ready-start interval), the difference in race times was also negative (i.e. the second race time was longer). The Pearson correlation is significant (p = 0.003), and explains about 12% of the variance in race time differences.

In the current dataset, assuming a linear relationship, one extra second of ready-start interval difference caused 174 ms of difference in race times. Both the explained variance and the magnitude of the effect are less than what we demonstrated in the analysis in our article. This means that at least some variance in that data can be explained by what several people, including Lex, suggested: quicker skaters are quicker to assume their starting position, and thus have shorter ready-start intervals. However, we can still explain 12% of the variance, whereas this should be 0% in a fair competition. And the remaining effect of 174 ms of added race time per extra second of ready-start interval is still very worrying in sports where the difference between winning gold or silver (or nothing!) can sometimes be only a few milliseconds.

Shorter ready-start intervals do cause shorter race times, even when controlling for confounds.

Is this sample too small?

After collecting data and computing a correlation, you can calculate the statistical power of your results. Ours is 91.42%. That number indicates that our sample was big enough to reliably test the effect that we found.

What happened to the differences between men and women?

They are not there in the current analysis, with the individual Pearson R for men being 0.21, and 0.26 for women. This means they were likely due to noise in the men’s analysis from our article.

Final note

The most sceptic of people might now argue something along the lines of the following: “But wait a minute… Maybe skaters that are very good are both more constant in assuming their position (and thus their ready-start interval), and in their race times! So you ARE wrong! HA!“. In that case, one would still expect the minor differences you assume to correlate in the same direction. So they should still be picked up by our regression. In other words: our concerns with starting procedures still hold.

Conclusion

The theoretical issues that we put forward in our article are valid, and the data we provided to support our claims are still valid. This is after we corrected for skaters’ individual qualities, and using a regression on within-participant differences from which causal inferences can be made. Our article was a Perspective, which are intended to highlight important areas of future research. Our article and the discussion following its publication illustrate precisely that: There is a need to thoroughly investigate the starting procedure of racing sports, and speed skating in particular.

Further comments

If you have any comments, objections, compliments, or tips for funny cat videos, please post them in the comment section below. You can also direct them to me directly, using my email address: edwin.dalmaijer@psy.ox.ac.uk.

Reference

Dalmaijer, E.S., Nijenhuis, B.G., & Van der Stigchel, S. (2015). Life is unfair, and so are racing sports: Some athletes can randomly benefit from alerting effects due to inconsistent starting procedures. Frontiers in Psychology, 6(1618). doi: 10.3389/fpsyg.2015.01618

TL;DR

What happened?

So what’s wrong?

What’s really going on?

Conclusion

Further comments

Reference

2 Comments:

Leave a Reply Cancel reply

Search this website

Recent blog posts