Friday, November 21, 2014

Lap Position Count Charts

Whilst putting together a simple routine to calculate the number of laps led by each driver from the ergast data, it struck me that we could count - and chart - the number of racing laps held in a particular position by each driver in a particular race.

The following chart summarises the 2012 Australian Grand Prix in this way.

```#Count the number of laps each driver held each position for
posCounts=ddply(lapTimes,.(driverRef,position),summarise,poscount=length(lap))

#Set the transparency relative to the proportion of the race in each position
alpha=function(x) 100*x/max(lapTimes\$lap)
#Rotate the x-tick labels
xRotn=function(s=7) theme(axis.text.x=element_text(angle=-90,size=s))

g=ggplot(posCounts)
#For each driver, plot the number of laps in each race position
g=g+geom_text(aes(x=driverRef,y=position,label=poscount,alpha=alpha(poscount)),size=4)
g+theme_bw()+xRotn()+xlab(NULL)+ylab(NULL)
```

Drivers are aligned along the bottom according to rank position at the end of the race. (Drivers who were unclassified are ranked according to how far into the race they got, and what position they were in when they retired in the case of two or more unclassified drivers having gone on on the same lap.

The number shows the number of laps completed in each race position; the transparency level is also indicative of this value. The pink circle shows the position the driver was in on their last lap, it's size proportional to the total number of laps the driver completed in the race. The empty grey circles show the drivers' grid positions.

Where the pink circle is off the diagonal, it shows that a driver was in a higher position at the point they exited the race than they were finally classified at. The larger the red circle, the closer they were to the end of the race at the point they left it. So for example in this case, we see that Maldonado appears to have been in 6th position quite deep into the race, despite being ranked 13th in the end.

The large lap counts shared by Vettel and Hamilton for second and third position don't tell us hwo these were distributed - was Hamilton in second during a large part of the race, for example, then ceding to Vettel for the latter half of the race, or were they continually changing positions in a hard fought fight? To distinguish that, we would need to look to the actual lapchart, or another metric.

A couple of other summary details that are missing from the chart include a total lap count for each driver, and an indication of the actual final classification of each driver (eg to distinguish those drivers that were unclassified).

The full recipe for creating this chart from data obtained from the ergast database can be found in the Wrangling F1 Data With R book.