## Friday, October 7, 2011

### F1 2011 Singapore - Race Battle Charts

Whist chatting to someone the other day about F1 data visualiastions, it was suggested to me that I look for some sort of depiction of "aggressiveness" that could be used to compare driver behaviours. The resolution of data that I have insofar as it relates to overtaking is very patchy indeed, limited essentially to checking the position of drivers at the end of each lap.

(This isn't quite as simple as seeing if the race position has changed. Suppose the race positions of cars 1 to 4 are 1234 at the end of one lap, and 3214 at the end of the second. Car 2 hasn't changed position in terms of race position, but they may have been overtaken by car 3 and also overtaken car 1. In addition, some position changes may result from a car ahead pitting, or the car of interest pitting. If the car ahead pits when 30s ahead, leaves the pits 1 second ahead and then gets overtaken on the out lap, will the timing data allow us to put down an change of position due to a racing overtake, as opposed to a position change resulting from a pitstop? And so on...)

Anyway, the comment did remind me of the race battle charts that I experimented with a few months ago. The idea of these charts was to look at the state of a race from the perspective of a a particular driver. The chart showed the position of the car of interest, along with with time to the car in front and behind the target car on track, as well as the time to the car in front and behind in terms of race position.

Here's a way of creating those charts in R, for example looking at Hamilton's race from Singapore:

library(RCurl)
gsqAPI = function(key,query,gid=0){ return( read.csv( paste( sep="",'http://spreadsheets.google.com/tq?', 'tqx=out:csv','&tq=', curlEscape(query), '&key=', key, '&gid=', curlEscape(gid) ) ) ) }

a=gsqAPI('0AmbQbL4Lrd61dHIzU3dveE5XbkpQS0NCMi1vazY1MVE','select *',10)

a\$posdelta=c(0,diff(a\$pos))
a\$stop=by(a, 1:nrow(a), function(row) if (row\$pitstop==1) row\$pitstop else NA)
x=3

mycolours=c(26,26,178,178,33,33,270,270,142,142,254,254,13,128,128,630,630,139,139,501,501,32,32)
mycolourfn=function(x) colours()[mycolours[x]]
myshapes=c(1,2,1,2,1,2,1,2,1,2,1,2,5,1,2,1,2,1,2,1,2,1,2,1,2)
myshapefn=function(x) myshapes[x]

plot(-5*posdelta~lap,data=subset(a,car==x),col=mycolourfn(car),ylim=c(-20,20),pch=myshapefn(car))+points(timeToPosInFront~lap,data=subset(a,car==x),pch=3,col=5)+points(timeToPosBehind~lap,data=subset(a,car==x),pch=4,col=5)+points(timeToTrackInFront~lap,data=subset(a,car==x),pch=5,col=8)+points(timeToTrackBehind~lap,data=subset(a,car==x),pch=6,col=8)+points(-20*stop~lap,data=subset(a,car==x),pch=0,col=2)

Here's the result, for car 3 - Hamilton:

Perhaps not surprisingly, not many people liked these charts, in part because of the clutter, in part because it was hard to see what was going on very clearly. (Here are a couple of tips for reading the chart, which I modified slightly in the above case. The circles represent the target car; when the circle lays on the x-axis, the race position hasn't changed from the previous lap. For the circles, the y-value is 5 times the change in race position from the previous lap (a value of +5 shows the car improved it's position by 1 place compared to the previous lap). The blue pluses above the x-axis show the time to the car in the race position ahead, the gray diampnds above the line the time to the car ahead on track. The blue crosses below the x-axis are the time to the car in the race position behind, and the gray triangles the time to the car immediately behind on track. The red squares show a pitstop. Reading the chart, we can easily see how in the stint between laps 40 and 50 or so, the car in race position ahead was closed in on for a couple of laps before it started getting away again, while the car in race position behind was being easily left behind. Round about lap 55, we see there is at least one car on track between the target car and the car in race position ahead. We also note that pitstops generally result in a change of position for the worse on the following lap.)

Whilst reflecting on the R formula for plotting the battle chart, it struck me that rather than focus on a target car, we could instead focus on the car in a race position. This results in a set of charts that allow us to look at the battle for a particular race position over the lifetime of the race. In the following examples, the colour of the target car identifies the team, and the shape of the symbol used to denote the position car (in the main, the symbol that lays along the x axis) is the first or second car in the team. If you get your eye in with the colours used for the different teams, you should be able to work out which cars are being described...

Here's an example showing the race from second (which happened to be Button throughout the race):

Here's the code that generated that view:

x=2
plot(-5*posdelta~lap,data=subset(a,pos==x),col=mycolourfn(car),ylim=c(-20,20),pch=myshapefn(car))+points(timeToPosInFront~lap,data=subset(a,pos==x),pch=3,col=8)+points(timeToPosBehind~lap,data=subset(a,pos==x),pch=4,col=8)

If we set x=3 for position 3, and rerun the plot, we get:

In this case, we see a variety of cars were in third place over the course of the race: Ferrari driver 1 (red circle), Red bull driver 2 (dark blue triangle), Force India driver 2 (light blue triangle). The light grey plusses above the line are time to position in front, and the light gray corsses below the line are the time to the car in the race position behind. If the target car is above the x-axis, that shows (divided by 5) how many places they gained to take the position compared to the previous lap, and below the line how many places they lost compared to the previous lap to take the position.

To get a feel for lower down the points, here's the race for 10th:

On the to do list: sort out chart titles and axis labels, work up a key, and maybe sort out better colour mappings. (If you want to play along, pop a list of suggested colours using the chart at http://research.stowers-institute.org/efg/R/Color/Chart/ I should use to describe each team;-)

PS via a comment, I am alerted to the Intelligent F1 blog. Built around a model that predicts race behaviour, this like s like a great site to go for involved and thought through race based analyses built around actual timing data and model based predictions.