Here's a quick one to compare the best sector times within a team. If one driver in a team is better than the other by a consistent normalised amount in each sector, we'd get parallel lines. If the lines cross, one of the drivers significantly outperformed (or underperformed) the other in one of the sectors:
Formula One data, statistics and analysis. A data junkie's guide to data wrangling and visualisation in F1 in particular, and motor sport in general.
Saturday, October 6, 2012
Tinkering With a New Chart Type - Comparing Sector TImes
Here's a quick one to compare the best sector times within a team. If one driver in a team is better than the other by a consistent normalised amount in each sector, we'd get parallel lines. If the lines cross, one of the drivers significantly outperformed (or underperformed) the other in one of the sectors:
Saturday, June 9, 2012
F1 2012 Canada FP1 & FP2
Summary charts from the first and second free practice sessions from the Canada Formula One Grand Prix.
Session utilisation chart for first practice:

Lap times per stint by car in FP1 (long stints, 8 or more laps)

Session utilisation chart for second practice:

Lap times per stint by car in FP2 (long stints, 8 or more laps):

Lap times per stint by car in FP2 (zoom in on long stint, fast laps):

Scatter plot showing relative position of each car at the end of each session (FP1: x, FP2: y)

(It would probably make sense to swap the axes, then you can read from left to right to see who did best in FP2, and compare with the FP1 position by reading up the chart?)
Scatter plot showing fastest laptimes per car across sessions (FP1: x, FP2: y)

(I really need to use the same limits on the axes, and also plot an x=y line so we can see the extent to which times were improved, or not, across sessions.)
Scatter plot showing number of laps run per car across sessions (FP1: x, FP2: y)

Source code used to generate the charts available from CloudStat: F1 Practice Reports
Friday, April 20, 2012
F1 2012 Practice Session 2 Laptimes
Friday, October 14, 2011
F1 2011 Korea FP1 and FP2 Session Utilisation




Friday, September 23, 2011
F1 2011 Singapore Free Practice 2
First up, a summary of how the teams used the session:

Here's a very crude summary of the times recorded by each team for direct comparison.


Maybe more usefully, here's a breakdown of the laptimes by stint for a couple of the teams...
Firstly, Williams:


And secondly, Mercedes:


Plots for the other teams are available from the F1DataJunkie Singapore 2011 Gallery
Friday, July 29, 2011
F1 2011 Hungary Practice 2
[Sortable table; click on column header to sort by table. Raw data available via Data page.]
If my fuel correction calculations aren't way off, Sutil seemed to have a good turn of pace on stints of 11 or more laps in length?

Hmmm... is there anything interesting in that sort of view? Or is it misleading, and purely an artefact of whatever tests were being run?
Friday, June 24, 2011
F1 2011 European Grand Prix/Valencia - Second Practice
[Columns sortable by clicking on headers]
Here's how they went out:

And the times:

The fastest times:

And the fastest times, fuel corrected:

PS See also:
- F1 2011 European Grand Prix/Valencia - First Practice
- F1 Valencia - Circuit Preview Based on 2010 Race Telemetry
Friday, May 27, 2011
F1 2011 Monaco Free Practice 1 and 2 - Utilisation and Laptime Distribution Comparisons
I've been exploring a little more around what's quick and easily achieved using the statistical programming language R (via RStudio. (Don't let the phrase "statistical programming langage" put you offer. With a few simple commands you can generate all manner of complicated graphs and charts without having to know any stats at all!)
The data (in the text based CSV format) is available for download from here: FP1 data, FP2 data
You can download and save these files from my Google spreadsheet archive of the Monaco data by right-clicking on the link and choosing "Save Link As..." or something similar. I saved the files as mco_2001fp1laptimes.csv and mco_2001fp1laptimes.csv respectively.
In Rstudio, you can now load in the data using the Import Dataset option.
(Loading direct from the CSV URL doesn't seem to work for me...)
The datasets I uploaded to Google spreadsheets include things like each laptime in the practice session, the stint (and lap number in the stint), the elapsed time during the session at the end of each lap and the fuel corrected laptime (relative to the stint)
Here's how we can plot how the teams used the session - the following command says "for each driver, plot the elapsed time at which they finished each lap using the FP1 data)":
plot (DriverNum ~ Elapsed,data=mco_2011p1laptimes)
(The Export option in the Chart window allows you to easily save the chart as an image file.)
If we want to plot session 1 and session 2 data on the same chart, we can generate a combined data set. (Both datasets have exactly the same column headings.) Before we do that though, we want to be able to identify the data from free practice 1 and free practice 2 in the combined dataset. We can do that by adding a new column to each dataset within R (it will leave the actual CSV file untouched) that specifies the practice session:
mco_2011p1laptimes$fpsession<-1 mco_2011p12aptimes$fpsession<-2
Here, the first command says: for dataset mco_2011p1laptimes, add a column ($) fpsession and set the value of each cell in that column to 1.
Now we can concatenate the two datasets (rbind, which maybe means "row bind"?) into a single dataset (bothfp), whilst still being able to reference each sessions times directly via the fpsession column.
bothfp=rbind(mco_2011p1laptimes,mco_2011p2laptimes)
We can now plot data from both practice sessions on the the same chart using the following command:
plot (DriverNum ~ Elapsed, col=fpsession,pch=Stint,data=bothfp)
This reads as "plot a scatterplot (plot) for each DriverNum against Elapsed time, colouring the points by the session number (col=fpsession) and using symbols that represent which stint in the session the driver was on (pch=Stint):
We should really add a title too, using the main parameter:
plot (DriverNum ~ Elapsed, col=fpsession,pch=Stint,data=bothfp,main="F1 2011 Monaco: Free Practice 1 and 2 Session Utilisation")
Alternatively, we could have added the title using the command:
par(ps=10)
title(main="F1 2011 Monaco: Free Practice 1 and 2 Session Utilisation")
Here, par(ps=10) says: first set the parameter ps (font size) to 10, then print the title.
Seeing how the teams used the session is one thing, but how about the laptime distribution within a session? The following command shows the laptimes across the second session as a whole by driver:
plot (Time ~ DriverNum, data=mco_2011p2laptimes)
If we want to look at the distributions of laptimes by driver, we can plot the "density" of laptimes according to driver (this is a bit like a histogram, but it uses a continuous line to display the distribution of the laptimes).
d3=subset(mco_2011p2laptimes,DriverNum==3)
plot (density(d1$Time))
(Another way of writing the above would be plot (density(Time),subset(mco_2011p2laptimes,DriverNum==3)). Can you see how they achieve the same thing?)
If we include the lattice package in out set up (which may need installing via Packages/Install Packages), we can plot multiple kernel density plots on the same chart. Here's a comparison the in-stint fuel corrected laptimes between Hamilton and Vettel:
require(lattice)
densityplot(~Fuel.Corrected.Laptime, groups=DriverNum, data=subset(mco_2011p2laptimes,DriverNum==1 | DriverNum==3),main="F1 2011 Monaco - FP2: VET vs HAM")
(This really needs a legend to identify each driver.)
At first glance, this is quite appealing, but on second thoughts I wonder if a histogram wouldn't actually reveal more? For example, if you look closely, you see that there Hamilton's laptimes may also be split into two main clusters, as Vettel's are, although this distinction is masked by the smoothed density plot? Hmm...
Note that we can also use the lattice to plot a separate distribution plot for each driver:
densityplot(~fuelCorrectedLaptime|DriverNum,data=mco_2011p2laptimes)
(In this case, I need to work out how to label each chart; note that there is a visual indicator of each DriverNum in each celll title bar. Hint: VET is the bottom left chart.)
Okay - enough for now. What I wanted to start exploring was some of the charting tools in R that might make a visual comparison of laptimes from practice possible at a glance. The kernel density plot for comparing laptime distributions between two drivers looks like it could be really handy, though at times maybe misleading in a way that a histogram wouldn't be..? Along the way, I also learned how to add a column to a dataset and concatenate two separate datasets into a single one.:-)
Thursday, May 26, 2011
F1 2011 Monaco Free Practice 2 - Utilisation
Saturday, May 21, 2011
F1 2011 Spain Free Practice - Tyres
So here's a first attempt looking at some of the fuel corrected laptimes over some long practice stints for one or two of the drivers. (Timing data available here. Fuel time correction data obtained from AT&TWIlliams: Spanish GP Preview.)
To start with, let's look at Mark Webber (WEB). Here's how Red Bull made use of the practice sessions - the time along the bottom is the elapsed time of the particular session in seconds starting from the first mark on the timing sheet by any driver. The colours denote practice session (first, second or third free practice) and the symbols identify different stints. I've oopsed badly on the labelling of the y-axis: it's actually fuel corrected laptimes (about fuel corrected laptimes):

We see the long stint, red FP2 times do appear to be showing a drop off in time (remember, these are fuel weight penalty adjusted times) over each stint.
So you can get a feel for reading the chart, here's Red Bull's session utilisation chart for free practice 2 (WEB's laps are inside the circle); the chart is read like an elapsed time clock across the session.

Looking at other session utilisation charts, I noticed that Alguesuari seemed to have a long FP2 stint, so let's look at his times:

After an initial coming of of the tyres (maybe?) at the start of the long stint at the end of FP2, do the tyres hang on pretty well (with maybe only a slight fade) before the raggedy times at the end of the stint?
Let's take a closer look:

At this scale, we might say the fade isn't that much. But closer up?

A definite drop off - on the order of a tenth of a second per lap. But there seems to be structure in those times two - so have I made a systematic error in my fuel adjustment calculations, or was ALG doing an on/off, this way/that way trial of some sort over those few laps?
Looking at Renault's FP2 session utiisation chart, Petrov's times are interesting, Towards the end of the session, did he maybe try a pitstop?

Here are his laptimes over all three sessions:

And here's a close-up on the long stint at the end of FP2:

Do the tyres start to go off around 4200s? (Remember, the fuel weight adjustment is approximate...) Do we see a pitstop around 4750s and resulting improvement in laptime from a fresh pair of tyres around the 5000s elapsed time mark?
Commands for generating the scatterplots (R-Studio) - of the form:
t='F1 2011 - Spain: Free Practice: KOV'
dnum='20'
plot(Fuel.Corrected.Laptime ~ Elapsed,data=subset(esp_2011practicelaptimes,DriverNum==dnum ),pch=Stint,col=Session, main=t,xlab="Elapsed Time",ylab="Fuel Corrected Laptime (s)")
legend(4000,109,c("FP1","FP2","FP3"),col=1:3,pch=1:1:1)
PS here's another view - long stint times across all drivers/all sessions:

F1 2011 Spain Free Practice - Laptime Distributions
(Note that laptimes exceeding 1.5 times the time of a driver's fastest lap are not shown.)






The R commands used to generate these images are of the form:
plot(Time ~ DriverNum,data=esp_2011p3laptimes, main='F1 2011 - Spain: Free Practice 3',xlab="Car",ylab="Laptime (s)")
and
boxplot(Time ~ DriverNum,data=esp_2011p3laptimes, main='F1 2011 - Spain: Free Practice 3',xlab="Car",ylab="Laptime (s)")
PS here's a better R recipe for plotting charts with driver name labels:
//set font size
par(ps=15)
plot(Time ~ DriverNum,data=esp_2011p3laptimes, main='F1 2011 - Spain: Free Practice 3',xlab="Driver",ylab="Laptime (s)",xaxt='n')
labels=c("VET","WEB","HAM","BUT","ALO","MAS","SCH","ROS","HEI","PET","BAR","MAL","","SUT","RES","KOB","PER","BUE","ALG","TRU","KOV","KAR","LIU","GLO","AMB")
//set font size
par(ps=10)
text(1:25,par("usr")[3],srt=90,adj=1,labels=labels,xpd=TRUE)