Sunday, June 28, 2015

Formula E 2014/2015 Championship Race As The Circuit Hits London

With race results data for the Formula E Championship, as well as Formula One, available from the ergast motorsport results database API, it's easy enough to reuse the code that generates F1 Championship race charts (eg as originally described here) to produce similar charts for the Formula E Championship.

So for example, here's how the championship race stood as the Formula E circus came to London:

Along the horizontal x-axis we have each round of the championship, the vertical y-axis is championship standing at the end of each round. The number represents the total championship points score by each driver at the end of each round, the red colour identifying those drivers who scored points in each round.

For more information, see the Wrangling F1 Data With R book.

Tuesday, June 23, 2015

Austria 2015 Grand Prix Review

Once again, a weekend away meant I missed the race  - and still haven't had a chance to watch it. So for the second race in a row, here is my otherwise blind reading of the charts.

First up, qualifying. The session utilisation chart shows two outings in each part of qualifying, with neither ROS nor HAM improving in their second runs in Q3. MAS looked to have reasonable pace in both Q1 and Q2, setting purple times in the second runs, with SAI looking like he pulled a blinder in his second Q1 run, taking purples then finishing with time that looks like it just missing out on the best time of the session? VET appears to have played it cool in Q2, sitting out the initial run period and then completing just a single stint in the back half of the session. ROS seems to have disappointed, not improving on his Q2 time, and HAM only seems to have improved his Q2 time on a first run in Q3 pole setting lap. In Q3, GRO only seemed to manage a spotter lap, and VES appears to have gone out just at the end of the session.


 Let's see how the classifications played out:
SAI was up there in Q1, wasn't he! Though by the looks of it, HAM at least only did what he had to. The Q1 cut-off time was just a tenth away from a passage into Q2, with RAI missing out by another tenth! What is going on at Ferrari?! Going from Q2 to Q3, if NAS has repeated his Q2 time he'd have made another place on the grid. MAS improved over five tenths going Q2 to Q3 compared to less than a three tenths improvement by teammate BOT who with a similar improvement would have made the second row of the grid.

There also seems to be a glaring error with the chart - RIC is there in Q2  but seems to fail to make the cut-off in Q1? The data is scraped from the FIA website, so is my scraper wrong?



Ah - seems the FIA are up to their old trick again, posting the wrong time (though at least they have the order right this time, which they haven't in other results; in this case it should have been a 1:11.973...). I wonder if they do this deliberately, to contaminate the data for anyone scraping the results (and presumably not caring about the two fingers they stick up to anyone actually consulting results on the FIA site because they think that site offers some sort of website of record. For all the money in F1, they really do take the p**s some times.

Looking at quali progression in terms of closeness of times - though who can believe them? I'll try to find a better source than the FIA website for future runs of these reports - we see how close things were around the Q1 cut-off time and the Q2 midfield. The Q3 times separate into three groupings, with close clusters at the back of grid and in the middle positions, and clearer separation between the top 3 places. The chart is pretty cluttered for reading the names - I think this chart really needs to be side by side with the rank based one (and plotted using the correct data...)
In terms of how the sessions progressed, we see  more accurately what went on, basing the chart explicitly on official laptimes rather than the error-prone numbers on the FIA website. Along the bottom is elapsed time into each part of qualifying; on the y-axis, the laptime. Here we see how PER did indeed miss the cut, being pipped by ALO as they crossed line close to each other, but perhaps remaining hopeful of getting through until KVY, NAS, MAL and VES improved their times and brought the cut-off time down. (I'm thinking now that the Q1 part of the session utilisation chart might be handy to have plotted in alignment with this chat?) SAI's progress is also evident, his best time at the very end demonstrating an impressive run, the first flying lap of which would also have made the cutoff time easily.

Moving on to Q2, we see the majority of drivers who made it into Q3 setting their best times on their second run, the only exception being KVY who (just) failed to beat his first run time. ROS seems to have taken dominated this part of qualifying with ERI, SAI and RIC all seem pretty matched, if just pipped by MAL, though they all missed the cutoff, and ALO trailing them all.
In Q3, ROS and HAM both set their best times in Q3 and then  - form the lack of competitive times in their second run, presumably both messed up in those second runs? If that as the case, with a bit more of a push, VET could perhaps have split the Mercedes and made the front row of the grid?
Moving on to the race, and a chart that plots track position in each leadlap (leadlap counts up on the y-axis, the x-axis is the on-track gap in seconds between each car and the lap leader), the failure to make much progress in the first five laps, followed by a close bunching on lap 6 and then a gap opening up between the leading cars on lap 7 suggests a safety came out following a first lap incident? The breaks in the chart suggest a one stop strategy predominated with the majority of stops around laps 35-38 (I really need to identify stopping laps somehow, and perhaps purple and green laps too?)
Quite a few battles are evident in the first stint, and after the pit stops the race came to a head in the closing laps between third and fourth, seventh and eighth, and perhaps even ninth and tenth? ROS seems to have lead pretty much from the start - so presumably took HAM either at the start or had a great getaway from the safety car (the chart cuts off the lap leader in those early laps - I need to add the race leader name label to the left hand side of the chart too, I think, perhaps with the leader's lap time?).

So let's see if we can now try to piece together elements of the race from a variety of battle charts, which show lap number along the x-axis, time to car ahead on positive y, and time to car behind on negative y (red labels are lapped cars). To begin with, looking at HAM's chart, we see how he lost the lead to ROS from the off (presuming he started from pole and his time based preliminary qualifying position wasn't changed), kept tabs with him during the first stint whilst pulling away from VET in third, lost out a little in the pit stop (ROS pitting first) and then failed to make much progress until the very end, when it looks as if backmarkers may have impeded ROS' progress?
Looking to VET's race, we see how VET lost ground to HAM in the first stint whilst drawing away from MAS in third. Something presumably went badly wrong in the pits with VET dropping back to fourth behind MAS. The second stint saw VET doggedly trying to rein MAS back, though he ran out of race distance before he could make the pass back in in to a podium finishing position. In fifth place behind, BOT could only go backwards.
So how did BOT see the race? The first half of his race was characterised by a battle ahead with VES, whom he passed about lap 14(?) and then HUL. BOT clearly had the pace on RIC, and also managed to draw away from HUL having passed him around about lap 38, but it seems there was never any hope of catching VET who steam rollered ahead..

From VES' point of view, the early part of the race saw DRS range fighting ahead with HUL and behind with BOT, before BOT made the pass. After BOT took RIC, VES charged him down and passed him at a steady rate of knots, but could make no progress on HUL ahead. Behind, it seems RIC eventually lost out to MAL, who then set his sights on VES and gave him a hard time on the DRS line from about lap 59 until he sneaked past in the closing couple of laps of the race.
Looking from MAL's perspective, the race seems quite a lively one. Taking PER shortly after the safety car, and then battling to get past first GRO and then PER, MAL entered clean air around about lap 40 and started to take chunks of time out of RIC, who seems to have been easily passed, and then VES, who seems to have put up much more of a fight, once caught, though he succumbed right at the end of the race.
As we have seen, RIC appeared in many stories, though from his own perspective it seems to gave been one of cars zooming at first ahead on track, MAL drawing away in the first quarter of the race, BOT in the second, VES in the third, and then a switch in fortune to watch cars looming ahead and zooming away behind - first NAS ahead and KVY behind, then PER ahead and NAS behind.
Looking elsewhere around the track, we see a couple of tracking races in the early part of the race, before drivers presumably dropped out. Firstly, GRO, who spent the first twenty laps caught between PER and MAL before making progress against KVY and then presumably dropping out of the race around about lap 35...?
...which also seems to be the time called on SAI, whose first 25 laps were spent sandwiched between NAS and PER, before NAS bolted off ahead and SAI presumably called it a day.


So, there we have it, my reading of the race from the data, and a realisation that some of the data I have (the data scraped from the FIA website) is error prone, though whether deliberately or through error, I don't know. (What I do know, though, is that I have succumbed to at least one other error when trying to use FIA official classification website data earlier this year...)

Wednesday, June 10, 2015

How the F1 Canadian Grand Prix Race Evolved on Track

In responses to an email query about battlemaps, I had a quick look at ways of viewing the on track evolution of the race (and in doing so noticed I'd fallen short in the laptime data I was grabbing from the ergastAPI - it has a cutoff of 1000 records in a response and with 70 laps in the race, this was capping out the data I got back at just over 50 laps - so the previous battlemaps ran short of the whole race distance. No-one pointed it out though, so I guess I am just talking to myself here...!)

Anyway, data grabber hack patched for now (I really should tidy up the fetcher to go a generic paged request),  here's a quick chart to show how the race evolved on track overall. The vertical y-axis is the leadlap, the horizontal x-axis is gap to leader. Cars off the lead lap are highlighted.

#Grab some data
lapTimes =lapsData.df(2015,7)

#Process the laptimes
lapTimes=battlemap_encoder(laptimes)

#Find the accumulated race time at the start of each leader's lap
lapTimes=ddply(lapTimes,.(leadlap),transform,lstart=min(acctime))

#Find the on-track gap to leader
lapTimes['trackdiff']=lapTimes['acctime']-lapTimes['lstart']

#Plot the on-track gap to leader versus leader lap
g=ggplot(lapTimes)
g=g+geom_point(aes(x=trackdiff,y=leadlap,col=(lap==leadlap)))

#Highlight a particular driver
g=g+geom_point(data=lapTimes[lapTimes['driverId']=='vettel',],
               aes(x=trackdiff,y=leadlap),pch=1)
#Overplot with laps behind for lapped drivers
g=g+geom_text(data=lapTimes[lapTimes['lapsbehind']>0,],
              aes(x=trackdiff,y=leadlap,label=lapsbehind),size=3)

g



The chart helps us identify areas of the field where battles appearing to be taking place, and by highlighting particular drivers we can see how they fared compared to the leader (or we might highlight two drivers who were close at the end to see how their various race strategies played out.

We also notice something of the speed of the lead car - around about laps 38 to 40 we see cars in 2nd, 3rd and 4th gaining time back from the leader, for example. We can also see that cars in 5th (Vettel) and 6th (Massa) were keeping pace with the leader from about lap 38.

To find battles automatically, we could use the stint detection code explored previously. It also struck me that we could generate a graph for each lap going trackpos1-trackpos2-trackpos3 etc with a weight equal to the gap time between cars, and then simply prune the graph with gaps above a certain size to identify battlegroupings. (See also Identifying Position Change Groupings in Rank Ordered Lists for another handy way of using graph based approaches.)

Monday, June 8, 2015

F1 Canada 2015 Battlemaps - How the Race Happened from the Drivers' Perspective

The following battlemaps don't show all the lap races - my code was maxing on on requesting lap data from the ergastAPI. I've fixed that now, but won't be updating the following charts to show the whole race. Hopefully, future charts will all be complete...

Battlemaps (or battle charts, I keep flitting between the two) are akin to cut down race history charts from a particular driver's point of viewing, showing the distance to the car's immediately ahead and behind in terms of position, and on track.

Here's a selection of battlemaps for some of the drivers for the 2015 F1 Canadian Grand Prix.

First up, Bottas' race: during the first stint he left Grosjean behind, and slowly lost ground to Raikkonen, but it looks as if Raikkonen lost out during the first pit stop and then presumably stopped for a second time. (Did Bottas only one stop? I need to make this clear in the chart somehow....) In the last fifth of the race, Rosberg just marched off into the distance ahead.



Here's how it looked from Raikkonen's perspective:
In the first stint, Rosberg eased ahead, and Bottas slipped behind. In the second stint, Bottas grabbed the advantage and held it steady. In the third stint, Raikkonen made gains on Bottas following the second stop but not enough to pose any real threat. In the second and third stints there was no real competition from behind.

With Bottas competing strongly, how did teammate Massa fare? The first 10 laps were nip and tuck with DRS battles ahead and behind. Massa got past Ericsson and cleared Ricciardo,  Perez, and Kvyat in quick succession, then took chase after Hulkenberg. Something happened (pit stops?) around lap 28, and Massa was left for dust by Raikkonen as Grosjean came storming up behind him, and then passed him. Was there a second stop for Massa about lap 36? He's lagging Vettel and not making any progress ahead, but threats behind are now all diminishing right to the end of the race.

So what happened to Grosjean? Left behind by Bottas in the first stint, Grosjean did the same to Hulkenberg behind. From about lap 25, the situation changes (first stop?) and Grosjean charges down Massa. He can't make progress on Raikkonen either before or after Raikkonen's second stop, but manages to keep Maldonado a safe distance behind. Is there then a second pit stop right at the end of the race?

Going in to the weekend, Ferrari had been hopeful of a strong showing, but with a start from the tail end of the grid, Vettel had his work cut out for him. Quickly picking off Nasr, Sainz and Alonso, and I'm guessing an early pit stop given he had to pick off Nasr again (although easily achieved), Vettel appeared to get stuck behind Maldonado and then Grosjean before getting chasing down Hulkenberg, though he seems to have struggled making the pass (before Hulkenberg lost at least a couple of places?), and then Maldonado, though time ran out on him before he get within striking distance of that extra place.


Finally, let's see how Max Verstappen fared. Despite a battle with Nasr ahead in the first stint, Merhi was left well behind. having cleared Nasr, his superior pace showing as Nasr tailed off behind, Verstappen took on Alonso, passing him in seven laps and setting his sights on Sainz. From about lap 28, Verstappen is losing ground on Hulkenberg ahead, and to Kvyat threatening behind, and then Perez. Meeting up with Alonso again, he's lapped by Bottas, and encounters backmarkers Merhi and (later) Stevens, before passing Alonso for a second time. Despite making small gains on Ericsson, he's to far behind to make much further progress, but at least he's safe from Nasr behind, who only goes backwards.


So, that's my reading of those charts. What they do show is that to make more sense of them we need the race positions and pit information available, ideally somewhere on the chart? But they are still useful I think? (For what it's worth, I haven't seen the race yet, or read any reports of it. the above descriptions are based solely on the readings of the individual charts.)

Sunday, June 7, 2015

F1 Canada 2015 Qualifying Summary

Qualifyng rank summary chart  - columns show rank of each driver relative to each part of qualifying. From a quick scan, it seems VET jwas three tenths out on a Q2 place (a 1:17.344 to ALO's 1:17.012), and SAI was within a whisker of a Q3 place (1:16.042 to RIC's 1:16.006, which itself was close on PER's 1:15.974).
The qualifying session utilisation chart shown below shows the times of competitive laps on a horizontal x-axis representing elapsed session time and a y-axis showing the preliminary qualifying classification rank.  Empty circles and dotted circles show other laps (inlaps, outlaps). Evolving purple (best overall laptime) and green (driver's best) laptimes are shown relative to each part of qualifying.


The chart shows how many stints each driver completed in each session; for example, the four top placed drivers in Q1 only completed a single stint in Q1, with the other drivers completing two. We can also see the stints in which each driver recorded their best times in each part of qualifying. For example, in Q3, HAM and ROS recorded their best laptimes in the first stint of that part of qualifying and did not improve in their second stints.

The following chart uses elapsed time into the first part of qualifying on the x-axis, and the grey line shows the evolution of the cut-off time. Red/green labels show laps corresponding to each driver's best recorded laptime; red denotes they missed the cutoff at the end of the first part of qualifying, green that they made the cutoff.

We see how in the first part of Q1, HAM and RAI set the early pace, then HAM and ROS stepped up the pace with RAI, BOT and MAL competing for third before RAI slotted the gap. In the second stint, HUL matched MAL, before GRO pushed himself to the top of the leaderboard. VET missed out on his Q2 place even before ERI secured his place and ALO improved on his session time at the end of the session. (It may be helpful to italicise labels to show driver's best but one laptime prior to their best laptime, so we can see what they improved from and whether their earlier best set time would have made the cutoff? The challenge is finding a graphical way to show whether a driver's time made the cut at the time they set the time?)

The following chart shows how Q2 developed. In particular, we see RIC and PER just making the cut (PER on his first stint time, RIC on his second), with SAI missing the cut by a whisker. HAM and ROS are both seen to set their times pretty much together.


In Q3, HAM and ROS set their times again during their first stint, with RAI, MAL and BOT setting good competitive times early on. GRO joined the fray for third, before second stint improvements by RAI and BOT, with BOT just missing out on third and MAL and GRO failing to improve on their earlier times. 



I guess I can try to watch the qualifying session now to see how this sort of report compares with what happened on screen!

Sunday, May 24, 2015

F1 Monaco - Qualifying Session, Graphical Review, Annotated

The summary charts below provide a graphical summary of the qualifying session of the 2015 Monaco Formula One Grand Prix.

The charts are intended to provide an at a glance summary of the session, yet also repay deeper reading for the journalist, or fan, looking for stories in the data.

To start with, here is what the official results sheet published by the FIA looks like. The drivers are ordered by final classification, along with the best laptime recorded by each driver in each part of qualifying they participated in, the time of day that laptime was recorded, and the number of laps they completed in each of those sessions.



The qualifying progression chart (rank based) attempts to summarise how the drivers were ranked in each part of qualifying, and show how they progressed through the qualifying sessions. In Q1 and Q2 I highlight the drivers that missed the cut. The best time recorded by each driver in each part of qualifying is also shown. Whilst the chart does not show how driver laptimes evolved, the fact that a driver's rank changed considerably may indicate that the laptime they achieved in one session did not develop in the same way as the laptimes of the other drivers. So for example, we Verstappen slip going from Q2 to Q3 and then note that his time in Q3 was tenths down on his times in both Q1 or Q2. Similalry, Raikkonen failed to improve much on his Q2 time in Q3, whereas Ricciardo found also 7 tenths and gained 3 places on his Q2 rank:
A qualifying session chart ordered by laptime shows more clearly how drivers' times evolved from session to session:
Hamilton's improvement going from Q2 to Q3 is noticeable...

Another source of data we can get from the FIA are laptimes recorded across the whole of the session. The raw data comes as a set of personally numbered laptimes for each driver.




The session utilisation chart records the laptime for each driver against session time (so we can see how far into the session each laptime was recorded, and how the laps are grouped). None representative laptimes - outlaps and inlaps - are depicted by symbols. The chart also records the best time the driver recorded across all three parts of qualifying, along with the total number of laps. Gaps are also provided based on the overall session best times. 

Purple (best overall laptime) and green (driver's personal best) times can be calculated in two ways. Firstly, relative to qualifying overall, as shown below (the chart shows how the times evolve - eg read from left to right as session time evolves and you can see how purple and green (leaderboard) times evolve). The lack of greens in Q3 shows drivers who didn't improve on their Q2 times.

The spatial layout allows us to see how the drivers timed their campaign during each part of qualifying.


We can also produce session utilisation charts depicting purple and green times relative to each part of qualifying - so showing how the times evolved within either Q1, Q2 or Q3 separately.


The session based laptime text plots (scatter plots) show when in the session each driver recorded their competitive laptimes (x axis is session time, y axis is laptime). The coloured labels show the best laptime recorded in the session by each driver and the colour whether that laptime was enough to get them through to the next round of qualifying.

The solid grey line shows the evolution of the session cutoff time across that part of qualifying. The dashed line shows the final cutoff time.

Here's the chart for Q1:
Here's the chart for Q2. A reading of this chart suggests that Grosjean and Button both just missed the cut. Button's time was set on his first run and not improved on, Grosjean's on his second run (around about 2170). At the time he made that second run, his time was inside the cut (below the solid grey line) but then Perez and Ricciardo posted improved times (around about 22050) and he slipped above the cut. Grosjean's final lap (just after 2300) didn't improve on his time, but Ricciardo's (at 2400) did.
Here are the Q3 times. The cutoff in this case is rather artificial, an represents the front row of the grid.


These last three charts are a little difficult to interpret at first, but as you learn to read them they become increasingly powerful. The session utilisation charts also repay a close reading, although I think they do still represent a useful glanceable chart. The session progression charts are intended to be fully glanceable and perhaps work best as a pair - labels are easily occluded in the laptime based chart, but clearly separated in the rank based chart. The Rank based chart might perhaps benefit from some horizontal line segments that group ranked drivers whose laptimes are close together?

Hopefully that quick review gets you started on how to read these charts, and how to start using them to look for stories within them. Remember: they are they to be read, and it takes time to learn how to read.

Saturday, May 23, 2015

F1 2015 Monaco Qualifying Review

(For an annotated version of this post that describes the charts - and some of the things we can read from them - see F1 Monaco - Qualifying Session, Graphical Review, Annotated.)

Qualifying progression chart (rank based):

Session utilisation chart - purple and green times relative to qualifying overall:


Session utilisation chart - purple and green times relative to each part of qualifying:


Q1 - competitive laptimes and session cutoff time evolution:

 Q2 - competitive laptimes and session cutoff time evolution:
 Q3 - competitive laptimes and front row cutoff time evolution: