Saturday, February 6, 2016

Updated Edition of "Wrangling F1 Data With R" Released

A new, updated version of Wrangling F1 Data With R is now available on Leanpub. As well as including a large number of minor corrections (I found out how to do a spell check in RStudio!) the latest release includes several new sections and chapters, including:

  • qualifying session analyses: chart the evolution of the cut-off time during each part of qualifying;
  • race track history charts: for each lap of the race, see how each cars were placed on track relative to the lap leader;
  • undercut detection: use laptime data to automatically spot when pit-stop undercuts took place.
For readers of this blog, at least the first 10 purchasers to use this link can benefit from an F1Datjunkie Blog discount...

(All proceeds are fed back into costs associated with producing the book (Dropbox fees, purchasing other books and relevant content), and hosting fees for some experimental, yet to be announced, services...)

Monday, July 6, 2015

F1 2015 Silverstone - Race Review

I actually got to hear radio commentary for the first and last 10 laps of the race today, but from the data it seems like I missed the most interesting bits...

So here's what the laptime data tells us about the on-track positions during the race:

Along the bottom is time in seconds (on track) the to lap leader, with lead lap number on y. The labels on the left are the lap leader, the text labels far right show the leader on the next lap.

Following a couple of safety car laps following the first lap, it seems like a pretty hard battle was being fought at the head of the race for 20 laps or so, with MAS leading a four car group, but after a round of pit stops (presumably) HAM dominated the race. The dotted circles trace VET's race, showing how he was well down the order at the back of the second group in the first stint, but the, starting at lap 37, all points changed and he moved up the order, taking a strong third in the last few laps and by the looks of it, putting in some pretty impressive laptimes as the cars in 4th and 5th fell ever further behind.

If we zoom in bit to the head of the race, we can see a bit more clearly how the cars were grouped at the front of the field over the first 20 laps.
So how did the race look from different drivers' perspectives?

First up, HAM. The early phase of the race saw him tussling with BOT in DRS range ahead and ROS in DRS range behind. Presumably following a pit stop window around lap 20, HAM took the lead, from MAS, who started to fall behind. What seems to be another round of pitstops at lap 44 put ROS into second, but HAM further extended he lead until pretty much to the end of the race.

For ROS, as we have seen, the first 20 or laps or so saw him riding on HAM's tail, though HUL behind fell off quickly. In the second stint, ROS was chasing BOT, and behind RAI failed to keep pace. Did something start to happen around lap 33, or was lap 33 a blip for ROS? Whatever the case, on lap 39 ROS makes it past BOT, then eases past MAS, though he fails to make progress on HAM ahead. Behind, MAS falls away quickly, and is replaced by VET as the chasing car, who makes ground in the last few laps but never represents a threat.

So how about the race from MAS' perspective? Leading through the first stint with team mate BOT in DRS distance behind, MAS lost out to HAM at lap 19 and started going backwards, with BOT fading slightly behind. At lap 39, ROS takes BOT, and eases past MAS with BOT fades horrendously behind. As if from nowhere (following a pit stop, perhaps) VET appears in front at lap 45, suddenly picking up pace and heading off into the distance ahead over the last few laps.

So how did it look from VET's seat? After following KVY through the safety car phase wth PER behind, PER sneaks passed to leave SAI on VET's tail before VET retakes PER and is back behind KVY at lap 9. At lap 14 VET takes KVY, something happens to HUL, and VET is now sitting on RAI's tail. RAI manages to pull out of DRS range until lap 38, and VET manages to keep KVY out o of DRS range behind over that same period. At lap 38 something happens to RAI, and VET is on a charge against BOT. BOT loses 5s or so on lap 43 - does VET then pit? - and suddenly VET is ahead of MAS and facing ROS in the distance ahead, making slight gains on him over the last few laps but never even to pose a real challenge. Meanwhile, all MAS can do behind is go backwards.
I'm guessing BOT is not a happy man tonight? Sandwiched between MAS and HAM for the first stint, I'm guessing with DRS flaps open all over the place, and then stuck between MAS and ROS after HAM's early pit stop, I wonder if MAS had been holding BOT back? Towards the end of the second stint, MAS does seem to start pulling away, but presumably BOT's tyres had been suffering as a result of being sandwiched - again. At lap 39, suddenly everything goes pear shaped. With backmarker traffic to negotiate, BOT goes backwards, fast; VET storms up from behind, followed by KVY, with MAS streaking away ahead. What a miserable day... and why did he go backwards?

In the other Ferrari, RAI was in a first stint tussle with HUL ahead and KVY behind, then quickly chased down ERI, presumably as HUL pitted? In the second stint, RAI was perhaps holding VET back as ROS made progress ahead? At lap 33 gains are made on ROS, then RAI loses it. ROS pushes forward, PER pushes past at a rate of knots, then presumably pits, gaining on RAI at several seconds per lap before RAI presumably pits and then it's PER who goes backwards.

And so, to HUL, and a chart which seems quite, erm, spacious. A first stint spent holding of RAI behind saw ROS at the back of the lead group streak away ahead. In the second stint, it's KVY who peels off ahead, as ERI is passed, and presumably passed by SAI, who is also then left for dust. Up front, KVY dances ahead, falls back, and takes off again several times, while behind, PER and RAI are all over the place... What on earth was going on with RAI?!
Finally, let's have a look at KVY's chart. A first stint spent chasing first RAI, then HUL, whilst being pursued by VET, then PER, then VET again, and PER again, turns into a second stint pulling away from HUL but failing to get within DRS range of VET, who starts to extenfd the gap ahead, slowly at first, bit then more easily, before being closed up on again. Suddenly, lap 32, VET is gone, and lapped ALO is dispatched as 5s a lap as BOT gets drawn in by 3s a lap, though it looks like he might have been taken had there been another lap in the race?

Hmmm...eventful... though I have to admit, I am a bit confused by RAI's final third of the race:
Ah - that makes it a bit clearer - he was tailing off badly from lap 39, but then managed to hold ground just behind HAM, albeit a lap behind, over the last 4 laps....

And I'm left wondering what on earth happened to BOT?

Ah, ha... similar problem, with it all going completely pear shaped on lap 42. He was also lucky not to lose that 5th place at the end, wasn't he?!

Sunday, July 5, 2015

F1 2015 Silverstone Qualifiying Summary

A quick summary of progression through the qualifying session - the rank ordered progression chart:

And the time-based progression chart which shows a little more clearly how times are bunching up:

Here are the original session utilisation charts, showing the times recorded by each according to elapsed session time. Firstly with green and purple times measured across all three parts of qualifying:

Secondly, with green an purple times relative to each session:

There is, however, a problem with all my laptime based analyses - they include times that were deleted:

So what I need to do is refine everything to include a scraper that pulls excluded times from the Stewards' report and tweaks the laptime datafile, perhaps with another column, Excluded, taking true/false values. But no time to do that just at the moment:-(

So in an alternate universe where there were no deleted times, here's how cutoff times evolved:

Details of how these charts were created can be found in the Wrangling F1 Data With R Leanpub book.

Sunday, June 28, 2015

Formula E 2014/2015 Championship Race As The Circuit Hits London

With race results data for the Formula E Championship, as well as Formula One, available from the ergast motorsport results database API, it's easy enough to reuse the code that generates F1 Championship race charts (eg as originally described here) to produce similar charts for the Formula E Championship.

So for example, here's how the championship race stood as the Formula E circus came to London:

Along the horizontal x-axis we have each round of the championship, the vertical y-axis is championship standing at the end of each round. The number represents the total championship points score by each driver at the end of each round, the red colour identifying those drivers who scored points in each round.

For more information, see the Wrangling F1 Data With R book.

Tuesday, June 23, 2015

Austria 2015 Grand Prix Review

Once again, a weekend away meant I missed the race  - and still haven't had a chance to watch it. So for the second race in a row, here is my otherwise blind reading of the charts.

First up, qualifying. The session utilisation chart shows two outings in each part of qualifying, with neither ROS nor HAM improving in their second runs in Q3. MAS looked to have reasonable pace in both Q1 and Q2, setting purple times in the second runs, with SAI looking like he pulled a blinder in his second Q1 run, taking purples then finishing with time that looks like it just missing out on the best time of the session? VET appears to have played it cool in Q2, sitting out the initial run period and then completing just a single stint in the back half of the session. ROS seems to have disappointed, not improving on his Q2 time, and HAM only seems to have improved his Q2 time on a first run in Q3 pole setting lap. In Q3, GRO only seemed to manage a spotter lap, and VES appears to have gone out just at the end of the session.

 Let's see how the classifications played out:
SAI was up there in Q1, wasn't he! Though by the looks of it, HAM at least only did what he had to. The Q1 cut-off time was just a tenth away from a passage into Q2, with RAI missing out by another tenth! What is going on at Ferrari?! Going from Q2 to Q3, if NAS has repeated his Q2 time he'd have made another place on the grid. MAS improved over five tenths going Q2 to Q3 compared to less than a three tenths improvement by teammate BOT who with a similar improvement would have made the second row of the grid.

There also seems to be a glaring error with the chart - RIC is there in Q2  but seems to fail to make the cut-off in Q1? The data is scraped from the FIA website, so is my scraper wrong?

Ah - seems the FIA are up to their old trick again, posting the wrong time (though at least they have the order right this time, which they haven't in other results; in this case it should have been a 1:11.973...). I wonder if they do this deliberately, to contaminate the data for anyone scraping the results (and presumably not caring about the two fingers they stick up to anyone actually consulting results on the FIA site because they think that site offers some sort of website of record. For all the money in F1, they really do take the p**s some times.

Looking at quali progression in terms of closeness of times - though who can believe them? I'll try to find a better source than the FIA website for future runs of these reports - we see how close things were around the Q1 cut-off time and the Q2 midfield. The Q3 times separate into three groupings, with close clusters at the back of grid and in the middle positions, and clearer separation between the top 3 places. The chart is pretty cluttered for reading the names - I think this chart really needs to be side by side with the rank based one (and plotted using the correct data...)
In terms of how the sessions progressed, we see  more accurately what went on, basing the chart explicitly on official laptimes rather than the error-prone numbers on the FIA website. Along the bottom is elapsed time into each part of qualifying; on the y-axis, the laptime. Here we see how PER did indeed miss the cut, being pipped by ALO as they crossed line close to each other, but perhaps remaining hopeful of getting through until KVY, NAS, MAL and VES improved their times and brought the cut-off time down. (I'm thinking now that the Q1 part of the session utilisation chart might be handy to have plotted in alignment with this chat?) SAI's progress is also evident, his best time at the very end demonstrating an impressive run, the first flying lap of which would also have made the cutoff time easily.

Moving on to Q2, we see the majority of drivers who made it into Q3 setting their best times on their second run, the only exception being KVY who (just) failed to beat his first run time. ROS seems to have taken dominated this part of qualifying with ERI, SAI and RIC all seem pretty matched, if just pipped by MAL, though they all missed the cutoff, and ALO trailing them all.
In Q3, ROS and HAM both set their best times in Q3 and then  - form the lack of competitive times in their second run, presumably both messed up in those second runs? If that as the case, with a bit more of a push, VET could perhaps have split the Mercedes and made the front row of the grid?
Moving on to the race, and a chart that plots track position in each leadlap (leadlap counts up on the y-axis, the x-axis is the on-track gap in seconds between each car and the lap leader), the failure to make much progress in the first five laps, followed by a close bunching on lap 6 and then a gap opening up between the leading cars on lap 7 suggests a safety came out following a first lap incident? The breaks in the chart suggest a one stop strategy predominated with the majority of stops around laps 35-38 (I really need to identify stopping laps somehow, and perhaps purple and green laps too?)
Quite a few battles are evident in the first stint, and after the pit stops the race came to a head in the closing laps between third and fourth, seventh and eighth, and perhaps even ninth and tenth? ROS seems to have lead pretty much from the start - so presumably took HAM either at the start or had a great getaway from the safety car (the chart cuts off the lap leader in those early laps - I need to add the race leader name label to the left hand side of the chart too, I think, perhaps with the leader's lap time?).

So let's see if we can now try to piece together elements of the race from a variety of battle charts, which show lap number along the x-axis, time to car ahead on positive y, and time to car behind on negative y (red labels are lapped cars). To begin with, looking at HAM's chart, we see how he lost the lead to ROS from the off (presuming he started from pole and his time based preliminary qualifying position wasn't changed), kept tabs with him during the first stint whilst pulling away from VET in third, lost out a little in the pit stop (ROS pitting first) and then failed to make much progress until the very end, when it looks as if backmarkers may have impeded ROS' progress?
Looking to VET's race, we see how VET lost ground to HAM in the first stint whilst drawing away from MAS in third. Something presumably went badly wrong in the pits with VET dropping back to fourth behind MAS. The second stint saw VET doggedly trying to rein MAS back, though he ran out of race distance before he could make the pass back in in to a podium finishing position. In fifth place behind, BOT could only go backwards.
So how did BOT see the race? The first half of his race was characterised by a battle ahead with VES, whom he passed about lap 14(?) and then HUL. BOT clearly had the pace on RIC, and also managed to draw away from HUL having passed him around about lap 38, but it seems there was never any hope of catching VET who steam rollered ahead..

From VES' point of view, the early part of the race saw DRS range fighting ahead with HUL and behind with BOT, before BOT made the pass. After BOT took RIC, VES charged him down and passed him at a steady rate of knots, but could make no progress on HUL ahead. Behind, it seems RIC eventually lost out to MAL, who then set his sights on VES and gave him a hard time on the DRS line from about lap 59 until he sneaked past in the closing couple of laps of the race.
Looking from MAL's perspective, the race seems quite a lively one. Taking PER shortly after the safety car, and then battling to get past first GRO and then PER, MAL entered clean air around about lap 40 and started to take chunks of time out of RIC, who seems to have been easily passed, and then VES, who seems to have put up much more of a fight, once caught, though he succumbed right at the end of the race.
As we have seen, RIC appeared in many stories, though from his own perspective it seems to gave been one of cars zooming at first ahead on track, MAL drawing away in the first quarter of the race, BOT in the second, VES in the third, and then a switch in fortune to watch cars looming ahead and zooming away behind - first NAS ahead and KVY behind, then PER ahead and NAS behind.
Looking elsewhere around the track, we see a couple of tracking races in the early part of the race, before drivers presumably dropped out. Firstly, GRO, who spent the first twenty laps caught between PER and MAL before making progress against KVY and then presumably dropping out of the race around about lap 35...?
...which also seems to be the time called on SAI, whose first 25 laps were spent sandwiched between NAS and PER, before NAS bolted off ahead and SAI presumably called it a day.

So, there we have it, my reading of the race from the data, and a realisation that some of the data I have (the data scraped from the FIA website) is error prone, though whether deliberately or through error, I don't know. (What I do know, though, is that I have succumbed to at least one other error when trying to use FIA official classification website data earlier this year...)

Wednesday, June 10, 2015

How the F1 Canadian Grand Prix Race Evolved on Track

In responses to an email query about battlemaps, I had a quick look at ways of viewing the on track evolution of the race (and in doing so noticed I'd fallen short in the laptime data I was grabbing from the ergastAPI - it has a cutoff of 1000 records in a response and with 70 laps in the race, this was capping out the data I got back at just over 50 laps - so the previous battlemaps ran short of the whole race distance. No-one pointed it out though, so I guess I am just talking to myself here...!)

Anyway, data grabber hack patched for now (I really should tidy up the fetcher to go a generic paged request),  here's a quick chart to show how the race evolved on track overall. The vertical y-axis is the leadlap, the horizontal x-axis is gap to leader. Cars off the lead lap are highlighted.

#Grab some data
lapTimes =lapsData.df(2015,7)

#Process the laptimes

#Find the accumulated race time at the start of each leader's lap

#Find the on-track gap to leader

#Plot the on-track gap to leader versus leader lap

#Highlight a particular driver
#Overplot with laps behind for lapped drivers


The chart helps us identify areas of the field where battles appearing to be taking place, and by highlighting particular drivers we can see how they fared compared to the leader (or we might highlight two drivers who were close at the end to see how their various race strategies played out.

We also notice something of the speed of the lead car - around about laps 38 to 40 we see cars in 2nd, 3rd and 4th gaining time back from the leader, for example. We can also see that cars in 5th (Vettel) and 6th (Massa) were keeping pace with the leader from about lap 38.

To find battles automatically, we could use the stint detection code explored previously. It also struck me that we could generate a graph for each lap going trackpos1-trackpos2-trackpos3 etc with a weight equal to the gap time between cars, and then simply prune the graph with gaps above a certain size to identify battlegroupings. (See also Identifying Position Change Groupings in Rank Ordered Lists for another handy way of using graph based approaches.)

Monday, June 8, 2015

F1 Canada 2015 Battlemaps - How the Race Happened from the Drivers' Perspective

The following battlemaps don't show all the lap races - my code was maxing on on requesting lap data from the ergastAPI. I've fixed that now, but won't be updating the following charts to show the whole race. Hopefully, future charts will all be complete...

Battlemaps (or battle charts, I keep flitting between the two) are akin to cut down race history charts from a particular driver's point of viewing, showing the distance to the car's immediately ahead and behind in terms of position, and on track.

Here's a selection of battlemaps for some of the drivers for the 2015 F1 Canadian Grand Prix.

First up, Bottas' race: during the first stint he left Grosjean behind, and slowly lost ground to Raikkonen, but it looks as if Raikkonen lost out during the first pit stop and then presumably stopped for a second time. (Did Bottas only one stop? I need to make this clear in the chart somehow....) In the last fifth of the race, Rosberg just marched off into the distance ahead.

Here's how it looked from Raikkonen's perspective:
In the first stint, Rosberg eased ahead, and Bottas slipped behind. In the second stint, Bottas grabbed the advantage and held it steady. In the third stint, Raikkonen made gains on Bottas following the second stop but not enough to pose any real threat. In the second and third stints there was no real competition from behind.

With Bottas competing strongly, how did teammate Massa fare? The first 10 laps were nip and tuck with DRS battles ahead and behind. Massa got past Ericsson and cleared Ricciardo,  Perez, and Kvyat in quick succession, then took chase after Hulkenberg. Something happened (pit stops?) around lap 28, and Massa was left for dust by Raikkonen as Grosjean came storming up behind him, and then passed him. Was there a second stop for Massa about lap 36? He's lagging Vettel and not making any progress ahead, but threats behind are now all diminishing right to the end of the race.

So what happened to Grosjean? Left behind by Bottas in the first stint, Grosjean did the same to Hulkenberg behind. From about lap 25, the situation changes (first stop?) and Grosjean charges down Massa. He can't make progress on Raikkonen either before or after Raikkonen's second stop, but manages to keep Maldonado a safe distance behind. Is there then a second pit stop right at the end of the race?

Going in to the weekend, Ferrari had been hopeful of a strong showing, but with a start from the tail end of the grid, Vettel had his work cut out for him. Quickly picking off Nasr, Sainz and Alonso, and I'm guessing an early pit stop given he had to pick off Nasr again (although easily achieved), Vettel appeared to get stuck behind Maldonado and then Grosjean before getting chasing down Hulkenberg, though he seems to have struggled making the pass (before Hulkenberg lost at least a couple of places?), and then Maldonado, though time ran out on him before he get within striking distance of that extra place.

Finally, let's see how Max Verstappen fared. Despite a battle with Nasr ahead in the first stint, Merhi was left well behind. having cleared Nasr, his superior pace showing as Nasr tailed off behind, Verstappen took on Alonso, passing him in seven laps and setting his sights on Sainz. From about lap 28, Verstappen is losing ground on Hulkenberg ahead, and to Kvyat threatening behind, and then Perez. Meeting up with Alonso again, he's lapped by Bottas, and encounters backmarkers Merhi and (later) Stevens, before passing Alonso for a second time. Despite making small gains on Ericsson, he's to far behind to make much further progress, but at least he's safe from Nasr behind, who only goes backwards.

So, that's my reading of those charts. What they do show is that to make more sense of them we need the race positions and pit information available, ideally somewhere on the chart? But they are still useful I think? (For what it's worth, I haven't seen the race yet, or read any reports of it. the above descriptions are based solely on the readings of the individual charts.)