Where you race series · part 2 of 2
The world's fastest marathon courses
532,375 finisher results, 14 marathons, and the 38,539 runners who raced more than one, used to settle which courses actually make you fast, and which only look that way because fast people show up.
17 June 2026 · 7 min read
TL;DR
- Some marathon courses really are faster, but the spread is modest, about 12 minutes end-to-end for a 3:30 runner, and most of it is the two extremes: California International’s downhill at one end, New York’s bridges at the other.
- We separate the course from the field by following the same runners between cities rather than ranking finish times, the same within-athlete trick behind our Britain’s fastest tracks piece, here across half a million finishers.
- Trust the order, not the exact gap: read it as which way a course tips you, not a guaranteed time. The converter below turns it into a number you can carry.
A marathon time is two things tangled together: how fit you were, and where you ran. A 2:55 in Valencia and a 2:55 in New York are not the same run. So "which course is fastest?" can't be answered by ranking finish times. Fast courses attract fast people, so you'd just be measuring the fields, not the tarmac.
The dataset behind this is big: 532,375 finisher results across 14 marathons in 4 countries. Of those finishers, 38,539 people raced two or more of the courses, and they're the ones who tie the whole map together. The streams flowing between cities above are those shared runners; each course glows by how much faster or slower the same legs go there.
The trick: let the same legs decide
You can't compare my time to yours: different fitness, different day. But you can compare my Berlin to my London. Same body, weeks apart: the difference is mostly the course.
So we throw away raw rankings and look only at runners who appear on more than one course. For each of them we measure how their time on a given course compares to their own average across the courses they ran. Average that personal gap over everyone who raced a course, and the runners' fitness cancels out, leaving the course itself. It's the same within-athlete method behind the hardest ultras piece, and the one we laid out step by step (placebo tests, out-of-sample checks and all) for Britain's fastest tracks. If you want the method poked at from every angle, start there; this piece is that idea scaled up to the world's biggest marathons.
The whole thing rides on overlap, and there's plenty: the heaviest links are Berlin↔New York (5,953 shared runners), Chicago↔New York, London↔New York. Big-city marathoners are repeat travellers, and that's exactly what makes them useful here. Most pairs of courses share nobody directly (almost no one has raced both Seville and Houston), but they don't need to. Like chess ratings, where a chain of games connects two players who never met, a chain of shared runners connects two courses: Seville links to Berlin through the runners they share, Berlin on to New York, until every course hangs in one connected web and each is pinned relative to all the rest.
The ranking
Read it as: a bar to the left means the same runners beat their own average on that course; to the right means they ran slower. A few things jump out:
- California International (CIM) is fastest, and nobody who's run it will be surprised: it's net downhill and point-to-point, the classic Boston-qualifier rocket.
- Amsterdam, Berlin, Valencia form the flat-and-fast tier. Berlin is where world records get set; the data agrees without being told.
- New York is slowest by a wide margin. The bridges and hills are real, and they cost the same runner over two minutes versus a flat course.
- End to end, the spread is about 12 minutes for a 3:30 marathoner, purely from the course.
Does it pass the smell test?
This is the part that matters: the model is given no map, no elevation profile, no reputation, only times and who ran where. And it independently rediscovers what runners already know. The net-downhill PB course comes out on top. The world-record course sits in the fast tier. The famously brutal one lands dead last. When a method recovers the known answers from nothing but the data, you can start to trust it on the cases you don't already know: the mid-table courses where reputation is just folklore.
The hard part: knowing it's the same runner
The whole method rests on one deceptively simple step: recognising that the Berlin finisher and the London finisher are the same person. There's no universal runner ID across races; each result set numbers its own field and nothing more. So we have to re-identify people from what's printed next to their time, and that turns out to be the hard, fragile heart of the project.
The trap is false matches. Merge two different "James Smith"s into one and you've averaged two unrelated runs, and because every such mistake pulls a course's estimate toward the middle, a sloppy matcher quietly flattens the whole ranking toward "all courses are average." So the rule throughout is precision over recall: far better to miss a real match than to invent a fake one. A runner we fail to link just doesn't contribute; a runner we link wrongly poisons the result.
Matching runs in tiers, using the strongest identifier each pair of records actually shares:
- A stable ID, where it exists. A few sources carry the same athlete identifier across their own events. When present, that's ground truth, no guessing.
- Name + sex + nationality. The default across sources. Names are messy ("Lastname, First", "First Last", whole strings, accents, capitalisation), so every name is reduced to an order-independent set of cleaned tokens, and "Kipchoge, Eliud" and "Eliud Kipchoge" collapse to the same key. Nationality is normalised too (the same country is written
GER,DEUorDEdepending on the source). - Name + sex + birth year. Some sources give no nationality but do give exact age, so we back out a birth year and match on that instead, which is strong because a name plus a precise birth year is close to unique.
- Name + sex + running club. The regional races publish neither nationality nor exact age, only an age band. There, the club does the heavy lifting: "J. Smith, Leeds City AC" is far more unique than "J. Smith", and the same club recurs across a runner's races.
Then the guards, which is where most of the work goes:
- A name that appears twice in one race is provably two people (nobody runs the same race twice), so that name is too ambiguous to trust and is dropped, not guessed.
- Birth years that don't reconcile across a person's supposed races split them back into separate people.
- Common regional names only bridge to the world network when there's exactly one plausible match there, never on a coin-flip.
After all that, 38,539 runners survive as confidently matched across two or more courses. Everyone else stays single-course and simply adds no link. It's deliberately conservative, and that conservatism is what lets the ranking mean something.
Laid flat, those matches are a web: every course at its real location, joined to the others by the runners they share.
The marathon circuit · every race at its real place
faster · slower · size = shared runners
Loading the circuit…
Hover a race for its ranking. Lines join courses run by the same people; the moving dots are those shared runners travelling between them — the thicker the traffic, the more the two races can be compared.
The honest caveats
It isn't a perfect instrument and shouldn't pretend to be:
- One year is a weather report, not a verdict. Rotterdam comes out slow here, which contradicts its flat reputation, but it's the only single-edition course in the set, so that day's wind is baked into its number. Multi-year courses (New York, Berlin) average the weather out; single editions don't yet.
- The UK regionals are noisier. Manchester, Edinburgh and Yorkshire expose only an age band, so they're matched on name, sex and running club, giving fewer clean links and wider error bars. They sit on the same scale as the majors, but trust their exact order less.
- We publish the analysis, not the rows. What's shown is the derived course effect; the underlying finisher data stays internal.
Surface, or the kind of race?
There's one limit the within-athlete trick can't fully scrub, and it's worth naming because it's the same one that broke our method on the 10,000m over on the track. Differencing within a runner removes who races a course. It doesn't remove what kind of race a course tends to be.
California International is the clearest case. Yes, it's net downhill, but it's also the course people travel to specifically to chase a Boston qualifier, arriving peaked and pacing it to the second. Some of its lead is tarmac, and some is the occasion. The same caution applies in reverse to a Berlin or a Valencia, where world-record fields and electric pacing pull everyone along. The method can tell you the same legs go faster there; it can't, on its own, fully separate "the road is quick" from "this is where people come to run quick." We lean on multi-year coverage and the tracks robustness checks to argue the surface is doing most of the work here, but treat the headline numbers as the road plus a little of the day it's usually run on.
So what would you run elsewhere?
The whole model collapses to a single number per course, which means the practical payoff is tiny: your time on one course times the speed gap to another. Put in a time you know and see it translated:
Loading…
The same maths now lives in the ontrack app, so you can carry the half-million finishers above as a one-line prediction in your pocket: what a result on your local marathon is really worth on Berlin, Boston or New York.
▸How it’s made — the technical bit
How it's made
Runners are resolved across results sets by name + sex, disambiguated by whatever each set exposes: a stable athlete ID where one exists, else country + birth year, else running club for the races that give nothing else. Namesakes are dropped where the data can't separate them (two people of the same name in one race, or birth years that don't reconcile). A within-athlete log-time difference per course gives the effect (for each runner, how their time here compares to their own average across the courses they ran), and confidence intervals come from the spread across that course's shared runners. We publish the derived course effects, never the underlying rows. The globe and the tools read a single small JSON of those effects, finisher counts and shared-runner links.
Where you race series
Part 2 of 2
- 01Britain's fastest tracks
- 02The world's fastest marathon coursesyou’re here