Sign in to follow this  
SecretAgentMan

Predicting the Summer Olympics (sort of)

Recommended Posts

 Hi all! For those who might not know me, my name is Jeff but I go by SAM on some sports forums.  A hobby of mine is game programming, and I have for years been tinkering with an Olympics sim. In the past, I have run Olympics games on websites, but my program (called Going for Gold or GFG for short) is in a bit of disrepair right now and it's not ready for that. However, I do want to resurrect that, and step one is seeing if my game can actually in fact predict the results of the Olympics with reasonable accuracy. I intend to test my game's results and compare them to some other prediction methods as well as the actual results of the Olympics.

So I figured I would share that endeavor with you!  I don't intend to get to all sports, but I am hoping to do what I can. My first predictions are for.....swimming. First, a data table, and then I will explain my process.Summer Olympics Swimming Projections Test.png


On the left are the countries, and their in-game rankings and rating. USA is the top-ranked swimming nation, Australia is 2nd, China is barely 3rd over 4th place Italy, and so on. My rankings are based on a combination of factors, including recent (and less recent) Olympics qualifications and medals, world championships qualifications and medals, and yearly results in worldwide swimming competitions.

The first projection I am looking at is simple: the recent results projections. Basically, who has the top swim times in each event THIS year. I am assuming top time of the year will win gold, second best time will win silver, and third-best time this year will win bronze.

The next projection I am looking at is historical results, though technically this is *recent* historical results. Basically, I averaged out the number of medals won by each country in swimming at 2020 Olympics, 2022 World Championships, and 2023 World Championships. I am combining three sets of results, so there are some rounding errors, which is why you see two separate *totals* listed. The rounded number of total medals and the actual fractional total.

Then, I am checking how MY game (Going for Gold, or GFG) works at projecting the results. Projection A is simple. I looked at the rankings of the athletes by event in the game. Top ranked is expected to win gold, second ranked is expected to win silver, and third ranked is expected to win bronze. I ran three separate trials and averaged, so once again there are some rounding errors.

Projection B is what happened when I actually used my game engine to simulate the results (using the generated athletes from Projection A). Once again, I averaged three separate trials and there are rounding errors.

******
THOUGHTS AND ANALYSIS

I am looking forward to seeing how my game does in this endeavor. I can't say that I feel supremely confident. The projections for GFG look....decent, but not great. Projection A I suspect will be a much better result than Projection B, but we will find out.

Some other things to watch out for when you watch the Olympics:

-The USA, Australia, and China are likely going to be the top three countries in the swimming medal table. However, will it be close at the top or not? The projections based on real-world data seem to show a good spread, but my GFG simulations gave the USA a huge advantage and minimized China's likelihood of even finishing top three. (I am almost certain they will)

-Will Ireland get a breakthrough? You'll notice that Ireland, which has not won any recent medals in swimming, is projected to win 2, but only in the recent results. They have a young athlete who has posted the top time in the world in two events this year, but who has never won a medal in anything before. (Daniel Wiffen) It will be interesting to see if he does. Currently, my program can't really model for "breakthrough athletes" such is this, which is why it may also be underestimating Hong Kong and Greece.

-Canada in 4th? My algorithms, which may be too heavily weighted towards the past 15-20 years, show Canada as 11th. Recent results show them more like 4th. (They had a breakthrough in 2020 and seem likely to continue that this year). Which will be more accurate in actual practice?

-Does France get any sort of home-field advantage? Neither of the real-world data projections like France's chance to do much. Will they benefit from hosting the games?

More updates as we get through the games. Thanks for reading!

Share this post


Link to post
Share on other sites
Posted (edited)

 Okay, time for more projections! Previously, I did a projection for swimming. This time, it's gymnastics. First, the detailed data table, then an explanation of what you can watch for when watching the games!
Gymnastics Projections 2024.png


First off, a word about the projections. Recent results looks at who is likely to win a medal based on this year's top gymnastics scores. The historical precedent looks at the amount of medals won at the 2020 Summer Games, 2022 World Championships, and 2023 World Championships and averages them out. GFG A involves using my program to run 3 simulations and finding the top 3 athletes in each sport, then averaging the results. GFG B involves actually running 3 simulations of results with my program and averages the medal results (this is likely going to be the weakest projection in my opinion, but it's the one I am trying to make stronger)

Some things I'm looking for with this data:
-Are my GFG rankings and ratings any good? Unlike the swimming rankings, my data for Gymnastics is showing many of the top countries being very close together in terms of overall talent level. I don't think that's actually true. Both the real world projections point to strong dominance from USA, Japan, and China in Gymnastics, with other countries usually just getting 1 or 2 medals. Also, my rankings show Germany as being high up in the Gymnastics world, and the real-world data doesn't match that at all.

-What's going on with Italy and Brazil? The two real-world projections differ greatly here. Brazil has had lots of success with recent medals, but this year's scores and data show a bleak picture for Brazilian success at the Olympics. Italy, meanwhile, is the exact opposite. This year's scores and data show a medal haul unlike what Italy has seen in Gymnastics in recent memory. Does it actually go that way in Paris?

-First time medalists in Gymnastics? Ahmad Abu Al-Soud of Jordan has won medals in Pommel Horse at each of the last two world championships. Mind you, the country of Jordan has never even qualified a gymnast for the Olympic Games, much less won a medal. It will be interesting to see if he can break through. Other reasonable possibilities to get their country's first gymnastics medal are Carlos Yulo of the Phlippines and Kaylia Nemour of Algeria. Obviously, my GFG rankings do not indicate strength from these countries- I hope to be able to model "breakthrough" athletes like this someday.

-Obviously, Russia not being invited to the Olympics is going to matter here. Many of the countries that don't usually medal might stand a better chance here because of it

-Just how many countries will medal? I think that's probably one of the biggest differences between my game's model and the real-world models.

Next update will be a team sports projection which I hope to complete prior to the Opening Ceremonies. After that, I'm not sure what else will happen- maybe Athletics.

Edited by SecretAgentMan

Share this post


Link to post
Share on other sites
Posted (edited)

 Okay, time for my next projection. This time, it's all 7 of the major "team sports" all at once. This may or may not be the final projection of the games, as I currently have no database for Athletics, and I'm not sure if I can get it built in the next week before Athletics competitions start. We'll see. Regardless, you will notice that this data table is set up a bit differently than the others. Here it is!

Summer Olympics Team Sports Projections.png
First off, a word about the projections. World Rankings looks at who is likely to win a medal based on the current team rankings in that sport. (The International Handball Federation does not currently keep rankings for its member organizations) The Historical Precedent projection looks at the results of the previous Olympics and any world championships/world cups in that sport that have taken place since then and averages them. GFG A is basically a look at the top 3 teams according to the algorithm my program (Going for Gold) uses, which also takes into account past results.

I did not use the GFG B projection (which runs an actual sim using my game) because I know that my team sports are very broken right now and are not simming anywhere close to accurately.

Now, a few thoughts about the projections:
-A few of these are sort of obvious. For example, any projection that doesn't predict the USA to win both men's and women's basketball is clearly broken. Does that mean they will? Of course not. But every bit of data shows that that is what SHOULD happen. (US Women, if they win, will get their 8th consecutive gold medal)

-What order for Women's 3x3 basketball? All three projections like China, France, and the USA, but in a different order each time

-If you're looking for another "sure thing", you would be wise to consider backing Netherlands in the Women's Field Hockey Tournament.

-Conversely, there's already a chance that all three models on the men's soccer tournament will be wrong, given that Argentina already lost their first match to Morocco and is going to need to come from behind at this point. (Still doable, obviously)

-Speaking of soccer...my projections like the US Women's chances to win, but none of the major tournaments have gone their way in the last four years, and the FIFA ranking doesn't favor them either. I will be keeping an eye on that.

-Literally all three projections suggest New Zealand over Australia (with France for Bronze) in the women's Rugby 7s tournament. Is it actually going to go that way, or might France get a bump from playing at home?

-None of the volleyball projections agree. I'm not sure what to make of that. Worth keeping an eye on.

-I should finish by saying that I tend to think that very short team tournaments like we have in the Olympics can be a bit of a crapshoot, so take all of this data with a grain of salt. It's all likely to be a bit more chaotic than what you see here.

Again, possibly one more update in a week or so before the Athletics get going. Enjoy  the games!

Edited by SecretAgentMan

Share this post


Link to post
Share on other sites

 
I didn't fully complete my Athletics projections, but here's some data I was working on for that. Here's a look at the rankings, by country, in ONLY the sprints. (100, 200, 400, 100/110 High Hurdles, 400 hurdles, and the 4x100 and 4x400 relays. There are 15 total including the mixed 4x400 relay). The scale here is 100 being a near-perfect "max" score. Here's how the countries break down.
Sprint Rankings.png
fetch?filedataid=47&type=full

Let's break them down further into a tier list of sorts. (Reminder, this is ONLY for the sprint races. Distance events and field events are their own deals)

Tier 1 (The Top Guns): These countries should have finalists in basically every sprint race, and if not, it's a sign that something has gone wrong.
-USA, Jamaica

Tier 2 (Solid Programs): These countries will be consistently represented in sprint finals and will probably see some occasional medals, especially in relays. Could also have a star here or there make some waves
-Great Britain, Italy, Brazil, Canada, Germany, Netherlands

Tier 3 (Host Countries): They've had a lot of qualifiers lately, which makes them look like Tier 2 countries. But it's probably just hosting and they are more likely Tier 4 countries. So they get their own Tier all to themselves.
-France, Japan

Tier 4 (Background Players):  You'll see their flag. You might even see a star athlete from their country win a medal. But they aren't going to be serious contenders across all of the sprint events, that's for sure.
-Nigeria, South Africa, Switzerland, Poland, Belgium, Australia, Bahamas, China, Trinidad and Tobago

Tier 5 (Spot the Star): This is probably a sign that the country has had a single star sprinter or two in the last 10 years. Who knows, they might even medal again here. Or you might not even hear from them in this meet. Who knows?
-Spain, Czech Republic, Botswana, Ukraine, Ivory Coast, Dominican Republic, Ireland, Colombia, Norway, Portugal, Cuba, Barbados

Tier 6 (Potential Breakthroughs):
-Everybody else. If you see a country medal in Sprints and they haven't yet been listed (it's already happened on Day 2 of the track meet), they are a breakthrough nation who has elevated themselves to Tier 5 for the next go-around.

Share this post


Link to post
Share on other sites
Posted (edited)

 The final swimming event in the pool is done, so it's time to look back and see what could be learned. Here's what I was watching for beforehand:
Swimming Analysis.png
And here's what ACTUALLY happened:
Swimming Results Final.png
I included all four projections from before, as well as Sports Illustrated Writer Pat Forde's predictions.

Some thoughts to share:

-Clearly, the USA was best with Australia in 2nd. I would personally call China 3rd best because they had 12 medals, but it is true that they only had 2 golds. I had wondered aloud if Canada would be 4th. They have the 4th most gold medals (behind France) and 4th most total medals (slightly ahead of France), so it's pretty clear that the projections there were on track.

-France's home pool advantage was legitimately  a big one. Anyone who has watched the coverage in the US has probably heard the commentators talk about how it's one of the loudest competitions they've ever heard. Leon Marchand went nuts, and won beaucoup gold medals, but he wasn't the only medalist. So clearly, France benefitted from hosting.

-Ireland, Hong Kong, and Romania all won multiple medals. I suggested that my program was probably going to miss those type of results, and it definitely did. Recent results projections, though, showed that these countries were likely to have success, even though they haven't won medals historically. Clearly, any sort of legitimate projections need to incorporate some of the "recent results"  projections.

So, what did I learn about my game?

Well, I learned that my GFG A projection (the one where I determine the "best" athletes in each event without running a sim) is actually pretty solid. There's a couple of major flaws that can be seen (Italy and Japan are way too high, and Canada is way too low), but that's clearly related to the algorithm for how I determine the country rankings.  I probably have weighted long-term historical results too highly and I may need to adjust how much I look at short-term recent results. My GFG B projection (where I actually run the sim of the event), which I already suggested was not going to be great, was indeed not great. That will require some work, but I already knew that. My program, in other words, is good at making the athletes in a way that reflects reality (except for breakthrough athletes), but is not so great at simulating the actual results yet.

Speaking of short-term results, you definitely cannot ignore those.  In 31 of the 35 events, the athlete/team with the highest time THIS YEAR won a medal, and 20 of those athletes/teams won gold medals. (I will show the data table for this in another post)

Anyways, still plenty more sports to go, ,but I figured I would share what I was learning as we go along here!

Edited by SecretAgentMan

Share this post


Link to post
Share on other sites
Posted (edited)

Here's a little bit more swimming data. Let me explain what you're looking at.

For each of the individual events, I took down the Top 5 times from this year. For the team events, I generally calculated 4.

If an athlete in the Top 5 won a medal, I highlighted them. Green highlight means that they matched their projection, and yellow means that they won a medal, but not the one they were projected for.


If you see a red highlight next to an event, it means that an athlete outside of the Top 5 times won a medal in that event. (Occasionally, this just means the athlete hasn't done that event often. Leon Marchand messed up these projections multiple times for that exact reason.)  Two red marks means two athletes, and three marks (it happened once) means that all three medalists came from outside the Top 5 times.

Swimming Event by Event Results.png

Edited by SecretAgentMan

Share this post


Link to post
Share on other sites
Posted (edited)

 All of the artistic gymnastics events are done, so I would now like to analyze the results there. Here is what I was looking for before the events started:Gymnastics Analysis.png
And here is what actually happened:

Gymnastics Final Results.png


Some thoughts about what happened:
-My ratings for gymnastics need some work. USA, China, and Japan were definitely the three best countries at these games as expected, but some of the countries that my ratings claim are "good" are not really close to medals. Germany is a particularly egregious example, but France, Canada, and Spain are also notable.

-I think that "recent results" (the best scores from this year) are actually a better measure for gymnastics than the "historical precedent".  This is especially true for breakthrough athletes. For example, Carlos Yulo of the Philippines won two golds. That wasn't  hard to predict by looking at the top scores from this season. Ditto the gold medal for Algeria's Kaylia Nemour. There were a couple of surprises (Ireland and Colombia), but generally  the scores for this year were right on the money.

-Except for Brazil. Let's talk about Brazil. I wasn't sure what to make of them beforehand, because the data didn't show anything of note. Clearly, this is an outlier, and here's why: Rebecca Andrade must not have competed much this year. She won 3 of Brazil's four medals individually, and was clearly a big reason why they won their first ever team medal. (China and France also had terrible team events, which certainly helped). I think that's why it looked like they weren't going to do well, but if you watched her, you saw that she was excellent throughout the competition.

What did I learn about my game?
-I think I have to look at ways to  incorporate more recent results into my game. It clearly matters in Gymnastics.

-I also have to figure out why so many mediocre countries were rated so highly. I suspect it's an over-reliance on qualification numbers. Something clearly needs to be adjusted there, that's for sure.

-Finally, and this is important: I need to figure out how to model the differences between all-around gymnasts and specialists. The countries that are trying to win gold medals in the team and individual all-around events get to bring 5 gymnasts to the games, but they generally bring several all-around athletes and maybe only 1 or 2 specialists (think Stephen Nedoroscik on the US team). However, a country that only sends 1 or 2 gymnasts is often sending specialists who can afford to focus on that one event. On the men's side, these specialists often performed quite well on the individual finals. One the women's side, the all-around athletes were often so good that they also won individual event medals, too. I have to figure out if maybe these things need to be de-coupled in some way so as to more effectively model the state of modern gymnastics.

Edited by SecretAgentMan

Share this post


Link to post
Share on other sites

I'm going to post a couple of more final analyses today and tomorrow. If you would like to receive e-mail updates if/when I ever get to the point of doing beta testing on my game OR if you would like to know if/when a release occurs, please feel free to add your e-mail address to the document linked below.

https://docs.google.com/forms/d/e/1FAIpQLSfbCUqoLWr4u59JGoJTy-TwQAKSVPk2VG3G0PvpSk0n2UVy1g/viewform?usp=sf_link

Share this post


Link to post
Share on other sites
Posted (edited)

 Let's look at what happened in athletics. Here is a breakdown of the results:
Athletics Results 1.png

Athletics Results 2.png


I feel relatively comfortable with both projections in terms of how well they modeled the actual results. Except for a few "outlier" countries (we'll get there), I feel like the projections were pretty close to being inline with what actually happened. I am actually thinking that a combination of the two projections would have actually produced even better results.

Now, for the outliers. Ethiopia was the first obvious one. Clearly the didn't perform as well as either projection suggested they would. I suspect that's the nature of the distance events that they tend to specialize in. I feel like the sprints and field events followed the projections more closely than the distance events.

USA and Great Britain had good meets in response to either projection you might want to use. 

France also struggled. They seemed to have a home country advantage in a lot of situations, but perhaps not in Track & Field

Ireland and Cuba were the countries who were most likely to medal who didn't actually receive medals. 

I'm going to run an additional analysis of ONLY sprints, and I will be curious to see if that's got a higher degree of accuracy.

Edited by SecretAgentMan

Share this post


Link to post
Share on other sites
Posted (edited)

 A closer look at ONLY the sprints (100, 200, 400, hurdles, and relays). 15 total sprint events, with an absolute max of 35 possible medals. I broke the countries into six tiers. Here are the tiers:

Tier 1 (The Top Guns): These countries should have finalists in basically every sprint race, and if not, it's a sign that something has gone wrong.
-USA, Jamaica

Tier 2 (Solid Programs): These countries will be consistently represented in sprint finals and will probably see some occasional medals, especially in relays. Could also have a star here or there make some waves
-Great Britain, Italy, Brazil, Canada, Germany, Netherlands

Tier 3 (Host Countries): They've had a lot of qualifiers lately, which makes them look like Tier 2 countries. But it's probably just hosting and they are more likely Tier 4 countries. So they get their own Tier all to themselves.
-France, Japan

Tier 4 (Background Players): You'll see their flag. You might even see a star athlete from their country win a medal. But they aren't going to be serious contenders across all of the sprint events, that's for sure.
-Nigeria, South Africa, Switzerland, Poland, Belgium, Australia, Bahamas, China, Trinidad and Tobago

Tier 5 (Spot the Star): This is probably a sign that the country has had a single star sprinter or two in the last 10 years. Who knows, they might even medal again here. Or you might not even hear from them in this meet. Who knows?
-Spain, Czech Republic, Botswana, Ukraine, Ivory Coast, Dominican Republic, Ireland, Colombia, Norway, Portugal, Cuba, Barbados

Tier 6 (Potential Breakthroughs):
-Everybody else. If you see a country medal in Sprints and they haven't yet been listed (it's already happened on Day 2 of the track meet), they are a breakthrough nation who has elevated themselves to Tier 5 for the next go-around.

And here are the results:

Sprint Results.png

What to think of this?

Well, Jamaica underperformed, while the US overperformed. I would argue that Britain performed quite well also, though perhaps they ought to be in a Tier above Tier 2 (Tier 1b, you could say) given how many medals they got.

Saint Lucia and Zambia had breakthroughs, and Puerto Rico continued a breakthrough that started with a gold medal four years ago.

Botswana will probably deserve to bump up a tier after winning medals not just in the 200m, but in the 4x400 as well.

I think the tiers were helpful, but perhaps Jamaica was over-rated to begin with (though there were definitely some signs in the data that they might not be up to their usual standard).

Edited by SecretAgentMan

Share this post


Link to post
Share on other sites
Posted (edited)

 One more analysis to do: here's an overview of team sports. First off, here's what I was saying beforehand:
Team Sports Projections.png
And, here are the final medal results for each of the team sports:Team Sports Results.png

Trying to sort out what I learned here is tricky. Some things that I will note are as follows:
-The USA basketball wins were expected. France's silvers were not, at least according to the projections. I imagine that the home-country advantage mattered quite a bit in that regard.

-3x3 basketball looks like it was impossible to predict. I think it might be too volatile to make a lot of assumptions, to be honest. None of the projections had more than 1 of the medalists as a top 3 team, which was true even given different projections.

-I said that the Dutch were a sure thing in the Women's Field Hockey Tournament. Check! And the world rankings projection also suggested the Dutch men were positioned to win, which they did.

-I learned very little from the football projections. France and Spain medaling was a point in favor of the world rankings / recent history projections. On the other hand, the US gold (which is not shocking looking LONG-RANGE historically, but the short-term projections didn't like them. 

-On the other hand, the handball projections both felt good. Fairly accurate, other than the French men getting upset in the quarterfinals (which is a general problem with "bracket-style" sports...one bad game at the wrong time can mean no medal.

-Rugby Sevens....look, I dunno. France's men's team rode the home crowd and a star player who transferred from Rugby Union to Rugby Sevens just this year to Gold, while the women's tournament, which all three projections agreed upon, got completely upended by a pair of North American squads (who, to be fair, are ranked 4th and 5th in the world based on the quick research I did). Just goes to show how volatile short-term tournaments are.

-Volleyball was sort of split. I felt like the projections for the men's tournament leaned more heavily on recent history, while the women's tournament relied more heavily on world rankings. (Italy apparently had never won a gold before and absolutely dominated the women's tournament)

-Water Polo again was sort of a split decision, with the GFG projection actually looking great for the men's tournament and the world rankings projection looking best from the women's tournament. So I'm not sure what there was to learn.

So, I don't think the team sports taught me too much about my program, other than that volatility is a part of the game with these kinds of tournaments. It does make you appreciate when the "favorites" do manage to come through.

Edited by SecretAgentMan

Share this post


Link to post
Share on other sites
Posted (edited)

 Alright, last little bit of fun, and then I'm done with this exercise for now. (You will see one of these threads again for the 2026 Winter Games, I assure you)

Which country had the best result in "team sports"? I lumped all 8 of the major team sports into one medal table, and it looks like this:

Team Sports Medal Table.png

Clearly, the "best" country at team sports for the Paris 2024 games is either USA or France. I would actually argue in favor of France, because I would argue that making the final in 7 team events (even though they only won 2 of them) outweighs the US's 3 golds and 8 team medals because 4 of those medals were bronze. But I would love to hear the interpretations of others!

Anyways, thanks for reading along!

Edited by SecretAgentMan

Share this post


Link to post
Share on other sites

Create a GM profile or sign in to comment

You need to be a member in order to leave a comment

Create a GM profile

Sign up for a GM profile in our community. It's free & easy!

Create a GM profile

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

  • Recently Browsing   0 members

    No registered users viewing this page.