Artificial intelligence has become the most significant talking point. It’s not just in Silicon Valley, either. The technology's promise – and threat – is on the agenda on Wall Street and in the halls of power in Washington, DC. Proponents of AI believe it will touch every facet of human life, whereas critics think it’s somewhat overblown. The truth is probably somewhere in between. Still, it’s unarguable that AI won’t grow in influence, including areas like sports predictions.
Yet, we wanted to comment a bit on two areas that will impact the future of sports modeling—the rush for ‘good’ data and the concept of competing models. We know that AI is going to be useful for everything from making fantasy football picks to spotting value in the NBA odds, but there are hurdles to overcome, and that’s the area we want to overcome here.
Huge data deals signed by AI companies
The first is the access to data. If you follow tech news, you’ll have noted that there was a bit of a scramble across 2024 for AI companies like OpenAI and Google to strike deals with publishing companies for access to their data. OpenAI, for example, signed an exclusive agreement with Condé Nast, which publishes The New Yorker, Vogue, Wired, GQ, and other important ‘cultural’ publications.
OpenAI’s deal with Condé Nast has little to do with sports, but it does help us illustrate our point. OpenAI will have access to that vast catalog of reporting going back decades from those magazines, whereas other AI models will not, at least not legally. In short, OpenAI’s ChatGPT will know some things other chatbots do not. On the other hand, Google, Anthropic, Meta, and other companies with AI bots have signed exclusive deals with publications or have access to their data pools (like Meta and Facebook) that OpenAI does not have, so they will ‘know’ things that others do not.
The point for sports predictions is that the internet is not some vast open receptacle with data ready to be plucked at whim. Moreover, companies that hold the data – sports analytics companies – are not acutely aware that the data they hold is hugely valuable for AI training. Many have been taking steps to actively block AI from accessing those websites. Even if web crawlers somehow bypass the blocks, the website owners can ‘poison’ the data to ensure it’s unusable.
Sports analytics companies will want paid-for data access.
You might say that many data sites for sports modeling on the open web are publicly available. That is true, but it’s only up to a point. Anyone can look at Pro Football Reference and see historical results and statistics to develop an algorithm, but the whole point of AI is to go the extra mile. If the model does not have access to vast amounts of ‘good’ data, then its ability to predict is only negligibly better than a basic algorithm. If, for example, you fed an AI model the season’s college football results without any additional data, any human with good football knowledge could make more informed predictions.
None of this says AI won’t play a prominent role in sports modeling. We started this piece saying that it would. The argument aims to point out that AI bots are not omnipotent tools with access to all knowledge. Moreover, we are moving into a future where some AI models will have access to data that others do not, including sports data. It’s something to be wary of.
Ultimately, we predict there will be bespoke sports predictions AI models, either something akin to what you would find in OpenAI’s GPT Store or simply a release from an AI company that has built their model to focus on analyzing sports. Yet, it is always worth remembering that AI does not think, nor can it honestly reason. It is only as good as the data fed into it, and that data has become very valuable.