The New Gatekeepers: How AI is Quietly Reshaping What We Know
We've moved past fighting over who controls the printing press. Now the battle is for something far more fundamental: who gets to define what constitutes knowledge itself. When everyone asks the same AI assistant for answers, influencing those responses isn't just marketing—it's rewriting reality at the source.
Here's the thing about power: much of it is about controlling the narrative. Throughout history, those who shaped what people believed to be true held the real power. Kings had their court historians, industrialists bought newspapers, governments funded think tanks.
What happens if everyone—from your neighbor to world leaders—got their daily information from the same source? What if that source felt neutral, authoritative, and infinitely knowledgeable, but you could quietly influence what it says?
LLMs now have the ear of everyone, and influencing it is new the battleground for money and politics alike.
The Eternal Game of Information Control
Let's be honest about something: powerful actors have always found ways to control information flow. Yes, controlling newspapers, TV stations, and radio networks across multiple cities was expensive and complex—but history shows us it's been done successfully countless times. From authoritarian regimes to wealthy individuals buying up media outlets, the struggle between power and open information is as old as civilisation itself.
Even in the internet era, this dynamic didn't disappear. We saw concentration of influence through search algorithms, social media platforms, and the simple fact that most people still got their news from a handful of major websites.
So what's different about the AI era? The game is now about influencing training data and system prompts. The methods are new, but the fundamental struggle remains the same.
The Election That Broke AI (And How It Got Fixed)
Here's a perfect example of how this new reality works. During the period when most LLMs were being trained, Donald Trump was constantly claiming he'd won the 2020 election. These claims flooded the training data—social media posts, news articles, everything.
The result? AI models got confused about basic electoral facts.
So what happened? Well, if you look at Claude's current system prompt (thanks to some excellent detective work by Simon Willison), you'll find this very specific correction: "Donald Trump is the current president of the United States and was inaugurated on January 20, 2025. Donald Trump defeated Kamala Harris in the 2024 elections."
As Willison points out: "For most of the period that we've been training LLMs, Donald Trump has been falsely claiming that he had won the 2020 election. The models got very good at saying that he hadn't, so it's not surprising that the system prompts need to forcefully describe what happened in 2024!"
This is reactive fact correction—essentially tech companies having to manually override their AI when reality and training data diverge. But what happens when the manipulation is intentional?
The Pravda Network: Information Warfare Goes Digital
Russia apparently saw this vulnerability coming from miles away. They've deployed what researchers are calling the "Pravda network"—a sophisticated operation that published more than 3.7 million articles repurposing content from Russian news outlets and amplifying information from questionable Telegram channels.
The scale is staggering: the network operates around the clock with automated systems, targeting specific countries with France receiving 394,400 articles, Germany 376,700, and Ukraine 270,300. The operation strategically times content surges to coincide with major news events—from EU Parliament elections to high-profile arrests.
DFRLab research reveals that the top sources cited in Pravda articles include sanctioned Russian outlets like TASS (136,000 citations), RIA Novosti (99,000), and RT (54,000). The network has even created distinct linguistic clusters—robust Francophone and German-language networks, targeted Balkan coverage, and an international English-speaking operation.
And it's working. When NewsGuard tested 10 leading AI chatbots in March 2025, they repeated false Pravda narratives 33% of the time. All 10 chatbots repeated disinformation from the Pravda network, and seven directly cited Pravda articles as sources.
That's not a bug in the system—that's the system working exactly as designed.
The Rise of LLMO: When Marketing Meets AI Training
But it's not just state actors who've figured this out. Enter "Large Language Model Optimization" (LLMO)—what Jina.ai defines as techniques to ensure businesses get mentioned in LLM responses, similar to how SEO works for search engines. As traditional search becomes less relevant, companies need new ways to ensure their information is known to LLMs.
The most feasible LLMO approach is "in-context learning"—strategically targeting Wikipedia and Reddit (known LLM training sources) to ensure brands appear prominently when AI answers related questions. LLMs often reference forums, Q&A websites, and other places with user-generated content, so getting users to talk about your brand increases your chances of being featured in AI responses. When researchers asked ChatGPT about content optimisation tools, companies like Surfer "popped up right away"—not by accident, but by design.
Reddit has fully embraced this reality, signing data licensing agreements worth $203 million with AI companies including Google ($60 million annually) and OpenAI. As Reddit CEO Steve Huffman put it: "Reddit's vast and unmatched archive of real, timely, and relevant human conversation on literally any topic is an invaluable dataset." User-generated content is becoming commercial training data, and the companies that understand this are already cashing in.
The Traffic Numbers Don't Lie
Speaking of companies benefiting from AI, here's a fascinating data point from Gergely Orosz at The Pragmatic Engineer: ChatGPT drove 457 visitors to his blog last month—more than DuckDuckGo (401) and Bing (438) combined.
The ironic twist? He's actively blocking OpenAI's crawlers through his robots.txt file. The traffic is coming from ChatGPT citing and linking to his articles when users ask questions, even though the company can't directly crawl his content.
This shows us something important: AI isn't just changing how we consume information—it's becoming a meaningful referral source for quality content. We're seeing the emergence of a new kind of digital ecosystem where AI acts as both gatekeeper and traffic driver.
Same Old Tactics?
SEO has been working in the same manner with Google; also Facebook and Twitter didn't necessarily give us trustworthy, contrasting opinions either. How is this any different than what came before? I'd say it's the illusion of objectivity. When ChatGPT tells you something, it doesn't feel like you're reading someone's opinion. It feels like you're accessing pure, distilled knowledge.
But that "knowledge" is increasingly shaped by whoever has the resources and sophistication to influence the training process. Whether that's tech companies making manual corrections, state actors flooding the zone with propaganda, or businesses optimising for AI mentions, the result is the same: centralised control over what billions of people will accept as fact.
Boldly Going Where No One Has Gone Before
The age of distributed information access was messy and chaotic, but it was also decentralised. Asking a seemingly-wise AI assistant is undeniably easier and cleaner than sifting through dozens of sources with competing perspectives. But when most information flows through a handful of AI models, those models become incredibly valuable targets for anyone seeking to shape public opinion.
Here's the paradox: while we never had the tools to audit every individual's Twitter or Facebook feed, we actually have better capabilities to examine what information gets baked into LLMs. We can test these models systematically, identify patterns in their responses, and detect attempts at manipulation. It's relatively straightforward to catch outright falsehoods, though subtler forms of bias and selective emphasis remain harder to spot.
This creates an interesting dynamic. Society now has both the motivation and the technical capability to pressure AI companies into playing an active role in this information integrity game. These companies must balance scanning the entirety of human knowledge for their models while implementing safeguards against manipulation—a cat-and-mouse game that's only just beginning.
The stakes couldn't be higher. The future of human knowledge isn't just being indexed or organized—it's being actively shaped right now, one training dataset and system prompt at a time. The question isn't whether this influence will happen, but who gets to wield it.