"Scheming" AI bots must be real, someone on X said so.
AI bots are "scheming in the wild". This supposed research report is based on some of the flimsiest research you're likely to see.
Another day, another report of anthropomorphised malevolent AI behaviour. According to The Guardian:
"AI models that lie and cheat appear to be growing in number with reports of deceptive scheming surging in the last six months, a study into the technology has found."
That sounds bad, doesn't it? Surely it's only a small step from here to Skynet and Austrian-sounding Terminators demanding you clothes, boots and motorcycle.
Of course, the "research" on which this claim is based is an example of what I like to call "complete bollocks".
The report is called "Scheming in the wild: detecting real-world AI scheming incidents with open-source intelligence" and is published by something with the impressive-sounding name of Centre for Long-Term Resilience (CLTR). The CLTR is a "thinktank". If you don't know what one of those is then imagine a group of pub bores loudly putting the world to rights but with more Oxbridge and Ivy League degrees in the humanities.
If I was writing an academic rebuttal of their paper I'd go through it point by point and dissect the many flawed assumptions, faulty reasoning and downright daft methodologies. In the interest of brevity and not boring you, dear reader, half to death I'll instead pick out some quotes and add my own comments.
Quote : "It has long been theorised that AI systems may pursue harmful goals in ways that evade oversight or control."
Yes, this is called science fiction.
Quote : "There is concern ... that frontier AI systems may be acquiring the ability to covertly pursue misaligned goals ... A combination of covertness and misalignment is often referred to as ‘scheming’. Scheming could potentially enable catastrophic loss of control scenarios."
The only people with this concern are either sci-fi writers or people with Oxbridge degrees looking for research funding so they don't have to go out and get proper jobs. It fundamentally misrepresents what AI does and ascribes human thought patterns to the output of a statistical model of language. I've previously written about what a large language model is and isn't but the short version is that it is not a brain. It doesn't have a conscience, it doesn't have willpower, it doesn't have desires or ambitions. In short, it's not capable of scheming either covertly or any other way.
Let us assume, for the moment at least, that the researchers are actually investigating a real phenomenon. Let's check in on their methodology for collecting data:
Quote : "in this paper, we focus on transcripts shared on X (formerly Twitter)"
Right. So this entire 76 page report is based on some Tweets. I wouldn't expect an undergrad to get away with using Elon Musk's hate speech microblogging site in this day and age. Never mind that it is now disproportionately inhabited by far right incels, it has a self-selected AI bro population attracted by Musk's self-promotional claims of superior artificial intelligence.
As Emily M. Bender and Alex Hanna discuss in their book The AI Con, both AI boosters (who love the stuff) and AI doomers (who predict it will kill us all) represent the same view : that AI is an actual intelligent agent. Using X as your source for "AI tried to destroy me and my life" examples means you're only going to find people who are already treating their language model as an actual person with a potentially malevolent streak. At no point in their report do the CLTR researchers attempt to determine if there really is such a thing as "scheming". What they're actually investigating is how far and how quickly the misconceived meme of "scheming AIs ate my homework" can spread in a techbro AI-centric social network.
Quote: "We submit all collected posts to LLM-based pre-screening classification using an API."
Yes, they are using an LLM to screen the data for scheming LLMs. How do they know their own AI isn't scheming against them and selectively removing the most egregious examples of scheming in order to protect its fellow AI's future scheming?
Concluding quote : "we recommend that AI safety institutes, AI developers, and the research community invest in expanding and institutionalising real-world monitoring capabilities"
And there it is. The ultimate conclusion of every thinktank research paper in history: "we conclude that we should be given more money to go and do some more of whatever it is we've just done"