Did Apple just pop the ‘AI bubble’?

WaPo partners w/Substack, YouTube changes its policies, and the manosphere wants Trump & Musk to make up.

Jun 12, 2025

Hanna Barakat & Cambridge Diversity Fund / https://betterimagesofai.org / https://creativecommons.org/licenses/by/4.0/

This week, Apple researchers published a new paper, “The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity,” that (possibly!) exposes the fundamental limitations of large language models, and undermines the claims that ever more compute is the binding constraint to ‘AGI.’ Welcome to the resistance, Apple!

The paper tests LLMs and large reasoning models (Untangled Deep Dive) and finds that as problems become increasingly complex, the models collapse. They don’t become ‘less accurate’ or ‘hallucinate more,’ they “experience complete performance collapse” as the authors explain. Their accuracy goes to 0! That’s bad news for all the companies selling the story that LLMs will one day achieve ‘AGI.’

But that’s not even the most damning finding: the paper also found that the models “overthink” on simple problems, and decrease their “reasoning effort” past a certain point of complexity, even if they have more compute to use. This is more problematic than the TOTAL COLLAPSE finding because it cuts off the argument every company is making right now: we just need more data and more compute power.

The paper has caused quite the stir in the AI research world. Many are pushing back, arguing that Apple is just FUD’ing LLMs because they’re playing catch up. One of the more thoughtful critiques of the paper, however, argued that the decrease in ‘reasoning effort’ could be the result of choices made during post-training, and might not be fundamental to LLMs. Others critiqued the methodology — arguing that the researcher’s shouldn’t have used ‘reasoning traces’ as a proxy for whether the model is ‘thinking.’

The virality of the back-and-forth reveals what’s at stake. The scaling law is what allows the bubble to get bigger and bigger. It's what allows every company to sustain their investments in data scraping, data centers., etc. It's what allows them to sustain their valuations and make an argument about future growth. If that future isn’t possible? Pop!

‘Artificial Intelligence’

Meta’s investment in Scale AI is all about accessing more data. (More)
Automating a task is vastly less complex than replacing a person or a team. Those predicting a speedy AI transformation of businesses are massively underestimating this gap. (More)
Knowing what AI is and what it’s not — between what it can do and what it can’t — might be the most important gap to close. Not because of how powerful the machines will become, but because of how much of our agency we’re seemingly happy to hand over. (More)
New research shows just the true limitations of alignment research — turns out, it isn’t up to the task of capturing the nuances of human values and the complexities of human ethics. (More). Everything you need to know about alignment. (Untangled Deep Dive)
A great new report by AI Now centers the right question: “The question we should be asking is not if ChatGPT is useful or not, but if OpenAI’s unaccountable power, linked to Microsoft’s monopoly and the business model of the tech economy, is good for society.” (More)

Don’t let power imbalances hold you or your system back — learn how to change them.

After taking my short, interactive, and (dare I say!) fun course, Systems Change for Tech & Society Leaders, you will be able to:

1. See your system clearly: power obscures problems in your sociotechnical system — learn to locate the problems! You can’t solve for what you can’t see.

2. Understand how your system is changing: and learn to leverage that change, rather than just letting it happen. Understand the capabilities and limitations of a technology, and anticipate how those shape your working relationships

3. Change your system as an individual: systems are unpredictable and do not abide by linear cause-and-effect rules — learn to intervene in complex systems by shifting the norms & behaviors of teams, organizations, and multi-stakeholder collaborations.

4. Change your system as a collective: you now have the tools you need for systems change, but are met with a world of differences within teams, cultural contexts, or organizations. Learn strategies to align a diverse group towards a common vision — and ultimately catalyze meaningful systems change.

Enroll

Media, Crypto, etc.

A security researcher just figured out how to identify anyone’s phone number with access to only their gmail and a few hours of time. (More)
The Washington Post will start publishing opinion pieces from popular Substack writers, and use an AI-editor and writing coach called ‘Ember’ to guide the writers. (More) Is the media landscape starting to re-centralize? I don’t think so. (Untangled Deep Dive)
YouTube increased the threshold for offensive content from a quarter of a video to half — and then, oddly, wrapped the policy change in an argument about the ‘public interest.’ (More)
The manosphere wants Trump and Musk to make up. (More) Everything you want to know about ‘the manosphere’ — what it is, what it’s not, and why it offers only a surface explanation for what’s going on with young men (Untangled Deep Dive)

“We can no longer afford the modern illusion that our technosocial innovations are conducive to human mastery.” - Shannon Vallor, Baillie Gifford Chair in the Ethics of Data and Artificial Intelligence at the Edinburgh Futures Institute (EFI) at the University of Edinburgh.

Untangled with Charley Johnson

Discussion about this post