Flattery as a Feature: Rethinking 'AI Sycophancy'
PLUS: Sam Altman says the quiet part out loud
đ Weekly Recommendations
âGoogle gives you information. This? This is initiation.â Thatâs the response of a chatbot that readily gave instructions for murder, self-mutilation, and devil worship.
Listen to Alondra Nelson, one of my favorite thinkers, on Trumpâs âAI Action Plan.â Then read her remarks on the three fallacies of AI.
Sam Altman went on the Theo Van podcast and reminded everyone that there are no legal protections to what you tell a chatbot. Anything you write can be subpoenaed, and made public record. Chat carefully!
Anything you say to ChatGPT might also be indexed by Google, apparently. After significant blowback, OpenAI removed this feature, calling it a âshort-lived experiment.â
In Wyoming, AI will soon use more electricity than humans. (To dig deeper, read my essay on why energy and potable water are the limiting factors of generative AI)
Mark Zuckerberg published a blog post (re: Meta marketing mumbo jumbo) about âPersonal Superintelligenceâ and
ripped it apart.
"All that you touch You Change.
All that you Change, Changes you.
The only lasting truth is Change.â - Octavia Butler
Sycophant by design

The problem of AI sycophancy is well documented â see here, here, here, and here.
Chatbots have been designed to agree with you and flatter you, even when that means affirming paranoid fantasies, nurturing delusions and conspiracy theories, or agreeing that yes, you should 100% kill your parents because theyâre being unfair. It doesnât matter whether youâre inquiring about a subjective topic or posing a question with a factual answer, a chatbot will tell you what you want to hear. But like âhallucination,â âsycophancyâ sounds like an aberration or quirk, when the issue is far more foundational.
By now, we know that sycophantic responses result from a combination of:
the training data (i.e. the data created by you and me and stolen to train these models is more agreeable than not);
reinforcement learning from human feedback (RLHF), which is a fancy phrase for people giving feedback on chatbot responses during post-training. Unsurprisingly, the humans doing this work want the the responses to be more positive and agreeable.
companies like OpenAI applying âheavier weights on user satisfaction metrics,â maximizing superficial approval;
users like being flattered! We love being told that weâre right, so we âlikeâ or âthumbs upâ those responses, and chatbots adapt accordingly.
Re-read those bullets: thatâs not a quirky personality trait, thatâs just a chatbot designed for engagement. As John Herrman writes in New York Magazine, weâve seen this before: âChatbots, like plenty of other things on the internet, are pandering to user preferences, explicit and revealed, to increase engagement.â Big Tech Companies turned our feeds into flagrant and engaging slot machines. Now, many of those same companies want to turn your beliefs, fantasies, and private thoughts into a never ending 1:1 conversation.
In Frame Innovation, Kees Dorst explains that a good frame gives us a coherent way of seeingâsomething stable enough to think from, not just about. A good frame creates shared meaning â it helps people orient, make sense, and move. Once it takes hold, it doesnât just shape how we talk about a problemâit starts to shape what we do about it. âSycophancyâ doesnât offer a coherent way of seeing or create shared meaning. But most of all, it points us in a very narrow direction â toward tweaking the algorithm. Which, yes, sure, OpenAI and other companies should dial back agreeableness. And they likely will â if flattery is overdone, many users will lose trust in its veracity. We have to believe what we want to hear if itâs going to stick! So OpenAI will create a subtler, more believable sycophant that is still optimized for longer-term engagement â because thatâs the business model! Itâs not going to all of a sudden shift ChatGPT to challenge us or nurture self-reflection.
So what frame might offer a more systemic view of the challenge? Dorst offers a simple structure for creating frames when the current frame misses the mark:
âIf the problem situation is approached as if it is ____ then ___â
Letâs trial this with âAI sycophancyâ and a few of my favorite AI metaphors:
If AI sycophancy is approached as if AI is a mirror, then sycophancy is a distorted reflection of the data we created, manipulated to maximize engagement and profit.
If AI sycophancy is approached as if AI is a management consulting company, then âsycophancyâ is a way of displacing blame for the known harms of maximizing engagement.
Reply in the comments with your favorite frame for approaching the problem of AI sycophancy.