🗞️ ChatGPT & Copyright; AlphaFold gets an asterisk; Participation & Trust
Want to know what truly scares me about artificial intelligence? Read to the end!
Hi, it’s Charley, and this is Untangled, a weekly-ish newsletter on technology, people, and power. Last week:
🧠 I published the essay, “Synthetic social media: Internet users, brace for the emergence of E-Swift.”
🤗 I shared a personal letter to my nephew and enlisted you in shaping the future of Untangled. Please read the post and respond to the poll if you haven’t — there is one day left!
Now on to the show, where I contextualize research and news about:
📰 OpenAI & the value of high-quality data
🧬 AlphaFold & AI as hypotheses generators
🤓 The displacement of social trust
📣What in the world does meaningful participation look like?
📰 In “What Do AI Companies Want With the Media,” John Herrman breaks down the deal between OpenAI and Axel Springer, the German media conglomerate that owns Politico and Business Insider. The upshot? It’s all about the data — the stories from Axel Springer publishers will be used to train OpenAI models. See, OpenAI’s biggest competitive vulnerability is access to high-quality data. Other companies in the generative AI race — Facebook, Google, and even X — already have access to our data. That’s why OpenAI scraped the public web to train its models and potentially infringed on the copyright of The New York Times. Anyway, the race for high-quality data is also why other news organizations are trying to block companies like OpenAI from using its articles to train its models.
This is a story of media companies trying to learn the lessons of the past. On the one hand, I’m glad they’ll get paid for their work. But like Herrman, I’m also not exactly excited for a world wherein the “web is to be harvested by companies that give back nothing but spam, and a company like Axel Springer is destined to be reduced to a wire service for an automated news aggregator.”
If you want to understand what might happen in the not-so-distant future when we run out of high-quality data and tech companies start training their models on the synthetic outputs of prior models, read my essay, “The Doom Loop of Synthetic Data.”
Keep reading with a 7-day free trial
Subscribe to Untangled with Charley Johnson to keep reading this post and get 7 days of free access to the full post archives.