Sam Altman promotes the next generation of AI
Sam Altman outlines OpenAI's vision for autonomous agents and next-generation models, while a mysterious 'gpt2-chatbot' leak sparks community speculation about upcoming capabilities and architectural shifts.
Joel Miller

AI circus ringmaster and OpenAI CEO Sam Altman, has been making some bold claims about the future of AI, hinting at the company’s big visions for autonomous agents capable of tackling complex tasks and potentially serving as a stepping stone to the artificial general intelligence (AGI).
Speaking on Wednesday, he talked of a “super-competent colleague that knows absolutely everything about my whole life, every email, every conversation I’ve ever had, but doesn’t feel like an extension.” He also doubled down on the narrative that what comes next will be a big step, calling today’s ChatGPT “mildly embarrassing at best” and GPT-4 the “dumbest model” we’ll ever have to use. He went on to say that with some scientific certainty “GPT-5 is going to be a lot smarter than GPT-4” and GPT-6 would also see a similar jump.
If that was not enough expectation building, hours earlier a mysterious new chatbot had emerged briefly on a testing site that sent the AI community into overdrive. The site in question LMSYS.org, provides a blind testing interface for user to rate bots and publishes a widely used leaderboard. On Sunday night the first reports surfaced, with people seeing a model dubbed “gpt2-chatbot” appearing in tests that had unprecedented capabilities. Soon everyone was heading to the site to try it out. “The whole situation is so infuriatingly representative of LLM research,” AI researcher Simon Willison told Ars Technica. “A completely unannounced, opaque release and now the entire Internet is running non-scientific ‘vibe checks’ in parallel.”
Our own vibe checks suggested a GPT-4 level model for logic tasks but an impressive ability to self-reflect and generate very detailed plans, more so than anything we’ve seen before. We also run a kind of cognitive process scan on AI models we use. On one of our probes gpt2-chatbot responded: “In reflecting on the methods used and possible biases, I engage in a simulated form of metacognition, analyzing my own ‘thought’ processes and decision-making strategies as if stepping back and reviewing a human’s cognitive processes” indicating some unusual internal loops that could explain its capabilities.
As theories abounded, Altman shared a cryptic message, posting on X: “i do have a soft spot for gpt2”. To add to the intrigue X users spotted that Sam had edited the tweet, initially posting “gpt-2”. With most of the AI community hammering the testing site, the bot was taken down, probably never to be seen in the wild again. There have been no official statements from OpenAI, but the consensus is that this may have been an attempt to get some early feedback, to harvest some choice prompts (Sam now has our brain scanner), and to build hype for the range of models they plan for 2024. Within that range, several people believe this might be an enhanced, but more compact GPT-4 level option, with some new planning capabilities, perhaps destined for free tier users later in the year, but not likely the full-strength GPT-5. Others believe the “gpt2” moniker might hint at this being a new ‘2nd generation’ LLM architecture, or a new product naming convention.
Takeaways: Agents that can plan reliably are the next big thing. We’re working on agentic solutions for clients, and we’ll be covering this topic in detail in future weeks, but for now it is pretty clear that there are some strong capabilities in the pipeline for 2024. AGI would be by most measures one of the most significant inventions in human history. It would however be preferable if this wasn’t being turned into a circus by the OpenAI team, who clearly delight in cryptic comms. Their commitment to shipping early and often, and letting the world adjust to the implications is laudable, but more transparency is needed. Let’s just hope they take their work more seriously behind the scenes and are indeed on the verge of delivering some major AI progress. A few brief prompts with a shadowy AI seem to suggest they may have something interesting waiting in the wings.