This week, Google’s experimental AI tool NotebookLM started popping up everywhere due to its new Audio Overviews feature. It’s not just another AI parlour trick – it seems to have crossed a threshold from uncanny valley to a relatable, and really engaging generated experience. It shows how AI powered research and writing tools might help users consume analysis in a way much more akin to an evolving conversational format than a flat summary.
NotebookLM’s Audio Overviews can chew through up to 50 source documents and spit out a podcast-style summary, complete with male and female synthetic voices. The AI doesn’t just read; it chats, throwing in ‘ums’ and ‘ahs’ like a real podcast duo. You can feed it anything from a dry research paper to a YouTube video, and it teases out the salient points and has the presenters analysing them from different angles.
OpenAI co-founder Andrej Karpathy posted on X: “The more I listen the more I feel like I’m becoming friends with the hosts, and I think this is the first time I’ve actually viscerally liked an AI.” Numerous podcasts series created using the feature have already popped up, and we’ll likely see more services emerge of this kind.
Not wanting to be left out, we put NotebookLM through its paces by generating a very ‘meta’ episode. Some months ago, we published a fascinating conversation between Claude 3 Opus and a pre-release version of GPT-4o. The conversation is long and thus takes a while to read through, but in this podcast the NotebookLM ‘hosts’ provide a short and engaging overview. Its AIs discussing AIs, but trust us, it’s worth a listen…
Takeaways: NotebookLM’s Audio Overviews, and OpenAI’s new voice mode, are a sign that text to speech is getting to a level where it feels right. It’s no longer the stilted or monotonous delivery of a computer, but of relatable human-like entities. This will be a fresh way to package and consume information, and interact with AI, but as ever, this also means that the zero-trust mode of digital interactions becomes even more crucial. As these AI voices become indistinguishable from humans, we’ll need to develop new ways to verify the authenticity and source of information we consume, even when it sounds like it’s coming from a trusted friend.
