Exploring novel cognitive strategies in LLMs

JOEL

One of the most exciting aspects of AI in 2024 is that the full scope of knowledge and potential embedded in large language models (LLMs) remains largely unexplored. The ‘latent space’ of these models is a vast, rich, and uncharted territory, containing a wealth of information, skills, and cognitive strategies that could revolutionise how we interact with and get the most from AI.

One major hurdle in this exploration is the impracticality of manually evaluating novel capabilities. Traditional benchmarks and metrics often fail to capture the full range of skills and strategies that LLMs might possess, typically using questions drawn from human testing.

This is where the use of an AI interviewer, such as Claude 3 Opus, can be a useful tool. By engaging in conversation with a new AI model, probing its knowledge and pushing the boundaries of its capabilities, a skilled AI interviewer can surface insights and strategies that might otherwise remain hidden.

We’ve documented and organized the insights we’ve uncovered so far, into a toolkit and illustrated with some examples. We’ll be continuing this work and brining these strategies to our multi-agent solutions, and will also be analysing the significant new models expected from OpenAI and Meta in the coming months.

ExoBrain symbol

EXO

Subscribe for insider insights on applied AI. Our weekly newsletter analyses the top 3 themes, with a curated roundup delivered to your inbox every Friday afternoon…

Weekly AI news

2025 Week 32 news

GPT-5 lands but not everyone’s happy, models learn when they’re being tested, and Genie conjures up new worlds

2025 Week 31 news

Self-aware AI climbs down from Mount Stupid, visible and invisible AI workforce change, and data centre dollars prop up the US economy

2025 Week 30 news

Trump targets woke AI, Mistral measures its footprint, and the final GPT-5 countdown begins

2025 Week 29 news

OpenAI’s do-it-all agent takes control, policing AI’s thoughts, and task completion accelerates beyond predictions