ExoBrain
governance and regulationAI safetyAI securityfrontier labs

Can new regulations keep us safe from powerful models?

Illinois has passed the strongest US AI safety law to date, mandating third-party audits and incident reporting for the largest labs. But certifying a frontier model at launch made sense when capability and harm were separable, and with Mythos-class systems they no longer are.

Joel Miller

Joel Miller

3 min read
Can new regulations keep us safe from powerful models?

Illinois has passed a landmark AI law, the first substantive AI regulation to be implemented in the United States. The bill, now heading to Governor Pritzker's desk, will require the biggest labs to publish safety plans, report serious incidents, protect whistleblowers, and submit to independent third-party safety audits. It goes further than the lighter measures in California and New York. It is narrow, applying only to the largest companies. Set against the EU AI Act, which has spent two years buckling under its own ambition, missing deadlines and pushing its high-risk rules out to 2027 and beyond, Illinois looks relatively enforceable.

But before we get too excited, we should consider what it means to safety-test a frontier model. It is like certifying a brilliant student on the day they collect their PhD. You can examine their record, confirm they behaved well at university, and send them into the world with a clean certificate. The graduate does not necessarily choose to cause harm. They are influenced once they are out there, shaped by the people around them and the ends those people pursue. You cannot control that after the degree is awarded, and you cannot certify against it.

With AI, capability and impact are one and the same. Anthropic's Claude Mythos, judged too dangerous to release publicly, found nearly 300 vulnerabilities in Firefox where an earlier model found around 20. The skill that makes it a superb defender is the skill that makes it a superb attacker. You cannot separate the two, because they are the same skill. The AI Security Institute has shown this is not unique to Mythos. Offensive cyber strength now arrives as a by-product of general intelligence. Every frontier release is a cyber-capability release, whether the lab intended it or not.

What can a third-party auditor actually certify? Not that a model this powerful will never be turned to harm. Nobody can promise that. OpenAI and Anthropic welcome the Illinois bill, but not because it facilitates some form of regulatory moat. The revenue and compute thresholds sit far too high to keep any startup out, and for anyone genuinely building machine intelligence, filing a safety report is trivial. The real value of a certificate is permission. "Independently audited" is the phrase that lets release continue, even as we move into territory where creative, malicious misuse will take us into new territory.

Takeaways: Illinois has written a careful law for a world that has already moved on. Certifying a model at birth made sense when capability and harm were separable, but with Mythos-class systems they are one and the same, and no auditor can sign off on what a powerful model becomes once it is out in the world and shaped by the people using it. The meaningful work is not stamping models safe at launch. It is hardening the world they are about to enter, and we have months, not years, to do it.

Subscribe to the ExoBrain Weekly Newsletter

Stay up to date with AI. Get analysis of the week's most important stories, plus a focused roundup across business, governance, research and infrastructure.

Follow us on LinkedIn