ExoBrain
compute infrastructureenergy and climateinference economicsmodel releasesopen models

Meta’s Eco Llama

Meta releases Llama 3.3, an efficient 70 billion parameter open-source model that maintains high performance while significantly reducing training emissions and inference costs.

ExoBrain

1 min read
Meta’s Eco Llama

Meta has released Llama 3.3, a new open-source language model that packs the punch of their 405 billion parameter model into a smaller 70 billion parameter package. The key achievement? Running costs could drop by up to 24 times.

The model needed 39.3 million GPU hours on Nvidia H100 hardware to train, but Meta offset this with renewable energy to achieve net-zero emissions during training. Its set to cost around $0.01 per million tokens – significantly less than many competitors. Performance hasn’t suffered from the downsizing. Llama 3.3 achieves 91.1% accuracy on multilingual reasoning tasks, supporting languages from German to Thai. It also features a 128,000 token context window – matching GPT-4o’s capacity to process about 400 pages of text at once.

Meta’s approach shows how AI companies are starting to tackle the field’s resource intensity. By shrinking model size while maintaining performance, they’re addressing both cost and environmental concerns – two factors that could shape how AI develops.

Takeaways: Meta’s latest release suggests efficient AI development doesn’t always mean compromise. As energy and computing costs influence AI strategy, expect more companies to focus on getting maximum performance from smaller models.