Anthropic Unveils AI Models Capable of Complex Tasks

Anthropic, a leading AI research company, recently unveiled two new models, Claude 4 Opus and Claude Sonnet 4. The new models boast significant improvements in their ability to reason, plan, and remember context over extended periods. These advancements will be immediately available to paying subscribers with Claude, while the free version, Claude Sonnet 4, will also provide access.

Claude 4 Opus excels at playing Pokémon, surpassing its predecessor’s capabilities by an impressive 24 hours. This upgrade allows the AI to analyze the game and make decisions step-by-step with minimal direction. David Hershey, a technical staff member at Anthropic, led the Pokémon research project, citing Pokémon Red as the perfect game for his team due to its simplicity and turn-based nature.

Hershey’s primary goal was to study how Claude can be used as an agent, working independently to complete complex tasks on behalf of a user. To achieve this, he eliminated any Pokémon-specific data from Claude’s training set. He hopes to build a new game that the model has never seen before to truly test its limits.

The advancements in Claude 4 Opus demonstrate improved long-term memory and planning capabilities. The AI can now navigate complex quests without getting stuck or struggling with non-player characters. This enhanced coherence showcases the model’s ability to stay on track over extended periods.

Anthropic’s research is crucial for tackling the industry’s pressing question: how do we understand what decisions an AI makes when approaching complex tasks? Creating powerful agents that can tackle such tasks safely and reliably is a significant focus for the company. Claude 4 Opus and its successor, Sonnet 4, are steps towards achieving this goal.

The new models will be released under different safety classifications: ASL-3 for Claude 4 Opus and ASL-2 for Sonnet 4. The ASL classification system helps evaluate a model’s risks and determine its level of danger. Anthropic is committed to building AI that can handle complex tasks reliably, making significant strides towards creating virtual collaborators.

With these advancements, companies like Google and OpenAI are working towards similar goals. However, Anthropic has taken a cautious approach, prioritizing research over rapid deployment. This method ensures the development of powerful AI that minimizes risks, particularly when dealing with sensitive information.

Source: https://www.wired.com/story/anthropic-new-model-launch-claude-4