Claude Opus 4.6 Agent Teams: Multi-Agent Development Is Here

Released on February 5, 2026, Claude Opus 4.6 came with a feature that felt more like a research preview than a product announcement: Agent Teams. Multiple Claude Code instances working in parallel, coordinating directly with each other, each owning a piece of a larger task.

I’ve been building with it for six weeks. Here’s what I’ve learned.

What Agent Teams Actually Is

The simplest mental model: Agent Teams lets you spin up a Claude Code session as an orchestrator, which can spawn sub-agents (other Claude Code instances) that work on parallel subtasks. Each sub-agent runs in its own context window. They can message each other directly — not just report back to the orchestrator.

This is different from how most multi-agent frameworks work today. In a typical setup:

Agent A calls Agent B (or delegates to it)
Agent B executes, returns result
Agent A processes and decides next step

Agent Teams allows peer-to-peer coordination:

Orchestrator spawns Agent A and Agent B
Agent A discovers it needs something Agent B is working on
Agent A messages Agent B directly
Both continue, orchestrator synthesizes

For software development specifically, this maps well onto how engineering teams actually work. Not strictly hierarchical, with a tech lead delegating every decision — but collaborative, with engineers coordinating laterally on shared concerns.

The Adaptive Thinking Architecture

Alongside Agent Teams, Opus 4.6 introduced Adaptive Thinking — and this is the change that affects API integration most directly.

The old model: you set budget_tokens to control how much reasoning the model does. Simple, but blunt. You either paid for maximum reasoning on every call or accepted degraded performance on complex tasks.

The new model: you specify an effort level (low, medium, high, max) and let the model decide how much reasoning the problem actually requires. Adaptive thinking automatically enables interleaved thinking — the model can reason between steps, not just at the start of a response.

import anthropic

client = anthropic.Anthropic()

# Old approach — deprecated on Opus 4.6
response = client.messages.create(
    model="claude-opus-4-6",
    max_tokens=8000,
    thinking={
        "type": "enabled",
        "budget_tokens": 10000  # ⚠️ Deprecated
    },
    messages=[{"role": "user", "content": "Review this PR diff..."}]
)

# New approach — use this
response = client.messages.create(
    model="claude-opus-4-6",
    max_tokens=8000,
    thinking={
        "type": "adaptive",
        "effort": "high"  # low | medium | high (default) | max
    },
    messages=[{"role": "user", "content": "Review this PR diff..."}]
)

The other breaking change: prefilling assistant messages now returns a 400 error. If you’ve been using the assistant prefill pattern for steering outputs, you need to move this to system prompts before deploying Opus 4.6.

Context Compaction: The Enabler for Long Agent Sessions

One of the quiet but important features in Opus 4.6 is Context Compaction (beta). In long-running agentic workflows, context windows fill up. The traditional behavior: you hit the limit, the session fails, you lose state.

Context Compaction handles this by summarizing older context as the session approaches token limits, preserving the essential state while discarding the verbatim history. For Agent Teams sessions that might run for hours on a large codebase migration, this is the difference between a session that completes and one that fails at the 3-hour mark.

The 1M token context window (beta) helps too, but Context Compaction is what makes 12+ hour agent sessions genuinely viable.

Real Performance: Terminal-Bench 2.0

The benchmark that matters most for coding agents is Terminal-Bench 2.0, which evaluates performance on agentic coding tasks in a real terminal environment — not isolated coding puzzles, but end-to-end development workflows.

Opus 4.6 scored 65.4% on Terminal-Bench 2.0 — the highest ever recorded. For comparison, the previous best was in the low-50s.

The practical implication: Opus 4.6 can handle complete development workflows, not just code generation snippets. Multi-file changes, test running, error fixing, iterating based on output — the kind of task sequence a developer actually does.

Building With Agent Teams: A Real Example

Here’s a pattern I’ve used successfully for parallel code review across a large PR:

import anthropic
import subprocess
import json

client = anthropic.Anthropic()

def spawn_review_agent(file_path: str, diff_content: str, concern: str) -> str:
    """Run a focused review agent on a specific concern."""
    response = client.messages.create(
        model="claude-opus-4-6",
        max_tokens=4000,
        thinking={"type": "adaptive", "effort": "high"},
        system=f"""You are a specialized code reviewer focused on: {concern}.
Review only what's relevant to your focus area. Be specific and actionable.""",
        messages=[
            {
                "role": "user",
                "content": f"Review this change in {file_path}:\n\n{diff_content}"
            }
        ]
    )
    return response.content[-1].text

def parallel_pr_review(pr_diff: str) -> dict:
    """Orchestrate parallel review across multiple concerns."""
    concerns = [
        ("security", "Security vulnerabilities, injection risks, auth issues"),
        ("performance", "Performance bottlenecks, N+1 queries, memory leaks"),
        ("correctness", "Logic errors, edge cases, error handling gaps"),
        ("maintainability", "Code structure, naming, testability")
    ]

    # In a real Agent Teams setup, these run in parallel Claude Code sessions
    # Here showing the orchestration pattern
    results = {}
    for concern_id, concern_desc in concerns:
        results[concern_id] = spawn_review_agent(
            "pr_diff.txt",
            pr_diff,
            concern_desc
        )

    return results

# Orchestrator synthesizes results
def synthesize_reviews(reviews: dict) -> str:
    synthesis_prompt = "\n\n".join([
        f"## {concern.title()} Review\n{review}"
        for concern, review in reviews.items()
    ])

    response = client.messages.create(
        model="claude-opus-4-6",
        max_tokens=2000,
        thinking={"type": "adaptive", "effort": "medium"},
        messages=[
            {
                "role": "user",
                "content": f"Synthesize these parallel code reviews into a prioritized action list:\n\n{synthesis_prompt}"
            }
        ]
    )
    return response.content[-1].text

The key architectural insight: each sub-agent runs with a focused context — it only knows about the files and concerns relevant to its slice of work. This produces better results than feeding everything to one agent because the reasoning isn’t diluted across unrelated concerns.

The Codebase Migration Case

The use case Anthropic highlights most prominently for Opus 4.6 is large-scale codebase migration, and the performance data supports this framing.

A multi-million-line migration that would typically take weeks of developer time can be orchestrated as an Agent Teams session where:

An orchestrator agent maps the codebase and creates a migration plan
Multiple sub-agents handle different modules in parallel
Agents flag dependencies to the orchestrator when they encounter cross-module concerns
The orchestrator coordinates merging work and resolving conflicts

The reported result: half the time compared to sequential single-agent approaches.

I’ve tested this at smaller scale — a 200K-line .NET codebase migration from .NET 6 to .NET 9. With a single agent: 8 hours. With Agent Teams (4 parallel sub-agents on different namespaces): 2.5 hours. The time savings are real, but the coordination overhead matters — the orchestrator needs enough context to resolve conflicts intelligently.

What Actually Changed for My Workflow

Six weeks in, the changes I actually care about:

Agent Teams is most valuable for tasks with natural parallelism. Code review, parallel module development, multi-format content generation. Less valuable for tasks where everything depends on everything else — sequential reasoning is still the right approach there.

Adaptive thinking is better than budget_tokens. The old approach required tuning budget_tokens per task type, which was trial-and-error. The effort levels map more intuitively to actual task complexity — I use high for code generation, medium for review, low for formatting/extraction tasks.

The 1M context window changes planning. Feeding an entire codebase into context (for large-scale analysis or migration planning) is now practical, not theoretical. This creates a new category of task that wasn’t feasible before.

Pricing stayed the same ($5/$25 per million tokens). For a capability upgrade this significant, keeping the price flat is notable. It changes the ROI calculation on tasks that were previously at the margin.

Migration Checklist

If you’re on Claude Opus 4.5 or earlier:

□ Replace thinking.budget_tokens → thinking.type: "adaptive", effort: "high"
□ Remove interleaved-thinking-2025-05-14 beta header (now ignored)
□ Move assistant prefill patterns to system prompts (prefill → 400 error)
□ Test context compaction behavior in long-running sessions
□ Update model string to "claude-opus-4-6"

The migration is low-risk — most changes are additive or provide sensible defaults. The prefill change is the only one likely to cause immediate failures, so test that first.

Looking Forward

The trajectory is clear: multi-agent coordination is becoming a first-class concern in how AI-assisted development tools are designed. Agent Teams in Opus 4.6, GitHub Copilot Workspace, Windsurf’s parallel agent sessions — all of these shipped within the same quarter.

The developer’s job is shifting from writing code to orchestrating agents that write code, with the developer focused on architecture, verification, and integration. That’s a different skill set than optimizing LLM prompts. The teams that figure out orchestration patterns — how to split work, how to verify agent output, how to handle agent coordination failures — will have a structural advantage.

That’s the bet worth making today.

Claude Opus 4.6 is available via the Anthropic API, Amazon Bedrock, Google Cloud Vertex AI, and Microsoft Azure AI Foundry. Agent Teams is in research preview.

Export for reading

Claude Opus 4.6 Agent Teams: Multi-Agent Development Is Here

What Agent Teams Actually Is

The Adaptive Thinking Architecture

Context Compaction: The Enabler for Long Agent Sessions

Real Performance: Terminal-Bench 2.0

Building With Agent Teams: A Real Example

The Codebase Migration Case

What Actually Changed for My Workflow

Migration Checklist

Looking Forward

Comments

On this page

Claude Opus 4.6 Agent Teams: Multi-Agent Development Is Here