Solving MCP Context Bloat with Claude's Tool Search API
Writer
If you’ve been working with the Model Context Protocol (MCP), you’ve likely encountered a frustrating bottleneck: Context Bloat.
When you connect a powerful MCP server—like the GitHub MCP—it typically loads the definitions for every single tool available. For a server with 91 tools, this can consume roughly 46,000 tokens just for the definitions. That’s a massive chunk of your context window gone before you’ve even sent a single message.
Anthropic has addressed this challenge with a powerful new feature in the Claude API called Tool Search. While this isn’t a change to the MCP protocol itself, it’s a game-changer for anyone building MCP-powered agents with Claude.
What is Tool Search?
Tool Search is a Claude API feature that allows the model to dynamically discover and load tools on-demand, rather than pre-loading everything at the start. Instead of stuffing all tool definitions into the context window upfront, Claude searches your tool catalog and loads only the 3-5 most relevant tools required for the current task.
This solves two critical challenges:
- Context Efficiency: Tool definitions can consume massive portions of your context window (50 tools ≈ 10-20K tokens)
- Tool Selection Accuracy: Claude’s ability to correctly select tools degrades significantly with more than 30-50 tools loaded at once
Key Point: Tool Search is a Claude API feature, not an MCP protocol change. It works seamlessly with MCP servers, but the search logic runs on Claude’s side.
Supported Providers & Models
Tool Search is currently in public beta and requires specific beta headers depending on your provider:
| Provider | Beta Header | Supported Models |
|---|---|---|
| Anthropic API | advanced-tool-use-2025-11-20 | Claude Opus 4.5, Claude Sonnet 4.5 |
| Microsoft Foundry | advanced-tool-use-2025-11-20 | Claude Opus 4.5, Claude Sonnet 4.5 |
| Google Cloud Vertex AI | tool-search-tool-2025-10-19 | Claude Opus 4.5, Claude Sonnet 4.5 |
| Amazon Bedrock | tool-search-tool-2025-10-19 | Claude Opus 4.5 only |
Bedrock Users: Tool Search is only available via the Invoke API, not the Converse API.
The Two Flavors of Search: Regex vs. BM25
Claude offers two distinct search methods, each optimized for different tool naming conventions:
1. Regex Search (tool_search_tool_regex_20251119)
Best For: Tools with a strict, consistent, and predictable naming structure.
How It Works: Claude constructs regex patterns using Python’s re.search() syntax to find matching tools.
- Pro: Extremely precise for structured APIs like
stripe_customer_getoraws_ec2_stop - Con: Fails if tool names are inconsistent or ambiguous
2. BM25 Search (tool_search_tool_bm25_20251119)
Best For: Tools with natural language names or semantically meaningful descriptions.
How It Works: Claude uses natural language queries, and a BM25 relevance ranking algorithm matches them to your tool definitions.
- Pro: More flexible; handles synonyms and natural language variations
- Con: Slightly less precise than exact pattern matching
How the defer_loading Flag Works
The magic happens with a simple property: defer_loading: true. When you mark a tool with this flag, Claude won’t load its definition into context until it’s discovered via search.
Key Rules:
- The Tool Search tool itself must never have
defer_loading: true - Keep your 3-5 most frequently used tools as non-deferred
- At least one tool must be non-deferred (or you’ll get an error)
Implementation Guide
Here’s how to implement Tool Search with the Claude API:
Step 1: Add the Beta Header
Step 2: Include the Search Tool
Step 3: Mark Tools for Deferred Loading
MCP Server Integration
When working with MCP servers, you can use mcp_toolset to defer all tools from a server by default:
When Tool Search Isn’t Available: Manual Curation
Not every environment supports Claude’s Tool Search yet. For developers building agents in Microsoft Copilot Studio, you’ll need a different approach.
The Alternative: Design-Time Tool Selection
While Copilot Studio doesn’t yet have dynamic tool search, it offers granular tool configuration that lets you manually control which tools are available to your agent.
When you connect an MCP server in Copilot Studio, every tool appears as a distinct action with an Enable/Disable toggle. Instead of dumping all 50+ tools into the agent’s context, you can curate exactly what the agent can access.
Why This Still Matters
Manual curation isn’t as elegant as dynamic search, but it’s effective:
- Context Hygiene: Disabling irrelevant tools removes their definitions from the system prompt
- Latency Reduction: Fewer tool tokens means faster response times
- Accuracy: With fewer tools to choose from, the model is less likely to select the wrong action
Best Practice: If you’re using Copilot Studio, audit your MCP connections and disable any tools your agent doesn’t need. It’s manual, but it works.
Technical Constraints
While Tool Search solves the bloat issue, there are boundaries to keep in mind:
| Constraint | Limit |
|---|---|
| Maximum tools in catalog | 10,000 |
| Regex pattern length | 200 characters |
| Search results per query | 3-5 tools |
| Model support | Sonnet 4.5+, Opus 4.5+ (no Haiku) |
The Bottom Line
Context bloat has been one of the biggest pain points when scaling MCP-powered agents. Claude’s Tool Search feature provides an elegant, dynamic solution—but it’s important to understand this is a Claude API capability, not an MCP protocol change.
If you’re using Claude (via Anthropic, Microsoft Foundry, Vertex AI, or Bedrock), enable Tool Search and start deferring your tools. Your context window will thank you.
If you’re on another platform, manual tool curation is your friend until similar features arrive. Every tool you disable is tokens saved.
Either way, the days of cramming 46,000 tokens of tool definitions into every request are finally behind us.
Related Articles
More articles coming soon...