Anthropic Code Mode’s MCP vs CLI battle: tools pin runtime, tokens drop from 150K to 2K

ChainNewsAbmedia

2026-05-10 09:15:05

Throughout all of 2025, AI engineering communities have been debating endlessly over the question of whether “MCP vs CLI” is better suited for Agent tool calling. In November 2025, Anthropic’s paper “Code execution with MCP” redefined the problem from first principles. akshay_pachaar 5/10 summarized the thread explaining that the issue has never been the protocol itself, but the old habit of stuffing the descriptions of all tools into the context at the start of a session. Anthropic’s solution is to have the model write code to call tools, while runtime is responsible for managing the tool details. The new approach is called “Code Mode”.

The problem with the old mode: most of the model’s 150K tokens go unused

How the old MCP mode wastes tokens:

Playwright MCP: 13.7K tokens (filled all at once)

Chrome DevTools MCP: 18K tokens

5 server configurations: 55K tokens burned before they even start working

Single workflow fully executed: can bloat to 150K tokens

What the model actually uses: most of it is not usable

Critics argue for switching to CLI, but CLI is error-prone in multi-tenant apps, lacks typed contracts, and agents unfamiliar with APIs need extra rounds to parse outputs as text. Both sides have arguments, but they’ve misdiagnosed the problem.

The solution: have the model write code to call tools, no longer call directly from context

The core of Anthropic’s proposed “Code Mode”:

Flip the model’s role: it’s not the model calling tools through context; instead, the model writes code and runtime calls the tools

Tools live in runtime, and the model only sees the part it imports

Types follow the imports: the model imports a tool, and it gets that tool’s type contract

Call already-installed binaries via Bash (git, curl, etc.)

Use typed module imports to call proprietary APIs

Anthropic example: Google Drive transcript text flows into a Salesforce CRM update. In the old approach, you load schema for both sides’ tools and send the entire transcript text through the model twice; in the new approach, 10 lines of TypeScript only import what’s needed, and the same task is compressed from the original 150K down to 2K tokens—a 98.7% reduction.

Cloudflare pushed it to the limit: 2,500 endpoint APIs, compressing from 1.17M tokens to 1K

Cloudflare did the most aggressive version:

Original API scale: 2,500 endpoint APIs, with schemas totaling 1.17M tokens

New approach: expose only two functions, search and execute, totaling 1K tokens

The agent writes code first to search the tool directory, then execute the corresponding tools

Compression ratio: over 1kx

The claim that “MCP is dead” is wrong—Anthropic has published that MCP SDK downloads have reached 300 million, up from 100 million at the beginning of the year. It’s one of the fastest-growing Agent infrastructure efforts right now. What’s “dead” is the approach of loading all tools at once when a session starts—and that was a bad idea in the first place. For developers writing Agents in 2026, the rule is simple: tool definitions belong to code, not context; the model writes a few lines of code to call, and runtime handles the rest.

Specific trackable follow-ups: the continuing growth rate of MCP SDK downloads from 300 million; whether Anthropic standardizes Code Mode as the official recommended mode in the MCP spec; and the adoption progress of Code Mode by other Agent platforms such as OpenAI, Google, and Cursor.

This article on Anthropic’s Code Mode resolving the MCP vs CLI debate: tools live in runtime, tokens compressed from 150K to 2K first appeared on Lian News ABMedia.

Disclaimer: The information on this page may come from third parties and does not represent the views or opinions of Gate. The content displayed on this page is for reference only and does not constitute any financial, investment, or legal advice. Gate does not guarantee the accuracy or completeness of the information and shall not be liable for any losses arising from the use of this information. Virtual asset investments carry high risks and are subject to significant price volatility. You may lose all of your invested principal. Please fully understand the relevant risks and make prudent decisions based on your own financial situation and risk tolerance. For details, please refer to Disclaimer.