MCP: What's Working, What's Broken, and What Comes Next

In November 2024, Anthropic open-sourced the Model Context Protocol. Within a year, it became the standard for connecting AI models to external tools: 97 million monthly SDK downloads, 10,000+ active servers, and adoption by OpenAI, Microsoft, and Google.

The fundamental question persists: is MCP the right abstraction? Debate continues across Hacker News, Reddit, and developer blogs — some view it as AI tooling’s future, others as over-engineered middleware destined for obsolescence.

StackOne has spent a year building MCP servers, debugging authentication flows, and observing spec evolution. Here’s what works, what remains broken, and what comes next.

The Evolution of Model Context Protocol

November 2024: Anthropic releases MCP with JSON-RPC over SSE and basic tool definitions.

March 2025: Security incidents prompt OAuth 2.1 support and tool annotations.

June 2025: Authorization servers separate from resource servers following Confused Deputy vulnerability discoveries.

September 2025: Registry launch centralizes discovery for 10,000+ servers.

November 2025: Anthropic donates MCP to the Linux Foundation’s Agentic AI Foundation.

2026+: Code Mode and Skills enable agents to write code calling MCP tools.

MCP began as an experiment rather than finished product. The original specification provided simple JSON-RPC over Server-Sent Events with basic tool definitions. Security incidents forced authentication maturity. Enterprise adoption exposed multi-tenant gaps. When Asana launched their MCP integration, researchers discovered a “Confused Deputy” vulnerability where servers cached responses without re-verifying tenant context.

Scale broke the original transport — SSE worked locally but failed handling thousands of concurrent production connections. Streamable HTTP replaced it. The ecosystem needed discovery; by early 2025, thousands of community servers scattered across npm, PyPI, and GitHub repositories prompted the Registry launch.

What MCP Gets Right in 2025

Discovery is useful: AI agents ask “what can you do?” and receive structured answers immediately — no hardcoded lists or stale documentation.

The abstraction fits simple cases: Connecting Claude to Notion or letting GPT-4 read GitHub issues becomes much simpler than building custom integrations.

Vendor neutrality matters: The same MCP server works with Claude, GPT-4, Gemini, and local models.

What’s Still Missing

MCP Authentication Remains Problematic

The specification technically supports OAuth 2.1, but SDKs and reference implementations assume servers are also authorization servers. Enterprise IdP integration (Okta, Azure AD) requires workarounds. Token lifecycle management proves particularly painful when using third-party authorization — complex token mapping and tracking become necessary.

Multi-Tenant Support Is Bolted On

MCP was designed for single-user, local deployment. Enterprise SaaS requires multi-tenant infrastructure where many users share resources while accessing only their data. The protocol lacks native tenant isolation concepts. Teams must build their own solutions for:

Tool visibility (which tools can users see?)
Permission boundaries (what actions are permitted?)
Data isolation (preventing cross-tenant leaks)

The Asana vulnerability exemplified this: the server trusted cached context without re-verifying who was asking. MCP’s specification doesn’t prevent such attacks — teams build tenant isolation independently.

Admin Controls Don’t Exist

Deploying AI agents to teams requires administration controls: which MCP servers can users connect, what tools are allowed per user or role, and how do you audit agent actions? MCP offers no answers.

Tool Discovery Needs Work

The Registry helps clients find servers, but there’s no standard for agents discovering tools within sessions based on context. Applications must embed this logic themselves; a built-in context-aware discovery mechanism would enhance agent awareness about relevant tools.

MCP Notifications Are Immature

MCP operates on request-response: clients ask, servers answer. Real integrations need push capabilities — new emails arrived, builds finished, PR comments posted. While specification hooks exist, implementation guidance remains thin and tooling support inconsistent.

Developer Sentiment on MCP

Community feedback reveals mixed reactions, reflecting complaints about low barriers to publishing inadequate implementations. Before Code Mode, developers complained context windows filled with tool definitions — loading 50 tools burned 100K+ tokens before conversations started.

Security researchers discovered prompt injection vulnerabilities in early implementations: tool poisoning, sandbox escapes. While specifications improved, community server ecosystems haven’t fully caught up.

The debate persists: MCP is either the future of AI tooling or an elaborate over-engineering exercise. The protocol adds abstraction between models and tools — whether costs justify benefits depends on use cases.

The Next Evolution: Code Mode

The debate shifts from “MCP versus nothing” to “how agents should call MCP tools.”

Traditional approach: load all tool definitions into context, let models pick tools, feed results back. This works but proves expensive — thousands of tools mean hundreds of thousands of context tokens.

Emerging approach: Code Mode. Instead of calling tools directly, agents write code calling tools. Cloudflare’s Agents SDK converts MCP schemas into TypeScript APIs. LLMs write await mcp.notion.createPage({ title: "..." }) rather than special tool-calling tokens.

Comparison

Direct Tool Calling:

Context overhead: all tool definitions loaded upfront (150K+ tokens)
Intermediate results: every output passes through model
Control flow: loops and conditionals require multiple roundtrips
Training data: LLMs see less tool-calling syntax than code

Code Mode:

On-demand discovery: tools loaded when needed (2K vs 150K tokens)
Local execution: filter and aggregate data before returning
Native control flow: loops, conditionals, error handling run as code
Better performance: LLMs excel at writing TypeScript

Anthropic’s research shows this approach reduces token usage by 98.7%. Code Mode doesn’t replace MCP — it makes MCP more efficient by having agents write code calling MCP tools.

What Needs to Happen for Enterprise Adoption

First-class multi-tenant support: Tenant isolation, per-user permissions, and admin controls should be protocol-level concepts.
Better IdP integration: SDKs should assume enterprise auth as default.
Context-aware discovery: Agents see relevant tools based on current context.
Unified observability: Standard ways to trace agent actions, log decisions, and audit behavior.

The Linux Foundation Move: Too Early?

November 2025 brought Anthropic’s donation of MCP to the Linux Foundation’s Agentic AI Foundation, framing it as vendor-neutral governance ensuring long-term future.

The concern: MCP isn’t ready for stasis. Foundation governance slows development — that’s intentional. But MCP still contains fundamental gaps: multi-tenancy, admin controls, context-aware discovery, real-time notifications. These aren’t edge cases; they’re enterprise blockers.

Foundation governance means committees, consensus, and careful deliberation. Protocol evolution follows agreement speed, not need speed. Competitors with centralized control iterate faster and ship features quicker.

Is Anthropic stepping back because they recognize MCP won’t be the winner? The handoff signals MCP serves some use cases, not universal AI-tool interaction. By transferring to the Foundation, Anthropic focuses on making Claude better without managing a protocol potentially capturing only part of the market.

MCP might become the USB-C of AI tooling: widely adopted, good enough for most cases, but not the only connector. Or it may ossify while something more nimble emerges.

Where StackOne Is Betting: MCP vs Code Mode

Both paths coexist:

MCP for structured, high-trust integrations: Enterprise workflows requiring explicit permissions, audit logs, and vendor-neutral tooling.

Code Mode for exploratory, developer-centric use cases: When agents need flexibility and broader system access is acceptable.

Winners won’t choose sides — they’ll understand when to use which approach.

The Missing Layer: Agent Skills for MCP

MCP provides technical plumbing connecting models to tools but doesn’t specify when to use which tools or how to combine them for specific outcomes.

Skills fill this gap as the abstraction layer making MCP usable: they encapsulate MCP servers and tools into outcome-oriented packages without overloading context.

The Agent Skills specification defines skills as reusable capability modules. Drop a skill file and agents gain new domains: code review patterns, deployment workflows, research methodologies. Skills specify which MCP servers to use, what tools to invoke, and how to combine them.

Skills prove interesting for two reasons:

Use-case framing: MCP describes tools technically; skills describe outcomes. Agents reason better about outcomes than API specifications.

Context isolation: Agents with 50 MCP servers face 50 competing tool definitions. Skills load only what’s relevant — activate deployment skills, get deployment tools only. The rest stays out of context until needed.

Skills won’t replace MCP; they make it more accessible. MCP provides standard interfaces; skills provide higher-level composition making those interfaces useful.