Announcing our $20m Series A from GV (Google Ventures) and Workday Ventures Read More

Guillaume Lebedel

Guillaume Lebedel

CTO

We Built a Unified API. Here's Why We Tore It Down for Agents.
| 14 min read

We Built a Unified API. Here's Why We Tore It Down for Agents.

We started StackOne as a unified API company.

If you’re not familiar with the model: take dozens of SaaS platforms in a category (say, HR systems like Workday, BambooHR, SAP), normalize their data models into a single schema, and let developers write one integration that works with all of them. Standard play. Merge, Finch, Kombo, and others built entire businesses around this. So did we.

We had paying customers. Growing revenue. A team of integration engineers building connectors. The unified API model made sense for its intended audience: developers writing deterministic code paths where the consumer was software, not intelligence.

Then agents showed up, and everything broke.

The promise of unified APIs

The unified API pitch makes sense. You have 200 HR systems, each with different endpoints, authentication flows, field names, and pagination styles. Instead of building 200 integrations, you build one. A list_employees call returns the same shape regardless of whether the underlying system is Workday’s SOAP API or BambooHR’s REST API.

For traditional software, this works. Your application code doesn’t care that Workday calls it Worker and BambooHR calls it Employee. You just need a consistent first_name, last_name, department to populate a dashboard or run a sync job.

The unified API handles the translation. You ship faster. Your customers connect their tools. Everyone wins.

Until your consumer stops being a for loop and starts being an LLM.

Where unified APIs fail agents

We didn’t realize this all at once. It started as friction in customer conversations, graduated to technical limitations we couldn’t work around, and eventually became a strategic conviction that forced us to rebuild our core engine.

Here’s what went wrong.

Agents think in the language of the underlying system

When a user tells an agent “find all open reqs in Greenhouse,” they’re thinking in Greenhouse terms. They know what a “req” is in Greenhouse. They know the fields, the filters, the status labels.

A unified API translates req into a generic job object. It renames opening_id to job_id. It maps Greenhouse-specific statuses into a normalized enum. The agent receives data that doesn’t match what the user said, what documentation describes, or what the LLM was trained on.

This creates an interpretation gap. The LLM has to translate twice: once from the user’s intent to the unified schema, and again from the unified response back to the user’s mental model. Every translation is a chance to lose meaning.

The less interpretation an LLM has to do between what the user meant and what gets called, the better. Unified APIs maximize that interpretation distance.

The lowest common denominator problem

This is the one everyone warns you about, and it’s worse than it sounds.

A unified API can only expose fields that exist (or have reasonable equivalents) across all providers it supports. Add a new provider with fewer capabilities, and your unified model gets smaller. Support 10 systems and you get a decent model. Support 50 and you’re down to the intersection of what they all share.

Nango’s analysis puts it directly: the more apps supported within a unified API, the more limiting the unified API becomes. It’s an inverse relationship between breadth and depth.

For agents, depth is everything. An HR agent needs to understand that Workday tracks compensation_grade differently from BambooHR’s pay_rate. A recruiting agent needs Greenhouse’s scorecard structure, not a flattened evaluation_score field. When you strip provider-specific fields to maintain a unified model, you strip the agent’s ability to do useful work.

Some things can’t be unified

Greenhouse has structured keyword search. Workday uses XPath-based queries. Jira has JQL, a full query language with operators, functions, and nesting. Salesforce has SOQL, which is basically SQL for objects.

How do you unify JQL and free-text search? You can’t, not without destroying what makes JQL useful. A unified search endpoint that accepts a string query works for BambooHR but lobotomizes Jira. The agent loses the ability to write project = BACKEND AND status changed FROM "In Progress" TO "Done" AFTER -7d.

Some capabilities are fundamentally incompatible. A unified API has to pick a lowest common denominator, and that denominator is usually “simple text search” when what the user wanted was a structured query that the underlying system supports natively.

Depth in the wrong places

Unified APIs do offer depth, sometimes. But it’s often depth that serves the API design, not the agent consuming it.

Expand parameters, nested include paths, complex filter syntax with dozens of operators. These features exist because the unified API is trying to compensate for its own abstraction. “Yes, we normalized the model, but you can use expand=manager.department.cost_center to get the data you actually needed.”

An agent calling a unified API now needs to understand a query language that isn’t the underlying system’s language and isn’t the user’s language. It’s a third vocabulary invented by the unified API provider. The LLM wasn’t trained on it. There’s no Stack Overflow post explaining the nuances. The agent guesses, and often guesses wrong.

Token waste at every layer

This is where unified APIs become actively harmful for agents.

A single employee record from Workday’s native SOAP API runs approximately 17,000 tokens. StackOne’s unified employee object compresses this to about 600 tokens. That 28x reduction sounds like a win, and for traditional API consumers, it is.

But agents have a harder constraint: reasoning quality degrades well before you hit the context window limit. Chroma’s research on context rot found that model performance becomes increasingly unreliable as input length grows. Drew Breunig’s analysis measured correctness beginning to fall around 32K tokens.

A unified API returns all normalized fields whether the agent needs them or not. An agent asking “what department is Alice in?” gets back her full compensation history, emergency contacts, and employment timeline. Multiply this by a list call that returns 100 records, and you’ve burned 60,000 tokens on data the agent will never reference.

The problem isn’t that unified APIs are verbose. It’s that they have no mechanism to return only what matters for the specific task. They were built for developers who would pick the fields they need in application code. Agents don’t have that luxury. Every token goes into the context window.

Contradicting what the model knows

LLMs are trained on vast amounts of API documentation. Claude, GPT-4, Gemini, they’ve all seen Greenhouse’s API docs, Workday’s developer guides, Salesforce’s reference. They know what a Requisition is in Greenhouse. They know Workday’s Worker object structure.

When you route these models through a unified API that renames Worker to Employee, changes field names, restructures nested objects, and maps statuses to different enums, you’re actively fighting the model’s training data. The agent knows that Workday calls it Worker. Your unified API insists it’s Employee. The model has to learn a new mapping that contradicts what it already knows.

This isn’t theoretical. We watched agents consistently fail on nested query parameters. Connectors with query params under HTTP in: query created query: { query: "..." } schemas that LLMs misinterpreted, causing invalid_type errors. The fix was schema flattening: making tool schemas match what the model expects, not what the API designer thought was clean.

Limited to what APIs can do

Unified APIs are, by definition, API layers on top of other APIs. If the underlying platform doesn’t expose a capability through its API, neither does your unified endpoint.

But agents aren’t limited to APIs. An agent with browser-use capabilities can navigate a UI, fill forms, click buttons, and extract data from screens. An agent with file access can parse exports, read CSVs, process PDFs. Some of the most valuable enterprise automations involve systems that have incomplete or nonexistent APIs.

A unified API can never expose what the underlying API doesn’t offer. An agent-first integration layer can use whatever interface gets the job done.

What unified APIs got right

I’m not here to say unified APIs are worthless. After years of building one, I know exactly what they do well. And some of those capabilities matter more for agents, not less.

Authentication is a solved problem

Managing OAuth flows, token refreshes, API key rotation, and SAML handshakes across 200+ platforms is brutal. A unified API that abstracts all authentication behind a single credential eliminates an entire category of agent infrastructure complexity.

This is hard to replicate. Every agent builder who tries to connect directly to enterprise SaaS immediately hits the authentication wall. Unified APIs solved this, and that solution transfers directly to the agentic world.

Deep system knowledge

Years of building integrations teach you things that aren’t in the docs. Rate limit quirks, undocumented pagination behaviors, field-level gotchas that only surface at scale. This institutional knowledge is valuable regardless of whether you expose a unified schema or native actions.

When we built Falcon (our new engine), we didn’t throw away this knowledge. We carried it forward into actions that account for each system’s specific behaviors.

Compound actions and workflows

Not everything an agent needs to do maps to a single API call. “Transfer an employee from BambooHR to Workday” involves reading from one system, transforming data, writing to another, and handling the dozen edge cases in between.

Unified APIs that offer workflow-level operations (not just CRUD endpoints) save real time. The goal shouldn’t be unifying the data model. It should be creating multi-step actions that are useful for agents: compound operations that handle orchestration, error recovery, and data transformation in a single tool call.

Standardized behavior for predictable operations

Pagination is a great example. Every API paginates differently: cursor-based, offset-based, page tokens, link headers. Standardizing pagination behavior so an agent can reliably page through results without learning each system’s approach? That’s useful.

Same for filtering output. If every tool supports a consistent way to select fields, limit response size, and truncate large payloads, agents can manage their context budget without per-tool logic. This isn’t data model unification. It’s behavior standardization, and the distinction matters.

What agents actually need

After eighteen months of building for agents (and twelve months of watching unified API assumptions break), here’s what we’ve found works.

Native-first actions, designed for the system

Instead of mapping everything to a unified schema, build actions that reflect how each system actually works. A Greenhouse action should return Greenhouse objects with Greenhouse field names. A Workday action should use Workday’s terminology.

The agent doesn’t need a universal Employee object. It needs a clear, well-documented workday_get_worker action that returns exactly what Workday returns, structured for LLM consumption.

This means the agent can use what it already knows about the system. Anthropic’s research on tool design confirms what we’ve observed: tools that match the model’s training distribution perform better than tools that introduce novel abstractions.

Context optimization at every layer

Agents don’t need all the data. They need the right data.

We built a progressive solution stack for this:

Metadata surfacing. Before fetching anything, the agent can ask “how big would this response be?” A dry-run mode returns record counts and token estimates. The agent decides whether to proceed, filter, or delegate to a sub-agent.

Server-side filtering. Instead of returning 500 employee records and letting the agent figure out which ones matter, allow the agent to filter at the source. Select specific fields. Apply conditions. The filtering happens before data enters the context window.

Token-aware responses. Include total_count, returned_count, truncated, and estimated_tokens_per_item in response metadata. The agent knows exactly what it got and what it’s missing.

Code mode execution. Let agents write code that processes tool responses in a sandbox, returning only extracted results. Our internal testing shows this can reduce token usage by up to 80%.

None of these require a unified data model. They require tool design that treats context as a scarce resource.

Surfacing useful data for decisions

An agent planning a compensation review doesn’t need a complete employee record. It needs: name, current salary, last review date, manager, performance rating. Five fields, not fifty.

Agent-first tools surface data that helps the LLM take its next action. This means different tools for different intents. get_employee_compensation_summary is more useful than get_employee with an expand parameter. The tool name tells the agent what it’s getting. The response contains only what’s relevant.

This is the opposite of the unified API philosophy, which tries to be everything through one endpoint. Agent-first design builds many focused actions, each returning exactly what a specific task needs.

Dynamic tool discovery

With 10,000+ actions across hundreds of connectors, you can’t load every tool definition into the context window. At StackOne, we built meta-tools: a search layer that lets agents discover relevant actions based on what they’re trying to do.

The agent asks “what can I do with Greenhouse?” and gets back a filtered list of applicable actions, scoped to what the connected account has permissions for. This is closer to how a developer explores an API than how a unified API client works.

Testable, reliable, measurable

Building native-first actions doesn’t mean abandoning quality. Each action in our Falcon engine is defined in YAML, generates deterministic RPC/REST and AI tool interfaces from the same spec, and goes through behavioral testing.

We benchmark how accurately models select and call tools across thousands of task scenarios. Accuracy and token efficiency are tracked as first-class metrics.

This is possible precisely because actions are well-scoped and native. A focused greenhouse_list_open_jobs action is easier to test than a generic list_jobs endpoint that behaves differently depending on which provider is connected.

Why we made the call

Rebuilding wasn’t a theoretical exercise. We had paying customers on the unified API. Revenue. Contracts. An integration engine built in TypeScript that had been refined over years.

We built Falcon anyway. A new engine, defined in YAML, that generates native-first actions purpose-built for agents. It meant rewriting connectors, migrating customers, and deliberately winding down the unified API while building the replacement.

The speed difference justified the rebuild. Our old engine added integrations linearly: each one required custom TypeScript, manual mapping, QA cycles. Falcon adds new integrations in days, not weeks. Enterprise customers evaluating us against traditional unified API providers consistently chose StackOne because we offer real-time data access directly to underlying platforms and support for agentic use cases that pure unified APIs can’t cover.

GV led our Series A. Their investment thesis wasn’t “better unified API.” It was “agentic integration infrastructure.”

What this means for the market

The unified API model served its generation well. For traditional SaaS integrations, where the consumer is deterministic application code, normalizing data across providers saves real engineering time.

But the consumer is changing. Agents don’t want normalized data. They want native actions, context-aware responses, and tools that match what they already know about the underlying system. The less translation between user intent and tool behavior, the better the agent performs.

If you’re building agents that need to interact with enterprise SaaS, here’s what I’d evaluate:

  1. Does the tool preserve the underlying system’s semantics? Or does it introduce a new vocabulary the model has to learn?
  2. Can the agent control how much data it receives? Filtering, field selection, dry-run modes, token estimates.
  3. Are actions scoped to specific intents? Or is it one big CRUD endpoint with complex parameters?
  4. Can the agent discover available tools dynamically? Without loading thousands of schemas into context?
  5. Is the integration limited to what the API offers? Or can it use multiple interfaces (API, browser, file) to accomplish the task?

Unified APIs will continue to exist for traditional integration use cases. But for agents, the architecture needs to be different. We learned this the hard way, by building both.