Tools: I Tried to Deploy My MCP Server to Vercel. Here's What Actually Happened.

Tools: I Tried to Deploy My MCP Server to Vercel. Here's What Actually Happened.

Source: Dev.to

Why MCP and Serverless Don't Get Along ## The Deeper Problem: Even When It Deploys, the AI Still Guesses ## MCP Fusion: A New Architecture for Agentic MCP Servers ## The Vercel Adapter: Solving the Deployment Wall ## How It Fixes Each Problem ## Edge vs. Node.js: Which Runtime? ## Native Vercel Services Work Out of the Box ## What's Fully Supported on Vercel ## Compatible With Any MCP Client ## The Takeaway I built a working MCP server. It connected to my database, returned tool results, and worked flawlessly in Claude Desktop locally. Then I pushed to Vercel. 500 errors everywhere. The MCP adapter was trying to use persistent SSE connections inside ephemeral serverless functions. Everything broke — and it wasn't obvious why or how to fix it. I wasn't alone. This is a known, documented problem across the community. MCP was designed for long-lived processes. The original spec only supported two transports: stdio (local-only) and SSE (persistent server-sent events over HTTP). Both assume the server stays alive between calls. Vercel Functions don't work that way. Each request can land on a different function instance. Memory is ephemeral. There's no persistent filesystem. And SSE connections stored in memory — poof, gone on the next cold start. The result is a mess developers across Reddit, GitHub, and dev.to have been hitting for months: The MCP protocol spec itself acknowledges this: statelessness and horizontal scaling are on the official roadmap as unresolved challenges. A GitHub discussion from the core team literally says: "I'm building a hosting platform for deploying MCPs and SSE makes it hard to scale remote MCPs because we can't use serverless." This isn't a niche edge case. It's the default experience for anyone who tries to ship an MCP server to a modern deployment platform. Let's say you do get past the deployment wall. You still have a second, subtler problem. Most MCP servers today look like this: Raw JSON. No context. No rules. No hints on what to do next. The LLM receives amount_cents: 45000 and has to guess — is that dollars? Cents? Yen? It receives 3,000 invoice rows and burns your entire context window. It receives a stripe_payment_intent_id and a password_hash that were never meant to leave your database. These aren't prompt engineering problems you solve with longer system prompts. They're architecture problems. The handler is doing too much, and the AI is receiving too little structure. The two problems are related. Developers struggling to deploy their MCP server often never get far enough to realize the data layer is also broken. MCP Fusion is an architecture layer built on top of the MCP SDK. It introduces the MVA pattern (Model-View-Agent) — a deliberate separation of three concerns that raw MCP servers collapse into one function. The handler doesn't check auth. It doesn't filter fields. It doesn't format anything. It just returns data. The framework handles everything else through a deterministic pipeline: Here's what this looks like in practice: Notice: the handler is 3 lines. Auth, field stripping, domain rules, affordances, and truncation happen automatically — not sprinkled across the handler with if statements you forget to copy. A database migration that adds a column doesn't change what the agent sees. New fields are invisible by default unless you explicitly declare them in the Presenter schema. Now back to the original problem. You have a well-structured MCP server. How do you deploy it to Vercel without everything breaking? MCP Fusion ships a dedicated Vercel Adapter that solves every serverless incompatibility at once: Done. Live at https://your-project.vercel.app/api/mcp. The adapter splits work into two phases deliberately: Cold start pays the cost once. Every subsequent request is just routing + your business logic. For tools querying Vercel Postgres or Vercel KV, use Node.js. For fast routing or API gateway tools, Edge is ideal. Switch with a single export line. No extra config. No adapter magic. Just import and use. ✅ Tools, groups, tags, exposition ✅ Middleware chains (auth, rate limiting, etc.) ✅ Presenters — field stripping, rules, affordances, agentLimit ✅ Governance Lockfile — pre-generated at build time ✅ Structured error recovery via toolError() ✅ Vercel Postgres, KV, and Blob ✅ Both Edge and Node.js runtimes ❌ autoDiscover() — no filesystem; register tools explicitly ❌ createDevServer() — use next dev or vercel dev ❌ State Sync notifications — stateless transport by design The stateless JSON-RPC endpoint works with everything: The reason deploying MCP servers to Vercel has been painful isn't a skill issue. The original protocol simply wasn't designed for stateless infrastructure. The combination of SSE transport, in-memory sessions, and filesystem assumptions made serverless deployment an exercise in workarounds — most of which don't work. MCP Fusion resolves this at the framework level, not the patch level. The Vercel Adapter isn't a set of hacks; it's a first-class adapter that changes the transport model to match the deployment model. Pair it with the MVA architecture, and you stop writing MCP servers that make the AI guess — and start shipping servers that tell the AI exactly what to do next. Have you hit this same wall trying to deploy MCP servers to serverless? What transport were you using? Drop a comment — the more real war stories, the better. Templates let you quickly answer FAQs or store snippets for re-use. Are you sure you want to hide this comment? It will become hidden in your post, but will still be visible via the comment's permalink. Hide child comments as well For further actions, you may consider blocking this person and/or reporting abuse CODE_BLOCK: TypeError: Cannot read properties of undefined (reading 'addEventListener') Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: TypeError: Cannot read properties of undefined (reading 'addEventListener') CODE_BLOCK: TypeError: Cannot read properties of undefined (reading 'addEventListener') COMMAND_BLOCK: handler: async ({ input }) => { return JSON.stringify(await db.query('SELECT * FROM invoices')); } Enter fullscreen mode Exit fullscreen mode COMMAND_BLOCK: handler: async ({ input }) => { return JSON.stringify(await db.query('SELECT * FROM invoices')); } COMMAND_BLOCK: handler: async ({ input }) => { return JSON.stringify(await db.query('SELECT * FROM invoices')); } CODE_BLOCK: contextFactory → middleware chain → Zod input validation → handler (raw data) → Presenter pipeline: 1. agentLimit() → truncate before context window overflow 2. Zod schema → strip undeclared fields (allowlist, not denylist) 3. rules() → attach contextual domain instructions 4. suggestActions() → HATEOAS affordances: valid next actions with pre-filled args 5. uiBlocks() → server-rendered charts/diagrams (ECharts, Mermaid) → Agent receives structured package Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: contextFactory → middleware chain → Zod input validation → handler (raw data) → Presenter pipeline: 1. agentLimit() → truncate before context window overflow 2. Zod schema → strip undeclared fields (allowlist, not denylist) 3. rules() → attach contextual domain instructions 4. suggestActions() → HATEOAS affordances: valid next actions with pre-filled args 5. uiBlocks() → server-rendered charts/diagrams (ECharts, Mermaid) → Agent receives structured package CODE_BLOCK: contextFactory → middleware chain → Zod input validation → handler (raw data) → Presenter pipeline: 1. agentLimit() → truncate before context window overflow 2. Zod schema → strip undeclared fields (allowlist, not denylist) 3. rules() → attach contextual domain instructions 4. suggestActions() → HATEOAS affordances: valid next actions with pre-filled args 5. uiBlocks() → server-rendered charts/diagrams (ECharts, Mermaid) → Agent receives structured package COMMAND_BLOCK: import { initFusion, ui } from '@vinkius-core/mcp-fusion'; import { z } from 'zod'; interface AppContext { db: PrismaClient; user: { role: string; tenantId: string } } const f = initFusion<AppContext>(); // Auth middleware — runs before EVERY tool that declares it const auth = f.middleware(async (ctx) => { const payload = await verifyJWT((ctx as any).rawToken); const user = await prisma.user.findUniqueOrThrow({ where: { id: payload.sub } }); return { db: prisma, user }; }); // Presenter — the perception layer const InvoicePresenter = f.presenter({ name: 'Invoice', schema: z.object({ id: z.string(), customer: z.string(), amount_cents: z.number().describe('Amount in CENTS — divide by 100 for display'), status: z.enum(['draft', 'sent', 'paid', 'overdue']), // password_hash, stripe_secret, profit_margin → GONE before the wire }), rules: (inv) => [ inv.status === 'overdue' ? 'Invoice is overdue. Send a reminder before any other action.' : null, ], suggestActions: (inv) => [ inv.status === 'draft' ? { tool: 'billing.send', args: { id: inv.id } } : null, inv.status === 'overdue' ? { tool: 'billing.remind', args: { id: inv.id } } : null, ].filter(Boolean), agentLimit: { max: 50 }, // never send 3,000 rows to the LLM }); // Tool — just fetch data. Everything else is handled. const getInvoice = f.tool({ name: 'billing.get', description: 'Retrieve an invoice by ID', input: z.object({ id: z.string() }), middleware: [auth], returns: InvoicePresenter, // ← one line wires the whole pipeline handler: async ({ input, ctx }) => ctx.db.invoice.findUniqueOrThrow({ where: { id: input.id, tenantId: ctx.user.tenantId } }), }); Enter fullscreen mode Exit fullscreen mode COMMAND_BLOCK: import { initFusion, ui } from '@vinkius-core/mcp-fusion'; import { z } from 'zod'; interface AppContext { db: PrismaClient; user: { role: string; tenantId: string } } const f = initFusion<AppContext>(); // Auth middleware — runs before EVERY tool that declares it const auth = f.middleware(async (ctx) => { const payload = await verifyJWT((ctx as any).rawToken); const user = await prisma.user.findUniqueOrThrow({ where: { id: payload.sub } }); return { db: prisma, user }; }); // Presenter — the perception layer const InvoicePresenter = f.presenter({ name: 'Invoice', schema: z.object({ id: z.string(), customer: z.string(), amount_cents: z.number().describe('Amount in CENTS — divide by 100 for display'), status: z.enum(['draft', 'sent', 'paid', 'overdue']), // password_hash, stripe_secret, profit_margin → GONE before the wire }), rules: (inv) => [ inv.status === 'overdue' ? 'Invoice is overdue. Send a reminder before any other action.' : null, ], suggestActions: (inv) => [ inv.status === 'draft' ? { tool: 'billing.send', args: { id: inv.id } } : null, inv.status === 'overdue' ? { tool: 'billing.remind', args: { id: inv.id } } : null, ].filter(Boolean), agentLimit: { max: 50 }, // never send 3,000 rows to the LLM }); // Tool — just fetch data. Everything else is handled. const getInvoice = f.tool({ name: 'billing.get', description: 'Retrieve an invoice by ID', input: z.object({ id: z.string() }), middleware: [auth], returns: InvoicePresenter, // ← one line wires the whole pipeline handler: async ({ input, ctx }) => ctx.db.invoice.findUniqueOrThrow({ where: { id: input.id, tenantId: ctx.user.tenantId } }), }); COMMAND_BLOCK: import { initFusion, ui } from '@vinkius-core/mcp-fusion'; import { z } from 'zod'; interface AppContext { db: PrismaClient; user: { role: string; tenantId: string } } const f = initFusion<AppContext>(); // Auth middleware — runs before EVERY tool that declares it const auth = f.middleware(async (ctx) => { const payload = await verifyJWT((ctx as any).rawToken); const user = await prisma.user.findUniqueOrThrow({ where: { id: payload.sub } }); return { db: prisma, user }; }); // Presenter — the perception layer const InvoicePresenter = f.presenter({ name: 'Invoice', schema: z.object({ id: z.string(), customer: z.string(), amount_cents: z.number().describe('Amount in CENTS — divide by 100 for display'), status: z.enum(['draft', 'sent', 'paid', 'overdue']), // password_hash, stripe_secret, profit_margin → GONE before the wire }), rules: (inv) => [ inv.status === 'overdue' ? 'Invoice is overdue. Send a reminder before any other action.' : null, ], suggestActions: (inv) => [ inv.status === 'draft' ? { tool: 'billing.send', args: { id: inv.id } } : null, inv.status === 'overdue' ? { tool: 'billing.remind', args: { id: inv.id } } : null, ].filter(Boolean), agentLimit: { max: 50 }, // never send 3,000 rows to the LLM }); // Tool — just fetch data. Everything else is handled. const getInvoice = f.tool({ name: 'billing.get', description: 'Retrieve an invoice by ID', input: z.object({ id: z.string() }), middleware: [auth], returns: InvoicePresenter, // ← one line wires the whole pipeline handler: async ({ input, ctx }) => ctx.db.invoice.findUniqueOrThrow({ where: { id: input.id, tenantId: ctx.user.tenantId } }), }); COMMAND_BLOCK: npm install @vinkius-core/mcp-fusion-vercel Enter fullscreen mode Exit fullscreen mode COMMAND_BLOCK: npm install @vinkius-core/mcp-fusion-vercel COMMAND_BLOCK: npm install @vinkius-core/mcp-fusion-vercel COMMAND_BLOCK: // app/api/mcp/route.ts — the ENTIRE file import { initFusion } from '@vinkius-core/mcp-fusion'; import { vercelAdapter } from '@vinkius-core/mcp-fusion-vercel'; import { z } from 'zod'; interface AppContext { tenantId: string; dbUrl: string } const f = initFusion<AppContext>(); const listProjects = f.tool({ name: 'projects.list', input: z.object({ limit: z.number().optional().default(20) }), readOnly: true, handler: async ({ input, ctx }) => fetch(`${ctx.dbUrl}/projects?tenant=${ctx.tenantId}&limit=${input.limit}`).then(r => r.json()), }); const registry = f.registry(); registry.register(listProjects); export const POST = vercelAdapter<AppContext>({ registry, serverName: 'my-mcp-server', contextFactory: async (req) => ({ tenantId: req.headers.get('x-tenant-id') || 'public', dbUrl: process.env.DATABASE_URL!, }), }); // Optional: global edge network, ~0ms cold start export const runtime = 'edge'; Enter fullscreen mode Exit fullscreen mode COMMAND_BLOCK: // app/api/mcp/route.ts — the ENTIRE file import { initFusion } from '@vinkius-core/mcp-fusion'; import { vercelAdapter } from '@vinkius-core/mcp-fusion-vercel'; import { z } from 'zod'; interface AppContext { tenantId: string; dbUrl: string } const f = initFusion<AppContext>(); const listProjects = f.tool({ name: 'projects.list', input: z.object({ limit: z.number().optional().default(20) }), readOnly: true, handler: async ({ input, ctx }) => fetch(`${ctx.dbUrl}/projects?tenant=${ctx.tenantId}&limit=${input.limit}`).then(r => r.json()), }); const registry = f.registry(); registry.register(listProjects); export const POST = vercelAdapter<AppContext>({ registry, serverName: 'my-mcp-server', contextFactory: async (req) => ({ tenantId: req.headers.get('x-tenant-id') || 'public', dbUrl: process.env.DATABASE_URL!, }), }); // Optional: global edge network, ~0ms cold start export const runtime = 'edge'; COMMAND_BLOCK: // app/api/mcp/route.ts — the ENTIRE file import { initFusion } from '@vinkius-core/mcp-fusion'; import { vercelAdapter } from '@vinkius-core/mcp-fusion-vercel'; import { z } from 'zod'; interface AppContext { tenantId: string; dbUrl: string } const f = initFusion<AppContext>(); const listProjects = f.tool({ name: 'projects.list', input: z.object({ limit: z.number().optional().default(20) }), readOnly: true, handler: async ({ input, ctx }) => fetch(`${ctx.dbUrl}/projects?tenant=${ctx.tenantId}&limit=${input.limit}`).then(r => r.json()), }); const registry = f.registry(); registry.register(listProjects); export const POST = vercelAdapter<AppContext>({ registry, serverName: 'my-mcp-server', contextFactory: async (req) => ({ tenantId: req.headers.get('x-tenant-id') || 'public', dbUrl: process.env.DATABASE_URL!, }), }); // Optional: global edge network, ~0ms cold start export const runtime = 'edge'; CODE_BLOCK: vercel deploy Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: vercel deploy CODE_BLOCK: vercel deploy CODE_BLOCK: COLD START (once per instance) → Zod reflection → cached → Presenter compile → cached → Schema generation → cached → Middleware resolve → cached WARM REQUEST (per invocation) → new McpServer() → ephemeral → contextFactory(req) → per-request → JSON-RPC dispatch → your handler runs → server.close() → cleanup Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: COLD START (once per instance) → Zod reflection → cached → Presenter compile → cached → Schema generation → cached → Middleware resolve → cached WARM REQUEST (per invocation) → new McpServer() → ephemeral → contextFactory(req) → per-request → JSON-RPC dispatch → your handler runs → server.close() → cleanup CODE_BLOCK: COLD START (once per instance) → Zod reflection → cached → Presenter compile → cached → Schema generation → cached → Middleware resolve → cached WARM REQUEST (per invocation) → new McpServer() → ephemeral → contextFactory(req) → per-request → JSON-RPC dispatch → your handler runs → server.close() → cleanup COMMAND_BLOCK: import { sql } from '@vercel/postgres'; import { kv } from '@vercel/kv'; const getUser = f.tool({ name: 'users.get', input: z.object({ id: z.string() }), readOnly: true, handler: async ({ input }) => { const { rows } = await sql`SELECT id, name, email FROM users WHERE id = ${input.id}`; return rows; }, }); const getCached = f.tool({ name: 'cache.get', input: z.object({ key: z.string() }), readOnly: true, handler: async ({ input }) => ({ value: await kv.get(input.key) }), }); Enter fullscreen mode Exit fullscreen mode COMMAND_BLOCK: import { sql } from '@vercel/postgres'; import { kv } from '@vercel/kv'; const getUser = f.tool({ name: 'users.get', input: z.object({ id: z.string() }), readOnly: true, handler: async ({ input }) => { const { rows } = await sql`SELECT id, name, email FROM users WHERE id = ${input.id}`; return rows; }, }); const getCached = f.tool({ name: 'cache.get', input: z.object({ key: z.string() }), readOnly: true, handler: async ({ input }) => ({ value: await kv.get(input.key) }), }); COMMAND_BLOCK: import { sql } from '@vercel/postgres'; import { kv } from '@vercel/kv'; const getUser = f.tool({ name: 'users.get', input: z.object({ id: z.string() }), readOnly: true, handler: async ({ input }) => { const { rows } = await sql`SELECT id, name, email FROM users WHERE id = ${input.id}`; return rows; }, }); const getCached = f.tool({ name: 'cache.get', input: z.object({ key: z.string() }), readOnly: true, handler: async ({ input }) => ({ value: await kv.get(input.key) }), }); - SSE connections drop — The session lives in-memory on instance A. The next request hits instance B. Session not found. - autoDiscover() fails silently — It scans directories at boot. Vercel has no persistent filesystem. - Cold starts waste CPU — Zod reflection, schema generation, and Presenter compilation run from scratch on every cold invocation. - Transport bridge breaks — The official MCP SDK's StreamableHTTPServerTransport expects Node.js http.IncomingMessage. Vercel Edge Runtime uses Web Standard Request/Response. Manually bridging them is fragile and often breaks. - The adapter's disableSSE: true — Doesn't even exist as a property in ServerOptions. You're stuck. - Claude Desktop — direct HTTP config or via proxy - LangChain / LangGraph — HTTP transport - Vercel AI SDK — direct JSON-RPC - FusionClient — built-in type-safe client (tRPC-style) - Any custom agent — standard POST with JSON-RPC payload - 📖 MCP Fusion Docs - 🔌 Vercel Adapter - 🏗️ MVA Architecture