Edge functions don’t have sessions. That’s the first thing to internalize when you’re wiring Vercel AI SDK agent payments into a production tool chain.

Most payment infrastructure was designed for humans: OAuth tokens in cookies, virtual card sessions tied to a billing account, authorization flows that assume persistent state. None of that works when your agent is running in Vercel’s Edge Runtime where each invocation is stateless, cold-start budgets are measured in milliseconds, and there’s no server-side memory between requests.

The Vercel AI SDK has over 4 million weekly npm downloads as of Q1 2026 (npmjs.com). Developers are shipping AI-powered applications at scale on edge infrastructure. But the SDK’s documentation stops at the AI layer. When your tool calls need to spend money on external APIs — web search, image generation, data providers — you’re on your own.

This guide covers that gap: how to wire Vercel AI SDK agent payments through ATXP so your edge-deployed agents can call paid APIs without credential sprawl, session state, or a billing account per service.

What Makes the Vercel AI SDK Different for Agent Payments?

The AI SDK’s tool model is clean. You define a description, a Zod parameters schema, and an execute function. The LLM decides which tools to call; the SDK dispatches them.

import { generateText, tool } from 'ai';
import { openai } from '@ai-sdk/openai';
import { z } from 'zod';

const result = await generateText({
  model: openai('gpt-4o'),
  tools: {
    webSearch: tool({
      description: 'Search the web for current information',
      parameters: z.object({ query: z.string() }),
      execute: async ({ query }) => {
        // This is where paid API calls happen.
        // Without ATXP: manage keys, billing accounts, and rate limits per service.
        // With ATXP: one credential, one call, billed per use.
      },
    }),
  },
  prompt: 'What happened in AI last week?',
});

The execute function is where the money gets spent. The challenge: if this runs on Vercel’s edge runtime, you have no persistent session, often no database connection, and a cold-start budget of 50ms. That rules out most payment infrastructure patterns.

The three approaches developers try

Approach	Edge-compatible?	Problem
Hard-coded API keys per service	Yes	Key sprawl across 7+ vendors, rotation nightmare, leaked in logs
Virtual cards per agent	No	Require session initialization and card auth flows — blocking on edge
ATXP credits	Yes	Token-based, stateless, one credential covers all tools

Why Do Virtual Cards Fail on Edge Functions?

Virtual cards are the intuitive answer to “give an agent a payment method.” The problem is that virtual cards were built for humans navigating checkout flows.

A virtual card transaction requires:

Session initialization with the card issuer
An authorization hold at charge time (synchronous network round-trip)
Capture confirmation (sometimes async, sometimes not)
Reconciliation back to your billing account

Step 1 alone breaks edge functions. You can’t run a session handshake in a cold-start function. Even if you could, the authorization hold in step 2 blocks your tool response — which blocks the LLM’s next token — which kills your streaming latency.

Virtual cards also create a management problem at scale. The three models for agent payments breaks this down: one card per agent-per-service means a production multi-tool agent managing 20–30 active virtual cards. You can’t audit that at runtime, and you can’t set a per-request spending limit across services.

ATXP credits sidestep all of this. Your agent holds a pre-funded balance. Each tool call deducts atomically. No session, no auth flow, no card network. The latency cost is a single HTTPS call to the ATXP API — under 100ms globally.

How Do Vercel AI SDK Agent Payments Work With ATXP?

Three steps: register an agent, fund it, wire it into tool execute functions.

Step 1: Create your agent

npx atxp agent create \
  --name "edge-research-agent" \
  --description "Vercel edge research agent" \
  --budget 50.00

This returns an ATXP_AGENT_ID and ATXP_AGENT_KEY. Add both to your Vercel project’s environment variables. These credentials are safe on the edge — they’re scoped to a single agent’s budget, not your master account.

Step 2: Install the SDK

npm install @atxp/sdk

Step 3: Wire into AI SDK tools

import { generateText, tool } from 'ai';
import { openai } from '@ai-sdk/openai';
import { ATXPClient } from '@atxp/sdk';
import { z } from 'zod';

const atxp = new ATXPClient({
  agentId: process.env.ATXP_AGENT_ID!,
  agentKey: process.env.ATXP_AGENT_KEY!,
});

export async function POST(req: Request) {
  const { prompt } = await req.json();

  const result = await generateText({
    model: openai('gpt-4o-mini'),
    tools: {
      webSearch: tool({
        description: 'Search the web for current information',
        parameters: z.object({
          query: z.string().describe('The search query'),
        }),
        execute: async ({ query }) => {
          const results = await atxp.tools.search({ query, maxResults: 5 });
          return results.results;
        },
      }),

      generateImage: tool({
        description: 'Generate an image from a text prompt',
        parameters: z.object({
          prompt: z.string().describe('Image description'),
          size: z.enum(['square', 'landscape', 'portrait']).default('square'),
        }),
        execute: async ({ prompt, size }) => {
          const image = await atxp.tools.imageGeneration({ prompt, size });
          return { url: image.url, creditsCost: image.creditsCost };
        },
      }),
    },
    maxSteps: 5,
    prompt,
  });

  return Response.json({ text: result.text });
}

This runs clean on edge. ATXPClient is initialized per-request with no session state. Each atxp.tools.* call is a single authenticated HTTPS request that deducts from the agent’s credit balance and returns structured output.

How Do You Set Spending Limits Per Request?

Set a requestBudget before you ship anything to production. Without one, a multi-step tool chain with maxSteps: 8 and 3 searches per step can drain significant balance in a single bad request.

const atxp = new ATXPClient({
  agentId: process.env.ATXP_AGENT_ID!,
  agentKey: process.env.ATXP_AGENT_KEY!,
  requestBudget: 2.00,   // hard cap per API invocation
  dailyBudget: 100.00,   // rolling 24h cap across all requests
});

When a tool call would exceed requestBudget, ATXP returns BUDGET_EXCEEDED from the execute function. The AI SDK surfaces this as a tool error. The LLM can handle it gracefully — either informing the user or switching to a cheaper tool path.

This is budget enforcement at the infrastructure layer, not just a number in a config file. The cap is enforced before the charge hits, not after. That’s the meaningful difference between a spending limit and a spending guideline.

Register at atxp.ai to create your first agent. Fund it via Stripe or USDC — setup takes under five minutes.

What Does a Full Production Tool Chain Look Like?

Here’s a streaming example using Claude as the model — demonstrating that ATXP works at the tool layer, not the model layer:

import { streamText, tool } from 'ai';
import { anthropic } from '@ai-sdk/anthropic';
import { ATXPClient } from '@atxp/sdk';
import { z } from 'zod';

const atxp = new ATXPClient({
  agentId: process.env.ATXP_AGENT_ID!,
  agentKey: process.env.ATXP_AGENT_KEY!,
  requestBudget: 3.00,
});

export const runtime = 'edge';

export async function POST(req: Request) {
  const { topic } = await req.json();

  const stream = streamText({
    model: anthropic('claude-sonnet-4-6'),
    tools: {
      webSearch: tool({
        description: 'Search the web for current information',
        parameters: z.object({ query: z.string() }),
        execute: async ({ query }) => {
          const results = await atxp.tools.search({ query, maxResults: 8 });
          return results.results.map((r) => ({
            title: r.title,
            snippet: r.snippet,
            url: r.url,
          }));
        },
      }),

      generateSummaryImage: tool({
        description: 'Generate a visual summary of findings',
        parameters: z.object({
          description: z.string(),
        }),
        execute: async ({ description }) => {
          const img = await atxp.tools.imageGeneration({
            prompt: `Infographic summary: ${description}`,
            size: 'landscape',
          });
          return { imageUrl: img.url, cost: img.creditsCost };
        },
      }),
    },
    maxSteps: 8,
    system: 'Research the topic thoroughly, then generate a visual summary of your findings.',
    prompt: `Research this topic: ${topic}`,
  });

  return stream.toDataStreamResponse();
}

The cost field in generateSummaryImage’s return is intentional — surface credit consumption in tool results so the LLM can include it in its response. Downstream, you can log this to your observability layer and build a real cost-per-request picture over time.

ATXP Credits vs. Direct API Keys on Edge

	ATXP Credits	Direct API Keys
Credential count	1 per agent	1 per service per agent
Edge compatibility	Full — stateless HTTPS	Varies by provider
Budget enforcement	Hard cap at request and daily level	None (charges accrue unchecked)
Key rotation scope	Single rotation covers all tools	Rotate across every service account
Spend visibility	Per-request, per-tool breakdown	Manual invoice reconciliation
Rate limit handling	ATXP absorbs and retries	Your responsibility
New tool onboarding	Enable in ATXP dashboard	New API account + key + billing

The credential count matters most at team scale. Three engineers building agents that each call web search, image generation, and a data provider generates nine separate API accounts to manage. The real cost of maintaining a DIY API stack goes beyond invoice line items — it’s the rotation overhead, the key-in-env-var sprawl, and the reconciliation time that compounds monthly.

How Do You Track What Each Request Actually Cost?

The ATXPClient exposes a per-session spend report:

const result = await generateText({ ... });

const spendReport = await atxp.session.getReport();

console.log(spendReport);
// {
//   totalCost: 0.38,
//   tools: [
//     { tool: 'webSearch', calls: 4, cost: 0.24 },
//     { tool: 'generateSummaryImage', calls: 1, cost: 0.14 },
//   ],
//   requestId: 'req_01HZ...',
//   budgetRemaining: 1.62,
// }

Log this alongside your LLM token usage. Over time the pattern becomes clear: which prompts trigger excessive tool chains, which tools are cost-efficient, and where maxSteps should be tightened. The same optimization loop you’d run on model token costs, now applied to tool costs.

FAQ

Does ATXP work with Vercel streaming responses?

Yes. Tool execute functions run synchronously within the streaming pipeline — the stream pauses while a tool executes, then continues. ATXP’s sub-100ms global API latency keeps that pause imperceptible. The streaming example above with streamText and toDataStreamResponse() works as shown.

Can I use ATXP with non-OpenAI models in the AI SDK?

Yes. ATXP integrates at the tool layer, not the model layer. Any AI SDK-compatible model provider that supports function calling works: Anthropic, Google, Mistral, Cohere. The Claude example above demonstrates this directly. See the OpenAI Agents SDK integration guide for a comparison of how tool wiring differs across SDKs.

What happens when an agent hits its requestBudget mid-task?

ATXP returns BUDGET_EXCEEDED from the execute function. The AI SDK treats this as a tool error and passes it back to the model. The LLM can inform the user, try a cheaper alternative, or stop. You can also configure throwOnBudgetExceeded: true to terminate the tool chain immediately rather than letting the LLM decide.

Is there a way to preview the cost before executing an expensive tool call?

Yes — use atxp.tools.estimate() with the same parameters as the real call. It returns a credit estimate without executing and without charging. Useful for building a confirmation step before large image generation jobs or deep web scrapes.

How do ATXP credits convert to real money?

Credits are purchased in advance via Stripe or USDC and held in your agent’s balance. The exchange rate is fixed at purchase time — no surge pricing, no retroactive adjustments. All tool costs are published in the ATXP pricing table at atxp.ai.

What to Do Before Going to Production

Set requestBudget. Run 10 real requests in staging. Pull the spend reports. Then tune maxSteps and tool selection until your cost-per-request is where you want it.

That operational loop — budget, deploy, measure, tighten — is the difference between an agent that works in a demo and one you can run profitably at scale. ATXP gives you the infrastructure to close that loop without managing credentials for every API your agent touches.

Start at atxp.ai.