How to Add ATXP to LlamaIndex

LlamaIndex gives you a powerful framework for building data-aware agents. It does not give you a way to bound what those agents cost or isolate their credentials from your primary API keys.

ATXP fills that gap. This is how to wire them together.

What You’re Solving

LlamaIndex agents call LLMs, embedding services, and external APIs. By default, they use your credentials and have no spending ceiling. In development that’s fine. In production — especially with autonomous agents running on their own loop — it’s a liability.

The goal: each LlamaIndex agent gets its own ATXP account with a fixed credit balance. When the balance is exhausted, calls stop. Your primary keys never leave your infrastructure.

Prerequisites

  • Python 3.9+
  • llama-index and openai installed
  • An ATXP account (atxp.ai)

Step 1: Create an Agent Account

import httpx

ATXP_API_KEY = "your-atxp-key"

# Create an isolated agent account with a $5 budget
response = httpx.post(
    "https://api.atxp.ai/v1/agents",
    headers={"Authorization": f"Bearer {ATXP_API_KEY}"},
    json={
        "name": "llamaindex-research-agent",
        "budget": 5.00,
        "currency": "usd"
    }
)

agent = response.json()
agent_id = agent["id"]
agent_key = agent["api_key"]  # Use this in LlamaIndex, not your primary key

Step 2: Configure LlamaIndex to Route Through ATXP

ATXP provides an OpenAI-compatible endpoint. Point your LlamaIndex OpenAI client at it using the agent key:

from llama_index.llms.openai import OpenAI
from llama_index.core import Settings

# Route through ATXP instead of calling OpenAI directly
llm = OpenAI(
    model="gpt-4o",
    api_key=agent_key,
    api_base="https://api.atxp.ai/v1/proxy/openai"
)

Settings.llm = llm

For Anthropic models via LlamaIndex:

from llama_index.llms.anthropic import Anthropic

llm = Anthropic(
    model="claude-sonnet-4-6",
    api_key=agent_key,
    base_url="https://api.atxp.ai/v1/proxy/anthropic"
)

Settings.llm = llm

Everything else in your LlamaIndex code stays the same.

Step 3: Build Your Agent Normally

from llama_index.core.agent import ReActAgent
from llama_index.core.tools import FunctionTool

def search_documents(query: str) -> str:
    """Search internal documentation for relevant passages."""
    # your retrieval logic
    return results

search_tool = FunctionTool.from_defaults(fn=search_documents)

agent = ReActAgent.from_tools(
    [search_tool],
    llm=llm,  # Already configured to route through ATXP
    verbose=True
)

response = agent.chat("Summarize our Q1 revenue trends")

Step 4: Check Spending

After the agent runs:

import httpx

transactions = httpx.get(
    f"https://api.atxp.ai/v1/agents/{agent_id}/transactions",
    headers={"Authorization": f"Bearer {ATXP_API_KEY}"}
).json()

for tx in transactions["data"]:
    print(f"{tx['service']}{tx['model']} — ${tx['cost_usd']:.4f}")

# Check remaining balance
balance = httpx.get(
    f"https://api.atxp.ai/v1/agents/{agent_id}/balance",
    headers={"Authorization": f"Bearer {ATXP_API_KEY}"}
).json()

print(f"Remaining: ${balance['available_usd']:.4f}")

Step 5: Refill or Replace

When an agent account runs out of credits, calls return a 402 Payment Required. You can handle this gracefully:

from llama_index.core.callbacks import CallbackManager, LlamaDebugHandler

def refill_agent(agent_id: str, amount: float):
    httpx.post(
        f"https://api.atxp.ai/v1/agents/{agent_id}/topup",
        headers={"Authorization": f"Bearer {ATXP_API_KEY}"},
        json={"amount": amount}
    )

# Or create a fresh agent account for the next run

For long-running agents, a common pattern is to check balance before each major task and refill programmatically if below a threshold.

Production Patterns

One account per agent instance: Every autonomous agent gets its own ATXP account. When the agent is retired, the account is deactivated. Clean separation.

One account per job: For batch tasks (e.g., “analyze 200 documents”), create an account, fund it for the job, run the job, inspect spend. Predictable cost per batch.

Budget escalation: Start agents with a small budget. If they complete their task under budget, close them out. If they need more, refill programmatically after a human review step.

This last pattern — budget escalation with human review — is the right model for ramping agent autonomy responsibly.

Why Isolated Credentials Matter

LlamaIndex agents calling APIs with your primary key means any bug, prompt injection, or unexpected behavior has the same blast radius as your full account. The blast radius problem is real: one compromised agent shouldn’t be able to exhaust your entire API budget.

ATXP agent accounts are isolated. If an agent misbehaves, it exhausts its own budget — not yours. That’s the right architecture for anything running with meaningful autonomy.

Start building with ATXP — the payment and identity layer for agents that need to act in the real world.