Documentation

Real-time LLM cost tracking per agent, endpoint, and user. 3-line drop-in SDK.

Overview

Agent Cost Radar (ACR) is open source observability and routing recommendations for LLM agents. Drop in our SDK with three lines of code — every Anthropic and OpenAI call is tracked automatically: tokens, cost, latency, per-agent breakdown.

  • Real-time per-agent cost tracking
  • Anthropic + OpenAI auto-instrumentation
  • Routing recommendations (Haiku vs Sonnet vs Opus)
  • Free tier: 1,000 events/month

Quick start

Python

import acr
acr.init(api_key="acr_live_...", project="my-app")

# any anthropic.messages.create() is now auto-tracked
import anthropic
client = anthropic.Anthropic()
client.messages.create(
    model="claude-sonnet-4-5",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello"}],
)

JavaScript

import { ACR } from '@aimayak/acr-sdk';
import Anthropic from '@anthropic-ai/sdk';

const acr = new ACR({ apiKey: 'acr_live_...', project: 'my-app' });
const anthropic = acr.instrument(new Anthropic());

await anthropic.messages.create({
  model: 'claude-sonnet-4-5',
  max_tokens: 1024,
  messages: [{ role: 'user', content: 'Hello' }],
});

Install the SDK

Python (pip)

pip install agent-cost-radar

JavaScript / TypeScript (npm)

npm install @aimayak/acr-sdk
# or
pnpm add @aimayak/acr-sdk

API Reference

Status: Draft. Subject to change before v1.0 (planned 2026-Q3).
Stable contract: event schema fields below are forward-compatible (new optional fields may be added).

Base URL

https://api.agentcostradar.com/v1

For local development: http://localhost:8787/v1 via Cloudflare Workers wrangler.

Authentication

All requests require an Authorization: Bearer <api_key> header.

API keys have prefix acr_live_ (production) or acr_test_ (sandbox).

POST /v1/events HTTP/1.1
Host: api.agentcostradar.com
Authorization: Bearer acr_live_abc123...
Content-Type: application/json

POST /v1/events

Ingest one or more usage events from an instrumented LLM client.

Request — single event

{
  "ts": "2026-05-14T19:35:00.123Z",
  "project": "my-app",
  "model": "claude-sonnet-4-5",
  "input_tokens": 1234,
  "output_tokens": 567,
  "cost_usd": 0.012345,
  "latency_ms": 842,
  "agent_id": "researcher-v2",
  "conversation_id": "conv_abc123",
  "metadata": {
    "user_id_hash": "sha256:...",
    "feature": "search"
  }
}

Request — batch (preferred — up to 1,000 events)

{
  "events": [
    { "ts": "...", "project": "...", "model": "...", "..." },
    { "ts": "...", "project": "...", "model": "...", "..." }
  ]
}

Response — 202 Accepted

{
  "accepted": 1,
  "rejected": 0,
  "request_id": "req_xyz789"
}

Response — 400 Bad Request

{
  "error": "invalid_event",
  "details": "field 'model' is required",
  "rejected_indexes": [3]
}

Event schema

FieldTypeRequiredDescription
tsISO 8601 stringyesUTC timestamp of the LLM call
projectstringyesProject slug (set via init(project=...))
modelstringyesModel name (e.g. claude-sonnet-4-5, gpt-4o)
input_tokensintegeryesPrompt tokens (incl. cached)
output_tokensintegeryesCompletion tokens
cost_usdfloatyesComputed cost in USD (6 decimal places)
latency_msintegernoRound-trip latency, milliseconds
agent_idstringnoOptional agent role identifier
conversation_idstringnoLogical conversation grouping
cache_read_tokensintegernoAnthropic cache read tokens
cache_write_tokensintegernoAnthropic cache write tokens
providerstringnoanthropic | openai | azure | custom
metadataobjectnoUser-defined tags (≤ 8 KB)

GET /v1/events (planned, Week 3)

Query historical events with filters: ?project=...&from=...&to=...&model=...

GET /v1/insights (planned, Week 4)

Returns AI-generated cost optimization insights for a project.

{
  "project": "my-app",
  "period": "2026-05-07/2026-05-14",
  "total_cost_usd": 124.56,
  "insights": [
    {
      "kind": "model_downgrade",
      "savings_usd_per_month": 89.50,
      "message": "78% of calls use Opus for classification — Haiku would suffice."
    }
  ]
}

Rate limits

TierEvents / minuteBurst
Free1,0005,000
Pro50,000200,000
Enterpriseunlimited

429 responses include a Retry-After header.

Errors

CodeMeaning
400Invalid event payload
401Missing or invalid API key
403Project not allowed for this key
413Batch too large (> 1,000 events or > 5 MB)
429Rate limit exceeded
5xxTransient — SDK retries with exponential backoff

SDK guide

SDK responsibilities

  • Buffer & batch — accumulate events in memory, flush every 5 s or 100 events.
  • Retry on 5xx / 429 — exponential backoff with jitter, max 3 attempts.
  • Never block the user's LLM call — instrumentation runs after response, async.
  • Drop on overflow — if buffer > 10K events (server unreachable), log warning and drop oldest.

Python SDK

Source: sdk/python on GitHub

from acr import init, track

# Auto-instrument
init(api_key="acr_live_...", project="my-app")

# Manual tracking (alternative)
track(
    model="claude-sonnet-4-5",
    input_tokens=1234,
    output_tokens=567,
    cost_usd=0.012345,
    agent_id="researcher",
)

JavaScript SDK

Source: sdk/javascript on GitHub

import { ACR } from '@aimayak/acr-sdk';

const acr = new ACR({
  apiKey: 'acr_live_...',
  project: 'my-app',
  flushInterval: 5000,   // ms
  batchSize: 100,
});

// Auto-instrument
const anthropic = acr.instrument(new Anthropic());

// Manual tracking
acr.track({
  model: 'claude-sonnet-4-5',
  inputTokens: 1234,
  outputTokens: 567,
  costUsd: 0.012345,
  agentId: 'researcher',
});