Agent Cost Radar

Overview

Agent Cost Radar (ACR) is open source observability and routing recommendations for LLM agents. Drop in our SDK with three lines of code — every Anthropic and OpenAI call is tracked automatically: tokens, cost, latency, per-agent breakdown.

Real-time per-agent cost tracking
Anthropic + OpenAI auto-instrumentation
Routing recommendations (Haiku vs Sonnet vs Opus)
Free tier: 1,000 events/month

Quick start

Python

import acr
acr.init(api_key="acr_live_...", project="my-app")

# any anthropic.messages.create() is now auto-tracked
import anthropic
client = anthropic.Anthropic()
client.messages.create(
    model="claude-sonnet-4-5",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello"}],
)

JavaScript

import { ACR } from '@aimayak/acr-sdk';
import Anthropic from '@anthropic-ai/sdk';

const acr = new ACR({ apiKey: 'acr_live_...', project: 'my-app' });
const anthropic = acr.instrument(new Anthropic());

await anthropic.messages.create({
  model: 'claude-sonnet-4-5',
  max_tokens: 1024,
  messages: [{ role: 'user', content: 'Hello' }],
});

Install the SDK

Python (pip)

pip install agent-cost-radar

JavaScript / TypeScript (npm)

npm install @aimayak/acr-sdk
# or
pnpm add @aimayak/acr-sdk

Base URL

https://api.agentcostradar.com/v1

For local development: http://localhost:8787/v1 via Cloudflare Workers wrangler.

Authentication

All requests require an Authorization: Bearer <api_key> header.

API keys have prefix acr_live_ (production) or acr_test_ (sandbox).

POST /v1/events HTTP/1.1
Host: api.agentcostradar.com
Authorization: Bearer acr_live_abc123...
Content-Type: application/json

POST /v1/events

Ingest one or more usage events from an instrumented LLM client.

Request — single event

{
  "ts": "2026-05-14T19:35:00.123Z",
  "project": "my-app",
  "model": "claude-sonnet-4-5",
  "input_tokens": 1234,
  "output_tokens": 567,
  "cost_usd": 0.012345,
  "latency_ms": 842,
  "agent_id": "researcher-v2",
  "conversation_id": "conv_abc123",
  "metadata": {
    "user_id_hash": "sha256:...",
    "feature": "search"
  }
}

Request — batch (preferred — up to 1,000 events)

{
  "events": [
    { "ts": "...", "project": "...", "model": "...", "..." },
    { "ts": "...", "project": "...", "model": "...", "..." }
  ]
}

Response — 202 Accepted

{
  "accepted": 1,
  "rejected": 0,
  "request_id": "req_xyz789"
}

Response — 400 Bad Request

{
  "error": "invalid_event",
  "details": "field 'model' is required",
  "rejected_indexes": [3]
}

Event schema

Field	Type	Required	Description
`ts`	ISO 8601 string	yes	UTC timestamp of the LLM call
`project`	string	yes	Project slug (set via `init(project=...)`)
`model`	string	yes	Model name (e.g. `claude-sonnet-4-5`, `gpt-4o`)
`input_tokens`	integer	yes	Prompt tokens (incl. cached)
`output_tokens`	integer	yes	Completion tokens
`cost_usd`	float	yes	Computed cost in USD (6 decimal places)
`latency_ms`	integer	no	Round-trip latency, milliseconds
`agent_id`	string	no	Optional agent role identifier
`conversation_id`	string	no	Logical conversation grouping
`cache_read_tokens`	integer	no	Anthropic cache read tokens
`cache_write_tokens`	integer	no	Anthropic cache write tokens
`provider`	string	no	`anthropic` \| `openai` \| `azure` \| custom
`metadata`	object	no	User-defined tags (≤ 8 KB)

GET /v1/events (planned, Week 3)

Query historical events with filters: ?project=...&from=...&to=...&model=...

GET /v1/insights (planned, Week 4)

Returns AI-generated cost optimization insights for a project.

{
  "project": "my-app",
  "period": "2026-05-07/2026-05-14",
  "total_cost_usd": 124.56,
  "insights": [
    {
      "kind": "model_downgrade",
      "savings_usd_per_month": 89.50,
      "message": "78% of calls use Opus for classification — Haiku would suffice."
    }
  ]
}

Rate limits

Tier	Events / minute	Burst
Free	1,000	5,000
Pro	50,000	200,000
Enterprise	unlimited	—

429 responses include a Retry-After header.

Errors

Code	Meaning
400	Invalid event payload
401	Missing or invalid API key
403	Project not allowed for this key
413	Batch too large (> 1,000 events or > 5 MB)
429	Rate limit exceeded
5xx	Transient — SDK retries with exponential backoff

SDK responsibilities

Buffer & batch — accumulate events in memory, flush every 5 s or 100 events.
Retry on 5xx / 429 — exponential backoff with jitter, max 3 attempts.
Never block the user's LLM call — instrumentation runs after response, async.
Drop on overflow — if buffer > 10K events (server unreachable), log warning and drop oldest.

Python SDK

Source: sdk/python on GitHub

from acr import init, track

# Auto-instrument
init(api_key="acr_live_...", project="my-app")

# Manual tracking (alternative)
track(
    model="claude-sonnet-4-5",
    input_tokens=1234,
    output_tokens=567,
    cost_usd=0.012345,
    agent_id="researcher",
)

JavaScript SDK

Source: sdk/javascript on GitHub

import { ACR } from '@aimayak/acr-sdk';

const acr = new ACR({
  apiKey: 'acr_live_...',
  project: 'my-app',
  flushInterval: 5000,   // ms
  batchSize: 100,
});

// Auto-instrument
const anthropic = acr.instrument(new Anthropic());

// Manual tracking
acr.track({
  model: 'claude-sonnet-4-5',
  inputTokens: 1234,
  outputTokens: 567,
  costUsd: 0.012345,
  agentId: 'researcher',
});

Documentation