TokenCurb is an LLM cost monitoring tool that shows teams exactly which feature, user, or agent loop is burning tokens — and how to spend less.

How is TokenCurb different from Helicone or LangSmith?

Helicone focuses on observability proxy logging. LangSmith focuses on LangChain trace debugging. TokenCurb is built specifically for cost visibility — per-feature breakdown, spike alerts, and agent loop detection.

Which LLM providers does TokenCurb support?

TokenCurb supports OpenAI, Anthropic, Google Gemini, and Mistral API cost tracking.

How much can I save with LLM cost visibility?

Teams with per-feature LLM cost visibility typically reduce spend by approximately 35% through model routing, agent loop fixes, and spike detection.

Is there a free LLM cost calculator?

Yes. TokenCurb offers a free LLM cost calculator at https://tokencurb.vercel.app/en/calculator to estimate monthly OpenAI, Anthropic, and Gemini API spend.

How to Track OpenAI API Costs Per Feature in Production | TokenCurb

Your OpenAI invoice arrives. $2,400. You pay it. But you still can't answer the most important question: which part of your product burned the tokens?

Was it chat? Embeddings? That agent loop you shipped last sprint? The invoice doesn't say. And by the time finance flags the spike, the damage is already done.

Why monthly totals aren't enough

Most teams track LLM spend one of three ways:

Provider dashboard — shows total usage, not per-feature
Spreadsheet — manual, always outdated
Nothing — surprisingly common for teams under $5k/mo

None of these tell you that your agent endpoint consumed 52% of your budget while chat only used 28%. That visibility is what enables real optimization.

Step 1: Tag every API call

Add metadata to every LLM request before it leaves your app:

const response = await openai.chat.completions.create({
  model: "gpt-4o",
  messages,
}, {
  headers: {
    "X-Feature": "chat",
    "X-User-Id": userId,
  },
});

Step 2: Log input + output tokens per call

const { usage } = response;
log({
  feature: "chat",
  model: "gpt-4o",
  input_tokens: usage.prompt_tokens,
  output_tokens: usage.completion_tokens,
  cost: calculateCost(usage, "gpt-4o"),
  timestamp: Date.now(),
});

Step 3: Aggregate by feature daily

Roll up logs into a daily view: feature → total tokens → total cost. This is the dashboard your engineering lead actually needs.

Step 4: Set spike alerts

Define thresholds per feature. If chat normally costs $40/day and suddenly hits $400, you want a Slack alert today — not on next month's invoice.

The agent loop problem

Agent chains are the silent budget killer. A loop that retries 12 times burns 12× the expected tokens. Flag any agent call that exceeds 3× your rolling average token count.

Build vs. buy

Or use our free LLM Cost Calculator to estimate your monthly spend in seconds.

Tools like TokenCurb, Helicone, and LangSmith solve this out of the box — each with different strengths.

What to do this week

Pick one feature and add token logging today
Check last month's invoice against your logs
Set one alert for your highest-cost endpoint
Review agent loops for retry patterns

TokenCurb does all of this automatically — per-feature breakdown, spike alerts, and agent loop detection.

Join the waitlist →