Track token usage and costs in real-time with hard limits and warning callbacks.

Budget Sentinel

Prevents surprise billing by tracking token usage and costs in real-time, with configurable hard limits and warning callbacks.

const guard = new Guardian({
  budget: {
    model:       'gpt-4o-mini',
    maxTokens:   2000,
    maxCostUSD:  0.05,
    onWarning:   (usage) => console.warn(`Budget at ${Math.round(usage.totalTokens / 2000 * 100)}%`),
    // Called when usage > 80% of limit
  },
});
 
const result = await guard.protect(callFn, prompt);
console.log(result.meta.budget);
// { inputTokens: 312, outputTokens: 89, totalTokens: 401, estimatedCostUSD: 0.000060, model: 'gpt-4o-mini' }

Supported Models & Pricing (per 1M tokens)

Model Input Output
gpt-4o $2.50 $10.00
gpt-4o-mini $0.15 $0.60
gpt-4-turbo $10.00 $30.00
gpt-3.5-turbo $0.50 $1.50
claude-3-5-sonnet-20241022 $3.00 $15.00
claude-3-5-haiku-20241022 $0.80 $4.00
claude-3-opus-20240229 $15.00 $75.00
gemini-1.5-pro $1.25 $5.00
gemini-1.5-flash $0.075 $0.30
gemini-2.0-flash $0.10 $0.40

Standalone Usage

import { buildUsage, calculateCost } from '@edwinfom/ai-guard/budget';
 
const usage = buildUsage(312, 89, 'gpt-4o-mini');
// { inputTokens: 312, outputTokens: 89, totalTokens: 401, estimatedCostUSD: 0.000060 }
 
const cost = calculateCost(1000, 500, 'claude-3-5-sonnet-20241022');
// { estimatedCostUSD: 0.0105 }

Error Handling

When a hard limit is exceeded, a BudgetError is thrown:

import { BudgetError } from '@edwinfom/ai-guard';
 
try {
  await guard.protect(callFn, longPrompt);
} catch (err) {
  if (err instanceof BudgetError) {
    console.log(err.code);    // 'BUDGET_EXCEEDED'
    console.log(err.context); // { totalTokens: 2001, maxTokens: 2000, ... }
    return Response.json({ error: 'Service temporarily limited.' }, { status: 429 });
  }
}