Track token usage and costs in real-time with hard limits and warning callbacks.

Budget Sentinel

Prevents surprise billing by tracking token usage and costs in real-time, with configurable hard limits and warning callbacks.

const guard = new Guardian({
  budget: {
    model:       'gpt-4o-mini',
    maxTokens:   2000,
    maxCostUSD:  0.05,
    onWarning:   (usage) => console.warn(`Budget at ${Math.round(usage.totalTokens / 2000 * 100)}%`),
    // Called when usage > 80% of limit
  },
});
 
const result = await guard.protect(callFn, prompt);
console.log(result.meta.budget);
// { inputTokens: 312, outputTokens: 89, totalTokens: 401, estimatedCostUSD: 0.000060, model: 'gpt-4o-mini' }

Supported Models & Pricing (per 1M tokens)

Model	Input	Output
`gpt-4o`	$2.50	$10.00
`gpt-4o-mini`	$0.15	$0.60
`gpt-4-turbo`	$10.00	$30.00
`gpt-3.5-turbo`	$0.50	$1.50
`claude-3-5-sonnet-20241022`	$3.00	$15.00
`claude-3-5-haiku-20241022`	$0.80	$4.00
`claude-3-opus-20240229`	$15.00	$75.00
`gemini-1.5-pro`	$1.25	$5.00
`gemini-1.5-flash`	$0.075	$0.30
`gemini-2.0-flash`	$0.10	$0.40

Standalone Usage

import { buildUsage, calculateCost } from '@edwinfom/ai-guard/budget';
 
const usage = buildUsage(312, 89, 'gpt-4o-mini');
// { inputTokens: 312, outputTokens: 89, totalTokens: 401, estimatedCostUSD: 0.000060 }
 
const cost = calculateCost(1000, 500, 'claude-3-5-sonnet-20241022');
// { estimatedCostUSD: 0.0105 }

Error Handling

When a hard limit is exceeded, a BudgetError is thrown:

import { BudgetError } from '@edwinfom/ai-guard';
 
try {
  await guard.protect(callFn, longPrompt);
} catch (err) {
  if (err instanceof BudgetError) {
    console.log(err.code);    // 'BUDGET_EXCEEDED'
    console.log(err.context); // { totalTokens: 2001, maxTokens: 2000, ... }
    return Response.json({ error: 'Service temporarily limited.' }, { status: 429 });
  }
}