Streaming Support

protectStream() wraps streaming AI calls with full Guard protection. It buffers the stream, applies all guards to the complete response, then re-streams to the client if everything passes.

import { Guardian } from '@edwinfom/ai-guard';
import OpenAI from 'openai';
 
const openai = new OpenAI();
const guard = new Guardian({
  pii:       { targets: ['email', 'phone'], onOutput: true },
  injection: { enabled: true },
  budget:    { model: 'gpt-4o-mini', maxCostUSD: 0.05 },
});
 
// Next.js Route Handler
export async function POST(req: Request) {
  const { prompt } = await req.json();
 
  const stream = await guard.protectStream(
    (safePrompt) =>
      openai.chat.completions.create({
        model:  'gpt-4o-mini',
        stream: true,
        messages: [{ role: 'user', content: safePrompt }],
      }),
    prompt
  );
 
  return new Response(stream, {
    headers: { 'Content-Type': 'text/event-stream' },
  });
}

How It Works

User prompt → [Guard input checks] → LLM (stream)
                                         ↓
                                   Buffer chunks
                                         ↓
                                [Guard output checks]
                                         ↓
                               ✅ Re-stream to client
                               ❌ Block & throw error

With Vercel AI SDK

import { streamText } from 'ai';
import { openai } from '@ai-sdk/openai';
import { guardVercelStream } from '@edwinfom/ai-guard/vercel';
 
export async function POST(req: Request) {
  const { messages } = await req.json();
  const lastMessage = messages[messages.length - 1].content;
 
  const stream = await guard.protectStream(
    (safePrompt) => streamText({
      model:    openai('gpt-4o-mini'),
      messages: [{ role: 'user', content: safePrompt }],
    }),
    lastMessage
  );
 
  return guardVercelStream(stream);
}

Streaming Events

Listen to guard events during streaming:

const stream = await guard.protectStream(callFn, prompt, {
  onChunk: (chunk) => {
    // Called for each streamed chunk (after output checks pass)
  },
  onComplete: (result) => {
    console.log(result.meta);  // Full guard metadata after stream ends
  },
  onError: (err) => {
    // Guard blocked the stream
  },
});

Streaming requires buffering the full response before re-streaming, which adds a small latency overhead equal to the model's time-to-last-token. For latency-critical use cases, use protect() instead.

PII redaction on output is especially powerful with streaming because the user never sees the raw response — only the redacted version.

Streaming Support

How It Works

With Vercel AI SDK

Streaming Events

Limitations