AI Chatbots

AI Chatbot Moderation API

Stop unsafe LLM inputs and outputs before users see them.

Pre-screen prompts and post-screen completions in under 500ms.

What it detects

  • Prompt injection attempts
  • Jailbreak prompts
  • NSFW user inputs
  • Unsafe LLM outputs
  • PII in completions
  • Custom rules

Why developers choose Vettly

  • Streaming API for sub-500ms decisions
  • Pre-built injection and jailbreak policies
  • Same API works on inputs and outputs
  • Audit trails for every blocked completion
Example request
bash
import { createStreamingClient } from '@vettly/sdk';

const streaming = createStreamingClient('YOUR_KEY');
const ws = streaming.connectRealtime({
  policyId: 'chat-policy',
  onResult: (result) => {
    if (result.safe) showMessage(result);
    else logBlocked(result);
  }
});

await ws.connect();
const result = await ws.moderate(message);
Example response
json
{
  "safe": true,
  "action": "allow",
  "categories": {
    "harassment": 0.02,
    "spam": 0.01
  },
  "latency_ms": 47
}

Compared to a regex blocklist

AI evaluators catch novel attack phrasing that wordlists miss, with confidence scores you can tune.

Read the prompt injection guide

Get an API key

Start making decisions in minutes with a Developer plan and clear upgrade paths.

Get an API key