AI Chatbots

AI Chatbot Moderation API

Stop unsafe LLM inputs and outputs before users see them.

Pre-screen prompts and post-screen completions in under 500ms.

What it detects

• Prompt injection attempts
• Jailbreak prompts
• NSFW user inputs
• Unsafe LLM outputs
• PII in completions
• Custom rules

Why developers choose Vettly

• Streaming API for sub-500ms decisions
• Pre-built injection and jailbreak policies
• Same API works on inputs and outputs
• Audit trails for every blocked completion

Example request

bash

import { createStreamingClient } from '@vettly/sdk';

const streaming = createStreamingClient('YOUR_KEY');
const ws = streaming.connectRealtime({
  policyId: 'chat-policy',
  onResult: (result) => {
    if (result.safe) showMessage(result);
    else logBlocked(result);
  }
});

await ws.connect();
const result = await ws.moderate(message);

Example response

json

{
  "safe": true,
  "action": "allow",
  "categories": {
    "harassment": 0.02,
    "spam": 0.01
  },
  "latency_ms": 47
}

Compared to a regex blocklist

AI evaluators catch novel attack phrasing that wordlists miss, with confidence scores you can tune.

Read the prompt injection guide

Keep exploring

Content Moderation API

One endpoint for text, image, and video moderation.

Image Moderation API

Policy-driven image checks with clear allow, review, and block actions.

Video Moderation API

Async video moderation without stitching together multiple vendors.

Social App Moderation API

Moderate posts, comments, profiles, DMs, and media in your social app with one API. Meets App Store Guideline 1.2.

Get an API key

Start making decisions in minutes with a Developer plan and clear upgrade paths.

Get an API key