Performance

Real-Time Moderation Latency Guide

Keep every moderation decision under 500ms.

Streaming endpoints, pre-warming, and async persistence.

What it detects

  • Live chat messages
  • AI chatbot completions
  • Game chat traffic
  • Streaming comments
  • Real-time DMs
  • Custom rules

Why developers choose Vettly

  • Streaming endpoint with sub-500ms target
  • Provider pre-warming and warm pools
  • Async persistence to avoid blocking writes
  • Edge-compatible SDK
Example request
bash
import { createStreamingClient } from '@vettly/sdk';

const streaming = createStreamingClient('YOUR_KEY');
const ws = streaming.connectRealtime({
  policyId: 'chat-policy',
  onResult: (result) => {
    if (result.safe) showMessage(result);
    else logBlocked(result);
  }
});

await ws.connect();
const result = await ws.moderate(message);
Example response
json
{
  "safe": true,
  "action": "allow",
  "categories": {
    "harassment": 0.02,
    "spam": 0.01
  },
  "latency_ms": 47
}

Compared to standard moderation APIs

Most APIs are batch-oriented. Vettly is built for production traffic where latency is part of the user experience.

Get an API key

Start making decisions in minutes with a Developer plan and clear upgrade paths.

Get an API key