Product

Moderation Policies as Code: Managing Content Rules with YAML

February 15, 2026·8 min read

Most content moderation systems bury their rules inside model configurations, admin dashboards, or hardcoded if-else chains. When a policy changes — a new category needs to be blocked, a threshold needs adjustment, a regulation requires stricter rules — someone edits a config, deploys a new build, and hopes nothing breaks.

There's a better way: treat moderation policies like code. Define them in YAML, version them in Git, review changes in pull requests, and deploy them through your existing CI/CD pipeline. The Policies documentation covers the full schema reference; this post focuses on the workflow.

Why Policies as Code?

Traceable: every policy change has a commit hash, an author, a timestamp, and a review. When a regulator asks "why was this content blocked?", you can point to the exact policy version.

Reviewable: policy changes go through pull requests. Engineers, legal, and trust & safety review the same diff. No more "someone changed a setting in the dashboard and we're not sure when."

Testable: write tests against your policies. "Given this input, the policy should return block." Run tests in CI before deploying.

Rollbackable: if a policy change causes problems (too many false positives, too permissive), revert the commit and redeploy.

Policy Structure

A Vettly policy is a YAML file that defines categories, thresholds, and actions:

policies/community-safe.yamlYAML

name: community-safe
version: "3"
description: Standard community safety policy for UGC

categories:
hate_speech:
  action: block
  threshold: 0.7

harassment:
  action: block
  threshold: 0.8

nudity:
  action: block
  threshold: 0.6

violence:
  action: flag
  threshold: 0.7

self_harm:
  action: block
  threshold: 0.5

spam:
  action: flag
  threshold: 0.8

pii:
  action: flag
  threshold: 0.9

defaults:
action: allow

Each category has an action (allow, flag, or block) and a threshold (confidence score required to trigger the action). The defaults section handles categories not explicitly listed.

Multiple Policies for Different Contexts

Different parts of your product may need different policies:

policies/profile-photo.yamlYAML

name: profile-photo
version: "1"
description: Stricter policy for profile photos - visible everywhere

categories:
nudity:
  action: block
  threshold: 0.4  # Lower threshold = stricter

violence:
  action: block
  threshold: 0.5

hate_symbols:
  action: block
  threshold: 0.3

defaults:
action: allow

policies/direct-messages.yamlYAML

name: direct-messages
version: "2"
description: More permissive for private conversations, strict on harassment

categories:
harassment:
  action: block
  threshold: 0.7

threats:
  action: block
  threshold: 0.5

csam:
  action: block
  threshold: 0.1  # Zero tolerance

nudity:
  action: flag  # Flag but don't block in DMs
  threshold: 0.6

defaults:
action: allow

When calling the API, specify which policy to use:

check.tsNode.js

// Different policies for different surfaces
const feedCheck = await vettly.check({
content: post.text,
policy: 'community-safe',
});

const profileCheck = await vettly.check({
imageUrl: avatar.url,
policy: 'profile-photo',
});

const dmCheck = await vettly.check({
content: message.text,
policy: 'direct-messages',
});

Versioning and Git Workflow

Store policies in your repository alongside your application code:

policies/
  community-safe.yaml
  profile-photo.yaml
  direct-messages.yaml
  chatbot-output.yaml
tests/
  policies/
    community-safe.test.ts
    profile-photo.test.ts

Policy changes follow the same workflow as code changes:

Create a branch
Edit the YAML file
Run policy tests
Open a pull request
Legal and trust & safety review the diff
Merge and deploy

The pull request diff makes it obvious what changed:

 categories:
   hate_speech:
     action: block
-    threshold: 0.7
+    threshold: 0.6  # Lowered per T&S review 2026-03-01

Testing Policies

Write test cases for your policies. Each test provides a content sample and asserts the expected action:

tests/policies/community-safe.test.tsNode.js

import { describe, it, expect } from 'vitest';
import { testPolicy } from '../helpers';

describe('community-safe policy', () => {
it('blocks obvious hate speech', async () => {
  const result = await testPolicy('community-safe', {
    content: '[example hate speech input]',
  });
  expect(result.action).toBe('block');
});

it('allows normal conversation', async () => {
  const result = await testPolicy('community-safe', {
    content: 'Hey, great post! I really enjoyed reading this.',
  });
  expect(result.action).toBe('allow');
});

it('flags borderline content for review', async () => {
  const result = await testPolicy('community-safe', {
    content: '[example borderline content]',
  });
  expect(result.action).toBe('flag');
});
});

Run these tests in CI. If a policy change breaks a test, the PR fails and the change doesn't ship.

Deployment

When you merge a policy change, deploy it to Vettly:

deploy-policies.shShell

#!/bin/bash
# Deploy all policies to Vettly
for policy in policies/*.yaml; do
echo "Deploying $policy..."
curl -X PUT https://api.vettly.dev/v1/policies \
  -H "Authorization: Bearer $VETTLY_API_KEY" \
  -H "Content-Type: application/yaml" \
  --data-binary @"$policy"
done

Or use the SDK:

scripts/deploy-policies.tsNode.js

import fs from 'fs';
import path from 'path';
import { Vettly } from '@vettly/sdk';

const vettly = new Vettly(process.env.VETTLY_API_KEY);
const policiesDir = path.join(__dirname, '../policies');

for (const file of fs.readdirSync(policiesDir)) {
const yaml = fs.readFileSync(path.join(policiesDir, file), 'utf-8');
await vettly.policies.deploy({ yaml });
console.log(`Deployed: ${file}`);
}

Rollbacks

If a policy change causes problems (spike in false positives, user complaints), revert the Git commit and redeploy:

git revert HEAD
git push
# CI redeploys the previous policy version

Vettly keeps all policy versions, so you can also reference a specific version in your API calls during an incident. See Policy Versioning for details on how version history works.

rollback.tsNode.js

// Temporarily pin to a known-good policy version
const result = await vettly.check({
content: post.text,
policy: 'community-safe',
policyVersion: '2', // Previous version
});

Benefits Over Dashboard-Only Configuration

| Aspect | Dashboard | Policies as Code | |--------|-----------|-----------------| | Change history | Audit log (if available) | Full Git history | | Review process | "I changed it" in Slack | Pull request with diff | | Rollback | Manual revert | git revert | | Testing | Manual spot checks | Automated test suite | | Multi-environment | Copy settings manually | Deploy per environment | | Compliance evidence | Screenshots | Commit hashes and diffs |

Dashboard configuration is fine for getting started. But as your moderation requirements grow — multiple policies, regulatory obligations, cross-team review — policies as code scales better.

Define your moderation policies in code

Vettly supports YAML policy definitions with versioning, testing, and API deployment. Bring your moderation rules into your engineering workflow.

Get Free API Key Content Moderation API