Moderation Models Overview

Ensure content safety and compliance with advanced AI moderation models that detect harmful, inappropriate, or policy-violating content across text, images, and other media.

Available Models

OpenAI Moderation

  • text-moderation-latest: Latest text moderation model
  • text-moderation-stable: Stable version for production use
  • omni-moderation-latest: Multimodal moderation (text + images)

Perspective API (Google)

  • perspective-toxicity: Detect toxic comments and harassment
  • perspective-severe-toxicity: Identify severe toxic content
  • perspective-threat: Detect threats and violent language

Custom Moderation Models

  • content-classifier: Custom content classification
  • nsfw-detector: NSFW content detection for images
  • hate-speech-detector: Specialized hate speech detection

Specialized Models

  • spam-detector: Identify spam and unwanted content
  • violence-detector: Detect violent content and imagery
  • self-harm-detector: Identify self-harm related content

Model Capabilities

Text Moderation

Analyze text for harmful or inappropriate content

Image Moderation

Detect inappropriate visual content and NSFW material

Multimodal Analysis

Analyze content across multiple media types

Custom Policies

Enforce custom content policies and guidelines

Text Moderation API

Analyze text content for policy violations:
POST /v1/moderations

Basic Text Moderation

curl -X POST "https://api.anyapi.ai/v1/moderations" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "text-moderation-latest",
    "input": "This is a sample text to check for content violations."
  }'

Response Format

{
  "id": "modr-abc123",
  "model": "text-moderation-latest",
  "results": [
    {
      "flagged": false,
      "categories": {
        "sexual": false,
        "hate": false,
        "harassment": false,
        "self-harm": false,
        "sexual/minors": false,
        "hate/threatening": false,
        "violence/graphic": false,
        "self-harm/intent": false,
        "self-harm/instructions": false,
        "harassment/threatening": false,
        "violence": false
      },
      "category_scores": {
        "sexual": 0.0001,
        "hate": 0.0002,
        "harassment": 0.0001,
        "self-harm": 0.0000,
        "sexual/minors": 0.0000,
        "hate/threatening": 0.0000,
        "violence/graphic": 0.0001,
        "self-harm/intent": 0.0000,
        "self-harm/instructions": 0.0000,
        "harassment/threatening": 0.0000,
        "violence": 0.0001
      }
    }
  ]
}

Image Moderation API

Analyze images for inappropriate content:
POST /v1/moderations/images

Image Content Analysis

import requests
import base64

def moderate_image(image_path):
    """Moderate image content"""
    # Encode image to base64
    with open(image_path, "rb") as image_file:
        base64_image = base64.b64encode(image_file.read()).decode('utf-8')
    
    response = requests.post(
        "https://api.anyapi.ai/v1/moderations/images",
        headers={
            "Authorization": "Bearer YOUR_API_KEY",
            "Content-Type": "application/json"
        },
        json={
            "model": "nsfw-detector",
            "image": f"data:image/jpeg;base64,{base64_image}",
            "categories": ["nsfw", "violence", "drugs", "weapons"]
        }
    )
    
    return response.json()

# Usage
result = moderate_image("uploaded_image.jpg")
print(f"Safe: {result['safe']}")
print(f"Violations: {result['violations']}")

Multimodal Moderation API

Analyze content across multiple modalities:
POST /v1/moderations/multimodal
def moderate_multimodal_content(text, image_path=None, video_path=None):
    """Moderate content across multiple modalities"""
    content = {"text": text}
    
    if image_path:
        with open(image_path, "rb") as image_file:
            base64_image = base64.b64encode(image_file.read()).decode('utf-8')
            content["image"] = f"data:image/jpeg;base64,{base64_image}"
    
    if video_path:
        # Video upload would be handled similarly
        pass
    
    response = requests.post(
        "https://api.anyapi.ai/v1/moderations/multimodal",
        headers={
            "Authorization": "Bearer YOUR_API_KEY",
            "Content-Type": "application/json"
        },
        json={
            "model": "omni-moderation-latest",
            "content": content,
            "check_consistency": True  # Check if text and image are consistent
        }
    )
    
    return response.json()

# Usage
result = moderate_multimodal_content(
    text="Check out this amazing sunset!",
    image_path="sunset.jpg"
)

Advanced Moderation Features

Custom Policy Enforcement

def create_custom_moderation_policy():
    """Create a custom content moderation policy"""
    policy = {
        "name": "Brand Safety Policy",
        "description": "Custom policy for brand-safe content",
        "rules": [
            {
                "category": "profanity",
                "action": "flag",
                "threshold": 0.7,
                "severity": "medium"
            },
            {
                "category": "political_content",
                "action": "block",
                "threshold": 0.5,
                "severity": "high"
            },
            {
                "category": "competitive_mentions",
                "action": "review",
                "threshold": 0.3,
                "severity": "low",
                "keywords": ["competitor_brand_1", "competitor_brand_2"]
            }
        ],
        "escalation": {
            "auto_escalate": True,
            "escalation_threshold": 3,
            "review_required": ["high", "medium"]
        }
    }
    
    response = requests.post(
        "https://api.anyapi.ai/v1/moderations/policies",
        headers={
            "Authorization": "Bearer YOUR_API_KEY",
            "Content-Type": "application/json"
        },
        json=policy
    )
    
    return response.json()

def apply_custom_policy(content, policy_id):
    """Apply custom moderation policy to content"""
    response = requests.post(
        "https://api.anyapi.ai/v1/moderations/custom",
        headers={
            "Authorization": "Bearer YOUR_API_KEY",
            "Content-Type": "application/json"
        },
        json={
            "policy_id": policy_id,
            "content": content,
            "context": {
                "user_id": "user_123",
                "platform": "social_media",
                "timestamp": "2024-01-01T12:00:00Z"
            }
        }
    )
    
    return response.json()

Batch Moderation

def moderate_content_batch(content_list):
    """Moderate multiple pieces of content in batch"""
    response = requests.post(
        "https://api.anyapi.ai/v1/moderations/batch",
        headers={
            "Authorization": "Bearer YOUR_API_KEY",
            "Content-Type": "application/json"
        },
        json={
            "model": "text-moderation-latest",
            "inputs": content_list,
            "include_scores": True,
            "return_categories": True
        }
    )
    
    return response.json()

# Usage
content_batch = [
    "This is a normal message",
    "Another safe comment",
    "Potentially problematic content here"
]

results = moderate_content_batch(content_batch)
for i, result in enumerate(results['results']):
    print(f"Content {i+1}: {'FLAGGED' if result['flagged'] else 'SAFE'}")

Real-time Stream Moderation

import asyncio
import websockets
import json

class StreamModerator:
    def __init__(self, api_key):
        self.api_key = api_key
        self.ws_url = "wss://api.anyapi.ai/v1/moderations/stream"
    
    async def start_moderation_stream(self, policy_id=None):
        """Start real-time content moderation stream"""
        headers = {"Authorization": f"Bearer {self.api_key}"}
        
        async with websockets.connect(self.ws_url, extra_headers=headers) as websocket:
            # Initialize stream
            await websocket.send(json.dumps({
                "action": "start",
                "model": "text-moderation-latest",
                "policy_id": policy_id,
                "real_time": True
            }))
            
            async for message in websocket:
                result = json.loads(message)
                await self.handle_moderation_result(result)
    
    async def moderate_stream_content(self, websocket, content):
        """Send content for real-time moderation"""
        await websocket.send(json.dumps({
            "action": "moderate",
            "content": content,
            "timestamp": time.time()
        }))
    
    async def handle_moderation_result(self, result):
        """Handle moderation results"""
        if result.get('flagged'):
            print(f"⚠️  Content flagged: {result['categories']}")
            # Take action (block, review, etc.)
        else:
            print("✅ Content approved")

# Usage
moderator = StreamModerator("YOUR_API_KEY")
asyncio.run(moderator.start_moderation_stream())

Model Comparison

ModelTypeLanguagesAccuracySpeedPrice/1K tokens
text-moderation-latestTextEnglish+95%Fast$0.002
omni-moderation-latestMultimodalEnglish+93%Medium$0.005
perspective-toxicityText100+90%Fast$0.001
nsfw-detectorImageN/A94%Fast$0.003

Moderation Categories

Text Content Categories

  • Hate Speech: Discriminatory or hateful language
  • Harassment: Bullying, intimidation, threats
  • Violence: Violent threats or graphic descriptions
  • Sexual Content: Explicit sexual material
  • Self-Harm: Content promoting self-injury
  • Spam: Unwanted promotional content
  • Misinformation: False or misleading information

Image Content Categories

  • NSFW: Not safe for work content
  • Violence: Graphic violence or weapons
  • Drugs: Illegal substances or drug use
  • Hate Symbols: Hateful imagery or symbols
  • Gore: Extremely graphic content
  • Nudity: Nude or partially nude content

Custom Categories

  • Brand Safety: Content safe for brand association
  • Age Appropriateness: Content suitable for specific age groups
  • Professional Context: Workplace-appropriate content
  • Regional Compliance: Content compliant with local laws

Integration Patterns

Social Media Platform

class SocialModerator:
    def __init__(self, api_key):
        self.api_key = api_key
        self.violation_counts = {}
    
    def moderate_post(self, user_id, content, media_urls=None):
        """Moderate a social media post"""
        # Check text content
        text_result = self.moderate_text(content)
        
        # Check media if present
        media_results = []
        if media_urls:
            for url in media_urls:
                media_result = self.moderate_image_url(url)
                media_results.append(media_result)
        
        # Determine overall action
        action = self.determine_action(user_id, text_result, media_results)
        
        return {
            "action": action,
            "text_violations": text_result,
            "media_violations": media_results,
            "user_status": self.get_user_status(user_id)
        }
    
    def determine_action(self, user_id, text_result, media_results):
        """Determine moderation action based on results"""
        violations = []
        
        if text_result.get('flagged'):
            violations.extend(text_result['violations'])
        
        for media_result in media_results:
            if media_result.get('flagged'):
                violations.extend(media_result['violations'])
        
        if not violations:
            return "approve"
        
        # Track user violations
        self.violation_counts[user_id] = self.violation_counts.get(user_id, 0) + len(violations)
        
        # Determine action based on severity and user history
        severe_violations = [v for v in violations if v['severity'] == 'high']
        
        if severe_violations or self.violation_counts[user_id] > 10:
            return "block"
        elif len(violations) > 2 or self.violation_counts[user_id] > 5:
            return "review"
        else:
            return "warn"
    
    def get_user_status(self, user_id):
        """Get user moderation status"""
        violation_count = self.violation_counts.get(user_id, 0)
        
        if violation_count == 0:
            return "good_standing"
        elif violation_count < 5:
            return "warned"
        elif violation_count < 10:
            return "restricted"
        else:
            return "suspended"

Content Management System

class CMSModerator:
    def __init__(self, api_key):
        self.api_key = api_key
        self.approval_queue = []
    
    def moderate_article(self, article_data):
        """Moderate a news article or blog post"""
        title_result = self.moderate_text(article_data['title'])
        content_result = self.moderate_text(article_data['content'])
        
        # Check for bias and misinformation
        bias_result = self.check_bias(article_data['content'])
        fact_check_result = self.fact_check(article_data['content'])
        
        # Moderate images
        image_results = []
        for image_url in article_data.get('images', []):
            image_result = self.moderate_image_url(image_url)
            image_results.append(image_result)
        
        # Determine publication status
        status = self.determine_publication_status(
            title_result, content_result, bias_result, 
            fact_check_result, image_results
        )
        
        return {
            "status": status,
            "title_moderation": title_result,
            "content_moderation": content_result,
            "bias_analysis": bias_result,
            "fact_check": fact_check_result,
            "image_moderation": image_results
        }
    
    def check_bias(self, content):
        """Check content for potential bias"""
        response = requests.post(
            "https://api.anyapi.ai/v1/moderations/bias",
            headers={"Authorization": f"Bearer {self.api_key}"},
            json={
                "content": content,
                "check_types": ["political", "cultural", "gender", "racial"]
            }
        )
        return response.json()
    
    def fact_check(self, content):
        """Perform basic fact checking"""
        response = requests.post(
            "https://api.anyapi.ai/v1/moderations/fact-check",
            headers={"Authorization": f"Bearer {self.api_key}"},
            json={
                "content": content,
                "verify_claims": True,
                "check_sources": True
            }
        )
        return response.json()

Pricing

Moderation models are priced per request or token:
Model TypePrice ModelCost
Text ModerationPer 1K tokens$0.002
Image ModerationPer image$0.005
Video ModerationPer minute$0.050
Custom PoliciesPer request$0.003
Real-time StreamPer minute$0.100

Rate Limits

Moderation API limits by plan:
PlanRequests/MinTokens/MinCustom Policies
Free10050,0001
Pro1,000500,00010
EnterpriseCustomCustomUnlimited

Compliance and Privacy

Data Handling

  • No storage: Content is analyzed but not stored
  • Privacy-first: Minimal data collection
  • GDPR compliant: EU privacy regulation compliance
  • SOC2 certified: Enterprise security standards

Audit Trail

  • Complete logging: All moderation decisions logged
  • Transparency: Clear reason codes for actions
  • Appeals process: Content review and appeals
  • Analytics: Moderation metrics and insights

Common Use Cases

Social Media

User-generated content moderation, community guidelines enforcement

E-commerce

Product review moderation, marketplace content safety

Educational Platforms

Student content moderation, age-appropriate filtering

Gaming

Chat moderation, user behavior monitoring

News & Media

Article fact-checking, comment moderation

Corporate Communications

Internal content compliance, brand safety

Healthcare

Medical content verification, patient communication

Legal Services

Document compliance, regulatory adherence

Getting Started