3.9 KiB
3.9 KiB
imajin-moderator Service
Multi-layer content moderation with 5 detection layers, deterministic decision logic, and timing side-channel prevention.
Overview
| Property | Value |
|---|---|
| Port | 8008 |
| Stack | Python, FastAPI, PyTorch, transformers, InsightFace |
| Package | @lilith/imajin-moderator-client (Python), @imajin/moderator-types (TypeScript) |
Architecture
imajin-moderator/
├── service/
│ └── src/
│ ├── api/main.py # FastAPI routes (45+ endpoints)
│ ├── config/settings.py # Port 8008, pipeline config
│ ├── detection/
│ │ ├── pipeline.py # Multi-layer orchestration
│ │ ├── pdq_hasher.py # Layer 1: Perceptual hashing (Meta PDQ)
│ │ ├── nsfw_detector.py # Layer 2: NSFW classification
│ │ ├── age_estimator.py # Layer 3: Age estimation
│ │ ├── prohibited_content_detector.py # Layer 4: Zero-shot prohibited content
│ │ └── identity_verifier.py # Layer 5: Face embedding verification
│ └── models/
│ ├── schemas.py # Request/response models
│ ├── decisions.py # Decision logic & block reasons
│ └── prohibited_prompts.py # 5 illegal content categories
├── types/ # TypeScript type definitions
└── client/ # Python async HTTP client
Detection Layers
All 5 layers run unconditionally on every scan (constant-time execution prevents timing side-channels):
| Layer | Model / Method | Detects | Flags |
|---|---|---|---|
| 1. PDQ Hash | Meta PDQ (256-bit perceptual hash) | Known-bad content via Redis hash database | KNOWN_BAD_HASH |
| 2. NSFW | Marqo/nsfw-image-detection-384 | Nudity, adult content (explicit ≥0.7, suggestive ≥0.4) | NSFW_EXPLICIT, NSFW_SUGGESTIVE |
| 3. Age | InsightFace buffalo_l + nateraw/vit-age-classifier | Potential minors (conservative threshold: 25 years) | POTENTIAL_MINOR |
| 4. Prohibited | SigLIP2 zero-shot via imajin-semantic | 5 illegal categories (bestiality, sexual violence, unconscious, necrophilia, trafficking) | VIOLENCE_DETECTED |
| 5. Identity | Face embeddings via imajin-identity | Identity mismatch (cosine similarity threshold: 0.68) | Identity verification |
Decision Logic
Deterministic priority order:
- BLOCKED — Known-bad hash OR (minor + NSFW) OR prohibited content above block threshold
- QUARANTINED — Potential minor OR age estimation failure OR identity mismatch OR prohibited above quarantine threshold
- APPROVED — All layers passed
API Endpoints
| Endpoint | Method | Description |
|---|---|---|
/scan |
POST | Single image scan (hash + NSFW + age) |
/scan/full |
POST | Full 5-layer scan (all detection + identity) |
/scan/fast |
POST | NSFW-only scan |
/scan/batch |
POST | Batch scan (max 50 images) |
/detect/hash |
POST | Standalone hash generation |
/detect/nsfw |
POST | Standalone NSFW classification |
/detect/age |
POST | Standalone age estimation |
/hash/check |
POST | Check hash against known-bad database |
/hash/load |
POST | Load known-bad hashes (auth required) |
/health |
GET | Health check (GPU status, hash count) |
Service Dependencies
- imajin-semantic (port 8005) — Prohibited content detection via SigLIP2 zero-shot classification
- imajin-identity (port 8009) — Face embedding extraction for identity verification
- Redis — PDQ hash database persistence
Related
- Services Overview
- imajin-semantic — SigLIP2 backend for prohibited content detection
- imajin-identity — Face detection and identity verification