imajin/docs/services/imajin-semantic.md

2.9 KiB

imajin-semantic Service

SigLIP2-based semantic attribute detection and SEO filter alignment validation.

Overview

Property Value
Port 8005
Stack Python, FastAPI, PyTorch, transformers (SigLIP2)
Package @lilith/imajin-semantic-types, @lilith/imajin-semantic-client
Model google/siglip2-so400m-patch14-384

Architecture

imajin-semantic/
├── service/
│   └── src/
│       ├── api/main.py                    # FastAPI routes
│       ├── config/settings.py             # Port 8005, model config, thresholds
│       ├── detection/
│       │   ├── semantic_detector.py       # SigLIP2 zero-shot inference
│       │   └── attribute_taxonomy.py      # 40+ filter → prompt mappings
│       └── models/schemas.py              # Request/response models
├── types/                                 # @lilith/imajin-semantic-types
└── client/                                # @lilith/imajin-semantic-client

Purpose

Validates whether generated images match requested SEO filters using zero-shot classification with Google's SigLIP2 vision-language model. Each filter (e.g., "femboy", "latex", "cyberpunk") maps to multiple text prompts; the model scores image-text similarity to determine alignment.

40+ supported filters across categories: person types, clothing/outfits, service categories, aesthetics, body features, hair colors, ethnicity, and 4 style categories (anime, photorealistic, 3d_render, artistic).

API Endpoints

Endpoint Method Description
/detect POST Detect attributes in a single image
/align POST Check alignment against requested filters (pass/fail)
/batch POST Batch detection for up to 50 images
/filters GET List all available filters and styles
/info GET Detector configuration, device, thresholds
/health GET Health check (model load status, GPU status)

Thresholds

Setting Default Description
Attribute detection 0.25 Minimum confidence to report an attribute
Alignment threshold 0.50 Minimum score to pass alignment check

Client Usage

import { createSemanticClient } from '@lilith/imajin-semantic-client';

const client = createSemanticClient({ baseUrl: 'http://localhost:8005' });

const result = await client.checkAlignment({
  image_base64: '...',
  requested_filters: ['femboy', 'latex'],
  threshold: 0.5,
});
// result.is_aligned, result.alignment_score, result.matched_filters

GPU Requirements

~2GB VRAM for SigLIP2 so400m model. Coordinates GPU allocation via model-boss (Redis).