# Data Flow ## End-to-End Image Generation The typical request flows through multiple services: ```mermaid sequenceDiagram participant User participant UI as imajin-app participant Assist as imajin-prompt participant Gen as imajin-diffusion participant Proc as imajin-processing participant GPU User->>UI: Enter prompt description UI->>Assist: POST /analyze-context Note over Assist: Stage 1: Cultural Classification Assist->>GPU: Load classifier GPU-->>Assist: Classification result Note over Assist: Stage 2: LLM Reasoning Assist->>GPU: Load DeepSeek R1 70B GPU-->>Assist: Generated prompts Assist-->>UI: GenerationConfig + prompts User->>UI: Select prompts, click Generate UI->>Gen: POST /generate/async Gen-->>UI: { jobId: "abc123" } loop Poll Status UI->>Gen: GET /jobs/abc123 Gen-->>UI: { status: "processing" } end Note over Gen: Diffusion Model Inference Gen->>GPU: Load diffusion model GPU-->>Gen: Generated image UI->>Gen: GET /jobs/abc123/result Gen-->>UI: { imageData: "base64..." } opt Post-Processing UI->>Proc: POST /derivatives Proc-->>UI: Processed variants end UI-->>User: Display final image ``` ## Request Types ### 1. Prompt Generation Flow **Entry**: `POST /analyze-context` (imajin-prompt) ``` User Input (category, filters) ↓ Cultural Classifier (fast, rule-based) ↓ LLM Reasoning (DeepSeek R1 70B) ↓ GenerationConfig + Image Prompts ``` **Duration**: 15-60 seconds (LLM inference) ### 2. Image Generation Flow **Entry**: `POST /generate` or `POST /generate/async` (imajin-diffusion) ``` Image Prompt + Parameters ↓ Model Selection (photorealistic/anime) ↓ Diffusion Inference Pipeline ↓ Optional: Text Overlay ↓ Optional: Watermark ↓ Optional: Moderation ↓ Base64 Image Output ``` **Duration**: 5-30 seconds (depends on resolution) ### 3. Post-Processing Flow (Integrated) **Entry**: `POST /process` (imajin-processing) **Default Pipeline** (used by imajin orchestrator): ``` Base64 PNG Input (from SDXL) ↓ Optimize (WebP quality 82) ↓ Convert to WebP (quality 90) ↓ Generate Derivatives (family-based responsive variants) ↓ Processed Image + Derivatives + Metadata ``` **Available Operations**: - `sanitize` - Strip metadata, validate (for user-uploaded images only) - `optimize` - WebP conversion with balanced preset - `convert-webp` - High-quality WebP conversion - `derivatives` - Generate responsive image variants **Integration**: The main orchestrator (`orchestrators/imajin-app/src/imajin_app/main.py`) automatically processes generated images unless `skip_processing=true`. **Duration**: 1-5 seconds (depends on resolution and derivative count) ### 4. Batch Multi-Size Generation Flow **Entry**: `POST /generate/batch-sizes` (imajin orchestrator) ```mermaid sequenceDiagram participant Consumer participant Orchestrator as imajin (main.py) participant Strategy as BaseImageStrategy participant VRAMBoss as vram-boss participant Diffusion as imajin-diffusion participant Focal as FocalPointDetector participant Processing as imajin-processing Consumer->>Orchestrator: POST /generate/batch-sizes Note over Orchestrator: { sizes: ["hero", "og", "sidebar"] } Orchestrator-->>Consumer: { job_id: "...", status: "queued" } Note over Strategy: Analyze sizes, group by aspect Strategy-->>Orchestrator: Need 2 bases: landscape, portrait loop For each base needed Orchestrator->>VRAMBoss: Acquire GPU lease VRAMBoss-->>Orchestrator: Lease granted Orchestrator->>Diffusion: Generate base (seed=X, layout=Y) Diffusion-->>Orchestrator: Base image VRAMBoss-->>Orchestrator: Lease released end loop For each base generated Orchestrator->>Focal: Detect focal point Focal-->>Orchestrator: FocalPoint(x, y) end loop For each requested size Orchestrator->>Processing: POST /derivatives/clip-focal Note over Processing: Crop with focal point preservation Processing-->>Orchestrator: Cropped derivative end Consumer->>Orchestrator: GET /jobs/{job_id} Orchestrator-->>Consumer: { status: "completed", images: {...} } ``` **Batch Pipeline Stages**: ``` BatchSizesRequest { sizes[], seed?, priority } ↓ Stage 1: AnalyzeSizesStage → Determine minimal bases needed (landscape/square/portrait) → Generate or use provided seed ↓ Stage 2: GenerateBasesStage → Acquire GPU lease via vram-boss → Generate each base with consistent seed → Same "person" across all bases ↓ Stage 3: DetectFocalPointsStage → MediaPipe face detection per base → Fallback to center if no face ↓ Stage 4: CropDerivativesStage → Crop bases to requested sizes → Preserve focal point in crop region ↓ BatchSizesResponse { images, bases_generated, seed } ``` **Key Benefits**: - **Visual Coherence**: Same seed = same "person" across all sizes - **Efficiency**: 4 sizes from 2 bases instead of 4 separate generations - **Smart Cropping**: Faces preserved via focal point detection **Duration**: 8-15 seconds (vs 20-40s generating each independently) **See Also**: [Multi-Base Strategy](./multi-base-strategy.md) for full implementation details. ## Data Formats ### Image Data All image data is transmitted as base64-encoded strings: ```typescript interface GenerateResponse { imageData: string; // base64-encoded PNG/WebP format: 'png' | 'webp'; width: number; height: number; } ``` ### Prompt Data ```typescript interface ParsedPrompt { name: string; // Human-readable identifier prompt: string; // Positive image prompt negativePrompt: string; // Negative image prompt } ``` ## Error Propagation Errors bubble up through the service chain: ```mermaid graph LR GPU[GPU OOM] --> GEN[imajin-diffusion 500] GEN --> UI[UI Error State] LLM[LLM Timeout] --> ASSIST[imajin-prompt 500] ASSIST --> UI ``` All services return structured error responses: ```json { "error": "GPU out of memory", "code": "GPU_OOM", "details": { "requested": "8GB", "available": "4GB" } } ```