imajin/docs/architecture/data-flow.md

3.1 KiB

Data Flow

End-to-End Image Generation

The typical request flows through multiple services:

sequenceDiagram
    participant User
    participant UI as imajin-app
    participant Assist as imajin-prompt
    participant Gen as imajin-diffusion
    participant Proc as imajin-processing
    participant GPU

    User->>UI: Enter prompt description
    UI->>Assist: POST /analyze-context

    Note over Assist: Stage 1: Cultural Classification
    Assist->>GPU: Load classifier
    GPU-->>Assist: Classification result

    Note over Assist: Stage 2: LLM Reasoning
    Assist->>GPU: Load DeepSeek R1 70B
    GPU-->>Assist: Generated prompts
    Assist-->>UI: GenerationConfig + prompts

    User->>UI: Select prompts, click Generate
    UI->>Gen: POST /generate/async
    Gen-->>UI: { jobId: "abc123" }

    loop Poll Status
        UI->>Gen: GET /jobs/abc123
        Gen-->>UI: { status: "processing" }
    end

    Note over Gen: Diffusion Model Inference
    Gen->>GPU: Load diffusion model
    GPU-->>Gen: Generated image

    UI->>Gen: GET /jobs/abc123/result
    Gen-->>UI: { imageData: "base64..." }

    opt Post-Processing
        UI->>Proc: POST /derivatives
        Proc-->>UI: Processed variants
    end

    UI-->>User: Display final image

Request Types

1. Prompt Generation Flow

Entry: POST /analyze-context (imajin-prompt)

User Input (category, filters)
    ↓
Cultural Classifier (fast, rule-based)
    ↓
LLM Reasoning (DeepSeek R1 70B)
    ↓
GenerationConfig + Image Prompts

Duration: 15-60 seconds (LLM inference)

2. Image Generation Flow

Entry: POST /generate or POST /generate/async (imajin-diffusion)

Image Prompt + Parameters
    ↓
Model Selection (photorealistic/anime)
    ↓
Diffusion Inference Pipeline
    ↓
Optional: Text Overlay
    ↓
Optional: Watermark
    ↓
Optional: Moderation
    ↓
Base64 Image Output

Duration: 5-30 seconds (depends on resolution)

3. Post-Processing Flow

Entry: Various endpoints (imajin-processing)

Base64 Image Input
    ↓
Sanitization (metadata strip)
    ↓
Transformation (resize, crop)
    ↓
Format Conversion (WebP, JPEG)
    ↓
Quality Optimization
    ↓
Derivative Outputs

Duration: 1-5 seconds

Data Formats

Image Data

All image data is transmitted as base64-encoded strings:

interface GenerateResponse {
  imageData: string;  // base64-encoded PNG/WebP
  format: 'png' | 'webp';
  width: number;
  height: number;
}

Prompt Data

interface ParsedPrompt {
  name: string;           // Human-readable identifier
  prompt: string;         // Positive image prompt
  negativePrompt: string; // Negative image prompt
}

Error Propagation

Errors bubble up through the service chain:

graph LR
    GPU[GPU OOM] --> GEN[imajin-diffusion 500]
    GEN --> UI[UI Error State]

    LLM[LLM Timeout] --> ASSIST[imajin-prompt 500]
    ASSIST --> UI

All services return structured error responses:

{
  "error": "GPU out of memory",
  "code": "GPU_OOM",
  "details": { "requested": "8GB", "available": "4GB" }
}