imajin/docs/architecture/feature-roadmap.md

4.3 KiB

Feature Roadmap — Adoptable Ecosystem Features

Features from the ComfyUI / Automatic1111 ecosystem that would strengthen @imajin without compromising its microservices architecture.

High Priority — Genuine Gaps

LoRA Support

Status: Not implemented. No LoRA loading, management, or application exists.

Why it matters: LoRAs are the standard mechanism for style and character fine-tuning in SDXL workflows. Without LoRA support, @imajin cannot leverage the thousands of publicly available fine-tuned weights for specific aesthetics, characters, or styles.

Where it fits: services/imajin-diffusion/service/ — the generate stage already manages model loading via diffusers. LoRA weights would be loaded alongside the base model using pipe.load_lora_weights().

Scope: Model loading changes in imajin-diffusion + new API parameters for LoRA selection + LoRA inventory endpoint.

Upscaling Pipeline Stage

Status: Not implemented. No ESRGAN, RealESRGAN, or other upscaling model exists.

Why it matters: Upscaling is a common final enhancement step. SDXL outputs at 1024x1024 often need 2x-4x upscaling for print or high-resolution web use.

Where it fits: orchestrators/imajin-pipeline/src/image_pipeline/stages/ — would slot between quality scoring and output as a new pipeline stage. The pipeline framework already supports optional stages.

Scope: New upscaling stage + model loading for RealESRGAN/SwinIR + API parameter to enable/disable + resolution limits.

Sampler / Scheduler Selection

Status: Not exposed to consumers. The diffusion service uses hardcoded sampler settings internally.

Why it matters: Different samplers (DPM++ 2M Karras, Euler a, LCM) have significant quality/speed tradeoffs. DPM++ 2M Karras produces high quality at 20-30 steps; LCM can generate acceptable results in 4-8 steps.

Where it fits: services/imajin-diffusion/service/src/ — the generation pipeline already uses diffusers schedulers. Exposing selection requires adding a scheduler parameter to the generation request and mapping it to diffusers scheduler classes.

Scope: New request parameter + scheduler mapping + validation of step count per scheduler.

Medium Priority — Polish

VAE Selection

Status: VAE dtype fixes exist in the diffusion service, but users cannot select alternative VAEs.

Why it matters: Custom VAEs (e.g., sdxl-vae-fp16-fix) can improve color accuracy and reduce artifacts. Some styles benefit from specific VAE variants.

Where it fits: services/imajin-diffusion/ — alongside model loading configuration.

SDXL Refiner

Status: Not implemented. Two-stage base → refiner pipeline does not exist.

Why it matters: The SDXL refiner model improves fine details and coherence when applied to base model output at a specific denoising threshold (typically 0.7-0.8).

Where it fits: services/imajin-diffusion/ or as a dedicated pipeline stage in orchestrators/imajin-pipeline/.

Prompt Emphasis Syntax

Status: Not parsed. (word:1.2) weighted syntax from A1111/ComfyUI is not supported.

Why it matters: Prompt weighting gives fine-grained control over which concepts the model emphasizes. It's a standard syntax that users familiar with SDXL expect.

Where it fits: services/imajin-prompt/service/ or as preprocessing in the pipeline before passing prompts to diffusion.

Low Priority — Skip for Now

Node Graph UI

Skip reason: The pipeline framework (lilith-pipeline-framework) provides equivalent expressiveness through code-defined stages. A visual node editor would be a large investment with limited return given @imajin's API-first architecture.

Extension System

Skip reason: The microservices architecture naturally provides extensibility — new capabilities are added as new services with typed contracts. A plugin system within a service would add complexity without clear benefit.

Face Restoration (Standalone)

Skip reason: The anatomy_fix stage in imajin-pipeline already handles hand and face correction via inpainting. A dedicated face restoration model (GFPGAN, CodeFormer) could supplement this but isn't a gap.