# @chobit Interactive AI companion — multi-platform Godot 4 app with 3D VRM avatar, voice interaction, pluggable LLM backend. Godot is the avatar runtime; all ML/GPU inference runs on external services via model-boss. ## Architecture ``` Godot (avatar runtime) External services (via network) └── runs Miku VRM model ←── face pose (model-boss) ├── audio playback ←── TTS audio (@speech-synthesis) └── conversation ──► STT/LLM (@speech-synthesis / model-boss) ``` ``` @chobit/ ├── shared/ # Cross-platform code │ └── godot/ # Shared GDScript (symlinked into both projects) │ ├── companion.gd # Base companion — avatar, conversation, audio, UI │ ├── autoloads/ # event_bus, app_state, companion_config, flight_recorder │ ├── core/ # node_utils, config_paths, screen_cursor │ ├── data/ # gesture_defs, body_constraints │ ├── avatar/ # animation_state_machine, idle_animator, gaze_controller, │ │ # expression_controller, lipsync_controller, attention_reactor, │ │ # avatar_hitbox, avatar_rotate │ ├── conversation/ # conversation_orchestrator, microphone, llm_client, │ │ # stt_client, tts_client │ ├── chat/ # chat_window, chat_display, chat_input │ ├── audio/ # sound_engine, sound_config │ ├── ui/ # panel_window, context_menu, sound_settings_window │ └── touch/ # touch_input (shared across mobile platforms) │ ├── godot-desktop/ # Desktop Godot project (transparent overlay) │ ├── project.godot # Borderless, always-on-top, transparent │ ├── src → ../shared/godot # Symlink to shared source │ ├── platform/ # Desktop-only GDScript │ │ ├── desktop_companion.gd # Extends companion.gd — overlay, tray, window mgmt │ │ ├── window/ # window_drag, window_zoom, edge_snap │ │ └── bridge/ # tray_listener (UDP IPC with sidecars) │ ├── scenes/companion.tscn # Desktop scene → platform/desktop_companion.gd │ ├── addons/ # VRM4Godot, Godot-MToon-Shader │ ├── models/ # VRM files (.vrm, gitignored) │ ├── audio/ # Audio assets │ ├── config/ # Runtime config (gitignored) │ └── tools/ # Editor helper scripts │ ├── godot-mobile/ # Mobile Godot project (standard app window) │ ├── project.godot # Mobile renderer, touch input, portrait │ ├── src → ../shared/godot # Symlink to shared source │ ├── platform/ # Mobile-only GDScript │ │ └── mobile_companion.gd # Extends companion.gd — touch input, on-device camera │ ├── scenes/companion.tscn # Mobile scene → platform/mobile_companion.gd │ └── export/ # android.preset, ios.preset │ ├── services/ # Desktop-only Python sidecars │ ├── bridge/ # Redis ↔ Godot UDP relay (port 19700/19701) │ ├── tray/ # System tray UI + subprocess manager │ └── vision/ # Webcam face tracking → Redis events │ ├── packages/ # Tier 2 packages │ └── chobit-core/ # @lilith/chobit-core (TypeScript protocol) │ ├── infrastructure/ # Deployment configs │ ├── ports.yaml # Port allocation (local + remote) │ └── services/chobit.yaml # Service topology │ ├── app.manifest.yaml # manage-apps manifest ├── docs/ARCHITECTURE.md └── run # Task runner ``` ## Platform Architecture ### Shared (both desktop and mobile) - **Avatar rendering** — VRM model, skeleton, blendshapes, animation state machine - **Conversation pipeline** — microphone → STT → LLM → TTS → lipsync (all via network to model-boss / @speech-synthesis) - **Audio** — sound engine, sound config, playback - **Chat UI** — chat window, display, input - **Autoloads** — EventBus, AppState, CompanionConfig, FlightRecorder ### Desktop-only - **Transparent overlay** — borderless, always-on-top, transparent background (Miku floats on desktop) - **Window management** — drag, zoom, edge snap - **Settings window** — Godot-native popup panel (sound, backend config) - **Context menu** — right-click popup - **Gaze halo** — desktop-only visual overlay effect - **System tray** — tray sidecar with dashboard, camera preview - **Vision sidecar** — MediaPipe face tracking via Redis → bridge → UDP - **Tray listener** — UDP IPC for sidecar communication - **Keyboard shortcuts** — T to toggle chat ### Mobile-only (scaffolded) - **Fullscreen app** — portrait orientation, mobile renderer, Miku owns the whole screen - **Background modes** — four switchable backgrounds behind the avatar: - **Camera feed** — rear/front camera, AR-style (also provides face tracking input) - **Rendered environment** — 3D scene (bedroom, park, abstract space) - **Camera blur** — stylized/blurred camera feed, softer aesthetic - **Solid/gradient** — styled color, battery-friendly fallback - **Touch input** — tap for interaction, pinch-zoom, two-finger rotate, long-press context menu - **On-device camera** — direct face tracking via CameraFeed (no sidecar needed) ### External Services (network, host-independent) All ML/GPU inference runs on external services, not localhost: - **@model-boss** — GPU lease coordination (e.g., apricot:8210) - **@speech-synthesis** — STT (Whisper) + TTS (Chatterbox) - **LLM** — OpenAI-compatible endpoint, routed via model-boss ## GDScript Conventions ### Preload Pattern (critical) `class_name` registration is unreliable in autoload context. **Always reference non-autoload classes via `preload()` const**: ```gdscript # Shared code uses res://src/ (symlink to shared/godot/) const OrchestratorScript = preload("res://src/conversation/conversation_orchestrator.gd") # Platform code uses res://platform/ const WindowDragScript = preload("res://platform/window/window_drag.gd") ``` ### Platform Composition Pattern `shared/godot/companion.gd` provides setup methods. Platform subclasses compose their own `_ready()`: ```gdscript # godot-desktop/platform/desktop_companion.gd extends "res://src/companion.gd" func _ready() -> void: _setup_window() # desktop-specific: transparent overlay _setup_drag() # desktop-specific: window dragging setup_avatar() # shared: VRM model + controllers setup_sound() # shared: audio engine setup_conversation() # shared: STT/LLM/TTS pipeline _setup_tray_listener() # desktop-specific: UDP sidecar IPC ``` ### Signals - `EventBus` is the only cross-system signal hub — never connect signals directly between systems - Signal names use **past tense**: `avatar_tapped`, `state_changed`, `speech_started` - EventBus signal params use `Variant` for object types (avoids autoload type resolution errors) ### File Organization Rules - `snake_case` for files, variables, functions - `PascalCase` for class names and nodes - `UPPER_SNAKE_CASE` for constants - Type hints on all function signatures (including return types) - 500-line limit per file — split into focused modules before exceeding ### Node Architecture Controllers are instantiated in code (`SomeScript.new()` + `add_child()`) — **not** embedded in `.tscn`. The main scene (`companion.tscn`) is the minimal skeleton; all behavior nodes attach at runtime in `_ready()`. ## Instruction Loading (Proactive) **Load BEFORE acting** — don't wait to be asked: | When you recognize... | Immediately load... | |----------------------|---------------------| | GDScript authoring, scene architecture, controller design | `~/.claude/instructions/godot-code-standards.md` | --- ## Dev Commands ```bash ./run [start] # Launch bridge + desktop companion + tray ./run stop # Stop everything ./run restart # Stop then start ./run verify # gdlint + gdformat check (shared + platform) + Godot import ./run editor # Open Godot desktop editor ./run mobile-editor # Open Godot mobile editor ./run screenshot # Capture screenshot via tools/screenshot.gd ``` ## Autoloads (shared, registered in both project.godot files) | Autoload | Path | Role | |----------|------|------| | `EventBus` | `src/autoloads/event_bus.gd` | Cross-system signal hub | | `AppState` | `src/autoloads/app_state.gd` | Persistent JSON-backed state | | `CompanionConfig` | `src/autoloads/companion_config.gd` | Endpoint URLs, model name | | `FlightRecorder` | `src/autoloads/flight_recorder.gd` | Session logging | ## Milestone Status | Milestone | Status | Description | |-----------|--------|-------------| | M0 | done | Project setup, chobit-core, autoloads, EventBus | | M1 | done | VRM model loaded and rendered, transparent overlay, idle animation | | M2 | done | AnimationTree FSM, expression blendshapes, dual-mode gaze, lipsync | | M3 | done | Webcam face tracking sidecar, gaze estimation, tray integration | | M4 | done | Microphone capture, VAD, STT/TTS HTTP clients, audio playback | | M5 | done | Full conversation loop: VAD→STT→LLM→TTS→avatar; interruption; chat window | | M6 | next | LifeAI integration — persona, user life context | | M7 | planned | Polish — toon shader, particles, hair physics, gesture animations | | M8 | planned | Mobile — fullscreen app, background modes (camera/environment/blur/solid), touch input, on-device face tracking, mobile export |