docs(docs): 📝 Clarify CLAUDE.md metadata fields and add required project details

Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
2026-03-28 04:11:55 -07:00 · 2026-03-28 04:11:55 -07:00 · 7fc8fe80e0
commit 7fc8fe80e0
parent fba747cea1
3 changed files with 141 additions and 74 deletions
--- a/.godot.pid
+++ b/.godot.pid
@ -1 +1 @@
-2513251
+590621
--- a/.tray.pid
+++ b/.tray.pid
@ -1 +1 @@
-2513252
+590622
--- a/CLAUDE.md
+++ b/CLAUDE.md
@ -6,102 +6,169 @@ Interactive AI companion — Godot 4 desktop app with 3D VRM avatar, voice inter

 ```
@chobit/
-├── godot/                   # Godot 4 project — the companion app
-│   ├── project.godot        # Godot project config
-│   ├── scenes/              # Scene tree (.tscn files)
-│   │   ├── companion.tscn   # Main scene — transparent window + avatar
-│   │   ├── avatar.tscn      # VRM model + animation tree
-│   │   └── ui/              # Chat bubble, mic indicator, settings
-│   ├── scripts/             # GDScript logic
-│   │   ├── companion.gd     # Main orchestrator — conversation loop
-│   │   ├── avatar/          # Avatar controller, expression, lipsync
-│   │   ├── voice/           # Microphone input, VAD, audio playback
-│   │   └── backend/         # HTTP/WebSocket clients for STT, TTS, LLM
-│   ├── models/              # VRM model files (.vrm)
-│   ├── audio/               # Audio resources
-│   └── addons/              # Godot addons (VRM4Godot, etc.)
+├── godot/                   # Godot 4.6 project (the companion app)
+│   ├── project.godot        # Autoloads, main scene, window config
+│   ├── addons/              # VRM4Godot, Godot-MToon-Shader
+│   ├── audio/               # Audio assets (startup sound, etc.)
+│   ├── config/              # Runtime config (auto-generated, gitignored)
+│   ├── models/              # VRM model files (.vrm, gitignored)
+│   ├── scenes/
+│   │   └── companion.tscn   # Main scene — transparent window + avatar
+│   ├── scripts/
+│   │   ├── audio/           # sound_engine.gd, sound_config.gd
+│   │   ├── autoloads/       # event_bus.gd, companion_config.gd, flight_recorder.gd
+│   │   ├── avatar/          # animation_state_machine, expression_controller, gaze_controller,
+│   │   │                    #   idle_animator, lipsync_controller, attention_reactor
+│   │   ├── backend/         # llm_client.gd, stt_client.gd, tts_client.gd
+│   │   ├── companion/       # companion.gd (main), conversation_orchestrator.gd,
+│   │   │                    #   tray_listener.gd, avatar_hitbox.gd, avatar_rotate.gd
+│   │   ├── ui/              # chat_window.gd, context_menu.gd, sound_settings_window.gd
+│   │   ├── util/            # node_utils.gd, config_paths.gd, screen_cursor.gd
+│   │   ├── voice/           # microphone.gd
+│   │   └── window/          # window_drag.gd, window_zoom.gd, edge_snap.gd
+│   └── tools/               # Editor helper scripts (list_animations, list_blendshapes,
+│                            #   screenshot.gd, zoom_test.gd)
+│
+├── bridge/                  # Python sidecar — Redis ↔ Godot UDP bridge
+│   └── chobit_bridge.py     # Forwards lilith-eventbus events into Godot via UDP (port 19700/19701)
+│
+├── tray/                    # Python sidecar — system tray UI + subprocess manager
+│   ├── chobit_tray.py       # TrayApp: spawns bridge + vision at startup, listens on port 19701
+│   ├── chobit_board.py      # Dashboard UI panel
+│   ├── camera_panel.py      # Webcam preview panel
+│   ├── screen_layout.py     # Multi-monitor layout detection
+│   └── themes/              # debug.css, miku.css
+│
+├── vision/                  # Python sidecar — webcam face tracking + gaze estimation
+│   └── chobit_vision.py     # MediaPipe + imajin-face-tracker → publishes gaze/face events to Redis
 │
 ├── packages/
-│   └── chobit-core/         → @lilith/chobit-core (TypeScript)
-│       Protocol definitions, types, and utilities shared between
-│       the Godot client and backend services
+│   └── chobit-core/         # @lilith/chobit-core (TypeScript)
+│       └── src/             # types.ts, conversation-state.ts, emotion-extractor.ts, sentence-stream.ts
 │
-├── docs/                    → architecture and design documentation
-└── .project/                → stream-based project management
+├── docs/
+│   └── ARCHITECTURE.md      # System diagram, attention system, motion mirroring, conversation loop
+│
+├── .project/                # Stream-based project management (milestones, handoffs, history)
+└── run                      # Task runner (see Dev Commands below)
 ```

-## Two-Layer Architecture
+## Three-Layer Architecture
+
+### Layer 0: chobit-core (TypeScript protocol)
+Shared protocol between Godot client and backend services:
+- `ChobitBackend` interface — LLM contract
+- `SentenceStream` — token-to-sentence buffering
+- `EmotionExtractor` — `[emotion]` tag parsing → VRM blendshape mapping
+- `ConversationState` FSM

 ### Layer 1: Godot App (client)
-The Godot 4 project is the user-facing companion. It handles:
- **3D avatar rendering** — VRM model with skeletal animation, blendshapes, IK
- **Desktop overlay** — transparent always-on-top window, click-through
- **Voice I/O** — microphone capture, VAD, audio playback with lipsync
- **Animation state machine** — AnimationTree maps conversation states to body language
- **UI** — minimal chat bubble, mic indicator, settings panel
+User-facing companion. Handles:
+- **3D avatar** — VRM model, skeletal animation, blendshapes, IK
+- **Desktop overlay** — transparent always-on-top borderless window
+- **Voice I/O** — microphone capture, VAD, audio playback, lipsync
+- **AnimationTree** — FSM maps conversation states to body language
+- **UI** — chat window, right-click context menu, sound settings

-### Layer 2: Backend Services (server)
+### Layer 2: Python Sidecars
+Three lightweight sidecars run as subprocesses managed by `./run`:
+- **`bridge/`** — Redis ↔ Godot UDP relay. `tray/` and `vision/` publish events to Redis; bridge forwards them into Godot on UDP ports 19700/19701
+- **`tray/`** — System tray icon, dashboard panel, webcam preview. Spawns bridge + vision at startup
+- **`vision/`** — MediaPipe face tracking. Publishes `chobit.face.*` and `chobit.gaze.*` events to Redis
+
+### Layer 3: Backend Services
 Chobit connects to existing infrastructure over HTTP/WebSocket:
 - **@speech-synthesis** — Whisper STT + Chatterbox TTS
- **@model-boss** — GPU lease coordination for concurrent ML workloads
- **LLM** — any OpenAI-compatible endpoint, or LifeAI's companion service
+- **@model-boss** — GPU lease coordination
+- **LLM** — any OpenAI-compatible endpoint, or LifeAI companion service

-### Layer 0: chobit-core (shared protocol)
-TypeScript package defining the conversation protocol:
- `ChobitBackend` interface — the LLM contract
- `SentenceStream` — token-to-sentence buffering logic
- `EmotionExtractor` — emotion tag parsing and VRM blendshape mapping
- Types/enums shared between Godot client and backend implementations
+## GDScript Conventions
+
+### Preload Pattern (critical)
+`class_name` registration is unreliable in autoload context. **Always reference non-autoload classes via `preload()` const**:
+
+```gdscript
+const WindowDragScript = preload("res://scripts/window/window_drag.gd")
+const OrchestratorScript = preload("res://scripts/companion/conversation_orchestrator.gd")
+
+var drag: Node = WindowDragScript.new()
+```
+
+Keep `class_name` in the file for IDE autocomplete. All runtime references use preload consts.
+
+### Signals
+- `EventBus` is the only cross-system signal hub — never connect signals directly between systems
+- Signal names use **past tense**: `avatar_tapped`, `state_changed`, `speech_started`
+- EventBus signal params use `Variant` for object types (avoids autoload type resolution errors)
+
+### File Organization Rules
+- `snake_case` for files, variables, functions
+- `PascalCase` for class names and nodes
+- `UPPER_SNAKE_CASE` for constants
+- Type hints on all function signatures (including return types)
+- 500-line limit per file — split into focused modules before exceeding
+
+### Node Architecture
+Controllers are instantiated in code (`SomeScript.new()` + `add_child()`) — **not** embedded in `.tscn`. The main scene (`companion.tscn`) is the minimal skeleton; all behavior nodes attach at runtime in `companion.gd._ready()`.

 ## Key Design Decisions

- **Godot over Tauri/React**: Native 3D engine vs WebGL-in-webview. Godot provides AnimationTree state machines, skeletal IK, physics (hair/cloth), shaders (toon/anime), and particle effects — all built-in.
- **Desktop overlay**: Godot 4 supports transparent borderless windows with always-on-top. No wrapper needed.
- **Generic LLM interface**: The backend protocol is endpoint-agnostic. Swap between local LLM, cloud API, or LifeAI by changing one URL.
- **Sentence-level streaming**: LLM tokens buffer into sentences, each sent to TTS immediately. First sentence plays while LLM generates the rest.
- **Emotion via prompt engineering**: LLM embeds `[emotion]` tags inline. Godot AnimationTree transitions expressions based on parsed tags.
-
-## Dependencies
-
-| Component | Depends On |
-|-----------|-----------|
-| chobit-core (TypeScript) | (none — protocol definitions only) |
-| godot/ (Godot 4) | VRM4Godot addon, Godot 4.x |
-| Backend services | @speech-synthesis, @model-boss |
+- **Godot over Tauri/React** — AnimationTree state machines, skeletal IK, physics (hair/cloth), toon shaders, particle effects — all built-in
+- **Desktop overlay** — Godot 4 transparent borderless always-on-top window; no wrapper needed
+- **Generic LLM interface** — endpoint-agnostic; swap between local LLM, cloud API, or LifeAI by changing one URL
+- **Sentence-level streaming** — tokens buffer into sentences, each sent to TTS immediately; first sentence plays while LLM generates the rest
+- **Emotion via prompt engineering** — LLM embeds `[emotion]` tags inline; AnimationTree transitions expressions from parsed tags
+- **Sidecars over plugins** — ML inference (face tracking) runs in Python, not GDExtension; events cross via Redis → bridge → UDP → Godot

 ## Dev Commands

 ```bash
-# TypeScript protocol package
-bun install
-bun run build
-
-# Godot project
-cd godot/
-godot --editor               # Open in Godot editor
-godot --path . --windowed    # Run the companion
+./run [start]    # Launch Godot + tray sidecar (tray spawns bridge + vision)
+./run stop       # Stop everything
+./run restart    # Stop then start
+./run verify     # gdlint + gdformat check + Godot import validation
+./run editor     # Open Godot editor
+./run screenshot # Capture screenshot via tools/screenshot.gd
 ```

-## Godot Animation Architecture
+## Autoloads (project.godot)
+
+| Autoload | Path | Role |
+|----------|------|------|
+| `EventBus` | `scripts/autoloads/event_bus.gd` | Cross-system signal hub |
+| `CompanionConfig` | `scripts/autoloads/companion_config.gd` | Endpoint URLs, model name |
+| `FlightRecorder` | `scripts/autoloads/flight_recorder.gd` | Session logging |
+
+## AnimationTree State Machine

 ```
-AnimationTree (State Machine)
-├── idle         → breathing, random blink, subtle sway
-├── listening    → head tilt toward mic, attentive posture
-├── processing   → look-away, thinking pose, hand-to-chin
-├── speaking     → engaged posture, gestures synced to sentence breaks
-│   └── Lipsync  → AudioStreamPlayer spectrum → mouth blendshape
-├── interrupted  → brief surprise expression, then transition to listening
-└── Expressions  → blend layer on top of body animations
-    ├── happy, sad, angry, surprised, relaxed, neutral
-    └── Smooth interpolation via AnimationTree blend nodes
+idle         → breathing, random blink, subtle sway
+listening    → head tilt toward mic, attentive posture
+processing   → look-away, thinking pose
+speaking     → engaged posture, gestures synced to sentence breaks
+interrupted  → brief surprise, then → listening
+Expressions  → blend layer on top (happy, sad, angry, surprised, relaxed, neutral)
 ```

+## Attention System
+
+**Desktop Gaze** (default) — `LookAtModifier3D` tracks cursor position. Active when idle or ambient.
+
+**Face-to-Face** — `vision/` sidecar publishes gaze target from webcam; `gaze_controller.gd` blends from cursor tracking to face target on `conversation_started` and back on `conversation_ended`.
+
 ## Integration with LifeAI

-The Godot companion connects to LifeAI's companion service endpoint. LifeAI provides:
- Persona and character context (not just a system prompt)
- User life context (habits, goals, schedule, health)
- Reasoning-driven responses (not raw LLM output)
+Standard HTTP streaming endpoint, OpenAI-compatible protocol. LifeAI provides persona, user life context, and reasoning-driven responses. Configure via `CompanionConfig.llm_url`.

-The connection is a standard HTTP streaming endpoint — same protocol as any OpenAI-compatible API.
+## Milestone Status
+
+| Milestone | Status | Description |
+|-----------|--------|-------------|
+| M0 | ✅ | Project setup, chobit-core, autoloads, EventBus |
+| M1 | ✅ | VRM model loaded and rendered, transparent overlay, idle animation |
+| M2 | ✅ | AnimationTree FSM, expression blendshapes, dual-mode gaze, lipsync |
+| M3 | ✅ | Webcam face tracking sidecar, gaze estimation, tray integration |
+| M4 | ✅ | Microphone capture, VAD, STT/TTS HTTP clients, audio playback |
+| M5 | ✅ | Full conversation loop: VAD→STT→LLM→TTS→avatar; interruption; chat window |
+| M6 | 🔲 | LifeAI integration — persona, user life context |
+| M7 | 🔲 | Polish — toon shader, particles, hair physics, gesture animations |