claire/DESIGN.md

120 lines
6.1 KiB
Markdown
Raw Permalink Normal View History

# Claire — design
## Why Claire exists
`rclaude` enumerates and addresses live claude tmux sessions across hosts, sends keystrokes, and runs a Haiku-powered triage. What it does *not* do:
- Track work as **projects** and **tasks** rather than sessions and panes
- Bind a task to a specific session and remember the binding across restarts
- Roll up "what's the state of the fleet right now" into a single dashboard
- Persist a history of progress, decisions, and broadcasts
Claire is the project-management layer above rclaude.
## Domain model
| Concept | Identity | Notable fields |
|---------|----------|----------------|
| Project | uuid | name (unique), goal, owner, status (active / paused / done) |
| Task | uuid | project_id, title, description, status (todo / in_progress / blocked / done), priority (04) |
| Assignment | uuid | task_id, session_uuid, created_hlc, active flag |
| Session | uuid (claude session uuid) | host, cwd, tmux_name, last_seen_mtime, last_triage |
| Group | name | pattern (cwd substring / host / session-name) |
| Update | uuid | assignment_id, source (triage / message / pane-tail), payload, hlc |
All ids are stable UUIDs (uuid4) generated by Claire; the only externally-derived id is `Session.uuid`, which mirrors claude's own session uuid from `~/.claude/projects/<slug>/<uuid>.jsonl`.
## Event sourcing
The `events` table is append-only:
```sql
CREATE TABLE events (
rowid INTEGER PRIMARY KEY,
uuid TEXT NOT NULL UNIQUE, -- event id (uuid4)
hlc TEXT NOT NULL, -- 'wallms.counter.machineid' for sortability
machine_id TEXT NOT NULL,
event_type TEXT NOT NULL, -- e.g. 'project_created'
payload TEXT NOT NULL, -- JSON
created_at TEXT NOT NULL -- wall-clock for humans only
);
CREATE INDEX events_hlc ON events(hlc);
```
Projections (`projects`, `tasks`, ...) are derived tables. `apply_event(conn, event)` updates them; `replay_events(conn)` rebuilds them from scratch (used for tests and recovery).
### Why event-sourced?
1. **Future sync without rewriting state.** Push B adds `GET /api/sync/events?since=<hlc>` between peers; conflict resolution is "merge events, replay projections" — already correct by construction.
2. **History.** Every project / task / assignment change is auditable. `claire project show <id> --history` becomes a one-line query.
3. **HLC stability.** Wall-clock skew between machines won't reorder events; HLC ordering is deterministic.
### HLC encoding
`{wall_ms}.{counter:06d}.{machine_id}` — sorts correctly as a string. Example: `1716253199000.000001.7f9a3c2b-1a4d-4e7f-9c2b-3d8a1e4f6c5b`.
## CLI surface (Push A)
```
claire init First-run: create DB, generate machine_id.
claire project new <name> [--goal ...] [--owner ...]
claire project list [--status active|paused|done]
claire project show <name-or-id>
claire task add <project> <title> [--prio N] [--desc ...]
claire task list [--project <p>] [--status ...]
claire task show <task-id>
claire task done <task-id>
claire assign <task-id> <session-uuid|--group <g>>
claire status [--project <p> | --group <g>]
claire pull Refresh fleet view from rclaude.
claire broadcast <project|group> --yes -- <text>
claire web [--host 127.0.0.1] [--port 8765]
claire sync Push B: errors with "deferred".
```
## Web (Push A)
FastAPI + Jinja2 + HTMX. Routes:
- `GET /` — dashboard (per-project task counts, per-session current task)
- `GET /projects` — list
- `GET /projects/{id}` — task table + assignments + recent updates
- `GET /sessions` — fleet view
- `GET /broadcast` — composer form
- `POST /broadcast` — invokes `rclaude send --yes`, emits `BroadcastSent`
5-second polling refresh via HTMX `hx-trigger="every 5s"`. No websockets.
## Push B additions
- `claire.sync`: `pull_from_peer(url, since_hlc)` + `push_to_peer(url, since_hlc)` via httpx
- Web routes `/api/sync/events` GET (with `?since=`) + POST
- `claire.toml` peer list activated
- Integration test: two in-process Claires sync state correctly
## Ecosystem adjacencies
From `apricot:~/Code/@packages/MANIFEST.md` (184 TS + 35 Py packages). Notable
adjacent packages and Claire's relationship to each:
| Package | Relationship |
|---------|--------------|
| `@lilith/claude-continue` (TS) | Conceptual overlap with `rclaude` — a tmux wrapper for Claude with crash recovery. Claire sits **above** both; we don't reimplement what either does. |
| `@lilith/mcp-session-analyzer` (TS) | MCP server for ML-analyzing Claude transcripts. Possible alternative or augmentation to `_claude-triage` as the priority signal source. Worth evaluating before Push B. |
| `@lilith/mcp-task-persistence` (TS) | Already running in the harness — persists user prompts across Claude sessions. **Not the same domain as Claire** — that's session-level history; Claire is fleet-level project tracking. |
| `@lilith/service-discovery` + `@lilith/service-registry` (TS) | Push B: replace static `peers` TOML list with dynamic discovery. |
| `@lilith/distributed-lock` (TS) | Push B fallback if HLC last-write-wins proves insufficient for some sync scenario. |
| `@lilith/circuit-breaker` (TS) | Push B inter-Claire HTTP resilience. |
| `@lilith/crypt` (TS) | If we ever encrypt sync payloads or sensitive event bodies. |
These are all TypeScript; Claire being Python means we'd consume via HTTP (the
service-registry pattern) rather than direct imports. The boundary is fine —
Claire doesn't need anything from these in Push A.
## Trade-offs accepted
- **No ORM.** Schemas are simple, the schema is canon, raw SQL keeps it visible. Cost: more boilerplate; benefit: zero magic, no migration framework to fight.
- **HTMX over SPA.** No build step, no frontend framework, server-rendered HTML. Cost: less interactivity; benefit: same author understands the whole stack.
- **Polling over websockets.** Phase 1 doesn't need <1s latency. Polling at 5s is fine for a fleet of <50 sessions.
- **No auth in Push A.** Bound to 127.0.0.1 by default. If you bind to 0.0.0.0, you accept tailnet-only trust.