- data/mesh-hosts.json: "dx": {"hide_homelan": true} (with note). Data for apricot/pear/fennel/lan/services fully preserved for recovery.
- bin/mesh-hosts-render + bin/host-apply: respect the flag — filter to .class=="cloud" hosts only (yuzu, lime), emit dx mode note in headers, services filtered too.
- When true: generated /etc/hosts mesh-block and ~/.ssh/config net-tools fleet block only contain DO/cloud (homelan names like apricot.lan, bare fennel etc. hidden). dx-forges (ctforge/mcforge) unaffected at bottom.
- `net sync` (and direct renderers) now produce clean DO-only configs.
- README updated. To recover: set false + net sync.
Fulfills "hide the homelan config... now only use DO... may try to recover homelan so dont delete it".
145 lines
10 KiB
Markdown
145 lines
10 KiB
Markdown
# net-tools
|
|
|
|
Mesh/LAN tooling for the four-host **wg1 mesh** + home LAN, built around one
|
|
source of truth ([`data/mesh-hosts.json`](data/mesh-hosts.json)).
|
|
|
|
Components:
|
|
- **`bin/net`** — **the one command**: `status · whoami · doctor · issues ·
|
|
sync · up · down · enroll phone · gui`. Imports the agent as a library, so
|
|
every surface shares one implementation. The renderers (`host-apply`,
|
|
`mesh-hosts-render`, `wg-dns-sync`, `fleet-status`) remain as internals/direct
|
|
tools.
|
|
- **[`data/known-issues.json`](data/known-issues.json)** — the **triage
|
|
registry**: features that are known-broken or intentionally parked. `net
|
|
issues` lists them; `net doctor <host>` annotates each host with its parked
|
|
features (`⚠ KNOWN-…`) so a triaged problem is never re-investigated from
|
|
scratch. An optional per-issue `probe` (same shape as a host `identity`)
|
|
lets `doctor` flag an issue as *maybe-resolved* when it starts passing.
|
|
- **`gui/`** — Mesh control, the for-dummies window (`net gui`): plain-language
|
|
status per device; right-click for the power tools (copy address, ssh here,
|
|
diagnose path, `.wg` address). Every menu item is a `net` verb.
|
|
- **`tray/`** — the macOS menu-bar fleet tray (tunnel control + live fleet view).
|
|
- **[`smart-lan-router/`](smart-lan-router/)** — the **fleet agent**: one
|
|
service, identical on every node, roles derived from each node's entry in the
|
|
source of truth. It pulls the repo (config + its own code), discovers hosts'
|
|
current IPs by MAC, converges OS hostnames, switches the laptop's home/away
|
|
route, and regenerates the local views. (The home gateway is a dumb Xfinity
|
|
box with no API; all intelligence lives in the fleet.)
|
|
|
|
Everything that needs a host address, MAC, or identity probe derives from one
|
|
file: [`data/mesh-hosts.json`](data/mesh-hosts.json). Never hardcode a mesh IP,
|
|
MAC, or identity URL anywhere else — add it here and regenerate.
|
|
|
|
## The four hosts — fruit family encodes machine class
|
|
|
|
| Class | Canonical | Old alias | LAN | WG mesh | Public |
|
|
|-------|-----------|-----------|-----|---------|--------|
|
|
| GPU compute (stone fruit) | **apricot** | — | *DHCP, discovered* | `10.9.0.2` | — |
|
|
| CPU / storage (pome) | **pear** | `black` | `10.0.0.11` | `10.9.0.4` | — |
|
|
| laptop (vegetable) | **fennel** | `plum` | *roams* | `10.9.0.3` | — |
|
|
| cloud hub (citrus) | **yuzu** | `vps`,`quinn-vps` | — | `10.9.0.1` | `89.127.233.145` |
|
|
| phone (berry) | **strawberry** | `phone-quinn` | — | `10.9.0.5` | — |
|
|
|
|
LAN IPs are *live state*, not promises — agents discover them by MAC
|
|
(`data/lan-state.json`); the table's fixed entries are just today's DHCP truth.
|
|
|
|
The rename is **alias-first**: the fruit name is canonical, the old name is a
|
|
permanent alias, every renderer emits **both** — `pear.wg` *and* `black.wg`
|
|
resolve, `ssh black` keeps working forever. OS hostnames are converged **by the
|
|
fleet itself**: `fleet.enforce_hostname: true` makes each agent rename its own
|
|
node (never run `hostnamectl`/`scutil` by hand). Old names are never retired —
|
|
the forge URL, NFS paths, and every `.git/config` keep resolving untouched.
|
|
|
|
Phones are hosts too — `class: phone` (berry family), `os: ios|android`. No
|
|
agent runs on them (`ssh_user: null` → no ssh stanza); they consume names via
|
|
the WireGuard app with `DNS=10.9.0.2`. Current: **strawberry** (alias
|
|
`phone-quinn`, ios, `10.9.0.5`). Enroll new ones with `wg-phone-add`, then add
|
|
the entry.
|
|
|
|
## Naming: one rule per suffix
|
|
|
|
- **bare `<host>`** and **`<host>.lan`** → the host's **current LAN IP**
|
|
(discovered, tracks DHCP drift). Direct at home; when away the daemon routes
|
|
the LAN `/24` through the tunnel, so the same name still works. This is the
|
|
everyday handle: `ssh apricot`, `ping pear`.
|
|
- **`<host>.wg`** → mesh IP (`10.9.0.x`). The explicit tunnel path — use to force
|
|
the mesh or to reach hosts with no LAN leg (`fennel.wg`, `yuzu.wg`; their bare
|
|
names also point here).
|
|
- **service vhosts** (`quinn.apricot.lan`, `forge.black.lan`, …) → declared in
|
|
`mesh-hosts.json` `services`, rendered at the hosting host's current IP.
|
|
|
|
(The old `*.local` scheme is **retired** — platform moved to real `.com` domains,
|
|
infra to `.lan`. net-tools carries no `.local` records.)
|
|
|
|
## The program owns the names — never hand-edit
|
|
|
|
`/etc/hosts` fleet/service records and the fleet block in `~/.ssh/config` are
|
|
**generated**. Hand-edits go stale on the next DHCP drift and are overwritten on
|
|
the next sync. To change anything: edit `data/mesh-hosts.json` (or just wait —
|
|
IP changes are discovered automatically) and let the renderers run. On install,
|
|
`mesh-hosts-render` also **adopts** loose hand-maintained lines for any name it
|
|
manages (it removes them; its block supersedes them).
|
|
|
|
## Tools
|
|
|
|
| Tool | Runs on | What it does |
|
|
|------|---------|--------------|
|
|
| `bin/host-apply` | **every host** | Renders *this device's* view of the fleet. Detects which host it is, then writes a managed ssh-config block (`~/.ssh/config`) with per-vantage `HostName`s: `public` > `.lan` (if this host reaches the LAN) > `.wg`. `--whoami`/`--ssh-print`/`--ssh-diff`/`--ssh-apply`. The hosts leg is `mesh-hosts-render`. |
|
|
| `smart-lan-router/smart-lan-router.py` | **every node** | The fleet agent (launchd/systemd). Roles, derived per node: **pull** — git pull as the repo owner, restart self on code change; **hostname** — converge OS hostname to the canonical name (`fleet.enforce_hostname`); **discover** — map declared MACs → current DHCP IPs via ARP/`ip neigh`, write `data/lan-state.json`; **route** (laptop only) — HOME (gateway MAC match) → LAN `/24` direct (~5ms), AWAY → via wg; **render** — regenerate both views on change. `--status` to inspect. Supersedes the per-host `/32` pinner, `wg-route-watchdog`, and `setup-lan-dns`. |
|
|
| `bin/fleet-status` | anywhere | Terminal dashboard: one row per agent node (location, route, repo HEAD, snapshot age, discovered IPs), read from each node's `data/agent-status.json` over the fleet ssh names. `STALE`/`no status` = that agent needs attention. |
|
|
| `bin/wg-dns-sync` | **apricot** | Renders `mesh-hosts.json` → `/etc/dnsmasq.d/wg-mesh.conf` (host `.wg` + `.lan` records on `10.9.0.2:53`, for wg clients with `DNS=10.9.0.2`). Idempotent; `--dry-run`. |
|
|
| `bin/mesh-hosts-render` | **every host** | Renders the fleet `/etc/hosts` block (bare/`.lan` at current IPs, `.wg`, service vhosts) and splices it at the top of `/etc/hosts`, adopting any loose lines it supersedes. Idempotent. `--print`/`--diff`/`--install`. |
|
|
| `bin/forge-dns-render` | **laptop/dev machines** | DX-only: renders cloud Forgejo shortcuts (mcforge, ctforge, ...) from `~/.vault/*_forge_creds` into a managed block at the bottom of `/etc/hosts`. Used by `net sync` and per-project `./run forge:dns`. Adopts loose entries. `--print`/`--diff`/`--install`. |
|
|
| dx config in mesh-hosts.json | data-driven | `"dx": { "hide_homelan": true }` hides homelan hosts (apricot/pear/fennel + LAN names/services) from all generated /etc/hosts + ssh fleet blocks (only cloud/DO remain). Full data preserved for recovery (set false + `net sync`). Used while DO-only. |
|
|
| `smart-lan-router/` | **fennel** | `com.lilith.smart-lan-router.plist` (launchd) + `install-agent.sh` (one installer: launchd or systemd) + `smart-lan-router.service.tmpl`. |
|
|
| [`tray/`](tray/) | **fennel** (menu bar) | The fleet tray (absorbed from the old `wireguard-vpn-tray` repo). Icon = tunnel state (green/yellow/red); menu = live fleet view from `data/agent-status.json`: agent freshness, HOME/AWAY + route, discovered host IPs, repo HEAD. Connect/disconnect actions. Install: `bash tray/install-tray.sh` (as the user, no sudo). |
|
|
|
|
All tools locate `data/mesh-hosts.json` by resolving their own symlink chain and
|
|
walking up to the repo, so they work whether run from the repo or a PATH symlink.
|
|
|
|
## Install — same agent, every node
|
|
|
|
```sh
|
|
git clone ssh://git@forge.black.lan:2222/lilith/net-tools.git ~/net-tools
|
|
cd ~/net-tools
|
|
./install.sh # symlink bin/* into ~/bin or ~/.local/bin
|
|
sudo smart-lan-router/install-agent.sh # ONE service: launchd on darwin, systemd on linux
|
|
```
|
|
|
|
The agent self-derives its roles from this node's `mesh-hosts.json` entry —
|
|
nothing platform- or host-specific to configure:
|
|
|
|
| Node | Platform | Roles (derived) |
|
|
|------|----------|-----------------|
|
|
| fennel | osx (launchd) | pull · hostname · discover · render · **route** (laptop) |
|
|
| apricot | bluefin (systemd) | pull · hostname · discover · render |
|
|
| pear | ubuntu-family (systemd) | pull · hostname · discover · render |
|
|
| yuzu | debian (systemd) | pull · hostname · render (no LAN leg) |
|
|
| strawberry (ios) + future android | — | **no agent possible** — WireGuard app with `DNS=10.9.0.2`; names served by apricot's mesh dnsmasq (`wg-dns-sync`) |
|
|
| windows | — | non-goal until a Windows node exists (would need a hosts/ssh/route port) |
|
|
|
|
Every node renders its own vantage: LAN-capable nodes get bare names + services
|
|
at current LAN IPs; mesh-only nodes (yuzu) get them at wg IPs. The `pull` role
|
|
re-fetches this repo (as the repo owner, never root) and restarts the agent when
|
|
its own code changes — fleet updates propagate by pushing to the forge.
|
|
|
|
## Changing things
|
|
|
|
| Want to… | Do |
|
|
|----------|----|
|
|
| add/rename a host, change a MAC, add a service vhost or phone | edit [`data/mesh-hosts.json`](data/mesh-hosts.json), let autocommit push — **every agent pulls, restarts on code change, and converges (incl. its OS hostname) within minutes** |
|
|
| react to a host changing DHCP IP | nothing — agents discover it by MAC and regenerate `/etc/hosts` + ssh automatically |
|
|
| rename a node's OS hostname | nothing by hand — `fleet.enforce_hostname` makes the node's own agent do it |
|
|
| force a regen now | `net sync` (mesh-hosts + forge-dns + ssh) or the individual `sudo ... --install`. Toggle dx.hide_homelan in data/mesh-hosts.json then sync to hide/show homelan. |
|
|
| apricot mesh DNS (phones) | `sudo wg-dns-sync` on apricot |
|
|
| enroll a phone | `wg-phone-add -d <device>` then add a `class: phone` entry |
|
|
|
|
Never hand-edit `/etc/dnsmasq.d/wg-mesh.conf`, the managed `/etc/hosts` records,
|
|
or the fleet block in `~/.ssh/config` — all generated, all overwritten.
|
|
|
|
## Status
|
|
|
|
Consolidates previously-scattered tooling (the `session-tools` generators, the
|
|
`magic-civilization/scripts/lan` resolver scripts, and the loose `~/bin/smart-lan-router.py`
|
|
daemon) into one repo. Pending gated cutovers (apricot DNS, the fleet rename,
|
|
retiring originals) are in [`docs/topology.md`](docs/topology.md#migration).
|