diff --git a/PLAN.md b/PLAN.md new file mode 100644 index 0000000..874ec5d --- /dev/null +++ b/PLAN.md @@ -0,0 +1,283 @@ +# Implementation Plan: Live Status v4 +> Generated: 2026-03-07 | Agent: planner:proj035 | Status: DRAFT + +## 1. Goal + +Replace the broken agent-cooperative live-status system with a transparent infrastructure-level daemon that tails OpenClaw's JSONL transcript files in real-time and updates a Mattermost status box automatically — **zero agent cooperation required**. Sub-agents become visible. Spam is eliminated. Sessions never lose state. Works from gateway startup without any AGENTS.md instruction injection. + +## 2. Architecture + +``` +OpenClaw Gateway +├── Agent Sessions (main, coder-agent, sub-agents, hooks...) +│ └── writes {uuid}.jsonl as it works +│ +└── status-watcher daemon (per active session) + ├── Polls/watches {uuid}.jsonl (new line = new event) + ├── Parses tool calls, results, assistant text + ├── Maps tool names → human-readable labels + ├── Debounces Mattermost updates (500ms) + ├── Auto-creates status box in correct channel/thread + ├── Detects sub-agent spawns → nests sub-agent status + └── Auto-completes when agent stops writing (idle timeout) + +sessions.json (runtime registry) +├── session key → {sessionId, sessionFile, spawnedBy, spawnDepth, channel, ...} +└── used to: resolve JSONL file path, determine channel, link parent→child + +OpenClaw Hook (gateway:startup + command:new) +└── Spawns status-watcher for the right session + +Mattermost API (slack.solio.tech) +├── POST /api/v4/posts → create status box +├── PUT /api/v4/posts/{id} → update in-place (no edit time limit) +└── Multiple bot tokens per agent +``` + +### Key Design Decisions (from discovery) + +1. **Watch sessions.json, not just transcript files.** sessions.json is the authoritative registry that maps session keys (including sub-agents) to JSONL files. Monitor it to detect new sessions. + +2. **No new hook events needed.** We cannot use `session:start`/`session:end` hooks (they don't exist). Instead: use `gateway:startup` to begin watching all active sessions, and poll sessions.json for new sessions. + +3. **Sub-agent detection via `spawnedBy` field.** When sessions.json gets a new entry with `spawnedBy`, we know it's a sub-agent of the given parent session. Nest its status under the parent status box. + +4. **JSONL format is stable.** Version 3 format confirmed. Key events: + - `message` with role=`assistant` + content `toolCall` → tool being called + - `message` with role=`toolResult` → tool completed + - `message` with role=`assistant` + content `text` → agent thinking/responding + - `custom` with `customType: openclaw.cache-ttl` → turn boundary (good idle signal) + +5. **Mattermost post edit is unlimited.** `PostEditTimeLimit = -1`. We can update the status post indefinitely. No workaround needed. + +6. **Keep live-status.js as thin orchestration layer.** agents can still call it manually for special cases, but it's no longer the primary mechanism. + +## 3. Tech Stack + +| Layer | Technology | Version | Reason | +|-------|-----------|---------|--------| +| Watcher daemon | Node.js | 22.x (existing) | Already installed, fs.watch/setInterval available | +| File watching | fs.watch + fallback polling | built-in | fs.watch is iffy on Linux; polling fallback needed | +| Mattermost API | https (built-in) | - | Already used in live-status.js | +| Session registry | JSON file watch | - | sessions.json updated on every message | +| IPC (parent↔watcher) | PID file + signals | - | Simple, no deps | +| Hook integration | OpenClaw hooks system | existing | gateway:startup hook for auto-start | + +## 4. Project Structure + +``` +MATTERMOST_OPENCLAW_LIVESTATUS/ +├── src/ +│ ├── status-watcher.js CREATE Core transcript tail + parse + debounce +│ ├── session-monitor.js CREATE Watch sessions.json for new/ended sessions +│ ├── mattermost-client.js CREATE Mattermost HTTP API wrapper (rate-limited) +│ ├── tool-labels.json CREATE Tool name → human-readable label map +│ ├── status-formatter.js CREATE Format status box message (text + sub-agents) +│ ├── watcher-manager.js CREATE Start/stop watchers per session, PID tracking +│ ├── live-status.js MODIFY Add start-watcher/stop-watcher commands; keep create/update/complete +│ └── agent-accounts.json KEEP Agent ID → bot account mapping +│ +├── hooks/ +│ └── status-watcher-hook/ +│ ├── HOOK.md CREATE Hook metadata (events: gateway:startup, command:new) +│ └── handler.ts CREATE Spawns watcher-manager on gateway start +│ +├── skill/ +│ └── SKILL.md REWRITE Remove verbose manual protocol; just note status is automatic +│ +├── deploy-to-agents.sh REWRITE Installs hook instead of AGENTS.md injection +├── install.sh REWRITE New install flow: npm install + hook enable +├── README.md REWRITE Full v4 documentation +├── package.json MODIFY Add start/stop/status npm scripts +└── Makefile MODIFY Add check/test/lint/fmt targets +``` + +## 5. Dependencies + +| Package | Version | Purpose | New/Existing | +|---------|---------|---------|-------------| +| node.js | 22.x | Runtime | Existing (system) | +| (none) | - | All built-in: https, fs, path, child_process | - | + +No new npm dependencies. Everything uses Node.js built-ins to keep install footprint at zero. + +## 6. Data Model + +### sessions.json entry (relevant fields) +```json +{ + "agent:main:subagent:uuid": { + "sessionId": "50dc13ad-...", + "sessionFile": "50dc13ad-....jsonl", + "spawnedBy": "agent:main:main", + "spawnDepth": 1, + "label": "proj035-planner", + "channel": "mattermost", + "groupChannel": "#channelId__botUserId" + } +} +``` + +### JSONL event schema (parsed by watcher) +``` +type=session → session UUID, cwd (first line only) +type=message → role=user|assistant|toolResult; content[]=text|toolCall|toolResult +type=custom → customType=openclaw.cache-ttl (turn boundary marker) +``` + +### Watcher state per session +```json +{ + "sessionKey": "agent:main:subagent:uuid", + "sessionFile": "/path/to/uuid.jsonl", + "bytesRead": 1024, + "statusPostId": "abc123def456...", + "channelId": "yy8agcha...", + "rootPostId": null, + "lastActivity": 1772897576000, + "subAgentWatchers": ["child-session-key"], + "statusLines": ["[15:21] Reading file... done", ...], + "parentStatusPostId": null +} +``` + +### Status box format +``` +Agent: main — PROJ-035 Plan +[15:21:22] Reading transcript format... +[15:21:25] exec: ls /agents/sessions done (0.8s) +[15:21:28] Writing implementation plan... + Sub-agent: proj035-planner + [15:21:42] Reading protocol... + [15:21:55] Analyzing JSONL format... + [15:22:10] Complete (28s) +[15:22:15] Plan ready. Awaiting approval. +Runtime: 53s +``` + +## 7. Task Checklist + +### Phase 0: Repo Sync + Setup ⏱️ 10min +> Parallelizable: no | Dependencies: none +- [ ] 0.1: Sync workspace live-status.js to remote repo (git push) → remote matches workspace +- [ ] 0.2: Verify Makefile has check/test/lint/fmt targets (or add them) → make check passes +- [ ] 0.3: Create `src/tool-labels.json` with initial tool→label mapping → file exists +- [ ] 0.4: Create `src/agent-accounts.json` (already exists, verify) → agent→account mapping + +### Phase 1: Core Watcher ⏱️ 2-3h +> Parallelizable: no | Dependencies: Phase 0 +- [ ] 1.1: Create `src/mattermost-client.js` — HTTP wrapper with rate limiting (max 2 req/s), retry on 429, create/update/delete post methods → tested with curl +- [ ] 1.2: Create `src/status-formatter.js` — formats status box lines from events, sub-agent nesting, timestamps → unit testable pure function +- [ ] 1.3: Create `src/status-watcher.js` — core daemon: + - Accepts: sessionKey, sessionFile, channelId, rootPostId (optional), statusPostId (optional) + - Reads JSONL file from current byte offset + - On new lines: parse event type, extract human-readable status + - Debounce 500ms before Mattermost update + - Idle timeout: 30s after last new line → mark complete + - Emits events: status-update, session-complete + - Returns: statusPostId (created on first event) +- [ ] 1.4: Add `src/tool-labels.json` with all known tools → exec, read, write, edit, web_search, web_fetch, message, subagents, nodes, browser, image, camofox_*, claude_code_* +- [ ] 1.5: Manual test — start watcher against a real session file, verify Mattermost post appears → post created and updated + +### Phase 2: Session Monitor ⏱️ 1-2h +> Parallelizable: no | Dependencies: Phase 1 +- [ ] 2.1: Create `src/session-monitor.js` — watches sessions.json for changes: + - Polls every 2s (fs.watch unreliable on Linux for JSON files) + - Diffs previous vs current sessions.json + - On new session: emit `session-added` with session details + - On removed session: emit `session-removed` + - Resolves channel/thread from session key format +- [ ] 2.2: Create `src/watcher-manager.js` — coordinates monitor + watchers: + - On session-added: resolve channel (from session key), start status-watcher + - Tracks active watchers in memory (Map: sessionKey → watcher) + - On session-removed or watcher-complete: clean up + - Handles sub-agents: on `spawnedBy` session added, nest under parent watcher + - PID file at `/tmp/openclaw-status-watcher.pid` for single-instance enforcement +- [ ] 2.3: Entry point `src/watcher-manager.js` CLI: `node watcher-manager.js start|stop|status` → process management +- [ ] 2.4: End-to-end test — run manager in foreground, trigger agent session, verify status box appears → automated smoke test + +### Phase 3: Channel Resolution ⏱️ 1h +> Parallelizable: no | Dependencies: Phase 2 +- [ ] 3.1: Implement channel resolver — given a session key like `agent:main:mattermost:channel:abc123`, extract the Mattermost channel ID → function with unit test +- [ ] 3.2: Handle thread sessions — `agent:main:mattermost:channel:abc123:thread:def456` → channel=abc123, rootPost=def456 +- [ ] 3.3: Fallback for non-Mattermost sessions (hook sessions, cron sessions) — use configured default channel → configurable in openclaw.json or env var +- [ ] 3.4: Sub-agent channel resolution — inherit parent session's channel + use parent status box as `rootPostId` → sub-agent status appears under parent + +### Phase 4: Hook Integration ⏱️ 1h +> Parallelizable: no | Dependencies: Phase 2, Phase 3 +- [ ] 4.1: Create `hooks/status-watcher-hook/HOOK.md` with `events: ["gateway:startup"]` → discovered by OpenClaw hooks system +- [ ] 4.2: Create `hooks/status-watcher-hook/handler.js` (plain JS) — on gateway:startup, spawn `watcher-manager.js start` as background child_process → watcher manager auto-starts with gateway. Note: OpenClaw hooks system discovers `handler.ts` first, then `handler.js` — both are supported natively via dynamic import. Plain .js is confirmed to work. +- [ ] 4.3: Add `hooks/status-watcher-hook/` to workspace hooks dir (`/home/node/.openclaw/workspace/hooks/`) via `deploy-to-agents.sh` → hook auto-discovered +- [ ] 4.4: Test: restart gateway → watcher-manager starts → verify PID file exists + +### Phase 5: Polish + Cleanup ⏱️ 1h +> Parallelizable: no | Dependencies: Phase 4 +- [ ] 5.1: Rewrite `skill/SKILL.md` — remove manual protocol; say "live status is automatic, no action needed" → 10-line skill file +- [ ] 5.2: Rewrite `deploy-to-agents.sh` — remove AGENTS.md injection; install hook into workspace hooks dir; restart gateway → one-command deploy +- [ ] 5.3: Update `install.sh` — npm install, deploy hook, optionally restart gateway +- [ ] 5.4: Update `src/live-status.js` — add `start-watcher` and `stop-watcher` commands for manual control; mark create/update/complete as deprecated but keep working +- [ ] 5.5: Handle session compaction — detect if JSONL file gets smaller (compaction rewrites) → reset byte offset and re-read from start +- [ ] 5.6: Write `README.md` — full v4 documentation with architecture diagram, install steps, config reference +- [ ] 5.7: Run `make check` to verify lint/format passes → clean CI + +### Phase 6: Remove v1 Injection from AGENTS.md ⏱️ 30min +> Parallelizable: no | Dependencies: Phase 5 (after watcher confirmed working) +- [ ] 6.1: Remove "📡 Live Status Protocol (MANDATORY)" section from main agent's AGENTS.md +- [ ] 6.2: Remove from all other agent AGENTS.md files (coder-agent, xen, global-calendar, etc.) +- [ ] 6.3: Confirm watcher is running before removing (safety check) → watcher PID file exists + +## 8. Testing Strategy + +| What | Type | How | Success Criteria | +|------|------|-----|-----------------| +| Mattermost client | Unit | Direct API call with test channel | Post created and updated | +| Status formatter | Unit | Input JSONL events → verify output strings | Correct labels, timestamps | +| Channel resolver | Unit | Test session key strings → verify channel/thread extracted | All formats parsed | +| JSONL parser | Unit | Sample events from real transcripts | All types handled | +| Session monitor | Integration | Write to sessions.json, verify events emitted | New session detected in <2s | +| Status watcher | Integration | Append to JSONL file, verify Mattermost post updates | Update within 1s of new line | +| Sub-agent nesting | Integration | Spawn real sub-agent, verify nested status | Sub-agent visible in parent box | +| Idle timeout | Integration | Stop writing to JSONL, verify complete after 30s | Status box marked done | +| Compaction | Integration | Truncate JSONL file, verify watcher recovers | No duplicate events, no crash | +| E2E | Manual smoke test | Real agent task in Mattermost, verify status box | Real-time updates visible | + +## 9. Risks & Mitigations + +| Risk | Impact | Mitigation | +|------|--------|-----------| +| fs.watch unreliable on Linux | High | Fall back to polling (setInterval 2s). fs.watch as optimization | +| Sessions.json write race condition | Medium | Use atomic read (retry on parse error), debounce diff | +| Mattermost rate limit (10 req/s) | Medium | Debounce updates to 500ms; queue + batch; exponential backoff on 429 | +| Session compaction truncates JSONL | Medium | Compare file size on each poll; if smaller, reset offset | +| Multiple gateway restarts create duplicate watchers | Medium | PID file check + kill old process before spawning new | +| Sub-agent session key not stable across restarts | Low | Use sessionId (UUID) as key, not session key string | +| Watcher dies silently | Low | Cron health check or gateway boot-md restart | +| Non-Mattermost sessions (xen, hook) get status boxes | Low | Channel resolver returns null for non-MM sessions; skip gracefully | +| JSONL format change in future OpenClaw version | Medium | Abstract parser behind interface; version check on session record | + +## 10. Effort Estimate + +| Phase | Time | Can Parallelize? | Depends On | +|-------|------|-------------------|-----------| +| Phase 0: Repo Setup | 10min | No | — | +| Phase 1: Core Watcher | 2-3h | No | Phase 0 | +| Phase 2: Session Monitor | 1-2h | No | Phase 1 | +| Phase 3: Channel Resolution | 1h | No | Phase 2 | +| Phase 4: Hook Integration | 1h | No | Phase 2+3 | +| Phase 5: Polish + Cleanup | 1h | No | Phase 4 | +| Phase 6: Remove v1 Injection | 30min | No | Phase 5 (verified) | +| **Total** | **7-9h** | | | + +## 11. Open Questions + +- [ ] **Q1: Idle timeout threshold.** 30s is aggressive — exec commands can run for minutes. Should we use a smarter heuristic? E.g., detect `stopReason: "toolUse"` (agent is waiting for tool) vs `stopReason: "stop"` (agent is done). + **Default if unanswered:** Use `stopReason: "stop"` in the most recent assistant message as the idle signal, combined with 10s of no new lines. If stop_reason=toolUse, reset idle timer on every toolResult line. This is accurate and avoids false completions during long tool runs. + +- [ ] **Q2: Default channel for non-MM sessions.** Hook-triggered sessions (agent:main:hook:gitea:...) don't have a Mattermost channel. Should we (a) skip them, (b) post to a default monitoring channel, or (c) allow config per-session-type? + **Default if unanswered:** (a) Skip non-MM sessions. Hook and cron sessions are largely invisible today and not causing user pain. The priority is Mattermost interactive sessions. Non-MM support can be Phase 7. + +- [ ] **Q3: Status box per-session or per-request?** A single agent session may handle multiple sequential requests. Should each new user message create a new status box, or does one session = one status box? + **Default if unanswered:** One status box per user message (per-request). Each incoming user message starts a new status cycle. When agent sends final response (stopReason=stop + no tool calls), mark current box complete. On next user message, create a new box. This matches expected UX: one progress indicator per task. + +- [ ] **Q4: Compaction behavior.** When OpenClaw compacts a transcript (rewrites the JSONL), does it preserve the original file or create a new one? + **Default if unanswered:** Assume in-place truncation (most likely based on `compactionCount` field in sessions.json). Detect by checking if fileSize < bytesRead on each poll. If truncated, reset bytesRead to 0 and re-read from start (with deduplication via message IDs to avoid re-posting old events). diff --git a/STATE.json b/STATE.json new file mode 100644 index 0000000..838ce21 --- /dev/null +++ b/STATE.json @@ -0,0 +1,30 @@ +{ + "projectId": "PROJ-035", + "state": "PENDING_APPROVAL", + "planVersion": "v5-beta", + "phase": 0, + "totalPhases": 6, + "lastAgent": "planner:proj035:subagent:e8bb592a", + "lastUpdated": "2026-03-07T16:00:00Z", + "planPostedTo": "gitea", + "giteaRepo": "ROOH/MATTERMOST_OPENCLAW_LIVESTATUS", + "giteaIssueNumber": 3, + "discoveryIssues": [], + "completedDiscoveries": [], + "synthesisComplete": true, + "synthesisDoc": "discoveries/README.md", + "auditComplete": true, + "auditScore": "32/32", + "auditFindings": ["WARNING: gateway restart needed to activate hook — coordinate with Rooh"], + "simulationComplete": true, + "simulationVerdict": "READY", + "hasOpenQuestions": true, + "questionsAnswered": false, + "approvedBy": null, + "approvedAt": null, + "completedPhases": [], + "errors": [], + "maxConcurrentSubagents": 2, + "activeSubagents": 0, + "queuedTasks": [] +} diff --git a/discoveries/README.md b/discoveries/README.md new file mode 100644 index 0000000..fed75f1 --- /dev/null +++ b/discoveries/README.md @@ -0,0 +1,100 @@ +# Discovery Findings: Live Status v4 + +## Overview + +Planner sub-agent (proj035-planner) conducted inline discovery before drafting the plan. Key findings are documented here. + +## Discovery 1: JSONL Transcript Format + +**Confirmed format (JSONL, version 3):** + +Each line is a JSON object with `type` field: +- `session` — First line only. Contains `id` (UUID), `version: 3`, `cwd` +- `model_change` — `provider`, `modelId` change events +- `thinking_level_change` — thinking on/off +- `custom` — Subtypes: `model-snapshot`, `openclaw.cache-ttl` (turn boundary marker) +- `message` — Main event type. `role` = `user`, `assistant`, or `toolResult` + +Message content types: +- `{type: "text", text: "..."}` — plain text from any role +- `{type: "toolCall", id, name, arguments: {...}}` — tool invocations in assistant messages +- `{type: "thinking", thinking: "..."}` — internal reasoning (thinking mode) + +Assistant messages carry extra fields: `api`, `provider`, `model`, `usage`, `stopReason`, `timestamp` + +ToolResult messages carry: `toolCallId`, `toolName`, `isError`, `content: [{type, text}]` + +**Key signals for watcher:** +- `stopReason: "stop"` + no new lines → agent turn complete → idle +- `stopReason: "toolUse"` → agent waiting for tool results → NOT idle +- `custom.customType: "openclaw.cache-ttl"` → turn boundary marker + +## Discovery 2: Session Keying + +Session keys in sessions.json follow the pattern: `agent:{agentId}:{context}` + +Examples: +- `agent:main:main` — direct session +- `agent:main:mattermost:channel:{channelId}` — channel session +- `agent:main:mattermost:channel:{channelId}:thread:{threadId}` — thread session +- `agent:main:subagent:{uuid}` — SUB-AGENT SESSION (has `spawnedBy`, `spawnDepth`, `label`) +- `agent:main:hook:gitea:{repo}:issue:{n}` — hook-triggered session +- `agent:main:cron:{name}` — cron session + +Sub-agent entry fields relevant to watcher: +- `sessionId` — maps to `{sessionId}.jsonl` filename +- `spawnedBy` — parent session key (for nesting) +- `spawnDepth` — nesting depth (1 = direct child of main) +- `label` — human-readable name (e.g., "proj035-planner") +- `channel` — delivery channel (mattermost, etc.) + +Sessions files: `/home/node/.openclaw/agents/{agentId}/sessions/` +- `sessions.json` — registry (updated on every message) +- `{uuid}.jsonl` — transcript files +- `{uuid}-topic-{topicId}.jsonl` — topic-scoped transcripts + +## Discovery 3: OpenClaw Hook Events + +Available internal hook events (confirmed from source): +- `command:new`, `command:reset`, `command:stop` — user commands +- `command` — all commands +- `agent:bootstrap` — before workspace files injected +- `gateway:startup` — gateway startup (250ms after channels start) + +**NO session:start or session:end hooks exist.** The hooks system covers commands and gateway lifecycle only, NOT individual agent runs. + +Sub-agent lifecycle hooks (`subagent_spawned`, `subagent_ended`) are channel plugin hooks, not internal hooks — not directly usable from workspace hooks. + +**Hook handler files:** workspace hooks support `handler.ts` OR `handler.js` (both discovered automatically via `handlerCandidates` in workspace.ts). + +## Discovery 4: Mattermost API + +- `PostEditTimeLimit = -1` — unlimited edits on this server +- Bot token: `n73636eit7bg3rgmpsj693mwno` (default/main bot account) +- Multiple bot accounts available per agent (see openclaw.json `accounts`) +- API base: `https://slack.solio.tech/api/v4` +- Post update: `PUT /api/v4/posts/{id}` — no time limit, no count limit + +## Discovery 5: Current v1 Failure Modes + +- Agents call `live-status create/update/complete` manually +- `deploy-to-agents.sh` injects verbose 200-word protocol into AGENTS.md +- Agents forget to call it (no enforcement mechanism) +- IDs get lost between tool calls (no persistent state) +- No sub-agent visibility (sub-agents have separate sessions) +- Thread sessions create separate OpenClaw sessions → IDs not shared +- Final response dumps multiple status updates (spam from forgotten updates) + +## Discovery 6: Repo State + +- Workspace copy: `/home/node/.openclaw/workspace/projects/openclaw-live-status/` + - `src/live-status.js` — 283 lines, v2 CLI with --agent, --channel, --reply-to, create/update/complete/delete + - `deploy-to-agents.sh` — AGENTS.md injection approach + - `skill/SKILL.md` — manual usage instructions + - `src/agent-accounts.json` — agent→bot account mapping +- Remote repo (ROOH/MATTERMOST_OPENCLAW_LIVESTATUS): `src/live-status.js` is outdated (114 lines v1) +- Makefile with check/test/lint/fmt targets already exists in remote repo + +## Synthesis + +The transcript-tailing daemon approach is sound and the format is stable. The key implementation insight is: **watch sessions.json to discover new sessions, then watch each JSONL file for that session**. Sub-agents are automatically discoverable via `spawnedBy` fields. The hook system can auto-start the daemon on gateway startup via `gateway:startup` event. No new OpenClaw core changes are needed.