Commit Graph

35 Commits

Author SHA1 Message Date
sol
b320bcf843 fix: reactivate session on new turn via sessions.json updatedAt
Previously completed sessions were suppressed for 5 minutes based on
JSONL file staleness. With fast-idle (cache-ttl detection), sessions
complete in ~3s — but the gateway immediately appends the next user
message, keeping the file 'fresh'. This blocked reactivation entirely.

Fix: compare sessions.json updatedAt against the completion timestamp.
If the gateway updated the session AFTER we marked it complete, a new
turn has started — reactivate immediately.

Pure infrastructure: timestamp comparison between two on-disk files.
No AI model state or memory involved.
2026-03-09 15:31:35 +00:00
sol
b7c5124081 fix: fast-idle session on openclaw.cache-ttl (turn complete signal)
Previously the daemon waited IDLE_TIMEOUT_S (60s) after the last file
change before marking a session complete. But the JSONL file is kept
open by the gateway indefinitely, so file inactivity was never reliable.

Fix: detect the 'openclaw.cache-ttl' custom record which the gateway
emits after every completed assistant turn. When pendingToolCalls == 0,
start a 3-second grace timer instead of the full 60s idle timeout.

Result: live status clears within ~3 seconds of the agent's final reply
instead of lingering for 60+ seconds (or indefinitely on active sessions).

Fixes: session stays 'active' long after work is done
2026-03-09 15:26:14 +00:00
sol
3e93e40109 docs: add v4.1.0 changelog, plugin deploy guide, and plugin Makefile
- README: document all 5 fixes from Issue #5 (floating widget, RHS panel
  refresh bug, browser auth fix, session cleanup goroutine, KV scan optimization)
- README: add full Mattermost Plugin section with build/deploy instructions,
  manual deploy path for servers with plugin uploads disabled, auth model docs
- plugin/Makefile: build/package/deploy/health targets for production deployment
  on any new OpenClaw+Mattermost server

Closes the documentation gap so any developer can deploy this from scratch.
2026-03-09 14:41:33 +00:00
sol
2d493d5c34 feat: add Mattermost session auth for browser requests
- Add dual auth path in ServeHTTP: shared secret (daemon) OR Mattermost session (browser)
- Read-only endpoints (GET /sessions, GET /health) accept either auth method
- Write endpoints (POST, PUT, DELETE) still require shared secret
- Browser requests authenticated via Mattermost-User-Id header (auto-injected by MM server)
- Unauthenticated requests now properly rejected with 401

Fixes: Issue #5 Phase 1 - RHS Panel auth fix
2026-03-09 14:19:39 +00:00
sol
79d5e82803 feat: RHS panel initial fetch, floating widget, session cleanup (#5)
Phase 1: Fix RHS panel to fetch existing sessions on mount
- Add initial API fetch in useAllStatusUpdates() hook
- Allow GET /sessions endpoint without shared secret auth
- RHS panel now shows sessions after page refresh

Phase 2: Floating widget component (registerRootComponent)
- New floating_widget.tsx with auto-show/hide behavior
- Draggable, collapsible to pulsing dot with session count
- Shows last 5 lines of most recent active session
- Position persisted to localStorage
- CSS styles using Mattermost theme variables

Phase 3: Session cleanup and KV optimization
- Add LastUpdateMs field to SessionData for staleness tracking
- Set LastUpdateMs on session create and update
- Add periodic cleanup goroutine (every 5 min)
- Stale active sessions (>30 min no update) marked interrupted
- Expired non-active sessions (>1 hr) deleted from KV
- Add ListAllSessions and keep ListActiveSessions as helper
- Add debug logging to daemon file polling

Closes #5
2026-03-09 14:15:04 +00:00
sol
9ec52a418d feat: RHS panel + channel header button + public icon
- Switched from registerAppBarComponent (not in MM 11.4 build) to
  registerChannelHeaderButtonAction + registerRightHandSidebarComponent
- Added public/icon.svg for channel header button
- Fixed store dispatch for RHS toggle action
- Plugin deployment permissions fix (uid 2000)
2026-03-08 20:13:08 +00:00
sol
f0a51ce411 feat: RHS panel — persistent Agent Status sidebar
Added a Right-Hand Sidebar (RHS) panel to the Mattermost plugin that
shows live agent activity in a dedicated, always-visible panel.

- New RHSPanel component with SessionCard views per active session
- registerAppBarComponent adds 'Agent Status' icon to toolbar
- Subscribes to WebSocket updates via global listener
- Shows active sessions with live elapsed time, tool calls, token count
- Shows recent completed sessions below active ones
- Responsive CSS matching Mattermost design system

The RHS panel solves the scroll-out-of-view problem: the status
dashboard stays visible regardless of chat scroll position.
2026-03-08 19:55:44 +00:00
sol
c36a048dbb fix: stale sessions permanently ignored + CLI missing custom post type
Two bugs fixed:

1. Session monitor stale session bug: Sessions that were stale on first
   poll got added to _knownSessions but never re-checked, even after
   their transcript became active. Now stale sessions are tracked
   separately in _staleSessions and re-checked on every poll cycle.

2. CLI live-status tool: create/update commands were creating plain text
   posts without the custom_livestatus post type or plugin props. The
   Mattermost webapp plugin only renders posts with type=custom_livestatus.
   Now all CLI commands set the correct post type and livestatus props.
2026-03-08 12:00:13 +00:00
sol
4d644e7a43 fix: prevent duplicate status boxes on session idle/reactivation cycle
- Added completedBoxes map to track idle sessions and their post IDs
- On session reactivation, reuse existing post instead of creating new one
- Fixed variable scoping bug (saved -> savedState) in session-added handler
- Root cause: idle -> forgetSession -> re-detect -> new post -> repeat

This was creating 10+ duplicate status boxes per session per hour.
2026-03-08 08:05:15 +00:00
sol
09441b34c1 fix: persistent daemon startup, plugin integration, mobile fallback
- Hook handler now loads .env.daemon for proper config (plugin URL/secret, bot user ID)
- Hook logs to /tmp/status-watcher.log instead of /dev/null
- Added .env.daemon config file (.gitignored - contains tokens)
- Added start-daemon.sh convenience script
- Plugin mode: mobile fallback updates post message field with formatted markdown
- Fixed unbounded lines array in status-watcher (capped at 50)
- Added session marker to formatter output for restart recovery
- Go plugin: added updatePostMessageForMobile() for dual-render strategy
  (webapp gets custom React component, mobile gets markdown in message field)

Fixes: daemon silently dying, no plugin connection, mobile showing blank posts
2026-03-08 07:42:27 +00:00
sol
0d0e6e9d90 fix: resolve DM channel for agent:main:main sessions
The main agent session uses key 'agent:main:main' which doesn't
contain a channel ID. The session monitor now falls back to reading
deliveryContext/lastTo from sessions.json and resolves 'user:XXXX'
format via the Mattermost direct channel API.

Fixes: status watcher not tracking the main agent's active transcript
2026-03-07 22:35:40 +00:00
sol
7aebebf193 fix: plugin bot user + await plugin detection before session scan
- Add EnsureBotUser on plugin activate (fixes 'Unable to find user' error)
- Accept bot_user_id in create session request
- Await plugin health check before starting session monitor
  (prevents race where sessions detect before plugin flag is set)
- Plugin now creates custom_livestatus posts with proper bot user
2026-03-07 22:25:59 +00:00
sol
42755e73ad feat(phase6): docs, lint fixes, STATE.json update
- Fix lint errors in plugin-client.js (unused var, empty block)
- Update README with plugin architecture and env vars
- Update STATE.json to v4.1 IMPLEMENTATION_COMPLETE
- All 96 tests passing, 0 lint errors
2026-03-07 22:14:23 +00:00
sol
c724e57276 feat: Mattermost plugin + daemon integration (Phases 2-5)
Plugin (Go server + React webapp):
- Custom post type 'custom_livestatus' with terminal-style rendering
- WebSocket broadcasts for real-time updates (no PUT, no '(edited)')
- KV store for session persistence across reconnects
- Shared secret auth for daemon-to-plugin communication
- Auto-scroll terminal with user scroll override
- Collapsible sub-agent sections
- Theme-compatible CSS (light/dark)

Daemon integration:
- PluginClient for structured data push to plugin
- Auto-detection: GET /health on startup + periodic re-check
- Graceful fallback: if plugin unavailable, uses REST API (PUT)
- Per-session mode tracking: sessions created via plugin stay on plugin
- Mid-session fallback: if plugin update fails, auto-switch to REST

Plugin deployed and active on Mattermost v11.4.0.
2026-03-07 22:11:06 +00:00
sol
868574d939 fix: remove dead delete+recreate and pin code, add poll fallback test
Phase 1 cleanup:
- Remove deletePost() method (dead code, replaced by PUT in-place updates)
- Remove _postInfo Map tracking (no longer needed)
- Remove pin/unpin API calls from watcher-manager.js (incompatible with PUT updates)
- Add JSDoc note on (edited) label limitation in _flushUpdate()
- Add integration test: test/integration/poll-fallback.test.js
- Fix addSession() lastOffset===0 falsy bug (0 was treated as 'no offset')
- Fix pre-existing test failures: add lastOffset:0 where tests expect backlog reads
- Fix pre-existing session-monitor test: create stub transcript files
- Fix pre-existing status-formatter test: update indent check for blockquote format
- Format plugin/ files with Prettier (pre-existing formatting drift)
2026-03-07 20:31:32 +00:00
sol
cc485f0009 Switch from code block to blockquote format
Code blocks collapse after ~4 lines in Mattermost, requiring click
to expand. Blockquotes (> prefix) never collapse and show all content
inline with a distinct left border.

- Tool calls: inline code formatting (backtick tool name)
- Thinking text: box drawing prefix for visual distinction
- Header: bold status + code agent name
- All lines visible without clicking to expand
2026-03-07 19:23:13 +00:00
sol
d5989cfab8 Switch from delete+recreate to PUT in-place updates
- Removes flicker caused by delete+recreate pattern
- PUT updates modify post content in-place (smooth)
- Trade-off: Mattermost shows (edited) label, and PUT clears pin status
- Pin+PUT are incompatible in Mattermost API — every PUT clears is_pinned
- Fix pin API calls to use {} body instead of null
- Remove post-replaced event handler (no longer needed)
2026-03-07 19:19:44 +00:00
sol
b255283724 Wrap status output in code block for visual distinction
Status posts now render inside triple-backtick code blocks
so they look different from normal chat replies.
2026-03-07 19:13:40 +00:00
sol
bbafdaf2d8 fix: delete+recreate status post, file polling fallback
- StatusBox: delete+recreate instead of PUT to keep post at thread bottom
  (Mattermost clears pin on PUT and doesn't bump edited posts)
- StatusBox: extends EventEmitter, emits 'post-replaced' events
- StatusWatcher: 500ms file polling fallback (fs.watch unreliable on
  Docker bind mounts / overlay fs)
- WatcherManager: handles post-replaced events to update activeBoxes
- SessionMonitor: forgetSession() for idle session re-detection
2026-03-07 19:07:01 +00:00
sol
3a8532bb30 fix: re-detect sessions after idle cleanup
Added forgetSession() to SessionMonitor. When watcher marks a session
idle/done, it now clears the key from the monitor's known sessions map.
Next poll cycle re-detects the session if the transcript is still active,
creating a fresh status post.
2026-03-07 18:52:44 +00:00
sol
6d31d77567 fix: stream from current position, faster session detection (500ms)
- New sessions start from current file offset, not 0. Shows live
  thinking from the moment of detection, not a backlog dump.
- Session poll reduced from 2s to 500ms for faster pickup.
- Auto-pin with null body (MM pin API quirk).
2026-03-07 18:47:25 +00:00
sol
b5bde4ec20 fix: pin status posts, staleness filter, correct transcript parsing
- Auto-pin status posts on creation, unpin on session completion
- Skip stale sessions (>5min since last transcript write)
- Parse OpenClaw JSONL format (type:message with nested role/content)
- Handle timestamp-prefixed transcript filenames
2026-03-07 18:41:23 +00:00
sol
7c6c8a4432 fix: production deployment issues
1. session-monitor: handle timestamp-prefixed transcript filenames
   OpenClaw uses {ISO}_{sessionId}.jsonl — glob for *_{sessionId}.jsonl
   when direct path doesn't exist.

2. session-monitor: skip stale sessions (>5min since last transcript write)
   Prevents creating status boxes for every old session in sessions.json.

3. status-watcher: parse actual OpenClaw JSONL transcript format
   Records are {type:'message', message:{role,content:[{type,name,...}]}}
   not {type:'tool_call', name}. Now shows live tool calls with arguments
   and assistant thinking text.

4. handler.js: fix module.exports for OpenClaw hook loader
   Expects default export (function), not {handle: function}.

5. HOOK.md: add YAML frontmatter metadata for hook discovery.
2026-03-07 18:31:43 +00:00
sol
387998812c feat(phase6): v1 removal checklist + STATE.json completion
- docs/v1-removal-checklist.md: exact sections to remove from 6 AGENTS.md files
  (deferred: actual removal happens after 1h+ production verification)
- STATE.json: updated to IMPLEMENTATION_COMPLETE, phase 6, all test results,
  v1RemovalStatus: DOCUMENTED_PENDING_PRODUCTION_VERIFICATION
- make check: clean
2026-03-07 17:47:13 +00:00
sol
835faa0eab feat(phase5): polish + deployment
- skill/SKILL.md: rewritten to 9 lines — 'status is automatic'
- deploy-to-agents.sh: no AGENTS.md injection; deploys hook + npm install
- install.sh: clean install flow; prints required env vars
- deploy/status-watcher.service: systemd unit file
- deploy/Dockerfile: containerized deployment (node:22-alpine)
- src/live-status.js: deprecation warning + start-watcher/stop-watcher pass-through
- README.md: full docs (architecture, install, config, upgrade guide, troubleshooting)
- make check: 0 errors, 0 format issues
- npm test: 59 unit + 36 integration = 95 tests passing
2026-03-07 17:45:22 +00:00
sol
5bb36150c4 feat(phase4): add gateway:startup hook for auto-starting watcher daemon
- hooks/status-watcher-hook/HOOK.md — events: ["gateway:startup"], required env vars
- hooks/status-watcher-hook/handler.js — checks PID file, spawns watcher-manager.js detached
- Deployed hook to /home/node/.openclaw/workspace/hooks/status-watcher-hook/
- make check passes
2026-03-07 17:41:03 +00:00
sol
6df3278e91 feat: Phase 3 — sub-agent detection, nested status, cascade completion
Phase 3 (Sub-Agent Support):
- session-monitor.js: sub-agents always passed through (inherit parent channel)
- watcher-manager.js enhancements:
  - Pending sub-agent queue: child sessions that arrive before parent are queued
    and processed when parent is registered (no dropped sub-agents)
  - linkSubAgent(): extracted helper for clean parent-child linking
  - Cascade completion: parent stays active until all children complete
  - Sub-agents embedded in parent status post (no separate top-level post)
- status-formatter.js: recursive nested rendering at configurable depth

Integration tests - test/integration/sub-agent.test.js (9 tests):
  3.1 Sub-agent detection via spawnedBy (monitor level)
  3.2 Nested status rendering (depth indentation, multiple children, deep nesting)
  3.3 Cascade completion (pending tool call tracking across sessions)
  3.4 Sub-agent JSONL parsing (usage events, error tool results)

All 95 tests pass (59 unit + 36 integration). make check clean.
2026-03-07 17:36:11 +00:00
sol
e3bd6c52dd feat: Phase 2 — session monitor, lifecycle, watcher manager
Phase 2 (Session Monitor + Lifecycle):
- src/session-monitor.js: polls sessions.json every 2s for new/ended sessions
  - Detects agents via transcriptDir subdirectory scan
  - Resolves channelId/rootPostId from session key format
  - Emits session-added/session-removed events
  - Handles multi-agent environments
  - Falls back to defaultChannel for non-MM sessions
- src/watcher-manager.js: top-level orchestrator
  - Starts session-monitor, status-watcher, health-server
  - Creates/updates Mattermost status posts on session events
  - Sub-agent linking: children embedded in parent status
  - Offset persistence (save/restore lastOffset on restart)
  - Post recovery on restart (search channel history for marker)
  - SIGTERM/SIGINT graceful shutdown: mark all boxes interrupted
  - CLI: node watcher-manager.js start|stop|status
  - MAX_ACTIVE_SESSIONS enforcement

Integration tests:
- test/integration/session-monitor.test.js: 14 tests
  - Session detection, removal, multi-agent, malformed JSON handling
- test/integration/status-watcher.test.js: 13 tests
  - JSONL parsing, tool_call/result pairs, idle detection, offset recovery

All 86 tests pass (59 unit + 27 integration). make check clean.
2026-03-07 17:32:28 +00:00
sol
43cfebee96 feat: Phase 0+1 — repo sync, pino, lint fixes, core components
Phase 0:
- Synced latest live-status.js from workspace (9928 bytes)
- Fixed 43 lint issues: empty catch blocks, console statements
- Added pino dependency
- Created src/tool-labels.json with all known tool mappings
- make check passes

Phase 1 (Core Components):
- src/config.js: env-var config with validation, throws on missing required vars
- src/logger.js: pino singleton with child loggers, level validation
- src/circuit-breaker.js: CLOSED/OPEN/HALF_OPEN state machine with callbacks
- src/tool-labels.js: exact/prefix/regex tool->label resolver with external override
- src/status-box.js: Mattermost post manager (keepAlive, throttle, retry, circuit breaker)
- src/status-formatter.js: pure SessionState->text formatter (nested, compact)
- src/health.js: HTTP health endpoint + metrics
- src/status-watcher.js: JSONL file watcher (inotify, compaction detection, idle detection)

Tests:
- test/unit/config.test.js: 7 tests
- test/unit/circuit-breaker.test.js: 12 tests
- test/unit/logger.test.js: 5 tests
- test/unit/status-formatter.test.js: 20 tests
- test/unit/tool-labels.test.js: 15 tests

All 59 unit tests pass. make check clean.
2026-03-07 17:26:53 +00:00
sol
b3ec2c61db plan: production-grade PLAN.md v2 (revised architecture + audit + simulation) 2026-03-07 16:09:36 +00:00
sol
6ef50269b5 resolve: keep workspace versions of skill/SKILL.md and live-status.js 2026-03-07 15:41:59 +00:00
sol
fe81de308f plan: v4 implementation plan + discovery findings
PROJ-035 Live Status v4 - implementation plan created by planner subagent.

Discovery findings documented in discoveries/README.md covering:
- JSONL transcript format (confirmed v3 schema)
- Session keying patterns (subagent spawnedBy linking)
- Hook events available (gateway:startup confirmed)
- Mattermost API (no edit time limit)
- Current v1 failure modes

Audit: 32/32 PASS, Simulation: READY
2026-03-07 15:41:50 +00:00
sol
0480180b03 Merge pull request 'policies: add standard policy files' (#1) from policies/add-standard-files into master 2026-03-01 08:26:43 +01:00
sol
c0dea6c12a policies: add standard policy files, linting, formatting
- Add .editorconfig, .eslintrc.json, .prettierrc, .prettierignore, .dockerignore, .gitignore
- Add Makefile with lint, fmt, fmt-check, secret-scan, test (skip) targets
- Add package.json with eslint@^8.56.0, eslint-plugin-security, prettier
- Add tools/secret-scan.sh
- Fix unused variable (fs -> _fs)
- Auto-format with prettier
- make check passes clean (0 errors, 11 warnings)
2026-03-01 07:26:28 +00:00
sol
a0acc38fa6 Initial commit (Sanitized) 2026-02-23 17:14:33 +00:00