- README: document all 5 fixes from Issue #5 (floating widget, RHS panel refresh bug, browser auth fix, session cleanup goroutine, KV scan optimization) - README: add full Mattermost Plugin section with build/deploy instructions, manual deploy path for servers with plugin uploads disabled, auth model docs - plugin/Makefile: build/package/deploy/health targets for production deployment on any new OpenClaw+Mattermost server Closes the documentation gap so any developer can deploy this from scratch.
392 lines
14 KiB
Markdown
392 lines
14 KiB
Markdown
# Live Status v4.1
|
|
|
|
Real-time Mattermost progress updates for OpenClaw agent sessions.
|
|
|
|
Version 4 replaces the manual v1 live-status CLI with a transparent infrastructure daemon.
|
|
Agents no longer need to call `live-status`. The watcher auto-updates Mattermost as they work.
|
|
|
|
## What's New in v4.1
|
|
|
|
- **Floating widget** — PiP-style overlay using `registerRootComponent`. Auto-shows when a session starts, auto-hides when idle. Draggable, collapsible, position persisted to localStorage. Solves the "status box buried in long threads" problem without touching post ordering.
|
|
- **RHS panel fix** — Panel now loads existing sessions on mount (previously empty after page refresh). Added dual auth path so browser JS can fetch sessions without the daemon shared secret.
|
|
- **Session cleanup** — Orphaned sessions (daemon crash, etc.) now auto-expire: stale after 30 min inactivity, deleted after 1 hour.
|
|
- **KV prefix filter** — `ListActiveSessions` now filters at the KV level instead of scanning all plugin keys.
|
|
|
|
## Architecture
|
|
|
|
Two rendering modes (auto-detected):
|
|
|
|
### Plugin Mode (preferred)
|
|
When the Mattermost plugin is installed, updates stream via WebSocket:
|
|
- Custom post type `custom_livestatus` with terminal-style React rendering
|
|
- Zero Mattermost post API calls during streaming (no "(edited)" label)
|
|
- Auto-scroll, collapsible sub-agents, theme-compatible
|
|
|
|
### REST API Fallback
|
|
When the plugin is unavailable, updates use the Mattermost REST API:
|
|
- Blockquote-formatted posts updated via PUT
|
|
- Shows "(edited)" label (Mattermost API limitation)
|
|
|
|
```
|
|
OpenClaw Gateway
|
|
Agent Sessions
|
|
-> writes {uuid}.jsonl as they run
|
|
|
|
status-watcher daemon (SINGLE PROCESS)
|
|
-> fs.watch + polling fallback on transcript directory
|
|
-> Multiplexes all active sessions
|
|
-> Auto-detects plugin (GET /health every 60s)
|
|
-> Plugin mode: POST/PUT/DELETE to plugin REST endpoint
|
|
-> Plugin broadcasts via WebSocket to React component
|
|
-> REST fallback: PUT to Mattermost post API
|
|
-> Shared HTTP connection pool (keep-alive, maxSockets=4)
|
|
-> Throttled updates (leading edge + trailing flush, 500ms)
|
|
-> Circuit breaker for API failure resilience
|
|
-> Graceful shutdown (SIGTERM -> mark all boxes "interrupted")
|
|
-> Sub-agent nesting (child sessions under parent status box)
|
|
|
|
Mattermost Plugin (com.openclaw.livestatus)
|
|
-> Go server: REST API + KV store + WebSocket broadcast
|
|
-> React webapp: custom post type renderer
|
|
-> Terminal-style UI with auto-scroll
|
|
|
|
gateway:startup hook
|
|
-> hooks/status-watcher-hook/handler.js
|
|
-> Checks PID file; spawns daemon if not running
|
|
|
|
Mattermost API (fallback)
|
|
-> PUT /api/v4/posts/{id} (in-place edits, unlimited)
|
|
-> Shared http.Agent (keepAlive, maxSockets=4)
|
|
-> Circuit breaker: open after 5 failures, 30s cooldown
|
|
```
|
|
|
|
## Install
|
|
|
|
### Prerequisites
|
|
|
|
- Node.js 22.x
|
|
- OpenClaw gateway running
|
|
- Mattermost bot token
|
|
|
|
### One-command install
|
|
|
|
```sh
|
|
cd /path/to/MATTERMOST_OPENCLAW_LIVESTATUS
|
|
bash install.sh
|
|
```
|
|
|
|
This installs npm dependencies and deploys the `gateway:startup` hook.
|
|
The daemon starts automatically on the next gateway restart.
|
|
|
|
### Manual start (without gateway restart)
|
|
|
|
Set required env vars, then:
|
|
|
|
```sh
|
|
node src/watcher-manager.js start
|
|
```
|
|
|
|
## Configuration
|
|
|
|
All config via environment variables. No hardcoded values.
|
|
|
|
### Required
|
|
|
|
| Variable | Description |
|
|
| ---------------- | ----------------------------------------------------- |
|
|
| `MM_TOKEN` | Mattermost bot token |
|
|
| `MM_URL` | Mattermost base URL (e.g. `https://slack.solio.tech`) |
|
|
| `TRANSCRIPT_DIR` | Path to agent sessions directory |
|
|
| `SESSIONS_JSON` | Path to sessions.json |
|
|
|
|
### Optional
|
|
|
|
| Variable | Default | Description |
|
|
| ---------------------------- | ------------------------- | ------------------------------------------ |
|
|
| `THROTTLE_MS` | `500` | Min interval between Mattermost updates |
|
|
| `IDLE_TIMEOUT_S` | `60` | Inactivity before marking session complete |
|
|
| `MAX_SESSION_DURATION_S` | `1800` | Hard timeout per session (30 min) |
|
|
| `MAX_STATUS_LINES` | `15` | Max lines in status box (oldest dropped) |
|
|
| `MAX_ACTIVE_SESSIONS` | `20` | Concurrent status box limit |
|
|
| `MAX_MESSAGE_CHARS` | `15000` | Mattermost post truncation limit |
|
|
| `HEALTH_PORT` | `9090` | Health endpoint port (0 = disabled) |
|
|
| `LOG_LEVEL` | `info` | Logging level (pino) |
|
|
| `PID_FILE` | `/tmp/status-watcher.pid` | PID file location |
|
|
| `CIRCUIT_BREAKER_THRESHOLD` | `5` | Failures before circuit opens |
|
|
| `CIRCUIT_BREAKER_COOLDOWN_S` | `30` | Cooldown before half-open probe |
|
|
| `TOOL_LABELS_FILE` | _(built-in)_ | External tool labels JSON override |
|
|
| `DEFAULT_CHANNEL` | _none_ | Fallback channel for non-MM sessions |
|
|
| `PLUGIN_URL` | _none_ | Plugin endpoint URL (enables plugin mode) |
|
|
| `PLUGIN_SECRET` | _none_ | Shared secret for plugin authentication |
|
|
| `PLUGIN_ENABLED` | `true` | Enable/disable plugin auto-detection |
|
|
|
|
## Status Box Format
|
|
|
|
```
|
|
[ACTIVE] main | 38s
|
|
Reading live-status source code...
|
|
exec: ls /agents/sessions [OK]
|
|
Analyzing agent configurations...
|
|
exec: grep -r live-status [OK]
|
|
Writing new implementation...
|
|
Sub-agent: proj035-planner
|
|
Reading protocol...
|
|
Analyzing JSONL format...
|
|
[DONE] 28s
|
|
Plan ready. Awaiting approval.
|
|
[DONE] 53s | 12.4k tokens
|
|
```
|
|
|
|
## Daemon Management
|
|
|
|
```sh
|
|
# Start
|
|
node src/watcher-manager.js start
|
|
|
|
# Stop (graceful shutdown)
|
|
node src/watcher-manager.js stop
|
|
|
|
# Status
|
|
node src/watcher-manager.js status
|
|
|
|
# Pass-through via legacy CLI
|
|
live-status start-watcher
|
|
live-status stop-watcher
|
|
|
|
# Health check
|
|
curl http://localhost:9090/health
|
|
```
|
|
|
|
## Deployment Options
|
|
|
|
### Hook (default)
|
|
|
|
The `gateway:startup` hook in `hooks/status-watcher-hook/` auto-starts the daemon.
|
|
No configuration needed beyond deploying the hook.
|
|
|
|
### systemd
|
|
|
|
```sh
|
|
# Copy service file
|
|
cp deploy/status-watcher.service /etc/systemd/system/
|
|
|
|
# Create env file
|
|
cat > /etc/status-watcher.env <<EOF
|
|
MM_TOKEN=your_token
|
|
MM_URL=https://slack.solio.tech
|
|
TRANSCRIPT_DIR=/home/node/.openclaw/agents/main/sessions
|
|
SESSIONS_JSON=/home/node/.openclaw/agents/main/sessions/sessions.json
|
|
EOF
|
|
|
|
systemctl enable --now status-watcher
|
|
```
|
|
|
|
### Docker
|
|
|
|
```sh
|
|
docker build -f deploy/Dockerfile -t status-watcher .
|
|
docker run -d \
|
|
-e MM_TOKEN=your_token \
|
|
-e MM_URL=https://slack.solio.tech \
|
|
-e TRANSCRIPT_DIR=/sessions \
|
|
-e SESSIONS_JSON=/sessions/sessions.json \
|
|
-v /home/node/.openclaw/agents:/sessions:ro \
|
|
-p 9090:9090 \
|
|
status-watcher
|
|
```
|
|
|
|
## Changelog
|
|
|
|
### v4.1.0 (2026-03-09)
|
|
|
|
**New: Floating Widget** (`plugin/webapp/src/components/floating_widget.tsx`)
|
|
- Registers as a root component via `registerRootComponent` — always visible, not tied to any post position
|
|
- Auto-shows when an agent session becomes active (WebSocket event), auto-hides 5 seconds after completion
|
|
- Draggable — drag to any screen position, position persisted in `localStorage`
|
|
- Collapsible to a pulsing dot with session count badge
|
|
- Shows agent name, elapsed time, and last 5 status lines of the most active session
|
|
- Click to expand or jump to the thread
|
|
|
|
**Fix: RHS Panel blank after page refresh** (`plugin/webapp/src/components/rhs_panel.tsx`)
|
|
- Previously the panel was empty after a browser refresh because it only showed sessions received via WebSocket during the current page load
|
|
- Now fetches `GET /api/v1/sessions` on mount to pre-populate with existing active sessions
|
|
- WebSocket updates continue to keep it live after initial hydration
|
|
|
|
**Fix: Plugin auth for browser requests** (`plugin/server/api.go`)
|
|
- Previously all plugin API requests required the shared secret (daemon token)
|
|
- Browser-side `fetch()` calls from the webapp can't include the shared secret
|
|
- Added dual auth: `GET` endpoints now also accept Mattermost session auth (`Mattermost-User-Id` header, auto-injected by the Mattermost server)
|
|
- Write operations (`POST`/`PUT`/`DELETE`) still require the shared secret — daemon-only
|
|
|
|
**New: Session cleanup goroutine** (`plugin/server/plugin.go`, `plugin/server/store.go`)
|
|
- Added `LastUpdateMs` timestamp field to `SessionData`
|
|
- Cleanup goroutine runs every 5 minutes in `OnActivate`
|
|
- Sessions active >30 minutes with no update are marked `interrupted` (daemon likely crashed)
|
|
- Non-active sessions older than 1 hour are deleted from the KV store
|
|
- Prevents indefinite accumulation of orphaned sessions from daemon crashes
|
|
|
|
**Fix: KV store scan optimization** (`plugin/server/store.go`)
|
|
- `ListActiveSessions` now filters by `ls_session_` key prefix before deserializing
|
|
- Avoids scanning unrelated KV entries from other plugins
|
|
|
|
---
|
|
|
|
## Upgrade from v1
|
|
|
|
v1 required agents to call `live-status create/update/complete` manually.
|
|
AGENTS.md contained a large "Live Status Protocol (MANDATORY)" section.
|
|
|
|
### What changes
|
|
|
|
1. The daemon handles all updates — no manual calls needed.
|
|
2. AGENTS.md protocol section can be removed (see `docs/v1-removal-checklist.md`).
|
|
3. `skill/SKILL.md` is now 9 lines: "status is automatic".
|
|
4. `live-status` CLI still works for manual use but prints a deprecation notice.
|
|
|
|
### Migration steps
|
|
|
|
1. Run `bash install.sh` to deploy v4.
|
|
2. Restart the gateway (hook activates).
|
|
3. Verify the daemon is running: `curl localhost:9090/health`
|
|
4. After 1+ hours of verified operation, remove the v1 AGENTS.md sections
|
|
(see `docs/v1-removal-checklist.md` for exact sections to remove).
|
|
|
|
## Mattermost Plugin
|
|
|
|
The plugin (`plugin/`) provides WebSocket-based live rendering — no "(edited)" labels, full terminal UI with theme support.
|
|
|
|
### Requirements
|
|
|
|
- Mattermost 7.0+
|
|
- Go 1.21+ (for building server binary)
|
|
- Node.js 18+ (for building React webapp)
|
|
- A bot token with System Admin or plugin management rights (for deployment)
|
|
|
|
### Build and Deploy
|
|
|
|
```sh
|
|
# Build server + webapp
|
|
cd plugin
|
|
make all
|
|
|
|
# Deploy to Mattermost (uploads + enables plugin)
|
|
MM_URL=https://your-mattermost.example.com \
|
|
MM_TOKEN=your_system_admin_token \
|
|
make deploy
|
|
|
|
# Verify plugin is healthy
|
|
PLUGIN_SECRET=your_shared_secret \
|
|
MM_URL=https://your-mattermost.example.com \
|
|
make health
|
|
```
|
|
|
|
### Plugin Configuration (Admin Console)
|
|
|
|
After deploying, configure in **System Console > Plugins > OpenClaw Live Status**:
|
|
|
|
| Setting | Description | Default |
|
|
|---------|-------------|---------|
|
|
| `SharedSecret` | Shared secret between plugin and daemon. Must match `PLUGIN_SECRET` in `.env.daemon`. | _(empty — set this)_ |
|
|
| `MaxActiveSessions` | Max simultaneous tracked sessions | 20 |
|
|
| `MaxStatusLines` | Max status lines per session | 30 |
|
|
|
|
### Manual Deploy (when plugin uploads are disabled)
|
|
|
|
If your Mattermost server has plugin uploads disabled (common in self-hosted setups), deploy directly to the host filesystem:
|
|
|
|
```sh
|
|
# Build the package
|
|
cd plugin && make package
|
|
# Outputs: plugin/dist/com.openclaw.livestatus.tar.gz
|
|
|
|
# Extract to Mattermost plugins volume (adjust path to match your setup)
|
|
tar xzf plugin/dist/com.openclaw.livestatus.tar.gz \
|
|
-C /opt/mattermost/volumes/app/mattermost/plugins/
|
|
|
|
# Restart or reload plugin via API
|
|
curl -X POST \
|
|
-H "Authorization: Bearer $MM_TOKEN" \
|
|
"$MM_URL/api/v4/plugins/com.openclaw.livestatus/enable"
|
|
```
|
|
|
|
### Plugin Auth Model
|
|
|
|
The plugin uses dual authentication:
|
|
|
|
- **Shared secret** (Bearer token in `Authorization` header) — used by the daemon for all write operations (POST/PUT/DELETE sessions)
|
|
- **Mattermost session** (`Mattermost-User-Id` header, auto-injected by the Mattermost server) — used by the browser webapp for read-only operations (GET sessions, GET health)
|
|
|
|
This means the RHS panel and floating widget can fetch existing sessions on page load without needing the shared secret in the frontend.
|
|
|
|
---
|
|
|
|
## Troubleshooting
|
|
|
|
**Daemon not starting:**
|
|
|
|
- Check PID file: `cat /tmp/status-watcher.pid`
|
|
- Check env vars: `MM_TOKEN`, `MM_URL`, `TRANSCRIPT_DIR`, `SESSIONS_JSON` must all be set
|
|
- Start manually and check logs: `node src/watcher-manager.js start`
|
|
|
|
**No status updates appearing:**
|
|
|
|
- Check health endpoint: `curl localhost:9090/health`
|
|
- Check circuit breaker state (shown in health response)
|
|
- Verify `MM_TOKEN` has permission to post in the target channel
|
|
|
|
**Duplicate status boxes:**
|
|
|
|
- Multiple daemon instances — check PID file, kill extras
|
|
- `node src/watcher-manager.js status` shows if it's running
|
|
|
|
**Session compaction:**
|
|
|
|
- When JSONL is truncated, the watcher detects it (stat.size < lastOffset)
|
|
- Offset resets, status box shows `[session compacted - continuing]`
|
|
- No crash, no data loss
|
|
|
|
## Development
|
|
|
|
```sh
|
|
# Run all tests
|
|
npm test
|
|
|
|
# Run unit tests only
|
|
npm run test-unit
|
|
|
|
# Run integration tests only
|
|
npm run test-integration
|
|
|
|
# Lint + format + test
|
|
make check
|
|
```
|
|
|
|
## Project Structure
|
|
|
|
```
|
|
src/
|
|
watcher-manager.js Entrypoint; PID file; graceful shutdown
|
|
status-watcher.js JSONL file watcher (inotify)
|
|
session-monitor.js sessions.json poller (2s interval)
|
|
status-box.js Mattermost post manager (throttle, circuit breaker)
|
|
status-formatter.js Status box text renderer
|
|
circuit-breaker.js Circuit breaker state machine
|
|
config.js Env var config with validation
|
|
logger.js pino wrapper
|
|
health.js HTTP health endpoint
|
|
tool-labels.js Tool name -> label resolver
|
|
tool-labels.json Built-in tool label defaults
|
|
live-status.js Legacy CLI (deprecated; backward compat)
|
|
|
|
hooks/
|
|
status-watcher-hook/ gateway:startup hook (auto-start daemon)
|
|
|
|
deploy/
|
|
status-watcher.service systemd unit file
|
|
Dockerfile Container deployment
|
|
|
|
test/
|
|
unit/ Unit tests (59 tests)
|
|
integration/ Integration tests (36 tests)
|
|
```
|