7 Commits

Author SHA1 Message Date
clawbot
c30a280d6e update Gitea poller docs to dispatcher architecture (closes #4)
Some checks failed
check / check (push) Failing after 11s
2026-02-28 06:30:25 -08:00
f3e48c6cd4 Merge pull request 'Expand sensitive output routing and make inbox references conditional' (#3) from fix/pii-and-conditional-email into main
All checks were successful
check / check (push) Successful in 9s
Reviewed-on: #3
2026-02-28 15:22:36 +01:00
clawbot
c0d345e767 expand PII routing to cover secrets, credentials, and operational info; make email/inbox references conditional
All checks were successful
check / check (push) Successful in 12s
- Rename 'PII Output Routing' → 'Sensitive Output Routing' throughout
- Expand scope to include secrets, credentials, API keys, flight numbers,
  locations, travel plans, medical info
- Replace hardcoded 'Emails' heartbeat check with conditional language
  ('Notifications — whatever inbox sources you've integrated')
- Remove 'email' from heartbeat-state.json example
- Update cross-references in SETUP_CHECKLIST.md
2026-02-28 03:40:13 -08:00
user
36223ca550 fix: agent should infer needed fields, not wait to be told
All checks were successful
check / check (push) Successful in 12s
2026-02-28 03:33:08 -08:00
user
f0a2a5eb62 docs: update Gitea notification section — webhook vs poller, flag-file approach
Some checks are pending
check / check (push) Waiting to run
- Replaced wake-event poller with flag-file approach (prevents DM spam)
- Added Option A (webhooks for VPS) vs Option B (poller for NAT)
- Documented the wake-event failure mode and why we switched
2026-02-28 03:30:49 -08:00
9631535583 Merge pull request 'Rewrite SETUP_CHECKLIST.md: replace checklists with paste-able agent prompts' (#1) from rewrite-setup-checklist-prompts into main
Some checks are pending
check / check (push) Waiting to run
2026-02-28 12:27:17 +01:00
user
b0495d5b56 rewrite SETUP_CHECKLIST.md: replace checklist items with paste-able agent prompts
All checks were successful
check / check (push) Successful in 13s
Each section now contains a self-contained prompt in a code block that
adopting users can paste directly to their agent. Prompts include full
URLs to raw reference docs. Fixes 'you provide' wording to 'your human
provides'. Keeps same phase/section structure.
2026-02-28 03:22:08 -08:00
2 changed files with 964 additions and 473 deletions

View File

@@ -173,46 +173,91 @@ The landing checklist (triggered automatically after every flight) updates
location, timezone, nearest airport, and lodging in the daily context file. It
also checks if any cron jobs have hardcoded timezones that need updating.
### The Gitea Notification Poller
### Gitea Notification Delivery
OpenClaw has heartbeats, but those are periodic (every ~30min). For Gitea issues
and PRs, we wanted near-realtime response. The solution: a tiny Python script
that polls the Gitea notifications API every 2 seconds and wakes the agent via
OpenClaw's `/hooks/wake` endpoint when new notifications arrive.
There are two approaches for getting Gitea notifications to your agent,
depending on your network setup.
#### Option A: Direct Webhooks (VPS / Public Server)
If your OpenClaw instance runs on a VPS or other publicly reachable server, the
simplest approach is direct webhooks. Run Traefik (or any reverse proxy with
automatic TLS) on the same server and configure Gitea webhooks to POST directly
to OpenClaw's webhook endpoint. This is push-based and realtime — notifications
arrive instantly.
Setup: add a webhook on each Gitea repo (or use an organization-level webhook)
pointing to `https://your-openclaw-host/hooks/gitea`. OpenClaw handles the rest.
#### Option B: Notification Poller + Dispatcher (Local Machine Behind NAT)
If your OpenClaw runs on a dedicated local machine behind NAT (like a home Mac
or Linux workstation), Gitea can't reach it directly. This is our setup —
OpenClaw runs on a Mac Studio on a home LAN.
The solution: a Python script that polls the Gitea notifications API, triages
each notification, and spawns an isolated agent session per actionable item.
Response time is ~15-30 seconds.
**Evolution note:** We originally used a flag-file approach (poller writes
flag → agent checks during heartbeat → ~30 min latency). This was replaced by
the dispatcher pattern below, which is near-realtime.
Key design decisions:
- **The poller never marks notifications as read.** That's the agent's job after
it processes them. This prevents the poller and agent from racing.
- **It tracks notification IDs, not counts.** This way it only fires on
genuinely new notifications, not re-reads of existing ones.
- **The wake message tells the agent to route output to Gitea/Mattermost, not to
DM.** This prevents chatty notification processing from disturbing the human.
- **Zero dependencies.** Just Python stdlib (`urllib`, `json`, `time`). Runs
anywhere.
Here's the full source:
- **The poller IS the dispatcher.** It fetches notification details, checks
whether the agent is mentioned or assigned, and spawns agents directly.
No middleman session needed.
- **One agent per actionable notification.** Each spawns via
`openclaw cron add --session isolated` with full context (API token, issue
URL, instructions) baked into the message. Parallel notifications get parallel
agents.
- **Marks notifications as read immediately.** Prevents re-processing. The
agent's job is to respond, not to manage notification state.
- **Tracks notification IDs, not counts.** Only fires on genuinely new
notifications, not re-reads of existing ones.
- **Triage before dispatch.** Not every notification is actionable. The poller
checks: is the agent @-mentioned (in issue body or latest comment)? Is the
issue/PR assigned to the agent? Is the agent's comment already the latest
(no response needed)?
- **Assignment scan as backup.** A secondary loop periodically scans watched
repos for open issues assigned to the agent that were recently updated but
have no agent response. This catches cases where notifications aren't
generated (API-created issues, self-assignment).
- **Strict scope enforcement.** Each spawned agent's prompt includes a SCOPE
constraint: "You are responsible for ONLY this issue. Do NOT touch any other
issues or PRs." This prevents rogue agents from creating unauthorized work.
- **Priority rule.** Agent prompts explicitly state that the user's instructions
in the issue override all boilerplate rules (e.g., if the user asks for a DM
response, the agent should DM).
- **Zero dependencies.** Just Python stdlib. Runs anywhere.
```python
#!/usr/bin/env python3
"""
Gitea notification poller.
Polls for unread notifications and wakes OpenClaw when the count
changes. The AGENT marks notifications as read after processing —
the poller never marks anything as read.
Gitea notification poller + dispatcher.
Two polling loops:
1. Notification-based: detects new notifications (mentions, assignments)
and dispatches agents for actionable ones.
2. Assignment-based: periodically checks for open issues/PRs assigned to
the agent that have no recent response. Catches cases where
notifications aren't generated.
Required env vars:
GITEA_URL - Gitea instance URL
GITEA_TOKEN - Gitea API token
HOOK_TOKEN - OpenClaw hooks auth token
Optional env vars:
GATEWAY_URL - OpenClaw gateway URL (default: http://127.0.0.1:18789)
POLL_DELAY - Delay between polls in seconds (default: 2)
POLL_DELAY - Delay between polls in seconds (default: 15)
COOLDOWN - Minimum seconds between dispatches (default: 30)
ASSIGNMENT_INTERVAL - Seconds between assignment scans (default: 120)
OPENCLAW_BIN - Path to openclaw binary
"""
import json
import os
import subprocess
import sys
import time
import urllib.request
@@ -220,117 +265,286 @@ import urllib.error
GITEA_URL = os.environ.get("GITEA_URL", "").rstrip("/")
GITEA_TOKEN = os.environ.get("GITEA_TOKEN", "")
GATEWAY_URL = os.environ.get("GATEWAY_URL", "http://127.0.0.1:18789").rstrip(
"/"
)
HOOK_TOKEN = os.environ.get("HOOK_TOKEN", "")
POLL_DELAY = int(os.environ.get("POLL_DELAY", "2"))
POLL_DELAY = int(os.environ.get("POLL_DELAY", "15"))
COOLDOWN = int(os.environ.get("COOLDOWN", "30"))
ASSIGNMENT_INTERVAL = int(os.environ.get("ASSIGNMENT_INTERVAL", "120"))
OPENCLAW_BIN = os.environ.get("OPENCLAW_BIN", "/opt/homebrew/bin/openclaw")
# Mattermost channel for status updates (customize to your setup)
GIT_CHANNEL = "channel:YOUR_GIT_CHANNEL_ID"
# Repos to scan for assigned issues
WATCHED_REPOS = [
"your-org/repo1",
"your-org/repo2",
]
# Track dispatched issues to prevent duplicates
dispatched_issues = set()
BOT_USERNAME = "your-bot-username" # e.g. "clawbot"
def check_config():
missing = []
if not GITEA_URL:
missing.append("GITEA_URL")
if not GITEA_TOKEN:
missing.append("GITEA_TOKEN")
if not HOOK_TOKEN:
missing.append("HOOK_TOKEN")
if missing:
print(
f"ERROR: Missing required env vars: {', '.join(missing)}",
file=sys.stderr,
)
if not GITEA_URL or not GITEA_TOKEN:
print("ERROR: GITEA_URL and GITEA_TOKEN required", file=sys.stderr)
sys.exit(1)
def gitea_unread_ids():
"""Return set of unread notification IDs."""
req = urllib.request.Request(
f"{GITEA_URL}/api/v1/notifications?status-types=unread",
headers={"Authorization": f"token {GITEA_TOKEN}"},
)
def gitea_api(method, path, data=None):
"""Call Gitea API, return parsed JSON or None on error."""
url = f"{GITEA_URL}/api/v1{path}"
body = json.dumps(data).encode() if data else None
headers = {"Authorization": f"token {GITEA_TOKEN}"}
if body:
headers["Content-Type"] = "application/json"
req = urllib.request.Request(url, headers=headers, method=method, data=body)
try:
with urllib.request.urlopen(req, timeout=10) as resp:
notifs = json.loads(resp.read())
return {n["id"] for n in notifs}
with urllib.request.urlopen(req, timeout=15) as resp:
raw = resp.read()
return json.loads(raw) if raw else None
except Exception as e:
print(
f"WARN: Gitea API failed: {e}", file=sys.stderr, flush=True
)
return set()
print(f"WARN: Gitea API {method} {path}: {e}",
file=sys.stderr, flush=True)
return None
def wake_openclaw(count):
text = (
f"[Gitea Notification] {count} new notification(s). "
"Check your Gitea notification inbox via API, process them, "
"and mark as read when done. "
"Route all output to Gitea comments or Mattermost #git/#claw. "
"Do NOT reply to this session — respond with NO_REPLY."
def get_unread_notifications():
result = gitea_api("GET", "/notifications?status-types=unread")
return result if isinstance(result, list) else []
def mark_notification_read(notif_id):
gitea_api("PATCH", f"/notifications/threads/{notif_id}")
def needs_bot_response(repo_full, issue_number):
"""True if bot is NOT the author of the most recent comment."""
comments = gitea_api(
"GET", f"/repos/{repo_full}/issues/{issue_number}/comments"
)
payload = json.dumps({"text": text, "mode": "now"}).encode()
req = urllib.request.Request(
f"{GATEWAY_URL}/hooks/wake",
data=payload,
headers={
"Authorization": f"Bearer {HOOK_TOKEN}",
"Content-Type": "application/json",
},
method="POST",
if comments and len(comments) > 0:
if comments[-1].get("user", {}).get("login") == BOT_USERNAME:
return False
return True
def is_actionable_notification(notif):
"""Check if a notification needs agent action.
Returns (actionable, reason, issue_number)."""
subject = notif.get("subject", {})
repo = notif.get("repository", {})
repo_full = repo.get("full_name", "")
url = subject.get("url", "")
number = url.rstrip("/").split("/")[-1] if url else ""
if not number or not number.isdigit():
return False, "no issue number", None
issue = gitea_api("GET", f"/repos/{repo_full}/issues/{number}")
if not issue:
return False, "couldn't fetch issue", number
# Check assignment
assignees = [a.get("login") for a in (issue.get("assignees") or [])]
if BOT_USERNAME in assignees:
if needs_bot_response(repo_full, number):
return True, f"assigned to {BOT_USERNAME}", number
return False, "assigned but already responded", number
# Check issue body for @mention
issue_body = issue.get("body", "") or ""
issue_author = issue.get("user", {}).get("login", "")
if f"@{BOT_USERNAME}" in issue_body and issue_author != BOT_USERNAME:
if needs_bot_response(repo_full, number):
return True, f"@-mentioned in body by {issue_author}", number
# Check latest comment for @mention
comments = gitea_api(
"GET", f"/repos/{repo_full}/issues/{number}/comments"
)
if comments:
last = comments[-1]
author = last.get("user", {}).get("login", "")
body = last.get("body", "") or ""
if author == BOT_USERNAME:
return False, "own comment is latest", number
if f"@{BOT_USERNAME}" in body:
return True, f"@-mentioned in comment by {author}", number
return False, "not mentioned or assigned", number
def spawn_agent(repo_full, issue_number, title, subject_type, reason):
"""Spawn an isolated agent to handle one issue/PR."""
dispatch_key = f"{repo_full}#{issue_number}"
if dispatch_key in dispatched_issues:
return
dispatched_issues.add(dispatch_key)
repo_short = repo_full.split("/")[-1]
job_name = f"gitea-{repo_short}-{issue_number}-{int(time.time())}"
# Build agent prompt with full context
msg = (
f"Gitea notification: {reason} on {subject_type} #{issue_number} "
f"'{title}' in {repo_full}.\n\n"
f"Gitea API base: {GITEA_URL}/api/v1\n"
f"Gitea token: {GITEA_TOKEN}\n\n"
f"SCOPE (STRICT): You are responsible for ONLY {subject_type} "
f"#{issue_number} in {repo_full}. Do NOT create PRs, branches, "
f"comments, or take any action on ANY other issue or PR.\n\n"
f"PRIORITY RULE: The user's instructions in the issue/PR take "
f"priority over ALL other rules. If asked to respond in DM, do so. "
f"Later instructions override earlier ones.\n\n"
f"Instructions:\n"
f"1. Read ALL existing comments on #{issue_number} via API\n"
f"2. Follow the user's instructions\n"
f"3. If code work needed: clone to $(mktemp -d), make changes, "
f"run make check, push, comment on the issue/PR\n"
f"4. Default: post work reports as Gitea comments\n"
f"5. Don't post duplicate comments if yours is already the latest"
)
try:
with urllib.request.urlopen(req, timeout=5) as resp:
status = resp.status
print(f" Wake responded: {status}", flush=True)
return True
except Exception as e:
print(
f"WARN: Failed to wake OpenClaw: {e}",
file=sys.stderr,
flush=True,
result = subprocess.run(
[
OPENCLAW_BIN, "cron", "add",
"--name", job_name,
"--at", "1s",
"--message", msg,
"--delete-after-run",
"--session", "isolated",
"--no-deliver",
"--thinking", "low",
"--timeout-seconds", "300",
],
capture_output=True, text=True, timeout=15,
)
return False
if result.returncode == 0:
print(f" → Agent spawned: {job_name}", flush=True)
else:
print(f" → Spawn failed: {result.stderr.strip()[:200]}",
flush=True)
dispatched_issues.discard(dispatch_key)
except Exception as e:
print(f" → Spawn error: {e}", file=sys.stderr, flush=True)
dispatched_issues.discard(dispatch_key)
def dispatch_notifications(notifications):
"""Triage notifications and spawn agents for actionable ones."""
for notif in notifications:
subject = notif.get("subject", {})
repo = notif.get("repository", {})
repo_full = repo.get("full_name", "")
title = subject.get("title", "")[:60]
notif_id = notif.get("id")
subject_type = subject.get("type", "").lower()
is_act, reason, issue_num = is_actionable_notification(notif)
if notif_id:
mark_notification_read(notif_id)
if is_act:
print(f" ACTIONABLE: {repo_full} #{issue_num} ({reason})",
flush=True)
spawn_agent(repo_full, issue_num, title, subject_type, reason)
else:
print(f" skip: {repo_full} #{issue_num} ({reason})", flush=True)
def scan_assigned_issues():
"""Backup scan: find assigned issues needing response."""
for repo_full in WATCHED_REPOS:
for issue_type in ["issues", "pulls"]:
items = gitea_api(
"GET",
f"/repos/{repo_full}/issues?state=open&type={issue_type}"
f"&assignee={BOT_USERNAME}&sort=updated&limit=10"
)
if not items:
continue
for item in items:
number = str(item["number"])
dispatch_key = f"{repo_full}#{number}"
if dispatch_key in dispatched_issues:
continue
if not needs_bot_response(repo_full, number):
continue
kind = "PR" if issue_type == "pulls" else "issue"
print(f" [assign-scan] {repo_full} {kind} #{number}",
flush=True)
spawn_agent(
repo_full, number, item.get("title", "")[:60],
"pull" if issue_type == "pulls" else "issue",
f"assigned to {BOT_USERNAME}"
)
def main():
check_config()
print(
f"Gitea notification poller started (delay={POLL_DELAY}s)",
flush=True,
)
print(f"Gitea poller+dispatcher (poll={POLL_DELAY}s, "
f"cooldown={COOLDOWN}s, assign_scan={ASSIGNMENT_INTERVAL}s)",
flush=True)
last_seen_ids = gitea_unread_ids()
print(
f"Initial unread: {len(last_seen_ids)} notification(s)", flush=True
)
seen_ids = set(n["id"] for n in get_unread_notifications())
last_dispatch_time = 0
last_assign_scan = 0
print(f"Initial unread: {len(seen_ids)} (draining)", flush=True)
while True:
time.sleep(POLL_DELAY)
now = time.time()
current_ids = gitea_unread_ids()
new_ids = current_ids - last_seen_ids
# Notification polling
notifications = get_unread_notifications()
current_ids = {n["id"] for n in notifications}
new_ids = current_ids - seen_ids
if not new_ids:
last_seen_ids = current_ids
continue
if new_ids:
ts = time.strftime("%H:%M:%S")
new_notifs = [n for n in notifications if n["id"] in new_ids]
print(f"[{ts}] {len(new_ids)} new notification(s)", flush=True)
if now - last_dispatch_time >= COOLDOWN:
dispatch_notifications(new_notifs)
last_dispatch_time = now
else:
remaining = int(COOLDOWN - (now - last_dispatch_time))
print(f" → Cooldown ({remaining}s remaining)", flush=True)
ts = time.strftime("%H:%M:%S")
print(
f"[{ts}] {len(new_ids)} new notification(s) "
f"({len(current_ids)} total unread), waking agent",
flush=True,
)
seen_ids = current_ids
wake_openclaw(len(new_ids))
last_seen_ids = current_ids
# Assignment scan (less frequent)
if now - last_assign_scan >= ASSIGNMENT_INTERVAL:
scan_assigned_issues()
last_assign_scan = now
if __name__ == "__main__":
main()
```
Run it as a background service (launchd on macOS, systemd on Linux) with the env
vars set. It's intentionally simple — no frameworks, no async, no dependencies.
Run it as a background service (launchd on macOS, systemd on Linux) with
`GITEA_URL` and `GITEA_TOKEN` set. Customize `WATCHED_REPOS`, `BOT_USERNAME`,
and `GIT_CHANNEL` for your setup. It's intentionally simple — no frameworks,
no async, no dependencies.
**Lessons learned during development:**
- `openclaw cron add --at` uses formats like `1s`, `20m` — not `+5s` or `+0s`.
- `--no-deliver` is incompatible with `--session main`. Use
`--session isolated` with `--no-deliver`.
- `--system-event` targets the main DM session. If your agent is active in a
channel session, it won't see system events. Use `--session isolated` with
`--message` instead.
- Isolated agent sessions don't have access to workspace files (TOOLS.md, etc).
Bake all credentials and instructions directly into the agent prompt.
- Agents WILL go out of scope unless the SCOPE constraint is extremely explicit
and uses strong language ("violating scope is a critical failure").
- When the user's explicit instructions in an issue conflict with boilerplate
rules in the agent prompt, the agent will follow the boilerplate unless the
prompt explicitly says "user instructions take priority."
### The Daily Diary
@@ -368,13 +582,15 @@ This applies to everything: project rules ("no mocks in tests"), workflow
preferences ("fewer PRs, don't over-split"), corrections, new policies.
Immediate write to the daily file, and to MEMORY.md if it's a standing rule.
### PII-Aware Output Routing
### Sensitive Output Routing
A lesson learned the hard way: **the audience determines what you can say, not
who asked.** If the human asks for a medication status report in a group
channel, the agent can't just dump it there — other people can read it. The
rule: if the output would contain PII and the channel isn't private, redirect to
DM and reply in-channel with "sent privately."
rule: if the output would contain sensitive information (PII, secrets,
credentials, API keys, operational details like flight numbers, locations,
travel plans, medical info, etc.) and the channel isn't private, redirect to DM
and reply in-channel with "sent privately."
This is enforced at multiple levels:
@@ -405,7 +621,7 @@ The heartbeat handles:
- Periodic memory maintenance
State tracking in `memory/heartbeat-state.json` prevents redundant checks (e.g.,
don't re-check email if you checked 10 minutes ago).
don't re-check notifications if you checked 10 minutes ago).
The key output rule: heartbeats should either be `HEARTBEAT_OK` (nothing to do)
or a direct alert. Work narration goes to a designated status channel, never to
@@ -1417,7 +1633,8 @@ stay quiet.
## Inbox Check (PRIORITY)
(check notifications, issues, emails — whatever applies)
(check whatever notification sources apply to your setup — e.g. Gitea
notifications, emails, issue trackers)
## Flight Prep Blocks (daily)
@@ -1451,10 +1668,9 @@ Never send internal thinking or status narration to user's DM. Output should be:
```json
{
"lastChecks": {
"email": 1703275200,
"gitea": 1703280000,
"calendar": 1703260800,
"weather": null,
"gitea": 1703280000
"weather": null
},
"lastWeeklyDocsReview": "2026-02-24"
}
@@ -1623,21 +1839,24 @@ Never lose a rule or preference your human states:
---
## PII Output Routing — Audience-Aware Responses
## Sensitive Output Routing — Audience-Aware Responses
A critical security pattern: **the audience determines what you can say, not who
asked.** If your human asks for a sitrep (or any PII-containing info) in a group
asked.** If your human asks for a sitrep (or any sensitive info) in a group
channel, you can't just dump it there — other people can read it.
### AGENTS.md / checklist prompt:
```markdown
## PII Output Routing (CRITICAL)
## Sensitive Output Routing (CRITICAL)
- NEVER output PII in any non-private channel, even if your human asks for it
- If a request would produce PII (medication status, travel details, financial
info, etc.) in a shared channel: send the response via DM instead, and reply
in-channel with "sent privately"
- NEVER output sensitive information in any non-private channel, even if your
human asks for it
- This includes: PII, secrets, credentials, API keys, and sensitive operational
information (flight numbers/times/dates, locations, travel plans, medical
info, financial details, etc.)
- If a request would produce any of the above in a shared channel: send the
response via DM instead, and reply in-channel with "sent privately"
- The rule is: the audience determines what you can say, not who asked
- This applies to: group chats, public issue trackers, shared Mattermost
channels, Discord servers — anywhere that isn't a 1:1 DM
@@ -1646,10 +1865,10 @@ channel, you can't just dump it there — other people can read it.
### Why this matters:
This is a real failure mode. If someone asks "sitrep" in a group channel and you
respond with medication names, partner details, travel dates, and hotel names
you just leaked all of that to everyone in the channel. The human asking is
authorized to see it; the channel audience is not. Always check WHERE you're
responding, not just WHO asked.
respond with medication names, partner details, travel dates, hotel names, or
API credentials — you just leaked all of that to everyone in the channel. The
human asking is authorized to see it; the channel audience is not. Always check
WHERE you're responding, not just WHO asked.
---

File diff suppressed because it is too large Load Diff