Merge pull request 'docs: update poller to dispatcher architecture (closes #4 )' (#5 ) from fix/update-poller-docs into main

docs: update poller to dispatcher architecture (closes #4 )
Replace flag-file + heartbeat approach with the production dispatcher pattern: poller triages notifications and spawns isolated agents directly via openclaw cron. Adds assignment scan for self-created issues. Response time ~15-60s instead of ~30 min.
2026-02-28 16:31:43 +01:00 · 2026-02-28 06:29:32 -08:00 · 2026-02-28 15:22:36 +01:00 · 2026-02-28 03:40:13 -08:00 · 2026-02-28 03:33:08 -08:00 · 2026-02-28 03:30:49 -08:00
2 changed files with 240 additions and 142 deletions
--- a/OPENCLAW_TRICKS.md
+++ b/OPENCLAW_TRICKS.md
@ -173,46 +173,84 @@ The landing checklist (triggered automatically after every flight) updates
 location, timezone, nearest airport, and lodging in the daily context file. It
 also checks if any cron jobs have hardcoded timezones that need updating.
-### The Gitea Notification Poller
+### Gitea Notification Delivery
-OpenClaw has heartbeats, but those are periodic (every ~30min). For Gitea issues
+There are two approaches for getting Gitea notifications to your agent,
-and PRs, we wanted near-realtime response. The solution: a tiny Python script
+depending on your network setup.
-that polls the Gitea notifications API every 2 seconds and wakes the agent via
+
-OpenClaw's `/hooks/wake` endpoint when new notifications arrive.
+#### Option A: Direct Webhooks (VPS / Public Server)
 If your OpenClaw instance runs on a VPS or other publicly reachable server, the
 simplest approach is direct webhooks. Run Traefik (or any reverse proxy with
 automatic TLS) on the same server and configure Gitea webhooks to POST directly
 to OpenClaw's webhook endpoint. This is push-based and realtime — notifications
 arrive instantly.
 Setup: add a webhook on each Gitea repo (or use an organization-level webhook)
 pointing to `https://your-openclaw-host/hooks/gitea`. OpenClaw handles the rest.
 #### Option B: Notification Poller + Dispatcher (Local Machine Behind NAT)
 If your OpenClaw runs on a dedicated local machine behind NAT (like a home Mac
 or Linux workstation), Gitea can't reach it directly. This is our setup —
 OpenClaw runs on a Mac Studio on a home LAN.
 The solution: a Python script that both polls and dispatches. It polls the Gitea
 notifications API every 15 seconds, triages each notification (checking
 assignment and @-mentions), marks them as read, and spawns one isolated agent
 session per actionable item via `openclaw cron add --session isolated`.
 The poller also runs a secondary **assignment scan** every 2 minutes, checking
 all watched repos for open issues/PRs assigned to the bot that were recently
 updated and still need a response. This catches cases where notifications aren't
 generated (e.g. self-assignment, API-created issues).
 Key design decisions:
- **The poller never marks notifications as read.** That's the agent's job after
+- **The poller IS the dispatcher.** No flag files, no heartbeat dependency. The
-  it processes them. This prevents the poller and agent from racing.
+  poller triages notifications and spawns agents directly.
- **It tracks notification IDs, not counts.** This way it only fires on
+- **Marks notifications as read immediately.** Each notification is marked read
-  genuinely new notifications, not re-reads of existing ones.
+  as it's processed, preventing re-dispatch on the next poll.
- **The wake message tells the agent to route output to Gitea/Mattermost, not to
+- **One agent per issue.** Each spawned agent gets a `SCOPE` instruction
-  DM.** This prevents chatty notification processing from disturbing the human.
+  limiting it to one specific issue/PR. Agents post results as Gitea comments,
- **Zero dependencies.** Just Python stdlib (`urllib`, `json`, `time`). Runs
+  not DMs.
-  anywhere.
+- **Dedup tracking.** An in-memory `dispatched_issues` set prevents spawning
  multiple agents for the same issue within one poller lifetime.
 - **`--no-deliver` instead of `--announce`.** Agents report via Gitea API
  directly. The `--announce` flag on isolated sessions had delivery failures.
 - **Assignment scan filters by recency.** Only issues updated in the last 5
  minutes are considered, preventing re-dispatch for stale assigned issues.
 - **Zero dependencies.** Just Python stdlib. Runs anywhere.
-Here's the full source:
+Response time: ~15–60s from notification to agent comment (vs ~30 min with the
 old heartbeat approach).
 ```python
 #!/usr/bin/env python3
 """
-Gitea notification poller.
+Gitea notification poller + dispatcher.
-Polls for unread notifications and wakes OpenClaw when the count
+
-changes. The AGENT marks notifications as read after processing —
+Two polling loops:
-the poller never marks anything as read.
+1. Notification-based: detects new notifications (mentions, assignments by
   other users) and dispatches agents for actionable ones.
 2. Assignment-based: periodically checks for open issues/PRs assigned to
   the bot that have no recent bot comment. Catches cases where
   notifications aren't generated (e.g. self-assignment, API-created issues).
 Required env vars:
  GITEA_URL        - Gitea instance URL
  GITEA_TOKEN      - Gitea API token
  HOOK_TOKEN       - OpenClaw hooks auth token
 Optional env vars:
-  GATEWAY_URL      - OpenClaw gateway URL (default: http://127.0.0.1:18789)
+  POLL_DELAY       - Delay between polls in seconds (default: 15)
-  POLL_DELAY       - Delay between polls in seconds (default: 2)
+  COOLDOWN         - Minimum seconds between dispatches (default: 30)
  ASSIGNMENT_INTERVAL - Seconds between assignment scans (default: 120)
  OPENCLAW_BIN     - Path to openclaw binary
 """
 import json
 import os
 import subprocess
 import sys
 import time
 import urllib.request
@ -220,109 +258,158 @@ import urllib.error
 GITEA_URL = os.environ.get("GITEA_URL", "").rstrip("/")
 GITEA_TOKEN = os.environ.get("GITEA_TOKEN", "")
-GATEWAY_URL = os.environ.get("GATEWAY_URL", "http://127.0.0.1:18789").rstrip(
+POLL_DELAY = int(os.environ.get("POLL_DELAY", "15"))
-    "/"
+COOLDOWN = int(os.environ.get("COOLDOWN", "30"))
-)
+ASSIGNMENT_INTERVAL = int(os.environ.get("ASSIGNMENT_INTERVAL", "120"))
-HOOK_TOKEN = os.environ.get("HOOK_TOKEN", "")
+OPENCLAW_BIN = os.environ.get("OPENCLAW_BIN", "/opt/homebrew/bin/openclaw")
-POLL_DELAY = int(os.environ.get("POLL_DELAY", "2"))
+BOT_USER = "clawbot"  # Change to your bot's Gitea username
 # Repos to scan for assigned issues
 WATCHED_REPOS = [
    # "org/repo1",
    # "org/repo2",
 ]
 # Track dispatched issues to prevent duplicates
 dispatched_issues = set()
-def check_config():
+def gitea_api(method, path, data=None):
-    missing = []
+    url = f"{GITEA_URL}/api/v1{path}"
-    if not GITEA_URL:
+    body = json.dumps(data).encode() if data else None
-        missing.append("GITEA_URL")
+    headers = {"Authorization": f"token {GITEA_TOKEN}"}
-    if not GITEA_TOKEN:
+    if body:
-        missing.append("GITEA_TOKEN")
+        headers["Content-Type"] = "application/json"
-    if not HOOK_TOKEN:
+    req = urllib.request.Request(url, headers=headers, method=method, data=body)
        missing.append("HOOK_TOKEN")
    if missing:
        print(
            f"ERROR: Missing required env vars: {', '.join(missing)}",
            file=sys.stderr,
        )
        sys.exit(1)
 def gitea_unread_ids():
    """Return set of unread notification IDs."""
    req = urllib.request.Request(
        f"{GITEA_URL}/api/v1/notifications?status-types=unread",
        headers={"Authorization": f"token {GITEA_TOKEN}"},
    )
    try:
-        with urllib.request.urlopen(req, timeout=10) as resp:
+        with urllib.request.urlopen(req, timeout=15) as resp:
-            notifs = json.loads(resp.read())
+            raw = resp.read()
-            return {n["id"] for n in notifs}
+            return json.loads(raw) if raw else None
    except Exception as e:
-        print(
+        print(f"WARN: {method} {path}: {e}", file=sys.stderr, flush=True)
-            f"WARN: Gitea API failed: {e}", file=sys.stderr, flush=True
+        return None
        )
        return set()
-def wake_openclaw(count):
+def needs_bot_response(repo_full, issue_number):
-    text = (
+    """True if the bot is NOT the author of the most recent comment."""
-        f"[Gitea Notification] {count} new notification(s). "
+    comments = gitea_api("GET", f"/repos/{repo_full}/issues/{issue_number}/comments")
-        "Check your Gitea notification inbox via API, process them, "
+    if comments and len(comments) > 0:
-        "and mark as read when done. "
+        if comments[-1].get("user", {}).get("login") == BOT_USER:
        "Route all output to Gitea comments or Mattermost #git/#claw. "
        "Do NOT reply to this session — respond with NO_REPLY."
    )
    payload = json.dumps({"text": text, "mode": "now"}).encode()
    req = urllib.request.Request(
        f"{GATEWAY_URL}/hooks/wake",
        data=payload,
        headers={
            "Authorization": f"Bearer {HOOK_TOKEN}",
            "Content-Type": "application/json",
        },
        method="POST",
    )
    try:
        with urllib.request.urlopen(req, timeout=5) as resp:
            status = resp.status
        print(f"  Wake responded: {status}", flush=True)
        return True
    except Exception as e:
        print(
            f"WARN: Failed to wake OpenClaw: {e}",
            file=sys.stderr,
            flush=True,
        )
            return False
    return True
 def is_actionable(notif):
    """Returns (actionable, reason, issue_number)."""
    subject = notif.get("subject", {})
    repo = notif.get("repository", {})
    repo_full = repo.get("full_name", "")
    url = subject.get("url", "")
    number = url.rstrip("/").split("/")[-1] if url else ""
    if not number or not number.isdigit():
        return False, "no issue number", None
    issue = gitea_api("GET", f"/repos/{repo_full}/issues/{number}")
    if not issue:
        return False, "couldn't fetch issue", number
    assignees = [a.get("login") for a in (issue.get("assignees") or [])]
    if BOT_USER in assignees:
        if needs_bot_response(repo_full, number):
            return True, f"assigned to {BOT_USER}", number
        return False, "assigned but already responded", number
    issue_body = issue.get("body", "") or ""
    if f"@{BOT_USER}" in issue_body and issue.get("user", {}).get("login") != BOT_USER:
        if needs_bot_response(repo_full, number):
            return True, f"@-mentioned in body", number
    comments = gitea_api("GET", f"/repos/{repo_full}/issues/{number}/comments")
    if comments:
        last = comments[-1]
        if last.get("user", {}).get("login") == BOT_USER:
            return False, "own comment is latest", number
        if f"@{BOT_USER}" in (last.get("body") or ""):
            return True, f"@-mentioned in comment", number
    return False, "not mentioned or assigned", number
 def spawn_agent(repo_full, issue_number, title, subject_type, reason):
    dispatch_key = f"{repo_full}#{issue_number}"
    if dispatch_key in dispatched_issues:
        return
    dispatched_issues.add(dispatch_key)
    repo_short = repo_full.split("/")[-1]
    job_name = f"gitea-{repo_short}-{issue_number}-{int(time.time())}"
    msg = (
        f"Gitea: {reason} on {subject_type} #{issue_number} "
        f"'{title}' in {repo_full}.\n"
        f"API: {GITEA_URL}/api/v1 | Token: {GITEA_TOKEN}\n"
        f"SCOPE: Only {subject_type} #{issue_number} in {repo_full}.\n"
        f"Read all comments, do the work, post results as Gitea comments."
    )
    try:
        subprocess.run(
            [OPENCLAW_BIN, "cron", "add",
             "--name", job_name, "--at", "1s",
             "--message", msg, "--delete-after-run",
             "--session", "isolated", "--no-deliver",
             "--thinking", "low", "--timeout-seconds", "300"],
            capture_output=True, text=True, timeout=15,
        )
    except Exception as e:
        print(f"Spawn error: {e}", file=sys.stderr, flush=True)
        dispatched_issues.discard(dispatch_key)
 def main():
-    check_config()
+    print(f"Poller started (poll={POLL_DELAY}s, cooldown={COOLDOWN}s)", flush=True)
-    print(
+    seen_ids = set(n["id"] for n in (gitea_api("GET", "/notifications?status-types=unread") or []))
-        f"Gitea notification poller started (delay={POLL_DELAY}s)",
+    last_dispatch = 0
-        flush=True,
+    last_assign_scan = 0
    )
    last_seen_ids = gitea_unread_ids()
    print(
        f"Initial unread: {len(last_seen_ids)} notification(s)", flush=True
    )
    while True:
        time.sleep(POLL_DELAY)
        now = time.time()
-        current_ids = gitea_unread_ids()
+        # Notification polling
-        new_ids = current_ids - last_seen_ids
+        notifs = gitea_api("GET", "/notifications?status-types=unread") or []
        current_ids = {n["id"] for n in notifs}
        new_ids = current_ids - seen_ids
        if new_ids and now - last_dispatch >= COOLDOWN:
            for n in [n for n in notifs if n["id"] in new_ids]:
                nid = n.get("id")
                if nid:
                    gitea_api("PATCH", f"/notifications/threads/{nid}")
                is_act, reason, num = is_actionable(n)
                if is_act:
                    repo = n["repository"]["full_name"]
                    title = n["subject"]["title"][:60]
                    stype = n["subject"].get("type", "").lower()
                    spawn_agent(repo, num, title, stype, reason)
            last_dispatch = now
        seen_ids = current_ids
-        if not new_ids:
+        # Assignment scan (less frequent)
-            last_seen_ids = current_ids
+        if now - last_assign_scan >= ASSIGNMENT_INTERVAL:
            for repo in WATCHED_REPOS:
                for itype in ["issues", "pulls"]:
                    items = gitea_api("GET",
                        f"/repos/{repo}/issues?state=open&type={itype}"
                        f"&assignee={BOT_USER}&sort=updated&limit=10") or []
                    for item in items:
                        num = str(item["number"])
                        if f"{repo}#{num}" in dispatched_issues:
                            continue
-
+                        # Only recently updated items (5 min)
-        ts = time.strftime("%H:%M:%S")
+                        # ... add is_recently_updated() check here
-        print(
+                        if needs_bot_response(repo, num):
-            f"[{ts}] {len(new_ids)} new notification(s) "
+                            spawn_agent(repo, num, item["title"][:60],
-            f"({len(current_ids)} total unread), waking agent",
+                                "pull" if itype == "pulls" else "issue",
-            flush=True,
+                                f"assigned to {BOT_USER}")
-        )
+            last_assign_scan = now
        wake_openclaw(len(new_ids))
        last_seen_ids = current_ids
 if __name__ == "__main__":
@ -368,13 +455,15 @@ This applies to everything: project rules ("no mocks in tests"), workflow
 preferences ("fewer PRs, don't over-split"), corrections, new policies.
 Immediate write to the daily file, and to MEMORY.md if it's a standing rule.
-### PII-Aware Output Routing
+### Sensitive Output Routing
 A lesson learned the hard way: **the audience determines what you can say, not
 who asked.** If the human asks for a medication status report in a group
 channel, the agent can't just dump it there — other people can read it. The
-rule: if the output would contain PII and the channel isn't private, redirect to
+rule: if the output would contain sensitive information (PII, secrets,
-DM and reply in-channel with "sent privately."
+credentials, API keys, operational details like flight numbers, locations,
 travel plans, medical info, etc.) and the channel isn't private, redirect to DM
 and reply in-channel with "sent privately."
 This is enforced at multiple levels:
@ -405,7 +494,7 @@ The heartbeat handles:
 - Periodic memory maintenance
 State tracking in `memory/heartbeat-state.json` prevents redundant checks (e.g.,
-don't re-check email if you checked 10 minutes ago).
+don't re-check notifications if you checked 10 minutes ago).
 The key output rule: heartbeats should either be `HEARTBEAT_OK` (nothing to do)
 or a direct alert. Work narration goes to a designated status channel, never to
@ -1417,7 +1506,8 @@ stay quiet.
 ## Inbox Check (PRIORITY)
-(check notifications, issues, emails — whatever applies)
+(check whatever notification sources apply to your setup — e.g. Gitea
 notifications, emails, issue trackers)
 ## Flight Prep Blocks (daily)
@ -1451,10 +1541,9 @@ Never send internal thinking or status narration to user's DM. Output should be:
 ```json
 {
    "lastChecks": {
-        "email": 1703275200,
+        "gitea": 1703280000,
        "calendar": 1703260800,
-        "weather": null,
+        "weather": null
        "gitea": 1703280000
    },
    "lastWeeklyDocsReview": "2026-02-24"
 }
@ -1623,21 +1712,24 @@ Never lose a rule or preference your human states:
 ---
-## PII Output Routing — Audience-Aware Responses
+## Sensitive Output Routing — Audience-Aware Responses
 A critical security pattern: **the audience determines what you can say, not who
-asked.** If your human asks for a sitrep (or any PII-containing info) in a group
+asked.** If your human asks for a sitrep (or any sensitive info) in a group
 channel, you can't just dump it there — other people can read it.
 ### AGENTS.md / checklist prompt:
 ```markdown
-## PII Output Routing (CRITICAL)
+## Sensitive Output Routing (CRITICAL)
- NEVER output PII in any non-private channel, even if your human asks for it
+- NEVER output sensitive information in any non-private channel, even if your
- If a request would produce PII (medication status, travel details, financial
+  human asks for it
-  info, etc.) in a shared channel: send the response via DM instead, and reply
+- This includes: PII, secrets, credentials, API keys, and sensitive operational
-  in-channel with "sent privately"
+  information (flight numbers/times/dates, locations, travel plans, medical
  info, financial details, etc.)
 - If a request would produce any of the above in a shared channel: send the
  response via DM instead, and reply in-channel with "sent privately"
 - The rule is: the audience determines what you can say, not who asked
 - This applies to: group chats, public issue trackers, shared Mattermost
  channels, Discord servers — anywhere that isn't a 1:1 DM
@ -1646,10 +1738,10 @@ channel, you can't just dump it there — other people can read it.
 ### Why this matters:
 This is a real failure mode. If someone asks "sitrep" in a group channel and you
-respond with medication names, partner details, travel dates, and hotel names —
+respond with medication names, partner details, travel dates, hotel names, or
-you just leaked all of that to everyone in the channel. The human asking is
+API credentials — you just leaked all of that to everyone in the channel. The
-authorized to see it; the channel audience is not. Always check WHERE you're
+human asking is authorized to see it; the channel audience is not. Always check
-responding, not just WHO asked.
+WHERE you're responding, not just WHO asked.
 ---
--- a/SETUP_CHECKLIST.md
+++ b/SETUP_CHECKLIST.md
@ -104,8 +104,9 @@ Set up the memory directory structure:
 ## Notes
-Your human will tell you what fields to add to daily-context.json for
+Add fields relevant to whatever tracking systems you set up later
-their specific needs (medications, travel, etc.).
+(medications, travel, sleep, etc.). Infer what's needed from the
 sections your human enables — don't wait to be told.
 ```
 ### 1.5 Create AGENTS.md
@ -272,7 +273,8 @@ poll. Structure it like this:
 ## Checks (rotate through these, 2-4 times per day)
- Emails — any urgent unread messages?
+- Notifications — any unread items? (Gitea notifications, emails, or
  whatever inbox sources you've integrated)
 - Calendar — upcoming events in next 24-48h?
 - Open issues/PRs — anything assigned to me?
 - Workspace sync — any uncommitted changes to push?
@ -335,32 +337,36 @@ Then add a reference to this checklist in the MEMORY.md checklist index.
 Reference:
 https://git.eeqj.de/sneak/clawpub/raw/branch/main/OPENCLAW_TRICKS.md
-(see "PII Output Routing" and "Checklists Over Prose")
+(see "Sensitive Output Routing" and "Checklists Over Prose")
 ```
-### 5.2 PII output routing
+### 5.2 Sensitive output routing
-Prevents leaking private info in shared channels. Paste this to your agent:
+Prevents leaking private info, secrets, and operational details in shared
 channels. Paste this to your agent:
 ```
 Add the following warning banner near the TOP of AGENTS.md (before the
 session startup section):
-**⚠️ NEVER output PII in non-private channels.** If asked for
+**⚠️ NEVER output sensitive information in non-private channels.** This
-PII-containing info (medical, financial, personal) in a shared channel,
+includes PII, secrets, credentials, API keys, and sensitive operational
-send via DM to your human instead.
+information (flight numbers/times/dates, locations, travel plans,
 medical info, etc.). If asked for any of this in a shared channel, send
 via DM to your human instead.
-Also add a PII section to memory/checklist-messaging.md:
+Also add a sensitive-info section to memory/checklist-messaging.md:
-## PII Check (before every message in shared channels)
+## Sensitive Info Check (before every message in shared channels)
-1. Contains names, addresses, medical info, financial info? → DM only
+1. Contains PII (names, addresses, medical info, financial info)? → DM only
-2. Contains login credentials or tokens? → NEVER send, period
+2. Contains secrets, credentials, API keys, or tokens? → NEVER send, period
-3. When in doubt → send via DM
+3. Contains operational details (flight numbers, travel plans, locations)? → DM only
 4. When in doubt → send via DM
 Reference:
 https://git.eeqj.de/sneak/clawpub/raw/branch/main/OPENCLAW_TRICKS.md
-(see "PII-Aware Output Routing")
+(see "Sensitive Output Routing")
 ```
 ### 5.3 Additional checklists
Author	SHA1	Message	Date
clawbot	ccf08cfb67	Merge pull request 'docs: update poller to dispatcher architecture (closes #4 )' (#5 ) from fix/update-poller-docs into main All checks were successful check / check (push) Successful in 9s Details	2026-02-28 16:31:43 +01:00
clawbot	0284ea63c0	docs: update poller to dispatcher architecture (closes #4 ) All checks were successful check / check (push) Successful in 11s Details Replace flag-file + heartbeat approach with the production dispatcher pattern: poller triages notifications and spawns isolated agents directly via openclaw cron. Adds assignment scan for self-created issues. Response time ~15-60s instead of ~30 min.	2026-02-28 06:29:32 -08:00
Jeffrey Paul	f3e48c6cd4	Merge pull request 'Expand sensitive output routing and make inbox references conditional' (#3 ) from fix/pii-and-conditional-email into main All checks were successful check / check (push) Successful in 9s Details Reviewed-on: #3	2026-02-28 15:22:36 +01:00
clawbot	c0d345e767	expand PII routing to cover secrets, credentials, and operational info; make email/inbox references conditional All checks were successful check / check (push) Successful in 12s Details - Rename 'PII Output Routing' → 'Sensitive Output Routing' throughout - Expand scope to include secrets, credentials, API keys, flight numbers, locations, travel plans, medical info - Replace hardcoded 'Emails' heartbeat check with conditional language ('Notifications — whatever inbox sources you've integrated') - Remove 'email' from heartbeat-state.json example - Update cross-references in SETUP_CHECKLIST.md	2026-02-28 03:40:13 -08:00
user	36223ca550	fix: agent should infer needed fields, not wait to be told All checks were successful check / check (push) Successful in 12s Details	2026-02-28 03:33:08 -08:00
user	f0a2a5eb62	docs: update Gitea notification section — webhook vs poller, flag-file approach Some checks are pending check / check (push) Waiting to run Details - Replaced wake-event poller with flag-file approach (prevents DM spam) - Added Option A (webhooks for VPS) vs Option B (poller for NAT) - Documented the wake-event failure mode and why we switched	2026-02-28 03:30:49 -08:00
clawbot	9631535583	Merge pull request 'Rewrite SETUP_CHECKLIST.md: replace checklists with paste-able agent prompts' (#1 ) from rewrite-setup-checklist-prompts into main Some checks are pending check / check (push) Waiting to run Details	2026-02-28 12:27:17 +01:00