Merge pull request 'docs: update poller to dispatcher architecture (closes #4 )' (#5 ) from fix/update-poller-docs into main

docs: update poller to dispatcher architecture (closes #4 )
Replace flag-file + heartbeat approach with the production dispatcher pattern: poller triages notifications and spawns isolated agents directly via openclaw cron. Adds assignment scan for self-created issues. Response time ~15-60s instead of ~30 min.
2026-02-28 16:31:43 +01:00 · 2026-02-28 06:29:32 -08:00 · 2026-02-28 15:22:36 +01:00 · 2026-02-28 03:40:13 -08:00
2 changed files with 221 additions and 97 deletions
--- a/OPENCLAW_TRICKS.md
+++ b/OPENCLAW_TRICKS.md
@@ -189,50 +189,68 @@ arrive instantly.
 Setup: add a webhook on each Gitea repo (or use an organization-level webhook)
 pointing to `https://your-openclaw-host/hooks/gitea`. OpenClaw handles the rest.
-#### Option B: Notification Poller (Local Machine Behind NAT)
+#### Option B: Notification Poller + Dispatcher (Local Machine Behind NAT)
 If your OpenClaw runs on a dedicated local machine behind NAT (like a home Mac
 or Linux workstation), Gitea can't reach it directly. This is our setup —
 OpenClaw runs on a Mac Studio on a home LAN.
-The solution: a lightweight Python script that polls the Gitea notifications API
+The solution: a Python script that both polls and dispatches. It polls the Gitea
-every few seconds. When new notifications appear, it writes a flag file that the
+notifications API every 15 seconds, triages each notification (checking
-agent checks during heartbeats.
+assignment and @-mentions), marks them as read, and spawns one isolated agent
 session per actionable item via `openclaw cron add --session isolated`.
 The poller also runs a secondary **assignment scan** every 2 minutes, checking
 all watched repos for open issues/PRs assigned to the bot that were recently
 updated and still need a response. This catches cases where notifications aren't
 generated (e.g. self-assignment, API-created issues).
 Key design decisions:
- **The poller never marks notifications as read.** The agent does that after
+- **The poller IS the dispatcher.** No flag files, no heartbeat dependency. The
-  processing. This prevents lost notifications if the agent fails to process.
+  poller triages notifications and spawns agents directly.
- **It tracks notification IDs, not counts.** Only fires on genuinely new
+- **Marks notifications as read immediately.** Each notification is marked read
-  notifications, not re-reads of existing ones.
+  as it's processed, preventing re-dispatch on the next poll.
- **Flag file instead of wake events.** We initially used OpenClaw's
+- **One agent per issue.** Each spawned agent gets a `SCOPE` instruction
-  `/hooks/wake` endpoint, but wake events target the main (DM) session — any
+  limiting it to one specific issue/PR. Agents post results as Gitea comments,
-  model response during processing leaked to DM as a notification. The flag file
+  not DMs.
-  approach is processed during heartbeats, where output routing is controlled.
+- **Dedup tracking.** An in-memory `dispatched_issues` set prevents spawning
  multiple agents for the same issue within one poller lifetime.
 - **`--no-deliver` instead of `--announce`.** Agents report via Gitea API
  directly. The `--announce` flag on isolated sessions had delivery failures.
 - **Assignment scan filters by recency.** Only issues updated in the last 5
  minutes are considered, preventing re-dispatch for stale assigned issues.
 - **Zero dependencies.** Just Python stdlib. Runs anywhere.
-Tradeoff: notifications are processed at heartbeat cadence (~30 min) instead of
+Response time: ~15–60s from notification to agent comment (vs ~30 min with the
-realtime. For code review and issue triage, this is fine.
+old heartbeat approach).
 ```python
 #!/usr/bin/env python3
 """
-Gitea notification poller (flag-file approach).
+Gitea notification poller + dispatcher.
-Polls for unread notifications and writes a flag file when new ones
+
-appear. The agent checks this flag during heartbeats and processes
+Two polling loops:
-notifications via the Gitea API directly.
+1. Notification-based: detects new notifications (mentions, assignments by
   other users) and dispatches agents for actionable ones.
 2. Assignment-based: periodically checks for open issues/PRs assigned to
   the bot that have no recent bot comment. Catches cases where
   notifications aren't generated (e.g. self-assignment, API-created issues).
 Required env vars:
  GITEA_URL        - Gitea instance URL
  GITEA_TOKEN      - Gitea API token
 Optional env vars:
-  FLAG_PATH   - Path to flag file (default: workspace/memory/gitea-notify-flag)
+  POLL_DELAY       - Delay between polls in seconds (default: 15)
-  POLL_DELAY  - Delay between polls in seconds (default: 5)
+  COOLDOWN         - Minimum seconds between dispatches (default: 30)
  ASSIGNMENT_INTERVAL - Seconds between assignment scans (default: 120)
  OPENCLAW_BIN     - Path to openclaw binary
 """
 import json
 import os
 import subprocess
 import sys
 import time
 import urllib.request
@@ -240,62 +258,158 @@ import urllib.error
 GITEA_URL = os.environ.get("GITEA_URL", "").rstrip("/")
 GITEA_TOKEN = os.environ.get("GITEA_TOKEN", "")
-POLL_DELAY = int(os.environ.get("POLL_DELAY", "5"))
+POLL_DELAY = int(os.environ.get("POLL_DELAY", "15"))
-FLAG_PATH = os.environ.get(
+COOLDOWN = int(os.environ.get("COOLDOWN", "30"))
-    "FLAG_PATH",
+ASSIGNMENT_INTERVAL = int(os.environ.get("ASSIGNMENT_INTERVAL", "120"))
-    os.path.join(
+OPENCLAW_BIN = os.environ.get("OPENCLAW_BIN", "/opt/homebrew/bin/openclaw")
-        os.path.dirname(os.path.dirname(os.path.abspath(__file__))),
+BOT_USER = "clawbot"  # Change to your bot's Gitea username
-        "memory",
+
-        "gitea-notify-flag",
+# Repos to scan for assigned issues
-    ),
+WATCHED_REPOS = [
-)
+    # "org/repo1",
    # "org/repo2",
 ]
 # Track dispatched issues to prevent duplicates
 dispatched_issues = set()
-def check_config():
+def gitea_api(method, path, data=None):
-    if not GITEA_URL or not GITEA_TOKEN:
+    url = f"{GITEA_URL}/api/v1{path}"
-        print("ERROR: GITEA_URL and GITEA_TOKEN required", file=sys.stderr)
+    body = json.dumps(data).encode() if data else None
-        sys.exit(1)
+    headers = {"Authorization": f"token {GITEA_TOKEN}"}
    if body:
        headers["Content-Type"] = "application/json"
    req = urllib.request.Request(url, headers=headers, method=method, data=body)
    try:
        with urllib.request.urlopen(req, timeout=15) as resp:
            raw = resp.read()
            return json.loads(raw) if raw else None
    except Exception as e:
        print(f"WARN: {method} {path}: {e}", file=sys.stderr, flush=True)
        return None
-def gitea_unread_ids():
+def needs_bot_response(repo_full, issue_number):
-    req = urllib.request.Request(
+    """True if the bot is NOT the author of the most recent comment."""
-        f"{GITEA_URL}/api/v1/notifications?status-types=unread",
+    comments = gitea_api("GET", f"/repos/{repo_full}/issues/{issue_number}/comments")
-        headers={"Authorization": f"token {GITEA_TOKEN}"},
+    if comments and len(comments) > 0:
        if comments[-1].get("user", {}).get("login") == BOT_USER:
            return False
    return True
 def is_actionable(notif):
    """Returns (actionable, reason, issue_number)."""
    subject = notif.get("subject", {})
    repo = notif.get("repository", {})
    repo_full = repo.get("full_name", "")
    url = subject.get("url", "")
    number = url.rstrip("/").split("/")[-1] if url else ""
    if not number or not number.isdigit():
        return False, "no issue number", None
    issue = gitea_api("GET", f"/repos/{repo_full}/issues/{number}")
    if not issue:
        return False, "couldn't fetch issue", number
    assignees = [a.get("login") for a in (issue.get("assignees") or [])]
    if BOT_USER in assignees:
        if needs_bot_response(repo_full, number):
            return True, f"assigned to {BOT_USER}", number
        return False, "assigned but already responded", number
    issue_body = issue.get("body", "") or ""
    if f"@{BOT_USER}" in issue_body and issue.get("user", {}).get("login") != BOT_USER:
        if needs_bot_response(repo_full, number):
            return True, f"@-mentioned in body", number
    comments = gitea_api("GET", f"/repos/{repo_full}/issues/{number}/comments")
    if comments:
        last = comments[-1]
        if last.get("user", {}).get("login") == BOT_USER:
            return False, "own comment is latest", number
        if f"@{BOT_USER}" in (last.get("body") or ""):
            return True, f"@-mentioned in comment", number
    return False, "not mentioned or assigned", number
 def spawn_agent(repo_full, issue_number, title, subject_type, reason):
    dispatch_key = f"{repo_full}#{issue_number}"
    if dispatch_key in dispatched_issues:
        return
    dispatched_issues.add(dispatch_key)
    repo_short = repo_full.split("/")[-1]
    job_name = f"gitea-{repo_short}-{issue_number}-{int(time.time())}"
    msg = (
        f"Gitea: {reason} on {subject_type} #{issue_number} "
        f"'{title}' in {repo_full}.\n"
        f"API: {GITEA_URL}/api/v1 | Token: {GITEA_TOKEN}\n"
        f"SCOPE: Only {subject_type} #{issue_number} in {repo_full}.\n"
        f"Read all comments, do the work, post results as Gitea comments."
    )
    try:
-        with urllib.request.urlopen(req, timeout=10) as resp:
+        subprocess.run(
-            return {n["id"] for n in json.loads(resp.read())}
+            [OPENCLAW_BIN, "cron", "add",
             "--name", job_name, "--at", "1s",
             "--message", msg, "--delete-after-run",
             "--session", "isolated", "--no-deliver",
             "--thinking", "low", "--timeout-seconds", "300"],
            capture_output=True, text=True, timeout=15,
        )
    except Exception as e:
-        print(f"WARN: Gitea API failed: {e}", file=sys.stderr, flush=True)
+        print(f"Spawn error: {e}", file=sys.stderr, flush=True)
-        return set()
+        dispatched_issues.discard(dispatch_key)
 def write_flag(count):
    os.makedirs(os.path.dirname(FLAG_PATH), exist_ok=True)
    with open(FLAG_PATH, "w") as f:
        f.write(json.dumps({
            "ts": time.strftime("%Y-%m-%dT%H:%M:%SZ", time.gmtime()),
            "count": count,
        }))
 def main():
-    check_config()
+    print(f"Poller started (poll={POLL_DELAY}s, cooldown={COOLDOWN}s)", flush=True)
-    print(f"Gitea poller started (delay={POLL_DELAY}s, flag={FLAG_PATH})", flush=True)
+    seen_ids = set(n["id"] for n in (gitea_api("GET", "/notifications?status-types=unread") or []))
-    last_seen_ids = gitea_unread_ids()
+    last_dispatch = 0
-    print(f"Initial unread: {len(last_seen_ids)}", flush=True)
+    last_assign_scan = 0
    while True:
        time.sleep(POLL_DELAY)
-        current_ids = gitea_unread_ids()
+        now = time.time()
-        new_ids = current_ids - last_seen_ids
+
-        if not new_ids:
+        # Notification polling
-            last_seen_ids = current_ids
+        notifs = gitea_api("GET", "/notifications?status-types=unread") or []
        current_ids = {n["id"] for n in notifs}
        new_ids = current_ids - seen_ids
        if new_ids and now - last_dispatch >= COOLDOWN:
            for n in [n for n in notifs if n["id"] in new_ids]:
                nid = n.get("id")
                if nid:
                    gitea_api("PATCH", f"/notifications/threads/{nid}")
                is_act, reason, num = is_actionable(n)
                if is_act:
                    repo = n["repository"]["full_name"]
                    title = n["subject"]["title"][:60]
                    stype = n["subject"].get("type", "").lower()
                    spawn_agent(repo, num, title, stype, reason)
            last_dispatch = now
        seen_ids = current_ids
        # Assignment scan (less frequent)
        if now - last_assign_scan >= ASSIGNMENT_INTERVAL:
            for repo in WATCHED_REPOS:
                for itype in ["issues", "pulls"]:
                    items = gitea_api("GET",
                        f"/repos/{repo}/issues?state=open&type={itype}"
                        f"&assignee={BOT_USER}&sort=updated&limit=10") or []
                    for item in items:
                        num = str(item["number"])
                        if f"{repo}#{num}" in dispatched_issues:
                            continue
-        ts = time.strftime("%H:%M:%S")
+                        # Only recently updated items (5 min)
-        print(f"[{ts}] {len(new_ids)} new ({len(current_ids)} total), flag written", flush=True)
+                        # ... add is_recently_updated() check here
-        write_flag(len(new_ids))
+                        if needs_bot_response(repo, num):
-        last_seen_ids = current_ids
+                            spawn_agent(repo, num, item["title"][:60],
                                "pull" if itype == "pulls" else "issue",
                                f"assigned to {BOT_USER}")
            last_assign_scan = now
 if __name__ == "__main__":
@@ -341,13 +455,15 @@ This applies to everything: project rules ("no mocks in tests"), workflow
 preferences ("fewer PRs, don't over-split"), corrections, new policies.
 Immediate write to the daily file, and to MEMORY.md if it's a standing rule.
-### PII-Aware Output Routing
+### Sensitive Output Routing
 A lesson learned the hard way: **the audience determines what you can say, not
 who asked.** If the human asks for a medication status report in a group
 channel, the agent can't just dump it there — other people can read it. The
-rule: if the output would contain PII and the channel isn't private, redirect to
+rule: if the output would contain sensitive information (PII, secrets,
-DM and reply in-channel with "sent privately."
+credentials, API keys, operational details like flight numbers, locations,
 travel plans, medical info, etc.) and the channel isn't private, redirect to DM
 and reply in-channel with "sent privately."
 This is enforced at multiple levels:
@@ -378,7 +494,7 @@ The heartbeat handles:
 - Periodic memory maintenance
 State tracking in `memory/heartbeat-state.json` prevents redundant checks (e.g.,
-don't re-check email if you checked 10 minutes ago).
+don't re-check notifications if you checked 10 minutes ago).
 The key output rule: heartbeats should either be `HEARTBEAT_OK` (nothing to do)
 or a direct alert. Work narration goes to a designated status channel, never to
@@ -1390,7 +1506,8 @@ stay quiet.
 ## Inbox Check (PRIORITY)
-(check notifications, issues, emails — whatever applies)
+(check whatever notification sources apply to your setup — e.g. Gitea
 notifications, emails, issue trackers)
 ## Flight Prep Blocks (daily)
@@ -1424,10 +1541,9 @@ Never send internal thinking or status narration to user's DM. Output should be:
 ```json
 {
    "lastChecks": {
-        "email": 1703275200,
+        "gitea": 1703280000,
        "calendar": 1703260800,
-        "weather": null,
+        "weather": null
        "gitea": 1703280000
    },
    "lastWeeklyDocsReview": "2026-02-24"
 }
@@ -1596,21 +1712,24 @@ Never lose a rule or preference your human states:
 ---
-## PII Output Routing — Audience-Aware Responses
+## Sensitive Output Routing — Audience-Aware Responses
 A critical security pattern: **the audience determines what you can say, not who
-asked.** If your human asks for a sitrep (or any PII-containing info) in a group
+asked.** If your human asks for a sitrep (or any sensitive info) in a group
 channel, you can't just dump it there — other people can read it.
 ### AGENTS.md / checklist prompt:
 ```markdown
-## PII Output Routing (CRITICAL)
+## Sensitive Output Routing (CRITICAL)
- NEVER output PII in any non-private channel, even if your human asks for it
+- NEVER output sensitive information in any non-private channel, even if your
- If a request would produce PII (medication status, travel details, financial
+  human asks for it
-  info, etc.) in a shared channel: send the response via DM instead, and reply
+- This includes: PII, secrets, credentials, API keys, and sensitive operational
-  in-channel with "sent privately"
+  information (flight numbers/times/dates, locations, travel plans, medical
  info, financial details, etc.)
 - If a request would produce any of the above in a shared channel: send the
  response via DM instead, and reply in-channel with "sent privately"
 - The rule is: the audience determines what you can say, not who asked
 - This applies to: group chats, public issue trackers, shared Mattermost
  channels, Discord servers — anywhere that isn't a 1:1 DM
@@ -1619,10 +1738,10 @@ channel, you can't just dump it there — other people can read it.
 ### Why this matters:
 This is a real failure mode. If someone asks "sitrep" in a group channel and you
-respond with medication names, partner details, travel dates, and hotel names —
+respond with medication names, partner details, travel dates, hotel names, or
-you just leaked all of that to everyone in the channel. The human asking is
+API credentials — you just leaked all of that to everyone in the channel. The
-authorized to see it; the channel audience is not. Always check WHERE you're
+human asking is authorized to see it; the channel audience is not. Always check
-responding, not just WHO asked.
+WHERE you're responding, not just WHO asked.
 ---
--- a/SETUP_CHECKLIST.md
+++ b/SETUP_CHECKLIST.md
@@ -273,7 +273,8 @@ poll. Structure it like this:
 ## Checks (rotate through these, 2-4 times per day)
- Emails — any urgent unread messages?
+- Notifications — any unread items? (Gitea notifications, emails, or
  whatever inbox sources you've integrated)
 - Calendar — upcoming events in next 24-48h?
 - Open issues/PRs — anything assigned to me?
 - Workspace sync — any uncommitted changes to push?
@@ -336,32 +337,36 @@ Then add a reference to this checklist in the MEMORY.md checklist index.
 Reference:
 https://git.eeqj.de/sneak/clawpub/raw/branch/main/OPENCLAW_TRICKS.md
-(see "PII Output Routing" and "Checklists Over Prose")
+(see "Sensitive Output Routing" and "Checklists Over Prose")
 ```
-### 5.2 PII output routing
+### 5.2 Sensitive output routing
-Prevents leaking private info in shared channels. Paste this to your agent:
+Prevents leaking private info, secrets, and operational details in shared
 channels. Paste this to your agent:
 ```
 Add the following warning banner near the TOP of AGENTS.md (before the
 session startup section):
-**⚠️ NEVER output PII in non-private channels.** If asked for
+**⚠️ NEVER output sensitive information in non-private channels.** This
-PII-containing info (medical, financial, personal) in a shared channel,
+includes PII, secrets, credentials, API keys, and sensitive operational
-send via DM to your human instead.
+information (flight numbers/times/dates, locations, travel plans,
 medical info, etc.). If asked for any of this in a shared channel, send
 via DM to your human instead.
-Also add a PII section to memory/checklist-messaging.md:
+Also add a sensitive-info section to memory/checklist-messaging.md:
-## PII Check (before every message in shared channels)
+## Sensitive Info Check (before every message in shared channels)
-1. Contains names, addresses, medical info, financial info? → DM only
+1. Contains PII (names, addresses, medical info, financial info)? → DM only
-2. Contains login credentials or tokens? → NEVER send, period
+2. Contains secrets, credentials, API keys, or tokens? → NEVER send, period
-3. When in doubt → send via DM
+3. Contains operational details (flight numbers, travel plans, locations)? → DM only
 4. When in doubt → send via DM
 Reference:
 https://git.eeqj.de/sneak/clawpub/raw/branch/main/OPENCLAW_TRICKS.md
-(see "PII-Aware Output Routing")
+(see "Sensitive Output Routing")
 ```
 ### 5.3 Additional checklists
Author	SHA1	Message	Date
clawbot	ccf08cfb67	Merge pull request 'docs: update poller to dispatcher architecture (closes #4 )' (#5 ) from fix/update-poller-docs into main All checks were successful check / check (push) Successful in 9s Details	2026-02-28 16:31:43 +01:00
clawbot	0284ea63c0	docs: update poller to dispatcher architecture (closes #4 ) All checks were successful check / check (push) Successful in 11s Details Replace flag-file + heartbeat approach with the production dispatcher pattern: poller triages notifications and spawns isolated agents directly via openclaw cron. Adds assignment scan for self-created issues. Response time ~15-60s instead of ~30 min.	2026-02-28 06:29:32 -08:00
Jeffrey Paul	f3e48c6cd4	Merge pull request 'Expand sensitive output routing and make inbox references conditional' (#3 ) from fix/pii-and-conditional-email into main All checks were successful check / check (push) Successful in 9s Details Reviewed-on: #3	2026-02-28 15:22:36 +01:00
clawbot	c0d345e767	expand PII routing to cover secrets, credentials, and operational info; make email/inbox references conditional All checks were successful check / check (push) Successful in 12s Details - Rename 'PII Output Routing' → 'Sensitive Output Routing' throughout - Expand scope to include secrets, credentials, API keys, flight numbers, locations, travel plans, medical info - Replace hardcoded 'Emails' heartbeat check with conditional language ('Notifications — whatever inbox sources you've integrated') - Remove 'email' from heartbeat-state.json example - Update cross-references in SETUP_CHECKLIST.md	2026-02-28 03:40:13 -08:00