5 Commits

Author SHA1 Message Date
clawbot
c0d345e767 expand PII routing to cover secrets, credentials, and operational info; make email/inbox references conditional
All checks were successful
check / check (push) Successful in 12s
- Rename 'PII Output Routing' → 'Sensitive Output Routing' throughout
- Expand scope to include secrets, credentials, API keys, flight numbers,
  locations, travel plans, medical info
- Replace hardcoded 'Emails' heartbeat check with conditional language
  ('Notifications — whatever inbox sources you've integrated')
- Remove 'email' from heartbeat-state.json example
- Update cross-references in SETUP_CHECKLIST.md
2026-02-28 03:40:13 -08:00
user
36223ca550 fix: agent should infer needed fields, not wait to be told
All checks were successful
check / check (push) Successful in 12s
2026-02-28 03:33:08 -08:00
user
f0a2a5eb62 docs: update Gitea notification section — webhook vs poller, flag-file approach
Some checks are pending
check / check (push) Waiting to run
- Replaced wake-event poller with flag-file approach (prevents DM spam)
- Added Option A (webhooks for VPS) vs Option B (poller for NAT)
- Documented the wake-event failure mode and why we switched
2026-02-28 03:30:49 -08:00
9631535583 Merge pull request 'Rewrite SETUP_CHECKLIST.md: replace checklists with paste-able agent prompts' (#1) from rewrite-setup-checklist-prompts into main
Some checks are pending
check / check (push) Waiting to run
2026-02-28 12:27:17 +01:00
user
b0495d5b56 rewrite SETUP_CHECKLIST.md: replace checklist items with paste-able agent prompts
All checks were successful
check / check (push) Successful in 13s
Each section now contains a self-contained prompt in a code block that
adopting users can paste directly to their agent. Prompts include full
URLs to raw reference docs. Fixes 'you provide' wording to 'your human
provides'. Keeps same phase/section structure.
2026-02-28 03:22:08 -08:00
2 changed files with 718 additions and 534 deletions

View File

@@ -173,42 +173,62 @@ The landing checklist (triggered automatically after every flight) updates
location, timezone, nearest airport, and lodging in the daily context file. It location, timezone, nearest airport, and lodging in the daily context file. It
also checks if any cron jobs have hardcoded timezones that need updating. also checks if any cron jobs have hardcoded timezones that need updating.
### The Gitea Notification Poller ### Gitea Notification Delivery
OpenClaw has heartbeats, but those are periodic (every ~30min). For Gitea issues There are two approaches for getting Gitea notifications to your agent,
and PRs, we wanted near-realtime response. The solution: a tiny Python script depending on your network setup.
that polls the Gitea notifications API every 2 seconds and wakes the agent via
OpenClaw's `/hooks/wake` endpoint when new notifications arrive. #### Option A: Direct Webhooks (VPS / Public Server)
If your OpenClaw instance runs on a VPS or other publicly reachable server, the
simplest approach is direct webhooks. Run Traefik (or any reverse proxy with
automatic TLS) on the same server and configure Gitea webhooks to POST directly
to OpenClaw's webhook endpoint. This is push-based and realtime — notifications
arrive instantly.
Setup: add a webhook on each Gitea repo (or use an organization-level webhook)
pointing to `https://your-openclaw-host/hooks/gitea`. OpenClaw handles the rest.
#### Option B: Notification Poller (Local Machine Behind NAT)
If your OpenClaw runs on a dedicated local machine behind NAT (like a home Mac
or Linux workstation), Gitea can't reach it directly. This is our setup —
OpenClaw runs on a Mac Studio on a home LAN.
The solution: a lightweight Python script that polls the Gitea notifications API
every few seconds. When new notifications appear, it writes a flag file that the
agent checks during heartbeats.
Key design decisions: Key design decisions:
- **The poller never marks notifications as read.** That's the agent's job after - **The poller never marks notifications as read.** The agent does that after
it processes them. This prevents the poller and agent from racing. processing. This prevents lost notifications if the agent fails to process.
- **It tracks notification IDs, not counts.** This way it only fires on - **It tracks notification IDs, not counts.** Only fires on genuinely new
genuinely new notifications, not re-reads of existing ones. notifications, not re-reads of existing ones.
- **The wake message tells the agent to route output to Gitea/Mattermost, not to - **Flag file instead of wake events.** We initially used OpenClaw's
DM.** This prevents chatty notification processing from disturbing the human. `/hooks/wake` endpoint, but wake events target the main (DM) session — any
- **Zero dependencies.** Just Python stdlib (`urllib`, `json`, `time`). Runs model response during processing leaked to DM as a notification. The flag file
anywhere. approach is processed during heartbeats, where output routing is controlled.
- **Zero dependencies.** Just Python stdlib. Runs anywhere.
Here's the full source: Tradeoff: notifications are processed at heartbeat cadence (~30 min) instead of
realtime. For code review and issue triage, this is fine.
```python ```python
#!/usr/bin/env python3 #!/usr/bin/env python3
""" """
Gitea notification poller. Gitea notification poller (flag-file approach).
Polls for unread notifications and wakes OpenClaw when the count Polls for unread notifications and writes a flag file when new ones
changes. The AGENT marks notifications as read after processing — appear. The agent checks this flag during heartbeats and processes
the poller never marks anything as read. notifications via the Gitea API directly.
Required env vars: Required env vars:
GITEA_URL - Gitea instance URL GITEA_URL - Gitea instance URL
GITEA_TOKEN - Gitea API token GITEA_TOKEN - Gitea API token
HOOK_TOKEN - OpenClaw hooks auth token
Optional env vars: Optional env vars:
GATEWAY_URL - OpenClaw gateway URL (default: http://127.0.0.1:18789) FLAG_PATH - Path to flag file (default: workspace/memory/gitea-notify-flag)
POLL_DELAY - Delay between polls in seconds (default: 2) POLL_DELAY - Delay between polls in seconds (default: 5)
""" """
import json import json
@@ -220,108 +240,61 @@ import urllib.error
GITEA_URL = os.environ.get("GITEA_URL", "").rstrip("/") GITEA_URL = os.environ.get("GITEA_URL", "").rstrip("/")
GITEA_TOKEN = os.environ.get("GITEA_TOKEN", "") GITEA_TOKEN = os.environ.get("GITEA_TOKEN", "")
GATEWAY_URL = os.environ.get("GATEWAY_URL", "http://127.0.0.1:18789").rstrip( POLL_DELAY = int(os.environ.get("POLL_DELAY", "5"))
"/" FLAG_PATH = os.environ.get(
"FLAG_PATH",
os.path.join(
os.path.dirname(os.path.dirname(os.path.abspath(__file__))),
"memory",
"gitea-notify-flag",
),
) )
HOOK_TOKEN = os.environ.get("HOOK_TOKEN", "")
POLL_DELAY = int(os.environ.get("POLL_DELAY", "2"))
def check_config(): def check_config():
missing = [] if not GITEA_URL or not GITEA_TOKEN:
if not GITEA_URL: print("ERROR: GITEA_URL and GITEA_TOKEN required", file=sys.stderr)
missing.append("GITEA_URL")
if not GITEA_TOKEN:
missing.append("GITEA_TOKEN")
if not HOOK_TOKEN:
missing.append("HOOK_TOKEN")
if missing:
print(
f"ERROR: Missing required env vars: {', '.join(missing)}",
file=sys.stderr,
)
sys.exit(1) sys.exit(1)
def gitea_unread_ids(): def gitea_unread_ids():
"""Return set of unread notification IDs."""
req = urllib.request.Request( req = urllib.request.Request(
f"{GITEA_URL}/api/v1/notifications?status-types=unread", f"{GITEA_URL}/api/v1/notifications?status-types=unread",
headers={"Authorization": f"token {GITEA_TOKEN}"}, headers={"Authorization": f"token {GITEA_TOKEN}"},
) )
try: try:
with urllib.request.urlopen(req, timeout=10) as resp: with urllib.request.urlopen(req, timeout=10) as resp:
notifs = json.loads(resp.read()) return {n["id"] for n in json.loads(resp.read())}
return {n["id"] for n in notifs}
except Exception as e: except Exception as e:
print( print(f"WARN: Gitea API failed: {e}", file=sys.stderr, flush=True)
f"WARN: Gitea API failed: {e}", file=sys.stderr, flush=True
)
return set() return set()
def wake_openclaw(count): def write_flag(count):
text = ( os.makedirs(os.path.dirname(FLAG_PATH), exist_ok=True)
f"[Gitea Notification] {count} new notification(s). " with open(FLAG_PATH, "w") as f:
"Check your Gitea notification inbox via API, process them, " f.write(json.dumps({
"and mark as read when done. " "ts": time.strftime("%Y-%m-%dT%H:%M:%SZ", time.gmtime()),
"Route all output to Gitea comments or Mattermost #git/#claw. " "count": count,
"Do NOT reply to this session — respond with NO_REPLY." }))
)
payload = json.dumps({"text": text, "mode": "now"}).encode()
req = urllib.request.Request(
f"{GATEWAY_URL}/hooks/wake",
data=payload,
headers={
"Authorization": f"Bearer {HOOK_TOKEN}",
"Content-Type": "application/json",
},
method="POST",
)
try:
with urllib.request.urlopen(req, timeout=5) as resp:
status = resp.status
print(f" Wake responded: {status}", flush=True)
return True
except Exception as e:
print(
f"WARN: Failed to wake OpenClaw: {e}",
file=sys.stderr,
flush=True,
)
return False
def main(): def main():
check_config() check_config()
print( print(f"Gitea poller started (delay={POLL_DELAY}s, flag={FLAG_PATH})", flush=True)
f"Gitea notification poller started (delay={POLL_DELAY}s)",
flush=True,
)
last_seen_ids = gitea_unread_ids() last_seen_ids = gitea_unread_ids()
print( print(f"Initial unread: {len(last_seen_ids)}", flush=True)
f"Initial unread: {len(last_seen_ids)} notification(s)", flush=True
)
while True: while True:
time.sleep(POLL_DELAY) time.sleep(POLL_DELAY)
current_ids = gitea_unread_ids() current_ids = gitea_unread_ids()
new_ids = current_ids - last_seen_ids new_ids = current_ids - last_seen_ids
if not new_ids: if not new_ids:
last_seen_ids = current_ids last_seen_ids = current_ids
continue continue
ts = time.strftime("%H:%M:%S") ts = time.strftime("%H:%M:%S")
print( print(f"[{ts}] {len(new_ids)} new ({len(current_ids)} total), flag written", flush=True)
f"[{ts}] {len(new_ids)} new notification(s) " write_flag(len(new_ids))
f"({len(current_ids)} total unread), waking agent",
flush=True,
)
wake_openclaw(len(new_ids))
last_seen_ids = current_ids last_seen_ids = current_ids
@@ -368,13 +341,15 @@ This applies to everything: project rules ("no mocks in tests"), workflow
preferences ("fewer PRs, don't over-split"), corrections, new policies. preferences ("fewer PRs, don't over-split"), corrections, new policies.
Immediate write to the daily file, and to MEMORY.md if it's a standing rule. Immediate write to the daily file, and to MEMORY.md if it's a standing rule.
### PII-Aware Output Routing ### Sensitive Output Routing
A lesson learned the hard way: **the audience determines what you can say, not A lesson learned the hard way: **the audience determines what you can say, not
who asked.** If the human asks for a medication status report in a group who asked.** If the human asks for a medication status report in a group
channel, the agent can't just dump it there — other people can read it. The channel, the agent can't just dump it there — other people can read it. The
rule: if the output would contain PII and the channel isn't private, redirect to rule: if the output would contain sensitive information (PII, secrets,
DM and reply in-channel with "sent privately." credentials, API keys, operational details like flight numbers, locations,
travel plans, medical info, etc.) and the channel isn't private, redirect to DM
and reply in-channel with "sent privately."
This is enforced at multiple levels: This is enforced at multiple levels:
@@ -405,7 +380,7 @@ The heartbeat handles:
- Periodic memory maintenance - Periodic memory maintenance
State tracking in `memory/heartbeat-state.json` prevents redundant checks (e.g., State tracking in `memory/heartbeat-state.json` prevents redundant checks (e.g.,
don't re-check email if you checked 10 minutes ago). don't re-check notifications if you checked 10 minutes ago).
The key output rule: heartbeats should either be `HEARTBEAT_OK` (nothing to do) The key output rule: heartbeats should either be `HEARTBEAT_OK` (nothing to do)
or a direct alert. Work narration goes to a designated status channel, never to or a direct alert. Work narration goes to a designated status channel, never to
@@ -1417,7 +1392,8 @@ stay quiet.
## Inbox Check (PRIORITY) ## Inbox Check (PRIORITY)
(check notifications, issues, emails — whatever applies) (check whatever notification sources apply to your setup — e.g. Gitea
notifications, emails, issue trackers)
## Flight Prep Blocks (daily) ## Flight Prep Blocks (daily)
@@ -1451,10 +1427,9 @@ Never send internal thinking or status narration to user's DM. Output should be:
```json ```json
{ {
"lastChecks": { "lastChecks": {
"email": 1703275200, "gitea": 1703280000,
"calendar": 1703260800, "calendar": 1703260800,
"weather": null, "weather": null
"gitea": 1703280000
}, },
"lastWeeklyDocsReview": "2026-02-24" "lastWeeklyDocsReview": "2026-02-24"
} }
@@ -1482,51 +1457,9 @@ Never send internal thinking or status narration to user's DM. Output should be:
## Gitea Integration & Notification Polling ## Gitea Integration & Notification Polling
For self-hosted Gitea instances, you need a way to deliver notifications (issue For self-hosted Gitea instances, you can set up a notification poller that
assignments, PR reviews, @-mentions) to your agent. There are two approaches, injects Gitea events (issue assignments, PR reviews, @-mentions) into the
depending on your network setup. agent's session.
### Notification Delivery: Webhooks vs Polling
#### 1. Direct webhooks (VPS / public server)
If your OpenClaw instance runs on a VPS or other publicly reachable server, you
can run Traefik (or any reverse proxy) on the same server and configure Gitea
webhooks to POST directly to OpenClaw's webhook endpoint. This is push-based and
realtime — notifications arrive instantly.
Set up a Gitea webhook (per-repo or org-wide) pointing at your OpenClaw
instance's `/hooks/wake` endpoint. Gitea sends a POST on every event, and the
agent wakes immediately to process it.
#### 2. Notification poller (local machine behind NAT)
If your OpenClaw instance runs on a dedicated local machine behind NAT (like a
home Mac or Linux box), Gitea can't reach it directly. In this case, use a
lightweight polling script that checks the Gitea notifications API every few
seconds and signals the agent when new notifications arrive.
This is the approach we use — OpenClaw runs on a dedicated Mac Studio on a home
LAN, so we poll Gitea's notification API and wake the agent via the local
`/hooks/wake` endpoint when new notifications appear. The poller script is
included below in the [Notification poller](#notification-poller) section.
The poller approach trades ~30 seconds of latency (polling interval) for
simplicity and no NAT/firewall configuration. For most workflows this is
perfectly fine — code review and issue triage don't need sub-second response
times. If no new notifications arrive between heartbeats, the effective latency
is bounded by the heartbeat interval (~30 minutes), but in practice the poller
catches most events within seconds.
#### Which should you choose?
| Factor | Webhooks | Poller |
| ------------------- | ------------------- | ------------------------- |
| Network requirement | Public IP / domain | None (outbound-only) |
| Latency | Instant | ~2-30s (polling interval) |
| Setup complexity | Reverse proxy + TLS | Single background script |
| Dependencies | Traefik/nginx/Caddy | Python stdlib only |
| Best for | VPS / cloud deploys | Home LAN / NAT setups |
### Workflow rules (HEARTBEAT.md / AGENTS.md): ### Workflow rules (HEARTBEAT.md / AGENTS.md):
@@ -1665,21 +1598,24 @@ Never lose a rule or preference your human states:
--- ---
## PII Output Routing — Audience-Aware Responses ## Sensitive Output Routing — Audience-Aware Responses
A critical security pattern: **the audience determines what you can say, not who A critical security pattern: **the audience determines what you can say, not who
asked.** If your human asks for a sitrep (or any PII-containing info) in a group asked.** If your human asks for a sitrep (or any sensitive info) in a group
channel, you can't just dump it there — other people can read it. channel, you can't just dump it there — other people can read it.
### AGENTS.md / checklist prompt: ### AGENTS.md / checklist prompt:
```markdown ```markdown
## PII Output Routing (CRITICAL) ## Sensitive Output Routing (CRITICAL)
- NEVER output PII in any non-private channel, even if your human asks for it - NEVER output sensitive information in any non-private channel, even if your
- If a request would produce PII (medication status, travel details, financial human asks for it
info, etc.) in a shared channel: send the response via DM instead, and reply - This includes: PII, secrets, credentials, API keys, and sensitive operational
in-channel with "sent privately" information (flight numbers/times/dates, locations, travel plans, medical
info, financial details, etc.)
- If a request would produce any of the above in a shared channel: send the
response via DM instead, and reply in-channel with "sent privately"
- The rule is: the audience determines what you can say, not who asked - The rule is: the audience determines what you can say, not who asked
- This applies to: group chats, public issue trackers, shared Mattermost - This applies to: group chats, public issue trackers, shared Mattermost
channels, Discord servers — anywhere that isn't a 1:1 DM channels, Discord servers — anywhere that isn't a 1:1 DM
@@ -1688,10 +1624,10 @@ channel, you can't just dump it there — other people can read it.
### Why this matters: ### Why this matters:
This is a real failure mode. If someone asks "sitrep" in a group channel and you This is a real failure mode. If someone asks "sitrep" in a group channel and you
respond with medication names, partner details, travel dates, and hotel names respond with medication names, partner details, travel dates, hotel names, or
you just leaked all of that to everyone in the channel. The human asking is API credentials — you just leaked all of that to everyone in the channel. The
authorized to see it; the channel audience is not. Always check WHERE you're human asking is authorized to see it; the channel audience is not. Always check
responding, not just WHO asked. WHERE you're responding, not just WHO asked.
--- ---

File diff suppressed because it is too large Load Diff