Compare commits
5 Commits
add-notifi
...
fix/pii-an
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
c0d345e767 | ||
|
|
36223ca550 | ||
|
|
f0a2a5eb62 | ||
| 9631535583 | |||
|
|
b0495d5b56 |
@@ -173,42 +173,62 @@ The landing checklist (triggered automatically after every flight) updates
|
|||||||
location, timezone, nearest airport, and lodging in the daily context file. It
|
location, timezone, nearest airport, and lodging in the daily context file. It
|
||||||
also checks if any cron jobs have hardcoded timezones that need updating.
|
also checks if any cron jobs have hardcoded timezones that need updating.
|
||||||
|
|
||||||
### The Gitea Notification Poller
|
### Gitea Notification Delivery
|
||||||
|
|
||||||
OpenClaw has heartbeats, but those are periodic (every ~30min). For Gitea issues
|
There are two approaches for getting Gitea notifications to your agent,
|
||||||
and PRs, we wanted near-realtime response. The solution: a tiny Python script
|
depending on your network setup.
|
||||||
that polls the Gitea notifications API every 2 seconds and wakes the agent via
|
|
||||||
OpenClaw's `/hooks/wake` endpoint when new notifications arrive.
|
#### Option A: Direct Webhooks (VPS / Public Server)
|
||||||
|
|
||||||
|
If your OpenClaw instance runs on a VPS or other publicly reachable server, the
|
||||||
|
simplest approach is direct webhooks. Run Traefik (or any reverse proxy with
|
||||||
|
automatic TLS) on the same server and configure Gitea webhooks to POST directly
|
||||||
|
to OpenClaw's webhook endpoint. This is push-based and realtime — notifications
|
||||||
|
arrive instantly.
|
||||||
|
|
||||||
|
Setup: add a webhook on each Gitea repo (or use an organization-level webhook)
|
||||||
|
pointing to `https://your-openclaw-host/hooks/gitea`. OpenClaw handles the rest.
|
||||||
|
|
||||||
|
#### Option B: Notification Poller (Local Machine Behind NAT)
|
||||||
|
|
||||||
|
If your OpenClaw runs on a dedicated local machine behind NAT (like a home Mac
|
||||||
|
or Linux workstation), Gitea can't reach it directly. This is our setup —
|
||||||
|
OpenClaw runs on a Mac Studio on a home LAN.
|
||||||
|
|
||||||
|
The solution: a lightweight Python script that polls the Gitea notifications API
|
||||||
|
every few seconds. When new notifications appear, it writes a flag file that the
|
||||||
|
agent checks during heartbeats.
|
||||||
|
|
||||||
Key design decisions:
|
Key design decisions:
|
||||||
|
|
||||||
- **The poller never marks notifications as read.** That's the agent's job after
|
- **The poller never marks notifications as read.** The agent does that after
|
||||||
it processes them. This prevents the poller and agent from racing.
|
processing. This prevents lost notifications if the agent fails to process.
|
||||||
- **It tracks notification IDs, not counts.** This way it only fires on
|
- **It tracks notification IDs, not counts.** Only fires on genuinely new
|
||||||
genuinely new notifications, not re-reads of existing ones.
|
notifications, not re-reads of existing ones.
|
||||||
- **The wake message tells the agent to route output to Gitea/Mattermost, not to
|
- **Flag file instead of wake events.** We initially used OpenClaw's
|
||||||
DM.** This prevents chatty notification processing from disturbing the human.
|
`/hooks/wake` endpoint, but wake events target the main (DM) session — any
|
||||||
- **Zero dependencies.** Just Python stdlib (`urllib`, `json`, `time`). Runs
|
model response during processing leaked to DM as a notification. The flag file
|
||||||
anywhere.
|
approach is processed during heartbeats, where output routing is controlled.
|
||||||
|
- **Zero dependencies.** Just Python stdlib. Runs anywhere.
|
||||||
|
|
||||||
Here's the full source:
|
Tradeoff: notifications are processed at heartbeat cadence (~30 min) instead of
|
||||||
|
realtime. For code review and issue triage, this is fine.
|
||||||
|
|
||||||
```python
|
```python
|
||||||
#!/usr/bin/env python3
|
#!/usr/bin/env python3
|
||||||
"""
|
"""
|
||||||
Gitea notification poller.
|
Gitea notification poller (flag-file approach).
|
||||||
Polls for unread notifications and wakes OpenClaw when the count
|
Polls for unread notifications and writes a flag file when new ones
|
||||||
changes. The AGENT marks notifications as read after processing —
|
appear. The agent checks this flag during heartbeats and processes
|
||||||
the poller never marks anything as read.
|
notifications via the Gitea API directly.
|
||||||
|
|
||||||
Required env vars:
|
Required env vars:
|
||||||
GITEA_URL - Gitea instance URL
|
GITEA_URL - Gitea instance URL
|
||||||
GITEA_TOKEN - Gitea API token
|
GITEA_TOKEN - Gitea API token
|
||||||
HOOK_TOKEN - OpenClaw hooks auth token
|
|
||||||
|
|
||||||
Optional env vars:
|
Optional env vars:
|
||||||
GATEWAY_URL - OpenClaw gateway URL (default: http://127.0.0.1:18789)
|
FLAG_PATH - Path to flag file (default: workspace/memory/gitea-notify-flag)
|
||||||
POLL_DELAY - Delay between polls in seconds (default: 2)
|
POLL_DELAY - Delay between polls in seconds (default: 5)
|
||||||
"""
|
"""
|
||||||
|
|
||||||
import json
|
import json
|
||||||
@@ -220,108 +240,61 @@ import urllib.error
|
|||||||
|
|
||||||
GITEA_URL = os.environ.get("GITEA_URL", "").rstrip("/")
|
GITEA_URL = os.environ.get("GITEA_URL", "").rstrip("/")
|
||||||
GITEA_TOKEN = os.environ.get("GITEA_TOKEN", "")
|
GITEA_TOKEN = os.environ.get("GITEA_TOKEN", "")
|
||||||
GATEWAY_URL = os.environ.get("GATEWAY_URL", "http://127.0.0.1:18789").rstrip(
|
POLL_DELAY = int(os.environ.get("POLL_DELAY", "5"))
|
||||||
"/"
|
FLAG_PATH = os.environ.get(
|
||||||
|
"FLAG_PATH",
|
||||||
|
os.path.join(
|
||||||
|
os.path.dirname(os.path.dirname(os.path.abspath(__file__))),
|
||||||
|
"memory",
|
||||||
|
"gitea-notify-flag",
|
||||||
|
),
|
||||||
)
|
)
|
||||||
HOOK_TOKEN = os.environ.get("HOOK_TOKEN", "")
|
|
||||||
POLL_DELAY = int(os.environ.get("POLL_DELAY", "2"))
|
|
||||||
|
|
||||||
|
|
||||||
def check_config():
|
def check_config():
|
||||||
missing = []
|
if not GITEA_URL or not GITEA_TOKEN:
|
||||||
if not GITEA_URL:
|
print("ERROR: GITEA_URL and GITEA_TOKEN required", file=sys.stderr)
|
||||||
missing.append("GITEA_URL")
|
|
||||||
if not GITEA_TOKEN:
|
|
||||||
missing.append("GITEA_TOKEN")
|
|
||||||
if not HOOK_TOKEN:
|
|
||||||
missing.append("HOOK_TOKEN")
|
|
||||||
if missing:
|
|
||||||
print(
|
|
||||||
f"ERROR: Missing required env vars: {', '.join(missing)}",
|
|
||||||
file=sys.stderr,
|
|
||||||
)
|
|
||||||
sys.exit(1)
|
sys.exit(1)
|
||||||
|
|
||||||
|
|
||||||
def gitea_unread_ids():
|
def gitea_unread_ids():
|
||||||
"""Return set of unread notification IDs."""
|
|
||||||
req = urllib.request.Request(
|
req = urllib.request.Request(
|
||||||
f"{GITEA_URL}/api/v1/notifications?status-types=unread",
|
f"{GITEA_URL}/api/v1/notifications?status-types=unread",
|
||||||
headers={"Authorization": f"token {GITEA_TOKEN}"},
|
headers={"Authorization": f"token {GITEA_TOKEN}"},
|
||||||
)
|
)
|
||||||
try:
|
try:
|
||||||
with urllib.request.urlopen(req, timeout=10) as resp:
|
with urllib.request.urlopen(req, timeout=10) as resp:
|
||||||
notifs = json.loads(resp.read())
|
return {n["id"] for n in json.loads(resp.read())}
|
||||||
return {n["id"] for n in notifs}
|
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
print(
|
print(f"WARN: Gitea API failed: {e}", file=sys.stderr, flush=True)
|
||||||
f"WARN: Gitea API failed: {e}", file=sys.stderr, flush=True
|
|
||||||
)
|
|
||||||
return set()
|
return set()
|
||||||
|
|
||||||
|
|
||||||
def wake_openclaw(count):
|
def write_flag(count):
|
||||||
text = (
|
os.makedirs(os.path.dirname(FLAG_PATH), exist_ok=True)
|
||||||
f"[Gitea Notification] {count} new notification(s). "
|
with open(FLAG_PATH, "w") as f:
|
||||||
"Check your Gitea notification inbox via API, process them, "
|
f.write(json.dumps({
|
||||||
"and mark as read when done. "
|
"ts": time.strftime("%Y-%m-%dT%H:%M:%SZ", time.gmtime()),
|
||||||
"Route all output to Gitea comments or Mattermost #git/#claw. "
|
"count": count,
|
||||||
"Do NOT reply to this session — respond with NO_REPLY."
|
}))
|
||||||
)
|
|
||||||
payload = json.dumps({"text": text, "mode": "now"}).encode()
|
|
||||||
req = urllib.request.Request(
|
|
||||||
f"{GATEWAY_URL}/hooks/wake",
|
|
||||||
data=payload,
|
|
||||||
headers={
|
|
||||||
"Authorization": f"Bearer {HOOK_TOKEN}",
|
|
||||||
"Content-Type": "application/json",
|
|
||||||
},
|
|
||||||
method="POST",
|
|
||||||
)
|
|
||||||
try:
|
|
||||||
with urllib.request.urlopen(req, timeout=5) as resp:
|
|
||||||
status = resp.status
|
|
||||||
print(f" Wake responded: {status}", flush=True)
|
|
||||||
return True
|
|
||||||
except Exception as e:
|
|
||||||
print(
|
|
||||||
f"WARN: Failed to wake OpenClaw: {e}",
|
|
||||||
file=sys.stderr,
|
|
||||||
flush=True,
|
|
||||||
)
|
|
||||||
return False
|
|
||||||
|
|
||||||
|
|
||||||
def main():
|
def main():
|
||||||
check_config()
|
check_config()
|
||||||
print(
|
print(f"Gitea poller started (delay={POLL_DELAY}s, flag={FLAG_PATH})", flush=True)
|
||||||
f"Gitea notification poller started (delay={POLL_DELAY}s)",
|
|
||||||
flush=True,
|
|
||||||
)
|
|
||||||
|
|
||||||
last_seen_ids = gitea_unread_ids()
|
last_seen_ids = gitea_unread_ids()
|
||||||
print(
|
print(f"Initial unread: {len(last_seen_ids)}", flush=True)
|
||||||
f"Initial unread: {len(last_seen_ids)} notification(s)", flush=True
|
|
||||||
)
|
|
||||||
|
|
||||||
while True:
|
while True:
|
||||||
time.sleep(POLL_DELAY)
|
time.sleep(POLL_DELAY)
|
||||||
|
|
||||||
current_ids = gitea_unread_ids()
|
current_ids = gitea_unread_ids()
|
||||||
new_ids = current_ids - last_seen_ids
|
new_ids = current_ids - last_seen_ids
|
||||||
|
|
||||||
if not new_ids:
|
if not new_ids:
|
||||||
last_seen_ids = current_ids
|
last_seen_ids = current_ids
|
||||||
continue
|
continue
|
||||||
|
|
||||||
ts = time.strftime("%H:%M:%S")
|
ts = time.strftime("%H:%M:%S")
|
||||||
print(
|
print(f"[{ts}] {len(new_ids)} new ({len(current_ids)} total), flag written", flush=True)
|
||||||
f"[{ts}] {len(new_ids)} new notification(s) "
|
write_flag(len(new_ids))
|
||||||
f"({len(current_ids)} total unread), waking agent",
|
|
||||||
flush=True,
|
|
||||||
)
|
|
||||||
|
|
||||||
wake_openclaw(len(new_ids))
|
|
||||||
last_seen_ids = current_ids
|
last_seen_ids = current_ids
|
||||||
|
|
||||||
|
|
||||||
@@ -368,13 +341,15 @@ This applies to everything: project rules ("no mocks in tests"), workflow
|
|||||||
preferences ("fewer PRs, don't over-split"), corrections, new policies.
|
preferences ("fewer PRs, don't over-split"), corrections, new policies.
|
||||||
Immediate write to the daily file, and to MEMORY.md if it's a standing rule.
|
Immediate write to the daily file, and to MEMORY.md if it's a standing rule.
|
||||||
|
|
||||||
### PII-Aware Output Routing
|
### Sensitive Output Routing
|
||||||
|
|
||||||
A lesson learned the hard way: **the audience determines what you can say, not
|
A lesson learned the hard way: **the audience determines what you can say, not
|
||||||
who asked.** If the human asks for a medication status report in a group
|
who asked.** If the human asks for a medication status report in a group
|
||||||
channel, the agent can't just dump it there — other people can read it. The
|
channel, the agent can't just dump it there — other people can read it. The
|
||||||
rule: if the output would contain PII and the channel isn't private, redirect to
|
rule: if the output would contain sensitive information (PII, secrets,
|
||||||
DM and reply in-channel with "sent privately."
|
credentials, API keys, operational details like flight numbers, locations,
|
||||||
|
travel plans, medical info, etc.) and the channel isn't private, redirect to DM
|
||||||
|
and reply in-channel with "sent privately."
|
||||||
|
|
||||||
This is enforced at multiple levels:
|
This is enforced at multiple levels:
|
||||||
|
|
||||||
@@ -405,7 +380,7 @@ The heartbeat handles:
|
|||||||
- Periodic memory maintenance
|
- Periodic memory maintenance
|
||||||
|
|
||||||
State tracking in `memory/heartbeat-state.json` prevents redundant checks (e.g.,
|
State tracking in `memory/heartbeat-state.json` prevents redundant checks (e.g.,
|
||||||
don't re-check email if you checked 10 minutes ago).
|
don't re-check notifications if you checked 10 minutes ago).
|
||||||
|
|
||||||
The key output rule: heartbeats should either be `HEARTBEAT_OK` (nothing to do)
|
The key output rule: heartbeats should either be `HEARTBEAT_OK` (nothing to do)
|
||||||
or a direct alert. Work narration goes to a designated status channel, never to
|
or a direct alert. Work narration goes to a designated status channel, never to
|
||||||
@@ -1417,7 +1392,8 @@ stay quiet.
|
|||||||
|
|
||||||
## Inbox Check (PRIORITY)
|
## Inbox Check (PRIORITY)
|
||||||
|
|
||||||
(check notifications, issues, emails — whatever applies)
|
(check whatever notification sources apply to your setup — e.g. Gitea
|
||||||
|
notifications, emails, issue trackers)
|
||||||
|
|
||||||
## Flight Prep Blocks (daily)
|
## Flight Prep Blocks (daily)
|
||||||
|
|
||||||
@@ -1451,10 +1427,9 @@ Never send internal thinking or status narration to user's DM. Output should be:
|
|||||||
```json
|
```json
|
||||||
{
|
{
|
||||||
"lastChecks": {
|
"lastChecks": {
|
||||||
"email": 1703275200,
|
"gitea": 1703280000,
|
||||||
"calendar": 1703260800,
|
"calendar": 1703260800,
|
||||||
"weather": null,
|
"weather": null
|
||||||
"gitea": 1703280000
|
|
||||||
},
|
},
|
||||||
"lastWeeklyDocsReview": "2026-02-24"
|
"lastWeeklyDocsReview": "2026-02-24"
|
||||||
}
|
}
|
||||||
@@ -1623,21 +1598,24 @@ Never lose a rule or preference your human states:
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## PII Output Routing — Audience-Aware Responses
|
## Sensitive Output Routing — Audience-Aware Responses
|
||||||
|
|
||||||
A critical security pattern: **the audience determines what you can say, not who
|
A critical security pattern: **the audience determines what you can say, not who
|
||||||
asked.** If your human asks for a sitrep (or any PII-containing info) in a group
|
asked.** If your human asks for a sitrep (or any sensitive info) in a group
|
||||||
channel, you can't just dump it there — other people can read it.
|
channel, you can't just dump it there — other people can read it.
|
||||||
|
|
||||||
### AGENTS.md / checklist prompt:
|
### AGENTS.md / checklist prompt:
|
||||||
|
|
||||||
```markdown
|
```markdown
|
||||||
## PII Output Routing (CRITICAL)
|
## Sensitive Output Routing (CRITICAL)
|
||||||
|
|
||||||
- NEVER output PII in any non-private channel, even if your human asks for it
|
- NEVER output sensitive information in any non-private channel, even if your
|
||||||
- If a request would produce PII (medication status, travel details, financial
|
human asks for it
|
||||||
info, etc.) in a shared channel: send the response via DM instead, and reply
|
- This includes: PII, secrets, credentials, API keys, and sensitive operational
|
||||||
in-channel with "sent privately"
|
information (flight numbers/times/dates, locations, travel plans, medical
|
||||||
|
info, financial details, etc.)
|
||||||
|
- If a request would produce any of the above in a shared channel: send the
|
||||||
|
response via DM instead, and reply in-channel with "sent privately"
|
||||||
- The rule is: the audience determines what you can say, not who asked
|
- The rule is: the audience determines what you can say, not who asked
|
||||||
- This applies to: group chats, public issue trackers, shared Mattermost
|
- This applies to: group chats, public issue trackers, shared Mattermost
|
||||||
channels, Discord servers — anywhere that isn't a 1:1 DM
|
channels, Discord servers — anywhere that isn't a 1:1 DM
|
||||||
@@ -1646,10 +1624,10 @@ channel, you can't just dump it there — other people can read it.
|
|||||||
### Why this matters:
|
### Why this matters:
|
||||||
|
|
||||||
This is a real failure mode. If someone asks "sitrep" in a group channel and you
|
This is a real failure mode. If someone asks "sitrep" in a group channel and you
|
||||||
respond with medication names, partner details, travel dates, and hotel names —
|
respond with medication names, partner details, travel dates, hotel names, or
|
||||||
you just leaked all of that to everyone in the channel. The human asking is
|
API credentials — you just leaked all of that to everyone in the channel. The
|
||||||
authorized to see it; the channel audience is not. Always check WHERE you're
|
human asking is authorized to see it; the channel audience is not. Always check
|
||||||
responding, not just WHO asked.
|
WHERE you're responding, not just WHO asked.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|||||||
File diff suppressed because it is too large
Load Diff
Reference in New Issue
Block a user