488 lines
20 KiB
Markdown
488 lines
20 KiB
Markdown
# Automated Development Systems
|
|
|
|
How we connect self-hosted Gitea, Mattermost, CI, and a lightweight PaaS into a
|
|
continuous development pipeline where code goes from PR to production with
|
|
minimal human intervention.
|
|
|
|
---
|
|
|
|
## The Hub: Self-Hosted Gitea
|
|
|
|
[Gitea](https://gitea.io) is the coordination center. Every repo, issue, PR, and
|
|
code review lives here. We self-host it because:
|
|
|
|
- Full API access for automation (the agent has its own Gitea user with API
|
|
token)
|
|
- Gitea Actions for CI (compatible with GitHub Actions syntax)
|
|
- Webhooks on every repo event
|
|
- No vendor lock-in, no rate limits, no surprise pricing changes
|
|
- Complete control over visibility (public/private per-repo)
|
|
|
|
The agent (OpenClaw) has its own Gitea account (`clawbot`) separate from the
|
|
human user. This matters because:
|
|
|
|
- The agent's commits, comments, and PR reviews are clearly attributed
|
|
- The human can @-mention the agent in issues to assign work
|
|
- Assignment is unambiguous — issues assigned to `clawbot` are the agent's
|
|
queue, issues assigned to the human are theirs
|
|
- The agent can be given write access per-repo (some repos it can push to
|
|
directly, others it must fork and PR)
|
|
- API rate limits and permissions are independent
|
|
|
|
### Issues as the To-Do List
|
|
|
|
Gitea issues aren't just bug reports — they're the universal task queue. Every
|
|
piece of work, from feature requests to personal errands, lives as an issue:
|
|
|
|
- **Issue = task.** If it needs doing, it's an issue. No separate task manager,
|
|
no Notion boards, no sticky notes.
|
|
- **Assignment = ownership.** An issue assigned to `clawbot` means the agent is
|
|
responsible for the next step. An issue assigned to the human means it's in
|
|
their queue. An unassigned issue is unclaimed work.
|
|
- **Assignment as coordination flag.** When the agent finishes its part of a
|
|
task but needs human input, it unassigns itself and assigns the human. When
|
|
the human delegates work, they assign the agent. The assignment field IS the
|
|
handoff mechanism — no need for separate status updates or "hey, this is ready
|
|
for you" messages.
|
|
- **Close = done.** When an issue is closed, the work is complete. PRs reference
|
|
issues with "(closes #N)" so merging the PR automatically closes the issue.
|
|
|
|
This means you can answer "what's on my plate?" with a single API call:
|
|
`GET /repos/{owner}/{repo}/issues?assignee=me&state=open`. The agent does this
|
|
during every sitrep and heartbeat.
|
|
|
|
For personal task management, we even have a dedicated issues-only repo (no
|
|
code) that serves as a general to-do list. The agent reviews open issues there,
|
|
suggests ways to complete tasks with minimal effort, and proactively comments
|
|
with research or offers to handle items directly.
|
|
|
|
### The Issue → PR → Deploy Lifecycle
|
|
|
|
Issues drive the entire pipeline:
|
|
|
|
1. Issue filed (by human or agent)
|
|
2. Assigned to whoever should work on it
|
|
3. Work happens on a feature branch
|
|
4. PR created with "(closes #N)" in the title
|
|
5. PR reviewed, checked, merged
|
|
6. Issue auto-closes on merge
|
|
7. Code auto-deploys to production (if applicable)
|
|
|
|
The issue tracker is the single source of truth for what needs doing, who's
|
|
doing it, and what's done. Everything else (PRs, branches, deployments) links
|
|
back to issues.
|
|
|
|
### PR State Machine
|
|
|
|
Once a PR exists, it enters a finite state machine tracked by Gitea labels. Each
|
|
PR has exactly one state label at a time, plus a `bot` label indicating it's the
|
|
agent's turn to act.
|
|
|
|
#### States (Gitea Labels)
|
|
|
|
| Label | Color | Meaning |
|
|
| -------------- | ------ | --------------------------------------------- |
|
|
| `needs-review` | yellow | Code pushed, `docker build .` passes, awaiting review |
|
|
| `needs-rework` | purple | Code review found issues that need fixing |
|
|
| `merge-ready` | green | Reviewed clean, build passes, ready for human |
|
|
|
|
Earlier iterations included `needs-rebase` and `needs-checks` states, but we
|
|
eliminated them. Rebasing is handled inline by workers and reviewers (they
|
|
rebase onto the target branch as part of their normal work). And `docker build .`
|
|
is the only check — it's run by workers before pushing and by reviewers before
|
|
approving. There's no separate "checks" phase.
|
|
|
|
#### The `bot` Label + Assignment Model
|
|
|
|
The `bot` label signals that an issue or PR is the agent's turn to act. The
|
|
assignment field tracks who is actively working on it:
|
|
|
|
- **`bot` label + unassigned** = work available, poller dispatches an agent
|
|
- **`bot` label + assigned to agent** = actively being worked
|
|
- **No `bot` label** = not the agent's turn (either human's turn or done)
|
|
|
|
The notification poller assigns the agent account to the issue at dispatch time,
|
|
before the agent session even starts. This prevents race conditions — by the
|
|
time a second poller scan runs, the issue is already assigned and gets skipped.
|
|
|
|
When the agent finishes its step and spawns the next agent, it unassigns itself
|
|
first (releasing the lock). The next agent's first action is to verify it's the
|
|
only one working on the issue by checking comments for duplicate work.
|
|
|
|
At chain-end (`merge-ready`): the agent assigns the human and removes the `bot`
|
|
label. The human's PR inbox contains only PRs that are genuinely ready to merge.
|
|
|
|
#### Agent Chaining — No Self-Review
|
|
|
|
Each step in the pipeline is handled by a separate, isolated agent session.
|
|
Agents spawn the next agent in the chain via `openclaw cron add --session
|
|
isolated`. This enforces a critical rule: **the agent that wrote the code never
|
|
reviews it.**
|
|
|
|
The chain looks like this:
|
|
|
|
```
|
|
Worker agent (writes/fixes code)
|
|
→ docker build . → push → label needs-review
|
|
→ unassign self → spawn reviewer agent → STOP
|
|
|
|
Reviewer agent (reviews code it didn't write)
|
|
→ read diff + referenced issues → review
|
|
→ PASS: rebase if needed → docker build . → label merge-ready
|
|
→ assign human → remove bot label → STOP
|
|
→ FAIL: comment findings → label needs-rework
|
|
→ unassign self → spawn worker agent → STOP
|
|
```
|
|
|
|
The cycle repeats (worker → reviewer → worker → reviewer → ...) until the
|
|
reviewer approves. Each agent is a fresh session with no memory of previous
|
|
iterations — it reads the issue comments and PR diff to understand context.
|
|
|
|
#### TOCTOU Protection
|
|
|
|
Just before changing labels or assignments, agents re-read all comments and
|
|
current labels via the API. If the state changed since they started (another
|
|
agent already acted), they report the conflict and stop. This prevents stale
|
|
agents from overwriting fresh state.
|
|
|
|
#### Race Detection
|
|
|
|
If an agent starts and finds its work was already done (e.g., a reviewer sees a
|
|
review was already posted, or a worker sees a PR was already created), it
|
|
reports to the status channel and stops.
|
|
|
|
#### The Loop in Practice
|
|
|
|
A typical PR goes through this cycle:
|
|
|
|
1. Worker agent creates PR, runs `docker build .`, labels `needs-review`
|
|
2. Worker spawns reviewer agent
|
|
3. Reviewer reads diff — finds a missing error check → labels `needs-rework`
|
|
4. Reviewer spawns worker agent
|
|
5. Worker fixes the error check, rebases, runs `docker build .`, labels
|
|
`needs-review`
|
|
6. Worker spawns reviewer agent
|
|
7. Reviewer reads diff — looks good → rebases → `docker build .` → labels
|
|
`merge-ready`, assigns human
|
|
8. Human reviews, merges
|
|
|
|
Steps 1-7 happen without human involvement. Each step is a separate agent
|
|
session that spawns the next one.
|
|
|
|
#### Safety Net
|
|
|
|
The notification poller runs a periodic scan (every 2 minutes) of all watched
|
|
repos for issues/PRs with the `bot` label that are unassigned. This catches
|
|
broken chains — if an agent crashes or times out without spawning the next agent,
|
|
the poller will eventually re-dispatch. A 30-minute cooldown prevents duplicate
|
|
dispatches during normal operation.
|
|
|
|
#### Why Labels + Assignments
|
|
|
|
You could track PR state in a file, a database, or just in the agent's memory.
|
|
Labels and assignments are better because:
|
|
|
|
- **Visible in the Gitea UI.** Anyone can glance at the PR list and see what
|
|
state each PR is in without reading comments.
|
|
- **Queryable via API.** "Show me all PRs that need review" is a single API call
|
|
with a label filter.
|
|
- **Durable.** Labels survive agent restarts, session timeouts, and context
|
|
loss. The state is in Gitea, not in the agent's head.
|
|
- **Human-readable.** Color-coded labels in a PR list give an instant dashboard:
|
|
lots of red = rebase debt, lots of orange = CI problems, lots of green = ready
|
|
for review.
|
|
|
|
#### Branch Protection
|
|
|
|
The state machine assumes nobody can bypass it by pushing directly to main.
|
|
Branch protection rules on Gitea (or GitHub) enforce this at the server level:
|
|
|
|
- **Require pull request reviews** before merging — no direct pushes to main
|
|
- **Require CI to pass** — the Docker build (which runs `make check`) must
|
|
succeed before a PR can be merged
|
|
- **Block force-pushes** — history is immutable on protected branches
|
|
- **Require branches to be up-to-date** — PRs must be rebased before merge
|
|
|
|
This is the server-side interlock that makes the entire state machine
|
|
trustworthy. Without branch protection, an agent could skip the review/check
|
|
cycle and push directly to main. With it, the only path to main is through a PR
|
|
that passes all gates. A tired human at 3am, or an overconfident agent,
|
|
physically cannot bypass the review and CI gates — the server won't allow it.
|
|
|
|
Branch protection completes the interlocking chain:
|
|
|
|
```
|
|
Branch protection (server-side)
|
|
└── Requires CI pass
|
|
└── CI runs docker build
|
|
└── Dockerfile runs make check
|
|
├── make fmt-check
|
|
├── make lint
|
|
└── make test
|
|
```
|
|
|
|
Every layer enforces the one below it. The developer (human or agent) can't skip
|
|
any step because each gate is enforced by a different system: the Makefile
|
|
enforces test/lint/fmt, the Dockerfile enforces the Makefile, CI enforces the
|
|
Docker build, and branch protection enforces CI. No single point of failure.
|
|
|
|
## Real-Time Activity Feed: Gitea → Mattermost
|
|
|
|
Every repo has a Gitea webhook that sends all activity (pushes, PRs, issues,
|
|
comments, reviews, CI status) to a channel in our self-hosted
|
|
[Mattermost](https://mattermost.com) instance. This creates a real-time feed
|
|
where the human can see what's happening across all projects without checking
|
|
Gitea's notification inbox.
|
|
|
|
**Important caveat:** The agent can't see this feed directly. Gitea's webhook
|
|
messages arrive in Mattermost as a "bot" integration user. Mattermost
|
|
deliberately hides bot messages from other bot users to prevent infinite
|
|
bot-to-bot loops. This means the agent's Mattermost bot account is blind to the
|
|
Gitea webhook feed, even though it's posted in a channel the agent has access
|
|
to.
|
|
|
|
This is why we built the [notification poller](#the-notification-poller) — a
|
|
separate Python script that polls Gitea's notification API directly, bypassing
|
|
Mattermost entirely. The human sees Gitea activity via the Mattermost webhook
|
|
feed; the agent sees it via the API poller. Same events, different delivery
|
|
paths, because of a Mattermost platform limitation.
|
|
|
|
The agent has its own Mattermost bot user (`@claw`), separate from the human.
|
|
This means:
|
|
|
|
- The agent posts status updates to dedicated channels (`#git` for work status,
|
|
`#claw` for general work narration)
|
|
- The human's DMs stay clean — only direct alerts and responses
|
|
- In group channels, it's clear who said what
|
|
- The agent can be @-mentioned in any channel
|
|
|
|
### Channel Architecture
|
|
|
|
A practical setup:
|
|
|
|
- **#git** — Real-time Gitea webhook feed (all repos) + agent's work status
|
|
updates. The human sees commits, PRs, reviews, CI results as they happen. (The
|
|
agent posts here but can't read the webhook messages — see caveat above.)
|
|
- **#claw** — Agent's internal work narration. Useful for debugging what the
|
|
agent is doing, but notifications muted so it doesn't disturb anyone.
|
|
- **DM with agent** — Private conversation, sitreps, sensitive commands
|
|
- **Project-specific channels** — For coordination with external collaborators
|
|
|
|
### The Notification Poller + Dispatcher
|
|
|
|
Because the agent can't see Gitea webhooks in Mattermost (bot-to-bot visibility
|
|
issue), we built a Python script that both polls and dispatches. It polls the
|
|
Gitea notifications API every 15 seconds, triages each notification (checking
|
|
@-mentions and assignment), marks them as read, and spawns one isolated agent
|
|
session per actionable item via `openclaw cron add --session isolated`.
|
|
|
|
The poller also runs a secondary **label scan** every 2 minutes, checking all
|
|
watched repos for open issues/PRs with the `bot` label that are unassigned
|
|
(meaning they need work but no agent has claimed them yet). This catches cases
|
|
where the agent chain broke — an agent timed out or crashed without spawning the
|
|
next one.
|
|
|
|
Key design decisions:
|
|
|
|
- **The poller IS the dispatcher.** No flag files, no heartbeat dependency. The
|
|
poller triages notifications and spawns agents directly.
|
|
- **Marks notifications as read immediately.** Prevents re-dispatch on the next
|
|
poll cycle.
|
|
- **Assigns the agent account at dispatch time.** Before spawning the agent
|
|
session, the poller assigns the bot user to the issue via API. This prevents
|
|
race conditions — subsequent scans skip assigned issues.
|
|
- **Dispatched issues are tracked in a persistent JSON file.** Survives poller
|
|
restarts. Entries auto-prune after 1 hour.
|
|
- **30-minute re-dispatch cooldown.** The poller won't re-dispatch for the same
|
|
issue within 30 minutes, even if it appears unassigned again.
|
|
- **Concurrency cap.** The poller checks how many agents are currently running
|
|
and defers dispatch if the cap is reached.
|
|
- **Stale agent reaper.** Kills agent sessions that have been running longer
|
|
than 10 minutes (the `--timeout-seconds` flag isn't always enforced).
|
|
- **`bot` label + `merge-ready` skip.** The label scan skips issues that are
|
|
already labeled `merge-ready` — those are in the human's court.
|
|
- **Zero dependencies.** Python stdlib only. Runs anywhere.
|
|
|
|
Response time: ~15-30 seconds from notification to agent starting work.
|
|
|
|
Full source code is available in
|
|
[OPENCLAW_TRICKS.md](OPENCLAW_TRICKS.md#gitea-integration--notification-polling).
|
|
|
|
## CI: Gitea Actions
|
|
|
|
Every repo has a CI workflow in `.gitea/workflows/` that runs on push. The
|
|
standard workflow is one line:
|
|
|
|
```yaml
|
|
- name: Build and check
|
|
run: docker build .
|
|
```
|
|
|
|
Because the Dockerfile runs `make check` (which runs tests, linting, and
|
|
formatting checks), a successful Docker build means everything passes. A failed
|
|
build means something is broken. Binary signal, no ambiguity.
|
|
|
|
### PR Preview Deployments
|
|
|
|
For web projects (like a blog or documentation site), CI can go further than
|
|
pass/fail. When a PR is opened against a site repo, CI can:
|
|
|
|
1. Build the site from the PR branch
|
|
2. Deploy it to a preview URL
|
|
3. Post the preview URL as a comment on the PR or drop it into a Mattermost
|
|
channel
|
|
|
|
This lets reviewers see the actual rendered result before merging, not just the
|
|
code diff. For a Jekyll/Hugo blog, this means you can see how a new post looks
|
|
on the real site layout, on mobile, with real CSS — before it goes live.
|
|
|
|
## Deployment: µPaaS
|
|
|
|
[µPaaS](https://git.eeqj.de/sneak/upaas) by [@sneak](https://sneak.berlin) is a
|
|
lightweight, MIT-licensed, self-hosted platform-as-a-service written in Go that
|
|
auto-deploys Docker containers when code changes. It exists because:
|
|
|
|
- Full PaaS platforms (Kubernetes, Nomad, etc.) are massive overkill for a small
|
|
fleet of services
|
|
- Heroku/Render/Fly.io mean vendor dependency and recurring costs
|
|
- We wanted: push to main → live in production, automatically, with zero human
|
|
intervention
|
|
|
|
### How It Works
|
|
|
|
µPaaS is a single Go binary that:
|
|
|
|
1. **Receives Gitea webhooks** on push/merge events
|
|
2. **Clones the repo** using deploy keys (read-only SSH keys per-repo)
|
|
3. **Runs `docker build`** to build the new image
|
|
4. **Swaps the running container** with the new image
|
|
5. **Routes traffic** via Traefik reverse proxy with automatic TLS
|
|
|
|
The deploy flow:
|
|
|
|
```
|
|
Developer merges PR to main/prod
|
|
→ Gitea fires webhook to µPaaS
|
|
→ µPaaS clones repo, builds Docker image
|
|
→ µPaaS stops old container, starts new one
|
|
→ Traefik routes traffic to new container
|
|
→ Site is live on its production URL with TLS
|
|
```
|
|
|
|
Time from merge to live: typically under 2 minutes (dominated by Docker build
|
|
time).
|
|
|
|
### Deploy Keys
|
|
|
|
Each repo that deploys via µPaaS has a read-only SSH deploy key. This means:
|
|
|
|
- µPaaS can clone the repo to build it, but can't push to it
|
|
- Each key is scoped to one repo — compromise of one key doesn't affect others
|
|
- No shared credentials, no broad API tokens
|
|
|
|
### What's Deployed This Way
|
|
|
|
Any Docker-based service or site:
|
|
|
|
- Static sites (Jekyll, Hugo) — Dockerfile builds the site, nginx serves it
|
|
- Go services — Dockerfile builds the binary, runs it
|
|
- Web applications — same pattern
|
|
|
|
Everything gets a production URL with automatic TLS via Traefik.
|
|
|
|
## The Full Pipeline
|
|
|
|
Putting it all together, the development lifecycle looks like this:
|
|
|
|
```
|
|
1. Human labels issue with `bot` (or agent files issue)
|
|
↓
|
|
2. Poller detects `bot` label + unassigned → assigns agent → spawns worker
|
|
↓
|
|
3. Worker agent clones repo, writes code, runs `docker build .`
|
|
↓
|
|
4. Worker creates PR "(closes #N)", labels `needs-review`
|
|
↓
|
|
5. Worker spawns reviewer agent → stops
|
|
↓
|
|
6. Reviewer agent reads diff + referenced issues → reviews
|
|
↓
|
|
7a. Review PASS → reviewer rebases if needed → `docker build .`
|
|
→ labels `merge-ready` → assigns human → removes `bot`
|
|
↓
|
|
7b. Review FAIL → reviewer labels `needs-rework`
|
|
→ spawns worker agent → back to step 3
|
|
↓
|
|
8. Human reviews, merges
|
|
↓
|
|
9. Gitea webhook fires → µPaaS deploys to production
|
|
↓
|
|
10. Site/service is live
|
|
```
|
|
|
|
Steps 2-7 happen without any human involvement, driven by agent-to-agent
|
|
chaining. The human's role is reduced to: label the issue, review the final PR,
|
|
merge. Everything else is automated.
|
|
|
|
### Observability
|
|
|
|
Because everything flows through Mattermost channels:
|
|
|
|
- The human can glance at #git to see the current state of all projects
|
|
- CI failures are immediately visible (Gitea Actions posts status)
|
|
- Deployments are immediately visible (µPaaS can log to the same channel)
|
|
- The agent's work narration in #claw shows what it's currently doing
|
|
- No need to check multiple dashboards — one chat client shows everything
|
|
|
|
## Identity Separation
|
|
|
|
A key architectural decision: the agent has its own identity everywhere.
|
|
|
|
| System | Human Account | Agent Account |
|
|
| ---------- | ------------- | ------------- |
|
|
| Gitea | @sneak | @clawbot |
|
|
| Mattermost | @sneak | @claw |
|
|
|
|
This separation means:
|
|
|
|
- **Clear attribution.** Every commit, comment, and message shows who did it.
|
|
When reviewing git history, you know which commits were human and which were
|
|
agent.
|
|
- **Independent permissions.** The agent can have write access to repos where
|
|
it's trusted, fork-and-PR access where it's not. The human can have admin
|
|
access without the agent inheriting it.
|
|
- **Mentionability.** The human can @-mention the agent in an issue comment to
|
|
assign work. The agent can @-mention the human when it needs review. This
|
|
works exactly like human-to-human collaboration.
|
|
- **Separate notification streams.** The agent's notification poller watches
|
|
`@clawbot`'s inbox. The human's notifications are separate. No cross-talk.
|
|
|
|
## Why Self-Host Everything
|
|
|
|
The stack — Gitea, Mattermost, µPaaS, OpenClaw — is entirely self-hosted. This
|
|
isn't ideological; it's practical:
|
|
|
|
- **No API rate limits.** The agent makes dozens of API calls per hour to Gitea.
|
|
GitHub's API limits would throttle it.
|
|
- **No surprise costs.** CI minutes, seat licenses, storage — all free when
|
|
self-hosted.
|
|
- **Full API access.** Every feature of every tool is available via API. No
|
|
"enterprise only" gates.
|
|
- **Custom webhooks.** We can wire up any event to any action. Gitea push →
|
|
Mattermost notification → µPaaS deploy → agent notification, all custom.
|
|
- **Data sovereignty.** Code, issues, conversations, and deployment
|
|
infrastructure all live on machines we control.
|
|
- **Offline resilience.** If GitHub/Slack/Vercel have an outage, our pipeline
|
|
keeps running.
|
|
|
|
The trade-off is maintenance burden, but with an AI agent handling most of the
|
|
operational work (monitoring, updates, issue triage), the maintenance cost is
|
|
surprisingly low.
|
|
|
|
---
|
|
|
|
_This document describes a production system that's been running since
|
|
early 2026. The specific tools (Gitea, Mattermost, µPaaS) are interchangeable —
|
|
the patterns (webhook-driven deployment, real-time activity feeds, identity
|
|
separation, automated CI gates) apply to any self-hosted stack._
|