refactor: structured body (array|object, never string) for canonicalization

Message bodies are always arrays of strings (text lines) or objects
(structured data like PUBKEY). Never raw strings. This enables:
- Multiline messages without escape sequences
- Deterministic JSON canonicalization (RFC 8785 JCS) for signing
- Structured data where needed

Update all schemas: body fields use array type with string items.
Update message.json envelope: body is oneOf[array, object], id is UUID.
Update README: message envelope table, examples, and canonicalization docs.
Update schema/README.md: field types, examples with array bodies.
This commit is contained in:
clawbot
2026-02-10 10:36:02 -08:00
parent dfb1636be5
commit ab70f889a6
22 changed files with 446 additions and 171 deletions

View File

@@ -11,24 +11,33 @@ to IRC wire format:
```
IRC: :nick PRIVMSG #channel :hello world
JSON: {"command": "PRIVMSG", "from": "nick", "to": "#channel", "body": "hello world"}
JSON: {"command": "PRIVMSG", "from": "nick", "to": "#channel", "body": ["hello world"]}
IRC: :server 353 nick = #channel :user1 @op1 +voice1
JSON: {"command": "353", "to": "nick", "params": ["=", "#channel"], "body": "user1 @op1 +voice1"}
JSON: {"command": "353", "to": "nick", "params": ["=", "#channel"], "body": ["user1 @op1 +voice1"]}
Multiline: {"command": "PRIVMSG", "to": "#ch", "body": ["line 1", "line 2"]}
Structured: {"command": "PUBKEY", "body": {"alg": "ed25519", "key": "base64..."}}
```
Common fields (see `message.json` for full schema):
| Field | Type | Description |
|-----------|----------|-------------------------------------------------------|
| `id` | integer | Server-assigned ID (monotonically increasing) |
| `command` | string | IRC command or 3-digit numeric code |
| `from` | string | Source nick or server name (IRC prefix) |
| `to` | string | Target: #channel or nick |
| `params` | string[] | Middle parameters (mainly for numerics) |
| `body` | string | Trailing parameter (message text) |
| `ts` | string | ISO 8601 timestamp (server-assigned, not in raw IRC) |
| `meta` | object | Extensible metadata (not in raw IRC) |
| Field | Type | Description |
|-----------|----------------|------------------------------------------------------|
| `id` | string (uuid) | Server-assigned message UUID |
| `command` | string | IRC command or 3-digit numeric code |
| `from` | string | Source nick or server name (IRC prefix) |
| `to` | string | Target: #channel or nick |
| `params` | string[] | Middle parameters (mainly for numerics) |
| `body` | array \| object | Structured body — never a raw string (see below) |
| `ts` | string | ISO 8601 timestamp (server-assigned, not in raw IRC) |
| `meta` | object | Extensible metadata (signatures, hashes, etc.) |
**Structured bodies:** `body` is always an array of strings (for text) or an
object (for structured data like PUBKEY). Never a raw string. This enables:
- Multiline messages without escape sequences
- Deterministic canonicalization via RFC 8785 JCS for signing
- Structured data where needed
## Commands