Add complete OAuth token refresh and sync solution

- Setup wizard with auto-detection of OpenClaw paths and Claude CLI
- Token sync watcher (inotifywait) for real-time credential updates
- Auto-refresh trigger timer that runs Claude CLI every 30 min
- Supports Claude CLI in Docker container or on host
- Temporary ANTHROPIC_BASE_URL override for container environments
- Anthropic model configuration for OpenClaw
- Auth profile management (fixes key vs access field)
- Systemd services and timers for both sync and trigger
- Comprehensive documentation and troubleshooting guides
- Re-authentication notification system

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
shamid202
2026-02-27 01:51:18 +07:00
parent 3ae5d5274a
commit 22731fff60
24 changed files with 2846 additions and 6 deletions

118
docs/ARCHITECTURE.md Normal file
View File

@@ -0,0 +1,118 @@
# Architecture
## Token Flow Diagram
```
+--------------------+ auto-refresh +------------------------------------------+
| Claude Code CLI | ================> | .credentials.json |
| (inside | (every ~8 hours, | { |
| claude-proxy | built-in to CLI) | "claudeAiOauth": { |
| container) | | "accessToken": "sk-ant-oat01-...", |
+--------------------+ | "refreshToken": "sk-ant-ort01-...", |
| "expiresAt": 1772120060006 |
| } |
| } |
+-------------------+-----------------------+
|
inotifywait detects
CLOSE_WRITE / MOVED_TO
|
+-------------------v-----------------------+
| sync-oauth-token.sh |
| (systemd service, runs continuously) |
+---+----------------+----------------+-----+
| | |
+----------------+ +-----------+---------+ |
| | | |
+---------v---------+ +--------v--------+ +--------v--------+
| oauth.json | | .env | | docker compose |
| { | | ANTHROPIC_ | | down/up gateway |
| "anthropic": { | | OAUTH_TOKEN= | | (reloads env) |
| "access":..., | | "sk-ant-oat01-" | +---------+--------+
| "refresh":...,| +-----------------+ |
| "expires":... | +----------v----------+
| } | | OpenClaw Gateway |
| } | | (fresh token loaded |
+--------+----------+ | from container env) |
| +----------+----------+
| mergeOAuthFileIntoStore() |
| (reads on startup) |
+-------------------->+ |
| +--------v---------+
+------------->| api.anthropic.com|
| Claude Opus 4.6 |
+------------------+
```
## Volume Mounts (Docker)
```
HOST PATH CONTAINER PATH
========= ==============
Gateway container (openclaw-openclaw-gateway-1):
/root/.openclaw/ -> /home/node/.openclaw/
/root/.openclaw/credentials/oauth.json -> /home/node/.openclaw/credentials/oauth.json
/root/.openclaw/agents/*/agent/auth-profiles.json -> /home/node/.openclaw/agents/*/agent/auth-profiles.json
/home/node/.claude/ -> /home/node/.claude/
/root/openclaw/.env -> loaded as container env vars (at creation time only)
Claude CLI container (claude-proxy):
/root/.openclaw/workspaces/workspace-claude-proxy/
config/ -> /root/
config/.claude/.credentials.json -> /root/.claude/.credentials.json
```
## Auth Resolution Order (inside gateway)
When the gateway needs to authenticate with Anthropic:
```
1. resolveApiKeyForProvider("anthropic")
2. -> resolveAuthProfileOrder()
3. -> reads agents/<agent>/agent/auth-profiles.json
4. -> isValidProfile() checks each profile:
5. - type:"api_key" -> requires cred.key
6. - type:"oauth" -> requires cred.access (NOT cred.key!)
7. - type:"token" -> requires cred.token
8. -> If valid profile found: use it
9. -> If no valid profile: resolveEnvApiKey("anthropic")
10. -> Reads ANTHROPIC_OAUTH_TOKEN from container env
11. -> isOAuthToken(key) detects "sk-ant-oat" prefix
12. -> Uses Bearer auth + Claude Code identity headers
13. -> Sends request to api.anthropic.com
On gateway startup:
mergeOAuthFileIntoStore()
-> Reads /home/node/.openclaw/credentials/oauth.json
-> Merges into auth profile store (if profile doesn't exist)
```
## Why down/up and NOT restart
```
docker compose restart openclaw-gateway
-> Sends SIGTERM to container process
-> Restarts the SAME container (same env vars from creation time)
-> .env changes are NOT reloaded
-> Result: gateway still has OLD token
docker compose down openclaw-gateway && docker compose up -d openclaw-gateway
-> Stops and REMOVES the container
-> Creates a NEW container (reads .env fresh)
-> New env vars are loaded
-> Result: gateway has NEW token
```
## Source Code References (inside gateway container)
| File | Line | Function |
|------|------|----------|
| `/app/dist/paths-CyR9Pa1R.js` | 190 | `OAUTH_FILENAME = "oauth.json"` |
| `/app/dist/paths-CyR9Pa1R.js` | 198-204 | `resolveOAuthDir()` -> `$STATE_DIR/credentials/` |
| `/app/dist/paths-CyR9Pa1R.js` | 203 | `resolveOAuthPath()` -> joins dir + filename |
| `/app/dist/model-auth-CmUeBbp-.js` | 3048 | `mergeOAuthFileIntoStore()` -- reads oauth.json |
| `/app/dist/model-auth-CmUeBbp-.js` | 3358 | `buildOAuthApiKey()` -- returns `credentials.access` |
| `/app/dist/model-auth-CmUeBbp-.js` | 3832 | `isValidProfile()` -- for oauth, checks `cred.access` |
| `/app/dist/model-auth-CmUeBbp-.js` | 3942 | `resolveApiKeyForProvider()` -- profiles then env fallback |
| `/app/dist/model-auth-CmUeBbp-.js` | 4023 | `resolveEnvApiKey("anthropic")` -> reads env var |

93
docs/FIELD-MAPPING.md Normal file
View File

@@ -0,0 +1,93 @@
# Credential Field Mapping Reference
## Claude CLI format (`.credentials.json`)
Written by Claude Code CLI when it refreshes the token.
```json
{
"claudeAiOauth": {
"accessToken": "sk-ant-oat01-...",
"refreshToken": "sk-ant-ort01-...",
"expiresAt": 1772120060006,
"scopes": ["user:inference", "user:mcp_servers", "user:profile", "user:sessions:claude_code"],
"subscriptionType": "max",
"rateLimitTier": "default_claude_max_5x"
}
}
```
## OpenClaw format (`oauth.json`)
Read by the gateway's `mergeOAuthFileIntoStore()` on startup.
```json
{
"anthropic": {
"access": "sk-ant-oat01-...",
"refresh": "sk-ant-ort01-...",
"expires": 1772120060006,
"scopes": ["user:inference", "user:mcp_servers", "user:profile", "user:sessions:claude_code"],
"subscriptionType": "max",
"rateLimitTier": "default_claude_max_5x"
}
}
```
## Field name mapping
| Claude CLI | OpenClaw | Notes |
|------------|----------|-------|
| `accessToken` | `access` | The OAuth access token (`sk-ant-oat01-...`) |
| `refreshToken` | `refresh` | The refresh token (`sk-ant-ort01-...`) |
| `expiresAt` | `expires` | Unix timestamp in milliseconds |
| `scopes` | `scopes` | Same format (array of strings) |
| `subscriptionType` | `subscriptionType` | Same (`"max"`) |
| `rateLimitTier` | `rateLimitTier` | Same (`"default_claude_max_5x"`) |
## .env format
Single env var, only the access token (no refresh/expiry):
```
ANTHROPIC_OAUTH_TOKEN="sk-ant-oat01-..."
```
## Auth profiles format (CORRECT)
```json
{
"profiles": {
"anthropic:default": {
"type": "oauth",
"provider": "anthropic",
"access": "sk-ant-oat01-..."
}
}
}
```
## Auth profiles format (BROKEN)
```json
{
"profiles": {
"anthropic:default": {
"type": "oauth",
"provider": "anthropic",
"key": "sk-ant-oat01-..."
}
}
}
```
**Why it's broken:** `isValidProfile()` for `type: "oauth"` checks `cred.access`, not `cred.key`. The profile is silently skipped, and auth falls through to the `ANTHROPIC_OAUTH_TOKEN` env var. This works by accident but means the auth profile system isn't being used properly.
## File locations
| File | Host Path | Container Path |
|------|-----------|---------------|
| Claude CLI creds | `/root/.openclaw/workspaces/workspace-claude-proxy/config/.claude/.credentials.json` | `/root/.claude/.credentials.json` (claude-proxy) |
| OpenClaw oauth | `/root/.openclaw/credentials/oauth.json` | `/home/node/.openclaw/credentials/oauth.json` (gateway) |
| .env | `/root/openclaw/.env` | loaded as env vars at container creation |
| Auth profiles | `/root/.openclaw/agents/<agent>/agent/auth-profiles.json` | `/home/node/.openclaw/agents/<agent>/agent/auth-profiles.json` (gateway) |

View File

@@ -0,0 +1,107 @@
# How Token Refresh Works
## Anthropic OAuth Token Lifecycle
Claude Max subscriptions use OAuth tokens for API authentication.
- **Access token** (`sk-ant-oat01-...`): Used for API requests, expires in ~8 hours
- **Refresh token** (`sk-ant-ort01-...`): Used to get new access tokens, long-lived
- **Token endpoint**: `POST https://console.anthropic.com/v1/oauth/token`
- **Client ID**: `9d1c250a-e61b-44d9-88ed-5944d1962f5e` (Claude Code public OAuth client)
## How Claude Code CLI Refreshes Tokens
Claude Code CLI has built-in token refresh. Before each API request:
1. Checks `[API:auth] OAuth token check` — reads `.credentials.json`
2. If access token is expired or near expiry:
- Sends `grant_type: "refresh_token"` to Anthropic's token endpoint
- Gets back new `access_token`, `refresh_token`, `expires_in`
- Writes updated credentials back to `.credentials.json`
3. Uses the (possibly refreshed) access token for the API request
The relevant function in Claude Code's minified source (`cli.js`):
```javascript
// Simplified from minified source
async function refreshToken(refreshToken, scopes) {
const params = {
grant_type: "refresh_token",
refresh_token: refreshToken,
client_id: CLIENT_ID,
scope: scopes.join(" ")
};
const response = await axios.post(TOKEN_URL, params, {
headers: { "Content-Type": "application/json" }
});
return {
accessToken: response.data.access_token,
refreshToken: response.data.refresh_token || refreshToken,
expiresAt: Date.now() + response.data.expires_in * 1000,
scopes: response.data.scope.split(" ")
};
}
```
## Why We Need a Sync Service
The problem is that Claude Code CLI and OpenClaw are **separate processes in separate containers**:
- Claude Code CLI runs in `claude-proxy` container, writes to its own `.credentials.json`
- OpenClaw gateway runs in `openclaw-openclaw-gateway-1`, reads from `oauth.json` and env vars
They don't share the same credential files. So when Claude CLI refreshes the token, OpenClaw doesn't know about it.
## The Sync Approach (inotifywait)
We use `inotifywait` from `inotify-tools` to watch for file changes in real-time:
```bash
inotifywait -q -e close_write,moved_to "$WATCH_DIR"
```
- **`close_write`**: Fires when a file is written and closed (normal writes)
- **`moved_to`**: Fires when a file is moved into the directory (atomic writes: write to temp, then `mv`)
- We watch the **directory** not the file, because atomic renames create a new inode
When the file changes:
1. Read `accessToken`, `refreshToken`, `expiresAt` from Claude CLI format
2. Map fields: `accessToken` -> `access`, `refreshToken` -> `refresh`, `expiresAt` -> `expires`
3. Write to `oauth.json` (for gateway's `mergeOAuthFileIntoStore()`)
4. Update `ANTHROPIC_OAUTH_TOKEN` in `.env`
5. Recreate gateway container (`docker compose down/up`) to reload env vars
## The Fallback Approach (Timer)
If `inotifywait` is unavailable, a systemd timer runs every 6 hours:
1. Reads current refresh token from `.credentials.json`
2. Calls Anthropic's token endpoint directly
3. Writes new tokens to all credential locations
4. Recreates gateway container
This is less responsive (up to 6-hour delay) but works without inotify.
## Field Mapping
| Claude CLI (`.credentials.json`) | OpenClaw (`oauth.json`) |
|----------------------------------|------------------------|
| `claudeAiOauth.accessToken` | `anthropic.access` |
| `claudeAiOauth.refreshToken` | `anthropic.refresh` |
| `claudeAiOauth.expiresAt` | `anthropic.expires` |
| `claudeAiOauth.scopes` | `anthropic.scopes` |
## Timeline
```
T=0h Token issued (access + refresh)
T=7.5h Claude CLI detects token nearing expiry
T=7.5h CLI calls refresh endpoint, gets new tokens
T=7.5h CLI writes new .credentials.json
T=7.5h inotifywait detects change (< 1 second)
T=7.5h sync-oauth-token.sh syncs to oauth.json + .env
T=7.5h Gateway recreated with fresh token
T=8h Old token would have expired (but we already refreshed)
```
The entire sync happens within seconds of the CLI refresh, well before the old token expires.

View File

@@ -0,0 +1,140 @@
# Configuring Anthropic Models in OpenClaw
## CRITICAL: Do NOT add an "anthropic" provider to models.providers
OpenClaw has a **built-in** Anthropic provider. You do NOT need to (and must NOT) add a custom `anthropic` entry to `models.providers` in `openclaw.json`.
Adding one causes the Anthropic SDK to append `/v1` to your `baseUrl`, which already has `/v1`, resulting in:
```
https://api.anthropic.com/v1/v1/messages -> 404 Not Found
```
## Correct Configuration
### 1. Set the primary model
In `openclaw.json`, under `agents.defaults.model`:
```json
{
"agents": {
"defaults": {
"model": {
"primary": "anthropic/claude-opus-4-6",
"fallbacks": [
"anthropic/claude-sonnet-4-6",
"google/gemini-3.1-pro-preview"
]
}
}
}
}
```
The `anthropic/` prefix tells OpenClaw to use the built-in Anthropic provider. No extra configuration needed.
### 2. Add model aliases (optional)
Under `agents.defaults.models`:
```json
{
"agents": {
"defaults": {
"models": {
"anthropic/claude-opus-4-6": {
"alias": "Claude Opus 4.6 (Max)"
},
"anthropic/claude-sonnet-4-6": {
"alias": "Claude Sonnet 4.6 (Max)"
}
}
}
}
}
```
### 3. Set ANTHROPIC_OAUTH_TOKEN in .env
In your OpenClaw `.env` file (e.g., `/root/openclaw/.env`):
```
ANTHROPIC_OAUTH_TOKEN="sk-ant-oat01-YOUR_TOKEN_HERE"
```
This is the fallback auth method. The gateway reads it as a container environment variable.
### 4. Create auth profiles for agents
Each agent needs an `anthropic:default` profile in its `auth-profiles.json`:
```json
{
"profiles": {
"anthropic:default": {
"type": "oauth",
"provider": "anthropic",
"access": "sk-ant-oat01-YOUR_TOKEN_HERE"
}
},
"lastGood": {
"anthropic": "anthropic:default"
}
}
```
**Important:** The field must be `access`, NOT `key`. Using `key` with `type: "oauth"` causes the profile to be silently skipped.
### 5. Create oauth.json
At `/root/.openclaw/credentials/oauth.json` (maps to `/home/node/.openclaw/credentials/oauth.json` in the gateway container):
```json
{
"anthropic": {
"access": "sk-ant-oat01-YOUR_TOKEN_HERE",
"refresh": "sk-ant-ort01-YOUR_REFRESH_TOKEN",
"expires": 1772120060006,
"scopes": ["user:inference", "user:mcp_servers", "user:profile", "user:sessions:claude_code"],
"subscriptionType": "max",
"rateLimitTier": "default_claude_max_5x"
}
}
```
## Available Built-in Models
When using the built-in Anthropic provider:
- `anthropic/claude-opus-4-6`
- `anthropic/claude-sonnet-4-6`
- Other models listed in the Anthropic API
## Per-Agent Model Override
You can set a specific model per agent:
```json
{
"agents": {
"list": [
{
"id": "my-agent",
"model": "anthropic/claude-opus-4-6"
}
]
}
}
```
## Authentication Flow
1. Gateway checks `auth-profiles.json` for a valid `anthropic:default` profile
2. For `type: "oauth"`, it requires the `access` field (not `key`)
3. If no valid profile: falls back to `ANTHROPIC_OAUTH_TOKEN` env var
4. On startup, `mergeOAuthFileIntoStore()` reads `oauth.json` and merges credentials
5. `isOAuthToken()` detects the `sk-ant-oat` prefix
6. Uses Bearer auth + Claude Code identity headers to call `api.anthropic.com`
## OAuth Token Lifecycle
Tokens from Claude Max subscriptions expire every ~8 hours. Use the sync service from this project to keep them fresh automatically. See the main README for setup instructions.

154
docs/TROUBLESHOOTING.md Normal file
View File

@@ -0,0 +1,154 @@
# Troubleshooting
## "HTTP 401 authentication_error: OAuth token has expired"
The most common error. The OAuth token has a ~8 hour lifetime.
**Check:**
1. Is the sync service running? `systemctl status sync-oauth-token.service`
2. Is inotifywait watching? `pgrep -af inotifywait`
3. Is the source credentials file being updated? `stat /root/.openclaw/workspaces/workspace-claude-proxy/config/.claude/.credentials.json`
4. Check service logs: `journalctl -u sync-oauth-token.service -f`
**Fix:**
- If service stopped: `systemctl restart sync-oauth-token.service`
- If token expired everywhere: run `./scripts/refresh-claude-token.sh` manually
- Nuclear option: `claude login` inside the Claude CLI container, then restart sync service
---
## "docker compose restart doesn't reload .env"
This is a Docker Compose design behavior, not a bug.
`docker compose restart` only sends SIGTERM and restarts the container process. The container keeps its original environment variables from creation time.
**Always use:**
```bash
cd /root/openclaw
docker compose down openclaw-gateway
docker compose up -d openclaw-gateway
```
This destroys the container and creates a new one, reading `.env` fresh.
---
## Auth profile has "key" field instead of "access"
OpenClaw's `isValidProfile()` for `type: "oauth"` checks for `cred.access`, not `cred.key`. If your auth profile looks like:
```json
{
"anthropic:default": {
"type": "oauth",
"provider": "anthropic",
"key": "sk-ant-oat01-..." <-- WRONG
}
}
```
The profile is silently skipped and falls through to the env var.
**Fix:** Run `./scripts/fix-auth-profiles.sh`
The correct format is:
```json
{
"anthropic:default": {
"type": "oauth",
"provider": "anthropic",
"access": "sk-ant-oat01-..." <-- CORRECT
}
}
```
---
## "404 model_not_found" or double /v1 in URL
This happens when you add `anthropic` to `models.providers` in `openclaw.json`.
**Do NOT do this:**
```json
"models": {
"providers": {
"anthropic": {
"baseUrl": "https://api.anthropic.com/v1", <-- WRONG
...
}
}
}
```
The built-in Anthropic provider already handles routing. Adding a custom one with `baseUrl` ending in `/v1` causes the SDK to append another `/v1`, resulting in `https://api.anthropic.com/v1/v1/messages` -> 404.
**Fix:** Remove any `anthropic` entry from `models.providers`. The built-in provider handles it automatically when you reference `anthropic/claude-opus-4-6` as the model.
---
## "No available auth profile for anthropic (all in cooldown)"
Auth profiles enter a cooldown period after repeated failures (e.g., expired tokens, wrong model names).
**Fix:**
```bash
./scripts/fix-auth-profiles.sh
```
This clears `cooldownUntil`, `errorCount`, and `failureCounts` from all agent auth profiles.
---
## inotifywait: "No such file or directory"
The watched file or directory doesn't exist yet.
**Check:**
- Does the Claude CLI container exist? `docker ps | grep claude`
- Does the credentials path exist? `ls -la /root/.openclaw/workspaces/workspace-claude-proxy/config/.claude/`
- Has Claude CLI been authenticated? You may need to run `claude login` inside the container first.
---
## Gateway starts but Anthropic model still fails
After recreating the gateway, wait a few seconds for it to fully start. Then verify:
```bash
# Check container has the new token
docker exec openclaw-openclaw-gateway-1 printenv ANTHROPIC_OAUTH_TOKEN
# Check oauth.json was picked up
docker exec openclaw-openclaw-gateway-1 cat /home/node/.openclaw/credentials/oauth.json
```
---
## Checking logs
```bash
# Real-time sync service logs
journalctl -u sync-oauth-token.service -f
# Last 50 log entries
journalctl -u sync-oauth-token.service -n 50
# Gateway container logs
docker logs openclaw-openclaw-gateway-1 --tail 100
# Force a re-sync
systemctl restart sync-oauth-token.service
```
---
## Complete reset procedure
If everything is broken:
1. Get a fresh token: `docker exec -it claude-proxy claude login`
2. Fix auth profiles: `./scripts/fix-auth-profiles.sh`
3. Restart sync service: `systemctl restart sync-oauth-token.service`
4. Wait 10 seconds for sync to complete
5. Verify: `./scripts/verify.sh`