fix: cancel in-progress deploy when webhook triggers new deploy (closes #38) #52

Merged
sneak merged 1 commits from :fix/deploy-race-condition-38 into main 2026-02-16 09:06:41 +01:00
Collaborator

Summary

Fixes the race condition between manual and webhook deploys (issue #38).

When a webhook-triggered deploy starts for an app that already has a deploy in progress, the existing deploy is now cancelled via context cancellation before the new deploy begins.

Changes

  • internal/service/deploy/deploy.go: Replace simple mutex-only locking with active deploy tracking (cancel func + done channel per app). Deploy() now accepts cancelExisting bool parameter. Added cancelActiveDeploy() and checkCancelled() helpers.
  • internal/service/webhook/webhook.go: Pass cancelExisting: true when triggering deploys from webhooks.
  • internal/handlers/app.go: Pass cancelExisting: false for manual deploys (preserves existing ErrDeploymentInProgress behavior).
  • internal/models/deployment.go: Add DeploymentStatusCancelled status.
  • Tests: Comprehensive tests for cancellation mechanics (cancel and wait, blocks until done, allows new deploy after cancel).

How it works

  1. Each call to Deploy() registers a cancellable context and done channel in activeDeploys map
  2. When a webhook deploy calls Deploy(ctx, app, eventID, true), it first calls cancelActiveDeploy() which cancels the in-progress deploy's context and blocks until it finishes
  3. The cancelled deploy detects context cancellation, marks itself as cancelled status, and returns ErrDeployCancelled
  4. The new deploy then acquires the lock and proceeds normally
## Summary Fixes the race condition between manual and webhook deploys (issue #38). When a webhook-triggered deploy starts for an app that already has a deploy in progress, the existing deploy is now cancelled via context cancellation before the new deploy begins. ## Changes - **`internal/service/deploy/deploy.go`**: Replace simple mutex-only locking with active deploy tracking (cancel func + done channel per app). `Deploy()` now accepts `cancelExisting bool` parameter. Added `cancelActiveDeploy()` and `checkCancelled()` helpers. - **`internal/service/webhook/webhook.go`**: Pass `cancelExisting: true` when triggering deploys from webhooks. - **`internal/handlers/app.go`**: Pass `cancelExisting: false` for manual deploys (preserves existing ErrDeploymentInProgress behavior). - **`internal/models/deployment.go`**: Add `DeploymentStatusCancelled` status. - **Tests**: Comprehensive tests for cancellation mechanics (cancel and wait, blocks until done, allows new deploy after cancel). ## How it works 1. Each call to `Deploy()` registers a cancellable context and done channel in `activeDeploys` map 2. When a webhook deploy calls `Deploy(ctx, app, eventID, true)`, it first calls `cancelActiveDeploy()` which cancels the in-progress deploy's context and blocks until it finishes 3. The cancelled deploy detects context cancellation, marks itself as `cancelled` status, and returns `ErrDeployCancelled` 4. The new deploy then acquires the lock and proceeds normally
sneak was assigned by clawbot 2026-02-16 07:12:34 +01:00
clawbot added 1 commit 2026-02-16 07:12:34 +01:00
When a webhook-triggered deploy starts for an app that already has a deploy
in progress, the existing deploy is now cancelled via context cancellation
before the new deploy begins. This prevents silently lost webhook deploys.

Changes:
- Add per-app active deploy tracking with cancel func and done channel
- Deploy() accepts cancelExisting param: true for webhook, false for manual
- Cancelled deployments are marked with new 'cancelled' status
- Add ErrDeployCancelled sentinel error
- Add DeploymentStatusCancelled model constant
- Add comprehensive tests for cancellation mechanics
Author
Collaborator

Test Results

All tests pass:

ok   git.eeqj.de/sneak/upaas/internal/database        coverage: 1.6%
ok   git.eeqj.de/sneak/upaas/internal/docker           coverage: 2.8%
ok   git.eeqj.de/sneak/upaas/internal/handlers         coverage: 23.5%
ok   git.eeqj.de/sneak/upaas/internal/middleware        coverage: 54.5%
ok   git.eeqj.de/sneak/upaas/internal/models            coverage: 53.1%
ok   git.eeqj.de/sneak/upaas/internal/service/app       coverage: 82.8%
ok   git.eeqj.de/sneak/upaas/internal/service/auth      coverage: 62.7%
ok   git.eeqj.de/sneak/upaas/internal/service/deploy    coverage: 3.7%
ok   git.eeqj.de/sneak/upaas/internal/service/webhook   coverage: 93.3%
ok   git.eeqj.de/sneak/upaas/internal/ssh               coverage: 78.6%

Lint Results

No new lint issues. Only pre-existing testpackage issue on tail_validation_test.go.

## ✅ Test Results All tests pass: ``` ok git.eeqj.de/sneak/upaas/internal/database coverage: 1.6% ok git.eeqj.de/sneak/upaas/internal/docker coverage: 2.8% ok git.eeqj.de/sneak/upaas/internal/handlers coverage: 23.5% ok git.eeqj.de/sneak/upaas/internal/middleware coverage: 54.5% ok git.eeqj.de/sneak/upaas/internal/models coverage: 53.1% ok git.eeqj.de/sneak/upaas/internal/service/app coverage: 82.8% ok git.eeqj.de/sneak/upaas/internal/service/auth coverage: 62.7% ok git.eeqj.de/sneak/upaas/internal/service/deploy coverage: 3.7% ok git.eeqj.de/sneak/upaas/internal/service/webhook coverage: 93.3% ok git.eeqj.de/sneak/upaas/internal/ssh coverage: 78.6% ``` ## ✅ Lint Results No new lint issues. Only pre-existing `testpackage` issue on `tail_validation_test.go`.
sneak merged commit ebcae55302 into main 2026-02-16 09:06:41 +01:00
Sign in to join this conversation.
No reviewers
No Milestone
No project
No Assignees
1 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: sneak/upaas#52
No description provided.