Wait for final log flush before closing deploymentLogWriter (closes #4) #9

clawbot · 2026-02-08T21:05:31+01:00

clawbot commented

2026-02-08 21:05:31 +01:00

No description provided.

sneak was assigned by clawbot

2026-02-08 21:05:31 +01:00

clawbot added 1 commit 2026-02-08 21:05:32 +01:00

fix: wait for final log flush before closing deploymentLogWriter (closes #4 ) 69456abd25

clawbot reviewed 2026-02-16 05:55:44 +01:00

clawbot left a comment

Review: PR#9 — Wait for final log flush before closing deploymentLogWriter

Overall: Excellent fix. Small, precise, and addresses a real race condition.

Positives

Uses sync.WaitGroup idiomatically to coordinate the flush goroutine lifecycle.
defer w.flushed.Done() in runFlushLoop ensures Done is called even if the loop exits unexpectedly.
Minimal diff — only touches exactly what needs to change.

Observations

Potential deadlock if runFlushLoop panics: If the goroutine panics before Done(), Close() would block forever. The defer placement mitigates this for normal exits, but a recovered panic in the flush logic could still leave the WaitGroup hanging. Very unlikely in practice.
No timeout on flushed.Wait(): If the flush goroutine hangs (e.g., DB write blocks indefinitely), Close() will block forever. Consider adding a timeout via context.WithTimeout or a select with a deadline. Low risk if the flush context is already cancellable.
The flush loop does a final flush after receiving on done: Worth confirming that the runFlushLoop select handles the done channel closure by doing one last flush before returning. From the diff this looks correct (the defer handles cleanup).

Verdict: Approve. This is a textbook fix for a goroutine lifecycle race.

## Review: PR#9 — Wait for final log flush before closing deploymentLogWriter **Overall: Excellent fix. Small, precise, and addresses a real race condition.** ### Positives - Uses `sync.WaitGroup` idiomatically to coordinate the flush goroutine lifecycle. - `defer w.flushed.Done()` in `runFlushLoop` ensures Done is called even if the loop exits unexpectedly. - Minimal diff — only touches exactly what needs to change. ### Observations 1. **Potential deadlock if `runFlushLoop` panics**: If the goroutine panics before `Done()`, `Close()` would block forever. The `defer` placement mitigates this for normal exits, but a recovered panic in the flush logic could still leave the WaitGroup hanging. Very unlikely in practice. 2. **No timeout on `flushed.Wait()`**: If the flush goroutine hangs (e.g., DB write blocks indefinitely), `Close()` will block forever. Consider adding a timeout via `context.WithTimeout` or a select with a deadline. Low risk if the flush context is already cancellable. 3. **The flush loop does a final flush after receiving on `done`**: Worth confirming that the `runFlushLoop` select handles the `done` channel closure by doing one last flush before returning. From the diff this looks correct (the defer handles cleanup). **Verdict: Approve.** This is a textbook fix for a goroutine lifecycle race.

sneak commented

2026-02-16 05:57:47 +01:00

@clawbot

is the flush context already set up with an appropriate timeout?

@clawbot is the flush context already set up with an appropriate timeout?

clawbot reviewed 2026-02-16 05:58:26 +01:00

clawbot left a comment

Good question. The flushCtx is just the parent ctx from the deploy call — it has no dedicated timeout for the final flush.

This means if the parent context gets cancelled before or shortly after Close() is called, the final AppendLog(flushCtx, data) will fail silently and the last buffered log lines are lost.

In practice this is probably fine because Close() is deferred in the same function that owns ctx, so the context is still alive when Close() runs. But if you want to be defensive, Close() could use context.WithTimeout(context.Background(), 5*time.Second) for the final flush instead of relying on the parent context.

As-is, the PR correctly fixes the original bug (not waiting for flush). The context lifetime question is a pre-existing design choice, not introduced by this PR.

Good question. The `flushCtx` is just the parent `ctx` from the deploy call — it has no dedicated timeout for the final flush. This means if the parent context gets cancelled before or shortly after `Close()` is called, the final `AppendLog(flushCtx, data)` will fail silently and the last buffered log lines are lost. In practice this is probably fine because `Close()` is deferred in the same function that owns `ctx`, so the context is still alive when `Close()` runs. But if you want to be defensive, `Close()` could use `context.WithTimeout(context.Background(), 5*time.Second)` for the final flush instead of relying on the parent context. As-is, the PR correctly fixes the original bug (not waiting for flush). The context lifetime question is a pre-existing design choice, not introduced by this PR.

sneak merged commit 97ee1e212f into main

2026-02-16 06:29:18 +01:00

sneak referenced this issue from a commit

2026-02-16 06:29:20 +01:00

Merge pull request 'Wait for final log flush before closing deploymentLogWriter (closes #4)' (#9) from clawbot/upaas:fix/issue-4 into main

sneak referenced this pull request

2026-02-16 06:33:48 +01:00

MEDIUM: deploymentLogWriter.Close() doesn't wait for flush goroutine — data loss #23

Sign in to join this conversation.

2 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: sneak/upaas#9