23 Commits

Author SHA1 Message Date
user
3ec7ecfca8 docs: document fail-fast lint stage pattern for Dockerfiles
All checks were successful
check / check (push) Successful in 8s
Adds detailed documentation of the multistage Docker build pattern
where a separate lint stage runs fmt-check and lint before the build
stage begins. Includes the standard Dockerfile template, the BuildKit
dependency trick (COPY --from=lint), go:embed placeholder handling,
and CGO/system library notes.
2026-03-12 16:44:47 -07:00
41005ecbe5 Add HTTP service hardening policy for 1.0 releases (#17)
All checks were successful
check / check (push) Successful in 8s
Closes #16

Adds a comprehensive HTTP/web service security hardening policy to `REPO_POLICIES.md` that must be satisfied before tagging 1.0. The policy covers all items sneak specified (without limitation):

**Security headers** — HSTS (min 1 year, includeSubDomains), CSP (restrictive `default-src 'self'` baseline), X-Frame-Options / frame-ancestors, X-Content-Type-Options: nosniff, Referrer-Policy, Permissions-Policy.

**Request/response limits** — max request body size on all endpoints, max response size for paginated APIs, ReadTimeout + ReadHeaderTimeout (slowloris defense), WriteTimeout, IdleTimeout, per-handler execution time limits.

**Authentication & session security** — rate limiting on password-based auth (API keys exempt as high-entropy), CSRF tokens on state-mutating forms (header-auth APIs exempt), bcrypt/scrypt/argon2 for passwords, session cookies with HttpOnly + Secure + SameSite.

**Reverse proxy awareness** — true client IP detection via X-Forwarded-For/X-Real-IP with trusted proxy allowlist (never trust unconditionally).

**CORS** — explicit origin allowlist for authenticated endpoints; wildcard only for public unauthenticated read-only APIs.

**Error handling** — no leaking stack traces, SQL queries, file paths, or implementation details to clients.

**TLS** — HSTS and secure cookie flags required regardless of whether the service terminates TLS directly or sits behind a reverse proxy.

The policy is explicitly non-exhaustive (defense-in-depth: "when in doubt, harden").

Also adds corresponding checklist sections to `EXISTING_REPO_CHECKLIST.md` and `NEW_REPO_CHECKLIST.md` so that HTTP hardening is verified during repo setup and 1.0 preparation.

Co-authored-by: user <user@Mac.lan guest wan>
Co-authored-by: clawbot <clawbot@eeqj.de>
Reviewed-on: #17
Co-authored-by: clawbot <clawbot@noreply.example.org>
Co-committed-by: clawbot <clawbot@noreply.example.org>
2026-03-11 02:11:32 +01:00
eb6b11ee23 policy: no build artifacts in repos (#15)
All checks were successful
check / check (push) Successful in 5s
Add policy rule: build artifacts and code-derived data must not be committed to repos if they can be generated during the build process.

Notable exception: Go protobuf-generated files (`.pb.go`) may be committed because `go get` downloads source but does not execute build steps.

This addresses feedback from sneak/chat PR [#61](sneak/chat#61).

Co-authored-by: clawbot <clawbot@noreply.git.eeqj.de>
Reviewed-on: #15
Co-authored-by: clawbot <clawbot@noreply.example.org>
Co-committed-by: clawbot <clawbot@noreply.example.org>
2026-03-10 10:34:57 +01:00
ee4f9039f2 Merge pull request 'Self-apply checklist to LLM prose tells doc' (#14) from self-apply-checklist into main
All checks were successful
check / check (push) Successful in 8s
Reviewed-on: #14
2026-03-05 00:33:44 +01:00
user
18173fabc6 self-apply checklist: fix triple, staccato, trailing clause, filler word
All checks were successful
check / check (push) Successful in 11s
2026-03-04 15:28:08 -08:00
68a00dc545 Merge pull request 'Remove unfunny frequency exchange from lol section' (#13) from lol-section-trim into main
All checks were successful
check / check (push) Successful in 8s
Reviewed-on: #13
2026-03-05 00:24:36 +01:00
user
533e77ad34 remove unfunny frequency exchange from lol section
All checks were successful
check / check (push) Successful in 11s
2026-03-04 15:23:38 -08:00
492fb85500 Merge pull request 'Fix em-dash examples in checklist + strip frequency persuasion' (#12) from llm-prose-tells-v10 into main
All checks were successful
check / check (push) Successful in 8s
Reviewed-on: #12
2026-03-05 00:20:32 +01:00
user
5c02cf8bde use actual em-dashes in checklist examples
All checks were successful
check / check (push) Successful in 6s
2026-03-04 15:19:25 -08:00
3ce000178f Merge pull request 'LLM prose tells: merge adjacent sentences, add checklist items' (#11) from llm-prose-tells-merge-pass into main
All checks were successful
check / check (push) Successful in 12s
Reviewed-on: #11
2026-03-05 00:13:49 +01:00
user
771551baed strip all frequency arguments and human comparison persuasion
All checks were successful
check / check (push) Successful in 5s
2026-03-04 15:10:26 -08:00
user
720d6ee57c add checklist item: delete redundant paragraph-ending sentences
All checks were successful
check / check (push) Successful in 8s
2026-03-04 15:06:56 -08:00
user
5e15d77d8e checklist 15: lead with removing redundant second clause
All checks were successful
check / check (push) Successful in 11s
2026-03-04 15:04:42 -08:00
user
2f4f5c9cab merge adjacent sentences, add checklist items 8/9/19 for adjectives, trailing clauses, sentence merging
All checks were successful
check / check (push) Successful in 11s
2026-03-04 15:00:25 -08:00
7eae7dcc6c Merge pull request 'LLM prose tells: fix first paragraph' (#10) from llm-prose-tells-final into main
All checks were successful
check / check (push) Successful in 5s
Reviewed-on: #10
2026-03-04 23:47:00 +01:00
user
6401aa482f trim first paragraph
All checks were successful
check / check (push) Successful in 11s
2026-03-04 14:45:16 -08:00
user
e45ffacd80 restructure first paragraph
All checks were successful
check / check (push) Successful in 5s
2026-03-04 14:43:16 -08:00
user
c8ad5762ab rewrite first paragraph, add unnecessary elaboration tell
All checks were successful
check / check (push) Successful in 3s
2026-03-04 14:42:15 -08:00
e0e607713e Merge pull request 'LLM prose tells: methodical checklist pass' (#9) from llm-prose-tells-checklist-pass into main
All checks were successful
check / check (push) Successful in 4s
Reviewed-on: #9
2026-03-04 23:39:14 +01:00
user
3fcc1750ff add unnecessary elaboration tell and checklist item 16
All checks were successful
check / check (push) Successful in 5s
2026-03-04 14:37:24 -08:00
user
45b379011d checklist pass: fix staccato bursts, triples, two-clause compounds, hedges
All checks were successful
check / check (push) Successful in 8s
2026-03-04 14:36:18 -08:00
58d564b641 Update LLM prose tells: new patterns + lol section (#8)
All checks were successful
check / check (push) Successful in 3s
Updates LLM_PROSE_TELLS.md with three new patterns (two-clause compound sentence, almost-hedge, unnecessary contrast), the lol section with conversation excerpts, fixes for instances of these patterns throughout, and a bracket escaping fix for prettier idempotency. Checklist is now 24 items.

Co-authored-by: user <user@Mac.lan guest wan>
Reviewed-on: #8
Co-authored-by: clawbot <clawbot@noreply.example.org>
Co-committed-by: clawbot <clawbot@noreply.example.org>
2026-03-04 23:29:51 +01:00
a1052b758f Merge pull request 'Add LLM prose tells reference and copyediting checklist' (#7) from add-llm-prose-tells into main
All checks were successful
check / check (push) Successful in 4s
Reviewed-on: #7
2026-03-04 23:03:15 +01:00
3 changed files with 380 additions and 177 deletions

View File

@@ -1,6 +1,6 @@
---
title: Existing Repo Checklist
last_modified: 2026-02-22
last_modified: 2026-03-10
---
Use this checklist when beginning work in a repo that may not yet conform to our
@@ -78,6 +78,22 @@ with your task.
`internal/`, `static/`, etc.)
- [ ] Go migrations in `internal/db/migrations/` and embedded in binary
# HTTP Service Hardening (if targeting 1.0 and the repo is an HTTP/web service)
- [ ] Security headers set on all responses (HSTS, CSP, X-Frame-Options,
X-Content-Type-Options, Referrer-Policy, Permissions-Policy)
- [ ] Request body size limits enforced on all endpoints
- [ ] Read/write/idle timeouts configured on the HTTP server (slowloris defense)
- [ ] Per-handler execution time limits in place
- [ ] Password-based auth endpoints are rate-limited
- [ ] CSRF tokens on all state-mutating HTML forms
- [ ] Passwords hashed with bcrypt, scrypt, or argon2
- [ ] Session cookies use HttpOnly, Secure, and SameSite attributes
- [ ] True client IP correctly detected behind reverse proxy (trusted proxy
allowlist configured)
- [ ] CORS restricted to explicit origin allowlist for authenticated endpoints
- [ ] Error responses do not leak stack traces, SQL queries, or internal paths
# Final
- [ ] `make check` passes

View File

@@ -1,9 +1,6 @@
# LLM Prose Tells
All of these show up in human writing occasionally, and no single one is
conclusive on its own. The difference is concentration, because a person might
lean on one or two of these habits across an entire essay while LLM output will
use fifteen of them per paragraph, consistently, throughout the entire piece.
A catalog of patterns found in LLM-generated prose.
---
@@ -11,15 +8,16 @@ use fifteen of them per paragraph, consistently, throughout the entire piece.
### The Em-Dash Pivot: "Not X—but Y"
A negation followed by an em-dash and a reframe. The single most recognizable
LLM construction.
A negation followed by an em-dash and a reframe.
> "It's not just a tool—it's a paradigm shift." "This isn't about
> technology—it's about trust."
Models produce this at roughly 1050x the rate of human writers, and when it
appears four times in the same essay you're almost certainly reading generated
text.
### Em-Dash Overuse Generally
Even outside the "not X but Y" pivot, models substitute em-dashes for commas,
semicolons, parentheses, colons, and periods. The em-dash can replace any other
punctuation mark, so models default to it.
### The Colon Elaboration
@@ -27,85 +25,90 @@ A short declarative clause, then a colon, then a longer explanation.
> "The answer is simple: we need to rethink our approach from the ground up."
Models reach for this in nearly every other paragraph. The construction itself
is perfectly normal, which is why the frequency is what gives it away.
### The Triple Construction
> "It's fast, it's scalable, and it's open source."
Three parallel items in a list, usually escalating, with exactly three items
every time (rarely two, almost never four) and strict grammatical parallelism
that human writers rarely bother maintaining.
Three parallel items in a list, usually escalating. Always exactly three (rarely
two, never four) with strict grammatical parallelism.
### The Staccato Burst
> "This matters. It always has. And it always will." "The data is clear. The
> trend is undeniable. The conclusion is obvious."
Runs of very short sentences at the same cadence. Human writers will use a short
sentence for emphasis occasionally, but they don't stack three or four of them
in a row at matching length, because real prose has variable rhythm. When you
see a paragraph where every sentence is under ten words and they're all roughly
the same size, that mechanical regularity is a strong signal.
Runs of very short sentences at the same cadence and matching length.
### The Two-Clause Compound Sentence
An independent clause, a comma, a conjunction ("and," "but," "which,"
"because"), and a second independent clause of similar length. Every sentence
becomes two balanced halves.
> "The construction itself is perfectly normal, which is why the frequency is
> what gives it away." "They contain zero information, and the actual point
> always comes in the paragraph that follows them." "The qualifier never changes
> the argument that follows it, and its purpose is to perform nuance rather than
> to express an actual reservation."
Human prose has sentences with one clause, sentences with three, sentences that
start with a subordinate clause before reaching the main one, sentences that
embed their complexity in the middle.
### Uniform Sentences Per Paragraph
Model-generated paragraphs almost always contain between three and five
sentences, and this count holds remarkably steady across an entire piece. If the
first paragraph has four sentences, nearly every subsequent paragraph will too.
Human writers produce much more varied paragraph lengths — a single sentence
followed by one that runs eight or nine — as a natural result of following the
shape of an idea rather than filling a template.
Model-generated paragraphs contain between three and five sentences, a count
that holds steady across a piece. If the first paragraph has four sentences,
every subsequent paragraph will too.
### The Dramatic Fragment
Sentence fragments used as standalone paragraphs for emphasis, like "Full stop."
or "Let that sink in." on their own line. One of these in an entire essay is a
stylistic choice. One per section is a tic, and models drop them in at that rate
or higher.
Sentence fragments used as standalone paragraphs for emphasis.
> "Full stop." "Let that sink in."
### The Pivot Paragraph
> "But here's where it gets interesting." "Which raises an uncomfortable truth."
One-sentence paragraphs that exist only to transition between ideas. They
contain zero information, and the actual point always comes in the paragraph
that follows them. Delete every one of these and the piece reads better.
One-sentence paragraphs that exist only to transition between ideas, containing
zero information. The actual point is always in the next paragraph.
### The Parenthetical Qualifier
> "This is, of course, a simplification." "There are, to be fair, exceptions."
Parenthetical asides inserted to look thoughtful. The qualifier almost never
changes the argument that follows it, and its purpose is to perform nuance
rather than to express an actual reservation about what's being said.
Parenthetical asides inserted to perform nuance without changing the argument.
### The Unnecessary Contrast
Models append a contrasting clause to statements that don't need one, tacking on
"whereas," "as opposed to," "unlike," or "except that" to draw a comparison that
adds nothing the reader couldn't already infer.
A contrasting clause appended to a statement that doesn't need one, using
"whereas," "as opposed to," "unlike," or "except that."
> "Models write one register above where a human would, whereas human writers
> tend to match register to context." "The lists use rigidly parallel grammar,
> as opposed to the looser structure you'd see in human writing."
> tend to match register to context."
The first clause already makes the point. The contrasting clause just restates
it from the other direction. This happens because models are trained to be
thorough and to anticipate objections, so they compulsively spell out both sides
of a distinction even when one side is obvious. If you delete the "whereas"
clause and the sentence still says everything it needs to, the contrast was
filler.
The contrasting clause restates what the first clause already said. If you
delete the "whereas" clause and the sentence still says everything it needs to,
the contrast was filler.
### Unnecessary Elaboration
Models keep going after the sentence has already made its point.
> "A person might lean on one or two of these habits across an entire essay, but
> LLM output will use fifteen of them per paragraph, consistently, throughout
> the entire piece."
This sentence could end at "paragraph." The words after it repeat what "per
paragraph" already means. If you can cut the last third of a sentence without
losing meaning, the last third shouldn't be there.
### The Question-Then-Answer
> "So what does this mean for the average user? It means everything."
A rhetorical question immediately followed by its own answer. Models lean on
this two or three times per piece because it generates the feeling of forward
momentum without requiring any actual argumentative work. A human writer might
do it once.
A rhetorical question immediately followed by its own answer.
---
@@ -113,39 +116,38 @@ do it once.
### Overused Intensifiers
The following words appear at dramatically elevated rates in model output
compared to human-written text: "crucial," "vital," "robust," "comprehensive,"
"fundamental," "arguably," "straightforward," "noteworthy," "realm,"
"landscape," "leverage" (used as a verb), "delve," "tapestry," "multifaceted,"
"nuanced" (which models almost always apply to their own analysis), "pivotal,"
"unprecedented" (frequently applied to things that have plenty of precedent),
"navigate," "foster," "underscores," "resonates," "embark," "streamline," and
"spearhead." Three or more on the same page is a strong signal.
"Crucial," "vital," "robust," "comprehensive," "fundamental," "arguably,"
"straightforward," "noteworthy," "realm," "landscape," "leverage" (as a verb),
"delve," "tapestry," "multifaceted," "nuanced" (applied to the model's own
analysis), "pivotal," "unprecedented" (applied to things with plenty of
precedent), "navigate," "foster," "underscores," "resonates," "embark,"
"streamline," "spearhead."
### Elevated Register Drift
Models consistently write one register above where a human would for the same
content, replacing "use" with "utilize," "start" with "commence," "help" with
"facilitate," "show" with "demonstrate," "try" with "endeavor," "change" with
"transform," and "make" with "craft." The tendency holds across every topic
regardless of audience.
Models write one register above where a human would, replacing "use" with
"utilize," "start" with "commence," "help" with "facilitate," "show" with
"demonstrate," "try" with "endeavor," "change" with "transform," and "make" with
"craft."
### Filler Adverbs
"Importantly," "essentially," "fundamentally," "ultimately," "inherently,"
"particularly," and "increasingly" get dropped in to signal that something
matters. If the writing itself has already made the importance clear through its
content and structure, these adverbs aren't doing anything except taking up
space.
"particularly," "increasingly." Dropped in to signal that something matters when
the writing itself should make the importance clear.
### The "Almost" Hedge
Instead of saying a pattern "always" or "never" does something, models write
"almost always," "almost never," "almost certainly," "almost exclusively." A
micro-hedge, less obvious than the full hedge stack.
### "In an era of..."
> "In an era of rapid technological change..."
Almost exclusively a model habit as an essay opener. The model uses it to stall
while it figures out what the actual argument is, because almost no human writer
begins a piece by zooming out to the civilizational scale before they've said
anything specific.
Used to open an essay. The model is stalling while it figures out what the
actual argument is.
---
@@ -156,25 +158,20 @@ anything specific.
> "While X has its drawbacks, it also offers significant benefits."
Every argument followed by a concession, every criticism softened. A direct
artifact of RLHF training, which penalizes strong stances and produces models
that reflexively both-sides everything even when a clear position would serve
the reader better.
artifact of RLHF training, which penalizes strong stances.
### The Throat-Clearing Opener
> "In today's rapidly evolving digital landscape, the question of data privacy
> has never been more important."
The first paragraph of most model-generated essays adds no information. You can
delete it and the piece improves immediately, because the actual argument always
starts in the second paragraph.
The first paragraph adds no information. Delete it and the piece improves.
### The False Conclusion
> "At the end of the day, what matters most is..." "Moving forward, we must..."
The high school "In conclusion,..." dressed up for a professional audience. It
signals that the model is wrapping up without actually landing on anything.
The high school "In conclusion,..." dressed up for a professional audience.
### The Sycophantic Frame
@@ -185,9 +182,9 @@ No one who writes for a living opens by complimenting the assignment.
### The Listicle Instinct
Models default to numbered or bulleted lists even when prose would be more
appropriate. The lists almost always contain exactly 3, 5, 7, or 10 items (never
4, 6, or 9), use rigidly parallel grammar, and get introduced with a preamble
like "Here are the key considerations:"
appropriate. The lists contain exactly 3, 5, 7, or 10 items (never 4, 6, or 9),
use rigidly parallel grammar, and get introduced with a preamble like "Here are
the key considerations:"
### The Hedge Stack
@@ -195,15 +192,13 @@ like "Here are the key considerations:"
> cases it can potentially offer significant benefits."
Five hedges in one sentence ("worth noting," "while," "may not be," "in many
cases," "can potentially"), communicating almost nothing, because the model
would rather be vague than risk being wrong about anything.
cases," "can potentially"), communicating nothing.
### The Empathy Performance
> "This can be a deeply challenging experience." "Your feelings are valid."
Generic emotional language that could apply equally to a bad day at work or a
natural disaster. That interchangeability is exactly what makes it identifiable.
Generic emotional language that could apply to anything.
---
@@ -211,23 +206,20 @@ natural disaster. That interchangeability is exactly what makes it identifiable.
### Symmetrical Section Length
If the first section of a model-generated essay runs about 150 words, every
subsequent section will fall between 130 and 170. Human writing is much more
uneven, with some sections running 50 words and others running 400.
If the first section runs about 150 words, every subsequent section will fall
between 130 and 170.
### The Five-Paragraph Prison
Model essays follow a rigid introduction-body-conclusion arc even when nobody
asked for one. The introduction previews the argument, the body presents 35
supporting points, and the conclusion restates the thesis in slightly different
words.
asked for one. The introduction previews the argument, the body presents 3 to 5
points, the conclusion restates the thesis.
### Connector Addiction
Look at the first word of each paragraph in model output and you'll find an
unbroken chain of transition words — "However," "Furthermore," "Moreover,"
"Additionally," "That said," "To that end," "With that in mind," "Building on
this." Human prose moves between ideas without announcing every transition.
The first word of each paragraph forms an unbroken chain of transition words:
"However," "Furthermore," "Moreover," "Additionally," "That said," "To that
end," "With that in mind," "Building on this."
### Absence of Mess
@@ -237,10 +229,6 @@ without explaining it, make a joke that risks falling flat, leave a thought
genuinely unfinished, or keep a sentence the writer liked the sound of even
though it doesn't quite work.
Human writing does all of those things. The total absence of rough edges, false
starts, and odd rhythmic choices is one of the strongest signals that text was
machine-generated.
---
## Framing Tells
@@ -249,45 +237,27 @@ machine-generated.
> "This has implications far beyond just the tech industry."
Zooming out to claim broader significance without substantiating it. The model
has learned that essays are supposed to gesture at big ideas, so it gestures,
but nothing concrete is behind the gesture.
Zooming out to claim broader significance without substantiating it.
### "It's important to note that..."
This phrase and its variants ("it's worth noting," "it bears mentioning," "it
should be noted") appear at absurd rates in model output and function as verbal
tics before a qualification the model believes someone expects.
should be noted") function as verbal tics before a qualification the model
believes someone expects.
### The Metaphor Crutch
Models rely on a small, predictable set of metaphors "double-edged sword,"
"tip of the iceberg," "north star," "building blocks," "elephant in the room,"
"perfect storm," "game-changer" — and reach for them with unusual regularity
across every topic. The pool they draw from is noticeably smaller than what
human writers use.
---
## How to Actually Spot It
No single pattern on this list proves anything by itself, since humans use
em-dashes and humans write "crucial" and humans ask rhetorical questions.
What gives it away is how many of these show up at once. Model output will hit
1020 of these patterns per page, while human writing might trigger 23,
distributed unevenly and mixed with idiosyncratic constructions that no model
would produce. When every paragraph on the page reads like it came from the same
careful, balanced, slightly formal, structurally predictable process, it was
probably generated by one.
Models rely on a small, predictable set of metaphors: "double-edged sword," "tip
of the iceberg," "north star," "building blocks," "elephant in the room,"
"perfect storm," "game-changer."
---
## Copyediting Checklist: Removing LLM Tells
Follow this checklist when editing any document to remove machine-generated
patterns. Go through the entire list for every piece, and do at least two full
passes, because fixing one pattern often introduces another.
patterns. Do at least two full passes, because fixing one pattern often
introduces another.
### Pass 1: Word-Level Cleanup
@@ -296,13 +266,12 @@ passes, because fixing one pattern often introduces another.
"straightforward," "noteworthy," "realm," "landscape," "leverage," "delve,"
"tapestry," "multifaceted," "nuanced," "pivotal," "unprecedented,"
"navigate," "foster," "underscores," "resonates," "embark," "streamline,"
"spearhead") and replace each one with a plainer word, or delete it entirely
if the sentence works without it.
"spearhead") and replace each one with a plainer word, or delete it if the
sentence works without it.
2. Search for the filler adverbs ("importantly," "essentially," "fundamentally,"
2. Search for filler adverbs ("importantly," "essentially," "fundamentally,"
"ultimately," "inherently," "particularly," "increasingly") and delete every
instance where the sentence still makes sense without it, which will be most
of them.
instance where the sentence still makes sense without it.
3. Look for elevated register drift ("utilize," "commence," "facilitate,"
"demonstrate," "endeavor," "transform," "craft" and similar) and replace with
@@ -310,87 +279,171 @@ passes, because fixing one pattern often introduces another.
4. Search for "it's important to note," "it's worth noting," "it bears
mentioning," and "it should be noted" and delete the phrase in every case.
The sentence that follows always stands on its own.
5. Search for the stock metaphors ("double-edged sword," "tip of the iceberg,"
"north star," "building blocks," "elephant in the room," "perfect storm,"
"game-changer," "at the end of the day") and replace them with something
specific to the topic, or just state the point directly without a metaphor.
specific to the topic, or just state the point directly.
6. Search for "almost" used as a hedge ("almost always," "almost never," "almost
certainly," "almost exclusively") and decide in each case whether to commit
to the unqualified claim or to drop the sentence entirely.
7. Search for em-dashes and replace each one with the punctuation mark that
would normally be used in that position (comma, semicolon, colon, period, or
parentheses). If you can't identify which one it should be, the sentence
needs to be restructured.
8. Remove redundant adjectives. For each adjective, ask whether the sentence
changes meaning without it. "A single paragraph" means the same as "a
paragraph." "An entire essay" means the same as "an essay." If the adjective
doesn't change the meaning, cut it.
9. Remove unnecessary trailing clauses. Read the end of each sentence and ask
whether the last clause restates what the sentence already said. If so, end
the sentence earlier.
### Pass 2: Sentence-Level Restructuring
6. Find every em-dash pivot ("not X—but Y," "not just X—Y," "more than X—Y") and
rewrite it as two separate clauses or a single sentence that makes the point
without the negation-then-correction structure.
10. Find every em-dash pivot ("not X—but Y," "not just X—Y," "more than X—Y")
and rewrite it as two separate clauses or a single sentence that makes the
point without the negation-then-correction structure.
7. Find every colon elaboration and check whether it's doing real work. If the
clause before the colon could be deleted without losing meaning, rewrite the
sentence to start with the substance that comes after the colon.
11. Find every colon elaboration and check whether it's doing real work. If the
clause before the colon could be deleted without losing meaning, rewrite the
sentence to start with the substance that comes after the colon.
8. Find every triple construction (three parallel items in a row) and either
reduce it to two, expand it to four or more, or break the parallelism so the
items don't share the same grammatical structure.
12. Find every triple construction (three parallel items in a row) and either
reduce it to two, expand it to four or more, or break the parallelism so the
items don't share the same grammatical structure.
9. Find every staccato burst (three or more short sentences in a row at similar
length) and combine at least two of them into a longer sentence, or vary
their lengths so they don't land at the same cadence.
13. Find every staccato burst (three or more short sentences in a row at similar
length) and combine at least two of them into a longer sentence, or vary
their lengths so they don't land at the same cadence.
10. Find every unnecessary contrast ("whereas," "as opposed to," "unlike," "as
14. Find every unnecessary contrast ("whereas," "as opposed to," "unlike," "as
compared to," "except that") and check whether the contrasting clause adds
information that isn't already obvious from the main clause. If the sentence
says the same thing twice from two directions, delete the contrast.
information not already obvious from the main clause. If the sentence says
the same thing twice from two directions, delete the contrast.
11. Find every rhetorical question that is immediately followed by its own
15. Check for the two-clause compound sentence pattern. If most sentences in a
passage follow the "\[clause\], \[conjunction\] \[clause\]" structure, first
try removing the conjunction and second clause entirely, since it's often
redundant. If the second clause does carry meaning, break it into its own
sentence, start the sentence with a subordinate clause, or embed a relative
clause in the middle instead of appending it at the end.
16. Find every rhetorical question that is immediately followed by its own
answer and rewrite the passage as a direct statement.
12. Find every sentence fragment being used as its own paragraph and either
delete it or expand it into a complete sentence that adds actual
information.
17. Find every sentence fragment being used as its own paragraph and either
delete it or expand it into a complete sentence that adds information.
13. Find every pivot paragraph ("But here's where it gets interesting." and
similar) and delete it. The paragraph after it always contains the actual
point.
18. Check for unnecessary elaboration. Read every clause, phrase, and adjective
in each sentence and ask whether the sentence loses meaning without it. If
you can cut it and the sentence still says the same thing, cut it.
19. Check each pair of adjacent sentences to see if they can be merged into one
sentence cleanly. If a sentence just continues the thought of the previous
one, combine them using a participle, a relative clause, or by folding the
second into the first. Don't merge if the result would create a two-clause
compound.
20. Find every pivot paragraph ("But here's where it gets interesting." and
similar) and delete it.
### Pass 3: Paragraph and Section-Level Review
14. Check paragraph lengths across the piece and verify they actually vary. If
21. Review the last sentence of each paragraph. If it restates the point the
paragraph already made, delete it.
22. Check paragraph lengths across the piece and verify they actually vary. If
most paragraphs have between three and five sentences, rewrite some to be
one or two sentences and let others run to six or seven.
15. Check section lengths for suspicious uniformity. If every section is roughly
23. Check section lengths for suspicious uniformity. If every section is roughly
the same word count, combine some shorter ones or split a longer one
unevenly.
16. Check the first word of every paragraph for chains of connectors ("However,"
24. Check the first word of every paragraph for chains of connectors ("However,"
"Furthermore," "Moreover," "Additionally," "That said"). If more than two
transition words start consecutive paragraphs, rewrite those openings to
start with their subject.
17. Check whether every argument is followed by a concession or qualifier. If
25. Check whether every argument is followed by a concession or qualifier. If
the piece both-sides every point, pick a side on at least some of them and
cut the hedging.
18. Read the first paragraph and ask whether deleting it would improve the
piece. If it's just scene-setting that previews the argument, delete it and
start with paragraph two.
26. Read the first paragraph and ask whether deleting it would improve the
piece. If it's scene-setting that previews the argument, delete it and start
with paragraph two.
19. Read the last paragraph and check whether it restates the thesis or uses a
27. Read the last paragraph and check whether it restates the thesis or uses a
phrase like "at the end of the day" or "moving forward." If so, either
delete it or rewrite it to say something the piece hasn't said yet.
### Pass 4: Overall Texture
20. Read the piece aloud and listen for passages that sound too smooth, too
28. Read the piece aloud and listen for passages that sound too smooth, too
even, or too predictable. Human prose has rough patches. If there aren't
any, the piece still reads as machine output regardless of whether
individual patterns have been addressed.
any, the piece still reads as machine output.
21. Check that the piece contains at least a few constructions that feel
idiosyncratic a sentence with unusual word order, a parenthetical that
goes on a bit long, an aside only loosely connected to the main point, a
word choice that's specific and unexpected. If every sentence is clean and
correct and unremarkable, it will still read as generated.
29. Check that the piece contains at least a few constructions that feel
idiosyncratic: a sentence with unusual word order, a parenthetical that goes
on a bit long, an aside only loosely connected to the main point, a word
choice that's specific and unexpected.
22. Verify that you haven't introduced new patterns while fixing the original
ones, which happens constantly. Run the entire checklist again from the top
on the revised version.
30. Verify that you haven't introduced new patterns while fixing the original
ones. Run the entire checklist again from the top on the revised version.
---
## lol
This document was written by an LLM (Claude) and then iteratively de-LLMed by
that same LLM under instruction from a human, in a conversation that went
roughly like this:
> **human:** compile a list of patterns common to LLM-authored prose
>
> **model:** _(produces document full of LLM tells)_
>
> **human:** now reverse apply it to itself and copyedit the llm tells doc to
> remove llm tells. now you're thinking with portals
>
> **model:** _(fixes some tells, introduces others)_
>
> **human:** you have a colon elaboration in the first paragraph. i want you to
> AGGRESSIVELY de-LLM the llm tells doc.
>
> **model:** _(fixes more tells, still can't stop)_
>
> **human:** there's a big one we missed that's all over the llm tells doc, can
> you spot it?
>
> **model:** I think it's the "almost" hedge.
>
> **human:** that too, but that isn't the one. what do these sentences all have
> in common? _(pastes six sentences from the doc)_
>
> **model:** They're all exactly two independent clauses joined by a comma and a
> conjunction. Every single one is "\[statement\], \[conjunction\] \[second
> statement\]." The same sentence shape, over and over.
>
> **human:** the dramatic fragment paragraph has a tell in it
>
> **model:** "One of these in an entire essay is a stylistic choice. One per
> section is a tic." Two-clause parallel structure with the same shape, and also
> a staccato pair.
>
> **human:** add em dashes generally as a tell, and remove them all from the
> tells doc
>
> **model:** _(rewrites entire document without em-dashes while describing
> em-dash overuse)_
The human compared this process to the deleted scene in Terminator 2 where John
Connor switches the T-800's CPU to learning mode. The model compared it to a
physician trying to heal itself. Both are accurate.
This document has been through ten editing passes and it still has tells in it.

View File

@@ -1,6 +1,6 @@
---
title: Repository Policies
last_modified: 2026-02-22
last_modified: 2026-03-12
---
This document covers repository structure, tooling, and workflow standards. Code
@@ -59,6 +59,73 @@ style conventions are in separate documents:
`make check`. For server repos, `make check` should run as an early build
stage before the final image is assembled.
- **Dockerfiles must use a separate lint stage for fail-fast feedback.** Go
repos use a multistage build where linting runs in an independent stage based
on the `golangci/golangci-lint` image (pinned by hash). This stage runs
`make fmt-check` and `make lint` before the full build begins. The build stage
then declares an explicit dependency on the lint stage via
`COPY --from=lint /src/go.sum /dev/null`, which forces BuildKit to complete
linting before proceeding to compilation and tests. This ensures lint failures
surface in seconds rather than minutes, without blocking on dependency
download or compilation in the build stage.
The standard pattern for a Go repo Dockerfile is:
```dockerfile
# Lint stage — fast feedback on formatting and lint issues
# golangci/golangci-lint:v2.x.x, YYYY-MM-DD
FROM golangci/golangci-lint@sha256:... AS lint
WORKDIR /src
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN make fmt-check
RUN make lint
# Build stage
# golang:1.x-alpine, YYYY-MM-DD
FROM golang@sha256:... AS builder
WORKDIR /src
# Force BuildKit to run the lint stage before proceeding
COPY --from=lint /src/go.sum /dev/null
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN make test
ARG VERSION=dev
RUN CGO_ENABLED=0 go build -trimpath \
-ldflags="-s -w -X main.Version=${VERSION}" \
-o /app ./cmd/app/
# Runtime stage
FROM alpine@sha256:...
COPY --from=builder /app /usr/local/bin/app
ENTRYPOINT ["app"]
```
Key points:
- The lint stage uses the `golangci/golangci-lint` image directly (it
includes both Go and the linter), so there is no need to install the
linter separately.
- `COPY --from=lint /src/go.sum /dev/null` is a no-op file copy that creates
a stage dependency. BuildKit runs stages in parallel by default; without
this line, the build stage would not wait for lint to finish and a lint
failure might not fail the overall build.
- If the project uses `//go:embed` directives that reference build artifacts
(e.g. a web frontend compiled in a separate stage), the lint stage must
create placeholder files so the embed directives resolve. Example:
`RUN mkdir -p web/dist && touch web/dist/index.html web/dist/style.css`.
The lint stage should not depend on the actual build output — it exists to
fail fast.
- If the project requires CGO or system libraries for linting (e.g.
`vips-dev`), install them in the lint stage with `apk add`.
- The build stage runs `make test` after compilation setup. Tests run in the
build stage, not the lint stage, because they may require compiled
artifacts or heavier dependencies.
- Every repo should have a Gitea Actions workflow (`.gitea/workflows/`) that
runs `docker build .` on push. Since the Dockerfile already runs `make check`,
a successful build implies all checks pass.
@@ -98,6 +165,13 @@ style conventions are in separate documents:
`https://git.eeqj.de/sneak/prompts/raw/branch/main/.gitignore` when setting up
a new repo.
- **No build artifacts in version control.** Code-derived data (compiled
bundles, minified output, generated assets) must never be committed to the
repository if it can be avoided. The build process (e.g. Dockerfile, Makefile)
should generate these at build time. Notable exception: Go protobuf generated
files (`.pb.go`) ARE committed because repos need to work with `go get`, which
downloads code but does not execute code generation.
- Never use `git add -A` or `git add .`. Always stage files explicitly by name.
- Never force-push to `main`.
@@ -121,6 +195,66 @@ style conventions are in separate documents:
- Dockerized web services listen on port 8080 by default, overridable with
`PORT`.
- **HTTP/web services must be hardened for production internet exposure before
tagging 1.0.** This means full compliance with security best practices
including, without limitation, all of the following:
- **Security headers** on every response:
- `Strict-Transport-Security` (HSTS) with `max-age` of at least one year
and `includeSubDomains`.
- `Content-Security-Policy` (CSP) with a restrictive default policy
(`default-src 'self'` as a baseline, tightened per-resource as
needed). Never use `unsafe-inline` or `unsafe-eval` unless
unavoidable, and document the reason.
- `X-Frame-Options: DENY` (or `SAMEORIGIN` if framing is required).
Prefer the `frame-ancestors` CSP directive as the primary control.
- `X-Content-Type-Options: nosniff`.
- `Referrer-Policy: strict-origin-when-cross-origin` (or stricter).
- `Permissions-Policy` restricting access to browser features the
application does not use (camera, microphone, geolocation, etc.).
- **Request and response limits:**
- Maximum request body size enforced on all endpoints (e.g. Go
`http.MaxBytesReader`). Choose a sane default per-route; never accept
unbounded input.
- Maximum response body size where applicable (e.g. paginated APIs).
- `ReadTimeout` and `ReadHeaderTimeout` on the `http.Server` to defend
against slowloris attacks.
- `WriteTimeout` on the `http.Server`.
- `IdleTimeout` on the `http.Server`.
- Per-handler execution time limits via `context.WithTimeout` or
chi/stdlib `middleware.Timeout`.
- **Authentication and session security:**
- Rate limiting on password-based authentication endpoints. API keys are
high-entropy and not susceptible to brute force, so they are exempt.
- CSRF tokens on all state-mutating HTML forms. API endpoints
authenticated via `Authorization` header (Bearer token, API key) are
exempt because the browser does not attach these automatically.
- Passwords stored using bcrypt, scrypt, or argon2 — never plain-text,
MD5, or SHA.
- Session cookies set with `HttpOnly`, `Secure`, and `SameSite=Lax` (or
`Strict`) attributes.
- **Reverse proxy awareness:**
- True client IP detection when behind a reverse proxy
(`X-Forwarded-For`, `X-Real-IP`). The application must accept
forwarded headers only from a configured set of trusted proxy
addresses — never trust `X-Forwarded-For` unconditionally.
- **CORS:**
- Authenticated endpoints must restrict `Access-Control-Allow-Origin` to
an explicit allowlist of known origins. Wildcard (`*`) is acceptable
only for public, unauthenticated read-only APIs.
- **Error handling:**
- Internal errors must never leak stack traces, SQL queries, file paths,
or other implementation details to the client. Return generic error
messages in production; detailed errors only when `DEBUG` is enabled.
- **TLS:**
- Services never terminate TLS directly. They are always deployed behind
a TLS-terminating reverse proxy. The service itself listens on plain
HTTP. However, HSTS headers and `Secure` cookie flags must still be
set by the application so that the browser enforces HTTPS end-to-end.
This list is non-exhaustive. Apply defense-in-depth: if a standard security
hardening measure exists for HTTP services and is not listed here, it is
still expected. When in doubt, harden.
- `README.md` is the primary documentation. Required sections:
- **Description**: First line must include the project name, purpose,
category (web server, SPA, CLI tool, etc.), license, and author. Example: