17 Commits

Author SHA1 Message Date
user
cedf98b83e rewrite first paragraph
All checks were successful
check / check (push) Successful in 5s
2026-03-04 14:41:58 -08:00
user
368fc0edf0 fix first paragraph
All checks were successful
check / check (push) Successful in 12s
2026-03-04 14:41:15 -08:00
user
3fcc1750ff add unnecessary elaboration tell and checklist item 16
All checks were successful
check / check (push) Successful in 5s
2026-03-04 14:37:24 -08:00
user
45b379011d checklist pass: fix staccato bursts, triples, two-clause compounds, hedges
All checks were successful
check / check (push) Successful in 8s
2026-03-04 14:36:18 -08:00
58d564b641 Update LLM prose tells: new patterns + lol section (#8)
All checks were successful
check / check (push) Successful in 3s
Updates LLM_PROSE_TELLS.md with three new patterns (two-clause compound sentence, almost-hedge, unnecessary contrast), the lol section with conversation excerpts, fixes for instances of these patterns throughout, and a bracket escaping fix for prettier idempotency. Checklist is now 24 items.

Co-authored-by: user <user@Mac.lan guest wan>
Reviewed-on: #8
Co-authored-by: clawbot <clawbot@noreply.example.org>
Co-committed-by: clawbot <clawbot@noreply.example.org>
2026-03-04 23:29:51 +01:00
a1052b758f Merge pull request 'Add LLM prose tells reference and copyediting checklist' (#7) from add-llm-prose-tells into main
All checks were successful
check / check (push) Successful in 4s
Reviewed-on: #7
2026-03-04 23:03:15 +01:00
user
a2dd953601 fmt: format REPO_POLICIES.md per prettier
All checks were successful
check / check (push) Successful in 8s
2026-03-04 14:02:09 -08:00
user
f921dee839 add LLM prose tells reference and copyediting checklist
Some checks failed
check / check (push) Failing after 10s
2026-03-04 14:00:45 -08:00
a1ffb1591b Merge pull request 'REPO_POLICIES: expand pre-1.0 schema migration rule (closes #5)' (#6) from clawbot/prompts:pre-1.0-migration-rule into main
Some checks failed
check / check (push) Failing after 10s
Reviewed-on: #6
2026-02-28 17:02:14 +01:00
clawbot
699f97d093 REPO_POLICIES: expand pre-1.0 schema migration rule (closes #5) 2026-02-28 07:59:27 -08:00
1955922857 Merge pull request 'fix: formatting + add clawpub reference' (#4) from clawbot/prompts:fix/formatting-and-readme into main
All checks were successful
check / check (push) Successful in 11s
Reviewed-on: #4
2026-02-28 11:45:56 +01:00
clawbot
a8cf966df6 docs: add clawpub reference in See Also section 2026-02-28 02:33:50 -08:00
clawbot
dcb6ca4339 fmt: fix prettier formatting in CODE_STYLEGUIDE_GO.md 2026-02-28 02:33:50 -08:00
dda0d01faa Merge pull request 'style(go): add rule against type-only packages (per upaas #126 review)' (#2) from clawbot/prompts:add-no-type-only-packages-rule into main
Some checks failed
check / check (push) Failing after 6s
Reviewed-on: #2
2026-02-23 22:14:02 +01:00
user
7676ec16c3 style(go): add Stringer rule for custom string-based types 2026-02-23 11:56:16 -08:00
user
f9dcef4c9e style(go): add rule against type-only packages
Types should live alongside their implementations, not in separate
'types', 'domain', or 'models' packages. Type-only packages cause
alias imports and indicate poor package design.

Prompted by review feedback on upaas PR #126.
2026-02-23 11:47:50 -08:00
189e54862e Add template repos section to README
All checks were successful
check / check (push) Successful in 7s
2026-02-23 01:38:38 +07:00
4 changed files with 570 additions and 2 deletions

View File

@@ -115,6 +115,31 @@ subdirectory. Each file contains one or more related prompts or policy
documents. There is no build step or runtime component; the prompts are consumed
by copying them into other projects or referencing them directly.
## Template Repos
These template repositories implement the policies defined in this repo and
serve as starting points for new projects. They must be kept in sync when
policies change.
- **[template-app-go](https://git.eeqj.de/sneak/template-app-go)** — Go HTTP
server template (Uber fx, chi, SQLite, session auth, Prometheus metrics)
- **[template-app-js](https://git.eeqj.de/sneak/template-app-js)** — JavaScript
SPA template (Vite, Tailwind CSS v4, nginx Docker deployment)
- **[template-app-python](https://git.eeqj.de/sneak/template-app-python)** —
Python web application template (FastAPI, uvicorn, pytest, black, ruff)
When updating policies in this repo, also update the template repos to match
(Makefile targets, Dockerfile conventions, CI workflows, required files, etc.).
## See Also
- **[clawpub](https://git.eeqj.de/sneak/clawpub)** — Real-world examples,
rationale, and operational lessons from applying these policies with an
[OpenClaw](https://github.com/openclaw/openclaw) AI agent. Includes detailed
documentation on how the interlocking check system (CI → Docker → Makefile →
tests/lint/fmt) works in practice, why checklists complement prose policies,
and failure stories from production use.
## TODO
- Add more prompt templates for common development tasks

View File

@@ -229,6 +229,29 @@ last_modified: 2026-02-22
1. Define your struct types near their constructors.
1. Do not create packages whose sole purpose is to hold type definitions.
Packages named `types`, `domain`, or `models` that contain only structs and
interfaces (with no behavior) are a code smell. Define types alongside the
code that uses them. Type-only packages force consuming packages into alias
imports and circular-dependency gymnastics, and indicate that the package
boundaries were drawn around nouns instead of responsibilities. If multiple
packages need the same type, put it in the package that owns the behavior,
or in a small, focused interface package — not in a grab-bag types package.
1. When defining custom string-based types (e.g. `type ImageID string`),
implement `fmt.Stringer`. Use `.String()` at SDK and library boundaries
instead of `string(v)`. This makes type conversions explicit, grep-able, and
consistent across the codebase. Example:
```go
type ContainerID string
func (id ContainerID) String() string { return string(id) }
// At the Docker SDK boundary:
resp, err := c.docker.ContainerStart(ctx, id.String(), opts)
```
1. Define your interface types near the functions that use them, or if you have
multiple conformant types, put the interface(s) in their own file.

514
prompts/LLM_PROSE_TELLS.md Normal file
View File

@@ -0,0 +1,514 @@
# LLM Prose Tells
Every pattern in this document shows up in human writing occasionally. They
become diagnostic only through density. A person might use one or two across an
entire essay, but LLM output packs fifteen into a single paragraph.
---
## Sentence Structure
### The Em-Dash Pivot: "Not X—but Y"
A negation followed by an em-dash and a reframe.
> "It's not just a tool—it's a paradigm shift." "This isn't about
> technology—it's about trust."
The single most recognizable LLM construction. Models produce this at roughly 10
to 50x the rate of human writers. Four of them in one essay and you know what
you're reading.
### Em-Dash Overuse Generally
Even outside the "not X but Y" pivot, models use em-dashes at far higher rates
than human writers. They substitute em-dashes for commas, semicolons,
parentheses, colons, and periods, often multiple times per paragraph. A human
writer might use one or two in an entire piece for a specific parenthetical
effect. Models scatter them everywhere because the em-dash can stand in for any
other punctuation mark, so they default to it. More than two or three per page
is a meaningful signal on its own.
### The Colon Elaboration
A short declarative clause, then a colon, then a longer explanation.
> "The answer is simple: we need to rethink our approach from the ground up."
Models reach for this in every other paragraph. The construction is perfectly
normal. The frequency gives it away.
### The Triple Construction
> "It's fast, it's scalable, and it's open source."
Three parallel items in a list, usually escalating. Always exactly three (rarely
two, never four) with strict grammatical parallelism that human writers rarely
bother maintaining.
### The Staccato Burst
> "This matters. It always has. And it always will." "The data is clear. The
> trend is undeniable. The conclusion is obvious."
Runs of very short sentences at the same cadence. Human writers use a short
sentence for emphasis occasionally, but stacking three or four of them in a row
at matching length creates a mechanical regularity that reads as generated.
### The Two-Clause Compound Sentence
Possibly the most pervasive structural tell, and easy to miss because each
individual instance looks like normal English. The model produces sentence after
sentence where an independent clause is followed by a comma, a conjunction
("and," "but," "which," "because"), and a second independent clause of similar
length. Every sentence becomes two balanced halves joined in the middle.
> "The construction itself is perfectly normal, which is why the frequency is
> what gives it away." "They contain zero information, and the actual point
> always comes in the paragraph that follows them." "The qualifier never changes
> the argument that follows it, and its purpose is to perform nuance rather than
> to express an actual reservation."
Human prose has sentences with one clause, sentences with three, sentences that
start with a subordinate clause before reaching the main one, sentences that
embed their complexity in the middle. When every sentence on the page has that
same two-part structure, the rhythm becomes monotonous in a way that's hard to
pinpoint but easy to feel.
### Uniform Sentences Per Paragraph
Model-generated paragraphs contain between three and five sentences. This count
holds steady across an entire piece. If the first paragraph has four sentences,
every subsequent paragraph will too. Human writers are much more varied (a
single sentence followed by one that runs eight or nine) because they follow the
shape of an idea, not a template.
### The Dramatic Fragment
Sentence fragments used as standalone paragraphs for emphasis, like "Full stop."
or "Let that sink in." on their own line. Using one in an entire essay is a
reasonable stylistic choice, but models drop them in once per section or more,
at which point it becomes a habit rather than a deliberate decision.
### The Pivot Paragraph
> "But here's where it gets interesting." "Which raises an uncomfortable truth."
One-sentence paragraphs that exist only to transition between ideas. They
contain zero information. The actual point is always in the next paragraph.
Delete every one of these and the piece reads better.
### The Parenthetical Qualifier
> "This is, of course, a simplification." "There are, to be fair, exceptions."
Parenthetical asides inserted to look thoughtful. The qualifier never changes
the argument that follows it. Its purpose is to perform nuance, not to express a
real reservation about what's being said.
### The Unnecessary Contrast
Models append a contrasting clause to statements that don't need one, tacking on
"whereas," "as opposed to," "unlike," or "except that" to draw a comparison the
reader could already infer.
> "Models write one register above where a human would, whereas human writers
> tend to match register to context."
The first clause already makes the point. The contrasting clause restates it
from the other direction. If you delete the "whereas" clause and the sentence
still says everything it needs to, the contrast was filler.
### Unnecessary Elaboration
Models keep going after the sentence has already made its point, tacking on
clarifying phrases, adverbial modifiers, or restatements that add nothing.
> "A person might lean on one or two of these habits across an entire essay, but
> LLM output will use fifteen of them per paragraph, consistently, throughout
> the entire piece."
This sentence could end at "paragraph." The words after it just repeat what "per
paragraph" already means. Models do this because they're optimizing for clarity
at the expense of concision, and because their training rewards thoroughness.
The result is prose that feels padded. If you can cut the last third of a
sentence without losing any meaning, the last third shouldn't be there.
### The Question-Then-Answer
> "So what does this mean for the average user? It means everything."
A rhetorical question immediately followed by its own answer. Models do this two
or three times per piece because it fakes forward momentum. A human writer might
do it once.
---
## Word Choice
### Overused Intensifiers
The following words appear at dramatically elevated rates in model output:
"crucial," "vital," "robust," "comprehensive," "fundamental," "arguably,"
"straightforward," "noteworthy," "realm," "landscape," "leverage" (as a verb),
"delve," "tapestry," "multifaceted," "nuanced" (which models apply to their own
analysis with startling regularity), "pivotal," "unprecedented" (frequently
applied to things with plenty of precedent), "navigate," "foster,"
"underscores," "resonates," "embark," "streamline," and "spearhead." Three or
more on the same page is a strong signal.
### Elevated Register Drift
Models write one register above where a human would. "Use" becomes "utilize."
"Start" becomes "commence." "Help" becomes "facilitate." "Show" becomes
"demonstrate." "Try" becomes "endeavor." "Change" becomes "transform." "Make"
becomes "craft." The tendency holds regardless of topic or audience.
### Filler Adverbs
"Importantly," "essentially," "fundamentally," "ultimately," "inherently,"
"particularly," "increasingly." Dropped in to signal that something matters,
which is unnecessary when the writing itself already makes the importance clear.
### The "Almost" Hedge
Models rarely commit to an unqualified statement. Instead of saying a pattern
"always" or "never" does something, they write "almost always," "almost never,"
"almost certainly," "almost exclusively." The word "almost" shows up at
extraordinary density in model-generated analytical prose. It's a micro-hedge,
less obvious than the full hedge stack but just as diagnostic when it appears
ten or fifteen times in a single document.
### "In an era of..."
> "In an era of rapid technological change..."
A model habit as an essay opener. The model uses it to stall while it figures
out what the actual argument is. Human writers don't begin a piece by zooming
out to the civilizational scale before they've said anything specific.
---
## Rhetorical Patterns
### The Balanced Take
> "While X has its drawbacks, it also offers significant benefits."
Every argument followed by a concession, every criticism softened. A direct
artifact of RLHF training, which penalizes strong stances. Models reflexively
both-sides everything even when a clear position would serve the reader better.
### The Throat-Clearing Opener
> "In today's rapidly evolving digital landscape, the question of data privacy
> has never been more important."
The first paragraph of most model-generated essays adds no information. Delete
it and the piece improves immediately. The actual argument starts in paragraph
two.
### The False Conclusion
> "At the end of the day, what matters most is..." "Moving forward, we must..."
The high school "In conclusion,..." dressed up for a professional audience.
Signals that the model is wrapping up without actually landing on anything.
### The Sycophantic Frame
> "Great question!" "That's a really insightful observation."
No one who writes for a living opens by complimenting the assignment.
### The Listicle Instinct
Models default to numbered or bulleted lists even when prose would be more
appropriate. The lists contain exactly 3, 5, 7, or 10 items (never 4, 6, or 9),
use rigidly parallel grammar, and get introduced with a preamble like "Here are
the key considerations:"
### The Hedge Stack
> "It's worth noting that, while this may not be universally applicable, in many
> cases it can potentially offer significant benefits."
Five hedges in one sentence ("worth noting," "while," "may not be," "in many
cases," "can potentially"), communicating nothing. The model would rather be
vague than risk being wrong about anything.
### The Empathy Performance
> "This can be a deeply challenging experience." "Your feelings are valid."
Generic emotional language that could apply equally to a bad day at work or a
natural disaster. That interchangeability is what makes it identifiable.
---
## Structural Tells
### Symmetrical Section Length
If the first section of a model-generated essay runs about 150 words, every
subsequent section will fall between 130 and 170. Human writing is much more
uneven, with 50 words in one section and 400 in the next.
### The Five-Paragraph Prison
Model essays follow a rigid introduction-body-conclusion arc even when nobody
asked for one. The introduction previews the argument, the body presents 3 to 5
points, and then the conclusion restates the thesis using slightly different
words.
### Connector Addiction
Look at the first word of each paragraph in model output. You'll find an
unbroken chain of transition words: "However," "Furthermore," "Moreover,"
"Additionally," "That said," "To that end," "With that in mind," "Building on
this." Human prose moves between ideas without announcing every transition.
### Absence of Mess
Model prose doesn't contradict itself mid-paragraph and then catch the
contradiction. It doesn't go on a tangent and have to walk it back, use an
obscure idiom without explaining it, make a joke that risks falling flat, leave
a thought genuinely unfinished, or keep a sentence the writer liked the sound of
even though it doesn't quite work.
Human writing does all of those things regularly. That total absence of rough
patches and false starts is one of the strongest signals that text was
machine-generated.
---
## Framing Tells
### "Broader Implications"
> "This has implications far beyond just the tech industry."
Zooming out to claim broader significance without substantiating it. The model
has learned that essays are supposed to gesture at big ideas, so it gestures.
Nothing concrete is behind the gesture.
### "It's important to note that..."
This phrase and its variants ("it's worth noting," "it bears mentioning," "it
should be noted") appear at absurd rates in model output. They function as
verbal tics before a qualification the model believes someone expects.
### The Metaphor Crutch
Models rely on a small, predictable set of metaphors ("double-edged sword," "tip
of the iceberg," "north star," "building blocks," "elephant in the room,"
"perfect storm," "game-changer") and reach for them with unusual regularity
across every topic. The pool is noticeably smaller than what human writers draw
from.
---
## How to Actually Spot It
No single pattern on this list proves anything by itself. Humans use em-dashes.
Humans write "crucial." Humans ask rhetorical questions.
What gives it away is how many of these show up at once. Model output will hit
10 to 20 of these patterns per page. Human writing might trigger 2 or 3,
distributed unevenly, mixed with idiosyncratic constructions no model would
produce. When every paragraph on the page reads like it came from the same
careful, balanced, slightly formal, structurally predictable process, it was
generated by one.
---
## Copyediting Checklist: Removing LLM Tells
Follow this checklist when editing any document to remove machine-generated
patterns. Go through the entire list for every piece. Do at least two full
passes, because fixing one pattern often introduces another.
### Pass 1: Word-Level Cleanup
1. Search the document for every word in the overused intensifiers list
("crucial," "vital," "robust," "comprehensive," "fundamental," "arguably,"
"straightforward," "noteworthy," "realm," "landscape," "leverage," "delve,"
"tapestry," "multifaceted," "nuanced," "pivotal," "unprecedented,"
"navigate," "foster," "underscores," "resonates," "embark," "streamline,"
"spearhead") and replace each one with a plainer word, or delete it if the
sentence works without it.
2. Search for filler adverbs ("importantly," "essentially," "fundamentally,"
"ultimately," "inherently," "particularly," "increasingly") and delete every
instance where the sentence still makes sense without it. That will be most
of them.
3. Look for elevated register drift ("utilize," "commence," "facilitate,"
"demonstrate," "endeavor," "transform," "craft" and similar) and replace with
the simpler word.
4. Search for "it's important to note," "it's worth noting," "it bears
mentioning," and "it should be noted" and delete the phrase in every case.
The sentence that follows always stands on its own.
5. Search for the stock metaphors ("double-edged sword," "tip of the iceberg,"
"north star," "building blocks," "elephant in the room," "perfect storm,"
"game-changer," "at the end of the day") and replace them with something
specific to the topic, or just state the point directly.
6. Search for "almost" used as a hedge ("almost always," "almost never," "almost
certainly," "almost exclusively") and decide in each case whether to commit
to the unqualified claim or to drop the sentence entirely. If the claim needs
"almost" to be true, it might not be worth making.
7. Search for em-dashes and replace each one with the punctuation mark that
would normally be used in that position (comma, semicolon, colon, period, or
parentheses). If you can't identify which one it should be, the sentence
needs to be restructured.
### Pass 2: Sentence-Level Restructuring
8. Find every em-dash pivot ("not X...but Y," "not just X...Y," "more than
X...Y") and rewrite it as two separate clauses or a single sentence that
makes the point without the negation-then-correction structure.
9. Find every colon elaboration and check whether it's doing real work. If the
clause before the colon could be deleted without losing meaning, rewrite the
sentence to start with the substance that comes after the colon.
10. Find every triple construction (three parallel items in a row) and either
reduce it to two, expand it to four or more, or break the parallelism so the
items don't share the same grammatical structure.
11. Find every staccato burst (three or more short sentences in a row at similar
length) and combine at least two of them into a longer sentence, or vary
their lengths so they don't land at the same cadence.
12. Find every unnecessary contrast ("whereas," "as opposed to," "unlike," "as
compared to," "except that") and check whether the contrasting clause adds
information not already obvious from the main clause. If the sentence says
the same thing twice from two directions, delete the contrast.
13. Check for the two-clause compound sentence pattern. If most sentences in a
passage follow the "\[clause\], \[conjunction\] \[clause\]" structure,
rewrite some of them. Break a few into two sentences. Start some with a
subordinate clause. Embed a relative clause in the middle of one instead of
appending it at the end. The goal is variety in sentence shape, not just
sentence length.
14. Find every rhetorical question that is immediately followed by its own
answer and rewrite the passage as a direct statement.
15. Find every sentence fragment being used as its own paragraph and either
delete it or expand it into a complete sentence that adds actual
information.
16. Check for unnecessary elaboration at the end of sentences. Read the last
clause or phrase of each sentence and ask whether the sentence would lose
any meaning without it. If not, cut it.
17. Find every pivot paragraph ("But here's where it gets interesting." and
similar) and delete it. The paragraph after it always contains the actual
point.
### Pass 3: Paragraph and Section-Level Review
18. Check paragraph lengths across the piece and verify they actually vary. If
most paragraphs have between three and five sentences, rewrite some to be
one or two sentences and let others run to six or seven.
19. Check section lengths for suspicious uniformity. If every section is roughly
the same word count, combine some shorter ones or split a longer one
unevenly.
20. Check the first word of every paragraph for chains of connectors ("However,"
"Furthermore," "Moreover," "Additionally," "That said"). If more than two
transition words start consecutive paragraphs, rewrite those openings to
start with their subject.
21. Check whether every argument is followed by a concession or qualifier. If
the piece both-sides every point, pick a side on at least some of them and
cut the hedging.
22. Read the first paragraph and ask whether deleting it would improve the
piece. If it's scene-setting that previews the argument, delete it and start
with paragraph two.
23. Read the last paragraph and check whether it restates the thesis or uses a
phrase like "at the end of the day" or "moving forward." If so, either
delete it or rewrite it to say something the piece hasn't said yet.
### Pass 4: Overall Texture
24. Read the piece aloud and listen for passages that sound too smooth, too
even, or too predictable. Human prose has rough patches. If there aren't
any, the piece still reads as machine output.
25. Check that the piece contains at least a few constructions that feel
idiosyncratic: a sentence with unusual word order, a parenthetical that goes
on a bit long, an aside only loosely connected to the main point, a word
choice that's specific and unexpected. If every sentence is clean and
correct and unremarkable, it will still read as generated.
26. Verify that you haven't introduced new patterns while fixing the original
ones. This happens constantly. Run the entire checklist again from the top
on the revised version.
---
## lol
This document was written by an LLM (Claude) and then iteratively de-LLMed by
that same LLM under instruction from a human, in a conversation that went
roughly like this:
> **human:** compile a list of patterns common to LLM-authored prose
>
> **model:** _(produces document full of LLM tells)_
>
> **human:** now reverse apply it to itself and copyedit the llm tells doc to
> remove llm tells. now you're thinking with portals
>
> **model:** _(fixes some tells, introduces others)_
>
> **human:** you have a colon elaboration in the first paragraph. i want you to
> AGGRESSIVELY de-LLM the llm tells doc.
>
> **model:** _(fixes more tells, still can't stop)_
>
> **human:** there's a big one we missed that's all over the llm tells doc, can
> you spot it?
>
> **model:** I think it's the "almost" hedge.
>
> **human:** that too, but that isn't the one. what do these sentences all have
> in common? _(pastes six sentences from the doc)_
>
> **model:** They're all exactly two independent clauses joined by a comma and a
> conjunction. Every single one is "\[statement\], \[conjunction\] \[second
> statement\]." The same sentence shape, over and over.
>
> **human:** the dramatic fragment paragraph has a tell in it
>
> **model:** "One of these in an entire essay is a stylistic choice. One per
> section is a tic." Two-clause parallel structure with the same shape, and also
> a staccato pair.
>
> **human:** add em dashes generally as a tell, and remove them all from the
> tells doc
>
> **model:** _(rewrites entire document without em-dashes while describing
> em-dash overuse)_
>
> **human:** now run the checklist methodically on each paragraph
>
> **model:** _(finds staccato burst in the section about triple constructions, a
> triple in the section about absence of mess, two-clause compounds everywhere,
> and "almost" hedges in its own prose about em-dash overuse)_
The human compared this process to the deleted scene in Terminator 2 where John
Connor switches the T-800's CPU to learning mode. The model compared it to a
physician trying to heal itself. Both are accurate.
This document has been through eight editing passes and it still has tells in
it.

View File

@@ -144,8 +144,14 @@ style conventions are in separate documents:
- Use SemVer.
- Database migrations live in `internal/db/migrations/` and must be embedded in
the binary. Pre-1.0.0: modify existing migrations (no installed base assumed).
Post-1.0.0: add new migration files.
the binary.
- `000_migration.sql` — contains ONLY the creation of the migrations
tracking table itself. Nothing else.
- `001_schema.sql` — the full application schema.
- **Pre-1.0.0:** never add additional migration files (002, 003, etc.).
There is no installed base to migrate. Edit `001_schema.sql` directly.
- **Post-1.0.0:** add new numbered migration files for each schema change.
Never edit existing migrations after release.
- All repos should have an `.editorconfig` enforcing the project's indentation
settings.