From f921dee83902f9cae220867438fe7b9fc8df49de Mon Sep 17 00:00:00 2001 From: user Date: Wed, 4 Mar 2026 14:00:45 -0800 Subject: [PATCH 1/2] add LLM prose tells reference and copyediting checklist --- prompts/LLM_PROSE_TELLS.md | 396 +++++++++++++++++++++++++++++++++++++ 1 file changed, 396 insertions(+) create mode 100644 prompts/LLM_PROSE_TELLS.md diff --git a/prompts/LLM_PROSE_TELLS.md b/prompts/LLM_PROSE_TELLS.md new file mode 100644 index 0000000..f08bde0 --- /dev/null +++ b/prompts/LLM_PROSE_TELLS.md @@ -0,0 +1,396 @@ +# LLM Prose Tells + +All of these show up in human writing occasionally, and no single one is +conclusive on its own. The difference is concentration, because a person might +lean on one or two of these habits across an entire essay while LLM output will +use fifteen of them per paragraph, consistently, throughout the entire piece. + +--- + +## Sentence Structure + +### The Em-Dash Pivot: "Not X—but Y" + +A negation followed by an em-dash and a reframe. The single most recognizable +LLM construction. + +> "It's not just a tool—it's a paradigm shift." "This isn't about +> technology—it's about trust." + +Models produce this at roughly 10–50x the rate of human writers, and when it +appears four times in the same essay you're almost certainly reading generated +text. + +### The Colon Elaboration + +A short declarative clause, then a colon, then a longer explanation. + +> "The answer is simple: we need to rethink our approach from the ground up." + +Models reach for this in nearly every other paragraph. The construction itself +is perfectly normal, which is why the frequency is what gives it away. + +### The Triple Construction + +> "It's fast, it's scalable, and it's open source." + +Three parallel items in a list, usually escalating, with exactly three items +every time (rarely two, almost never four) and strict grammatical parallelism +that human writers rarely bother maintaining. + +### The Staccato Burst + +> "This matters. It always has. And it always will." "The data is clear. The +> trend is undeniable. The conclusion is obvious." + +Runs of very short sentences at the same cadence. Human writers will use a short +sentence for emphasis occasionally, but they don't stack three or four of them +in a row at matching length, because real prose has variable rhythm. When you +see a paragraph where every sentence is under ten words and they're all roughly +the same size, that mechanical regularity is a strong signal. + +### Uniform Sentences Per Paragraph + +Model-generated paragraphs almost always contain between three and five +sentences, and this count holds remarkably steady across an entire piece. If the +first paragraph has four sentences, nearly every subsequent paragraph will too. +Human writers produce much more varied paragraph lengths — a single sentence +followed by one that runs eight or nine — as a natural result of following the +shape of an idea rather than filling a template. + +### The Dramatic Fragment + +Sentence fragments used as standalone paragraphs for emphasis, like "Full stop." +or "Let that sink in." on their own line. One of these in an entire essay is a +stylistic choice. One per section is a tic, and models drop them in at that rate +or higher. + +### The Pivot Paragraph + +> "But here's where it gets interesting." "Which raises an uncomfortable truth." + +One-sentence paragraphs that exist only to transition between ideas. They +contain zero information, and the actual point always comes in the paragraph +that follows them. Delete every one of these and the piece reads better. + +### The Parenthetical Qualifier + +> "This is, of course, a simplification." "There are, to be fair, exceptions." + +Parenthetical asides inserted to look thoughtful. The qualifier almost never +changes the argument that follows it, and its purpose is to perform nuance +rather than to express an actual reservation about what's being said. + +### The Unnecessary Contrast + +Models append a contrasting clause to statements that don't need one, tacking on +"whereas," "as opposed to," "unlike," or "except that" to draw a comparison that +adds nothing the reader couldn't already infer. + +> "Models write one register above where a human would, whereas human writers +> tend to match register to context." "The lists use rigidly parallel grammar, +> as opposed to the looser structure you'd see in human writing." + +The first clause already makes the point. The contrasting clause just restates +it from the other direction. This happens because models are trained to be +thorough and to anticipate objections, so they compulsively spell out both sides +of a distinction even when one side is obvious. If you delete the "whereas" +clause and the sentence still says everything it needs to, the contrast was +filler. + +### The Question-Then-Answer + +> "So what does this mean for the average user? It means everything." + +A rhetorical question immediately followed by its own answer. Models lean on +this two or three times per piece because it generates the feeling of forward +momentum without requiring any actual argumentative work. A human writer might +do it once. + +--- + +## Word Choice + +### Overused Intensifiers + +The following words appear at dramatically elevated rates in model output +compared to human-written text: "crucial," "vital," "robust," "comprehensive," +"fundamental," "arguably," "straightforward," "noteworthy," "realm," +"landscape," "leverage" (used as a verb), "delve," "tapestry," "multifaceted," +"nuanced" (which models almost always apply to their own analysis), "pivotal," +"unprecedented" (frequently applied to things that have plenty of precedent), +"navigate," "foster," "underscores," "resonates," "embark," "streamline," and +"spearhead." Three or more on the same page is a strong signal. + +### Elevated Register Drift + +Models consistently write one register above where a human would for the same +content, replacing "use" with "utilize," "start" with "commence," "help" with +"facilitate," "show" with "demonstrate," "try" with "endeavor," "change" with +"transform," and "make" with "craft." The tendency holds across every topic +regardless of audience. + +### Filler Adverbs + +"Importantly," "essentially," "fundamentally," "ultimately," "inherently," +"particularly," and "increasingly" get dropped in to signal that something +matters. If the writing itself has already made the importance clear through its +content and structure, these adverbs aren't doing anything except taking up +space. + +### "In an era of..." + +> "In an era of rapid technological change..." + +Almost exclusively a model habit as an essay opener. The model uses it to stall +while it figures out what the actual argument is, because almost no human writer +begins a piece by zooming out to the civilizational scale before they've said +anything specific. + +--- + +## Rhetorical Patterns + +### The Balanced Take + +> "While X has its drawbacks, it also offers significant benefits." + +Every argument followed by a concession, every criticism softened. A direct +artifact of RLHF training, which penalizes strong stances and produces models +that reflexively both-sides everything even when a clear position would serve +the reader better. + +### The Throat-Clearing Opener + +> "In today's rapidly evolving digital landscape, the question of data privacy +> has never been more important." + +The first paragraph of most model-generated essays adds no information. You can +delete it and the piece improves immediately, because the actual argument always +starts in the second paragraph. + +### The False Conclusion + +> "At the end of the day, what matters most is..." "Moving forward, we must..." + +The high school "In conclusion,..." dressed up for a professional audience. It +signals that the model is wrapping up without actually landing on anything. + +### The Sycophantic Frame + +> "Great question!" "That's a really insightful observation." + +No one who writes for a living opens by complimenting the assignment. + +### The Listicle Instinct + +Models default to numbered or bulleted lists even when prose would be more +appropriate. The lists almost always contain exactly 3, 5, 7, or 10 items (never +4, 6, or 9), use rigidly parallel grammar, and get introduced with a preamble +like "Here are the key considerations:" + +### The Hedge Stack + +> "It's worth noting that, while this may not be universally applicable, in many +> cases it can potentially offer significant benefits." + +Five hedges in one sentence ("worth noting," "while," "may not be," "in many +cases," "can potentially"), communicating almost nothing, because the model +would rather be vague than risk being wrong about anything. + +### The Empathy Performance + +> "This can be a deeply challenging experience." "Your feelings are valid." + +Generic emotional language that could apply equally to a bad day at work or a +natural disaster. That interchangeability is exactly what makes it identifiable. + +--- + +## Structural Tells + +### Symmetrical Section Length + +If the first section of a model-generated essay runs about 150 words, every +subsequent section will fall between 130 and 170. Human writing is much more +uneven, with some sections running 50 words and others running 400. + +### The Five-Paragraph Prison + +Model essays follow a rigid introduction-body-conclusion arc even when nobody +asked for one. The introduction previews the argument, the body presents 3–5 +supporting points, and the conclusion restates the thesis in slightly different +words. + +### Connector Addiction + +Look at the first word of each paragraph in model output and you'll find an +unbroken chain of transition words — "However," "Furthermore," "Moreover," +"Additionally," "That said," "To that end," "With that in mind," "Building on +this." Human prose moves between ideas without announcing every transition. + +### Absence of Mess + +Model prose doesn't contradict itself mid-paragraph and then catch the +contradiction, go on a tangent and have to walk it back, use an obscure idiom +without explaining it, make a joke that risks falling flat, leave a thought +genuinely unfinished, or keep a sentence the writer liked the sound of even +though it doesn't quite work. + +Human writing does all of those things. The total absence of rough edges, false +starts, and odd rhythmic choices is one of the strongest signals that text was +machine-generated. + +--- + +## Framing Tells + +### "Broader Implications" + +> "This has implications far beyond just the tech industry." + +Zooming out to claim broader significance without substantiating it. The model +has learned that essays are supposed to gesture at big ideas, so it gestures, +but nothing concrete is behind the gesture. + +### "It's important to note that..." + +This phrase and its variants ("it's worth noting," "it bears mentioning," "it +should be noted") appear at absurd rates in model output and function as verbal +tics before a qualification the model believes someone expects. + +### The Metaphor Crutch + +Models rely on a small, predictable set of metaphors — "double-edged sword," +"tip of the iceberg," "north star," "building blocks," "elephant in the room," +"perfect storm," "game-changer" — and reach for them with unusual regularity +across every topic. The pool they draw from is noticeably smaller than what +human writers use. + +--- + +## How to Actually Spot It + +No single pattern on this list proves anything by itself, since humans use +em-dashes and humans write "crucial" and humans ask rhetorical questions. + +What gives it away is how many of these show up at once. Model output will hit +10–20 of these patterns per page, while human writing might trigger 2–3, +distributed unevenly and mixed with idiosyncratic constructions that no model +would produce. When every paragraph on the page reads like it came from the same +careful, balanced, slightly formal, structurally predictable process, it was +probably generated by one. + +--- + +## Copyediting Checklist: Removing LLM Tells + +Follow this checklist when editing any document to remove machine-generated +patterns. Go through the entire list for every piece, and do at least two full +passes, because fixing one pattern often introduces another. + +### Pass 1: Word-Level Cleanup + +1. Search the document for every word in the overused intensifiers list + ("crucial," "vital," "robust," "comprehensive," "fundamental," "arguably," + "straightforward," "noteworthy," "realm," "landscape," "leverage," "delve," + "tapestry," "multifaceted," "nuanced," "pivotal," "unprecedented," + "navigate," "foster," "underscores," "resonates," "embark," "streamline," + "spearhead") and replace each one with a plainer word, or delete it entirely + if the sentence works without it. + +2. Search for the filler adverbs ("importantly," "essentially," "fundamentally," + "ultimately," "inherently," "particularly," "increasingly") and delete every + instance where the sentence still makes sense without it, which will be most + of them. + +3. Look for elevated register drift ("utilize," "commence," "facilitate," + "demonstrate," "endeavor," "transform," "craft" and similar) and replace with + the simpler word. + +4. Search for "it's important to note," "it's worth noting," "it bears + mentioning," and "it should be noted" and delete the phrase in every case. + The sentence that follows always stands on its own. + +5. Search for the stock metaphors ("double-edged sword," "tip of the iceberg," + "north star," "building blocks," "elephant in the room," "perfect storm," + "game-changer," "at the end of the day") and replace them with something + specific to the topic, or just state the point directly without a metaphor. + +### Pass 2: Sentence-Level Restructuring + +6. Find every em-dash pivot ("not X—but Y," "not just X—Y," "more than X—Y") and + rewrite it as two separate clauses or a single sentence that makes the point + without the negation-then-correction structure. + +7. Find every colon elaboration and check whether it's doing real work. If the + clause before the colon could be deleted without losing meaning, rewrite the + sentence to start with the substance that comes after the colon. + +8. Find every triple construction (three parallel items in a row) and either + reduce it to two, expand it to four or more, or break the parallelism so the + items don't share the same grammatical structure. + +9. Find every staccato burst (three or more short sentences in a row at similar + length) and combine at least two of them into a longer sentence, or vary + their lengths so they don't land at the same cadence. + +10. Find every unnecessary contrast ("whereas," "as opposed to," "unlike," "as + compared to," "except that") and check whether the contrasting clause adds + information that isn't already obvious from the main clause. If the sentence + says the same thing twice from two directions, delete the contrast. + +11. Find every rhetorical question that is immediately followed by its own + answer and rewrite the passage as a direct statement. + +12. Find every sentence fragment being used as its own paragraph and either + delete it or expand it into a complete sentence that adds actual + information. + +13. Find every pivot paragraph ("But here's where it gets interesting." and + similar) and delete it. The paragraph after it always contains the actual + point. + +### Pass 3: Paragraph and Section-Level Review + +14. Check paragraph lengths across the piece and verify they actually vary. If + most paragraphs have between three and five sentences, rewrite some to be + one or two sentences and let others run to six or seven. + +15. Check section lengths for suspicious uniformity. If every section is roughly + the same word count, combine some shorter ones or split a longer one + unevenly. + +16. Check the first word of every paragraph for chains of connectors ("However," + "Furthermore," "Moreover," "Additionally," "That said"). If more than two + transition words start consecutive paragraphs, rewrite those openings to + start with their subject. + +17. Check whether every argument is followed by a concession or qualifier. If + the piece both-sides every point, pick a side on at least some of them and + cut the hedging. + +18. Read the first paragraph and ask whether deleting it would improve the + piece. If it's just scene-setting that previews the argument, delete it and + start with paragraph two. + +19. Read the last paragraph and check whether it restates the thesis or uses a + phrase like "at the end of the day" or "moving forward." If so, either + delete it or rewrite it to say something the piece hasn't said yet. + +### Pass 4: Overall Texture + +20. Read the piece aloud and listen for passages that sound too smooth, too + even, or too predictable. Human prose has rough patches. If there aren't + any, the piece still reads as machine output regardless of whether + individual patterns have been addressed. + +21. Check that the piece contains at least a few constructions that feel + idiosyncratic — a sentence with unusual word order, a parenthetical that + goes on a bit long, an aside only loosely connected to the main point, a + word choice that's specific and unexpected. If every sentence is clean and + correct and unremarkable, it will still read as generated. + +22. Verify that you haven't introduced new patterns while fixing the original + ones, which happens constantly. Run the entire checklist again from the top + on the revised version. From a2dd95360178a106d166e4768ddc6afffaf9bfc1 Mon Sep 17 00:00:00 2001 From: user Date: Wed, 4 Mar 2026 14:02:09 -0800 Subject: [PATCH 2/2] fmt: format REPO_POLICIES.md per prettier --- prompts/REPO_POLICIES.md | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/prompts/REPO_POLICIES.md b/prompts/REPO_POLICIES.md index a024cbd..a0abedf 100644 --- a/prompts/REPO_POLICIES.md +++ b/prompts/REPO_POLICIES.md @@ -145,13 +145,13 @@ style conventions are in separate documents: - Database migrations live in `internal/db/migrations/` and must be embedded in the binary. - - `000_migration.sql` — contains ONLY the creation of the migrations tracking - table itself. Nothing else. - - `001_schema.sql` — the full application schema. - - **Pre-1.0.0:** never add additional migration files (002, 003, etc.). There - is no installed base to migrate. Edit `001_schema.sql` directly. - - **Post-1.0.0:** add new numbered migration files for each schema change. - Never edit existing migrations after release. + - `000_migration.sql` — contains ONLY the creation of the migrations + tracking table itself. Nothing else. + - `001_schema.sql` — the full application schema. + - **Pre-1.0.0:** never add additional migration files (002, 003, etc.). + There is no installed base to migrate. Edit `001_schema.sql` directly. + - **Post-1.0.0:** add new numbered migration files for each schema change. + Never edit existing migrations after release. - All repos should have an `.editorconfig` enforcing the project's indentation settings.