add em-dash overuse tell, remove all em-dashes from prose, checklist now 25 items
All checks were successful
check / check (push) Successful in 11s

This commit is contained in:
user
2026-03-04 14:24:50 -08:00
parent 729fea84de
commit 318da3666c

View File

@@ -1,7 +1,7 @@
# LLM Prose Tells # LLM Prose Tells
All of these show up in human writing occasionally. No single one is conclusive All of these show up in human writing occasionally. No single one is conclusive
on its own. The difference is concentration a person might lean on one or two on its own. The difference is concentration; a person might lean on one or two
of these habits across an entire essay, but LLM output will use fifteen of them of these habits across an entire essay, but LLM output will use fifteen of them
per paragraph, consistently, throughout the entire piece. per paragraph, consistently, throughout the entire piece.
@@ -9,17 +9,28 @@ per paragraph, consistently, throughout the entire piece.
## Sentence Structure ## Sentence Structure
### The Em-Dash Pivot: "Not Xbut Y" ### The Em-Dash Pivot: "Not X...but Y"
A negation followed by an em-dash and a reframe. A negation followed by an em-dash and a reframe.
> "It's not just a tool—it's a paradigm shift." "This isn't about > "It's not just a tool—it's a paradigm shift." "This isn't about
> technology—it's about trust." > technology—it's about trust."
The single most recognizable LLM construction. Models produce this at roughly The single most recognizable LLM construction. Models produce this at roughly 10
1050x the rate of human writers. Four of them in one essay and you know what to 50x the rate of human writers. Four of them in one essay and you know what
you're reading. you're reading.
### Em-Dash Overuse Generally
Even outside the "not X but Y" pivot, models use em-dashes at far higher rates
than human writers. They substitute em-dashes for commas, semicolons,
parentheses, colons, and periods, often multiple times per paragraph. A human
writer might use one or two in an entire piece for a specific parenthetical
effect. Models scatter them everywhere because the em-dash is a flexible
punctuation mark that can replace almost any other, and models default to
flexible options. When a piece of prose has more than two or three em-dashes per
page, that alone is a meaningful signal.
### The Colon Elaboration ### The Colon Elaboration
A short declarative clause, then a colon, then a longer explanation. A short declarative clause, then a colon, then a longer explanation.
@@ -69,11 +80,11 @@ in a way that's hard to pinpoint but easy to feel.
### Uniform Sentences Per Paragraph ### Uniform Sentences Per Paragraph
Model-generated paragraphs contain between three and five sentences, and this Model-generated paragraphs contain between three and five sentences. This count
count holds steady across an entire piece. If the first paragraph has four holds steady across an entire piece. If the first paragraph has four sentences,
sentences, every subsequent paragraph will too. Human writers are much more every subsequent paragraph will too. Human writers are much more varied (a
varied — a single sentence followed by one that runs eight or nine because single sentence followed by one that runs eight or nine) because they follow the
they follow the shape of an idea, not a template. shape of an idea, not a template.
### The Dramatic Fragment ### The Dramatic Fragment
@@ -230,18 +241,18 @@ natural disaster. That interchangeability is what makes it identifiable.
If the first section of a model-generated essay runs about 150 words, every If the first section of a model-generated essay runs about 150 words, every
subsequent section will fall between 130 and 170. Human writing is much more subsequent section will fall between 130 and 170. Human writing is much more
uneven 50 words in one section, 400 in the next. uneven, with 50 words in one section and 400 in the next.
### The Five-Paragraph Prison ### The Five-Paragraph Prison
Model essays follow a rigid introduction-body-conclusion arc even when nobody Model essays follow a rigid introduction-body-conclusion arc even when nobody
asked for one. Introduction previews the argument. Body presents 35 points. asked for one. Introduction previews the argument. Body presents 3 to 5 points.
Conclusion restates the thesis in different words. Conclusion restates the thesis in different words.
### Connector Addiction ### Connector Addiction
Look at the first word of each paragraph in model output. You'll find an Look at the first word of each paragraph in model output. You'll find an
unbroken chain of transition words "However," "Furthermore," "Moreover," unbroken chain of transition words: "However," "Furthermore," "Moreover,"
"Additionally," "That said," "To that end," "With that in mind," "Building on "Additionally," "That said," "To that end," "With that in mind," "Building on
this." Human prose moves between ideas without announcing every transition. this." Human prose moves between ideas without announcing every transition.
@@ -277,9 +288,9 @@ verbal tics before a qualification the model believes someone expects.
### The Metaphor Crutch ### The Metaphor Crutch
Models rely on a small, predictable set of metaphors "double-edged sword," Models rely on a small, predictable set of metaphors ("double-edged sword," "tip
"tip of the iceberg," "north star," "building blocks," "elephant in the room," of the iceberg," "north star," "building blocks," "elephant in the room,"
"perfect storm," "game-changer" and reach for them with unusual regularity "perfect storm," "game-changer") and reach for them with unusual regularity
across every topic. The pool is noticeably smaller than what human writers draw across every topic. The pool is noticeably smaller than what human writers draw
from. from.
@@ -291,11 +302,11 @@ No single pattern on this list proves anything by itself. Humans use em-dashes.
Humans write "crucial." Humans ask rhetorical questions. Humans write "crucial." Humans ask rhetorical questions.
What gives it away is how many of these show up at once. Model output will hit What gives it away is how many of these show up at once. Model output will hit
1020 of these patterns per page. Human writing might trigger 23, distributed 10 to 20 of these patterns per page. Human writing might trigger 2 or 3,
unevenly, mixed with idiosyncratic constructions no model would produce. When distributed unevenly, mixed with idiosyncratic constructions no model would
every paragraph on the page reads like it came from the same careful, balanced, produce. When every paragraph on the page reads like it came from the same
slightly formal, structurally predictable process, it was probably generated by careful, balanced, slightly formal, structurally predictable process, it was
one. probably generated by one.
--- ---
@@ -338,86 +349,92 @@ passes, because fixing one pattern often introduces another.
to the unqualified claim or to drop the sentence entirely. If the claim needs to the unqualified claim or to drop the sentence entirely. If the claim needs
"almost" to be true, it might not be worth making. "almost" to be true, it might not be worth making.
7. Search for em-dashes and replace each one with the punctuation mark that
would normally be used in that position (comma, semicolon, colon, period, or
parentheses). If you can't identify which one it should be, the sentence
probably needs to be restructured.
### Pass 2: Sentence-Level Restructuring ### Pass 2: Sentence-Level Restructuring
7. Find every em-dash pivot ("not Xbut Y," "not just XY," "more than X—Y") and 8. Find every em-dash pivot ("not X...but Y," "not just X...Y," "more than
rewrite it as two separate clauses or a single sentence that makes the point X...Y") and rewrite it as two separate clauses or a single sentence that
without the negation-then-correction structure. makes the point without the negation-then-correction structure.
8. Find every colon elaboration and check whether it's doing real work. If the 9. Find every colon elaboration and check whether it's doing real work. If the
clause before the colon could be deleted without losing meaning, rewrite the clause before the colon could be deleted without losing meaning, rewrite the
sentence to start with the substance that comes after the colon. sentence to start with the substance that comes after the colon.
9. Find every triple construction (three parallel items in a row) and either 10. Find every triple construction (three parallel items in a row) and either
reduce it to two, expand it to four or more, or break the parallelism so the reduce it to two, expand it to four or more, or break the parallelism so the
items don't share the same grammatical structure. items don't share the same grammatical structure.
10. Find every staccato burst (three or more short sentences in a row at similar 11. Find every staccato burst (three or more short sentences in a row at similar
length) and combine at least two of them into a longer sentence, or vary length) and combine at least two of them into a longer sentence, or vary
their lengths so they don't land at the same cadence. their lengths so they don't land at the same cadence.
11. Find every unnecessary contrast ("whereas," "as opposed to," "unlike," "as 12. Find every unnecessary contrast ("whereas," "as opposed to," "unlike," "as
compared to," "except that") and check whether the contrasting clause adds compared to," "except that") and check whether the contrasting clause adds
information not already obvious from the main clause. If the sentence says information not already obvious from the main clause. If the sentence says
the same thing twice from two directions, delete the contrast. the same thing twice from two directions, delete the contrast.
12. Check for the two-clause compound sentence pattern. If most sentences in a 13. Check for the two-clause compound sentence pattern. If most sentences in a
passage follow the "[clause], [conjunction] [clause]" structure, rewrite passage follow the "\[clause\], \[conjunction\] \[clause\]" structure,
some of them. Break a few into two sentences. Start some with a subordinate rewrite some of them. Break a few into two sentences. Start some with a
clause. Embed a relative clause in the middle of one instead of appending it subordinate clause. Embed a relative clause in the middle of one instead of
at the end. The goal is variety in sentence shape, not just sentence length. appending it at the end. The goal is variety in sentence shape, not just
sentence length.
13. Find every rhetorical question that is immediately followed by its own 14. Find every rhetorical question that is immediately followed by its own
answer and rewrite the passage as a direct statement. answer and rewrite the passage as a direct statement.
14. Find every sentence fragment being used as its own paragraph and either 15. Find every sentence fragment being used as its own paragraph and either
delete it or expand it into a complete sentence that adds actual delete it or expand it into a complete sentence that adds actual
information. information.
15. Find every pivot paragraph ("But here's where it gets interesting." and 16. Find every pivot paragraph ("But here's where it gets interesting." and
similar) and delete it. The paragraph after it always contains the actual similar) and delete it. The paragraph after it always contains the actual
point. point.
### Pass 3: Paragraph and Section-Level Review ### Pass 3: Paragraph and Section-Level Review
16. Check paragraph lengths across the piece and verify they actually vary. If 17. Check paragraph lengths across the piece and verify they actually vary. If
most paragraphs have between three and five sentences, rewrite some to be most paragraphs have between three and five sentences, rewrite some to be
one or two sentences and let others run to six or seven. one or two sentences and let others run to six or seven.
17. Check section lengths for suspicious uniformity. If every section is roughly 18. Check section lengths for suspicious uniformity. If every section is roughly
the same word count, combine some shorter ones or split a longer one the same word count, combine some shorter ones or split a longer one
unevenly. unevenly.
18. Check the first word of every paragraph for chains of connectors ("However," 19. Check the first word of every paragraph for chains of connectors ("However,"
"Furthermore," "Moreover," "Additionally," "That said"). If more than two "Furthermore," "Moreover," "Additionally," "That said"). If more than two
transition words start consecutive paragraphs, rewrite those openings to transition words start consecutive paragraphs, rewrite those openings to
start with their subject. start with their subject.
19. Check whether every argument is followed by a concession or qualifier. If 20. Check whether every argument is followed by a concession or qualifier. If
the piece both-sides every point, pick a side on at least some of them and the piece both-sides every point, pick a side on at least some of them and
cut the hedging. cut the hedging.
20. Read the first paragraph and ask whether deleting it would improve the 21. Read the first paragraph and ask whether deleting it would improve the
piece. If it's scene-setting that previews the argument, delete it and start piece. If it's scene-setting that previews the argument, delete it and start
with paragraph two. with paragraph two.
21. Read the last paragraph and check whether it restates the thesis or uses a 22. Read the last paragraph and check whether it restates the thesis or uses a
phrase like "at the end of the day" or "moving forward." If so, either phrase like "at the end of the day" or "moving forward." If so, either
delete it or rewrite it to say something the piece hasn't said yet. delete it or rewrite it to say something the piece hasn't said yet.
### Pass 4: Overall Texture ### Pass 4: Overall Texture
22. Read the piece aloud and listen for passages that sound too smooth, too 23. Read the piece aloud and listen for passages that sound too smooth, too
even, or too predictable. Human prose has rough patches. If there aren't even, or too predictable. Human prose has rough patches. If there aren't
any, the piece still reads as machine output. any, the piece still reads as machine output.
23. Check that the piece contains at least a few constructions that feel 24. Check that the piece contains at least a few constructions that feel
idiosyncratic a sentence with unusual word order, a parenthetical that idiosyncratic: a sentence with unusual word order, a parenthetical that goes
goes on a bit long, an aside only loosely connected to the main point, a on a bit long, an aside only loosely connected to the main point, a word
word choice that's specific and unexpected. If every sentence is clean and choice that's specific and unexpected. If every sentence is clean and
correct and unremarkable, it will still read as generated. correct and unremarkable, it will still read as generated.
24. Verify that you haven't introduced new patterns while fixing the original 25. Verify that you haven't introduced new patterns while fixing the original
ones. This happens constantly. Run the entire checklist again from the top ones. This happens constantly. Run the entire checklist again from the top
on the revised version. on the revised version.
@@ -460,10 +477,16 @@ roughly like this:
> **model:** "One of these in an entire essay is a stylistic choice. One per > **model:** "One of these in an entire essay is a stylistic choice. One per
> section is a tic." Two-clause parallel structure with the same shape, and also > section is a tic." Two-clause parallel structure with the same shape, and also
> a staccato pair. > a staccato pair.
>
> **human:** add em dashes generally as a tell, and remove them all from the
> tells doc
>
> **model:** _(rewrites entire document without em-dashes while describing
> em-dash overuse)_
The human compared this process to the deleted scene in Terminator 2 where John The human compared this process to the deleted scene in Terminator 2 where John
Connor switches the T-800's CPU to learning mode. The model compared it to a Connor switches the T-800's CPU to learning mode. The model compared it to a
physician trying to heal itself. Both descriptions are probably accurate. physician trying to heal itself. Both descriptions are probably accurate.
This document has been through six editing passes and it probably still has This document has been through seven editing passes and it probably still has
tells in it. tells in it.