Compare commits
6 Commits
llm-prose-
...
llm-prose-
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
ef08f42d84 | ||
|
|
e09360b46d | ||
|
|
c4ae355189 | ||
|
|
985c48bf19 | ||
|
|
318da3666c | ||
|
|
729fea84de |
@@ -1,7 +1,9 @@
|
|||||||
# LLM Prose Tells
|
# LLM Prose Tells
|
||||||
|
|
||||||
Human writers occasionally use every pattern in this document. The reason they
|
All of these show up in human writing occasionally. No single one is conclusive
|
||||||
work as tells is that LLM output packs fifteen of them into a paragraph.
|
on its own. The difference is concentration. A person might lean on one or two
|
||||||
|
of these habits across an entire essay, but LLM output will use fifteen of them
|
||||||
|
per paragraph, consistently, throughout the entire piece.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -14,17 +16,19 @@ A negation followed by an em-dash and a reframe.
|
|||||||
> "It's not just a tool—it's a paradigm shift." "This isn't about
|
> "It's not just a tool—it's a paradigm shift." "This isn't about
|
||||||
> technology—it's about trust."
|
> technology—it's about trust."
|
||||||
|
|
||||||
The most recognizable LLM construction. Models produce this at roughly 10 to 50x
|
The single most recognizable LLM construction. Models produce this at roughly 10
|
||||||
the rate of human writers. Four of them in one essay and you know what you're
|
to 50x the rate of human writers. Four of them in one essay and you know what
|
||||||
reading.
|
you're reading.
|
||||||
|
|
||||||
### Em-Dash Overuse Generally
|
### Em-Dash Overuse Generally
|
||||||
|
|
||||||
Even outside the "not X but Y" pivot, models use em-dashes at far higher rates
|
Even outside the "not X but Y" pivot, models use em-dashes at far higher rates
|
||||||
than human writers. They substitute em-dashes for commas, semicolons,
|
than human writers. They substitute em-dashes for commas, semicolons,
|
||||||
parentheses, colons, and periods. A human writer might use one or two in a
|
parentheses, colons, and periods, often multiple times per paragraph. A human
|
||||||
piece. Models scatter them everywhere because the em-dash can stand in for any
|
writer might use one or two in an entire piece for a specific parenthetical
|
||||||
other punctuation mark. More than two or three per page is a signal.
|
effect. Models scatter them everywhere because the em-dash can stand in for any
|
||||||
|
other punctuation mark, so they default to it. More than two or three per page
|
||||||
|
is a meaningful signal on its own.
|
||||||
|
|
||||||
### The Colon Elaboration
|
### The Colon Elaboration
|
||||||
|
|
||||||
@@ -50,15 +54,15 @@ bother maintaining.
|
|||||||
|
|
||||||
Runs of very short sentences at the same cadence. Human writers use a short
|
Runs of very short sentences at the same cadence. Human writers use a short
|
||||||
sentence for emphasis occasionally, but stacking three or four of them in a row
|
sentence for emphasis occasionally, but stacking three or four of them in a row
|
||||||
at matching length creates a mechanical regularity.
|
at matching length creates a mechanical regularity that reads as generated.
|
||||||
|
|
||||||
### The Two-Clause Compound Sentence
|
### The Two-Clause Compound Sentence
|
||||||
|
|
||||||
Possibly the most pervasive tell, and easy to miss because each individual
|
Possibly the most pervasive structural tell, and easy to miss because each
|
||||||
instance looks like normal English. The model produces sentence after sentence
|
individual instance looks like normal English. The model produces sentence after
|
||||||
where an independent clause is followed by a comma, a conjunction ("and," "but,"
|
sentence where an independent clause is followed by a comma, a conjunction
|
||||||
"which," "because"), and a second independent clause of similar length. Every
|
("and," "but," "which," "because"), and a second independent clause of similar
|
||||||
sentence becomes two balanced halves.
|
length. Every sentence becomes two balanced halves joined in the middle.
|
||||||
|
|
||||||
> "The construction itself is perfectly normal, which is why the frequency is
|
> "The construction itself is perfectly normal, which is why the frequency is
|
||||||
> what gives it away." "They contain zero information, and the actual point
|
> what gives it away." "They contain zero information, and the actual point
|
||||||
@@ -69,21 +73,23 @@ sentence becomes two balanced halves.
|
|||||||
Human prose has sentences with one clause, sentences with three, sentences that
|
Human prose has sentences with one clause, sentences with three, sentences that
|
||||||
start with a subordinate clause before reaching the main one, sentences that
|
start with a subordinate clause before reaching the main one, sentences that
|
||||||
embed their complexity in the middle. When every sentence on the page has that
|
embed their complexity in the middle. When every sentence on the page has that
|
||||||
same two-part structure, the rhythm becomes monotonous.
|
same two-part structure, the rhythm becomes monotonous in a way that's hard to
|
||||||
|
pinpoint but easy to feel.
|
||||||
|
|
||||||
### Uniform Sentences Per Paragraph
|
### Uniform Sentences Per Paragraph
|
||||||
|
|
||||||
Model-generated paragraphs contain between three and five sentences. This count
|
Model-generated paragraphs contain between three and five sentences. This count
|
||||||
holds steady across a piece. If the first paragraph has four sentences, every
|
holds steady across an entire piece. If the first paragraph has four sentences,
|
||||||
subsequent paragraph will too. Human writers are much more varied (a single
|
every subsequent paragraph will too. Human writers are much more varied (a
|
||||||
sentence followed by one that runs eight or nine) because they follow the shape
|
single sentence followed by one that runs eight or nine) because they follow the
|
||||||
of an idea.
|
shape of an idea, not a template.
|
||||||
|
|
||||||
### The Dramatic Fragment
|
### The Dramatic Fragment
|
||||||
|
|
||||||
Sentence fragments used as standalone paragraphs for emphasis, like "Full stop."
|
Sentence fragments used as standalone paragraphs for emphasis, like "Full stop."
|
||||||
or "Let that sink in." on their own line. Using one in an essay is a reasonable
|
or "Let that sink in." on their own line. Using one in an entire essay is a
|
||||||
stylistic choice, but models drop them in once per section or more.
|
reasonable stylistic choice, but models drop them in once per section or more,
|
||||||
|
at which point it becomes a habit rather than a deliberate decision.
|
||||||
|
|
||||||
### The Pivot Paragraph
|
### The Pivot Paragraph
|
||||||
|
|
||||||
@@ -98,12 +104,14 @@ Delete every one of these and the piece reads better.
|
|||||||
> "This is, of course, a simplification." "There are, to be fair, exceptions."
|
> "This is, of course, a simplification." "There are, to be fair, exceptions."
|
||||||
|
|
||||||
Parenthetical asides inserted to look thoughtful. The qualifier never changes
|
Parenthetical asides inserted to look thoughtful. The qualifier never changes
|
||||||
the argument that follows it. Its purpose is to perform nuance.
|
the argument that follows it. Its purpose is to perform nuance, not to express a
|
||||||
|
real reservation about what's being said.
|
||||||
|
|
||||||
### The Unnecessary Contrast
|
### The Unnecessary Contrast
|
||||||
|
|
||||||
Models append a contrasting clause to statements that don't need one, tacking on
|
Models append a contrasting clause to statements that don't need one, tacking on
|
||||||
"whereas," "as opposed to," "unlike," or "except that."
|
"whereas," "as opposed to," "unlike," or "except that" to draw a comparison the
|
||||||
|
reader could already infer.
|
||||||
|
|
||||||
> "Models write one register above where a human would, whereas human writers
|
> "Models write one register above where a human would, whereas human writers
|
||||||
> tend to match register to context."
|
> tend to match register to context."
|
||||||
@@ -112,20 +120,6 @@ The first clause already makes the point. The contrasting clause restates it
|
|||||||
from the other direction. If you delete the "whereas" clause and the sentence
|
from the other direction. If you delete the "whereas" clause and the sentence
|
||||||
still says everything it needs to, the contrast was filler.
|
still says everything it needs to, the contrast was filler.
|
||||||
|
|
||||||
### Unnecessary Elaboration
|
|
||||||
|
|
||||||
Models keep going after the sentence has already made its point.
|
|
||||||
|
|
||||||
> "A person might lean on one or two of these habits across an entire essay, but
|
|
||||||
> LLM output will use fifteen of them per paragraph, consistently, throughout
|
|
||||||
> the entire piece."
|
|
||||||
|
|
||||||
This sentence could end at "paragraph." The words after it just repeat what "per
|
|
||||||
paragraph" already means. Models do this because they're optimizing for clarity
|
|
||||||
at the expense of concision. The result is prose that feels padded. If you can
|
|
||||||
cut the last third of a sentence without losing any meaning, the last third
|
|
||||||
shouldn't be there.
|
|
||||||
|
|
||||||
### The Question-Then-Answer
|
### The Question-Then-Answer
|
||||||
|
|
||||||
> "So what does this mean for the average user? It means everything."
|
> "So what does this mean for the average user? It means everything."
|
||||||
@@ -160,15 +154,16 @@ becomes "craft." The tendency holds regardless of topic or audience.
|
|||||||
|
|
||||||
"Importantly," "essentially," "fundamentally," "ultimately," "inherently,"
|
"Importantly," "essentially," "fundamentally," "ultimately," "inherently,"
|
||||||
"particularly," "increasingly." Dropped in to signal that something matters,
|
"particularly," "increasingly." Dropped in to signal that something matters,
|
||||||
which is unnecessary when the writing itself makes the importance clear.
|
which is unnecessary when the writing itself already makes the importance clear.
|
||||||
|
|
||||||
### The "Almost" Hedge
|
### The "Almost" Hedge
|
||||||
|
|
||||||
Models rarely commit to an unqualified statement. Instead of saying a pattern
|
Models rarely commit to an unqualified statement. Instead of saying a pattern
|
||||||
"always" or "never" does something, they write "almost always," "almost never,"
|
"always" or "never" does something, they write "almost always," "almost never,"
|
||||||
"almost certainly," "almost exclusively." The word "almost" shows up at high
|
"almost certainly," "almost exclusively." The word "almost" shows up at
|
||||||
density in model-generated analytical prose. It's a micro-hedge, diagnostic in
|
extraordinary density in model-generated analytical prose. It's a micro-hedge,
|
||||||
volume.
|
less obvious than the full hedge stack but just as diagnostic when it appears
|
||||||
|
ten or fifteen times in a single document.
|
||||||
|
|
||||||
### "In an era of..."
|
### "In an era of..."
|
||||||
|
|
||||||
@@ -176,7 +171,7 @@ volume.
|
|||||||
|
|
||||||
A model habit as an essay opener. The model uses it to stall while it figures
|
A model habit as an essay opener. The model uses it to stall while it figures
|
||||||
out what the actual argument is. Human writers don't begin a piece by zooming
|
out what the actual argument is. Human writers don't begin a piece by zooming
|
||||||
out to the civilizational scale.
|
out to the civilizational scale before they've said anything specific.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -188,7 +183,7 @@ out to the civilizational scale.
|
|||||||
|
|
||||||
Every argument followed by a concession, every criticism softened. A direct
|
Every argument followed by a concession, every criticism softened. A direct
|
||||||
artifact of RLHF training, which penalizes strong stances. Models reflexively
|
artifact of RLHF training, which penalizes strong stances. Models reflexively
|
||||||
both-sides everything.
|
both-sides everything even when a clear position would serve the reader better.
|
||||||
|
|
||||||
### The Throat-Clearing Opener
|
### The Throat-Clearing Opener
|
||||||
|
|
||||||
@@ -196,7 +191,8 @@ both-sides everything.
|
|||||||
> has never been more important."
|
> has never been more important."
|
||||||
|
|
||||||
The first paragraph of most model-generated essays adds no information. Delete
|
The first paragraph of most model-generated essays adds no information. Delete
|
||||||
it and the piece improves.
|
it and the piece improves immediately. The actual argument starts in paragraph
|
||||||
|
two.
|
||||||
|
|
||||||
### The False Conclusion
|
### The False Conclusion
|
||||||
|
|
||||||
@@ -232,7 +228,7 @@ vague than risk being wrong about anything.
|
|||||||
> "This can be a deeply challenging experience." "Your feelings are valid."
|
> "This can be a deeply challenging experience." "Your feelings are valid."
|
||||||
|
|
||||||
Generic emotional language that could apply equally to a bad day at work or a
|
Generic emotional language that could apply equally to a bad day at work or a
|
||||||
natural disaster.
|
natural disaster. That interchangeability is what makes it identifiable.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -242,20 +238,21 @@ natural disaster.
|
|||||||
|
|
||||||
If the first section of a model-generated essay runs about 150 words, every
|
If the first section of a model-generated essay runs about 150 words, every
|
||||||
subsequent section will fall between 130 and 170. Human writing is much more
|
subsequent section will fall between 130 and 170. Human writing is much more
|
||||||
uneven.
|
uneven, with 50 words in one section and 400 in the next.
|
||||||
|
|
||||||
### The Five-Paragraph Prison
|
### The Five-Paragraph Prison
|
||||||
|
|
||||||
Model essays follow a rigid introduction-body-conclusion arc even when nobody
|
Model essays follow a rigid introduction-body-conclusion arc even when nobody
|
||||||
asked for one. The introduction previews the argument, the body presents 3 to 5
|
asked for one. The introduction previews the argument, the body presents 3 to 5
|
||||||
points, and then the conclusion restates the thesis.
|
points, and then the conclusion restates the thesis using slightly different
|
||||||
|
words.
|
||||||
|
|
||||||
### Connector Addiction
|
### Connector Addiction
|
||||||
|
|
||||||
Look at the first word of each paragraph in model output. You'll find an
|
Look at the first word of each paragraph in model output. You'll find an
|
||||||
unbroken chain of transition words: "However," "Furthermore," "Moreover,"
|
unbroken chain of transition words: "However," "Furthermore," "Moreover,"
|
||||||
"Additionally," "That said," "To that end," "With that in mind," "Building on
|
"Additionally," "That said," "To that end," "With that in mind," "Building on
|
||||||
this." Human prose doesn't do this.
|
this." Human prose moves between ideas without announcing every transition.
|
||||||
|
|
||||||
### Absence of Mess
|
### Absence of Mess
|
||||||
|
|
||||||
@@ -266,7 +263,8 @@ a thought genuinely unfinished, or keep a sentence the writer liked the sound of
|
|||||||
even though it doesn't quite work.
|
even though it doesn't quite work.
|
||||||
|
|
||||||
Human writing does all of those things regularly. That total absence of rough
|
Human writing does all of those things regularly. That total absence of rough
|
||||||
patches and false starts is one of the strongest signals.
|
patches and false starts is one of the strongest signals that text was
|
||||||
|
machine-generated.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -278,6 +276,7 @@ patches and false starts is one of the strongest signals.
|
|||||||
|
|
||||||
Zooming out to claim broader significance without substantiating it. The model
|
Zooming out to claim broader significance without substantiating it. The model
|
||||||
has learned that essays are supposed to gesture at big ideas, so it gestures.
|
has learned that essays are supposed to gesture at big ideas, so it gestures.
|
||||||
|
Nothing concrete is behind the gesture.
|
||||||
|
|
||||||
### "It's important to note that..."
|
### "It's important to note that..."
|
||||||
|
|
||||||
@@ -290,7 +289,8 @@ verbal tics before a qualification the model believes someone expects.
|
|||||||
Models rely on a small, predictable set of metaphors ("double-edged sword," "tip
|
Models rely on a small, predictable set of metaphors ("double-edged sword," "tip
|
||||||
of the iceberg," "north star," "building blocks," "elephant in the room,"
|
of the iceberg," "north star," "building blocks," "elephant in the room,"
|
||||||
"perfect storm," "game-changer") and reach for them with unusual regularity
|
"perfect storm," "game-changer") and reach for them with unusual regularity
|
||||||
across every topic.
|
across every topic. The pool is noticeably smaller than what human writers draw
|
||||||
|
from.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -301,9 +301,10 @@ Humans write "crucial." Humans ask rhetorical questions.
|
|||||||
|
|
||||||
What gives it away is how many of these show up at once. Model output will hit
|
What gives it away is how many of these show up at once. Model output will hit
|
||||||
10 to 20 of these patterns per page. Human writing might trigger 2 or 3,
|
10 to 20 of these patterns per page. Human writing might trigger 2 or 3,
|
||||||
distributed unevenly. When every paragraph on the page reads like it came from
|
distributed unevenly, mixed with idiosyncratic constructions no model would
|
||||||
the same careful, balanced, slightly formal, structurally predictable process,
|
produce. When every paragraph on the page reads like it came from the same
|
||||||
it was generated by one.
|
careful, balanced, slightly formal, structurally predictable process, it was
|
||||||
|
generated by one.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -388,58 +389,50 @@ passes, because fixing one pattern often introduces another.
|
|||||||
delete it or expand it into a complete sentence that adds actual
|
delete it or expand it into a complete sentence that adds actual
|
||||||
information.
|
information.
|
||||||
|
|
||||||
16. Check for unnecessary elaboration. Read every clause, phrase, and adjective
|
16. Find every pivot paragraph ("But here's where it gets interesting." and
|
||||||
in each sentence and ask whether the sentence loses meaning without it. This
|
|
||||||
includes trailing clauses that restate what the sentence already said,
|
|
||||||
redundant modifiers ("a single paragraph" when "a paragraph" works),
|
|
||||||
secondary clauses that add nothing ("which is why this matters"), and any
|
|
||||||
words whose removal doesn't change the meaning. If you can cut it and the
|
|
||||||
sentence still says the same thing, cut it.
|
|
||||||
|
|
||||||
17. Find every pivot paragraph ("But here's where it gets interesting." and
|
|
||||||
similar) and delete it. The paragraph after it always contains the actual
|
similar) and delete it. The paragraph after it always contains the actual
|
||||||
point.
|
point.
|
||||||
|
|
||||||
### Pass 3: Paragraph and Section-Level Review
|
### Pass 3: Paragraph and Section-Level Review
|
||||||
|
|
||||||
18. Check paragraph lengths across the piece and verify they actually vary. If
|
17. Check paragraph lengths across the piece and verify they actually vary. If
|
||||||
most paragraphs have between three and five sentences, rewrite some to be
|
most paragraphs have between three and five sentences, rewrite some to be
|
||||||
one or two sentences and let others run to six or seven.
|
one or two sentences and let others run to six or seven.
|
||||||
|
|
||||||
19. Check section lengths for suspicious uniformity. If every section is roughly
|
18. Check section lengths for suspicious uniformity. If every section is roughly
|
||||||
the same word count, combine some shorter ones or split a longer one
|
the same word count, combine some shorter ones or split a longer one
|
||||||
unevenly.
|
unevenly.
|
||||||
|
|
||||||
20. Check the first word of every paragraph for chains of connectors ("However,"
|
19. Check the first word of every paragraph for chains of connectors ("However,"
|
||||||
"Furthermore," "Moreover," "Additionally," "That said"). If more than two
|
"Furthermore," "Moreover," "Additionally," "That said"). If more than two
|
||||||
transition words start consecutive paragraphs, rewrite those openings to
|
transition words start consecutive paragraphs, rewrite those openings to
|
||||||
start with their subject.
|
start with their subject.
|
||||||
|
|
||||||
21. Check whether every argument is followed by a concession or qualifier. If
|
20. Check whether every argument is followed by a concession or qualifier. If
|
||||||
the piece both-sides every point, pick a side on at least some of them and
|
the piece both-sides every point, pick a side on at least some of them and
|
||||||
cut the hedging.
|
cut the hedging.
|
||||||
|
|
||||||
22. Read the first paragraph and ask whether deleting it would improve the
|
21. Read the first paragraph and ask whether deleting it would improve the
|
||||||
piece. If it's scene-setting that previews the argument, delete it and start
|
piece. If it's scene-setting that previews the argument, delete it and start
|
||||||
with paragraph two.
|
with paragraph two.
|
||||||
|
|
||||||
23. Read the last paragraph and check whether it restates the thesis or uses a
|
22. Read the last paragraph and check whether it restates the thesis or uses a
|
||||||
phrase like "at the end of the day" or "moving forward." If so, either
|
phrase like "at the end of the day" or "moving forward." If so, either
|
||||||
delete it or rewrite it to say something the piece hasn't said yet.
|
delete it or rewrite it to say something the piece hasn't said yet.
|
||||||
|
|
||||||
### Pass 4: Overall Texture
|
### Pass 4: Overall Texture
|
||||||
|
|
||||||
24. Read the piece aloud and listen for passages that sound too smooth, too
|
23. Read the piece aloud and listen for passages that sound too smooth, too
|
||||||
even, or too predictable. Human prose has rough patches. If there aren't
|
even, or too predictable. Human prose has rough patches. If there aren't
|
||||||
any, the piece still reads as machine output.
|
any, the piece still reads as machine output.
|
||||||
|
|
||||||
25. Check that the piece contains at least a few constructions that feel
|
24. Check that the piece contains at least a few constructions that feel
|
||||||
idiosyncratic: a sentence with unusual word order, a parenthetical that goes
|
idiosyncratic: a sentence with unusual word order, a parenthetical that goes
|
||||||
on a bit long, an aside only loosely connected to the main point, a word
|
on a bit long, an aside only loosely connected to the main point, a word
|
||||||
choice that's specific and unexpected. If every sentence is clean and
|
choice that's specific and unexpected. If every sentence is clean and
|
||||||
correct and unremarkable, it will still read as generated.
|
correct and unremarkable, it will still read as generated.
|
||||||
|
|
||||||
26. Verify that you haven't introduced new patterns while fixing the original
|
25. Verify that you haven't introduced new patterns while fixing the original
|
||||||
ones. This happens constantly. Run the entire checklist again from the top
|
ones. This happens constantly. Run the entire checklist again from the top
|
||||||
on the revised version.
|
on the revised version.
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user