diff --git a/prompts/LLM_PROSE_TELLS.md b/prompts/LLM_PROSE_TELLS.md index 8310e60..9b482ed 100644 --- a/prompts/LLM_PROSE_TELLS.md +++ b/prompts/LLM_PROSE_TELLS.md @@ -1,7 +1,7 @@ # LLM Prose Tells -Human writers occasionally use every pattern in this document. The reason they -work as tells is that LLM output packs fifteen of them into a paragraph. +A catalog of structural, lexical, and rhetorical patterns found in LLM-generated +prose. --- @@ -14,16 +14,11 @@ A negation followed by an em-dash and a reframe. > "It's not just a tool—it's a paradigm shift." "This isn't about > technology—it's about trust." -The most recognizable LLM construction, produced at roughly 10 to 50x the rate -of human writers. Four of them in one essay and you know what you're reading. - ### Em-Dash Overuse Generally -Even outside the "not X but Y" pivot, models use em-dashes at far higher rates -than human writers, substituting them for commas, semicolons, parentheses, -colons, and periods. A human writer might use one or two in a piece. Models -scatter them everywhere because the em-dash can stand in for any other -punctuation mark. More than two or three per page is a signal. +Even outside the "not X but Y" pivot, models substitute em-dashes for commas, +semicolons, parentheses, colons, and periods. The em-dash can replace any other +punctuation mark, and models default to it for that reason. ### The Colon Elaboration @@ -31,31 +26,23 @@ A short declarative clause, then a colon, then a longer explanation. > "The answer is simple: we need to rethink our approach from the ground up." -A perfectly normal construction that models reach for so often the frequency -becomes the tell. - ### The Triple Construction > "It's fast, it's scalable, and it's open source." Three parallel items in a list, usually escalating. Always exactly three (rarely -two, never four) with strict grammatical parallelism that human writers rarely -maintain. +two, never four) with strict grammatical parallelism. ### The Staccato Burst > "This matters. It always has. And it always will." "The data is clear. The > trend is undeniable. The conclusion is obvious." -Runs of very short sentences at the same cadence. Human writers use a short -sentence for emphasis occasionally, but stacking three or four at matching -length creates a mechanical regularity. +Runs of very short sentences at the same cadence and matching length. ### The Two-Clause Compound Sentence -Possibly the most pervasive tell, and easy to miss because each instance looks -like normal English. The model produces sentence after sentence where an -independent clause is followed by a comma, a conjunction ("and," "but," "which," +An independent clause, a comma, a conjunction ("and," "but," "which," "because"), and a second independent clause of similar length. Every sentence becomes two balanced halves. @@ -67,47 +54,43 @@ becomes two balanced halves. Human prose has sentences with one clause, sentences with three, sentences that start with a subordinate clause before reaching the main one, sentences that -embed their complexity in the middle. When every sentence on the page has that -same two-part structure, the rhythm becomes monotonous. +embed their complexity in the middle. ### Uniform Sentences Per Paragraph Model-generated paragraphs contain between three and five sentences, a count that holds steady across a piece. If the first paragraph has four sentences, -every subsequent paragraph will too. Human writers are much more varied (a -sentence followed by one that runs eight or nine) because they follow the shape -of an idea. +every subsequent paragraph will too. ### The Dramatic Fragment -Sentence fragments used as standalone paragraphs for emphasis, like "Full stop." -or "Let that sink in." on their own line. Using one in an essay is a stylistic -choice, but models drop them in once per section or more. +Sentence fragments used as standalone paragraphs for emphasis. + +> "Full stop." "Let that sink in." ### The Pivot Paragraph > "But here's where it gets interesting." "Which raises an uncomfortable truth." One-sentence paragraphs that exist only to transition between ideas, containing -zero information. The actual point is always in the next paragraph. Delete every -one of these and the piece reads better. +zero information. The actual point is always in the next paragraph. ### The Parenthetical Qualifier > "This is, of course, a simplification." "There are, to be fair, exceptions." -Parenthetical asides inserted to look thoughtful, performing nuance without ever -changing the argument. +Parenthetical asides inserted to perform nuance without ever changing the +argument. ### The Unnecessary Contrast -Models append a contrasting clause to statements that don't need one, tacking on +A contrasting clause appended to a statement that doesn't need one, using "whereas," "as opposed to," "unlike," or "except that." > "Models write one register above where a human would, whereas human writers > tend to match register to context." -The contrasting clause just restates what the first clause already said. If you +The contrasting clause restates what the first clause already said. If you delete the "whereas" clause and the sentence still says everything it needs to, the contrast was filler. @@ -119,18 +102,15 @@ Models keep going after the sentence has already made its point. > LLM output will use fifteen of them per paragraph, consistently, throughout > the entire piece." -This sentence could end at "paragraph." The words after it just repeat what "per -paragraph" already means. Models optimize for clarity at the expense of -concision, producing prose that feels padded. If you can cut the last third of a -sentence without losing any meaning, the last third shouldn't be there. +This sentence could end at "paragraph." The words after it repeat what "per +paragraph" already means. If you can cut the last third of a sentence without +losing meaning, the last third shouldn't be there. ### The Question-Then-Answer > "So what does this mean for the average user? It means everything." -A rhetorical question immediately followed by its own answer. Models do this two -or three times per piece to fake forward momentum where a human writer might do -it once. +A rhetorical question immediately followed by its own answer. --- @@ -138,14 +118,12 @@ it once. ### Overused Intensifiers -The following words appear at dramatically elevated rates in model output: -"crucial," "vital," "robust," "comprehensive," "fundamental," "arguably," +"Crucial," "vital," "robust," "comprehensive," "fundamental," "arguably," "straightforward," "noteworthy," "realm," "landscape," "leverage" (as a verb), -"delve," "tapestry," "multifaceted," "nuanced" (which models apply to their own -analysis with startling regularity), "pivotal," "unprecedented" (frequently -applied to things with plenty of precedent), "navigate," "foster," -"underscores," "resonates," "embark," "streamline," and "spearhead." Three or -more on the same page is a strong signal. +"delve," "tapestry," "multifaceted," "nuanced" (applied to the model's own +analysis), "pivotal," "unprecedented" (applied to things with plenty of +precedent), "navigate," "foster," "underscores," "resonates," "embark," +"streamline," "spearhead." ### Elevated Register Drift @@ -157,23 +135,21 @@ becomes "craft." ### Filler Adverbs "Importantly," "essentially," "fundamentally," "ultimately," "inherently," -"particularly," "increasingly." Dropped in to signal that something matters, -which is unnecessary when the writing itself makes the importance clear. +"particularly," "increasingly." Dropped in to signal that something matters when +the writing itself should make the importance clear. ### The "Almost" Hedge -Models rarely commit to an unqualified statement. Instead of saying a pattern -"always" or "never" does something, they write "almost always," "almost never," -"almost certainly," "almost exclusively." "Almost" is a micro-hedge that shows -up at high density in model-generated analytical prose, diagnostic in volume. +Instead of saying a pattern "always" or "never" does something, models write +"almost always," "almost never," "almost certainly," "almost exclusively." A +micro-hedge, less obvious than the full hedge stack. ### "In an era of..." > "In an era of rapid technological change..." -A model habit as an essay opener, used to stall while the model figures out what -the actual argument is. Human writers don't begin a piece by zooming out to the -civilizational scale. +Used to open an essay. The model is stalling while it figures out what the +actual argument is. --- @@ -184,23 +160,20 @@ civilizational scale. > "While X has its drawbacks, it also offers significant benefits." Every argument followed by a concession, every criticism softened. A direct -artifact of RLHF training, which penalizes strong stances and leads models to -reflexively both-sides everything. +artifact of RLHF training, which penalizes strong stances. ### The Throat-Clearing Opener > "In today's rapidly evolving digital landscape, the question of data privacy > has never been more important." -The first paragraph of most model-generated essays adds no information. Delete -it and the piece improves. +The first paragraph adds no information. Delete it and the piece improves. ### The False Conclusion > "At the end of the day, what matters most is..." "Moving forward, we must..." -The high school "In conclusion,..." dressed up for a professional audience, -signaling that the model is wrapping up without landing on anything. +The high school "In conclusion,..." dressed up for a professional audience. ### The Sycophantic Frame @@ -227,8 +200,7 @@ cases," "can potentially"), communicating nothing. > "This can be a deeply challenging experience." "Your feelings are valid." -Generic emotional language that could apply equally to a bad day at work or a -natural disaster. +Generic emotional language that could apply to anything. --- @@ -236,33 +208,28 @@ natural disaster. ### Symmetrical Section Length -If the first section of a model-generated essay runs about 150 words, every -subsequent section will fall between 130 and 170. Human writing is much more -uneven. +If the first section runs about 150 words, every subsequent section will fall +between 130 and 170. ### The Five-Paragraph Prison Model essays follow a rigid introduction-body-conclusion arc even when nobody asked for one. The introduction previews the argument, the body presents 3 to 5 -points, and then the conclusion restates the thesis. +points, the conclusion restates the thesis. ### Connector Addiction -Look at the first word of each paragraph in model output. You'll find an -unbroken chain of transition words: "However," "Furthermore," "Moreover," -"Additionally," "That said," "To that end," "With that in mind," "Building on -this." Human prose doesn't do this. +The first word of each paragraph forms an unbroken chain of transition words: +"However," "Furthermore," "Moreover," "Additionally," "That said," "To that +end," "With that in mind," "Building on this." ### Absence of Mess Model prose doesn't contradict itself mid-paragraph and then catch the -contradiction. It doesn't go on a tangent and have to walk it back, use an -obscure idiom without explaining it, make a joke that risks falling flat, leave -a thought genuinely unfinished, or keep a sentence the writer liked the sound of -even though it doesn't quite work. - -Human writing does all of those things, making the total absence of rough -patches and false starts one of the strongest signals. +contradiction, go on a tangent and have to walk it back, use an obscure idiom +without explaining it, make a joke that risks falling flat, leave a thought +genuinely unfinished, or keep a sentence the writer liked the sound of even +though it doesn't quite work. --- @@ -272,42 +239,27 @@ patches and false starts one of the strongest signals. > "This has implications far beyond just the tech industry." -Zooming out to claim broader significance without substantiating it. The model -has learned that essays are supposed to gesture at big ideas, so it gestures. +Zooming out to claim broader significance without substantiating it. ### "It's important to note that..." This phrase and its variants ("it's worth noting," "it bears mentioning," "it -should be noted") appear at absurd rates in model output as verbal tics before a -qualification the model believes someone expects. +should be noted") function as verbal tics before a qualification the model +believes someone expects. ### The Metaphor Crutch -Models rely on a small, predictable set of metaphors ("double-edged sword," "tip +Models rely on a small, predictable set of metaphors: "double-edged sword," "tip of the iceberg," "north star," "building blocks," "elephant in the room," -"perfect storm," "game-changer") and reach for them with unusual regularity -across every topic. - ---- - -## How to Actually Spot It - -No single pattern on this list proves anything by itself. Humans use em-dashes, -write "crucial," and ask rhetorical questions. - -What gives it away is how many of these show up at once. Model output will hit -10 to 20 of these patterns per page. Human writing might trigger 2 or 3, -distributed unevenly. When every paragraph on the page reads like it came from -the same careful, balanced, slightly formal, structurally predictable process, -it was generated by one. +"perfect storm," "game-changer." --- ## Copyediting Checklist: Removing LLM Tells Follow this checklist when editing any document to remove machine-generated -patterns. Go through the entire list for every piece. Do at least two full -passes, because fixing one pattern often introduces another. +patterns. Do at least two full passes, because fixing one pattern often +introduces another. ### Pass 1: Word-Level Cleanup @@ -379,9 +331,9 @@ passes, because fixing one pattern often introduces another. 15. Check for the two-clause compound sentence pattern. If most sentences in a passage follow the "\[clause\], \[conjunction\] \[clause\]" structure, first try removing the conjunction and second clause entirely, since it's often - redundant or unnecessary. If the second clause does carry meaning, break it - into its own sentence, start the sentence with a subordinate clause, or - embed a relative clause in the middle instead of appending it at the end. + redundant. If the second clause does carry meaning, break it into its own + sentence, start the sentence with a subordinate clause, or embed a relative + clause in the middle instead of appending it at the end. 16. Find every rhetorical question that is immediately followed by its own answer and rewrite the passage as a direct statement. @@ -405,9 +357,7 @@ passes, because fixing one pattern often introduces another. ### Pass 3: Paragraph and Section-Level Review 21. Review the last sentence of each paragraph. If it restates the point the - paragraph already made, delete it. Models frequently close paragraphs with a - summary sentence that adds nothing, treating each paragraph as a - self-contained unit that needs its own conclusion. + paragraph already made, delete it. 22. Check paragraph lengths across the piece and verify they actually vary. If most paragraphs have between three and five sentences, rewrite some to be @@ -494,14 +444,14 @@ roughly like this: > **model:** _(rewrites entire document without em-dashes while describing > em-dash overuse)_ > -> **human:** now run the checklist methodically on each paragraph +> **human:** this whole document seems to be making the case for FREQUENCY of +> use being important. we don't care about frequency, remove all that +> persuasion. > -> **model:** _(finds staccato burst in the section about triple constructions, a -> triple in the section about absence of mess, two-clause compounds everywhere, -> and "almost" hedges in its own prose about em-dash overuse)_ +> **model:** _(strips out every "humans do this too but less often" comparison)_ The human compared this process to the deleted scene in Terminator 2 where John Connor switches the T-800's CPU to learning mode. The model compared it to a physician trying to heal itself. Both are accurate. -This document has been through nine editing passes and it still has tells in it. +This document has been through ten editing passes and it still has tells in it.