Tools

Tools: Why AI Text Gets Detected - The Linguistics Behind It

2026-02-22 0 views admin

Tools: Why AI Text Gets Detected - The Linguistics Behind It

Source: Dev.to

The Three Metrics That Matter ## 1. Perplexity ## 2. Burstiness ## 3. Vocabulary Distribution ## What This Means Practically I've been building an AI text humanizer and spent weeks studying how AI detection actually works. The results surprised me - it's not about grammar, vocabulary, or even factual accuracy. It's about statistical patterns that humans produce naturally but language models don't. AI detectors primarily measure three properties: Perplexity measures how predictable the next word is given the previous context. Lower perplexity = more predictable text. Language models generate text by selecting the most probable next token. This produces consistently low perplexity. Human writing has higher perplexity because we make unexpected word choices - idioms, slang, unusual metaphors, sentence fragments. Think of it this way: if you can easily predict what word comes next, it was probably written by AI. Burstiness measures the variation in sentence complexity across a piece of text. AI text has low burstiness - sentences hover around 15-20 words with similar grammatical complexity. Human text has high burstiness - a 5-word sentence followed by a 40-word one, a simple declarative followed by a complex compound-complex structure. This is the metric I find most interesting because it maps directly to how humans think. We don't maintain a consistent "complexity level." We shift between simple and complex depending on emphasis, emotion, and flow. Zipf's law says that in natural language, word frequency follows a specific distribution. AI text follows this distribution almost perfectly - too perfectly. Human text deviates in characteristic ways: we overuse certain words, underuse others, and occasionally use rare words that break the expected pattern. If you're writing with AI assistance, the fix isn't to "add errors" or "dumb it down." It's to: I built a free tool that does this automatically: GoForTool AI Humanizer. It analyzes text for these statistical patterns and adjusts them to match human writing distributions. Everything runs in the browser - no server processing. The irony of building AI to make AI sound less like AI isn't lost on me. But the underlying linguistics are genuinely fascinating, and understanding them makes you a better writer regardless of whether AI is involved. What patterns have you noticed in AI-generated text? I'd love to hear what bugs people most about it. Templates let you quickly answer FAQs or store snippets for re-use. Are you sure you want to hide this comment? It will become hidden in your post, but will still be visible via the comment's permalink. Hide child comments as well For further actions, you may consider blocking this person and/or reporting abuse - Vary your rhythm - short sentences. Then a longer one. Fragment. Another long one that goes on a bit longer than expected. - Break predictability - use an unexpected word where a common one would go. - Add your voice - hedges, opinions, asides. "Honestly, this part surprised me."

🏷️ Tags

how-totutorialguidedev.toaiserver