Current research shows AI-generated text detection tools to be unreliable, inconsistent, and vulnerable to manipulation. Studies show these tools often produce conflicting results, even when analyzing similar content, and their accuracy can vary widely across contexts. They struggle with both false positives and false negatives, especially when faced with adversarial techniques like paraphrasing, spelling errors, or stylistic prompts. Some tools also exhibit bias, disproportionately flagging texts by non-native English speakers. Additionally, detectors often misclassify AI-polished human writing, highlighting their difficulty in gauging nuanced AI involvement. While a few tools show relatively better performance in isolated studies, no single detector is consistently effective. Given the fast-paced evolution of generative AI and detection tools, many researchers now consider reliable detection increasingly mathematically infeasible. Experts recommend a cautious, multi-layered approach, blending detection tools with human judgment rather than relying solely on automation.
AI detection tools show inconsistent and unreliable performance across studies.
Accuracy is impacted by adversarial techniques such as paraphrasing or formatting changes.
Tools may exhibit bias, especially against non-native English writers.
AI detectors often flag human text polished by AI as AI-generated.
No tool is universally effective; even top performers vary by study.
Reliable detection of AI text is considered mathematically difficult.
Human oversight remains essential for responsible evaluation.
NOTE: The above text was generated by Google NotebookLM, based off of all studies referenced in this section, then summarized by ChatGPT 40.
Listing and linking to these resources does not indicate SFCC Library's endorsement of said resources (Editor's Note: I've actually seriously considered deleting this section altogether due to the controversy surrounding the use of these resources, but...)
♦ means a score of 100% on David Gewirtz' AI Detector Test (Senior Contributing Editor for ZDNet)
AI Humanizers - What are they?
The increasing use of Generative AI by students and faculty efforts to counter it have often been described as an arms race. One of the latest weapons in this race are AI 'humanizer' writing websites.
What are they?
AI humanizer writing websites are tools designed to make AI-generated text sound more natural, human-like, and less detectable as machine-written. They work by taking content created by an AI (like ChatGPT or similar tools) and rewriting or editing it to:
Some use rule-based methods (applying specific linguistic tweaks), while others use additional AI models trained to mimic human writing styles. These tools are often used to bypass AI detection tools or improve readability.
At the time of this writing (Spring 2025) some of the more popular AI humanizer websites are AIHumanizer, WriteHuman, Humanize AI and AI Undetect but there are hundreds out there.
To address suspected AI humanizer use in student essays:
Detection Strategies
Conversation Approaches
Policy Adjustments
Tools and Workflow