
Carola F Berger, PhD, CT
I present a semi-scientific study of six publicly available AI detectors that purport to detect whether a text was written by generative AI or a human. I study the accuracy of these AI detectors with a sample of different scientific texts which were 100% written by humans, and one text that was written by AI. Specifically, I investigate the number of false positives in the resulting AI detector outputs. The results indicate that detectors that aggregate the output of several other engines seem to overestimate the AI content in the text. Other AI detectors seem to more reliably identify human-written content as such, although nearly all fail to detect the one AI-generated text correctly. Also of note is that the results of AI detection aggregators seem to evolve over time.
Introduction
Do you know somebody who has been falsely accused of using AI for their school or work project? Are you a manager in charge of written content and are wondering whether the texts you receive are AI-generated? Or are you merely curious about the accuracy of AI detectors? AI detectors are online tools created to detect whether a text was written by generative AI or humans. I have been wondering for a while how accurately these AI detectors actually detect human vs. AI-generated content. So I decided to do a semi-scientific study to answer this question. The study is “semi-scientific,” because instead of testing these AI detectors on a large scale with randomly selected texts to obtain statistically meaningful results, I chose a handful of manually curated texts. However, the rest of the methodology is as scientific as possible, with detailed, auditable records of all results. More on the methodology below.
Read more here.
Dejar una contestacion
Lo siento, debes estar conectado para publicar un comentario.