ICML Poster Spotting LLMs With Binoculars: Zero-Shot Detection of Machine-Generated Text

Poster

Spotting LLMs With Binoculars: Zero-Shot Detection of Machine-Generated Text

Abhimanyu Hans · Avi Schwarzschild · Valeriia Cherepanova · Hamid Kazemi · Aniruddha Saha · Micah Goldblum · Jonas Geiping · Tom Goldstein

[ Abstract ]

Abstract:

Detecting text generated by modern large language models is thought to be hard, as both LLMs and humans can exhibit a wide range of complex behaviors. However, we find that a score based on contrasting two closely related language models is highly accurate at separating human-generated and machine-generated text.Based on this mechanism, we propose a novel LLM detector that only requires simple calculations using a pair of pre-trained LLMs. The method, called Binoculars, achieves state-of-the-art accuracy without any training data.It is capable of spotting machine text from a range of modern LLMs without any model-specific modifications.We comprehensively evaluate Binoculars on a number of text sources and in varied situations. Over a wide range of document types, Binoculars detects over 90\% of generated samples from ChatGPT (and other LLMs) at a false positive rate of 0.01\%, despite not being trained on any ChatGPT data.

Live content is unavailable. Log in and register to view live content