Is Everyone in Your Life a Bot? You Might Not Be Able to Tell.
Imagine you get a call from a work buddy. You make normal small talk. Then he brings up the big acquisition about to be made by your company. The public doesn't know about it. Neither does Wall Street. You talk freely, giving your take on the strategy. You go over the details, all super confidential stuff. In the course of the conversation, your buddy mentions he forgot the access code to get into the share drive. No problem - you remind him what it is.
Later you discover you were talking to an A.I. bot the whole time. It adopted the persona of your work buddy. And all those secrets you gave out just made it into the wrong hands. Your company is sabotaged and you're out of a job, maybe even accused of being part of the scam.
No way that could happen, right? Because you know when you're interacting with an AI bot and when you're talking to a person.
Except a new study says those days might be over.
Chat GPT just passed the Turing Test.
The study was conducted by two cognitive scientists at UC San Diego, published last month but not yet peer-reviewed. They had a simple question: could today's large language models (LLMs) fool someone into thinking they were human? The researchers evaluated several AI systems, including Chat GPT model 4.5 and Meta's LLamMA-3.1-405B.
Chat GPT-4 has already passed the simple two-party version of the Turing Test: a person talks to a bot, and researchers see if the person is fooled. But the harder Turing test - the 'true" version scientist Alan Turing called "the imitation game" in his famous 1950 paper - is the three-party experiment. In this test, there is a human Judge having a conversation with two people. One of the people is actually a machine -- or in this case, an LLM. The three participants communicate only by typing, so there are no visual or audio cues involved. The human Judge has to determine which person in the conversation is not a person at all.
No machine has ever passed the test. We can always tell. Human conversation is complex. Even with the most impressive computers, there's a tell. Maybe the responses are too fast or too perfect. Or they repeat your question exactly before answering, in a way that sounds artificial. One common "tell" is the machine's inability to get sarcasm or jokes.
This recent study involved a five-minute text conversation with no restrictions on what the human Judge could ask. But the Judge could play whatever tricks they felt would trip up the machine. This included small talk about daily activities, personal details, probing emotional qualities, and digging into opinions and personal experiences.
Judges even asked the direct question: are you human? Of course - disturbingly - this doesn't work because LLMs lie.
Now, the Judge in this test doesn't have to be wrong 100% of the time for the LLM to get the win. They just have to be fooled enough of the time to prove participants can't reliably distinguish between a human and a machine. In this study, that meant the Judge just has to correctly identify the real person more than 50% of the time. Should be easy. It allows for a few mistakes, but most of the time the Judge should pick out the human.
Turns out, that's not the case.
GPT 4.5 fooled the Judge a staggering 73% of the time. Judges were _more likely_ to believe Chat GPT was human than the real human participant. This rate is significantly above random chance. The LLM won.
For the record, Meta's LLM fooled Judges 56% of the time, which also passes Turing's test.
One interesting reason for the LLM's success was the prompt it was given when entering the conversation. It was asked to adopt a persona. It was mimicking a human personality, specifically a "young person who is introverted, knowledgeable about internet culture, and uses slang". When guided by this parameter, the conversation became believable and convincing to the Judge.
We've officially entered a new age where humans can be fooled by machines. This doesn't mean the machines think for themselves, or have any agency. But it does open the door for the use of "counterfeit personas", fake people leveraged for any number of crimes and manipulations.
So the next time your buddy from work calls to talk shop, you may want to throw in a sarcastic joke or a non-sequitur having nothing to do with anything. Just to make sure the person on the other end is, you know, actually a person.