NEXT ENTRY »
Old Man mannequin

The female Turing Test

Most high-tech folks assume they know what a “Turing Test” is: You ask an interrogator to chat online with a human and a computer, and to try and figure out which is which. Right?

Nope. In his famous 1950 essay “Computer machinery and intelligence”, mathematician Alan Turing described a test that was quite different. Turing called his invention “the imitation game”, and he described it thusly:

It is played with three people, a man (A), a woman (B), and an interrogator (C) who may be of either sex. The interrogator stays in a room apart from the other two. The object of the game for the interrogator is to determine which of the other two is the man and which is the woman. [snip]

We now ask the question, ‘What will happen when a machine takes the part of A in this game?’ Will the interrogator decide wrongly as often when the game is played like this as he does when the game is played between a man and a woman? These questions replace our original, ‘Can machines think?’

Dig it: The goal is not to figure out which one is a ‘bot; it’s to figure out which is female. That sounds weird, sure, but the thing is, this version of the Turing test is actually far better at determining whether a chatbot can pass as human. It’s more fair.

That’s because the Turing test, as it’s popularly misinterpreted , is heavily skewed against the robots. In this version, the interrogator is forewarned; he’s told that one of the chatters will be a machine, so he’s on the lookout for robot-like behavior. And of course, if you already know a robot is lurking out there somewhere, it’s easy to smoke it out. You just ask a content-heavy question, such as “What’s the capital of Uzbekistan?” or “What do you think of Putin’s latest move against Yukos Oil?” The robot can’t deal with that.

But this is inherently unfair, because these sorts of questions rarely occur in normal, everyday human conversation. On the contrary, the average conversation is composed of completely content-free exchanges that rarely go beyond the “wazzup” stage. (“How’s it going?” “Can’t complain. How about you?” “Ah, could be worse.”) Most real, adult humans spend the day not discussing philosophy, but vaguely pinging the people around them with completely formulaic conversational gambits. On that playing field, most chatbots are perfectly capable of holding their own. Indeed, if you’re not on your guard, it’s extremely easy to get fooled — at least temporarily — by a chatbot, as did the victims of AOLiza, and as even I did last year when someone sicced a ‘bot on me without my knowledge. The reason why chatbots never win in formal Turing-tests is that the interrogator is suspicious from the get-go, and does not engage in realistic conversation.

To make the Turing test fair, you have to introduce a ‘bot into a quasi-normal conversation — one in which the interrogator is unaware of the possibility that he might be talking to a machine. And that is precisely why Turing introduced that weird bit of gender misdirection in his “imitation game”. Because the interrogator is concentrating on the question of “am I talking to a woman or not?”, he’s focusing on how the chatter uses language, and wondering whether it sounds “female” enough. He’s not obsessing over whether the chatter is human or not. This means the ‘bot is, finally, on an even playing-field with its human competitor. But since no-one bothers to read Turing’s original essay, Turing tests are never conducted in this fashion.

Until now! Last weekend, a bunch of students at Simon’s Rock College ran a Turing test using the original, punk-rock version. Better yet, they used ALICE, the open-source chatbot created by Richard Wallace, who I profiled for the New York Times Magazine. True to Turing’s suggestion, the students told their subjects that they would be participating in “The Guessing Game” — a test of whether they could disambiguate male and female chatters. They didn’t tell the subjects that robots would be involved. Over 80 people participated, and I can’t wait to find out what data they collected. (That’s a picture of a participant above.)

Given that Turing was a closeted gay man for most of his life, his original test is almost fractal in its multifaceted weirdness. What precisely did he think would be different about how men and women communicate? Did he wonder whether a suspicious observer could spot someone passing himself off as a “different” gender — something that doubtless resonated deeply with Turing? His essay gives no clues, so I’m just speculating here. As a related irony, you may remember the computer program that Israeli scientists wrote two years ago that can, in fact, figure out whether the author of an anonymous text is a man or a woman.

Maybe in the future we’ll be turning the Turing test on its head. When a computer is finally smart enough to be considered alive, will it be able to figure out that it’s talking to a human?

(Thanks to Slashdot for this one!)


blog comments powered by Disqus

Search This Site


Bio:

I'm Clive Thompson, a writer on science, technology, and culture. This blog collects bits of offbeat research I'm running into, and musings thereon.

Currently, I'm a contributing writer for the New York Times Magazine and a columnist for Wired magazine. I also write for Fast Company and Wired magazine's web site, among other places. Email or AOL IM me (pomeranian99) to say hi or send in something strange!

More of Me

Twitter
Tumblr
Flickr


Recent Entries

A long German word for “noticing when ads are being customized based on your surfing history”

Gay squid sex

“El Ajedrecista” — an analog chess-playing computer from 1912

Hacking the Model T

“How did you find my site?” and Vannevar Bush’s memex

» visit the Collision Detection archives

Clive Thompson's Tumblr
a bunch of stuff

May 20, 2011 » 02:28 PM

From Christopher Kennedy’s very droll book “Neitzsche’s Horse”.

July 28, 2010 » 07:35 AM
“Wr” - S

July 06, 2010 » 10:05 AM

My Xbox broke, and I was trying to Google some possible technical solutions, when I noticed that Google appears to be encouraging me to make a typo. I suppose it’s possible that Google’s algorithms know that typing “wont” instead of “won’t” would produce better results.

June 29, 2010 » 05:00 PM

On the other hand, when I tried the test for multitasking, I was pretty abysmal. I performed worse than people who identify themselves as heavy multitaskers, and those who identify as low multitaskers.

June 29, 2010 » 04:58 PM

I finally got around to trying out the interactive “test your distractability and multitasking” page at the New York Times, which they put up alongside their story earlier this month about how computer distractions are eroding our lives. 

According to the test, I guess I have good focus — I’m not very distractable! 

» visit my Tumblr

Recent Comments

Photos

» see all of my photos on Flickr

Collision Detection: A Blog by Clive Thompson