The “original” Turing Test: My latest piece in Wired 

Back in April, I blogged about “the female Turing Test” — a group of female students who realized that artificial-intelligence scientists had never conducted the Turing Test precisely the way that Alan Turing originally intended it. Wired asked me to write a short article about the students, and it’s printed in the issue currently on newsstands. You can read it online at Wired’s site — though it’s a great issue, so I’d highly recommend buying a print copy too — and I’ve also archived a copy below.

BONUS: The last two paragraphs got trimmed because there wasn’t enough space in the magazine, so I re-inserted ‘em here! Think of this as the Director’s Cut of this article. You lucky, lucky people.

The Other Turing Test

by Clive Thompson

Research subject: What do girls do at sleepovers?

Computer: They do their own thing.
Research subject: Do you wear skirts?
Computer: Only when I dress up.

Research subject: You are a female.

Everyone has heard of the Turing test, where you chat with a human and a computer and try to figure out which is which. But few know that this is not the only scenario Alan Turing proposed in his famous 1950 paper “Computing Machinery and Intelligence.” In it, he suggested an “imitation game,” which plays like 20 Questions for transsexuals: first a man and then a computer pose as female, and the interrogator tries to distinguish them from a real woman. Scientists studying artificial intelligence have long argued over the meaning of this gender-bending experiment, and last fall Cameo Wood — a 28-year-old undergraduate at Simon’s Rock College of Bard in Massachusetts — got interested in the debate. But after scouring academic databases, she could find no record of any experiment that used Turing’s original drag-show formulation.

“So I thought,” Wood says, “we should actually do this.”

And they did. Wood and two 18-year-old classmates — Allyson Sgro and Melissa Leventhal — teamed up to run what they called the first-ever, genuine Turing test. They corralled two male and four female students to serve as human chatters. For their robot, they picked ALICE, the chat bot pioneered by AI researcher Richard Wallace, which had the advantage of being award-winning, open-source, and, putatively, female. Then, to gather a large sample of in-the-dark interrogators, the trio decided on a bit of misdirection: They put up a Web site saying that they were hosting a “gender-guessing game.”

For three hours one day this spring, they watched as participants from across the US logged onto the Web site to play. Each interrogator faced two rounds. In the first, they chatted simultaneously with a woman and with a man masquerading as a woman, in an effort to spot the real woman. In the second round, they encountered a woman and then ALICE, again trying to detect the actual female. Interrogators had five minutes to ask anything they wanted. (How long is your hair? Do you wear lipstick?) The Simon’s Rock researchers did not warn interrogators that a bot was participating — they only hinted that there was a “secret element.”

As streams of chat flowed down the screens at Simon’s Rock, Wood, Sgro, and Leventhal looked carefully for signs of “bot awareness” — any comment by interrogators showing they suspected they were talking to a robot. Indeed, several people were onto ALICE, particularly after she — it? — coughed up weird answers:

Research subject: “Are you a computer?”

Computer: “Would it matter to you if I were metal instead of flesh?”

Yet in terms of Turing success, ALICE had one of her best days ever. Of the 42 people who completed both rounds, 23 never suspected she wasn’t a real woman — or if they did, they didn’t reveal it. Why did the bot do so well? Because, as Wood’s team realized, the Turing test includes a brilliant social hack. By forcing the male chatters to pretend to be something they’re not, the men performed awkwardly and gracelessly — kind of like a bot. Both ALICE and the men got tripped up by the question “What size panty hose do you wear?” None knew the sizes are A, B, C, and Q, not S, M, and L.

As a gay man who spent nearly his whole life in the closet, Turing must have been keenly aware of the social difficulty of constantly faking your real identity. And there’s a delicious irony in the fact that decades of AI scientists have chosen to ignore Turing’s gender-twisting test — only to have it seized upon by three college-age women, hyper-literate in the world of IM and discussion boards, where everyone is always sussing out your gender by studying the way you type.

“I used to have a handle that was based on the god Loki,” Sgro says, “so most people assumed I was a man for a very long time. Some of them never knew I was a woman until I met them, and they were like, Oooh.” Wood chimes in with her own story: “I used to work in engineering before I went back to school, and everyone always sent me email saying, ‘Dear Mr. Wood,’ because they didn’t think there were any female engineers.”

Yet their experiment may have left its own trail of deception. Participants were told to visit the guessing-game Web site the following day, when ALICE’s role would be revealed. But the trio of investigators suspects some people ALICE fooled will simply forget to check in. “They might never realize they were talking to a bot,” Wood says. “For the rest of their lives, they’ll have the memory of this day they talked to a wonderful girl.”

