On a recent afternoon Jonas Thiel, a socioeconomics major at a college in northern Germany, spent more than an hour chatting online with some of the left-wing political philosophers he had been studying. These were not the actual philosophers but virtual recreations, brought to conversation, if not quite life, by sophisticated chatbots on a website called Character.AI.
Mr. Thiel’s favorite was a bot that imitated Karl Kautsky, a Czech-Austrian socialist who died before World War Two. When Mr. Thiel asked Kautsky’s digital avatar to provide some advice for modern-day socialists struggling to rebuild the worker’s movement in Germany, Kautsky-bot suggested that they launch a newspaper. “They can use it not only as a means of spreading socialist propaganda, which is in short supply in Germany for the time being, but also to organize working class people,” the bot said.
Kautsky-bot went on to argue that the working classes would eventually “come to their senses” and embrace a modern-day Marxist revolution. “The proletariat is at a low point in their history right now,” it wrote. “They will eventually realize the flaws in capitalism, especially because of climate change.”
Over the course of several days, Mr. Thiel met with other virtual scholars, including G.A. Cohen and Adolph Reed Jr. But he could have picked almost anyone, live or dead, real or imagined. At Character.AI, which emerged this summer, users can chat with reasonable facsimiles of everyone from Queen Elizabeth or William Shakespeare to Billie Eilish or Elon Musk (there are several versions). Anyone you want to invoke, or concoct, is available for conversation. The company and site, founded by Daniel De Freitas and Noam Shazeer, two former Google researchers, is among the many efforts to build a new kind of chatbot. These bots cannot chat exactly like a human, but they often seem to.
In late November, OpenAI, a San Francisco artificial intelligence lab, unveiled a bot called ChatGPT that left more than a million people feeling as if they were chatting with another human being. Similar technologies are under development at Google, Meta and other tech giants. Some companies have been reluctant to share the technology with the wider public. Because these bots learn their skills from data posted to the internet by real people, they often generate untruths, hate speech and language that is biased against women and people of color. If misused, they could become a more efficient way of running the kind of misinformation campaign that has become commonplace in recent years.
“Without any additional guardrails in place, they are just going to end up reflecting all the biases and toxic information that is already on the web,” said Margaret Mitchell, a former A.I. researcher at Microsoft and Google, where she helped start its Ethical A.I. team. She is now with the A.I. start-up Hugging Face.
But other companies, including Character.AI, are confident that the public will learn to accept the flaws of chatbots and develop a healthy distrust of what they say. Mr. Thiel found that the bots at Character.AI had both a talent for conversation and a knack for impersonating real-life people. “If you read what someone like Kautsky wrote in the 19th century, he does not use the same language we use today,” he said. “But the A.I. can somehow translate his ideas into ordinary modern English.”
For the moment, these and other advanced chatbots are a source of entertainment. And they are quickly becoming a more powerful way of interacting with machines. Experts are still debating whether the strengths of these technologies will outweigh their flaws and potential for harm, but they agree on one point: The believability of make-believe conversation will continue to improve.
The art of conversation
Character.ai founders Noam Shazeer, left, and Daniel De Freitas at their offices in Palo Alto, Calif.Credit…Ian C. Bates for The New York Times
In 2015 Mr. De Freitas, then working as a software engineer at Microsoft, read a research paper published by scientists at Google Brain, the flagship artificial intelligence lab at Google. Detailing what it called “A Neural Conversational Model,” the paper showed how a machine could learn the art of conversation by analyzing dialogue transcripts from hundreds of movies.
The Rise of OpenAI
The San Francisco company is one of the world’s most ambitious artificial intelligence labs. Here’s a look at some recent developments.
- ChatGPT: The new cutting-edge chatbot is inspiring awe, fear, stunts and attempts to circumvent its guardrails, our technology columnist writes.
- DALL-E 2: The system lets you create digital images simply by describing what you want to see. But for some, image generators are worrisome.
- GPT-3: With mind-boggling fluency, the natural-language system can write, argue and code. The implications for the future could be profound.
The paper described what A.I. researchers call a neural network, a mathematical system loosely modeled on the web of neurons in the brain. This same technology also translates between Spanish and English on services like Google Translate and identifies pedestrians and traffic signs for self-driving cars navigating city streets.
A neural network learns skills by pinpointing patterns in enormous amounts of digital data. By analyzing thousands of cat photos, for instance, it can learn to recognize a cat.
The neural system described in the Google paper was far from perfect but seemed to chat like a real person every once in a while:
When Mr. De Freitas read the paper, he was not yet an A.I. researcher; he was a software engineer working on search engines. But what he really wanted was to take Google’s idea to its logical extreme.
“You could tell this bot could generalize,” he said. “What it said did not look like what was in a movie script.”
He moved to Google in 2017. Officially, he was an engineer on YouTube, the company’s video-sharing site. But for his “20 percent time” project — a Google tradition that lets employees explore new ideas alongside their daily obligations — he began building his own chatbot.
The idea was to train a neural network using a much larger collection of dialogue: reams of chat logs culled from social media services and other sites across the internet. The idea was simple, but it would require enormous amounts of computer processing power. Even a supercomputer would need weeks or even months to analyze all that data.
As a Google engineer, he held a few credits that allowed him to run experimental software across the company’s vast network of computer data centers. But these credits would grant only a small fraction of the computing power needed to train his chatbot. So he started borrowing credits from other engineers; as the system analyzed more data, its skills would improve by leaps and bounds.
Initially, he trained his chatbot using what is called an LSTM, for Long Short-Term Memory — a neural network designed in the 1990s specifically for natural language. But he soon switched to a new kind of neural network called a transformer, developed by a team of Google A.I. researchers that included Noam Shazeer.
Unlike an LSTM, which reads text one word at a time, a transformer can use multiple computer processors to analyze an entire document in a single step.
Google, OpenAI and other organizations were already using transformers to build what are called “large language models,” systems suited for a wide range of language tasks, from writing tweets to answering questions. Still working on his own, Mr. De Freitas focused the idea on conversation, feeding his transformer as much dialogue as possible.
It was an exceedingly simple approach. But as Mr. De Freitas likes to say: “Simple solutions for incredible results.”
The result in this case was a chatbot that he called Meena. It was so effective that Google Brain hired Mr. De Freitas and turned his project into an official research effort. Meena became LaMDA, short for Language Model for Dialogue Applications.
The project spilled into the public consciousness early last summer when another Google engineer, Blake Lemoine, told The Washington Post that LaMDA was sentient. This assertion was an exaggeration, to say the least. But the brouhaha showed how quickly chatbots were improving inside top labs like Google Brain and OpenAI.
Google was reluctant to release the technology, worried that its knack for misinformation and other toxic language could damage the company brand. But by this time Mr. De Freitas and Mr. Shazeer had left Google, determined to get this kind of technology into the hands of as many people as possible through their new company, Character.AI.
“The technology is useful today — for fun, for emotional support, for generating ideas, for all kinds of creativity,” Mr. Shazeer said.
Designed for ‘plausible conversation’
ChatGPT, the bot released by OpenAI to much fanfare in late November, was designed to operate as a new kind of question-and-answer engine. It is pretty good in this role, but the user never knows when the chatbot will just make something up. It may tell you that the official currency of Switzerland is the euro (it’s actually the Swiss franc) or that Mark Twain’s Celebrated Jumping Frog of Calaveras County could not only jump but talk. A.I. researchers call this generation of untruths “hallucination.”
In building Character.AI, Mr. De Freitas and Mr. Shazeer had a different objective: open-ended conversation. They believe that today’s chatbots are better suited to this kind of service, for now a means of entertainment, factual or not. As every page on the site notes, “Everything Characters say is made up!”
“These systems are not designed for truth,” Mr. Shazeer said. “They are designed for plausible conversation.”
Mr. De Freitas, Mr. Shazeer and their colleagues did not build one bot that imitates Elon Musk and another that mimics Queen Elizabeth and a third that parrots William Shakespeare. They built a single system that can imitate all those people and countless others.
It has learned from reams of general dialogue as well as from articles, news stories, books and other digital text describing people like Elon Musk, Queen Elizabeth and William Shakespeare.
The system also has a way of combining disparate concepts learned during training. The result is a practically endless collection of bots that can imitate a practically endless collection of people, riffing on a practically endless number of topics, as Mr. Thiel found when he chatted with the Karl Kautsky bot:
Sometimes, the chatbot gets things right. Sometimes, it doesn’t. When Mr. Thiel chatted with an avatar meant to imitate Mr. Reed, the 20th-century American political thinker, it turned him into “some kind of militant Maoist, which is definitely not right.”
Like Google and OpenAI and other top labs, Mr. De Freitas, Mr. Shazeer and their colleagues plan on training their system with ever larger amounts of digital data. This training can take months, and millions of dollars; it can also sharpen the skills of the artificial conversationalist.
Researchers say that the rapid improvement of the past several years will last only so long. Richard Socher, former head of A.I. at Salesforce who now runs a start-up called You.com, believes these exponential improvements will begin to level off over the next few years, when language models reach a point when they have analyzed pretty much all the text on the internet.
But Mr. Shazeer believes the runway is much longer. “There are billions of people in the world generating text all the time,” he said. “People will keep spending more and more money to train smarter and smarter systems. We are nowhere near the end of that trend.”