You probably have a sense that new forms of artificial intelligence can be dumb as rocks.
Hilariously wrong information from Google’s new AI are showing you just how dumb.
In search results, Google’s AI recently suggested mixing glue into pizza ingredients so the cheese doesn’t slide off. (Don’t do this.) It has previously said drinking plenty of urine can help you pass a kidney stone. (Don’t do this. And Google said it fixed this one.)
Google’s AI said John F. Kennedy graduated from the University of Wisconsin at Madison in six different years, including 1993. It said no African countries start with the letter “K.” (Nope. Kenya.)
These are mostly silly examples that in some cases came from people trying to coax Google’s AI into saying the wrong thing.
But there’s a serious lesson: You should have relatively low expectations for AI. With the generative AI from Google, OpenAI’s ChatGPT and Microsoft’s Copilot, you should assume that they’re wrong until proved otherwise.
These chatbots can still be incredibly useful. But assuming they’re wrong is a different mental model from most technologies you use regularly. When you type directions into Waze, send an email or click a link in Google, you have a reasonable expectation that the technologies work properly and accurately.
“The vast majority of AI Overviews provide high quality information, with links to dig deeper on the web,” Google said in a statement. The company said it did extensive testing before the AI search results launched and that the company is using examples of mistakes to “develop broader improvements to our systems, some of which have already started to roll out.”
I’ll explain why Google’s AI (and other chatbots) might tell you to eat glue and the lessons you should take from such mistakes.
– – –
How AI can be so wrong
The technology behind ChatGPT and the “AI Overviews” in Google searches, which rolled out to all Americans last week, is called a large language model.
The tech has been fed gobs of information from the internet – it could be news articles, Wikipedia, online recipes, law school admissions practice tests, Reddit forums and more. Based on patterns from the internet data, computers generate mathematically likely words to your requests.
Computer programmer Simon Willison said that Google’s AI works by crunching the terms you type into the search box and pasting the search and results into a large language model. It then pulls out useful information from relevant websites in Google’s search results.
Sometimes Google’s AI isolates correct and useful information. Sometimes, particularly if there’s not much online information related to your search, it spits out something wrong.
The pizza glue example, Willison said, appeared to come from the relatively sparse online information related to a probably uncommon Google search for making cheese stick to pizza. Google’s search AI appeared to draw its response from one seemingly joking, 11-year-old post from Reddit.
AI cannot tell the difference between a joke and a fact. Or if the AI doesn’t have enough information to give you an accurate answer, it might invent a confident-sounding fiction including that dogs have played professional basketball and hockey.
“It’s hard to build a version of this that doesn’t have all sorts of weird bugs,” Willison said.
Most of the examples flying around social media were laughable mistakes from Google AI. We confirmed, however, that Google’s AI spouted a piece of habitual misinformation that former president Barack Obama is a Muslim. That’s a lie.
Google said it removed the AI-generated false information.
Companies don’t disclose much about what information their AI models “learn” from or how often their chatbots are wrong.
OpenAI said its accuracy rate has improved. Microsoft said its Copilot chatbot includes links in its replies, as Google does, so people can explore more. Microsoft also said it takes feedback from people and is improving Copilot.
– – –
‘Distrust and verify’
Willison suggested that twist on the Reagan-era “trust but verify” refrain for chatbots. Let’s put it on bumper stickers.
Most people wouldn’t eat glue just because Google suggested it. But chatbots are already difficult to use effectively. The risk is AI that’s wrong a lot will waste your time and erode your confidence in an emerging technology.
Technology writer Molly White said companies should do more to lower your expectations.
She said Google, OpenAI and Microsoft must be more careful not to show you AI-generated information for high-stakes questions like those relating to your health or the law. (Representatives for Google and OpenAI said the companies put limits on the use of AI-generated information for higher-stakes topics including health and legal questions. Microsoft said many Copilot replies, including for health and legal information, may refer people to Bing web search results.
White also said companies shouldn’t show unrealistic demonstrations of quick-witted chatbots that make it seem like you’re interacting with a human who has infinite knowledge.
“This is not a person on the other end who is reasoning,” White said. “It’s guessing at what may sound right.”
It’s particularly tricky to set your expectations for AI responses in Google searches. Google is now mixing familiar search results with new, AI-generated information that is likely to be less reliable. ChatGPT isn’t doing that.
With appropriately low expectations, I’m still cautiously curious about Google’s AI. This week, I did a Google search about the top editor at Wired. Google’s AI-generated information we together her biographical details, plus a quote from her boss from when she was hired.
Without Google’s AI response, I would have needed to dig through several links to find the information. That’s the promise Google has made, that its AI will do the Googling for you.
That time, the promise was fulfilled. And Google’s AI didn’t tell me to eat glue.