Microsoft Bing faces a big problem: Google utterly eclipses it as a search engine. But Bing has a chance to grab more attention for itself with the OpenAI’s language technology, the artificial intelligence foundation that’s made the ChatGPT service a huge hit.
For the brainier Bing to work, though, Microsoft has to get the details right. ChatGPT can be useful, but it can be flaky, too, and nobody wants a search engine they can’t trust.
Microsoft has put a lot of thought and its own programming resources into the challenge. It’s wrestled with issues like how AI-powered Bing shows ads, reveals its data sources, and grounds the AI technology in reality so you get trustworthy results, not the digital hallucinations that can be hard to spot in machine-generated information.
I spoke to Jordi Ribas, leader of Bing search and AI, to dig more deeply into the overhauled Bing search engine. He’s a big enough fan that he used the technology to help him write his boss a memo about it. “It probably saved me two to three hours,” he said, and it improved the Spanish executive’s English, too.
When the technology expands beyond today’s very small test group, it’ll let millions of us dig for much more complicated information, like whether an Ikea loveseat will fit into your car. And we’ll all be able to see whether it truly gives Google a run for its money. But for now, are seven aspects of Bing AI that I learned.
Bing AI isn’t just a repackaged version of ChatGPT
Microsoft blends its Bing search engine with the large language model technology from OpenAI, the AI lab that built the ChatGPT tool that’s fired up excitement about AI and that Microsoft invested in. You can get ChatGPT-like results using Bing’s “chat” option — for example, “Write a short essay on the importance of Taoism.” But for other queries, Bing and OpenAI technology are blended through an orchestration system Microsoft calls Prometheus.
For instance, you can Bing, “I like the band Led Zeppelin. What other musicians should I listen to?” OpenAI first paraphrases that prompt to “bands similar to Led Zeppelin,” then repackages Bing search results in a bulleted list. Each suggestion, like Fleetwood Mac, Pink Floyd and the Rolling Stones, comes with a two-sentence description.
Bing AI cites its sources — sometimes
When you give ChatGPT a prompt, it’ll respond with text it generates, but it won’t tell you where it got that information. The AI system is trained on vast amounts of the information on the internet, but it’s hard to draw a direct line between that training data and ChatGPT’s output.
On Bing, though, factual information is often annotated, because Bing knows the source from its indexing of the web. For example, in the Led Zeppelin prompt above, Bing includes a link at the top of its answer to a Musicaroo post, 13 Bands That Sound Like Led Zeppelin, and includes that link and others from MusicalMum and Producer Hive.
That sourcing transparency helps address a big criticism of AI, making it easier to evaluate whether the response is accurate or a mere AI hallucination. But it doesn’t always appear. In the essay on Taoism above, for example, there aren’t any sources, footnotes or links at all.
Some source links are ads that make Microsoft money
The Bing AI’s elaborate answers provide a new way for Microsoft to generate money from ads. In traditional Bing searches, the “organic” search results that Bing judges to be most relevant are separate from items placed by advertisers. But with Bing AI searches, the two types of information can be blended.
For example, in its response to the query “plan me a one-week trip to Iceland without a rental car,” AI-powered Bing suggests several destinations. In one of them, several words are underlined: “You can visit places like Vík, Skógafoss, Seljalandsfoss, and Jökulsárlón glacier lagoon by joining a multi-day tour or taking a bus.” Hovering over that link shows three sources for that information and an ad from a tour company. The advertisement is the top item of the three and is labeled “ad.”
“When you look at those citations, sometimes they are ads,” Ribas said. “When it’s more of a purchasing intent query, you hover over it and you’ll see the list of the references and sometimes it’s an ad. Then sometimes in the conversation itself, you’re going to see product ads, like if you do a hotel query.”
Ad revenue is a big deal, since it takes weeks of work on an enormous cluster of computers for OpenAI to build a single update to its language model, and OpenAI CEO Sam Altman estimates it costs a few cents to process each ChatGPT prompt. Bing, even though it’s a distant second to Google in the search engine market, still handles millions of queries a day.
Google plans to open access to its Bard AI chatbot soon, but it won’t be including ads to begin with.
OpenAI-boosted results are more relevant than plain old Bing
The fundamental measure of a search engine’s usefulness is whether its results are relevant, and the OpenAI technology brings a huge boost in the measurement that Microsoft uses to score its search engine results’ relevance.
“My team, working super, super hard in a given year, might move that metric by one point,” Ribas said, but OpenAI’s technology boosted it three points in one fell swoop. “It’s just never happened before in the history of Bing,” Ribas said.
That relevance boost is just for ordinary search results, Ribas added. OpenAI’s technology can further improve Bing with its chat interface that offers more elaborate answers and a follow-up exchange.
OpenAI makes Bing better with languages besides English
One particular area where Bing has been weak is searches that aren’t in English, and Ribas said OpenAI helps there. A lot of Bing’s three-point gain in relevance scoring “came from international markets,” Ribas said.
OpenAI’s large language model, or LLM, is trained with text from 100 languages. “Catalan is my first language. I can have a dialogue in Catalan. It works really, really well,” Ribas said
Bing brings OpenAI’s results up to date
Large language models like OpenAI’s GPT-3.5, the foundation for ChatGPT, are slow to build and improve, which means they don’t move at the speed of the web or of conventional search engines. GPT-3.5, for example, was trained in 2021, so it doesn’t have any idea about Russia’s invasion of Ukraine, the effects of recent inflation on consumers, or Xi Jinping securing his third term as general secretary of the Chinese Communist Party.
Bing often does know this more recent information, though. “When you bring in the Bing results, then you will get fresh results on that complete answer,” Ribas said.
Bing ‘grounds’ OpenAI’s flights of fancy
Microsoft uses its Bing data to try to avoid situations where OpenAI’s more creative technology could lead people astray. The more factual a query and answer are, the more Bing’s technology is used in the answer, Ribas said. This “grounding” significantly reduces AI’s problems with making stuff up: “It will reduce hallucination, which is … an ongoing battle,” Ribas said.
But Microsoft doesn’t want its grounding system to squash all the magic out of the AI. There’s a reason ChatGPT has been so captivating. The Prometheus system decides on the priorities for each query.
“We had to find the sweet spot between over-grounding the model and keeping it interesting,” Ribas said. “We have a measurement of the interestingness of the results, and we have a measurement for the groundedness of the results. The more the query is looking for something very factual, the more we weight the grounded. The more the query is supposed to be creative, the less we weight the grounded. I kept telling my team, I want my cake and eat it too.”