It's been a month since Google's awesome goof. Its new AI Review feature was supposed to “take the search beyond the work,” offering easy-to-read answers to our questions based on multiple search results. Instead, he asked people to eat rocks and put cheese on pizza. You can ask Google which country in Africa starts with the letter “K”, and Google won't say any of them. In fact, you can still get these wrong answers because the AI search is a disaster.
This spring looked like a turning point for AI exploration, thanks to a few big announcements from major players in the space. One is the Google AI Overview update, and the other comes from Perplexity, an AI search startup that's already been labeled as a viable alternative to Google. In late May, Perplexity launched a new feature called Pages that can create customized web pages filled with information on a specific topic, such as a smart friend who does your homework for you. Then Perplexity is caught stealing. For AI search to work well, it seems, it has to cheat a bit.
There is a lot of bad will over AI search errors and mistakes, and critics are becoming increasingly vocal. A group of online publishers and creators went to Capitol Hill on Wednesday to lobby lawmakers to look into Google's AI review feature and other AI tech that pulls content from independent creators. This comes days after the Recording Industry Association of America (RIAA) and a group of major record labels filed a lawsuit against two AI companies that generate text-to-music for copyright infringement. And let's not forget that several newspapers, including the New York Times, have filed copyright infringement lawsuits against OpenAI and Microsoft for scraping their content to train the same AI models that power their search tools. Give strength. (Vox Media, the company that owns this publication, meanwhile, has a licensing agreement with OpenAI that allows our content to be used to train its models and via ChatGPT. Our journalism and editorial decisions remain free.)
Generative AI technology is set to change the way we search the web. At least, that's the line we've been fed since ChatGPT hit the scene in late 2022, and now every tech company is pushing its own brand of AI technology: Microsoft has Copilot, Google has Gemini has Apple, Apple has Apple Intelligence. While these tools can do more than just help you find things online, beating Google Search still seems to be the holy grail of AI. Even OpenAI, the maker of ChatGPT, is reportedly building a search engine to compete directly with Google.
But despite the public efforts of many companies, AI search won't make it easier to find answers online anytime soon, according to experts I spoke with.
It's not just that AI search isn't ready for prime time because of some flaws, it's that those flaws are so deeply embedded in how AI search works that it's no longer clear whether Google may never be good enough to replace it.
“It's a nice addition, and there are times when it's really great,” Chirag Shah, a professor of information science at the University of Washington, told me. “But I think we're still going to need traditional exploration.”
Rather than go into all the AI search flaws here, I'll highlight two that were on display with the recent Google and Perplexity kerfuffles. The Google Pizza Glow incident shows just how stubborn AI's deception problem is. Just days after Google launched AI Overview, some users noticed that if you asked Google how to stop cheese from falling off a pizza, Google would suggest adding some glue. This particular answer came from an old Reddit thread that, for some reason, thought Google's AI was an authoritative source even though a human would soon realize that Redditors were joking about eating glue. Weeks later, The Verge's Elizabeth Lopato reported that Google's AI overview feature was still recommending Pizza Glow. Google pulled back its AI review feature in May after viral failures, so accessing AI reviews is quite difficult.
The problem isn't just that the large language models that power generative AI tools can mislead, or make up, information in some situations. They can't even tell good information from bad – at least not yet.
“I don't think we'll ever be at the stage where we can guarantee that there won't be hallucinations,” said Yoon Kim, an assistant professor at MIT who researches large language models. “But I think there's been a lot of progress in reducing these hallucinations, and I think we're going to get to a point where they're going to be good enough to use.”
The recent Perplexity drama highlights a different problem with AI search: It accesses and republishes content it doesn't care about. Perplexity, whose investors include Jeff Bezos and Nvidia, made a name for itself by providing in-depth answers to search queries and showing its sources. You can ask it a question and it will come back with a conversational answer, with references from around the web, which you can refine by asking more questions.
When Perplexity launched its Pages feature, however, it became clear that its AI had the uncanny ability to deconstruct journalism. Worry even displays the pages it creates as a news section of its website. One such page he published included summaries of some of Forbes' exclusive, paywalled investigative reporting on Eric Schmidt's drone project. Forbes accused Perplexity of stealing its content, and Wired later reported that Perplexity was scraping content from websites that blocked crawlers that did such scraping. . An AI-powered search engine will even give incorrect answers to queries based on details in URLs or metadata. (In an interview with Fast Company last week, Perplexity CEO Arvind Srinivas disputed some of the Wired investigation's findings, saying, “I think there's a fundamental misunderstanding of how it works. )
Shah explained that the reasons why AI-powered search stinks are both technical and simple. The technical explanation involves something called retrieval-augmented generation (RAG), which works somewhat like research to find out more about a particular topic when a professor's personal library isn't enough. Recruits assistants. RAG solves some of the problems with how the current generation of large-language models generate content, including frequency-frequency, but it also creates a new problem: it can't distinguish good sources from bad. In its current state, AI lacks good judgment.
When you or I do a Google search, we know that the long list of blue links will include high-quality links, like newspaper articles, and low-quality or unverified stuff, like old Reddit threads or SEO form garbage. . We can distinguish between good and bad in a split second thanks to years of experience perfecting our Googling skills.
And then there's some intelligence that AI doesn't have, like knowing whether or not it's okay to eat rocks and glue.
“AI-powered search doesn't have that capability right now,” Shah said.
None of this is to say that you should turn around and run the next time you see an AI review. But instead of thinking of it as an easy way to get an answer, you should think of it as a starting point. Like Wikipedia. It's hard to know how that answer got to the top of a Google search, so you might want to check the sources. After all, you are smarter than the AI.