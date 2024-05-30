One fundamental problem is that generative AI tools don't know what is true, just what is popular. For example, there aren't a lot of articles on the web about eating rocks as it is so self-evidently a bad idea.

There is, however, a well-read satirical article from The Onion about eating rocks. And so Google's AI based its summary on what was popular, not what was true.

Some AI Overview results appear to have mistaken jokes and parodies for factual information. Google / The Conversation

Another problem is that generative AI tools don't have our values. They're trained on a large chunk of the web.

And while sophisticated techniques (that go by exotic names such as“reinforcement learning from human feedback” or RLHF) are used to eliminate the worst, it is unsurprising they reflect some of the biases, conspiracy theories and worse to be found on the web. Indeed, I am always amazed how polite and well-behaved AI chatbots are, given what they're trained on.

If this is really the future of search, then we're in for a bumpy ride. Google is, of course, playing catch-up with OpenAI and Microsoft.

The financial incentives to lead the AI race are immense . Google is therefore being less prudent than in the past in pushing the technology out into users' hands.

In 2023, Google chief executive Sundar Pichai said :

That no longer appears to be so true, as Google responds to criticisms that it has become a large and lethargic competitor.

It's a risky strategy for Google. It risks losing the trust that the public has in Google being the place to find (correct) answers to questions.

But Google also risks undermining its own billion-dollar business model. If we no longer click on links and just read their summary, how does Google continue to make money?

The risks are not restricted to Google. I fear such use of AI might be harmful for society more broadly. Truth is already a somewhat contested and fungible idea. AI untruths are likely to make this worse.

In a decade's time, we may look back at 2024 as the golden age of the web, when most of it was quality human-generated content before the bots took over and filled the web with synthetic and increasingly low-quality AI-generated content .

The second generation of large language models are likely and unintentionally being trained on some of the outputs of the first generation . And lots of AI startups are touting the benefits of training on synthetic, AI-generated data .