According to a new BBC study, the four most popular AI chatbots - ChatGPT by OpenAI, Copilot by Microsoft, Gemini by Google, and Perplexity AI - do not always summarize the news accurately.
The publication's journalists asked the chatbots questions based on various news stories and asked them to use the BBC as a source for answers. However, in many cases, ChatGPT, Copilot, and others made mistakes, inaccurately conveyed information, or referred to other sources.
The BBC sent 100 questions to each of the chatbots, and 91% of the responses contained at least minor errors, and 51% contained major problems. 19% of the responses contained factual errors, such as incorrect statements, wrong numbers or dates.
Mistakes made by chatbots included misusing sources or mixing answers. Chatbots such as ChatGPT or Gemini sometimes used old BBC news and presented it as current, or mixed information from the latest news with outdated data.
Another problem with chatbots is that they are not always able to distinguish between factual data and author's opinion. BBC journalists found at least 23 cases where the author's opinion was presented as fact.
Overall, the BBC found that Copilot and Gemini have more significant errors in news-based responses than ChatGPT or Perplexity, but all chatbots have problems.