AI Hallucinations

  • 0
  • 3015
Font size:
Print

AI Hallucinations

AI basics | OpenAI’s latest AI models report high ‘hallucination’ rate: What does it mean, why is this significant?

Context: A recent technical report by OpenAI has sparked concern in the artificial intelligence (AI) community. The findings reveal that OpenAI’s latest models — o3 and o4-mini — are hallucinating more frequently than older versions, raising fundamental questions about the future of large language models (LLMs).

More on News

This highlights the dual nature of the hallucination problem — it is not only about improving algorithms but also about managing user expectations and understanding the limitations of machine-generated knowledge.

What Are AI Hallucinations?

  • Originally, AI hallucinations referred specifically to fabricated information generated by AI models.
  • A well-known case: In June 2023, a U.S. lawyer used ChatGPT to draft a court filing — the chatbot included fake citations and nonexistent cases.
  • Today, hallucinations include:
    • Fabricated facts
    • Irrelevant but factually correct answers
    • Outputs not grounded in the question asked

Why Do LLMs Hallucinate?

  • LLMs (Large Language Models): Systems like ChatGPT, o3, o4-mini, Gemini, etc., generate outputs by identifying patterns in massive internet text datasets.
  • Prediction-based Output: These models guess the next word based on probability — they do not fact-check or understand truth like humans.
  • Gary Marcus’ View: “LLMs know word patterns, not facts. They don’t operate like you and me.”
  • Training on Flawed Data: If trained on inaccurate or biased text, the model may reproduce or even generate new inaccuracies.
  • Black-box Nature: Due to their complexity, experts can’t trace exactly why a model gives a specific output.

OpenAI’s New Report: Key Findings

  • Model o3 (OpenAI’s most powerful system): Hallucinated in 33% of responses during the PersonQA benchmark test (focused on public figures).
  • Model o4-mini: Hallucinated in 48% of PersonQA test cases.
  • Significance: These rates are higher than previous models, reversing the earlier trend of improvement.
  • OpenAI’s Challenge: The company does not know why hallucinations have increased in newer models.

Why the Report Is Significant?

  • Hallucination has always been an issue in AI, but optimism existed that it would decline over time.
  • The latest findings show that hallucination is not going away—and might even be getting worse.
  • This trend is not unique to OpenAI: Chinese startup DeepSeek saw double-digit increases in hallucination rates in its new R-1 model.
  • Implication: All LLMs currently face similar limitations, regardless of origin.

Limitations in Practical Use

  • Due to hallucination risks, the applicability of AI systems is currently limited in several fields:
  • They cannot yet be trusted as research assistants, since they may generate fake citations in academic papers.
  • They are unreliable as paralegal bots, as they can fabricate legal cases and misinterpret laws.
  • In high-stakes domains like medicine, law, and science, even small errors can have serious consequences, making hallucination a critical barrier.
Share:
Print
Apply What You've Learned.
Previous Post Childhood Hypertension in India
0 0 votes
Article Rating
Subscribe
Notify of
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x