Hallucinations
What are AI hallucinations and how can they be avoided? You can find out more about this topic here.
What are hallucinations?
In AI, hallucinations are cases in which a model produces seemingly plausible but incorrect or unsubstantiated results. Large language models (LLMs) in particular tend to invent facts when they have no reliable information. An AI hallucination therefore occurs when the AI output sounds realistic or convincing, but is factually incorrect or does not match the input data. Example: An AI chatbot is asked who was German Chancellor in 1823 and responds fluently with an invented name, even though there was no German Chancellor at the time.
Possible causes
Hallucinations result from the way generative models work: they probabilize the next output based on learned patterns, not on real knowledge retrieval as in a database. If a query is made for which no learned factual knowledge is available, the model "hallucinates" an answer that fits structurally but may be incorrect in terms of content. Causes include, for example
Examples and risks
Example 1: A language model is asked about a person who does not appear in the training data. It could compile a biography and freely invent awards, dates of life, etc.
Example 2: An AI system for medical advice "hallucinates" a non-existent study to support a recommendation because it cannot answer the question.
Such hallucinations are problematic because users often trust the AI answers. In creative applications (e.g. writing stories), invented details may be uncritical. But in scientific, medical or legal contexts, hallucinated facts can provoke serious errors of judgment. For example, there was a well-known case in which an AI tool cited several invented court rulings in a legal brief – clearly damaging the lawyer's credibility.
Dealing with hallucinations
Research is working to reduce hallucinations. One approach is the use of Retrieval Augmented Generation (RAG), whereby the model queries external knowledge sources before answering to incorporate up-to-date and correct information. The fine-tuning of models for truthfulness (TruthfulQA etc.) or the incorporation of confidence ratings are also being tested. In practice, critical AI applications should be monitored – a human in the loop can recognize and correct hallucinations. Users should also be sensitized to verify AI answers before basing important decisions on them.