How does GPT-4.5's performance compare to GPT-4o in terms of hallucinations

GPT-4.5 and GPT-4o are both advanced language models developed by OpenAI, each with distinct improvements over their predecessors. When comparing their performance in terms of hallucinations, GPT-4.5 demonstrates a significant reduction in hallucination rates compared to GPT-4o.

Hallucination Reduction in GPT-4.5

- Hallucination Rate: GPT-4.5 has a hallucination rate of approximately 19% when tested on the PersonQA dataset, which is a substantial improvement over GPT-4o's rate of about 52%[2][5]. This reduction indicates that GPT-4.5 is more reliable and less prone to generating information that is not grounded in factual data.
- Improvement Techniques: The reduction in hallucinations in GPT-4.5 is attributed to new supervision techniques combined with traditional methods like supervised fine-tuning (SFT) and reinforcement learning from human feedback (RLHF)[1]. These methods help in enhancing the model's factual accuracy and reliability.

Performance Comparison

- Accuracy: In terms of accuracy on the PersonQA dataset, GPT-4.5 achieves a higher accuracy of 78%, compared to GPT-4o's 28%[2][5]. This suggests that not only does GPT-4.5 hallucinate less, but it also provides more accurate responses.
- Overall Reliability: The lower hallucination rate and higher accuracy of GPT-4.5 make it a more dependable choice for applications requiring precise and trustworthy information. However, both models perform similarly in certain evaluations, such as fairness and bias assessments[5].

In summary, GPT-4.5 outperforms GPT-4o in terms of hallucination reduction and accuracy, making it a more reliable option for tasks that require precise and factual information. However, both models have their strengths and weaknesses across different domains and evaluations.

Citations:
[1] https://topmostads.com/openai-release-gpt-4-5/
[2] https://assets.ctfassets.net/kftzwdyauwt9/7EaDv6OaWHhXLAehUYu7Db/64e9f7916d3581ba4b5d0f0a6c5098d1/GPT-4-5_System_Card_2272025.pdf
[3] https://www.reddit.com/r/OpenAI/comments/1izq37r/gpt45s_low_hallucination_rate_is_a_gamechanger/
[4] https://www.techtarget.com/searchenterpriseai/feature/GPT-4o-vs-GPT-4-How-do-they-compare
[5] https://cdn.openai.com/gpt-4-5-system-card.pdf
[6] https://community.openai.com/t/gpt-4-vs-gpt-4o-which-is-the-better/746991
[7] https://mashable.com/article/openai-gpt-4-5-release-how-to-try
[8] https://www.techtarget.com/searchenterpriseai/tip/GPT-35-vs-GPT-4-Biggest-differences-to-consider