Home Arrow Icon Knowledge base Arrow Icon Global Arrow Icon Are there any domains where GPT-4.5's hallucination rate is particularly low


Are there any domains where GPT-4.5's hallucination rate is particularly low


GPT-4.5, the latest model from OpenAI, demonstrates a significant reduction in hallucination rates compared to its predecessors. While it generally shows a hallucination rate of about 37%, there are specific domains and evaluations where its performance is notably better.

Hallucination Reduction Techniques

GPT-4.5 employs new supervision techniques, including supervised fine-tuning (SFT) and reinforcement learning from human feedback (RLHF), which contribute to its lower hallucination rate[1][3]. These methods help improve the model's accuracy and reliability by ensuring that it generates information grounded in factual data.

Performance in Specific Domains

- PersonQA Evaluation: In the PersonQA dataset, which tests a model's ability to answer questions about people accurately, GPT-4.5 shows a hallucination rate of about 19%. This is lower than its overall rate and indicates that in domains involving factual information about individuals, GPT-4.5 performs well with fewer hallucinations[3].

- Language-Related Tasks: GPT-4.5 excels in writing and programming tasks, offering detailed explanations and assistance in practical problem-solving. While specific hallucination rates for these tasks are not detailed, the model's improved pattern recognition and broader knowledge base contribute to more accurate and reliable outputs[1][5].

- Comparison with Other Models: In certain evaluations, such as the o1 reasoning model, GPT-4.5's hallucination rate is higher (37% vs. 44% for o1). However, GPT-4.5 is designed for more general-purpose applications rather than specialized reasoning tasks[5].

Overall, GPT-4.5's hallucination rate is particularly low in domains where it has been extensively trained and evaluated, such as in the PersonQA dataset. However, its performance can vary across different tasks and evaluations, reflecting the ongoing challenges in reducing hallucinations in AI models.

Citations:
[1] https://topmostads.com/openai-release-gpt-4-5/
[2] https://www.reddit.com/r/ChatGPT/comments/18kqaom/gpt45turbo_hallucination_explained_with_tests_and/
[3] https://cdn.openai.com/gpt-4-5-system-card.pdf
[4] https://community.openai.com/t/custom-gpt-used-to-work-now-hallucinates-and-does-not-use-specific-data-from-files-as-instructed/809849
[5] https://www.channelnewsasia.com/business/openai-rolls-out-gpt-45-some-paying-users-expand-access-next-week-4966131
[6] https://www.youtube.com/watch?v=KtwK3hBAjDY
[7] https://garymarcus.substack.com/p/gpt-45-is-no-gpt-5
[8] https://www.toolify.ai/gpts/exciting-new-updates-open-source-stable-diffusion-200k-context-claude-21-139386