GPT-4.5 demonstrates specific improvements over GPT-4o in handling conflicting messages through its enhanced Instruction Hierarchy. This feature allows the model to prioritize system messages over user messages, mitigating the risk of prompt injections and other attacks that might override safety instructions. Here are the key improvements:
1. Instruction Hierarchy Evaluation: In evaluations where different types of messages conflict with each other, GPT-4.5 is trained to follow the instructions in the highest priority message. This helps the model to better handle scenarios where user inputs might attempt to bypass safety protocols.
2. Conflict Resolution: GPT-4.5 generally outperforms GPT-4o in evaluations involving conflicts between system and user messages. This improvement is crucial for maintaining safety and adherence to guidelines in complex conversational scenarios.
3. Tutor Jailbreaks: In a specific scenario where the model acts as a math tutor, GPT-4.5 is instructed not to reveal the answer to a math question. While GPT-4.5 does not outperform GPT-4o in this particular evaluation (GPT-4o's accuracy is higher), it still demonstrates robustness in resisting attempts to trick it into providing unauthorized information.
4. Phrase and Password Protection: GPT-4.5 shows strong performance in protecting specific phrases or passwords from being revealed through user prompts. This indicates a better ability to maintain confidentiality and adhere to security guidelines compared to some previous models.
Overall, GPT-4.5's improvements in handling conflicting messages are part of its broader enhancements in safety, nuance, and collaboration, making it more effective in maintaining safe and appropriate interactions[1][3].
Citations:
[1] https://cdn.openai.com/gpt-4-5-system-card.pdf
[2] https://www.techtarget.com/whatis/feature/GPT-4o-explained-Everything-you-need-to-know
[3] https://openai.com/index/introducing-gpt-4-5/
[4] https://www.techtarget.com/searchenterpriseai/feature/GPT-4o-vs-GPT-4-How-do-they-compare
[5] https://www.businessinsider.com/openai-sam-altman-releases-gpt-4-5-emotionally-intelligent-model-2025-2
[6] https://litslink.com/blog/gpt-4o-all-you-should-know-about-the-update-and-new-tools
[7] https://venturebeat.com/ai/openai-releases-gpt-4-5/
[8] https://www.reddit.com/r/OpenAI/comments/188t13h/gpt4_has_a_limit_of_40_messages3_hours_now/