Home Arrow Icon Knowledge base Arrow Icon Global Arrow Icon How does the RX 9070 XT's FP8 support enhance AI tasks compared to previous models


How does the RX 9070 XT's FP8 support enhance AI tasks compared to previous models


The RX 9070 XT's support for FP8 (8-bit floating point) operations significantly enhances AI tasks compared to previous models, particularly those based on the RDNA 3 architecture. FP8 is a lower precision format that includes modes like BF8 (Brain Float 8) and HF8 (half precision float), which are optimized for improving inferencing efficiency in AI and machine learning tasks. This support allows the GPU to handle AI workloads more swiftly without sacrificing significant accuracy.

Compared to RDNA 3, the RDNA 4 architecture in the RX 9070 XT doubles the baseline FP16 throughput and doubles it again for sparse operations. For FP8 workloads, the throughput is increased by up to 8 times compared to FP16 operations on RDNA 3. This substantial increase in throughput is particularly beneficial for tasks that rely heavily on matrix multiplications, such as those found in machine learning models.

The enhanced Wave Matrix Multiply Accumulate (WMMA) instructions in RDNA 4 further optimize performance for AI tasks. These improvements enable the RX 9070 XT to deliver significantly better performance in applications like Adobe Lightroom and DaVinci Resolve, with up to 34% better performance compared to the RX 7900 GRE. For generative AI tasks, such as Stable Diffusion image generation, the RX 9070 XT is up to 70% faster than its predecessor.

However, while the RX 9070 XT excels in compute-bound AI tasks, it may face limitations in memory-bound workloads due to its 256-bit memory bus, which provides up to 640 GB/s of bandwidth. This is less than the 7900 XT's 800 GB/s and the XTX's 960 GB/s, potentially impacting performance in tasks that require high memory bandwidth, such as large language models (LLMs).

Overall, the RX 9070 XT's FP8 support and enhanced AI accelerators position it as a competitive option for AI tasks, especially those that benefit from improved compute performance and lower precision data types. However, its efficiency and performance in memory-intensive AI applications may vary compared to other high-end GPUs.

Citations:
[1] https://www.theregister.com/2025/02/28/amd_rx_9070_series/
[2] https://www.neowin.net/news/amd-details-windows-11-ai-performance-gains-on-rx-9070-xt-vs-7900/
[3] https://www.tomshardware.com/pc-components/gpus/amd-rdna4-rx-9000-series-gpus-specifications-pricing-release-date
[4] https://www.reddit.com/r/LocalLLaMA/comments/1j088yg/rx_9070_xt_potential_performance_discussion/
[5] https://windowsforum.com/threads/amd-rx-9070-series-unleashing-ai-and-gaming-power-on-windows-11.354177/?amp=1
[6] https://www.guru3d.com/review/amd-announces-radeon-rx-9070-and-9070-xt-preview/page-2/
[7] https://gizmodo.com/amd-unleashes-the-radeon-rx-9070-gpus-2000569514
[8] https://www.storagereview.com/review/asus-prime-amd-radeon-rx-9070-xt-and-rx-9070-review
[9] https://www.pcgamer.com/hardware/graphics-cards/amd-has-officially-revealed-its-rdna-4-based-rx-9070-and-rx-9070-xt-gpus-and-they-look-a-lot-like-rdna-3-only-turbocharged/
[10] https://www.tweaktown.com/news/103556/amds-official-benchmarks-for-the-radeon-rx-9070-xt-and-across-30-games/index.html