Home Arrow Icon Knowledge base Arrow Icon Global Arrow Icon Can I manually update the firmware on the DGX A100 without using the ISO file


Can I manually update the firmware on the DGX A100 without using the ISO file


Yes, you can manually update the firmware on the NVIDIA DGX A100 without using the ISO file. Here are the detailed steps to achieve this:

Using the DGX A100 Firmware Update Utility

The DGX A100 System Firmware Update utility is available as a tarball and a `.run` file. You can use either method to update the firmware.

1. Download the Firmware Update Files:
- Access the NVIDIA Enterprise Support Portal to download the DGX A100 firmware update files. You will need the `.tar.gz` file (e.g., `nvfw-dgxa100_24.11.1_241107.tar.gz`) and/or the `.run` file (e.g., `nvfw-dgxa100_24.11.1_241107.run`).

2. Copy Files to the DGX A100 System:
- Transfer the downloaded files to the DGX A100 system. Ensure you have sufficient permissions to perform the update.

3. Update Using Docker:
- The DGX A100 system includes Docker, which is required to run the firmware update container.
- Load the Docker image from the tarball:

bash
     sudo docker load -i nvfw-dgxa100_24.11.1_241107.tar.gz
     

- Verify the image is loaded:
bash
     sudo docker images
     

- Run the Docker container to update the firmware:
bash
     sudo docker run --rm --privileged -ti -v /:/hostfs nvfw-dgxa100:24.11.1 update_fw all
     

- This command updates all firmware components. You can specify specific components by replacing `all` with the names of the components you want to update (e.g., `BMC SBIOS`).

4. Update Using the `.run` File:
- If you prefer not to use Docker, you can run the `.run` file directly:

bash
     sudo ./nvfw-dgxa100_24.11.1_241107.run update_fw all
     

- This method also updates all firmware components. You can specify components similarly to the Docker method.

5. Using NVSM (NVIDIA System Management):
- NVSM provides an interactive mode for updating firmware.
- Enter the NVSM firmware update module:

bash
     sudo nvsm
     nvsm-> cd systems/localhost/firmware/install
     

- Set the flags for the update action:
bash
     nvsm(/system/localhost/firmware/install)-> set Flags=update_fw\ all
     

- Set the Docker image reference:
bash
     nvsm(/system/localhost/firmware/install)-> set DockerImageRef=nvfw-dgxa100:24.11.1
     

- Note that in NVSM, you need to escape spaces with a backslash (`\`).

Additional Considerations

- Fan Speed Increase: During the BMC firmware update, fan speeds may increase temporarily, which is normal[1].
- Reboot Required: After updating the firmware, you may need to reboot the system for the changes to take effect.
- PXE Boot Alternative: If you manage a cluster of DGX A100 systems, you can also use PXE boot to update firmware from a headnode, but this requires additional setup and infrastructure[4].

By following these steps, you can manually update the firmware on your DGX A100 system without using the ISO file.

Citations:
[1] https://docs.nvidia.com/dgx/dgxa100-fw-container-release-notes/using-utility.html
[2] https://docs.nvidia.com/dgx/dgxa100-fw-container-release-notes/dgxa100-fw-update-iso.html
[3] https://www.manualslib.com/manual/1925509/Nvidia-Dgx-A100.html
[4] https://kb.brightcomputing.com/knowledge-base/how-to-upgrade-dgx-a100-firmware-from-headnode/
[5] https://www.broadberry.com/dual-amd-epyc-rackmount-servers/nvidia-dgx-a100
[6] https://support.brightcomputing.com/manuals/9.2/admin-manual.pdf
[7] https://docs.nvidia.com/dgx/dgxa100-user-guide/updating-restoring-sw.html
[8] https://kb.brightcomputing.com/knowledge-base/installing-nvidia-dgx-software-stack-in-bright-ubuntu-20-04-software-images/
[9] https://github.com/NVIDIA/deepops/blob/master/docs/deepops/dgx-diagnostic-firmware.md
[10] http://cdn.cnetcontent.com/2f/68/2f6888a0-063f-4d76-94e4-8666b7619dfd.pdf