Goal
We all know AMD VEGA 56 and VEGA 64 are powerful GPUs competitive to NVIDIA RTX 2080 Ti, the Radeon VII is even more powerful with 13.44 TFLOPS FP32 (float) performance (Theoretical) with a lower price at 699 USD.
I was thinking about having a Deep Learning machine, and I looked at the price for NVIDIA 2080 Ti, it is around 1548 USD here. So I made a call to my friend who sells 2nd hand electronic stuff and get a AMD Radeon RX 580 with around 100 USD. RX 580 is a cost-effective card to get 6 TFLOPS performance.
But, how to fit it with TensorFlow or Pytorch? CUDA does not agree with AMD GPUs. No worries, I have found ROCm for AMD GPUs to replace CUDA.
Requirements
It is so excited to run TensorFlow on AMD GPU. But here are some important notes:
- Only new CPUs are supported as it requires PCIe Gen3 and PCIe Atomics
- Only new GPUs are supported because old GPUs are too poor in performance
- Use Linux with kernel 4.17 or above (Or you will have a hard time with it)
Supported CPUs
- AMD Ryzen CPUs
- The CPUs in AMD Ryzen APUs
- AMD Ryzen Threadripper CPUs
- AMD EPYC CPUs
- Intel Xeon E7 v3 or newer CPUs
- Intel Xeon E5 v3 or newer CPUs
- Intel Xeon E3 v3 or newer CPUs
- Intel Core i7 v4, Core i5 v4, Core i3 v4 or newer CPUs (i.e. Haswell family or newer)
- Some Ivy Bridge-E systems
Refer to GitHub of ROCm CPU section. What are those “Some Ivy Bridge-E systems”? I don’t know too. You may Email them for support on that. You may see some older CPUs are limited supported, but I would suggest you don’t waste your time on that unless you want to contribute in ROCm to make it support older CPUs. It will make your life harder.
Supported GPUs
- GFX8 GPUs
- “Fiji” chips, such as on the AMD Radeon R9 Fury X and Radeon Instinct MI8
- “Polaris 10” chips, such as on the AMD Radeon RX 580 and Radeon Instinct MI6
- GFX9 GPUs
- “Vega 10” chips, such as on the AMD Radeon RX Vega 64 and Radeon Instinct MI25
- “Vega 7nm” chips, such as on the Radeon Instinct MI50, Radeon Instinct MI60 or AMD Radeon VII
Refer to GitHub of ROCm GPU section. Few GFX8 and GFX7 GPUs are supported unofficially (if you got problem, no help, no guarantee).
Install ROCm
Things I got
I got a i5-4570 for CPU (another 2nd hand computer with 140 USD) and RX 580 for GPU and installed Ubuntu Server 18.04.
The official tutorial will be just fine for installnig ROCm and TensorFlow. But you will encounter problem in Pytorch (which is the reason I write this tutorial, I gave up on the first time, and this time I find solution).
Install dependencies
Note the ROCm version you install, I am installing ROCm 3.3.0 This information will be useful for Pytorch installation.
Update system, install libnuma-dev
and reboot:
1 | $ sudo apt update |
Install ROCm
Add the ROCm apt repository.
1 | $ wget -q -O - http://repo.radeon.com/rocm/apt/debian/rocm.gpg.key | sudo apt-key add - |
Install ROCm
1 | $ sudo apt update |
Post installation
Grant yourself permission for accessing your GPU
1 | $ sudo usermod -a -G video $LOGNAME |
If you need to add more users, check the document
Reboot
1 | $ sudo reboot |
Test and Cofnigure
Test the ROCm installation.
1 | $ /opt/rocm/bin/rocminfo |
You should see something like report.
Add ROCm to environment PATH:
1 | $ echo 'export PATH=$PATH:/opt/rocm/bin:/opt/rocm/profiler/bin:/opt/rocm/opencl/bin/x86_64' | |
Install TensorFlow
This is simple with two steps, get some important libraries and install TensorFlow through pip
. Refer to ROCm Doc
1 | $ sudo apt update |
It now installed TensorFlow 2 (latest), you need to specify version if you need just like installing other packages in Python.
DONE
TensorFlow time~ (^_^)b