CUTLASS Compilation
Follow official tutorial to build and compile cutalss.
Building
Run the following commands to build cutalss:
$ export CUDACXX=${CUDA_INSTALL_PATH}/bin/nvcc
$ mkdir build && cd build
$ cmake .. -DCUTLASS_NVCC_ARCHS=${gencode} # compiles for NVIDIA Hopper GPU architecture
How to get $CUDA_INSTALL_PATH?
Usually the cuda is installed at /usr/local/cuda
or /usr/local/cuda-11.7/
, which depends on your cuda version. If not, run echo $LD_LIBRARY_PATH
, the output is supposed to be /usr/local/cuda/lib64
, the prefix without lib64
is the location of cuda.
How to get $gencode?
$gencode is determined by your GPU architecture, please match your gencode with your GPU architecture according to this map. Other references:
- How to get the GPU architecture?,
- GPU Architecture Compatibility Guide
- How to find out which NVIDIA GPU I have
Using CUTLASS in your CUDA program
Applications should list /include
within their include paths. They must be compiled as C++17 or greater. Specifically, we want to use cutalss in the following test.cu
program:
#include <iostream>
#include <cutlass/cutlass.h>
#include <cutlass/numeric_types.h>
#include <cutlass/core_io.h>
int main() {
cutlass::half_t x = 2.25_hf;
std::cout << x << std::endl;
return 0;
}
You should compile the program by including /include
. Specifically, If your project path is as follows:
~/cutlass
~/test/test.cu
Then the compilation command at ~/test/
is:
nvcc -I../cutlass/include -gencode=arch=compute_80,code=compute_80 -std=c++17 test.cu -o main
It is worthy to notice that you have to specify the gencode. We have seen that specifying gpu architecture is needed as some libraries are only available in the last couple of years. If we don’t specify gpu architecture, I am not sure which architecture it will pick. In that case half suppport may not be identified by nvcc.
If you want to use cutlass utilities, then make sure tools/util/include
is listed as an include path:
nvcc -I../cutlass/include -I../cutlass/include/tools/util/include -gencode=arch=compute_80,code=compute_80 -std=c++17 test.cu -o main