CUTLASS Building & Compilation

发布于:2024-05-05 ⋅ 阅读:(26) ⋅ 点赞:(0)

CUTLASS Compilation

Follow official tutorial to build and compile cutalss.

Building

Run the following commands to build cutalss:

$ export CUDACXX=${CUDA_INSTALL_PATH}/bin/nvcc

$ mkdir build && cd build

$ cmake .. -DCUTLASS_NVCC_ARCHS=${gencode}             # compiles for NVIDIA Hopper GPU architecture

How to get $CUDA_INSTALL_PATH?

Usually the cuda is installed at /usr/local/cuda or /usr/local/cuda-11.7/, which depends on your cuda version. If not, run echo $LD_LIBRARY_PATH, the output is supposed to be /usr/local/cuda/lib64, the prefix without lib64 is the location of cuda.

How to get $gencode?

$gencode is determined by your GPU architecture, please match your gencode with your GPU architecture according to this map. Other references:

Using CUTLASS in your CUDA program

Applications should list /include within their include paths. They must be compiled as C++17 or greater. Specifically, we want to use cutalss in the following test.cu program:

#include <iostream>
#include <cutlass/cutlass.h>
#include <cutlass/numeric_types.h>
#include <cutlass/core_io.h>

int main() {

  cutlass::half_t x = 2.25_hf;

  std::cout << x << std::endl;

  return 0;
}

You should compile the program by including /include. Specifically, If your project path is as follows:

~/cutlass
~/test/test.cu

Then the compilation command at ~/test/ is:

nvcc -I../cutlass/include -gencode=arch=compute_80,code=compute_80 -std=c++17 test.cu -o main

It is worthy to notice that you have to specify the gencode. We have seen that specifying gpu architecture is needed as some libraries are only available in the last couple of years. If we don’t specify gpu architecture, I am not sure which architecture it will pick. In that case half suppport may not be identified by nvcc.

If you want to use cutlass utilities, then make sure tools/util/include is listed as an include path:

nvcc -I../cutlass/include -I../cutlass/include/tools/util/include -gencode=arch=compute_80,code=compute_80 -std=c++17 test.cu -o main