Offloading target to PTX
Offloading target to PTX
Offloading
Host Compiler: Accel Compiler: - A compiler which reads LTO sections and generates code for accelerator
- Compiler create N+1 versions(one for host) of the target region for N devices
- pragmas are expanded
- LTO section is written into .gnu_offload_lto_*
- lto-wrapper runs mkoffload which runs accel compiler to convert IR into target code
- Linker adds new object files which are produced by mkoffload
Offloading Practical
Ensure nvptx-tools is installed, this will have to be used instead of binutils
Gist of building GCC for offloading
Read this article building-gcc-with-support-for-nvidia