0% found this document useful (0 votes)
49 views4 pages

OpenCL to CUDA Translator Guide

The document provides instructions for installing and using SnuCL-Tr, an OpenCL to CUDA translator. It describes downloading and compiling the required software components. The translator works by taking OpenCL device code and converting it to CUDA device code. It implements OpenCL host APIs as wrapper functions around CUDA driver APIs. The document includes a sample Makefile template and shows the original and translated kernel code for a sample application.

Uploaded by

xieliwei
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
49 views4 pages

OpenCL to CUDA Translator Guide

The document provides instructions for installing and using SnuCL-Tr, an OpenCL to CUDA translator. It describes downloading and compiling the required software components. The translator works by taking OpenCL device code and converting it to CUDA device code. It implements OpenCL host APIs as wrapper functions around CUDA driver APIs. The document includes a sample Makefile template and shows the original and translated kernel code for a sample application.

Uploaded by

xieliwei
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

SnuCL-Tr: OpenCL to CUDA

Quick Start Guide


Requirements
The OpenCL to CUDA translator in SnuCL-Tr requires that you have the latest CUDA Toolkit installed. You
can download the CUDA Toolkit from [Link]

How to Install
Perform the following steps to install the OpenCL to CUDA translator in SnuCL-Tr and verify the installation.

1. Untar Source Code

Download the SnuCL-Tr source code from [Link] Untar it on your


preferred location.

$ tar xvzf [Link]

2. LLVM

You need to configure the LLVM compiler first and then compile the program. It may take a long
time to compile for LLVM.

$ cd opencl2cuda/build
$ ../[Link]/configure
$ make BUILD_EXAMPLES=1

Note: If your system already has Clang, then you need to configure and build the LLVM compiler
manually with the following flags.

CC=gcc CXX=g++

Example:

$ ../[Link]/configure CC=gcc CXX=g++


$ make CC=gcc CXX=g++ BUILD_EXAMPLES=1

3. The Runtime Library

Build the runtime library (i.e., wrapper functions) at the location below.

$ cd opencl2cuda/common/common/
$ make

As a result, the shared library will be created in the following location:


 opencl2cuda/common/lib/libsnuclOC.a

4. Set Environment Variables

There are two environment variables that have to be set to use the OpenCL to CUDA translator:
OPENCL_TO_CUDA and OPENCL_TO_CUDA_GPU_ARCH (i.e., your GPU’s compute capability). You
can check the GPU’s compute capability at [Link]

Open your .bashrc to edit.

$ vi $(HOME)/.bashrc

At the bottom of the file, insert the two lines shown below.

export OPENCL_TO_CUDA=$(HOME)/opencl2cuda
export OPENCL_TO_CUDA_GPU_ARCH=compute_xx

For example, if your GPU’s compute capability is 3.0, then modify the
OPENCL_TO_CUDA_GPU_ARCH variable as shown below.

export OPENCL_TO_CUDA_GPU_ARCH=compute_30

To apply modified environment variables, go to your home directory and execute the following
command.

$ source .bashrc

Understanding the OpenCL to CUDA Translator


In OpenCL, the host code and device code are separated. Hence, we translate them separately. The
OpenCL device code (e.g., [Link]) is translated to the CUDA device code (e.g., [Link]) by our
source-to-source translator. The OpenCL host API functions are implemented as wrapper functions. We
use CUDA driver API functions to implement the wrappers.
How to Build a Program Using the OpenCL to CUDA Translator
 Makefile Template

In sample directory, a Makefile template for the OpenCL to CUDA translator is provided with a
sample application. Makefile can be written as you deem fit, but there are four things you have
to follow to use the translator.

 Use g++ compiler


CC = g++

 Add CUDA and the runtime library path to search


-L$(CUDA_INSTALLED_PATH)/lib64 –L$(OPENCL_TO_CUDA)/common/lib

 Link CUDA and the runtime library


-lcuda –lcudart –lsnuclOC -lpthread

 Add CUDA header file path


-I$(CUDA_INSTALLED_PATH)/include

Translation Result for the Sample Application


When you build your program with the OpenCL to CUDA translator, translated source files will be named
“*.cu”. For example, when you build the sample application, “__temp_kernel.cu” will be generated. You
can open it with text-editor to see how it is translated. The figures shown below are the original device
code and the translated device code of the sample application.

 [Link] (original kernel code)

__global__ void vecAdd(


__global int* A, __global int* B, __global int* C, const int N) {
int i = get_global_id(0);
C[i] = A[i] + B[i];
}
 __temp_kernel.cu (translated kernel code)

__constant__ char __snucl_const_mem[16384];


extern __shared__ char __snucl_shared_mem[];
__device__ unsigned int __snucl_group_id_offs[2] = {0, 0};
__device__ int get_global_id( int index) {
switch (index){
case 0: return (blockIdx.x + __snucl_group_id_offs[0]) *
blockDim.x + threadIdx.x;
case 1: return blockIdx.y * blockDim.y + threadIdx.y;
case 2: return blockIdx.z * blockDim.z + threadIdx.z;
default: return -1;
};
}

extern “C” __global__ void vecAdd(int* A, int* B, int* C, const int N){
int id = get_global_id(0);
C[i] = A[i] + B[i];
}

You might also like