Let’s get started with the OpenCL tutorial using OpenCLTemplate.
OpenCL is a code developed to enable users to use their video cards (GPUs) as processors. The advantage is that GPUs have an immense power to do parallel processing. I myself have been able to run collision detection algorithms 120x faster on a GPU. Since you are here reading this tutorial, I assume you want to use all this processing power to accelerate your own tasks.
I’ll try to keep this initial steps as simple as possible, because what we all really want is to see some OpenCL code running, right? If you want some historical background on GPU computing, try searching CUDA, ATI Stream and GPGPU using some web searcher.
First, you should check if your video card supports OpenCL by looking at the manufacturer’s website. Even if it doesn’t, you can use and install OpenCL applications in your computer if you install the ATI Stream SDK. ATI has provided OpenCL support to CPUs even without ATI GPUs, which is awesome.
Basically, you will need to:
- Download and install Microsoft Visual C# Express from http://www.microsoft.com/exPress/. Go to the download section and get Visual C# 2008 Express. Of course you don’t need this step if you own a commercial version of Visual Studio;
- Download and install the latest drivers for your GPU;
- Get the OpenCL drivers from NVidia or AMD. You can find ATI Stream SDK (which includes OpenCL support) in http://developer.amd.com/gpu/ATIStreamSDK/Pages/default.aspx#two. NVidia CUDA can be found at http://developer.nvidia.com/object/cuda_3_0_downloads.html. Have in mind that these drivers are necessary to run OpenCL;
- Get the OpenCLTemplate files. Extract them and run OpenCLTemplate.EXE. This is what you should see:
If you see an ERROR this means your OpenCL drivers are not working properly (assuming you have a compatible device). If you get errors trying to set things up try browsing some forums or contact meand I’ll try to help.
If everything went fine, you should see a Platform (which is your computer) and some Devices (which are the processors we can run OpenCL code on).
You see from my screenshot that I have two devices to run OpenCL: one AMD Phenom (processor) and one ATI RV770, which is my Radeon 4870. I have a Crossfire which I didn’t turn off and I can’t run code on the second GPU.
Let’s go into some important configuration now. Windows won’t let the GPU work forever without returning images to the screen. The default configuration requires the GPU to respond in 2 seconds. Our code might take longer to run on the GPU, which is why we need to set the TdrDelay and TdrDdiDelay variables to values greater than the default.
Moreover, Windows doesn’t always allows us to identify all the GPU memory. I have 1 Gb cards and OpenCL would only see 256 Mb before I adjusted the local Windows variable GPU_MAX_HEAP_SIZE to 1024.
If you are in a rush to start coding, trust me and write the recommended Registry values by cliking the button (you can always create a backup of your registry first). If you don’t, you will probably need to continue reading further.
I have found that 128 seconds (which is more than 2 minutes) delay in the GPU works well for me. Also, you should set the GPU_MAX_HEAP_SIZE environment variable to the amount of GPU memory available.
This is how to do it:
1 – Click Start and execute regedit.exe (type “regedit” as a command to open the Registry Editor);
2 – Go to HKEY_LOCAL_MACHINE\SYSTEM\CURRENTCONTROLSET\CONTROL\GraphicsDrivers;
3 – Create two REG_DWORD: TdrDelay and TdrDdiDelay;
4 – Set TdrDelay to the maximum time in seconds you will allow the GPU to run codes (I suggest 128) and TdrDdiDelay to the time it takes for Windows to reboot your GPUs if they don’t respond (I suggest 256);
5 – Go to entry HKEY_LOCAL_MACHINE\SYSTEM\CURRENTCONTROLSET\CONTROL\SESSION MANAGER\ENVIRONMENT and create a variable REG_SZ called GPU_MAX_HEAP_SIZE. Set its value to something greater than or equal to your GPU memory (I suggest 1024);
6 – Alternatively, you can create the environment variable by going to Control Panel -> System – > Advanced system settings -> Tab Advanced -> Environment Variables and creating GPU_MAX_HEAP_SIZE there.
I have made this automatic for you with a single click. This is the hard way to set up the configurations. Have in mind that OpenCL is not nearly as established as OpenGL and that there may exist many issues.