Case Study: Efficient manipulation of Kinect data using OpenCL/GL Interop

Download source code for this section.

1. Introduction

Interactive technologies have become extremely important in a world where busy users demand intuitive devices which demand little to no learning time. In this modern scenario, tablets have emerged with their easy-to-use touchscreens, gaming consoles have been successfully exploring movement controls (Wii, PS3 eye, Kinect) and augmented reality has started to emerge as a viable technology.

However, implementation of intelligent systems using devices such as the Kinect usually involves real-time processing of data from multiple sensors (RGB camera, depth camera, audio, accelerometers). This task can be achieved using OpenCL to harness the processing power of multicore GPUs and CPUs.

The goal of this case study is to demonstrate how to create a C# framework to capture Microsoft Kinect sensor data and transfer it to an OpenCL GPU Device, thus enabling the development of software that can potentially process Kinect data hundreds of times faster when compared to pure CPU processing. In order to do this efficiently, we’ll use OpenCL OpenGL interoperation tools, the OpenCLTemplate C# resources and a Kinect for Windows hardware. The Kinect for Xbox can be used for research purposes but you’re not allowed to use it commercially.

As a processing  example, we will create a continuous color coding for the depth data ranging from blue (closest) to red – white (furthest). In the tests ran using our own hardware, while Microsoft’s Kinect Explorer example runs the depth map at 17 fps using only 5 discrete depth regions, our example runs at full 30 fps while still able to compute the full continuous color coding.

The video below demonstrates the results: two OpenGL textured quads are used to display RGB and depth information acquired from a Kinect sensor. The data is transferred to GPU memory, processed using OpenCL and then copied to CLGL shared textures. Notice that there is a full false coloring procedure which includes highlighting players detected by Kinect.

In order to complete this tutorial the following tools are needed:

Please feel free to ask questions using our forum if you need help.

2. System setup

The tools needed to set up the development environment, acquire data from Kinect and process it using a shared OpenCL/OpenGL context will be presented in this section.

By the end of the system setup, you should be able to:

  • Create a new project in Visual C# 2010 Express;
  • Include references to Microsoft.Kinect, OpenCLTemplate, Cloo and OpenTK;
  • Create a basic OpenCL/GL shared context in a Windows Form.

2.1 Visual C# 2010 Express

In this case study, we’ll use Microsoft Visual C# 2010 Express. If you have the professional version, you’ll probably want to use it instead.

Microsoft’s C# 2010 Express is a tool that can be used to develop applications compatible with the Kinect and OpenCL. The first step is to download and install it from Microsoft’s VS2010 download page.

2.2 Kinect hardware and SDK

In this tutorial we’ll use our commercial version Kinect for Windows, shown below.

KinectForWindows

It is possible to use the Xbox Kinect for testing purposes. However, the Kinect redistributable and the Kinect near mode will only work when using the Kinect for Windows.

In order to use the Kinect framework inside C# it is necessary to download and install the Kinect SDK, available from the Kinect for Windows download page. You may also want to download the toolkit to be able to learn more about the capabilities of the device.

2.3 References and the OpenCL/GL shared context

Now that we already have Visual C# 2010 and the Kinect SDK installed, let’s create the project we’ll use to process Kinect data using OpenCL. The video below demonstrates step-by-step creation of a new Visual C# 2010 project and how to include references to Microsoft.Kinect, OpenCLTemplate and the other necessary tools.

In this case study we’ll use OpenCLTemplate’s CL/GL sharing capabilities to create the OpenGL 3D enviromnent and share the context with OpenCL through interop.

3. Retrieving Kinect data

In this section, we’ll use Microsoft.Kinect functions to retrieve RGB and depth information from the Kinect for Windows sensor. Bear in mind that this is by no means a complete description or tutorial about Kinect. You can find complete tutorials in Microsoft’s Kinect page.

3.1 Initializing Kinect

The first step to retrieve Kinect data is to initialize the sensor. There are some parameters (TransformSmoothParameters) designed to control how to preprocess Kinect data before it is delivered to the AllFramesReady event. We’ll just use values that work fine. You can read more about this in in Microsoft’s Kinect page.

The important parts of this code are the .Enable() statements, which tells Kinect API that we want to retrieve the ColorStream (color camera), DepthStream (depth values) and SkeletonStream (joint position values if a human is detected).

All this information will be sent to our AllFramesReady event once we start the Kinect in line ks.Start().

bool working = true;
private void InitKinect()
{
    #region Enable kinect
    if (KinectSensor.KinectSensors.Count > 0)
    {
        ks = KinectSensor.KinectSensors[0];
        if (ks.Status == KinectStatus.Connected)
        {
            TransformSmoothParameters tsp = new TransformSmoothParameters();
            tsp.Smoothing = 0.1f;
            tsp.Prediction = 0.1f;
            tsp.Correction = 0.1f;
            ks.DepthStream.Range = DepthRange.Near;
            ks.SkeletonStream.EnableTrackingInNearRange = true;
            ks.ColorStream.Enable();
            ks.DepthStream.Enable();
            ks.SkeletonStream.Enable();
            ks.AllFramesReady += new EventHandler<AllFramesReadyEventArgs>(ks_AllFramesReady);
        }
    }
    #endregion
    //done initializing
    working = false;
    ks.Start();
    while (!ks.IsRunning) { }
}

3.2 Retrieving depth and color data

Once we start the Kinect sensor, it will call ks_AllFramesReady function and send RGB and depth images for processing. In this tutorial we won’t use Skeleton data; rather, we’ll use the RGB and Depth information.

The boolean working variable is used to allow frame skipping in case the computer is not able to process the current frame before the next one arrives. In our configuration, the Kinect SDK will send RGB and depth at 30 frames per second.

The variables pixels (length = 4*W*H) and depthPix (length = W*H) are used to store the RGB image and the depth values of the scene, respectively. Considering that the retrieved image has width W and height H, the information stored in these arrays is as follows:

pixels[4*(x + W*y)] to pixels[4*(x + W*y) + 3] – stores the BGRA color values of pixel [x,y]

depthPix[x+W*y] – stores depth and player information of pixel [x,y]. Player information is the way Kinect uses to inform the user whether it thinks there is a player (from 1 to 6) in screen position [x,y] of the depth image. In order to retrieve the actual depth of the scene as well as which player number Kinect thinks is in position [x,y] we unmask the bits using the formula:

int player = depthPix[x+W*y] & DepthImageFrame.PlayerIndexBitmask;
int depth =  depthPix[x+W*y] >> DepthImageFrame.PlayerIndexBitmaskWidth;

The code below provides details on how to retrieve the Kinect data. At this point, we’re ready to send the data to OpenCL for processing and to OpenGL for rendering.

//RGB image
byte[] pixels;
//depth
short[] depthPix;
void ks_AllFramesReady(object sender, AllFramesReadyEventArgs e)
{
    if (working) return;
    working = true;
    {
        #region Color image
        using (ColorImageFrame cif = e.OpenColorImageFrame())
        {
            if (cif == null)
            {
                working = false;
                return;
            }
            if (pixels == null)
            {
                pixels = new byte[cif.PixelDataLength];
            }
            cif.CopyPixelDataTo(pixels);
        }
        #endregion
        #region Depth map
        using (DepthImageFrame dif = e.OpenDepthImageFrame())
        {
            if (dif == null)
            {
                working = false;
                return;
            }
            if (depthPix == null)
            {
                depthPix = new short[dif.PixelDataLength];
            }
            dif.CopyPixelDataTo(depthPix);
            //int player = depthPix[0] & DepthImageFrame.PlayerIndexBitmask;
            //int depth = depthPix[0] >> DepthImageFrame.PlayerIndexBitmaskWidth;
        }
        #endregion
    }
    working = false;
}

4. Processing Kinect Data Using OpenCL

In this section, we will demonstrate how to use OpenCLTemplate to share textures using OpenCL and OpenGL and send the Kinect data for display.

By the end of this section, we should be able to:

  • Create two rectangles with texuture in the OpenGL environment;
  • Send Kinect data to the GPU;
  • Use OpenCL/GL texture sharing to display Kinect data using OpenGL textures.

4.1 Sharing OpenGL textures

In order to display a texture in the OpenGL environment, it is first necessary to create a texture mapped rectangle in the screen. In OpenCLTemplate, this is done by creating a GLVBOModel to be added to the GLRender object:

GLRender.GLVBOModel m = CreateTexHolder(640, 480, -650, Color.White);
GLRender.GLVBOModel m2 = CreateTexHolder(640, 480, 650, Color.White);
CLimg = m.GetCLTexture2D();
CLimgdepth = m2.GetCLTexture2D();
CLtime = new CLCalc.Program.Variable(time);
glr.Models.Add(m);
glr.Models.Add(m2);

The parameters used in CreateTexHolder were defined manually to fit appropriately into the GLRender display environment. Note that GLRender.GLVBOModel implements functions to retrieve OpenCL objects created from OpenGL buffers. In this case, we’re using the .GetTexture2D() method to retrieve OpenCL Image2D objects which will be used in Kernels. Thus, CLimg and CLimgdepth are OpenCL Image2D objects created from OpenGL textures.

The function GLRender.GLVBOModel CreateTexHolder(int w, int h, int offset, Color c) is used to create the 3D model of a rectangle, complete with texture coordinates and a texture. Since this function comprises solely OpenGL instructions the reader is invited to study the implementation details directly from the source code of this section.

4.2 Transferring data to Device memory

After initialization, we now have two CLCalc.Image2D objects shared with OpenGL: CLimg andCLimgdepth. We now need to update our Kinect callback function ks_AllFramesReady to send RGB and depth data to the OpenCL Device.

Recall that Kinect gives us one 4-byte-per-pixel array containing camera RGB information and one short array containing depth information. Arrays CLKinectRGB and CLDepth were created to store Kinect data that will be processed in the OpenCL device.

Note that OpenCL objects created from OpenGL need to be Acquired and Released before being used by OpenCL. Next section will present the OpenCL C99 Kernel code.

void ks_AllFramesReady(object sender, AllFramesReadyEventArgs e)
{
    if (working) return;
    working = true;
    {
        #region Color image
        using (ColorImageFrame cif = e.OpenColorImageFrame())
        {
            if (cif == null)
            {
                working = false;
                return;
            }
            if (pixels == null)
            {
                pixels = new byte[cif.PixelDataLength];
                CLKinectRGB = new CLCalc.Program.Image2D(pixels, cif.Width, cif.Height);
            }
            cif.CopyPixelDataTo(pixels);
            //for (int i = 0; i < (pixels.Length >> 2); i++) pixels[(i << 2) + 3] = 255;
            CLKinectRGB.WriteToDevice(pixels);
        }
        #endregion
        #region Depth map
        using (DepthImageFrame dif = e.OpenDepthImageFrame())
        {
            if (dif == null)
            {
                working = false;
                return;
            }
            if (depthPix == null)
            {
                depthPix = new short[dif.PixelDataLength];
                CLdepth = new CLCalc.Program.Variable(depthPix);
            }
            dif.CopyPixelDataTo(depthPix);
            CLdepth.WriteToDevice(depthPix);
        }
        #endregion
        CLGLInteropFunctions.AcquireGLElements(new CLCalc.Program.MemoryObject[] { CLimg, CLimgdepth });
        kernelimgKinectCopy.Execute(new CLCalc.Program.MemoryObject[] { CLKinectRGB, CLdepth, CLimg, CLimgdepth },
                                    new int[] { CLimgdepth.Width, CLimgdepth.Height });
        CLGLInteropFunctions.ReleaseGLElements(new CLCalc.Program.MemoryObject[] { CLimg, CLimgdepth });
        glr.ReDraw();
    }
    working = false;
}

4.3 Processing and displaying data

Now that all Kinect data has been copied to OpenCL memory objects, we are ready to process the RGB and Depth images using the GPU and to copy them into the shared OpenGL textures.

The write_only images imgGLRGB and imgGLDepth are shared with OpenGL. Prior to kernel execution they must be locked to OpenCL using commands Acquire and Release, as shown previously. Kinect inputs are the RGB data contained in imgRGB and depth data in depthMap.

Copying the RGB image into the GL texture requires converting the 4-byte-per-pixel input into 4-float-per-pixel used by OpenGL and also reversing C# BGRA order into RGBA color ordering.

The depth data is a bit trickier: first of all it is necessary to unmask the bits in order to retrieve player and depth information. Then, pixels get a custom color based on their depth (function getColor; feel free to see implementation details in this section’s source code).

#define PlayerIndexBITMASK 7
#define PlayerIndexBITMASKWidth 3
__kernel void imgKinectCopy (
    __read_only  image2d_t imgRGB, //acquired RGB image
    __global const short * depthMap, //depth map
    __write_only image2d_t imgGLRGB, //shared GL texture for RGB
    __write_only image2d_t imgGLDepth //shared GL texture for depth
)
{
    const sampler_t smp = CLK_NORMALIZED_COORDS_FALSE | //Natural coordinates
                          CLK_ADDRESS_CLAMP | //Clamp to zeros
                          CLK_FILTER_NEAREST; //Don't interpolate
    int2 coord = (int2)(get_global_id(0), get_global_id(1));
    //RGB image
    uint4 val1 = read_imageui(imgRGB, smp, coord);
    //remember C# strange BGR instead of regular RGB sequence
    float4 val11 = (float4)((float)val1.z, (float)val1.y, (float)val1.x, 255.0f);
    write_imagef (imgGLRGB, coord, val11 * 0.00392156862745098f * 1.2f);
    //depth image
    int ind = coord.x + get_global_size(0)*coord.y;
    int player = depthMap[ind] & PlayerIndexBITMASK;
    float depth = (float)(depthMap[ind] >> PlayerIndexBITMASKWidth);
    //ranges: normal 800 - 4000 mm
    //        near   400 - 3000 mm
    float4 depColor = depth > 100 ? getColor(4000.0f, 200.0f, depth) : (float4)(0.0f,0.0f,0.0f,1.0f);
    //depColor = player <= 0 ? depColor : (float4)( ((float)player+1.0f)*20 ,0.0f,0.0f,1.0f);
    depColor = player <= 0 ? depColor : getColor(10.0f, 0.0f, (float)player ) + (float4)(0.5f,0.0f,0.0f,1.0f);
    write_imagef (imgGLDepth, coord, depColor);
}

5. Conclusion

In this study we demonstrated how to process and display Kinect data in a GPU using OpenCL/GL interop functions available in OpenCLTemplate (wrapping Cloo OpenCL and OpenTK OpenGL). A CLGL environment was created to hold two shared textures. Kinect data is acquired using Microsoft.Kinect SDK, sent to the GPU memory, processed and then the RGB image is copied to one texture while a false colored depth map is stored in the other OpenGL texture.

At this point, Kinect RGB and depth information are available in GPU memory, potentially enabling this information to be processed hundreds of times faster.

This case study outlines the important aspects of OpenCL/GL sharing and how to send and process Kinect data using an OpenCL Device and is not meant to be a complete tutorial about either. Implementation details can be found in the source code.

More information about OpenCL/GL interop can be found in our Image2d tutorial. Please refer to Microsoft Kinect’s webpage for tutorials about how to retrieve Kinect data.

Download source code for this section.

2 thoughts on “Case Study: Efficient manipulation of Kinect data using OpenCL/GL Interop”

    1. You can use our rss feed. For now we aren’t posting very often, but we are planning to start posting again next year.

Leave a Reply

Your email address will not be published. Required fields are marked *