The Department of Computer Science of Federal University of Minas Gerais – DCC/UFMG and Khronos Chapter Brazil have been working together to spread and strengthen industrial and scientific use of Khronos’ standards, most notably OpenCL and OpenGL.
On March 12th a talk was held with students and researchers of the Computer Vision laboratory – VeRLAB to present benefits of using GPGPU through OpenCL to accelerate CV algorithms.
Machine learning and computer vision have become a reality in people’s daily life through smart wearables (smart glasses, watches phones), vehicles capable of recognizing traffic signs, biometric systems and many others. These technologies increase safety and comfort when using such machines and are currently object of active research. However, development of better algorithms and the possibility of executing them in mobile devices under acceptable time frames and lower energy consumption still remain as open challenges.
This presentation covers topics on how GPU parallel processing allows performance and energy efficiency increases when executing algorithms whose inputs are images. Accelerations up to 800x may be obtained by intelligent use of appropriate parallel algorithms suited to SIMD architectures, explicit cache management and use of texture samplers.
One very important issue with OpenCL code at the moment is that it needs to either be compiled at runtime by vendor’s compilers or be precompiled but also restricted to the hardware it has been precompiled to. This is a problem because programmers cannot protect sensitive parallel code unless it is delivered precompiled to each specific platform, a very tedious task that completely challenges the goals of having an open specification.
Khronos’ SPIR (Standard Portable Intermediate Representation) specification is an important step towards protecting sensitive source code while still maintaining cross-platform capabilities.
This may very well be the last step that gaming industry and multimedia processing companies were waiting to fully incorporate heterogeneous computing into their applications.
We are proud to release CMSoft’s Dynamic Shader (beta), a software designed to let artists quickly bring to life their creations by providing a fast and easy way to create high quality shading of 2D pictures, such as the one below:
The following video demonstrates how to create automatic gradient filling for 2D pictures:
Dynamic shader aims to be a practical tool for professional art, concept sketches and people who just like to draw and paint alike. The algorithm uses concepts from dynamic programming in order to compute color gradients and perform automatic shading. Due to the large amount of processing power required it is necessary to have an OpenCL-enabled GPU.
– 64-bit Windows with at least 4 Gb RAM
– OpenCL enabled GPU
Please keep in mind that Dynamic Shader is currently in beta version. Please do send us any and all suggestions/comments using the email email@example.com
Many thanks to:
Grand Prix Senai de Inovação 2013, event during which Dynamic Shader was first unveiled.
Tales Vieira (drawing/shading) for the amazing picture and shading.
OpenCL 2.0 has been presented by Khronos at SIGGRAPH 2013 and has a draft specification and it has many exciting features. At this point the 2.0 spec needs some clarification in some aspects and preferably some examples.
Among the new interesting features are:
Shared virtual memory – clSVMAlloc will be the new base memory allocation function for shared memory that will remain synchronized between host and device. This is going to be a useful function when the two devices share the same physical memory, so that no copy is needed.
Dynamic parallelism – Device kernels will be able to launch its own subkernels in the device. With proper attention this should effectively enable recursive kernels.
Generic address space – At this moment, developers need to declare all buffer types in kernels and functions. If a function does the same operation to a __global or __private memory buffer it has to be duplicated. Generic address spaces should eliminate this need by accepting generic memory qualifiers.
Pipe spaces – This seems to be a pipeline with FIFO characteristics that can be used by workitems and host to allow message passing.
Of all thesen new features, dynamic parallelism seems to be the one that will affect the most how developers use OpenCL. Let’s just hope that implementations come soon!