It is possible to do using Cloo and OpenCL.h in C++. This is not implemented in OpenCLTemplate (mainly to try and keep things simple).
In native OpenCL this is done by setting the cl_bool blocking_write flag in clEnqueueReadBuffer and clEnqueueWriteBuffer. If you happen to have this need maybe you'd gain performance by moving to C/C++ and using raw OpenCL.
Using Cloo, you'll want to use ComputeCommandQueue.Read<>() and ComputeCommandQueue.Write<>(), again setting cl_blocking_write to FALSE (which means code will continue on while the copy is happening).
Note that you can't use the data that is being copied while the copy/read event does not trigger to tell you that the operation is complete.
You can either use clWaitForEvents or pass the list of events to be waited for as parameters for the next functions.
I strongly encourage you to take a look at the OpenCL documentation: www.khronos.org/registry/cl/sdk/1.2/docs/man/xhtml/
Specifically, you'll be looking for Events and how to sync using them in Host code.
Let me know if this explanation helps. This topic is so specific that I haven't even covered it in the OpenCL tutorials (yet ?? =] ).