Case study: low poly exact collision detection Download source code for this section. 1. Introduction In this tutorial we will present a O(nm) collision detection algorithm suitable for detecting exact collision between 3D models containing n and m triangles. Since the complexity of the algorithm is O(nm), it is better used for models with around 100 polygons. Nonetheless, if you need to check collision between more complex objects, you can always create intermediate bounding boxes to lower the order of the algorithm. This algorithm is implemented in the Lab3D software. Collision detection is a reallife problem constantly present in engineering applications. It exists in a very wide range of problems, such as: 1  Avoiding robot arms trajectories that collide with the surroundings; 2  Solving particles interaction systems; 3  Estimating risk due to explosions that generate debris; 4  Detecting construction problems when projects try to place equipments that will collide if they are placed as the project says; 5  Ray tracing algorithms; 6  Almost every game. The objective of this tutorial is to present an OpenCL algorithm to detect exact collisions in engineering model sketches, which usually involve simple geometries (spheres, cylinders, pressure vessels and pipelines) as opposed to complex geometries (very highpoly human models, spaceships and vehicles). The analyzed algorithm shall be suitable to solve problems 3 and 4, which are closely related to engineering. 1.1 Example The picture below shows two forklifts that have interlaced tops but don't collide, which is the worst case. You can try this sample in Lab3D (download Lab3D) by loading two forklift.OBJ model and playing with them: Notice that they don't collide. Checking that there is no collision (worst case) between these models takes 38,0 s using regular code (CPU without OpenCL), 4,0 s on my CPU (OpenCL), and only 1,2 s on my GPU (OpenCL) in my computer. When collisions exist the algorithm runs much faster taking as low as 0.1 s on the GPU for this case. 1.2 Benchmarks I decided to benchmark collision test results to show how powerful collision computation in GPU can be: Notice that the GPU time is about 35x faster than the regular implementation and around 3.5x faster than OpenCL running on the CPU. IMPORTANT NOTE: Every time, the OpenCL execution copies all vertex and triangle data to the Device before computing calculations. If you need to calculate collisions between the same models over and over again, you will need minimal data transfer and you can expect algorithms 300x faster. 2. Collision models There are basically 3 types of techniques to detect collision:  Vertexvertex;  Vertexface;  Faceface Vertexvertex collisions are techniques to detect particles interaction and simulate fluids and gases. They consist in creating interaction "boxes" to lower the order of the algorithm from O(n²) to O(b), where n is number of particles and b is number of boxes and are suitable for highpoly collision detection and distance estimation. The downside is that it is not possible to guarantee exact collision detection. Vertexface collisions are techniques to detect when a particle interacts with a surface. Faceface collisions is a technique that allows exact collision detection even if the model's faces are very large and stretched. This will be the focus of this case study since engineering models in projects don't usually require a very high vertex count. For instance, a storage tank of any height can be well represented by 32 vertexes: 16 for the upper circumference and 16 for the bottom. A texture can then enhance the visualization. You may check the source code for this tutorial in order to see implementations of the other collision models. 3. Overview of the problem Let's analyze the faceface collision carefully to better understand the problem. We want an algorithm capable of detecting exact collision between two given triangles, no matter what their position is, and not dependant on vertex to vertex distances. Even the slightest touch should be detected because this is important, for instance, to analyze explosion problems. Take a look at the picture below. The picture above shows the same scene from two different perspectives. The scene with the bottom plane rendered in wireframes shows that the vertexes of the two models aren't even close. Still, the detection algorithm should have no problem detecting the collision. 4. Solution Let's divide the problem into two tasks: 1  Create a routine to detect if two triangles collide or not; 2  Check if any of the triangles in Model 1 collide with triangles in Model 2. The second task is quite straightforward. Let's pseudocode it: bool CheckCollision(Model 1, Model 2) { for each(triangle T1 in Model 1) for each(triangle T2 in Model 2) if (T1 collides with T2) return true; return false; } The first task is the complicated one. The picture below should make the first part of the problem clear: As you can see, the above cases are what we are looking for. Either two sides of the same triangle cross the other one or one side of each crosses the other. The steps to solve this problem are: 1  Apply translation/rotation to the vertexes of the model before starting to detect collisions; 2  Create a routine that checks if a line given by its points AB crosses a triangle given by its points CDE; 3  Create a routine that, given two triangles, checks if any of the 3 borders of T1 cross T2 and viceversa. 4.1 Model translation / rotation For you programmers familiar with OpenGL, what we will do next is create functions that mimic glTranslate and glRotate. It is very rare that all colliding objects will stand still in the scene. It's likely that their center has translated to the point Displacement[] so that Displacement[0]  X component of displacement, Displacement[1]  Y component of displacement, Displacement[2]  Z component of displacement and their axis has rotated by angles ψ (along Z axis), θ (along Y axis) and φ (along X axis). The order of rotations is defined by the software implementation so you need to make sure to do the same rotation here as you do in your software. Calculating the translations using OpenCL is quite simple: __kernel void CalcTransl( __global read_only float * Displacement, __global float * DisplacedVertexes) { int k = get_global_id(0); int i = k / 3; int r = k  3 * i; DisplacedVertexes[k] += Displacement[r]; } Yes, I could have used float4 vectors for this, but since I am only using 3 components I preferred to save some space and just use the space I needed. What the above code does is calculate the remainder r of k/3 to know if the vertex of the worker is X, Y or Z. Now onto the rotation. This is a bit more complicated since you need to compute the correct rotation matrix and then apply it to all points. If you are not familiar with rotation equations, I suggest you go back and study Euler angles. There's a lot of information available out there. We need to compute the rotation matrix and then apply it to the vertexes. Fortunately, the rotation matrix is calculated only once since we are considering solid models. We can just compute the angles using the CPU: private void CalcRotMatrix(float[] RotMatrix, float[] Displacement) { float psi = Displacement[5], theta = Displacement[4], phi = Displacement[3]; double Costheta = Math.Cos(theta), Sintheta = Math.Sin(theta); double Cosphi = Math.Cos(phi), Sinphi = Math.Sin(phi); double Cospsi = Math.Cos(psi), Sinpsi = Math.Sin(psi); RotMatrix[0] = (float)(Costheta * Cospsi); RotMatrix[3] = (float)(Cosphi * Sinpsi + Sinphi * Sintheta * Cospsi); RotMatrix[6] = (float)(Sinphi * Sinpsi + Cosphi * Sintheta * Cospsi); RotMatrix[1] = (float)(Costheta * Sinpsi); RotMatrix[4] = (float)(Cosphi * Cospsi + Sinphi * Sintheta * Sinpsi); RotMatrix[7] = (float)(Sinphi * Cospsi + Cosphi * Sintheta * Sinpsi); RotMatrix[2] = (float)(Sintheta); RotMatrix[5] = (float)(Sinphi * Costheta); RotMatrix[8] = (float)(Cosphi * Costheta); } And then use OpenCL to parallelize the rotation of vertexes: __kernel void CalcRotacoes( __global read_only float * rotM, __global read_only float * V, __global float * DisplacedVertexes) { int k = get_global_id(0); int i = k / 3; int r = k  3 * i; DisplacedVertexes[k] = V[3 * i] * rotM[r] + V[3 * i + 1] * rotM[r + 3] + V[3 * i + 2] * rotM[r + 6];
} Finally, we wrap it up by creating a function to apply both displacements to the vertexes: /// <summary>Calculates vertex displacements using openCL. Returns displaced vertexes in OpenCL memory.</summary> /// <param name="ObjVertex">Original vertex positions</param> /// <param name="ObjDisplacement">Model displacement</param> private CLCalc.Program.Variable CLCalcVertexDisplacement(float[] ObjVertex, float[] ObjDisplacement) { CLCalc.Program.Variable ObjVert = new CLCalc.Program.Variable(ObjVertex); CLCalc.Program.Variable ObjDisp = new CLCalc.Program.Variable(ObjDisplacement); CLCalc.Program.Variable ObjDispVert = new CLCalc.Program.Variable(ObjVertex); float[] rotMatrix = new float[9]; CLCalc.Program.Variable rotM = new CLCalc.Program.Variable(rotMatrix); CLCalc.Program.Variable[] args; int[] max = new int[1]; max[0] = ObjVertex.Length; if (ObjDisplacement[3] != 0  ObjDisplacement[4] != 0  ObjDisplacement[5] != 0) { CalcRotMatrix(rotMatrix, ObjDisplacement); rotM.WriteToDevice(rotMatrix); args = new CLCalc.Program.Variable[] { rotM, ObjVert, ObjDispVert }; kernelCalcRotacoes.Execute(args, max); } args = new CLCalc.Program.Variable[] { ObjDisp, ObjDispVert }; kernelCalcTransl.Execute(args, max); return ObjDispVert; } The vector ObjDisplacement contains the XYZ displacements and the φθψ displacements, respectively, in its elements 0,1,2,3,4,5. The if checks if rotations have been applied to the model. If not, then it's not necessary to rotate the model. 4.2 Checking if a line crosses a triangle A little bit of analytic geometry now. Problem: given a line AB and a triangle CDE, check if AB crosses CDE. Let's visualize the problem: We need to parameterize the triangle and the line in order to calculate the collision. The points of line AB are given by: L ( δ ) = A + δ AB where δ is a scalar that goes from 0 to 1 and AB is the vector that goes from A to B (i.e., BA). You can easily check that L(0) = A and L(1) = B. Plane CDE is given by: P ( α, β ) = C + αCD + βCE, α ≥ 0, β ≥ 0, α + β ≤ 1 Again, it's easy to check that P(0,0) = C, P(1,0) = D, P(0,1) = E and the edge DE is located where α + β = 1. All that's left to do is solve the linear system: A + δ AB = C + αCD + βCE Remember that each vector and each point has 3 components (X, Y, and Z) which means there are 3 equations and 3 unknowns. After solving the equations we know that the line crossed the triangle if and only if α ≥ 0, β ≥ 0, α + β ≤ 1, 0 ≤ δ ≤ 1. This happens exactly because of the parameterization. You may study and solve the 3x3 equation yourself if you want to. This is the code to do it: bool RetaCruzaTriangulo(float4 A, float4 B, float4 C, float4 D, float4 E) { //checks if A + k*AB crosses CDE //vector calculations float4 AB = B  A; float4 CD = D  C; float4 CE = E  C;
//match point between plane and line: //A+delta*AB = C+alfa*CD+beta*CE //[CD CE AB]*[alfa beta delta]t=[AC] //0<=delta<=1, alfa>=0, beta>=0, alfa+beta<=1 float invdet; invdet = fma(CD.x, (fma(CE.y, AB.z,  AB.y * CE.z)), fma(CD.y, (fma(AB.x, CE.z,  CE.x * AB.z)), CD.z * (fma(CE.x, AB.y,  AB.x * CE.y)))); if (invdet == 0) return false; invdet = 1 / invdet; float M[9]; M[0] = fma(CE.y, AB.z, CE.z * AB.y); M[1] = fma(CD.y, AB.z,  CD.z * AB.y);M[2] = fma(CD.y, CE.z,  CD.z * CE.y); M[3] = fma(CE.z, AB.x, AB.z * CE.x); M[4] = fma(CD.z, AB.x,  AB.z * CD.x);M[5] = fma(CD.z, CE.x,  CE.z * CD.x); M[6] = fma(CE.y, AB.x,  CE.x * AB.y);M[7] = fma(CD.y, AB.x, CD.x * AB.y); M[8] = fma(CD.y, CE.x, CD.x * CE.y); //[alfa=sol.x beta=sol.y delta=sol.z]=M*[AC] float4 CA = A  C; float4 sol; sol.x = fma(M[0], CA.x, fma(M[3], CA.y, M[6] * CA.z)); sol.y = fma(M[1], CA.x, fma(M[4], CA.y, M[7] * CA.z)); sol.z = fma(M[2], CA.x, fma(M[5], CA.y, M[8] * CA.z)); sol *= invdet; if (0 <= sol.z && sol.z <= 1 && sol.x >= 0 && sol.y >= 0 && sol.x + sol.y <= 1) return true; else return false; } One interesting thing about this approach is that it is possible to know EXACTLY the point where the collision takes place. The downside is that we don't calculate the closest distance between the polygons. Note: The builtin function mad is a fast hardware multiplicationaddition function. mad(a,b,c) = a*b + c, except that it is MUCH faster. 4.3 Checking triangle collision Recall the following picture: We need to check if the edges of one triangle cross the other triangle. Good part: if one single line edge crosses one triangle of the other model, collision has been detected and the algorithm can fully stop. You may also notice that collisions always come in pairs, which means that we don't need to check all 6 edges  5 is enough. To solve the problem we will launch a 2dimension kernel that checks if the ith triangle of Model 1 collides with the jth triangle of Model 2. If a collision is detected, the models collide. Let's name the variables. Let A0, A1 and A2 be the vertexes of the first triangle, B0, B1 and B2 be the vertexes of the second one and numCollisions[0] be the number of collisions detected so far. What each worker has to do is: __kernel void CalcExactCollision( __global float * Obj1Vertexes, __global int * Obj1Triangles, __global float * Obj2Vertexes, __global int * Obj2Triangles, __global int * numCollisions) { int i = get_global_id(0); int j = get_global_id(1); int ii = 3 * i, jj = 3 * j; //Read vertexes (...) //Collision check if (numCollisions[0] <= 0) if (RetaCruzaTriangulo(A0, A1, B0, B1, B2)) numCollisions[0]++; if (numCollisions[0] <= 0) if (RetaCruzaTriangulo(A0, A2, B0, B1, B2)) numCollisions[0]++; if (numCollisions[0] <= 0) if (RetaCruzaTriangulo(A1, A2, B0, B1, B2)) numCollisions[0]++; if (numCollisions[0] <= 0) if (RetaCruzaTriangulo(B0, B1, A0, A1, A2)) numCollisions[0]++; if (numCollisions[0] <= 0) if (RetaCruzaTriangulo(B0, B2, A0, A1, A2)) numCollisions[0]++; //Not necessary. If none of the others have collided this one won't too //if (numCollisions[0] <= 0) // if (RetaCruzaTriangulo(B1, B2, A0, A1, A2)) // numCollisions[0]++; } Notice that one single collision ends all workers. If no collision is detected the algorithm has to sweep through all triangles and that's the worst case. The read vertexes part is omitted from this but you can check it in the source. It is also important to say that if two triangles share the same edge this algorithm will compute the collision test twice. Since this algorithm is intended to be able to check collision in models that explode, this is necessary. If your models don't explode, you may consider merging your triangles into a unique set of lines for each model. 5. Host Implementation I will not copy and paste full codes into this section. If you want to know more details about the implementation, check the source code and look for public bool DetectCollisions(CollisionObject Obj1, CollisionObject Obj2). You may also want to check my collision object structure because I may check for bounding boxes collision prior to full collision testing. Now that we have the tools, the implementation of the algorithm is fairly easy (Host code): public bool DetectCollisions(CollisionObject Obj1, CollisionObject Obj2) { //Calculates displacement of models CLCalc.Program.Variable Obj1DispVert = CLCalcVertexDisplacement(Obj1.Vertex, Obj1.Displacement); CLCalc.Program.Variable Obj2DispVert = CLCalcVertexDisplacement(Obj2.Vertex, Obj2.Displacement); //Transfers triangles data to Device memory int[] varNumCols = new int[1] { 0 }; CLCalc.Program.Variable numColisoes = new CLCalc.Program.Variable(varNumCols); CLCalc.Program.Variable Obj1Triangles = new CLCalc.Program.Variable(Obj1.Triangle); CLCalc.Program.Variable Obj2Triangles = new CLCalc.Program.Variable(Obj2.Triangle); //Calculates collisions int[] max = new int[2] { Obj1.Triangle.Length / 3, Obj2.Triangle.Length / 3 }; CLCalc.Program.Variable[] args = new CLCalc.Program.Variable[] { Obj1DispVert, Obj1Triangles, Obj2DispVert, Obj2Triangles, numColisoes }; kernelCalcExactCollision.Execute(args, max); //Read results numColisoes.ReadFromDeviceTo(varNumCols); return varNumCols[0] > 0; } The upside of the algorithm is that the result is a single YesNo answer, which avoids expensive backwards data transfer. If you are always going to check collision between a given set of objects the collision algorithm will be incredibly faster than the CPU because you will need to transfer models data only once to the Device. Since data transfer is the bottleneck and after copying the data to Device memory all you will need to do is copy translation/rotation data and retrieve a Yes/No answer, OpenCL can easily be 300x faster than the CPU in this case. 6. Conclusion We have presented an algorithm to calculate exact lowpoly collisions for engineering purposes using OpenCL and parallel processing technology. The proposed implementation is 35x faster when compared to the regular implementation and, when collisions are calculated between the same models more than once, it is not necessary to copy the data again to the Device and accelerations up to 300x are to be expected. This algorithm features a collision detection algorithm suitable for use even when explosion simulation is taking place, in which case triangles may not share any vertexes, as opposed to regular models where triangles will almost certainly share vertexes. If your problem doesn't involve collision you may consider creating a nonrepeating list of lines to calculate collisions. Collision detection is a task that can be highly parallelized and thus benefits greatly from OpenCL and GPGPU calculations. If you need to detect highpoly exact collisions it would be a good idea to divide each model into a set number of bounding boxes to reduce the order of the algorithm from O(nm) (with n and m being the number of polygons in models 1 and 2, respectively) to lower values on average. Source code is presented to show the actual implementation of Collision Detection using OpenCL and OpenCLTemplate, as well as other collision detection systems. Download source code for this section.
