Support Vector Machines (SVMs) is a statistical learning tool [1] considered to be the state-of-the art classifiers for many applications today, including medical research [2] and text categorization [3].

SVM training and classification depends on computing distances (kernels) both to train the SVM and to use the learned parameters to classify data later on. These computations can be executed in parallel using OpenCL, which enhances even further the usability of the technique to problems with large training sets and vectors comprising large number of features.

As an example, a software has been developed to classify the MNIST handwritten database [5].

Basically, SVMs focuses on the most difficult examples to classify and uses them as a boundary for classification. In fact, SVMs try to maximize the distance of the classification hyperplane to the so-called Support Vectors.

Further study material can be obtained in a very wide variety of sources and this tutorial does not intend to cover the mathematical aspects, which are covered in the Stanford Machine Learning Course available online [4]. The authors (Douglas and Edmundo) would like to thank professor Andrew Ng and Stanford University for the Online Machine Learning course.

2. OpenCLTemplate.MachineLearning

OpenCLTemplate MachineLearning module comprises a prototype multi class SVM which uses individual SVMs to classify sets of data into one or more categories. In order to use it, it's necessary to include the line

Some internal classes are also exposed in case the reader wants to analyze them further, but the main class we will cover here is the MultiClassSVM.

3. Using the MultiClassSVM

3.1 Training

In order to train the MultiClass SVM it is necessary to create Training Units containing the feature vector and the desired classification. The classification should be the desired output and not the SVM classical 1 or -1 for positive and negative samples. MultiClassSVM converts these into 1 or -1 for each SVM as appropriate. Classifications of -1 are considered negative samples for all created SVMs. This is how to create training units:

float[] features = new float[] { 1, 2, 3 }; float classification = 4; TrainingUnit u = new TrainingUnit(features, classification);

float[] features2 = new float[] { 1, 2, 3 }; float classification2 = 2; TrainingUnit u2 = new TrainingUnit(features2, classification2);

float[] features3 = new float[] { 1, 2, 3 }; float classification3 = 1; TrainingUnit u3 = new TrainingUnit(features3, classification3);

Now the training units should be added to a training set:

TrainingSet TSet = new TrainingSet(); TSet.addTrainingUnit(u); TSet.addTrainingUnit(u2); TSet.addTrainingUnit(u3);

And finally, to create the multi class SVM:

MultiClassSVM SVM = new MultiClassSVM(TSet);

The command above creates a multi class SVM containing as many SVMs as necessary to classify the data. Have in mind that using too many SVMs might be slow and memory-consuming. It automatically tunes the exponential parameter of a RBF kernel, trains each SVM and discards non-support vectors. For more information about these algorithms feel free to send an e-mail to
This e-mail address is being protected from spambots. You need JavaScript enabled to view it
.

3.2 Testing

There are two main tests to verify the accuracy of the multi class SVM:

- SVM.GetInternalHitRate(): Computes the internal accuracy of each SVM separatedly and returns the average.

- SVM.GetHitRate(TrainingSet): Classifies each sample from a training set and compares the classification with the correct value. Outputs the classification hit rate..

3.3 Classifying

There are two possible classifications:

- When the sample to be classified is known to belong to one of the trained categories: SVM.Classify(TrainingUnit, out Value) - classifies the sample and returns a Value that indicates how good the classification was. The bigger the value, the better. Negative values indicate that the sample wasn't found to belong to any category.

- When the sample may not belong to any category: SVM.ClassifyWithRejection(TrainingUnit) - classifies the sample returning one of the trained values or -1 if the sample is not found to belong to any category.

4. Example: MNIST handwritten dataset

As an example, a software has been written to classify the MNIST handwritten dataset [5]. In my system, the MultiClassSVM training with 3000 training examples using all 10 digits multiple classification takes about 30s (Radeon 5770). Compared to the CPU time of around 5 minutes (Phenom II 2.6 GHz), this means an acceleration of ~10x.

The screenshot below shows what this sample looks like.

The 93% accuracy, of course, is not close to the best results obtained for the MNIST handwritten database. Nonetheless, it is a rather complicated classification problem and MultiClassSVM handles it very easily. Checking the accuracy with the test set involves very many kernel calculations and can be somewhat time consuming. As you can see, training 30000 MNIST samples took only 11 minutes. Remember that all 10 digits SVM were trained in this time.

If you are interested in more details, check the source code provided in this section.

5. Conclusion

An OpenCL accelerated multi-class SVM classification system is now available in OpenCLTemplate.MachineLearning namespace. The class MultiClassSVM is trained using a given training set and provides GetHitRate() and GetInternalHitRate() methods for accuracy evaluation and Classify()/ClassifyWithRejection() methods to classify unknown examples.

The algorithm focuses on accelerating Kernel computations and manages to accelerate the training by 10x. The achievable accuracy result using the MNIST dataset is very reasonable (93% for 30k training examples) for an algorithm that runs fast and has not been specifically tuned for the task.

[1] VAPNIK, V. Statistical Learning Theory. New York: Wiley, 1998.

[2] EL-NAQA, Issam, YANG, Yongyi, WERNICK, Miles N., GALATSANOS, Nikolas P. and NISHIKAWA, Robert M., A Support Vector Machine Approach for Detection of Microcalcifications. IEEE Transactions on medical imaging Vol 21, No 12, December 2002.

[3] HUSSAIN, Syed Fawad, BISSON, Gilles. Text Categorization Using Word Similarities Based on Higher Co-occurrences. SIAM

[5] LECUN, Y., BOTTOU, L., BENGIO, Y., and HAFFNER, P. "Gradient-based learning applied to document recognition." Proceedings of the IEEE, 86(11):2278-2324, November 1998.