New for: D2, D3
Multiple Kernel Learning (MKL) in the context of Support Vector Machines refers to the problem of learning both the SVM parameters and kernel matrix from the training data, rather than using a pre-specified kernel. Various techniques have been to solve the MKL problem which include solving it as a Semi-definite program[4], or solving the intermediate saddle-point problem due to the non-differentiability of the dual optimization problem[5]. The recent approach in [6] to solve the general Lp-norm (p > 1) instead of the standard l1 regularization formulation leads to the dual which is differentiable with respect to the dual variable α. This allows the use of Sequential Minimal Optimization (SMO) algorithm and hence leading to a significant speedup towards training the SVM.
As part of my thesis work, I have worked towards solving the regularization path for Lp-norm MKL. This includes completing the mathematical derivation for the following:
1. Evaluating the initial values of the dual variable α, and the decision function for large value of the regularization parameter λ.
2. Evaluating the value of the regularization parameter λ for the first breakpoint as it is decreased from large values and also the other SVM parameters.
3. Iteration step, which given the current break-point, computes the next break-point by solving a set of non-linear equations.
4. Above three steps by varying the coefficient for the p-norm regularizer for the kernel weights.
Experiments for the Sonar dataset (208 points, 60 features each) using the kernel function Kk(xi; xj) = e __ (xik __ xjk)2 show that αi for most of the points increases linearly for small value of λ(< 40), then exhibit piecewise non-linearity and eventually become close to 1 for large values. The values of the kernel weights dk remain constant for a while for small value of λ and then decrease towards zero for large values.