The modern theory of machine learning, rooted in the "statistical learning theory" of Vapnik/Chervonenkis and the "PAC-learning" of Valiant, places a special value on generalization ability of learning algorithms. Many recent learning algorithms for which provable error bounds are available - such as Support Vector Machines (SVM) and other kernel methods - amount to difficult optimization problems. The key to designing efficient learning algorithms, as I will argue in this talk, lies in utilization of special structure contained in their optimization problems. To illustrate typical problems arising in the development of learning algorithms, three useful design techniques will be presented. Decomposition was (and remains) a first successful approach for efficient SVM training. Different flavors of decomposition lie at the heart of the majority of currently available SVM implementations, most notably SVM^light and SMO. I will present the approach of feasible direction decomposition providing unified treatment of classification and regression SVM. Perturbation expansion is a classical technique of mathematical physics, extensively used in solving linear operator eigenvalue problems. I will present an application of this technique to the design of FFDIAG algorithm for simultaneuos matrix diagonalization, an optimization problem widely used in blind source separation and common principal component analysis. This algorithm is currently the only quadratically-convergent algorithm for unconstrained simultaneous diagonalization problems. Incremental learning addresses an important learning scenario when data arrives sequentially rather than in a single batch. Incremental learning is crucial for online anomaly detection. As an example application, a quarter-sphere SVM for network intrusion detection will be presented. An important feature of this algorithm is interpretability of its predictions, which can be used for understanding the features of previously unseen attacks.