Informatica logo


Login Register

  1. Home
  2. Issues
  3. Volume 28, Issue 4 (2017)
  4. A Support Vector Machine for Regression ...

Informatica

Information Submit your article For Referees Help ATTENTION!
  • Article info
  • Full article
  • Related articles
  • Cited by
  • More
    Article info Full article Related articles Cited by

A Support Vector Machine for Regression in Complex Field
Volume 28, Issue 4 (2017), pp. 651–664
Rongling Lang   Fei Zhao   Yongtang Shi  

Authors

 
Placeholder
https://doi.org/10.15388/Informatica.2017.150
Pub. online: 1 January 2017      Type: Research Article      Open accessOpen Access

Received
1 September 2016
Accepted
1 November 2017
Published
1 January 2017

Abstract

In this paper, one method for training the Support Vector Regression (SVR) machine in the complex data field is presented, which takes into account all the information of both the real and imaginary parts simultaneously. Comparing to the existing methods, it not only considers the geometric information of the complex-valued data, but also can be trained with the same amount of computation as the original SVR in the real data field. The accuracy of the proposed method is analysed by the simulation experiments. This also can be applied to the field of anti-interference for satellite navigation successfully, which shows its effectiveness in practical application.

1 Introduction

Support Vector Machine (SVM) is a high-performance machine learning algorithm, which was introduced by Vapnik in the early 1990s (Ramón and Christodoulou, 2006). SVM has been proved to have excellent performances in addressing non-linear regression problems due to the robustness against noise and interferences. In the application of SVM, there is a kind of important model that is commonly known as SVR (Ramón and Christodoulou, 2006).
The original SVR is targeted to treat real numbers. The SVRs in complex data field are also desired in some field, such as the beam forming (Ramón et al., 2005; Gaudes et al., 2007) and pattern recognition (Scholkopf and Smola, 2002; Chen and Xie, 2005). The main method for processing the complex-valued SVR model is using two separate processes for the real and imaginary parts (Bouboulis and Theodoridis, 2011) respectively, which misses the cross-information of the real and imaginary parts of a complex number.
Currently, Kernel-based methods have been employed to solve nonlinear tasks for SVRs. The key method is Reproducing Kernel Hilbert Spaces (RKHS). Wintinger’s calculus on complex RKHS was developed in dissertations (Bouboulis et al., 2013, 2015). It was also proved that a complex SVM/SVR task is equivalent to solving two real SVM/SVR tasks by exploiting a specific real kernel. In Bouboulis et al. (2012), the authors introduced how to use RHKS to solve the problem in complex data field. Different from the above-mentioned methods, we propose a model, which not only considers the geometric information of the complex-valued data, but also can be trained with the same amount of computation as the original SVR in the real data field. The proposed method also has no more additional requests for kernel function than the original SVR in real data field.
The paper is organized as follows. In Section 2, the main contributions of the paper are presented, where the model of complex SVR and the training algorithms for the model are proposed in details. In Section 3, experiments are carried out to demonstrate the performance of the proposed method, and the method is applied to anti-jamming field. Finally, the paper is concluded in Section 4.

2 Complex-Valued SVR Model

In this section, a complex SVR model and the corresponding solving algorithms are proposed. Firstly, the complex-valued original problem of SVR is introduced. Secondly, the dual problem is derived. Thirdly, Karush–Kuhn–Tucker (KKT) conditions are obtained. Finally, the solution of the original problem is achieved by solving the dual problem.
The following notations are used in this paper. The superscripts ${^{H}}$ and ${^{T}}$ denote hermitian transposition and transposition. $\| \cdot \| $ is the Euclidean vector norm. $\mathrm{Re}(.)$ and $\mathrm{Im}(.)$ denote the real and imaginary part of a complex number, respectively. $|\cdot |$ denotes modulus of a complex data. In particular, when the imaginary part of the complex data is equal to zero, $|\cdot |$ denotes absolute value.

2.1 Model Building

The real-valued SVR can be obtained by solving the following optimization problem (Ramón and Christodoulou, 2006):
(1)
\[ \underset{w}{\min }\frac{1}{2}\| w{\| ^{2}}+P{\sum \limits_{i=1}^{l}}({\xi _{i}}+{\xi ^{\prime }_{i}}).\]
Subject to the constraints
(2)
\[\begin{array}{l}\displaystyle {y_{i}}-\big({w^{T}}{x_{i}}+b\big)\leqslant \varepsilon +{\xi _{i}},\\ {} \displaystyle \big({w^{T}}{x_{i}}+b\big)-{y_{i}}\leqslant \varepsilon +{{\xi _{i}}^{\prime }},\\ {} \displaystyle {\xi _{i}},{{\xi _{i}}^{\prime }}\geqslant 0\end{array}\]
for $i=1,2,\dots ,l$, where $\{({x_{1}},{y_{1}}),\dots ,({x_{l}},{y_{l}})\}$ is the training set and ${x_{i}}\in {R^{n}}$, $w\in {R^{n}}$, ${y_{i}}\in R$. P is a penalty factor. ε is the insensitive loss factor. ${\xi _{i}}$ and ${{\xi _{i}}^{\prime }}$ are the slack variables of the output. P, ε, ${\xi _{i}}$ and ${\xi ^{\prime }_{i}}$ are all real numbers. In the formula (1), the optimal value can be obtained by finding the optimal value of its dual problem. According to the principle of real-valued SVR, an SVR model in complex field is built and the corresponding solution method is proposed. In order to simplify the expression, ‘Subject to the constraints’ will be abbreviated to ‘$\text{s.t.}$’. The corresponding complex SVR task is formulated and expressed as
(3)
\[\begin{aligned}{}\min & \frac{1}{2}{\| w\| ^{2}}+{C_{1}}{\sum \limits_{i=1}^{l}}({\xi _{i}}+{{\xi _{i}}^{\prime }})+{C_{2}}{\sum \limits_{i=1}^{l}}({\eta _{i}}+{{\eta _{i}}^{\prime }}),\\ {} & \operatorname{Re}\big({y_{i}}-\big({w^{T}}{x_{i}}+b\big)\big)\leqslant \varepsilon +{\xi _{i}},\\ {} & \operatorname{Re}\big(\big({w^{T}}{x_{i}}+b\big)-{y_{i}}\big)\leqslant \varepsilon +{{\xi _{i}}^{\prime }},\\ {} \text{s.t.}& \operatorname{Im}\big({y_{i}}-\big({w^{T}}{x_{i}}+b\big)\big)\leqslant \varepsilon +{\eta _{i}},\hspace{1em}i=1,2,\dots ,l,\\ {} & \operatorname{Im}\big(\big({w^{T}}{x_{i}}+b\big)-{y_{i}}\big)\leqslant \varepsilon +{\eta ^{\prime }_{i}},\\ {} & {\xi _{i}},{\xi ^{\prime }_{i}},{\eta _{i}},{\eta ^{\prime }_{i}}\geqslant 0\end{aligned}\]
where ${x_{i}}\in {C^{n}}$, $w\in {C^{n}}$, ${y_{i}}\in C$. ${C_{1}}$ and ${C_{2}}$ are the penalty factor of the real and imaginary parts. ε is the insensitive loss factor. ${\xi _{i}}$ and ${{\xi _{i}}^{\prime }}$ are the slack variables in the real part of the output, and the corresponding ${\eta _{i}}$ and ${{\eta _{i}}^{\prime }}$ are in the imaginary part. ${C_{1}}$, ${C_{2}}$, ε, ${\xi _{i}}$, ${{\xi _{i}}^{\prime }}$, ${\eta _{i}}$ and ${{\eta _{i}}^{\prime }}$ are all real numbers. The formula (3) is a complex-valued quadratic programming problem, which also can be solved by solving its dual problem.

2.2 Dual Problem

The Lagrange function of the equation (3) can be expressed as
(4)
\[\begin{aligned}{}Lp=& \frac{1}{2}{\| w\| ^{2}}+{C_{1}}{\sum \limits_{i=1}^{l}}({\xi _{i}}+{{\xi _{i}}^{\prime }})+{C_{2}}{\sum \limits_{i=1}^{l}}({\eta _{i}}+{{\eta _{i}}^{\prime }})\\ {} & -{\sum \limits_{i=1}^{l}}({\gamma _{i}}{\xi _{i}}+{{\gamma _{i}}^{\prime }}{{\xi _{i}}^{\prime }})-{\sum \limits_{i=1}^{l}}({\kappa _{i}}{\eta _{i}}+{{\kappa _{i}}^{\prime }}{{\eta _{i}}^{\prime }})\\ {} & +{\sum \limits_{i=1}^{l}}{\alpha _{i}}\big[\operatorname{Re}\big({y_{i}}-{w^{T}}{x_{i}}-b\big)-\varepsilon -{\xi _{i}}\big]\\ {} & +{\sum \limits_{i=1}^{l}}{\alpha _{{_{i}}}^{\ast }}\big[\operatorname{Re}\big(-{y_{i}}+{w^{T}}{x_{i}}+b\big)-\varepsilon -{{\xi _{i}}^{\prime }}\big]\\ {} & +{\sum \limits_{i=1}^{l}}{\beta _{{_{i}}}}\big[\operatorname{Im}\big({y_{i}}-{w^{T}}{x_{i}}-b\big)-\varepsilon -{\eta _{i}}\big]\\ {} & +{\sum \limits_{i=1}^{l}}{\beta _{{_{i}}}^{\ast }}\big[\operatorname{Im}\big(-{y_{i}}+{w^{T}}{x_{i}}+b\big)-\varepsilon -{{\eta _{i}}^{\prime }}\big],\end{aligned}\]
where ${\alpha _{i}}$, ${\alpha _{i}^{\ast }}$, ${\beta _{i}}$, ${\beta _{i}^{\ast }}$, ${\gamma _{i}}$, ${\gamma ^{\prime }_{i}}$, ${\kappa _{i}}$ and ${\kappa ^{\prime }_{i}}$ are the Lagrange multipliers. According to the dual definition of Wolfe (Brandwood, 1983; Huang and Zhang, 2007), the partial derivatives are:
\[\begin{array}{l}\displaystyle {\nabla _{w}}(Lp)=w-{\sum \limits_{i=1}^{l}}{\phi _{i}}x,\\ {} \displaystyle {\nabla _{b}}(Lp)=-{\sum \limits_{i=1}^{l}}{\alpha _{i}}+{\sum \limits_{i=1}^{l}}{\alpha _{{_{i}}}^{\ast }}-j{\sum \limits_{i=1}^{l}}{\beta _{i}}+j{\sum \limits_{i=1}^{l}}{\beta _{{_{i}}}^{\ast }}.\end{array}\]
For real variables, the gradients can be obtained in the traditional way:
\[\begin{array}{l}\displaystyle {\nabla _{{\xi _{i}}}}(Lp)={C_{1}}-{\gamma _{i}}-{\alpha _{i}},\hspace{1em}{\nabla _{{{\xi _{i}}^{\prime }}}}(Lp)={C_{1}}-{{\gamma _{i}}^{\prime }}-{\alpha _{i}^{\ast }},\\ {} \displaystyle {\nabla _{{\eta _{i}}}}(Lp)={C_{2}}-{\kappa _{i}}-{\beta _{i}},\hspace{1em}{\nabla _{{{\eta _{i}}^{\prime }}}}(Lp)={C_{2}}-{{\kappa _{i}}^{\prime }}-{\beta _{i}^{\ast }},\end{array}\]
where ${\phi _{i}}=({\alpha _{i}}-{\alpha _{i}^{\ast }})+j({\beta _{i}}-{\beta _{i}^{\ast }})$, for all $i=1,2,\dots ,l$. As all partial derivatives are equal to zero for saddle point conditions, the following parameters can be obtained:
(5)
\[ w={\sum \limits_{i=1}^{l}}\big({\alpha _{i}}-{\alpha _{i}^{\ast }}\big)+j\big({\beta _{i}}-{\beta _{i}^{\ast }}\big){x_{i}},\]
(6)
\[ {\sum \limits_{i=1}^{l}}\big({\alpha _{i}}-{\alpha _{i}^{\ast }}\big)={\sum \limits_{i=1}^{l}}\big({\beta _{i}}-{\beta _{i}^{\ast }}\big)=0,\]
(7)
\[ {C_{1}}-{\gamma _{i}}-{\alpha _{i}}=0,\hspace{1em}{C_{1}}-{{\gamma _{i}}^{\prime }}-{\alpha _{i}^{\ast }}=0,\]
(8)
\[ {C_{2}}-{\kappa _{i}}-{\beta _{i}}=0,\hspace{1em}{C_{2}}-{{\kappa _{i}}^{\prime }}-{\beta _{i}^{\ast }}=0,\]
for $i=1,2,\dots ,l$. Let ${\lambda _{i}}={\alpha _{i}}-{\alpha _{i}^{\ast }}$, ${\varphi _{i}}={\beta _{i}}-{\beta _{i}^{\ast }}$, then
(9)
\[ {\phi _{i}}={\lambda _{i}}+j{\varphi _{i}}.\]
Accordingly to the equations (5)–(9), the dual problem of the equation (4) can be solved. The solution can be presented as:
(10)
\[\begin{array}{l}\displaystyle \min \frac{1}{2}{\sum \limits_{i=1}^{l}}{\sum \limits_{j=1}^{l}}{\phi _{i}^{H}}{\phi _{j}}({x_{i}},{x_{j}})+\varepsilon {\sum \limits_{i=1}^{l}}\big(|{\lambda _{i}}|+|{\varphi _{i}}|\big)-\operatorname{Re}\bigg[{\sum \limits_{i=1}^{l}}{\phi _{i}^{H}}{y_{i}}\bigg],\\ {} \displaystyle \text{s.t.}\hspace{2.5pt}\left\{\begin{array}{l}{\textstyle\textstyle\sum _{i=1}^{l}}{\phi _{i}}=0,\\ {} {\alpha _{i}},{\alpha _{i}}\ast ,{\beta _{i}},{\beta _{i}}\ast \geqslant 0,\end{array}\right.\hspace{1em}i=1,2,\dots ,l.\end{array}\]
By introducing a kernel function $K({x_{i}},{x_{j}})$: $C\times C\to R$, the dual problem can be expressed as:
(11)
\[\begin{array}{l}\displaystyle \min \frac{1}{2}{\sum \limits_{i=1}^{l}}{\sum \limits_{j=1}^{l}}{\phi _{i}^{H}}{\phi _{j}}K({x_{i}},{x_{j}})+\varepsilon {\sum \limits_{i=1}^{l}}\big(|{\lambda _{i}}|+|{\varphi _{i}}|\big)-\operatorname{Re}\bigg[{\sum \limits_{i=1}^{l}}{\phi _{i}^{H}}{y_{i}}\bigg],\\ {} \displaystyle \text{s.t.}\hspace{2.5pt}\left\{\begin{array}{l}{\textstyle\textstyle\sum _{i=1}^{l}}{\phi _{i}}=0\hspace{1em}\\ {} {\alpha _{i}},{\alpha _{i}}\ast ,{\beta _{i}},{\beta _{i}}\ast \geqslant 0,\hspace{1em}\end{array}\right.\hspace{1em}i=1,2,\dots ,l.\end{array}\]
The objective function can be expressed by ${\lambda _{i}}$ and ${\varphi _{i}}$, and the formula as follows:
(12)
\[\begin{array}{l}\displaystyle \frac{1}{2}{\sum \limits_{i=1}^{l}}{\sum \limits_{j=1}^{l}}{\phi _{i}^{H}}{\phi _{j}}K({x_{i}},{x_{j}})+\varepsilon {\sum \limits_{i=1}^{l}}\big(|{\lambda _{i}}|+|{\varphi _{i}}|\big)-\operatorname{Re}\bigg[{\sum \limits_{i=1}^{l}}{\phi _{i}^{H}}{y_{i}}\bigg]\\ {} \displaystyle \hspace{1em}=\frac{1}{2}{\sum \limits_{i=1}^{l}}{\sum \limits_{j=1}^{l}}({\lambda _{i}}{\lambda _{j}}+{\varphi _{i}}{\varphi _{j}})K({x_{i}},{x_{j}})+\varepsilon {\sum \limits_{i=1}^{l}}\big(|{\lambda _{i}}|+|{\varphi _{i}}|\big)-\operatorname{Re}\bigg[{\sum \limits_{i=1}^{l}}{\lambda _{i}}{y_{i}}\bigg]\\ {} \displaystyle \hspace{2em}-\operatorname{Im}\bigg[{\sum \limits_{i=1}^{l}}{\varphi _{i}}{y_{i}}\bigg].\end{array}\]
Assume that the optimal solution of the dual problem (11) is ${\bar{\alpha }^{(\ast )}}={({\bar{\alpha }_{1}},{\bar{\alpha }_{{_{1}}}^{\ast }},\dots ,{\bar{\alpha }_{l}},{\bar{\alpha }_{l}^{\ast }})^{T}}$ and ${\bar{\beta }^{(\ast )}}={({\bar{\beta }_{1}},{\bar{\beta }_{{_{1}}}^{\ast }},\dots ,{\bar{\beta }_{l}},{\bar{\beta }_{l}^{\ast }})^{T}}$. The optimal solution of the primal problem can be obtained:
(13)
\[ w={\sum \limits_{i=1}^{l}}\big({\bar{\alpha }_{i}}-{\bar{\alpha }_{i}^{\ast }}\big)+j\big({\bar{\beta }_{i}}-{\bar{\beta }_{i}^{\ast }}\big){x_{i}}.\]
If the training point is support vector on the boundary, then the threshold $\bar{b}$ can be computed:
(14)
\[ \bar{b}={y_{j}}-{\sum \limits_{i=1}^{l}}{\phi _{i}}K({x_{i}},{x_{j}})+(\varepsilon +j\varepsilon ).\]
Finally, a support vector regression function in complex data field can be obtained:
(15)
\[ f(x)={\sum \limits_{i=1}^{l}}{\phi _{i}}K({x_{i}},x)+\bar{b}.\]

2.3 KKT Conditions

Assume that the solution of the formula (11) is ${\bar{\alpha }^{(\ast )}}$ and ${\bar{\beta }^{(\ast )}}$, then the KKT condition of (3) can be given in formula (16).
(16)
\[ \left\{\begin{array}{l}\operatorname{Re}\bigg({y_{i}}-\bigg({\textstyle\textstyle\sum _{j=1}^{l}}{\phi _{j}}K({x_{j}},{x_{i}})+b\bigg)\bigg)\leqslant \varepsilon +{\xi _{i}},\\ {} \operatorname{Re}\bigg(\bigg({\textstyle\textstyle\sum _{j=1}^{l}}{\phi _{j}}K({x_{j}},{x_{i}})+b\bigg)-{y_{i}}\bigg)\leqslant \varepsilon +{{\xi _{i}}^{\prime }},\\ {} \operatorname{Im}\bigg({y_{i}}-\bigg({\textstyle\textstyle\sum _{j=1}^{l}}{\phi _{j}}K({x_{j}},{x_{i}})+b\bigg)\bigg)\leqslant \varepsilon +{\eta _{i}},\\ {} \operatorname{Im}\bigg(\bigg({\textstyle\textstyle\sum _{j=1}^{l}}{\phi _{j}}K({x_{j}},{x_{i}})+b\bigg)-{y_{i}}\bigg)\leqslant \varepsilon +{{\eta _{i}}^{\prime }},\\ {} {\alpha _{i}}\bigg\{\varepsilon +{\xi _{i}}-\operatorname{Re}\bigg[{y_{i}}-{\textstyle\textstyle\sum _{j=1}^{l}}{\phi _{j}}K({x_{j}},{x_{i}})-b\bigg]\bigg\}=0,\\ {} {\alpha _{{_{i}}}^{\ast }}\bigg\{\varepsilon +{{\xi _{i}}^{\prime }}-\operatorname{Re}\bigg[-{y_{i}}+{\textstyle\textstyle\sum _{j=1}^{l}}{\phi _{j}}K({x_{j}},{x_{i}})+b\bigg]\bigg\}=0,\\ {} {\beta _{i}}\bigg\{\varepsilon +{\eta _{i}}-\operatorname{Im}\bigg[{y_{i}}-{\textstyle\textstyle\sum _{j=1}^{l}}{\phi _{j}}K({x_{j}},{x_{i}})-b\bigg]\bigg\}=0,\\ {} {\beta _{i}^{\ast }}\bigg\{\varepsilon +{{\eta _{i}}^{\prime }}-\operatorname{Im}\bigg[-{y_{i}}+{\textstyle\textstyle\sum _{j=1}^{l}}{\phi _{j}}K({x_{j}},{x_{i}})+b\bigg]\bigg\}=0,\\ {} ({C_{1}}-{\alpha _{i}}){\xi _{i}}=0,\hspace{1em}\big({C_{1}}-{\alpha _{i}^{\ast }}\big){{\xi _{i}}^{\prime }}=0,\\ {} ({C_{2}}-{\beta _{i}}){\eta _{i}}=0,\hspace{1em}\big({C_{2}}-{\beta _{i}^{\ast }}\big){{\eta _{i}}^{\prime }}=0,\\ {} -{\xi _{i}},-{{\xi _{i}}^{\prime }},-{\eta _{i}},-{{\eta _{i}}^{\prime }}\leqslant 0,\\ {} -{\alpha _{i}},-{\alpha _{{_{i}}}^{\ast }},-{\beta _{i}},-{\beta _{{_{i}}}^{\ast }}\leqslant 0.\end{array}\right.\]
for all $i=1,2,\dots ,l$. Here, the inherent relationship among ${\alpha _{i}},{\alpha _{i}^{\ast }},{\beta _{i}}$ and ${\beta _{i}^{\ast }}$will be considered firstly.
Corollary 1.
Let ${\bar{\alpha }^{(\ast )}}$ and ${\bar{\beta }^{(\ast )}}$ be the solution of formula (11), and then we have
(17)
\[ \left\{\begin{array}{l}{\alpha _{i}}{\alpha _{i}^{\ast }}=0,\\ {} {\beta _{i}}{\beta _{i}^{\ast }}=0.\end{array}\right.\]
Proof.
The proof procedure is composed of two parts.
In part 1, suppose ${\alpha _{i}}\in (0,{C_{1}})$. Based on KKT conditions in (16), the following equations can be achieved:
(18)
\[\begin{array}{l}\displaystyle {\alpha _{i}}\bigg(\varepsilon +{\xi _{i}}-\operatorname{Re}\bigg({y_{i}}-{\sum \limits_{j=1}^{l}}{\phi _{j}}K({x_{j}},{x_{i}})-b\bigg)\bigg)=0,\\ {} \displaystyle ({C_{1}}-{\alpha _{i}}){\xi _{i}}=0.\end{array}\]
By calculation of equation (18), we have
(19)
\[\begin{array}{l}\displaystyle \operatorname{Re}\bigg({y_{i}}-{\sum \limits_{j=1}^{l}}{\phi _{j}}K({x_{j}},{x_{i}})-b\bigg)=\varepsilon ,\\ {} \displaystyle {\xi _{i}}=0.\end{array}\]
Through combination of formulas (16) and (19), the following expression can be obtained:
(20)
\[ {\alpha _{{_{i}}}^{\ast }}\big(\varepsilon +{{\xi _{i}}^{\prime }}+\operatorname{Re}({y_{i}}-(w\cdot {x_{i}})-\mathrm{b})\big)=0,\hspace{1em}{\xi ^{\prime }_{i}}\geqslant 0.\]
Obviously, ${\alpha _{i}^{\ast }}=0$ can be deferred from equation (20).
In part 2, we suppose ${\alpha _{i}}={C_{1}}$. According to KKT conditions in (16), it is known that ${\xi _{i}}$ is an arbitrary value. And there are some cases as follows.
Part 2.1. Suppose ${\xi _{i}}=0$ and according to part 1, it is easy to get ${\alpha _{i}^{\ast }}=0$.
Part 2.2. Suppose ${\xi _{i}}\ne 0$ $({\xi _{i}}>0)$, then
(21)
\[ \left\{\begin{array}{l}\operatorname{Re}\bigg(-{y_{i}}+{\textstyle\textstyle\sum _{j=1}^{l}}{\phi _{j}}K({x_{j}},{x_{i}})+b\bigg)=-\varepsilon -{\xi _{i}},\\ {} {\alpha _{i}^{\ast }}\bigg(\varepsilon +{{\xi _{i}}^{\prime }}+\operatorname{Re}\bigg({y_{i}}-{\textstyle\textstyle\sum _{j=1}^{l}}{\phi _{j}}K({x_{j}},{x_{i}})-b\bigg)\bigg)=0,\\ {} {\xi ^{\prime }_{i}}\geqslant 0.\end{array}\right.\]
By solving equation (21), ${\alpha _{i}^{\ast }}=0$ can also be obtained.
In summary, ${\alpha _{i}^{\ast }}{\alpha _{i}}=0$ is proved to be true. With the similar method, ${\beta _{i}^{\ast }}{\beta _{i}}=0$ can be proved as well.  □
According to the Corollary, we can analyse the distribution of the real and imaginary parts of the sample points on the boundary. At the same time, in order to facilitate the expression of the formula, a simple notation is introduced:
(22)
\[ {E_{i}}=E({x_{i}})=f({x_{i}})-{y_{i}}.\]
Therefore the complex domain KKT conditions can be expressed as:
(23)
\[ \left\{\begin{array}{l@{\hskip4.0pt}l}\big|\operatorname{Re}({E_{i}})\big|\leqslant \varepsilon \hspace{1em}& \big(|{\lambda _{i}}|=0\big),\\ {} \big|\operatorname{Re}({E_{i}})\big|=\varepsilon \hspace{1em}& \big(0<|{\lambda _{i}}|<{C_{1}}\big),\\ {} \big|\operatorname{Re}({E_{i}})\big|\geqslant \varepsilon \hspace{1em}& \big(|{\lambda _{i}}|={C_{1}}\big),\\ {} \big|\operatorname{Im}({E_{i}})\big|\leqslant \varepsilon \hspace{1em}& \big(|{\varphi _{i}}|=0\big),\\ {} \big|\operatorname{Im}({E_{i}})\big|=\varepsilon \hspace{1em}& \big(0<|{\varphi _{i}}|<{C_{2}}\big),\\ {} \big|\operatorname{Im}({E_{i}})\big|\geqslant \varepsilon \hspace{1em}& \big(|{\varphi _{i}}|={C_{2}}\big).\end{array}\right.\]
The formula (11) can be solved by using SMO (Cristianini and Shawo-Taylor, 2005; Yang and Tian, 2004) method.

2.4 SMO Algorithm

SMO algorithm is a special case of decomposing algorithm. The core of the algorithm is to update two Lagrangian multipliers corresponding to two sample points at a time, while others remain unchanged. In the complex domain, the Lagrangian multiplier variables corresponding to sample points $({x_{i}},{y_{i}})$ and $({x_{j}},{y_{j}})$ are $({\lambda _{i}},{\phi _{i}})$ and $({\lambda _{j}},{\phi _{j}})$. It is important for SMO algorithm to choose the two appropriate training points and solve the corresponding optimal problem with two Lagrangian multiplier variables. In the following, these two problems are solved.
Firstly, suppose that there are already two training points $({x_{1}},{y_{1}})$ and $({x_{2}},{y_{2}})$. In order to simplify the formula, let ${K_{ij}}=K({x_{i}},{x_{j}})$. Therefore the objective function (12) can be expressed as
(24)
\[\begin{array}{l}\displaystyle W({\lambda _{1}},{\lambda _{2}},{\phi _{1}},{\phi _{2}})\\ {} \displaystyle \hspace{1em}=\frac{1}{2}{\lambda _{1}^{2}}{K_{11}}+\frac{1}{2}{\lambda _{2}^{2}}{K_{22}}+{\lambda _{1}}{\lambda _{2}}{K_{12}}+\varepsilon \big[|{\lambda _{1}}|+|{\lambda _{2}}|\big]-\big[{\lambda _{1}}\operatorname{Re}({y_{1}})+{\lambda _{2}}\operatorname{Re}({y_{2}})\big]\\ {} \displaystyle \hspace{2em}+\frac{1}{2}{\varphi _{1}^{2}}{K_{11}}+\frac{1}{2}{\varphi _{2}^{2}}{K_{22}}+{\varphi _{1}}{\varphi _{2}}{K_{12}}+\varepsilon \big[|{\varphi _{1}}|+|{\varphi _{2}}|\big]-\big[{\varphi _{1}}\operatorname{Im}({y_{1}})+{\varphi _{2}}\operatorname{Im}({y_{2}})\big]\\ {} \displaystyle \hspace{2em}+{\lambda _{1}}{\sum \limits_{i=3}^{l}}{\lambda _{i}}{K_{1i}}+{\lambda _{2}}{\sum \limits_{i=3}^{l}}{\lambda _{i}}{K_{2i}}+{\varphi _{1}}{\sum \limits_{i=3}^{l}}{\varphi _{i}}{K_{1i}}+{\varphi _{2}}{\sum \limits_{i=3}^{l}}{\varphi _{i}}{K_{2i}}+W(\text{const}),\end{array}\]
where $W(\text{const})$ has no relationship with $({x_{1}},{y_{1}})$ and $({x_{2}},{y_{2}})$. Let the initial feasible solution of this problem is $({\lambda _{1}^{\mathit{old}}},{\phi _{1}^{\mathit{old}}})$ and $({\lambda _{2}^{\mathit{old}}},{\phi _{2}^{\mathit{old}}})$, and the optimal solution of the problem is the new values $({\lambda _{1}},{\phi _{1}})$ and $({\lambda _{2}},{\phi _{2}})$ of the two variables. In order to satisfy the constraint, the new values must satisfy
(25)
\[ \left\{\begin{array}{l}{\lambda _{1}^{\mathit{old}}}+{\lambda _{2}^{\mathit{old}}}={s_{1}^{\mathit{old}}}={\lambda _{1}}+{\lambda _{2}},\\ {} {\phi _{1}^{\mathit{old}}}+{\phi _{2}^{\mathit{old}}}={s_{2}^{\mathit{old}}}={\phi _{1}}+{\phi _{2}},\end{array}\right.\]
where ${s_{1}^{\mathit{old}}}$ and ${s_{2}^{\mathit{old}}}$ are constant. The formula (25) is substituted into the objective function (24) and takes partial derivation of W with respect to ${\lambda _{2}}$. By simplifying, we finally get
(26)
\[ {\lambda _{2}}={\lambda _{2}^{\mathit{old}}}+\frac{1}{\kappa }\big(\operatorname{Re}({E_{1}})-\operatorname{Re}({E_{2}})-\varepsilon \big[\mathrm{sgn}({\lambda _{2}})-\mathrm{sgn}({\lambda _{1}})\big]\big).\]
Similarly, we can update ${\phi _{2}}$:
(27)
\[ {\phi _{2}}={\phi _{2}^{\mathit{old}}}+\frac{1}{\kappa }\big(\operatorname{Im}({E_{1}})-\operatorname{Im}({E_{2}})-\varepsilon \big[\mathrm{sgn}({\phi _{2}})-\mathrm{sgn}({\phi _{1}})\big]\big),\]
where $\kappa ={k_{11}}+{k_{22}}-2{k_{12}}$ and $\mathrm{sgn}(.)$ is signum function.
Second, we should solve the problem of selecting the two training points in the iteration. According to the definition of gradient, we choose first training point which not only violates KKT conditions but also can maximize the module of gradient.
(28)
\[ \frac{\partial W}{\partial {\lambda _{i}}}={\sum \limits_{j=1}^{l}}{\lambda _{j}}{K_{ij}}+0+\varepsilon \mathrm{sgn}({\lambda _{i}})-\operatorname{Re}({y_{i}})=\operatorname{Re}({E_{i}})-\operatorname{Re}(b)+\varepsilon \mathrm{sgn}({\lambda _{i}}),\]
(29)
\[ \frac{\partial W}{\partial {\varphi _{i}}}={\sum \limits_{j=1}^{l}}{\varphi _{j}}{K_{ij}}+0+\varepsilon \mathrm{sgn}({\varphi _{i}})-\operatorname{Im}({y_{i}})=Im({E_{i}})-Im(b)+\varepsilon \mathrm{sgn}({\varphi _{i}}),\]
for $i=1,2$. Since the ultimate aim of training is to find the training points that make objective function minimize, we choose the second training point which can make greatest change of formula (24).

3 Experimental Results

In this section, the effectiveness of the proposed method in Section 2 will be demonstrated by experiments. Then it will be applied to the anti-interference field for further verification, which is carried out by comparison of our proposed solution method with the existing Power Inversion (PI) algorithm. All the simulation experiments conducted in the paper are implemented in MATLAB.

3.1 SVR-Function Estimation

In this subsection, a regression test is performed to verify the ability of the support vector regression model to handle noise by using two function ${f_{1}}(x)={e^{jx/20}}+N$, $x\in R$ and ${f_{2}}(x)=\frac{\sin (x)}{x}+N$, $x\in C$. For ${f_{1}}(x)$, $x\in [0,60]$. For ${f_{2}}(x)$, $x={x_{r}}+$j${x_{i}}$, ${x_{r}}$ and ${x_{i}}$ are real random variables, and ${x_{r}}\in [-4,4]$, ${x_{i}}\in [-1,1]$, j as the imaginary unit. N is the white Gaussian noise with zero mean and variances $\sigma =0.2$. The regression curves are shown in Figs. 1 and 2, which indicate that with the proposed solution method, the regression data can agree well with the original data.
info1163_g001.jpg
Fig. 1
Regression result of ${f_{1}}(x)$.
info1163_g002.jpg
Fig. 2
Regression result of ${f_{2}}(x)$.
At the same time, a Monte Carlo experiment for 50 times are conducted to demonstrate the stability of the proposed solution, based on the root mean-square-error (RMSE) of complex-data defined as:
(30)
\[ \mathit{RMSE}=\sqrt{\frac{1}{N}{\sum \limits_{t=1}^{N}}{|{x_{t}}-{\hat{x}_{t}}|^{2}}},\]
where ${\hat{x}_{t}}$ is the estimated value of ${x_{t}}$.
The resultant RMSE of the complex-data is depicted in Fig. 3, which shows that the square error is small and has a high stability.
info1163_g003.jpg
Fig. 3
Resultant RMSE of complex-data.
In all of the above experiments, a real Gaussian Radial Basis Kernel (RBF) function is selected for SMO algorithm and the Gaussian RBF is expressed as
(31)
\[ K(x,y)=\exp \bigg(-\frac{{\| x-y\| ^{2}}}{{r^{2}}}\bigg),\]
where r is the Gaussian kernel parameter. For ${f_{1}}(x)$, ${C_{1}}={C_{2}}=2$, $\varepsilon =1\text{E}-5$, $r=50$. For ${f_{2}}(x)$, ${C_{1}}={C_{2}}=10$, $\varepsilon =0.1$, $r=0.8$.
Next, we compare the performance of the proposed method with CSVR (Bouboulis, 2015). The CSVR method exploits a Xcomplex kernel function and transforms the complex problem into two real SVR tasks. In order to compare the performance of the two algorithms under the same standard, the Gaussian Radial Basis kernel function is selected and ${C_{1}}={C_{2}}$. The training samples are generated by ${f_{2}}(x)$, and ${x_{r}}\in [-1,1]$, ${x_{i}}\in [-1,1]$. In this experiments, $r=0.6$ and $\varepsilon =0.1$ for both proposed solution and CSVR. Table 1 shows the difference in RMSE between the two algorithms under 50 Monte Carlo experiments. It can be concluded from the Table 1 that the precision of the proposed solution is roughly the same as that of CSVR.
Table 1
RMSE under different penalty factors.
Penalty Proposed (RMSE) CSVR (RMSE)
${C_{1}}={C_{2}}=2$ −12.89 dB −12.79 dB
${C_{1}}={C_{2}}=10$ −15.17 dB −14.93 dB
${C_{1}}={C_{2}}=20$ −15.23 dB −14.77 dB

3.2 Application to Anti-Interference Problem

In this section, the proposed complex-SVR will be applied to the field of anti-interference to verify its effect in practical application. The anti-interference technology is an important part of array signal processing. According to the literature (Ramón and Christodoulou, 2006), the output vector of an L-element array can be described in matrix notation as:
(32)
\[ X={A_{s}}S+{A_{j}}J+N,\]
where S is desired signals, J is interfere signals, N is noise vector, and ${A_{s}}$ is the response vector of the desired signals, ${A_{j}}$ is the response vector of the interfere signals. The purpose of the anti-interference is to obtain desired output S and suppress interfere signal. Therefore, the output of the array processor is
(33)
\[ y={W^{H}}X=d+\varepsilon ,\]
where W is the weight vector of the array, d is the estimated value of the desired signal S, ε is the estimation error. In practical applications, we only know the output vector of the array. Therefore, in order to achieve interference suppression, it is important to know how to use the output vector X of an L-element array to get the optimal weight vector W.
Firstly, it is important to generate the training data. For an L-elements Unit Circular Array (UCA), a dataset $X(k)={[{x_{1}}(k),{x_{2}}(k),\dots ,{x_{l}}(k)]^{T}}$ can be obtained, where $1<k<N$ and N is the snapshot number and ${x_{i}}(k)$ is the received value of the $i-th$ element at moment k. The autocorrelation matrix of all samples can be computed by
(34)
\[ R=\frac{1}{N}{\sum \limits_{k=1}^{N}}X(k)X{(k)^{H}}.\]
According to the PI algorithm (Zahm, 1973; Compton, 1979), the optimal weight is
(35)
\[ W=\frac{{R^{-1}}\mathrm{Const}}{{\mathrm{Const}^{T}}{R^{-1}}\mathrm{Const}},\]
where $\mathrm{Const}={[1,0,\dots ,0]^{T}}$. And then a set of pairs $\{R,W\}$ can be chosen as training samples.
info1163_g004.jpg
Fig. 4
The arrays pattern of the original PI algorithm and complex SVR.
Secondly, we will show how to use Complex-SVR to achieve anti-interference. The narrowband signal is chosen as the jammer, whose direction is $\theta ={120^{0}}$, $\varphi ={50^{0}}$, where θ is the azimuth, φ is the elevation angle. In this experiment, a 4-element UCA is chosen and the data X is divided into 100 sections (where $N=12800$), so as to have 128 snapshots in each section. Then, based on the equations (34) and (35), 100 sample data can be obtained. In the experiment, 70 sample data are chosen as training samples, and the remaining 30 points as the testing samples. In this experiment, ${C_{1}}={C_{2}}=20$, $\varepsilon =1\mathrm{E}-4$, $r=0.8$. The arrays patterns of PI and the proposed complex-valued SVR are illustrated in Fig. 4. where Fig. 4(a) is a pattern of the azimuth angle when the elevation angle is fixed. and Fig. 4(b) is a pattern of the elevation angle when the azimuth angle is fixed. The antenna gain is defined as
(36)
\[ G=10\log \big({\big\| {W^{H}}{A_{j}}\big\| ^{2}}\big).\]
It can be seen from Fig. 4 that complex-valued SVR can get deeper nulls in the direction of the interfering signals than PI.

4 Conclusion

A complex-valued model of SVR was proposed in the paper. The proposed method considers all the information of both the real and imaginary parts of the SVR model. Compared to conventional SVR model, the great feature is that although the method has both the real and imaginary parts information, it does not require extra amount of computation. The accuracy and stability of the proposed model of SVR are finally demonstrated by experiments. In addition, it was also applied to the anti-interference of satellite navigation system for verification of its effect in reality. The results indicate that the proposed method achieved much better performance the normal PI algorithm.

Acknowledgements

The authors would like to thank the editor and referees for their valuable suggestions and comments. Yongtang Shi is partially supported by the Natural Science Foundation of Tianjin (No. 17JCQNJC00300) and The National Natural Science Foundation of China.

References

 
Bouboulis, P. (2015). Complex soupport vector machines for regression and quaternary classification. IEEE Transaction on Neural Networks and Learning Systems, 26(6), 1260–1274.
 
Bouboulis, P., Theodoridis, S. (2011). Extension of Wirtinger’s calculus to reproducing kernel Hilbert spaces and the complex kernel LMS. IEEE Transactions on Signal Processing, 59(3), 964–978.
 
Bouboulis, P., Theodoridis, S., Mavroforakis, C. (2012). Complex support vector regression. In: IEEE Conference Publications, pp. 1–6.
 
Bouboulis, P., Theodoridou, E., Theodoridis, S. (2013). Complex support vector machines for quaternary classification. In: IEEE International Workshop on Machine Learning for Signal Processing, Southcamption, UK, pp. 22–25.
 
Bouboulis, P., Theodoridis, S., Mavroforakis, C., Evaggelatou-Dalla, L. (2015). Complex support vector machines for regression and quaternary classification. IEEE Transcation on Neural Networks and Learning Systems, 26(6), 1260–1274.
 
Brandwood, D.H. (1983). A Complex gradient operator and its application in adaptive array theroy. IEE Proceedings F – Communications, Radar and Signal Processing, 130, 11–16.
 
Chen, G.Y., Xie, W.F. (2005). Pattern recognition using dual-tree complex wavelets features and SVM. In: IEEE Conference Publications, Saskatoon, pp. 2053–2056.
 
Compton, R.J. (1979). The power-inversion adaptive array–concept and performance. IEEE Transaction on Aerospace and Electronic Systems, 15(6), 803–814.
 
Cristianini, N., Shawo-Taylor, J. (2005). Support Vector Machines and Other Kernel-Based Learning Methods. China Machine Press.
 
Gaudes, C.C., Santamaria, I., Via, J., Gomez, E.M., Paules, T.S. (2007). Robust array beamforming with sidelobe control using support vector machines. IEEE Transactions on Signal Processing, 55(2), 574–584.
 
Huang, Y., Zhang, S. (2007). Complex matrix decomposition and quadratic programming. Mathematics of Operations Research, 32(3), 758–768.
 
Ramón, M.M., Christodoulou, C. (2006). Support Vector Machines for Antenna Array Processing and Electromagnetics. Morgan & Claypool Publishers.
 
Ramón, M.M., Xu, N., Christodoulou, C.G. (2005). Beamforming using support vector machines. IEEE Antennas and Wireless Propagation Letters, 4(1), 439–442.
 
Scholkopf, B., Smola, A. (2002). Learning with Kernels: Support Vector Machines, Regularization, Optimization and Beyond. MIT Press, Cambridge, MA, USA.
 
Yang, D., Tian, Y. (2004). A New Method of Data Mining-Support Vector Machine. Science Press, Beijing.
 
Zahm, C.L. (1973). Application of adaptive arrays to suppress strong jammers in the presence of weak signals. IEEE Transaction on Aerospace and Electronic Systems, 9(3), 260–271.

Biographies

Lang Rongling
ronglinglang@163.com

R. Lang is currently an associate professor in the school of Electronic and Information Engineering in Beihang University. She got BSc Degree and MSc Degree in applied mathematics, Northwestern Polytechnical University, P.R. China, in 2002. She received PhD degree in automatic control, Northwestern Polytechnical University, Xi’an, P.R. China, in 2005. Currently, her research area are data driven GNSS signal monitoring, fault diagnosis and fault prognosis.

Zhao Fei
zhaofei9307@163.com

F. Zhao is currently pursuing his MS degree in communication and information engineering in Beihang University, Beijing, China. He received his MS degree in communication engineering from China University of Geosciences, Beijing, China. He is now doing the research on jamming suppression in satellite navigation system, blind signal processing and spatial spectrum estimation.

Shi Yongtang
shi@nankai.edu.cn

Y. Shi studied mathematics at Northwest University (Xi’an, China) and received his PhD in applied mathematics from Nankai University (Tianjin, China). He visited Technische Universität Bergakademie Freiberg (Germany), UMIT (Austria), Simon Fraser University (Canada) and University of Mississippi (USA). Currently, he is an associate professor at the Center for Combinatorics of Nankai University. His research interests are in graph theory and its applications, especially the applications of graph theory in computer science and information theory. He has written over 60 publications in graph theory and its applications.


Reading mode PDF XML

Table of contents
  • 1 Introduction
  • 2 Complex-Valued SVR Model
  • 3 Experimental Results
  • 4 Conclusion
  • Acknowledgements
  • References
  • Biographies

Copyright
© 2017 Vilnius University
by logo by logo
Open access article under the CC BY license.

Keywords
support vector machine for regression complex field kernel function

Metrics
since January 2020
1030

Article info
views

701

Full article
views

447

PDF
downloads

209

XML
downloads

Export citation

Copy and paste formatted citation
Placeholder

Download citation in file


Share


RSS

  • Figures
    4
  • Tables
    1
info1163_g001.jpg
Fig. 1
Regression result of ${f_{1}}(x)$.
info1163_g002.jpg
Fig. 2
Regression result of ${f_{2}}(x)$.
info1163_g003.jpg
Fig. 3
Resultant RMSE of complex-data.
info1163_g004.jpg
Fig. 4
The arrays pattern of the original PI algorithm and complex SVR.
Table 1
RMSE under different penalty factors.
info1163_g001.jpg
Fig. 1
Regression result of ${f_{1}}(x)$.
info1163_g002.jpg
Fig. 2
Regression result of ${f_{2}}(x)$.
info1163_g003.jpg
Fig. 3
Resultant RMSE of complex-data.
info1163_g004.jpg
Fig. 4
The arrays pattern of the original PI algorithm and complex SVR.
Table 1
RMSE under different penalty factors.
Penalty Proposed (RMSE) CSVR (RMSE)
${C_{1}}={C_{2}}=2$ −12.89 dB −12.79 dB
${C_{1}}={C_{2}}=10$ −15.17 dB −14.93 dB
${C_{1}}={C_{2}}=20$ −15.23 dB −14.77 dB

INFORMATICA

  • Online ISSN: 1822-8844
  • Print ISSN: 0868-4952
  • Copyright © 2023 Vilnius University

About

  • About journal

For contributors

  • OA Policy
  • Submit your article
  • Instructions for Referees
    •  

    •  

Contact us

  • Institute of Data Science and Digital Technologies
  • Vilnius University

    Akademijos St. 4

    08412 Vilnius, Lithuania

    Phone: (+370 5) 2109 338

    E-mail: informatica@mii.vu.lt

    https://informatica.vu.lt/journal/INFORMATICA
Powered by PubliMill  •  Privacy policy