Journal:Informatica
Volume 20, Issue 2 (2009), pp. 165–172
Abstract
Recent changes in the intersection of the fields of intelligent systems optimization and statistical learning are surveyed. These changes bring new theoretical and computational challenges to the existing research areas racing from web page mining to computer vision, pattern recognition, financial mathematics, bioinformatics and many other ones.
Journal:Informatica
Volume 20, Issue 2 (2009), pp. 173–186
Abstract
In this paper, we consider the problem of semi-supervised binary classification by Support Vector Machines (SVM). This problem is explored as an unconstrained and non-smooth optimization task when part of the available data is unlabelled. We apply non-smooth optimization techniques to classification where the objective function considered is non-convex and non-differentiable and so difficult to minimize. We explore and compare the properties of Simulated Annealing and of Simultaneous Perturbation Stochastic Approximation (SPSA) algorithms (SPSA with the Lipschitz Perturbation Operator, SPSA with the Uniform Perturbation Operator, Standard Finite Difference Approximation) for semi-supervised SVM classification. Numerical results are given, obtained by running the proposed methods on several standard test problems drawn from the binary classification literature. The performance of the classifiers were evaluated by analyzing Receiver Operating Characteristics (ROC).
Journal:Informatica
Volume 20, Issue 2 (2009), pp. 187–202
Abstract
In this paper, a method for the study of cluster stability is purposed. We draw pairs of samples from the data, according to two sampling distributions. The first distribution corresponds to the high density zones of data-elements distribution. Thus it is associated with the clusters cores. The second one, associated with the cluster margins, is related to the low density zones. The samples are clustered and the two obtained partitions are compared. The partitions are considered to be consistent if the obtained clusters are similar. The resemblance is measured by the total number of edges, in the clusters minimal spanning trees, connecting points from different samples. We use the Friedman and Rafsky two sample test statistic. Under the homogeneity hypothesis, this statistic is normally distributed. Thus, it can be expected that the true number of clusters corresponds to the statistic empirical distribution which is closest to normal. Numerical experiments demonstrate the ability of the approach to detect the true number of clusters.
Journal:Informatica
Volume 20, Issue 2 (2009), pp. 203–216
Abstract
One of the most known applications of Discrete Optimization is on scheduling. In contrast, one of the most known applications of Continuous Nonlinear Optimization is on the control of dynamic systems. In this paper, we combine both views, solving scheduling problems as dynamic systems, modeled as discrete-time nonlinear optimal control problems with state and control continuous variables subjected to upper and lower bounds. Complementarity constraints are used to represent scheduling decisions. One example we discuss in detail is the crude oil scheduling in ports, with numerical results presented.
Journal:Informatica
Volume 20, Issue 2 (2009), pp. 217–234
Abstract
It is known that the minimum affine separating committee (MASC) combinatorial optimization problem, which is related to some machine learning techniques, is NP-hard and does not belong to Apx class unless P=NP. In this paper, it is shown that the MASC problem formulated in a fixed dimension space within n>1 is intractable even if sets defining an instance of the problem are in general position. A new polynomial-time approximation algorithm for this modification of the MASC problem is presented. An approximation ratio and complexity bounds of the algorithm are obtained.
Journal:Informatica
Volume 20, Issue 2 (2009), pp. 235–254
Abstract
Most of real-life data are not often truly high-dimensional. The data points just lie on a low-dimensional manifold embedded in a high-dimensional space. Nonlinear manifold learning methods automatically discover the low-dimensional nonlinear manifold in a high-dimensional data space and then embed the data points into a low-dimensional embedding space, preserving the underlying structure in the data. In this paper, we have used the locally linear embedding method on purpose to unravel a manifold. In order to quantitatively estimate the topology preservation of a manifold after unfolding it in a low-dimensional space, some quantitative numerical measure must be used. There are lots of different measures of topology preservation. We have investigated three measures: Spearman's rho, Konig's measure (KM), and mean relative rank errors (MRRE). After investigating different manifolds, it turned out that only KM and MRRE gave proper results of manifold topology preservation in all the cases. The main reason is that Spearman's rho considers distances between all the pairs of points from the analysed data set, while KM and MRRE evaluate a limited number of neighbours of each point from the analysed data set.
Journal:Informatica
Volume 20, Issue 2 (2009), pp. 255–272
Abstract
In this paper, an efficient hybrid genetic algorithm (HGA) and its variants for the well-known combinatorial optimization problem, the quadratic assignment problem (QAP) are discussed. In particular, we tested our algorithms on a special type of QAPs, the structured quadratic assignment problems. The results from the computational experiments on this class of problems demonstrate that HGAs allow to achieve near-optimal and (pseudo-)optimal solutions at very reasonable computation times. The obtained results also confirm that the hybrid genetic algorithms are among the most suitable heuristic approaches for this type of QAPs.
Journal:Informatica
Volume 20, Issue 2 (2009), pp. 273–292
Abstract
The paper studies stochastic optimization problems in Reproducing Kernel Hilbert Spaces (RKHS). The objective function of such problems is a mathematical expectation functional depending on decision rules (or strategies), i.e. on functions of observed random parameters. Feasible rules are restricted to belong to a RKHS. This kind of problems arises in on-line decision making and in statistical learning theory. We solve the problem by sample average approximation combined with Tihonov's regularization and establish sufficient conditions for uniform convergence of approximate solutions with probability one, jointly with a rule for downward adjustment of the regularization factor with increasing sample size.
Journal:Informatica
Volume 20, Issue 2 (2009), pp. 293–304
Abstract
In this study, the performance of the modified subgradient algorithm (MSG) to solve the 0–1 quadratic knapsack problem (QKP) was examined. The MSG was proposed by Gasimov for solving dual problems constructed with respect to sharp Augmented Lagrangian function. The MSG has some important proven properties. For example, it is convergent, and it guarantees zero duality gap for the problems such that its objective and constraint functions are all Lipschtz. Additionally, the MSG has been successfully used for solving non-convex continuous and some combinatorial problems with equality constraints since it was first proposed. In this study, the MSG was used to solve the QKP which has an inequality constraint. The first step in solving the problem was converting zero-one nonlinear QKP problem into continuous nonlinear problem by adding only one constraint and not adding any new variables. Second, in order to solve the continuous QKP, dual problem with "zero duality gap" was constructed by using the sharp Augmented Lagrangian function. Finally, the MSG was used to solve the dual problem, by considering the equality constraint in the computation of the norm. To compare the performance of the MSG with some other methods, some test instances from the relevant literature were solved both by using the MSG and by using three different MINLP solvers of GAMS software. The results obtained were presented and discussed.
Journal:Informatica
Volume 20, Issue 2 (2009), pp. 305–320
Abstract
Multi-attribute analysis is a useful tool in many economical, managerial, constructional, etc. problems. The accuracy of performance measures in COPRAS (The multi-attribute COmplex PRoportional ASsessment of alternatives) method is usually assumed to be accurate. This method assumes direct and proportional dependence of the weight and utility degree of investigated versions on a system of attributes adequately describing the alternatives and on values and weights of the attributes. However, there is usually some uncertainty involved in all multi-attribute model inputs. The objective of this research is to demonstrate how simulation can be used to reflect fuzzy inputs, which allows more complete interpretation of model results. A case study is used to demonstrate the concept of general contractor choice of on the basis of multiple attributes of efficiency with fuzzy inputs applying COPRAS-G method. The research has concluded that the COPRAS-G method is appropriate to use.