Pub. online:19 Nov 2024Type:Research ArticleOpen Access
Journal:Informatica
Volume 35, Issue 4 (2024), pp. 883–908
Abstract
There are different deep neural network (DNN) architectures and methods for performing augmentation on time series data, but not all the methods can be adapted for specific datasets. This article explores the development of deep learning models for time series, applies data augmentation methods to conveyor belt (CB) tension signal data and investigates the influence of these methods on the accuracy of CB state classification. CB systems are one of the essential elements of production processes, enabling smooth transportation of various industrial items, therefore its analysis is highly important. For the purpose of this work, multi-domain tension data signals from five different CB load weight conditions (0.5 kg, 1 kg, 2 kg, 3 kg, 5 kg) and one damaged belt condition were collected and analysed. Four DNN models based on fully convolutional network (FCN), convolutional neural network combined with long short-term memory (CNN-LSTM) model, residual network (ResNet), and InceptionTime architectures were developed and applied to classification of CB states. Different time series augmentations, such as random Laplace noise, drifted Gaussian noise, uniform noise, and magnitude warping, were applied to collected data during the study. Furthermore, new CB tension signals were generated using a TimeVAE model. The study has shown that DNN models based on FCN, ResNet, and InceptionTime architectures are able to classify CB states accurately. The research has also shown that various data augmentation methods can improve the accuracy of the above-mentioned models, for example, the combined addition of random Laplace and drifted Gaussian noise improved FCN model’s baseline (without augmentation) classification accuracy with 2.0 s-length signals by 4.5% to 92.6% ± 1.54%. FCN model demonstrated the best accuracy and classification performance despite its lowest amount of trainable parameters, thus demonstrating the importance of selecting and optimizing the right architecture when developing models for specific tasks.
Pub. online:10 Jan 2022Type:Research ArticleOpen Access
Journal:Informatica
Volume 33, Issue 1 (2022), pp. 109–130
Abstract
In this paper, a new approach has been proposed for multi-label text data class verification and adjustment. The approach helps to make semi-automated revisions of class assignments to improve the quality of the data. The data quality significantly influences the accuracy of the created models, for example, in classification tasks. It can also be useful for other data analysis tasks. The proposed approach is based on the combination of the usage of the text similarity measure and two methods: latent semantic analysis and self-organizing map. First, the text data must be pre-processed by selecting various filters to clean the data from unnecessary and irrelevant information. Latent semantic analysis has been selected to reduce the vectors dimensionality of the obtained vectors that correspond to each text from the analysed data. The cosine similarity distance has been used to determine which of the multi-label text data class should be changed or adjusted. The self-organizing map has been selected as the key method to detect similarity between text data and make decisions for a new class assignment. The experimental investigation has been performed using the newly collected multi-label text data. Financial news data in the Lithuanian language have been collected from four public websites and classified by experts into ten classes manually. Various parameters of the methods have been analysed, and the influence on the final results has been estimated. The final results are validated by experts. The research proved that the proposed approach could be helpful to verify and adjust multi-label text data classes. 82% of the correct assignments are obtained when the data dimensionality is reduced to 40 using the latent semantic analysis, and the self-organizing map size is reduced from 40 to 5 by step 5.
Pub. online:1 Jan 2019Type:Research ArticleOpen Access
Journal:Informatica
Volume 30, Issue 2 (2019), pp. 349–365
Abstract
The isometric mapping (Isomap) algorithm is often used for analysing hyperspectral images. Isomap allows to reduce such hyperspectral images from a high-dimensional space into a lower-dimensional space, keeping the critical original information. To achieve such objective, Isomap uses the state-of-the-art MultiDimensional Scaling method (MDS) for dimensionality reduction. In this work, we propose to use Isomap with SMACOF, since SMACOF is the most accurate MDS method. A deep comparison, in terms of accuracy, between Isomap based on an eigen-decomposition process and Isomap based on SMACOF has been carried out using three benchmark hyperspectral images. Moreover, for the hyperspectral image classification, three classifiers (support vector machine, k-nearest neighbour, and Random Forest) have been used to compare both Isomap approaches. The experimental investigation has shown that better classification accuracy is obtained by Isomap with SMACOF.
Journal:Informatica
Volume 26, Issue 3 (2015), pp. 419–434
Abstract
A secure and high-quality operation of power grids requires frequency to be managed to keep it stable around a reference value. The deviation of the frequency from this reference value is caused by the imbalance between the active power produced and consumed. In the Smart Grid paradigm, the balance can be achieved by adjusting the demand to the production constraints, instead of the other way round. In this paper, an swarm intelligence-based approach for frequency management is proposed. It is grounded on the idea that a swarm is composed of decentralised individual agents (particles) and that each of them interacts with other ones via a shared environment. Three swarm intelligence-based policies ensure a decentralised frequency management in the smart power grid, where agents of swarm are making decisions and acting on the demand side. Policies differ in behaviour function of agents. Finally, these policies are evaluated and compared using indicators that point out their advantages.
Journal:Informatica
Volume 26, Issue 1 (2015), pp. 33–50
Abstract
Abstract
Classical evolutionary multi-objective optimization algorithms aim at finding an approximation of the entire set of Pareto optimal solutions. By considering the preferences of a decision maker within evolutionary multi-objective optimization algorithms, it is possible to focus the search only on those parts of the Pareto front that satisfy his/her preferences. In this paper, an extended preference-based evolutionary algorithm has been proposed for solving multi-objective optimization problems. Here, concepts from an interactive synchronous NIMBUS method are borrowed and combined with the R-NSGA-II algorithm. The proposed synchronous R-NSGA-II algorithm uses preference information provided by the decision maker to find only desirable solutions satisfying his/her preferences on the Pareto front. Several scalarizing functions are used simultaneously so the several sets of solutions are obtained from the same preference information. In this paper, the experimental-comparative investigation of the proposed synchronous R-NSGA-II and original R-NSGA-II has been carried out. The results obtained are promising.
Journal:Informatica
Volume 25, Issue 1 (2014), pp. 155–184
Abstract
In the paper we propose a genetic algorithm based on insertion heuristics for the vehicle routing problem with constraints. A random insertion heuristic is used to construct initial solutions and to reconstruct the existing ones. The location where a randomly chosen node will be inserted is selected by calculating an objective function. The process of random insertion preserves stochastic characteristics of the genetic algorithm and preserves feasibility of generated individuals. The defined crossover and mutation operators incorporate random insertion heuristics, analyse individuals and select which parts should be reinserted. Additionally, the second population is used in the mutation process. The second population increases the probability that the solution, obtained in the mutation process, will survive in the first population and increase the probability to find the global optimum. The result comparison shows that the solutions, found by the proposed algorithm, are similar to the optimal solutions obtained by other genetic algorithms. However, in most cases the proposed algorithm finds the solution in a shorter time and it makes this algorithm competitive with others.
Journal:Informatica
Volume 22, Issue 4 (2011), pp. 507–520
Abstract
The most classical visualization methods, including multidimensional scaling and its particular case – Sammon's mapping, encounter difficulties when analyzing large data sets. One of possible ways to solve the problem is the application of artificial neural networks. This paper presents the visualization of large data sets using the feed-forward neural network – SAMANN. This back propagation-like learning rule has been developed to allow a feed-forward artificial neural network to learn Sammon's mapping in an unsupervised way. In its initial form, SAMANN training is computation expensive. In this paper, we discover conditions optimizing the computational expenditure in visualization even of large data sets. It is shown possibility to reduce the original dimensionality of data to a lower one using small number of iterations. The visualization results of real-world data sets are presented.
Journal:Informatica
Volume 22, Issue 1 (2011), pp. 115–134
Abstract
In this paper, the quality of quantization and visualization of vectors, obtained by vector quantization methods (self-organizing map and neural gas), is investigated. A multidimensional scaling is used for visualization of multidimensional vectors. The quality of quantization is measured by a quantization error. Two numerical measures for proximity preservation (Konig's topology preservation measure and Spearman's correlation coefficient) are applied to estimate the quality of visualization. Results of visualization (mapping images) are also presented.
Journal:Informatica
Volume 13, Issue 3 (2002), pp. 275–286
Abstract
In the paper, we analyze the software that realizes the self-organizing maps: SOM-PAK, SOM-TOOLBOX, Viscovery SOMine, Nenet, and two academic systems. Most of the software may be found in the Internet. These are freeware, shareware or demo. The self-organizing maps assist in data clustering and analyzing data similarities. The software differs one from another in the realization and visualization capabilities. The data on coastal dunes and their vegetation in Finland are used for the experimental comparison of the graphical result presentation of the software. Similarities of the systems and their differences, advantages and imperfections are exposed.