Pub. online:10 Jan 2022Type:Research ArticleOpen Access
Journal:Informatica
Volume 33, Issue 1 (2022), pp. 109–130
Abstract
In this paper, a new approach has been proposed for multi-label text data class verification and adjustment. The approach helps to make semi-automated revisions of class assignments to improve the quality of the data. The data quality significantly influences the accuracy of the created models, for example, in classification tasks. It can also be useful for other data analysis tasks. The proposed approach is based on the combination of the usage of the text similarity measure and two methods: latent semantic analysis and self-organizing map. First, the text data must be pre-processed by selecting various filters to clean the data from unnecessary and irrelevant information. Latent semantic analysis has been selected to reduce the vectors dimensionality of the obtained vectors that correspond to each text from the analysed data. The cosine similarity distance has been used to determine which of the multi-label text data class should be changed or adjusted. The self-organizing map has been selected as the key method to detect similarity between text data and make decisions for a new class assignment. The experimental investigation has been performed using the newly collected multi-label text data. Financial news data in the Lithuanian language have been collected from four public websites and classified by experts into ten classes manually. Various parameters of the methods have been analysed, and the influence on the final results has been estimated. The final results are validated by experts. The research proved that the proposed approach could be helpful to verify and adjust multi-label text data classes. 82% of the correct assignments are obtained when the data dimensionality is reduced to 40 using the latent semantic analysis, and the self-organizing map size is reduced from 40 to 5 by step 5.
Pub. online:7 Jan 2022Type:Research ArticleOpen Access
Journal:Informatica
Volume 33, Issue 1 (2022), pp. 131–150
Abstract
In our daily life, we could be confronted with numerous multiple attribute group decision making (MAGDM) problems. For such problems we designed a model which employs probabilistic linguistic MABAC (multi-attributive border approximation area comparison) based on the cumulative prospect theory (CPT-PL-MABAC) method to solve the MAGDM. The CPT-PL-MABAC method can take experts’ psychological behaviour and preferences into consideration. Furthermore, we utilize the combined weight consisting of subjective weight and objective weight. The objective weight is acquired by the entropy method. Additionally, the concrete calculating steps of CPT-PL-MABAC method are proposed to solve the MAGDM for selecting the optimal location of express distribution centre. Also, a numerical example for location selection of express distribution centre is given as the justification of the usefulness of the designed method. Finally, we compare the designed model with the other three existing models, and summarize the advantages and shortcomings.
Pub. online:5 Jan 2022Type:Research ArticleOpen Access
Journal:Informatica
Volume 33, Issue 3 (2022), pp. 523–543
Abstract
In this paper we propose modifications of the well-known algorithm of particle swarm optimization (PSO). These changes affect the mapping of the motion of particles from continuous space to binary space for searching in it, which is widely used to solve the problem of feature selection. The modified binary PSO variations were tested on the dataset SVC2004 dedicated to the problem of user authentication based on dynamic features of a handwritten signature. In the example of k-nearest neighbours (kNN), experiments were carried out to find the optimal subset of features. The search for the subset was considered as a multicriteria optimization problem, taking into account the accuracy of the model and the number of features.
Pub. online:4 Jan 2022Type:Research ArticleOpen Access
Journal:Informatica
Volume 33, Issue 3 (2022), pp. 499–522
Abstract
This paper models and solves the scheduling problem of cable manufacturing industries that minimizes the total production cost, including processing, setup, and storing costs. Two hybrid meta-heuristics, which combine simulated annealing and variable neighbourhood search algorithms with tabu search algorithm, are proposed. Applying some case-based theorems and rules, a special initial solution with optimal setup cost is obtained for the algorithms. The computational experiments, including parameter tuning and final experiments over the benchmarks obtained from a real cable manufacturing factory, show superiority of the combination of tabu search and simulated annealing comparing to the other proposed hybrid and classical meta-heuristics.
Journal:Informatica
Volume 32, Issue 4 (2021), pp. 865–886
Abstract
Picture fuzzy sets (PFSs) utilize the positive, neutral, negative and refusal membership degrees to describe the behaviours of decision-makers in more detail. In this article, we expound the application of extended TODIM based on cumulative prospect theory under picture fuzzy multiple attribute group decision making (MAGDM). In addition, we adopt Information Entropy, which is used to ascertain the weighting vector of attributes to improve the availability of the TODIM method. At last, we exercise the improved TODIM into a numerical case for super market location and testify the effectiveness of this new method by comparing its results with other methods’ results.
Journal:Informatica
Volume 32, Issue 4 (2021), pp. 709–739
Abstract
A p-rung orthopair fuzzy set (p-ROFS) describes a generalization of intuitionistic fuzzy set and Pythagorean fuzzy set in the case where we face a larger representation space of acceptable membership grades, and moreover, it gives a decision maker more flexibility in expressing his/her real preferences. Under the p-rung orthopair fuzzy environment, we are going to propose a novel and parametrized score function of p-ROFSs by incorporating the idea of weighted average of the degree of membership and non-membership functions. In view of this fact, this study is further undertaken to investigate and present different properties of the proposed score function for p-ROFSs. Moreover, we indicate that this ranking technique reduces some of the drawbacks of the existing ones. Eventually, we develop an approach based on the above-mentioned ranking technique to deal with multiple criteria decision making problems with p-rung orthopair fuzzy information.
Pub. online:17 Dec 2021Type:Research ArticleOpen Access
Journal:Informatica
Volume 33, Issue 1 (2022), pp. 55–80
Abstract
Ligand Based Virtual Screening methods are used to screen molecule databases to select the most promising compounds for a query. This is performed by decision-makers based on the information of the descriptors, which are usually processed individually. This methodology leads to a lack of information and hard post-processing dependent on the expert’s knowledge that can end up in the discarding of promising compounds. Consequently, in this work, we propose a new multi-objective methodology called MultiPharm-DT where several descriptors are considered simultaneously and whose results are offered to the decision-maker without effort on their part and without relying on their expertise.
Pub. online:10 Dec 2021Type:Research ArticleOpen Access
Journal:Informatica
Volume 32, Issue 4 (2021), pp. 795–816
Abstract
Nowadays, there is a lack of smart marine monitoring systems, which have possibilities to integrate multi-dimensional components for monitoring and predicting marine water quality and making decisions for their optimal operations with minimal human intervention. This research aims to extend the smart coastal marine monitoring by proposing a solar energy planning and control component. The proposed approach involves the adaptive neuro-fuzzy inference system (ANFIS) for the wireless buoys, working online during the whole year in the Baltic Sea near the Lithuanian coast. The usage of our proposed fuzzy solar energy planning and control components allows us to prolong the lifespan of batteries in buoys, so it has a positive impact on sustainable development. The novelty and advantage of the proposed approach lie in establishing the ANFIS-based model to predict and control solar energy in a buoy for different lighting and temperature conditions depending on the four year seasons and to make a decision to transfer the collected data. The energy planning and consumption system for the wireless sensor network of buoys is carefully evaluated, and its prototype is developed. The proposed approach can be practically used for environmental monitoring, providing stakeholders with relevant and timely information for sound decision-making about hydro-meteorological situations in coastal marine water.
Pub. online:9 Dec 2021Type:Research ArticleOpen Access
Journal:Informatica
Volume 32, Issue 4 (2021), pp. 817–847
Abstract
A method for counterfactual explanation of machine learning survival models is proposed. One of the difficulties of solving the counterfactual explanation problem is that the classes of examples are implicitly defined through outcomes of a machine learning survival model in the form of survival functions. A condition that establishes the difference between survival functions of the original example and the counterfactual is introduced. This condition is based on using a distance between mean times to event. It is shown that the counterfactual explanation problem can be reduced to a standard convex optimization problem with linear constraints when the explained black-box model is the Cox model. For other black-box models, it is proposed to apply the well-known Particle Swarm Optimization algorithm. Numerical experiments with real and synthetic data demonstrate the proposed method.