Journal:Informatica
Volume 32, Issue 3 (2021), pp. 441–475
Abstract
This paper is devoted to the problem of class imbalance in machine learning, focusing on the intrusion detection of rare classes in computer networks. The problem of class imbalance occurs when one class heavily outnumbers examples from the other classes. In this paper, we are particularly interested in classifiers, as pattern recognition and anomaly detection could be solved as a classification problem. As still a major part of data network traffic of any organization network is benign, and malignant traffic is rare, researchers therefore have to deal with a class imbalance problem. Substantial research has been undertaken in order to identify these methods or data features that allow to accurately identify these attacks. But the usual tactic to deal with the imbalance class problem is to label all malignant traffic as one class and then solve the binary classification problem. In this paper, however, we choose not to group or to drop rare classes but instead investigate what could be done in order to achieve good multi-class classification efficiency. Rare class records were up-sampled using SMOTE method (Chawla et al., 2002) to a preset ratio targets. Experiments with the 3 network traffic datasets, namely CIC-IDS2017, CSE-CIC-IDS2018 (Sharafaldin et al., 2018) and LITNET-2020 (Damasevicius et al., 2020) were performed aiming to achieve reliable recognition of rare malignant classes available in these datasets.
Popular machine learning algorithms were chosen for comparison of their readiness to support rare class detection. Related algorithm hyper parameters were tuned within a wide range of values, different data feature selection methods were used and tests were executed with and without over-sampling to test the multiple class problem classification performance of rare classes.
Machine learning algorithms ranking based on Precision, Balanced Accuracy Score, $\bar{G}$, and prediction error Bias and Variance decomposition, show that decision tree ensembles (Adaboost, Random Forest Trees and Gradient Boosting Classifier) performed best on the network intrusion datasets used in this research.
Journal:Informatica
Volume 32, Issue 3 (2021), pp. 477–498
Abstract
This work compares different algorithms to replace the genetic optimizer used in a recent methodology for creating realistic and computationally efficient neuron models. That method focuses on single-neuron processing and has been applied to cerebellar granule cells. It relies on the adaptive-exponential integrate-and-fire (AdEx) model, which must be adjusted with experimental data. The alternatives considered are: i) a memetic extension of the original genetic method, ii) Differential Evolution, iii) Teaching-Learning-Based Optimization, and iv) a local optimizer within a multi-start procedure. All of them ultimately outperform the original method, and the last two do it in all the scenarios considered.
Journal:Informatica
Volume 32, Issue 3 (2021), pp. 499–516
Abstract
The main objective of the present paper is to report two studies on mathematical and computational techniques used to model the behaviour of the aorta in the human cardiovascular system. In this paper, an account of the design and implementation of two distinct models is presented: a Windkessel model and an agent-based model. Windkessel model represents the left heart and arterial system of the cardiovascular system in the physiological domain. The agent-based model offers a simplified account of arterial behaviour by randomly generating arterial parameter values. This study has described the mechanism how and when the left heart contracts and pumps the blood out of the aorta, and it has taken the Windkessel model one step further. The results of this study show that the dynamics of the aorta can be explored in each modelling approaches as proposed and implemented by our research group. It is thought that this study will contribute to the literature in terms of development of the Windkessel model by considering its timing and redesigning it with digital electronics perspective.
Pub. online:26 Aug 2021Type:Research ArticleOpen Access
Journal:Informatica
Volume 32, Issue 3 (2021), pp. 517–542
Abstract
State of emergency affects many areas of our life, including education. Due to school closure during COVID-19 pandemic as a case of a long-term emergency, education has been moved into a remote mode. In order to determine the factors driving the acceptance of distance learning technologies and ensuring sustainable education, a model based on the Unified Theory of Acceptance and Use of Technology has been proposed and empirically validated with data collected from 550 in-service primary school teachers in Lithuania. Structural equation modelling technique with multi-group analysis was utilized to analyse the data. The results show that performance expectancy, social influence, technology anxiety, effort expectancy, work engagement, and trust are factors that significantly affect teachers’ behavioural intention to use distance learning technologies. The relationships in the model are moderated by pandemic anxiety and age of teachers. The results of this study provide important implications for education institutions, policy makers and designers: the predictors of intention to use distance learning technologies observed during the emergency period may serve as factors that should be strengthened in teachers’ professional development, and the applicability of the findings is expanded beyond the pandemic isolation period.
Journal:Informatica
Volume 32, Issue 3 (2021), pp. 543–564
Abstract
As an extension of intuitionistic fuzzy sets, picture fuzzy sets can deal with vague, uncertain, incomplete and inconsistent information. The similarity measure is an important technique to distinguish two objects. In this study, a similarity measure between picture fuzzy sets based on relationship matrix is proposed. The new similarity measure satisfies the axiomatic definition of similarity measure. It can be testified from a numerical experiment that the new similarity measure is more effective. Finally, we apply the proposed similarity measure to multiple-attribute decision making.
Journal:Informatica
Volume 32, Issue 3 (2021), pp. 565–582
Abstract
Quality function deployment (QFD) is an effective product development and management tool, which has been broadly applied in various industries to develop and improve products or services. Nonetheless, when used in real situations, the traditional QFD method shows some important weaknesses, especially in describing experts’ opinions, weighting customer requirements, and ranking engineering characteristics. In this study, a new QFD approach integrating linguistic Z-numbers and evaluation based on distance from average solution (EDAS) method is proposed to determine the prioritization of engineering characteristics. Specially, linguistic Z-numbers are adopted to deal with the vague evaluation information provided by experts on the relationships among customer requirements and engineering characteristics. Then, the EDAS method is extended to estimate the final priority ratings of engineering characteristics. Additionally, stepwise weight assessment ratio analysis (SWARA) method is employed to derive the relative weights of customer requirements. Finally, a practical case of Panda shared car design is introduced and a comparison is conducted to verify the feasibility and effectiveness of the proposed QFD approach. The results show that the proposed linguistic Z-EDAS method can not only represent experts’ interrelation evaluation information flexibly, but also produce a more reasonable and reliable prioritization of engineering characteristics in QFD.
Journal:Informatica
Volume 32, Issue 3 (2021), pp. 583–618
Abstract
Policy-makers are often hesitant to invest in unproven solutions because of a lack of the decision-making framework for managing innovations as a portfolio of investments that balances risk and return, especially in the field of developing new technologies. This study provides a new portfolio matrix for decision making of policy-makers to identify IoT applications in the agriculture sector for future investment based on two dimensions of sustainable development as a return and IoT challenge as a risk using a novel MADM approach. To this end, the identified applications of IoT in the agriculture sector fall into eight areas using the meta-synthesis method. The authors extracted a set of criteria from the literature. Later, the fuzzy Delphi method helped finalise it. The authors extended the SWARA method with interval-valued triangular fuzzy numbers (IVTFN SWARA) and used it to the weighting of the characteristics. Then, the alternatives were rated using the Additive Ratio Assessment (ARAS) method based on interval-valued triangular fuzzy numbers (IVTFN ARAS). Finally, decision-makers evaluated the results of ratings based on two dimensions of sustainability and IoT challenge by developing a framework for decision-making. Results of this paper show that policy-makers can manage IOT innovations in a disciplined way that balances risk and return by a portfolio approach, simultaneously the proposed framework can be used to determine and prioritise the areas of IoT application in the agriculture sector.
Pub. online:2 Jun 2021Type:Research ArticleOpen Access
Journal:Informatica
Volume 32, Issue 3 (2021), pp. 619–660
Abstract
Code repositories contain valuable information, which can be extracted, processed and synthesized into valuable information. It enabled developers to improve maintenance, increase code quality and understand software evolution, among other insights. Certain research has been made during the last years in this field. This paper presents a systematic mapping study to find, evaluate and investigate the mechanisms, methods and techniques used for the analysis of information from code repositories that allow the understanding of the evolution of software. Through this mapping study, we have identified the main information used as input for the analysis of code repositories (commit data and source code), as well as the most common methods and techniques of analysis (empirical/experimental and automatic). We believe the conducted research is useful for developers working on software development projects and seeking to improve maintenance and understand the evolution of software through the use and analysis of code repositories.