Journal:Informatica
Volume 25, Issue 1 (2014), pp. 95–111
Abstract
Nowadays data mining algorithms are successfully applying to analyze the real data in our life to provide useful suggestion. Since some available real data is multi-valued and multi-labeled, researchers have focused their attention on developing approaches to mine multi-valued and multi-labeled data in recent years. Unfortunately, there are no algorithms can discretize multi-valued and multi-labeled data to improve the performance of data mining. In this paper, we proposed a novel approach to solve this problem. Our approach is based on a statistical-based discretization metric and the simulated annealing search algorithm. Experimental results show that our approach can effectively improve the performance of the-state-of-art multi-valued and multi-labeled classification algorithm.
Journal:Informatica
Volume 19, Issue 1 (2008), pp. 135–156
Abstract
Data stream mining has become a novel research topic of growing interest in knowledge discovery. Most proposed algorithms for data stream mining assume that each data block is basically a random sample from a stationary distribution, but many databases available violate this assumption. That is, the class of an instance may change over time, known as concept drift. In this paper, we propose a Sensitive Concept Drift Probing Decision Tree algorithm (SCRIPT), which is based on the statistical X2 test, to handle the concept drift problem on data streams. Compared with the proposed methods, the advantages of SCRIPT include: a) it can avoid unnecessary system cost for stable data streams; b) it can immediately and efficiently corrects original classifier while data streams are instable; c) it is more suitable to the applications in which a sensitive detection of concept drift is required.