Journal:Informatica
Volume 18, Issue 3 (2007), pp. 343–362
Abstract
One of the tasks of data mining is classification, which provides a mapping from attributes (observations) to pre-specified classes. Classification models are built by using underlying data. In principle, the models built with more data yield better results. However, the relationship between the available data and the performance is not well understood, except that the accuracy of a classification model has diminishing improvements as a function of data size. In this paper, we present an approach for an early assessment of the extracted knowledge (classification models) in the terms of performance (accuracy), based on the amount of data used. The assessment is based on the observation of the performance on smaller sample sizes. The solution is formally defined and used in an experiment. In experiments we show the correctness and utility of the approach.
Journal:Informatica
Volume 12, Issue 3 (2001), pp. 455–468
Abstract
This paper describes a preliminary algorithm performing epilepsy prediction by means of visual perception tests and digital electroencephalograph data analysis. Special machine learning algorithm and signal processing method are used. The algorithm is tested on real data of epileptic and healthy persons that are treated in Kaunas Medical University Clinics, Lithuania. The detailed examination of results shows that computerized visual perception testing and automated data analysis could be used for brain damages diagnosing.
Journal:Informatica
Volume 12, Issue 1 (2001), pp. 109–118
Abstract
This paper considers the technique to construct the general decision rule for the contradictory expert classification of objects which are described with many qualitative attributes. This approach is based on the theory of multiset metric spaces, and allows to classify a collection of multi-attribute objects and define the classification rule which approximates the set of individual sorting rules.
Journal:Informatica
Volume 11, Issue 2 (2000), pp. 115–124
Abstract
Influence of projection pursuit on classification errors and estimates of a posteriori probabilities from the sample is considered. Observed random variable is supposed to satisfy a multidimensional Gaussian mixture model. Presented computer simulation results show that for comparatively small sample size classification using projection pursuit algorithm gives better accuracy of estimates of a posteriori probabilities and less classification error.
Journal:Informatica
Volume 8, Issue 1 (1997), pp. 139–152
Abstract
ProObj is a Prolog based system for knowledge representation which was strongly influenced by object-oriented and frame-based systems. The paper shortly describes ProObj and then presents a classification mechanism which is based on the ideas of classifiers in KL-ONE like systems.
As a new and very flexible feature we present a user-directed control of classification process. The ProObj classifier gives the user the possibility to guide the classification process by excluding attributes and facets – elements of our representation formalism – from being considered in the classification. By this mechanism we gain a substantial improvement of the efficiency of the classification process. Furthermore, it allows a more flexible and adequate modelling of a knowledge domain. It is possible to build a knowledge base under a particular view where only those attributes of concepts are considered for classification which seem to be relevant for the structure of the domain hierarchy.