Journal:Informatica
Volume 36, Issue 3 (2025), pp. 657–676
Abstract
Most classification algorithms involve subjective inputs or hyperparameters to be determined prior to performing the classification. When taking different input or hyperparameter values, each classification algorithm will comprise a collection of classifiers. In this work, we propose a data-driven methodology for assessing similarity in consensus agreement within such a collection of classifiers, and between two classification algorithms, conditional on the dataset of interest. The core of our approach lies in considering the variability introduced by different hyperparameter values for each algorithm when performing such comparisons. We address these problems by evaluating the similarity through consensus agreement and by proposing the application of asymmetric similarity indices based on the Jaccard coefficient. We present the proposed methodology on two publicly available datasets.
Pub. online:1 Jan 2018Type:Research ArticleOpen Access
Journal:Informatica
Volume 29, Issue 3 (2018), pp. 399–420
Abstract
This paper introduces a new similarity measure derived from the Common Submatrix-based measures for comparing square matrices. The novelty is that the similarity between two matrices is computed as the average area of the largest sub-matrices exactly matching and being located at the same position in the two matrices. By contrast, in the original similarity measures, the largest sub-matrices can exactly or approximately match and be located at different positions. An experiment conducted on a subset of the MNIST and NIST datasets shows that the new similarity measure is very promising in retrieving relevant handwritten character images.