Quantifying Binary Classifier Algorithms Similarity with a Consensus Agreement Approach

Perišić, Ana; Vanbelle, Sophie; Petričević, Rafaela Brigita

doi:10.15388/25-INFOR601

Informatica

Quantifying Binary Classifier Algorithms Similarity with a Consensus Agreement Approach

Volume 36, Issue 3 (2025), pp. 657–676

Ana Perišić

Sophie Vanbelle

Rafaela Brigita Petričević

https://doi.org/10.15388/25-INFOR601

Pub. online: 12 September 2025 Type: Research Article

Open Access

Received
1 January 2025

Accepted
1 September 2025

Published
12 September 2025

Abstract

Most classification algorithms involve subjective inputs or hyperparameters to be determined prior to performing the classification. When taking different input or hyperparameter values, each classification algorithm will comprise a collection of classifiers. In this work, we propose a data-driven methodology for assessing similarity in consensus agreement within such a collection of classifiers, and between two classification algorithms, conditional on the dataset of interest. The core of our approach lies in considering the variability introduced by different hyperparameter values for each algorithm when performing such comparisons. We address these problems by evaluating the similarity through consensus agreement and by proposing the application of asymmetric similarity indices based on the Jaccard coefficient. We present the proposed methodology on two publicly available datasets.

References

Benavoli, A., Corani, G., Mangili, F. (2016). Should we really use post-hoc tests based on mean-ranks? The Journal of Machine Learning Research, 17(1), 152–161.

Bischl, B., Binder, M., Lang, M., Pielok, T., Richter, J., Coors, S., Thomas, J., Ullmann, T., Becker, M., Boulesteix, A.-L., Deng, D., Lindauer, M. (2023). Hyperparameter optimization: foundations, algorithms, best practices, and open challenges. WIREs Data Mining and Knowledge Discovery, 13(2), e1484. https://doi.org/10.1002/widm.1484.

Chicco, D., Warrens, M.J., Jurman, G. (2021). The Matthews Correlation Coefficient (MCC) is more informative than Cohen’s Kappa and brier score in binary classification assessment. IEEE Access, 9, 78368–78381. https://doi.org/10.1109/ACCESS.2021.3084050.

Choi, S., Cha, S., Tappert, C.C. (2010). A survey of binary similarity and distance measures. Journal of Systemics, Cybernetics and Informatics, 8(1), 43–48.

Demšar, J. (2006). Statistical comparisons of classifiers over multiple data sets. Journal of Machine Learning Research, 7, 1–30.

Feurer, M., Hutter, F. (2019). Hyperparameter optimization. In: Hutter, F., Kotthoff, L., Vanschoren, J. (Eds.), Automated Machine Learning: Methods, Systems, Challenges. Springer International Publishing, pp. 3–33. https://doi.org/10.1007/978-3-030-05318-5_1.

Han, J., Kamber, M., Pei, J. (2012). 8 – Classification: Basic concepts. In: Han, J., Kamber, M., Pei, J. (Eds.), Data Mining (third edition). Morgan Kaufmann, pp. 327–391. https://doi.org/10.1016/B978-0-12-381479-1.00008-3.

Hubert, L. (1977). Kappa revisited. Psychological Bulletin, 84(2), 289–297. https://doi.org/10.1037/0033-2909.84.2.289.

Janosi, A., Steinbrunn, W., Pfisterer, M., Detrano, R. (1988). Heart Disease [Dataset]. UCI Machine Learning Repository. https://doi.org/10.24432/C52P4X.

Kaggle (2023). Stroke prediction dataset [data retrieved from Kaggle]. https://www.kaggle.com/datasets/fedesoriano/stroke-prediction-dataset.

Kuncheva, L.I., Whitaker, C.J. (2003). Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Machine Learning, 51(2), 181–207. https://doi.org/10.1023/A:1022859003006.

Labatut, V., Cherifi, H. (2012). Accuracy measures for the comparison of classifiers. arXiv preprint. arXiv:1207.3790.

Liaw, A., Wiener, M. (2002). Classification and regression by randomforest. R News, 2(3), 18–22. https://CRAN.R-project.org/doc/Rnews/.

Makhtar, M., Neagu, D.C., Ridley, M.J. (2011). Binary classification models comparison: on the similarity of datasets and confusion matrix for predictive toxicology applications. In: Böhm, C., Khuri, S., Lhotská, L., Pisanti, N. (Eds.), Information Technology in Bio- and Medical Informatics. Springer, Berlin Heidelberg, pp. 108–122.

Margineantu, D.D., Dietterich, T.G. (1997). Pruning adaptive boosting. ICML, 97, 211–218.

Narasimhamurthy, A. (2005). Evaluation of diversity measures for binary classifier ensembles. In: Oza, N.C., Polikar, R., Kittler, J., Roli, F. (Eds.), Multiple Classifier Systems. Springer, Berlin Heidelberg, pp. 267–277.

Perišić, A., Vanbelle, S. (2024). Two-group k-adic similarity coefficients for binary classifiers. Journal of Classification, 41(2), 325–345. https://doi.org/10.1007/s00357-024-09498-8.

Petrakos, M., Atli Benediktsson, J., Kanellopoulos, I. (2001). The effect of classifier agreement on the accuracy of the combined classifier in decision level fusion. IEEE Transactions on Geoscience and Remote Sensing, 39(11), 2539–2546. https://doi.org/10.1109/36.964992.

Petričević, R.B. (2023). Primjena modificiranog k-adskog jaccardovog koeficijenta sličnosti za usporedbu dvaju skupova binarnih klasifikatora. Master’s thesis, University of Split, Faculty of Science in Split.

Shirdel, M., Di Mauro, M., Liotta, A. (2024). Worthiness benchmark: a novel concept for analyzing binary classification evaluation metrics. Information Sciences, 678, 120882. https://doi.org/10.1016/j.ins.2024.120882.

Sokal, R.R., Sneath, P.H.A. (1963). Principles of Numerical Taxonomy. W. H. Freeman; Company.

Strobl, C., Boulesteix, A.-L., Kneib, T., Augustin, T., Zeileis, A. (2008). Conditional variable importance for random forests. BMC Bioinformatics, 9 307. https://doi.org/10.1186/1471-2105-9-307.

Strobl, C., Boulesteix, A.-L., Zeileis, A., Hothorn, T. (2007). Bias in random forest variable importance measures: illustrations, sources and a solution. BMC Bioinformatics, 8, 25. https://doi.org/10.1186/1471-2105-8-25.

Strobl, C., Hothorn, T., Zeileis, A. (2009). Party on! The R Journal, 1(2), 14–17. https://doi.org/10.32614/RJ-2009-013.

Tang, E.K., Suganthan, P.N., Yao, X. (2006). An analysis of diversity measures. Machine Learning, 65(1), 247–271. https://doi.org/10.1007/s10994-006-9449-2.

Tsymbal, A., Pechenizkiy, M., Cunningham, P. (2005). Diversity in search strategies for ensemble feature selection. Information Fusion, 6(1), 83–98.

Vanbelle, S., Albert, A. (2009). Agreement between two independent groups of raters. Psychometrika, 74, 477–491. https://doi.org/10.1007/S11336-009-9116-1.

Wang, J., Wang, L., Zheng, Y., Yeh, C.-C.M., Jain, S., Zhang, W. (2023). Learning-from-disagreement: a model comparison and visual analytics framework. IEEE Transactions on Visualization & Computer Graphics, 29(09), 3809–3825. https://doi.org/10.1109/TVCG.2022.3172107.

Warrens, M.J. (2009). k-adic similarity coefficients for binary (presence/absence) data. Journal of Classification, 26, 227–245. https://doi.org/10.1007/s00357-009-9032-1.

Wood, D., Mu, T., Webb, A.M., Reeve, H.W., Lujan, M., Brown, G. (2023). A unified theory of diversity in ensemble learning. Journal of Machine Learning Research, 24(359), 1–49.

Zouari, H., Heutte, L., Lecourtier, Y. (2005). Using diversity measure in building classifier ensembles for combination method analysis. In: Kurzyński, M., Puchała, E., Woźniak, M., żołnierek, A. (Eds.), Computer Recognition Systems. Springer, Berlin Heidelberg, pp. 337–344.

Biographies

Perišić Ana

https://orcid.org/0000-0001-9180-0270

ana.perisic@pmfst.hr

sisak@vus.hr

A. Perišić received her PhD at the University of Ljubljana (Slovenia) in 2021, after completing her master’s degree in mathematics and postgraduate studies in Economics at University of Zagreb (Croatia). Currently she is working at the Šibenik University of Applied Sciences as a college professor, and at the Department of Mathematics, University of Split as a postdoctoral researcher. Her research interests span a wide range of topics in the development and application of statistical methodologies, with significant contributions to modeling for customer churn prediction, clustering of mixed-type data, developing composite indicators and the development of similarity coefficients for binary data sets. She has contributed to a number of journal articles and conference papers, and participated in diverse research projects, including industry-academia collaborations.

Vanbelle Sophie

https://orcid.org/0000-0001-6584-2522

sophie.vanbelle@maastrichtuniversity.nl

S. Vanbelle completed a master’s degree in mathematics (ULiègge, Belgium) and a master’s degree in Biostatistics (UHasselt, Belgium) before obtaining her PhD at ULiège in 2009. She is currently associate professor in the department of Methodology & Statistics at the faculty of Health, Medicine and Life Sciences, Maastricht University, The Netherlands. Her research focuses on the development and application of statistical methodology for reliability and agreement studies, with particular interest in complex and multilevel designs and intensive longitudinal data. She has authored numerous peer-reviewed articles and contributed to tutorials and reviews. In addition, she is actively engaged in the statistical community, including service within the Belgian Region of the International Biometric Society.

Petričević Rafaela Brigita

rafaelap98@gmail.com

R.B. Petričević graduated in mathematics from the Faculty of Science, University of Split, in 2023. As part of her master’s thesis, she worked on quantifying binary classifier algorithms similarity with a consensus agreement approach. She is currently employed at OTP Bank as a Data Warehouse Specialist, where she works on the implementation of a new data warehouse. Her responsibilities also include developing regulatory reports for the Croatian National Bank and the European Central Bank, as well as supporting data-driven decision-making within the bank.

Full article Related articles Cited by

Open access article under the CC BY license.

Keywords

similarity binary classification consensus agreement Jaccard coefficient classifier sets

Metrics

since January 2020

1007

Article info
views

846

Full article
views

935

PDF
downloads

788

XML
downloads

RSS

Authors

Abstract

References

Biographies

Export citation

Copy and paste formatted citation

Download citation in file