Informatica logo


Login Register

  1. Home
  2. Issues
  3. Volume 36, Issue 3 (2025)
  4. Quantifying Binary Classifier Algorithms ...

Informatica

Information Submit your article For Referees Help ATTENTION!
  • Article info
  • Full article
  • Related articles
  • More
    Article info Full article Related articles

Quantifying Binary Classifier Algorithms Similarity with a Consensus Agreement Approach
Volume 36, Issue 3 (2025), pp. 657–676
Ana Perišić ORCID icon link to view author Ana Perišić details   Sophie Vanbelle ORCID icon link to view author Sophie Vanbelle details   Rafaela Brigita Petričević  

Authors

 
Placeholder
https://doi.org/10.15388/25-INFOR601
Pub. online: 12 September 2025      Type: Research Article      Open accessOpen Access

Received
1 January 2025
Accepted
1 September 2025
Published
12 September 2025

Abstract

Most classification algorithms involve subjective inputs or hyperparameters to be determined prior to performing the classification. When taking different input or hyperparameter values, each classification algorithm will comprise a collection of classifiers. In this work, we propose a data-driven methodology for assessing similarity in consensus agreement within such a collection of classifiers, and between two classification algorithms, conditional on the dataset of interest. The core of our approach lies in considering the variability introduced by different hyperparameter values for each algorithm when performing such comparisons. We address these problems by evaluating the similarity through consensus agreement and by proposing the application of asymmetric similarity indices based on the Jaccard coefficient. We present the proposed methodology on two publicly available datasets.

References

 
Benavoli, A., Corani, G., Mangili, F. (2016). Should we really use post-hoc tests based on mean-ranks? The Journal of Machine Learning Research, 17(1), 152–161.
 
Bischl, B., Binder, M., Lang, M., Pielok, T., Richter, J., Coors, S., Thomas, J., Ullmann, T., Becker, M., Boulesteix, A.-L., Deng, D., Lindauer, M. (2023). Hyperparameter optimization: foundations, algorithms, best practices, and open challenges. WIREs Data Mining and Knowledge Discovery, 13(2), e1484. https://doi.org/10.1002/widm.1484.
 
Chicco, D., Warrens, M.J., Jurman, G. (2021). The Matthews Correlation Coefficient (MCC) is more informative than Cohen’s Kappa and brier score in binary classification assessment. IEEE Access, 9, 78368–78381. https://doi.org/10.1109/ACCESS.2021.3084050.
 
Choi, S., Cha, S., Tappert, C.C. (2010). A survey of binary similarity and distance measures. Journal of Systemics, Cybernetics and Informatics, 8(1), 43–48.
 
Demšar, J. (2006). Statistical comparisons of classifiers over multiple data sets. Journal of Machine Learning Research, 7, 1–30.
 
Feurer, M., Hutter, F. (2019). Hyperparameter optimization. In: Hutter, F., Kotthoff, L., Vanschoren, J. (Eds.), Automated Machine Learning: Methods, Systems, Challenges. Springer International Publishing, pp. 3–33. https://doi.org/10.1007/978-3-030-05318-5_1.
 
Han, J., Kamber, M., Pei, J. (2012). 8 – Classification: Basic concepts. In: Han, J., Kamber, M., Pei, J. (Eds.), Data Mining (third edition). Morgan Kaufmann, pp. 327–391. https://doi.org/10.1016/B978-0-12-381479-1.00008-3.
 
Hubert, L. (1977). Kappa revisited. Psychological Bulletin, 84(2), 289–297. https://doi.org/10.1037/0033-2909.84.2.289.
 
Janosi, A., Steinbrunn, W., Pfisterer, M., Detrano, R. (1988). Heart Disease [Dataset]. UCI Machine Learning Repository. https://doi.org/10.24432/C52P4X.
 
Kaggle (2023). Stroke prediction dataset [data retrieved from Kaggle]. https://www.kaggle.com/datasets/fedesoriano/stroke-prediction-dataset.
 
Kuncheva, L.I., Whitaker, C.J. (2003). Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Machine Learning, 51(2), 181–207. https://doi.org/10.1023/A:1022859003006.
 
Labatut, V., Cherifi, H. (2012). Accuracy measures for the comparison of classifiers. arXiv preprint. arXiv:1207.3790.
 
Liaw, A., Wiener, M. (2002). Classification and regression by randomforest. R News, 2(3), 18–22. https://CRAN.R-project.org/doc/Rnews/.
 
Makhtar, M., Neagu, D.C., Ridley, M.J. (2011). Binary classification models comparison: on the similarity of datasets and confusion matrix for predictive toxicology applications. In: Böhm, C., Khuri, S., Lhotská, L., Pisanti, N. (Eds.), Information Technology in Bio- and Medical Informatics. Springer, Berlin Heidelberg, pp. 108–122.
 
Margineantu, D.D., Dietterich, T.G. (1997). Pruning adaptive boosting. ICML, 97, 211–218.
 
Narasimhamurthy, A. (2005). Evaluation of diversity measures for binary classifier ensembles. In: Oza, N.C., Polikar, R., Kittler, J., Roli, F. (Eds.), Multiple Classifier Systems. Springer, Berlin Heidelberg, pp. 267–277.
 
Perišić, A., Vanbelle, S. (2024). Two-group k-adic similarity coefficients for binary classifiers. Journal of Classification, 41(2), 325–345. https://doi.org/10.1007/s00357-024-09498-8.
 
Petrakos, M., Atli Benediktsson, J., Kanellopoulos, I. (2001). The effect of classifier agreement on the accuracy of the combined classifier in decision level fusion. IEEE Transactions on Geoscience and Remote Sensing, 39(11), 2539–2546. https://doi.org/10.1109/36.964992.
 
Petričević, R.B. (2023). Primjena modificiranog k-adskog jaccardovog koeficijenta sličnosti za usporedbu dvaju skupova binarnih klasifikatora. Master’s thesis, University of Split, Faculty of Science in Split.
 
Shirdel, M., Di Mauro, M., Liotta, A. (2024). Worthiness benchmark: a novel concept for analyzing binary classification evaluation metrics. Information Sciences, 678, 120882. https://doi.org/10.1016/j.ins.2024.120882.
 
Sokal, R.R., Sneath, P.H.A. (1963). Principles of Numerical Taxonomy. W. H. Freeman; Company.
 
Strobl, C., Boulesteix, A.-L., Kneib, T., Augustin, T., Zeileis, A. (2008). Conditional variable importance for random forests. BMC Bioinformatics, 9 307. https://doi.org/10.1186/1471-2105-9-307.
 
Strobl, C., Boulesteix, A.-L., Zeileis, A., Hothorn, T. (2007). Bias in random forest variable importance measures: illustrations, sources and a solution. BMC Bioinformatics, 8, 25. https://doi.org/10.1186/1471-2105-8-25.
 
Strobl, C., Hothorn, T., Zeileis, A. (2009). Party on! The R Journal, 1(2), 14–17. https://doi.org/10.32614/RJ-2009-013.
 
Tang, E.K., Suganthan, P.N., Yao, X. (2006). An analysis of diversity measures. Machine Learning, 65(1), 247–271. https://doi.org/10.1007/s10994-006-9449-2.
 
Tsymbal, A., Pechenizkiy, M., Cunningham, P. (2005). Diversity in search strategies for ensemble feature selection. Information Fusion, 6(1), 83–98.
 
Vanbelle, S., Albert, A. (2009). Agreement between two independent groups of raters. Psychometrika, 74, 477–491. https://doi.org/10.1007/S11336-009-9116-1.
 
Wang, J., Wang, L., Zheng, Y., Yeh, C.-C.M., Jain, S., Zhang, W. (2023). Learning-from-disagreement: a model comparison and visual analytics framework. IEEE Transactions on Visualization & Computer Graphics, 29(09), 3809–3825. https://doi.org/10.1109/TVCG.2022.3172107.
 
Warrens, M.J. (2009). k-adic similarity coefficients for binary (presence/absence) data. Journal of Classification, 26, 227–245. https://doi.org/10.1007/s00357-009-9032-1.
 
Wood, D., Mu, T., Webb, A.M., Reeve, H.W., Lujan, M., Brown, G. (2023). A unified theory of diversity in ensemble learning. Journal of Machine Learning Research, 24(359), 1–49.
 
Zouari, H., Heutte, L., Lecourtier, Y. (2005). Using diversity measure in building classifier ensembles for combination method analysis. In: Kurzyński, M., Puchała, E., Woźniak, M., żołnierek, A. (Eds.), Computer Recognition Systems. Springer, Berlin Heidelberg, pp. 337–344.

Biographies

Perišić Ana
https://orcid.org/0000-0001-9180-0270
ana.perisic@pmfst.hr
sisak@vus.hr

A. Perišić received her PhD at the University of Ljubljana (Slovenia) in 2021, after completing her master’s degree in mathematics and postgraduate studies in Economics at University of Zagreb (Croatia). Currently she is working at the Šibenik University of Applied Sciences as a college professor, and at the Department of Mathematics, University of Split as a postdoctoral researcher. Her research interests span a wide range of topics in the development and application of statistical methodologies, with significant contributions to modeling for customer churn prediction, clustering of mixed-type data, developing composite indicators and the development of similarity coefficients for binary data sets. She has contributed to a number of journal articles and conference papers, and participated in diverse research projects, including industry-academia collaborations.

Vanbelle Sophie
https://orcid.org/0000-0001-6584-2522
sophie.vanbelle@maastrichtuniversity.nl

S. Vanbelle completed a master’s degree in mathematics (ULiègge, Belgium) and a master’s degree in Biostatistics (UHasselt, Belgium) before obtaining her PhD at ULiège in 2009. She is currently associate professor in the department of Methodology & Statistics at the faculty of Health, Medicine and Life Sciences, Maastricht University, The Netherlands. Her research focuses on the development and application of statistical methodology for reliability and agreement studies, with particular interest in complex and multilevel designs and intensive longitudinal data. She has authored numerous peer-reviewed articles and contributed to tutorials and reviews. In addition, she is actively engaged in the statistical community, including service within the Belgian Region of the International Biometric Society.

Petričević Rafaela Brigita
rafaelap98@gmail.com

R.B. Petričević graduated in mathematics from the Faculty of Science, University of Split, in 2023. As part of her master’s thesis, she worked on quantifying binary classifier algorithms similarity with a consensus agreement approach. She is currently employed at OTP Bank as a Data Warehouse Specialist, where she works on the implementation of a new data warehouse. Her responsibilities also include developing regulatory reports for the Croatian National Bank and the European Central Bank, as well as supporting data-driven decision-making within the bank.


Full article Related articles PDF XML
Full article Related articles PDF XML

Copyright
© 2025 Vilnius University
by logo by logo
Open access article under the CC BY license.

Keywords
similarity binary classification consensus agreement Jaccard coefficient classifier sets

Metrics
since January 2020
866

Article info
views

578

Full article
views

880

PDF
downloads

754

XML
downloads

Export citation

Copy and paste formatted citation
Placeholder

Download citation in file


Share


RSS

INFORMATICA

  • Online ISSN: 1822-8844
  • Print ISSN: 0868-4952
  • Copyright © 2023 Vilnius University

About

  • About journal

For contributors

  • OA Policy
  • Submit your article
  • Instructions for Referees
    •  

    •  

Contact us

  • Institute of Data Science and Digital Technologies
  • Vilnius University

    Akademijos St. 4

    08412 Vilnius, Lithuania

    Phone: (+370 5) 2109 338

    E-mail: informatica@mii.vu.lt

    https://informatica.vu.lt/journal/INFORMATICA
Powered by PubliMill  •  Privacy policy