Informatica logo


Login Register

  1. Home
  2. Issues
  3. Volume 35, Issue 2 (2024)
  4. Advancing Research Reproducibility in Ma ...

Informatica

Information Submit your article For Referees Help ATTENTION!
  • Article info
  • Full article
  • Related articles
  • More
    Article info Full article Related articles

Advancing Research Reproducibility in Machine Learning through Blockchain Technology
Volume 35, Issue 2 (2024), pp. 227–253
Ernestas Filatovas   Linas Stripinis   Francisco Orts   Remigijus Paulavičius  

Authors

 
Placeholder
https://doi.org/10.15388/24-INFOR553
Pub. online: 11 April 2024      Type: Research Article      Open accessOpen Access

Received
1 February 2024
Accepted
1 April 2024
Published
11 April 2024

Abstract

Like other disciplines, machine learning is currently facing a reproducibility crisis that hinders the advancement of scientific research. Researchers face difficulties reproducing key results due to the lack of critical details, including the disconnection between publications and associated models, data, parameter settings, and experimental results. To promote transparency and trust in research, solutions that improve the accessibility of models and data, facilitate experiment tracking, and allow audit of experimental results are needed. Blockchain technology, characterized by its decentralization, data immutability, cryptographic hash functions, consensus algorithms, robust security measures, access control mechanisms, and innovative smart contracts, offers a compelling pathway for the development of such solutions. To address the reproducibility challenges in machine learning, we present a novel concept of a blockchain-based platform that operates on a peer-to-peer network. This network comprises organizations and researchers actively engaged in machine learning research, seamlessly integrating various machine learning research and development frameworks. To validate the viability of our proposed concept, we implemented a blockchain network using the Hyperledger Fabric infrastructure and conducted experimental simulations in several scenarios to thoroughly evaluate its effectiveness. By fostering transparency and facilitating collaboration, our proposed platform has the potential to significantly improve reproducible research in machine learning and can be adapted to other domains within artificial intelligence.

References

 
Androulaki, E., Barger, A., Bortnikov, V., Cachin, C., Christidis, K., De Caro, A., Enyeart, D., Ferris, C., Laventman, G., Manevich, Y. Muralidharan, S., Murthy, C., Nguyen, B., Sethi, M., Singh, G., Smith, K., Sorniotti, A., Stathakopoulou, Ch., Vukolić, M., Cocco, S.W., Yellick, J. (2018). Hyperledger fabric: a distributed operating system for permissioned blockchains. In: Proceedings of the Thirteenth EuroSys Conference, pp. 1–15.
 
Bag, R., Spilak, B., Winkel, J., Härdle, W.K. (2022). Quantinar: a blockchain p2p ecosystem for honest scientific research. arXiv preprint. arXiv:2211.11525.
 
Bathen, L.A.D., Jadav, D. (2022). Trustless AutoML for the Age of Internet of Things. In: 2022 IEEE International Conference on Blockchain and Cryptocurrency (ICBC). IEEE, pp. 1–3.
 
Bayer, D., Haber, S., Stornetta, W.S. (1993). Improving the efficiency and reliability of digital time-stamping. In: Sequences II: Methods in Communication, Security, and Computer Science. Springer, pp. 329–334.
 
Belchior, R., Vasconcelos, A., Guerreiro, S., Correia, M. (2021). A survey on blockchain interoperability: past, present, and future trends. ACM Computing Surveys (CSUR), 54(8), 1–41.
 
Beltrán, E.T.M., Pérez, M.Q., Sánchez, P.M.S., Bernal, S.L., Bovet, G., Pérez, M.G., Pérez, G.M., Celdrán, A.H. (2023). Decentralized federated learning: fundamentals, state of the art, frameworks, trends, and challenges. IEEE Communications Surveys & Tutorials, 25(4), 2983–3013.
 
Bertolini, M., Mezzogori, D., Neroni, M., Zammori, F. (2021). Machine learning for industrial applications: a comprehensive literature review. Expert Systems with Applications, 175, 114820.
 
Buterin, V. (2017). The Meaning of Decentralization. https://medium.com/@VitalikButerin/the-meaning-of-decentralization-a0c92b76a274.
 
Cacti (2024). Hyperledger Cacti. https://www.hyperledger.org/projects/cacti.
 
Cao, B., Wang, Z., Zhang, L., Feng, D., Peng, M., Zhang, L., Han, Z. (2022). Blockchain systems, technologies, and applications: a methodology perspective. IEEE Communications Surveys & Tutorials, 25(1), 353–385.
 
Coelho, R., Braga, R., David, J.M.N., Dantas, M., Ströele, V., Campos, F. (2020). Blockchain for reliability in collaborative scientific workflows on cloud platforms. In: 2020 IEEE Symposium on Computers and Communications (ISCC). IEEE, pp. 1–7.
 
Coelho, R., Braga, R., David, J.M.N., Stroele, V., Campos, F., Dantas, M. (2022). A blockchain-based architecture for trust in collaborative scientific experimentation. Journal of Grid Computing, 20(4), 35.
 
Deng, L. (2012). The mnist database of handwritten digit images for machine learning research. IEEE Signal Processing Magazine, 29(6), 141–142.
 
Fabric, H. (2024). Hyperledger Fabric Docs. https://hyperledger-fabric.readthedocs.io/en/latest/index.html.
 
Filatovas, E., Marcozzi, M., Mostarda, L., Paulavičius, R. (2022). A MCDM-based framework for blockchain consensus protocol selection. Expert Systems with Applications, 204, 117609.
 
FireFly, H. (2023). Hyperledger FireFly Docs. https://hyperledger.github.io/firefly/.
 
Gudžius, P., Kurasova, O., Darulis, V., Filatovas, E. (2021). Deep learning-based object recognition in multispectral satellite imagery for real-time applications. Machine Vision and Applications, 32(4), 98.
 
Gudzius, P., Kurasova, O., Darulis, V., Filatovas, E. (2022). AutoML-based neural architecture search for object recognition in satellite imagery. Remote Sensing, 15(1), 91.
 
Gundersen, O.E., Shamsaliei, S., Isdahl, R.J. (2022). Do machine learning platforms provide out-of-the-box reproducibility? Future Generation Computer Systems, 126, 34–47.
 
Haber, S., Stornetta, W.S. (1991). How to time-stamp a digital document. In: Menezes, A.J., Vanstone, S.A. (Eds.), Advances in Cryptology-CRYPTO’ 90. Springer, Berlin Heidelberg, Berlin, Heidelberg, pp. 437–455.
 
Harris, J.D., Waggoner, B. (2019). Decentralized and collaborative AI on blockchain. In: 2019 IEEE International Conference on Blockchain (Blockchain). IEEE, pp. 368–375.
 
Hoopes, R., Hardy, H., Long, M., Dagher, G.G. (2022). SciLedger: a blockchain-based scientific workflow provenance and data sharing platform. In: 2022 IEEE 8th International Conference on Collaboration and Internet Computing (CIC). IEEE, pp. 125–134.
 
Hutson, M. (2018). Artificial intelligence faces reproducibility crisis. Science, 359(6377), 725–726.
 
Juodis, M., Filatovas, E., Paulavičius, R. (2024). Overview and empirical analysis of wealth decentralization in blockchain networks. ICT Express.
 
Kannan, K., Singh, A., Verma, M., Jayachandran, P., Mehta, S. (2020). Blockchain-based platform for trusted collaborations on data and AI models. In: 2020 IEEE International Conference on Blockchain (Blockchain). IEEE, pp. 82–89.
 
Khoi Tran, N., Sabir, B., Babar, M.A., Cui, N., Abolhasan, M., Lipman, J. (2022). ProML: a decentralised platform for provenance management of machine learning software systems. In: Software Architecture: 16th European Conference, 2022, Proceedings, ECSA 2022, Prague, Czech Republic, September 19–23. Springer, pp. 49–65.
 
Knez, T., Gašperlin, D., Bajec, M., Žitnik, S. (2022). Blockchain-based transaction manager for ontology databases. Informatica, 33(2), 343–364.
 
Kwon, J., Buchman, E. (2015). Comsos: A Network of Distributed Ledgers. https://github.com/cosmos/cosmos/blob/master/WHITEPAPER.md.
 
Lamport, L., Shostak, R., Pease, M. (1982). The Byzantine generals problem. ACM Transactions on Programming Languages and Systems (TOPLAS), 4(3), 382–401.
 
Li, L., Wu, J., Cui, W. (2023). A review of blockchain cross-chain technology. IET Blockchain, 3(3), 149–158.
 
Li, Y., Chen, C., Liu, N., Huang, H., Zheng, Z., Yan, Q. (2021). A blockchain-based decentralized federated learning framework with committee consensus. IEEE Network, 35(1), 234–241.
 
Liu, F., Chen, D., Wang, F., Li, Z., Xu, F. (2023). Deep learning based single sample face recognition: a survey. Artificial Intelligence Review, 56(3), 2723–2748.
 
Lo, S.K., Liu, Y., Lu, Q., Wang, C., Xu, X., Paik, H.Y., Zhu, L. (2022). Towards trustworthy AI: blockchain-based architecture design for accountability and fairness of federated learning systems. IEEE Internet of Things Journal, 10(4), 3276–3284.
 
Lu, Y., Huang, X., Dai, Y., Maharjan, S., Zhang, Y. (2019). Blockchain and federated learning for privacy-preserved data sharing in industrial IoT. IEEE Transactions on Industrial Informatics, 16(6), 4177–4186.
 
Lüthi, P., Gagnaux, T., Gygli, M. (2020). Distributed ledger for provenance tracking of artificial intelligence assets. In: Friedewald, M., Önen, M., Lievens, E., Krenn, S., Fricker, S. (Eds.), Privacy and Identity Management. Data for Better Living: AI and Privacy. Privacy and Identity 2019. IFIP Advances in Information and Communication Technology, Vol. 576. Springer, Cham, pp. 411–426. https://doi.org/10.1007/978-3-030-42504-3_26.
 
Marcozzi, M., Filatovas, E., Stripinis, L., Paulavičius, R. (2024). Data-driven consensus protocol classification using machine learning. Mathematics, 12(2), 221.
 
Matulevičius, R., Iqbal, M., Elhadjamor, E.A., Ghannouchi, S.A., Bakhtina, M., Ghannouchi, S. (2022). Ontological representation of healthcare application security using blockchain technology. Informatica, 33(2), 365–397.
 
Mehrish, A., Majumder, N., Bharadwaj, R., Mihalcea, R., Poria, S. (2023). A review of deep learning techniques for speech processing. Information Fusion, 99, 101869.
 
Meng, Q., Sun, R. (2021). Towards secure and efficient scientific research project management using consortium blockchain. Journal of Signal Processing Systems, 93, 323–332.
 
Mora-Cantallops, M., Sánchez-Alonso, S., García-Barriocanal, E., Sicilia, M.A. (2021). Traceability for trustworthy AI: a review of models and tools. Big Data and Cognitive Computing, 5(2), 20.
 
Mothukuri, V., Parizi, R.M., Pouriyeh, S., Dehghantanha, A., Choo, K.K.R. (2021). FabricFL: blockchain-in-the-loop federated learning for trusted decentralized systems. IEEE Systems Journal, 16(3), 3711–3722.
 
Nakamoto, S. (2008). Bitcoin: A Peer-to-Peer Electronic Cash System. https://bitcoin.org/bitcoin.pdf.
 
Paulavičius, R., Grigaitis, S., Igumenov, A., Filatovas, E. (2019). A decade of blockchain: review of the current status, challenges, and future directions. Informatica, 30(4), 729–748.
 
Paulavičius, R., Grigaitis, S., Filatovas, E. (2021). A systematic review and empirical analysis of blockchain simulators. IEEE Access, 9, 38010–38028.
 
Pimentel, J.F., Murta, L., Braganholo, V., Freire, J. (2019). A large-scale study about quality and reproducibility of jupyter notebooks. In: 2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR). IEEE, pp. 507–517.
 
Ray, P.P. (2023). Web3: A comprehensive review on background, technologies, applications, zero-trust architectures, challenges and future directions. Internet of Things and Cyber-Physical Systems, 3, 213–248.
 
Ren, K., Ho, N.M., Loghin, D., Nguyen, T.T., Ooi, B.C., Ta, Q.T., Zhu, F. (2023). Interoperability in blockchain: a survey. IEEE Transactions on Knowledge and Data Engineering, 35(12), 12750–12769.
 
Rowhani-Farid, A., Barnett, A.G. (2018). Badges for sharing data and code at Biostatistics: an observational study. F1000Research, 7, 90. https://doi.org/10.12688/f1000research.13477.2.
 
Sakalauskas, E., Bendoraitis, A., Lukšaitė, D., Butkus, G., Vitkutė-Adžgauskienė, D. (2023). Tax declaration scheme using blockchain confidential transactions. Informatica, 34(3), 603–616.
 
Sarpatwar, K., Vaculin, R., Min, H., Su, G., Heath, T., Ganapavarapu, G., Dillenberger, D. (2019). Towards enabling trusted artificial intelligence via blockchain. In: Calo, S., Bertino, E., Verma, D. (Eds.), Policy-Based Autonomic Data Governance, Lecture Notes in Computer Science, Vol. 11550. Springer, Cham, pp. 137–153.
 
Schelter, S., Boese, J.H., Kirschnick, J., Klein, T., Seufert, S. (2017). Automatically tracking metadata and provenance of machine learning experiments. In: Machine Learning Systems Workshop at NIPS 2017, Long Beach, CA, USA.
 
Stodt, J., Stodt, F., Reich, C., Clarke, N. (2022). Verifiable machine learning models in industrial IoT via blockchain. In: Proceedings of the 12th International Advanced Computing Conference, Hyderabad, Telangana, pp. 16–17.
 
Ullah, I., Deng, X., Pei, X., Jiang, P., Mushtaq, H. (2023). A verifiable and privacy-preserving blockchain-based federated learning approach. Peer-to-Peer Networking and Applications, 16(5), 2256–2270.
 
Usuga Cadavid, J.P., Lamouri, S., Grabot, B., Pellerin, R., Fortin, A. (2020). Machine learning applied in production planning and control: a state-of-the-art in the era of industry 4.0. Journal of Intelligent Manufacturing, 31, 1531–1558.
 
Vanschoren, J., Van Rijn, J.N., Bischl, B., Torgo, L. (2014). OpenML: networked science in machine learning. ACM SIGKDD Explorations Newsletter, 15(2), 49–60.
 
Vartak, M., Subramanyam, H., Lee, W.E., Viswanathan, S., Husnoo, S., Madden, S., Zaharia, M. (2016). ModelDB: a system for machine learning model management. In: Proceedings of the Workshop on Human-In-the-Loop Data Analytics, pp. 1–3.
 
Wang, G., Wang, Q., Chen, S. (2023). Exploring blockchains interoperability: a systematic survey. ACM Computing Surveys, 55(13s), 1–38.
 
WeCross (2019). WeCross. https://github.com/WeBankBlockchain/WeCross.
 
Weng, J., Weng, J., Zhang, J., Li, M., Zhang, Y., Luo, W. (2019). Deepchain: auditable and privacy-preserving deep learning with blockchain-based incentive. IEEE Transactions on Dependable and Secure Computing, 18(5), 2438–2455.
 
Wohlin, C. (2014). Guidelines for snowballing in systematic literature studies and a replication in software engineering. In: Proceedings of the 18th International Conference on Evaluation and Assessment in Software Engineering – EASE ’14. ACM Press, pp. 1–10.
 
YUI (2022). Hyperledger YUI. https://labs.hyperledger.org/labs/yui.html.
 
Zamyatin, A., Al-Bassam, M., Zindros, D., Kokoris-Kogias, E., Moreno-Sanchez, P., Kiayias, A., Knottenbelt, W.J. (2021). Sok: Communication across distributed ledgers. In: Borisov, N., Diaz, C. (Eds.), Financial Cryptography and Data Security, FC 2021, Lecture Notes in Computer Science, Vol. 12675. Springer, Berlin, Heidelberg, pp. 3–36.
 
Zhang, C., Lu, Y. (2021). Study on artificial intelligence: the state of the art and future prospects. Journal of Industrial Information Integration, 23, 100224.

Full article Related articles PDF XML
Full article Related articles PDF XML

Copyright
© 2024 Vilnius University
by logo by logo
Open access article under the CC BY license.

Keywords
machine learning reproducibility reproducible research blockchain distributed ledger technology interoperability blockchain-based platform hyperledger fabric

Metrics
since January 2020
369

Article info
views

322

Full article
views

320

PDF
downloads

92

XML
downloads

Export citation

Copy and paste formatted citation
Placeholder

Download citation in file


Share


RSS

INFORMATICA

  • Online ISSN: 1822-8844
  • Print ISSN: 0868-4952
  • Copyright © 2023 Vilnius University

About

  • About journal

For contributors

  • OA Policy
  • Submit your article
  • Instructions for Referees
    •  

    •  

Contact us

  • Institute of Data Science and Digital Technologies
  • Vilnius University

    Akademijos St. 4

    08412 Vilnius, Lithuania

    Phone: (+370 5) 2109 338

    E-mail: informatica@mii.vu.lt

    https://informatica.vu.lt/journal/INFORMATICA
Powered by PubliMill  •  Privacy policy