Informatica logo


Login Register

  1. Home
  2. To appear
  3. Anti-Money Laundering Compliance Using F ...

Informatica

Information Submit your article For Referees Help ATTENTION!
  • Article info
  • Full article
  • More
    Article info Full article

Anti-Money Laundering Compliance Using Feature Engineering with SQL Analytics, TF-IDF and Oversampling: Conditional Tabular Generative Adversarial Networks
Anca Ioana Andreescu ORCID icon link to view author Anca Ioana Andreescu details   Simona-Vasilica Oprea ORCID icon link to view author Simona-Vasilica Oprea details   Alin Gabriel Văduva ORCID icon link to view author Alin Gabriel Văduva details   Adela Bâra ORCID icon link to view author Adela Bâra details  

Authors

 
Placeholder
https://doi.org/10.15388/25-INFOR598
Pub. online: 25 June 2025      Type: Research Article      Open accessOpen Access

Received
1 March 2025
Accepted
1 June 2025
Published
25 June 2025

Abstract

Traditional Anti-Money Laundering (AML) systems rely on rule-based approaches, which often fail to adapt to evolving money laundering tactics and produce high false-positive rates, overwhelming compliance teams. This study proposes an innovative machine learning (ML) framework that leverages Conditional Tabular Generative Adversarial Networks (CTGANs) to address severe class imbalance, a common challenge in Suspicious Activity Reporting (SAR). Implemented in Python, CTGAN generates realistic synthetic samples to enhance minority-class representation, improving recall and F1-scores. For instance, the Random Forest (RF) model achieves a recall of 0.991 and an F1-score of 0.528 in oversampled datasets with engineered variables, highlighting the effectiveness of CTGAN in mitigating imbalance. This framework also incorporates SQL-based feature engineering using Oracle Analytics, creating dynamic variables such as cumulative sums, rolling averages, and ranks. The modelling phase and exploratory data analysis are conducted in the SAS programming language, employing Logistic Regression (LR) as baseline, Decision Trees (DT), and RF. Evaluation across undersampled and oversampled datasets, combined with varying probability thresholds, reveals key trade-offs between sensitivity and precision. Among the models, RF consistently achieves the highest ROC-AUC scores, ranging from 0.945 in undersampled datasets to 0.951 in oversampled configurations, demonstrating its robustness and accuracy in SAR detection. By integrating CTGAN and TF-IDF (textual feature transformation in Python) with SQL-engineered variables, this framework provides a comprehensive data-driven approach to AML. It reduces false positives, strengthens the detection of suspicious activities, and ensures scalability, adaptability, and compliance with regulatory standards.

References

 
Ahmad Tarmizi, M., Zolkaflil, S., Omar, N., Hasnan, S., Syed Mustapha Nazri, S.N.F. (2023). Compliance determinants of anti-money laundering regime among professional accountants in Malaysia. Journal of Money Laundering Control, 26(2), 361–387. https://doi.org/10.1108/JMLC-01-2022-0003.
 
Alotibi, J., Almutanni, B., Alsubait, T., Alhakami, H., Baz, A. (2022). Money laundering detection using machine learning and deep learning. International Journal of Advanced Computer Science and Applications, 13(10). https://doi.org/10.14569/IJACSA.2022.0131087.
 
Antwi, S., Tetteh, A.B., Armah, P., Dankwah, E.O. (2023). Anti-money laundering measures and financial sector development: empirical evidence from Africa. Cogent Economics and Finance, 11(1). https://doi.org/10.1080/23322039.2023.2209957.
 
Al Badawi, A., Al-Haija, Q.A. (2021). Detection of money laundering in bitcoin transactions. In: IET Conference Proceedings. https://doi.org/10.1049/icp.2022.0387.
 
Benzerrouk, Z.S., Alnor, N.H.A., Al-Matari, E.M., Alhebri, A., Al-Bukhrani, M.A. (2023). The effect of the banking supervision on anti-money laundering. Humanities and Social Sciences Letters, 11(4), 399–415. https://doi.org/10.18488/73.v11i4.3518.
 
Bidabad, B. (2017). Money laundering detection system (MLD) (a complementary system of rastin banking). Journal of Money Laundering Control, 20(4), 354–366. https://doi.org/10.1108/JMLC-04-2016-0016.
 
Caglayan, M., Bahtiyar, S. (2022). Money laundering detection with Node2Vec. Gazi University Journal of Science, 35(3), 854–873. https://doi.org/10.35378/gujs.854725.
 
Chen, Z., Van Khoa, L.D., Teoh, E.N., Nazir, A., Karuppiah, E.K., Lam, K.S. (2018). Machine learning techniques for anti-money laundering (AML) solutions in suspicious transaction detection: a review. Knowledge and Information Systems, 57, 245–285. https://doi.org/10.1007/s10115-017-1144-z.
 
Chen, Z., Soliman, W.M., Nazir, A., Shorfuzzaman, M. (2021). Variational autoencoders and wasserstein generative adversarial networks for improving the anti-money laundering process. IEEE Access, 9, 83762–83785. https://doi.org/10.1109/ACCESS.2021.3086359.
 
Cheng, D., Ye, Y., Xiang, S., Ma, Z., Zhang, Y., Jiang, C. (2023). Anti-money laundering by group-aware deep graph learning. IEEE Transactions on Knowledge and Data Engineering, 35(12), 12444–12457. https://doi.org/10.1109/TKDE.2023.3272396.
 
Chitimira, H., Animashaun, O. (2023). The adequacy of the legal framework for combating money laundering and terrorist financing in Nigeria. Journal of Money Laundering Control, 26(7), 110–126. https://doi.org/10.1108/JMLC-12-2022-0171.
 
Demetis, D.S. (2018). Fighting money laundering with technology: a case study of Bank X in the UK. Decision Support Systems, 105, 96–107. https://doi.org/10.1016/j.dss.2017.11.005.
 
Drezewski, R., Sepielak, J., Filipkowski, W. (2015). The application of social network analysis algorithms in a system supporting money laundering detection. Information Sciences, 295, 18–32. https://doi.org/10.1016/j.ins.2014.10.015.
 
Gilmour, P.M. (2023). Reexamining the anti-money-laundering framework: a legal critique and new approach to combating money laundering. Journal of Financial Crime, 30(1), 35–47. https://doi.org/10.1108/JFC-02-2022-0041.
 
Goecks, L.S., Korzenowski, A.L., Terra Neto, P.G., de Souza, D.L., Mareth, T. (2022). Anti-money laundering and financial fraud detection: a systematic literature review. Intelligent Systems in Accounting, Finance and Management, 29(2), 71–85. https://doi.org/10.1002/isaf.1509.
 
Hampo, J.P.A.C., Nwokorie, E.C., Odii, J.N. (2023). A web-based KNN money laundering detection system. European Journal of Theoretical and Applied Sciences, 1(4), 277–288. https://doi.org/10.59324/ejtas.2023.1(4).27.
 
Huong, H., Nguyen, X., Dang, T.K., Tran-Truong, P.T. (2024). Money laundering detection using a transaction-based graph learning approach. In: Proceedings of the 2024 18th International Conference on Ubiquitous Information Management and Communication, IMCOM 2024, pp. 1–8. https://doi.org/10.1109/IMCOM60618.2024.10418307.
 
Isolauri, E.A., Ameer, I. (2023). Money laundering as a transnational business phenomenon: a systematic review and future agenda. Critical Perspectives on International Business, 19(3), 426–468. https://doi.org/10.1108/cpoib-10-2021-0088.
 
Jensen, R.I.T., Iosifidis, A. (2023). Fighting money laundering with statistics and machine learning. IEEE Access, 11, 8889–8903. https://doi.org/10.1109/ACCESS.2023.3239549.
 
Jovicic, S., Tan, Q. (2018). Machine learning for money laundering detection in the block chain financial transaction system. Journal of Fundamental and Applied Sciences, 10(4S).
 
Kannan, S., Somasundaram, K. (2017). Autoregressive-based outlier algorithm to detect money laundering activities. Journal of Money Laundering Control, 20(2), 190–202. https://doi.org/10.1108/JMLC-07-2016-0031.
 
Ketenci, U.G., Kurt, T., Önal, S., Erbil, C., Aktürkoǧlu, S., Ilhan, H.Ş. (2021). A time-frequency based suspicious activity detection for anti-money laundering. IEEE Access, 9, 59957–59967. https://doi.org/10.1109/ACCESS.2021.3072114.
 
Korejo, M.S., Rajamanickam, R., Muhamad, M.H. (2021). The concept of money laundering: a quest for legal definition. Journal of Money Laundering Control, 24(4), 725–736. https://doi.org/10.1108/JMLC-05-2020-0045.
 
Kramer, J.A., Blokland, A.A.J., Kleemans, E.R., Soudijn, M.R.J. (2023). Money laundering as a service: investigating business-like behavior in money laundering networks in the Netherlands. Trends in Organized Crime, 27, 314–341. https://doi.org/10.1007/s12117-022-09475-w.
 
Labanca, D., Primerano, L., Markland-Montgomery, M., Polino, M., Carminati, M., Zanero, S. (2022). Amaretto: an active learning framework for money laundering detection. IEEE Access, 10, 41720–41739. https://doi.org/10.1109/ACCESS.2022.3167699.
 
Liu, J., Yin, C., Wang, H., Wu, X., Lan, D., Zhou, L., Ge, C. (2023). Graph embedding-based money laundering detection for ethereum. Electronics (Switzerland), 12(14), 3180. https://doi.org/10.3390/electronics12143180.
 
Lo, W.W., Kulatilleke, G.K., Sarhan, M., Layeghy, S., Portmann, M. (2023). Inspection-L: self-supervised GNN node embeddings for money laundering detection in bitcoin. Applied Intelligence, 53, 19406–19417. https://doi.org/10.1007/s10489-023-04504-9.
 
Luo, X., Han, X., Zuo, W., Xu, Z., Wang, Z., Wu, X. (2022). A dynamic transaction pattern aggregation neural network for money laundering detection. In: Proceedings – 2022 IEEE 21st International Conference on Trust, Security and Privacy in Computing and Communications, TrustCom 2022, pp. 818–826. https://doi.org/10.1109/TrustCom56396.2022.00114.
 
Oad, A., Razaque, A., Tolemyssov, A., Alotaibi, M., Alotaibi, B., Zhao, C. (2021). Blockchain-enabled transaction scanning method for money laundering detection. Electronics (Switzerland), 10(15), 1766. https://doi.org/10.3390/electronics10151766.
 
Ofoeda, I., Agbloyor, E.K., Abor, J.Y., Osei, K.A. (2022). Anti-money laundering regulations and financial sector development. International Journal of Finance and Economics, 27(4), 4085–4104. https://doi.org/10.1002/ijfe.2360.
 
Ogbeide, H., Thomson, M.E., Gonul, M.S., Pollock, A.C., Bhowmick, S., Bello, A.U. (2023). The anti-money laundering risk assessment: a probabilistic approach. Journal of Business Research, 162, 113820. https://doi.org/10.1016/j.jbusres.2023.113820.
 
Pavlidis, G. (2023). Deploying artificial intelligence for anti-money laundering and asset recovery: the dawn of a new era. Journal of Money Laundering Control, 26(7), 155–166. https://doi.org/10.1108/JMLC-03-2023-0050.
 
Rocha-Salazar, J.-J., Segovia-Vargas, M.-J., Camacho-Miñano, M.-M. (2021). Money laundering and terrorism financing detection using neural networks and an abnormality indicator. Expert Systems with Applications, 169, 114470. https://doi.org/10.1016/j.eswa.2020.114470.
 
Salehi, A., Ghazanfari, M., Fathian, M. (2017). Data mining techniques for anti money laundering. International Journal of Applied Engineering Research, 146(12), 28–33. https://doi.org/10.5120/ijca2016910953.
 
Saragih, I.I.M. (2023). The needs of money laundering and tax evasion crimes prevention in the Asean Community. International Journal of Scientific Multidisciplinary Research, 1(5), 471–484. https://doi.org/10.55927/ijsmr.v1i5.4619.
 
Sheu, G.Y., Li, C.Y. (2022). On the potential of a graph attention network in money laundering detection. Journal of Money Laundering Control, 25(3), 594–608. https://doi.org/10.1108/JMLC-07-2021-0076.
 
Singh, K., Best, P. (2019). Anti-money laundering: using data visualization to identify suspicious activity. International Journal of Accounting Information Systems, 34, 100418. https://doi.org/10.1016/j.accinf.2019.06.001.
 
Teichmann, F.M.J., Falker, M.C. (2023). Money laundering – the gold method. Journal of Money Laundering Control, 26(3), 509–522. https://doi.org/10.1108/JMLC-07-2019-0060.
 
Thommandru, A., Chakka, B. (2023). Recalibrating the banking sector with blockchain technology for effective anti-money laundering compliances by banks. Sustainable Futures, 5, 100107. https://doi.org/10.1016/j.sftr.2023.100107.
 
Tiwari, M., Gepp, A., Kumar, K. (2020). A review of money laundering literature: the state of research in key areas. Pacific Accounting Review, 32(2), 271–303. https://doi.org/10.1108/PAR-06-2019-0065.
 
Wang, H.M., Hsieh, M.L. (2024). Cryptocurrency is new vogue: a reflection on money laundering prevention. Security Journal, 37(1), 25–46. https://doi.org/10.1057/s41284-023-00366-5.
 
Yang, G., Liu, X., Li, B. (2023). Anti-money laundering supervision by intelligent algorithm. Computers and Security, 132, 103344. https://doi.org/10.1016/j.cose.2023.103344.
 
Yu, L., Zhang, F., Ma, J., Yang, L., Yang, Y., Jia, W. (2023). Who are the money launderers? Money laundering detection on blockchain via mutual learning-based graph neural network. In: Proceedings of the International Joint Conference on Neural Networks. https://doi.org/10.1109/IJCNN54540.2023.10191217.
 
Zhang, Y., Trubey, P. (2019). Machine learning and sampling scheme: an empirical study of money laundering detection. Computational Economics, 54, 1043–1063. https://doi.org/10.1007/s10614-018-9864-z.
 
Zhong, Z., Zhu, C., Yang, Y., Liao, X., Wang, R., Zhao, Y., Zhou, F., Shi, R., Qin, Z. (2022). Money laundering detection for cryptocurrency transactions. Hunan Daxue Xuebao/Journal of Hunan University Natural Sciences.

Biographies

Andreescu Anca Ioana
https://orcid.org/0000-0003-0086-6608
anca.andreescu@ie.ase.ro

A.I. Andreescu graduated from the Faculty of Cybernetics, Statistics and Economic Informatics of the Academy of Economic Studies in 2001. She got the title of doctor in economy in the specialty economic informatics in 2009. At present she is an associate professor in the Department of Economic Informatics and Cybernetics of the Bucharest University of Economic Studies. Her interest domains related to computer science are requirements engineering, business analytics, modelling languages, business rules approaches and software development methodologies.

Oprea Simona-Vasilica
https://orcid.org/0000-0002-9005-5181
simona.oprea@csie.ase.ro

S.-V. Oprea received the MSc degree through the Infrastructure Management Program from Yokohama National University, Japan, in 2007, her first PhD degree in power system engineering from the Bucharest Polytechnic University in 2009, and her second PhD degree in economic informatics from the Bucharest University of Economic Studies in 2017. She is currently a professor within the Faculty of Cybernetics, Statistics, and Economic Informatics with the Bucharest Academy of Economic Studies, involved in several research projects.

Văduva Alin Gabriel
https://orcid.org/0009-0008-1825-4945
alin.vaduva@csie.ase.ro

A.-G. Văduva earned his bachelor’s degree in economic informatics in 2022 and his master’s degree in databases – Support for Business in 2024. He is currently pursuing a PhD, focusing on the trustworthiness of artificial intelligence algorithms in business. Professionally, he works as an artificial intelligence engineer. His research interests include mathematics, machine learning, data mining, deep learning, and generative AI.

Bâra Adela
https://orcid.org/0000-0002-0961-352X
bara.adela@ie.ase.ro

A. Bâra graduated the Faculty of Economic Cybernetics in 2002, holds a PhD diploma in economics from 2007. She is a professor at the Economic Informatics Department at the Faculty of Cybernetics, Statistics and Economic Informatics from The Bucharest University of Economic Studies and has coordinated three R&D projects. Her research interests are focused on data science, analytics, databases, IoT, big data, data mining, power systems, authoring more than 70 papers in international journals and conferences.


Full article PDF XML
Full article PDF XML

Copyright
© 2025 Vilnius University
by logo by logo
Open access article under the CC BY license.

Keywords
anti-money laundering synthetic data generation SAS-Python SQL analytics TF-IDF

Metrics
since January 2020
44

Article info
views

4

Full article
views

8

PDF
downloads

1

XML
downloads

Export citation

Copy and paste formatted citation
Placeholder

Download citation in file


Share


RSS

INFORMATICA

  • Online ISSN: 1822-8844
  • Print ISSN: 0868-4952
  • Copyright © 2023 Vilnius University

About

  • About journal

For contributors

  • OA Policy
  • Submit your article
  • Instructions for Referees
    •  

    •  

Contact us

  • Institute of Data Science and Digital Technologies
  • Vilnius University

    Akademijos St. 4

    08412 Vilnius, Lithuania

    Phone: (+370 5) 2109 338

    E-mail: informatica@mii.vu.lt

    https://informatica.vu.lt/journal/INFORMATICA
Powered by PubliMill  •  Privacy policy