Counterfactual Explanation of Machine Learning Survival Models
Volume 32, Issue 4 (2021), pp. 817–847
Pub. online: 9 December 2021
Type: Research Article
Open Access
Received
1 December 2020
1 December 2020
Accepted
1 December 2021
1 December 2021
Published
9 December 2021
9 December 2021
Abstract
A method for counterfactual explanation of machine learning survival models is proposed. One of the difficulties of solving the counterfactual explanation problem is that the classes of examples are implicitly defined through outcomes of a machine learning survival model in the form of survival functions. A condition that establishes the difference between survival functions of the original example and the counterfactual is introduced. This condition is based on using a distance between mean times to event. It is shown that the counterfactual explanation problem can be reduced to a standard convex optimization problem with linear constraints when the explained black-box model is the Cox model. For other black-box models, it is proposed to apply the well-known Particle Swarm Optimization algorithm. Numerical experiments with real and synthetic data demonstrate the proposed method.
References
Arrieta, A.B., Diaz-Rodriguez, N., Ser, J.D., Bennetot, A., Tabik, S., Barbado, A., Garcia, S., Gil-Lopez, S., Molina, D., Benjamins, R., Chatila, R., Herrera, F. (2019). Explainable Artificial Intelligence (XAI): Concepts, Taxonomies, Opportunities and Challenges toward Responsible AI. arXiv:1910.10045.
Arya, V., Bellamy, R.K.E., Chen, P.-Y., Dhurandhar, A., Hind, M., Hoffman, S.C., Houde, S., Liao, Q.V., Luss, R., Mojsilovic, A., Mourad, S., Pedemonte, P., Raghavendra, R., Richards, J., Sattigeri, P., Shanmugam, K., Singh, M., Varshney, K.R., Wei, D., Zhang, Y. (2019). One Explanation Does Not Fit All: A Toolkit and Taxonomy of AI Explainability Techniques. arXiv:1909.03012.
Buhrmester, V., Munch, D., Arens, M. (2019). Analysis of Explainers of Black Box Deep Neural Networks for Computer Vision: A Survey. arXiv:1911.12116v1.
Dandl, S., Molnar, C., Binder, M., Bischl, B. (2020). Multi-Objective Counterfactual Explanations. arXiv:2004.11165.
Das, A., Rad, P. (2020). Opportunities and Challenges in Explainable Artificial Intelligence (XAI): A Survey. arXiv:2006.11371v2.
Dhurandhar, A., Chen, P.-Y., Luss, R., Tu, C.-C., Ting, P., Shanmugam, K., Das, P. (2018). Explanations based on the Missing: Towards Contrastive Explanations with Pertinent Negatives. arXiv:1802.07623v2.
Dhurandhar, A., Pedapati, T., Balakrishnan, A., Chen, P.-Y., Shanmugam, K., Puri, R. (2019). Model Agnostic Contrastive Explanations for Structured Data. arXiv:1906.00117.
Fernandez, C., Provost, F., Han, X. (2020). Explaining Data-Driven Decisions made by AI Systems: The Counterfactual Approach. arXiv:2001.07417.
Garreau, D., von Luxburg, U. (2020). Explaining the Explainer: A First Theoretical Analysis of LIME. arXiv:2001.03447.
Goyal, Y., Wu, Z., Ernst, J., Batra, D., Parikh, D., Lee, S. (2019). Counterfactual Visual Explanations. arXiv:1904.07451.
Haarburger, C., Weitz, P., Rippel, O., Merhof, D. (2018). Image-Based Survival Analysis for Lung Cancer Patients using CNNs. arXiv:1808.09679v1.
Hendricks, L.A., Hu, R., Darrell, T., Akata, Z. (2018a). Generating Counterfactual Explanations with Natural Language. arXiv:1806.09809.
Kallus, N. (2016). Learning to personalize from observational data. arXiv:1608.08925.
Kovalev, M.S., Utkin, L.V. (2020). A robust algorithm for explaining unreliable machine learning survival models using the Kolmogorov–Smirnov bounds. Neural Networks, 132, 1–18. https://doi.org/10.1016/j.neunet.2020.08.007.
Kovalev, M.S., Utkin, L.V., Kasimov, E.M. (2020). SurvLIME: A method for explaining machine learning survival models. Knowledge-Based Systems, 203, 106164. https://doi.org/10.1016/j.knosys.2020.106164.
Laugel, T., Lesot, M.-J., Marsala, C., Renard, X., Detyniecki, M. (2018). Comparison-based inverse classification for interpretability in machine learning. In: Information Processing and Management of Uncertainty in Knowledge-Based Systems. Theory and Foundations. Proceedings of the 17th International Conference, IPMU 2018, Vol. 1, Cadiz, Spain, pp. 100–111.
Lenis, D., Major, D., Wimmer, M., Berg, A., Sluiter, G., Buhler, K. (2020). Domain aware medical image classifier interpretation by counterfactual impact analysis. In: Medical Image Computing and Computer Assisted Intervention – MICCAI 2020, Lecture Notes in Computer Science, Vol. 12261. Springer, Cham, pp. 315–325.
Looveren, A.V., Klaise, J. (2019). Interpretable Counterfactual Explanations Guided by Prototypes. arXiv:1907.02584.
Lucic, A., Oosterhuis, H., Haned, H., de Rijke, M. (2019). Actionable Interpretability through Optimizable Counterfactual Explanations for Tree Ensembles. arXiv:1911.12199.
Molnar, C. (2019). Interpretable Machine Learning: A Guide for Making Black Box Models Explainable. Published online, https://christophm.github.io/interpretable-ml-book/.
Murdoch, W.J., Singh, C., Kumbier, K., Abbasi-Asl, R., Yua, B. (2019). Interpretable machine learning: definitions, methods, and applications. arXiv:1901.04592.
Petsiuk, V., Das, A., Saenko, K. (2018). RISE: Randomized input sampling for explanation of black-box models. arXiv:1806.07421.
Ramon, Y., Martens, D., Provost, F., Evgeniou, T. (2019). Counterfactual explanation algorithms for behavioral and textual data. arXiv:1912.01819.
Ribeiro, M.T., Singh, S., Guestrin, C. (2016). “Why Should I Trust You?” Explaining the Predictions of Any Classifier. arXiv:1602.04938v3.
Russel, C. (2019). Efficient Search for Diverse Coherent Explanations. arXiv:1901.04909.
Sharma, S., Henderson, J., Ghosh, J. (2019). CERTIFAI: Counterfactual Explanations for Robustness, Transparency, Interpretability, and Fairness of Artificial Intelligence Models. arXiv:1905.07857.
Utkin, L.V., Kovalev, M.S., Kasimov, E.M. (2020). An explanation method for black-box machine learning survival models using the Chebyshev distance. In: Artificial Intelligence and Natural Language. AINL 2020, Communications in Computer and Information Science, Vol. 1292. Springer, Cham, pp. 62–74. https://doi.org/10.1007/978-3-030-59082-6_5.
van der Waa, J., Robeer, M., van Diggelen, J., Brinkhuis, M., Neerincx, M. (2018). Contrastive Explanations with Local Foil Trees. arXiv:1806.07470.
Verma, S., Dickerson, J., Hines, K. (2020). Counterfactual Explanations for Machine Learning: A Review. arXiv:2010.10596.
Vermeire, T., Martens, D. (2020). Explainable Image Classification with Evidence Counterfactual. arXiv:2004.07511.
Vu, M.N., Nguyen, T.D., Phan, N., R. Gera, M.T.T. (2019). Evaluating Explainers via Perturbation. arXiv:1906.02032v1.
Wager, S., Athey, S. (2015). Estimation and inference of heterogeneous treatment effects using random forests. arXiv:1510.0434.
White, A., Garcez, A.d. (2019). Measurable Counterfactual Local Explanations for Any Classifier. arXiv:1908.03020v2.
Xie, N., Ras, G., van Gerven, M., Doran, D. (2020). Explainable Deep Learning: A Field Guide for the Uninitiated. arXiv:2004.14545.
Zhao, L., Feng, D. (2020). DNNSurv: Deep Neural Networks for Survival Analysis Using Pseudo Values. arXiv:1908.02337v2.