Informatica logo


Login Register

  1. Home
  2. Issues
  3. Volume 29, Issue 1 (2018)
  4. Machine Learning Based Classification of ...

Informatica

Information Submit your article For Referees Help ATTENTION!
  • Article info
  • Full article
  • Related articles
  • Cited by
  • More
    Article info Full article Related articles Cited by

Machine Learning Based Classification of Colorectal Cancer Tumour Tissue in Whole-Slide Images
Volume 29, Issue 1 (2018), pp. 75–90
Mindaugas Morkūnas   Povilas Treigys   Jolita Bernatavičienė   Arvydas Laurinavičius   Gražina Korvel  

Authors

 
Placeholder
https://doi.org/10.15388/Informatica.2018.158
Pub. online: 1 January 2018      Type: Research Article      Open accessOpen Access

Received
1 September 2017
Accepted
1 February 2018
Published
1 January 2018

Abstract

The recent introduction of whole-slide scanning systems enabled accumulation of high-quality pathology images into large collections, thus opening new perspectives in cancer research, as well as new analysis challenges. Automated identification of tumour tissue in the whole-slide image enables further use of developed grading systems that classify tumour cell abnormalities and predict tumour developments. In this article, we describe several possibilities to achieve epithelium-stroma classification of tumour tissues in digital pathology images by employing annotated superpixels to train machine learning algorithms. We emphasize that annotating superpixels rather than manually outlining tissue classes in raw images is less time consuming, and more effective way of producing ground truth for computational pathology pipelines. In our approach feature space for supervised learning is created from tissue class assigned superpixels by extracting colour and texture parameters, and applying dimensionality reduction methods. Alternatively, to train convolutional neural network, labelled superpixels are used to generate square image patches by moving fixed size window around each superpixel centroid. The proposed method simplifies the process of ground truth data collection and should minimize the time spent by a skilled expert to perform manual annotation of whole-slide images. We evaluate our method on a private data set of colorectal cancer images. Obtained results confirm that a method produces accurate reference data suitable for the use of different machine learning based classification algorithms.

References

 
Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G.S., Davis, A., Dean, J., Devin, M., Ghemawat, S., Goodfellow, I.J., Harp, A., Irving, G., Isard, M., Jia, Y., Jozefowicz, R., Kaiser, L., Kudlur, M., Levenberg, J., Mane, D., Monga, R., Moore, S., Murray, D.G., Olah, C., Schuster, M., Shlens, J., Steiner, B., Sutskever, I., Talwar, K., Tucker, P.A., Vanhoucke, V., Vasudevan, V., Viegas, F.B., Vinyals, O., Warden, P., Wattenberg, M., Wicke, M., Yu, Y., Zheng, X. (2016). TensorFlow: large-scale machine learning on heterogeneous distributed systems. arXiv preprint, 1603.04467. arxiv.org/abs/1603.04467. Software available from tensorflow.org.
 
Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P., Süsstrunk, S. (2012). SLIC superpixels compared to state-of-the-art superpixel methods. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(11), 2274–2282.
 
Ahammer, H., Kriipfl, J., Hackl, C., Sedivy, R. (2009). Image statistics and data mining of anal intraepithelial neoplasia. Pattern Recognition Letters, 29, 2189–2196.
 
Bejnordi, B.E., Balkenhol, M., Litjens, G., Holland, R., Bult, P., Karssemeijer, N., van der Laak, J.A. (2016). Automated detection of DCIS in whole-slide H&E stained breast histopathology images. IEEE Transactions on Medical Imaging, 35(9), 2141–2150.
 
Bianconi, F., Alvarez-Larran, A., Fernandez, A. (2015). Discrimination between tumour epithelium and stroma via perception-based features. Neurocomputing, 154, 119–126.
 
Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32.
 
Bunyak, F., Hafiane, A., Al-Milaji, Z., Ersoy, I., Haridas, A., Palaniappan, K. (2015). A segmentation-based multi-scale framework for the classification of epithelial and stromal tissues in H&E images. In: IEEE International Conference on Bioinformatics and Biomedicine (BIBM). https://doi.org/10.1109/BIBM.2015.7359726.
 
Chang, Y.W., Hsieh, C.J., Chang, K.W., Ringgaard, M., Lin, C.J. (2010). Training and testing low-degree polynomial data mappings via linear SVM. Journal of Machine Learning Research, 11, 1471–1490.
 
Cortes, C., Vapnik, V. (1995). Support-vector networks. Mach Learn, 20, 273.
 
Cruz-Roa, A., Basavanhally, A., Gonzalez, F., Gilmore, H., Feldman, M., Ganesan, S., Shih, N., Tomaszewski, J., Madabhushi, A. (2014). Automatic detection of invasive ductal carcinoma in whole slide images with Convolutional Neural Networks. In: Proc. of SPIE, Vol. 9041, (904103).
 
Dunne, M.R., Michielsen, A.J., O’Sullivan, K.E., Cathcart, M.C., Feighery, R., Doyle, B., Watson, J.A., O’Farrell, N.J., Ravi, N., Kay, E., Reynolds, J.V., Ryan, E.J., O’Sullivan, J. (2017). HLA-DR expression in tumor epithelium is an independent prognostic indicator in esophageal adenocarcinoma patients. Cancer Immunology, Immunotherapy, 66(7), 841–850. https://doi.org/10.1007/s00262-017-1983-1.
 
Emens, L.A. (2017). Breast cancer immunotherapy: facts and hopes. Clinical Cancer Research. https://doi.org/10.1158/1078-0432.CCR-16-3001.
 
Ertosun, M.G., Rubin, D.L. (2015). Automated grading of gliomas using deep learning in digital pathology images: a modular approach with ensemble of convolutional neural networks. In: AMIA Annual Symposium Proceedings, pp. 1899–1908.
 
Fawcett, T. (2006). An introduction to ROC analysis (PDF). Pattern Recognition Letters, 27(8), 861–874.
 
Haralick, R.M. (1979). Statistical and structural approaches to texture. Proceedings of the IEEE, 67(5), 786–804.
 
Huang, Y., Zheng, H., Liu, C., Ding, X., Rohde, G. (2015). Epithelium-stroma classification via convolutional neural networks and unsupervised domain adaptation in histopathological images. IEEE Journal of Biomedical and Health Informatics. https://doi.org/10.1109/JBHI.2017.2691738.
 
Kanan, C., Cottrell, G.W. (2012). Color-to-grayscale: does the method matter in image recognition? PLoS ONE, 7(1), e29740. https://doi.org/10.1371/journal.pone.0029740.
 
Kingma, D.P., Ba, J.L. (2015). Adam: a method for stochastic optimization. In: International Conference on Learning Representations, pp. 1–13.
 
Lakhani, S.R., Ellis, I.O., Schnitt, S.J., Tan, P.H., van de Vijver, M.J. (2012). WHO Classification of Tumours, 4th ed. Vol. 4. IARC.
 
Linder, N., Konsti, J., Turkki, R., Rahtu, E., Lundin, M., Nordling, S., Haglund, C., Ahonen, T., Pietikäinen, M., Lundin, J. (2012). Identification of tumor epithelium and stroma in tissue microarrays using texture analysis. Diagnostic Pathology, 7, 22. https://doi.org/10.1186/1746-1596-7-22.
 
Litjens, G., Sanchez, C.I., Timofeeva, N., Hermsen, M., Nagtegaal, I., Kovacs, I., van de Kaa, C.H., Bult, P., van Ginneken, B., van der Laak, J., (2016). Deep learning as a tool for increased accuracy and efficiency of histopathological diagnosis. Scientific Reports, 6, 26286. https://doi.org/10.1038/srep26286.
 
Malouf, R. (2002). A comparison of algorithms for maximum entropy parameter estimation. In: Proceedings of the Sixth Conference on Natural Language Learning (CoNLL), pp. 49–55.
 
McLaughlin, J., Han, G., Schalper, K.A., Carvajal-Hausdorf, D., Pelakanou, V., Rehman, J., Velcheti, V., Herbst, R., LoRusso, P., Rimm, D.L. (2016). Quantitative assessment of the heterogeneity of PD-L1 expression in non-small cell lung cancer (NSCLC). JAMA Oncology, 2(1), 46–54. https://doi.org/10.1001/jamaoncol.2015.3638.
 
Nava, R., Gonzalez, G., Kybic, J., Escalante-Ramirez, B. (2016). Classification of tumor epithelium and stroma in colorectal cancer based on discrete tchebichef moments. In: Oyarzun, L.C. et al. (Eds.), Clinical Image-Based Procedures. Translational Research in Medical Imaging. CLIP 2015, Lecture Notes in Computer Science, Vol. 9401. Springer, Cham.
 
Panayiotou, H., Orsi, N.M., Thygesen, H.H., Wright, A.I., Winder, M., Hutson, R., Cummings, M. (2015). The prognostic significance of tumour-stroma ratio in endometrial carcinoma. BMC Cancer, 15, 955. https://doi.org/10.1186/s12885-015-1981-7.
 
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E. (2011). Scikit-learn: machine learning in Python. Journal of Machine Learning Research, 12, 2825–2830.
 
Rumelhart, D.E., Hinton, G.E., Williams, R.J. (1986). Learning internal representations by backpropagating errors. Nature, 323, 533–x 536.
 
Sethi, A., Sha, L., Vahadane, A.R., Deaton, R.J., Kumar, N., Macias, V., Gann, P.H. (2016). Empirical comparison of color normalization methods for epithelial-stromal classification in H and E images. Journal of Pathology Informatics, 7, 17. https://doi.org/10.4103/2153-3539.179984.
 
Shamir, L., Orlov, N., Eckley, D.M., Macura, T., Johnston, J., Goldberg, I.G. (2008). Wndchrm – an open source utility for biological image analysis. Source Code for Biology and Medicine, 3, 13. https://doi.org/10.1186/1751-0473-3-13.
 
Xu, J., Luo, X., Wang, G., Gilmore, H., Madabhushi, A. (2016). A deep convolutional neural network for segmenting and classifying epithelial and stromal regions in histopathological images. Neurocomputing, 191, 214–223. https://doi.org/10.1016/j.neucom.2016.01.034.

Biographies

Morkūnas Mindaugas
mindaugas.morkunas@mii.vu.lt

M. Morkūnas graduated from the Vilnius Gediminas Technical University, Lithuania, in 2002. In 2016 he started PhD studies in informatics engineering at the Institute of Data Science and Digital Technologies, Vilnius University, Lithuania. His interests include bioinformatics, cancer biology, image analysis, machine learning, artificial neural networks.

Treigys Povilas
povilas.treigys@mii.vu.lt

P. Treigys graduated from the Vilnius Gediminas Technical University, Lithuania, in 2005. In 2010 he received the doctoral degree in computer science (PhD) from Institute of Mathematics and Informatics jointly with Vilnius Gediminas Technical University. He is a member of the Lithuanian Society for biomedical engineering. His interests include image analysis, detection and object’s feature extraction in image processing, automated image objects segmentation, optimization methods, artificial neural networks, and software engineering.

Bernatavičienė Jolita
jolita.bernataviciene@mii.vu.lt

J. Bernatavičienė graduated from the Vilnius Pedagogical University in 2004 and received a master’s degree in informatics. In 2008, she received the doctoral degree in computer science (PhD) from Institute of Mathematics and Informatics jointly with Vilnius Gediminas Technical University. She is a researcher at the Cognitive Computing Group of Vilnius University, Institute of Data Science and Digital Technologies. Her research interests include databases, data mining, neural networks, image analysis, visualization, decision support systems and Internet technologies.

Laurinavičius Arvydas
arvydas.laurinavicius@vpc.lt

A. Laurinavičius, MD, PhD fulltime professor at Vilnius University, Department of Pathology, Forensic Medicine and Pharmacology. Director and consultant pathologist at National Center of Pathology. Chair, and board member of multiple international professional societies. Fields of interest: renal pathology, digital pathology image analysis, pathology informatics, health information systems, standards, testing of cancer biomarkers in tissue, multi-resolution analysis of biomarkers.

Korvel Gražina
grazina.korvel@mii.vu.lt

G. Korvel received her BS degree in mathematics and MS degree in informatics (with honors) from Lithuanian University of Educational Sciences, in 2007 and 2009, respectively. She received the doctoral degree from Vilnius University Institute of Data Science and Digital Technologies (former Institute of Mathematics and Informatics) in 2013. Currently she works in this institution. Her research interests include speech signal processing, developing of mathematical models, applications of soft computing and computational intelligence.


Full article Related articles Cited by PDF XML
Full article Related articles Cited by PDF XML

Copyright
© 2018 Vilnius University
by logo by logo
Open access article under the CC BY license.

Keywords
tumour whole-slide image machine learning superpixel ground truth colour and texture features convolutional neural network

Metrics
since January 2020
1586

Article info
views

673

Full article
views

664

PDF
downloads

276

XML
downloads

Export citation

Copy and paste formatted citation
Placeholder

Download citation in file


Share


RSS

INFORMATICA

  • Online ISSN: 1822-8844
  • Print ISSN: 0868-4952
  • Copyright © 2023 Vilnius University

About

  • About journal

For contributors

  • OA Policy
  • Submit your article
  • Instructions for Referees
    •  

    •  

Contact us

  • Institute of Data Science and Digital Technologies
  • Vilnius University

    Akademijos St. 4

    08412 Vilnius, Lithuania

    Phone: (+370 5) 2109 338

    E-mail: informatica@mii.vu.lt

    https://informatica.vu.lt/journal/INFORMATICA
Powered by PubliMill  •  Privacy policy