Publication: Artificial Intelligence (AI) in Pathology – A Summary and Challenges Part 1

Abstract

This bibliographic study covers Artificial Intelligence (AI) theory and its applications from the healthcare field and in particular from the discipline of pathology. This review includes basics of AI, supervised and unsupervised machine learning (ML), various supervised ML algorithms, and their applications in healthcare and pathology. Digital Pathology with Deep Machine Learning is more advantageous over traditional pathology that is based on ‘physical slide on a physical microscope’. However, various implementation challenges of cost, data quality, multi-center validation, bias, and regulatory approval issues for AI in clinical practice still remain, which are also described in this study.

 

  1. Introduction

The main objective of this paper is to describe the history of the evolution of Artificial Intelligence over time. The past two decades have shown tremendous progress in the application of artificial intelligence (AI) including in a few medical images-based specialties of radiology, dermatology, ophthalmology, and pathology. First, we explore how AI began about 65 years back and its progression in various disciplines including healthcare/medicine and particularly pathology. Second, we review books available on AI in general as well as AI in medicine and in pathology. Next, we define the necessary terms in AI and various AI algorithms that are utilized to get acceptance by the physicians to assist patients in a more efficient fashion. After, we review AI literature pertinent to healthcare and pathology. Finally, the various challenges and barriers AI faces for use in pathological applications are then discussed.

 

  1. AI theory in textbooks

In 1955 Artificial intelligence (AI) was termed by McCarthy et al. as the subdivision of computer science in which machine based methodologies were used to make predictions to imitate what human intellect may do in the identical situation. [1] The origin of Digital Pathology (DP) began in 1966, as Prewitt et al. photographed images from a microscopic field from a blood smear and then transformed the information into a matrix of optical density numbers for mechanized image investigation.[2] The AI field is built on statistics and Vapnik provides a more detailed description of the statistical learning theory in his two books.[3][4] In 2003, Russell and Norig introduced an idea of an intellectual agent that mechanically plans and performs a sequence of activities to attain a goal as a novel form of AI.[5] Goodfellow et al.’s comprehensive textbook on the AI is written by some of the most innovative and prolific researchers in the field.[6] Kelleher explains how deep learning is useful in understanding big data and covers methodologies of Autoencoders, Recurrent neural networks, Generative Adversarial Networks, Gradient descent and Backpropagation.[7]

 

2.1 AI books in medicine and pathology

There are many excellent textbooks on AI’s applications in medicine including notetaking, drug development, remote patient monitoring, surgery, laboratory discovery, and healthcare delivery.[8][9][10][11][12][13]

In this section our emphasis is on review of the latest textbooks on AI in pathology. Sucacet et al. in Digital Pathology (DP) discuss how technology over a decade has seen tremendous growth in its applications. They observe that DP offers the hope of providing pathology consulting and educational services to underserved areas of the world that otherwise would not experience high level of services.[14] In Artificial Intelligence and Deep Learning in Pathology, Cohen observes how recent advances in computational algorithms, and the arrival of whole slide imaging (WSI) as a platform for combining AI, are assisting both diagnosis and prognosis by transforming pattern recognition and image interpretation. The book focuses on various AI applications in pathology and covers important topics of WSI for 2D and 3D analysis, principles of image analysis, and deep learning.[15] Holzinger et al. in their book describe why AI and Machine Learning (ML) is very promising in the disciplines of DP, radiology, and dermatology. They observe that in some cases Deep Learning (DL) even exceeds human performance and stress the importance that a human expert regardless should always verify the outcome. The authors cover ‘biobanks,’ which offer large collections of high quality and well labeled samples as big data is required for training and covering a variety of diseases in different organs.[16] Belciug in his book covers theoretical concepts and practical techniques of AI and its applications in cancer management. The author describes the impactful role of AI during diagnosis and how it can help doctors make better decisions including AI tools to help pathologists identify exact types of cancer and assist surgeons and oncologists. The book discusses over 20 cancer examples in which AI was used and in particular AI algorithms utilized for them.[17]

 

  1. AI basics

In this section we cover Learning theory, important AI terminology, and algorithms for Machine Learning.

 

3.1 Learning theory and machine learning

Vapnik introduces the learning model from examples using three constituents of a) a creator of random vectors,  b) a supervisor that yields an output vector for each input vector, and c) a learning machine qualified of applying a set of functions.  The next step is the Risk Minimization Problem. So as to find the best obtainable estimate to the supervisor’s reply, one should measure the difference between the reply of the supervisor to a specified input and the answer offered by the learning machine.[18]  In 2015, Deo’s review on ML found that only a few papers out of thousands applying ML algorithms to medical data have contributed meaningfully to clinical care unlike how ML has been impactful in other industries.[19] Cabitza et al. search for ‘laboratory medicine/tests’ and ‘machine learning’ terms found 34 papers in Scopus and three in PubMed showing that it is a relatively new area for AI in laboratory medicine/tests.[20]

 

3.2 Important AI terminology

In this section we will cover important AI terminology.

    1. Machine learning (ML): Machine Learning is a discipline of mathematics that combines statistical modeling and computers/machines to learn from available data sets hence its name. ML is classified into categories called ‘supervised’ and ‘unsupervised’ learning. ML techniques are widely used in transcribing speech into text, matching news items, and identifying objects in images. In these applications a ‘Deep learning’ (DL) technique is widely used.
    2. Supervised learning: In supervised learning the goal is to either predict a known output or target (Y) from input variables (X) using an algorithm to learn the mapping function [Y= f(X)] from the labeled input data to the output. This derived mapping function can then be used for a new input data (X1) to predict its output variable (Y1). Supervised learning based models can be of ‘classification’ or ‘regression’ types. In the classification case the output variable is a category e.g. disease or lack of disease. Automated interpretation using pattern recognition of a breast X-ray or an EKG to select from a limited set of diagnoses are examples of supervised learning. In a regression problem the output variable is a real continuous value of e.g. temperature or blood pressure. The concept of bias and variance and their relationship with each other are important determinants of the performance of supervised ML models. To obtain a most Generalizable Supervised ML Model requires finding a right balance between bias and variance.
    3. Unsupervised learning: In unsupervised learning, unlike supervised learning, there is no output prediction variable. Instead, all input data (X) which is unlabeled, the algorithms in this case learn to understand inherent structure from the available input. E.g., an objective could be to look for data patterns to identify novel disease mechanisms.
    4. Semi-supervised learning: In semi-supervised learning, some of the available data is labeled and the remaining is not. In this type of learning scientists use a combination of supervised and unsupervised methods. LeCun et al. describe various techniques used in ML such as ‘Conventional ML’ and ‘Deep ML.’ They discuss limitations of ‘Conventional ML’ in their ability to process data in their raw form. In contrast ‘Representation learning’ permits a computer to be served with the data to find the depictions desired for recognition or categorization. ‘Deep learning’ are ‘Representation learning’ approaches with numerous stages of representation acquired by creating non-linear sections which convert the representation at one stage into a representation at a higher but at a more abstract level.[21]
    5. Artificial Neural Networks (ANNs): Similar to a brain which operates through interconnected, complex network of neurons, an Artificial Neural Network has a set of artificial neurons which are layered and connected, with a definite passageway for how information is transmitted through the network. The Artificial Neural Network allows a means of reaching an output that is the outcome of many nondependent phases of computation and weighting.
    6. Convolutional Neural Networks (CNN or ConvNets): ConvNets are deep, feedforward networks that are easier to train and are generalized better than networks with full connectivity between adjacent layers. CNNs are applicable to use information that come in the shape of numerous arrays such as a color photo consisting three 2D arrays comprising pixel concentrations in the three colour channels. A CNN is designed with the first few steps being constituted of two types of tiers of convolutional and pooling tiers.[22]
    7. Recurrent Neural Networks (RNN): RNNs process a given input order of one component at a time while preserving in their concealed parts a ‘state vector’ which keeps data regarding entire historical components of the arrangement. RNN is appropriate with sequential inputs, such as speech and language. Backpropagation is therefore appropriate for training RNNs. Training RNNs can have issues as the backpropagated gradients grow/shrink at each step causing them to blow up or become very small.[23]

 

3.3 Algorithms for Supervised ML

Model building phase of Supervised ML includes splitting of the available data into training and testing sets in order to train the model followed by testing it for validation. The following algorithms are widely used in Supervised ML:

    1. Linear Regression: Linear regression models find the target by finding the best-fitted “least squares regression line,” which has the smallest error sum, amongst the independent continuous variables (features/the cause) and the dependent continuous variables (target/the effect). Aggarwal et al. detail the pitfalls associated with this analysis.[24]
    2. Logistic Regression: Ranganathan et al. discuss logistic regression examining the relationship of a binary outcome of ‘yes/no,’ ‘alive/dead,’ ‘success/failure’ with one or more predictors being either categorical or continuous. They provide method’s limitations in choosing the right predictor variables, avoiding the use of highly correlated variables, restricting the number of variables, and handling continuous input variables.[25]
    3. Convolutional Neural Network or CNN: Neural Networks attempt to model a neuron which uses certain input features to find and assign appropriate mathematical weights to forecast some output objective. Deep neural network has a sizable number of nodal contacts within its unseen tier and are most appropriate for highly complicated data studies such as images. However, caution is required because of a limitation due to overfitting.
    4. k-Nearest Neighbor or k-NN: This algorithm is used for data classification and regression tasks of nonparametric clustering. k is defined as the square root of the number of occurrences and its distance from a pre-defined point and classification is based on the number of k neighbors.[26] Being ‘distanced based,’ they require normalization of features and work best in the presence of a smaller number of input variables but are sensitive to the outliers.
    5. Support Vector Machine or SVM: The SVM algorithm classifies available data by defining a hyperplane that best differentiates the presence of two groups. The differentiation for the two groups is maximized by growing the space on either side of the hyperplane and the hyperplane enclosed area with the greatest possible distance is then chosen for the evaluation. It finds a nonlinear relationship using a kernel function but has tendency for overfitting.[27]
    6. Naive Bayes: Naive Bayes approach assumes that the features under evaluation are independent of each other. For simple tasks it can produce good results, but in general their performance is inferior to the other ML algorithms.[28]
    7. Decision Tree and Boosted Tree: A decision tree comprises a root, nodes, branches, and leaves. The node is where the characteristic is examined and the branch is where the result of this examined query is then assigned. The decision tree provides a set of guidelines that defines the passageway from the root all the way to the leaves. Gradient boosting machine uses weak predictors (a Decision tree) that are boosted, which provide a better performing model (a Boosted tree). This method can work with unbalanced data sets but may produce overfitting.[29]
    8. Random Forest or RF: Breiman provides how RFs are an effective tool in accurate prediction of classifiers and regressors as it avoids overfitting due to the Law of Large Numbers.[30] However, it might be more time exhausting and less efficient vs. the nonparametric (SVM and k-NN) and parametric (logistic regression) modeling.

 

References

  1. McCarthy J, Minsky ML, Shannon CE. A proposal for the Dartmouth summer research project on artificial intelligence ‐ August 31, 1955. Ai Mag. 2006;27(4):12–4.
  2. Prewitt J., Mendelsohn M. “The analysis of cell images.” Ann N Y Acad Sci. 1966;128(3):1035–53. 10.1111/j.1749-6632.1965.tb11715.x.
  3. Vapnik V. The Nature of Statistical Learning Theory. Second edition. Springer-Verlag, 1995.
  4. Vapnik V. Statistical Learning Theory. Wiley, 1998.
  5. Russell S., Norvig P. Artificial intelligence: a modern approach. Upper Saddle River: Prentice Hall, 2003.
  6. Goodfellow I., Bengio Y., et al. Deep Learning (Adaptive Computation and Machine Learning series) Illustrated Edition. The MIT Press, 2016.
  7. Kelleher J. Deep Learning (The MIT Press Essential Knowledge series) Paperback – Illustrated, The MIT Press, 2019.
  8. Topol E. Deep Medicine: How Artificial Intelligence Can Make Healthcare Human Again. Basic Books, 2019.
  9. Mahajan P. Artificial Intelligence in Healthcare: AI, Machine Learning, and Deep and Intelligent Medicine Simplified for Everyone. 2nd Edition, MedMantra, 2019.
  10. Bohr A., Memarzadeh K. Artificial Intelligence in Healthcare. 1st Edition, Academic Press, 2020.
  11. Chang A. Intelligence-Based Medicine: Artificial Intelligence and Human Cognition in Clinical Medicine and Healthcare. 1st Edition, Academic Press, 2020.
  12. Lawry T. AI in Health: A Leader’s Guide to Winning in the New Age of Intelligent Health Systems (HIMSS Book Series). 1st Edition, CRC Press, 2020
  13. Xing L., Giger M., Min J. Artificial Intelligence in Medicine: Technical Basis and Clinical Applications.1st Edition, Academic Press, 2020.
  14. Sucaet Y., Waelput W. Digital Pathology (Springer Briefs in Computer Science). Springer, 2014.
  15. Cohen S. Artificial Intelligence and Deep Learning in Pathology E-Book. 1st Edition, Elsevier, 2020
  16. Holzinger A., Goebel R., et al. Artificial Intelligence and Machine Learning for Digital Pathology: State-of-the-Art and Future Challenges (Lecture Notes in Computer Science Book 12090). 1st Edition, Springer, 2020.
  17. Belciug S. Artificial Intelligence in Cancer: Diagnostic to Tailored Treatment.1st Edition, Academic Press, 2020.
  18. Vapnik V. “An overview of statistical learning theory.” IEEE Trans Neural Netw. 1999;10:988–999. 
  19. Deo R. “Machine Learning in Medicine.” Circulation. 2015;132(20):1920-1930. doi:10.1161/CIRCULATIONAHA.115.001593.
  20. Cabitza F., Banfi G. “Machine learning in laboratory medicine: waiting for the flood?” Clin Chem Lab Med. 2018 Mar 28;56(4):516-524. doi: 10.1515/cclm-2017-0287. PMID: 29055936.
  21. LeCun Y., Bengio Y., Hinton G. “Deep learning.” Nature. 2015;521(7553):436–444.
  22. LeCun, Y., Bottou, L., Bengio, Y., et al. “Gradient-based learning applied to document recognition.” Proc. IEEE 86, 2278–2324 (1998).
  23. Hochreiter S., Schmidhuber J. “Long short-term memory.” Neural Comput. 9, 1735–1780 (1997).
  24. Aggarwal R., Ranganathan P. “Common pitfalls in statistical analysis: Linear regression analysis.” Perspect Clin Res. 2017;8(2):100-102. doi:10.4103/2229-3485.203040.
  25. Ranganathan P., Pramesh C., Aggarwal R. “Common pitfalls in statistical analysis: Logistic regression.” Perspect Clin Res. 2017;8(3):148-151. doi:10.4103/picr.PICR_87_17.
  26. Hall P., Park B., Samworth R. “Choice of neighbor order in nearest-neighbor classification.” Annals of Statistics. 2008;36:2135–2152. DOI: 10.1214/07-AOS537.
  27. Hearst M., Dumais S., Osuna E. et al. “Support vector machines,” in IEEE Intelligent Systems and their Applications, vol. 13, no. 4, pp. 18-28, July-Aug. 1998, doi: 10.1109/5254.708428.
  28. George J., Langley P. “Estimating continuous distributions in Bayesian classifiers.” Paper presented at: Eleventh Conference on Uncertainty in Artificial Intelligence; August 18–20, 1995; Montréal, Qué, Canada.
  29. Elith J., Leathwick J., Hastie T. “A working guide to boosted regression trees.” J Anim Ecol. 2008 Jul;77(4):802-13. doi: 10.1111/j.1365-2656.2008.01390.x. Epub 2008 Apr 8. PMID: 18397250.
  30. Breiman L. “Random forests.” Mach Learn. 2001;45:5–32.

Leave a comment

Summery for: Artificial Intelligence (AI) in Pathology – A Summary and Challenges Part 1
First publisher:

Global Journal of Medical Research

Date published: 02/27/2021

Author

Designation: MBBS

Table of Contents