Publication: Artificial Intelligence (AI) in Pathology – A Summary and Challenges Part 3

First part of “Artificial Intelligence (AI) in Pathology – A Summary and Challenges” is here!
Second part of “Artificial Intelligence (AI) in Pathology – A Summary and Challenges” is here!

We continue with…

4.6 AI in cancer applications 

AI in breast cancer: Stålhammar et al. for prognostic and predictive value categorized breast cancers by using four gene expressions ‘Luminal A,’ ‘Luminal B,’ ‘HER2-enriched,’ and ‘Basal-like.’ The authors examined 3 cohorts of main breast carcinoma specimens totaling 436 (up to 28 years of survival) and scored them for ER, PR, HER2, and Ki67 rank by Digital Image Analysis (DIA) and manually. DIA approach beat manual scoring in both sensitivity and specificity for the Luminal B subtype, and achieved slightly superior concordance and Cohen’s κ agreement in reference with PAM50 gene expression assays. The manual biomarker and DIA approaches were close in comparison of each other for Cox regression hazard ratios. In addition DIA faired superior in terms of Spearman’s rank-order correlations, and  prognostic value of Ki67 scores in terms of likelihood ratio thus adding appreciably more prognostic information to the manual scores. The authors concluded that overall the DIA approach was clearly a better substitute to the method of manual biomarker scoring.48 A manual process identifying the existence and degree of breast carcinoma by a pathologist is serious for patient administration for tumor staging, including an assessment of treatment response, but it is subject to variability between inter- and intra-reader. As a decision support tool any computerized technique needs to be robust to data acquisition from different sources, different scanners, and different staining/cutting approaches. Cruz-Roa et al.’s CNN approach trained the classifier using 400 exemplars from various sites and using TCGA data to validate it with 200 cases. Their approach attained a Dice coefficient of 75.9%, a PPV of 71.6%, and a NPV or of 96.8% regarding the evaluation for pixel-by-pixel in reference with manually annotated regions of invasive ductal carcinoma.49

Autoencoder (AE) use in breast cancer: An AE can be described as an ANN with a symmetric construction in which middle tiers encode the entered data and then aim to construct a form of its input onto the yield tier and avoids using a direct copy of the data along with the network.50 Macías-García et al. developed a structure to process DNA methylation to obtain meaningful data from pertinent genes regarding breast cancer recurrence and tested it using The Cancer Genome Atlas (TCGA) data portal. The method is based on AEs to preprocess DNA methylation and generate AE features to characterize breast cancer recurrence and demonstrated how it improved recurrence prediction.51

AI in cervical cancer: Out of half million annual cervical cancer cases in the world about 80% occur in low and middle income nations. Hu et al. followed over 9,000 women ages 18 to 94 from Costa Rica over period of seven years from 1993 to 2000 identifying cancers up to 18 years. They developed a DL based visual evaluation algorithm based on digitized cervical images taken with a fixed focus camera (cervicography), which did automatically identify cervical precancer or cancer. The DL method recognized cumulative precancer and cancer cases with higher AUC of  0.91compared to the original cervigram interpretation with AUC of 0.69 or conventional cytology with AUC of 0.71. The authors therefore recommend use of  automated visual evaluation of cervical images from contemporary digital cameras.52

AI in prostate cancer: Ström et al. work noticed the high intra/inter-observer variability in grading resulting in either under or over treatment of prostate carcinoma. To overcome this issue the authors developed an AI method  for prostate cancer detection/localization, and Gleason grading. They digitized 6,682 slides from biopsies of 976 randomly selected Swedes aged 50-69 and another 271 slides from 93 men from outside the original study to train deep neural networks. The resulting networks were tested with independent 1,631 biopsies from 246 men from STHLM3 for the presence, extent, and Gleason grade of malignant tissue and an exterior data from 73 men of 330 biopsies. They also compared rating performance by 23 pathology experts on grading 87 biopsies. The AI networks attained an AUC of 0·997 for differentiating between benign and malignant biopsy cores on the independent dataset and 0·986  on the external verification data between benign and malignant. The correlation found between carcinoma length predicted by the Artificial Intelligence networks and given by the pathology experts was 0·96 for the impartial data and 0·87 for the external verification dataset. The AI methodology for allotting Gleason grades attained a mean pairwise kappa of 0·62 which was within the range of values for the pathology experts of 0·60-0·73. The authors recommend using the AI approach resulting in reduction of the evaluation of benign biopsies and automating the work of determining cancer length in the cases of positive biopsy cores. This AI approach by standardizing grading can be utilized as a second opinion in cancer assessment.53

AI in stomach and colon cancer: Iizuka et al. in their study utilized biopsy histopathology WSIs of stomach and colon trained CNNs and RNNs to classify them into adenocarcinoma, adenoma, and non-neoplastic. They gathered datasets of stomach and colon consisting of 4,128 and 4,036 WSIs, respectively which were then manually annotated by pathologists. The authors using millions of tiles extracted from the WSIs then trained a CNN based on the Inception-V3 architecture for each organ to categorize a tile into one of the three classification tags. Next they summed the projections from all the tiles in the WSI to acquire a final categorization using two approaches of a RNN and a Max Pooling. The models were successfully evaluated on three independent test sets each and achieved Area Under the Curves (AUCs) for gastric adenocarcinoma and adenoma was 0.97 and 0.99 respectively, and for colonic adenocarcinoma and adenoma of 0.96 and 0.99 respectively. Further they evaluated the stomach model versus a collection of pathology experts and medical scholars that were not part of labeling the teaching set utilizing an investigation set of 45 images (15 WSI of adenoma, 15 of adenocarcinoma, and 15 of non-neoplastic lesions). The categorization time for Whole Slide Image using the educated model ranged from 5-30 seconds. The average accuracy of diagnoses achieved by pathologists was 85.9%, medical school students was 41.2%, while the stomach model achieved an accuracy of 95.6% in a 30 sec assessment.54

AI in lung cancer: Kriegsmann et al.’s evaluation of CNNs included the classification of the very usual lung carcinoma subtypes of pulmonary adenocarcinoma (ADC), pulmonary squamous cell carcinoma (SqCC), and small-cell lung cancer (SCLC). To validate the appropriateness of the outcomes, skeletal muscle was also integrated in the investigation, as histologically the difference between skeletal muscle and the three tumor entities is unambiguous. They assembled a group of 80 ADC, 80 SqCC, 80 SCLC, and 30 skeletal muscle specimens. The InceptionV3, VGG16,  and InceptionResNetV2 architectures were qualified to categorize the four entities of interest. InceptionV3 based on the CNN model produced the highest classification accuracy and hence was used for the classification of the test set. The final model received an image patch categorization accuracy of 88% in the training as well as in the verification set. In the test set they achieved image patch and patient-based CNN classification results of 95% and 100%.55 To predict carcinoma in WSIs, Kanavati et al. trained a deep learning  CNN founded on the EfficientNet-B3 design, using transfer learning and weakly-supervised learning to calculate carcinoma using a training dataset of 3,554 WSIs from a sole medical establishment. The model was then applied to four independent test sets from distinct hospitals in order to validate its generalization on unseen data. The authors obtained excellent results that did show differentiation amongst lung cancer and non-neoplastic with an elevated Receiver Operator Curve based AUCs on impartial investigation of four sets of 0.98, 0.97, 0.99, and 0.98, respectively. Out of two methodologies to train the simulations of ‘fully supervised learning’ and ‘weaklysupervised learning,’ the last performed always superior with an improvement of 0.05 in the AUC on the experiment sets.56

 

  1. AI – regulation

The FDA’s vision is that with suitable regulatory oversight, Software as a Medical Device (SaMD) based on AI-ML will deliver safe and effective software functionality that will be able to improve the quality of patient care. Their guidance for software modifications focuses on the risk to patients resulting from the software change. For a traditional application three classes of software alterations that might necessitate a premarket submission include: a) a change that introduces a novel risk or changes an existing risk that can produce significant harm, b) an alteration to risk controls to avoid substantial harm, and c) a modification that considerably affects clinical functionality of the device. For SaMD, any modifications would require a premarket submission to the FDA when the AI/ML software changes significantly, the alteration is to the device’s envisioned use, or the alteration introduces a key change to its algorithm. The FDA to date has approved several AI/ML-based SaMD algorithms that are locked before marketing and algorithm modifications will possibly require an FDA pre-market assessment for the modifications beyond the initial approval. However, the SaMD has the capability to constantly learn, as the alteration or modification to the algorithm is recognized after the SaMD has learned from real world experience might provide a significantly dissimilar output in contrast to the output originally approved for a specified set of inputs. Therefore, the AI/ML tools require a new, Total Product Life Cycle (TPLC) regulatory approach.57

 

References

  1. Stålhammar G., Fuentes M., Lippert M., et al. “Digital image analysis outperforms manual biomarker assessment in breast cancer.” Mod Pathol. 2016 Apr;29(4):318-29. doi: 10.1038/modpathol.2016.34. Epub 2016 Feb 26. PMID: 26916072.
  2. Cruz-Roa A., Gilmore H., Basavanhally A., et al. “Accurate and reproducible invasive breast cancer detection in whole-slide images: A Deep Learning approach for quantifying tumor extent.” Sci Rep. 2017;7:46450. Published 2017 Apr 18. doi:10.1038/srep46450.
  3. Charte D., Charte F., García S., et al. “A practical tutorial on autoencoders for nonlinear feature fusion: Taxonomy, models, software and guidelines.” Information Fusion, Volume 44, 2018, Pages 78-96. ISSN 1566-2535. https://doi.org/10.1016/j.inffus.2017.12.007.
  4. Macías-García L., Martínez-Ballesteros M., Luna-Romera J., et al. “Autoencoded DNA methylation data to predict breast cancer recurrence: Machine learning models and gene-weight significance.” Artificial Intelligence in Medicine, Volume 110, 2020,101976. ISSN 0933-3657. https://doi.org/10.1016/j.artmed.2020.101976.
  5. Hu L., Bell D., Antani S., et al. “An Observational Study of Deep Learning and Automated Evaluation of Cervical Images for Cancer Screening.” J Natl Cancer Inst. 2019;111(9):923-932. doi:10.1093/jnci/djy225.
  6. Ström P., Kartasalo K., Olsson H., et al. “Artificial intelligence for diagnosis and grading of prostate cancer in biopsies: a population-based, diagnostic study.” Lancet Oncol. 2020 Feb;21(2):222-232. doi: 10.1016/S1470-2045(19)30738-7. Epub 2020 Jan 8. Erratum in: Lancet Oncol. 2020 Feb;21(2):e70. PMID: 31926806.
  7. Iizuka O., Kanavati, F., Kato K., et al. “Deep Learning Models for Histopathological Classification of Gastric and Colonic Epithelial Tumours.” Sci Rep 10, 1504 (2020). https://doi.org/10.1038/s41598-020-58467-9.
  8. Kriegsmann M., Haag C., Weis C., et al. “Deep Learning for the Classification of Small-Cell and Non-Small-Cell Lung Cancer.” Cancers 2020, 12(6), 1604. https://doi.org/10.3390/cancers12061604.
  9. Kanavati F., Toyokawa G., Momosaki S., et al. “Weakly-supervised learning for lung carcinoma classification using deep learning.” Sci Rep 10, 9297 (2020). https://doi.org/10.1038/s41598-020-66333-x.
  10. US Food and Drug Administration. “Proposed Regulatory Framework for Modifications to Artificial Intelligence/Machine Learning (AI/ML)‐Based Software as a Medical Device (SaMD) – Discussion Paper and Request for Feedback.” 2019. https://www.fda.gov/media/122535/download.

Leave a comment

Summery for: Artificial Intelligence (AI) in Pathology – A Summary and Challenges Part 3
First publisher:

Global Journal of Medical Research

Date published: 02/27/2021

Author

Designation: MBBS

Table of Contents