TY - JOUR
T1 - Factors determining generalization in deep learning models for scoring COVID-CT images
AU - Horry, Michael James
AU - Chakraborty, Subrata
AU - Pradhan, Biswajeet
AU - Fallahpoor, Maryam
AU - Chegeni, Hossein
AU - Paul, Manoranjan
N1 - Publisher Copyright:
© 2021 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
PY - 2021/10/27
Y1 - 2021/10/27
N2 - The COVID-19 pandemic has inspired unprecedented data collection and computer vision modelling efforts worldwide, focused on the diagnosis of COVID-19 from medical images. However, these models have found limited, if any, clinical application due in part to unproven generalization to data sets beyond their source training corpus. This study investigates the generalizability of deep learning models using publicly available COVID-19 Computed Tomography data through cross dataset validation. The predictive ability of these models for COVID-19 severity is assessed using an independent dataset that is stratified for COVID-19 lung involvement. Each inter-dataset study is performed using histogram equalization, and contrast limited adaptive histogram equalization with and without a learning Gabor filter. We show that under certain conditions, deep learning models can generalize well to an external dataset with F1 scores up to 86%. The best performing model shows predictive accuracy of between 75% and 96% for lung involvement scoring against an external expertly stratified dataset. From these results we identify key factors promoting deep learning generalization, being primarily the uniform acquisition of training images, and secondly diversity in CT slice position.
AB - The COVID-19 pandemic has inspired unprecedented data collection and computer vision modelling efforts worldwide, focused on the diagnosis of COVID-19 from medical images. However, these models have found limited, if any, clinical application due in part to unproven generalization to data sets beyond their source training corpus. This study investigates the generalizability of deep learning models using publicly available COVID-19 Computed Tomography data through cross dataset validation. The predictive ability of these models for COVID-19 severity is assessed using an independent dataset that is stratified for COVID-19 lung involvement. Each inter-dataset study is performed using histogram equalization, and contrast limited adaptive histogram equalization with and without a learning Gabor filter. We show that under certain conditions, deep learning models can generalize well to an external dataset with F1 scores up to 86%. The best performing model shows predictive accuracy of between 75% and 96% for lung involvement scoring against an external expertly stratified dataset. From these results we identify key factors promoting deep learning generalization, being primarily the uniform acquisition of training images, and secondly diversity in CT slice position.
KW - Computed tomography
KW - COVID-19 scoring
KW - Deep learning
KW - External validation
KW - Image pre-processing
KW - Model generalization
UR - http://www.scopus.com/inward/record.url?scp=85118479515&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85118479515&partnerID=8YFLogxK
U2 - 10.3934/mbe.2021456
DO - 10.3934/mbe.2021456
M3 - Article
C2 - 34814345
AN - SCOPUS:85118479515
SN - 1551-0018
VL - 18
SP - 9264
EP - 9293
JO - Mathematical Biosciences and Engineering
JF - Mathematical Biosciences and Engineering
IS - 6
ER -