Evaluating Generalizability of Deep Learning Models Using Indian-COVID-19 CT Dataset

Suba S; Nita Parekh; Ramesh Loganathan; Vikram Pudi; Chinnababu Sunkavalli

インドの COVID-19 CT データセットを使用した深層学習モデルの一般化可能性の評価

コンピュータ断層撮影法 (CT) は、肺疾患の診断に日常的に使用されており、最近では、パンデミックの間、COVID-19 疾患の感染性と重症度を検出するために使用されています。臨床現場で CT スキャン画像の自動処理に機械学習 (ML) アプローチを使用する際の主な懸念事項の 1 つは、これらの方法が、公開されている COVID-19 データの限られた偏ったサブセットでトレーニングされていることです。これにより、トレーニング中にモデルに表示されない、外部データセットでのこれらのモデルの一般化可能性に関する懸念が生じました。これらの問題のいくつかに対処するために、この作業では、最大のパブリックリポジトリの 1 つである COVIDx CT 2A から取得された確認済みの COVID-19 データからの CT スキャン画像が、機械学習モデルのトレーニングと内部検証に使用されました。外部検証のために、インドの 288 人の COVID-19 患者からの 3D CT ボリュームと 12096 の胸部 CT 画像を含むオープンソースリポジトリである、Indian-COVID-19 CT データセットを生成しました。 4 つの最先端の機械学習モデル、つまり、軽量の畳み込みニューラルネットワーク (CNN)、および VGG-16、ResNet-50、Inception などの他の 3 つの CNN ベースの深層学習 (DL) モデルの比較パフォーマンス評価-v3 CT 画像を 3 つのクラス、つまり、通常、非 covid 肺炎、および COVID-19 に分類することは、これら 2 つのデータセットに対して実行されます。私たちの分析では、ホールドアウト COVIDx CT 2A テストセットでは、すべてのモデルのパフォーマンスが 90% ～ 99% の精度 (CNN では 96%) で同等であるのに対し、外部の Indian-COVID-19 CT データセットでは低下することが示されました。すべてのモデルでパフォーマンスが観察されます (8% ～ 19%)。従来の機械学習モデルである CNN は、深層学習モデルと比較して、外部データセットで最高のパフォーマンスを発揮し (精度 88%)、軽量の CNN が目に見えないデータでより一般化できることを示しています。データとコードは、https://github.com/aleesuss/c19 で入手できます。

Computer tomography (CT) have been routinely used for the diagnosis of lung diseases and recently, during the pandemic, for detecting the infectivity and severity of COVID-19 disease. One of the major concerns in using ma-chine learning (ML) approaches for automatic processing of CT scan images in clinical setting is that these methods are trained on limited and biased sub-sets of publicly available COVID-19 data. This has raised concerns regarding the generalizability of these models on external datasets, not seen by the model during training. To address some of these issues, in this work CT scan images from confirmed COVID-19 data obtained from one of the largest public repositories, COVIDx CT 2A were used for training and internal vali-dation of machine learning models. For the external validation we generated Indian-COVID-19 CT dataset, an open-source repository containing 3D CT volumes and 12096 chest CT images from 288 COVID-19 patients from In-dia. Comparative performance evaluation of four state-of-the-art machine learning models, viz., a lightweight convolutional neural network (CNN), and three other CNN based deep learning (DL) models such as VGG-16, ResNet-50 and Inception-v3 in classifying CT images into three classes, viz., normal, non-covid pneumonia, and COVID-19 is carried out on these two datasets. Our analysis showed that the performance of all the models is comparable on the hold-out COVIDx CT 2A test set with 90% - 99% accuracies (96% for CNN), while on the external Indian-COVID-19 CT dataset a drop in the performance is observed for all the models (8% - 19%). The traditional ma-chine learning model, CNN performed the best on the external dataset (accu-racy 88%) in comparison to the deep learning models, indicating that a light-weight CNN is better generalizable on unseen data. The data and code are made available at https://github.com/aleesuss/c19.

updated: Wed Dec 28 2022 16:23:18 GMT+0000 (UTC)

published: Wed Dec 28 2022 16:23:18 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト