A Hybrid VDV Model for Automatic Diagnosis of Pneumothorax using Class-Imbalanced Chest X-rays Dataset

Tahira Iqbal; Arslan Shaukat; Usman Akram; Zartasha Mustansar; Yung-Cheol Byun

クラス不均衡な胸部X線データセットを使用した気胸の自動診断のためのハイブリッドVDVモデル

生命を脅かす病気である気胸は、即座に効率的に診断する必要があります。この場合の予後は、時間がかかるだけでなく、人為的ミスを起こしやすい。したがって、胸部X線を使用した正確な診断の自動方法が最大の要件です。現在まで、利用可能な医用画像データセットのほとんどには、クラスの不均衡の問題があります。この研究の主なテーマは、気胸を検出する自動化された方法を提案するとともに、この問題を解決することです。まず、既存のアプローチを比較してクラスの不均衡の問題に取り組み、データレベルのアンサンブル（つまり、データセットのサブセットのアンサンブル）が他のアプローチよりも優れていることを確認します。したがって、データレベルアンサンブルの複雑なモデルレベルアンサンブルであり、固定特徴抽出器としてVGG16、VGG-19、およびDenseNet-121を含む3つの畳み込みニューラルネットワーク（CNN）を使用する、VDVモデルという名前の新しいフレームワークを提案します。。各データレベルアンサンブルでは、事前定義されたCNNの1つから抽出された特徴が、サポートベクターマシン（SVM）分類器に供給され、各データレベルアンサンブルからの出力が投票方法を使用して計算されます。 3つの異なるCNNアーキテクチャを備えた3つのデータレベルアンサンブルからの出力が取得されると、ここでも、投票方法を使用して最終的な予測が計算されます。私たちが提案するフレームワークは、SIIM ACR気胸データセットとNIH胸部X線データセットのランダムサンプル（RS-NIH）でテストされています。最初のデータセットでは、受信者動作特性曲線（AUC）の下の86.0％の面積で85.17％のリコールが達成されます。 2番目のデータセットでは、95.0％AUCで90.9％の想起がデータのランダム分割で達成され、77.06％AUCで85.45％の想起が患者ごとのデータ分割で得られます。 RS-NIHの場合、以前の文献の結果と比較して、得られた結果は高くなります。ただし、最初のデータセットについては、このデータセットが気胸分類に以前に使用されていないため、直接比較することはできません。

Pneumothorax, a life threatening disease, needs to be diagnosed immediately and efficiently. The prognosis in this case is not only time consuming but also prone to human errors. So an automatic way of accurate diagnosis using chest X-rays is the utmost requirement. To-date, most of the available medical images datasets have class-imbalance issue. The main theme of this study is to solve this problem along with proposing an automated way of detecting pneumothorax. We first compare the existing approaches to tackle the class-imbalance issue and find that data-level-ensemble (i.e. ensemble of subsets of dataset) outperforms other approaches. Thus, we propose a novel framework named as VDV model, which is a complex model-level-ensemble of data-level-ensembles and uses three convolutional neural networks (CNN) including VGG16, VGG-19 and DenseNet-121 as fixed feature extractors. In each data-level-ensemble features extracted from one of the pre-defined CNN are fed to support vector machine (SVM) classifier, and output from each data-level-ensemble is calculated using voting method. Once outputs from the three data-level-ensembles with three different CNN architectures are obtained, then, again, voting method is used to calculate the final prediction. Our proposed framework is tested on SIIM ACR Pneumothorax dataset and Random Sample of NIH Chest X-ray dataset (RS-NIH). For the first dataset, 85.17% Recall with 86.0% Area under the Receiver Operating Characteristic curve (AUC) is attained. For the second dataset, 90.9% Recall with 95.0% AUC is achieved with random split of data while 85.45% recall with 77.06% AUC is obtained with patient-wise split of data. For RS-NIH, the obtained results are higher as compared to previous results from literature However, for first dataset, direct comparison cannot be made, since this dataset has not been used earlier for Pneumothorax classification.

updated: Tue Dec 22 2020 10:20:04 GMT+0000 (UTC)

published: Tue Dec 22 2020 10:20:04 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト