A Multitask Deep Learning Model for Parsing Bridge Elements and Segmenting Defect in Bridge Inspection Images

Chenyu Zhang; Muhammad Monjurul Karim; Ruwen Qin

橋梁要素を解析し、橋梁検査画像の欠陥をセグメント化するためのマルチタスクディープラーニングモデル

米国の橋の広大なネットワークは、保守と修復に対する高い要求を引き起こします。橋梁の状態を評価するための手作業による目視検査の莫大なコストは、ある程度負担になります。高度なロボットを活用して、検査データの収集を自動化しています。大量の検査画像データ内の要素のマルチクラス要素と表面欠陥のセグメンテーションを自動化することで、橋梁の状態を効率的かつ効果的に評価することが容易になります。要素解析 (つまり、マルチクラス要素のセマンティックセグメンテーション) と欠陥セグメンテーションのために個別の単一タスクネットワークをトレーニングしても、これら 2 つのタスク間の密接な関係を組み込むことができません。認識可能な構造要素と明らかな表面欠陥の両方が検査画像に存在します。このホワイトペーパーでは、ブリッジ要素と欠陥の間のこのような相互依存性を十分に活用して、モデルのタスクパフォーマンスと一般化を向上させるマルチタスクディープラーニングモデルを開発することを目的としています。さらに、この研究では、機能分解、クロストーク共有、および多目的損失関数を含む、タスクパフォーマンスを改善するための提案されたモデル設計の有効性を調査しました。モデルのトレーニングとテストのために、ブリッジ要素と腐食のピクセルレベルのラベルを含むデータセットが開発されました。開発されたマルチタスクディープモデルを評価した定量的および定性的な結果は、パフォーマンス (ブリッジ解析で 2.59% 高い mIoU、腐食セグメンテーションで 1.65% 高い) だけでなく、計算時間と実装能力においても、シングルタスクベースのモデルよりも優れていることを示しています。

The vast network of bridges in the United States raises a high requirement for maintenance and rehabilitation. The massive cost of manual visual inspection to assess bridge conditions is a burden to some extent. Advanced robots have been leveraged to automate inspection data collection. Automating the segmentations of multiclass elements and surface defects on the elements in the large volume of inspection image data would facilitate an efficient and effective assessment of the bridge condition. Training separate single-task networks for element parsing (i.e., semantic segmentation of multiclass elements) and defect segmentation fails to incorporate the close connection between these two tasks. Both recognizable structural elements and apparent surface defects are present in the inspection images. This paper is motivated to develop a multitask deep learning model that fully utilizes such interdependence between bridge elements and defects to boost the model's task performance and generalization. Furthermore, the study investigated the effectiveness of the proposed model designs for improving task performance, including feature decomposition, cross-talk sharing, and multi-objective loss function. A dataset with pixel-level labels of bridge elements and corrosion was developed for model training and testing. Quantitative and qualitative results from evaluating the developed multitask deep model demonstrate its advantages over the single-task-based model not only in performance (2.59% higher mIoU on bridge parsing and 1.65% on corrosion segmentation) but also in computational time and implementation capability.

updated: Fri Oct 28 2022 15:12:44 GMT+0000 (UTC)

published: Tue Sep 06 2022 02:48:15 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト