A hierarchical semantic segmentation framework for computer vision-based bridge damage detection

Jingxiao Liu; Yujie Wei; Bingqing Chen

コンピュータビジョンベースの橋の損傷検出のための階層的セマンティックセグメンテーションフレームワーク

リモートカメラと無人航空機（UAV）を使用したコンピュータービジョンベースの損傷検出により、効率的で低コストの橋梁の健全性監視が可能になり、人件費とセンサーの設置と保守の必要性が削減されます。最近のセマンティック画像セグメンテーションアプローチを活用することで、重要な構造コンポーネントの領域を見つけ、画像を唯一の入力として使用してピクセルレベルで損傷を認識することができます。ただし、既存の方法は、小さな損傷（亀裂や露出した鉄筋など）や画像サンプルが限られている薄いオブジェクトを検出する場合、特に対象のコンポーネントのバランスが非常に悪い場合は、パフォーマンスが低下します。この目的のために、この論文では、コンポーネントカテゴリと損傷タイプの間に階層的な意味関係を課す意味セグメンテーションフレームワークを紹介します。たとえば、特定のコンクリートのひび割れは橋の柱にのみ存在するため、そのような損傷を検出すると、柱以外の領域がマスクされます。このようにして、損傷検出モデルは、損傷の可能性のある領域からのみ特徴を学習することに焦点を当て、他の無関係な領域の影響を回避することができます。また、さまざまなスケールのビューを提供するマルチスケール拡張を利用して、小さくて薄いオブジェクトを処理する能力を失うことなく、各画像のコンテキスト情報を保持します。さらに、提案されたフレームワークは、まれなコンポーネント（たとえば、枕木や露出した鉄筋）を含む画像を繰り返しサンプリングする重要なサンプリングを使用して、不均衡なデータの課題に対処するより多くのデータサンプルを提供します。

Computer vision-based damage detection using remote cameras and unmanned aerial vehicles (UAVs) enables efficient and low-cost bridge health monitoring that reduces labor costs and the needs for sensor installation and maintenance. By leveraging recent semantic image segmentation approaches, we are able to find regions of critical structural components and recognize damage at the pixel level using images as the only input. However, existing methods perform poorly when detecting small damages (e.g., cracks and exposed rebars) and thin objects with limited image samples, especially when the components of interest are highly imbalanced. To this end, this paper introduces a semantic segmentation framework that imposes the hierarchical semantic relationship between component category and damage types. For example, certain concrete cracks only present on bridge columns and therefore the non-column region will be masked out when detecting such damages. In this way, the damage detection model could focus on learning features from possible damaged regions only and avoid the effects of other irrelevant regions. We also utilize multi-scale augmentation that provides views with different scales that preserves contextual information of each image without losing the ability of handling small and thin objects. Furthermore, the proposed framework employs important sampling that repeatedly samples images containing rare components (e.g., railway sleeper and exposed rebars) to provide more data samples, which addresses the imbalanced data challenge.

updated: Mon Jul 18 2022 18:42:54 GMT+0000 (UTC)

published: Mon Jul 18 2022 18:42:54 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト