No-Service Rail Surface Defect Segmentation via Normalized Attention and Dual-scale Interaction

Gongyang Li; Chengjun Han; Zhi Liu

正規化された注意力とデュアルスケールインタラクションによる非運行レール表面欠陥のセグメント化

非稼動レール表面欠陥 (NRSD) セグメンテーションは、非稼動レールの品質を認識するために不可欠な方法です。しかし、運行されていないレールの複雑で多様な輪郭と低コントラストのテクスチャのため、既存の自然画像セグメンテーション手法は、NRSD 画像、特に一部のユニークで困難な NRSD シーンで期待できるパフォーマンスを達成できません。この目的を達成するために、この論文では、NaDiNetと名付けられた、正規化された注意とデュアルスケールインタラクションに基づくNRSDのための新しいセグメンテーションネットワークを提案します。具体的には、NaDiNet は拡張相互作用パラダイムに従っています。正規化されたチャネルごとのセルフアテンションモジュール (NAM) とデュアルスケールインタラクションブロック (DIB) は、NaDiNet の 2 つの重要なコンポーネントです。 NAM は、低コントラスト NRSD 画像から抽出された特徴を強化するための、チャネルごとのセルフアテンションメカニズム（CAM）の特別な拡張です。 CAM のソフトマックス層は、低コントラストの特徴の強調に役立たない非常に小さな相関係数を生成します。代わりに、NAM では、チャネル間の正規化された相関係数を直接計算して、特徴の区別を拡大します。 DIB は、拡張機能の機能相互作用のために特別に設計されています。これには、二重スケールを持つ 2 つのインタラクションブランチがあり、1 つはきめの細かい手がかり用、もう 1 つは粗い手がかり用です。両方のブランチが連携して動作することで、DIB は異なる粒度の欠陥領域を認識できます。これらのモジュールが連携して動作することで、NaDiNet は正確なセグメンテーションマップを生成できます。人工および自然の NRSD を使用した公開 NRSD-MN データセットに対する広範な実験により、さまざまなバックボーン (つまり、VGG、ResNet、および DenseNet) を備えた私たちが提案する NaDiNet が、10 の最先端の方法よりも一貫して優れていることが実証されました。私たちのメソッドのコードと結果は、https://github.com/monxxcn/NaDiNet で入手できます。

No-service rail surface defect (NRSD) segmentation is an essential way for perceiving the quality of no-service rails. However, due to the complex and diverse outlines and low-contrast textures of no-service rails, existing natural image segmentation methods cannot achieve promising performance in NRSD images, especially in some unique and challenging NRSD scenes. To this end, in this paper, we propose a novel segmentation network for NRSDs based on Normalized Attention and Dual-scale Interaction, named NaDiNet. Specifically, NaDiNet follows the enhancement-interaction paradigm. The Normalized Channel-wise Self-Attention Module (NAM) and the Dual-scale Interaction Block (DIB) are two key components of NaDiNet. NAM is a specific extension of the channel-wise self-attention mechanism (CAM) to enhance features extracted from low-contrast NRSD images. The softmax layer in CAM will produce very small correlation coefficients which are not conducive to low-contrast feature enhancement. Instead, in NAM, we directly calculate the normalized correlation coefficient between channels to enlarge the feature differentiation. DIB is specifically designed for the feature interaction of the enhanced features. It has two interaction branches with dual scales, one for fine-grained clues and the other for coarse-grained clues. With both branches working together, DIB can perceive defect regions of different granularities. With these modules working together, our NaDiNet can generate accurate segmentation map. Extensive experiments on the public NRSD-MN dataset with man-made and natural NRSDs demonstrate that our proposed NaDiNet with various backbones (i.e., VGG, ResNet, and DenseNet) consistently outperforms 10 state-of-the-art methods. The code and results of our method are available at https://github.com/monxxcn/NaDiNet.

updated: Tue Jun 27 2023 12:58:16 GMT+0000 (UTC)

published: Tue Jun 27 2023 12:58:16 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト