M-FasterSeg: An Efficient Semantic Segmentation Network Based on Neural Architecture Search

Huiyu Kuang

M-FasterSeg：ニューラルアーキテクチャ検索に基づく効率的なセマンティックセグメンテーションネットワーク

画像セマンティックセグメンテーション技術は、インテリジェントシステムが自然のシーンを理解するための重要な技術の1つです。ビジュアルインテリジェンスの分野における重要な研究の方向性の1つとして、このテクノロジーには、移動ロボット、ドローン、スマートドライビング、およびスマートセキュリティの分野で幅広いアプリケーションシナリオがあります。ただし、移動ロボットの実際のアプリケーションでは、不正確なセグメンテーションセマンティックラベル予測や、セグメント化されたオブジェクトと背景のエッジ情報の損失などの問題が発生する可能性があります。本論文は、自己注意ニューラルネットワークとニューラルネットワークアーキテクチャ検索法を組み合わせた深層学習ネットワークに基づくセマンティックセグメンテーションネットワークの改良された構造を提案した。まず、ニューラルネットワーク検索方法NAS（Neural Architecture Search）を使用して、複数の解像度のブランチを持つセマンティックセグメンテーションネットワークを検索します。検索プロセスでは、自己注意ネットワーク構造モジュールを組み合わせて検索されたニューラルネットワーク構造を調整し、次に異なるブランチによって検索されたセマンティックセグメンテーションネットワークを組み合わせて高速セマンティックセグメンテーションネットワーク構造を形成し、画像をネットワーク構造に入力します。最終的な予測結果を取得します。 Cityscapesデータセットの実験結果は、アルゴリズムの精度が69.8％であり、セグメンテーション速度が48 / sであることを示しています。リアルタイムと精度のバランスが取れており、エッジのセグメンテーションを最適化でき、複雑なシーンでのパフォーマンスが向上します。優れた堅牢性は、実際のアプリケーションに適しています。

Image semantic segmentation technology is one of the key technologies for intelligent systems to understand natural scenes. As one of the important research directions in the field of visual intelligence, this technology has broad application scenarios in the fields of mobile robots, drones, smart driving, and smart security. However, in the actual application of mobile robots, problems such as inaccurate segmentation semantic label prediction and loss of edge information of segmented objects and background may occur. This paper proposes an improved structure of a semantic segmentation network based on a deep learning network that combines self-attention neural network and neural network architecture search methods. First, a neural network search method NAS (Neural Architecture Search) is used to find a semantic segmentation network with multiple resolution branches. In the search process, combine the self-attention network structure module to adjust the searched neural network structure, and then combine the semantic segmentation network searched by different branches to form a fast semantic segmentation network structure, and input the picture into the network structure to get the final forecast result. The experimental results on the Cityscapes dataset show that the accuracy of the algorithm is 69.8%, and the segmentation speed is 48/s. It achieves a good balance between real-time and accuracy, can optimize edge segmentation, and has a better performance in complex scenes. Good robustness is suitable for practical application.

updated: Wed Dec 15 2021 06:46:55 GMT+0000 (UTC)

published: Wed Dec 15 2021 06:46:55 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト