LiDAR-based Panoptic Segmentation via Dynamic Shifting Network

Fangzhou Hong; Hui Zhou; Xinge Zhu; Hongsheng Li; Ziwei Liu

ダイナミックシフティングネットワークを介したLiDARベースのパノプティコンセグメンテーション

自動運転の急速な進歩に伴い、センシングシステムにさらに包括的な3D知覚を装備することが重要になっています。ただし、既存の作業では、LiDARセンサーからオブジェクト（車や歩行者など）またはシーン（木や建物など）のいずれかを解析することに重点が置かれています。この作業では、オブジェクトとシーンの両方を統一された方法で解析することを目的とした、LiDARベースのパノラマセグメンテーションのタスクについて説明します。この新しい挑戦的なタスクに向けた最初の取り組みの1つとして、点群領域で効果的なパノラマセグメンテーションフレームワークとして機能するダイナミックシフティングネットワーク（DS-Net）を提案します。特に、DS-Netには3つの魅力的な特性があります。1）強力なバックボーン設計。 DS-Netは、LiDARポイントクラウド用に特別に設計されたシリンダーコンボリューションを採用しています。抽出された特徴は、ボトムアップクラスタリングスタイルで動作するセマンティックブランチとインスタンスブランチによって共有されます。 2）複雑な点分布の動的シフト。 BFSやDBSCANなどの一般的に使用されるクラスタリングアルゴリズムでは、点群の分布が不均一でインスタンスサイズが変化する複雑な自動運転シーンを処理できないことがわかりました。したがって、カーネル関数をさまざまなインスタンスにオンザフライで適応させる、効率的な学習可能なクラスタリングモジュールであるダイナミックシフトを紹介します。 3）コンセンサス主導の融合。最後に、コンセンサス駆動型の融合を使用して、セマンティック予測とインスタンス予測の間の不一致に対処します。 LiDARベースのパノプティコンセグメンテーションのパフォーマンスを包括的に評価するために、2つの大規模な自動運転LiDARデータセット、SemanticKITTIとnuScenesからベンチマークを構築してキュレートします。広範な実験により、提案されたDS-Netが現在の最先端の方法よりも優れた精度を達成していることが実証されています。特に、SemanticKITTIのパブリックリーダーボードで1位を獲得し、PQメトリックで2位を2.6％上回っています。

With the rapid advances of autonomous driving, it becomes critical to equip its sensing system with more holistic 3D perception. However, existing works focus on parsing either the objects (e.g. cars and pedestrians) or scenes (e.g. trees and buildings) from the LiDAR sensor. In this work, we address the task of LiDAR-based panoptic segmentation, which aims to parse both objects and scenes in a unified manner. As one of the first endeavors towards this new challenging task, we propose the Dynamic Shifting Network (DS-Net), which serves as an effective panoptic segmentation framework in the point cloud realm. In particular, DS-Net has three appealing properties: 1) strong backbone design. DS-Net adopts the cylinder convolution that is specifically designed for LiDAR point clouds. The extracted features are shared by the semantic branch and the instance branch which operates in a bottom-up clustering style. 2) Dynamic Shifting for complex point distributions. We observe that commonly-used clustering algorithms like BFS or DBSCAN are incapable of handling complex autonomous driving scenes with non-uniform point cloud distributions and varying instance sizes. Thus, we present an efficient learnable clustering module, dynamic shifting, which adapts kernel functions on-the-fly for different instances. 3) Consensus-driven Fusion. Finally, consensus-driven fusion is used to deal with the disagreement between semantic and instance predictions. To comprehensively evaluate the performance of LiDAR-based panoptic segmentation, we construct and curate benchmarks from two large-scale autonomous driving LiDAR datasets, SemanticKITTI and nuScenes. Extensive experiments demonstrate that our proposed DS-Net achieves superior accuracies over current state-of-the-art methods. Notably, we achieve 1st place on the public leaderboard of SemanticKITTI, outperforming 2nd place by 2.6% in terms of the PQ metric.

updated: Tue Dec 01 2020 05:49:08 GMT+0000 (UTC)

published: Tue Nov 24 2020 08:44:46 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト