PDFNet: Pointwise Dense Flow Network for Urban-Scene Segmentation

Venkata Satya Sai Ajay Daliparthi

PDFNet：都市-シーンセグメンテーションのためのポイントワイズデンスフローネットワーク

近年、ディープ畳み込みニューラルネットワーク（CNN）を機能エンコーダー（またはバックボーン）として使用することは、いくつかのコンピュータービジョン手法で最も一般的に観察されるアーキテクチャパターンであり、セマンティックセグメンテーションも例外ではありません。このアーキテクチャパターンの2つの主な欠点は、次のとおりです。（i）自動運転車が正確な決定を行うために不可欠な、壁、柵、ポール、信号機、交通標識、自転車などの小さなクラスをネットワークがキャプチャできないことがよくあります。（ii）深さが任意に増加するため、ネットワークは、収束し、過剰適合のリスクを防ぐために、それぞれ大量のラベル付きデータと追加の正則化手法を必要とします。正則化手法のコストは最小限ですが、ラベル付きデータの収集は費用と手間がかかります。この作業では、ポイントワイズデンスフローネットワーク（PDFNet）という名前の新しい軽量アーキテクチャを提案することにより、これら2つの欠点に対処します。 PDFNetでは、ネットワークのすべての部分へのスムーズなグラデーションフローを可能にするために、高密度、残差、および複数のショートカット接続を採用しています。 CityscapesとCamVidベンチマークに関する広範な実験は、私たちの方法が、少人数のクラスと少数のデータ体制でのキャプチャにおいてベースラインを大幅に上回っていることを示しています。さらに、私たちの方法は、Cityscapes to KITTIデータセットで評価された、トレーニング外の分布サンプルの分類でかなりのパフォーマンスを達成します。

In recent years, using a deep convolutional neural network (CNN) as a feature encoder (or backbone) is the most commonly observed architectural pattern in several computer vision methods, and semantic segmentation is no exception. The two major drawbacks of this architectural pattern are: (i) the networks often fail to capture small classes such as wall, fence, pole, traffic light, traffic sign, and bicycle, which are crucial for autonomous vehicles to make accurate decisions. (ii) due to the arbitrarily increasing depth, the networks require massive labeled data and additional regularization techniques to converge and to prevent the risk of over-fitting, respectively. While regularization techniques come at minimal cost, the collection of labeled data is an expensive and laborious process. In this work, we address these two drawbacks by proposing a novel lightweight architecture named point-wise dense flow network (PDFNet). In PDFNet, we employ dense, residual, and multiple shortcut connections to allow a smooth gradient flow to all parts of the network. The extensive experiments on Cityscapes and CamVid benchmarks demonstrate that our method significantly outperforms baselines in capturing small classes and in few-data regimes. Moreover, our method achieves considerable performance in classifying out-of-the training distribution samples, evaluated on Cityscapes to KITTI dataset.

updated: Tue Sep 21 2021 10:39:46 GMT+0000 (UTC)

published: Tue Sep 21 2021 10:39:46 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト