DiCENet: Dimension-wise Convolutions for Efficient Networks

Sachin Mehta; Hannaneh Hajishirzi; Mohammad Rastegari

DiCENet：効率的なネットワークのための次元ごとの畳み込み

次元ごとの畳み込みと次元ごとの融合を使用して構築された、新規で一般的な畳み込みユニット、DiCEユニットを紹介します。次元ごとの畳み込みは、入力テンソルの各次元に軽量の畳み込みフィルタリングを適用し、次元ごとの融合はこれらの次元ごとの表現を効率的に組み合わせます。 DiCEユニットが、入力テンソルに含まれる空間情報とチャネルごとの情報を効率的にエンコードできるようにします。 DiCEユニットはシンプルで、あらゆるアーキテクチャとシームレスに統合して、効率とパフォーマンスを向上させることができます。深さ方向に分離可能な畳み込みと比較して、DiCEユニットはさまざまなアーキテクチャ間で大幅な改善を示しています。 DiCEユニットを積み重ねてDiCENetモデルを構築すると、画像分類、オブジェクト検出、セマンティックセグメンテーションなどのさまざまなコンピュータービジョンタスク全体で、最先端のモデルに比べて大幅な改善が見られます。 ImageNetデータセットでは、DiCENetは、最先端の手動で設計されたモデル（MobileNetv2やShuffleNetv2など）よりも2〜4％高い精度を提供します。また、DiCENetは、ニューラル検索ベースの方法（MobileNetv3やMobileNetv3など）を含む、最先端の分離可能な畳み込みベースの効率的なネットワークと比較して、リソースに制約のあるデバイスでよく使用されるタスク（オブジェクト検出など）をより適切に一般化します。 MixNet。PyTorchのソースコードはオープンソースであり、https：//github.com/sacmehta/EdgeNets/で入手できます。

We introduce a novel and generic convolutional unit, DiCE unit, that is built using dimension-wise convolutions and dimension-wise fusion. The dimension-wise convolutions apply light-weight convolutional filtering across each dimension of the input tensor while dimension-wise fusion efficiently combines these dimension-wise representations; allowing the DiCE unit to efficiently encode spatial and channel-wise information contained in the input tensor. The DiCE unit is simple and can be seamlessly integrated with any architecture to improve its efficiency and performance. Compared to depth-wise separable convolutions, the DiCE unit shows significant improvements across different architectures. When DiCE units are stacked to build the DiCENet model, we observe significant improvements over state-of-the-art models across various computer vision tasks including image classification, object detection, and semantic segmentation. On the ImageNet dataset, the DiCENet delivers 2-4% higher accuracy than state-of-the-art manually designed models (e.g., MobileNetv2 and ShuffleNetv2). Also, DiCENet generalizes better to tasks (e.g., object detection) that are often used in resource-constrained devices in comparison to state-of-the-art separable convolution-based efficient networks, including neural search-based methods (e.g., MobileNetv3 and MixNet. Our source code in PyTorch is open-source and is available at https://github.com/sacmehta/EdgeNets/

updated: Mon Nov 30 2020 06:27:08 GMT+0000 (UTC)

published: Sat Jun 08 2019 20:17:06 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト