Improving the Learning of Multi-column Convolutional Neural Network for   Crowd Counting

Zhi-Qi Cheng; Jun-Xiu Li; Qi Dai; Xiao Wu; Jun-Yan He; Alexander Hauptmann

群衆計数のためのマルチカラム畳み込みニューラルネットワークの学習の改善

Improving the Learning of Multi-column Convolutional Neural Network for Crowd Counting

人/頭の大きさの大きなばらつきは、群衆の数え上げにとって重大な問題です。フィーチャ表現のスケール不変性を改善するために、最近の研究では、さまざまなスケールと解像度を処理するために、複数列構造の畳み込みニューラルネットワークを広く採用しています。ただし、列の実質的な冗長パラメータにより、既存の複数列ネットワークは常に異なる列でほぼ同じスケールの特徴を示し、これはカウント精度に深刻な影響を与え、オーバーフィッティングにつながります。この論文では、新しいマルチカラム相互学習（McML）戦略を提案することにより、この問題を攻撃します。 1）統計ネットワークがマルチカラムフレームワークに組み込まれ、カラム間の相互情報を推定します。これにより、異なるカラムのフィーチャ間のスケール相関をほぼ示すことができます。相互情報を最小限に抑えることにより、各列は異なる画像スケールの特徴を学習するようにガイドされます。 2）各ミニバッチトレーニングデータで他の列を固定したまま、各列を交互に最適化できる相互学習スキームを考案します。このような非同期パラメーター更新プロセスでは、各列が他の列とは異なる特徴表現を学習する傾向があるため、パラメーターの冗長性が効率的に削減され、一般化能力が向上します。さらに驚くべきことに、McMLは既存のすべてのマルチカラムネットワークに適用でき、エンドツーエンドのトレーニングが可能です。 4つの困難なベンチマークに関する広範な実験により、McMLは元のマルチカラムネットワークを大幅に改善し、他の最先端のアプローチよりも優れていることが示されています。

Tremendous variation in the scale of people/head size is a critical problem for crowd counting. To improve the scale invariance of feature representation, recent works extensively employ Convolutional Neural Networks with multi-column structures to handle different scales and resolutions. However, due to the substantial redundant parameters in columns, existing multi-column networks invariably exhibit almost the same scale features in different columns, which severely affects counting accuracy and leads to overfitting. In this paper, we attack this problem by proposing a novel Multi-column Mutual Learning (McML) strategy. It has two main innovations: 1) A statistical network is incorporated into the multi-column framework to estimate the mutual information between columns, which can approximately indicate the scale correlation between features from different columns. By minimizing the mutual information, each column is guided to learn features with different image scales. 2) We devise a mutual learning scheme that can alternately optimize each column while keeping the other columns fixed on each mini-batch training data. With such asynchronous parameter update process, each column is inclined to learn different feature representation from others, which can efficiently reduce the parameter redundancy and improve generalization ability. More remarkably, McML can be applied to all existing multi-column networks and is end-to-end trainable. Extensive experiments on four challenging benchmarks show that McML can significantly improve the original multi-column networks and outperform the other state-of-the-art approaches.

updated: Tue Sep 17 2019 06:34:47 GMT+0000 (UTC)

published: Tue Sep 17 2019 06:34:47 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト