Auxiliary Learning for Deep Multi-task Learning

Yifan Liu; Bohan Zhuang; Chunhua Shen; Hao Chen; Wei Yin

深いマルチタスク学習のための補助学習

マルチタスク学習（MTL）は、複数のタスクを同時に解決して、各シングルタスクを順番に処理するよりも優れた速度とパフォーマンスを得るための効率的なソリューションです。最新のメソッドは、次のいずれかに分類できます。（i）パラメーターのサブセットがタスク間で共有され、他のパラメーターはタスク固有であるハードパラメーター共有。または（ii）すべてのパラメーターはタスク固有ですが、共同で正規化されるソフトパラメーター共有。どちらの方法にも制限があります。前者の共有された隠されたレイヤーは、競合する目的のために最適化が困難であり、後者の複雑さはタスク数の増加に伴って直線的に増加します。これらの欠点を軽減するために、この論文では、トレーニング段階でハードパラメータ共有レイヤーの最適化を支援するためのソフトパラメータ共有を模倣する補助モジュールを明示的に構築する代替案を提案します。特に、補助モジュールは、共有の隠れ層の出力を入力として受け取り、補助タスクの損失によって監視されます。トレーニング中、補助モジュールはMTLネットワークと共同で最適化され、共有レイヤーに誘導バイアスを導入することで正則化として機能します。テスト段階では、元のMTLネットワークのみが保持されます。したがって、この方法は両方のカテゴリの制限を回避します。セマンティックセグメンテーション、深度推定、異なるネットワーク構造での表面法線予測など、ピクセル単位の予測タスクで提案された補助モジュールを評価します。さまざまな設定での広範な実験により、メソッドの有効性が検証されます。

Multi-task learning (MTL) is an efficient solution to solve multiple tasks simultaneously in order to get better speed and performance than handling each single-task in turn. The most current methods can be categorized as either: (i) hard parameter sharing where a subset of the parameters is shared among tasks while other parameters are task-specific; or (ii) soft parameter sharing where all parameters are task-specific but they are jointly regularized. Both methods suffer from limitations: the shared hidden layers of the former are difficult to optimize due to the competing objectives while the complexity of the latter grows linearly with the increasing number of tasks. To mitigate those drawbacks, this paper proposes an alternative, where we explicitly construct an auxiliary module to mimic the soft parameter sharing for assisting the optimization of the hard parameter sharing layers in the training phase. In particular, the auxiliary module takes the outputs of the shared hidden layers as inputs and is supervised by the auxiliary task loss. During training, the auxiliary module is jointly optimized with the MTL network, serving as a regularization by introducing an inductive bias to the shared layers. In the testing phase, only the original MTL network is kept. Thus our method avoids the limitation of both categories. We evaluate the proposed auxiliary module on pixel-wise prediction tasks, including semantic segmentation, depth estimation, and surface normal prediction with different network structures. The extensive experiments over various settings verify the effectiveness of our methods.

updated: Thu Nov 28 2019 01:46:55 GMT+0000 (UTC)

published: Thu Sep 05 2019 05:29:15 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト