Universal Representations: A Unified Look at Multiple Task and Domain Learning

Wei-Hong Li; Xialei Liu; Hakan Bilen

普遍的な表現：複数のタスクとドメイン学習の統一された見方

普遍的な表現、単一の深い神経ネットワークを通じて、複数の視覚タスクと視覚領域を共同で学習する統一された外観を提案します。複数の問題を同時に学習するには、大きさと特性が異なる複数の損失関数の加重和を最小化する必要があります。そのため、問題ごとに個別のモデルを学習する場合と比較して、1つの損失の不均衡な状態が最適化を支配し、結果が悪くなります。この目的のために、小容量のアダプターを介してその表現をタスク/ドメイン固有の表現と整合させた後、複数のタスク/ドメイン固有のネットワークの知識を単一の深いニューラルネットワークに抽出することを提案します。 NYU-v2とCityscapesでの複数の密な予測問題、Visual Decathlon Datasetでの多様なドメインからの複数の画像分類問題、MetaDatasetでのクロスドメインの少数ショット学習で、ユニバーサル表現が最先端のパフォーマンスを実現することを厳密に示します。。最後に、アブレーションと定性的研究を通じて複数の分析も行います。

We propose a unified look at jointly learning multiple vision tasks and visual domains through universal representations, a single deep neural network. Learning multiple problems simultaneously involves minimizing a weighted sum of multiple loss functions with different magnitudes and characteristics and thus results in unbalanced state of one loss dominating the optimization and poor results compared to learning a separate model for each problem. To this end, we propose distilling knowledge of multiple task/domain-specific networks into a single deep neural network after aligning its representations with the task/domain-specific ones through small capacity adapters. We rigorously show that universal representations achieve state-of-the-art performances in learning of multiple dense prediction problems in NYU-v2 and Cityscapes, multiple image classification problems from diverse domains in Visual Decathlon Dataset and cross-domain few-shot learning in MetaDataset. Finally we also conduct multiple analysis through ablation and qualitative studies.

updated: Tue Aug 30 2022 12:02:09 GMT+0000 (UTC)

published: Wed Apr 06 2022 11:40:01 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト