Learning to Minimize the Remainder in Supervised Learning

Yan Luo; Yongkang Wong; Mohan Kankanhalli; Qi Zhao

教師あり学習の残りを最小化することを学ぶ

深層学習法の学習プロセスは、通常、モデルのパラメーターを複数の反復で更新します。各反復は、テイラー級数展開の1次近似と見なすことができます。高次の項で構成される残りの部分は、簡単にするために、通常、学習プロセスでは無視されます。この学習スキームは、画像検索、レコメンデーションシステム、ビデオ検索など、さまざまなマルチメディアベースのアプリケーションを強化します。一般に、マルチメディアデータ（画像など）はセマンティクスが豊富で高次元であるため、近似の余りはゼロ以外になる可能性があります。この作業では、残りの部分が有益であると見なし、それが学習プロセスにどのように影響するかを調査します。この目的のために、我々は新しい学習アプローチ、すなわち勾配調整学習（GAL）を提案し、過去のトレーニング反復から学んだ知識を活用して、余りが最小化され、近似が改善されるようにバニラ勾配を調整します。提案されたGALは、モデルやオプティマイザーに依存せず、標準の学習フレームワークに簡単に適応できます。最先端のモデルとオプティマイザーを使用して、画像分類、オブジェクト検出、回帰の3つのタスクで評価されます。実験は、提案されたGALが一貫して評価されたモデルを強化するのに対し、アブレーション研究は提案されたGALのさまざまな側面を検証することを示しています。コードはhttps://github.com/luoyan407/gradient_adjustment.gitで入手できます。

The learning process of deep learning methods usually updates the model's parameters in multiple iterations. Each iteration can be viewed as the first-order approximation of Taylor's series expansion. The remainder, which consists of higher-order terms, is usually ignored in the learning process for simplicity. This learning scheme empowers various multimedia based applications, such as image retrieval, recommendation system, and video search. Generally, multimedia data (e.g., images) are semantics-rich and high-dimensional, hence the remainders of approximations are possibly non-zero. In this work, we consider the remainder to be informative and study how it affects the learning process. To this end, we propose a new learning approach, namely gradient adjustment learning (GAL), to leverage the knowledge learned from the past training iterations to adjust vanilla gradients, such that the remainders are minimized and the approximations are improved. The proposed GAL is model- and optimizer-agnostic, and is easy to adapt to the standard learning framework. It is evaluated on three tasks, i.e., image classification, object detection, and regression, with state-of-the-art models and optimizers. The experiments show that the proposed GAL consistently enhances the evaluated models, whereas the ablation studies validate various aspects of the proposed GAL. The code is available at https://github.com/luoyan407/gradient_adjustment.git.

updated: Sun Jan 23 2022 06:31:23 GMT+0000 (UTC)

published: Sun Jan 23 2022 06:31:23 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト