Auto-Encoding Score Distribution Regression for Action Quality Assessment

Boyu Zhang; Jiayuan Chen; Yinfei Xu; Hui Zhang; Xu Yang; Xin Geng

アクション品質評価のための自動エンコーディングスコア分布回帰

ビデオとアクションスコアの関係をモデル化するのは難しいため、ビデオからのアクション品質評価（AQA）は難しいビジョンタスクです。したがって、行動の質の評価は、文献で広く研究されてきました。従来、AQAタスクは、ビデオとアクションスコアの間の基礎となるマッピングを学習するための回帰問題として扱われていました。最近では、ラベル分布学習（LDL）の導入により、不確実性スコア分布学習（USDL）の方法が成功しました。ただし、USDLは連続ラベルのあるデータセットには適用されず、トレーニングで一定の分散が必要です。この論文では、上記の問題に対処するために、Distribution Auto-Encoder（DAE）をさらに開発します。 DAEは、回帰アルゴリズムとラベル分布学習（LDL）の両方の利点を活用します。具体的には、ビデオを分布にエンコードし、変分オートエンコーダー（VAE）の再パラメーター化トリックを使用してスコアをサンプリングします。これにより、ビデオとスコアの間のより正確なマッピングが確立されます。一方、DAEのトレーニングを加速するために、複合損失が作成されます。 DAE-MTは、マルチタスクデータセットのAQAを処理するためにさらに提案されています。 MTL-AQAおよびJIGSAWSデータセットでDAEアプローチを評価します。公開データセットでの実験結果は、私たちの方法がスピアマンの順位相関の下で最先端を達成することを示しています：MTL-AQAで0.9449、JIGSAWSで0.73。

Action quality assessment (AQA) from videos is a challenging vision task since the relation between videos and action scores is difficult to model. Thus, action quality assessment has been widely studied in the literature. Traditionally, AQA task is treated as a regression problem to learn the underlying mappings between videos and action scores. More recently, the method of uncertainty score distribution learning (USDL) made success due to the introduction of label distribution learning (LDL). But USDL does not apply to dataset with continuous labels and needs a fixed variance in training. In this paper, to address the above problems, we further develop Distribution Auto-Encoder (DAE). DAE takes both advantages of regression algorithms and label distribution learning (LDL).Specifically, it encodes videos into distributions and uses the reparameterization trick in variational auto-encoders (VAE) to sample scores, which establishes a more accurate mapping between videos and scores. Meanwhile, a combined loss is constructed to accelerate the training of DAE. DAE-MT is further proposed to deal with AQA on multi-task datasets. We evaluate our DAE approach on MTL-AQA and JIGSAWS datasets. Experimental results on public datasets demonstrate that our method achieves state-of-the-arts under the Spearman's Rank Correlation: 0.9449 on MTL-AQA and 0.73 on JIGSAWS.

updated: Mon Nov 22 2021 07:30:04 GMT+0000 (UTC)

published: Mon Nov 22 2021 07:30:04 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト