A Challenging Benchmark of Anime Style Recognition

Haotang Li; Shengtao Guo; Kailin Lyu; Xiao Yang; Tianchen Chen; Jianqing Zhu; Huanqiang Zeng

アニメスタイル認識の挑戦的なベンチマーク

アニメの役割が異なる2つの画像を考えると、アニメスタイル認識（ASR）は、2つの画像が同じ作品からのものであるかどうかを判断するために、抽象絵画のスタイルを学習することを目的としています。これは興味深いが難しい問題です。顔認識、虹彩認識、人物の再識別などの生体認証とは異なり、ASRははるかに大きな意味のギャップに悩まされますが、あまり注目されません。この論文では、挑戦的なASRベンチマークを提案します。まず、大規模なASRデータセット（LSASRD）を収集します。このデータセットには、190のアニメ作品の20,937の画像が含まれており、各作品には少なくとも10の異なる役割があります。大規模なものに加えて、LSASRDには、複雑な照明、さまざまなポーズ、劇場の色、誇張された構図など、難しい要素のリストが含まれています。次に、ASRのパフォーマンスを評価するためのクロスロールプロトコルを設計します。ASRモデルを検証するには、クエリとギャラリーの画像を異なるロールから取得する必要があります。ロールの識別機能を学習するのではなく、抽象絵画のスタイルを学習します。最後に、2つの強力な個人再識別方法、つまりAGWとTransReIDを適用して、LSASRDのベースラインパフォーマンスを構築します。驚いたことに、最近のトランスフォーマーモデル（つまり、TransReID）は、LSASRDで42.24％のmAPしか取得していません。したがって、巨大なセマンティックギャップのASRタスクは、深く長期的な研究に値すると信じています。 https://github.com/nkjcqvcpi/ASRでデータセットとコードを開きます。

Given two images of different anime roles, anime style recognition (ASR) aims to learn abstract painting style to determine whether the two images are from the same work, which is an interesting but challenging problem. Unlike biometric recognition, such as face recognition, iris recognition, and person re-identification, ASR suffers from a much larger semantic gap but receives less attention. In this paper, we propose a challenging ASR benchmark. Firstly, we collect a large-scale ASR dataset (LSASRD), which contains 20,937 images of 190 anime works and each work at least has ten different roles. In addition to the large-scale, LSASRD contains a list of challenging factors, such as complex illuminations, various poses, theatrical colors and exaggerated compositions. Secondly, we design a cross-role protocol to evaluate ASR performance, in which query and gallery images must come from different roles to validate an ASR model is to learn abstract painting style rather than learn discriminative features of roles. Finally, we apply two powerful person re-identification methods, namely, AGW and TransReID, to construct the baseline performance on LSASRD. Surprisingly, the recent transformer model (i.e., TransReID) only acquires a 42.24% mAP on LSASRD. Therefore, we believe that the ASR task of a huge semantic gap deserves deep and long-term research. We will open our dataset and code at https://github.com/nkjcqvcpi/ASR.

updated: Fri Apr 29 2022 12:09:42 GMT+0000 (UTC)

published: Fri Apr 29 2022 12:09:42 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト