Understanding Aesthetics with Language: A Photo Critique Dataset for Aesthetic Assessment

Daniel Vera Nieto; Luigi Celona; Clara Fernandez-Labrador

言語による美学の理解：美的評価のための写真批評データセット

美学の計算による推論は、その主観的な性質のために、明確に定義されていないタスクです。人間の評価に基づいて画像と美的スコアのペアを提供することにより、問題に取り組むために多くのデータセットが提案されています。ただし、人間は、意見、好み、感情を1つの数字にまとめるよりも、言語で表現する方が得意です。実際、写真批評は、ユーザーが視覚刺激の美学を評価する方法と理由を明らかにするため、はるかに豊富な情報を提供します。この点で、画像と写真の批評のタプルを含むReddit Photo Critique Dataset（RPCD）を提案します。 RPCDは74Kの画像と220Kのコメントで構成され、建設的なコミュニティのフィードバックを活用して写真スキルを向上させるために愛好家やプロの写真家が使用するRedditコミュニティから収集されます。提案されたデータセットは、主に3つの側面で以前の美学データセットと異なります。つまり、（i）データセットの大規模性と、画像のさまざまな側面を批判するコメントの拡張、（ii）主にUltraHD画像が含まれている、（iii）自動パイプラインを介して収集されるため、新しいデータに簡単に拡張できます。私たちの知る限り、この作品では、批評から視覚刺激の美的品質を推定する最初の試みを提案します。この目的のために、私たちは美的判断の指標として批評の感情の極性を利用します。感情の極性が、2つの美的評価ベンチマークで利用可能な美的判断とどのように正の相関関係があるかを示します。最後に、感情スコアを画像のランク付けのターゲットとして使用して、いくつかのモデルを実験します。データセットとベースラインが利用可能です（https://github.com/mediatechnologycenter/aestheval）。

Computational inference of aesthetics is an ill-defined task due to its subjective nature. Many datasets have been proposed to tackle the problem by providing pairs of images and aesthetic scores based on human ratings. However, humans are better at expressing their opinion, taste, and emotions by means of language rather than summarizing them in a single number. In fact, photo critiques provide much richer information as they reveal how and why users rate the aesthetics of visual stimuli. In this regard, we propose the Reddit Photo Critique Dataset (RPCD), which contains tuples of image and photo critiques. RPCD consists of 74K images and 220K comments and is collected from a Reddit community used by hobbyists and professional photographers to improve their photography skills by leveraging constructive community feedback. The proposed dataset differs from previous aesthetics datasets mainly in three aspects, namely (i) the large scale of the dataset and the extension of the comments criticizing different aspects of the image, (ii) it contains mostly UltraHD images, and (iii) it can easily be extended to new data as it is collected through an automatic pipeline. To the best of our knowledge, in this work, we propose the first attempt to estimate the aesthetic quality of visual stimuli from the critiques. To this end, we exploit the polarity of the sentiment of criticism as an indicator of aesthetic judgment. We demonstrate how sentiment polarity correlates positively with the aesthetic judgment available for two aesthetic assessment benchmarks. Finally, we experiment with several models by using the sentiment scores as a target for ranking images. Dataset and baselines are available (https://github.com/mediatechnologycenter/aestheval).

updated: Fri Jun 17 2022 08:16:20 GMT+0000 (UTC)

published: Fri Jun 17 2022 08:16:20 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト