Multimodal Learning for Hateful Memes Detection

Yi Zhou; Zhenhao Chen

嫌なミーム検出のためのマルチモーダル学習

ミームは、画像やフレーズを含むマルチメディアドキュメントであり、通常、組み合わせるとユーモラスな意味を持ちます。しかし、憎しみのミームもソーシャルネットワーク内で憎悪を広めています。嫌なミームを自動的に検出することは、それらの有害な社会的影響を減らすのに役立ちます。視覚情報とテキスト情報が意味的に整列している従来のマルチモーダルタスクとは異なり、嫌なミーム検出の課題は、その固有のマルチモーダル情報にあります。ミーム内のマルチモーダル情報は、弱く整列しているか、無関係でさえあります。そのため、モデルはミーム内のコンテンツを理解するだけでなく、複数のモダリティについて推論する必要があります。本論文では、マルチモーダルミームの嫌悪ミーム検出に焦点を当て、画像キャプションプロセスをミーム検出プロセスに組み込んだ新しい方法を提案します。マルチモーダルミームデータセットで広範な実験を実施し、アプローチの有効性を示しました。私たちのモデルはまた、憎悪のミーム検出の課題で有望な結果を達成します。

Memes are multimedia documents containing images and phrases that usually build a humorous meaning when combined. However, hateful memes are also spread hatred within social networks. Automatically detecting the hateful memes would help decrease their harmful societal impact. Unlike the conventional multimodal tasks, where the visual and textual information is semantically aligned, the challenge of hateful memes detection lies in its unique multimodal information. The multimodal information in the memes are weakly aligned or even irrelevant, which makes the model not only needs to understand the content in the memes but also reasoning over the multiple modalities. In this paper, we focus on hateful memes detection for multimodal memes and propose a novel method that incorporates the image captioning process into the memes detection process. We conducted extensive experiments on multimodal meme datasets and illustrated the effectiveness of our approach. Our model also achieves promising results on the Hateful memes detection challenge.

updated: Thu Nov 26 2020 03:57:32 GMT+0000 (UTC)

published: Wed Nov 25 2020 16:49:15 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト