Video-based Cross-modal Auxiliary Network for Multimodal Sentiment Analysis

Rongfei Chen; Wenju Zhou; Yang Li; Huiyu Zhou

マルチモーダル感情分析のためのビデオベースのクロスモーダル補助ネットワーク

マルチモーダルな感情分析には、マルチモーダルな相互作用における情報補完性があるため、幅広い用途があります。以前の研究は、効率的な結合表現の調査に重点を置いていましたが、不十分な単峰性特徴抽出と多峰性融合のデータ冗長性をほとんど考慮していません。この論文では、オーディオ機能マップモジュールとクロスモーダル選択モジュールで構成される、ビデオベースのクロスモーダル補助ネットワーク (VCAN) が提案されています。最初のモジュールは、音声特徴抽出における特徴の多様性を大幅に高めるように設計されており、より包括的な音響表現を提供することで分類精度を向上させることを目指しています。モデルが冗長な視覚的特徴を処理できるようにするために、2 番目のモジュールは、視聴覚データの統合中に冗長な視覚的フレームを効率的にフィルター処理するように処理されます。さらに、感情の極性と感情のカテゴリを予測するために、いくつかの画像分類ネットワークで構成される分類子グループが導入されます。 RAVDESS、CMU-MOSI、および CMU-MOSEI ベンチマークに関する広範な実験結果は、マルチモーダル感情分析の分類精度を向上させるための最先端の方法よりも VCAN が大幅に優れていることを示しています。

Multimodal sentiment analysis has a wide range of applications due to its information complementarity in multimodal interactions. Previous works focus more on investigating efficient joint representations, but they rarely consider the insufficient unimodal features extraction and data redundancy of multimodal fusion. In this paper, a Video-based Cross-modal Auxiliary Network (VCAN) is proposed, which is comprised of an audio features map module and a cross-modal selection module. The first module is designed to substantially increase feature diversity in audio feature extraction, aiming to improve classification accuracy by providing more comprehensive acoustic representations. To empower the model to handle redundant visual features, the second module is addressed to efficiently filter the redundant visual frames during integrating audiovisual data. Moreover, a classifier group consisting of several image classification networks is introduced to predict sentiment polarities and emotion categories. Extensive experimental results on RAVDESS, CMU-MOSI, and CMU-MOSEI benchmarks indicate that VCAN is significantly superior to the state-of-the-art methods for improving the classification accuracy of multimodal sentiment analysis.

updated: Tue Aug 30 2022 02:08:06 GMT+0000 (UTC)

published: Tue Aug 30 2022 02:08:06 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト