Relational Future Captioning Model for Explaining Likely Collisions in Daily Tasks

Motonari Kambara; Komei Sugiura

日常業務における衝突の可能性を説明するためのリレーショナル未来キャプションモデル

日常業務をサポートする国内サービスロボットは、高齢者や障害者にとって有望なソリューションです。国内のサービスロボットは、行動を起こす前に衝突の危険性を説明することが重要です。この論文では、私たちの目的は、将来のイベントについてのキャプションを生成することです。将来のキャプションタスクのためのクロスモーダル言語生成モデルであるRelationalFutureCaptioning Model（RFCM）を提案します。 RFCMには、トランスフォーマーの従来の自己注意よりも効果的にイベント間の関係を抽出するためのRelationalSelf-AttentionEncoderがあります。比較実験を行った結果、RFCMが2つのデータセットでベースラインメソッドよりも優れていることがわかりました。

Domestic service robots that support daily tasks are a promising solution for elderly or disabled people. It is crucial for domestic service robots to explain the collision risk before they perform actions. In this paper, our aim is to generate a caption about a future event. We propose the Relational Future Captioning Model (RFCM), a crossmodal language generation model for the future captioning task. The RFCM has the Relational Self-Attention Encoder to extract the relationships between events more effectively than the conventional self-attention in transformers. We conducted comparison experiments, and the results show the RFCM outperforms a baseline method on two datasets.

updated: Tue Jul 19 2022 05:42:14 GMT+0000 (UTC)

published: Tue Jul 19 2022 05:42:14 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト