Globetrotter: Unsupervised Multilingual Translation from Visual Alignment

Dídac Surís; Dave Epstein; Carl Vondrick

Globetrotter：ビジュアルアラインメントからの教師なし多言語翻訳

言語間に明示的な監視がないため、並列コーパスを使用しない多言語の機械翻訳は困難です。既存の教師なし方法は、通常、言語表現のトポロジー特性に依存しています。代わりに視覚モダリティを使用して複数の言語を整列させ、それらの間のブリッジとして画像を使用するフレームワークを紹介します。言語と画像の間のクロスモーダルアライメントを推定し、この推定値を使用して、言語間表現の学習をガイドします。私たちの言語表現は、単一のステージを持つ1つのモデルで共同でトレーニングされます。 52の言語での実験は、私たちの方法が、検索を使用した監視されていない単語レベルおよび文レベルの翻訳のベースラインを上回っていることを示しています。

Multi-language machine translation without parallel corpora is challenging because there is no explicit supervision between languages. Existing unsupervised methods typically rely on topological properties of the language representations. We introduce a framework that instead uses the visual modality to align multiple languages, using images as the bridge between them. We estimate the cross-modal alignment between language and images, and use this estimate to guide the learning of cross-lingual representations. Our language representations are trained jointly in one model with a single stage. Experiments with fifty-two languages show that our method outperforms baselines on unsupervised word-level and sentence-level translation using retrieval.

updated: Tue Dec 08 2020 18:50:40 GMT+0000 (UTC)

published: Tue Dec 08 2020 18:50:40 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト