Combining Visual and Textual Features for Semantic Segmentation of Historical Newspapers

Raphaël Barman; Maud Ehrmann; Simon Clematide; Sofia Ares Oliveira; Frédéric Kaplan

歴史的新聞のセマンティックセグメンテーションのための視覚的特徴とテキスト的特徴の組み合わせ

過去数十年にわたって取得された膨大な量のデジタル化された履歴文書は、当然のことながら自動処理および探索に役立ちます。ファクシミリを自動的に処理し、それによって情報を抽出しようとする研究作業は、最初の重要なステップとして、ドキュメントレイアウト分析を増やしています。深層学習技術のおかげで、ドキュメント画像で関心のあるセグメントの識別と分類が過去数年間で大幅な進歩を遂げた場合、細粒度のセグメンテーションタイポロジーの使用や複雑な異種ドキュメントの検討など、多くの課題が残ります歴史的な新聞など。その上、ほとんどのアプローチは視覚的な特徴のみを考慮し、テキスト信号を無視します。この文脈では、視覚的特徴とテキスト的特徴を組み合わせた、歴史的な新聞のセマンティックセグメンテーションのためのマルチモーダルアプローチを紹介します。通時性のあるスイスとルクセンブルグの新聞に関する一連の実験に基づいて、特に、視覚的およびテキスト的特徴の予測力と、時間とソースを超えて一般化する能力を調査します。結果は、強力な視覚的ベースラインと比較したマルチモーダルモデルの一貫した改善と、高い材料変動に対する優れた堅牢性を示しています。

The massive amounts of digitized historical documents acquired over the last decades naturally lend themselves to automatic processing and exploration. Research work seeking to automatically process facsimiles and extract information thereby are multiplying with, as a first essential step, document layout analysis. If the identification and categorization of segments of interest in document images have seen significant progress over the last years thanks to deep learning techniques, many challenges remain with, among others, the use of finer-grained segmentation typologies and the consideration of complex, heterogeneous documents such as historical newspapers. Besides, most approaches consider visual features only, ignoring textual signal. In this context, we introduce a multimodal approach for the semantic segmentation of historical newspapers that combines visual and textual features. Based on a series of experiments on diachronic Swiss and Luxembourgish newspapers, we investigate, among others, the predictive power of visual and textual features and their capacity to generalize across time and sources. Results show consistent improvement of multimodal models in comparison to a strong visual baseline, as well as better robustness to high material variance.

updated: Mon Dec 14 2020 16:56:29 GMT+0000 (UTC)

published: Fri Feb 14 2020 17:56:18 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト