Word separation in continuous sign language using isolated signs and post-processing

Razieh Rastgoo; Kourosh Kiani; Sergio Escalera

孤立した手話と後処理を使用した連続手話での単語の分離

手話の連続認識（CSLR）は、手話の単語間の明示的な境界を検出することが難しいため、コンピュータービジョンでは長く困難な作業です。この課題に対処するために、2段階モデルを提案します。最初の段階では、CNN、SVD、およびLSTMの組み合わせを含む予測モデルが、分離された符号でトレーニングされます。第2段階では、モデルの最初の部分から取得したSoftmax出力に後処理アルゴリズムを適用して、連続符号内の孤立した符号を分離します。手話シーケンスと対応する孤立した手話の両方を含む大きなデータセットがないため、孤立した手話認識（ISLR）の2つの公開データセット、RKS-PERSIANSIGNとASLVIDが評価に使用されます。連続サインビデオの結果は、孤立したサイン境界の検出を処理するための提案されたモデルの効率を確認します。

Continuous Sign Language Recognition (CSLR) is a long challenging task in Computer Vision due to the difficulties in detecting the explicit boundaries between the words in a sign sentence. To deal with this challenge, we propose a two-stage model. In the first stage, the predictor model, which includes a combination of CNN, SVD, and LSTM, is trained with the isolated signs. In the second stage, we apply a post-processing algorithm to the Softmax outputs obtained from the first part of the model in order to separate the isolated signs in the continuous signs. Due to the lack of a large dataset, including both the sign sequences and the corresponding isolated signs, two public datasets in Isolated Sign Language Recognition (ISLR), RKS-PERSIANSIGN and ASLVID, are used for evaluation. Results of the continuous sign videos confirm the efficiency of the proposed model to deal with isolated sign boundaries detection.

updated: Mon Apr 11 2022 18:46:37 GMT+0000 (UTC)

published: Sat Apr 02 2022 18:34:33 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト