Smart Inference for Multidigit Convolutional Neural Network based Barcode Decoding

Thao Do; Yalew Tolcha; Tae Joon Jun; Daeyoung Kim

多桁畳み込みニューラルネットワークベースのバーコード復号化のためのスマート推論

バーコードはユビキタスであり、何十年もの間、重要な日常活動のほとんどで使用されてきました。ただし、ほとんどの従来のデコーダーは、比較的標準的な条件下で十分に確立されたバーコードを必要とします。露出不足、オクルージョン、ぼやけ、しわ、回転などのより条件の厳しいバーコードは実際には一般的にキャプチャされますが、これらの従来のデコーダは認識の弱点を示します。これらの難しいバーコードを解決するためにいくつかの研究が試みられましたが、多くの制限がまだ存在しています。この作業は、ポータブルデバイスで実行できる可能性のある深い畳み込みニューラルネットワークを使用して、デコードの問題を解決することを目的としています。最初に、トレーニング済みモデルの予測フェーズでスマート推論（SI）と名付けられた、チェックサムとテスト時の拡張機能の特徴に基づく推論の特別な変更を提案しました。 SIは精度を大幅に向上させ、トレーニング済みモデルの誤予測を減らします。次に、さまざまな困難な条件下で実際にキャプチャされた1Dバーコードの大規模な実用的な評価データセットを作成して、他の研究者が公に利用できる方法を積極的にテストします。実験の結果は、95.85％の最も高い精度でSIの有効性を示し、評価セットの多くの既存のデコーダーを上回りました。最後に、知識の抽出によって最良のモデルを浅いモデルに最小化し、実際のエッジデバイスで画像ごとに34.2 msの優れた推論速度で高精度（90.85％）であることが示されています。

Barcodes are ubiquitous and have been used in most of critical daily activities for decades. However, most of traditional decoders require well-founded barcode under a relatively standard condition. While wilder conditioned barcodes such as underexposed, occluded, blurry, wrinkled and rotated are commonly captured in reality, those traditional decoders show weakness of recognizing. Several works attempted to solve those challenging barcodes, but many limitations still exist. This work aims to solve the decoding problem using deep convolutional neural network with the possibility of running on portable devices. Firstly, we proposed a special modification of inference based on the feature of having checksum and test-time augmentation, named as Smart Inference (SI) in prediction phase of a trained model. SI considerably boosts accuracy and reduces the false prediction for trained models. Secondly, we have created a large practical evaluation dataset of real captured 1D barcode under various challenging conditions to test our methods vigorously, which is publicly available for other researchers. The experiments' results demonstrated the SI effectiveness with the highest accuracy of 95.85% which outperformed many existing decoders on the evaluation set. Finally, we successfully minimized the best model by knowledge distillation to a shallow model which is shown to have high accuracy (90.85%) with good inference speed of 34.2 ms per image on a real edge device.

updated: Sun Jun 27 2021 08:42:23 GMT+0000 (UTC)

published: Tue Apr 14 2020 04:30:34 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト