Plant Species Recognition with Optimized 3D Polynomial Neural Networks and Variably Overlapping Time-Coherent Sliding Window

Habib Ben Abdallah; Christopher J. Henry; Sheela Ramanna

最適化された3D多項式ニューラルネットワークと可変的にオーバーラップする時間コヒーレントスライディングウィンドウによる植物種の認識

最近、EAGL-Iシステムは、農業でAI主導のソリューションを作成するために農家や研究者が一般的に使用することを目的とした植物の大規模なラベル付きデータセットを迅速に作成するために開発されました。その結果、その能力を実証するために、8つの植物種からなるさまざまなサイズの40,000枚の画像で構成される公開されている植物種認識データセットがシステムで作成されました。この論文は、可変サイズの画像で構成されるデータセットを畳み込みニューラルネットワークに適した固定サイズの3D表現に変換する、Variably Overlapping Time-Coherent Sliding Window（VOTCSW）と呼ばれる新しい方法を提案し、この表現がデータセットの画像を特定のサイズにサイズ変更するよりも有益です。メソッドのユースケースとその固有のプロパティを理論的に形式化し、データにオーバーサンプリングと正則化の効果があることを証明しました。 VOTCSW法と、最近提案された1次元多項式ニューラルネットワークと呼ばれる機械学習モデルの3D拡張を組み合わせることで、によって作成されたデータセットで99.9％の最先端の精度を達成したモデルを作成することができました。 EAGL-Iシステムは、ResNetやInceptionなどのよく知られたアーキテクチャを上回っています。さらに、事前にトレーニングされたN次元多項式ニューラルネットワークの次数を削減し、パフォーマンスを変更せずに圧縮してモデルを高速化および軽量化するヒューリスティックアルゴリズムを作成しました。さらに、トレーニングセットとテストセットの間にかなりのクラスの不均衡があるため、現在利用可能なデータセットを現在の形式の機械学習に使用できないことを確認しました。そのため、特定の前処理とモデル開発フレームワークを作成して、精度を49.23％から99.9％に向上させました。

Recently, the EAGL-I system was developed to rapidly create massive labeled datasets of plants intended to be commonly used by farmers and researchers to create AI-driven solutions in agriculture. As a result, a publicly available plant species recognition dataset composed of 40,000 images with different sizes consisting of 8 plant species was created with the system in order to demonstrate its capabilities. This paper proposes a novel method, called Variably Overlapping Time-Coherent Sliding Window (VOTCSW), that transforms a dataset composed of images with variable size to a 3D representation with fixed size that is suitable for convolutional neural networks, and demonstrates that this representation is more informative than resizing the images of the dataset to a given size. We theoretically formalized the use cases of the method as well as its inherent properties and we proved that it has an oversampling and a regularization effect on the data. By combining the VOTCSW method with the 3D extension of a recently proposed machine learning model called 1-Dimensional Polynomial Neural Networks, we were able to create a model that achieved a state-of-the-art accuracy of 99.9% on the dataset created by the EAGL-I system, surpassing well-known architectures such as ResNet and Inception. In addition, we created a heuristic algorithm that enables the degree reduction of any pre-trained N-Dimensional Polynomial Neural Network and which compresses it without altering its performance, thus making the model faster and lighter. Furthermore, we established that the currently available dataset could not be used for machine learning in its present form, due to a substantial class imbalance between the training set and the test set. Hence, we created a specific preprocessing and a model development framework that enabled us to improve the accuracy from 49.23% to 99.9%.

updated: Mon Aug 29 2022 16:36:08 GMT+0000 (UTC)

published: Fri Mar 04 2022 23:37:12 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト