Model Compression Methods for YOLOv5: A Review

Mohammad Jani; Jamil Fayyad; Younes Al-Younes; Homayoun Najjaran

YOLOv5 のモデル圧縮方法: レビュー

過去数年間にわたり、YOLO 物体検出器の強化に向けて広範な研究が行われてきました。 YOLO の導入以来、精度と効率を向上させる目的で、YOLO の 8 つのメジャーバージョンが導入されました。 YOLO の明白なメリットは多くの分野で広範に使用されていますが、リソースが限られたデバイスに YOLO を導入すると課題が生じます。この問題に対処するために、さまざまなニューラルネットワーク圧縮方法が開発されています。これらは、ネットワークプルーニング、量子化、知識蒸留という 3 つの主要なカテゴリに分類されます。モデル圧縮手法を利用すると、メモリ使用量や推論時間の削減などの有益な結果が得られるため、必要でない場合でも、ハードウェアに制約のあるエッジデバイスに大規模なニューラルネットワークを展開する場合に有利になります。このレビューペーパーでは、比較モジュール性による枝刈りと量子化に焦点を当てています。それらを分類し、それらのメソッドを YOLOv5 に適用した実際の結果を分析します。そうすることで、YOLOv5 を圧縮するためのプルーニングと量子化を適応させる際のギャップを特定し、この分野でのさらなる調査のための今後の方向性を提供します。 YOLO のいくつかのバージョンの中で、最新性と文献における人気の間の優れたトレードオフを考慮して、特に YOLOv5 を選択します。これは、YOLOv5 の実装の観点から枝刈りおよび量子化方法を調査した最初の具体的なレビューペーパーです。リソースが限られたデバイスに実装すると、今日でも同じ課題が生じるため、私たちの研究は YOLO の新しいバージョンにも拡張可能です。このペーパーは、YOLOv5 でのモデル圧縮方法の実際の展開に関心がある人、および YOLO の後続のバージョンで使用できるさまざまな圧縮技術を検討している人を対象としています。

Over the past few years, extensive research has been devoted to enhancing YOLO object detectors. Since its introduction, eight major versions of YOLO have been introduced with the purpose of improving its accuracy and efficiency. While the evident merits of YOLO have yielded to its extensive use in many areas, deploying it on resource-limited devices poses challenges. To address this issue, various neural network compression methods have been developed, which fall under three main categories, namely network pruning, quantization, and knowledge distillation. The fruitful outcomes of utilizing model compression methods, such as lowering memory usage and inference time, make them favorable, if not necessary, for deploying large neural networks on hardware-constrained edge devices. In this review paper, our focus is on pruning and quantization due to their comparative modularity. We categorize them and analyze the practical results of applying those methods to YOLOv5. By doing so, we identify gaps in adapting pruning and quantization for compressing YOLOv5, and provide future directions in this area for further exploration. Among several versions of YOLO, we specifically choose YOLOv5 for its excellent trade-off between recency and popularity in literature. This is the first specific review paper that surveys pruning and quantization methods from an implementation point of view on YOLOv5. Our study is also extendable to newer versions of YOLO as implementing them on resource-limited devices poses the same challenges that persist even today. This paper targets those interested in the practical deployment of model compression methods on YOLOv5, and in exploring different compression techniques that can be used for subsequent versions of YOLO.

updated: Fri Jul 21 2023 21:07:56 GMT+0000 (UTC)

published: Fri Jul 21 2023 21:07:56 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト