Problem-dependent attention and effort in neural networks with applications to image resolution and model selection

Chris Rohlfs

画像解像度とモデル選択への応用を伴うニューラルネットワークにおける問題依存の注意と努力

このホワイトペーパーでは、画像分類のデータと計算コストを削減する 2 つの新しいアンサンブルベースの方法を紹介します。これらは任意の分類器セットで使用でき、追加のトレーニングは必要ありません。最初のアプローチでは、モデルが低解像度のピクセル化されたバージョンを分類する信頼性が低い場合にのみ、フルサイズの画像を分析することによってデータ使用量が削減されます。ここで検討した最高のパフォーマンスの分類子に適用すると、MNIST で 61.2%、KMNIST で 69.6%、FashionMNIST で 56.3%、SVHN で 84.6%、ImageNet で 40.6%、ImageNet-V2 で 27.6% のデータ使用量が削減されます。精度の低下は 5% 未満です。ただし、CIFAR-10 の場合、ピクセル化されたデータは特に有益ではなく、アンサンブルアプローチは精度を低下させながらデータの使用量を増加させます。 2 番目のアプローチでは、単純なモデルの分類の信頼性が低い場合にのみ複雑なモデルを使用することで、計算コストが削減されます。計算コストは MNIST で 82.1%、KMNIST で 47.6%、FashionMNIST で 72.3%、SVHN で 86.9%、ImageNet で 89.2%、ImageNet-V2 で 81.5% 削減され、精度の低下はすべて 5% 未満です。 CIFAR-10 の場合、対応する改善は 13.5% と小さくなっています。コストがオブジェクトではない場合、観測ごとに最も信頼できるモデルから投影を選択すると、検証精度が ImageNet の 79.3% から 81.0% に、ImageNet-V2 の 67.5% から 69.4% に向上します。

This paper introduces two new ensemble-based methods to reduce the data and computation costs of image classification. They can be used with any set of classifiers and do not require additional training. In the first approach, data usage is reduced by only analyzing a full-sized image if the model has low confidence in classifying a low-resolution pixelated version. When applied on the best performing classifiers considered here, data usage is reduced by 61.2% on MNIST, 69.6% on KMNIST, 56.3% on FashionMNIST, 84.6% on SVHN, 40.6% on ImageNet, and 27.6% on ImageNet-V2, all with a less than 5% reduction in accuracy. However, for CIFAR-10, the pixelated data are not particularly informative, and the ensemble approach increases data usage while reducing accuracy. In the second approach, compute costs are reduced by only using a complex model if a simpler model has low confidence in its classification. Computation cost is reduced by 82.1% on MNIST, 47.6% on KMNIST, 72.3% on FashionMNIST, 86.9% on SVHN, 89.2% on ImageNet, and 81.5% on ImageNet-V2, all with a less than 5% reduction in accuracy; for CIFAR-10 the corresponding improvements are smaller at 13.5%. When cost is not an object, choosing the projection from the most confident model for each observation increases validation accuracy to 81.0% from 79.3% for ImageNet and to 69.4% from 67.5% for ImageNet-V2.

updated: Fri Apr 14 2023 00:08:08 GMT+0000 (UTC)

published: Wed Jan 05 2022 02:27:35 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト