Imbalanced Malware Images Classification: a CNN based Approach

Songqing Yue; Tianyang Wang

不均衡なマルウェア画像の分類：CNNベースのアプローチ

ディープ畳み込みニューラルネットワーク（CNN）は、画像分類を介したマルウェアバイナリ検出に適用できます。ただし、マルウェアファミリ（クラス）の不均衡により、パフォーマンスが低下します。この問題を軽減するために、ディープCNNの最終層として使用できるシンプルで効果的な加重ソフトマックス損失を提案します。元のソフトマックス損失に重みが付けられ、クラスのサイズに応じて重み値を決定できます。重みの計算には、スケーリングパラメータも含まれています。このパラメータの適切な選択が研究され、経験的なオプションが提案されます。加重損失は、エンドツーエンドの学習方法でデータの不均衡の影響を軽減することを目的としています。有効性を検証するために、事前にトレーニングされたディープCNNモデルに提案された加重損失を展開し、マルウェア画像分類で有望な結果を達成するように微調整します。広範な実験により、新しい損失関数が他の一般的なCNNにうまく適合し、分類パフォーマンスが向上することも示されています。

Deep convolutional neural networks (CNNs) can be applied to malware binary detection via image classification. The performance, however, is degraded due to the imbalance of malware families (classes). To mitigate this issue, we propose a simple yet effective weighted softmax loss which can be employed as the final layer of deep CNNs. The original softmax loss is weighted, and the weight value can be determined according to class size. A scaling parameter is also included in computing the weight. Proper selection of this parameter is studied and an empirical option is suggested. The weighted loss aims at alleviating the impact of data imbalance in an end-to-end learning fashion. To validate the efficacy, we deploy the proposed weighted loss in a pre-trained deep CNN model and fine-tune it to achieve promising results on malware images classification. Extensive experiments also demonstrate that the new loss function can well fit other typical CNNs, yielding an improved classification performance.

updated: Sun Feb 20 2022 03:51:03 GMT+0000 (UTC)

published: Sun Aug 27 2017 02:27:59 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト