TransCL: Transformer Makes Strong and Flexible Compressive Learning

Chong Mou; Jian Zhang

TransCL：Transformerは強力で柔軟な圧縮学習を実現します

圧縮学習（CL）は、圧縮センシング（CS）を介した信号取得と、少数の測定で直接推論タスクを行う機械学習を統合する新しいフレームワークです。これは、従来の画像ドメイン手法の有望な代替手段となる可能性があり、メモリの節約と計算効率に大きな利点があります。ただし、CLに対するこれまでの試みは、柔軟性に欠ける固定CS比に限定されるだけでなく、MNIST / CIFARのようなデータセットにも限定され、複雑な実世界の高解像度（HR）データまたはビジョンタスクに拡張されません。この論文では、TransCLと呼ばれる、任意のCS比を持つ大規模画像での新しいトランスベースの圧縮学習フレームワークを提案します。具体的には、TransCLは最初に学習可能なブロックベースの圧縮センシングの戦略を利用し、柔軟な線形射影戦略を提案して、任意のCS比でブロックごとの効率的な方法でCLを大規模画像に対して実行できるようにします。次に、すべてのブロックからのCS測定をシーケンスとして、純粋なトランスベースのバックボーンを展開して、さまざまなタスク指向のヘッドでビジョンタスクを実行します。私たちの十分な分析は、TransCLが干渉に対する強い耐性と任意のCS比への堅牢な適応性を示すことを示しています。複雑なHRデータの広範な実験は、提案されたTransCLが画像分類およびセマンティックセグメンテーションタスクで最先端のパフォーマンスを達成できることを示しています。特に、CS比が10％のTransCLは、元のデータを直接操作した場合とほぼ同じ性能が得られ、CS比が1％と非常に低くても満足のいく性能が得られます。提案されているTransCLのソースコードはhttps://github.com/MC-E/TransCL/で入手できます。

Compressive learning (CL) is an emerging framework that integrates signal acquisition via compressed sensing (CS) and machine learning for inference tasks directly on a small number of measurements. It can be a promising alternative to classical image-domain methods and enjoys great advantages in memory saving and computational efficiency. However, previous attempts on CL are not only limited to a fixed CS ratio, which lacks flexibility, but also limited to MNIST/CIFAR-like datasets and do not scale to complex real-world high-resolution (HR) data or vision tasks. In this paper, a novel transformer-based compressive learning framework on large-scale images with arbitrary CS ratios, dubbed TransCL, is proposed. Specifically, TransCL first utilizes the strategy of learnable block-based compressed sensing and proposes a flexible linear projection strategy to enable CL to be performed on large-scale images in an efficient block-by-block manner with arbitrary CS ratios. Then, regarding CS measurements from all blocks as a sequence, a pure transformer-based backbone is deployed to perform vision tasks with various task-oriented heads. Our sufficient analysis presents that TransCL exhibits strong resistance to interference and robust adaptability to arbitrary CS ratios. Extensive experiments for complex HR data demonstrate that the proposed TransCL can achieve state-of-the-art performance in image classification and semantic segmentation tasks. In particular, TransCL with a CS ratio of 10% can obtain almost the same performance as when operating directly on the original data and can still obtain satisfying performance even with an extremely low CS ratio of 1%. The source codes of our proposed TransCL is available at https://github.com/MC-E/TransCL/.

updated: Mon Jul 25 2022 08:21:48 GMT+0000 (UTC)

published: Mon Jul 25 2022 08:21:48 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト