Δ-Patching: A Framework for Rapid Adaptation of Pre-trained Convolutional Networks without Base Performance Loss

Chaitanya Devaguptapu; Samarth Sinha; K J Joseph; Vineeth N Balasubramanian; Animesh Garg

Models pre-trained on large-scale datasets are often fine-tuned to support newer tasks and datasets that arrive over time. This process necessitates storing copies of the model over time for each task that the pre-trained model is fine-tuned to. Building on top of recent model patching work, we propose Δ-Patching for fine-tuning neural network models in an efficient manner, without the need to store model copies. We propose a simple and lightweight method called Δ-Networks to achieve this objective. Our comprehensive experiments across setting and architecture variants show that Δ-Networks outperform earlier model patching work while only requiring a fraction of parameters to be trained. We also show that this approach can be used for other problem settings such as transfer learning and zero-shot domain adaptation, as well as other tasks such as detection and segmentation.

updated: Thu Sep 21 2023 08:56:46 GMT+0000 (UTC)

published: Sun Mar 26 2023 16:39:44 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト