Skip to yearly menu bar Skip to main content


Poster

Unleashing the Power of Meta-tuning for Few-shot Generalization Through Sparse Interpolated Experts

Shengzhuang Chen · Jihoon Tack · Yunqiao Yang · Yee-Whye Teh · Jonathan Richard Schwarz · Ying WEI


Abstract:

Conventional wisdom suggests parameter-efficient fine-tuning of foundation models as thestate-of-the-art method for transfer learning invision, replacing the rich literature of alternativessuch as meta-learning. In trying to harness thebest of both worlds, meta-tuning introduces a sub-sequent optimization stage of foundation modelsbut has so far only shown limited success andcrucially tends to underperform on out-of-domain(OOD) tasks. In this paper, we introduce SparseMetA-Tuning (SMAT), a method inspired bysparse mixture-of-experts approaches and trainedto automatically isolate subsets of pre-trainedparameters for meta-tuning on each task. SMATsuccessfully overcomes OOD sensitivity anddelivers on the promise of enhancing the transferabilities of vision foundation models beyondparameter-efficient finetuning. We establish newstate-of-the-art results on a challenging combina-tion of Meta-Dataset augmented with additionalOOD tasks in both zero-shot and gradient-basedadaptation settings. In addition, we provide athorough analysis of the superiority of learnedover hand-designed sparsity patterns for sparseexpert methods and the pivotal importance of thesparsity level in balancing between in-domainand out-of-domain generalization.

Live content is unavailable. Log in and register to view live content