Skip to yearly menu bar Skip to main content


Poster

MMPareto: Innocent Uni-modal Assistance for Enhanced Multi-modal Learning

Yake Wei · Di Hu


Abstract:

Multi-modal learning methods with targeted uni-modal learning objectives have exhibited their superior efficacy in alleviating the imbalanced multi-modal learning problem. However, in this paper, we identify the previously ignored gradient conflict between multi-modal and uni-modal learning objectives, potentially misleading the uni-modal encoder optimization. To well diminish these conflicts, we observe the discrepancy between multi-modal loss and uni-modal loss, where both gradient magnitude and covariance of the easier-to-learn multi-modal loss are smaller than the uni-modal one. With this property, we analyze Pareto integration under our multi-modal scenario and propose MMPareto algorithm, which could ensure a final gradient with direction that is common to all learning objectives and enhanced magnitude to improve generalization, providing innocent uni-modal assistance. Finally, experiments across multiple types of modalities and frameworks with dense cross-modal interaction indicate our superior and extendable method performance. Our method is also expected to facilitate multi-task cases with a clear discrepancy in task difficulty, demonstrating its ideal scalability.

Live content is unavailable. Log in and register to view live content