Skip to yearly menu bar Skip to main content


Poster

Domain-wise Data Acquisition to Improve Performance under Distribution Shift

Yue He · Dongbai Li · Pengfei Tian · Han Yu · Jiashuo Liu · Hao Zou · Peng Cui


Abstract:

Despite notable progress in enhancing machine learning models' capability against distribution shifts, training data quality remains a bottleneck for cross-distribution generalization. Recently, from a data-centric perspective, there have been considerable efforts to improve model performance through refining the preparation of training data. Inspired by realistic scenarios, this paper addresses a practical requirement of acquiring training samples from various domains on a limited budget to facilitate model generalization to target test domain with distribution shift. Our empirical evidences indicate that the advance in data acquisition can significantly benefit the model performance on shifted data. Additionally, by leveraging unlabeled test domain data, we introduce a Domain-wise Active Acquiring framework. This framework iteratively optimizes the data acquisition strategy as training samples are accumulated, theoretically ensuring the effective approximation of test distribution. Extensive real-world experiments demonstrate our proposal's advantages in machine learning applications.

Live content is unavailable. Log in and register to view live content