Skip to yearly menu bar Skip to main content


Poster

Actions Speak Louder than Words: Trillion-Parameter Sequential Transducers for Generative Recommendations

Jiaqi Zhai · Yunxing Liao · Xing Liu · Yueming Wang · Rui Li · Yazhi Gao · Zhaojie Gong · Xuan Cao · Fangda Gu · Michael He · Yinghai Lu · Yu Shi


Abstract:

Large-scale recommendation systems are characterized by their reliance on high cardinality, heterogeneous features and the need to handle tens of billions of user actions on a daily basis. Despite being trained on huge volume of data with thousands of features, we observe most Deep Learning Recommendation Models (DLRMs) in industry fail to scale with compute.Inspired by success achieved by Transformers in language and vision domains, we revisit fundamental design choices in recommendation systems. We reformulate recommendation problems as sequential transduction tasks within a generative modeling framework (``Generative Recommenders''), and propose a new architecture, HSTU, designed for high cardinality, non-stationary streaming recommendation data. HSTU outperforms baselines over synthetic and public datasets by up to 65.8\% in NDCG, and is up to 6.9x faster than state-of-the-art FlashAttention2-based Transformers. HSTU-based Generative Recommenders, with 1.5 trillion parameters, improve metrics in online A/B tests by 12.4\% and have been deployed on multiple surfaces of a large internet platform with billions of users. More importantly, we discover that the model quality of Generative Recommenders empirically scales as a power-law of training compute, up to GPT-3 scale, opening up new research frontiers through the application of Scaling Law.

Live content is unavailable. Log in and register to view live content