Skip to yearly menu bar Skip to main content


Poster

Generating, Reconstructing, and Representing Discrete and Continuous Data: Generalized Diffusion with Learnable Encoding-Decoding

Guangyi Liu · Yu Wang · Zeyu Feng · Qiyu Wu · Liping Tang · Yuan Gao · Zhen Li · Shuguang Cui · Julian McAuley · Eric Xing · Zichao Yang · Zhiting Hu


Abstract:

The vast applications of deep generative models are anchored in three core capabilities—generating new instances, reconstructing inputs, and learning compact representations—across various data types, such as discrete text/protein sequences and continuous images. Existing model families, like Variational Autoencoders (VAEs), Generative Adversarial Networks (GANs), autoregressive models, and diffusion models, generally excel in specific capabilities and data types but fall short in others. We introduce gener-alized diffusion with learnable encoder-decoder (DILED), that seamlessly integrates the core capabilities for broad applicability and enhanced performance. DILED generalizes the Gaussian noising-denoising in standard diffusion by introducing parameterized encoding-decoding. Crucially, DILED is compatible with the well-established diffusion model objective and training recipes, allowing effective learning of the encoder-decoder parameters jointly with diffusion. By choosing appropriate encoder/decoder (e.g., large language models), DILED naturally applies to different data types. Extensive experiments on text, proteins, and images demonstrate DILED’s flexibility to handle diverse data and tasks and its strong improvement over various existing models.

Live content is unavailable. Log in and register to view live content