Skip to yearly menu bar
Skip to main content
Main Navigation
ICML
Help/FAQ
Contact ICML
Downloads
Code of Conduct
Create Profile
Journal To Conference Track
Diversity & Inclusion
Privacy Policy
Press
Careers
My Stuff
Login
Select Year: (2024)
2024
2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2002
1996
IMLS Archives
Getting Started
Schedule
Tutorials
Main Conference
Orals
Awards
Test of Time Award
Papers
Invited Talks
Workshops
Community
Socials
Mentorship
Town Hall / Business Meeting
Affinity Events
Sponsors
Organizers
Help
Presenters Instructions
Moderators Instructions
RocketChat Help
RocketChat Desktop Client
FAQ
Browse
mini
compact
topic
detail
Showing papers for
.
×
×
title
author
topic
session
shuffle
by
serendipity
bookmarked first
visited first
not visited first
bookmarked but not visited
Enable Javascript in your browser to see the papers page.
Understanding the Training Speedup from Sampling with Approximate Losses
Position Paper: Leveraging Foundational Models for Black-Box Optimization: Benefits, Challenges, and Future Directions
ReLUs Are Sufficient for Learning Implicit Neural Representations
Accelerated Policy Gradient: On the Convergence Rates of the Nesterov Momentum for Reinforcement Learning
Equivariant Diffusion for Crystal Structure Prediction
Criterion collapse and loss distribution control
Non-Asymptotic Analysis for Single-Loop (Natural) Actor-Critic with Compatible Function Approximation
Constrained Reinforcement Learning Under Model Mismatch
PASOA- PArticle baSed Bayesian Optimal Adaptive design
SIN: Selective and Interpretable Normalization for Long-Term Time Series Forecasting
Tilt your Head: Activating the Hidden Spatial-Invariance of Classifiers
PIPER: Primitive-Informed Preference-based Hierarchical Reinforcement Learning via Hindsight Relabeling
Optimal Acceleration for Minimax and Fixed-Point Problems is Not Unique
LoRA Training in the NTK Regime has No Spurious Local Minima
DiffUCO: A Diffusion Model Framework for Unsupervised Neural Combinatorial Optimization
UniAudio: Towards Universal Audio Generation with Large Language Models
Prompt-tuning Latent Diffusion Models for Inverse Problems
Quality-Diversity with Limited Resources
Graph Distillation with Eigenbasis Matching
Scale-Free Image Keypoints Using Differentiable Persistent Homology
Offline-Boosted Actor-Critic: Adaptively Blending Optimal Historical Behaviors in Deep Off-Policy RL
Self-Correcting Self-Consuming Loops for Generative Model Training
Long-tail Learning with Foundation Model: Heavy Fine-tuning Hurts
Self-Rewarding Language Models
Towards Generalization beyond Pointwise Learning: A Unified Information-theoretic Perspective
Learning Label Shift Correction for Test-Agnostic Long-Tailed Recognition
Erasing the Bias: Fine-Tuning Foundation Models for Semi-Supervised Learning
Neural Collapse meets Differential Privacy: Curious behaviors of NoisyGD with Near-Perfect Representation Learning
Deep Equilibrium Models are Almost Equivalent to Not-so-deep Explicit Models for High-dimensional Gaussian Mixtures
Training-Free Long-Context Scaling of Large Language Models
LoRAP: Transformer Sub-Layers Deserve Differentiated Structured Compression for Large Language Models
Improving Adversarial Energy-Based Model via Diffusion Process
S$\Omega$I: Score-based O-INFORMATION Estimation
Expand-and-Cluster: Parameter Recovery of Neural Networks
Learning Associative Memories with Gradient Descent: An Interacting Particle Study
Implicit Representations for Constrained Image Segmentation
Hyperbolic Active Learning for Semantic Segmentation under Domain Shift
Trust the Model Where It Trusts Itself - Model-Based Actor-Critic with Uncertainty-Aware Rollout Adaption
Fast Timing-Conditioned Latent Audio Diffusion
Feature Contamination: On the Feasibility of Learning Representations that Generalize Out-of-Distribution
Individual Fairness in Graph Decomposition
Accelerating Federated Learning with Quick Distributed Mean Estimation
MALIBO: Meta-learning for Likelihood-free Bayesian Optimization
Out-of-Domain Generalization in Dynamical Systems Reconstruction
Optimal Network Topologies for Dynamical Systems Reconstruction
The Expressive Power of Path based Graph Neural Networks
Restoring balance: principled under/oversampling for optimal data classification
Safe and Robust Subgame Exploitation in Imperfect Information Games
Generalization Error of Graph Neural Networks in the Mean-field Regime
Defining Neural Network Architecture through Polytope Structure of Dataset
Model Tailor: Mitigating Catastrophic Forgetting in Multi-modal Large Language Models
Auditing Private Prediction
Unifying Bayesian Flow Networks and Diffusion Models through Stochastic Differential Equations
Disparate Impact on Group Accuracy of Linearization for Private Inference
Debiased Offline Representation Learning for Fast Online Adaptation in Non-stationary Dynamics
Surprisingly Strong Performance Prediction with Neural Graph Features
On the Universality of Coupling-Based Normalizing Flows
Manifold Integrated Gradients: Riemannian Geometry for Feature Attribution
Interpretable Distribution-Invariant Fairness Measures for Continuous Scores
Scaling Down Deep Learning with MNIST-1D
Can Machines Learn the True Probability?
VNNs: Verification-Friendly Neural Networks with Hard Robustness Guarantees
HexGen: Generative Inference of Large-Scale Foundation Model over Heterogeneous Decentralized Environment
The Selected-completely-at-random Complementary Label is a Practical Weak Supervision for Multi-class Classification
Counterfactual Reasoning for Multi-Label Image Classification via Patching-Based Training
Generating Chain-of-Thoughts with a Direct Pairwise-Comparison Approach to Find the Most Promising Intermediate Thought
A Federated Stochastic Multi-level Compositional Minimax Algorithm for Deep AUC Maximization
On the Convergence of Projected Bures-Wasserstein Gradient Descent under Euclidean Convexity
Deconstructing the Goldilocks Zone of Neural Network Initialization
Beyond the Federation: Topology-aware Federated Learning for Generalization to Unseen Clients
Tackling Byzantine Clients in Federated Learning
Federated Combinatorial Optimization with Multi-Agent Multi-Armed Bandits
Stochastic Q-learning for Large Discrete Action Spaces
Position paper: A call for embodied AI
CHAI: Clustered Head Attention for Efficient LLM Inference
Enabling Few-Shot Learning with PID Control: A Layer Adaptive Optimizer
Accelerating Heterogeneous Federated Learning with Closed-form Classifiers
Fine-tuning Reinforcement Learning Models is Secretly a Forgetting Mitigation Problem
On the Feasibility of Single-Pass Full-Capacity Learning in Linear Threshold Neurons with Binary Input Vectors
Characterizing ResNet's Universal Approximation Capabilities
Interacting Diffusion Processes for Event Sequence Forecasting
Prompt-guided Precise Audio Editing with Diffusion Models
Unlocking the Potential of Transformers in Time Series Forecasting with Sharpness-Aware Minimization and Channel-Wise Attention
SEMIQ: Semi-Supervised Learning of Quantum Data with Application to Quantum System Certification
Trained Random Forests Completely Reveal your Dataset
TERD: A Unified Framework for Backdoor Defense on Diffusion Model
Variational Schrödinger Diffusion Models
GeoMFormer: A General Architecture for Geometric Molecular Representation Learning
Temporal Spiking Neural Networks with Synaptic Delay for Graph Reasoning
WISER: Weak supervISion and supErvised Representation learning to improve drug response prediction in cancer
Bayesian Program Learning by Decompiling Amortized Knowledge
PID: Prompt-Independent Data Protection Against Latent Diffusion Models
Foundation Policies with Hilbert Representations
Quality-Weighted Vendi Scores For Diverse Experimental Design
Longitudinal Targeted Minimum Loss-based Estimation with Temporal-Difference Heterogeneous Transformer
Density Ratio Estimation with Doubly Strong Robustness
Accurate LoRA-Finetuning Quantization of LLMs via Information Retention
Flexible Residual Binarization for Image Super-Resolution
Stereographic Spherical Sliced Wasserstein Distances
SleepFM: Multi-modal Representation Learning for Sleep Across Brain Activity, ECG and Respiratory Signals
Outlier Weighed Layerwise Sparsity (OWL): A Missing Secret Sauce for Pruning LLMs to High Sparsity
BiLLM: Pushing the Limit of Post-Training Quantization for LLMs
Image Fusion via Vision-Language Model
Compressing Large Language Models by Joint Sparsification and Quantization
CKGConv: General Graph Convolution with Continuous Kernels
Memorization Through the Lens of Curvature of Loss Function Around Samples
Zero-Shot Reinforcement Learning via Function Encoders
Test-Time Model Adaptation with Only Forward Passes
Adaptive Online Experimental Design for Causal Discovery
Using AI Uncertainty Quantification to Improve Human Decision-Making
Multi-Agent Reinforcement Learning Meets Leaf Sequencing in Radiotherapy
Keypoint-based Progressive Chain-of-Thought Distillation for LLMs
Compositional Image Decomposition with Diffusion Models
IBD-PSC: Input-level Backdoor Detection via Parameter-oriented Scaling Consistency
Neurodegenerative Brain Network Classification via Adaptive Diffusion with Temporal Regularization
Inferring Change Points in High-Dimensional Linear Regression via Approximate Message Passing
MoMo: Momentum Models for Adaptive Learning Rates
Revisiting Scalable Hessian Diagonal Approximations for Applications in Reinforcement Learning
Randomized Confidence Bounds for Stochastic Partial Monitoring
Listening to the noise: Blind Denoising with Gibbs Diffusion
PARDEN, Can You Repeat That? Defending against Jail-Breaks via Repetition
Subhomogeneous Deep Equilibrium Models
By Tying Embeddings You Are Assuming the Distributional Hypothesis
Building Socially-Equitable Public Models
Model-based Reinforcement Learning for Parameterized Action Spaces
Premise Order Matters in Reasoning with Large Language Models
A Human-Inspired Reading Agent with Gist Memory of Very Long Contexts
Multi-Fidelity Residual Neural Processes for Scalable Surrogate Modeling
Chain of Code: Reasoning with a Language Model-Augmented Code Emulator
Defense against Backdoor Attack on Pre-trained Language Models via Head Pruning and Attention Normalization
Grokking Group Multiplication with Cosets
eCeLLM: Generalizing Large Language Models for E-commerce from Large-scale, High-quality Instruction Data
Language Models as Semantic Indexers
Symbolic Music Generation with Non-Differentiable Rule Guided Diffusion
TimeMIL: Advancing Multivariate Time Series Classification via a Time-aware Multiple Instance Learning
Do Large Language Models Generalize the Way People Expect? A Benchmark for Evaluation
Probabilistic Constrained Reinforcement Learning with Formal Interpretability
Thermometer: Towards Universal Calibration for Large Language Models
Position Paper: Categorical Deep Learning: An Algebraic Theory of Architectures
Codebook Features: Sparse and Discrete Interpretability for Neural Networks
To Each (Textual Sequence) Its Own: Improving Memorized-Data Unlearning in Large Language Models
CHEMREASONER: Heuristic Search over a Large Language Model’s Knowledge Space using Quantum-Chemical Feedback
Lyapunov-stable Neural Control for State and Output Feedback: A Novel Formulation for Efficient Synthesis and Verification
EVEREST: Efficient Masked Video Autoencoder by Removing Redundant Spatiotemporal Tokens
Smoothing Proximal Gradient Methods for Nonsmooth Sparsity Constrained Optimization: Optimality Conditions and Global Convergence
Efficient Reinforcement Learning from Partial Observability
SparseTSF: Modeling Long-term Time Series Forecasting with *1k* Parameters
E$^2$GAN: Efficient Training of Efficient GANs for Image-to-Image Translation
PANDA: Expanded Width-Aware Message Passing Beyond Rewiring
FESSNC: Fast Exponentially Stable and Safe Neural Controller
SLAB: Efficient Transformers with Simplified Linear Attention and Progressive Re-parameterized BatchNorm
Object Scale Net: Representing Dynamic 3D Scenes in Billion Ways from Monocular Videos
A Unified Framework for Learning with Nonlinear Model Classes from Arbitrary Linear Samples
Planning, Fast and Slow: Online Reinforcement Learning with Action-Free Offline Data via Multiscale Planners
Provably Scalable Black-Box Variational Inference with Structured Variational Families
Demystifying Doubly Stochastic Gradient Descent
Harnessing Hierarchical Label Distribution Variations in Test Agnostic Long-tail Recognition
A Global Geometric Analysis of Maximal Coding Rate Reduction
Generalized Neural Collapse for a Large Number of Classes
Navigating Complexity: Toward Lossless Graph Condensation via Expanding Window Matching
Offline Multi-Objective Optimization
Online Resource Allocation with Non-Stationary Customers
GALA3D: Towards Text-to-3D Complex Scene Generation via Layout-guided Generative Gaussian Splatting
Factorized Diffusion Models are Natural and Zero-shot Speech Synthesizers
Information Flow in Self-Supervised Learning
Semantic-Aware Distribution Matching for Semi-Supervised Learning
Matrix Information Theory for Self-Supervised Learning
NADOv2: Improved Training and Low-Rank Adaptation of Neurally-Decomposed Oracles for Controlling Language Models
Unveiling the Dynamics of Information Interplay in Supervised Learning
Provable Contrastive Continual Learning
Test-Time Regret Minimization in Meta Reinforcement Learning
Iterated Denoising Energy Matching for Sampling from Boltzmann Densities
An amortized approach to non-linear mixed-effects modeling based on neural posterior estimation
Learning Linear Block Error Correction Codes
Mol-AE: Auto-Encoder Based Molecular Representation Learning With 3D Cloze Test Objective
Federated Continual Learning via Prompt-based Dual Knowledge Transfer
Multi-Scale Protein Language Model for Unified Molecular Modeling
PinNet: Pinpoint Instructive Information for Retrieval Augmented Code-to-Text Generation
Learning Causal Dynamics Models in Object-Oriented Environments
FedLMT: Tackling System Heterogeneity of Federated Learning via Low-Rank Model Training with Theoretical Guarantees
How do Transformers Perform In-Context Autoregressive Learning ?
EAGLE: Speculative Sampling Requires Rethinking Feature Uncertainty
Constrained Exploration via Reflected Replica Exchange Stochastic Gradient Langevin Dynamics
Enabling Uncertainty Estimation in Iterative Neural Networks
MD tree: a model-diagnostic tree grown on loss landscape
convSeq: Fast and Scalable Method for Detecting Patterns in Spike Data
Towards Scalable and Versatile Hyper-Representation Learning
Revisiting Zeroth-Order Optimization for Memory-Efficient LLM Fine-Tuning: A Benchmark
Self-Consistency Training for Hamiltonian Prediction
Reinforcement Learning from Reachability Specifications: PAC Guarantees with Expected Conditional Distance
Sharp Rates in Dependent Learning Theory: Avoiding Sample Size Deflation for the Square Loss
Towards Causal Foundation Model: on Duality between Optimal Balancing and Attention
Latent Space Symmetry Discovery
Position Paper: Will we run out of data? Limits of LLM scaling based on human-generated data
Towards Realistic Model Selection for Semi-supervised Learning
Accelerating Convergence of Score-Based Diffusion Models, Provably
Conformal Prediction for Deep Classifier via Label Ranking
Causal Customer Churn Analysis with Low-rank Tensor Block Hazard Model
Pose and Interaction Aware Human Object Interaction Image Generation
Characterizing Truthfulness in Large Language Model Generations with Local Intrinsic Dimension
PairNet: Training with Observed Pairs to Estimate Individual Treatment Effect
Efficient World Models with Time-Aware and Context-Augmented Tokenization
DynSyn: Dynamical Synergistic Representation for Efficient Learning and Control in Overactuated Embodied Systems
Improve Multimodal Context Understanding via Multimodal Composition Learning
Disguised Copyright Infringement of Latent Diffusion Models
Vague Prototype-Oriented Diffusion Model for Multi-Class Anomaly Detection
Get More with LESS: Synthesizing Recurrence with KV Cache Compression for Efficient LLM Inference
Gradient-based Visual Explanation for CLIP
Interpretable Deep Clustering for Tabular Data
PPFLOW: Target-Aware Peptide Design with Torsional Flow Matching
Active Adaptive Experimental Design for Treatment Effect Estimation with Covariate Choice
Knowledge Storage and Extraction in Language Models
On the Calibration of Human Pose Estimation
Rationality Report Cards: Assessing the Economic Rationality of Large Language Models
State-Free Inference of State-Space Models: The *Transfer Function* Approach
A Fresh Take on Stale Embeddings: Improving Dense Retriever Training with Corrector Networks
Efficient Non-stationary Online Learning by Wavelets with Applications to Online Distribution Shift Adaptation
From Yes-Men to Truth-Tellers: Addressing Sycophancy in Large Language Models with Pinpoint Tuning
On the Nonlinearity of Layer Normalization
Interaction-based Retrieval-augmented Diffusion Models for Protein-specific 3D Molecule Generation
Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs
Generating In-Distribution Proxy Graphs for Explainable Graph Neural Networks
Image Clustering with External Guidance
Improving Flow Field Prediction of Complex Geometries Using Simple Geometries: A Case Study with Tandem Airfoils
Calibration Bottleneck: Over-compressed Representations are Less Calibratable
$\texttt{MoE-RBench}$: Towards Building Reliable Language Models with Sparse Mixture-of-Experts
On the Maximal Local Disparity of Fairness-Aware Classifiers
How Universal Polynomial Bases Enhance Spectral Graph Neural Networks: Heterophily, Over-smoothing, and Over-squashing
Superpoint Gaussian Splatting for Real-Time High-Fidelity Monocular Dynamic Scene Reconstruction
Is Kernel Prediction More Powerful than Gating in Convolutional Neural Networks?
Provably Better Explanations with Optimized Aggregation of Feature Attributions
LLark: A Multimodal Instruction-Following Language Model for Music
Imitation Learning from Purified Demonstrations
Positive concave deep equilibrium models
MEMORYLLM: Toward Self-Updating Large Language Models
Q-Align: Teaching LMMs for Visual Scoring via Discrete Text-Defined Levels
Domain-wise Data Acquisition to Improve Performance under Distribution Shift
Pragmatic Feature Preferences: Learning Reward-Relevant Preferences from Human Input
Should we be going MAD? A Look at Multi-Agent Debate Strategies for LLMs
Learning to Intervene on Concept Bottlenecks
Feature Distribution on Graph Topology Mediates the Effect of Graph Convolution: Homophily Perspective
Tackling Complex Conditions in Unsupervised Combinatorial Optimization: Cardinality, Minimum, Covering, and More
OAK: Enriching Document Representations using Auxiliary Knowledge for Extreme Classification
A Tale of Tails: Model Collapse as a Change of Scaling Laws
Consistent Adversarially Robust Linear Classification: Non-Parametric Setting
Masked Face Recognition with Generative-to-Discriminative Representations
Binning as a Pretext Task: Improving Self-Supervised Learning in Tabular Domains
Counterfactual Metarules for Local and Global Recourse
Variational Linearized Laplace Approximation for Bayesian Deep Learning
Log Neural Controlled Differential Equations: The Lie Brackets Make A Difference
Copyright Traps for Large Language Models
Candidate Pseudolabel Learning: Enhancing Vision-Language Models by Prompt Tuning with Unlabeled Data
Regularized Q-learning through Robust Averaging
Are Large Language Models Bayesian? A Martingale Perspective on In-Context Learning
Data Engineering for Scaling Language Models to 128K Context
ICED: Zero-Shot Transfer in Reinforcement Learning via In-Context Environment Design
Boosting Reinforcement Learning with Strongly Delayed Feedback Through Auxiliary Short Delays
AND: Audio Network Dissection for Interpreting Deep Acoustic Models
In-Context Sharpness as Alerts: An Inner Representation Perspective for Hallucination Mitigation
Attention Meets Post-hoc Interpretability: A Mathematical Perspective
Neural Jump-Diffusion Temporal Point Processes
Jacobian Regularizer-based Neural Granger Causality
Run-Time Task Composition with Safety Semantics
Graph Neural Stochastic Diffusion for Estimating Uncertainty in Node Classification
Position Paper: Building Guardrails for Large Language Models
BiE: Bi-Exponent Block Floating-Point for Large Language Models Quantization
Can AI Assistants Know What They Don't Know?
MOKD: Cross-domain Few-shot Classification via Maximizing Optimized Kernel Dependence
HyperFields: Towards Zero-Shot Generation of NeRFs from Text
How Graph Neural Networks Learn: Lessons from Training Dynamics
Degeneration-free Policy Optimization: RL Fine-Tuning for Language Models without Degeneration
PARCv2: Physics-aware Recurrent Convolutional Neural Networks for Spatiotemporal Dynamics Modeling
Training Nonlinear Transformers for Efficient In-Context Learning: A Theoretical Learning and Generalization Analysis
What Improves the Generalization of Graph Transformer? A Theoretical Dive into Self-attention and Positional Encoding
A Provably Effective Method for Pruning Experts in Fine-tuned Sparse Mixture-of-Experts
SF-DQN: Provable Knowledge Transfer using Successor Feature for Deep Reinforcement Learning
FedBPT: Efficient Federated Black-box Prompt Tuning for Large Language Models
SAM as the Guide: Mastering Pseudo-Label Refinement in Semi-Supervised Referring Expression Segmentation
Understanding Self-Attention through Prompt-Conditioned Markov Chains
Editing Partially Observable Networks via Graph Diffusion Models
OMPO: A Unified Framework for RL under Policy and Dynamics Shifts
Understanding Inter-Concept Relationships in Concept-Based Models
Linear Alignment: A Closed-form Solution for Aligning Human Preferences without Tuning and Feedback
Hieros: Hierarchical Imagination on Structured State Space Sequence World Models
From Biased Selective Labels to Pseudo-Labels: An Expectation-Maximization Framework for Learning from Biased Decisions
Patchscopes: A Unifying Framework for Inspecting Hidden Representations of Language Models
AI Control: Improving Safety Despite Intentional Subversion
Break the Sequential Dependency of LLM Inference Using Lookahead Decoding
Online Speculative Decoding
Learning High-Order Relationships of Brain Regions
Language Agents as Optimizable Graphs
Learning to Compile Programs to Neural Networks
Improving and Accelerating Retrieval-Augmented Generation with Superposition Prompting
On Mechanistic Knowledge Localization in Text-to-Image Generative Models
Learning to Route Among Specialized Experts for Zero-Shot Generalization
Secure and Fast Federated Few-Shot Learning
Neuroexplicit Diffusion Models for Inpainting of Optical Flow Fields
Two Heads are Actually Better than One: Towards Better Adversarial Robustness via Transduction and Rejection
Exploration and Anti-Exploration with Distributional Random Network Distillation
Graph Automorphism Group Equivariant Neural Networks
DiracDiffusion: Denoising and Incremental Reconstruction with Assured Data-Consistency
Adapt and Diffuse: Sample-adaptive Reconstruction via Latent Diffusion Models
Online Learning under Budget and ROI Constraints via Weak Adaptivity
Prior Mismatch and Adaptation in PnP-ADMM with a Nonconvex Convergence Analysis
LaMAGIC: Language-Model-based Topology Generation for Analog Integrated Circuits
Agent Instructs Large Language Models to be General Zero-Shot Reasoners
Layerwise Proximal Replay: A Proximal Point Method for Online Continual Learning
Online Learning with Bounded Recall
Nearest Neighbour Score Estimators for Diffusion Generative Models
Pruned Pivot: Correlation Clustering Algorithm for Dynamic, Parallel, and Local Computation Models
Generative Marginalization Models
SceneCraft: An LLM Agent for Synthesizing 3D Scene as Blender Code
QORA: Zero-Shot Transfer via Interpretable Object-Relational Model Learning
MMT-Bench: A Multimodal MultiTask Benchmark for Comprehensive Evaluation of Large Vision-Language Models
DiffFPR: Diffusion Prior for Oversampled Fourier Phase Retrieval
Private and Federated Stochastic Convex Optimization: Efficient Strategies for Centralized Systems
MH-pFLID: Model Heterogeneous personalized Federated Learning via Injection and Distillation for Medical Data Analysis
Nash Incentive-compatible Online Mechanism Learning via Weakly Differentially Private Online Learning
Wukong: Towards a Scaling Law for Large-Scale Recommendation
Sample Complexity Bounds for Estimating Probability Divergences under Invariances
Large Language Models are Geographically Biased
Differentially Private Post-Processing for Fair Regression
Multi-Region Markovian Gaussian Process: An Efficient Method to Discover Directional Communications Across Multiple Brain Regions
A Differentiable Partially Observable Generalized Linear Model with Forward-Backward Message Passing
A Contextual Combinatorial Bandits Approach to Negotiation
LASER: Linear Compression in Wireless Distributed Optimization
Performative Prediction with Bandit Feedback: Learning through Reparameterization
A Dual-module Framework for Counterfactual Estimation over Time
Spotting LLMs With Binoculars: Zero-Shot Detection of Machine-Generated Text
Contrastive Representation for Data Filtering in Cross-Domain Offline Reinforcement Learning
CRoFT: Robust Fine-Tuning with Concurrent Optimization for OOD Generalization and Open-Set OOD Detection
Diagnosing Correlated Instability Directions in the Reinforcement Learning Manifold
Exploring the Benefit of Activation Sparsity in Pre-training
Meta-Learners for Partially-Identified Treatment Effects Across Multiple Environments
FedMBridge: Bridgeable Multimodal Federated Learning
3D Geometric Shape Assembly via Efficient Point Cloud Matching
Don't be so Negative! Score-based Generative Modeling with Oracle-assisted Guidance
Complexity Matters: Feature Learning in the Presence of Spurious Correlations
When Do Skills Help Reinforcement Learning? A Theoretical Analysis of Temporal Abstractions
Sparse is Enough in Fine-tuning Pre-trained Large Language Models
An Infinite-Width Analysis on the Jacobian-Regularised Training of a Neural Network
From Coarse to Fine: Enable Comprehensive Graph Self-supervised Learning with Multi-granular Semantic Ensemble
RIME: Robust Preference-based Reinforcement Learning with Noisy Preferences
ELF: Encoding Speaker-Specific Latent Speech Feature for Speech Synthesis
SpikeLM: Towards General Spike-Driven Language Modeling via Elastic Bi-Spiking Mechanisms
Geometric Active Exploration in Markov Decision Processes: the Benefit of Abstraction
Don't trust your eyes: on the (un)reliability of feature visualizations
Winner-takes-all learners are geometry-aware conditional density estimators
Rethinking Optimization and Architecture for Tiny Language Models
Sparse and Structured Hopfield Networks
Learning Cognitive Maps from Transformers Representations for Efficient Planning in Partially Observed Environments
How Smooth Is Attention?
Principled Preferential Bayesian Optimization
Sampling in Unit Time with Kernel Fisher-Rao Flow
Designing Decision Support Systems using Counterfactual Prediction Sets
Transferable Facial Privacy Protection against Blind Face Restoration via Domain-Consistent Adversarial Obfuscation
Is Epistemic Uncertainty Faithfully Represented by Evidential Deep Learning Methods?
Latent Optimal Paths by Gumbel Propagation for Variational Bayesian Dynamic Programming
Easing Concept Bleeding in Diffusion via Entity Localization and Anchoring
Causal Discovery using Bayesian Model Selection
GeminiFusion: Efficient Pixel-wise Multimodal Fusion for Vision Transformer
Interpreting and Improving Large Language Models in Arithmetic Calculation
Memory-Space Visual Prompting for Efficient Vision-Language Fine-Tuning
Data-efficient Large Vision Models through Sequential Autoregression
Optimal Transport for Structure Learning Under Missing Data
Parameter Estimation in DAGs from Incomplete Data via Optimal Transport
Magicoder: Empowering Code Generation with OSS-Instruct
Learning Decision Trees and Forests with Algorithmic Recourse
AttNS: Attention-Inspired Numerical Solving For Limited Data Scenarios
Fundamental Benefit of Alternating Updates in Minimax Optimization
SaVeR: Optimal Data Collection Strategy for Safe Policy Evaluation in Tabular MDP
A New Theoretical Perspective on Data Heterogeneity in Federated Averaging
Arrows of Time for Large Language Models
Position Paper: Is machine learning good or bad for the natural sciences?
Effective Federated Graph Matching
Learning Causal Relations from Subsampled Time Series with Two Time-Slices
Two-Stage Shadow Inclusion Estimation: An IV Approach for Causal Inference under Latent Confounding and Collider Bias
Symmetry Leads to Structure and Constraint of Learning
Learning Shadow Variable Representation for Treatment Effect Estimation under Collider Bias
SelMatch: Effectively Scaling Up Dataset Distillation via Selection-Based Initialization and Partial Updates by Trajectory Matching
Self-attention Networks Localize When QK-eigenspectrum Concentrates
Policy Learning for Balancing Short-Term and Long-Term Rewards
Floating Anchor Diffusion Model for Multi-motif Scaffolding
Dynamic Byzantine-Robust Learning: Adapting to Switching Byzantine Workers
KnowFormer: Revisiting Transformers for Knowledge Graph Reasoning
One for All: A Universal Generator for Concept Unlearnability via Multi-Modal Alignment
Knowledge-aware Reinforced Language Models for Protein Directed Evolution
Exploration by Optimization with Hybrid Regularizers: Logarithmic Regret with Adversarial Robustness in Partial Monitoring
WARM: On the Benefits of Weight Averaged Reward Models
AttnLRP: Attention-Aware Layer-wise Relevance Propagation for Transformers
Second-Order Uncertainty Quantification: A Distance-Based Approach
Pursuing Overall Welfare in Federated Learning through Sequential Decision Making
What is Dataset Distillation Learning?
Model-Based RL for Mean-Field Games is not Statistically Harder than Single-Agent RL
Extreme Compression of Large Language Models via Additive Quantization
Machine Vision Therapy: Multimodal Large Language Models Can Enhance Visual Robustness via Denoising In-Context Learning
Hybrid$^2$ Neural ODE Causal Modeling
Non-stationary Online Convex Optimization with Arbitrary Delays
Robust Inverse Constrained Reinforcement Learning under Model Misspecification
Learning Pseudo-Contractive Denoisers for Inverse Problems
HyperAgent: A Simple, Scalable, Efficient and Provable Reinforcement Learning Framework for Complex Environments
InstructZero: Efficient Instruction Optimization for Black-Box Large Language Models
Stochastic Gradient Flow Dynamics of Test Risk and its Exact Solution for Weak Features
Graph Mixup on Approximate Gromov–Wasserstein Geodesics
Online Isolation Forest
Reweighted Solutions for Weighted Low Rank Approximation
Doubly Robust Causal Effect Estimation under Networked Interference via Targeted Learning
Reducing Balancing Error for Causal Inference via Optimal Transport
Generalization Bounds for Causal Regression: Insights, Guarantees and Sensitivity Analysis
InstructSpeech: Following Speech Editing Instructions via Large Language Models
Delving into Differentially Private Transformer
Federated Neuro-Symbolic Learning
Stay on Topic with Classifier-Free Guidance
CRUXEval: A Benchmark for Code Reasoning, Understanding and Execution
Better & Faster Large Language Models via Multi-token Prediction
Exploration-Driven Policy Optimization in RLHF: Theoretical Insights on Efficient Data Utilization
Modelling Microbial Communities with Graph Neural Networks
Impact of Decentralized Learning on Agent Utilities in Stackelberg Games
On the Embedding Collapse when Scaling up Recommendation Models
CogDPM: Diffusion Probabilistic Models via Cognitive Predictive Coding
Re-Dock: Towards Flexible and Realistic Molecular Docking with Diffusion Bridge
To the Max: Reinventing Reward in Reinforcement Learning
Langevin Policy for Safe Reinforcement Learning
Quantum Algorithms and Lower Bounds for Finite-Sum Optimization
Provably Efficient Exploration in Quantum Reinforcement Learning with Logarithmic Worst-Case Regret
Simplicity Bias via Global Convergence of Sharpness Minimization
Implicit Regularization in Feedback Alignment Learning Mechanisms for Neural Networks
Rethinking DP-SGD in Discrete Domain: Exploring Logistic Distribution in the Realm of signSGD
Stochastic positional embeddings improve masked image modeling
Memory Consolidation Enables Long-Context Video Understanding
Triadic-OCD: Asynchronous Online Change Detection with Provable Robustness, Optimality, and Convergence
Bridging the gap between mini-batch and asymptotic analysis in contrastive learning: From InfoNCE to Kernel-based losses
PriorBoost: An Adaptive Algorithm for Learning from Aggregate Responses
EvTexture: Event-driven Texture Enhancement for Video Super-Resolution
Online Variational Sequential Monte Carlo
ALERT-Transformer: Bridging Asynchronous and Synchronous Machine Learning for Real-Time Event-based Spatio-Temporal Data
DISCRET: Synthesizing Faithful Explanations For Treatment Effect Estimation
Towards Compositionality in Concept Learning
Self-Composing Policies for Scalable Continual Reinforcement Learning
A2Q+: Improving Accumulator-Aware Weight Quantization
Reflected Flow Matching
Class-Imbalanced Graph Learning without Class Rebalancing
Extracting Training Data From Document-Based VQA Models
Modeling Caption Diversity in Contrastive Visual Language Pretraining
Equivariant Graph Neural Operator for Modeling 3D Dynamics
SiBBlInGS: Similarity-driven Building-Block Inference using Graphs across States
Learning Solution-Aware Transformers for Efficiently Solving Quadratic Assignment Problem
Learning from Streaming Data when Users Choose
Unsupervised Domain Adaptation for Anatomical Structure Detection in Ultrasound Images
A Multimodal Automated Interpretability Agent
Learning Constraints from Offline Demonstrations via Superior Distribution Correction Estimation
Towards an Understanding of Stepwise Inference in Transformers: A Synthetic Graph Navigation Model
Neuro-Visualizer: A Novel Auto-Encoder-Based Loss Landscape Visualization Method With an Application in Knowledge-Guided Machine Learning
Amortizing Pragmatic Program Synthesis with Rankings
DITTO: Diffusion Inference-Time T-Optimization for Music Generation
Out of the Ordinary: Spectrally Adapting Regression for Covariate Shift
Stochastic Weakly Convex Optimization beyond Lipschitz Continuity
OptiMUS: Scalable Optimization Modeling with (MI)LP Solvers and Large Language Models
AnyTool: Self-Reflective, Hierarchical Agents for Large-Scale API Calls
Optimal Eye Surgeon: Finding image priors through sparse generators at initialization
How do Large Language Models Navigate Conflicts between Honesty and Helpfulness?
Operator SVD with Neural Networks via Nested Low-Rank Approximation
Noise-Adaptive Confidence Sets for Linear Bandits and Application to Bayesian Optimization
Quantum Theory and Application of Contextual Optimal Transport
KV-Runahead: Scalable Causal LLM Inference by Parallel Key-Value Cache Generation
A Circuit Domain Generalization Framework for Efficient Logic Synthesis in Chip Design
APT: Adaptive Pruning and Tuning Pretrained Language Models for Efficient Training and Inference
A Hierarchical Adaptive Multi-Task Reinforcement Learning Framework for Multiplier Circuit Design
Latent Noise Segmentation: How Neural Noise Leads to the Emergence of Segmentation and Grouping
High-Probability Bound for Non-Smooth Non-Convex Stochastic Optimization with Heavy Tails
Testing the Feasibility of Linear Programs with Bandit Feedback
Potential Based Diffusion Motion Planning
Position Paper: On the Standardization of Behavioral Use Clauses and Their Adoption for Responsible Licensing of AI
Adaptive Text Watermark for Large Language Models
Gaussian Processes on Cellular Complexes
AdsorbDiff: Adsorbate Placement via Conditional Denoising Diffusion
Robustness of Nonlinear Representation Learning
Neural Collapse in Multi-label Learning with Pick-all-label Loss
Reward Model Learning vs. Direct Policy Optimization: A Comparative Analysis of Learning from Human Preferences
LEVI: Generalizable Fine-tuning via Layer-wise Ensemble of Different Views
Peeking with PEAK: Sequential, Nonparametric Composite Hypothesis Tests for Means of Multiple Data Streams
Switching the Loss Reduces the Cost in Batch Reinforcement Learning
Latent Logic Tree Extraction for Event Sequence Explanation from LLMs
Connecting the Dots: Is Mode-Connectedness the Key to Feasible Sample-Based Inference in Bayesian Neural Networks?
Position Paper: Near to Mid-term Risks and Opportunities of Open Source Generative AI
Generalization Analysis for Multi-Label Learning
Fast and Sample Efficient Multi-Task Representation Learning in Stochastic Contextual Bandits
Position Paper: The Amazing Things That Come From Having Many Good Models
Private Truly-Everlasting Robust-Prediction
Acquiring Diverse Skills using Curriculum Reinforcement Learning with Mixture of Experts
Variance-reduced Zeroth-Order Methods for Fine-Tuning Language Models
Cross-Task Linearity Emerges in the Pretraining-Finetuning Paradigm
DisCo-Diff: Enhancing Continuous Diffusion Models with Discrete Latents
Translation-Equivariant Transformer Neural Processes
Variational Conceptual Explainers: Towards Trustworthy Conceptual Explanations for Vision Transformers
MagicPose: Realistic Human Pose and Facial Expression Retargeting with Identity-aware Diffusion
The Illusion of State in State-Space Models
Fair Off-Policy Learning from Observational Data
Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications
MaSS: Multi-attribute Selective Suppression for Utility-preserving Data Transformation from an Information-theoretic Perspective
Agent-Specific Effects: A Causal Effect Propagation Analysis in Multi-Agent MDPs
Partial Multi-View Multi-Label Classification via Semantic Invariance Learning and Prototype Modeling
NExT: Teaching Large Language Models to Reason about Code Execution
StackSight: Unveiling WebAssembly through Large Language Models and Neurosymbolic Chain-of-Thought Decompilation
Reinformer: Max-Return Sequence Modeling for offline RL
Subskill Predictive Control
Emergent Equivariance in Deep Ensembles
Quantum Positional Encodings for Graph Neural Networks
MindEye2: Shared-Subject Models Enable fMRI-To-Image With 1 Hour of Data
Deep Functional Factor Models: Forecasting High-Dimensional Functional Time Series via Bayesian Nonparametric Factorization
ED-Copilot: Reduce Emergency Department Wait Time with Language Model Diagnostic Assistance
StyDeSty: Min-Max Stylization and Destylization for Single Domain Generalization
DiffStitch: Boosting Offline Reinforcement Learning with Diffusion-based Trajectory Stitching
Coarse-To-Fine Tensor Trains for Compact and Robust Visual Representations
A General Online Algorithm for Optimizing Complex Performance Metrics
Autaptic Synaptic Circuit Enhances Spatio-temporal Predictive Learning of Spiking Neural Networks
Guiding LLMs The Right Way: Fast, Non-Invasive Constrained Generation
Mean Field Langevin Actor-Critic: Faster Convergence and Global Optimality beyond Lazy Learning
$S^2$IP-LLM: Semantic Space Informed Prompt Learning with LLM for Time Series Forecasting
Subgoal-based Demonstration Learning for Formal Theorem Proving
MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases
Two-sided Competing Matching Recommendation Markets With Quota and Complementary Preferences Constraints
How Language Model Hallucinations Can Snowball
SCoRe: Submodular Combinatorial Representation Learning
Inferring the Long-Term Causal Effects of Long-Term Treatments from Short-Term Experiments
Breaking the Barrier: Enhanced Utility and Robustness in Smoothed DRL Agents
VoroNav: Voronoi-based Zero-shot Object Navigation with Large Language Model
Understanding Preference Fine-Tuning for Large Language Models
Stop Regressing: The Unreasonable Effectiveness of Classification in Deep Reinforcement Learning
Scaling Laws for the Value of Individual Data Points in Machine Learning
Learning with 3D rotations, a hitchhiker's guide to SO(3)
Conditionally-Conjugate Gaussian Process Factor Analysis for Spike Count Data via Data Augmentation
🤳SelfIE: Self-Interpretation of Large Language Model Embeddings
Deep Neural Room Acoustics Primitive
Do Topological Characteristics Help in Knowledge Distillation?
SILVER: Single-loop variance reduction and application to federated learning
Adversarial Robustness Limits via Scaling-Law and Human-Alignment Studies
A Unified Linear Programming Framework for Reward Learning with Offline Human Behavior and Feedback Data
Watermarks in the Sand: Impossibility of Strong Watermarking for Generative Models
Craftax: A Lightning-Fast Benchmark for Open-Ended Reinforcement Learning
Refining Minimax Regret for Unsupervised Environment Design
Probabilistic Routing for Graph-Based Approximate Nearest Neighbor Search
Network Tight Community Detection
On the sample complexity of conditional independence testing with Von Mises estimator with application to causal discovery
On the Consistency of Kernel Methods with Dependent Observations
Guidance with Spherical Gaussian Constraint for Conditional Diffusion
On Statistical Learning Theory for Distributional Inputs
Benchmarking and Building Long-Context Retrieval Models with LoCo and M2-BERT
BAGEL: Bootstrapping Agents by Guiding Exploration with Language
Conditional language learning with context
Efficient Online Set-valued Classification with Bandit Feedback
A Stealthy, Accessible, and Provably Resilient Watermark for Language Models
Defense against Model Extraction Attack by Bayesian Active Watermarking
Transport of Algebraic Structure to Latent Embeddings
Forget Sharpness: Perturbed Forgetting of Model Biases Within SAM Dynamics
Embarrassingly Parallel GFlowNets
GFlowNet Training by Policy Gradients
MGit: A Model Versioning and Management System
Prospective Side Information for Latent MDPs
More Benefits of Being Distributional: Second-Order Bounds for Reinforcement Learning
Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models
Towards General Neural Surrogate Solvers with Specialized Neural Accelerators
Multi-Factor Adaptive Vision Selection for Egocentric Video Question Answering
Tilt and Average : Geometric Adjustment of the Last Layer for Recalibration
Discovering Multiple Solutions in Offline Reinforcement Learning
An Online Optimization Perspective on First-Order and Zero-Order Decentralized Nonsmooth Nonconvex Stochastic Optimization
On the Origins of Linear Representations in Large Language Models
Position Paper: The No Free Lunch Theorem, Kolmogorov Complexity, and the Role of Inductive Biases in Machine Learning
A Simple and Universal Prompt-Tuning Framework for Spatio-Temporal Prediction
InterpreTabNet: Distilling Predictive Signals From Tabular Data
Joint Composite Latent Space Bayesian Optimization
Position Paper: Explain to Question not to Justify
Robust Learning-Augmented Dictionaries
A Theory of Fault-Tolerant Learning
HumanTOMATO: Text-aligned Whole-body Motion Generation
A Fine-grained Analysis of Fitted Q-evaluation: Beyond Parametric Models
Contamination-Resilient Anomaly Detection via Adversarial Learning on Partially-Observed Normal and Anomalous Data
Bounded and Uniform Energy-based Out-of-distribution Detection for Graphs
Cluster-Aware Similarity Diffusion for Instance Retrieval
Scaling Beyond the GPU Memory Limit for Large Mixture-of-Experts Model Training
Sampling is as easy as keeping the consistency: convergence guarantee for Consistency Models
Position Paper: A Roadmap to Pluralistic Alignment
Generative Active Learning for Long-tailed Instance Segmentation
Uniformly Stable Algorithms for Adversarial Training and Beyond
R2E: Turning any Github Repository into Programming Agent Test Environment
The Relative Value of Prediction in Algorithmic Decision Making
Ambiguity-Aware Abductive Learning
Sparse Dimensionality Reduction Revisited
Understanding the Impact of Introducing Constraints at Inference Time on Generalization Error
Diffusion Models Demand Contrastive Guidance for Adversarial Purification to Advance
Closing the Gap: Achieving Global Convergence (Last Iterate) of Actor-Critic under Markovian Sampling with Neural Network Parametrization
An Intrinsic Vector Heat Network
Foundations of Testing for Finite-Sample Causal Discovery
Algorithm and Hardness for Dynamic Attention Maintenance in Large Language Models
Towards a Self-contained Data-driven Global Weather Forecasting System
Efficient Stochastic Approximation of Minimax Excess Risk Optimization
CasCast: Skillful High-resolution Precipitation Nowcasting via Cascaded Modelling
The Linear Representation Hypothesis and the Geometry of Large Language Models
Policy Representation Can be Utilized for More Generalizable Offline Dynamics Model Learning
Exploring the LLM Journey from Cognition to Expression with Linear Representations
Learning to Explore in POMDPS with Informational Rewards
High-Order Contrastive Learning with Fine-grained Comparative Levels for Sparse Ordinal Tensor Completion
The Max-Min Formulation of Multi-Objective Reinforcement Learning: From Theory to a Model-Free Algorithm
Analyzing $D^\alpha$ seeding for $k$-means
Token-level Direct Preference Optimization
Empowering Graph Invariance Learning with Deep Spurious Infomax
LayerMerge: Neural Network Depth Compression through Layer Pruning and Merging
INViT: A Generalizable Routing Problem Solver with Invariant Nested View Transformer
Maestro: Uncovering Low-Rank Structures via Trainable Decomposition
Controllable Molecule Synthesis with Residual Energy-based Model
High-Probability Convergence for Composite and Distributed Stochastic Minimization and Variational Inequalities with Heavy-Tailed Noise
A New Robust Partial p-Wasserstein-Based Metric for Comparing Distributions
GPT-4V(ision) is a Generalist Web Agent, if Grounded
Improving Neural Logic Machines via Failure Reflection
ByMI: Byzantine Machine Identification with False Discovery Rate Control
Distributional Bellman Operators over Mean Embeddings
Why Larger Language Models Do In-context Learning Differently?
Path-Guided Particle-based Sampling
Self-Supervised Coarsening of Unstructured Grid with Automatic Differentiation
Ensemble Pruning under Distribution Shifts
Learning to Model the World With Language
Quantum Implicit Neural Representations
SPADE: Sparsity-Guided Debugging for Deep Neural Networks
Discovering Mixtures of Structural Causal Models from Time Series Data
A Geometric Decomposition of Games: Convergence vs. Recurrence under No-Regret Learning
The Computational Complexity of Finding Second-Order Stationary Points
Parameter-Dependent Competitive Analysis for Online Capacitated Coverage Maximization through Boostings and Attenuations
Promoting External and Internal Equities Under Ex-Ante/Ex-Post Metrics in Online Resource Allocation
What is the Long-Run Distribution of SGD? A Large Deviations Analysis
Autoformalizing Euclidean Geometry
Iterative Data Smoothing: Mitigating Reward Overfitting and Overoptimization in RLHF
Robust Inverse Graphics via Probabilistic Inference
Risk-sensitive Policy Optimization via Predictive CVaR Policy Gradient
Multiply-Robust Causal Change Attribution
Predictive Linear Online Tracking for Unknown Targets
Generalization Analysis of Deep Non-linear Matrix Completion
SiT: Symmetry-invariant Transformers for Generalisation in Reinforcement Learning
A Universal Transfer Theorem for Convex Optimization Algorithms Using Inexact First-order Oracles
Physics and Lie symmetry informed Gaussian processes
Combinatorial Approximations for Cluster Deletion: Simpler, Faster, and Better
Can Looped Transformers Learn to Implement Multi-step Gradient Descent for In-context Learning?
Detecting Influence Structures in Multi-Agent Reinforcement Learning
How Spurious Features are Memorized: Precise Analysis for Random and NTK Features
Collaborative Learning with Different Labeling Functions
No Free Prune: Information-Theoretic Barriers to Pruning at Initialization
Multiply Robust Estimation for Local Distribution Shifts with Multiple Domains
Pausing Model Learning in Real-world Reinforcement Learning
Extending Test-Time Augmentation with Metamorphic Relations for Combinatorial Problems
A Sparsity Principle for Partially Observable Causal Representation Learning
Privacy-Preserving Embedding via Look-up Table Evaluation with Fully Homomorphic Encryption
PointMC: Multi-instance Point Cloud Registration based on Maximal Cliques
LLM Maybe LongLM: SelfExtend LLM Context Window Without Tuning
Attribute Based Interpretable Evaluation Metrics for Generative Models
Position Paper: Data-driven Discovery with Large Generative Models
Bridging Data Gaps in Diffusion Models with Adversarial Noise-Based Transfer Learning
Multi-Patch Prediction: Adapting LLMs for Time Series Representation Learning
Detecting Any instruction-to-answer interaction relationship:Universal Instruction-to-Answer Navigator for Med-VQA
A Neural-Preconditioned Poisson Solver for Mixed Dirichlet and Neumann Boundary Conditions
Coprocessor Actor Critic: A Model-Based Reinforcement Learning Approach For Adaptive Deep Brain Stimulation
$H$-Consistency Guarantees for Regression
LIDAO: Towards Limited Interventions for Debiasing (Large) Language Models
Adaptive Group Personalization for Federated Mutual Transfer Learning
Evaluation of Trajectory Distribution Predictions with Energy Score
Improving Neural Additive Models with Bayesian Principles
Position Paper: Bayesian Deep Learning in the Age of Large-Scale AI
Differentially Private Representation Learning via Image Captioning
UGrid: An Efficient-And-Rigorous Neural Multigrid Solver for Linear PDEs
Shared Attractor Dynamics in Spatial Navigation and Language Parsing
Temporal Distances in Stochastic Settings: Theoretical Properties and Application to Reinforcement Learning
Fast Algorithms for Hypergraph PageRank with Applications to Semi-Supervised Learning
Language Agent Tree Search Unifies Reasoning, Acting, and Planning in Language Models
GaussianPro: 3D Gaussian Splatting with Progressive Propagation
Nesting Particle Filters for Experimental Design in Dynamical Systems
When Representations Align: Universality in Representation Learning Dynamics
CLIF: Complementary Leaky Integrate-and-Fire Neuron for Spiking Neural Networks
Decoding-time Realignment of Language Models
Boosting Offline Optimizers with Surrogate Sensitivity
Provably Efficient Partially Observable Risk-sensitive Reinforcement Learning with Hindsight Observation
SLEB: Streamlining LLMs through Redundancy Verification and Elimination of Transformer Blocks
Improving Gradient-guided Nested Sampling for Posterior Inference
Iterative Search Attribution for Deep Neural Networks
Position Paper: Social Environment Design
3D-VLA: A 3D Vision-Language-Action Generative World Model
Ditto: Quantization-aware Secure Inference of Transformers upon MPC
ContPhy: Continuum Physical Concept Learning and Reasoning from Videos
Sobolev Space Regularised Pre Density Models
Improving Diffusion Models for Inverse Problems Using Optimal Posterior Covariance
A Persuasive Approach to Combating Misinformation
Equilibrium of Data Markets with Externality
Multi-Sender Persuasion: A Computational Perspective
Bootstrap AutoEncoders With Contrastive Paradigm for Self-supervised Gaze Estimation
Graph-based Time Series Clustering for End-to-End Hierarchical Forecasting
Incorporating Information into Shapley Values: Reweighting via a Maximum Entropy Approach
Position Paper: Considerations for Differentially Private Learning with Large-Scale Public Pretraining
Robust Data-driven Prescriptiveness Optimization
Conformalized Adaptive Forecasting of Heterogeneous Trajectories
Towards Theoretical Understanding of Learning Large-scale Dependent Data via Random Features
MMPareto: Innocent Uni-modal Assistance for Enhanced Multi-modal Learning
A Dynamical Model of Neural Scaling Laws
Unsupervised Episode Generation for Graph Meta-learning
Leveraging (Biased) Information: Multi-armed Bandits with Offline Data
Optimal Kernel Quantile Learning with Random Features
STELLA: Continual Audio-Video Pre-training with SpatioTemporal Localized Alignment
Improving Open-Ended Text Generation via Adaptive Decoding
Irregular Multivariate Time Series Forecasting: A Transformable Patching Graph Neural Networks Approach
Learning Mixtures of Gaussian Processes through Random Projection
Faster Maximum Inner Product Search in High Dimensions
Fool Your (Vision and) Language Model with Embarrassingly Simple Permutations
Transformers Provably Learn Sparse Token Selection While Fully-Connected Nets Cannot
Multigroup Robustness
StrWAEs to Invariant Representations
MF-CLR: Multi-Frequency Contrastive Learning Representation for Time Series
CF-OPT: Counterfactual Explanations for Structured Prediction
Remembering to Be Fair: Non-Markovian Fairness in Sequential Decision Making
FiT: Flexible Vision Transformer for Diffusion Model
Sequence Compression Speeds Up Credit Assignment in Reinforcement Learning
Combining Experimental and Historical Data for Policy Evaluation
Logistic Variational Bayes Revisited
HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal
Conformal Prediction Sets Improve Human Decision Making
Unsupervised Representation Learning of Brain Activity via Bridging Voxel Activity and Functional Connectivity
Graph Structure Extrapolation for Out-of-Distribution Generalization
Proteus: Pioneering Protein Structure Generation for Enhanced Designability and Efficiency
EvoluNet: Advancing Dynamic Non-IID Transfer Learning on Graphs
Viewing Transformers Through the Lens of Long Convolutions Layers
Universal Gradient Methods for Stochastic Convex Optimization
Online Algorithms with Uncertainty-Quantified Predictions
Chasing Convex Functions with Long-term Constraints
Generalizing Knowledge Graph Embedding with Universal Orthogonal Parameterization
Minimizing $f$-Divergences by Interpolating Velocity Fields
Reference Neural Operators: Learning the Smooth Dependence of Solutions of PDEs on Geometric Deformations
Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM Agents Exponentially Fast
Deciphering RNA Secondary Structure Prediction: A Probabilistic K-Rook Matching Perspective
SHINE: Shielding Backdoors in Deep Reinforcement Learning
Learning Low-dimensional Latent Dynamics from High-dimensional Observations: Non-asymptotics and Lower Bounds
Adaptive Accompaniment with ReaLchords
How Flawed is ECE? An Analysis via Logit Smoothing
ESNet: Evolution and Succession Network for High-Resolution Salient Object Detection
Efficient Contrastive Learning for Fast and Accurate Inference on Graphs
Position Paper: Social Choice for AI Ethics and Safety
Universal Consistency of Wide and Deep ReLU Neural Networks and Minimax Optimal Convergence Rates for Kolmogorov-Donoho Optimal Function Classes
Tuning-Free Stochastic Optimization
Size-invariance Matters: Rethinking Metrics and Losses for Imbalanced Multi-object Salient Object Detection
Adaptive Horizon Actor-Critic for Policy Learning in Contact-Rich Differentiable Simulation
Non-parametric Online Change Point Detection on Riemannian Manifolds
FightLadder: A Benchmark for Competitive Multi-Agent Reinforcement Learning
Improving Prototypical Visual Explanations With Reward Reweighing, Reselection, and Retraining
EfficientZero V2: Mastering Discrete and Continuous Control with Limited Data
In deep reinforcement learning, a pruned network is a good network
LQER: Low-Rank Quantization Error Reconstruction for LLMs
Local Causal Structure Learning in the Presence of Latent Variables
Mixtures of Experts Unlock Parameter Scaling for Deep RL
Studying K-FAC Heuristics by Viewing Adam through a Second-Order Lens
Automating the Selection of Proxy Variables of Unmeasured Confounders
How How to Make the Gradients Small Privately: Improved Rates for Differentially Private Non-Convex Optimization
Community-Invariant Graph Contrastive Learning
Model Assessment and Selection under Temporal Distribution Shift
Optimal Differentially Private Model Training with Public Data
Private Heterogeneous Federated Learning Without a Trusted Server Revisited: Error-Optimal and Communication-Efficient Algorithms for Convex Losses
Equivariant Deep Weight Space Alignment
Align Your Steps: Optimizing Sampling Schedules in Diffusion Models
Fundamental Limitations of Alignment in Large Language Models
Efficient Precision and Recall Metrics for Assessing Generative Models using Hubness-aware Sampling
Challenges in Training PINNs: A Loss Landscape Perspective
Debating with More Persuasive LLMs Leads to More Truthful Answers
Discovering Symmetry Breaking in Physical Systems with Relaxed Group Convolution
Diffusive Gibbs Sampling
Momentum Particle Maximum Likelihood
A fast algorithm to simulate nonlinear resistive networks
Faithfulness Measurable Masked Language Models
UPOCR: Towards Unified Pixel-Level OCR Interface
Interpreting and Improving Diffusion Models from an Optimization Perspective
Autoencoding Conditional Neural Processes for Representation Learning
Beyond Regular Grids: Fourier-Based Neural Operators on Arbitrary Domains
Why Do Animals Need Shaping? A Theory of Task Composition and Curriculum Learning
ODIN: Disentangled Reward Mitigates Hacking in RLHF
Evolution of Heuristics: Towards Efficient Automatic Algorithm Design Using Large Language Model
Implicit Compressibility of Overparametrized Neural Networks Trained with Heavy-Tailed SGD
Smooth Tchebycheff Scalarization for Multi-Objective Optimization
Federated Learning: Lessons from Generalization Error Analysis
Parameterized Physics-informed Neural Networks for Parameterized PDEs
DeepPolar: Inventing Nonlinear Large-Kernel Polar Codes via Deep Learning
Theoretical Analysis of Learned Database Operations under Distribution Shift through Distribution Learnability
Constrained Ensemble Exploration for Unsupervised Skill Discovery
Progressive Inference: Explaining Decoder-Only Sequence Classification Models Using Intermediate Predictions
Classification Under Strategic Self-Selection
Contrasting Multiple Representations with the Multi-Marginal Matching Gap
Non-convex Stochastic Composite Optimization with Polyak Momentum
Completing Visual Objects via Bridging Generation and Segmentation
Enhancing Value Function Estimation through First-Order State-Action Dynamics in Offline Reinforcement Learning
An Effective Dynamic Gradient Calibration Method for Continual Learning
Structured Inverse-Free Natural Gradient Descent: Memory-Efficient & Numerically-Stable KFAC
Bayesian Regret Minimization in Offline Bandits
Can We Remove the Square-Root in Adaptive Gradient Methods? A Second-Order Perspective
Successor Features for Efficient Multi-Subject Controlled Text Generation
Differentially Private Worst-group Risk Minimization
Riemannian coordinate descent algorithms on matrix manifolds
BeigeMaps: Behavioral Eigenmaps for Reinforcement Learning from Images
Cross-Domain Policy Adaptation by Capturing Representation Mismatch
Deep Stochastic Mechanics
Near-Optimal Reinforcement Learning with Self-Play under Adaptivity Constraints
Predictive Coding beyond Correlations
Copula-Nested Spectral Kernel Network
Dealing with unbounded gradients in stochastic saddle-point optimization
Active Label Correction for Semantic Segmentation with Foundation Models
Improving Robustness to Multiple Spurious Correlations by Multi-Objective Optimization
Trust Regions for Explanations via Black-Box Probabilistic Certification
A Graph is Worth $K$ Words: Euclideanizing Graph using Pure Transformer
A Large Touch, Vision, and Language Dataset for Multimodal Perception
Balanced Data, Imbalanced Spectra: Unveiling Class Disparities with Spectral Imbalance
MLAgentBench: Evaluating Language Models for ML Experimentation
Isometric Representation Learning for Disentangled Latent Space of Diffusion Models
Bayesian Optimization of Function Networks with Partial Evaluations
A Geometric Explanation of the Likelihood OOD Detection Paradox
Off-policy Evaluation Beyond Overlap: Sharp Partial Identification Under Smoothness
Bias of Stochastic Gradient Descent or the Architecture: Disentangling the Effects of Overparameterization of Neural Networks
MLIP: Efficient Multi-Perspective Language-Image Pretraining with Exhaustive Data Utilization
Beyond the Norms: Detecting Prediction Errors in Regression Models
TravelPlanner: A Benchmark for Real-World Planning with Language Agents
Don’t Label Twice: Quantity Beats Quality for Comparing Binary Classifiers on a Budget
Contextualized Policy Recovery: Modeling and Interpreting Medical Decisions with Adaptive Imitation Learning
Beyond the ROC Curve: Classification Trees Using Cost-Optimal Curves, with Application to Imbalanced Datasets
Contrastive Predict-and-Search for Mixed Integer Linear Programs
PGODE: Towards High-quality System Dynamics Modeling
Hypergraph-enhanced Dual Semi-supervised Graph Classification
Provably Neural Active Learning Succeeds via Prioritizing Perplexing Samples
In-context Vectors: Making In Context Learning More Effective and Controllable Through Latent Space Steering
Improving Factuality and Reasoning Language Models through Multiagent Debate
BAT: Learning to Reason about Spatial Sounds with Large Language Models
Mitigating Label Noise on Graphs via Topological Sample Selection
Assessing the Impact of ChatGPT in AI Conference Peer Reviews
Unraveling the Impact of Heterophilic Structures on Graph Positive-Unlabeled Learning
Using Left and Right Brains Together: Towards Vision and Language Planning
Individualized Privacy Accounting via Subsampling with Applications in Combinatorial Optimization
WebLINX: Real-World Website Navigation with Multi-Turn Dialogue
Enhancing Trajectory Prediction through Self-Supervised Waypoint Distortion Prediction
Sign Rank Limitations for Attention-Based Graph Decoders
In-Context Reinforcement Learning with Hierarchical Chain of Experience
ILILT: Implicit Learning of Inverse Lithography Technologies
AutoOS: Make Your OS More Powerful by Exploiting Large Language Models
Prompt-based Visual Alignment for Zero-shot Policy Transfer
Straight-Through meets Sparse Recovery: the Support Exploration Algorithm
RAUCA: A Novel Physical Adversarial Attack on Vehicle Detectors via Robust and Accurate Camouflage Generation
Realistic Evaluation of Test Time Adaptation
Synergistic Integration of Coordinate Network and Tensorial Feature for Improving Neural Radiance Fields from Sparse Inputs
A Synchronized Layer-by-layer Growing Approach for Plausible Neuronal Morphology Generation
Evolving Subnetwork Training for Large Language Models
Reducing Item Discrepancy via Differentially Private Robust Embedding Alignment for Privacy-Preserving Cross Domain Recommendation
Gambling-Based Confidence Sequences for Bounded Random Vectors
Low-Cost High-Power Membership Inference Attacks by Boosting Relativity
GALS: Generalizable Alternating Least Squares for Recommender System
Noise-Aware Algorithm for Heterogeneous Differentially Private Federated Learning
Grokking Happens All the Time and Here is Why
SelfVC: Voice Conversion With Iterative Refinement using Self Transformations
Diving into Underwater: Segment Anything Model Guided Underwater Salient Instance Segmentation and A Large-scale Dataset
Mind the Boundary: Coreset Selection via Reconstructing the Decision Boundary
Early Time Classification with Accumulated Accuracy Gap Control
Revisiting Context Aggregation for Image Matting
MathScale: Scaling Instruction Tuning for Mathematical Reasoning
On Universally Optimal Algorithms for A/B Testing
Converting Transformers to Polynomial Form for Secure Inference Over Homomorphic Encryption
Adaptively Perturbed Mirror Descent for Learning in Games
Matroid Semi-Bandits in Sublinear Time
Model-Based Minimum Bayes Risk Decoding for Text Generation
Learning in Feature Spaces via Coupled Covariances: Asymmetric Kernel SVD and Nyström method
REST: Efficient and Accelerated EEG Seizure Analysis through Residual State Updates
Learning to Remove Cuts in Integer Linear Programming
Two Stones Hit One Bird: Bilevel Positional Encoding for Better Length Extrapolation
Do Efficient Transformers Really Save Computation?
Graph Neural PDE Solvers with Conservation and Similarity-Equivariance
Watermark Stealing in Large Language Models
Near-Optimal Regret in Linear MDPs with Aggregate Bandit Feedback
ACPO: A Policy Optimization Algorithm for Average MDPs with Constraints
On a Neural Implementation of Brenier's Polar Factorization
Slow and Steady Wins the Race: Maintaining Plasticity with Hare and Tortoise Networks
Not all distributional shifts are equal: Fine-grained robust conformal inference
Predictive Dynamic Fusion
Overcoming the Optimizer's Curse: Obtaining Realistic Prescriptions from ReLU Neural Networks
Exponential Spectral Pursuit: An Effective Initialization Method for Sparse Phase Retrieval
Deep Regression Representation Learning with Topology
Node Out-of-Distribution Detection Goes Neighborhood Shaping
DoraemonGPT: Toward Understanding Dynamic Scenes with Large Language Models
Taylor Videos for Action Recognition
Effects of Exponential Gaussian Distribution on (Double Sampling) Randomized Smoothing
An Embodied Generalist Agent in 3D World
Revealing Vision-Language Integration in the Brain with Multimodal Networks
RoboMP$^2$: A Robotic Multimodal Perception-Planning Framework
Dynamic Metric Embedding into lp Space
Variational Learning is Effective for Large Deep Networks
EquiAV: Leveraging Equivariance for Audio-Visual Contrastive Learning
Learning Adaptive and View-Invariant Vision Transformer for Real-Time UAV Tracking
Quality-Diversity Actor-Critic: Learning High-Performing and Diverse Behaviors via Value and Successor Features Critics
Non-linear Triple Changes Estimator for Targeted Policies
Positional Knowledge is All You Need: Position-induced Transformer (PiT) for Operator Learning
How Far Can Fairness Constraints Help Recover From Biased Data?
Cross-view Masked Diffusion Transformers for Person Image Synthesis
Stabilizing Policy Gradients for Stochastic Differential Equations via Consistency with Perturbation Process
Generating, Reconstructing, and Representing Discrete and Continuous Data: Generalized Diffusion with Learnable Encoding-Decoding
Differentiable Distributionally Robust Optimization Layers
Generalization in Kernel Regression Under Realistic Assumptions
On the Hardness of Probabilistic Neurosymbolic Learning
Scalable High-Resolution Pixel-Space Image Synthesis with Hourglass Diffusion Transformers
CaRiNG: Learning Temporal Causal Representation under Non-Invertible Generation Process
HelmFluid: Learning Helmholtz Dynamics for Interpretable Fluid Prediction
Scaling Laws for Fine-Grained Mixture of Experts
Balanced Pedestrian Attribute Recognition
Towards a Better Theoretical Understanding of Independent Subnetwork Training
Causal Representation Learning Made Identifiable by Grouping of Observational Variables
Open Ad Hoc Teamwork with Cooperative Game Theory
Position Paper: Understanding the Role of Social Media Influencers in AI Research Visibility
Learning Exceptional Subgroups by End-to-End Maximizing KL-Divergence
CLLMs: Consistency Large Language Models
Improved Operator Learning by Orthogonal Attention
Improving Instruction Following in Language Models through Proxy-Based Uncertainty Estimation
Bottleneck-Minimal Indexing for Generative Document Retrieval
Subgraphormer: Unifying Subgraph GNNs and Graph Transformers via Graph Products
Interpreting Equivariant Representations
Rethinking Guidance Information to Utilize Unlabeled Samples: A Label Encoding Perspective
Is DPO Superior to PPO? A Comprehensive Investigation.
Partially Stochastic Infinitely Deep Bayesian Neural Networks
Rényi Pufferfish Privacy: General Additive Noise Mechanisms and Privacy Amplification by Iteration via Shift Reduction Lemmas
Language Agents with Reinforcement Learning for Strategic Play in the Werewolf Game
Adaptive proximal gradient methods are universal without approximation
Confidence Aware Inverse Constrained Reinforcement Learning
Sample Average Approximation for Conditional Stochastic Optimization with Dependent Data
Knowledge Distillation with Auxiliary Variable
DIDI: Diffusion-Guided Diversity for Offline Behavioral Generation
PAC-Bayesian Generalization Bounds for Knowledge Graph Representation Learning
RoboGen: Automated Robotic Skill Learning at Scale via Generative Simulation
Deletion-Anticipative Data Acquisition
Use Your INSTINCT: INSTruction optimization usIng Neural bandits Coupled with Transformers
Accelerating Legacy Numerical Solvers by Non-intrusive Gradient-based Meta-solving
Understanding Heterophily for Graph Neural Networks
Few-Shot Character Understanding in Movies as an Assessment to Meta-Learning of Theory-of-Mind
Zero-Shot Unsupervised and Text-Based Audio Editing Using DDPM Inversion
Distributed High-Dimensional Quantile Regression: Estimation Efficiency and Support Recovery
Stochastic Localization via Iterative Posterior Sampling
Better Safe than Sorry: Pre-training CLIP against Targeted Data Poisoning and Backdoor Attacks
DataFreeShield: Defending Adversarial Attacks without Training Data
TroVE: Inducing Verifiable and Efficient Toolboxes for Solving Programmatic Tasks
Scalable Safe Policy Improvement for Factored Multi-Agent MDPs
Refined Coreset Selection: Towards Minimal Coreset Size under Model Performance Constraints
DFlow: A Generative Model Combining Denoising AutoEncoder and Normalizing Flow for High Fidelity Waveform Generation
Self-Driven Entropy Aggregation for Byzantine-Robust Heterogeneous Federated Learning
Finite Volume Features, Global Geometry Representations, and Residual Training for Deep Learning-based CFD Simulation
S3GCL: Spectral, Swift, Spatial Graph Contrastive Learning
Stability-Informed Initialization of Neural Ordinary Differential Equations
LLM-Empowered State Representation for Reinforcement Learning
A New Branch-and-Bound Pruning Framework for $\ell_0$-Regularized Problems
Getting the most out of your tokenizer for pre-training and domain adaptation
Nonsmooth Implicit Differentiation: Deterministic and Stochastic Convergence Rates
The WMDP Benchmark: Measuring and Reducing Malicious Use with Unlearning
Inherent Trade-Offs between Diversity and Stability in Multi-Task Benchmarks
In-Context Reinforcement Learning for Variable Action Spaces
COLD-Attack: Jailbreaking LLMs with Stealthiness and Controllability
PAPM: A Physics-aware Proxy Model for Process Systems
Emergence of In-Context Reinforcement Learning from Noise Distillation
Multi-View Stochastic Block Models
Memoria: Resolving Fateful Forgetting Problem through Human-Inspired Memory Architecture
Socialized Learning: Making Each Other Better Through Multi-Agent Collaboration
Performance Bounds for Active Binary Testing with Information Maximization
Repeat After Me: Transformers are Better than State Space Models at Copying
Q-Probe: A Lightweight Approach to Reward Maximization for Language Models
D-Flow: Differentiating through Flows for Controlled Generation
${\rm E}(3)$-Equivariant Actor-Critic Methods for Cooperative Multi-Agent Reinforcement Learning
A connection between Tempering and Entropic Mirror Descent
Parameter-Efficient Fine-Tuning with Discrete Fourier Transform
High-Dimensional Geometric Streaming for Nearly Low Rank Data
Rate-Optimal Policy Optimization for Linear Markov Decision Processes
Moreau Envelope for Nonconvex Bi-Level Optimization: A Single-loop and Hessian-free Solution Strategy
Kernel-Based Evaluation of Conditional Biological Sequence Models
Projecting Molecules into Synthesizable Chemical Spaces
Balancing Feature Similarity and Label Variability for Optimal Size-Aware Subset Selection
Full-Atom Peptide Design based on Multi-modal Flow Matching
GeoAB: Towards Realistic Antibody Design and Reliable Affinity Maturation
Differentially private exact recovery for stochastic block models
RODEO: Robust Outlier Detection via Exposing Adaptive Outliers
Finite Smoothing Algorithm for High-Dimensional Support Vector Machines and Quantile Regression
Position Paper: The Science of Data Collection: Insights from Surveys can Improve Machine Learning Model
Two-timescale Derivative Free Optimization for Performative Prediction with Markovian Data
Provable Benefits of Local Steps in Heterogeneous Federated Learning for Neural Networks: A Feature Learning Perspective
On The Complexity of First-Order Methods in Stochastic Bilevel Optimization
USTAD: Unified Single-model Training Achieving Diverse Scores for Information Retrieval
Conformal Predictions under Markovian Data
Domain Generalisation via Imprecise Learning
On the Effectiveness of Supervision in Non-Contrastive Representation Learning
Contrastive Learning for Clinical Outcome Prediction with Partial Data Sources
A Single-Loop Robust Policy Gradient Method for Robust Markov Decision Processes
Rewards-in-Context: Multi-objective Alignment of Foundation Models with Dynamic Preference Adjustment
Decoupling Learning and Decision-Making: Breaking the $\mathcal{O}(\sqrt{T})$ Barrier in Online Resource Allocation with First-Order Methods
Language Models Represent Beliefs of Self and Others
Promptbreeder: Self-Referential Self-Improvement via Prompt Evolution
Exploring the Enigma of Neural Dynamics Through A Scattering-Transform Mixer Landscape for Riemannian Manifold
Discrete Flow Models: A Discrete Generative Framework with Applications to Protein Structure Sequence Co-Generation
GATE: How to Keep Out Intrusive Neighbors
Evolution-Inspired Loss Functions for Protein Representation Learning
A Simple Convolution Injector for Vision Transformer Towards Effective Adaptation in Visuo-Motor Control
Investigating Pre-Training Objectives for Generalization in Visual Reinforcement Learning
Neural Diffusion Models
Bayesian Exploration Networks
Structure-based drug design by denoising voxel grids
Convergence Guarantees for the DeepWalk Embedding on Block Models
CoLoRA: Continuous low-rank adaptation for reduced implicit neural modeling of parameterized partial differential equations
Learning Graph Representation via Graph Entropy Maximization
Learning Surrogates for Offline Black-Box Optimization via Gradient Matching
Towards Neural Architecture Search through Hierarchical Generative Modeling
Bounding the Excess Risk for Linear Models Trained on Marginal-Preserving, Differentially-Private, Synthetic Data
Graph Neural Network Explanations are Fragile
Drug Discovery with Dynamic Goal-aware Fragments
CLIPZyme: Reaction-Conditioned Virtual Screening of Enzymes
Compositional Curvature Bounds for Deep Neural Networks
Memory Efficient Neural Processes via Constant Memory Attention Block
Guarantees for Nonlinear Representation Learning: Non-identical Covariates, Dependent Data, Fewer Samples
Random features models: a way to study the success of naive imputation
Graph2Tac: Online Representation Learning of Formal Math Concepts
On the Duality Between Sharpness-Aware Minimization and Adversarial Training
Tell, Don't Show: Language Guidance Eases Transfer Across Domains in Images and Videos
Membership Inference Attacks on Diffusion Models via Quantile Regression
Less is More: on the Over-Globalizing Problem in Graph Transformers
Graph-based Forecasting with Missing Data through Spatiotemporal Downsampling
Quasi-Monte Carlo Random Features for Kernel Approximation
Discrete Latent Perspective Learning
Uncertainty for Active Learning on Graphs
Learning Modality Knowledge Alignment for Cross-Modality Transfer
Enhancing Cross-Modal Fine-Tuning with Gradually Intermediate Modality Generation
Going beyond compositional generalization, DDPM can produce zero-shot interpolation
An Empirical Examination of Balancing Strategy for Counterfactual Estimation on Time Series
Inferring dynamic networks from marginals with iterative proportional fitting
GNNs Also Deserve Editing, and They Need It More Than Once
Estimating Canopy Height at Scale
Dynamic Facility Location in High Dimensional Euclidean Spaces
Discovering More Effective Tensor Network Structure Search Algorithms via Large Language Models (LLMs)
HAMLET: Graph Transformer Neural Operator for Partial Differential Equations
Position Paper: Rethinking Empirical Research in Machine Learning: Addressing Epistemic and Methodological Challenges of Experimentation
An Improved Finite-time Analysis of Temporal Difference Learning with Deep Neural Networks
Prompt-Driven LLM Safeguarding via Directed Representation Optimization
Faster Streaming and Scalable Algorithms for Finding Directed Dense Subgraphs in Large Graphs
Mastering Robot Manipulation with Multimodal Prompts through Pretraining and Multi-task Fine-tuning
Sample-Efficient Multiagent Reinforcement Learning with Reset Replay
Quantum Algorithm for Online Exp-concave Optimization
Convex Relaxations of ReLU Neural Networks Approximate Global Optima in Polynomial Time
Causal Bandits: The Pareto Optimal Frontier of Adaptivity, a Reduction to Linear Bandits, and Limitations around Unknown Marginals
A Sober Look at LLMs for Material Discovery: Are They Actually Good for Bayesian Optimization Over Molecules?
Locally Differentially Private Decentralized Stochastic Bilevel Optimization with Guaranteed Convergence Accuracy
The Privacy Power of Correlated Noise in Decentralized Learning
Diversified Batch Selection for Training Acceleration
HarmoDT: Harmony Multi-Task Decision Transformer for Offline Reinforcement Learning
Causal-IQA: Towards the Generalization of Image Quality Assessment Based on Causal Inference
Locally Estimated Global Perturbations is Better than Local Perturbations for Federated Sharpness-aware Minimization
Q-learning Transformer for Offline Reinforcement Learning
Exploring Training on Heterogeneous Data with Mixture of Low-rank Adapters
One Size Fits All for Semantic Shifts: Adaptive Prompt Tuning for Continual Learning
MOMENT: A Family of Open Time-series Foundation Models
Out-of-Distribution Detection via Deep Multi-Comprehension Ensemble
HarmonyDream: Task Harmonization Inside World Models
Dr. Strategy: Model-Based Generalist Agents with Strategic Dreaming
Sparser, Better, Deeper, Stronger: Improving Sparse Training with Exact Orthogonal Initialization
VQDNA: Unleashing the Power of Vector Quantization for Multi-Species Genomic Sequence Modeling
Interpretability Illusions in the Generalization of Simplified Models
An Analysis of Linear Time Series Forecasting Models
Short-Long Convolutions Help Hardware-Efficient Linear Attention to Focus on Long Sequences
Distinguishing the Knowable from the Unknowable with Language Models
Beyond Implicit Bias: The Insignificance of SGD Noise in Online Learning
BLO-SAM: Bi-level Optimization Based Finetuning of the Segment Anything Model for Overfitting-Preventing Semantic Segmentation
Language Generation with Strictly Proper Scoring Rules
Improved Differentially Private and Lazy Online Convex Optimization
Recurrent Early Exits for Federated Learning with Heterogeneous Clients
Estimating Distributional Treatment Effects in Randomized Experiments: Machine Learning for Variance Reduction
Integrated Hardware Architecture and Device Placement Search
Sequential Asynchronous Action Coordination in Multi-Agent Systems: A Stackelberg Decision Transformer Approach
Perfect Alignment May be Poisonous to Graph Contrastive Learning
Keep the Momentum: Conservation Laws beyond Euclidean Gradient Flows
AegisFL: Efficient and Flexible Privacy-Preserving Byzantine-Robust Cross-silo Federated Learning
A Neural-Guided Dynamic Symbolic Network for Exploring Mathematical Expressions from Data
Distribution Alignment Optimization through Neural Collapse for Long-tailed Classification
Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning
Accelerated Algorithms for Constrained Nonconvex-Nonconcave Min-Max Optimization and Comonotone Inclusion
Tandem Transformers for Inference Efficient LLMs
Positive and Unlabeled Learning with Controlled Probability Boundary Fence
Position Paper: Quo Vadis, Unsupervised Time Series Anomaly Detection?
Hyperbolic Geometric Latent Diffusion Model for Graph Generation
Transitional Uncertainty with Layered Intermediate Predictions
Robust Classification via a Single Diffusion Model
Switchable Decision: Dynamic Neural Generation Networks
Prediction-powered Generalization of Causal Inferences
Fundamental Limits of Distributed Covariance Matrix Estimation Under Communication Constraints
Bayesian Adaptation of Network Depth and Width for Continual Learning
Position Paper: Tensor Networks are a Valuable Asset for Green AI
Leverage Class-Specific Accuracy to Guide Data Generation for Improving Image Classification
On Positivity Condition for Causal Inference
Position Paper: An Inner Interpretability Framework for AI Inspired by Lessons from Cognitive Neuroscience
Provably Efficient Exploration in Constrained Reinforcement Learning: Posterior Sampling Is All You Need
How Private is DP-SGD?
Evaluating and Analyzing Relationship Hallucinations in Large Vision-Language Models
IOI: Invisible One-Iteration Adversarial Attack on No-Reference Image- and Video-Quality Metrics
Position Paper: Evolving AI Collectives to Enhance Human Diversity and Enable Self-Regulation
Semantically-correlated memories in a dense associative model
Libra: Building Decoupled Vision System on Large Language Models
Roping in Uncertainty: Robustness and Regularization in Markov Games
Minimally Modifying a Markov Game to Achieve Any Nash Equilibrium and Value
Translating Subgraphs to Nodes Makes Simple GNNs Strong and Efficient for Subgraph Representation Learning
Lie Neurons: Adjoint-Equivariant Neural Networks for Semisimple Lie Algebras
Accelerating Iterative Retrieval-augmented Language Model Serving with Speculation
Flora: Low-Rank Adapters Are Secretly Gradient Compressors
Supervised Constrained Matrix Factorization: Local Landscape Analysis and Applications
Position Paper: Understanding LLMs Requires More Than Statistical Generalization
Probabilistic Modeling of Interpersonal Coordination Processes
Adaptive Advantage-guided Policy Regularization for Offline Reinforcement Learning
AlphaZero-Like Tree-Search can Guide Large Language Model Decoding and Training
Risk-Sensitive Reward-Free Reinforcement Learning with CVaR
Learning Latent Space Hierarchical EBM Diffusion Models
Decomposable Submodular Maximization in Federated Setting
Best of Both Worlds Guarantees for Smoothed Online Quadratic Optimization
Bring Your Own (Non-Robust) Algorithm to Solve Robust MDPs by Estimating The Worst Kernel
Accelerating Look-ahead in Bayesian Optimization: Multilevel Monte Carlo is All you Need
Decomposing and Editing Predictions by Modeling the Computation Graph
VideoPoet: A Large Language Model for Zero-Shot Video Generation
Wasserstein Wormhole: Scalable Optimal Transport Distance with Transformer
FrameQuant: Flexible Low-Bit Quantization for Transformers
Local Feature Selection without Label or Feature Leakage for Interpretable Machine Learning Predictions
Differentiable Combinatorial Scheduling at Scale
Improving Token-Based World Models with Parallel Observation Prediction
Skill Set Optimization: Reinforcing Language Model Behavior via Transferable Skills
MC-GTA: Metric-Constrained Model-Based Clustering using Goodness-of-fit Tests with Autocorrelations
Conformal Prediction with Learned Features
How to Explore with Blindness: State Entropy Maximization in POMDPs
MS-TIP: Imputation Aware Pedestrian Trajectory Prediction
Robust Multi-Task Learning with Excess Risks
Sub-token ViT Embedding via Stochastic Resonance Transformers
Does Label Smoothing Help Deep Partial Label Learning?
Hierarchical Novelty Detection via Fine-Grained Evidence Allocation
Stochastic Conditional Diffusion Models for Robust Semantic Image Synthesis
Harnessing Neural Unit Dynamics for Effective and Scalable Class-Incremental Learning
Language Models as Science Tutors
Revisiting character-level adversarial attacks
Dynamic Survival Analysis with Controlled Latent States
Causal Discovery via Conditional Independence Testing with Proxy Variables
Statistically Optimal Generative Modeling with Maximum Deviation from the Empirical Distribution
Integrating Global Context Contrast and Local Sensitivity for Blind Image Quality Assessment
The Connection Between R-Learning and Inverse-Variance Weighting for Estimation of Heterogeneous Treatment Effects
Adaptive Feature Selection for No-Reference Image Quality Assessment using Contrastive Mitigating Semantic Noise Sensitivity
AlphaFold Meets Flow Matching for Generating Protein Ensembles
MLI Formula: A Nearly Scale-Invariant Solution with Noise Perturbation
Ai-sampler: Adversarial Learning of Markov kernels with involutive maps
Harmonic Self-Conditioned Flow Matching for joint Multi-Ligand Docking and Binding Site Design
A Rate-Distortion View of Distance Awareness
Dirichlet Flow Matching with Applications to DNA Sequence Design
Fourier Controller Networks for Real-Time Decision-Making in Embodied Learning
A Primal-Dual Algorithm for Offline Constrained Reinforcement Learning with Low-Rank MDPs
Amortized Variational Inference with Coverage Guarantees
In-Context Learning on Function Classes Unveiled for Transformers
Combinatorial Multivariant Multi-Armed Bandits with Applications to Episodic Reinforcement Learning and Beyond
Understanding Reasoning Ability of Language Models From the Perspective of Reasoning Paths Aggregation
Scaling Rectified Flow Transformers for High-Resolution Image Synthesis
Reshape and Adapt for Output Quantization (RAOQ): Quantization-aware Training for In-memory Computing Systems
Position Paper: Beyond Personhood: Agency, Accountability, and the Limits of Anthropomorphic Ethical Analysis
Simultaneous identification of models and parameters of scientific simulators
DAG-Based Column Generation for Adversarial Team Games
Scribble-Supervised Semantic Segmentation with Prototype-based Feature Augmentation
Improving Sharpness-Aware Minimization by Lookahead
Optimization without retraction on the random generalized Stiefel manifold
Geometry-Calibrated DRO: Combating Over-Pessimism with Free Energy Implications
Sharpness-Aware Data Generation for Zero-shot Quantization
ConTextual: Evaluating Context-Sensitive Text-Rich Visual Reasoning in Large Multimodal Models
OODRobustBench: a benchmark and large-scale analysis of adversarial robustness under distribution shift
Merging Multi-Task Models via Weight-Ensembling Mixture of Experts
Low-Rank Similarity Mining for Multimodal Dataset Distillation
Improved Communication-Privacy Trade-offs in $L_2$ Mean Estimation under Streaming Differential Privacy
Learning Domain-Invariant Temporal Dynamics for Few-Shot Action Recognition
Matrix Completion with ReLU Sampling
Exploring Correlations of Self-supervised Tasks for Graphs
InfiAgent-DABench: Evaluating Agents on Data Analysis Tasks
Collage: Light-Weight Low-Precision Strategy for LLM Training
FlexSM: Flexible Spatial-Temporal Multiplexing for LLM Serving
Improved Bounds for Pure Private Agnostic Learning: Item-Level and User-Level Privacy
Conformal prediction for multi-dimensional time-series
TSLANet: Rethinking Transformers for Time Series Representation Learning
FedSC: Provable Federated Self-supervised Learning with Spectral Contrastive Objective over Non-i.i.d. Data
Beyond Helpfulness and Harmlessness: Eliciting Diverse Behaviors from Large Language Models with Persona In-Context Learning
A Computational Framework for Solving Wasserstein Lagrangian Flows
CarbonNovo: Joint Design of Protein Structure and Sequence Using a Unified Energy-based Model
Antibody Design Using a Score-based Diffusion Model Guided by Evolutionary, Physical and Geometric Constraints
Probabilistic Inference in Language Models via Twisted Sequential Monte Carlo
Online Matching with Stochastic Rewards: Provable Better Bound via Adversarial Reinforcement Learning
Harmonizing Generalization and Personalization in Federated Prompt Learning
Knowledge Graphs Can be Learned with Just Intersection Features
Inexact Newton-type Methods for Optimisation with Nonnegativity Constraints
Soft Prompt Recovers Compressed LLMs, Transferably
KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache
TVE: Learning Meta-attribution for Transferable Vision Explainer
Federated Full-Parameter Tuning of Billion-Sized Language Models with Communication Cost under 18 Kilobytes
ERQ: Error Reduction for Post-Training Quantization of Vision Transformers
CaM: Cache Merging for Memory-efficient LLMs Inference
Exploring the Low-Pass Filtering Behavior in Image Super-Resolution
Revisiting the Role of Language Priors in Vision-Language Models
Revealing the Dark Secrets of Extremely Large Kernel ConvNets on Robustness
Density-Softmax: Efficient Test-time Model for Uncertainty Estimation and Robustness under Distribution Shifts
Learning to Play Atari in a World of Tokens
Image Restoration Through Generalized Ornstein-Uhlenbeck Bridge
Self-Infilling Code Generation
Efficient Low-Rank Matrix Estimation, Experimental Design, and Arm-Set-Dependent Low-Rank Bandits
Actions Speak Louder than Words: Trillion-Parameter Sequential Transducers for Generative Recommendations
When is Transfer Learning Possible?
Adaptive Hierarchical Certification for Segmentation using Randomized Smoothing
Hyperbolic Optimizer as a Dynamical System
Inherently Efficient and Noise-Robust Local Point Cloud Geometry Encoder via Vectorized Kernel Mixture
Learning Temporal Action Abstractions as a Sequence Compression Problem
Better Locally Private Sparse Estimation Given Multiple Samples Per User
VideoPrism: A Foundational Visual Encoder for Video Understanding
dPOD: On Discrete Prompt Optimization for Diffusion Models
EE-LLM: Large-Scale Training and Inference of Early-Exit Large Language Models with 3D Parallelism
Adapting Static Fairness to Sequential Decision-Making: Bias Mitigation Strategies towards Equal Long-term Benefit Rate
ACE: Off-Policy Actor-Critic with Causality-Aware Entropy Regularization
Premier-TACO is a Few-Shot Policy Learner: Pretraining Multitask Representation via Temporal Action-Driven Contrastive Loss
How Deep Do We Need: Accelerating Training and Inference of Neural ODEs via Control Perspective
A Bayesian Approach to Online Planning
LCA-on-the-Line: Benchmarking Out of Distribution Generalization with Class Taxonomies
Provably Robust DPO: Aligning Language Models with Noisy Feedback
AMPA: Adaptive Mixed Precision Allocation For Low-Bit Integer Training
Neural Image Compression with Text-guided Encoding for both Pixel-level and Perceptual Fidelity
Position Paper: Foundation Agents: Formulation, Progress and Opportunities
Graph Neural Networks with a Distribution of Parametrized Graphs
Improving Transformers with Dynamically Composable Multi-Head Attention
Disentangled Graph Self-supervised Learning under Distribution Shifts
Projection-Free Online Convex Optimization with Time-Varying Constraints
Disentangled Continual Graph Neural Architecture Search with Invariant Modularization
Discovering Features with Synergistic Interactions in Multiple Views
Indirectly Parameterized Concrete Autoencoders
Incremental Topological Ordering and Cycle Detection with Predictions
CurBench: Curriculum Learning Benchmark
Outlier-aware Slicing for Post-Training Quantization in Vision Transformer
Energy-based Backdoor Defense without Task-Specific Samples and Model Retraining
High-Dimensional Kernel Methods under Covariate Shift: Data-Dependent Implicit Regularization
On the Independence Assumption in Neurosymbolic Learning
MILP-FBGen: LP/MILP Instance Generation with Feasibility/Boundedness
Boundary Intersection Sensitive Fingerprinting for Tampering Detection of DNN Models
Improving Equivariant Graph Neural Networks on Large Geometric Graphs via Virtual Nodes Learning
OxyGenerator: Reconstructing Global Ocean Deoxygenation Over a Century with Deep Learning
Unbiased Multi-Label Learning from Crowdsourced Annotations
Balanced Resonate-and-Fire Neurons
PEARL: Zero-shot Cross-task Preference Alignment and Robust Reward Learning for Robotic Manipulation
A Theoretical Analysis of Backdoor Poisoning Attacks in Convolutional Neural Networks
A General Framework for Learning from Weak Supervision
Denoising Score Matching For All
Dynamic Evaluation of Large Language Models by Meta Probing Agents
CompeteAI: Understanding the Competition Dynamics of Large Language Model-based Agents
The Good, The Bad, and Why: Unveiling Emotions in Generative AI
Selective Mixup Helps with Distribution Shifts, But Not (Only) because of Mixup
Diagnosing the Compositional Knowledge of Vision Language Models from a Game-Theoretic View
Open-Vocabulary Calibration for Vision-Language Models
Position Paper: What Can Large Language Models Tell Us about Time Series Analysis
Position Paper: TrustLLM: Trustworthiness in Large Language Models
Kepler codebook
Online Learning in CMDPs: Handling Stochastic and Adversarial Constraints
Selecting Large Language Model to Fine-tune via Rectified Scaling Law
VisionGraph: Leveraging Large Multimodal Models for Graph Theory Problems in Visual Context
Finding NEM-U: Explaining unsupervised representation learning through neural network generated explanation masks
Swallowing the Bitter Pill: Simplified Scalable Conformer Generation
Stability and Generalization for Stochastic Recursive Momentum-based Algorithms for (Strongly-)Convex One to $K$-Level Stochastic Optimizations
On Computational Limits of Modern Hopfield Models: A Fine-Grained Complexity Analysis
On Transferring Expert Knowledge from Tabular Data to Images
Multi-layer Rehearsal Feature Augmentation for Class-Incremental Learning
Position Paper: A Call to Action for a Human-Centered AutoML Paradigm
GroupCover: A Secure, Efficient and Scalable Inference Framework for On-device Model Protection based on TEEs
Case-Based or Rule-Based: How Do Transformers Do the Math?
Differential Model Scaling using Differential Topk
Unleashing the Power of Meta-tuning for Few-shot Generalization Through Sparse Interpolated Experts
Meta Reinforcement Learning Robust to Distributional Shift Via Performing Lifelong In-context Learning
A Fixed-Point Approach for Causal Generative Modeling
Graph Geometry-Preserving Autoencoders
Precise Accuracy / Robustness Tradeoffs in Regression: Case of General Norms
IM-3D: Iterative Multiview Diffusion and Reconstruction for High-Quality 3D Generation
Identification and Estimation for Nonignorable Missing Data: A Data Fusion Approach
WorkArena: How Capable are Web Agents at Solving Common Knowledge Work Tasks?
Unveiling Privacy, Memorization, and Input Curvature Links
Unveiling the Cycloid Trajectory of EM Iterations in Mixed Linear Regression
Understanding Diffusion Models by Feynman's Path Integral
Applying language models to algebraic topology: generating simplicial cycles using multi-labeling in Wu's formula
What needs to go right for an induction head?
Position Paper: Data Authenticity, Consent, & Provenance for AI are all broken: what will it take to fix them?
Position Paper: $C^*$-Algebraic Machine Learning: Moving in a New Direction
Total Variation Distance Meets Probabilistic Inference
Hierarchical State Space Models for Continuous Sequence-to-Sequence Modeling
Position paper: Do not explain (vision models) without context
DRCT: Diffusion Reconstruction Contrastive Training towards Universal Detection of Diffusion Generated Images
DSD-DA: Distillation-based Source Debiasing for Domain Adaptive Object Detection
Towards Efficient Training and Evaluation of Robust Models against $l_0$ Bounded Adversarial Perturbations
Pessimism Meets Risk: Risk-Sensitive Offline Reinforcement Learning
COALA: A Practical and Vision-Centric Federated Learning Platform
Sample-Efficient Robust Multi-Agent Reinforcement Learning in the Face of Environmental Uncertainty
RLAIF vs. RLHF: Scaling Reinforcement Learning from Human Feedback with AI Feedback
Invariant Risk Minimization Is A Total Variation Model
Timer: Transformers for Time Series at Scale
Sequential Kernel Goodness-of-fit Testing
Diffusion models encode the intrinsic dimension of data manifolds
Failures Are Fated, But Can Be Faded
Learning Latent Dynamic Robust Representations for World Models
Sign Gradient Descent-based Neuronal Dynamics: ANN-to-SNN Conversion Beyond ReLU Network
CATS: Enhancing Multivariate Time Series Forecasting by Constructing Auxiliary Time Series as Exogenous Variables
Distributional Values for XAI
Exploring the Complexity of Deep Neural Networks through Functional Equivalence
A Mechanistic Understanding of Alignment Algorithms: A Case Study on DPO and Toxicity
DéjàVu: KV-cache Streaming for Fast, Fault-tolerant Generative LLM Serving
Disentanglement Learning via Topology
Reparameterized Importance Sampling for Robust Variational Bayesian Neural Networks
Stability and Generalization of Stochastic Compositional Gradient Descent Algorithms
Directly Denoising Diffusion Models
Open-Domain Text Evaluation via Contrastive Distribution Methods
Overcoming Data and Model heterogeneities in Decentralized Federated Learning via Synthetic Anchors
Neural-Kernel Conditional Mean Embeddings
DiffAug: Enhance Unsupervised Contrastive Learning with Domain-Knowledge-Free Diffusion-based Data Augmentation
ULTRAFEEDBACK: Boosting Language Models with Scaled AI Feedback
Infinite Horizon Distributionally Robust Regret Optimal Control
Learning the Uncertainty Sets of Linear Control Systems via Set Membership: A Non-asymptotic Analysis
Attack-free Evaluating and Enhancing Adversarial Robustness on Categorical Data
A sampling theory perspective on activations for implicit neural representations
Risk Aware Benchmarking of Large Language Models
Asymmetry in Low-Rank Adapters of Foundation Models
Transformers are SSMs: Generalized Models and Efficient Algorithms with Structured State Space Duality
Caduceus: Bi-Directional Equivariant Long-Range DNA Sequence Modeling
Kernel Debiased Plug-in Estimation: Simultaneous, Automated Debiasing without Influence Functions for Many Target Parameters
Federated Offline Reinforcement Learning: Collaborative Single-Policy Coverage Suffices
Federated Graph Rationalization with Anti-shortcut Augmentations
Temporal Logic Specification-Conditioned Decision Transformer for Offline Safe Reinforcement Learning
ACM-MILP: Adaptive Constraint Modification via Grouping and Selection for Hardness-Preserving MILP Instance Generation
Non-confusing Generation of Customized Concepts in Diffusion Models
Do Transformer World Models Give Better Policy Gradients?
DiffDA: a Diffusion model for weather-scale Data Assimilation
MS$^3$D: A RG Flow-Based Regularization for GAN Training with Limited Data
MLLM-as-a-Judge: Assessing Multimodal LLM-as-a-Judge with Vision-Language Benchmark
Comparing Graph Transformers via Positional Encodings
Partial Optimality in the Linear Ordering Problem
Boundary Exploration for Bayesian Optimization With Unknown Physical Constraints
Optimal Hessian/Jacobian-Free Nonconvex-PL Bilevel Optimization
Faster Adaptive Decentralized Learning Algorithms
Data-Efficient Molecular Generation with Hierarchical Textual Inversion
UP2ME: Univariate Pre-training to Multivariate Fine-tuning as a General-purpose Framework for Multivariate Time Series Analysis
Achieving Lossless Gradient Sparsification via Mapping to Alternative Space in Federated Learning
Cooperative Graph Neural Networks
High-Performance Temporal Reversible Spiking Neural Networks with $\mathcal{O}(L)$ Training Memory and $\mathcal{O}(1)$ Inference Cost
Any-Precision LLM: Low-Cost Deployment of Multiple, Different-Sized LLMs
A3S: A General Active Clustering Method with Pairwise Constraints
Referee Can Play: An Alternative Approach to Conditional Generation via Model Inversion
Novel Spectral Algorithms for the Partial Credit Model
Absolute Policy Optimization: Enhancing Lower Probability Bound of Performance with High Confidence
Parameter-Efficient Visual Foundation Model with Sequence Imitation for Embodied Manipulation
Position Paper: Exploring the Robustness of Pipeline-Parallelism-Based Decentralized Training
Lookbehind-SAM: k steps back, 1 step forward
The Balanced-Pairwise-Affinities Feature Transform
Confronting Reward Overoptimization for Diffusion Models: A Perspective of Inductive and Primacy Biases
Making old things new: a unified algorithm for differentially private clustering
Sensitivity Sampling for Coreset-Based Data Selection
Vocabulary for Universal Approximation: A Linguistic Perspective of Mapping Compositions
Balancing Similarity and Complementarity for Unimodal and Multimodal Federated Learning
Trustworthy Alignment of Retrieval-Augmented Large Language Models via Reinforcement Learning
An Efficient Self-Learning Framework For Interactive Spoken Dialog Systems
Nonparametric Teaching of Implicit Neural Representations
Learning with Adaptive Resource Allocation
Handling Heterogeneous Curvatures in Bandit LQR Control
Parsimonious Learning-Augmented Approximations for Dense Instances of $\mathcal{NP}$-hard Problems
Probabilistic Subgoal Representations for Hierarchical Reinforcement learning
Mitigating Catastrophic Forgetting in Online Continual Learning by Modeling Previous Task Interrelations
Reinforcement learning and regret bounds for admission control
Privacy Attacks in Decentralized Learning
Switched Flow Matching: Eliminating Singularities via Switching ODEs
Adaptive Robust Learning using Latent Bernoulli Variables
Enhancing Vision Transformer: Amplifying Non-Linearity in Feedforward Network Module
Distinguishing Neighborhood Representations Through Reverse Process of GNNs for Heterophilic Graphs
Cascade-CLIP: Cascaded Vision-Language Embeddings Alignment for Zero-Shot Semantic Segmentation
BRAIn: Bayesian Reward-conditioned Amortized Inference for natural language generation from feedback
Eureka-Moments in Transformers: Multi-Step Tasks Reveal Softmax Induced Optimization Problems
Online Learning and Information Exponents: The Importance of Batch size & Time/Complexity Tradeoffs
The Benefits of Reusing Batches for Gradient Descent in Two-Layer Networks: Breaking the Curse of Information and Leap Exponents
MFTN: A Multi-scale Feature Transfer Network Based on IMatchFormer for Hyperspectral Image Super-Resolution
Amortized Equation Discovery in Hybrid Dynamical Systems
CodeIt: Self-Improving Language Models with Prioritized Hindsight Replay
Denoising Autoregressive Representation Learning
Recurrent Distance Filtering for Graph Representation Learning
A Unified View of FANOVA: A Comprehensive and Flexible Bayesian Framework for Component Selection and Estimation
Overestimation, Overfitting, and Plasticity in Actor-Critic: the Bitter Lesson of Reinforcement Learning
SLOG: An Inductive Spectral Graph Neural Network Beyond Polynomial Filter
Optimal bounds for $\ell_p$ sensitivity sampling via $\ell_2$ augmentation
Turnstile $\ell_p$ leverage score sampling with applications
Two Tales of Single-Phase Contrastive Hebbian Learning
Batch Singular Value Polarization and Weighted Semantic Augmentation for Universal Domain Adaptation
Towards Unified Multi-granularity Text Detection with Interactive Attention
Improved Modelling of Federated Datasets using Mixtures-of-Dirichlet-Multinomials
GLoRe: When, Where, and How to Improve LLM Reasoning via Global and Local Refinements
Binary Decomposition: A Problem Transformation Perspective for Open-Set Semi-Supervised Learning
VinT-6D: A Large-Scale Object-in-hand Dataset from Vision, Touch and Proprioception
Sample as you Infer: Predictive Coding with Langevin Dynamics
SpikeZIP-TF: Conversion is All You Need for Transformer-based SNN
Learning from Memory: A Non-Parametric Memory Augmented Self-Supervised Learning of Visual Features
Position Paper: On The Importance of Technical Research and Talent for AI Governance
ReconBoost: Boosting Can Achieve Modality Reconcilement
Preventing Model Collapse in Gaussian Process Latent Variable Models
An Empirical Study of Realized GNN Expressiveness
Graph As Point Set
Finite-Time Convergence and Sample Complexity of Actor-Critic Multi-Objective Reinforcement Learning
Learning Coverage Paths in Unknown Environments with Deep Reinforcement Learning
Learning Scale-Aware Spatio-temporal Implicit Representation for Event-based Motion Deblurring
C-RAG: Certified Generation Risks for Retrieval-Augmented Language Models
Assessing Large Language Models on Climate Information
Why Do You Grok? A Theoretical Analysis on Grokking Modular Addition
Achieving Margin Maximization Exponentially Fast via Progressive Norm Rescaling
Navigating Scaling Laws: Compute Optimality in Adaptive Model Training
IIANet: An Intra- and Inter-Modality Attention Network for Audio-Visual Speech Separation
Fast White-Box Adversarial Streaming Without a Random Oracle
Language Models with Conformal Factuality Guarantees
Target Networks and Over-parameterization Stabilize Off-policy Bootstrapping with Function Approximation
Differentiable Annealed Importance Sampling Minimizes The Jensen-Shannon Divergence Between Initial and Target Distribution
Slicing Mutual Information Generalization Bounds for Neural Networks
Proactive Detection of Voice Cloning with Localized Watermarking
On The Fairness Impacts of Hardware Selection in Machine Learning
$f$-Divergence Based Classification: Beyond the Use of Cross-Entropy
A Dynamic Algorithm for Weighted Submodular Cover Problem
X-Oscar: A Progressive Framework for High-quality Text-guided 3D Animatable Avatar Generation
Localizing Task Information for Improved Model Merging and Compression
Large Language Models Can Automatically Engineer Features for Few-Shot Tabular Learning
Stealing part of a production language model
Sliding down the stairs: how correlated latent variables accelerate learning with neural networks
Delaunay Graph: Addressing Over-Squashing and Over-Smoothing Using Delaunay Triangulation
Triplet Interaction Improves Graph Transformers: Accurate Molecular Graph Learning with Triplet Graph Transformers
MolCRAFT: Structure-Based Drug Design in Continuous Parameter Space
CauDiTS: Causal Disentangled Domain Adaptation of Multivariate Time Series
Auctionformer: A Unified Deep Learning Algorithm for Solving Equilibrium Strategies in Auction Games
Multi-View Clustering by Inter-cluster Connectivity-Guided Rewarding
KernelWarehouse: Rethinking the Design of Dynamic Convolution
FAFE: Immune Complex Modeling with Geodesic Distance Loss on Noisy Group Frames
Compressible Dynamics in Deep Overparameterized Low-Rank Learning & Adaptation
Sign is Not a Remedy: Multiset-to-Multiset Message Passing for Learning on Heterophilic Graphs
In-Context Freeze-Thaw Bayesian Optimization
Differentially Private Sum-Product Networks
Enhancing Class-Imbalanced Learning with Pre-trained Guidance through Class-Conditional Knowledge Distillation
Solving Hierarchical Information-Sharing Dec-POMDPs: An Extensive-Form Game Approach
Robustness of Deep Learning for Accelerated MRI: Benefits of Diverse Training Data
More Flexible PAC-Bayesian Meta-Learning by Learning Learning Algorithms
Breaking through the learning plateaus of in-context learning in Transformer
Revitalizing Multivariate Time Series Forecasting: Learnable Decomposition with Inter-Series Dependencies and Intra-Series Variations Modeling
Recovering the Pre-Fine-Tuning Weights of Generative Models
Experts Don't Cheat: Learning What You Don't Know By Predicting Pairs
Whispering Experts: Neural Interventions for Toxicity Mitigation in Language Models
Beyond Individual Input for Deep Anomaly Detection on Tabular Data
Exploiting Negative Samples: A Catalyst for Cohort Discovery in Healthcare Analytics
Approximate Nearest Neighbor Search with Window Filters
Agnostic Interactive Imitation Learning: New Theory and Practical Algorithms
Random matrix theory improved Frechet mean of symmetric positive definite matrices
DNCs Require More Planning Steps
Incorporating probabilistic domain knowledge into deep multiple instance learning
Pairwise Alignment Improves Graph Domain Adaptation
Learning to Infer Generative Template Programs for Visual Concepts
RICE: Breaking Through the Training Bottlenecks of Reinforcement Learning with Explanation
How to escape sharp minima with random perturbations
Diffusion Posterior Sampling is Computationally Intractable
Offline Actor-Critic Reinforcement Learning Scales to Large Models
RL-CFR: Improving Action Abstraction for Imperfect Information Extensive-Form Games with Reinforcement Learning
Bringing Motion Taxonomies to Continuous Domains via GPLVM on Hyperbolic manifolds
Sparse-to-dense Multimodal Image Registration via Multi-Task Learning
On the Minimal Degree Bias in OOD Generalization for non-Boolean Functions
Piecewise Constant and Linear Regression Trees: An Optimal Dynamic Programming Approach
Fast Text-to-3D-Aware Face Genereation and Manipulation via Direct Cross-modal Mapping and Geometric Regularization
Controlling Behavioral Diversity in Multi-Agent Reinforcement Learning
StrokeNUWA—Tokenizing Strokes for Vector Graphic Synthesis
Mitigating Privacy Risk in Membership Inference by Convex-Concave Loss
$\bf{\Phi}_\textrm{Flow}$: Differentiable Simulations for PyTorch, TensorFlow and Jax
Autonomous Sparse Mean-CVaR Portfolio Optimization
Learning to Scale Logits for Temperature-Conditional GFlowNets
APIServe: Efficient API Support for Large-Language Model Inferencing
Bifurcated Attention for Single-Context Large-Batch Sampling
Symmetric Replay Training: Enhancing Sample Efficiency in Deep Reinforcement Learning for Combinatorial Optimization
StableMask: Refining Causal Masking in Decoder-only Transformer
Hierarchical Integral Probability Metrics: A distance on random probability measures with low sample complexity
OSSCAR: One-Shot Structured Pruning in Vision and Language Models with Combinatorial Optimization
Incentivized Learning in Principal-Agent Bandit Games
On the Implicit Bias of Adam
An Information-Theoretic Analysis of In-Context Learning
Weakly Convex Regularisers for Inverse Problems: Convergence of Critical Points and Primal-Dual Optimisation
REMEDI: Corrective Transformations for Improved Neural Entropy Estimation
Probabilistic Generating Circuits - Demystified!
Mean Estimation in the Add-Remove Model of Differential Privacy
Random Exploration in Bayesian Optimization: Order-Optimal Regret and Computational Efficiency
Fast Sampling-Based Sketches for Tensors
Position Paper: Future Directions in Foundations of Graph Machine Learning
Weisfeiler-Leman at the margin: When more expressivity matters
Verifying message-passing neural networks via topology-based bounds tightening
Learning from Students: Applying t-Distributions to Explore Accurate and Efficient Formats for LLMs
Stochastic Bandits with ReLU Neural Networks
Pi-DUAL: Using privileged information to distinguish clean from noisy labels
Position Paper: On the Societal Impact of Open Foundation Models
On PI Controllers for Updating Lagrange Multipliers in Constrained Optimization
Exact Soft Analytical Side-Channel Attacks using Tractable Circuits
Ecologically rational meta-learned inference explains human category learning
In-context learning agents are asymmetric belief updaters
Intersectional Unfairness Discovery
Error Feedback Can Accurately Compress Preconditioners
TinyTrain: Resource-Aware Task-Adaptive Sparse Training of DNNs at the Data-Scarce Edge
Convergence and Trade-Offs in Riemannian Gradient Descent and Riemannian Proximal Point
Smooth Min-Max Monotonic Networks
DPZero: Private Fine-Tuning of Language Models without Backpropagation
MADA: Meta-Adaptive Optimizers through hyper-gradient Descent
Position Paper: Mission Critical – Satellite Data is a Distinct Modality in Machine Learning
Task-aware Orthogonal Sparse Network for Exploring Shared Knowledge in Continual Learning
Position Paper: Application-Driven Innovation in Machine Learning
Linear Explanations for Individual Neurons
Dynamic Memory Compression: Retrofitting LLMs for Accelerated Inference
GistScore: Learning Better Representations for In-Context Example Selection with Gist Bottlenecks
Optimally Improving Cooperative Learning in a Social Setting
Practical Hamiltonian Monte Carlo on Riemannian Manifolds via Relativity Theory
Learning High-Frequency Functions Made Easy with Sinusoidal Positional Encoding
Enhancing Size Generalization in Graph Neural Networks through Disentangled Representation Learning
Cell2Sentence: Teaching Large Language Models the Language of Biology
Estimating Unknown Population Sizes Using the Hypergeometric Distribution
The Effect of Weight Precision in Deep Neural Networks
State-Constrained Zero-Sum Differential Games with One-Sided Information
Monotone, Bi-Lipschitz, and Polyak-Łojasiewicz Networks
Predictive Performance Comparison of Decision Policies Under Confounding
A Probabilistic Approach to Learning the Degree of Equivariance in Steerable CNNs
A Tensor Decomposition Perspective on Second-order RNNs
Position Paper: Compositional Generative Modeling: A Single Model is Not All You Need
Enhancing Storage and Computational Efficiency in Federated Multimodal Learning for Large-Scale Models
Decoding Compressed Trust: Scrutinizing the Trustworthiness of Efficient LLMs Under Compression
Two Fists, One Heart: Multi-Objective Optimization Based Strategy Fusion for Long-tailed Learning
Enhancing Adversarial Robustness in SNNs with Sparse Gradients
Robust Stable Spiking Neural Networks
On the Second-Order Convergence of Biased Policy Gradient Algorithms
Clifford-Steerable Convolutional Neural Networks
Networked Inequality: Preferential Attachment Bias in Graph Neural Network Link Prediction
Plug-and-Play image restoration with Stochastic deNOising REgularization
Rethinking Transformers in Solving POMDPs
A Dense Reward View on Aligning Text-to-Image Diffusion with Preference
Safe Exploration in Dose Finding Clinical Trials with Heterogeneous Participants
Nash Learning from Human Feedback
MusicRL: Aligning Music Generation to Human Preferences
Robust Optimization in Protein Fitness Landscapes Using Reinforcement Learning in Latent Space
Collapse-Aware Triplet Decoupling for Adversarially Robust Image Retrieval
Differentiability and Convergence of Filtration Learning with Multiparameter Persistence
Pre-Training Protein bi-level Representation through Span Mask strategy on 3D Protein Chains
Vector-quantized Masked Auto-encoders on Molecular Surfaces
Amortized Variational Deep Kernel Learning
Bridging Model Heterogeneity in Federated Learning via Uncertainty-based Asymmetrical Reciprocity Learning
Rethinking Adversarial Robustness in the Context of the Right to be Forgotten
Data Poisoning Attacks against Conformal Prediction
Robust CLIP: Unsupervised Adversarial Fine-tuning of Vision Embeddings for Robust Large Vision-Language Models
FedBAT: Communication-efficient Federated Learning via Learnable Binarization
Scalable Multiple Kernel Clustering: Learning Clustering Structure from Expectation
Self-cognitive Denoising in the Presence of Multiple Noisy Label Sources
Do Language Models Exhibit the Same Cognitive Biases in Problem Solving as Human Learners?
Position Paper: Mind your Language (Model): Fact-Checking LLMs and their Role in ML Research and Practice
Residual-Conditioned Optimal Transport: Towards Structure-Preserving Unpaired and Paired Image Restoration
Seizing Serendipity: Exploiting the Value of Past Success in Off-Policy Actor-Critic
A Near-Linear Time Approximation Algorithm for Beyond-Worst-Case Graph Clustering
SignSGD with Federated Defense: Harnessing Adversarial Attacks through Gradient Sign Decoding
Layerwise Change of Knowledge in Neural Networks
QBMK: Quantum-based Matching Kernels for Un-attributed Graphs
Decouple then Classify: A Dynamic Multi-view Labeling Strategy with Shared and Specific Information
Towards Resource-friendly, Extensible and Stable Incomplete Multi-view Clustering
RMIB: Representation Matching Information Bottleneck for Matching Text Representations
Accelerated Speculative Sampling Based on Tree Monte Carlo
EquiPocket: an E(3)-Equivariant Geometric Graph Neural Network for Ligand Binding Site Prediction
Non-clairvoyant Scheduling with Partial Predictions
Post-hoc Part-Prototype Networks
Generalizing Orthogonalization for Models with Non-linearities
Characteristic Guidance: Non-linear Correction for Diffusion Model at Large Guidance Scale
ReMax: A Simple, Effective, and Efficient Reinforcement Learning Method for Aligning Large Language Models
Configurable Mirror Descent: Towards a Unification of Decision Making
Time Weaver: A Conditional Time Series Generation Model
Major-Minor Mean Field Multi-Agent Reinforcement Learning
Fine-Grained Causal Dynamics Learning with Quantization for Improving Robustness in Reinforcement Learning
Mean-field Analysis on Two-layer Neural Networks from a Kernel Perspective
Best Arm Identification for Stochastic Rising Bandits
On Online Experimentation without Device Identifiers
EvIL: Evolution Strategies for Generalisable Imitation Learning
Distributionally Robust Data Valuation
Privacy-Preserving Instructions for Aligning Large Language Models
PruNeRF: Segment-Centric Dataset Pruning via 3D Spatial Consistency
Rethinking Data Shapley for Data Selection Tasks: Misleads and Merits
Hierarchical Neural Operator Transformer with Learnable Frequency-aware Loss Prior for Arbitrary-scale Super-resolution
Efficient Value Iteration for s-rectangular Robust Markov Decision Processes
Adaptive Conformal Inference by Betting
MVMoE: Multi-Task Vehicle Routing Solver with Mixture-of-Experts
DistiLLM: Skew KL Divergence and Adaptive Off-policy Approach for Efficient Distillation of Auto-regressive Language Models
Time-Series Forecasting for Out-of-Distribution Generalization Using Invariant Learning
SMaRt: Improving GANs with Score Matching Regularity
Provably Efficient Long-Horizon Exploration in Monte Carlo Tree Search through State Occupancy Regularization
Extending the Reach of First-Order Algorithms for Nonconvex Min-Max Problems with Cohypomonotonicity
Generalization to New Sequential Decision Making Tasks with In-Context Learning
Unsupervised Concept Discovery Mitigates Spurious Correlations
Which Frequencies do CNNs Need? Emergent Bottleneck Structure in Feature Learning
Feel-Good Thompson Sampling for Contextual Dueling Bandits
Understanding Stochastic Natural Gradient Variational Inference
Improving Monte Carlo Evaluation with Offline Data
When and How Does In-Distribution Label Help Out-of-Distribution Detection?
Bayesian Knowledge Distillation: A Bayesian Perspective of Distillation with Uncertainty Quantification
Gaussian Plane-Wave Neural Operator for Electron Density Estimation
On the Expressive Power of Spectral Invariant Graph Neural Networks
Differentially Private Domain Adaptation with Theoretical Guarantees
FedRC: Tackling Diverse Distribution Shifts Challenge in Federated Learning by Robust Clustering
Position Paper: LLMs Can’t Plan, But Can Help Planning in LLM-Modulo Frameworks
Provable Multi-Task Representation Learning by Two-Layer ReLU Neural Networks
Towards the Theory of Unsupervised Federated Learning: Non-asymptotic Analysis of Federated EM Algorithms
Slicedit: Zero-Shot Video Editing With Text-to-Image Diffusion Models Using Spatio-Temporal Slices
A Generative Approach for Treatment Effect Estimation under Collider Bias: From an Out-of-Distribution Perspective
Accelerating Transformer Pre-Training with 2:4 Sparsity
Best-Fit Data Packing: Fewer Truncations Improve Language Modeling
MusicFlow: Cascaded Flow Matching for Text Guided Music Generation
Learning Latent Structures in Network Games via Data-Dependent Gated-Prior Graph Variational Autoencoders
ATraDiff: Accelerating Online Reinforcement Learning with Imaginary Trajectories
RoboDreamer: Learning Compositional World Models for Robot Imagination
Acquisition Conditioned Oracle for Nongreedy Active Feature Acquisition
Improving SAM Requires Rethinking its Optimization Formulation
Differentially Private Decentralized Learning with Random Walks
RNAFlow: RNA Structure & Sequence Co-Design via Inverse Folding-Based Flow Matching
Efficient Algorithms for Sum-Of-Minimum Optimization
Position Paper: Measuring Diversity in Datasets
A decoder-only foundation model for time-series forecasting
Prompt Sketching for Large Language Models
BBox-Adapter: Lightweight Adapting for Black-Box Large Language Models
Simple Ingredients for Offline Reinforcement Learning
Connecting the Dots: Collaborative Fine-tuning for Black-Box Vision-Language Models
The Perception-Robustness Tradeoff in Deterministic Image Restoration
Learning Divergence Fields for Generalization with Data Geometries
Training Greedy Policy for Proposal Batch Selection in Expensive Multi-Objective Combinatorial Optimization
Learning the Target Network in Function Space
How Transformers Learn Causal Structure with Gradient Descent
Compound Returns Reduce Variance in Reinforcement Learning
Evaluation of LLMs on Syntax-Aware Code Fill-in-the-Middle Tasks
Catapults in SGD: spikes in the training loss and their impact on generalization through feature learning
Tight Partial Identification of Causal Effects with Marginal Distribution of Unmeasured Confounders
Hybrid Neural Representations for Spherical Data
Uniform Memory Retrieval with Larger Capacity for Modern Hopfield Models
Position Paper: Rethinking LLM Censorship as a Security Problem
FADAS: Towards Federated Adaptive Asynchronous Optimization
On Multi-Armed Bandit with Impatient Arms
Visual Transformer with Differentiable Channel Selection: An Information Bottleneck Inspired Approach
AI Alignment with Changing and Influenceable Reward Functions
Learning from Integral Losses in Physics Informed Neural Networks
EvGGS: A Collaborative Learning Framework for Event-based Generalizable Gaussian Splatting
Private Gradient Descent for Linear Regression: Tighter Error Bounds and Instance-Specific Uncertainty Estimation
On the Unexpected Effectiveness of Reinforcement Learning for Sequential Recommendation
Evaluating model bias requires characterizing its mistakes
Diffuse, Sample, Project: Plug-And-Play Controllable Graph Generation
Interdependent Multi-Agent MDP: Theoretical Framework for Decentralized Agents with Dynamic Local Dependencies
Learning Decision Policies with Instrumental Variables through Double Machine Learning
Parallelized Spatiotemporal Binding
Block Acceleration Without Momentum: On Optimal Stepsizes of Block Gradient Descent for Least-Squares
Sampling-based Multi-dimensional Recalibration
Gibbs Sampling of Continuous Potentials on a Quantum Computer
Understanding Adam Optimizer via Online Learning of Updates: Adam is FTRL in Disguise
HGAP: Boosting Permutation Invariant and Permutation Equivariant in Multi-Agent Reinforcement Learning via Graph Attention Network
Rapid Learning without Catastrophic Forgetting in the Morris Water Maze
Auto-Regressive Next-Token Predictors are Universal Learners
LLM and Simulation as Bilevel Optimizers: A New Paradigm to Advance Physical Scientific Discovery
Truly No-Regret Learning in Constrained MDPs
Cross-domain Open-world Discovery
Theoretical Guarantees for Variational Inference with Fixed-Variance Mixture of Gaussians
ExCP: Extreme LLM Checkpoint Compression via Weight-Momentum Joint Shrinking
Privacy Profiles for Private Selection
Position Paper: Quantifying Policy Impacts on Online Harms – A Call for Machine Learning-powered Assessment of the EU Digital Services Act
Various Lengths, Constant Speed: Efficient Language Modeling with Lightning Attention
Online Adaptive Anomaly Thresholding with Confidence Sequences
Hard Tasks First: Multi-Task Reinforcement Learning Through Task Scheduling
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
Conditional Common Entropy for Instrumental Variable Testing and Partial Identification
Subequivariant Reinforcement Learning in 3D Multi-Object Physical Environments
Harmony in Diversity: Merging Neural Networks with Canonical Correlation Analysis
Optimizing watermarks for large language models
Learning Useful Representations of Recurrent Neural Network Weight Matrices
Unlocking Exact Recovery in Semi-Supervised Learning: Analysis of Spectral Method and Graph Convolution Network
RigorLLM: Resilient Guardrails for Large Language Models against Undesired Content
CogBench: a large language model walks into a psychology lab
Learning to Compress Long Contexts by Dropping-In Convolutions
Sparse Model Inversion: Efficient Inversion of Vision Transformers with Less Hallucination
Tag-LLM: Repurposing General-Purpose LLMs for Specialized Domains
Implicit Representations via Operator Learning
Hidden Traveling Waves bind Working Memory Variables in Recurrent Neural Networks
Near-Linear Time Approximation Algorithms for k-means with Outliers
WAVES: Benchmarking the Robustness of Image Watermarks
Stationary Latent Weight Inference for Unreliable Observations from Online Test-Time Adaptation
Trustless Audits without Revealing Data or Models
Adaptive Sampling of k-Space in Magnetic Resonance for Fast Pathology Prediction
Expert Proximity as Surrogate Rewards for Single Demonstration Imitation Learning
Equivariance via Minimal Frame Averaging for More Symmetries and Efficiency
diff History for Neural Language Agents
Generalized Preference Optimization: A Unified Approach to Offline Alignment
Fast, Scalable, Warm-Start Semidefinite Programming with Spectral Bundling and Sketching
PolySketchFormer: Fast Transformers via Sketching Polynomial Kernels
First-Order Manifold Data Augmentation for Regression Learning
Long Range Propagation on Continuous-Time Dynamic Graphs
Distilling Morphology-Conditioned Hypernetworks for Efficient Universal Morphology Control
Subsampling is not Magic: Why Large Batch Sizes Work for Differentially Private Stochastic Optimisation
Discovering environments with XRM
New and Improved Bounds on the Approximation of Complete-Link
Small-loss Adaptive Regret for Online Convex Optimization
Can Gaussian Sketching Converge Faster on a Preconditioned Landscape?
Learning to Predict Mutational Effects of Protein-Protein Interactions by Microenvironment-aware Hierarchical Prompt Learning
Momentum for the Win: Collaborative Federated Reinforcement Learning across Heterogeneous Environments
Feature Importance Disparities for Data Bias Investigations
Differentially Private Bias-Term Fine-tuning of Foundation Models
Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch
Toward Accurate Fast Convolution under Low-precision Arithmetic
MAGDi: Structured Distillation of Multi-Agent Interaction Graphs Improves Reasoning in Smaller Language Models
Prompting for Robustness: Extracting Robust Classifiers from Foundation Models
Graph Neural Networks Use Graphs When They Shouldn't
The Statistical Complexity of Offline Decision-Making
Generalist Equivariant Transformer Towards 3D Molecular Interaction Learning
Dense Reward for Free in Reinforcement Learning from Human Feedback
Agnostic Learning of Mixed Linear Regressions with EM and AM Algorithms
NDOT: Neuronal Dynamics-based Online Training for Spiking Neural Networks
One Prompt is not Enough: Automated Construction of a Mixture-of-Expert Prompts
One Meta-tuned Transformer is What You Need for Few-shot Learning
Speech Self-Supervised Learning Using Diffusion Model Synthetic Data
Learning-Rate-Free Stochastic Optimization over Riemannian Manifolds
Fast Decision Boundary based Out-of-Distribution Detector
Few-shot Adaption to Distribution Shifts By Mixing Source and Target Embeddings
Advancing DRL Agents in Commercial Fighting Games: Training, Integration, and Agent-Human Alignment
Simplicity Bias of Two-Layer Networks beyond Linearly-Separable Data
Generalization Bounds for Heavy-Tailed SDEs through the Fractional Fokker-Planck Equation
Saliency strikes back: How filtering out high frequencies improves white-box explanations
Simple linear attention language models balance the recall-throughput tradeoff
Discovering Bias in Latent Space: An Unsupervised Debiasing Approach
Position Paper: On the Possibilities of AI-Generated Text Detection
Slot Abstractors: Toward Scalable Abstract Visual Reasoning
Feedback Efficient Online Fine-Tuning of Diffusion Models
Position Paper: Automatic Environment Shaping is the Next Frontier in RL
Implicit meta-learning may lead language models to trust more reliable sources
Consistent Long-Term Forecasting of Ergodic Dynamical Systems
On Which Nodes Does GCN Fail? Enhancing GCN From the Node Perspective
Human vs. Generative AI in Content Creation Competition: Symbiosis or Conflict?
Solving Poisson Equations using Neural Walk-on-Spheres
Scalable Wasserstein Gradient Flow for Generative Modeling through Unbalanced Optimal Transport
Compositional Text-to-Image Generation with Dense Blob Representations
Position Paper: A Safe Harbor for AI Evaluation and Red Teaming
Self-Supervised Interpretable Sensorimotor Learning via Latent Functional Modularity
Position Paper: The Causal Revolution Needs Scientific Pragmatism
Fully-Dynamic Approximate Decision Trees With Worst-Case Update Time Guarantees
Hidden Harmonies in Chaos: An Unsupervised Approach for Periodic Source Detection
KernelSHAP-IQ: Weighted Least Square Optimization for Shapley Interactions
GRATH: Gradual Self-Truthifying for Large Language Models
Multiplicative Weights Update, Area Convexity and Random Coordinate Descent for Densest Subgraph Problems
Weakly-Supervised Residual Evidential Learning for Multi-Instance Uncertainty Estimation
Dynamic Spectral Clustering with Provable Approximation Guarantee
BOtied: Multi-objective Bayesian optimization with tied multivariate ranks
Robust Yet Efficient Conformal Prediction Sets
Graph Generation with Diffusion Mixture
Continuous Treatment Effects with Surrogate Outcomes
What Would Gauss Say About Representations? Probing Pretrained Image Models using Synthetic Gaussian Benchmarks
Understanding Finetuning for Factual Knowledge Extraction
Position Paper: Rethinking Post-Hoc Search-Based Neural Approaches for Solving Large-Scale Traveling Salesman Problems
AST-T5: Structure-Aware Pretraining for Code Generation and Understanding
Transforming and Combining Rewards for Aligning Large Language Models
Scalable Pre-training of Large Autoregressive Image Models
Hybrid Reinforcement Learning from Offline Observation Alone
Gibbs Sampling from Human Feedback: A Provable KL-constrained Framework for RLHF
A Language Model’s Guide Through Latent Space
What Will My Model Forget? Forecasting Forgotten Examples in Language Model Refinement
Decomposing Uncertainty for Large Language Models through Input Clarification Ensembling
Optimal Ridge Regularization for Out-of-Distribution Prediction
Position Paper: The Reasonable Person Standard for AI
Challenges and Considerations in the Evaluation of Bayesian Causal Discovery
SciBench: Evaluating College-Level Scientific Problem-Solving Abilities of Large Language Models
Batch and match: black-box variational inference with a score-based divergence
Nonlinear Filtering with Brenier Optimal Transport Maps
Submodular framework for structured-sparse optimal transport
High-Dimensional Bayesian Optimization via Semi-Supervised Learning with Optimized Unlabeled Data Sampling
Split-and-Denoise: Protect large language model inference with local differential privacy
Safety Fine-Tuning at (Almost) No Cost: A Baseline for Vision Large Language Models
RoSA: Accurate Parameter-Efficient Fine-Tuning via Robust Adaptation
EMC$^2$: Efficient MCMC Negative Sampling for Contrastive Learning with Global Convergence
Efficient Exploration for LLMs
Neural SPH: Improved Neural Modeling of Langrangian Fluid Dynamics
Controlled Decoding from Language Models
Graph-Triggered Rising Bandits
No-Regret Reinforcement Learning in Smooth MDPs
Modular Learning of Deep Causal Generative Models for High-dimensional Causal Inference
Genie: Generative Interactive Environments
Policy Evaluation for Variance in Average Reward RL
Fine-grained Classes and How to Find Them
Not Just Pretty Pictures: Toward Interventional Data Augmentation Using Text-to-Image Generators
ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL
IM-Unpack: Training and Inference with Arbitrarily Low Precision Integers
Integrating Multimodal Data for Joint Generative Modeling of Complex Dynamics
Language-guided Skill Learning with Temporal Variational Inference
Towards Theoretical Understandings of Self-Consuming Generative Models
Position Paper: Opportunities for Machine Learning in Magnetic Fusion Energy
Position Paper: Why Tabular Foundation Models Should Be a Research Priority
Exploiting Human-AI Dependency for Learning to Defer
Measuring Stochastic Data Complexity with Boltzmann Influence Functions
Transformers Implement Functional Gradient Descent to Learn Non-Linear Functions In Context
Predicting Lagrangian Multipliers for Mixed Integer Linear Programs
From generalization analysis to optimization designs for state space models
Learning to Explore for Stochastic Gradient MCMC
Protein Conformation Generation via Force-Guided SE(3) Diffusion Models
Simulation of Graph Algorithms with Looped Transformers
Rolling Diffusion Models
A new computationally efficient algorithm to solve Feature Selection for Functional Data Classification in high-dimensional spaces
Unsupervised Zero-Shot Reinforcement Learning via Functional Reward Encodings
Position Paper: What makes an image *realistic*?
Efficient Pareto Manifold Learning with Low-Rank Structure
PAC-Bayesian Error Bound, via R ́enyi Divergence, for a Class of Linear Time-Invariant State-Space Models
$\mathtt{VITS}$ : Variational Inference Thompson Sampling for contextual bandits
AD3: Implicit Action is the Key for World Models to Distinguish the Diverse Visual Distractors
Minimum-Norm Interpolation Under Covariate Shift
Sarah Frank-Wolfe: Methods for Constrained Optimization with Best Rates and Practical Features
Zero-Shot ECG Classification with Multimodal Learning and Test-time Clinical Knowledge Enhancement
The Stronger the Diffusion Model, the Easier the Backdoor: Data Poisoning to Induce Copyright BreachesWithout Adjusting Finetuning Pipeline
SeMOPO: Learning High-quality Model and Policy from Low-quality Offline Visual Datasets
Bridging Environments and Language with Rendering Functions and Vision-Language Models
MagicLens: Self-Supervised Image Retrieval with Open-Ended Instructions
CaPS: Collaborative and Private Synthetic Data Generation from Distributed Sources
Improving Sample Efficiency of Model-Free Algorithms for Zero-Sum Markov Games
Stochastic Interpolants with Data-Dependent Couplings
On Convergence of Incremental Gradient for Non-convex Smooth Functions
The Fundamental Limits of Least-Privilege Learning
NExT-Chat: An LMM for Chat, Detection and Segmentation
Towards Robust Model-Based Reinforcement Learning Against Adversarial Corruption
Neural Tangent Kernels for Axis-Aligned Tree Ensembles
Robust $\phi$-Divergence Reinforcement Learning Using Offline and Online Data
Measures of diversity and space-filling designs for categorical data
Unsupervised Evaluation of Code LLMs with Round-Trip Correctness
Stability and Multigroup Fairness in Ranking with Uncertain Predictions
Improving Interpretation Faithfulness for Vision Transformers
Unified Training of Universal Time Series Forecasting Transformers
Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
RL-VLM-F: Reinforcement Learning from Vision Language Foundation Model Feedback
What Can Transformer Learn with Varying Depth? Case Studies on Sequence Learning Tasks
Neural Collapse for Cross-entropy Class-Imbalanced Learning with Unconstrained ReLU Features Model
Random Latent Exploration for Deep Reinforcement Learning
Using Uncertainty Quantification to Characterize and Improve Out-of-Domain Learning for PDEs
CuTS: Customizable Tabular Synthetic Data Generation
Prediction Accuracy of Learning in Games : Follow-the-Regularized-Leader meets Heisenberg
Sliced Wasserstein with Random-Path Projecting Directions
Realistic Unsupervised CLIP Fine-tuning with Universal Entropy Optimization
Generalization Analysis of Stochastic Weight Averaging with General Sampling
Probability Distribution of Hypervolume Improvement in Bi-objective Bayesian Optimization
Transferring Knowledge From Large Foundation Models to Small Downstream Models
Learning by Reconstruction Produces Uninformative Features For Perception
Value-Evolutionary-Based Reinforcement Learning
Pseudo-Calibration: Improving Predictive Uncertainty Estimation in Unsupervised Domain Adaptation
MAGNOLIA: Matching Algorithms via GNNs for Online Value-to-go Approximation
A Closer Look at the Limitations of Instruction Tuning
LeaPformer: Enabling Linear Transformers for Autoregressive and Simultaneous Tasks via Learned Proportions
Coarse-to-Fine Highlighting: Reducing Knowledge Hallucination in Large Language Models
Unifying Image Processing as Visual Prompting Question Answering
FedCal: Achieving Local and Global Calibration in Federated Learning via Aggregated Parameterized Scaler
Position Paper: Limitations of and Alternatives to Benchmarking in Reinforcement Learning Research
On the Last-Iterate Convergence of Shuffling Gradient Methods
A Penalty-based Gradient Method for Bilevel Reinforcement Learning
CW Complex Hypothesis for Image Data
Towards Understanding the Word Sensitivity of Attention Layers: A Study via Random Features
Implicit Bias of Policy Gradient in Linear Quadratic Control: Extrapolation to Unseen Initial States
Fair Classification with Partial Feedback: An Exploration-Based Data-Collection Approach
Stability Evaluation through Distributional Perturbation Analysis
Mimicking Better by Matching the Approximate Action Distribution
Enforcing Constraints in RNA Secondary Structure Predictions: A Post-Processing Framework Based on the Assignment Problem
Position Paper: Cracking the Code of Cascading Disparity Towards Marginalized Communities
OT-CLIP: Understanding and Generalizing CLIP via Optimal Transport
Sparse-IFT: Sparse Iso-FLOP Transformations for Maximizing Training Efficiency
DPOT: Auto-Regressive Denoising Operator Transformer for Large-Scale PDE Pre-Training
Rethinking Decision Transformer via Hierarchical Reinforcement Learning
Robust and Fine-tuning-free Instance Attribution for Interpretable NLP
A Subquadratic Time Algorithm for Robust Sparse Mean Estimation
Tackling Non-Stationarity in Reinforcement Learning via Causal-Origin Representation
Time Series Diffusion in the Frequency Domain
Spectral Phase Transition and Optimal PCA in Block-Structured Spiked Models
Random Masking Finds Winning Tickets for Parameter Efficient Fine-tuning
Connect Later: Improving Fine-tuning for Robustness with Targeted Augmentations
On dimensionality of feature vectors in MPNNs
Zero-Sum Positional Differential Games as a Framework for Robust Reinforcement Learning: Deep Q-Learning Approach
Initial Guessing Bias: How Untrained Networks Favor Some Classes
Active Ranking and Matchmaking, with Perfect Matchings
Rethinking Momentum Knowledge Distillation in Online Continual Learning
Probabilistic Forecasting with Stochastic Interpolants
How to Trace Latent Generative Model Generated Images without Artificial Watermark?
LPGD: A General Framework for Backpropagation through Embedded Optimization Layers
A Theory of Non-Linear Feature Learning with One Gradient Step in Two-Layer Neural Networks
Mastering Zero-Shot Interactions in Cooperative and Competitive Simultaneous Games
Efficient and Effective Time-Series Forecasting with Spiking Neural Networks
PIDformer: Transformer Meets Control Theory
Mapping the Multiverse of Latent Representations
Benchmarking Deletion Metrics with the Principled Explanations
Parallel Affine Transformation Tuning of Markov Chain Monte Carlo
Practical Performance Guarantees for Pipelined DNN Inference
Ameliorate Spurious Correlations in Dataset Condensation
Let Go of Your Labels with Unsupervised Transfer Learning
ODIM: Outlier Detection via Likelihood of Under-Fitted Generative Models
Understanding Retrieval-Augmented Task Adaptation for Vision-Language Models
Asymptotics of Learning with Deep Structured (Random) Features
A Bias-Variance-Covariance Decomposition of Kernel Scores for Generative Models
Polynomial-based Self-Attention for Table Representation Learning
Differentiable Mapper for Topological Optimization of Data Representation
Reward-Free Kernel-Based Reinforcement Learning
Bayesian Power Steering: An Effective Approach for Domain Adaptation of Diffusion Models
Lightweight Image Super-Resolution via Flexible Meta Pruning
The Merit of River Network Topology for Neural Flood Forecasting
Causal Representation Learning from Multiple Distributions: A General Setting
Consistent Diffusion Meets Tweedie: Training Exact Ambient Diffusion Models with Noisy Data
Learning 1-Bit Tiny Object Detector with Discriminative Feature Refinement
Robust Universal Adversarial Perturbations
Private Vector Mean Estimation in the Shuffle Model: Optimal Rates Require Many Messages
Federated Representation Learning in the Under-Parameterized Regime
Think Before You Act: Decision Transformers with Internal Memory
LESS: Selecting Influential Data for Targeted Instruction Tuning
Bootstrapping Fisher Market Equilibrium and First-Price Pacing Equilibrium
In-Context Principle Learning from Mistakes
Limited Preference Aided Imitation Learning from Imperfect Demonstrations
Diffusion Model-Guided Behavioral Cloning
A Study of First-Order Methods with a Deterministic Relative-Error Gradient Oracle
Vanilla Bayesian Optimization Performs Great in High Dimensions
Compact Optimality Verification for Optimization Proxies
Controllable Prompt Tuning For Balancing Group Distributional Robustness
Counterfactual Image Editing
One-Shot Strategic Classification Under Unknown Costs
SparQ Attention: Bandwidth-Efficient LLM Inference
Liouville Flow Importance Sampler
Exploiting Code Symmetries for Learning Program Semantics
Automated Loss function Search for Class-imbalanced Node Classification
Spectral Preconditioning for Gradient Methods on Graded Non-convex Functions
Resisting Stochastic Risk in Diffusion Planners with the Trajectory Aggregation Tree
Augmenting Decision with Hypothesis in Reinforcement Learning
Privately Learning Smooth Distributions on the Hypercube by Projections
SimPro: A Simple Probabilistic Framework Towards Realistic Long-Tailed Semi-Supervised Learning
Revisit the Essence of Distilling Knowledge through Calibration
Diffusion Protein Language Model for Protein Generation and Representation Learning
Random Scaling and Momentum for Non-smooth Non-convex Optimization
LoRA+: Efficient Low Rank Adaptation of Large Models
Conditional Normalizing Flows for Active Learning of Coarse-Grained Molecular Representations
Online Learning in Betting Markets: Profit versus Prediction
A Provable Decision Rule for Out-of-Distribution Detection
The Complexity of Attention, or How Optimal is FlashAttention?
Stereo Risk: A Continuous Modeling Approach to Stereo Matching
Optimistic Multi-Agent Policy Gradient
Rotational Equilibrium: How Weight Decay Balances Learning Across Neural Networks
Position Paper: Optimization in SciML -- A Function Space Perspective
The Non-linear $F$-Design and Applications to Interactive Learning
ArtWhisperer: A Dataset for Characterizing Human-AI Interactions in Artistic Creations
Predicting Dose-Response Curves with Deep Neural Networks
No Double Descent in Principal Component Regression: A High-Dimensional Analysis
Embodied CoT Distillation From LLM To Off-the-shelf Agents
Theoretical insights for diffusion guidance: A case study for Gaussian mixture models
Chatbot Meets Pipeline: Augment Large Language Model with Definite Finite Automaton
Stacking Deep Set Networks and Pooling by Quantiles
Disentangled 3D Scene Generation with Layout Learning
On Hypothesis Transfer Learning of Functional Linear Models
On the Diminishing Returns of Width for Continual Learning
Differentially Private Synthetic Data via Foundation Model APIs 2: Text
Exact Conversion of In-Context Learning to Model Weights
DE-COP: Detecting Copyrighted Content in Language Models Training Data
Collaborative Heterogeneous Causal Inference Beyond Meta-analysis
Relaxed Quantile Regression: Efficient Prediction Intervals for Asymmetric Noise
Removing Spurious Concepts from Neural Network Representations via Joint Subspace Estimation
Compositional Capabilities of Autoregressive Transformers: A Study on Synthetic, Interpretable Tasks
Debiased Distribution Compression
QUEST: Query-Aware Sparsity for Efficient Long-Context LLM Inference
Improving Group Robustness on Spurious Correlation Requires Preciser Group Inference
Improving Accuracy-robustness Trade-off via Pixel Reweighted Adversarial Training
Understanding the Effects of Iterative Prompting on Truthfulness
Fair Resource Allocation in Multi-Task Learning
Shifted Interpolation for Differential Privacy
Neighboring Perturbations of Knowledge Editing on Large Language Models
Uncertainty Estimation by Density Aware Evidential Deep Learning
Position Paper: Scaling Simulation is Neither Necessary Nor Sufficient for Generalizable and Compliant Real-World Robot Manipulation
FRAPPÉ: A Group Fairness Framework for Post-Processing Everything
Trainable Transformer in Transformer
Stochastic Optimization with Arbitrary Recurrent Data Sampling
Estimating the Permanent by Nesting Importance Sampling
Feasibility Consistent Representation Learning for Safe Reinforcement Learning
Federated Optimization with Doubly Regularized Drift Correction
Improved Stability and Generalization Guarantees of the Decentralized SGD Algorithm
Prodigy: An Expeditiously Adaptive Parameter-Free Learner
Cut Facets and Cube Facets of Lifted Multicut Polytopes
Concentration Inequalities for General Functions of Heavy-Tailed Random Variables
Feature Attribution with Necessity and Sufficiency via Dual-stage Perturbation Test for Causal Explanation
Clustered Federated Learning via Gradient Partitioning
Neural operators meet conjugate gradients: The FCG-NO method for efficient PDE solving
Total Variation Floodgate for Variable Importance Inference in Classification
Riemannian Preconditioned LoRA for Fine-Tuning Foundation Models
Correlation-Induced Label Prior for Semi-Supervised Multi-Label Learning
Bagged Deep Image Prior for Recovering Images in the Presence of Speckle Noise
Careful with that Scalpel: Improving Gradient Surgery with an EMA
A General Framework for Sequential Decision-Making under Adaptivity Constraints
Neural Networks Learn Statistics of Increasing Complexity
Graph Adversarial Diffusion Convolution via Laplacian Distance
MultiMax: Sparse and Multi-Modal Attention Learning
Activation-Descent Regularization for Input Optimization of ReLU Networks
Understanding the Learning Dynamics of Direct Preference Optimization
Fault Tolerant ML: Efficient Meta-Aggregation and Synchronous Training
The Emergence of Reproducibility and Consistency in Diffusion Models
Single-Model Attribution of Generative Models Through Final-Layer Inversion
Causal Discovery with Fewer Independence Tests
Causal Inference from Competing Treatments
A Field Guide for Pacing Budget and ROS Constraints
Towards Understanding Inductive Bias in Transformers: A View From Infinity
Relational DNN Verification With Cross Executional Bound Refinement
EiG-Search: Generating Edge-Induced Subgraphs for GNN Explanation in Linear Time
Bayesian Design Principles for Offline-to-Online Reinforcement Learning
Generalization Analysis of Robust Adversarial Transferring from Auxiliary Hypotheses
Position Paper: Towards Implicit Prompt For Text-To-Image Models
Stealthy Imitation: Reward-guided Environment-free Policy Stealing
Efficient Mixture Learning in Black-Box Variational Inference
Fast Co-Training under Weak Dependence via Stream-Based Active Learning
Rich-Observation Reinforcement Learning with Continuous Latent Dynamics
Bayesian Uncertainty for Gradient Aggregation in Multi-Task Learning
LangCell: Language-Cell Pre-training for Cell Identity Understanding
A Doubly-Recursive Stochastic Compositional Gradient Descent Method for Federated Multi-Level Compositional Optimization
Can Mamba Learn How To Learn? A Comparative Study on In-Context Learning Tasks
Can Implicit Bias Imply Adversarial Robustness?
In-Context Unlearning: Language Models as Few-Shot Unlearners
From Inverse Optimization to Feasibility to ERM
Mechanistic Design and Scaling of Hybrid Architectures
DsDm: Dataset Selection with Datamodels
Connections between Minimum Norm Interpolation and The Local Theory of Banach Spaces
Recovering Labels from Local Updates in Federated Learning
ReGAL: Refactoring Programs to Discover Generalizable Abstractions
Accelerating PDE Data Generation via Differential Operator Action in Solution Space
Bespoke Non-Stationary Solvers for Fast Sampling of Diffusion and Flow Models
How Interpretable Are Interpretable Graph Neural Networks?
Conformal Prediction for AI Agents
Compute Better Spent: Replacing Dense Layers with Structured Matrices
Decoupling Feature Extraction and Classification Layers for Calibrated Neural Networks
Feedback Loops With Language Models Drive In-Context Reward Hacking
The Role of Learning Algorithms in Collective Action
Diffusion Rejection Sampling
Benign Overfitting in Two-Layer ReLU Convolutional Neural Networks for XOR Data
Dual Operating Modes of In-Context Learning
Asymptotics of feature learning in two-layer networks after one gradient-step
On the Identifiability of Switching Dynamical Systems
Position Paper: Graph Foundation Models
Reducing sequential change detection to sequential estimation
Position Paper: Video as the New Language for Real-World Decision Making
Light and Optimal Schrödinger Bridge Matching
Observable Propagation: Uncovering Feature Vectors in Transformers
Topological Neural Networks go Persistent, Equivariant and Continuous
FairProof : Confidential and Certifiable Fairness for Neural Networks
Fine-grained Local Sensitivity Analysis of Standard Dot-Product Self-Attention
Bipartite Matching in Massive Graphs: A Tight Analysis of EDCS
Robust Sparse Estimation for Gaussians with Optimal Error under Huber Contamination
Benign Overfitting in Adversarially Trained Neural Networks
Position Paper: Towards Unified Alignment Between Agents, Humans, and Environment
GenCO: Generating Diverse Solutions to Design Problems with Combinatorial Nature
Optimal Kernel Choice for Score Function-based Causal Discovery
Sparsest Models Elude Pruning: An Exposé of Pruning’s Current Capabilities
Optimal Batched Linear Bandits
All-in-one simulation-based inference
Imitation Learning in Discounted Linear MDPs without exploration assumptions
Reprompting: Automated Chain-of-Thought Prompt Inference Through Gibbs Sampling
Improved Dimensionality Dependence for Zeroth-Order Optimisation over Cross-Polytopes
Information Complexity of Stochastic Convex Optimization: Applications to Generalization and Memorization
Imagine Big from Small: Unlock the Cognitive Generalization of Deep Reinforcement Learning from Simple Scenarios
Role of data structure in learning: compositionality vs stability to diffeomorphism
Reflective Policy Optimization
Meta Evidential Transformer for Few-Shot Open-Set Recognition
Implicit Bias of AdamW: $\ell_\infty$-Norm Constrained Optimization
How Well Can LLMs Negotiate? NegotiationArena Platform and Analysis
Statistical Inference Under Constrained Selection Bias
Global Optimality without Mixing Time Oracles in Average-reward RL Multi-level Actor-Critic
Accelerating Parallel Sampling of Diffusion Models
Provable Interactive Learning with Hindsight Instruction Feedback
Fast Peer Adaptation with Context-aware Exploration
Learning Iterative Reasoning through Energy Diffusion
CosPGD: an efficient white-box adversarial attack for pixel-wise prediction tasks
Understanding Unimodal Bias in Multimodal Deep Linear Networks
CrossGET: Cross-Guided Ensemble of Tokens for Accelerating Vision-Language Transformers
Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution
An Independence-promoting Loss for Music Generation with Language Models
On Least Square Estimation in Softmax Gating Mixture of Experts
Purifying Quantization-conditioned Backdoors via Layer-wise Activation Correction with Distribution Approximation
Factored-Reward Bandits with Intermediate Observations
Scalable Online Exploration via Coverability
Scaling Tractable Probabilistic Circuits: A Systems Perspective
Uncertainty-Aware Reward-Free Exploration with General Function Approximation
Dynamic Anisotropic Smoothing for Noisy Derivative-Free Optimization
Weighted distance nearest neighbor condensing
Visual-Language Models as Fuzzy Rewards for Reinforcement Learning
Safe Reinforcement Learning using Finite-Horizon Gradient-based Estimation
Position Paper: Artificial Superhuman Intelligence via Open-Ended Foundation Models
Causal Inference out of Control: Estimating Performativity without Treatment Randomization
Promises and Pitfalls of Generative Masked Language Modeling: Theoretical Framework and Practical Guidelines
A Symmetry Informed Equivariant Network for Crystal Tensor Prediction
Position Paper: Relational Deep Learning: Graph Representation Learning on Relational Databases
Position Paper: The Platonic Representation Hypothesis
LLaGA: Large Language and Graph Assistant
Automated Statistical Model Discovery with Language Models
CCM: Real-Time Controllable Visual Content Creation Using Text-to-Image Consistency Models
Learning Universal Predictors
On the Weight Dynamics of Deep Normalized Networks
Principled Gradient-based Markov Chain Monte Carlo for Text Generation
Environment Design for Inverse Reinforcement Learning
Profile Reconstruction from Private Sketches
Sequential Disentanglement by Extracting Static Information From A Single Sequence Element
Why do Variational Autoencoders Really Promote Disentanglement?
Consistent Submodular Maximization
PcLast: Discovering Plannable Continuous Latent States
Robustly Learning Single-Index Models via Alignment Sharpness
Analysis for Abductive Learning and Neural-Symbolic Reasoning Shortcuts
Interplay of ROC and Precision-Recall AUCs: Theoretical Limits and Practical Implications in Binary Classification
Position Paper: Against Spurious Sparks$-$Dovelating Inflated AI Claims
Do Large Code Models Understand Programming Concepts? A Black-box Approach
Learning Optimal Deterministic Policies with Stochastic Policy Gradients
A Statistical Framework for Data-dependent Retrieval-Augmented Models
A Statistical Theory of Regularization-Based Continual Learning
Flextron: Many-in-One Flexible Large Language Model
Regularizing with Pseudo-Negatives for Continual Self-Supervised Learning
Online Linear Regression in Dynamic Environments via Discounting
Curated LLM: Synergy of LLMs and Data Curation for tabular augmentation in low-data regimes
A Universal Class of Sharpness-Aware Minimization Algorithms
SPP: Sparsity-Preserved Parameter-Efficient Fine-Tuning for Large Language Models
Tilting the Odds at the Lottery: the Interplay of Overparameterisation and Curricula in Neural Networks
Transformers with Loss Shaping Constraints for Long-Term Time Series Forecasting
QuIP#: Even Better LLM Quantization with Hadamard Incoherence and Lattice Codebooks
ConvNet vs Transformer, Supervised vs CLIP: Beyond ImageNet Accuracy
Do Models Explain Themselves? Counterfactual Simulatability of Natural Language Explanations
How Does Goal Relabeling Improve Sample Efficiency?
Sequential Neural Score Estimation: Likelihood-Free Inference with Conditional Score Based Diffusion Models
MaxMin-RLHF: Alignment with Diverse Human Preferences
HGCN2SP: Hierarchical Graph Convolutional Network for Two-Stage Stochastic Programming
Optimal Coresets for Low-Dimensional Geometric Median
Image Hijacks: Adversarial Images can Control Generative Models at Runtime
On the Error-Propagation of Inexact Deflation for Principal Component Analysis
On Stronger Computational Separations Between Multimodal and Unimodal Machine Learning
Rethinking Specificity in SBDD: Leveraging Delta Score and Energy-Guided Diffusion
Emerging Representations of Formal Semantics in Language Models Trained on Programs
Domain-Aware Guidance for Out-of-Distribution Molecular Design
Bridging discrete and continuous state spaces: Exploring the Ehrenfest process in time-continuous diffusion models
Unlocking the Power of Spatial and Temporal Information in Medical Multimodal Pre-training
Boximator: Generating Rich and Controllable Motions for Video Synthesis
Privacy Backdoors: Stealing Data with Corrupted Pretrained Models
Risk Estimation in a Markov Cost Process: Lower and Upper Bounds
DOGE: Domain Reweighting with Generalization Estimation
Predicting and Interpreting Energy Barriers of Metallic Glasses with Graph Neural Networks
Faster Sampling via Stochastic Gradient Proximal Sampler
Adaptive Stabilization Based on Machine Learning for Column Generation
Online Matrix Completion: A Collaborative Approach with Hott Items
Graphon Mean Field Games with A Representative Player: Analysis and Learning Algorithm
The Pitfalls of Next-Token Prediction
Leveraging Self-Consistency for Data-Efficient Amortized Bayesian Inference
EvoRainbow: Combining Improvements in Evolutionary Reinforcement Learning for Policy Search
Agnostic Sample Compression Schemes for Regression
Algorithmic Stability Unleashed: Generalization Bounds with Unbounded Losses
Encodings for Prediction-based Neural Architecture Search
Scalable AI Safety via Doubly-Efficient Debate
Is Inverse Reinforcement Learning Harder than Standard Reinforcement Learning? A Theoretical Perspective
ViP: A Differentially Private Foundation Model for Computer Vision
Causal Action Influence Aware Counterfactual Data Augmentation
Fair Federated Learning via the Proportional Veto Core
Multicalibration for Confidence Scoring in LLMs
Learning in Deep Factor Graphs with Gaussian Belief Propagation
Rethinking the Flat Minima Searching in Federated Learning
Covert Malicious Finetuning: Subverting LLM Safety Training Without Detection
Code as Reward: Empowering Reinforcement Learning with VLMs
Convex and Bilevel Optimization for Neural-Symbolic Inference and Learning
How Free is Parameter-Free Stochastic Optimization?
Fair Risk Control: A Generalized Framework for Calibrating Multi-group Fairness Risks
Non-Vacuous Generalization Bounds for Large Language Models
Geometry-Aware Instrumental Variable Regression
Establishing Foundations for Training and Evaluating Visually-Conditioned Language Models
Monotone Individual Fairness
Improper Gaussian process regression and improper kernels
NeRF Compression via Transform Coding
An Iterative Min-Min Optimization Method for Sparse Bayesian Learning
Towards Efficient and Exact Optimization of Language Model Alignment
Learning to Stabilize Online Reinforcement Learning in Unbounded State Spaces
Online bipartite matching with imperfect advice
Neural Tangent Kernels Motivate Cross-Covariance Graphs in Neural Networks
When Will Gradient Regularization Be Harmful?
Generative Modeling on Manifolds Through Mixture of Riemannian Diffusion Processes
Efficient Error Certification for Physics-Informed Neural Networks
Compress Clean Signal from Noisy Raw Image: A Self-Supervised Approach
Creative Text-to-Audio Generation via Synthesizer Programming
Improving fine-grained understanding in image-text pre-training
Unveiling the Potential of AI for Nanomaterial Morphology Prediction
A New Linear Scaling Rule for Private Adaptive Hyperparameter Optimization
A Minimaximalist Approach to Reinforcement Learning from Human Feedback
Explorations of Self-Repair in Language Models
Robust and Conjugate Gaussian Process Regression
IW-GAE: Importance weighted group accuracy estimation for improved calibration and model selection in unsupervised domain adaptation
Detecting and Identifying Selection Structure in Sequential Data
Regression with Multi-Expert Deferral
Scalable and Flexible Causal Discovery with an Efficient Test for Adjacency
Learning Optimal Projection for Forecast Reconciliation of Hierarchical Time Series
Vision Transformers as Probabilistic Expansion from Learngene
Breadth-First Exploration in Adaptive Grid-based Reinforcement Learning
PAGER: Accurate Failure Characterization in Deep Regression Models
Pricing with Contextual Elasticity and Heteroscedastic Valuation
Privacy-preserving data release leveraging optimal transport and particle gradient descent
High-dimensional Linear Bandits with Knapsacks
Automated Evaluation of Retrieval-Augmented Language Models with Task-Specific Exam Generation
Generalized Smooth Variational Inequalities: Methods with Adaptive Stepsizes
SAPG: Split and Aggregate Policy Gradients
Rob-FCP: Certifiably Byzantine-Robust Federated Conformal Prediction
Offline Imitation from Observation via Primal Wasserstein State Occupancy Matching
Dimension-Free Coresets for Multiple $\ell_p$ Regression
Self-Attention through Kernel-Eigen Pair Sparse Variational Gaussian Processes
Aligning Transformers with Weisfeiler-Leman
From Geometry to Causality- Ricci Curvature and the Reliability of Causal Inference on Networks
PerceptAnon: Exploring the Human Perception of Image Anonymization Beyond Pseudonymization
Eluder-based Regret for Stochastic Contextual MDPs
Universality of Linear Recurrences Followed by Non-linear Projections: Finite-Width Guarantees and Benefits of Complex Eigenvalues
Perturb-and-Project: Differentially Private Similarities and Marginals
Position Paper: Scarce Resource Allocations That Rely On Machine Learning Should Be Randomized
Diffusion Tempering Improves Parameter Estimation with Probabilistic Integrators for Ordinary Differential Equations
Neural Estimation of Mutual Information without Test-Time Optimization
ReLU to the Rescue: Improve Your On-Policy Actor-Critic with Positive Advantages
Improved Generalization of Weight Space Networks via Augmentations
Prompting a Pretrained Transformer Can Be a Universal Approximator
DySLIM: Dynamics Stable Learning by Invariant Measure for Chaotic Systems
Auto-Linear Phenomenon in Subsurface Imaging
In-Context Language Learning: Architectures and Algorithms
Dynamic Correlation Clustering in Sublinear Update Time
Molecule-Space: Free Lunch in Unified Multimodal Space via Knowledge Fusion
PrE-Text: Training Language Models on Private Federated Data in the Age of LLMs
Borda Regret Minimization for Generalized Linear Dueling Bandits
Position Paper: Embracing Negative Results in Machine Learning
Multimodal Prototyping for cancer survival prediction
A Nearly Optimal Single Loop Algorithm for Stochastic Bilevel Optimization under Unbounded Smoothness
Rethinking Generative Large Language Model Evaluation for Semantic Comprehension
Learning to Reach Goals via Diffusion
Single-Trajectory Distributionally Robust Reinforcement Learning
Projection-Free Variance Reduction Methods for Stochastic Constrained Multi-Level Compositional Optimization
Learning a Diffusion Model Policy from Rewards via Q-Score Matching
Hybrid Inverse Reinforcement Learning
Transformers Get Stable: An End-to-End Signal Propagation Theory for Language Models
Training Language Model Agents without Modifying Language Models
The Pitfalls and Promise of Conformal Inference Under Adversarial Attacks
Look Ahead or Look Around? A Theoretical Comparison Between Autoregressive and Masked Pretraining
The Surprising Effectiveness of Skip-Tuning in Diffusion Sampling
Online conformal prediction with decaying step sizes
Explaining Graph Neural Networks via Structure-aware Interaction Index
Human Alignment of Large Language Models through Online Preference Optimisation
Efficient PAC Learnability of Dynamical Systems Over Multilayer Networks
UniCorn: A Unified Contrastive Learning Approach for Multi-view Molecular Representation Learning
Toward Adaptive Reasoning in Large Language Models with Thought Rollback
Retrieval-Augmented Score Distillation for Text-to-3D Generation
Statistical Properties of Robust Satisficing
DecisionNCE: Embodied Multimodal Representations via Implicit Preference Learning
Multi-Track Message Passing: Tackling Oversmoothing and Oversquashing in Graph Learning via Preventing Heterophily Mixing
Proactive DP: A Multiple Target Optimization Framework for DP-SGD
Towards Certified Unlearning for Deep Neural Networks
Understanding MLP-Mixer as a wide and sparse MLP
Towards Interpretable Local Learning with Successive Gradient Reconciliation
SPABA: A Single-Loop and Probabilistic Stochastic Bilevel Algorithm Achieving Optimal Sample Complexity
Correcting Diffusion-Based Perceptual Image Compression with Privileged End-to-End Decoder
Privacy Preserving Adaptive Experiment Design
Compression of Structured Data with Autoencoders: Provable Benefit of Nonlinearities and Depth
Gradual Divergence for Seamless Adaptation: A Novel Domain Incremental Learning Method
NeWRF: A Deep Learning Framework for Wireless Radiation Field Reconstruction and Channel Prediction
Orchestrating Hierarchical Planning via D-Conductor and Q-Performer
Tuning-free Estimation and Inference of Cumulative Distribution Function under Local Differential Privacy
Efficient Contextual Bandits with Uninformed Feedback Graphs
DiJiang: Efficient Large Language Models through Compact Kernelization
Adversarial Attacks on Combinatorial Multi-Armed Bandits
Scaling exponents across parameterizations and optimizers: A large-scale empirical study
SqueezeLLM: Dense-and-Sparse Quantization
Beyond Chinchilla-Optimal: Accounting for Inference in Language Model Scaling Laws
The Entropy Enigma: Success and Failure of Entropy Minimization
Towards Modular LMs by Building and Reusing a Library of LoRA Adapters
On the Role of Edge Dependency in Graph Generative Models
Global Reinforcement Learning : Beyond Linear and Convex Rewards via Submodular Semi-gradient Methods
Purify Unlearnable Examples via Rate-Constrained Variational Autoencoders
See More Details: Efficient Image Super-Resolution by Experts Mining
Generalized Sobolev Transport for Probability Measures on a Graph
Large Scale Dataset Distillation with Domain Shift
Long Is More for Alignment: A Simple but Tough-to-Beat Baseline for Instruction Fine-Tuning
ProtoGate: Prototype-based Neural Networks with Global-to-local Feature Selection for Tabular Biomedical Data
PIVOT: Iterative Visual Prompting for VLMs with Applications to Zero-Shot Robotic Control
Regression Learning with Limited Observations of Multivariate Responses and Features
S3O: A Dual-Phase Approach for Reconstructing Dynamic Shape and Skeleton of Articulated Objects from Single Monocular Video
Posterior Sampling-Based Bayesian Optimization with Tighter Bayesian Regret Bounds
Theory of Consistency Diffusion Models: Distribution Estimation Meets Fast Sampling
KISA: A Unified Keyframe Identifier and Skill Annotator for Long-Horizon Robotics Demonstrations
From Vision to Audio and Beyond: A Unified Model for Audio-Visual Representation and Generation
BECoTTA: Input-dependent Online Blending of Experts for Continual Test-time Adaptation
Characterizing Large Language Model Geometry Solves Toxicity Detection and Generation
Executable Code Actions Elicit Better LLM Agents
InterLUDE: Interactions between Labeled and Unlabeled Data to Enhance Semi-Supervised Learning
Data-free Distillation of Diffusion Models with Bootstrapping
Gated Linear Attention Transformers with Hardware-Efficient Training
Stochastic Quantum Sampling for Non-Logconcave Distributions and Estimating Partition Functions
FlowMM: Generating Materials with Riemannian Flow Matching
Conformalized Survival Distributions: A Generic Post-Process to Increase Calibration
Category-Aware Active Domain Adaptation
On the Recoverability of Causal Relations from Temporally Aggregated I.I.D. Data
Enhancing Implicit Shape Generators Using Topological Regularizations
Residual Quantization with Implicit Neural Codebooks
Relaxing the Accurate Imputation Assumption in Doubly Robust Learning for Debiased Recommendation
Spider: A Unified Framework for Context-dependent Concept Understanding
Reason for Future, Act for Now: A Principled Architecture for Autonomous LLM Agents
Multi-Agent Reinforcement Learning with Hierarchical Coordination for Emergency Responder Stationing
A Unified Adaptive Testing System Enabled by Hierarchical Structure Search
InstructRetro: Instruction Tuning post Retrieval-Augmented Pretraining
Causality Based Front-door Denfence Against Backdoor Attack on Language Model
Information-Directed Pessimism for Offline Reinforcement Learning
Verification of Machine Unlearning is Fragile
Entropy-Reinforced Planning with Large Language Models for Drug Discovery
Multi-group Learning for Hierarchical Groups
Transformers, parallel computation, and logarithmic depth
Unsupervised Parameter-free Simplicial Representation Learning with Scattering Transforms
Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads
Representation Surgery for Multi-Task Model Merging
Minimax Optimality of Score-based Diffusion Models: Beyond the Density Lower Bound Assumptions
Critical windows: a theoretical lens on feature emergence in diffusion models
Layer-Aware Analysis of Catastrophic Overfitting: Revealing the Pseudo-Robust Shortcut Dependency
Trustworthy Actionable Perturbations
OLLIE: Imitation Learning from Offline Pretraining to Online Finetuning
How to Leverage Diverse Demonstrations in Offline Imitation Learning
DMTG: One-Shot Differentiable Multi-Task Grouping
Efficient Denoising Diffusion via Probabilistic Masking
Position Paper: Revisiting the hypothesis: Do pretrained Transformers Learn In-Context by Gradient Descent?
Contrastive Preference Optimization: Pushing the Boundaries of LLM Performance in Machine Translation
Behavior Generation with Latent Actions
Smoothness Adaptive Hypothesis Transfer Learning
Bayesian Online Multivariate Time Series Imputation with Functional Decomposition
Leveraging VLM-Based Pipelines to Annotate 3D Objects
Receptive Fields As Experts in Vision Architectures
BiSHop: Bi-Directional Cellular Learning for Tabular Data with Generalized Sparse Hopfield Model
Prototypical Transformer As Unified Motion Learners
Learning to Continually Learn with the Bayesian Principle
Provably Efficient Reinforcement Learning for Adversarial Restless Multi-Armed Bandits with Unknown Transitions and Bandit Feedback
Reward Shaping for Reinforcement Learning with An Assistant Reward Agent
tinyBenchmarks: evaluating LLMs with fewer examples
Active Statistical Inference
Plug-in Performative Optimization
No Wrong Turns: The Simple Geometry Of Neural Networks Optimization Paths
Replicable Learning of Large-Margin Halfspaces
Understanding Forgetting in Continual Learning with Linear Regression
Gradient Compressed Sensing: A Query-Efficient Gradient Estimator for High-Dimensional Zeroth-Order Optimization
Learning with Partial-Label and Unlabeled Data: A Uniform Treatment for Supervision Redundancy and Insufficiency
Kernel Semi-Implicit Variational Inference
Orthogonal Bootstrap: Efficient Simulation of Input Uncertainty
Pluvial Flood Emulation with Hydraulics-informed Message Passing
Token-Specific Watermarking with Enhanced Detectability and Semantic Coherence for Large Language Models
New Sample Complexity Bounds for Sample Average Approximation in Heavy-Tailed Stochastic Programming
Iterative Regularized Policy Optimization with Imperfect Demonstrations
Model-based Reinforcement Learning for Confounded POMDPs
Visual Representation Learning with Stochastic Frame Prediction
ReLU Network with Width $d+\mathcal{O}(1)$ Can Achieve Optimal Approximation Rate
An LLM Compiler for Parallel Function Calling
Equivariant Frames and the Impossibility of Continuous Canonicalization
Distributed Bilevel Optimization with Communication Compression
TagLog: Test-Time Adaptation for Tabular Data Using Logic Rules
Structured Chemistry Reasoning with Large Language Models
Representing Molecules as Random Walks Over Interpretable Grammars
Unveiling and Harnessing Hidden Attention Sinks: Enhancing Large Language Models without Training through Attention Calibration
When Linear Attention Meets Autoregressive Decoding: Towards More Effective and Efficient Linearized Large Language Models
Multi-Source Conformal Inference Under Distribution Shift
A Distributional Analogue to the Successor Representation
Beyond Point Prediction: Score Matching-based Pseudolikelihood Estimation of Neural Marked Spatio-Temporal Point Process
Position Paper: AI-Powered Autonomous Weapons Risk Geopolitical Instability and Threaten AI Research
Reinforcement Learning within Tree Search for Fast Macro Placement
Towards efficient deep spiking neural networks construction with spiking activity based pruning
Retrieval Across Any Domains via Large-scale Pre-trained Model
Seesaw: Compensating for Nonlinear Reduction with Linear Computations for Private Inference
Et Tu Certifications: Robustness Certificates Yield Better Adversarial Examples
Relational Learning in Pre-Trained Models: A Theory from Hypergraph Recovery Perspective
Allocation Requires Prediction Only if Inequality Is Low
Beyond Sole Strength: Customized Ensembles for Generalized Vision-Language Models
Video-of-Thought: Step-by-Step Video Reasoning from Perception to Cognition
Momentor: Advancing Video Large Language Model with Fine-Grained Temporal Reasoning
Probabilistic time series modeling with decomposable denoising diffusion model
In-context Convergence of Transformers
Asymptotically Optimal and Computationally Efficient Average Treatment Effect Estimation in A/B testing
Position Paper: Enforced Amnesia as a Way to Mitigate the Potential Risk of Silent Suffering in the Conscious AI
DNA-SE: Towards Deep Neural-Nets Assisted Semiparametric Estimation
Position Paper: Intent-aligned AI systems optimize for Agency Loss
An Explicit Frame Construction for Normalizing 3D Point Clouds
SurfPro: Functional Protein Design Based on Continuous Surface
Feature Reuse and Scaling: Understanding Transfer Learning with Protein Language Models
Transformers Learn Nonlinear Features In Context: Nonconvex Mean-field Dynamics on the Attention Landscape
A Vector Quantization Pretraining Method for EEG Time Series with Random Projection and Phase Alignment
Towards General Algorithm Discovery for Combinatorial Optimization: Learning Symbolic Branching Policy from Bipartite Graph
PDHG-Unrolled Learning-to-Optimize Method for Large-Scale Linear Programming
An Information Theoretic Approach to Interaction-Grounded Learning
Feasible Reachable Policy Iteration
FedREDefense: Defending against Model Poisoning Attacks for Federated Learning using Model Update Reconstruction Error
Deep Demonstration Tracing: Learning Generalizable Imitator for Runtime One-Shot Imitation
Model Alignment as Prospect Theoretic Optimization
Graph External Attention Enhanced Transformer
HALC: Object Hallucination Reduction via Adaptive Focal-Contrast Decoding
Be Your Own Neighborhood: Detecting Adversarial Examples by the Neighborhood Relations Built on Self-Supervised Learning
Centralized Selection with Preferences in the Presence of Biases
Ranking-based Client Selection with Imitation Learning for Efficient Federated Learning
Planning with Theory of Mind for Few-Shot Adaptation in Mixed-motive Environments
Decentralized Convex Finite-Sum Optimization with Better Dependence on Condition Numbers
On the Complexity of Finite-Sum Smooth Optimization under the Polyak–Łojasiewicz Condition
CARTE: pretraining and transfer for tabular learning
Position Paper: Levels of AGI -- Operationalizing Progress on the Path to AGI
Confidence-aware Contrastive Learning for Selective Classification
StableSSM: Alleviating the Curse of Memory in State-space Models through Stable Reparameterization
Variational Partial Group Convolutions for Input-Aware Partial Equivariance of Rotations and Color-Shifts
Discounted Adaptive Online Prediction
Mean-field Chaos Diffusion Models
Deeper or Wider: A Perspective from Optimal Generalization Error with Sobolev Loss
NExT-GPT: Any-to-Any Multimodal LLM
COPAL: Continual Pruning in Large Language Generative Models
Diffusion-based Missing-view Generation for Incomplete Multi-view Clustering
Accelerated Policy Gradient for s-rectangular Robust MDPs with Large State Spaces
Delving into the Convergence of Minimax Optimization
Improving Computational Complexity in Statistical Models with Local Curvature Information
Is Temperature Sample Efficient for Softmax Gaussian Mixture of Experts?
Score identity Distillation: Exponentially Fast Distillation of Pretrained Diffusion Models for One-Step Generation
DFD: Distillng the Feature Disparity Differently for Detectors
Listwise Reward Estimation for Offline Preference-based Reinforcement Learning
SyCoCa: Symmetrizing Contrastive Captioners with Attentive Masking for Multimodal Alignment
Score-Based Causal Discovery in the Presence of Causally-Related Latent Variables
Language-Driven Cross-Modal Classifier for Zero-Shot Multi-Label Image Recognition
Improving Generalization in Offline Reinforcement Learning via Adversarial Data Splitting
SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models
Robust Graph Matching when Nodes are Corrupt
Statistical Test for Attention Maps in Vision Transformers
BadPart: Unified Black-box Adversarial Patch Attacks against Pixel-wise Regression Tasks
Learning and Forgetting Unsafe Examples in Large Language Models
Enhancing Sufficient Dimension Reduction via Hellinger Correlation
Simulation-Based Inference with Quantile Regression
EDISON: Enhanced Dictionary-Induced Tensorized Incomplete Multi-View Clustering with Gaussian Error Rank Minimization
Generative Conditional Distributions by Neural (Entropic) Optimal Transport
RLVF: Learning from Verbal Feedback without Overgeneralization
Fast-Slow Test-Time Adaptation for Online Vision-and-Language Navigation
Scene Graph Generation Strategy with Co-occurrence Knowledge and Learnable Term Frequency
An Empirical Study Into What Matters for Calibrating Vision-Language Models
MM-Vet: Evaluating Large Multimodal Models for Integrated Capabilities
Finite Time Regret Bounds for Self-Tuning Regulation
An Interpretable Evaluation of Entropy-based Novelty of Generative Models
Algorithm of Thoughts: Enhancing Exploration of Ideas in Large Language Models
Learning Reward for Robot Skills Using Large Language Models via Self-Alignment
UPAM: Unified Prompt Attack in Text-to-Image Generation Models Against Both Textual Filters and Visual Checkers
Mechanistic Neural Networks for Scientific Machine Learning
DS-Agent: Automated Data Science by Empowering Large Language Models with Case-Based Reasoning
Weighted Visual-Text Cross Alignment via Localized Visual Prompting
TENG: Time-Evolving Natural Gradient for Solving PDEs with Deep Neural Net
DoRA: Weight-Decomposed Low-Rank Adaptation
Online Cascade Learning for Efficient Inference over Streams
Fast Adversarial Attacks on Language Models In One GPU Minute
Sparse Inducing Points in Deep Gaussian Processes: Enhancing Modeling with Denoising Diffusion Variational Inference
Make-A-Shape: a Ten-Million-scale 3D Shape Model
Toward Availability Attacks in 3D Point Clouds
Graph Positional and Structural Encoder
Sample-Specific Multi-Channel Masks for Visual Reprogramming
Rejuvenating image-GPT as Strong Visual Representation Learners
Data-free Neural Representation Compression with Riemannian Neural Dynamics
Modeling Language Tokens as Functionals of Semantic Fields
Split-Ensemble: Efficient OOD-aware Ensemble via Task and Model Splitting
Harnessing the Power of Neural Operators with Automatically Encoded Conservation Laws
Video-LaVIT: Unified Video-Language Pre-training with Decoupled Visual-Motional Tokenization
Repoformer: Selective Retrieval for Repository-Level Code Completion
On the Asymptotic Distribution of the Minimum Empirical Risk
A General Theory for Softmax Gating Multinomial Logistic Mixture of Experts
From Words to Actions: Unveiling the Theoretical Underpinnings of LLM-Driven Autonomous Systems
Stable Differentiable Causal Discovery
TimeX++: Learning Time-Series Explanations with Information Bottleneck
O$n$ Learning Deep O($n$)-Equivariant Hyperspheres
Unmasking Vulnerabilities: Cardinality Sketches under Adaptive Inputs
AquaLoRA: Toward White-box Protection for Customized Stable Diffusion Models via Watermark LoRA
Efficient Black-box Adversarial Attacks via Bayesian Optimization Guided by a Function Prior
Generalization Bound and New Algorithm for Clean-Label Backdoor Attack
From Neurons to Neutrons: A Case Study in Interpretability
Physics-Informed Neural Network Policy Iteration: Algorithms, Convergence, and Verification
QuRating: Selecting High-Quality Data for Training Language Models
Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision
Explain Temporal Black-Box Models via Functional Decomposition
Reducing Fine-Tuning Memory Overhead by Approximate and Memory-Sharing Backpropagation
Bridging Local and Global Perspectives in Interpretation Complexity
Reservoir Computing for Short High-Dimensional Time Series: an Application to SARS-CoV-2 Hospitalization Forecast
Beyond the Calibration Point: Mechanism Comparison in Differential Privacy
Prospector Heads: Generalized Feature Attribution for Large Models & Data
Characterizing Overfitting in Kernel Ridgeless Regression Through the Eigenspectrum
Causally Motivated Personalized Federated Invariant Learning with Shortcut-Averse Information-Theoretic Regularization
Amend to Alignment: Decoupled Prompt Tuning for Mitigating Spurious Correlation in Vision-Language Models
OpenMoE: An Early Effort on Open Mixture-of-Experts Language Models
Double-Step Alternating Extragradient with Increasing Timescale Separation for Finding Local Minimax Points: Provable Improvements
BetterV: Controlled Verilog Generation with Discriminative Guidance
BWS: Best Window Selection Based on Sample Scores for Data Pruning across Broad Ranges
Collective Certified Robustness against Graph Injection Attacks
Overcoming Saturation in Density Ratio Estimation by Iterated Regularization
A Simple Early Exiting Framework for Accelerated Sampling in Diffusion Models
Barrier Algorithms for Constrained Non-Convex Optimization
Naive Bayes Classifiers over Missing Data: Decision and Poisoning
GeoReasoner: Geo-localization with Reasoning in Street Views using a Large Vision-Language Model
Distance function for spike prediction
Locality-Sensitive Hashing-Based Efficient Point Transformer with Applications in High-Energy Physics
From Fourier to Neural ODEs: Flow Matching for Modeling Complex Systems
Convergence and Complexity Guarantee for Inexact First-order Riemannian Optimization Algorithms
Larimar: LLMs with External Episodic Memory Control
Envisioning Outlier Exposure for Out-of-distribution Detection
On the Generalization of Equivariant Graph Neural Networks
Deep Fusion: Efficient Network Training via Pre-trained Initializations
Adaptively Learning to Select-Rank in Online Platforms
Riemannian Accelerated Zeroth-order Algorithm: Improved Robustness and Lower Query Complexity
Two Heads Are Better Than One: Boosting Graph Sparse Training via Semantic and Topological Awareness
Differentiable Weightless Neural Networks
Task Groupings Regularization: Data-Free Meta-Learning with Heterogeneous Pre-trained Models
DetKDS: Knowledge Distillation Search for Object Detectors
Auto-Encoding Morph-Tokens for Multimodal LLM
Beyond ELBOs: A Large-Scale Evaluation of Variational Methods for Sampling
SPARSE COCKTAIL: EVERY SPARSE PATTERN EVERY SPARSE RATIO ALL AT ONCE
Pruning Small Pre-Trained Weights $\textit{Irreversibly}$ and $\textit{Monotonically}$ Impairs ``Difficult" Downstream Tasks in LLMs
Advancing Dynamic Sparse Training by Exploring Optimization Opportunities
Zeroth-Order Methods for Constrained Nonconvex Nonsmooth Stochastic Optimization
Compositional Few-Shot Class-Incremental Learning
Estimating Barycenters of Distributions with Neural Optimal Transport
Tripod: Three Complementary Inductive Biases for Disentangled Representation Learning
Evaluating Quantized Large Language Models
INSIGHT: End-to-End Neuro-Symbolic Visual Reinforcement Learning with Language Explanations
Parameter-Efficient Fine-Tuning with Controls
LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens
Neural operators with localized integral and differential kernels
LAGMA: LAtent Goal-guided Multi-Agent Reinforcement Learning
To Cool or not to Cool? Temperature Network Meets Large Foundation Models via DRO
Exploring Intrinsic Dimension for Vision-Language Model Pruning
Outlier-Efficient Hopfield Layers for Large Transformer-Based Models
Listenable Maps for Audio Classifiers
Can a Few Decide for Many? The Metric Distortion of Sortition
Biharmonic Distance of Graphs and its Higher-Order Variants: Theoretical Properties with Applications to Centrality and Clustering
Offline Transition Modeling via Contrastive Energy Learning
Energy-Guided Diffusion Sampling for Offline-to-Online Reinforcement Learning
Position Paper: Challenges and Opportunities in Topological Deep Learning
MiMiC: Minimally Modified Counterfactuals in the Representation Space
On Gradient-like Explanation under a Black-box Setting: When Black-box Explanations Become as Good as White-box
Dissecting Multimodality in VideoQA Transformer Models by Impairing Modality Fusion
Mobile Attention: Mobile-Friendly Linear-Attention for Vision Transformers
Active Preference Learning for Large Language Models
Adaptive-Gradient Policy Optimization: Enhancing Policy Learning in Non-Smooth Differentiable Simulations
Bidirectional Reciprocative Information Communication for Few-Shot Semantic Segmentation
Aligned Objective for Soft-Pseudo-Label Generation in Supervised Learning
ULAREF: A Unified Label Refinement Framework for Learning with Inaccurate Supervision
Latent variable model for high-dimensional point process with structured missingness
RoboCodeX: Multimodal Code Generation for Robotic Behavior Synthesis
On a Combinatorial Problem Arising in Machine Teaching
TimeSiam: A Pre-Training Framework for Siamese Time-Series Modeling
Improving Antibody Humanness Prediction using Patent Data
Adaptive Observation Cost Control for Variational Quantum Eigensolvers
Efficient Algorithms for Empirical Group Distributional Robust Optimization and Beyond
Convergence of Online Learning Algorithm for a Mixture of Multiple Linear Regressions
On the Sampling Structure of Diffusion Models
Accelerating Convergence in Bayesian Few-Shot Classification
FRAG: Frequency Adpating Group for Diffusion Video Editing
Prompting4Debugging: Red-Teaming Text-to-Image Diffusion Models by Finding Problematic Prompts
Outlier-robust Kalman Filtering through Generalised Bayes
Quality Diversity through Human Feedback: an Open-Ended Backend for Diversity-Driven Optimization
ETHER: Efficient Finetuning of Large-Scale Models with Hyperplane Reflections
Energy-Efficient Gaussian Processes using Low-Precision Arithmetic
Weisfeiler Leman for Euclidean Equivariant Machine Learning
Neuro-Symbolic Temporal Point Processes
No Dimensional Sampling Coresets for Classification
SuDA: Support-based Domain Adaptation for Sim2Real Motion Capture with Flexible Sensors
Vectorized Conditional Neural Fields: A Framework for Solving Time-dependent Parametric Partial Differential Equations
Just Cluster It: An Approach for Exploration in High-Dimensions using Clustering and Pre-Trained Representations
Particle Denoising Diffusion Sampler
Test-Time Degradation Adaption for Open-Set Image Restoration
NeuralIndicator: Implicit Surface Reconstruction from Neural Indicator Priors
LLM Arena: An Open Platform for Evaluating LLMs by Human Preference
Homomorphism Counts for Graph Neural Networks: All About That Basis
Towards Optimal Adversarial Robust Q-learning with Bellman Infinity-error
On Interpolating Experts and Multi-Armed Bandits
Knowledge Transfer from Vision Foundation Models for Efficient Training of Small Task-specific Models
Linguistic Calibration of Language Models
Learning Multiple Secrets in Mastermind
Interpreting Natural Language Generation via Optimal Transport
Adversarially Robust Deep Multi-View Clustering: A Novel Attack and Defense Framework
Offline Inverse RL: New Solution Concepts and Provably Efficient Algorithms
How Uniform Random Weights Induce Non-uniform Bias: Typical Interpolating Neural Networks Generalize with Narrow Teachers
Highway Value Iteration Networks
Provable Privacy with Non-Private Pre-Processing
ReDiffuser: Reliable Decision-Making Using a Diffuser with Confidence Estimation
Revisiting the Power of Prompt for Visual Tuning
xT: Nested Tokenization for Larger Context in Large Images
Sliced-Wasserstein Estimation with Spherical Harmonics as Control Variates
Transolver: A Fast Transformer Solver for PDEs on General Geometries
Instruction Tuning for Secure Code Generation
Identifiability Matters: Revealing the Hidden Recoverable Condition in Unbiased Learning to Rank
An Efficient Maximal Ancestral Graph Listing Algorithm
Graph-enhanced Large Language Models in Asynchronous Plan Reasoning
Structure-Aware E(3)-Invariant Molecular Conformer Aggregation Networks
Jetfire: Efficient and Accurate Transformer Pretraining with INT8 Data Flow and Per-Block Quantization
Expressivity and Generalization: Fragment-Biases for Molecular GNNs
Parameter Efficient Quasi-Orthogonal Fine-Tuning via Givens Rotation
Mollification Effects of Policy Gradient Methods
Self-Alignment of Large Language Models via Monopolylogue-based Social Scene Simulation
Low-Rank Bandits via Tight Two-to-Infinity Singular Subspace Recovery
Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities
Rethinking Independent Cross-Entropy Loss For Graph-Structured Data
Position Paper: A Critical Evaluation of Reinforcement Learning in Dynamic Treatment Regimes
LSEnet: Lorentz Structural Entropy Neural Network for Deep Graph Clustering
Critical feature learning in deep neural networks
Provable Risk-Sensitive Distributional Reinforcement Learning with General Function Approximation
ELTA: An Enhancer against Long-Tail for Aesthetics-oriented Models
Towards Efficient Spiking Transformer: a Token Sparsification Framework for Training and Inference Acceleration
Causal Effect Identification in LiNGAM Models with Latent Confounders
TIC-TAC: A Framework For Improved Covariance Estimation In Deep Heteroscedastic Regression
DPN: Decoupling Partition and Navigation for Neural Solvers of Min-max Vehicle Routing Problems
DeCoOp: Robust Prompt Tuning with Out-of-Distribution Detection
Classification under Nuisance Parameters and Generalized Label Shift in Likelihood-Free Inference
Individual Contributions as Intrinsic Exploration Scaffolds for Multi-agent Reinforcement Learning
Prometheus: Out-of-distribution Fluid Dynamics Modeling with Disentangled Graph ODE
RVI-SAC: Average Reward Off-Policy Deep Reinforcement Learning
Convergence of Some Convex Message Passing Algorithms to a Fixed Point
Structure Your Data: Towards Semantic Graph Counterfactuals
DUPLEX: Dual GAT for Complex Embedding of Directed Graphs
Pruner-Zero: Evolving Symbolic Pruning Metric From Scratch for Large Language Models
Few-Shot Unsupervised Implicit Neural Shape Representation Learning with Spatial Adversaries
Optimizing Complex Machine Learning Systems with Black-box and Differentiable Components
Mean-field Underdamped Langevin Dynamics and its Spacetime Discretization
Contextual Feature Selection with Conditional Stochastic Gates
On the tractability of SHAP explanations under Markovian distributions
Double Momentum Method for Lower-Level Constrained Bilevel Optimization
GliDe with a CaPE: A Low-Hassle Method to Accelerate Speculative Decoding
av-SALMONN: Speech-Enhanced Audio-Visual Large Language Models
Double Stochasticity Gazes Faster: Snap-Shot Decentralized Stochastic Gradient Tracking Methods
Double Variance Reduction: A Smoothing Trick for Composite Optimization Problems without First-Order Gradient
Generative Enzyme Design Guided by Functionally Important Sites and Small-Molecule Substrates
An Image is Worth Multiple Words: Discovering Object Level Concepts using Multi-Concept Prompt Learning
We use cookies to store which papers have been visited.
I agree
Successful Page Load
ICML uses cookies to remember that you are logged in. By using our websites, you agree to the placement of cookies.
Our Privacy Policy »
Accept Cookies