gistml | Daily ML Paper Summaries

annbatch unlocks terabyte-scale training of biological data in anndata

Ilan Gold, Felix Fischer, Lucas Arnoldt, F. Alexander Wolf, Fabian J. Theis

Efficient ML

Annbatch significantly reduces data loading times for large biological datasets.
The framework integrates fully with the anndata ecosystem, ensuring compatibility with existing tools.
Implements efficient data retrieval techniques such as pseudo-random access and pre-shuffling.
Achieves a throughput of ~35,000 samples per second, outperforming existing solutions.

Batched Contextual Reinforcement: A Task-Scaling Law for Efficient Reasoning

Bangji Yang, Hongbo Ma, Jiajun Fan, Ge Liu

Large Language Models Reinforcement Learning Efficient ML

Introduction of Batched Contextual Reinforcement (BCR) for efficient reasoning in LLMs.
Discovery of a task-scaling law indicating that increasing concurrent problems reduces token usage while maintaining accuracy.
BCR achieves significant token reductions (15.8% to 62.6%) without degrading accuracy across multiple benchmarks.
Emergent self-regulated efficiency allows models to optimize reasoning autonomously, reducing unnecessary verbosity.

CuTeGen: An LLM-Based Agentic Framework for Generation and Optimization of High-Performance GPU Kernels using CuTe

Tara Saba, Anne Ouyang, Xujie Si, Fan Long

Optimization Large Language Models Efficient ML

CuTeGen is an iterative framework for GPU kernel synthesis that emphasizes progressive refinement.
The framework utilizes the CuTe abstraction layer to enhance kernel generation stability and performance.
Delayed profiling integration prevents premature convergence to suboptimal solutions during kernel optimization.
CuTeGen achieves significant performance improvements over existing implementations, particularly in matrix multiplication and activation workloads.

Feature Weighting Improves Pool-Based Sequential Active Learning for Regression

Dongrui Wu

Theory Optimization Efficient ML

Introduces feature weighting in distance computation for active learning in regression.
Proposes five new active learning approaches that incorporate feature weights.
Demonstrates consistent performance improvements over existing methods.
Validates effectiveness across both single-task and multi-task regression problems.

Unifying Group-Relative and Self-Distillation Policy Optimization via Sample Routing

Gengsheng Li, Tianyu Yang, Junfeng Fang, Mingyang Song, Mao Zheng, Haiyun Guo, Dan Zhang, Jinqiao Wang, Tat-Seng Chua

Reinforcement Learning Large Language Models Optimization

SRPO unifies GRPO and SDPO to enhance reinforcement learning efficiency.
The framework routes samples based on correctness, improving credit assignment.
An entropy-aware mechanism stabilizes training by focusing on reliable signals.
SRPO outperforms both GRPO and SDPO in terms of peak performance and efficiency.

Improving Latent Generalization Using Test-time Compute

Arslan Chaudhry, Sridhar Thiagarajan, Andrew Lampinen

NLP Large Language Models Reinforcement Learning

In-weights learning in LLMs often struggles with latent generalization, particularly in deductive reasoning tasks.
Test-time compute, or 'thinking', can significantly improve latent generalization compared to traditional train-time data augmentation methods.
Models trained to generate long chains-of-thought through RL can generalize effectively to both in-distribution and out-of-distribution knowledge.
Despite improvements, thinking models still face challenges with pure reversal tasks, indicating a gap compared to in-context learning performance.

Care-Conditioned Neuromodulation for Autonomy-Preserving Supportive Dialogue Agents

Shalima Binta Manir, Tim Oates

NLP Large Language Models

Introduces Care-Conditioned Neuromodulation (CCN) for supportive dialogue agents.
Formulates supportive dialogue as a multi-objective alignment problem focusing on autonomy support.
Constructs a benchmark for relational failure modes in multi-turn dialogues.
Demonstrates significant improvements in autonomy-preserving utility over existing methods.

PAC-Bayesian Reward-Certified Outcome Weighted Learning

Yuya Ishikawa, Shu Tamano

Theory

PROWL incorporates reward uncertainty into the learning framework for individualized treatment rules.
The method provides a conservative reward estimate and a lower bound on expected value, improving robustness.
A nonasymptotic PAC-Bayes lower bound is established for randomized ITRs, characterized by a general Bayes update.
An automated calibration procedure for learning rates is introduced, enhancing optimization efficiency.

Variational LSTM with Augmented Inputs: Nonlinear Response History Metamodeling with Aleatoric and Epistemic Uncertainty

Manisha Sapkota, Min Li, Bowei Li

Time Series

Introduces a Variational LSTM model for nonlinear structural metamodeling.
Augmented inputs effectively capture record-to-record variability and system uncertainty.
Monte Carlo dropout is used to quantify epistemic uncertainty in predictions.
Validated on nonlinear systems subjected to stochastic seismic and wind loads.

Application of parametric Shallow Recurrent Decoder Network to magnetohydrodynamic flows in liquid metal blankets of fusion reactors

M. Lo Verso, C. Introini, E. Cervi, L. Savoldi, J. N. Kutz, A. Cammi

Time Series Efficient ML Theory

Introduction of SHRED as a data-driven approach for MHD state reconstruction.
Integration of SVD for dimensionality reduction enhances computational efficiency.
High reconstruction accuracy across various magnetic field configurations.
Ability to infer magnetic field dynamics from limited sensor data.

Training In-Context and In-Weights Mixtures Via Contrastive Context Sampling

Deeptanshu Malu, Deevyanshu Malu, Aditya Nemiwal, Sunita Sarawagi

NLP Large Language Models Theory

Inter-example similarity is crucial for the emergence of ICL during fine-tuning.
Contrastive-Context effectively balances ICL and IWL by sampling across similarity levels.
The method outperforms traditional fine-tuning approaches in various tasks and models.
Theoretical insights from a minimal model support the empirical findings.

Efficient and Principled Scientific Discovery through Bayesian Optimization: A Tutorial

Zhongwei Yu, Rasul Tutunov, Alexandre Max Maraval, Zikai Xie, Zhenzhi Tan, Jiankang Wang, Zijing Li, Liangliang Xu, Qi Yang, Jun Jiang, Sanzhong Luo, Zhenxiao Guo, Haitham Bou-Ammar, Jun Wang

Optimization

Bayesian Optimization formalizes the scientific discovery process, reducing reliance on trial-and-error.
The tutorial provides practical coding examples and theoretical foundations tailored for various audiences.
Real-world case studies validate the effectiveness of BO in optimizing experimental design in scientific research.
Key components of BO, such as surrogate models and acquisition functions, are essential for balancing exploration and exploitation.

Auction-Based Online Policy Adaptation for Evolving Objectives

Guruprerana Shabadi, Kaushik Mallik

Reinforcement Learning Robotics Optimization

Introduces a modular framework for adaptive policies in multi-objective reinforcement learning.
Utilizes an auction-based mechanism for dynamic coordination among competing objectives.
Achieves better performance than monolithic policies through concurrent training and environment-aware bidding.
Facilitates interpretability by allowing clear identification of the active policy and objective.

When Reward Hacking Rebounds: Understanding and Mitigating It with Representation-Level Signals

Rui Wu, Ruixiang Tang

Reinforcement Learning Large Language Models Optimization

Identification of a three-phase rebound pattern in reward hacking during RL training.
Demonstration that the shortcut concept direction is a strong indicator of hacking behavior.
Introduction of Advantage Modification, which integrates concept-level signals into training to mitigate hacking.
Use of a controlled environment-manipulation testbed to study reward hacking dynamics.

Pseudo-Quantized Actor-Critic Algorithm for Robustness to Noisy Temporal Difference Error

Taisuke Kobayashi

Reinforcement Learning Robotics Theory

Introduces the Pseudo-Quantized Actor-Critic (PQAC) algorithm for robust learning in RL.
Addresses the instability caused by noisy temporal difference errors in traditional RL methods.
Utilizes a sigmoid function to model optimality and achieve gradient vanishing for noise exclusion.
Demonstrates improved stability and efficiency in learning compared to baseline methods.

DDCL: Deep Dual Competitive Learning: A Differentiable End-to-End Framework for Unsupervised Prototype-Based Representation Learning

Giansalvo Cirrincione

Theory

Introduction of DDCL as the first fully differentiable end-to-end framework for unsupervised representation learning.
Replacement of external k-means clustering with an internal Dual Competitive Layer for direct optimization.
Theoretical analysis includes loss decomposition, collapse analysis, and global Lyapunov stability.
Empirical validation shows DDCL outperforms traditional methods by significant margins in clustering accuracy.

CANDI: Curated Test-Time Adaptation for Multivariate Time-Series Anomaly Detection Under Distribution Shift

HyunGi Kim, Jisoo Mok, Hyungyu Lee, Juhyeon Shin, Sungroh Yoon

Time Series

CANDI addresses the critical issue of distribution shift in MTSAD, which leads to increased false positives.
The framework employs False Positive Mining to curate informative samples for adaptation.
CANDI incorporates a lightweight Spatiotemporally-Aware Normality Adaptation module to update the model without compromising pre-trained knowledge.
The proposed method shows significant performance improvements over existing baselines, with a notable AUROC gain.

Smoothing the Landscape: Causal Structure Learning via Diffusion Denoising Objectives

Hao Zhu, Di Zhou, Donna Slonim

Graph Learning Theory Efficient ML

Introduction of Denoising Diffusion Causal Discovery (DDCD) for causal structure learning.
Utilization of denoising score matching to achieve smoother gradients and faster convergence.
Adaptive k-hop acyclicity constraint improves runtime efficiency.
DDCD-Smooth addresses the 'varsortability' problem, enhancing robustness to heterogeneous feature scales.

On the Role of Depth in the Expressivity of RNNs

Maude Lizaire, Michael Rizvi-Martel, Éric Dupuis, Guillaume Rabusseau

Theory Time Series NLP

Depth increases the expressivity of RNNs, enhancing memory capacity and input transformation capabilities.
2RNNs can compute higher-order polynomials as depth increases, unlike standard RNNs.
Multiplicative interactions in 2RNNs provide unique expressive capabilities that cannot be replicated by deep RNNs with only nonlinear activations.
Empirical results confirm theoretical insights, showing depth's impact on performance across various tasks.

Enhancing the Reliability of Medical AI through Expert-guided Uncertainty Modeling

Aleksei Khalin, Ekaterina Zaychenkova, Aleksandr Yugay, Andrey Goncharov, Sergey Korchagin, Alexey Zaytsev, Egor Ershov

Computer Vision Interpretability Efficient ML

Integration of expert knowledge improves uncertainty estimation in medical AI.
The proposed method effectively separates epistemic and aleatoric uncertainty.
A two-ensemble approach outperforms state-of-the-art uncertainty estimation methods.
Significant performance improvements were observed across multiple medical tasks.

Graph Neural Operator Towards Edge Deployability and Portability for Sparse-to-Dense, Real-Time Virtual Sensing on Irregular Grids

William Howes, Jason Yoo, Kazuma Kobayashi, Subhankar Sarkar, Farid Ahmed, Souvik Chakraborty, Syed Bahauddin Alam

Graph Learning Efficient ML

VIRSO provides accurate sparse-to-dense reconstruction for irregular geometries.
The framework is designed with edge deployability and power efficiency in mind.
Achieves mean relative L2 errors below 1% across various benchmarks.
Significantly reduces energy-delay product compared to traditional methods.

Universal Hypernetworks for Arbitrary Models

Xuanfeng Zhou

Computer Vision Graph Learning NLP

UHN is a fixed-architecture generator that can produce weights for various models without redesigning the generator.
It supports multi-model generalization and multi-task learning across different architectures.
UHN allows for recursive generation of hypernetworks, enhancing its flexibility and scalability.
Empirical results show UHN's competitive performance against direct training across diverse benchmarks.

Model Merging via Data-Free Covariance Estimation

Marawan Gamal Abdel Hameed, Derek Tam, Pascal Jr Tikeng Notsawo, Colin Raffel, Guillaume Rabusseau

Theory Efficient ML Optimization

Introduces ACTMat, a data-free method for estimating covariance matrices for model merging.
Revisits the interference minimization framework to enhance model merging without requiring training data.
Demonstrates superior performance of ACTMat over existing data-free merging methods across multiple benchmarks.
Addresses the limitations of traditional merging methods that rely on heuristics and lack theoretical justification.

Residuals-based Offline Reinforcement Learning

Qing Zhu, Xian Yu

Reinforcement Learning Optimization Theory

Introduces a residuals-based Bellman optimality operator for offline RL.
Addresses limitations of offline RL by generating unseen states through empirical residuals.
Develops a residuals-based offline DQN algorithm.
Demonstrates effectiveness in a stochastic CartPole environment.

Today's ML research,without the noise.

Today's ML research,
without the noise.