gistml | Daily ML Paper Summaries

Auction-Based Online Policy Adaptation for Evolving Objectives

Guruprerana Shabadi, Kaushik Mallik

Reinforcement Learning Robotics Optimization

Introduces a modular framework for multi-objective reinforcement learning using auction-based policy adaptation.
Local policies compete through bids reflecting urgency, allowing for dynamic prioritization of objectives.
Demonstrates superior performance compared to monolithic policies in dynamic environments.
Enhances interpretability by allowing clear identification of active policies and objectives.

Malliavin Calculus for Counterfactual Gradient Estimation in Adaptive Inverse Reinforcement Learning

Vikram Krishnamurthy, Luke Snow

Reinforcement Learning Theory Optimization

Introduces a novel passive Langevin-based algorithm for adaptive inverse reinforcement learning.
Utilizes Malliavin calculus to efficiently estimate counterfactual gradients conditioned on measure-zero events.
Achieves optimal convergence rates independent of trajectory resampling or kernel smoothing.
Provides a comprehensive algorithmic framework for counterfactual gradient estimation.

Towards Intrinsically Calibrated Uncertainty Quantification in Industrial Data-Driven Models via Diffusion Sampler

Yiran Ma, Jerome Le Ny, Zhichao Chen, Zhihuan Song

Theory Optimization

Introduces a diffusion-based framework for uncertainty quantification in industrial models.
Eliminates the need for post-hoc calibration by providing intrinsically calibrated predictive uncertainty.
Demonstrates significant improvements in uncertainty calibration and predictive accuracy over existing methods.
Evaluated on synthetic datasets and real-world industrial case studies.

Feature Weighting Improves Pool-Based Sequential Active Learning for Regression

Dongrui Wu

Optimization Theory Efficient ML

Introduces feature weighting in distance computation for active learning in regression.
Proposes five new active learning approaches that incorporate feature weighting.
Demonstrates improved performance of feature-weighted methods over traditional unweighted methods.
Extends the applicability of feature weighting to both single-task and multi-task regression problems.

Neural network methods for two-dimensional finite-source reflector design

Roel Hacking, Lisa Kusch, Koondanibha Mitra, Martijn Anthonissen, Wilbert IJzerman

Optimization

Introduces a neural network parameterization for reflector design that addresses finite-source light distribution.
Develops two differentiable objective functions for optimizing reflector height.
Demonstrates superior performance of the neural network approach over traditional deconvolution methods.
Provides a comprehensive evaluation across multiple benchmarks, including height constraints.

MATA-Former & SIICU: Semantic Aware Temporal Alignment for High-Fidelity ICU Risk Prediction

Zhichong Zheng, Xiaohang Nie, Xueqi Wang, Yuanjin Zhao, Haitao Zhang, Yichao Tang

Time Series Multimodal

Introduction of MATA-Former, a transformer architecture that aligns clinical semantics with temporal dynamics.
Development of Plateau-Gaussian Soft Labeling (PSL) for continuous risk modeling instead of binary classification.
Creation of the SIICU dataset with over 506,000 expert-annotated clinical events to enhance evaluation of ICU risk prediction models.
Demonstration of superior performance in risk prediction from text-intensive, irregular clinical time series.

Crystalite: A Lightweight Transformer for Efficient Crystal Modeling

Tin Hadži Veljković, Joshua Rosenthal, Ivor Lončarić, Jan-Willem van de Meent

Generative Models Graph Learning Efficient ML

Introduction of the Geometric Enhancement Module (GEM) for direct geometric biasing in Transformers.
Replacement of one-hot atom representations with a compact chemically informed tokenization.
Crystalite achieves state-of-the-art results in crystal structure prediction and generation.
Significantly faster sampling compared to traditional geometry-heavy models.

Batched Contextual Reinforcement: A Task-Scaling Law for Efficient Reasoning

Bangji Yang, Hongbo Ma, Jiajun Fan, Ge Liu

NLP Large Language Models Reinforcement Learning Efficient ML

Introduction of Batched Contextual Reinforcement (BCR) for efficient reasoning in LLMs.
Discovery of a task-scaling law where increasing concurrent problems reduces token usage while maintaining accuracy.
Demonstration of a 'free lunch' phenomenon where accuracy improves despite reduced verbosity.
Emergence of self-regulated efficiency in models, eliminating redundant reasoning loops.

Robust Graph Representation Learning via Adaptive Spectral Contrast

Zhuolong Li, Boxue Yang, Haopeng Chen

Graph Learning Theory

Identifies a spectral dilemma in graph contrastive learning regarding the trade-off between high-frequency signal utility and noise sensitivity.
Introduces ASPECT, a framework that utilizes a reliability-aware spectral gating mechanism to improve robustness in graph representation learning.
Demonstrates that existing global spectral fusion strategies are suboptimal for mixed graphs with varying node-wise frequency preferences.
Achieves state-of-the-art performance on 8 out of 9 benchmarks, particularly on heterophilic graphs.

Coupled Query-Key Dynamics for Attention

Barak Gahtan, Alex M. Bronstein

NLP Large Language Models Efficient ML

Introduces Coupled QK Dynamics, enhancing attention mechanisms by evolving queries and keys jointly.
Achieves significant improvements in language modeling perplexity with minimal additional parameters.
Structural ablation studies confirm that coupling is the key factor for performance gains.
Effectiveness varies by corpus, with benefits observed in domain-coherent texts but not in heterogeneous datasets.

Koopman-Based Nonlinear Identification and Adaptive Control of a Turbofan Engine

David Grasev

Optimization Theory Robotics

Development of a physics-based component-level model for turbofan engine control.
Introduction of a meta-heuristic extended dynamic mode decomposition for accurate dynamic modeling.
Creation of two controllers: AKMPC and K-FBLC, with AKMPC showing superior robustness.
Demonstration of the Koopman model's flexibility across different control objectives.

Residuals-based Offline Reinforcement Learning

Qing Zhu, Xian Yu

Reinforcement Learning Optimization Theory

Introduces a residuals-based framework for offline reinforcement learning that addresses data coverage limitations.
Defines a residuals-based Bellman optimality operator that incorporates estimation errors into policy optimization.
Develops a residuals-based offline deep Q-learning algorithm and demonstrates its effectiveness in a stochastic environment.
Provides finite-sample guarantees and conditions for asymptotic optimality of the proposed methods.

UQ-SHRED: uncertainty quantification of shallow recurrent decoder networks for sparse sensing via engression

Mars Liyao Gao, Yuxuan Bao, Amy S. Rude, Xinwei Shen, J. Nathan Kutz

Time Series Theory Efficient ML

UQ-SHRED provides a distributional learning framework for valid uncertainty quantification in sparse sensing.
The method combines noise injection with energy score minimization, maintaining computational efficiency.
Theoretical guarantees are established for the learned conditional distribution, supporting its use in uncertainty-aware applications.
UQ-SHRED is validated across multiple scientific datasets, showcasing its effectiveness in various domains.

FourierMoE: Fourier Mixture-of-Experts Adaptation of Large Language Models

Juyong Jiang, Fan Wang, Hong Qi, Sunghun Kim, Jing Tang

NLP Large Language Models Efficient ML

FourierMoE integrates MoE architecture with inverse discrete Fourier transform (IDFT) for frequency-aware adaptation.
The method addresses task interference and representation deficiency in multi-task fine-tuning settings.
FourierMoE employs a frequency-adaptive router and learns complex coefficients to capture both phase and amplitude information.
Extensive evaluations show superior performance across various benchmarks with fewer trainable parameters compared to existing methods.

Universal Hypernetworks for Arbitrary Models

Xuanfeng Zhou

Computer Vision Graph Learning NLP

UHN is a fixed-architecture generator that can produce weights for various models without redesigning the hypernetwork.
It supports multi-model generalization and multi-task learning across different architectures.
UHN allows for stable recursive generation of hypernetworks, enhancing flexibility in model creation.
Empirical results show UHN's competitive performance across diverse benchmarks.

Physics Informed Reinforcement Learning with Gibbs Priors for Topology Control in Power Grids

Pantelis Dogoulis, Maxime Cordy

Reinforcement Learning Graph Learning Optimization

Introduces a physics-informed RL methodology for topology control in power grids.
Utilizes a Gibbs prior to select a small, state-dependent set of feasible actions.
Employs a graph neural network to predict overload risks for action evaluation.
Achieves significant improvements in reward and decision time compared to existing methods.

Application of parametric Shallow Recurrent Decoder Network to magnetohydrodynamic flows in liquid metal blankets of fusion reactors

M. Lo Verso, C. Introini, E. Cervi, L. Savoldi, J. N. Kutz, A. Cammi

Time Series

SHRED effectively reconstructs MHD states from sparse measurements.
The integration of SVD with SHRED enhances computational efficiency.
The framework generalizes well to unseen magnetic field configurations.
SHRED can infer magnetic field dynamics from temperature data alone.

SKILL0: In-Context Agentic Reinforcement Learning for Skill Internalization

Zhengxi Lu, Zhiyuan Yao, Jinyang Wu, Chengcheng Han, Qi Gu, Xunliang Cai, Weiming Lu, Jun Xiao, Yueting Zhuang, Yongliang Shen

Reinforcement Learning Large Language Models Robotics

SKILL0 is the first RL framework explicitly designed for skill internalization, enabling zero-shot autonomous behavior.
In-context reinforcement learning (ICRL) is introduced to transition from context-dependent execution to intrinsic competence.
Dynamic Curriculum adaptsively withdraws skills based on their on-policy helpfulness, optimizing the learning process.
SKILL0 achieves substantial performance improvements over traditional RL baselines while maintaining a low token context size.

DISCO-TAB: A Hierarchical Reinforcement Learning Framework for Privacy-Preserving Synthesis of Complex Clinical Data

Arshia Ilaty, Hossein Shirazi, Amir Rahmani, Hajar Homayouni

Reinforcement Learning Generative Models Large Language Models

DISCO-TAB synthesizes clinical data while preserving privacy and ensuring clinical validity.
The framework uses a hierarchical reinforcement learning approach to evaluate data quality at multiple granularities.
It incorporates techniques to preserve medical logic and address class imbalances in synthetic data.
DISCO-TAB shows significant improvements in clinical classifier utility and statistical fidelity compared to existing methods.

Graph Neural Operator Towards Edge Deployability and Portability for Sparse-to-Dense, Real-Time Virtual Sensing on Irregular Grids

William Howes, Jason Yoo, Kazuma Kobayashi, Subhankar Sarkar, Farid Ahmed, Souvik Chakraborty, Syed Bahauddin Alam

Graph Learning Efficient ML

VIRSO provides accurate sparse-to-dense reconstruction for irregular geometries.
The method integrates spectral and spatial analysis for improved performance.
Achieves mean relative L2 errors below 1% while reducing energy-delay product significantly.
Demonstrates edge-deployability with low power consumption and latency.

Sven: Singular Value Descent as a Computationally Efficient Natural Gradient Method

Samuel Bright-Thonney, Thomas R. Harvey, Andre Lukas, Jesse Thaler

Optimization Efficient ML Theory

Sven optimizes neural networks by treating each data point's residual as a separate condition.
The algorithm approximates the Moore-Penrose pseudoinverse using truncated SVD, leading to lower computational costs.
Sven significantly outperforms Adam and other first-order methods in regression tasks.
The method is particularly suited for over-parameterized models and can be applied in scientific computing.

Enhancing the Reliability of Medical AI through Expert-guided Uncertainty Modeling

Aleksei Khalin, Ekaterina Zaychenkova, Aleksandr Yugay, Andrey Goncharov, Sergey Korchagin, Alexey Zaytsev, Egor Ershov

Computer Vision Interpretability Theory

Expert evaluations significantly enhance the quality of uncertainty estimates in medical AI.
The proposed two-ensemble method effectively separates epistemic and aleatoric uncertainty.
The framework shows substantial improvements in various medical tasks, outperforming state-of-the-art methods.
A simplified one-ensemble method offers comparable performance with greater efficiency.

Bridging Deep Learning and Integer Linear Programming: A Predictive-to-Prescriptive Framework for Supply Chain Analytics

Khai Banh Nghiep, Duc Nguyen Minh, Lan Hoang Thi

Time Series Optimization

The study systematically compares deep learning models with traditional statistical methods for demand forecasting.
N-BEATS outperforms MSTL in forecasting accuracy, making it the most optimized model for this dataset.
The proposed framework integrates forecasting with operational decision-making through integer linear programming.
The research demonstrates the practical application of improved forecasting in logistics planning.

Test-Time Scaling Makes Overtraining Compute-Optimal

Nicholas Roberts, Sungjun Cho, Zhiqi Gao, Tzu-Heng Huang, Albert Wu, Gabriel Orlanski, Avi Trost, Kelly Buchanan, Aws Albarghouthi, Frederic Sala

Large Language Models Optimization Theory

Introduces Train-to-Test (T2) scaling laws that optimize pretraining and test-time decisions jointly.
Demonstrates that optimal pretraining strategies shift towards overtraining when factoring in inference costs.
Validates the T2 scaling approach by showing improved performance of overtrained models across various tasks.
Findings remain relevant even after post-training, suggesting practical implications for model deployment.

Today's ML research,without the noise.

Today's ML research,
without the noise.