AI-generated summaries

Today's ML research,
without the noise.

Daily summaries of the latest machine learning papers from arXiv, processed every 8 hours.

61 Papers today
8h Update frequency
7 Days of history
A Boundary-Layer Mechanism for One-Third Scaling in Online Softmax Classification
Marcel Kühn, Yoon Thelge, Bernd Rosenow
Theory Optimization
  • Isolates a boundary-layer mechanism for understanding learning dynamics in online softmax classification.
  • Demonstrates that the generalization error scales as α^(-1/3) in late training phases.
  • Shows that learning-rate schedules can improve generalization error scaling to α^(-1/2).
  • Validates theoretical predictions through simulations and controlled experiments.
Read more
The Illusion of Reasoning: Exposing Evasive Data Contamination in LLMs via Zero-CoT Truncation
Yifan Lan, Yuanpu Cao, Hanyu Wang, Lu Lin, Jinghui Chen
Large Language Models NLP Theory
  • Data contamination in LLMs can create a false impression of reasoning capabilities.
  • The Zero-CoT Probe (ZCP) method effectively detects evasive data contamination by truncating the CoT process.
  • Contamination Confidence is introduced as a new metric to quantify the severity of data contamination.
  • Extensive evaluations reveal significant levels of data contamination in various LLMs.
Read more
Survive or Collapse: The Asymmetric Roles of Data Gating and Reward Grounding in Self-Play RL
Sophia Xiao Pu, Zhaotian Weng, Chengzhi Liu, Jayanth Srinivasa, Gaowen Liu, William Yang Wang, Xin Eric Wang
Reinforcement Learning Large Language Models Theory
  • Self-play stability is governed more by data gating than by reward design.
  • A strict data gate ensures stability across various reward configurations.
  • The Grounded Proposer Paradox indicates that access to ground truth can worsen stability.
  • A continuous strictness parameter for gating reveals a two-stage phase transition in training dynamics.
Read more
A Posterior-Predictive Variance Decomposition for Epistemic and Aleatoric Uncertainty in Wind Power Forecasting
Yinsong Chen, Samson S. Yu, Kashem M. Muttaqi
Time Series Theory
  • Introduces a posterior-predictive variance decomposition framework for wind power forecasting.
  • Successfully separates epistemic and aleatoric uncertainties, improving forecasting accuracy.
  • Develops an evaluation framework that does not rely on ground-truth uncertainty labels.
  • Demonstrates theoretical consistency and operational utility through synthetic and real-world experiments.
Read more
The Double Dilemma in Multi-Task Radiology Report Generation: A Gradient Dynamics Analysis and Solution
Erjian Zhang, Yatong Hao, Liejun Wang, Zhiqing Guo
Optimization Multimodal Theory
  • Identifies the limitations of linear scalarization in multi-task RRG optimization.
  • Introduces the concept of 'Double Dilemma' in gradient dynamics affecting RRG.
  • Proposes CAME-Grad, a novel optimizer that enhances multi-task learning performance.
  • Demonstrates significant improvements in clinical efficacy across multiple RRG methods.
Read more
Toward Understanding Adversarial Distillation: Why Robust Teachers Fail
Hongsin Lee, Hye Won Chung
Theory
  • Identifies the 'Robustly Unlearnable Set' as a key factor in the failure of Adversarial Distillation.
  • Develops a theoretical framework explaining how teacher-student dynamics affect robust generalization.
  • Demonstrates that a teacher's predictive confidence on unlearnable samples is crucial for student robustness.
  • Empirical validation confirms the theoretical predictions across various datasets.
Read more
Provable Joint Decontamination for Benchmarking Multiple Large Language Models
Zhenlong Liu, Hao Zeng, Hongxin Wei
NLP Large Language Models Theory
  • Introduces a joint selection framework for benchmark decontamination across multiple LLMs.
  • Proposes Joint Envelope Conformal Selection (JECS) to control global contamination rates.
  • Establishes theoretical guarantees for GCR control under specified conditions.
  • Demonstrates superior performance of JECS in maintaining GCR while improving power over existing methods.
Read more
BioFormer: Rethinking Cross-Subject Generalization via Spectral Structural Alignment in Biomedical Time-Series
Guikang Du, Haoran Li, Xinyu Liu, Zhibo Zhang, Xiaoli Gong, Jin Zhang
Time Series
  • Introduces 'spectral drift' as a new perspective to understand subject-specific variability in biomedical time-series.
  • Proposes the Frequency-Band Alignment Module (FBAM) to align spectral structures and mitigate variability.
  • Implements Sample Conditional Layer Normalization (SCLN) for stabilizing cross-subject representations.
  • Demonstrates a 6% absolute improvement in F1-score over 12 baseline models across multiple datasets.
Read more
What are the Right Symmetries for Formal Theorem Proving?
Krzysztof Olejniczak, Radoslav Dimitrov, Xingyue Huang, Bernardo Cuenca Grau, Jinwoo Kim, İsmail İlkan Ceylan
Theory Large Language Models
  • Introduction of rewriting categories as a framework for modeling theorem statement transformations.
  • Formalization of proof equivariance and success invariance as essential symmetry properties for theorem provers.
  • Empirical demonstration of LLM-based provers' failure to maintain success invariance across equivalent formulations.
  • Proposed test-time aggregation method improves robustness and performance of theorem proving.
Read more
TONIC: Token-Centric Semantic Communication for Task-Oriented Wireless Systems
Sige Liu, Kezhi Wang
Computer Vision Multimodal Generative Models
  • TONIC framework aligns communication design with token-level task relevance.
  • Introduces unequal error protection based on token utility under fixed channel budgets.
  • Utilizes confidence gating to manage unreliable token decisions effectively.
  • Combines transmitter-side semantic protection with receiver-side completion models.
Read more
ConTact: Contact-First Antibody CDR Design via Explicit Interface Reasoning
Mansoor Ahmed, Spencer VonBank, Nadeem Taj, Sujin Lee, Naila Jan, Murray Patterson
Graph Learning
  • CONTACT separates contact identification and sequence prediction into distinct stages, improving learning efficiency.
  • The architecture includes a contact-gated injection mechanism that selectively routes antigen information to relevant CDR positions.
  • CONTACT achieves superior performance metrics compared to existing CDR design methods, including a 7% improvement in RMSD and a 10% increase in F1 score.
  • The methodology addresses the architectural limitations of current models by focusing on the sparsity of CDR-antigen interactions.
Read more
Position: The Time for Sampling Is Now! Charting a New Course for Bayesian Deep Learning
Emanuel Sommer, David Rügamer
Theory Optimization Efficient ML
  • Sampling-based inference (SAI) is as computationally efficient as optimization-based methods for Bayesian neural networks.
  • SAI can improve prediction performance and provide better uncertainty quantification.
  • Addressing misconceptions about SAI is crucial for its broader acceptance in the community.
  • Research should focus on effective exploration of the posterior landscape and management of posterior samples.
Read more
Winner-Take-All bottlenecks enforce disentangled symbolic representations in multi-task learning
Julian Gutheil, Simon Hitzginger, Robert Legenstein
Theory
  • WTA bottlenecks can enforce the extraction of categorical latent factors in multi-task learning.
  • The representation at the WTA bottleneck is a structured permutation of the original latent factors.
  • Symbolic representations allow individual neurons to encode distinct abstract features.
  • Empirical validation shows generalization benefits from the symbolic representations.
Read more
When to Switch, Not Just What: Transition Quality Prediction in Clash Royale
Heeyun Heo, Huy Kang Kim
Reinforcement Learning
  • Frequent strategy switching is inversely related to win rates in Clash Royale.
  • Existing recommendation systems often ignore the behavioral costs of switching strategies.
  • The Transition Quality Predictor (TQP) reformulates strategy recommendations as a transition-level decision problem.
  • The TQP pipeline includes components for identifying suitable players and timing for strategy switches.
Read more
Clipping Bottleneck: Stabilizing RLVR via Stochastic Recovery of Near-Boundary Signals
Shuo Yang, Jinda Lu, Chiyu Ma, Kexin Huang, Haoming Meng, Qihui Zhang, Yuyang Liu, Bolin Ding, Guoyin Wang, Li Yuan, Jingren Zhou
Reinforcement Learning Large Language Models Optimization
  • Identified hard clipping as a key source of instability in RLVR.
  • Proposed Near-boundary Stochastic Rescue (NSR) to recover near-boundary signals.
  • NSR improves training stability and convergence over traditional methods.
  • Demonstrated effectiveness across various model sizes and architectures.
Read more
Chebyshev Policies and the Mountain Car Problem: Reinforcement Learning for Low-Dimensional Control Tasks
Stefan Huber, Hannes Unger, Georg Schäfer, Jakob Rehrl
Reinforcement Learning Theory Efficient ML
  • Analytical solution to the Mountain Car problem reveals a simple optimal control strategy.
  • Chebyshev policies significantly reduce the number of parameters and improve sample efficiency.
  • Chebyshev policies outperform traditional neural network approaches in various RL tasks.
  • The study highlights a substantial gap in performance between current RL agents and the optimal solution.
Read more
Relational Linear Properties in Language Models: An Empirical Investigation
Giovanni Valer, Luigi Gresele, Marco Bronzini, Emanuele Marconato
NLP Large Language Models Interpretability
  • Introduces a novel probing method based on Kullback-Leibler divergence to evaluate relational linearity in language models.
  • Demonstrates that relational linearity varies across different models and layers, with specific relations exhibiting stronger linearity.
  • Finds that the phrasing of queries significantly affects the linear probing results, highlighting the complexity of relational representations.
Read more
The Neural Compiler: Program-to-Network Translation for Hybrid Scientific Machine Learning
Lucas Sheneman
Theory
  • The Neural Compiler translates symbolic physics expressions into exact, differentiable PyTorch modules.
  • Compiled modules achieve zero approximation error in their safe domain, unlike traditional neural networks.
  • The system supports 51 primitive operations, enabling complex physics computations and PDE discretizations.
  • Experimental results show significant improvements in parameter recovery and extrapolation compared to PINNs and other baselines.
Read more
The Matching Principle: A Geometric Theory of Loss Functions for Nuisance-Robust Representation Learning
Vishal Rajput
Theory Optimization Computer Vision
  • Introduces the Matching Principle, unifying various robustness challenges in machine learning.
  • Establishes that existing methods are different estimators of the same statistical object, Σtask.
  • Presents theoretical results proving the necessity of covering the range of Σtask for effective regularization.
  • Introduces the Trajectory Deviation Index (tdi) as a new metric for assessing embedding sensitivity.
Read more
Correcting Class Imbalance in Prior-Data Fitted Networks for Tabular Classification
Samuel McDowell, Nathan Stromberg, Lalitha Sankar
Theory Efficient ML Optimization
  • PFNs excel in tabular classification but suffer from class imbalance issues.
  • Thresholding is identified as the most effective method for improving minority class performance.
  • Downsampling provides a balance between performance and computational efficiency.
  • Classical imbalance correction techniques can be adapted for PFNs despite their unique learning dynamics.
Read more
From Snapshots to Trajectories: Learning Single-Cell Gene Expression Dynamics via Conditional Flow Matching
Siyu Pu, Qingqing Long, Xiaohan Huang, Haotian Chen, Jiajia Wang, Meng Xiao, Xiao Luo, Hengshu Zhu, Yuanchun Zhou, Xuezhi Wang
Generative Models Time Series
  • scFM addresses the challenges of unpaired snapshots in scRNA-seq by integrating optimal transport and flow matching.
  • The framework improves temporal coherence and reduces distribution drift in long-horizon predictions.
  • Experimental results show enhanced performance in trajectory reconstruction and gene expression dynamics recovery.
Read more
Decomposing Ensemble Spread in Lorenz '96 With Learned Stochastic Parameterizations
Birgit Kühbacher, Daan Crommelin, Niki Kilbertus
Time Series Theory
  • The paper rigorously defines and decomposes sources of uncertainty in ensemble forecasting.
  • It systematically compares various parameterization strategies, including novel machine learning approaches.
  • Stochastic parameterizations with temporally persistent structures significantly improve spread growth and error consistency.
  • The study enhances understanding of how different uncertainties interact in chaotic systems.
Read more
Disentanglement Beyond Generative Models with Riemannian ICA
Edmond Cunningham
Theory Interpretability
  • Introduces Riemannian ICA (RICA) as a local geometric approach to disentanglement.
  • Develops the disentanglement tensor to quantify pointwise disentanglement.
  • Demonstrates that RICA can recover sources effectively across different manifolds.
  • Challenges the reliance on global generative models in traditional disentanglement methods.
Read more
PeakFocus: Bridging Peak Localization and Intensity Regression via a Unified Multi-Scale Framework for Electricity Load Forecasting
Wangzhi Yu, Peng Zhu, Qing Zhao, Yiwen Jiang, Dawei Cheng
Time Series
  • PeakFocus unifies peak localization and intensity regression into a single framework.
  • The framework employs a triple hybrid loss for joint supervision of peak timing and intensity.
  • Multi-Scale Mixing Peak Locator resolves misjudgment and timing misalignment using coarse and fine-grained features.
  • Location-Aware Decoder enhances intensity estimation by incorporating peak timing context.
Read more
On the Sample Complexity of Discounted Reinforcement Learning with Optimized Certainty Equivalents
Oliver Mortensen, Mohammad Sadegh Talebi
Reinforcement Learning Theory Optimization
  • Introduces a model-based algorithm, Model-Based OCE Value Iteration (MB-OCE-VI), for risk-sensitive RL.
  • Establishes PAC-type bounds on sample complexity for both value and policy learning under recursive OCE.
  • Proves that OCEs defined by utility functions outside a specific class are not PAC learnable.
  • Provides worst-case lower bounds on sample complexity, improving existing results for CVaR.
Read more
Learning Causal Orderings for In-Context Tabular Prediction
Sascha Xu, Sarah Mameche, Jilles Vreeken
Theory
  • Introduces TABORDER, a model that incorporates causal orderings into tabular prediction.
  • Uses causal order-constrained attention to ensure predictions are based on causal relationships.
  • Learns optimal variable orderings in an unsupervised manner through a likelihood-based objective.
  • Addresses the challenge of missing data in tabular datasets while identifying causal directions.
Read more
Dropout Universality: Scaling Laws and Optimal Scheduling at the Edge-of-Chaos
Lucas Fernandez Sarmiento
Theory Optimization
  • Dropout acts as a relevant perturbation that shifts the critical fixed point in deep networks.
  • Smooth and kinked activations lead to different universality classes with distinct critical scaling behaviors.
  • A two-parameter scaling collapse is established for dropout strength and distance to criticality.
  • Optimal dropout scheduling can significantly reduce test loss without increasing computational costs.
Read more
Posterior Collapse as Automatic Spectral Pruning
Johannes Hirn
Generative Models Theory Interpretability
  • Posterior collapse in beta-VAEs is shown to act as automatic spectral pruning.
  • A latent-rescaling-invariant order parameter is introduced to rank active latent modes.
  • The collapse spectrum and utility spectrum coincide in the linear Gaussian case.
  • The findings suggest that posterior collapse can be beneficial for feature learning and interpretability.
Read more
Beyond Single Slot: Joint Optimization for Multi-Slot Guaranteed Display Advertising
Zhaoqi Zhang, Jiaming Deng, Miao Xie, Linyou Cai, Qianlong Xie, Xingxing Wang, Siqiang Luo, Gao Cong
Optimization
  • Introduces a joint optimization framework for multi-slot guaranteed display advertising.
  • Addresses key challenges such as slot-level redundancy and contract imbalance.
  • Utilizes an offline bipartite matching approach for coordinated ad allocation.
  • Implements Page View constraints and a Contract Roulette mechanism to enhance user experience.
Read more
A note on convergence of Wasserstein policy optimization
David Šiška, Yufei Zhang
Reinforcement Learning Theory Optimization
  • Establishes linear convergence of Wasserstein Policy Optimization (WPO) in entropy-regularized MDPs.
  • Utilizes mean-field analysis and log-Sobolev inequalities to prove convergence properties.
  • Demonstrates monotonic energy dissipation along the gradient flow.
  • Concludes that the value function converges exponentially fast to the global optimum.
Read more
Beyond Scalar Objectives: Expert-Feedback-Driven Autonomous Experimentation for Scientific Discovery at the Nanoscale
Ralph Bulanadi, Jefferey Baxter, Arpan Biswas, Hiroshi Funakubo, Dennis Meier, Jan Schultheiß, Rama Vasudevan, Yongtao Liu
Robotics Optimization Theory
  • Introduction of deep-kernel pairwise learning (DKPL) for autonomous experimentation.
  • DKPL incorporates expert feedback to evaluate experimental outputs beyond scalar metrics.
  • Demonstrated effectiveness in learning nanoscale structures and analyzing ferroelectric domain walls.
  • Addresses limitations of traditional Bayesian optimization in capturing complex scientific phenomena.
Read more
Reading Task Failure Off the Activations: A Sparse-Feature Audit of GPT-2 Small on Indirect Object Identification
Mahdi Nasermoghadasi, Faezeh Ghaderi
NLP Large Language Models Interpretability
  • The audit pipeline developed is a model-agnostic tool for analyzing language model failures.
  • Feature 17,491 correlates strongly with task failure but does not serve as a sufficient cause.
  • The study highlights the importance of conducting controls to distinguish between robust behavioral effects and incidental feature correlations.
  • The findings reveal a significant lexical confound affecting the model's performance on the IOI task.
Read more
Bandit Convex Optimization with Gradient Prediction Adaptivity
Shuche Wang, Adarsh Barik, Vincent Y. F. Tan
Optimization Theory
  • Introduces Two-Point Variance-Reduced Optimistic Gradient Descent (TP-VR-OPT) for BCO.
  • Establishes a negative result indicating fundamental limitations of single-point feedback in BCO.
  • Achieves improved regret bounds that scale with cumulative prediction error.
  • Develops adaptive variants of the algorithm that do not require prior knowledge of prediction error or time horizon.
Read more
OPPO: Bayesian Value Recursion for Token-Level Credit Assignment in LLM Reasoning
Yu Li, Rui Miao, Tian Lan, Zhengling Qi
NLP Large Language Models Reinforcement Learning
  • OPPO improves token-level credit assignment in LLM reasoning by using Bayesian updates.
  • The method accumulates oracle signals to provide a running estimate of success probability for each token.
  • OPPO eliminates the need for a learned value network and additional rollouts.
  • The framework includes two estimators: self-oracle and teacher-oracle.
Read more
Three Costs of Amortizing Gaussian Process Inference with Neural Processes
Robin Young
Theory Efficient ML Generative Models
  • Decomposes KL divergence between GP and LNP predictives into three interpretable components.
  • Establishes bounds on approximation errors related to representation dimension and kernel smoothness.
  • Identifies label contamination as a persistent cost in neural process predictions.
  • Provides architectural recommendations to enhance predictive variance estimation.
Read more
Aerodynamic force reconstruction using physics-informed Gaussian processes
Gledson Rodrigo Tondo, Igor Kavrakov, Guido Morgenthal
Theory Optimization Time Series
  • Introduces a physics-informed machine learning approach for aerodynamic force reconstruction.
  • Utilizes Gaussian processes to avoid overfitting and eliminate the need for regularization.
  • Demonstrated effectiveness through a case study on the Great Belt East Bridge.
  • Achieves strong agreement between true and predicted aerodynamic loads.
Read more
Can Transformers Learn to Verify During Backtracking Search?
Yin Jun Phua, Tony Ribeiro, Tuan Nguyen, Katsumi Inoue
Theory Large Language Models Optimization
  • Transformers struggle with verification during backtracking due to scattered retrieval and history entanglement.
  • Selective State Attention (SSA) is introduced as a structural fix to enforce state-based decision-making.
  • SSA allows transformers to produce consistent outputs for same-state pairs, improving their reliability in search tasks.
  • The study highlights the importance of structural adjustments in transformer models for effective reasoning.
Read more
SDPM: Survival Diffusion Probabilistic Model for Continuous-Time Survival Analysis
Stanislav R. Kirpichenko, Andrei V. Konstantinov, Lev V. Utkin
Generative Models Time Series Theory
  • SDPM offers a generative approach to continuous-time survival analysis without fixed discretization.
  • The model effectively estimates survival functions using a denoising diffusion probabilistic framework.
  • SDPM demonstrates superior performance in survival function estimation, particularly in integrated Brier score.
  • The approach allows for controllable accuracy in survival estimates through sample generation.
Read more
Riemannian geometry meets fMRI: the advantages of modeling correlation manifolds and eigenvector subspaces
Mario Severino, Manuela Moretto, Robert A. McCutcheon, Mattia Veronese
Theory Time Series Graph Learning
  • Introduces a scalable geometric framework for analyzing correlation matrices in fMRI data.
  • Develops the Off–log metric for closed-form statistical modeling of correlation matrices.
  • Utilizes Grassmannian subspace discrimination to resolve ambiguities in eigenvector comparisons.
  • Demonstrates improved sensitivity and predictive performance in clinical and aging datasets.
Read more
PEARL: Unbiased Percentile Estimation via Contrastive Learning for Industrial-Scale Livestream Recommendation
Blake Gella, Wei Wu, Yuhao Yin, Zexi Huang, Zikai Wang, Emily Liu, Junlin Zhang, Wentao Guo, Qinglei Wang
Theory Optimization Reinforcement Learning
  • PEARL addresses behavioral intensity imbalance in recommender systems.
  • The framework uses nonparametric contrastive learning to estimate relative preference signals.
  • It eliminates the need for auxiliary models for distribution estimation.
  • Theoretical justification supports the unbiased nature of the proposed method.
Read more
Remember to be Curious: Episodic Context and Persistent Worlds for 3D Exploration
Lily Goli, Justin Kerr, Daniele Reda, Alec Jacobson, Andrea Tagliasacchi, Angjoo Kanazawa
Reinforcement Learning Computer Vision Robotics
  • Curiosity-driven exploration can be enhanced by integrating a persistent world model with episodic context.
  • The proposed method utilizes online 3D reconstruction to maintain spatial persistence.
  • The agent's policy is based on a transformer model that processes RGB observations to retain episodic history.
  • The approach outperforms traditional active-mapping methods and generalizes to unseen environments.
Read more
When Stronger Triggers Backfire: A High-Dimensional Theory of Backdoor Attacks
Donald Flynn, Hadas Yaron Goldhirsh, Jonathan P. Keating, Inbar Seroussi
Theory
  • Stronger training triggers can enhance clean test accuracy in high-dimensional models.
  • Attack success rates peak at a finite trigger strength before declining.
  • The most damaging trigger direction aligns with the minimum eigenvector of the data covariance.
  • The study provides a rigorous theoretical framework for analyzing backdoor poisoning attacks.
Read more
From Reasoning Chains to Verifiable Subproblems: Curriculum Reinforcement Learning Enables Credit Assignment for LLM Reasoning
Xitai Jiang, Zihan Tang, Wenze Lin, Yang Yue, Shenzhi Wang, Gao Huang
Reinforcement Learning Large Language Models Theory
  • SCRL effectively decomposes hard problems into verifiable subproblems, enhancing learning signals.
  • The framework allows for finer-grained credit assignment through subproblem-level normalization.
  • SCRL outperforms traditional RLVR methods and strong curriculum-learning baselines on multiple benchmarks.
  • The approach leads to improved exploration in challenging reasoning tasks.
Read more
Manifold-Guided Attention Steering
Ian Li, Kapilesh Guruprasad, Raunak Sengupta, Ninad Satish, Loris D'Antoni, Rose Yu
NLP Large Language Models Interpretability
  • MAGS introduces a dynamic, trajectory-aware correction mechanism for attention heads in LLMs.
  • The method is grounded in the observation that reasoning errors manifest as deviations from a low-dimensional correctness manifold.
  • MAGS outperforms static steering approaches by up to 10.8% across multiple reasoning and generation benchmarks.
  • The approach is validated through diagnostic experiments confirming the separability of correct and incorrect reasoning trajectories.
Read more
No Epoch Like the Present: Robust Climate Emulation Requires Out-of-Distribution Generalisation
Bradley Stanley-Clamp, Anson Lei, Hannah M. Christensen, Ingmar Posner
Time Series
  • Climate emulation is fundamentally an out-of-distribution prediction task.
  • Seasonal variations can serve as effective proxies for long-term climate shifts.
  • Current state-of-the-art hybrid-ML emulators show significant performance degradation under realistic distribution shifts.
  • Compositional generalization is crucial for improving the robustness of climate emulators.
Read more
Understanding Multimodal Failure in Action-Chunking Behavioral Cloning
Lorenzo Mazza, Massimiliano Datres, Ariel Rodriguez, Sebastian Bodenstedt, Gitta Kutyniok, Stefanie Speidel
Robotics Generative Models Theory
  • Multimodal action distributions pose significant challenges in behavioral cloning.
  • Posterior-prior regularization can enhance reliability but may lead to loss of multimodal information.
  • The Lipschitz constant of the base-to-action mapping affects the ability to capture multiple modes.
  • The paper provides a formal definition of multimodality and identifies key factors for its preservation.
Read more
Same Architecture, Different Capacity: Optimizer-Induced Spectral Scaling Laws
Nandan Kumar Jha, Brandon Reagen
NLP Large Language Models Optimization
  • Optimizers significantly influence the spectral scaling laws of Transformer architectures, affecting how model capacity is utilized.
  • Different optimizers can yield markedly different scaling behaviors, particularly in rare-token representation scenarios.
  • Matched validation loss does not imply similar representation structures across different optimizers.
  • Optimizer-induced spectral shifts can surpass the effects of architectural changes, emphasizing the importance of optimizer choice in model design.
Read more
IKNO: Infinite-order Kernel Neural Operators
Pengyuan Zhu, Ivor W. Tsang, Yueming Lyu
Theory Efficient ML
  • Introduction of Infinite-order Kernel Neural Operator (IKNO) for enhanced expressivity in neural operators.
  • Development of two constructions: IKNO-Vanilla and IKNO-TP, both optimized for computational efficiency.
  • Empirical results show IKNO consistently achieves state-of-the-art accuracy across multiple PDE benchmarks.
  • Significant improvements in scalability to large point clouds compared to existing methods.
Read more
Physics-Informed Generative Solver: Bridging Data-Driven Priors and Conservation Laws for Stable Spatiotemporal Field Reconstruction
Ziyuan Zhu, Keyu Hu, Zhifei Chen, Yuhao Shi, Ming Bao, Jing Zhao, Gang Wang, Haitan Xu, Jiadong Li, Qijun Zhao, Xiaodong Li, Minghui Lu, Yanfeng Chen
Generative Models Theory Time Series
  • Introduces a physics-informed generative framework for spatiotemporal field reconstruction.
  • Decouples training and inference processes to enhance stability and physical consistency.
  • Demonstrates effectiveness in acoustic systems and generalizes to chaotic flows and meteorological fields.
  • Addresses the limitations of traditional data-driven methods in the context of sparse measurements.
Read more
Value-Gradient Hypothesis of RL for LLMs
Arip Asadulaev, Daniil Ognev, Karim Salta, Martin Takac
Reinforcement Learning Large Language Models Theory
  • Critic-free RL methods like PPO and GRPO can effectively improve LLMs despite traditional RL concerns about credit assignment.
  • The actor update in critic-free RL is shown to be value-gradient-like in expectation.
  • Empirical costates in discrete transformers approximate the value gradient, with controlled error margins.
  • A predictive decomposition of RL impact into value-gradient signals and reward headroom is developed.
Read more
The Value of Covariance Matching in Gaussian DDPMs and the Lanczos Sampler
Md Sahil Akhtar, Aymane El Gadarri, Vivek F. Farias, Adam D. Jozefiak
Generative Models Theory Efficient ML
  • Full covariance matching reduces path-KL error from Ω(1/T) to O(1/T²).
  • The Lanczos Gaussian Sampler (LGS) enables practical sampling from optimal covariance without dense storage.
  • LGS achieves improved sample quality over strong diagonal-covariance baselines with minimal computational overhead.
  • The method leverages Jacobian-vector products to compute covariance-vector products efficiently.
Read more
Hierarchical Variational Policies for Reward-Guided Diffusion
Kushagra Pandey, Farrin Marouf Sofian, Jan Niklas Groeneveld, Felix Draxler, Stephan Mandt
Generative Models Computer Vision Efficient ML
  • Introduces a unified framework for test-time guidance in diffusion models using hierarchical variational policies.
  • Develops Amortized HVP (AHVP) for efficient generation of high-quality reward-aligned samples.
  • Presents Semi-Amortized HVP (SHVP) that combines amortized proposals with test-time refinement for improved quality.
  • Achieves over 5× faster inference with better perceptual quality compared to leading methods on inverse problems.
Read more
Two is better than one: A Collapse-free Multi-Reward RLIF Training Framework
Shourov Joarder, Diganta Sikdar, Ahsan Habib Akash, Binod Bhattarai, Prashnna Gyawali
Reinforcement Learning Large Language Models
  • Introduction of a multi-reward RLIF framework that combines answer-level and completion-level rewards.
  • Implementation of GDPO normalization to mitigate reward-scale imbalance.
  • Use of KL-Cov regularization to prevent entropy collapse and maintain exploration.
  • Demonstrated improved performance and stability over single-reward RLIF methods.
Read more
TBP-mHC: full expressivity for manifold-constrained hyper connections through transportation polytopes
Anton Lyubinin
Theory Optimization Efficient ML
  • Introduces TBP and RTBP for exact doubly stochastic mixing matrices.
  • Achieves minimal parameterization and full expressivity without iterative normalization.
  • Demonstrates improved stability and scalability in empirical evaluations.
  • Addresses trade-offs between exactness, expressivity, memory efficiency, and speed.
Read more
Cyber-Physical Anomaly Detection in IoT-Enabled Smart Grids Using Machine Learning and Metaheuristic Feature Optimization
Adis Alihodžić, Eva Tuba, Milan Tuba
Optimization Efficient ML Interpretability
  • Proposes a machine learning framework for anomaly detection in smart grids using PMU/IED measurements.
  • Implements a genetic algorithm for feature selection, significantly reducing the feature space.
  • Demonstrates that tree-based ensemble models, especially Extra Trees, outperform other baseline models.
  • Achieves improved detection metrics while maintaining a reduced set of informative features.
Read more
Evolutionary Multi-Task Optimization for LLM-Guided Program Discovery
Halil Alperen Gozeten, Xuechen Zhang, Emrullah Ildiz, Ege Onur Taga, Tara Javidi, Samet Oymak
Optimization Large Language Models Generative Models
  • Introduction of EMO-STA framework for efficient multi-task program discovery.
  • Demonstrated improvement over single-task evolutionary methods in various settings.
  • Adaptation strategies enhance performance for both seen and unseen tasks.
  • Shared evolution reduces overfitting by promoting generalizable solutions.
Read more
Objective-Induced Bias and Search Dynamics in Multiobjective Unsupervised Feature Selection
Mathieu Cherpitel, Thomas Bäck, Martijn R. Tannemaat, Anna V. Kononova
Optimization Theory
  • Objective design critically influences the performance of multiobjective unsupervised feature selection.
  • Silhouette-based formulations often lead to low-cardinality, less informative solutions.
  • The PCA reconstruction loss objective provides a better balance between subset compactness and predictive performance.
  • Subset-size regularization and initial population strategies significantly shape the Pareto front structure.
Read more
Teaching Language Models to Forecast Research Success Through Comparative Idea Evaluation
Srujan P Mule, Aniketh Garikaparthi, Manasi Patwardhan
NLP Large Language Models Reinforcement Learning
  • Introduced a large-scale dataset of 11,488 research idea pairs for comparative forecasting.
  • Achieved a significant accuracy improvement from 30% to 77.1% using Supervised Fine-Tuning.
  • Outperformed GPT-5 by over 10 percentage points while being more compute-efficient.
  • Demonstrated robustness against superficial heuristics and effective reasoning capabilities.
Read more
Can Breath Biomarkers Causally Influence Blood Glucose? Investigating VOC-Mediated Modulation in Diabetes
Varsha Sharma, Prasanta K. Guha, Avik Ghose
Theory Interpretability
  • Establishes a causal framework linking breath VOCs to blood glucose levels.
  • Develops a classifier to differentiate between diabetic and non-diabetic individuals.
  • Introduces a risk-ranking system for individuals at risk of diabetes.
  • Utilizes Gaussian Mixture Models for population clustering.
Read more
Vector Policy Optimization: Training for Diversity Improves Test-Time Search
Ryan Bahlous-Boldi, Isha Puri, Idan Shenfeld, Akarsh Kumar, Mehul Damani, Sebastian Risi, Omar Khattab, Zhang-Wei Hong, Pulkit Agrawal
Reinforcement Learning Large Language Models Optimization
  • VPO focuses on generating diverse, competent solutions rather than converging on a single optimal response.
  • The method leverages vector-valued rewards to encourage exploration of the Pareto frontier of multiple objectives.
  • VPO consistently outperforms scalar RL baselines in test-time search scenarios, especially with larger candidate budgets.
  • The approach allows for the resolution of complex problems that traditional methods fail to solve.
Read more
Discovering Entity-Conditioned Lag Heterogeneity: A Lag-Gated Neural Audit Framework for Panel Time Series
Andi Xu
Time Series
  • Formulates entity-conditioned heterogeneous lag discovery as a testable panel time-series mining task.
  • Introduces AC-GATE, which generates entity-level effective lags through an explicit lag gating structure.
  • Proposes a layered audit protocol for evaluating forecast calibration and lag discovery.
  • Demonstrates the ability of AC-GATE to recover true heterogeneous lag structures in synthetic data.
Read more