AI-generated summaries

Today's ML research,
without the noise.

Daily summaries of the latest machine learning papers from arXiv, processed every 8 hours.

55 Papers today
8h Update frequency
7 Days of history
Finding Stationary Points by Comparisons
Helin Wang, Chenyi Zhang, Xiwen Tao, Yexin Zhang, Tongyang Li
Optimization Theory
  • Developed an algorithm for finding ϵ-stationary points using a comparison oracle with improved query complexity.
  • Introduced a quantum algorithm for the same problem, showcasing the potential of quantum methods in optimization.
  • Demonstrated that the algorithm's ϵ-dependence matches optimal rates of second-order methods, although with a higher dimension dependence.
  • Identified the challenge of accessing gradient norms in the comparison oracle model, limiting the ability to confirm stationary points directly.
Read more
GEOALIGN: Geometric Rollout Curation for Robust LLM Reinforcement Learning
Ting Zhou, Zhenqing Ling, Yiyang Zhao, Ying Shen, Daoyuan Chen
Reinforcement Learning Large Language Models
  • Identification of directional inconsistency as a failure mode in online RL for LLMs.
  • Development of GEOALIGN, a lightweight module for rollout curation that enhances training stability.
  • GEOALIGN operates on-the-fly, requiring only forward passes and minimal overhead.
  • Demonstrated improvements in performance and stability over strong baselines in various tasks.
Read more
Does Aurora Encode Atmospheric Structure? Latent Regime Analysis and Attribution
Emma Kasteleyn, Ana Lucic
Time Series Interpretability Efficient ML
  • Aurora's latent space is organized by seasonal cycles rather than distinct storm events.
  • Layer-wise relevance propagation (LRP) reveals that the model captures the 3D vertical structure of significant weather events.
  • Perturbation tests indicate that relevant region masking severely impacts forecast accuracy.
  • The study demonstrates that Aurora learns meteorological coherence without explicit guidance.
Read more
Fast LeWorldModel
Yuntian Gao, Xiangyu Xu
Robotics Reinforcement Learning Efficient ML
  • Fast-LeWM replaces autoregressive rollout with action-prefix prediction, enhancing planning efficiency.
  • The model allows for parallel prediction of future latents, reducing accumulated errors.
  • Fast-LeWM achieves a 3.9× acceleration in dynamics module time and improves success rates from 85.8% to 90.5%.
  • The method lowers full CEM solve time by 48.0%, demonstrating significant improvements in planning tasks.
Read more
Error-Conditioned Neural Solvers
Haina Jiang, Liam Wang, Peng-Chen Chen, Min Seop Kwak, Seungryong Kim, Brian Bell, Jeong Joon Park
Optimization Theory Efficient ML
  • ENS uses the PDE residual as an input to improve prediction accuracy rather than as an optimization target.
  • The framework demonstrates significant improvements in reconstruction accuracy across diverse PDE families.
  • ENS achieves up to a 10x improvement in accuracy for turbulent Kolmogorov flow problems.
  • The method exhibits robustness to initialization and generalizes well under distribution shifts.
Read more
Necessary but Not Sufficient: Temperature Control and Reproducibility in LLM-as-Judge Safety Evaluations
Hiroki Tamba
Large Language Models NLP Theory
  • Grader configurations in LLM-as-judge evaluations often default to a temperature of 1.0, leading to non-deterministic outcomes.
  • Setting temperature to zero reduces variability but does not eliminate it, with persistent non-reproducibility observed in several test cases.
  • Grader disagreement should be considered a critical metric in evaluating the reliability of LLM judgments.
  • The study reveals that some major LLM providers are moving away from temperature control, complicating reproducibility efforts.
Read more
AIGP: An LLM-Based Framework for Long-Term Value Alignment in E-Commerce Pricing
Chennan Ma, Yanning Zhang, Siqi Hong, Xiuchong Wang, Fei Xiao, Keping Yang
NLP Large Language Models Reinforcement Learning
  • AIGP integrates LLM-based reasoning with long-term business value alignment for dynamic pricing.
  • The Long-Term Value Estimator (LTVE) automates preference pair selection for Direct Preference Optimization (DPO).
  • AIGP achieves significant improvements in GMV (+13.21%), ROI (+7.59%), and milestone achievement rate (+8.20%) over traditional methods.
  • The framework provides interpretable pricing rationales, enhancing decision transparency.
Read more
Topology-Informed Neural Networks for Flood Detection in Optical and Synthetic Aperture Radar Imagery
Sophia Li, Max Zhao, Raghu G. Raj, Tianyu Chen
Computer Vision Time Series Interpretability
  • Introduces topology-informed neural networks for enhanced flood detection.
  • Utilizes the SEN12-FLOOD dataset for comprehensive evaluation.
  • Demonstrates the effectiveness of combining topological features with CNNs.
  • Achieves a detection accuracy of 98.9%, significantly higher than previous baselines.
Read more
EVOM: Agentic Meta-Evolution of Actor-Critic Architectures for Reinforcement Learning
Boyun Zhang, Chao Wang, Kai Wu
Reinforcement Learning Large Language Models Optimization
  • EVOM automates the architecture design process for actor-critic reinforcement learning models.
  • The framework utilizes a bi-level optimization approach with an inner loop for training and an outer loop for architecture evolution.
  • An LLM-based design agent generates and refines architectures, reducing reliance on predefined search spaces.
  • Experimental results show significant performance improvements over traditional methods and other automated search techniques.
Read more
DualEval: Joint Model-Item Calibration for Unified LLM Evaluation
Aaron J. Li, Hao Huang, Youngmin Park, Yitong Ma, Wei-Lin Chiang, Li Chen, Cho-Jui Hsieh, Bin Yu, Ion Stoica
NLP Large Language Models Interpretability
  • DUALEVAL unifies static benchmark correctness and open-ended preference signals for LLM evaluation.
  • The framework jointly estimates model abilities and item properties, enhancing evaluation stability and interpretability.
  • Empirical results show DUALEVAL achieves high accuracy in reconstructing evaluation signals and produces balanced model rankings.
  • The framework supports diagnostic applications like benchmark compression and anomaly detection.
Read more
BetXplain: An Explanation-Annotated Dataset for Detecting Manipulative Betting Advertisements on Social Media
MSVPJ Sathvik, Parmitha Vangapadu, Nishit Rane, Sathwik Narkedimilli, Mark Lee, Akrati Saxena
NLP Interpretability
  • Introduction of BetXplain, the first dataset for detecting manipulative betting advertisements with explanations.
  • Manual annotation of advertisements for manipulative and deceptive practices, enhancing the dataset's utility for research.
  • Analysis of persuasive strategies in betting ads and their implications for mental health.
  • Potential applications for user protection and regulatory monitoring in the context of online betting.
Read more
KG-TRACE: A Neuro-Symbolic Framework for Mechanistic Grounding in Antimicrobial Resistance Prediction
Naman Garg, Sarika Jain, Sourav Yadav, Bharat K. Bhargava, Ghanapriya Singh, Abhishek Srivastava, Parimal Kar
Graph Learning Interpretability
  • KG-TRACE integrates genomic data with a WHO knowledge graph to improve AMR prediction.
  • The framework uses a learned epistemic trust gate to balance neural and symbolic evidence.
  • Achieved an AUROC of 0.9760 for isoniazid resistance with 92.5% symbolic coverage.
  • Introduced the Biological Grounding Ratio (BGR) to measure alignment with biological knowledge.
Read more
Discovering Millions of Interpretable Features with Sparse Autoencoders
XinYang He, Wei Wang, Bing Zhao, Xuan Ren, WenBo Li, WeiXu Qiao, Hu Wei, Lin Qu
NLP Large Language Models Interpretability
  • Introduction of Qwen3-Instruct SAE, a comprehensive suite of Sparse Autoencoders for Qwen3 models.
  • Layer-wise SAEs are provided for key activation sites, enhancing interpretability of model features.
  • Evaluation reveals distinct sparsity-fidelity trade-offs, contributing to understanding feature representations.
  • Demonstration of SAE utility through a refusal-steering case study, influencing model behavior.
Read more
Federated Hash Projected Latent Factor Learning
Jialan He
Federated Learning Efficient ML Optimization
  • FHPLF reduces communication and computation costs by using binary gradient-like matrices instead of real-valued gradients.
  • The model incorporates Projected Hamming Distance to enhance the representation capability of binary codes.
  • SBG-PEU strategy minimizes the risk of privacy leakage during data transmission.
  • FHPLF consistently outperforms state-of-the-art methods in terms of accuracy and efficiency across multiple datasets.
Read more
A Multi-Fidelity Convolutional Autoencoder-Transfer Learning Framework for Guided-Wave-Based Damage Diagnosis Using Large Simulated and Limited Experimental Datasets
Santosh Kapuria, Abhishek
Efficient ML
  • Introduces a multi-fidelity transfer learning framework for GWSHM.
  • Utilizes a convolutional autoencoder for deep feature learning from limited experimental data.
  • Achieves high accuracy in damage localization and sizing with R² scores above 0.93 and 0.99, respectively.
  • Demonstrates strong generalization capabilities on unseen data.
Read more
A Generalization Theory for JEPA-Based World Models
Jingyi Cui, Qi Zhang, Hongwei Wen, Yisen Wang
Theory Graph Learning Robotics
  • Establishment of a spectral graph-based theoretical framework for JEPA-based world models.
  • Demonstration of the equivalence between JEPA risk and matrix factorization of the co-occurrence matrix.
  • Derivation of a generalization error bound linking JEPA pretraining risk to downstream planning performance.
  • Identification of a trade-off between approximation and sample errors concerning latent dimensions.
Read more
Mesh-RL: Coupled subgrid reinforcement learning
Behnam Gheshlaghi, Bahador Rashidi, Shahin Atakishiyev
Reinforcement Learning
  • Mesh-RL introduces a spatial domain-decomposition framework for reinforcement learning.
  • It enforces boundary-consistent TD updates to enhance value propagation.
  • The framework improves convergence speed, cumulative reward, and learning stability.
  • Higher mesh resolutions lead to better exploration and prevent premature convergence.
Read more
SOLAR: AI-Powered Speed-of-Light Performance Analysis
Qijing Huang, Sana Damani, Zhifan Ye, Athinagoras Skiadopoulos, Siva Kumar Sastry Hari, Jason Clemons, Sahil Modi, Jingquan Wang, Aditya Kane, Edward C Lin, Humphrey Shi, Christos Kozyrakis
Optimization Efficient ML Robotics
  • SOLAR is the first tool to automatically derive validated SOL bounds from PyTorch and JAX source code.
  • The framework utilizes a three-stage pipeline that separates generative translation from deterministic analysis.
  • SOLAR achieves 100% operator and language coverage on KernelBench, outperforming existing tools.
  • The analysis reveals substantial optimization opportunities, with headroom improvements of up to 7.8x.
Read more
Neural Architecture Search for Generative Adversarial Networks: A Comprehensive Review and Critical Analysis
Abrar Alotaibi, Moataz Ahmed
Generative Models Optimization Computer Vision
  • NAS significantly optimizes GAN architecture design, improving performance and stability.
  • Evolutionary algorithms and gradient-based methods are particularly effective in certain scenarios.
  • Robust evaluation metrics are crucial for accurately assessing GAN performance.
  • Diverse datasets are essential for comprehensive evaluation of GANs.
Read more
Recovering Governing Equations from Solution Data: Identifiability Bounds for Linear and Nonlinear ODEs
Yang Pan, Helmut Bölcskei
Theory
  • Introduces Hausdorff distance as a metric for comparing differential equations.
  • Establishes identifiability bounds for a wide range of ODE classes.
  • Quantifies sample complexity needed for reliable recovery of governing equations.
  • Addresses theoretical gaps in the uniqueness and stability of ODE identification.
Read more
EMA-FS: Accelerating GBDT Training via Gain-Informed Feature Screening
Yan Song
Efficient ML
  • EMA-FS optimizes GBDT training by focusing histogram construction on high-gain features.
  • The method achieves significant speedups (up to 2.61x) while maintaining model accuracy.
  • S-EMA-FS offers a flexible framework that combines deterministic and stochastic feature selection.
  • EMA-FS is implemented in a compact manner, ensuring compatibility with existing LightGBM functionalities.
Read more
SharQ: Bridging Activation Sparsity and FP4 Quantization for LLM Inference
Haoqian Meng, Yilun Luo, Yafei Zhao, Wenyuan Liu, Huaqing Zheng, Xindian Ma, Peng Zhang
Large Language Models Efficient ML Multimodal
  • SharQ combines activation sparsity and FP4 quantization without requiring training or calibration data.
  • The method uses an online N:M mask to create an outlier-dominated sparse backbone for quantization.
  • It defines a dense residual relative to the quantized sparse backbone, improving accuracy and efficiency.
  • SharQ achieves significant reductions in latency and improvements in throughput for LLM inference.
Read more
The Geometry of Updates: Fisher Alignment at Vocabulary Scale
John Sweeney
Large Language Models Theory Optimization
  • Fisher alignment is shown to be non-identifiable using representation-only metrics without assumptions on error geometry.
  • FisherSketch provides a practical method for estimating head Fisher alignment at vocabulary scale with minimal computational overhead.
  • The method allows for effective source selection and diagnostic analysis of task similarity in LLMs.
  • FisherSketch outperforms traditional activation-only metrics in various experimental settings, demonstrating its utility in transfer learning.
Read more
Cross-Head Attention Uplift Network with Inverse Propensity Score under Unobserved Confounding
Haoran Zhang, Chuanpu Li, Yuxin Fu, Bin Tong, Guan Wang, Bo Zheng, Feng Zhou
Theory
  • Introduction of CHAUN, which leverages cross-head attention for better inter-group correlation modeling.
  • Theoretical proof that true propensity scores ensure ITE identifiability despite unobserved confounders.
  • RA-IPS method optimizes propensity weights to mitigate bias from unobserved variables.
  • CHAUN shows significant performance improvements over state-of-the-art uplift models.
Read more
Learning Probabilistic Filters with Strictly Proper Scoring Rules
Eviatar Bach, Ricardo Baptista, Jochen Bröcker, Bohan Chen, Andrew Stuart
Theory Time Series Optimization
  • Introduction of the Proper Scoring Ensemble Filter (PSEF) for Bayesian filtering.
  • Training based on strictly proper scoring rules to enhance probabilistic accuracy.
  • Theoretical foundation linking the population objective to the true Bayesian filtering distribution.
  • Numerical experiments demonstrate superior performance in challenging filtering scenarios.
Read more
Symplectic Neural Networks for learning Generalized Hamiltonians
Harsh Choudhary, Vyacheslav Kungurtsev, Chandan Gupta, Melvin Leok, Georgios Korpas
Theory Efficient ML Robotics
  • Development of a neural framework for learning generalized Hamiltonians from noisy trajectory observations without structural bias.
  • Demonstration of the HNN's ability to generalize to out-of-distribution data and learn governing Hamiltonians for chaotic systems.
  • Introduction of an efficient gradient computation method using adjoint sensitivity equations derived from symplectic discretizations.
  • Application of backward error analysis to improve the accuracy of the learned Hamiltonian.
Read more
Epiphany-Aware KV Cache Eviction Without the Attention Matrix
Steven Kolawole, Virginia Smith
Large Language Models Efficient ML NLP
  • EPIKV scores tokens based on internal representation changes rather than attention weights, leading to better eviction decisions.
  • The method scales to contexts 16 times longer than traditional attention-based methods without exhausting GPU memory.
  • EPIKV matches or exceeds the performance of existing attention-based eviction methods on benchmark datasets.
  • The approach runs up to 2.8 times faster than traditional methods while maintaining comparable eviction quality.
Read more
Can Large Language Models Reliably Code Qualitative Humanitarian Data? A Benchmark Study Against Human Expert Adjudication
Jerome Marston, Tino Kreutzer, Salomé Garnier, Ella Boone, Phuong N Pham, Patrick Vinck
NLP Large Language Models
  • First direct evaluation of LLM coding reliability on humanitarian-context data.
  • Multi-stage evaluation framework extending conventional inter-coder reliability testing.
  • Performance variation among LLMs based on humanitarian themes and reasoning effort.
  • LLMs can enhance humanitarian analytical capacity but should not replace human judgment.
Read more
Asymptotically Optimal Learning for Parametric Prophet Inequalities
Jung-hun Kim, Anna Grebennikova, Vianney Perchet
Theory Optimization
  • Characterization of optimal full-information asymptotic competitive ratios for exponential-type parametric families.
  • Development of a confidence-based dynamic programming policy for online learning that does not require offline samples.
  • Derivation of distribution-specific convergence guarantees for various reward distributions.
  • Numerical experiments demonstrate the effectiveness of the proposed algorithm in practical scenarios.
Read more
Deterministic Pareto-Optimal Policy Synthesis for Multi-Objective Reinforcement Learning
Aniruddha Joshi, Niklas Lauffer, Sanjit Seshia
Reinforcement Learning Optimization Theory
  • Introduction of a novel preference-conditioned Bellman operator for MORL.
  • Proof of convergence to the Pareto-optimal values for deterministic policies.
  • Extraction of deterministic policies from converged Q-estimates covering the Pareto frontier.
  • Empirical validation showing the algorithm's ability to recover complex trade-offs.
Read more
Heavy-Ball Q-Learning with Residual Weighting Correction
Donghwan Lee
Reinforcement Learning Theory Optimization
  • Introduces a corrected heavy-ball Q-learning method for faster convergence.
  • Establishes theoretical conditions for acceleration compared to standard Q-learning.
  • Utilizes switched linear system (SLS) representation and joint spectral radius (JSR) for analysis.
  • Extends findings to Q-learning with linear function approximation.
Read more
CascadeFormer: Depth-Tapered Transformers Motivated by Gradient Fan-in Asymmetry
Huzama Ahmad, Cao Viet Hai Nam, Se-Young Yun
NLP Large Language Models Efficient ML
  • CascadeFormer architecture tapers width with depth to optimize information flow.
  • Gradient Fan-in Asymmetry (GFA) is proposed as a structural explanation for layer redundancy in deep transformers.
  • CascadeFlow Pruning (CFP) effectively prunes layers based on training gradients, outperforming traditional heuristics.
  • Empirical tests validate the GFA hypothesis, showing that structural factors, not just gradient magnitude, affect layer importance.
Read more
A Causal Foundation Model for Structure and Outcome Prediction
Max Zhu, Martino Mansoldo, Ching-Hao Wang, Stefan Groha
Graph Learning Theory Interpretability
  • TabPFN-CFM predicts causal structures and outcomes from observational data.
  • Supports all three levels of Pearl's Causal Hierarchy.
  • Utilizes known graph structures to improve prediction accuracy.
  • Employs a refined training procedure that enhances efficiency.
Read more
How Good Can Linear Models Be for Time-Series Forecasting?
Lang Huang, Jinglue Xu, Luke Darlow
Time Series
  • Ridge regression, when carefully tuned, outperforms prior linear forecasting models and matches or exceeds transformer and MLP architectures on several benchmarks.
  • The optimal lookback period for forecasting is highly specific to the dataset and often non-monotonic with respect to the forecast horizon.
  • Normalizing over a learned trailing fraction of the context improves forecasting accuracy compared to using the entire context.
  • Different time series within the same dataset may require distinct hyperparameters, indicating the importance of series heterogeneity.
Read more
PersistentKV: Page-Aware Decode Scheduling for Long-Context LLM Serving on Commodity GPUs
Muhammad Ahmed
Large Language Models Efficient ML Optimization
  • PersistentKV enhances long-context LLM serving by optimizing KV-cache management and decode scheduling.
  • The system employs a calibrated adaptive policy to select the most efficient decoding strategy based on active batch sizes.
  • A native block-table decode engine is introduced, which improves memory utilization and reduces overhead.
  • The methodology includes rigorous comparisons with existing systems, showcasing the advantages of the proposed approach.
Read more
Kolmogorov Arnold networks (KAN) for aerodynamic prediction: a comparison with MLPs and GNNs
Miguel Jaraiz, Fermin Gutierrez, Pablo Yeste, Miguel Sánchez-Domínguez, Eusebio Valero, Gonzalo Rubio, Lucas Lacasa
Theory Efficient ML Graph Learning
  • KANs offer a novel approach to neural network architecture by adapting activation functions.
  • In aerodynamic prediction tasks, KANs perform comparably but slightly worse than MLPs and GNNs.
  • KANs converge faster during training due to their lower complexity compared to MLPs and GNNs.
  • Training instabilities and hyperparameter sensitivity are significant challenges for KANs.
Read more
Theory-Scale Auto-Formalization of Logics for Computer Science
Yuming Feng, Frederick Pu, One An, Osbert Bastani, Li Zhang, Jiani Huang, Xujie Si, Ziyang Li
Theory
  • Introduction of LCS-Bench, a theory-scale benchmark for auto-formalization.
  • Development of a semi-automated pipeline for constructing the benchmark.
  • Creation of five evaluation tracks with 1,271 benchmark instances.
  • Demonstration of the benchmark's challenging nature with state-of-the-art models achieving only 20.1% success.
Read more
Reasoning Quality Emerges Early: Data Curation for Reasoning Models
Hongyi Henry Jin, Wenhan Yang, Meysam Ghaffari, Carlos Morato, Baharan Mirzasoleiman
NLP Large Language Models Efficient ML
  • Introduces a new method for data curation in reasoning models that relies on initial reasoning tokens.
  • Demonstrates that the first 100 tokens can effectively indicate problem difficulty.
  • Establishes that similar loss patterns in initial tokens lead to similar gradients during training.
  • Achieves up to 1.7% performance improvement while being 91% more token efficient compared to existing methods.
Read more
Multipath Adaptive Gated Bottleneck Latent ODE with Raman Data Fusion for Cell Culture Process Forecasting
Johnny Peng, Thanh Tung Khuat, Ellen Otte, Katarzyna Musial, Bogdan Gabrys
Time Series Multimodal
  • Introduction of a novel adaptive framework for bioprocess forecasting.
  • Development of the Gated Bottleneck Latent ODE to improve learning from sparse data.
  • Implementation of Multi-Path Just-In-Time Fine-Tuning for generating multiple plausible forecasts.
  • Fusion of Raman spectroscopy data to enhance the observability of bioprocess runs.
Read more
Decision-Aligned Evaluation of Uncertainty Quantification
Annika Schneider, Tommy Rochussen, Joshua Stiller, Vincent Fortuin
Theory
  • Introduces decision-alignment as a formal criterion for evaluating UQ metrics.
  • Identifies misalignment in widely used UQ metrics with practical decision-making.
  • Proposes prior-weighted utility metrics that better capture decision utility.
  • Demonstrates the effectiveness of the new metrics through experiments and case studies.
Read more
Equivariance and Augmentation for Bayesian Neural Networks
Miaowen Dong, Axel Flinth, Jan E. Gerken
Theory
  • The paper establishes a theoretical framework for understanding how data augmentation induces equivariance in BNNs.
  • Three novel symmetrization techniques are introduced to enhance the equivariance of BNNs trained on augmented data.
  • The orbit expansion method outperforms baseline models in terms of both equivariance and overall performance.
  • The study provides bounds on the equivariance error and conditions for maintaining equivariance during training.
Read more
Implementation of reinforcement learning in chemical reaction networks: application to phototaxis as curiosity-driven exploration
Ruyi Tang, Grégoire Sergeant-Perthuis, David Colliaux
Reinforcement Learning Robotics Theory
  • Integration of reinforcement learning with biochemical reaction networks for modeling phototaxis.
  • Formulation of phototaxis as a subjective POMDP, highlighting the role of sensory ambiguity.
  • Use of Inverse Reinforcement Learning to derive a data-driven phototactic policy from experimental data.
  • Demonstration of how tumbling behavior aids in resolving sensory ambiguity and supports adaptive navigation.
Read more
Transformer-Based Classification of Bacterial Raman Spectra with LOOCV
Jamile Mohammad Jafari, Thomas Bocklitz
Theory
  • Transformer models outperform conventional machine learning methods in classifying bacterial Raman spectra.
  • The study employs a nested leave-one-replicate-out cross-validation framework for robust evaluation.
  • Transformers demonstrate superior class separation and maintain performance on raw spectra without preprocessing.
  • The findings highlight the significance of replicate-aware validation in model evaluation.
Read more
Beyond the Hard Budget: Sparsity Regularizers for More Interpretable Top-k Sparse Autoencoders
Nathanaël Jacquier, Maria Vakalopoulou, Mahdi S. Hosseini
Computer Vision Interpretability
  • Introduction of two sparsity regularizers for Top-k Sparse Autoencoders.
  • Regularizers improve monosemanticity and interpretability of latent representations.
  • The ℓ1/ℓ2 penalty enhances robustness to variations in the sparsity budget.
  • Evaluation across multiple datasets and vision foundation models shows consistent improvements.
Read more
Revisiting Action Factorization for Complex Action Spaces
Timothy Flavin, Sandip Sen
Reinforcement Learning Robotics Optimization
  • Introduces a cross-sectional study of action factorization methods across multiple RL algorithms and action spaces.
  • Presents two new environments to isolate challenges in hybrid action spaces.
  • VDN-PPO and PPO-MIX outperform other tested PPO factorizations by effectively assigning credit to action heads.
  • Shared encoder architectures offer the best compute-performance trade-off in most scenarios.
Read more
High-Probability PL-SGD with Markovian Noise: Optimal Mixing and Tail Dependence
Dhruv Sarkar, Aprameyo Chakrabartty, Vaneet Aggarwal
Optimization Theory
  • Establishes linear dependence on mixing time for high-probability PL-SGD under Markovian noise.
  • Closes the gap between expectation and high-probability bounds in previous analyses.
  • Introduces a clipped block method for heavy-tailed Markovian gradients.
  • Provides matching lower bounds for both light-tailed and heavy-tailed scenarios.
Read more
At the Edge of Understanding: Sparse Autoencoders Trace The Limits of Transformer Generalization
Praneet Suresh, Jack Stanley, Sonia Joseph, Luca Scimeca, Danilo Bzdok
NLP Large Language Models Interpretability
  • LLMs exhibit increased reliance on spurious concepts when encountering OOD inputs.
  • Minor distribution shifts in input prompts can lead to significant performance drops.
  • SAE-derived indicators can effectively identify per-sample distribution shifts.
  • The approach allows for targeted fine-tuning to enhance LLM robustness against adversarial inputs.
Read more
Blackwell Approachability and Gradient Equilibrium are Equivalent
Brian W. Lee, Nika Haghtalab, Michael I. Jordan, Ryan J. Tibshirani
Theory Optimization
  • GEQ is algorithmically equivalent to Blackwell Approachability, allowing for the use of BA algorithms to solve GEQ problems.
  • The paper provides efficient reductions that facilitate the transfer of guarantees from regret minimization to GEQ.
  • Necessary and sufficient conditions for achieving GEQ are identified, enhancing the theoretical understanding of the framework.
  • The equivalence between GEQ and other frameworks like regret minimization and calibration clarifies their interconnections.
Read more
Automating Potential-based Reward Shaping with Vision Language Model Guidance
Henrik Müller, Daniel Kudenko
Reinforcement Learning Multimodal Robotics
  • Introduction of VLM-PBRS framework for automating potential-based reward shaping.
  • Utilization of smaller, cost-effective vision language models to generate potential functions.
  • Empirical validation showing improved sample efficiency and robustness to reward hacking.
  • Demonstration of the connection between VLM preference label accuracy and learning efficiency.
Read more
Dataset Usage Inference without Shadow Models or Held-out Data
Wojciech Łapacz, Stanisław Pawlak, Jan Dubiński, Franziska Boenisch, Adam Dziedzic
Generative Models Computer Vision Efficient ML
  • Introduces NU-DUI, a framework for Dataset Usage Inference that does not require shadow models or held-out data.
  • Generates synthetic non-member samples to enhance the accuracy of dataset usage estimates.
  • Recasts DUI as a mixture proportion estimation problem, making it computationally efficient.
  • Empirical results show NU-DUI provides accurate member-ratio estimates across multiple large-scale generative models.
Read more
Optimizing CUDA like a Human: Micro-Profiling Tools as Expert Surrogates for LLM-Based GPU Kernel Optimization
Jiading Gai, Shuai Zhang, Kaj Bostrom, Jin Huang, Vihang Patil, Haoyang Fang, Bernie Wang, Huzefa Rangwala, George Karypis
Optimization Large Language Models Efficient ML
  • KernelPro integrates LLM code generation with expert heuristics for GPU kernel optimization.
  • The system achieves state-of-the-art performance with significant speedups on the KernelBench benchmark.
  • KernelPro incorporates energy efficiency as a secondary objective, reducing energy consumption while maintaining speed.
  • Ablation studies confirm the effectiveness of each design component in improving optimization quality.
Read more
Statistical and Structural Approaches to Algorithmic Fairness
Antonio Ferrara
Theory
  • Identifies limitations in current algorithmic fairness paradigms, particularly deterministic auditing metrics.
  • Proposes statistical hypothesis testing as a more robust method for assessing fairness.
  • Emphasizes the importance of structural context in understanding algorithmic fairness.
  • Advocates for reshaping network structures to promote fairness in opportunities.
Read more
Sketched Linear Contrastive Learning: Approximation, Optimization, and Statistical Scaling
Ziyan Chen, Zhongzhu Zhou, Ding-Xuan Zhou
Theory Optimization Multimodal
  • Introduces a theoretical framework for scaling laws in contrastive learning.
  • Derives a risk decomposition that highlights the effects of approximation, optimization, and sampling.
  • Demonstrates that contrastive learning requires learning interactions between two views, affecting scaling behavior.
  • Provides an explicit scaling law that connects sketch dimension, sample size, and optimization horizon.
Read more
fTNN: a tensor neural network for fractional PDEs
Qingkui Ma, Hehu Xie, Xiaobo Yin
Theory Optimization Efficient ML
  • Introduction of fTNN, a tensor neural network for fractional PDEs.
  • Development of a deterministic integration framework for the fractional Laplacian.
  • Use of boundary-singularity-aware trial functions to improve solution accuracy.
  • Design of a spatiotemporally separable neural network for time-dependent PDEs.
Read more
Effective Covariance Dynamics in Solvable High-Dimensional GANs
Andrew Bond, Zafer Doğan
Generative Models Theory Optimization
  • Introduces effective covariance dynamics for multi-feature GANs with complex latent structures.
  • Derives a high-dimensional ODE that captures the training dynamics of GANs with correlated latent variables.
  • Identifies a spectral solvable region that governs learning stability and recovery in GAN training.
  • Demonstrates a signal-boosting mechanism where weak latent directions can be enhanced through low-rank correlations.
Read more