AI-generated summaries

Today's ML research,
without the noise.

Daily summaries of the latest machine learning papers from arXiv, processed every 8 hours.

58 Papers today
8h Update frequency
7 Days of history
CaliDist: Calibrating Large Language Models via Behavioral Robustness to Distraction
Mohammad Anas Jawad, Cornelia Caragea
NLP Large Language Models
  • CALIDIST introduces a behavior-centric approach to calibrating LLMs by measuring their stability against distractions.
  • The method quantifies prediction instability and confidence stability to adaptively scale confidence scores.
  • Extensive experiments show that CALIDIST outperforms traditional calibration methods, achieving a significant reduction in ECE.
  • The findings suggest that a model's susceptibility to distractions is a strong predictor of its accuracy.
Read more
Non-Negative Matrix Factorization for Event Data
Raphaël Romero
Time Series
  • EventNMF operates directly on continuous-time event data without preprocessing, preserving fine-grained temporal features.
  • The model utilizes a Poisson process framework with non-negative B-spline basis for intensity factorization.
  • Efficient parameter estimation is achieved through multiplicative updates.
  • EventNMF is validated on synthetic and real-world datasets, demonstrating its effectiveness in various applications.
Read more
Mitigating the Curse of Dimensionality in Uniform Convergence of Deep Neural Networks via Smooth Activations
Yizhe Ding, Runze Li, Jia Liu, Lingzhou Xue
Theory
  • Smoothly activated DNNs provide stronger uniform convergence guarantees compared to ReLU networks.
  • The paper establishes the first theoretical lower bound for the uniform convergence of ReLU FNNs, demonstrating their limitations.
  • A comprehensive theoretical framework for smooth DNNs is developed, including pseudo-dimension bounds and approximation guarantees.
  • Uniform convergence rates are derived for smooth DNN estimators across various statistical contexts.
Read more
Tangram: Unlocking Non-Uniform KV Cache for Efficient Multi-turn LLM Serving
Hyungmin Kim, Minsoo Kim, Hongseok Kim, Jungwook Choi
Large Language Models Efficient ML NLP
  • Tangram addresses inefficiencies in multi-turn LLM serving caused by non-uniform KV caches.
  • The system employs three core techniques to optimize memory management and scheduling.
  • Experimental results show a throughput improvement of up to 2.6× without sacrificing model accuracy.
  • Tangram's implementation is publicly available, promoting further research and application.
Read more
Two-Way Is Better Than One: Bidirectional Alignment with Cycle Consistency for Exemplar-Free Class-Incremental Learning
Hongye Xu, Bartosz Krawczyk
Theory Efficient ML Computer Vision
  • Introduces BiCyc, a bidirectional cycle consistency approach for EFCIL.
  • Addresses systematic bias in existing one-directional projection methods.
  • Proves that cycle loss minimizes classification perturbations and stabilizes old-class decisions.
  • Demonstrates substantial improvements in accuracy and reduction of forgetting in EFCIL benchmarks.
Read more
Statistically Reliable LLM-Based Ranking Evaluation via Prediction-Powered Inference
Abhishek Divekar
NLP Large Language Models Theory
  • Introduces PRECISE, a framework for bias-corrected ranking evaluation using LLMs.
  • Achieves unbiased estimates for hierarchical metrics like Precision@K.
  • Demonstrates a 21% reduction in standard error when augmenting human annotations with LLM judgments.
  • Successfully identifies the best system variant in a production setting, leading to increased sales.
Read more
Diffusion Models for Adaptive Sequential Data Generation
Haoyang Cao, Minshuo Chen, Yinbin Han, Renyuan Xu
Generative Models Time Series Theory
  • Introduction of a new diffusion model framework (AD-Seq) for adapted sequential data generation.
  • Ensures that generated data respects temporal information flow and causal structure.
  • Develops a novel score-matching objective for scalable parallel training.
  • Provides statistical learning theory guarantees for the proposed framework.
Read more
Selective-Advantage Entropy-Adaptive Horizon GRPO: Asymmetric Token-Level Discounting for Efficient Reinforcement Learning of Language Models
Chirag Chawla, Rohan Charudatt Salvi, Madhav S. Baidya
NLP Large Language Models Reinforcement Learning
  • Introduction of AH-GRPO, which adapts token-level discounting based on entropy to improve training efficiency.
  • Development of SA-AH-GRPO, which selectively applies discounting to negative-advantage rollouts, enhancing learning stability.
  • SA-AH-GRPO achieves a 3.6× reduction in training variance on the 3B model while maintaining peak accuracy.
  • Demonstrated improvements over zero-shot baselines, indicating the effectiveness of the proposed methods.
Read more
Steering Vectors are an Adversarial Attack Surface
Abzal Aidakhmetov, Donato Crisostomi, Tommaso Mencattini, Adrian Robert Minut, Iacopo Masi, Emanuele Rodolà
NLP Large Language Models Optimization
  • Identification of contrastive steering datasets as a novel attack surface.
  • Demonstration of a stealthy data poisoning attack that alters steering vectors.
  • Validation of the attack on multiple model families and attributes, achieving significant ASR improvements.
  • Proposal of a defense mechanism that mitigates the attack's effectiveness.
Read more
Adaptive state-action abstractions via rate-distortion
Fernando E. Rosas
Reinforcement Learning Robotics Theory
  • Introduces soft state-action abstractions that allow for dynamic granularity adjustment.
  • Develops a learning-abstraction decomposition that separates value error into learning and abstraction errors.
  • Proposes an adaptive abstraction principle that refines abstractions based on learning progress.
  • Validates the framework on tabular control benchmarks, achieving near-optimal performance with lossy compression.
Read more
Short paper: Models in the dark -- Rectification and erasure under GDPR in ML supply chains
Henrik Graßhoff, Malte Hansen, Meiko Jensen, Sara Ramezanian
Theory
  • The paper identifies significant challenges in enforcing GDPR rights to rectification and erasure in ML systems.
  • It introduces the concept of 'models in the dark,' highlighting issues of transparency and traceability in ML supply chains.
  • The authors provide a taxonomy of challenges that impede the effective enforcement of these rights.
  • The study emphasizes the need for interdisciplinary approaches to bridge legal and technical aspects of GDPR compliance in ML.
Read more
Design a Reliable LLM-Integrated Interface for Mortality Forecasting
Thi Kim Ngan Nguyen
NLP Large Language Models Time Series
  • Development of a user-friendly LLM-integrated interface for mortality forecasting.
  • Implementation of a three-phase methodology to ensure accuracy, usability, and transparency.
  • Demonstration of the effectiveness of LLMs in translating natural language into structured forecasting requests.
  • Focus on maintaining statistical rigor while enhancing accessibility for non-technical users.
Read more
From Prediction to Self: Developmental Conditions for Agency in Minimal Neural Systems
Evan Ye
Theory Robotics
  • Identifies four critical developmental conditions for agency in neural systems.
  • Introduces 'agency gain' as a measurable metric for self-awareness in predictive systems.
  • Demonstrates that self-aware predictors outperform self-blind predictors in various environments.
  • Falsifies 12 hypotheses regarding the development of self-representation.
Read more
Q-GNN: Query-Conditioned Graph Neural Networks with Type Awareness for Knowledge Graph Completion
Dongxiao He, Ruqiong Zhang, Zhizhi Yu, Ling Ding, Di Jin, Guangquan Xu, Zhiyong Feng
Graph Learning
  • Q-GNN incorporates both query entity and query relation information for enhanced reasoning in KGC.
  • The approach utilizes structural context and semantic type to guide message passing and scoring.
  • Experiments show that Q-GNN outperforms traditional GNN methods in knowledge graph completion tasks.
  • The integration of large language models for entity type inference is a novel aspect of the methodology.
Read more
Towards Unified and Data-Efficient Prognostics and Health Management with Tabular Foundation Models
Raffael Theiler, Lev Telyatnikov, Leandro Von Krannichfeldt, Olga Fink
Time Series
  • Tabular Foundation Models can effectively handle fragmented and irregular industrial time series data.
  • The proposed framework allows for in-context learning, reducing the need for extensive retraining.
  • Tabular models outperform traditional sequence models and gradient-boosted trees in various PHM tasks.
  • Performance is enhanced by constructing representative contexts during data subsampling.
Read more
High-Dimensional Theory of LoRA Fine-Tuning in a Solvable Attention Model
O. Duranthon, F. Boncoraglio, L. Zdeborová
Theory Efficient ML Large Language Models
  • Introduction of a solvable high-dimensional model for LoRA fine-tuning in attention.
  • Derivation of a sharp asymptotic characterization of test error and reconstruction overlap using finite-dimensional order parameters.
  • Identification of an effective noise mechanism that quantifies the impact of pre-training quality on fine-tuning performance.
  • Discovery of regimes with mismatches between test error and reconstruction overlap due to memorization of pre-training data.
Read more
Less is MoE: Trimming Experts in Domain-Specialist Language Models
Haoze He, Xinkai Zou, Xuan Jiang, Xingyuan Ding, Ao Qu, Juncheng Billy Li, Heather Miller
NLP Large Language Models Efficient ML
  • Fisher importance is a more effective metric for identifying critical dimensions in MoE models compared to existing heuristics.
  • Fisher-MoE allows for fine-grained compression at the intermediate dimension level, preserving model performance while reducing size.
  • The proposed method significantly improves inference throughput and reduces memory requirements.
  • The study highlights the importance of evaluating MoE compression on challenging general-purpose benchmarks rather than solely on commonsense reasoning tasks.
Read more
Trust, but Don't Verify: Epistemic Blind Spots in LLM Source Evaluation
Rohan N. Pradhan, Steve Goley
NLP Large Language Models
  • LLMs can detect fabricated statistics in isolation but fail to do so during multi-source synthesis.
  • Source influence is governed by methodology presentation rather than numeric validity.
  • The study identifies a methodology-register gate that affects how models evaluate evidence.
  • Prompting-based mitigations do not effectively enhance models' ability to discern valid from fabricated statistics.
Read more
Deciphering Two Training Clocks in Grokking via Deep Linear Network Theory with Conditional ReLU Reduction
Hu Tan, Kuo Gai, Shihua Zhang
Theory
  • Introduces the concept of 'two training clocks' to separate fitting from representation simplification.
  • Demonstrates that classification loss decreases exponentially while representation simplification occurs on a polynomial time scale.
  • Extends findings to ReLU networks, showing a two-stage learning mechanism.
  • Provides a rigorous mathematical framework using deep linear networks.
Read more
Plug-and-Play Guidance for Discrete Diffusion Models via Gradient-Informed Logit Correction
Hongkun Dou, Zike Chen, Fengji Li, Hongjue Li, Yue Deng
Generative Models
  • Introduction of GILC as a training-free guidance framework for discrete diffusion models.
  • Utilization of a Jacobian-free mechanism for stable logit correction, addressing gradient instability.
  • Formal connection to policy gradients, enabling handling of non-differentiable objectives.
  • Demonstration of state-of-the-art performance in various scientific applications without additional training.
Read more
Flash-WAM: Modality-Aware Distillation for World Action Models
Arman Akbari, Ci Zhang, Arash Akbari, Lin Zhao, Yixiao Chen, Weiwei Chen, Xuan Zhang, Geng Yuan, Yanzhi Wang
Generative Models Robotics Multimodal
  • Flash-WAM introduces a modality-aware step-distillation framework for World Action Models.
  • The framework adapts consistency functions to match the noise characteristics of video and action modalities.
  • Flash-WAM achieves a 23× speedup in inference time, enabling real-time control.
  • It preserves high task success rates in simulation benchmarks compared to naive distillation methods.
Read more
Compress-Distill: Reasoning Trace Compression for Efficient Knowledge Distillation
Maxime Griot, Paul Steven Scotti, Tanishq Mathew Abraham
NLP Large Language Models Efficient ML
  • Introduces Compress-Distill, a method for compressing reasoning traces before knowledge distillation.
  • Compressed traces reduce training tokens to 12-30% of raw traces and speed up training by 2.0-7.6 times.
  • While compressed traces improve efficiency, they do not surpass raw traces in downstream accuracy.
  • The study includes a detailed analysis of the trade-offs between accuracy and efficiency in knowledge distillation.
Read more
Learned Subspace Compression for Communication-Efficient Pipeline Parallelism
Paul Janson, Edouard Oyallon, Eugene Belilovsky
Large Language Models Efficient ML Optimization
  • Introduces MAPL, a method for learnable orthogonal projections in pipeline parallelism.
  • Maintains orthogonality during training using Stiefel manifold constraints.
  • Allows each pipeline stage to adapt its own compression subspace, enhancing performance.
  • Integrates factorized anchor embeddings for efficient activation reconstruction.
Read more
Revisiting Prototype Rehearsal for Exemplar-Free Continual Learning: Manifold-Aware Boundary Sampling with Adaptive Class-Balanced Loss
Hongye Xu, Bartosz Krawczyk
Computer Vision
  • Introduces a manifold-aware approach to prototype rehearsal for EFCIL.
  • Proposes Constrained Expansive Over-Sampling (CEOS) to generate boundary-aware synthetic samples.
  • Develops an Adaptive Class-Balanced (ACB) loss to address class imbalance during training.
  • Demonstrates that the proposed methods outperform traditional prototype rehearsal and compete with drift-compensation techniques.
Read more
The Evaluation Blind Spot: A Stereological Theory of Benchmark Coverage for Large Language Models
Jason Z Wang
NLP Large Language Models Theory
  • Introduces a stereological framework for understanding benchmark coverage in LLMs.
  • Identifies a significant structural blind spot in LLM evaluations, dominating statistical noise.
  • Develops a submodular greedy algorithm for optimal benchmark selection, achieving high coverage with fewer benchmarks.
  • Empirical analysis shows effective dimensionality of benchmark suites and its implications for model evaluation.
Read more
A Machine Learning-Based Framework for Discovering Huntington's Disease Stages: Integrating Graph Representation Learning and clustering to Uncover Progression Dynamics in Longitudinal Enroll-HD Dataset
Lubna M. Abu Zohair, Marta Vallejo, MD Azher Uddin, John R. Woodward, Hind Zantout
Graph Learning Time Series Multimodal
  • Developed an unsupervised machine learning framework for identifying Huntington's disease stages.
  • Utilized graph representation learning to capture temporal relationships in longitudinal clinical data.
  • Achieved robust clustering performance with significant distinctions between identified disease stages.
  • Demonstrated the potential for a more objective, data-driven approach to HD staging.
Read more
Your GFlowNet Secretly Learns an Optimal Transport Plan
Ian Maksimov, Nikita Morozov, Denis Belomestny, Sergey Samsonov
Optimization Generative Models Graph Learning
  • Establishes a theoretical link between GFlowNets and optimal transport problems.
  • Demonstrates that minimum-flow GFlowNets can be formulated as linear programming problems.
  • Shows that GFlowNets can effectively approximate solutions to graph optimal transport problems.
  • Confirms the framework's ability to recover exact OT solutions and its scalability for larger problems.
Read more
Subspace-Aware Sparse Autoencoders for Effective Mechanistic Interpretability
Seyed Arshan Dalili, Mehrdad Mahdavi
Large Language Models Interpretability Efficient ML
  • Identifies a geometric mismatch between multi-dimensional feature structures and standard SAE assumptions.
  • Introduces Subspace-Aware Sparse Autoencoders (SASA) to address feature splitting and redundancy.
  • Proves that SASA can uniquely represent entire feature slices with a single group, improving interpretability.
  • Demonstrates empirical advantages of SASA over traditional SAEs on large language models.
Read more
Learning to model pediatric asthma exacerbation from multiple risk factors: a case study in coastal Virginia
Jonathan Colen, Eric Werner, Maryam Golbazi, Heather Richter, Diana McSpadden, Amy Quinn, Jocel Santos, Mary Jane Darling, Mary Margaret Gleason
Interpretability
  • The study highlights the importance of integrating multiple risk factors in modeling pediatric asthma exacerbation.
  • Three modeling techniques were compared, emphasizing the trade-off between interpretability and predictive power.
  • The novel framework developed allows for the identification of nonlinear interactions among risk factors.
  • Consensus across different modeling approaches provides robust insights into the relative risks of asthma exacerbation.
Read more
LEVANTE-bench: Multi-Scale Comparison of VLMs to Children Using Cognitive Tasks (or, "Is Your VLM Smarter Than a 5th Grader?")
Alvin Wei Ming Tan, David Cardinal, Tania Lorido-Botran, Laura Bravo-Sanchez, Sunny Yu, Michael C. Frank
Multimodal
  • LEVANTE-bench provides a comprehensive dataset for comparing VLMs to children's cognitive performance.
  • The benchmark evaluates VLMs across multiple scales, including task difficulty, item difficulty, and trial-level error distributions.
  • Larger VLMs show better alignment with human performance at the task level, but struggle with finer details of children's cognitive errors.
  • Smaller models may better reflect the cognitive errors of younger children in certain tasks.
Read more
TS-ICL: A Flexible Time-Indexed Foundation Model for Time Series via In-Context Learning
Etienne Le Naour, Tahar Nabil, Adrien Petralia
Time Series
  • TS-ICL unifies forecasting and imputation in time series modeling.
  • It incorporates covariates using a structured synthetic prior based on DAGs.
  • Achieves state-of-the-art performance in zero-shot imputation benchmarks.
  • Maintains competitive forecasting capabilities, especially with missing observations.
Read more
Zero-Copy Semantic Contagion: An In-Memory Streaming Architecture for Evolving Attention Graphs
Kabir Murjani
Time Series Graph Learning
  • Introduces a streaming architecture that captures cross-company information propagation in financial markets.
  • Achieves rapid ingestion and inference times, making it suitable for real-time applications.
  • Demonstrates significant improvements in prediction precision over traditional methods.
  • Highlights the critical role of dynamic graph structures in detecting financial contagion.
Read more
Sharp First-Order Lower Bounds for Higher-Order Smooth Nonconvex Optimization
Dongruo Zhou
Optimization Theory
  • Establishes new dimension-free lower bounds for higher-order smooth nonconvex optimization.
  • Achieves matching lower bounds of Ω(ϵ−7/4) and Ω(ϵ−5/3) for Hessian-Lipschitz and third-order smooth functions, respectively.
  • Introduces a novel block-chain mechanism for constructing hard instances that preserve smoothness.
  • Closes the gap in existing literature regarding lower bounds for first-order oracle complexity.
Read more
End-to-End Subgraph Detection with GraphDETR
Dexiong Chen, Till Hendrik Schulz, Karsten Borgwardt
Graph Learning
  • GraphDETR reformulates subgraph detection as a set prediction problem, enhancing efficiency and scalability.
  • The model employs a GNN for graph encoding and a transformer decoder for joint prediction of subgraph occurrences.
  • GraphDETR supports both exact and approximate matching, broadening the scope of detectable patterns.
  • Empirical results show high accuracy in detecting molecular functional groups, achieving an AP100 score of 91.2.
Read more
When Denser Credit Is Not Enough: Evidence-Calibrated Policy Optimization for Long-Horizon LLM Agent Training
Yuanfan Li, Qi Zhou, Wenjing Duan, Lu Chen
Reinforcement Learning Large Language Models Optimization
  • Identifies the issue of divergent anchor bias in existing reinforcement learning methods for LLMs.
  • Proposes a new algorithm, Evidence-Calibrated Policy Optimization (ECPO), to improve credit assignment.
  • ECPO combines techniques to reduce the impact of statistical noise and improve training stability.
  • Demonstrates superior performance of ECPO over existing methods on benchmark tasks.
Read more
SALT: When More Rollouts Don't Help in Group-Based Policy Optimization and How to Make Them Matter
Powei Chang, Jinpeng Zhang, Chaoqun Sun, MiniWell Tsao, Lianrui Li, Jianxiang Xiang, Chenyu Wang, Yukang Gao, Dongying Kong
Reinforcement Learning Large Language Models Optimization
  • Identifies structural inefficiencies in GRPO-style updates leading to diminishing returns from additional rollouts.
  • Introduces SALT, a method that reweights group-relative updates to reduce redundancy and improve learning signals.
  • Demonstrates that effective updates can be enhanced without changing the underlying reward model or sampling methods.
  • Validates the approach through comprehensive experiments showing consistent performance gains.
Read more
Proper Scoring Rules for Right-Censored Survival Data
Jef Jonkers, Glenn Van Wallendael, Luc Duchateau, Sofie Van Hoecke
Theory Time Series
  • Introduces a unified framework for proper scoring rules under right censoring.
  • Derives right-censored versions of several established scoring rules.
  • Demonstrates that the marginalized score is proper under specific conditions.
  • Presents censored engression as a new training method for multivariate survival data.
Read more
The Post-GCN Decade Revisited: Curvature-Stratified Evaluation of Relational Learning
Shuo Wang, Xiangyu Wang, Quanxin Wang, Bailin Wu, Bokui Wang, Shunyang Huang, Boyan Deng, Haonan Liu, Ruiyi Fang, Zhenxiang Xu, Boyu Wang, Zhao Kang
Graph Learning
  • Current evaluation methods in relational learning are biased due to reliance on flat leaderboards.
  • Intrinsic geometry significantly impacts model performance and should be considered in evaluations.
  • The proposed CURVBENCH framework stratifies datasets based on curvature, revealing critical performance insights.
  • Model rankings can vary significantly across different curvature regimes, challenging the universality of model effectiveness.
Read more
OPRD: On-Policy Representation Distillation
Shenzhi Yang, Guangcheng Zhu, Bowen Song, Haobo Wang, Mingxuan Xia, Xing Zheng, Yingfan Ma, Zhongqi Chen, Weiqiang Wang, Gang Chen
Large Language Models NLP Optimization
  • OPRD is the first method to perform on-policy distillation in the hidden-state space rather than the output space.
  • It eliminates sampling variance in gradient estimation, providing a more stable training signal.
  • OPRD exposes rich structural information from the teacher's intermediate hidden states, enhancing the supervision signal.
  • Empirical results show OPRD outperforms traditional methods on mathematics benchmarks and is more efficient in terms of training speed and memory usage.
Read more
Agentic Monte Carlo: Simulating Reinforcement Learning for Black-Box Agents
Dae Yon Hwang, Raunaq Suri, Valentin Villecroze, Anthony L. Caterini, Jesse C. Cresswell, Noël Vouitsis, Brendan Leigh Ross
Reinforcement Learning Large Language Models Optimization
  • AMC formulates RL for black-box agents as a Bayesian inference problem, allowing optimization without parameter access.
  • The method uses Sequential Monte Carlo to sample from the optimal policy, guided by a learned value function.
  • Empirical results show AMC outperforms traditional prompting methods and GRPO in various environments.
  • The approach highlights the feasibility of applying RL concepts to closed-source agents, expanding their usability.
Read more
Generative Criticality in Large Language Model Temperature Scaling
Huajian Ruan, Jinyang Li, Xingyu Guo, Lingxiao Wang
NLP Large Language Models Theory
  • Introduction of a statistical-field framework for LLM outputs with defined physical observables.
  • Evidence of critical behavior in LLM text generation driven by temperature scaling.
  • Independent geometric validation of criticality through the TwoNN intrinsic dimension method.
  • Findings are robust across different model scales and prompt categories.
Read more
DP-MacAdam: Differentially Private Mechanism with Adaptive Clipping and Adaptive Momentum
Naima Tasnim, Lalitha Sankar, Oliver Kosut
Optimization Theory Efficient ML
  • DP-MacAdam is the first algorithm to combine adaptive clipping and adaptive momentum under differential privacy.
  • The algorithm uses a novel bias correction factor for unbiased gradient variance estimation.
  • Empirical results show improved performance over DP-SGD, AdaClip, and DP-Adam across various privacy budgets.
  • DP-MacAdam does not require manual tuning of clipping thresholds, simplifying its application.
Read more
Evidence-Guided Neural Architecture Selection under Uncertainty for Subject-Specific Blood Glucose Forecasting
Md Azharul Islam, Dwyer Deighan, Tarunraj Singha, Danial Faghihi
Time Series
  • EVIDENT integrates Bayesian training and evidence-based ranking for neural architecture selection.
  • The framework identifies the lowest-capacity model that meets validation criteria, improving generalization.
  • EVIDENT demonstrates effectiveness in blood glucose forecasting for type 1 diabetes patients.
  • The approach outperforms random-search methods by selecting smaller, more consistent architectures.
Read more
Performance Evaluation of GraphCast for Medium-Range Weather Forecasting over Brazil
Wolfgang R. Rowell Jr., Lucas S. Kupssinskü
Time Series
  • First regime-stratified benchmark of an MLWP model over Brazil.
  • Utilization of a cloud-native evaluation pipeline for harmonizing data.
  • Identification of conditions under which GraphCast gains or loses skill relative to traditional models.
  • Establishment of operational boundaries for future optimization of AI weather models.
Read more
Cross-Epoch Adaptive Rollout Optimization for RL Post-Training
Yiming Zong, Yige Wang, Jiashuo Jiang
Reinforcement Learning Large Language Models Optimization
  • CERO is the first rollout-allocation framework for LLM post-training that optimizes a global rollout budget across epochs.
  • The method uses Bayesian estimates of prompt-level rollout value to guide adaptive budgeting.
  • CERO demonstrates improved sample efficiency compared to traditional fixed allocation methods.
  • Theoretical guarantees provide a strong foundation for the proposed approach.
Read more
PAC-Bayesian Adversarially Robust Generalization for Message Passing Graph Neural Networks: A Sensitivity Analysis
Ziling Liang, Xinping Yi, Qingsong Wen, Shi Jin
Graph Learning Theory
  • Introduces a sensitivity-aware PAC-Bayesian framework for MPGNNs.
  • Derives tighter robust generalization bounds by analyzing output Jacobians.
  • Utilizes anisotropic Gaussian posteriors to improve KL divergence bounds.
  • Reduces complexity terms from hidden-width-dependent to class-dependent.
Read more
Event Detection for Parameter-to-KPI Dependency Learning for AI-RAN
Christie Djidjev, Nicholas Kaminski
Time Series Interpretability Optimization
  • Introduces a method for event detection to support dependency learning in AI-RAN.
  • Develops a synthetic traffic generator to simulate parameter-KPI relationships.
  • Demonstrates the effectiveness of a machine-learning pipeline for recovering dependency structures.
  • Identifies threshold calibration as a key factor in event detection quality.
Read more
Gradient Descent with Large Step Size Restores Symmetry in Deep Linear Networks with Multi-Pathway
Hee-Sung Kim, Sungyoon Lee
Theory Optimization
  • Discrete Gradient Descent with large step sizes induces pathway re-balancing rather than symmetry breaking.
  • Single-path solutions are sharp minima, while balanced solutions across pathways are flatter.
  • Training dynamics exhibit two phases: initial symmetry breaking followed by a re-balancing phase due to oscillations.
  • The relationship between the number of pathways, depth, and sharpness of minima is theoretically derived.
Read more
Domain-Adapted Small Language Models with Hybrid Post-Processing: Achieving Cost-Efficient, Low-Latency Multi-Label Structured Prediction via LoRA Fine-Tuning on Scarce Data
Srinivasan Manoharan, Dilipkumar Nallusamy, Sachin Kumar, Haifeng Wu
NLP Large Language Models Efficient ML
  • Introduces a hybrid framework combining LoRA fine-tuning and rule-based post-processing for structured evaluation tasks.
  • Achieves 100% JSON structural validity and high accuracy on compliance evaluations with minimal training data.
  • Demonstrates significant cost savings and reduced latency compared to frontier large language models.
  • Utilizes targeted hard-negative augmentation to improve model performance on critical decision boundaries.
Read more
Regret Minimization with Adaptive Opponents in Repeated Games
Mingyang Liu, Asuman Ozdaglar, Tiancheng Yu, Kaiqing Zhang
Theory Reinforcement Learning Optimization
  • Introduction of Repeated Policy Regret (RP-Regret) as a new metric for regret in repeated games.
  • Establishment of necessary conditions for achieving sublinear RP-Regret.
  • Development of three algorithms for minimizing RP-Regret in non-convex strategy spaces.
  • Demonstration of improved cooperative solutions and higher utility through RP-Regret minimization.
Read more
Quantifying the Privacy of Counterfactuals by Leveraging Membership Inference Attacks Against Synthetic Data
Maryam Babaei, Yingke Wang, Hadrien Lautraite, Heber H. Arcolezi, Ulrich Aivodji, Sebastien Gambs
Theory Interpretability
  • Counterfactuals can be exploited for privacy attacks, similar to synthetic data.
  • Membership inference attacks can be conducted on counterfactuals without model access.
  • The study bridges the gap between synthetic data privacy research and counterfactual analysis.
  • An ensemble MIA is proposed and compared with existing counterfactual distance attacks.
Read more
Representation Learning Enables Scalable Multitask Deep Reinforcement Learning
Johan Obando-Ceron, Lu Li, Scott Fujimoto, Pierre-Luc Bacon, Aaron Courville, Pablo Samuel Castro
Reinforcement Learning Robotics Efficient ML
  • Representation learning is more critical than model-based control for scalable multitask RL.
  • MR.Q, a model-free algorithm, integrates predictive objectives and achieves superior performance.
  • The approach significantly reduces computational overhead while improving sample efficiency.
  • Predictive representation learning is essential for performance, as shown through ablation studies.
Read more
Scaling Laws for Behavioral Foundation Models over User Event Sequences
Rickard Brüel Gabrielsson
Optimization Theory Efficient ML
  • The compute-optimal event embedder size is approximately 2% of total parameters across various compute budgets.
  • Behavioral scaling initially favors data-heavy training but approaches the Chinchilla heuristic at higher compute levels.
  • The evaluation metric is integral to the scaling law, affecting optimal configurations for batch size and negative sampling.
  • Negative sampling becomes a memory constraint at higher compute budgets rather than a compute constraint.
Read more
Reactive Flux Matching: Mechanism Discovery and Adaptive Sampling of Rare Events
Rishal Aggarwal, David Ryan Koes, Nicholas M. Boffi, Eric Vanden-Eijnden
Theory Optimization
  • Flux Matching provides a variational characterization of reactive path ensembles through current velocity and scalar potential.
  • The framework is robust against non-Markovian projections, unlike traditional committor-based methods.
  • It offers data-driven reaction coordinates that enhance adaptive sampling methods.
  • Numerical validation shows its applicability across different molecular systems.
Read more
Generalized TV–ℓp Structured Priors for Bayesian T1 Mapping
Disi Lin, Martin Berggren, Tommy Löfstedt
Theory Optimization Computer Vision
  • Introduction of a generalized TV–ℓp prior for Bayesian T1 mapping.
  • Demonstrated properness of the prior and its effectiveness in uncertainty quantification.
  • Evaluation against multiple existing methods shows superior performance in terms of reliability and accuracy.
  • Results indicate reduced uncertainty and improved spatial coherence in T1 maps.
Read more
Dominant-Layer ZO: A Single Layer Dominates Zeroth-Order Fine-Tuning of LLMs
Wanhao Yu, Ziyan Wang, Zheng Wang, Abeer Matar Almalky, Yihang Zuo, Shuteng Niu, Sen Lin, Adnan Siraj Rakin, Deliang Fan, Li Yang
Large Language Models Optimization Efficient ML
  • Discovery of a dominant-layer phenomenon in ZO fine-tuning, where tuning a single layer can recover or exceed full-model performance.
  • The dominant layer is task-agnostic but model-specific, identified efficiently through activation outlier analysis.
  • Perturbation effects propagate effectively through the dominant layer, enhancing optimization signals under ZO updates.
  • Dominant-layer ZO fine-tuning shows improved performance and training speed compared to existing methods.
Read more
Learning to Route LLMs from Implicit Cost-Performance Preferences via Meta-Learning
Jiahao Zeng, Ming Tang, Ningning Ding
Large Language Models Optimization NLP
  • Introduction of a perceptive LLM routing paradigm that learns user preferences through interaction.
  • Development of MetaRouter, a meta-learning framework for preference-aware LLM routing.
  • Demonstrated superior performance of MetaRouter over existing routing methods.
  • High efficiency in learning user preferences and adaptability to different LLMs.
Read more
What Objects Enable, Not What They Are: Functional Latent Spaces for Affordance Reasoning
Rohan Siva, Neel P. Bhatt, Yunhao Yang, Seoyoung Lee, Nishant Gadde, Christian Ellis, Alvaro Velasquez, Zhangyang Wang, Ufuk Topcu
Robotics
  • Introduction of A4D, a framework for affordance-based reasoning in robot planning.
  • Mapping of visual observations into a functional latent space to enhance generalizability.
  • Significant improvements in inference accuracy for both existing and new affordances.
  • Incorporation of an uncertainty-aware affordance discovery mechanism.
Read more