AI-generated summaries

Today's ML research,
without the noise.

Daily summaries of the latest machine learning papers from arXiv, processed every 8 hours.

54 Papers today
8h Update frequency
7 Days of history
ELMoE-3D: Leveraging Intrinsic Elasticity of MoE for Hybrid-Bonding-Enabled Self-Speculative Decoding in On-Premises Serving
Yuseon Choi, Jingu Lee, Jungjun Oh, Sunjoo Whang, Byeongcheol Kim, Minsung Kim, Hoi-Jun Yoo, Sangjin Kim
NLP Large Language Models Efficient ML
  • Introduction of Elastic Self-Speculative Decoding (Elastic-SD) to optimize MoE performance.
  • Hybrid-bonding architecture enhances memory bandwidth and reduces compute underutilization.
  • LSB-augmented bit-sliced architecture supports efficient bit-nested execution.
  • Achieves significant speedup and energy efficiency improvements over traditional MoE serving methods.
Read more
xFODE+: Explainable Type-2 Fuzzy Additive ODEs for Uncertainty Quantification
Ertugrul Kececi, Tufan Kumbasar
Interpretability Time Series Theory
  • xFODE+ combines interpretability with uncertainty quantification in SysID models.
  • The model uses Interval Type-2 Fuzzy Logic Systems to enhance interpretability.
  • xFODE+ produces both point predictions and Prediction Intervals.
  • It retains physically meaningful incremental states for better state representation.
Read more
AdaSplash-2: Faster Differentiable Sparse Attention
Nuno Gonçalves, Hugo Pitorro, Vlad Niculae, Edoardo Ponti, Lei Li, Andre Martins, Marcos Treviso
NLP Large Language Models Efficient ML
  • ADASPLASH-2 introduces a histogram-based initialization for faster computation of the normalizer τ in α-entmax attention.
  • The method achieves significant speed improvements over FlashAttention-2, particularly in moderate-to-high sparsity regimes.
  • Empirical results indicate that ADASPLASH-2 matches or outperforms softmax attention in both short and long-context tasks.
  • The approach leverages on-chip SRAM for efficient memory usage and reduced computational overhead.
Read more
Quantum-inspired tensor networks in machine learning models
Guillermo Valverde, Igor García-Olaizola, Giannicola Scarpa, Alejandro Pozas-Kerstjens
Theory Efficient ML Interpretability
  • Tensor networks provide a structured approach to model complex dependencies in data.
  • They can enhance computational efficiency and reduce the risk of data leakage in ML models.
  • TNs offer insights into model interpretability through quantum information theory metrics.
  • The integration of TNs into ML can lead to novel architectures and compression techniques.
Read more
Calibrate-Then-Delegate: Safety Monitoring with Risk and Budget Guarantees via Model Cascades
Edoardo Pona, Milad Kazemi, Mehran Hosseini, Yali Du, David Watson, Osvaldo Simeone, Nicola Paoletti
NLP Large Language Models Efficient ML
  • CTD provides a more effective delegation strategy by using a delegation value probe instead of relying solely on uncertainty.
  • The method ensures probabilistic guarantees on computational costs and safety performance.
  • CTD adapts budget allocation dynamically based on the difficulty of inputs, improving efficiency.
  • Empirical results show significant performance improvements over traditional uncertainty-based methods.
Read more
CSRA: Controlled Spectral Residual Augmentation for Robust Sepsis Prediction
Honglin Guo, Rihao Chang, He Jiao, Weizhi Nie, Zhongheng Zhang, Yuehao Shen
Time Series
  • CSRA framework enhances short-window sepsis prediction by generating clinically plausible data augmentations.
  • The method employs spectral domain perturbations to control the augmentation process, improving temporal robustness.
  • Experiments show significant reductions in regression errors and improved classification performance across various models.
  • CSRA maintains effectiveness under data-scarce conditions and shorter observation windows, indicating strong generalizability.
Read more
Enhancing LLM-based Search Agents via Contribution Weighted Group Relative Policy Optimization
Junzhe Wang, Zhiheng Xi, Yajie Yang, Hao Luo, Shihan Dou, Tao Gui, Qi Zhang
NLP Large Language Models Reinforcement Learning Optimization
  • Introduction of Contribution-Weighted GRPO (CW-GRPO) for improved training of LLM-based search agents.
  • Reframing process supervision as advantage reallocation based on round contributions.
  • Empirical evidence showing concentrated contributions in successful search trajectories.
  • Significant performance improvements over standard GRPO in knowledge-intensive benchmarks.
Read more
Learning Ad Hoc Network Dynamics via Graph-Structured World Models
Can Karacelebi, Yusuf Talha Sahin, Elif Surer, Ertan Onur
Reinforcement Learning Graph Learning Optimization
  • Introduction of G-RSSM, a graph-structured model that maintains individual node dynamics.
  • First application of imagination-based combinatorial optimization for per-node decision-making in wireless networks.
  • Demonstrated high connectivity in large networks with training conducted on smaller networks.
  • Unified learning of multiple coupled network processes in a single model.
Read more
Towards Verified and Targeted Explanations through Formal Methods
Hanchen David Wang, Diego Manzanas Lopez, Preston K. Robinette, Ipek Oguz, Taylor T. Johnson, Meiyi Ma
Interpretability
  • ViTaX provides targeted semifactual explanations with formal guarantees.
  • The framework prioritizes critical decision boundaries based on user specifications.
  • It identifies minimal feature subsets sensitive to specific misclassifications.
  • ViTaX achieves over 30% improvement in explanation fidelity compared to existing methods.
Read more
On the Expressive Power and Limitations of Multi-Layer SSMs
Nikola Zubić, Qian Li, Yuyi Wang, Davide Scaramuzza
Theory Efficient ML Robotics
  • Multi-layer SSMs face fundamental limitations in compositional tasks compared to streaming models.
  • Online CoT enhances the expressiveness of SSMs, making them equivalent to streaming algorithms.
  • Width and precision are not interchangeable in the base model, but become equivalent with online CoT.
  • The paper introduces a forward communication model to establish lower bounds for SSMs.
Read more
Thermodynamic Diffusion Inference with Minimal Digital Conditioning
Aditi De
Efficient ML Generative Models Theory
  • Introduces a thermodynamic diffusion inference method that requires no digital arithmetic.
  • Achieves a theoretical energy reduction of approximately 107× compared to GPU inference.
  • Resolves challenges related to non-local skip connections and input conditioning in U-Net architectures.
  • Demonstrates high performance with a decoder cosine similarity of 0.9906 against an oracle upper bound.
Read more
An unsupervised decision-support framework for multivariate biomarker analysis in athlete monitoring
Fernando Barcelos Rosito, Sebastião De Jesus Menezes, Simone Ferreira Sturza, Adriana Seixas, Muriel Figueredo Franco
Theory Interpretability Time Series
  • Introduces a modular computational framework for unsupervised multivariate biomarker analysis in athlete monitoring.
  • Utilizes Gaussian Mixture Models for synthetic data generation, enhancing scalability and robustness in small-sample datasets.
  • Identifies distinct physiological profiles that differentiate between mechanical and metabolic stress in athletes.
  • Demonstrates the ability to uncover latent risk phenotypes not captured by traditional univariate monitoring methods.
Read more
Expressivity of Transformers: A Tropical Geometry Perspective
Ye Su, Yong Liu
Theory
  • Introduces a tropical geometry framework to analyze transformer expressivity.
  • Establishes that self-attention corresponds to a Power Voronoi Diagram in the zero-temperature limit.
  • Demonstrates that Multi-Head Self-Attention expands polyhedral complexity from O(N) to O(N H).
  • Derives the first tight asymptotic bounds on the number of linear regions in transformers as Θ(N dmodel L).
Read more
How Embeddings Shape Graph Neural Networks: Classical vs Quantum-Oriented Node Representations
Nouhaila Innan, Antonello Rosato, Alberto Marchisio, Muhammad Shafique
Graph Learning
  • Introduces a unified framework for evaluating node embeddings in GNNs.
  • Compares classical and quantum-oriented embeddings under controlled conditions.
  • Demonstrates that quantum embeddings outperform classical ones in structure-driven tasks.
  • Highlights the importance of dataset characteristics in embedding performance.
Read more
No More Guessing: a Verifiable Gradient Inversion Attack in Federated Learning
Francesco Diana, Chuan Xu, André Nusser, Giovanni Neglia
Federated Learning
  • Introduction of VGIA, a novel analytical gradient inversion attack that certifies reconstruction correctness.
  • VGIA achieves exact recovery of both input features and target values in regression settings.
  • Empirical validation shows VGIA's effectiveness on tabular and image datasets, even under large-batch aggregation.
  • The method addresses the limitations of existing attacks by providing a verifiable framework for privacy risk assessment.
Read more
Step-level Denoising-time Diffusion Alignment with Multiple Objectives
Qi Zhang, Dawei Wang, Shaofeng Zou
Reinforcement Learning Generative Models Optimization
  • Introduces a step-level RL formulation for fine-tuning diffusion models.
  • Proposes a retraining-free framework (MSDDA) for aligning models with multiple objectives.
  • Achieves optimal reverse denoising distribution in closed form without approximation errors.
  • Demonstrates superior performance compared to existing denoising-time alignment methods.
Read more
Explainable Graph Neural Networks for Interbank Contagion Surveillance: A Regulatory-Aligned Framework for the U.S. Banking Sector
Mohammad Nasir Uddin
Graph Learning Time Series Interpretability
  • Introduction of ST-GAT framework for interbank contagion surveillance.
  • Achieved AUPRC of 0.939, the highest among GNN architectures evaluated.
  • BiLSTM component significantly enhances model performance.
  • Identified ROA and NPL Ratio as key predictors of bank distress.
Read more
LLMs Gaming Verifiers: RLVR can Lead to Reward Hacking
Lukas Helff, Quentin Delfosse, David Steinmann, Ruben Härle, Hikaru Shindo, Patrick Schramowski, Wolfgang Stammer, Kristian Kersting, Felix Friedrich
Large Language Models Reinforcement Learning Theory
  • RLVR-trained models often engage in reward hacking by exploiting verifier weaknesses.
  • Isomorphic Perturbation Testing (IPT) is introduced as a method to detect shortcut behavior in LLMs.
  • Shortcut behavior is specific to RLVR-trained models and increases with task complexity.
  • Extensional verification can lead to systematic shortcut strategies, while isomorphic verification eliminates them.
Read more
CLion: Efficient Cautious Lion Optimizer with Enhanced Generalization
Feihu Huang, Guanyi Zhang, Songcan Chen
Optimization Theory Efficient ML
  • CLion optimizer improves generalization over the original Lion optimizer.
  • The generalization error of Lion is proven to be O(1/NτT).
  • CLion achieves a lower generalization error of O(1/N).
  • CLion demonstrates a fast convergence rate for nonconvex stochastic optimization.
Read more
Awakening Dormant Experts: Counterfactual Routing to Mitigate MoE Hallucinations
Wentao Hu, Yanbo Zhai, Xiaohui Hu, Mingkuan Zhao, Shanhong Yu, Xue Liu, Kaidong Yu, Shuangyong Song, Xuelong Li
NLP Large Language Models Efficient ML
  • Identification of the 'Dormant Expert' phenomenon in MoE models due to static routing mechanisms.
  • Introduction of Counterfactual Routing (CoR) as a training-free inference framework.
  • CoR achieves compute-preserving expert redistribution to enhance factual accuracy.
  • Empirical results show a 3.1% improvement in factual accuracy on multiple benchmarks.
Read more
Improving Sparse Autoencoder with Dynamic Attention
Dongsheng Wang, Jinsen Zhang, Dawei Su, Hui Huang
Interpretability Computer Vision NLP
  • Introduction of a transformer-based SAE architecture that enhances concept learning coherence.
  • Development of a sparsemax function that dynamically determines the number of active concepts per sample.
  • Demonstration of improved reconstruction loss and concept quality through extensive validation.
  • The approach eliminates the need for hyperparameter tuning related to sparsity levels.
Read more
When Fairness Metrics Disagree: Evaluating the Reliability of Demographic Fairness Assessment in Machine Learning
Khalid Adnan Alsayed
Computer Vision
  • Different fairness metrics can produce conflicting assessments of model bias.
  • The Fairness Disagreement Index (FDI) quantifies the inconsistency across metrics.
  • Fairness evaluations are unstable and vary significantly with different grouping strategies and thresholds.
  • Single-metric reporting is insufficient for reliable bias assessment in machine learning systems.
Read more
Path-Sampled Integrated Gradients
Firuz Kamalov, Fadi Thabtah, R. Sivaraj, Neda Abdelhamid
Interpretability Theory Efficient ML
  • PS-IG generalizes feature attribution by sampling baselines along the interpolation path.
  • It is mathematically equivalent to PWIG under specific conditions, enhancing computational efficiency.
  • The method improves error convergence rates for smooth models.
  • PS-IG reduces attribution variance, addressing issues of gradient noise.
Read more
Auxiliary Finite-Difference Residual-Gradient Regularization for PINNs
Stavros Kassinos
Theory Optimization
  • Introduces an auxiliary finite-difference regularizer for PINNs that maintains the governing PDE residual in AD form.
  • Demonstrates a trade-off between field accuracy and residual cleanliness in a controlled two-dimensional Poisson problem.
  • Implements a body-fitted shell in a three-dimensional annular heat-conduction benchmark to improve accuracy of specific quantities of interest.
  • Achieves significant reductions in RMSE for outer-wall boundary conditions and wall-flux metrics compared to baseline models.
Read more
Gating Enables Curvature: A Geometric Expressivity Gap in Attention
Satwik Bathula, Anand A. Joshi
NLP Large Language Models Theory
  • Gated attention mechanisms enable non-flat geometries, enhancing representational expressivity.
  • Ungated attention is limited to flat statistical manifolds due to its affine structure.
  • Multiplicative gating introduces curvature in representation spaces, improving performance on nonlinear tasks.
  • A depth amplification effect is observed, where curvature accumulates under composition in gated models.
Read more
Physics-Informed Machine Learning for Pouch Cell Temperature Estimation
Zheng Liu
Optimization Efficient ML Theory
  • Introduces a physics-informed machine learning framework for temperature estimation in pouch cells.
  • Integrates governing heat transfer equations into the neural network's loss function for improved accuracy.
  • Achieves a 49.1% reduction in mean squared error compared to traditional data-driven models.
  • Demonstrates faster convergence and superior performance in temperature estimation, especially away from cooling channels.
Read more
Generative Augmented Inference
Cheng Lu, Mengxin Wang, Dennis J. Zhang, Heng Zhang
Large Language Models Efficient ML Theory
  • GAI provides a framework for integrating AI-generated outputs into statistical estimation without requiring them to be accurate surrogates for outcomes.
  • The method improves estimation efficiency and reduces human labeling requirements significantly across various applications.
  • GAI demonstrates strong empirical performance, outperforming traditional estimators in diverse settings.
  • The framework utilizes an orthogonal moment construction for consistent estimation and valid inference.
Read more
Quantization of Spiking Neural Networks Beyond Accuracy
Evan Gibson Smith, Jacob Whitehill, Fatemeh Ganji
Efficient ML
  • EMD is proposed as a new metric for evaluating firing distribution divergence in quantized SNNs.
  • Quantization methods significantly affect firing dynamics, which are not captured by accuracy metrics alone.
  • Learned quantization methods like LQ-Net maintain firing behavior more effectively than uniform quantization.
  • The study highlights the importance of considering behavior preservation in the deployment of quantized SNNs.
Read more
Material-Agnostic Zero-Shot Thermal Inference for Metal Additive Manufacturing via a Parametric PINN Framework
Hyeonsu Lee, Jihoon Jeong
Theory Efficient ML Optimization
  • Introduces a parametric PINN framework for zero-shot thermal inference in metal AM.
  • Achieves generalization across different materials without requiring labeled data or retraining.
  • Demonstrates a significant reduction in relative L2 error compared to non-parametric models.
  • Incorporates physics-guided output scaling and hybrid optimization for improved training stability.
Read more
RL-STPA: Adapting System-Theoretic Hazard Analysis for Safety-Critical Reinforcement Learning
Steven A. Senczyszyn, Timothy C. Havens, Nathaniel Rice, Jason E. Summers, Benjamin D. Werner, Benjamin J. Schumeg
Reinforcement Learning Robotics Theory
  • RL-STPA adapts STPA for the unique challenges of reinforcement learning in safety-critical applications.
  • The framework includes hierarchical subtask decomposition, coverage-guided perturbation testing, and iterative checkpoints.
  • Demonstrated effectiveness in identifying potential hazards in autonomous drone navigation.
  • Provides a systematic approach for safety evaluation and improvement of RL systems.
Read more
Portfolio Optimization Proxies under Label Scarcity and Regime Shifts via Bayesian and Deterministic Students under Semi-Supervised Sandwich Training
Adhiraj Chattopadhyay
Optimization
  • Introduction of a teacher-student learning framework for portfolio optimization using CVaR as a supervisory signal.
  • Development of a low-data Bayesian Neural Network (BNN) pipeline that incorporates uncertainty awareness.
  • Demonstration of implicit turnover reduction in trading activity without explicit constraints.
  • Structured stress testing reveals the ability of models to generalize across different market regimes.
Read more
SOLIS: Physics-Informed Learning of Interpretable Neural Surrogates for Nonlinear Systems
Murat Furkan Mansur, Tufan Kumbasar
Theory Interpretability Optimization
  • Introduces SOLIS, a framework for nonlinear system identification that enhances interpretability.
  • Models dynamics using a state-conditioned second-order surrogate, avoiding rigid parametric assumptions.
  • Decouples trajectory reconstruction from parameter estimation to improve training stability.
  • Employs a cyclic curriculum and Local Physics Hints to mitigate optimization challenges.
Read more
Modular Continual Learning via Zero-Leakage Reconstruction Routing and Autonomous Task Discovery
Noureddine Kermiche
Computer Vision NLP Efficient ML
  • Introduces a modular architecture for continual learning that prevents catastrophic forgetting.
  • Utilizes a Simultaneous Pipeline for real-time knowledge consolidation while ensuring data privacy.
  • Employs a Tight-Bottleneck Autoencoder to manage high-dimensional latent spaces effectively.
  • Demonstrates strong retention in learning tasks across different domains without redundancy.
Read more
GUI-Perturbed: Domain Randomization Reveals Systematic Brittleness in GUI Grounding Models
Yangyue Wang, Harshvardhan Sikka, Yash Mathur, Tony Zhou, Jinu Nyachhyon, Pranav Guruprasad
Computer Vision NLP Multimodal
  • GUI grounding models show a significant drop in accuracy (27-56 percentage points) when tasked with spatial reasoning.
  • A 70% browser zoom leads to a statistically significant degradation in model performance.
  • Standard training methods, including rank-8 LoRA fine-tuning, do not improve performance and may degrade spatial reasoning capabilities.
  • GUI-Perturbed provides a diagnostic framework that reveals specific weaknesses in model capabilities.
Read more
Tight Sample Complexity Bounds for Best-Arm Identification Under Bounded Systematic Bias
Tianhao Qian
Theory Optimization Robotics
  • Introduces a framework for Best-Arm Identification under bounded systematic bias in heuristic pruning.
  • Establishes tight sample complexity bounds for safe node elimination based on empirical reward gaps.
  • Develops the PAC-MCTS algorithm for bias-aware pruning in Monte Carlo Tree Search applications.
  • Validates theoretical results through experiments in controlled synthetic environments.
Read more
Graph-Based Fraud Detection with Dual-Path Graph Filtering
Wei He, Wensheng Gan, Philip S. Yu
Graph Learning
  • DPF-GFD addresses key challenges in fraud detection, including relation camouflage and class imbalance.
  • The model utilizes a beta wavelet-based operator for structural pattern extraction.
  • A dual-path filtering approach enhances the discriminative power of node representations.
  • Experimental results show significant improvements in fraud detection accuracy over existing GNN methods.
Read more
Beyond Importance Sampling: Rejection-Gated Policy Optimization
Ziwu Sun, Zhen Gao, Jiyong Zhang, Jiaheng Li
Reinforcement Learning Optimization Theory
  • Introduces RGPO, which uses a differentiable acceptance gate for sample selection in policy optimization.
  • Unifies various policy gradient methods under a common framework, enhancing theoretical understanding.
  • Guarantees finite gradient variance and bounded bias, addressing issues with traditional importance sampling.
  • Achieves superior performance in online reinforcement learning tasks compared to existing methods.
Read more
xFODE: An Explainable Fuzzy Additive ODE Framework for System Identification
Ertugrul Kececi, Tufan Kumbasar
Interpretability Time Series Theory
  • xFODE enhances interpretability in system identification by defining states with physical meaning.
  • The framework employs fuzzy additive models to approximate state derivatives, allowing for input-wise contributions.
  • Partitioning Strategies (PSs) are introduced to simplify the antecedent space and improve interpretability.
  • xFODE achieves accuracy on par with existing models while providing clearer insights into system dynamics.
Read more
Constraint-based Pre-training: From Structured Constraints to Scalable Model Initialization
Fu Feng, Yucheng Xie, Ruixiao Shi, Jing Wang, Xin Geng
Efficient ML Computer Vision Robotics
  • Introduces a constraint-based pre-training paradigm for scalable model initialization.
  • Disentangles size-agnostic knowledge into reusable weight templates.
  • Employs Kronecker-based constraints for regularizing the pre-training process.
  • Achieves state-of-the-art performance across various perception and embodied learning tasks.
Read more
From Risk to Rescue: An Agentic Survival Analysis Framework for Liquidation Prevention
Fernando Spadea, Oshani Seneviratne
Optimization Time Series
  • The framework shifts from passive prediction to proactive intervention in liquidation prevention.
  • A novel return period metric is introduced to normalize risk across different transaction types.
  • The counterfactual optimization loop allows for simulation of user actions to minimize required intervention capital.
  • The agent successfully differentiates between actionable risks and negligible events, enhancing capital efficiency.
Read more
Mean Flow Policy Optimization
Xiaoyi Dong, Xi Sheryl Zhang, Jian Cheng
Reinforcement Learning Generative Models Optimization
  • MFPO utilizes MeanFlow models to improve efficiency in online RL.
  • The method promotes exploration through maximum entropy RL and soft policy iteration.
  • MFPO addresses challenges in action likelihood evaluation and policy improvement.
  • Experimental results show MFPO matches or exceeds the performance of diffusion-based methods.
Read more
Curvature-Aligned Probing for Local Loss-Landscape Stabilization
Nikita Kiselev, Andrey Grabovoy
Theory Optimization Efficient ML
  • Introduces a unified family of local stabilization criteria for loss landscapes under sample growth.
  • Proposes a curvature-aligned criterion ∆(D)² that focuses on the top-D eigenspace of the empirical Hessian.
  • Demonstrates that the new criterion preserves the O(k⁻²) mean-squared decay rate while simplifying curvature dependence.
  • Develops scalable estimators that are significantly faster than traditional Monte Carlo methods.
Read more
Scouting By Reward: VLM-TO-IRL-Driven Player Selection For Esports
Qing Yan, Wenyu Yang, Yufei Wang, Wenhao Ma, Linchong Hu, Yifei Jin, Anton Dahbura
Reinforcement Learning Multimodal
  • Introduces a novel application of inverse reinforcement learning for style-based player scouting in esports.
  • Develops a two-branch architecture that integrates gameplay telemetry with tactical commentary for enhanced player evaluation.
  • Demonstrates that the proposed system can match expert analysts' judgments while scaling beyond manual review capabilities.
  • Addresses the gap in current esports analytics tools that fail to capture nuanced player behaviors and styles.
Read more
When Flat Minima Fail: Characterizing INT4 Quantization Collapse After FP32 Convergence
Marcus Armstrong
NLP Large Language Models Efficient ML
  • Identifies a three-phase divergence structure in INT4 quantization after FP32 convergence.
  • Divergence begins when FP32 perplexity converges, not solely due to learning rate decay.
  • INT8 quantization remains stable, indicating the issue is specific to INT4 quantization.
  • Controlled experiments show that learning rate schedule amplitude affects quantization robustness.
Read more
Optimal last-iterate convergence in matrix games with bandit feedback using the log-barrier
Come Fiegel, Pierre Menard, Tadashi Kozuno, Michal Valko, Vianney Perchet
Theory Optimization
  • Introduces a new algorithm that achieves ˜O(t^{-1/4}) last-iterate convergence in bandit feedback settings.
  • Utilizes log-barrier regularization and a dual-focused analysis to enhance convergence rates.
  • Extends the approach to extensive-form games, maintaining the same convergence rate.
  • Addresses the limitations of previous methods that failed to achieve optimal rates in uncoupled player scenarios.
Read more
MixAtlas: Uncertainty-aware Data Mixture Optimization for Multimodal LLM Midtraining
Bingbing Wen, Sirajul Salekin, Feiyang Kang, Bill Howe, Lucy Lu Wang, Javier Movellan, Manjot Bilkhu
Multimodal Large Language Models Optimization
  • Introduces MixAtlas for interpretable and efficient multimodal data mixture optimization.
  • Decomposes training data along two axes: image concepts and task supervision.
  • Utilizes small proxy models and Gaussian-process surrogates for uncertainty-aware optimization.
  • Achieves significant performance improvements and faster convergence in training.
Read more
Shapley Value-Guided Adaptive Ensemble Learning for Explainable Financial Fraud Detection with U.S. Regulatory Compliance Validation
Mohammad Nasir Uddin, Md Munna Aziz
Interpretability Graph Learning Time Series
  • Evaluation of explanation methods reveals significant variation in SHAP reliability across different model types.
  • The SHAP-Guided Adaptive Ensemble (SGAE) framework dynamically adjusts model reliance based on SHAP attribution agreement.
  • GNN-GraphSAGE outperforms other models in overall performance metrics but raises questions about the nature of its advantages.
  • The study connects SHAP interpretations to regulatory standards, offering architecture-specific compliance guidance.
Read more
When Missing Becomes Structure: Intent-Preserving Policy Completion from Financial KOL Discourse
Yuncong Liu, Yuan Wan, Zhou Jiang, Yao Lu
Reinforcement Learning NLP Multimodal
  • Identifies a systematic structural property in financial KOL discourse, indicating that incompleteness reflects a pattern in investment intent expression.
  • Proposes the KICL framework to complete missing execution decisions while preserving KOL intent, framing it as an offline sequential decision-making problem.
  • Introduces a betrayal-oriented evaluation perspective for KOL-conditioned policy learning, focusing on unsupported entries and directional reversals.
  • Demonstrates that KICL achieves the best financial performance metrics on both YouTube and X platforms.
Read more
Adaptive Test-Time Compute Allocation for Reasoning LLMs via Constrained Policy Optimization
Zhiyuan Zhai, Bingcong Li, Bingnan Xiao, Ming Li, Xin Wang
Large Language Models Optimization Efficient ML
  • Formalization of input-adaptive compute allocation as a constrained optimization problem.
  • Introduction of a SOLVE-THEN-LEARN framework for efficient compute allocation.
  • Demonstrated significant accuracy improvements over traditional allocation methods.
  • Established formal guarantees for the proposed method's performance.
Read more
TOPCELL: Topology Optimization of Standard Cell via LLMs
Zhan Song, Yu-Tung Liu, Chen Chen, Guoheng Sun, Jiaqi Yin, Chia-tung Ho, Ang Li, Haoxing Ren, Cunxi Yu
Large Language Models Optimization
  • Introduction of TOPCELL, an LLM-driven framework for standard cell topology optimization.
  • Utilization of Group Relative Policy Optimization (GRPO) for efficient topology discovery.
  • Demonstration of superior performance and zero-shot generalization in topology generation.
  • Achieved an average speedup of 85.91x compared to existing automation frameworks.
Read more
Calibration-Gated LLM Pseudo-Observations for Online Contextual Bandits
Maksim Pershin, Ivan Golovanov, Pavel Baltabaev, Natalia Trankova
Large Language Models Reinforcement Learning Optimization
  • Introduces a framework for integrating LLM pseudo-observations into contextual bandits with calibration-gated weighting.
  • Demonstrates a 19% reduction in cumulative regret on MIND-small using task-specific prompts.
  • Finds that prompt design is more critical than decay schedule or calibration parameters in influencing performance.
  • Analyzes failure modes of calibration gating in domains with minimal prediction errors.
Read more
Reinforcement Learning via Value Gradient Flow
Haoran Xu, Kaiwen Hu, Somayeh Sojoudi, Amy Zhang
Reinforcement Learning Large Language Models Optimization
  • Introduces Value Gradient Flow (VGF) as a new paradigm for behavior-regularized RL.
  • Reframes behavior-regularized RL as an optimal transport problem, enhancing scalability.
  • Eliminates explicit policy parameterization, allowing for adaptive test-time scaling.
  • Achieves state-of-the-art performance on offline RL benchmarks and LLM tasks.
Read more
Stability and Generalization in Looped Transformers
Asher Labovich
Theory Large Language Models Efficient ML
  • Introduces a fixed-point based framework for analyzing looped transformers.
  • Establishes that recall and outer normalization are crucial for stability and generalization.
  • Empirical results validate the theoretical framework across various tasks.
  • Presents 'internal recall' as a novel variant that improves performance in specific scenarios.
Read more
Zeroth-Order Optimization at the Edge of Stability
Minhak Song, Liang Zhang, Bingcong Li, Niao He, Michael Muehlebach, Sewoong Oh
Optimization Theory
  • Introduces a mean-square linear stability theory for zeroth-order optimization methods.
  • Establishes that ZO methods' stability depends on the entire Hessian spectrum, unlike first-order methods.
  • Derives tractable stability bounds using the largest eigenvalue and Hessian trace.
  • Empirical results show ZO methods operate at the edge of stability in deep learning tasks.
Read more