AI-generated summaries

Today's ML research,
without the noise.

Daily summaries of the latest machine learning papers from arXiv, processed every 8 hours.

66 Papers today
8h Update frequency
7 Days of history
NANOZK: Layerwise Zero-Knowledge Proofs for Verifiable Large Language Model Inference
Zhaohui Geoffrey Wang
Large Language Models Theory Efficient ML
  • NANOZK provides a cryptographic mechanism for verifying LLM inference, addressing trust issues in LLM APIs.
  • The layerwise proof framework allows for independent layer computations, significantly improving scalability and efficiency.
  • Lookup table approximations for non-arithmetic operations maintain model accuracy without compromising verification.
  • NANOZK achieves a 52× speedup over existing ZKP methods while ensuring soundness guarantees.
Read more
Do Post-Training Algorithms Actually Differ? A Controlled Study Across Model Scales Uncovers Scale-Dependent Ranking Inversions
Xiaoyi Li
Large Language Models Reinforcement Learning Optimization
  • Algorithm rankings are scale-dependent, with significant inversions observed between different model sizes.
  • Most modifications to DPO variants do not significantly outperform the vanilla DPO algorithm.
  • Algorithm effectiveness is highly task-specific, with performance varying greatly across different benchmarks.
  • A hierarchy of performance factors is established: model scale > training paradigm > online vs. offline methods > loss function modifications.
Read more
GoAgent: Group-of-Agents Communication Topology Generation for LLM-based Multi-Agent Systems
Hongjiang Chen, Xin Zheng, Yixin Liu, Pengfei Jiao, Shiyuan Li, Huan Liu, Zhidong Zhao, Ziqi Xu, Ibrahim Khalil, Shirui Pan
NLP Large Language Models Graph Learning
  • GoAgent shifts the communication topology generation paradigm from node-centric to group-centric.
  • The method utilizes LLMs to identify task-relevant collaborative groups for efficient problem-solving.
  • Incorporation of a Conditional Information Bottleneck (CIB) reduces communication redundancy.
  • GoAgent achieves 93.84% average accuracy and reduces token consumption by about 17% across benchmarks.
Read more
Trojan horse hunt in deep forecasting models: Insights from the European Space Agency competition
Krzysztof Kotowski, Ramez Shendy, Jakub Nalepa, Agata Kaczmarek, Dawid Płudowski, Piotr Wilczyński, Artur Janicki, Przemysław Biecek, Ambros Marzetta, Atul Pande, Lalit Chandra Routhu, Swapnil Srivastava, Evridiki Ntagiou
Time Series
  • Trojan horse attacks pose significant security risks to deep forecasting models used in spacecraft telemetry.
  • The Trojan Horse Hunt competition engaged over 200 teams to identify hidden triggers in forecasting models.
  • The competition highlighted the lack of effective methods for detecting and characterizing trojan triggers in time series data.
  • The results emphasize the necessity for robust security measures in AI applications, particularly in high-stakes environments like space operations.
Read more
SOL-ExecBench: Speed-of-Light Benchmarking for Real-World GPU Kernels Against Hardware Limits
Edward Lin, Sahil Modi, Siva Kumar Sastry Hari, Qijing Huang, Zhifan Ye, Nestor Qin, Fengzhe Zhou, Yuan Zhang, Jingquan Wang, Sana Damani, Dheeraj Peri, Ouye Xie, Aditya Kane, Moshe Maor, Michael Behar, Triston Cao, Rishabh Mehta, Vartika Singh, Vikram Sharma Mailthody, Terry Chen, Zihao Ye, Hanfeng Chen, Tianqi Chen, Vinod Grover, Wei Chen, Wei Liu, Eric Chung, Luis Ceze, Roger Bringmann, Cyril Zeller, Michael Lightstone, Christos Kozyrakis, Humphrey Shi
Optimization Efficient ML Generative Models
  • SOL-ExecBench benchmarks GPU kernels against hardware limits rather than software baselines.
  • The benchmark includes 235 optimization problems from diverse AI models, targeting NVIDIA Blackwell GPUs.
  • Performance is measured using Speed-of-Light (SOL) bounds derived from hardware specifications.
  • A scoring system quantifies the optimization potential of kernels relative to hardware capabilities.
Read more
Scalable Learning of Multivariate Distributions via Coresets
Zeyu Ding, Katja Ickstadt, Nadja Klein, Alexander Munteanu, Simon Omlor
Efficient ML Theory Optimization
  • Introduction of the first coresets for semi-parametric distributional models.
  • Significant data reduction achieved through importance sampling.
  • High probability bounds on log-likelihood accuracy maintained.
  • Enhanced adaptability for complex distributions and non-linear relationships.
Read more
FedRG: Unleashing the Representation Geometry for Federated Learning with Noisy Clients
Tian Wen, Zhiqin Yang, Yonggang Zhang, Xuefeng Jiang, Hao Peng, Yuwei Wang, Bo Han
Federated Learning
  • FedRG redefines noise identification in federated learning by focusing on representation geometry rather than scalar loss values.
  • The framework utilizes self-supervised learning to create robust, label-agnostic representations.
  • A spherical vMF mixture model is employed to capture semantic clusters and identify noisy samples effectively.
  • Extensive experiments validate the superior performance of FedRG over state-of-the-art methods in heterogeneous data environments.
Read more
TTQ: Activation-Aware Test-Time Quantization to Accelerate LLM Inference On The Fly
Toshiaki Koike-Akino, Jing Liu, Ye Wang
Large Language Models Efficient ML NLP
  • TTQ enables on-the-fly quantization of large models during inference, addressing domain shift issues.
  • The framework incorporates low-complexity activation-aware quantization with negligible overhead.
  • TTQ integrates low-rank decomposition to further enhance model compression.
  • Experiments show that TTQ outperforms existing quantization methods on several LLM benchmarks.
Read more
MLOW: Interpretable Low-Rank Frequency Magnitude Decomposition of Multiple Effects for Time Series Forecasting
Runze Yang, Longbing Cao, Xiaoming Wu, Xin You, Kun Fang, Jianxun Li, Jie Yang
Time Series
  • MLOW provides an interpretable frequency-based decomposition for time series forecasting.
  • Introduces Hyperplane-NMF, a new low-rank method that enhances interpretability and efficiency.
  • Addresses challenges of spectral leakage and input horizon limitations in time series analysis.
  • Demonstrates robustness to noise and effective disentanglement of multiple effects.
Read more
SpecForge: A Flexible and Efficient Open-Source Training Framework for Speculative Decoding
Shenggui Li, Chao Wang, Yikai Zhu, Yubo Wang, Fan Yin, Shuai Shi, Yefei Chen, Xiaomin Dong, Qiaoling Chen, Jin Pan, Ji Li, Laixin Xie, Yineng Zhang, Lei Yu, Yonggang Wen, Ivor Tsang, Tianwei Zhang
Large Language Models Efficient ML NLP
  • SpecForge provides a scalable and efficient framework for training speculative decoding models.
  • The framework supports EAGLE-3 and incorporates advanced techniques like target-draft decoupling and hybrid parallelism.
  • SpecBundle offers a suite of high-quality draft models that enhance inference speed and quality.
  • The proposed methods lead to significant reductions in inference latency for large language models.
Read more
Beyond Weighted Summation: Learnable Nonlinear Aggregation Functions for Robust Artificial Neurons
Berke Deniz Bozyigit
Theory Optimization Computer Vision
  • Introduction of two learnable nonlinear aggregation functions: F-Mean and Gaussian Support neurons.
  • Development of hybrid neurons that combine linear and nonlinear aggregation for improved robustness.
  • Evaluation on CIFAR-10 demonstrates significant improvements in noise robustness and slight gains in clean data performance.
  • Learned parameters converge to sub-linear aggregation strategies, indicating effective noise handling.
Read more
Automatic Configuration of LLM Post-Training Pipelines
Channe Chwa, Xinle Wu, Yao Lu
Large Language Models Reinforcement Learning Optimization
  • AutoPipe is a novel framework for budget-aware configuration selection in LLM post-training pipelines.
  • It employs a dataset-conditioned ranking surrogate to provide transferable guidance across datasets.
  • The framework adapts online using Bayesian optimization and a Gaussian-process residual model.
  • An early-stop predictor is introduced to minimize evaluation costs by leveraging early training signals.
Read more
A Dynamic Bayesian and Machine Learning Framework for Quantitative Evaluation and Prediction of Operator Situation Awareness in Nuclear Power Plants
Shuai Chen, Huiqiao Jia, Tao Qing, Li Zhang, Xingyu Xiao
Theory Interpretability Time Series
  • Introduces a dynamic Bayesian-machine learning framework for real-time evaluation of operator situation awareness.
  • Identifies and quantifies interdependencies among 11 performance shaping factors affecting situation awareness.
  • Achieves a predictive accuracy of 13.8% in estimating situation awareness scores from performance shaping factors.
  • Demonstrates the importance of training quality and stress dynamics in maintaining operator situation awareness.
Read more
Parameter-Efficient Token Embedding Editing for Clinical Class-Level Unlearning
Iyad Ait Hou, Shrenik Borad, Harsh Sharma, Pooja Srinivasan, Rebecca Hwa, Aya Zirikly
NLP Large Language Models Efficient ML
  • Introduction of Sparse Token Embedding Unlearning (STEU) for parameter-efficient unlearning.
  • STEU modifies only a small fraction of model parameters, making it suitable for deployment-constrained environments.
  • Demonstrated effectiveness across multiple clinical datasets and transformer architectures.
  • Achieves near-complete forgetting of targeted information while preserving model utility.
Read more
Enhancing Multi-Corpus Training in SSL-Based Anti-Spoofing Models: Domain-Invariant Feature Extraction
Anh-Tuan Dao, Driss Matrouf, Mickael Rouvier, Nicholas Evans
Audio & Speech
  • Multi-corpus training can lead to performance degradation in spoofing detection due to dataset-specific biases.
  • The proposed IDFE framework effectively reduces corpus-specific information in embeddings, improving generalization.
  • The IDFE framework achieves a 20% reduction in average EER compared to baseline models across multiple datasets.
  • The study emphasizes the need for robust training methodologies to enhance the reliability of anti-spoofing systems.
Read more
Scalable Cross-Facility Federated Learning for Scientific Foundation Models on Multiple Supercomputers
Yijiang Li, Zilinghan Li, Kyle Chard, Ian Foster, Todd Munson, Ravi Madduri, Kibaek Kim
Federated Learning
  • Development of a cross-facility FL framework tailored for heterogeneous HPC environments.
  • Systematic characterization of performance variations due to computational throughput and communication costs.
  • Evaluation of existing FL algorithms under realistic HPC scheduling conditions.
  • Validation of the framework's applicability through fine-tuning a large language model on scientific data.
Read more
A Visualization for Comparative Analysis of Regression Models
Nassime Mountasir, Baptiste Lafabregue, Bruno Albert, Nicolas Lachiche
Theory Interpretability
  • Traditional regression metrics like MAE and RMSE can mask important differences in model performance.
  • The proposed 2D Error Space visualization allows for a more nuanced understanding of regression model errors.
  • The methodology includes a colormap for error distribution visualization and uses Mahalanobis distance for better comparison.
  • The approach is validated on three real datasets, showcasing its practical relevance.
Read more
Tula: Optimizing Time, Cost, and Generalization in Distributed Large-Batch Training
Sahil Tyagi, Feiyi Wang
Optimization Efficient ML Computer Vision
  • Tula optimizes large-batch training by balancing time, cost, and model quality.
  • The service predicts training time and cost with an error margin of 7.5-14%.
  • Achieves up to 20× speedup and approximately 9% improvement in test accuracy over standard methods.
  • Introduces a gradient-scaling technique to mitigate the generalization gap associated with large-batch training.
Read more
A Mathematical Theory of Understanding
Bahar Taşkesen
Theory
  • The ability to decode information is dependent on the learner's prerequisite knowledge structure.
  • Teaching is modeled as sequential communication, where the effectiveness of signals varies based on the learner's current knowledge state.
  • Two limits on learning speed are identified: structural (prerequisite reachability) and epistemic (uncertainty about the target).
  • Threshold effects in learning imply that resource allocation strategies should focus on depth rather than uniform distribution.
Read more
Heavy-Tailed and Long-Range Dependent Noise in Stochastic Approximation: A Finite-Time Analysis
Siddharth Chandak, Anuj Yadav, Ayfer Ozgur, Nicholas Bambos
Optimization Reinforcement Learning Theory
  • Establishes finite-time convergence bounds for stochastic approximation under heavy-tailed and LRD noise.
  • Demonstrates that convergence rates degrade with the presence of heavy-tailed and LRD noise compared to classical models.
  • Introduces a noise-averaging technique that improves moment bounds without modifying the iteration process.
  • Provides the first finite-time guarantees for SGD under LRD noise and for gradient play under both heavy-tailed and LRD noise.
Read more
GeoLAN: Geometric Learning of Latent Explanatory Directions in Large Language Models
Tianyu Bell Pan, Damon L. Woodard
NLP Large Language Models Interpretability
  • GeoLAN treats token representations as geometric trajectories to improve interpretability in LLMs.
  • Two differentiable regularizers are introduced to promote isotropy and diverse attention.
  • Experiments show that GeoLAN maintains task performance while enhancing geometric metrics and reducing biases.
  • The approach reveals scale-dependent trade-offs, particularly beneficial for mid-sized models.
Read more
From ex(p) to poly: Gaussian Splatting with Polynomial Kernels
Joerg H. Mueller, Martin Winter, Markus Steinberger
Computer Vision Efficient ML
  • Introduction of an N-th-order polynomial kernel for Gaussian Splatting that is computationally efficient and compatible with existing datasets.
  • Significant performance improvements (4%-15%) with negligible degradation in image quality.
  • Formal mathematical derivation proving invariance of anti-aliasing normalization factors for arbitrary kernel functions.
  • Methodology for fitting polynomial coefficients using L1 loss tailored to practical quadric distributions in 3DGS.
Read more
GO-GenZip: Goal-Oriented Generative Sampling and Hybrid Compression
Pietro Talli, Qi Liao, Alessandro Lieto, Parijat Bhattacharjee, Federico Chiariotti, Andrea Zanella
Generative Models Efficient ML Optimization
  • Introduces a goal-oriented approach to data sampling and compression in network telemetry.
  • Combines adaptive sampling with generative AI for efficient data acquisition.
  • Utilizes a hybrid compression scheme to balance fidelity and efficiency.
  • Demonstrates significant cost reductions in data transfer while preserving analytical performance.
Read more
Demonstrations, CoT, and Prompting: A Theoretical Analysis of ICL
Xuhan Tong, Yuchen Zeng, Jiawei Zhang
NLP Large Language Models Theory
  • Demonstration effectiveness is quantified using Lipschitz constants, linking quality to ICL performance.
  • CoT prompting benefits ICL by decomposing tasks into manageable subtasks, contingent on well-selected demonstrations.
  • The influence of prompt templates on ICL performance varies with the number of demonstrations, exhibiting diminishing returns.
  • Theoretical results are supported by empirical experiments, confirming the model's ability to generalize beyond pretraining.
Read more
MemReward: Graph-Based Experience Memory for LLM Reward Prediction with Limited Labels
Tianyang Luo, Tao Feng, Zhigang Hua, Yan Xie, Shuang Yang, Ge Liu, Jiaxuan You
Large Language Models Reinforcement Learning Graph Learning
  • MemReward utilizes a graph-based structure to enhance reward prediction in LLMs with limited labeled data.
  • The framework achieves 97.3% of Oracle performance with only 20% of the required labels.
  • MemReward outperforms fully-supervised models on out-of-domain tasks, showcasing its generalization capabilities.
  • The performance of MemReward scales positively with the increase in label budget, reaching 99.4% of Oracle at 70% labels.
Read more
MeanFlow Meets Control: Scaling Sampled-Data Control for Swarms
Anqi Dong, Yongxin Chen, Karl H. Johansson, Johan Karlsson
Robotics Optimization Generative Models
  • Introduces a control-space learning framework for swarm steering under sampled-data control.
  • Focuses on learning a coefficient for finite-horizon minimum-energy control rather than instantaneous velocity fields.
  • Demonstrates a scalable approach to few-step swarm steering consistent with real control systems.
  • Establishes integral and differential representations for the learned control coefficient.
Read more
SLEA-RL: Step-Level Experience Augmented Reinforcement Learning for Multi-Turn Agentic Training
Prince Zizhuang Wang, Shuli Jiang
Reinforcement Learning Large Language Models NLP
  • SLEA-RL retrieves experiences at each decision step, improving relevance and adaptability.
  • The framework includes a self-evolving experience library that maintains quality under continuous updates.
  • Empirical results show superior performance on multi-turn benchmarks compared to standard RL methods.
Read more
Warm-Start Flow Matching for Guaranteed Fast Text/Image Generation
Minyoung Kim
Generative Models Efficient ML Multimodal
  • Introduces Warm-Start Flow Matching (WS-FM) to enhance sample generation speed in flow matching algorithms.
  • Utilizes lightweight generative models to create initial draft samples that are of decent quality.
  • Reduces the number of time steps required for sample generation, ensuring a guaranteed speed-up.
  • Demonstrates effectiveness on both synthetic and real-world datasets.
Read more
Optimizing Resource-Constrained Non-Pharmaceutical Interventions for Multi-Cluster Outbreak Control Using Hierarchical Reinforcement Learning
Xueqiao Peng, Andrew Perrault
Reinforcement Learning Optimization
  • Formulates resource allocation as a constrained RMAB process addressing asynchronous cluster arrivals.
  • Proposes a hierarchical reinforcement learning framework that separates global coordination from local decision-making.
  • Implements a generalized local DQN that adapts to varying resource constraints without retraining.
  • Achieves a 20%-30% improvement in outbreak control effectiveness compared to heuristic strategies.
Read more
The Residual Stream Is All You Need: On the Redundancy of the KV Cache in Transformer Inference
Kaleem Ullah Qasim, Jiashu Zhang, Muhammad Kafeel Shaheen, Razan Alharith, Heying Zhang
NLP Large Language Models Efficient ML
  • KV cache entries can be exactly reconstructed from the residual stream, proving their redundancy.
  • Removing the KV cache yields token-identical outputs across various transformer models.
  • KV-Direct reduces peak memory usage by 2.5x and improves latency compared to traditional caching methods.
  • Cross-task residual patching shows that the residual stream satisfies a Markov property.
Read more
InfoMamba: An Attention-Free Hybrid Mamba-Transformer Model
Youjin Wang, Jiaqiao Zhao, Rong Fu, Run Zhou, Ruizhe Zhang, Jiani Liang, Suisuai Cao, Feng Zhou
Efficient ML Computer Vision NLP
  • Introduces InfoMamba, an attention-free hybrid model combining SSM and global filtering.
  • Develops a consistency boundary analysis to identify limitations in existing models.
  • Implements a concept-bottleneck linear filtering layer to reduce interaction complexity.
  • Demonstrates superior performance over existing Transformer and SSM models across multiple tasks.
Read more
CausalRM: Causal-Theoretic Reward Modeling for RLHF from Observational User Feedbacks
Hao Wang, Licheng Pan, Zhichao Chen, Chunyuan Zheng, Zhixuan Chu, Xiaoxi Li, Yuan Lu, Xinggao Liu, Haoxuan Li, Zhouchen Lin
Reinforcement Learning Large Language Models Theory
  • CausalRM addresses the challenges of noisy and biased observational feedback in reward modeling for RLHF.
  • The framework introduces a noise-aware surrogate loss to correct for user annotation errors.
  • Propensity scores are used to reweight training samples, counteracting user preference bias.
  • Extensive experiments show substantial performance improvements in RLHF tasks using CausalRM.
Read more
Breaking the Capability Ceiling of LLM Post-Training by Reintroducing Markov States
Yurun Yuan, Tengyang Xie
Large Language Models Reinforcement Learning Theory
  • Reintroducing Markov states can break the performance ceiling of RL in LLM post-training.
  • Markov models demonstrate superior out-of-distribution generalization compared to history-dependent models.
  • Theoretical guarantees indicate that Markovian learning achieves lower sample complexity.
  • Empirical results show significant improvements in solving complex logic puzzles.
Read more
Authority-Level Priors: An Under-Specified Constraint in Hierarchical Predictive Processing
Marcela Palejova
Theory
  • Introduction of Authority-Level Priors (ALPs) as constraints on identity-level hypotheses in predictive processing.
  • ALPs explain the persistence of maladaptive predictions despite belief updating and evidence accumulation.
  • The model provides a formal mechanism for understanding regulatory dominance among competing identity-level hypotheses.
  • Falsifiable predictions regarding stress-reactivity and behavioral change dynamics are generated from the proposed framework.
Read more
Deep Hilbert--Galerkin Methods for Infinite-Dimensional PDEs and Optimal Control
Samuel N. Cohen, Filippo de Feo, Jackson Hebner, Justin Sirignano
Optimization Theory Reinforcement Learning
  • Introduction of Hilbert–Galerkin Neural Operators (HGNOs) for approximating solutions to infinite-dimensional PDEs.
  • Establishment of Universal Approximation Theorems (UATs) for functions on Hilbert spaces and their derivatives.
  • Development of numerical methods that minimize PDE residuals across the entire Hilbert space.
  • Successful application of the proposed methods to Kolmogorov and HJB PDEs in optimal control scenarios.
Read more
Kolmogorov-Arnold causal generative models
Alejandro Almodóvar, Mar Elizo, Patricia A. Apellániz, Santiago Zazo, Juan Parras
Generative Models Interpretability Theory
  • Introduction of KaCGM, a causal generative model that enhances interpretability in causal inference.
  • Utilization of Kolmogorov-Arnold Networks (KANs) for parameterizing structural equations.
  • Development of a validation pipeline for assessing model performance using observational data.
  • Demonstration of competitive performance on synthetic and real-world datasets.
Read more
What If Consensus Lies? Selective-Complementary Reinforcement Learning at Test Time
Dong Yan, Jian Liang, Yanbo Wang, Shuo Lu, Ran He, Tieniu Tan
Reinforcement Learning Large Language Models NLP
  • SCRL mitigates label noise amplification in TTRL by enforcing strict consensus criteria.
  • Introduces negative supervision for the first time in TTRL to prune incorrect trajectories.
  • Demonstrates substantial performance improvements over baseline methods in challenging scenarios.
  • Maintains robust generalization and training stability under limited rollout budgets.
Read more
HISR: Hindsight Information Modulated Segmental Process Rewards For Multi-turn Agentic Reinforcement Learning
Zhicong Lu, Zichuan Lin, Wei Jia, Changyuan Tian, Deheng Ye, Peiguang Li, Li Jin, Nayu Liu, Guangluan Xu, Wei Feng
Reinforcement Learning Large Language Models
  • HISR enhances credit assignment in multi-turn RL by aligning rewards with sub-goals.
  • The segment-level process reward model avoids overly fine-grained reward allocation.
  • A hindsight model captures action importance based on trajectory outcomes.
  • Extensive experiments show HISR achieves state-of-the-art performance on benchmark tasks.
Read more
CLaRE-ty Amid Chaos: Quantifying Representational Entanglement to Predict Ripple Effects in LLM Editing
Manit Baser, Alperen Yildiz, Dinil Mon Divakaran, Mohan Gurusamy
Large Language Models NLP Interpretability
  • Introduction of CLARE, a lightweight technique for predicting ripple effects in LLM editing.
  • Achieves 62.2% improvement in predictive accuracy over gradient-based methods.
  • Utilizes a curated corpus of 11,427 facts for systematic analysis of model edits.
  • Significantly faster and more memory-efficient than existing techniques.
Read more
The $ extbf{Y}$-Combinator for LLMs: Solving Long-Context Rot with $λ$-Calculus
Amartya Roy, Rasul Tutunov, Xiaotong Ji, Matthieu Zimmer, Haitham Bou-Ammar
Large Language Models Theory Efficient ML
  • Introduction of λ-RLM, a structured framework for long-context reasoning in LLMs.
  • Replacement of arbitrary code generation with a typed functional runtime based on λ-calculus.
  • Formal guarantees of termination, predictable computation, and improved reliability.
  • Empirical results show significant improvements in accuracy and latency over standard RLMs.
Read more
Global Convergence of Multiplicative Updates for the Matrix Mechanism: A Collaborative Proof with Gemini 3
Keith Rush
Optimization Theory
  • Proves global convergence of a multiplicative update iteration for a nuclear norm optimization problem.
  • Demonstrates the utility of AI (Gemini 3) in assisting mathematical proofs.
  • Closes a previously open problem regarding the convergence of fixed-point iterations in private machine learning contexts.
  • Includes a narrative on the collaborative process of using AI in mathematics.
Read more
Integrating Meta-Features with Knowledge Graph Embeddings for Meta-Learning
Antonis Klironomos, Ioannis Dasoulas, Francesco Periti, Mohamed Gad-Elrab, Heiko Paulheim, Anastasia Dimou, Evgeny Kharlamov
Graph Learning
  • KGmetaSP utilizes knowledge graph embeddings to enhance meta-learning tasks.
  • The approach captures dataset-pipeline interactions by integrating past experiment metadata.
  • A large-scale benchmark of 144,177 experiments was created to validate the method.
  • KGmetaSP shows significant improvements in both pipeline performance estimation and dataset similarity estimation.
Read more
Sharpness-Aware Minimization in Logit Space Efficiently Enhances Direct Preference Optimization
Haocheng Luo, Zehang Deng, Thanh-Toan Do, Mehrtash Harandi, Dinh Phung, Trung Le
NLP Large Language Models Optimization
  • Developed a theoretical framework connecting parameter and logit spaces to analyze learning dynamics.
  • Identified the squeezing effect as a result of rapid expansion of residuals along high-curvature directions.
  • Introduced logits-SAM, a computationally efficient variant of SAM that improves DPO performance.
  • Demonstrated consistent performance gains across multiple datasets and benchmarks.
Read more
MolRGen: A Training and Evaluation Setting for De Novo Molecular Generation with Reasoning Models
Philippe Formont, Maxime Darrin, Ismail Ben Ayed, Pablo Piantanida
Large Language Models Reinforcement Learning Generative Models
  • Introduction of MOLRGEN, a large-scale benchmark for de novo molecular generation.
  • Development of a diversity-aware top-k scoring system for evaluating generated molecules.
  • Successful training of a 24B LLM using reinforcement learning for molecular generation.
  • Emphasis on the challenges of exploring chemical space in drug discovery.
Read more
BoundAD: Boundary-Aware Negative Generation for Time Series Anomaly Detection
Xiancheng Wang, Lin Wang, Zhibo Zhang, Rui Wang, Minghang Zhao
Time Series Reinforcement Learning Optimization
  • Introduces a reconstruction-driven framework for generating hard negatives in TSAD.
  • Utilizes reinforcement learning to adaptively control the negative sample generation process.
  • Improves temporal semantic consistency and decision-boundary supervision in anomaly detection.
  • Achieves competitive performance compared to existing TSAD methods.
Read more
Dual Path Attribution: Efficient Attribution for SwiGLU-Transformers through Layer-Wise Target Propagation
Lasse Marten Jantsch, Dong-Jae Koh, Seonghyeon Lee, Young-Kyoon Suh
NLP Large Language Models Interpretability
  • Introduction of Dual Path Attribution (DPA) for efficient model attribution.
  • DPA operates with O(1) time complexity, making it scalable for long sequences.
  • The method decomposes the SwiGLU Transformer into control and content pathways.
  • Extensive experiments show DPA achieves state-of-the-art faithfulness and efficiency.
Read more
RiboSphere: Learning Unified and Efficient Representations of RNA Structures
Zhou Zhang, Hanqun Cao, Cheng Tan, Fang Wu, Pheng Ann Heng, Tianfan Fu
Generative Models Graph Learning Interpretability
  • RiboSphere combines vector quantization and flow matching to learn discrete representations of RNA structures.
  • The framework captures biologically meaningful motifs, enhancing interpretability and generalization.
  • RiboSphere achieves state-of-the-art performance in structure reconstruction and inverse folding tasks.
  • The model demonstrates effective transferability to RNA-ligand binding predictions, even in data-scarce conditions.
Read more
AgenticRS-EnsNAS: Ensemble-Decoupled Self-Evolving Architecture Search
Yun Chen, Moyu Zhang, Jinxin Hu, Yu Zhang, Xiaoyi Zeng
Theory Efficient ML Optimization
  • Introduces Ensemble-Decoupled Architecture Search to reduce validation costs in NAS.
  • Establishes a theoretical condition for ensemble error improvement based on architecture properties.
  • Decouples architecture search from full ensemble training, enabling faster iterations.
  • Categorizes solution strategies for different types of architecture searches.
Read more
Discounted Beta--Bernoulli Reward Estimation for Sample-Efficient Reinforcement Learning with Verifiable Rewards
Haechan Kim, Soohyun Ryu, Gyouk Chu, Doohyuk Jang, Eunho Yang
Reinforcement Learning Large Language Models Efficient ML
  • Introduces Discounted Beta–Bernoulli (DBB) reward estimation to improve sample efficiency in RLVR.
  • DBB leverages historical reward statistics to reduce variance and avoid variance collapse.
  • Empirical results show significant accuracy improvements over naive GRPO methods.
  • DBB achieves lower mean squared error in low-sample scenarios compared to traditional point estimation.
Read more
Two-Time-Scale Learning Dynamics: A Population View of Neural Network Training
Giacomo Borghi, Hyesung Im, Lorenzo Pareschi
Optimization Theory
  • Introduces a mathematical framework for population-based neural network training dynamics.
  • Establishes connections between population-based learning, bilevel optimization, and replicator-mutator models.
  • Demonstrates the role of noise and diversity in optimizing hyperparameters and model parameters.
  • Validates theoretical results through numerical experiments, highlighting the benefits of effective fitness measures.
Read more
ODySSeI: An Open-Source End-to-End Framework for Automated Detection, Segmentation, and Severity Estimation of Lesions in Invasive Coronary Angiography Images
Anand Choudhary, Xiaowu Sun, Thabo Mahendiran, Ortal Senouf, Denise Auberson, Bernard De Bruyne, Stephane Fournier, Olivier Muller, Emmanuel Abbé, Pascal Frossard, Dorina Thanou
Computer Vision
  • ODySSeI provides an automated solution for lesion detection and severity estimation in ICA images.
  • The Pyramidal Augmentation Scheme (PAS) significantly enhances model performance, especially in complex tasks.
  • The framework achieves high accuracy in estimating lesion severity, with minimal deviation from ground truth.
  • ODySSeI processes ICA images rapidly, making it suitable for real-time clinical applications.
Read more
Target Concept Tuning Improves Extreme Weather Forecasting
Shijie Ren, Xinyue Gu, Ziheng Peng, Haifan Zhang, Peisong Niu, Bo Wu, Xiting Wang, Liang Sun, Jirong Wen
Time Series Interpretability
  • Introduces Target Concept Tuning (TaCT) for fine-tuning deep learning models in extreme weather forecasting.
  • Utilizes Sparse Autoencoders to identify failure-related concepts for targeted model adaptation.
  • Achieves improved forecasting accuracy for typhoons while maintaining performance on other meteorological variables.
  • Reveals model biases through interpretable concepts corresponding to meteorological structures.
Read more
DPxFin: Adaptive Differential Privacy for Anti-Money Laundering Detection via Reputation-Weighted Federated Learning
Renuga Kanagavelu, Manjil Nepal, Ning Peiyan, Cai Kangning, Xu Jiming, Fei Gao, Yong Liu, Goh Siow Mong Rick, Qingsong Wei
Federated Learning
  • Introduction of DPxFin, a reputation-driven differential privacy framework for federated learning in finance.
  • Dynamic adjustment of differential privacy noise based on client reputation enhances model utility and privacy.
  • Extensive experiments show improved performance in fraud detection on AML datasets, particularly under non-IID conditions.
  • DPxFin effectively mitigates risks of data leakage, proving its robustness in financial applications.
Read more
Self-Tuning Sparse Attention: Multi-Fidelity Hyperparameter Optimization for Transformer Acceleration
Arundhathi Dev, Justin Zhan
NLP Large Language Models Efficient ML
  • AFBS-BO automates hyperparameter tuning for sparse attention, eliminating the need for manual grid search.
  • The framework achieves 3.4× faster hyperparameter discovery with 8.8× fewer evaluations than traditional methods.
  • Configurations discovered by AFBS-BO outperform existing sparse attention baselines while closely matching dense attention quality.
  • The method leverages multi-fidelity evaluation to efficiently explore hyperparameter spaces.
Read more
FedPDPO: Federated Personalized Direct Preference Optimization for Large Language Model Alignment
Kewen Zhu, Liping Yi, Zhiming Zhao, Zhuang Qi, Han Yu, Qinghua Hu
Large Language Models Federated Learning Optimization
  • FedPDPO is the first framework for aligning LLMs with human preferences in federated learning while preserving privacy.
  • The framework utilizes a frozen LLM backbone with a shared LoRA adapter and personalized client-specific heads to address non-IID data challenges.
  • A personalized DPO training strategy is introduced to enhance generalization and mitigate the limitations of implicit rewards.
  • The proposed bottleneck adapter effectively bridges global and local knowledge, improving model performance.
Read more
DeepStock: Reinforcement Learning with Policy Regularizations for Inventory Management
Yaqi Xie, Xinru Hao, Jiaxi Liu, Will Ma, Linwei Xin, Lei Cao, Yidong Zhang
Reinforcement Learning Optimization
  • DeepStock integrates classical inventory management concepts into DRL to enhance performance.
  • Policy regularizations significantly reduce hyperparameter tuning time and improve training outcomes.
  • The approach has been successfully deployed in a real-world setting, managing inventory for Alibaba's Tmall.
  • Synthetic experiments indicate a re-evaluation of the best DRL methods for inventory management.
Read more
Fine-tuning Timeseries Predictors Using Reinforcement Learning
Hugo Cazaux, Ralph Rudd, Hlynur Stefánsson, Sverrir Ólafsson, Eyjólfur Ingi Ásgeirsson
Reinforcement Learning Time Series
  • Reinforcement learning can enhance the performance of pre-trained time series predictors.
  • The proposed fine-tuning methodology eliminates the need for human feedback, making it cost-effective.
  • The study demonstrates the transfer learning properties of fine-tuned models.
  • A systematic implementation plan for RL fine-tuning is provided for practitioners.
Read more
Online Learning and Equilibrium Computation with Ranking Feedback
Mingyang Liu, Yongshan Chen, Zhiyuan Fan, Gabriele Farina, Asuman Ozdaglar, Kaiqing Zhang
Theory Optimization
  • Sublinear regret is unattainable with instantaneous utility ranking feedback.
  • Sublinear regret can be achieved under time-average utility ranking feedback with certain assumptions.
  • The proposed algorithms yield approximate coarse correlated equilibria in normal-form games.
  • The study highlights the relevance of ranking feedback in real-world applications, such as recommendation systems.
Read more
Attack by Unlearning: Unlearning-Induced Adversarial Attacks on Graph Neural Networks
Jiahao Zhang, Yilong Wang, Suhang Wang
Graph Learning Optimization Theory
  • Introduction of unlearning corruption attacks that exploit graph unlearning processes.
  • Formulation of the attack as a bi-level optimization problem to address technical challenges.
  • Demonstration of significant accuracy degradation in GNNs due to carefully crafted unlearning requests.
  • Highlighting the stealthy nature of these attacks, which can evade detection during training.
Read more
Off-Policy Learning with Limited Supply
Koichi Tanaka, Ren Kishimoto, Bushun Kawagishi, Yusuke Narita, Yasuo Yamamoto, Nobuyuki Shimizu, Yuta Saito
Reinforcement Learning Theory Optimization
  • Conventional greedy OPL methods are suboptimal in limited supply scenarios.
  • Theoretical proof exists that superior policies can be developed under supply constraints.
  • OPLS focuses on relative expected rewards to improve item allocation efficiency.
  • Empirical results demonstrate OPLS's superiority over traditional OPL methods.
Read more
MSNet and LS-Net: Scalable Multi-Scale Multi-Representation Networks for Time Series Classification
Celal Alagöz, Mehmet Kurnaz, Farhan Aadil
Time Series
  • Introduction of MSNet and LS-Net for scalable time series classification.
  • Demonstrated the importance of structured multi-representation inputs for improved performance.
  • MSNet achieves superior calibration, while LiteMV has the highest accuracy.
  • LS-Net offers a favorable efficiency-accuracy trade-off, suitable for resource-constrained environments.
Read more
SHAPCA: Consistent and Interpretable Explanations for Machine Learning Models on Spectroscopy Data
Mingxing Zhang, Nicola Rossberg, Simone Innocente, Katarzyna Komolibus, Rekha Gautam, Barry O'Sullivan, Luca Longo, Andrea Visentin
Interpretability
  • SHAPCA combines PCA and SHAP to enhance interpretability of machine learning models on spectroscopy data.
  • The method provides explanations in the original input space, facilitating better understanding for practitioners.
  • Numerical analysis shows improved consistency of feature importance across repeated model training.
  • The framework allows for both global and local analysis of model predictions.
Read more
From Inference Efficiency to Embodied Efficiency: Revisiting Efficiency Metrics for Vision-Language-Action Models
Zhuofan Li, Hongkun Yang, Zhenyang Chen, Yangxuan Chen, Yingyan (Celine) Lin, Chaojian Li
Robotics Multimodal Efficient ML
  • Conventional efficiency metrics for VLA models do not capture real-world performance on robotic platforms.
  • Embodied efficiency metrics provide a more accurate assessment of robotic execution behaviors.
  • Reducing computational costs can lead to increased end-to-end execution time and degraded motion quality.
  • Common adaptation techniques show limited improvements in embodied efficiency and may involve trade-offs.
Read more
Engineering Verifiable Modularity in Transformers via Per-Layer Supervision
J. Clayton Kerce
NLP Large Language Models Interpretability
  • Introduces per-layer supervision to enhance modularity in transformer models.
  • Demonstrates that per-layer supervision leads to significantly larger ablation effects compared to standard training.
  • Establishes a methodology for capturing computational dynamics independent of vocabulary structure.
  • Validates the approach through causal experiments showing functional reorganization in attention heads.
Read more
FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization
Chiyu Ma, Shuo Yang, Kexin Huang, Jinda Lu, Haoming Meng, Shangshang Wang, Bolin Ding, Soroush Vosoughi, Guoyin Wang, Jingren Zhou
NLP Large Language Models Reinforcement Learning
  • FIPO enhances reasoning in LLMs by addressing limitations of uniform reward systems.
  • The algorithm incorporates future-KL divergence for more granular credit assignment.
  • FIPO significantly increases reasoning chain lengths and accuracy on benchmarks.
  • The approach outperforms existing models, demonstrating its effectiveness.
Read more
Ternary Gamma Semirings: From Neural Implementation to Categorical Foundations
Ruoqi Sun
Theory
  • Standard neural networks fail at compositional generalization tasks, achieving 0% accuracy.
  • Introducing Ternary Gamma Semirings allows neural networks to achieve 100% accuracy on novel combinations.
  • The learned feature space corresponds to a unique algebraic structure classified in mathematics.
  • Neural networks' generalization capabilities stem from their internalization of algebraic axioms.
Read more