AI-generated summaries

Today's ML research,
without the noise.

Daily summaries of the latest machine learning papers from arXiv, processed every 8 hours.

48 Papers today
8h Update frequency
7 Days of history
MGDA-Decoupled: Geometry-Aware Multi-Objective Optimisation for DPO-based LLM Alignment
Andor Vári-Kakas, Ji Won Park, Natasa Tagasovska
NLP Large Language Models Optimization
  • MGDA-DECOUPLED offers a geometry-based approach to multi-objective optimization for LLM alignment.
  • The method addresses procedural unfairness by considering the convergence dynamics of each objective.
  • It operates within the lightweight DPO paradigm, avoiding the complexities of reinforcement learning.
  • Experiments show that MGDA-DECOUPLED achieves superior performance in aligning LLMs with human values.
Read more
mcdok at SemEval-2026 Task 13: Finetuning LLMs for Detection of Machine-Generated Code
Adam Skurla, Dominik Macko, Jakub Simko
Large Language Models NLP Generative Models
  • The mcdok system was developed for detecting machine-generated code in a multi-domain and multi-language context.
  • The system was adapted from the existing mdok approach, focusing on code understanding through appropriate model selection.
  • Three subtasks were addressed: binary detection, authorship attribution, and hybrid code detection.
  • Results indicate competitive performance, but significant margins remain compared to leading systems, suggesting potential for further enhancements.
Read more
Differentially Private Model Merging
Qichuan Yin, Manzil Zaheer, Tian Li
Theory Efficient ML Federated Learning
  • Introduces two data-independent algorithms for merging private models: random selection and linear combination.
  • Provides tailored privacy accounting using R´enyi differential privacy and privacy loss distributions.
  • Demonstrates the superiority of linear combination over random selection in a case study on mean estimation.
  • Empirical results validate the effectiveness of the proposed methods on various datasets.
Read more
Task-specific Subnetwork Discovery in Reinforcement Learning for Autonomous Underwater Navigation
Yi-Ling Liu, Melvin Laux, Mariela De Lucas Alvarez, Frank Kirchner, Rebecca Adam
Reinforcement Learning Robotics Interpretability
  • MTRL effectively utilizes shared knowledge across tasks, indicating successful knowledge sharing.
  • Only a small fraction of network weights are task-specific, suggesting minimal specialization is needed.
  • Context variables play a crucial role in differentiating tasks within MTRL.
Read more
Transferable SCF-Acceleration through Solver-Aligned Initialization Learning
Eike S. Eberhard, Viktor Kotsev, Timm Güthle, Stephan Günnemann
Optimization Efficient ML Theory
  • SAIL addresses the supervision problem in ML models for SCF initialization, improving convergence rates.
  • The Effective Relative Iteration Count (ERIC) is introduced as a more accurate performance metric for SCF calculations.
  • SAIL achieves significant reductions in ERIC across various molecular sizes, outperforming previous methods.
  • The method extends the applicability of ML acceleration techniques to larger molecules, enhancing computational efficiency.
Read more
Unsupervised Learning of Inter-Object Relationships via Group Homomorphism
Kyotaro Ushida, Takayuki Komatsu, Yoshiyuki Ohmura, Yasuo Kuniyoshi
Computer Vision Theory Robotics
  • Introduces an unsupervised learning method based on group homomorphism to model inter-object relationships.
  • Demonstrates the ability to segment multiple objects and extract motion laws without ground-truth labels.
  • Establishes a one-dimensional additive latent space for mapping relative movements between objects.
  • Highlights the importance of algebraic geometric constraints in achieving disentangled representations.
Read more
Improving Performance in Classification Tasks with LCEN and the Weighted Focal Differentiable MCC Loss
Pedro Seber, Richard D. Braatz
Interpretability
  • The modified LCEN algorithm is effective for classification tasks, maintaining interpretability and sparsity.
  • LCEN consistently outperforms ten other models in terms of macro F1 score and MCC across multiple datasets.
  • The diffMCC loss function leads to better performance compared to traditional weighted cross-entropy loss.
  • LCEN achieves an average feature elimination of 56%, enhancing model interpretability.
Read more
The Origin of Edge of Stability
Elon Litman
Optimization Theory
  • Introduces the concept of edge coupling to explain the Edge of Stability in gradient descent.
  • Derives a recurrence relation and loss-change formula that forces Hessian eigenvalues towards 2/η.
  • Classifies fixed points and period-two orbits, providing insights into the dynamics of convergence.
  • Extends findings to mini-batch SGD and continuous time, indicating broader implications.
Read more
Even More Guarantees for Variational Inference in the Presence of Symmetries
Lena Zellinger, Antonio Vergari
Theory Optimization
  • The paper extends previous results on variational inference under symmetries to include FKL and α-divergences.
  • New sufficient conditions are derived for the exact recovery of the mean using FKL and α-divergences.
  • The authors provide insights into the practical implications of their theoretical findings for choosing variational families.
  • The study discusses potential optimization failures when sufficient conditions are not satisfied.
Read more
Fairness under uncertainty in sequential decisions
Michelle Seng Ah Lee, Kirtan Padh, David Watson, Niki Kilbertus, Jatinder Singh
Reinforcement Learning Theory Optimization
  • Introduces a taxonomy of uncertainty in sequential decision-making: model, feedback, and prediction uncertainty.
  • Formalizes uncertainties using counterfactual logic and reinforcement learning techniques.
  • Demonstrates the potential harms of naive decision-making policies that ignore unobserved outcomes.
  • Shows that uncertainty-aware exploration can improve fairness metrics in sequential decision systems.
Read more
Causal-Transformer with Adaptive Mutation-Locking for Early Prediction of Acute Kidney Injury
Weizhi Nie, Haolin Chen, Huifang Hao, Yuting Su, Keliang Xie, Bo Yang
Time Series Interpretability
  • CT-Former effectively models irregular clinical intervals without biased data imputation.
  • The Causal-Attention module provides transparent causal pathways linking historical physiological events to current predictions.
  • CT-Former significantly outperforms existing models in predicting AKI, as validated by extensive experiments.
  • The model enhances clinical interpretability, addressing the black-box nature of traditional deep learning approaches.
Read more
A Hybridizable Neural Time Integrator for Stable Autoregressive Forecasting
Brooks Kinch, Xiaozhe Hu, Yilong Huang, Martine Dyring Hansen, Sunniva Meltzer, Nathaniel Donald Hamlin, David Sirajuddin, Eric C. Cyr, Nathaniel Trask
Time Series Theory Efficient ML
  • Introduces a hybrid autoregressive model combining transformers with mixed finite element methods for stability.
  • Proves preservation of discrete energies and uniform gradient bounds, addressing the exploding gradient problem.
  • Achieves a 65× reduction in model parameters while outperforming state-of-the-art models in chaotic system forecasting.
  • Demonstrates a significant speedup in real-time simulations, enabling efficient design iterations.
Read more
Maximum Entropy Semi-Supervised Inverse Reinforcement Learning
Julien Audiffren, Michal Valko, Alessandro Lazaric, Mohammad Ghavamzadeh
Reinforcement Learning Robotics Theory
  • Introduction of MESSI, a new algorithm combining MaxEnt-IRL with semi-supervised learning principles.
  • Effective integration of unsupervised trajectories into the MaxEnt-IRL framework to resolve policy ambiguity.
  • Empirical results indicate significant performance improvements over traditional MaxEnt-IRL.
  • Addresses limitations of previous semi-supervised apprenticeship learning methods.
Read more
HARBOR: Automated Harness Optimization
Biswa Sengupta, Jinhua Wang
Large Language Models Optimization
  • Harness design is a significant factor in the performance of long-horizon language-model agents.
  • Automated configuration search is more effective than manual tuning as the flag space increases.
  • HARBOR formalizes harness optimization as constrained noisy Bayesian optimization.
  • The case study demonstrates the limitations of manual tuning, with only one successful tuning round out of four.
Read more
Physics-Guided Dimension Reduction for Simulation-Free Operator Learning of Stiff Differential–Algebraic Systems
Huy Hoang Le, Haoguang Wang, Christian Moya, Marcos Netto, Guang Lin
Theory Optimization Efficient ML
  • Introduces an extended Newton implicit layer for enforcing algebraic constraints and quasi-steady-state conditions.
  • Achieves significant dimension reduction by focusing only on slow states, improving computational efficiency.
  • Demonstrates superior performance on stiff DAE problems compared to traditional soft and hard constraint methods.
  • Extends the methodology to multi-component systems with provable convergence.
Read more
Transparent Screening for LLM Inference and Training Impacts
Arnault Pachot, Thierry Petit
Large Language Models
  • Introduces a transparent screening framework for estimating LLM impacts.
  • Develops a bounded multi-factor proxy methodology for inference and training estimates.
  • Provides an operational implementation through the ImpactLLM Observatory covering 41 models.
  • Emphasizes the importance of transparency and reproducibility in environmental impact assessments.
Read more
Sub-Token Routing in LoRA for Adaptation and Query-Aware KV Compression
Wei Jiang, Wei Wang
NLP Large Language Models Efficient ML
  • Introduces sub-token routing for finer control in transformer efficiency.
  • Presents a query-independent design that improves language modeling quality.
  • Develops a query-aware design that preserves downstream performance with reduced KV budgets.
  • Demonstrates the complementary nature of token-level and sub-token-level routing.
Read more
Domain-Aware Hierarchical Contrastive Learning for Semi-Supervised Generalization Fault Diagnosis
Junyu Ren, Wensheng Gan, Philip S Yu
Time Series
  • Introduces DAHCL framework to improve fault diagnosis under unseen conditions.
  • Addresses cross-domain pseudo-label bias by incorporating domain-specific geometric characteristics.
  • Utilizes uncertain samples effectively through fuzzy contrastive supervision.
  • Evaluated under realistic noisy conditions to reflect practical industrial scenarios.
Read more
Early Detection of Latent Microstructure Regimes in Limit Order Books
Prakul Sunil Hiremath, Vruksha Arun Hiremath
Time Series Theory
  • Introduces a causal regime model for limit order books with identifiable latent build-up phases.
  • Derives theoretical guarantees for detection lead-time and probability of early detection.
  • Proposes a novel trigger-based detection method that outperforms traditional reactive signals.
  • Demonstrates empirical effectiveness through extensive simulations and preliminary real-data applications.
Read more
Robustness of Spatio-temporal Graph Neural Networks for Fault Location in Partially Observable Distribution Grids
Burak Karabulut, Carlo Manna, Chris Develder
Graph Learning Time Series
  • Introduces a new graph-forming strategy for GNNs that utilizes only measured buses.
  • Develops STGNN models based on GraphSAGE and improved GATv2 for fault location.
  • Demonstrates significant performance improvements over traditional RNN baselines.
  • Shows that the measured-only topology reduces training time and enhances model robustness.
Read more
Clinically Interpretable Sepsis Early Warning via LLM-Guided Simulation of Temporal Physiological Dynamics
Weizhi Nie, Zhen Qu, Weijie Wang, Chunpei Li, Ke Lu, Bingyang Zhou, Hongzhi Yu
Large Language Models Time Series Interpretability
  • Introduces a novel LLM-guided framework for sepsis early warning.
  • Combines spatiotemporal feature extraction with clinical reasoning prompts.
  • Achieves superior predictive performance compared to traditional models.
  • Provides interpretable predictions that align with clinical judgment.
Read more
FedSIR: Spectral Client Identification and Relabeling for Federated Learning with Noisy Labels
Sina Gholami, Abdulmoneam Ali, Tania Haghighi, Ahmed Arafa, Minhaj Nur Alam
Federated Learning
  • FedSIR addresses noisy labels in federated learning through spectral analysis of client feature representations.
  • The framework includes a mechanism for identifying clean and noisy clients with minimal communication overhead.
  • It employs a relabeling scheme that allows noisy clients to correct their labels based on spectral references from clean clients.
  • The integration of noise-aware optimization techniques enhances the stability of training under label noise.
Read more
ACT: Anti-Crosstalk Learning for Cross-Sectional Stock Ranking via Temporal Disentanglement and Structural Purification
Juntao Li, Liang Zhang
Graph Learning Time Series
  • Identification of crosstalk as a critical issue in graph-based stock ranking.
  • Introduction of the ACT framework to systematically address temporal-scale and structural crosstalk.
  • Utilization of Temporal Component Decomposition (TCD) for disentangling stock sequences.
  • Implementation of a Progressive Structural Purification Encoder for structural crosstalk mitigation.
Read more
Generative Augmentation of Imbalanced Flight Records for Flight Diversion Prediction: A Multi-objective Optimisation Framework
Karim Aly, Alexei Sharpanskykh, Jacco Hoekstra
Generative Models Optimization
  • Introduces a multi-objective optimisation framework for hyperparameter tuning of generative models in the context of rare flight diversion events.
  • Demonstrates the necessity of a comprehensive evaluation framework for assessing synthetic data quality beyond single metrics.
  • Shows that models trained on a combination of real and synthetic data significantly outperform those trained only on real data.
  • Explores the impact of different augmentation sizes on the predictive quality of rare event predictions.
Read more
Dilated CNNs for Periodic Signal Processing: A Low-Complexity Approach
Eli Gildish, Michael Grebshtein, Igor Makienko
Time Series Efficient ML Audio & Speech
  • R-DCNN allows for denoising of periodic signals with varying frequencies using a single training observation.
  • The method significantly reduces computational complexity compared to traditional deep learning and classical autoregressive methods.
  • R-DCNN is optimized for low-power applications, making it suitable for IoT and edge devices.
  • The approach maintains high accuracy in signal denoising without the need for retraining on new observations.
Read more
Efficient Test-Time Inference via Deterministic Exploration of Truncated Decoding Trees
Xueyan Li, Johannes Zenn, Ekaterina Fadeeva, Guinan Su, Mrinmaya Sachan, Jonas Geiping
NLP Large Language Models Efficient ML
  • DLE is a deterministic method that replaces stochastic sampling in inference tasks.
  • It systematically explores previously unvisited high-probability branches, improving coverage of the search space.
  • DLE reduces redundant token generation, leading to more efficient use of computational resources.
  • Empirical results show that DLE achieves better performance on math, coding, and general reasoning tasks compared to self-consistency.
Read more
An effective variant of the Hartigan $k$-means algorithm
François Clément, Stefan Steinerberger
Optimization Theory Efficient ML
  • Smartigan improves upon Hartigan's algorithm, providing an additional 2-5% improvement in clustering performance.
  • The algorithm encourages exploration in the initial stages of clustering, leading to better convergence.
  • Smartigan maintains theoretical stability guarantees similar to Hartigan's method.
  • Empirical results show that Smartigan consistently outperforms both Hartigan and Lloyd's algorithms.
Read more
Data-Driven Open-Loop Simulation for Digital-Twin Operator Decision Support in Wastewater Treatment
Gary Simethy, Daniel Ortiz Arroyo, Petar Durdevic
Time Series
  • CCSS-RS effectively simulates WWTP responses under various control scenarios while managing irregular and missing data.
  • The model achieves a 40-46% reduction in RMSE compared to Neural CDE baselines, showcasing its superior predictive capabilities.
  • Operational case studies highlight the model's practical utility in real-world decision-making for wastewater treatment operators.
  • The architecture of CCSS-RS is tailored to the specific data conditions of full-scale WWTP operations, avoiding the need for recalibration of mechanistic models.
Read more
JEPAMatch: Geometric Representation Shaping for Semi-Supervised Learning
Ali Aghababaei-Harandi, Aude Sportisse, Massih-Reza Amini
Computer Vision Theory Efficient ML
  • Introduces JEPAMatch, a new semi-supervised learning framework that enhances representation learning.
  • Addresses class imbalance and convergence speed issues prevalent in existing methods like FixMatch.
  • Utilizes a latent-space regularization term to promote isotropic Gaussian structures in the representation space.
  • Demonstrates superior performance on benchmark datasets, achieving faster convergence and reduced computational costs.
Read more
Tokenised Flow Matching for Hierarchical Simulation Based Inference
Giovanni Charles, Cosmo Santoni, Seth Flaxman, Elizaveta Semenova
Efficient ML Theory
  • Introduces Tokenised Flow Matching for Posterior Estimation (TFMPE) to enhance simulation efficiency in hierarchical SBI.
  • Utilizes likelihood factorisation to train from single-site simulations, reducing the need for multiple simulator evaluations.
  • Validates the proposed method on a new benchmark and real-world models, showing improved calibration and reduced computational costs.
  • Addresses the practical bottleneck of simulator evaluations in hierarchical settings with shared parameters.
Read more
Variance Is Not Importance: Structural Analysis of Transformer Compressibility Across Model Scales
Samuel Salfati
NLP Large Language Models Efficient ML
  • High-variance activation directions are not indicative of importance for model predictions.
  • Block linearity is conditional and can be disrupted by changes in earlier blocks.
  • Direct quantization is more effective than weight factorization for reducing errors.
  • Linearity in transformer blocks increases with depth, indicating a shift from nonlinear to linear processing.
Read more
Drug Synergy Prediction via Residual Graph Isomorphism Networks and Attention Mechanisms
Jiyan Song, Wenyang Wang, Chengcheng Yan, Zhiquan Han, Feifei Zhao
Graph Learning
  • Introduces ResGIN-Att, a novel model for drug synergy prediction.
  • Integrates molecular features and genomic profiles with drug-drug interactions.
  • Utilizes residual connections to mitigate over-smoothing in deep layers.
  • Employs a cross-attention mechanism for improved interpretability.
Read more
DR-Venus: Towards Frontier Edge-Scale Deep Research Agents with Only 10K Open Data
Venus Team, Sunhao Dai, Yong Deng, Jinzhen Lin, Yusheng Song, Guoqing Wang, Xiaofeng Wu, Yuqi Zhou, Shuo Yang, Zhenzhe Ying, Zhanwei Zhang, Changhua Meng, Weiqiang Wang
NLP Reinforcement Learning Efficient ML
  • DR-Venus is a 4B parameter deep research agent trained entirely on 10K open data.
  • The training methodology includes a two-stage process: supervised fine-tuning followed by reinforcement learning.
  • DR-Venus outperforms existing models with 9B parameters and narrows the performance gap with larger 30B-class systems.
  • The study highlights the importance of data quality and effective utilization in training small models.
Read more
A Delta-Aware Orchestration Framework for Scalable Multi-Agent Edge Computing
Samaresh Kumar Singh, Joyjit Roy
Reinforcement Learning Optimization Efficient ML
  • DAOEF addresses the performance degradation in multi-agent systems when scaling beyond 100 agents.
  • The framework integrates three mechanisms that work synergistically to improve efficiency.
  • Controlled experiments validate the interdependence of the mechanisms, showing that removing any one increases latency significantly.
  • DAOEF achieves a 62% reduction in latency and 62% energy savings in a 200-agent cloud deployment.
Read more
Closing the Domain Gap in Biomedical Imaging by In-Context Control Samples
Ana Sanchez-Fernandez, Thomas Pinetz, Werner Zellinger, Günter Klambauer
Computer Vision
  • Batch effects significantly degrade model performance in biomedical imaging.
  • CS-ARM-BN is a novel meta-learning method that utilizes negative control samples for adaptation.
  • The proposed method achieves a high accuracy of 0.935 ± 0.018 on new experimental batches.
  • Traditional deep learning models fail to generalize across different batches, highlighting the need for effective domain adaptation.
Read more
A Hierarchical MARL-Based Approach for Coordinated Retail P2P Trading and Wholesale Market Participation of DERs
Patrick Wilk, Ethan Cantor, Yikui Liu, Jie Li
Reinforcement Learning Optimization
  • Proposes a hierarchical MARL framework for DER participation in electricity markets.
  • Facilitates P2P trading among prosumers to enhance market efficiency.
  • Utilizes a Stackelberg game model for coordination of market participation.
  • Addresses challenges of integrating DERs into existing electricity market structures.
Read more
Replicable Bandits with UCB based Exploration
Rohan Deb, Udaya Ghai, Karan Singh, Arindam Banerjee
Theory
  • Introduction of replicable algorithms for stochastic multi-armed and linear bandits.
  • Development of RepUCB and RepLinUCB algorithms that achieve low regret while ensuring replicability.
  • Introduction of RepRidge, a replicable ridge regression estimator with confidence guarantees.
  • Significant improvement in regret bounds compared to prior methods, particularly in linear bandits.
Read more
Graph-Theoretic Models for the Prediction of Molecular Measurements
Anna Niane, Prudence Djagba
Graph Learning
  • The Mukwembi-Nyabadza model was evaluated on five benchmark datasets, confirming its limited transferability.
  • A systematic enhancement framework significantly improved the model's predictive performance.
  • Enhanced classical models matched or outperformed deep learning approaches in molecular property prediction.
  • The proposed framework is computationally efficient and accessible for resource-limited researchers.
Read more
Fast Bayesian equipment condition monitoring via simulation based inference: applications to heat exchanger health
Peter Collett, Alexander Johannes Stasik, Simone Casolo, Signe Riemer-Sørensen
Efficient ML Theory Time Series
  • Introduces a fast Bayesian framework for equipment condition monitoring using Simulation-Based Inference.
  • Achieves 82x faster inference times compared to traditional MCMC methods while maintaining diagnostic accuracy.
  • Demonstrates applicability to complex failure modes in heat exchangers, particularly fouling and leakage.
  • Establishes a scalable workflow for real-time monitoring of industrial systems.
Read more
Understanding and Mitigating Spurious Signal Amplification in Test-Time Reinforcement Learning for Math Reasoning
Yongcan Yu, Lingxiao He, Jian Liang, Kuangpu Guo, Meng Wang, Qianlong Xie, Xingxing Wang, Ran He
Reinforcement Learning Large Language Models NLP
  • Medium-frequency responses are a major source of spurious reward signals in TTRL.
  • Group-relative advantage normalization amplifies these spurious signals.
  • DDRL framework effectively mitigates spurious signals through sampling, debiasing, and consensus refinement.
  • Extensive experiments show DDRL outperforms existing TTRL baselines significantly.
Read more
PrismaDV: Automated Task-Aware Data Unit Test Generation
Hao Chen, Arnab Phani, Sebastian Schelter
Theory Efficient ML
  • PrismaDV generates task-aware data unit tests by analyzing downstream task code and dataset profiles.
  • The SIFTA framework optimizes prompt generation for improved task adaptation.
  • PrismaDV outperforms existing task-agnostic and task-aware data validation frameworks.
  • The system addresses common shortcomings in current data unit testing approaches.
Read more
Rethinking Intrinsic Dimension Estimation in Neural Representations
Rickmer Schulte, David Rügamer
Theory
  • Commonly used ID estimators are biased and do not track true IDs in neural representations.
  • The discrepancy between theory and practice in ID estimation is significant, particularly in high dimensions.
  • The paper characterizes manifolds of LLM embeddings and hidden layer representations.
  • Layer-wise ID patterns are influenced by various underlying factors, challenging previous interpretations.
Read more
Measure Twice, Click Once: Co-evolving Proposer and Visual Critic via Reinforcement Learning for GUI Grounding
Wenkai Wang, Xiyun Li, Hongcan Guo, Wenhao Yu, Tianqing Fang, Haitao Mi, Dong Yu, Shengyu Zhang
Computer Vision Reinforcement Learning Multimodal
  • Introduction of a Propose-then-Critic framework for GUI grounding.
  • Utilization of a co-evolutionary reinforcement learning strategy to balance prediction accuracy and diversity.
  • Significant improvements in grounding accuracy and critic capabilities, with up to 17.2% enhancement.
  • Dynamic maturity mechanism to adaptively guide the learning process.
Read more
Relative Entropy Estimation in Function Space: Theory and Applications to Trajectory Inference
Chao Wang, Luca Nepote, Giulio Franzese, Pietro Michiardi
Theory Generative Models Time Series
  • Introduces a framework for estimating KL divergence in function space, addressing limitations of traditional snapshot-based evaluations.
  • Validates the functional KL estimator against known analytic KL divergences, demonstrating robustness and accuracy.
  • Shows that existing snapshot-based metrics can yield inconsistent rankings in trajectory inference methods.
  • Establishes functional KL as a coherent criterion for evaluating trajectory inference, particularly under sparse data conditions.
Read more
Geometric Monomial (GEM): a family of rational 2N-differentiable activation functions
Eylon E. Krause
Optimization Computer Vision NLP
  • GEM activation functions achieve ReLU-like performance with improved smoothness.
  • Three variants of GEM are introduced: GEM, E-GEM, and SE-GEM, each with unique properties.
  • N = 1 is optimal for standard-depth networks, while N = 2 is preferred for transformers.
  • GEM surpasses GELU in specific benchmarks, marking a significant advancement in activation function design.
Read more
The Path Not Taken: Duality in Reasoning about Program Execution
Eshgin Hasanov, Md Mahadi Hassan Sibat, Santu Karmaker, Aashish Yadavally
Large Language Models
  • Current benchmarks for LLMs focus too narrowly on single execution paths, limiting their evaluation of program understanding.
  • The proposed duality framework includes both forward and backward reasoning tasks to better assess LLMs' causal understanding of program execution.
  • DEXBENCH, the new benchmark introduced, consists of 445 paired instances for comprehensive evaluation of LLMs.
  • Results indicate that dual-path reasoning is a more reliable measure of LLMs' reasoning capabilities compared to traditional single-path evaluations.
Read more
R2IF: Aligning Reasoning with Decisions via Composite Rewards for Interpretable LLM Function Calling
Aijia Cheng, Kailong Wang, Ling Shi, Yongxin Zhao
NLP Large Language Models Reinforcement Learning
  • R2IF introduces a hybrid reward design that optimizes both reasoning quality and function-call correctness.
  • The Chain-of-Thought Effectiveness Reward (CER) enhances tool-call stability without relying on subjective evaluations.
  • The Specification-Modification-Value (SMV) reward explicitly supervises parameter constraints and transformations.
  • R2IF shows significant performance improvements across multiple benchmarks, indicating its robustness and scalability.
Read more
Differentially Private Clustered Federated Learning with Privacy-Preserving Initialization and Normality-Driven Aggregation
Jie Xu, Haaris Mehmood, Rogier Van Dalen, Karthikeyan Saravanan, Mete Ozay
Federated Learning
  • PINA addresses both data heterogeneity and privacy in federated learning without requiring privileged server data or random restarts.
  • The framework utilizes privacy-preserving sketches of client updates for accurate cluster prototype initialization.
  • A normality-driven aggregation mechanism is introduced to improve robustness against imbalanced client participation.
  • PINA consistently outperforms existing differential privacy federated learning methods in terms of accuracy.
Read more