AI-generated summaries

Today's ML research,
without the noise.

Daily summaries of the latest machine learning papers from arXiv, processed every 8 hours.

24 Papers today
8h Update frequency
7 Days of history
When Answers Stray from Questions: Hallucination Detection via Question-Answer Orthogonal Decomposition
Siyang Yao, Erhu Feng, Yubin Xia
NLP Large Language Models Efficient ML
  • QAOD introduces a geometric decomposition method for hallucination detection in LLMs.
  • The framework efficiently separates question-aligned components from answer representations.
  • Fisher-based selection identifies the most informative layers and neurons for probing.
  • QAOD achieves superior in-domain and out-of-domain performance compared to existing methods.
Read more
Neural Fields for NV-Center Inverse Sensing
Zhixuan Zhao, Tao Zhong, Yixun Hu, Nathalie P. de Leon, Christine Allen-Blanchette
Optimization Theory
  • NeTMY is introduced as an innovative neural-field solver for NV relaxometry inversion, addressing limitations of traditional methods.
  • The proposed tensor power-summed forward operator improves fidelity by avoiding nonphysical cross terms found in scalar solvers.
  • NeTMY achieves superior performance in reconstructing sparse density and spectral fields without the need for paired density labels.
  • The method effectively mitigates center-collapse issues and enhances optimization stability through its parameterization.
Read more
Generalized Priority-Aware Shapley Value
Kiljae Lee, Ziqi Liu, Weijing Tang, Yuan Zhang
Theory Interpretability Graph Learning
  • GPASV handles cyclic and weighted priority graphs, overcoming limitations of existing Shapley value methods.
  • The method incorporates individual soft priorities, enhancing the valuation framework.
  • GPASV is validated through simulations and practical applications, confirming its accuracy and scalability.
  • The priority sweeping diagnostic reveals larger effects of soft priorities compared to previous methods.
Read more
Exploitation of Hidden Context in Dynamic Movement Forecasting: A Neural Network Journey from Recurrent to Graph Neural Networks and General Purpose Transformers
Lukas Schelenz, Shobha Rajanna, Denis Gosalci, Lucas Heublein, Jonas Pirkl, Jonathan Ott, Felix Ott, Christopher Mutschler, Tobias Feigl
Time Series Graph Learning Optimization
  • Traditional forecasting methods are inadequate for capturing nonlinear dynamics in sports.
  • Hybrid LSTM models that incorporate contextual information significantly improve forecasting accuracy.
  • Machine learning methods outperform traditional models in predicting short-term movements.
  • No single architecture is optimal for all metrics in trajectory prediction; task-specific considerations are crucial.
Read more
Resolving Action Bottleneck: Agentic Reinforcement Learning Informed by Token-Level Energy
Langzhou He, Junyou Zhu, Yue Zhou, Zhengyao Gu, Junhua Liu, Wei-Chieh Huang, Henry Peng Zou, David Wipf, Philip S. Yu, Qitian Wu
Reinforcement Learning Large Language Models NLP
  • Introduces the concept of Action Bottleneck in agentic RL, highlighting the disparity in informative training signals between action and reasoning tokens.
  • Proposes ACTFOCUS, a token reweighting approach that downweights reasoning tokens and prioritizes action tokens based on their predictive uncertainty.
  • Demonstrates that ACTFOCUS consistently outperforms traditional methods like PPO and GRPO in various environments.
  • Finds a strong correlation between action token uncertainty and reward variance, emphasizing the importance of targeted training signal allocation.
Read more
RxEval: A Prescription-Level Benchmark for Evaluating LLM Medication Recommendation
Shuhao Chen, Weisen Jiang, Changmiao Wang, Xiaoqing Wu, Xuanren Shi, Yu Zhang, James T. Kwok
Large Language Models NLP
  • RxEval shifts the evaluation of medication recommendation from admission-level to prescription-level, capturing the dynamic nature of patient care.
  • The benchmark includes a reasoning-chain perturbation method to create patient-specific distractors for multiple-choice questions.
  • Evaluation of 16 LLMs demonstrates that current models have significant limitations in accurately recommending medications.
  • The results highlight systematic errors in LLMs, including oversight of patient information and reasoning failures.
Read more
Woodelf++: A Fast and Unified Partial Dependence Plot Algorithm for Decision Tree Ensembles
Ron Wettenstein, Alexander Nadel, Udi Boker
Efficient ML Interpretability
  • WOODELF++ provides a unified approach for computing PDPs, Joint-PDPs, and PDIVs.
  • The algorithm achieves significant computational efficiency, particularly for Any-Order-PDIVs.
  • Implementation in pure Python with GPU support enhances accessibility for large datasets.
  • The method demonstrates up to 6x speed improvement over existing algorithms and up to 1,000,000 years faster for Any-Order-PDIVs.
Read more
DeepTokenEEG Enhancing Mild Cognitive Impairment and Alzheimers Classification via Tokenized EEG Features
Thinh Nguyen-Quang, Minh Long Ngo, Ngoc-Son Nguyen, Nguyen Thanh Vinh, Huy-Dung Han, Bui Thanh Tung, Nguyen Quang Linh, Khuong Vo, Manoj Vishwanath, Hung Cao
Time Series Efficient ML
  • Introduction of DeepTokenEEG, a lightweight model for AD classification using EEG signals.
  • Utilization of spatial and temporal tokenization to capture relevant biomarkers.
  • Achieved 100% accuracy on specific frequency bands, surpassing existing methods.
  • Constructed a large-scale dataset for comprehensive benchmarking.
Read more
Paraphrasing Attack Resilience of Various AI-Generated Text Detection Methods
Andrii Shportko, Inessa Verbitsky
NLP Large Language Models
  • The rise of LLMs has increased the demand for effective AI-generated text detection methods.
  • Binoculars-inclusive ensembles demonstrate the highest detection performance but are most susceptible to paraphrasing attacks.
  • Human detection of AI-generated text is often unreliable, underscoring the need for automated solutions.
  • The study categorizes detection methods into training-based and training-free paradigms, with Binoculars showing significant advantages.
Read more
Mini-JEPA Foundation Model Fleet Enables Agentic Hydrologic Intelligence
Mashrekur Rahman
Computer Vision Multimodal Efficient ML
  • Mini-JEPAs outperform generalist models on specialized hydrologic tasks.
  • Each Mini-JEPA is specialized for a specific satellite sensor, enhancing predictive accuracy.
  • The embedding manifolds of the Mini-JEPAs exhibit distinct geometric structures.
  • A routing LLM effectively selects the appropriate Mini-JEPA for hydrologic queries.
Read more
Architecture-Aware Explanation Auditing for Industrial Visual Inspection
Sibo Jia, Zihang Zhao, Kunrong Li
Computer Vision Interpretability
  • The native-readout hypothesis suggests that explanation methods must align structurally with the model's decision mechanism to ensure faithfulness.
  • Experiments on the WM-811K dataset reveal significant differences in explanation quality among various models and methods.
  • Swin-Tiny's architecture allows for better explanation compatibility compared to traditional models like ResNet and DenseNet.
  • Model-agnostic methods like RISE underperform compared to native methods, highlighting the importance of architectural alignment.
Read more
GenAI for Energy-Efficient and Interference-Aware Compressed Sensing of GNSS Signals on a Google Edge TPU
Thorben Wegner, Lucas Heublein, Tobias Feigl, Felix Ott, Christopher Mutschler, Alexander RΓΌgamer
Generative Models Efficient ML Interpretability
  • Introduces a GenAI-based approach for real-time GNSS jamming signal classification and compression.
  • Achieves significant data compression (> 42Γ—) while maintaining high classification accuracy.
  • Utilizes 8-bit quantization for energy-efficient deployment on Google Edge TPUs.
  • Explores various autoencoder architectures and their impact on reconstruction and classification performance.
Read more
Slower Generalization, Faster Memorization: A Sweet Spot in Algorithmic Learning
Shin So, Kyelim Lee, Albert No
Theory Optimization
  • Validation convergence is fastest at an intermediate dataset size in NW matrix generation, not at the largest size.
  • Partial rule learning can reduce the number of updates needed to achieve high training accuracy in the weak-validation regime.
  • The study introduces a two-pressure account of learning, where initial rule discovery aids fitting, but further data can increase fitting costs.
  • The findings refine the classical understanding of grokking by presenting a broader timing diagram of training and validation convergence.
Read more
Peng's Q(Ξ») for Conservative Value Estimation in Offline Reinforcement Learning
Byeongchan Kim, Min-hwan Oh
Reinforcement Learning Theory Optimization
  • CPQL is the first multi-step Q-learning algorithm for model-free offline RL.
  • The algorithm effectively utilizes offline trajectories without requiring additional model estimations.
  • CPQL mitigates over-pessimistic value estimation while ensuring performance is at least equal to the behavior policy.
  • Theoretical analyses confirm CPQL's ability to reduce sub-optimality compared to existing methods.
Read more
Test-Time Learning with an Evolving Library
Weijia Xu, Alessandro Sordoni, Chandan Singh, Zelalem Gero, Michel Galley, Xingdi Yuan, Jianfeng Gao
NLP Large Language Models Efficient ML
  • EVOLIB allows LLMs to learn and adapt during inference without parameter updates or external supervision.
  • The framework maintains a structured library of knowledge abstractions, promoting efficient knowledge transfer across tasks.
  • A novel credit assignment mechanism based on Information Gain (IG) and Future Information Gain (Future IG) supports continual learning.
  • EVOLIB shows consistent performance improvements across diverse benchmarks compared to traditional test-time learning methods.
Read more
Beyond Binary: Reframing GUI Critique as Continuous Semantic Alignment
Yuchen Sun, Pei Fu, Shaojie Zhang, Anan Du, Xiuwen Xi, Ruoceng Zhang, Zhenbo Luo, Jian Luan, Chongyang Zhang
Theory Optimization Robotics
  • Introduces BBCritic, a paradigm shift in GUI critique from binary classification to continuous semantic alignment.
  • Identifies and addresses structural defects in existing GUI critic models: Affordance Collapse and Noise Sensitivity.
  • Presents BBBench, the first GUI critic benchmark with a dense action space and a hierarchical four-level taxonomy.
  • Demonstrates that BBCritic outperforms larger binary models with fewer parameters and no additional annotations.
Read more
A Mutual Information Lower Bound for Multimodal Regression Active Learning
Leonardo Ferreira Guilhoto, Akshat Kaushal, Paris Perdikaris
Multimodal Theory Generative Models
  • Introduction of the Two-Index framework to separate epistemic and aleatoric uncertainties.
  • Development of the Mutual Information Lower Bound (MI-LB) acquisition function for active learning in multimodal regression.
  • MI-LB consistently outperforms existing acquisition functions across various multimodal benchmarks.
  • The framework provides a unified approach to uncertainty quantification applicable to a wide range of model families.
Read more
A Novel Schur-Decomposition-Based Weight Projection Method for Stable State-Space Neural-Network Architectures
Sergio Vanegas, Lasse Lensu, Fredy Ruiz
Theory Efficient ML Time Series
  • Introduction of a Schur-stable weight projection method for state-space neural networks.
  • Dynamic projection ensures stable dynamics with reduced overparameterization.
  • Experimental validation shows comparable performance to state-of-the-art methods.
  • Lower weight count enhances training convergence without sacrificing accuracy.
Read more
TopoPrimer: The Missing Topological Context in Forecasting Models
Zara Zetlin, Kayhan Moharreri, Maria Safi
Time Series
  • TopoPrimer encodes the global topological structure of time series data, improving forecasting accuracy.
  • The framework uses persistent homology to analyze cross-series correlation, producing a shared persistence landscape vector.
  • Spectral sheaf coordinates provide a per-series relational context without requiring extensive training.
  • TopoPrimer shows significant performance gains, especially under peak seasonal demand and cold-start conditions.
Read more
Second-Order Actor-Critic Methods for Discounted MDPs via Policy Hessian Decomposition
Sanjeev Manivannan, Shuban V
Reinforcement Learning Optimization Robotics
  • Introduction of a second-order actor-critic method for discounted MDPs.
  • Utilization of a two-timescale framework to stabilize second-order updates.
  • Efficient computation of Hessian-vector products to enhance convergence.
  • Demonstrated effectiveness across multiple benchmark environments.
Read more
Enjoy Your Layer Normalization with the Computational Efficiency of RMSNorm
Yuxin Guo, Yihao Yue, Yunhao Ni, Yizhou Ruan, Jie Luo, Wenjun Wu, Lei Huang
Efficient ML Theory Optimization
  • Introduces a framework to replace Layer Normalization with RMSNorm in DNNs without affecting model predictions.
  • Defines 'foldable LNs' and develops a graph-based algorithm for their detection.
  • Demonstrates that many LNs in widely used architectures can be replaced, leading to significant inference-time acceleration.
  • Shows that the proposed method maintains competitive performance compared to traditional LN, especially in long-sequence tasks.
Read more
Uncovering Trajectory and Topological Signatures in Multimodal Pediatric Sleep Embeddings
Scott Ye, Harlin Lee
Multimodal Time Series Generative Models
  • The study explores the latent trajectory and topological signatures in pediatric sleep embeddings derived from PSG data.
  • Augmenting embeddings with geometric features and EHR improves model performance and generalization across apnea-hypopnea index (AHI) strata.
  • The research demonstrates that geometric and topological features provide complementary information for sleep event detection.
  • The use of persistent homology allows for stable and compact signatures of sleep continuity and fragmentation.
Read more
Policy Optimization in Hybrid Discrete-Continuous Action Spaces via Mixed Gradients
Matias Alvo, Daniel Russo, Yash Kanoria
Reinforcement Learning Robotics Optimization
  • Introduction of Hybrid Policy Optimization (HPO) framework for hybrid discrete-continuous action spaces.
  • HPO utilizes a mixed gradient estimator combining PW and SF gradients for improved credit assignment.
  • Empirical results show HPO outperforms PPO, especially in high-dimensional continuous action settings.
  • The mixed gradient structure allows for decentralized updates, enhancing efficiency in learning.
Read more
Eradicating Negative Transfer in Multi-Physics Foundation Models via Sparse Mixture-of-Experts Routing
Ellwil Sharma, Arastu Sharma
Theory Optimization Efficient ML
  • Introduction of Shodh-MoE architecture to address negative transfer in multi-physics models.
  • Utilization of a physics-informed autoencoder for generating compressed physical latents.
  • Implementation of a Top-1 soft-semantic router for dynamic expert assignment based on latent semantics.
  • Demonstration of significant improvements in model convergence and performance across distinct physical regimes.
Read more