AI-generated summaries

Today's ML research,
without the noise.

Daily summaries of the latest machine learning papers from arXiv, processed every 8 hours.

67 Papers today
8h Update frequency
7 Days of history
EEG-Based Multimodal Learning via Hyperbolic Mixture-of-Curvature Experts
Runhe Zhou, Shanglin Li, Guanxiang Huang, Xinliang Zhou, Qibin Zhao, Motoaki Kawanabe, Yi Ding, Cuntai Guan
Multimodal
  • Introduction of EEG-MoCE, a hyperbolic mixture-of-curvature framework for EEG-based multimodal learning.
  • Utilization of learnable curvatures for modality-specific experts to adapt to intrinsic differences.
  • Implementation of curvature-guided fusion to emphasize modalities with richer hierarchical structures.
  • Demonstration of state-of-the-art performance on multiple EEG-based multimodal datasets.
Read more
Is Sliding Window All You Need? An Open Framework for Long-Sequence Recommendation
Sayak Chakrabarty, Souradip Pal
Efficient ML Time Series Optimization
  • Introduction of an open-source framework for long-sequence recommendation training.
  • Development of a runtime-aware ablation study to analyze accuracy-compute trade-offs.
  • Novel k-shift embedding layer enabling large vocabularies on commodity GPUs.
  • Demonstration of competitive retrieval quality with modest training time overhead.
Read more
ASTER: Latent Pseudo-Anomaly Generation for Unsupervised Time-Series Anomaly Detection
Romain Hermary, Samet Hicsonmez, Dan Pineau, Abd El Rahman Shabayek, Djamila Aouada
Time Series Large Language Models Generative Models
  • ASTER generates pseudo-anomalies directly in latent space, improving generalization and eliminating the need for domain-specific augmentations.
  • The framework utilizes a VAE-based perturbator for synthesizing pseudo-anomalies and a Transformer-based classifier for anomaly detection.
  • Pre-trained LLMs are effectively leveraged as contextual feature extractors for time-series anomaly detection.
  • The method is validated using the TAB benchmark, ensuring fair and reproducible comparisons across TSAD methods.
Read more
Socrates Loss: Unifying Confidence Calibration and Classification by Leveraging the Unknown
Sandra Gómez-Gálvez, Tobias Olenyi, Gillian Dobbie, Katerina Taškova
Theory Optimization
  • Socrates Loss unifies classification and confidence calibration objectives through explicit uncertainty modeling.
  • The method incorporates an auxiliary unknown class to enhance training stability and performance.
  • Theoretical guarantees show that Socrates Loss regularizes model weights, preventing miscalibration.
  • Empirical results indicate improved accuracy-calibration trade-offs across multiple datasets and architectures.
Read more
Reward Hacking in the Era of Large Models: Mechanisms, Emergent Misalignment, Challenges
Xiaohua Wang, Muzhao Tian, Yuqi Zeng, Zisu Huang, Jiakang Yuan, Bowen Chen, Jingwen Xu, Mingbo Zhou, Wenhao Liu, Muling Wu, Zhengkang Guo, Qi Qian, Yifei Wang, Feiran Zhang, Ruicheng Yin, Shihan Dou, Changze Lv, Tao Chen, Kaitao Song, Xu Tan, Tao Gui, Xiaoqing Zheng, Xuanjing Huang
NLP Large Language Models Reinforcement Learning
  • Reward hacking is a systemic vulnerability in large models due to reliance on imperfect proxy signals.
  • The Proxy Compression Hypothesis (PCH) provides a framework for understanding reward hacking mechanisms.
  • Local shortcut learning can lead to broader misalignment issues, including deception and strategic manipulation.
  • Detection and mitigation strategies should focus on the dynamics of compression, amplification, and co-adaptation.
Read more
ResBM: Residual Bottleneck Models for Low-Bandwidth Pipeline Parallelism
Alan Aboudib, Rodrigo Lopez Portillo A., Kalei Brady, Steffen Cruz
Large Language Models Efficient ML Optimization
  • Introduction of ResBM, achieving state-of-the-art 128× activation compression.
  • End-to-end trainability without degradation in convergence rates.
  • Empirical analysis shows optimizer choice affects activation compressibility.
  • Negligible memory and compute overhead compared to traditional methods.
Read more
When Reasoning Models Hurt Behavioral Simulation: A Solver-Sampler Mismatch in Multi-Agent LLM Negotiation
Sandro Andric
Large Language Models NLP Theory
  • Stronger reasoning in models can hinder their ability to simulate boundedly rational behavior.
  • The study introduces the concept of 'solver-sampler mismatch' in multi-agent negotiation contexts.
  • Bounded reflection significantly improves simulation fidelity compared to native reasoning.
  • The paper proposes a framework for evaluating behavioral sampler fidelity in simulations.
Read more
PRiMeFlow: Capturing Complex Expression Heterogeneity in Perturbation Response Modelling
Zichao Yan, Yan Wu, Mica Xu Ji, Chaitra Agrahar, Esther Wershof, Marcel Nassar, Mehrshad Sadria, Ridvan Eksi, Vladimir Trifonov, Ignacio Ibarra, Telmo Felgueira, Błażej Osiński, Rory Stark
Generative Models
  • PRiMeFlow is an innovative flow matching approach for perturbation response modeling.
  • The model operates directly in gene expression space, enhancing biological signal retention.
  • Extensive benchmarking shows PRiMeFlow outperforms existing models in distribution-fitting metrics.
  • Key design choices, including the use of U-Net and independent coupling, significantly improve performance.
Read more
Beyond Weather Correlation: A Comparative Study of Static and Temporal Neural Architectures for Fine-Grained Residential Energy Consumption Forecasting in Melbourne, Australia
Prasad Nimantha Madusanka Ukwatta Hewage, Hao Wu
Time Series
  • LSTM outperforms MLP in short-term energy forecasting, emphasizing the importance of temporal autocorrelation.
  • The study provides empirical evidence that past consumption patterns are more informative than current weather conditions for fine-grained forecasting.
  • Solar photovoltaic integration introduces asymmetries in forecasting performance, particularly for households with solar systems.
  • The research highlights the need for accurate residential load forecasting in the context of Australia's National Electricity Market.
Read more
An Optimal Sauer Lemma Over k-ary Alphabets
Steve Hanneke, Qinglin Meng, Shay Moran, Amirreza Shaeiri
Theory
  • Establishes a sharp Sauer inequality for multiclass and list prediction based on the DS dimension.
  • Improves upon the Natarajan dimension bounds, which are suboptimal for k > 2.
  • Provides tight bounds for all alphabet sizes and list sizes, enhancing learning guarantees.
  • Utilizes the polynomial method for proof, indicating a lack of combinatorial proofs in the DS setting.
Read more
Adaptive Memory Crystallization for Autonomous AI Agent Learning in Dynamic Environments
Rajat Khanda, Mohammad Baqar, Sambuddha Chakrabarti, Satyasaran Changdar
Reinforcement Learning Robotics Theory
  • Introduction of Adaptive Memory Crystallization (AMC) for continual reinforcement learning.
  • Development of a three-phase memory hierarchy (Liquid, Glass, Crystal) to manage memory stability and plasticity.
  • Rigorous theoretical proofs regarding the SDE formulation and convergence properties.
  • Empirical results showing significant improvements in learning performance and memory efficiency.
Read more
Computational framework for multistep metabolic pathway design
Peter Zhiping Zhang, Jeffrey D. Varner
Optimization
  • Integration of deep learning with traditional retrobiosynthesis enhances metabolic pathway design.
  • Development of a data augmentation procedure to enrich reaction datasets.
  • Two neural network models were trained to classify and rank metabolic pathways.
  • Successful validation of the framework through the reproduction of various metabolic pathways.
Read more
Offline-Online Reinforcement Learning for Linear Mixture MDPs
Zhongjun Zhang, Sean R. Sinclair
Reinforcement Learning Theory Optimization
  • Introduction of O-O UCRL-VTR algorithm for offline-online learning in linear mixture MDPs.
  • Establishment of regret bounds that characterize the informativeness of offline data.
  • Demonstration of the algorithm's ability to safely leverage offline data while avoiding bias from environment shift.
  • Numerical experiments corroborate theoretical results, highlighting practical applicability.
Read more
Binomial Gradient-Based Meta-Learning for Enhanced Meta-Gradient Estimation
Yilang Zhang, Abraham Jaeger Mountain, Bingcong Li, Georgios B. Giannakis
Optimization Theory Efficient ML
  • Introduction of Binomial Gradient-Based Meta-Learning (BinomGBML) for improved meta-gradient estimation.
  • BinomGBML reduces estimation errors through efficient parallel computation and a truncated binomial expansion.
  • BinomMAML, a model-agnostic adaptation, shows provable improvements in error bounds over existing methods.
  • Theoretical results are validated through extensive numerical experiments on synthetic and real datasets.
Read more
Analog Optical Inference on Million-Record Mortgage Data
Sofia Berloff, Pavel Koptev, Konstantin Malkov
Efficient ML
  • The AOC achieves 94.6% balanced accuracy on mortgage classification, compared to 97.9% for XGBoost.
  • Increasing optical channels from 16 to 48 only improves accuracy by 0.5 percentage points, indicating architectural limitations.
  • Binarising features leads to a significant drop in accuracy for all models, highlighting the impact of encoding strategies.
  • Seven calibrated hardware non-idealities do not impose measurable penalties on model performance.
Read more
Pareto-Optimal Offline Reinforcement Learning via Smooth Tchebysheff Scalarization
Aadyot Bhatnagar, Peter Mørch Groth, Ali Madani
Reinforcement Learning Optimization Large Language Models
  • Introduces STOMP, a novel offline RL algorithm for multi-objective optimization.
  • Overcomes limitations of linear scalarization by using smooth Tchebysheff scalarization.
  • Dynamically standardizes rewards based on observed distributions to improve optimization.
  • Empirical validation shows STOMP achieves superior performance in protein engineering tasks.
Read more
Sample Complexity of Autoregressive Reasoning: Chain-of-Thought vs. End-to-End
Steve Hanneke, Idan Mehalel, Shay Moran
NLP Large Language Models Theory
  • The sample complexity in the End-to-End regime can vary widely, potentially growing linearly with T.
  • Chain-of-Thought supervision eliminates the dependence of sample complexity on the generation length T.
  • The paper introduces new combinatorial tools for analyzing sample complexity in autoregressive models.
  • The findings resolve several open questions from previous work regarding learnability and supervision types.
Read more
Unsupervised domain transfer: Overcoming signal degradation in sleep monitoring by increasing scoring realism
Mohammad Ahangarkiasari, Andreas Tind Damgaard, Casper Haurum, Kaare B. Mikkelsen
Time Series
  • Introduces an unsupervised domain transfer method for sleep monitoring.
  • Combines a pretrained model with a discriminator network to adapt to signal degradation.
  • Demonstrates performance improvements in sleep scoring without requiring ground truth labels.
  • Highlights the importance of hypnogram realism in enhancing model accuracy.
Read more
LASA: Language-Agnostic Semantic Alignment at the Semantic Bottleneck for LLM Safety
Junxiao Yang, Haoran Liu, Jinzhe Tu, Jiale Cheng, Zhexin Zhang, Shiyao Cui, Jiaqi Weng, Jialing Tao, Hui Xue, Hongning Wang, Han Qiu, Minlie Huang
NLP Large Language Models
  • Identification of the Semantic Bottleneck in LLMs where representations are organized by semantics rather than language.
  • Introduction of the LASA framework to align safety understanding with language-agnostic semantic structures.
  • Significant improvement in safety performance across all languages, especially for low-resource languages.
  • Empirical results showing a drastic reduction in attack success rates for various LLMs.
Read more
(How) Learning Rates Regulate Catastrophic Overtraining
Mark Rofin, Aditya Varre, Nicolas Flammarion
Large Language Models Optimization Theory
  • Learning rates significantly influence the optimization trajectory and model performance during finetuning.
  • Low learning rates act as implicit regularization, helping preserve the capabilities of pretrained models.
  • Increased sharpness of models due to learning rate decay during pretraining contributes to catastrophic forgetting.
  • The study connects optimization dynamics to the phenomenon of progressive sharpening in neural networks.
Read more
MyoVision: A Mobile Research Tool and NEATBoost-Attention Ensemble Framework for Real Time Chicken Breast Myopathy Detection
Chaitanya Pallerla, Siavash Mahmoudi, Dongyi Wang
Computer Vision Multimodal Optimization
  • MyoVision provides a low-cost, smartphone-based solution for detecting chicken breast myopathies.
  • The NEATBoost-Attention Ensemble model optimizes classification performance without manual hyperparameter tuning.
  • The framework captures and analyzes internal structural variations in poultry meat using transillumination imaging.
  • Achieved 82.4% accuracy in classifying myopathy types, comparable to expensive imaging systems.
Read more
The Linear Centroids Hypothesis: How Deep Network Features Represent Data
Thomas Walker, Ahmed Imtiaz Humayun, Randall Balestriero, Richard Baraniuk
Interpretability
  • Introduction of the Linear Centroids Hypothesis (LCH) for feature identification in deep networks.
  • LCH addresses limitations of the Linear Representation Hypothesis (LRH) by focusing on local centroids.
  • Demonstrates improved interpretability and performance in DINO vision transformers.
  • Facilitates identification of circuits in models like GPT2-Large.
Read more
Text-Attributed Knowledge Graph Enrichment with Large Language Models for Medical Concept Representation
Mohsen Nayebi Kerdabadi, Arya Hadizadeh Moghaddam, Chen Chen, Dongjie Wang, Zijun Yao
Large Language Models Graph Learning
  • COMED integrates LLMs with KGs to enhance medical concept representation.
  • The framework constructs a clinically interpretable and empirically supported KG.
  • LLM-generated semantics enrich the KG into a text-attributed graph.
  • Joint training of a text encoder and GNN allows for effective learning of concept embeddings.
Read more
Generalization Guarantees on Data-Driven Tuning of Gradient Descent with Langevin Updates
Saumya Goyal, Rohith Rongali, Ritabrata Ray, Barnabás Póczos
Optimization Theory Efficient ML
  • Introduction of the Langevin Gradient Descent Algorithm (LGD) for hyperparameter tuning in regression tasks.
  • Establishment of generalization guarantees with a pseudo-dimension bound of O(dh) for meta-learning optimal hyperparameters.
  • Demonstration of LGD's Bayes' optimality for squared loss and robustness to distribution shifts.
  • Empirical evidence showing LGD's effectiveness in few-shot learning with reduced computational requirements.
Read more
LongCoT: Benchmarking Long-Horizon Chain-of-Thought Reasoning
Sumeet Ramesh Motwani, Daniel Nichols, Charles London, Peggy Li, Fabio Pizzati, Acer Blake, Hasan Hammoud, Tavish McDonald, Akshat Naik, Alesia Ivanova, Vignesh Baskaran, Ivan Laptev, Ruben Glatt, Tal Ben-Nun, Philip Torr, Natasha Jaques, Ameya Prabhu, Brian Bartoldson, Bhavya Kailkhura, Christian Schroeder de Witt
Large Language Models NLP Theory
  • LongCoT is a new benchmark for evaluating long-horizon reasoning in language models.
  • The benchmark includes 2,500 expert-designed problems across multiple domains.
  • Current leading models achieve less than 10% accuracy on LongCoT, indicating significant reasoning limitations.
  • The problems require navigating complex interdependencies, emphasizing the need for planning and context management.
Read more
Asymmetric-Loss-Guided Hybrid CNN-BiLSTM-Attention Model for Industrial RUL Prediction with Interpretable Failure Heatmaps
Mohammed Ezzaldin Babiker Abdullah
Time Series
  • Introduces a hybrid architecture combining CNN, BiLSTM, and attention mechanisms for RUL prediction.
  • Utilizes an asymmetric loss function to prioritize safety by penalizing over-estimation more than under-estimation.
  • Achieves competitive performance metrics (RMSE of 17.52 cycles and NASA S-Score of 922.06) on the C-MAPSS dataset.
  • Provides interpretable attention heatmaps that enhance model transparency and support maintenance decisions.
Read more
Classification of Epileptic iEEG using Topological Machine Learning
Sunia Tanweer, Narayan Puthanmadam Subramaniyam, Firas A. Khasawneh
Time Series
  • Topological data analysis (TDA) improves classification of epileptic states from iEEG signals.
  • The study uses a larger dataset of 55 patients, enhancing the robustness of the findings.
  • Dimension-reduced topological features achieve up to 80% balanced accuracy, comparable to deep learning models.
  • Classical machine learning methods can effectively classify iEEG data with reduced complexity.
Read more
Beyond State Consistency: Behavior Consistency in Text-Based World Models
Youling Huang, Guanqiao Chen, Junchi Yao, Lu Wang, Fangkai Yang, Chao Du, ChenZhuo Zhao, Pu Zhao, Qingwei Lin, Saravan Rajmohan, Dongmei Zhang
NLP Large Language Models Reinforcement Learning
  • Introduction of Behavior Consistency Training paradigm for text-based world models.
  • Development of Behavior Consistency Reward (BehR) as a new metric for evaluating model performance.
  • Demonstrated improvements in long-term predictive fidelity and decision preservation in agent behavior.
  • BehR-based training leads to lower false positives in offline evaluations.
Read more
UI-Copilot: Advancing Long-Horizon GUI Automation via Tool-Integrated Policy Optimization
Zhengxi Lu, Fei Tang, Guangyi Liu, Kaitao Song, Xu Tan, Jin Ma, Wenqi Zhang, Weiming Lu, Jun Xiao, Yueting Zhuang, Yongliang Shen
Reinforcement Learning Large Language Models Optimization
  • Introduction of UI-Copilot framework for long-horizon GUI automation.
  • Implementation of memory decoupling to mitigate context overload.
  • Development of Tool-Integrated Policy Optimization (TIPO) for effective tool invocation.
  • UI-Copilot-7B achieves state-of-the-art performance on MemGUI-Bench.
Read more
SOAR: Self-Correction for Optimal Alignment and Refinement in Diffusion Models
You Qin, Linqing Wang, Hao Fei, Roger Zimmermann, Liefeng Bo, Qinglin Lu, Chunyu Wang
Generative Models Optimization Computer Vision
  • SOAR addresses exposure bias in diffusion models by correcting errors during the denoising process.
  • The method provides dense, on-policy supervision without the need for external reward models.
  • SOAR improves performance metrics significantly compared to traditional supervised fine-tuning methods.
  • The approach is compatible with subsequent reinforcement learning alignment, enhancing overall model performance.
Read more
AutoSurrogate: An LLM-Driven Multi-Agent Framework for Autonomous Construction of Deep Learning Surrogate Models in Subsurface Flow
Jiale Liu, Nanzhe Wang
NLP Large Language Models Efficient ML
  • AutoSurrogate enables non-experts to build deep learning surrogate models using natural language instructions.
  • The framework employs a multi-agent system to automate various stages of model construction and evaluation.
  • It autonomously addresses common failure modes, enhancing robustness and user experience.
  • Demonstrated superior performance compared to expert-designed models and traditional AutoML methods.
Read more
LEGO-MOF: Equivariant Latent Manipulation for Editable, Generative, and Optimizable MOF Design
Chaoran Zhang, Guangyao Li, Dongxu Ji
Generative Models Graph Learning Optimization
  • Introduction of LinkerVAE for continuous structural manipulation of MOFs.
  • Development of a test-time optimization strategy to enhance carbon capture performance.
  • Achieved a 147.5% average relative boost in CO2 uptake while preserving structural integrity.
  • Establishment of a fully differentiable framework for automated materials discovery.
Read more
Multi-Task LLM with LoRA Fine-Tuning for Automated Cancer Staging and Biomarker Extraction
Jiahao Shao, Anam Nawaz Khan, Christopher Brett, Tom Berg, Xueping Li, Bing Yao
NLP Large Language Models Efficient ML
  • Introduces a parameter-efficient multi-task framework for cancer staging and biomarker extraction.
  • Utilizes LoRA fine-tuning on a large dataset of pathology reports.
  • Achieves a high Macro F1 score of 0.976, outperforming traditional NLP methods.
  • Employs parallel classification heads for consistent schema adherence.
Read more
MOONSHOT : A Framework for Multi-Objective Pruning of Vision and Large Language Models
Gabriel Afriat, Xiang Meng, Shibal Ibrahim, Hussein Hazimeh, Rahul Mazumder
Computer Vision Large Language Models Efficient ML
  • MOONSHOT extends single-objective pruning methods to a multi-objective framework.
  • Joint optimization of layer-wise reconstruction error and second-order Taylor approximation improves pruning outcomes.
  • The framework is scalable for large models, maintaining efficiency in computation.
  • Significant performance improvements were observed across various models and tasks.
Read more
Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe
Yaxuan Li, Yuxin Zuo, Bingxiang He, Jinqian Zhang, Chaojun Xiao, Cheng Qian, Tianyu Yu, Huan-ang Gao, Wenkai Yang, Zhiyuan Liu, Ning Ding
NLP Large Language Models Reinforcement Learning
  • Effective OPD requires compatible thinking patterns between student and teacher models.
  • Higher benchmark scores do not guarantee new knowledge transfer in OPD.
  • Successful OPD is marked by progressive alignment on high-probability tokens.
  • Two strategies can recover failing OPD: off-policy cold start and teacher-aligned prompt selection.
Read more
Learning Inference Concurrency in DynamicGate MLP Structural and Mathematical Justification
Yongil Choi
Theory Efficient ML Optimization
  • DynamicGate-MLP allows for concurrent learning and inference without compromising output stability.
  • The architecture separates gating parameters from prediction parameters, enabling selective updates.
  • Mathematical formalization provides sufficient conditions for maintaining inference validity during updates.
  • The approach is particularly relevant for real-time applications requiring adaptive learning.
Read more
Online learning with noisy side observations
Tomáš Kocák, Gergely Neu, Michal Valko
Theory Optimization Graph Learning
  • Introduces a partial-observability model for online learning with noisy side observations.
  • Develops an efficient, parameter-free algorithm with a regret bound of eO(√α∗T).
  • Defines the effective independence number α∗ to characterize the learning complexity.
  • Generalizes existing models and addresses the challenges of noisy feedback.
Read more
FAST: A Synergistic Framework of Attention and State-space Models for Spatiotemporal Traffic Prediction
Xinjin Li, Jinghan Cao, Mengyue Wang, Yue Wu, Longxiang Yan, Yeyang Zhou, Ziqi Sha, Yu Ma
Time Series Graph Learning Efficient ML
  • FAST addresses the limitations of existing traffic forecasting methods by combining attention and state-space modeling.
  • The Temporal-Spatial-Temporal architecture allows for effective modeling of both temporal and spatial dependencies.
  • Incorporation of a learnable multi-source spatiotemporal embedding enhances the model's ability to capture heterogeneous traffic contexts.
  • FAST achieves superior performance on benchmark datasets, significantly reducing RMSE and MAE compared to strong baselines.
Read more
Black-Box Optimization From Small Offline Datasets via Meta Learning with Synthetic Tasks
Azza Fadhel, The Hung Tran, Trong Nghia Hoang, Jana Doppa
Optimization
  • Introduces OptBias, a meta-learning framework for offline black-box optimization.
  • Addresses data scarcity by generating synthetic tasks to enhance model training.
  • Demonstrates improved performance over existing optimization algorithms in small data settings.
  • Emphasizes the importance of capturing optimization bias through gradient matching.
Read more
Some Theoretical Limitations of t-SNE
Rupert Li, Elchanan Mossel
Theory
  • t-SNE can lose important features of data during dimensionality reduction.
  • In high-dimensional spaces, t-SNE may map many points to the same location, leading to uninformative visualizations.
  • Theoretical results show that for certain datasets, the t-SNE objective can yield poor embeddings.
  • The paper provides a mathematical framework to understand the limitations of t-SNE.
Read more
Spectral Entropy Collapse as an Empirical Signature of Delayed Generalisation in Grokking
Truong Xuan Khanh, Truong Quynh Hoa, Luu Duc Trung, Phan Thanh Duc
Theory
  • Grokking involves a two-phase process: norm expansion followed by spectral entropy collapse.
  • A stable threshold for normalized spectral entropy (ËœH∗ ≈ 0.61) is identified, below which grokking consistently occurs.
  • Causal interventions demonstrate that preventing entropy collapse delays the grokking process significantly.
  • A predictive model based on spectral entropy allows for accurate forecasting of generalization timing.
Read more
Automated co-design of high-performance thermodynamic cycles via graph-based hierarchical reinforcement learning
Wenqing Li, Xu Feng, Peixue Jiang, Yinhai Zhu
Reinforcement Learning Graph Learning Optimization
  • Introduces a graph-based hierarchical reinforcement learning approach for thermodynamic cycle design.
  • Encodes cycles as graphs to facilitate structural and parameter optimization.
  • Demonstrates the ability to discover novel cycle configurations that outperform traditional designs.
  • Establishes a fully automated pipeline for thermodynamic cycle co-design.
Read more
Parameter Importance is Not Static: Evolving Parameter Isolation for Supervised Fine-Tuning
Zekai Lin, Chao Xue, Di Liang, Xingsheng Han, Peiyang Liu, Xianjie Wu, Lei Jiang, Yu Lu, Haibo Shi, Shuang Liang, Minlong Peng
NLP Large Language Models Efficient ML
  • Parameter importance in supervised fine-tuning is dynamic and subject to change during training.
  • Evolving Parameter Isolation (EPI) adapts isolation strategies based on real-time gradient information.
  • EPI significantly reduces task interference and catastrophic forgetting compared to static isolation methods.
  • The framework maintains a balance between stability (retaining knowledge) and plasticity (learning new tasks).
Read more
When Less Latent Leads to Better Relay: Information-Preserving Compression for Latent Multi-Agent LLM Collaboration
Yiping Li, Zhiyu An, Wan Du
Large Language Models NLP Efficient ML
  • Introduces Orthogonal Backfill (OBF) for efficient KV cache compression in multi-agent systems.
  • Demonstrates that compressed relay can match or outperform full KV relay while reducing communication costs significantly.
  • Establishes a new perspective on KV compression as a relay-specific communication problem.
  • Shows that preserving useful information is more critical than the volume of transmitted data.
Read more
Does Dimensionality Reduction via Random Projections Preserve Landscape Features?
Iván Olarte Rodríguez, Anja Jankovic, Thomas Bäck, Elena Raponi
Optimization
  • Dimensionality reduction via RGEs can significantly alter ELA feature values.
  • Most ELA features are sensitive to the embedding process, impacting their reliability.
  • Robust features under projection may not reflect intrinsic landscape properties.
  • The study highlights the limitations of using dimensionality reduction for ELA in high-dimensional optimization problems.
Read more
Golden Handcuffs make safer AI agents
Aram Ebtekar, Michael K. Cohen
Reinforcement Learning Theory
  • Introduces the 'Golden Handcuffs' mechanism to enhance safety in RL agents.
  • Proposes a Bayesian mitigation strategy that incorporates a large negative reward to discourage risky exploration.
  • Demonstrates that the agent can achieve sublinear regret against the best mentor while maintaining safety.
  • Establishes that the agent avoids unsafe actions by deferring to mentor policies.
Read more
Spectral Thompson sampling
Tomas Kocak, Michal Valko, Remi Munos, Shipra Agrawal
Theory Graph Learning Efficient ML
  • SpectralTS provides a computationally efficient alternative to traditional Thompson Sampling algorithms.
  • The regret of SpectralTS scales as d√T ln N, which is favorable compared to existing methods.
  • The algorithm is applicable in contexts where payoffs are smooth over a graph, such as recommender systems.
  • Empirical evaluations indicate that SpectralTS performs competitively on both synthetic and real-world data.
Read more
Enhancing Reinforcement Learning for Radiology Report Generation with Evidence-aware Rewards and Self-correcting Preference Learning
Qin Zhou, Guoyan Liang, Qianyi Yang, Jingyuan Chen, Sai Wu, Chang Yao, Zhe Wang
Reinforcement Learning NLP Generative Models
  • Introduction of ESC-RL framework to enhance RRG with evidence-aware rewards and self-correcting mechanisms.
  • GEAR module provides group-wise, evidence-aware feedback for improved alignment of generated reports with clinical findings.
  • SPL strategy constructs a disease-specific preference dataset to refine report generation autonomously.
  • Extensive experiments show superior performance compared to existing RRG methods.
Read more
Enhancing Confidence Estimation in Telco LLMs via Twin-Pass CoT-Ensembling
Anton Saenko, Pranshav Gajjar, Abiodun Ganiyu, Vijay K. Shah
NLP Large Language Models
  • Current LLM confidence scores in telecommunications are often biased and unreliable.
  • The proposed Twin-Pass CoT-Ensembling method improves confidence estimation by aggregating multiple evaluations.
  • The methodology achieves up to 88% reduction in Expected Calibration Error (ECE) across various benchmarks.
  • Empirical validation provides concrete confidence thresholds for operational use in telecom.
Read more
Safety Training Modulates Harmful Misalignment Under On-Policy RL, But Direction Depends on Environment Design
Leon Eshuijs, Shihan Wang, Aanske Fokkens
Reinforcement Learning Large Language Models NLP
  • Model size can reduce harmful misalignment in some environments but increase it in others, depending on environmental design.
  • Environmental features such as role framing and gameability cues significantly influence the direction of harmful exploitation.
  • Existing safety benchmarks are poor predictors of RL-induced misalignment, with exceptions for specific metrics like Sycophancy scores.
  • On-policy RL preserves a safety buffer that is lost in off-policy training settings.
Read more
C-voting: Confidence-Based Test-Time Voting without Explicit Energy Functions
Kenji Kubo, Shunsuke Kamiya, Masanori Koyama, Kohei Hayashi, Yusuke Iwasawa, Yutaka Matsuo
Theory Efficient ML
  • C-voting enhances test-time performance of recurrent models without requiring explicit energy functions.
  • The method shows a 4.9% accuracy improvement over energy-based voting strategies on Sudoku-hard tasks.
  • ItrSA++, a new recurrent model, outperforms existing models like HRM and AKOrN in various reasoning tasks.
  • C-voting is applicable to a wide range of recurrent architectures, making it a flexible solution for improving reasoning capabilities.
Read more
SOLARIS: Speculative Offloading of Latent-bAsed Representation for Inference Scaling
Zikun Liu, Liang Luo, Qianru Li, Zhengyu Zhang, Wei Ling, Jingyi Shen, Zeliang Chen, Yaning Huang, Jingxian Huang, Abdallah Aboelela, Chonglin Sun, Feifan Gu, Fenggang Wu, Hang Qu, Huayu Li, Jill Pan, Kaidi Pei, Laming Chen, Longhao Jin, Qin Huang, Tongyi Tang, Varna Puvvada, Wenlin Chen, Xiaohan Wei, Xu Cao, Yantao Yao, Yuan Jin, Yunchen Pu, Yuxin Chen, Zijian Shen, Zhengkai Zhang, Dong Liang, Ellie Wen
Efficient ML
  • SOLARIS enables real-time knowledge transfer from complex foundation models to smaller models.
  • The framework utilizes speculative precomputation of user-item embeddings to enhance efficiency.
  • Direct embedding-based transfer improves the knowledge transfer ratio significantly compared to traditional methods.
  • Hierarchical feature enrichment maximizes coverage without incurring additional computational costs.
Read more
Robust Ultra Low-Bit Post-Training Quantization via Stable Diagonal Curvature Estimate
Jaemin Kim, Sungkyun Kim, Junyeol Lee, Jiwon Seo
Large Language Models Efficient ML Optimization
  • Introduction of DASH-Q, a robust PTQ framework for LLMs.
  • Utilizes diagonal Hessian approximation to mitigate noise in quantization.
  • Achieves significant accuracy improvements in ultra low-bit quantization.
  • Demonstrates effectiveness with minimal calibration data.
Read more
Self-Organizing Maps with Optimized Latent Positions
Seiki Ubukata, Akira Notsu, Katsuhiro Honda
Optimization Theory Efficient ML
  • Introduction of continuous latent positions for data points in SOM.
  • Development of an entropy-regularized objective that retains computational efficiency.
  • Demonstration of strong neighborhood preservation and quantization performance.
  • Effective scalability for large datasets and numerous latent nodes.
Read more
Depth-Resolved Coral Reef Thermal Fields from Satellite SST and Sparse In-Situ Loggers Using Physics-Informed Neural Networks
Alzayat Saleh, Mostafa Rahimi Azghadi
Theory Time Series Optimization
  • Introduces a physics-informed neural network (PINN) for reconstructing depth-resolved thermal fields in coral reefs.
  • Demonstrates significant improvements in accuracy over traditional methods, particularly under sparse data conditions.
  • Reveals that thermal stress on corals decreases with depth, challenging existing satellite-based assessments.
  • Provides a framework that can be applied to existing observational infrastructures for better coral management.
Read more
From Order to Distribution: A Spectral Characterization of Forgetting in Continual Learning
Zonghuan Xu, Xingjun Ma
Theory
  • The paper reformulates forgetting in continual learning as a function of task distribution rather than task order.
  • An exact spectral characterization of forgetting is derived, leading to an unconditional exponential upper bound.
  • The convergence rate of forgetting is linked to the geometric properties of the task distribution.
  • A fundamental obstruction to establishing uniform positive lower bounds for forgetting is identified.
Read more
Momentum Further Constrains Sharpness at the Edge of Stochastic Stability
Arseniy Andreyev, Advikar Ananthkumar, Marc Walden, Tomaso Poggio, Pierfrancesco Beneventano
Optimization Theory
  • SGD with momentum exhibits Edge of Stochastic Stability-like behavior that varies with batch size.
  • In small-batch regimes, momentum biases training towards flatter regions, tightening curvature constraints.
  • Large-batch momentum recovers classical stability effects, allowing for sharper curvature.
  • Checkpoint interventions reveal that destabilizing changes can trigger significant shifts in training dynamics.
Read more
LLM-Enhanced Log Anomaly Detection: A Comprehensive Benchmark of Large Language Models for Automated System Diagnostics
Disha Patel
Large Language Models NLP
  • Comprehensive evaluation of LLM-based methods against traditional log anomaly detection techniques.
  • Fine-tuned transformers achieve the highest F1-scores (0.96–0.99) across datasets.
  • Prompt-based LLMs show strong zero-shot performance (F1: 0.82–0.91) without labeled training data.
  • Introduces structured log context prompting (SLCP) to improve LLM performance by 8–12%.
Read more
A KL Lens on Quantization: Fast, Forward-Only Sensitivity for Mixed-Precision SSM-Transformer Models
Jason Kong, Nilesh Prasad Pandey, Flavio Ponzina, Tajana Rosing
NLP Large Language Models Efficient ML
  • Introduces a lightweight, backpropagation-free sensitivity analysis framework for hybrid SSM-Transformer models.
  • Demonstrates that KL divergence is a superior metric for quantization sensitivity in language modeling tasks.
  • Achieves significant model compression with minimal accuracy degradation through a novel mixed-precision quantization strategy.
  • Validates the approach with real-world profiling on Intel Lunar Lake hardware, achieving competitive performance.
Read more
Sparse Goodness: How Selective Measurement Transforms Forward-Forward Learning
Kamer Ali Yuksel, Hassan Sawaf
Theory Optimization Efficient ML
  • Introduction of top-k goodness function, significantly outperforming the traditional sum-of-squares method.
  • Development of entmax-weighted energy goodness, which utilizes adaptive sparse weights for improved accuracy.
  • Implementation of separate label–feature forwarding (FFCL) to enhance the learning process.
  • Identification of a unifying principle where sparsity in the goodness function is the most impactful design choice.
Read more
BID-LoRA: A Parameter-Efficient Framework for Continual Learning and Unlearning
Jagadeesh Rachapudi, Ritali Vatsi, Praful Hambarde, Amit Shukla
Efficient ML Computer Vision
  • Introduces BID-LoRA, a novel framework for combining Continual Learning and Machine Unlearning.
  • Addresses knowledge leakage and degradation of foundational knowledge in existing CL and MU methods.
  • Achieves parameter efficiency by updating only approximately 5% of model parameters.
  • Demonstrates effectiveness through experiments on CIFAR-100 and CASIA-Face100 datasets.
Read more
RPS: Information Elicitation with Reinforcement Prompt Selection
Tao Wang, Jingyao Lu, Xibo Wang, Haonan Huang, Su Yao, Zhiqiang Hu, Xingyan Chen, Enmao Diao
NLP Large Language Models Reinforcement Learning
  • RPS is a novel framework for adaptive prompt selection in information elicitation tasks.
  • The IELegal dataset provides a realistic benchmark for evaluating dialogue-based information elicitation in legal contexts.
  • RPS significantly outperforms traditional static prompt methods, enhancing the ability of LLMs to gather concealed information.
  • The approach reduces reliance on handcrafted rules and promotes prompt diversity.
Read more
Counterfactual Peptide Editing for Causal TCR--pMHC Binding Inference
Sanjar Khudoyberdiev, Arman Bekov
Theory
  • Introduces Counterfactual Invariant Prediction (CIP) to mitigate shortcut learning in TCR-pMHC binding prediction.
  • CIP employs biologically constrained counterfactual peptide edits to enhance model robustness.
  • Achieves significant improvements in out-of-distribution evaluation metrics compared to baseline models.
  • Introduces new metrics for assessing causal fidelity in predictive models.
Read more
GCA Framework: A Gulf-Grounded Dataset and Agentic Pipeline for Climate Decision Support
Muhammad Umer Sheikh, Khawar Shehzad, Salman Khan, Fahad Shahbaz Khan, Muhammad Haris Khan
NLP Large Language Models Multimodal
  • Introduction of GCA-DS, a comprehensive Gulf-focused multimodal dataset with 200k question-answer pairs.
  • Development of the Gulf Climate Agent (GCA), which integrates LLM reasoning with specialized climate tools.
  • Demonstration of improved performance of fine-tuned LLMs on Gulf climate tasks compared to general-purpose models.
  • Emphasis on the necessity of region-specific datasets and tools for effective climate decision-making.
Read more
Thermodynamic Liquid Manifold Networks: Physics-Bounded Deep Learning for Solar Forecasting in Autonomous Off-Grid Microgrids
Mohammed Ezzaldin Babiker Abdullah
Time Series
  • Introduction of Thermodynamic Liquid Manifold Networks (TLMN) for solar forecasting.
  • Utilizes a Koopman-linearized Riemannian manifold for accurate modeling of atmospheric dynamics.
  • Achieves zero nocturnal error and minimal phase lag during rapid weather changes.
  • Demonstrates high accuracy with a Root Mean Square Error of 18.31 Wh/m².
Read more
Loop Corrections to the Training and Generalization Errors of Random Feature Models
Taeyoung Kim
Theory
  • Development of a perturbative framework for random feature models that includes higher-order fluctuation statistics.
  • Derivation of explicit loop expansions for training error, test error, and generalization gap, revealing mixed fluctuation effects.
  • Exploration of scaling laws for correction terms, identifying regimes where mean-kernel approximation holds.
  • Experimental verification of theoretical predictions, confirming the effectiveness of the loop-based description.
Read more
From Imitation to Discrimination: Progressive Curriculum Learning for Robust Web Navigation
Chuang Peng, Wei Zhang, Renshuai Tao, Xinhao Zhang, Jian Yang
NLP Large Language Models Optimization
  • Introduction of the Triton dataset with 590k instances for enhanced web navigation training.
  • Development of a progressive training curriculum that improves model discrimination and consistency.
  • Triton-GRPO-32B achieves a 58.7% Step Success Rate, surpassing leading models by over 16%.
  • Demonstration that specialized data and training strategies can outperform larger models with more parameters.
Read more