AI-generated summaries

Today's ML research,
without the noise.

Daily summaries of the latest machine learning papers from arXiv, processed every 8 hours.

68 Papers today
8h Update frequency
7 Days of history
Scaling Atomistic Protein Binder Design with Generative Pretraining and Test-Time Compute
Kieran Didi, Zuobai Zhang, Guoqing Zhou, Danny Reidenbach, Zhonglin Cao, Sooyoung Cha, Tomas Geffner, Christian Dallago, Jian Tang, Michael M. Bronstein, Martin Steinegger, Emine Kucukbenli, Arash Vahdat, Karsten Kreis
Generative Models Optimization
  • Introduces Proteína-Complexa, a unified framework for protein binder design.
  • Constructs a large-scale dataset, Teddymer, for effective pretraining.
  • Achieves state-of-the-art performance in binder design benchmarks.
  • Utilizes advanced test-time optimization techniques for improved efficiency.
Read more
A Perturbation Approach to Unconstrained Linear Bandits
Andrew Jacobsen, Dorian Baudry, Shinji Ito, Nicolò Cesa-Bianchi
Optimization Theory
  • The perturbation approach reduces uBLO to a standard OLO problem.
  • Expected-regret guarantees are derived for comparator-adaptive OLO algorithms.
  • Dynamic regret analysis achieves optimal √PT dependencies without prior knowledge.
  • First high-probability guarantees for static and dynamic regret in uBLO are established.
Read more
FedDES: Graph-Based Dynamic Ensemble Selection for Personalized Federated Learning
Brianna Mueller, W. Nick Street
Federated Learning Graph Learning
  • FedDES provides a decentralized approach to personalized federated learning, allowing for model heterogeneity.
  • The use of a Graph Neural Network enables dynamic ensemble selection tailored to individual test samples.
  • FedDES effectively suppresses contributions from non-beneficial peer models, enhancing performance and reducing negative transfer.
  • The framework supports asynchronous peer-to-peer communication, avoiding bottlenecks associated with centralized coordination.
Read more
Liquid Networks with Mixture Density Heads for Efficient Imitation Learning
Nikolaus Correll
Robotics Efficient ML Generative Models
  • Liquid neural networks with mixture density heads outperform diffusion policies in imitation learning tasks.
  • Liquid policies require fewer parameters while achieving significantly lower prediction errors and faster inference times.
  • The proposed shared-backbone comparison protocol ensures a fair evaluation of policy head performance.
  • Liquid models show increased robustness, particularly in scenarios with limited training data.
Read more
Physics-Embedded Feature Learning for AI in Medical Imaging
Pulock Das, Al Amin, Kamrul Hasan, Rohan Thompson, Azubike D. Okpalaeze, Liang Hong
Interpretability
  • Introduction of PhysNet, a physics-embedded deep learning framework for medical imaging.
  • Integration of tumor growth dynamics into the feature learning process of CNNs.
  • Dual branch architecture enables simultaneous tumor classification and learning of tumor behavior.
  • PhysNet outperforms state-of-the-art models in classification tasks on brain MRI datasets.
Read more
Automating Early Disease Prediction Via Structured and Unstructured Clinical Data
Ane G Domingo-Aldama, Marcos Merino Prado, Alain García Olea, Josu Goikoetxea, Koldo Gojenola, Aitziber Atutxa
NLP
  • Introduces an automated methodology for early disease prediction using structured and unstructured data.
  • Utilizes natural language processing to extract relevant information from discharge reports.
  • Demonstrates improved predictive accuracy for atrial fibrillation progression compared to traditional methods.
  • Addresses challenges of missing or incomplete data in electronic health records.
Read more
Symbolic Density Estimation: A Decompositional Approach
Angelo Rajendram, Xieting Chu, Vijay Ganesh, Max Fieg, Aishik Ghosh
Theory Interpretability
  • Introduction of AI-Kolmogorov for Symbolic Density Estimation (SymDE).
  • Multi-stage pipeline includes clustering, nonparametric density estimation, and symbolic regression.
  • Demonstrated efficacy on synthetic and high-energy physics-related datasets.
  • Addresses challenges of validity constraints, dimensionality, and complex expression discovery.
Read more
KMM-CP: Practical Conformal Prediction under Covariate Shift via Selective Kernel Mean Matching
Siddhartha Laghuvarapu, Rohan Deb, Jimeng Sun
Theory Efficient ML
  • KMM-CP framework utilizes Kernel Mean Matching for conformal prediction under covariate shift.
  • Introduces a selective extension to improve stability in low-overlap regions.
  • Establishes a connection between moment-matching quality and effective sample size for coverage guarantees.
  • Demonstrates significant performance improvements in molecular property prediction tasks.
Read more
GIFT: Bootstrapping Image-to-CAD Program Synthesis via Geometric Feedback
Giorgio Giannone, Anna Clare Doris, Amin Heyrani Nobari, Kai Xu, Akash Srivastava, Faez Ahmed
Generative Models Computer Vision Optimization
  • GIFT leverages geometric feedback to enhance training data diversity for CAD program synthesis.
  • The framework reduces inference compute by 80% while improving performance metrics.
  • GIFT outperforms traditional supervised fine-tuning methods and remains competitive with complex models.
  • The approach addresses the critical bottleneck of limited training data in generative CAD design.
Read more
FeDMRA: Federated Incremental Learning with Dynamic Memory Replay Allocation
Tiantian Wang, Xiang Xiang, Simon S. Du
Federated Learning
  • FeDMRA addresses the limitations of traditional fixed memory allocation in federated learning.
  • The framework incorporates dynamic memory allocation based on client data distribution and contribution.
  • It effectively mitigates catastrophic forgetting through optimized exemplar storage.
  • Extensive experiments show significant performance improvements on medical image datasets.
Read more
Personalizing Mathematical Game-based Learning for Children: A Preliminary Study
Jie Gao, Adam K. Dubé
Theory
  • The study proposes a framework for personalizing game-based learning using AI techniques.
  • A dataset of 206 player-generated game levels was analyzed to develop a classifier.
  • The Random Forest model was identified as the most effective classifier for predicting valid game levels.
  • The research emphasizes the importance of adaptive learning in enhancing student engagement and learning outcomes.
Read more
A Comparative Investigation of Thermodynamic Structure-Informed Neural Networks
Guojie Li, Liu Hong
Theory
  • Comparison of different thermodynamic formulations in PINNs.
  • Newtonian-residual-based PINNs struggle with physical consistency.
  • Structure-preserving formulations improve parameter identification and robustness.
  • Numerical experiments demonstrate the effectiveness of various thermodynamic models.
Read more
PEANUT: Perturbations by Eigenvalue Alignment for Attacking GNNs Under Topology-Driven Message Passing
Bhavya Kohli, Biplab Sikdar
Graph Learning
  • PEANUT is a gradient-free, black-box attack that injects virtual nodes into GNNs.
  • The attack is applicable during the inference phase, making it practical for real-world scenarios.
  • No features are required for the injected nodes, showcasing the significance of connectivity in GNNs.
  • The method demonstrates effectiveness across various graph tasks, including graph-level regression.
Read more
Match or Replay: Self Imitating Proximal Policy Optimization
Gaurav Chaudhary, Laxmidhar Behera, Washim Uddin Mondal
Reinforcement Learning Robotics Optimization
  • Introduction of Self-Imitating Proximal Policy Optimization (SIPP) for improved exploration and sample efficiency.
  • Development of the MATCH strategy utilizing optimal transport to enhance learning in dense reward environments.
  • Implementation of the REPLAY strategy to reinforce learning from successful trajectories in sparse reward scenarios.
  • Empirical validation across various environments demonstrating significant improvements in learning efficiency.
Read more
Benchmarking Tabular Foundation Models for Conditional Density Estimation in Regression
Rafael Izbicki, Pedro L. C. Rodrigues
Theory
  • Tabular foundation models like TabPFN and TabICL are effective for conditional density estimation.
  • These models outperform traditional CDE methods in terms of loss, log-likelihood, and CRPS across various datasets.
  • Calibration performance is competitive but may require post-hoc adjustments for larger datasets.
  • A case study in photometric redshift estimation highlights the practical advantages of using foundation models.
Read more
Scalable Maximum Entropy Population Synthesis via Persistent Contrastive Divergence
Mirko Degli Esposti
Generative Models Optimization Theory
  • Introduction of GibbsPCDSolver, a scalable method for MaxEnt population synthesis.
  • Utilizes Persistent Contrastive Divergence to approximate expectations without full enumeration.
  • Demonstrates superior performance in terms of mean relative error and effective sample size compared to traditional methods.
  • Validated on a new demographic benchmark, Syn-ISTAT, with significant implications for urban modeling.
Read more
Stepwise Credit Assignment for GRPO on Flow-Matching Models
Yash Savani, Branislav Kveton, Yuchen Liu, Yilin Wang, Jing Shi, Subhojyoti Mukherjee, Nikos Vlassis, Krishna Kumar Singh
Reinforcement Learning Generative Models Computer Vision
  • Introduction of Stepwise-Flow-GRPO for improved credit assignment in reinforcement learning.
  • Utilization of Tweedie's formula for intermediate reward estimation to enhance learning efficiency.
  • Development of a new SDE inspired by DDIM for better image quality in generated outputs.
  • Demonstrated superior sample efficiency and faster convergence compared to traditional Flow-GRPO.
Read more
PruneFuse: Efficient Data Selection via Weight Pruning and Network Fusion
Humaira Kousar, Hasnain Irshad Bhatti, Jaekyun Moon
Efficient ML
  • PruneFuse introduces a two-stage approach for efficient data selection using pruned networks.
  • The method significantly reduces computational costs associated with traditional active learning techniques.
  • Fusing the pruned network with the original model enhances training efficiency and generalization.
  • Extensive experiments show PruneFuse outperforms state-of-the-art methods across multiple datasets.
Read more
Q-BIOLAT: Binary Latent Protein Fitness Landscapes for QUBO-Based Optimization
Truong-Son Hy
Optimization
  • Q-BioLat provides a framework for modeling protein fitness landscapes in binary latent spaces.
  • The approach emphasizes the significance of representation in optimization landscapes, showing that different representations can yield different optimization outcomes.
  • PCA-based binary representations consistently outperform learned representations in terms of optimization effectiveness.
  • Classical combinatorial optimization methods are effective in structured binary latent spaces, enabling efficient exploration of protein fitness landscapes.
Read more
AMIGO: Agentic Multi-Image Grounding Oracle Benchmark
Min Wang, Ata Mahjoubfar
Multimodal
  • AMIGO introduces a long-horizon benchmark for hidden-target identification in multi-image settings.
  • The benchmark employs a constrained questioning protocol with explicit penalties for invalid actions.
  • It allows for controlled oracle imperfections to assess model robustness and verification behavior.
  • The evaluation metrics cover identification success, interaction quality, and protocol compliance.
Read more
Stability and Sensitivity Analysis of Relative Temporal-Difference Learning: Extended Version
Masoud S. Sakha, Rushikesh Kamalapurkar, Sean Meyn
Reinforcement Learning Theory Optimization
  • Establishes stability conditions for relative TD learning with linear function approximation.
  • Demonstrates that the choice of baseline distribution is crucial for algorithm stability.
  • Shows that asymptotic bias and covariance remain bounded as the discount factor approaches one.
  • Provides empirical validation through simulations on finite-state MDPs.
Read more
PiCSRL: Physics-Informed Contextual Spectral Reinforcement Learning
Mitra Nasr Azadani, Syed Usama Imtiaz, Nasrin Alamdari
Reinforcement Learning Efficient ML Optimization
  • PiCSRL effectively addresses HDLSS constraints in environmental monitoring through improved representation mechanisms.
  • The framework is the first to apply reinforcement learning to hyperspectral sensing under HDLSS conditions for sample-efficient policy learning.
  • Demonstrates significant improvements in predictive modeling for cyanobacterial gene concentrations using hyperspectral imagery.
  • Achieves superior performance compared to traditional sampling methods, enhancing detection efficiency.
Read more
D-GATNet: Interpretable Temporal Graph Attention Learning for ADHD Identification Using Dynamic Functional Connectivity
Qurat Ul Ain, Alptekin Temizel, Soyiba Jawed
Graph Learning Time Series Interpretability
  • D-GATNet leverages dynamic functional connectivity for ADHD classification, addressing limitations of static approaches.
  • The framework incorporates a Graph Attention Network for spatial learning and temporal convolution for dynamic modeling.
  • Interpretability is achieved through attention mechanisms that highlight significant ROI interactions and temporal windows.
  • The model outperforms existing methods, achieving 85.18% balanced accuracy and 0.881 AUC on the ADHD-200 dataset.
Read more
EcoFair: Trustworthy and Energy-Aware Routing for Privacy-Preserving Vertically Partitioned Medical Inference
Mostafa Anoosha, Dhavalkumar Thakker, Kuniko Paxton, Koorosh Aslansefat, Bhupesh Kumar Mishra, Baseer Ahmad, Rameez Raja Kureshi
Efficient ML Federated Learning Multimodal
  • EcoFair maintains data privacy by transmitting only embeddings instead of raw data.
  • The framework employs a dynamic routing mechanism that activates heavier processing based on clinical risk and uncertainty.
  • Experimental results show significant energy savings in edge-side inference without compromising classification accuracy.
  • Selective routing improves performance for subgroup-sensitive malignant cases.
Read more
Near-Optimal Primal-Dual Algorithm for Learning Linear Mixture CMDPs with Adversarial Rewards
Kihyun Yu, Seoungbin Bae, Dabeen Lee
Reinforcement Learning Optimization Theory
  • Introduces a primal-dual policy optimization algorithm for linear mixture CMDPs with adversarial rewards.
  • Achieves near-optimal regret and constraint violation bounds, matching minimax lower bounds up to logarithmic factors.
  • Utilizes a regularized dual update and weighted ridge regression for tighter confidence intervals.
  • Addresses limitations of existing algorithms that either assume fixed rewards or do not scale well.
Read more
Shapley meets Rawls: an integrated framework for measuring and explaining unfairness
Fadoua Amri-Jouidel, Emmanuel Kemel, Stéphane Mussard
Theory Interpretability
  • Introduces an integrated framework combining Shapley values with fairness measurement.
  • Demonstrates the application of the framework on the Census Income dataset.
  • Identifies key features contributing to gender unfairness in classifiers.
  • Offers a computationally efficient alternative to traditional methods for measuring unfairness.
Read more
High dimensional theory of two-phase optimizers
Atish Agarwala
Optimization Theory
  • Two-phase optimizers like LA and LA-DiLoCo provide a different noise structure compared to SGD, which can be beneficial in high-dimensional optimization tasks.
  • The one-worker variant of LA shows a favorable trade-off between signal and noise, outperforming SGD under optimal learning rates.
  • LA-DiLoCo's multi-worker implementation generates more noise, but this can be controlled with appropriate hyperparameter choices.
  • The introduction of momentum in the Super Lookahead variant enhances optimization performance by non-linearly transforming the Hessian spectrum.
Read more
Robust Batch-Level Query Routing for Large Language Models under Cost and Capacity Constraints
Jelena Markovic-Voronov, Kayhan Behdin, Yuanda Xu, Zhengze Zhou, Zhipeng Wang, Rahul Mazumder
NLP Large Language Models Optimization
  • Identifies limitations of existing per-query LLM routing methods under batch inference and strict constraints.
  • Introduces a robust batch-level routing framework that optimizes model assignments while considering performance uncertainty.
  • Explores optimal allocation of computational resources prior to inference to enhance efficiency.
  • Demonstrates significant improvements in routing accuracy and resource management through extensive experiments.
Read more
Mixture-Model Preference Learning for Many-Objective Bayesian Optimization
Manisha Dubey, Sebastiaan De Peuter, Wanrong Wang, Samuel Kaski
Optimization
  • Introduces a mixture of preference archetypes for many-objective optimization, moving beyond a single utility function.
  • Develops information-theoretic methods for active query selection that focus on both mode identity and trade-off shapes.
  • Provides diagnostics for mixture-aware evaluation that go beyond simple regret measures.
  • Demonstrates superior performance on synthetic and real-world datasets compared to existing methods.
Read more
Online Learning for Dynamic Constellation Topologies
João Norberto, Ricardo Ferreira, Cláudia Soares
Optimization Theory
  • Introduces a novel convex optimization framework for dynamic satellite network topology management.
  • Does not assume fixed orbital structures, allowing for flexibility in satellite maneuvers.
  • Demonstrates a trade-off between computational complexity and convergence in online learning.
  • Empirical results show performance matching that of established offline methods.
Read more
Are LLM-Enhanced Graph Neural Networks Robust against Poisoning Attacks?
Yuhang Ma, Jie Wang, Zheng Yan
Graph Learning Large Language Models
  • Introduces a robustness assessment framework for LLM-enhanced GNNs against poisoning attacks.
  • Evaluates 24 victim models using diverse LLM/LM feature enhancers and GNN architectures.
  • Demonstrates that LLM-enhanced GNNs show superior performance and robustness compared to shallow embedding baselines.
  • Identifies critical factors contributing to robustness, such as effective node representation encoding.
Read more
Physics-Guided Transformer (PGT): Physics-Aware Attention Mechanism for PINNs
Ehsan Zeraatkar, Rodion Podorozhny, Jelena Tešić
Theory Efficient ML Optimization
  • Introduction of Physics-Guided Transformer (PGT) for improved reconstruction of physical fields.
  • Embedding of physical structure into self-attention mechanisms to enhance model performance.
  • Demonstrated significant error reduction in sparse data scenarios compared to existing methods.
  • Unified training framework combining PDE residuals and data fidelity for robust learning.
Read more
Hybrid Deep Learning with Temporal Data Augmentation for Accurate Remaining Useful Life Prediction of Lithium-Ion Batteries
Yun Tian, Guili Wang, Jian Bi, Kaixin Han, Chenglu Wu, Zhiyi Lu, Chenhao Li, Liangwang Sun, Minyu Zhou, Chenchen Xu
Time Series
  • Introduction of CDFormer, a hybrid deep learning model for RUL prediction of lithium-ion batteries.
  • Integration of CNNs, DRSNs, and Transformers for improved feature extraction and modeling of degradation dynamics.
  • Implementation of novel temporal data augmentation techniques to enhance model robustness.
  • Demonstrated superior performance over existing RUL prediction methods with significant error reductions.
Read more
TinyML for Acoustic Anomaly Detection in IoT Sensor Networks
Amar Almaini, Jakob Folz, Ghadeer Ashour
Audio & Speech Efficient ML Time Series
  • Introduction of a compact TinyML pipeline for acoustic anomaly detection.
  • Utilization of Mel Frequency Cepstral Coefficients (MFCCs) for sound feature extraction.
  • Achieved 91% test accuracy and balanced F1-scores of 0.91 on the UrbanSound8K dataset.
  • Demonstrates the effectiveness of on-device processing for real-time anomaly detection.
Read more
Optimistic Actor-Critic with Parametric Policies for Linear Markov Decision Processes
Max Qiushi Lin, Reza Asad, Kevin Tan, Haque Ishfaq, Csaba Szepesvari, Sharan Vaswani
Reinforcement Learning Theory Optimization
  • Introduces an optimistic actor-critic framework for linear MDPs using parametric log-linear policies.
  • Utilizes a logit-matching regression objective for the actor and Langevin Monte Carlo for the critic.
  • Achieves state-of-the-art sample complexity in both on-policy and off-policy settings.
  • Demonstrates practical effectiveness through experiments in linear MDPs and Atari environments.
Read more
The Unreasonable Effectiveness of Scaling Laws in AI
Chien-Ping Lu
Theory Efficient ML Interpretability
  • Classical scaling laws effectively predict AI progress despite diminishing returns.
  • The compute variable should be interpreted as logical compute, abstracting from implementation details.
  • Diminishing returns indicate rising operational burdens rather than merely a flatter performance curve.
  • Efficiency improvements in hardware and algorithms are crucial for continued AI progress.
Read more
Bit-Identical Medical Deep Learning via Structured Orthogonal Initialization
Yakov P. Shkolnikov
Time Series Theory Efficient ML
  • Introduces a framework for verified bit-identical training in deep learning.
  • Eliminates randomness from weight initialization, batch ordering, and GPU operations.
  • Structured orthogonal initialization outperforms traditional Kaiming initialization.
  • Demonstrates significant reductions in variance for rare clinical classes in ECG classification.
Read more
Kernel Dynamics under Path Entropy Maximization
Jnaneshwar Das
Theory
  • The kernel function is treated as a dynamical variable, allowing for a new perspective on kernel evolution.
  • The optimization landscape is endogenous, meaning that the geometry of the probability space changes as kernels evolve.
  • Fixed points of the dynamics correspond to self-consistent kernels that reinforce their own distinction structures.
  • The thermodynamic cost of kernel change is quantitatively linked to the mutual information gained.
Read more
Evaluating Interactive 2D Visualization as a Sample Selection Strategy for Biomedical Time-Series Data Annotation
Einari Vaaras, Manu Airaksinen, Okko Räsänen
Time Series Audio & Speech
  • The study compares three sample selection methods for annotating biomedical time-series data.
  • Interactive 2D visualizations (2DVs) significantly enhance the annotation process, particularly in capturing rare classes.
  • Variability in label distribution from 2DV can decrease classification performance when using individual annotator labels.
  • Farthest-first traversal (FAFT) excels in scenarios with limited annotation budgets.
Read more
Geometric Evolution Graph Convolutional Networks: Enhancing Graph Representation Learning via Ricci Flow
Jicheng Ma, Yunyan Yang, Juan Zhao, Liang Zhao
Graph Learning
  • Introduction of discrete Ricci flow into deep graph representation learning.
  • Integration of LSTM and GCN for enhanced node representation learning.
  • Demonstrated state-of-the-art performance on multiple benchmark datasets.
  • Particularly effective in capturing structural heterophily and long-range interactions.
Read more
Machine Learning-Assisted High-Dimensional Matrix Estimation
Wan Tian, Hui Yang, Zhouhui Lian, Lingyue Zhang, Yijie Peng
Optimization Theory Efficient ML
  • Introduces a machine learning-assisted approach to high-dimensional matrix estimation.
  • Enhances LADMM with learnable parameters for improved accuracy and convergence speed.
  • Proves theoretical convergence and faster convergence rates for the reparameterized LADMM.
  • Validates the proposed method against classical optimization techniques.
Read more
Rethinking Language Model Scaling under Transferable Hypersphere Optimization
Liliang Ren, Yang Liu, Yelong Shen, Weizhu Chen
NLP Large Language Models Optimization
  • Introduction of HyperP framework for optimal learning rate transfer across various model configurations.
  • Demonstration of transferable stability in training dynamics under hypersphere optimization.
  • Development of SqrtGate mechanism for improved MoE performance and load balancing.
  • Achieved 1.58× compute efficiency over a strong baseline at large scales.
Read more
SIMR-NO: A Spectrally-Informed Multi-Resolution Neural Operator for Turbulent Flow Super-Resolution
Muhammad Abid, Omer San
Theory
  • Introduces SIMR-NO, a novel framework for turbulent flow super-resolution.
  • Combines deterministic interpolation with spectral corrections for improved accuracy.
  • Achieves significant error reduction compared to existing methods like FNO and EDSR.
  • Successfully reproduces energy and enstrophy spectra, ensuring physical fidelity.
Read more
TED: Training-Free Experience Distillation for Multimodal Reasoning
Shuozhi Yuan, Jinqing Wang, Zihao Liu, Miaomiao Yuan, Haoran Peng, Jin Zhao, Bingwen Wang, Haoyi Wang
Multimodal Efficient ML Large Language Models
  • TED enables knowledge distillation without parameter updates, making it suitable for resource-constrained environments.
  • The framework utilizes a teacher-guided experience generation and compression mechanism to distill reusable reasoning principles.
  • Experiments show substantial performance improvements on multimodal reasoning tasks with only 100 training samples.
  • TED reduces training costs by over 20 times compared to conventional parameter-based distillation methods.
Read more
Hierarchy-Guided Topology Latent Flow for Molecular Graph Generation
Urvi Awasthi, Alexander Arjun Lobo, Leonid Zhukov
Generative Models Graph Learning
  • HLTF explicitly generates bond topology alongside 3D coordinates to improve molecular validity.
  • The model employs a planner-executor framework that integrates a latent hierarchy for global context.
  • HLTF achieves high stability and validity rates on benchmark datasets, outperforming existing methods.
  • The approach reduces false-valid samples that pass basic validation but fail stricter checks.
Read more
From Inference Routing to Agent Orchestration: Declarative Policy Compilation with Cross-Layer Verification
Huamin Chen, Xunzhuo Liu, Bowei He, Xue Liu
Large Language Models Theory Efficient ML
  • Extension of the Semantic Router DSL to multi-step agent workflows.
  • Introduction of a multi-target compilation framework for generating artifacts across different layers.
  • Establishment of a four-pillar analysis framework for evaluating the proposed approach.
  • Guarantees for auditability, cost efficiency, verifiability, and tunability are maintained across all targets.
Read more
ITQ3_S: High-Fidelity 3-bit LLM Inference via Interleaved Ternary Quantization with Rotation-Domain Smoothing
Edward J. Yoon
Large Language Models Efficient ML NLP
  • ITQ3 S utilizes FWHT for rotation-domain adaptive quantization, improving weight distribution for better quantization fidelity.
  • The method achieves zero-error round-trip fidelity between quantization and inference, outperforming traditional 3-bit quantization methods.
  • Empirical results show ITQ3 S achieves competitive perplexity with FP16 models while enhancing throughput significantly.
  • The approach is specifically designed for consumer-grade GPUs, addressing the challenges of deploying large language models efficiently.
Read more
Taming the Instability: A Robust Second-Order Optimizer for Federated Learning over Non-IID Data
Yuanqiao Zhang, Tiantian He, Yuan Gao, Yixin Wang, Yew-Soon Ong, Maoguo Gong, A.K. Qin, Hui Li
Optimization Federated Learning
  • FedRCO is the first comprehensive framework addressing second-order optimization instability in federated learning.
  • It incorporates mechanisms to monitor gradient anomalies and reset states during numerical instability.
  • The proposed aggregation strategy preserves local curvature while integrating global knowledge.
  • FedRCO shows superior performance in terms of convergence speed and accuracy compared to existing methods.
Read more
Data-Driven Plasticity Modeling via Acoustic Profiling
Khalid El-Awady
Audio & Speech Time Series Theory
  • Introduces a data-driven approach to model plasticity in crystalline metals using acoustic emissions.
  • Utilizes wavelet transforms for improved detection of AE events compared to traditional methods.
  • Identifies 266 unique AE events, revealing insights into the mechanics of dislocation dynamics.
  • Establishes a correlation between AE events and stress drops, validating the detection methodology.
Read more
QuitoBench: A High-Quality Open Time Series Forecasting Benchmark
Siqiao Xue, Zhaoyang Zhu, Wei Zhang, Rongyao Cai, Rui Wang, Yixiang Mu, Fan Zhou, Jianguo Li, Peng Di, Hang Yu
Time Series
  • QUITOBENCH addresses the scarcity of high-quality benchmarks in time series forecasting.
  • The benchmark categorizes time series based on intrinsic properties rather than application domains.
  • Deep learning models outperform foundation models at short context lengths, while the reverse is true at longer lengths.
  • Forecastability is the dominant factor affecting model performance, leading to significant MAE differences.
Read more
Automatic feature identification in least-squares policy iteration using the Koopman operator framework
Christian Mugisho Zagabe, Sebastian Peitz
Reinforcement Learning
  • Introduction of KAE-LSPI algorithm for automatic feature identification in RL.
  • Reformulation of classical LSPI using the Koopman operator framework.
  • Comparison with existing LSPI and KLSPI methods shows competitive performance.
  • Elimination of the need for manual feature/kernel selection.
Read more
Neuro-Symbolic Process Anomaly Detection
Devashish Gaikwad, Wil M. P. van der Aalst, Gyunam Park
Theory Interpretability
  • Proposes a neuro-symbolic approach for process anomaly detection that integrates domain knowledge.
  • Utilizes Logic Tensor Networks to enhance neural network models with symbolic reasoning.
  • Demonstrates improved anomaly detection performance with as few as 10 conformant traces.
  • Highlights the importance of Declare constraints in refining the detection process.
Read more
Spectral Signatures of Data Quality: Eigenvalue Tail Index as a Diagnostic for Label Noise in Neural Networks
Matthew Loftus
Theory
  • The tail index α of the bottleneck layer predicts test accuracy with high precision under label noise conditions.
  • Under hyperparameter variation, spectral and conventional measures are weak predictors of test accuracy.
  • The spectral signature is concentrated at the information-processing bottleneck layer.
  • The study provides a comprehensive comparison of spectral measures against conventional metrics.
Read more
Temporal Credit Is Free
Aur Shalev Merin
Time Series Optimization Efficient ML
  • Jacobian propagation is not necessary for online adaptation in RNNs; immediate derivatives are sufficient.
  • Eligibility traces fail due to miscalibrated decay rates and lack of normalization, not because of the absence of Jacobian information.
  • The proposed method scales to larger networks (n = 1024) with 1000× less memory than RTRL.
  • An architectural rule is established to determine when normalization is required based on the presence of nonlinear state updates.
Read more
Critic-Free Deep Reinforcement Learning for Maritime Coverage Path Planning on Irregular Hexagonal Grids
Carlos S. Sepúlveda, Gonzalo A. Ruz
Reinforcement Learning Optimization Robotics
  • Introduces a DRL framework for maritime CPP on irregular hexagonal grids.
  • Utilizes a Transformer-based pointer policy for constructing coverage tours.
  • Implements a critic-free GRPO scheme for stable training in long-horizon tasks.
  • Achieves a 99.0% success rate in unseen environments, outperforming classical heuristics.
Read more
AcTTA: Rethinking Test-Time Adaptation via Dynamic Activation
Hyeongyu Kim, Geonhui Han, Dosik Hwang
Computer Vision
  • AcTTA introduces an activation-aware approach to Test-Time Adaptation, focusing on dynamic modulation of activation functions.
  • The framework allows for adaptive adjustments of activation behavior without modifying network weights or requiring source data.
  • Extensive experiments show that AcTTA outperforms traditional normalization-based TTA methods across multiple datasets.
  • The study highlights the importance of activation functions in representation dynamics and their potential for improving adaptation to domain shifts.
Read more
ORACAL: A Robust and Explainable Multimodal Framework for Smart Contract Vulnerability Detection with Causal Graph Enrichment
Tran Duong Minh Dai, Triet Huynh Minh Le, M. Ali Babar, Van-Hau Pham, Phan The Duy
Graph Learning Multimodal
  • ORACAL integrates heterogeneous graph models with LLMs for enhanced vulnerability detection.
  • The framework employs a causal attention mechanism to improve robustness against adversarial attacks.
  • PGExplainer is used for generating explainable outputs, aiding in understanding vulnerability paths.
  • ORACAL achieves state-of-the-art performance, significantly outperforming existing models.
Read more
Interpretable long-term traffic modelling on national road networks using theory-informed deep learning
Yue Li, Shujuan Chen, Akihiro Shimoda, Ying Jin
Interpretability Theory Time Series
  • DeepDemand integrates travel demand theory with deep learning for improved traffic volume predictions.
  • The model outperforms traditional and machine learning baselines in predictive accuracy.
  • It demonstrates good geographic transferability, making it suitable for long-term planning.
  • Interpretability analysis provides insights into travel-time deterrence and socioeconomic factors.
Read more
Generative Modeling in Protein Design: Neural Representations, Conditional Generation, and Evaluation Standards
Senura Hansaja Wanasekara, Minh-Duong Nguyen, Xiaochen Liu, Nguyen H. Tran, Ken-Tye Yong
Generative Models Multimodal
  • Generative modeling is transforming protein design beyond traditional structure prediction.
  • The survey categorizes methods into representations, architectures, and task settings.
  • Best practices for evaluation emphasize the importance of physical validity and leakage-aware splits.
  • Identifies key challenges in the field, including biosecurity risks and modeling complexities.
Read more
Corruption-robust Offline Multi-agent Reinforcement Learning From Human Feedback
Andi Nika, Debmalya Mandal, Parameswaran Kamalaruban, Adish Singla, Goran Radanović
Reinforcement Learning Theory Optimization
  • Introduces a robust estimator for offline MARLHF against data corruption.
  • Achieves O(ϵ1−o(1)) and O(√ϵ) bounds on Nash-equilibrium gaps under different coverage assumptions.
  • Develops a quasi-polynomial-time algorithm for coarse correlated equilibria to address computational challenges.
  • First systematic approach to handle adversarial data corruption in multi-agent settings.
Read more
DPD-Cancer: Explainable Graph-based Deep Learning for Small Molecule Anti-Cancer Activity Prediction
Magnus H. Strømme, Alex G. C. de Sá, David B. Ascher
Graph Learning
  • DPD-Cancer utilizes a Graph Attention Transformer for improved prediction of small molecule anti-cancer activity.
  • The model outperforms existing methods, achieving high AUC scores and correlation coefficients for pGI50 predictions.
  • Attention mechanisms in DPD-Cancer enhance explainability by identifying and visualizing important molecular features.
  • The framework incorporates a multi-stage, chemistry-aware data partitioning strategy for robust performance evaluation.
Read more
Reducing Oracle Feedback with Vision-Language Embeddings for Preference-Based RL
Udita Ghosh, Dripta S. Raychaudhuri, Jiachen Li, Konstantinos Karydis, Amit Roy-Chowdhury
Reinforcement Learning Robotics Multimodal
  • ROVED combines vision-language embeddings with selective oracle feedback for efficient PbRL.
  • The framework reduces the need for high-quality oracle feedback by leveraging noisy VLE outputs.
  • A parameter-efficient fine-tuning method enhances the VLE's performance using sparse oracle feedback.
  • ROVED achieves oracle-level performance while cutting annotation costs by 50-80%.
Read more
DRiffusion: Draft-and-Refine Process Parallelizes Diffusion Models with Ease
Runsheng Bai, Chengyu Zhang, Yangdong Deng
Generative Models Efficient ML
  • DRiffusion introduces a draft-and-refine process for parallelizing diffusion models.
  • The method employs skip transitions to generate multiple draft states for parallel noise computation.
  • Theoretical acceleration rates of 1/n or 2/(n+1) are achieved depending on the operational mode.
  • Empirical results show speedups of 1.4× to 3.7× with minimal quality degradation.
Read more
Skillful Kilometer-Scale Regional Weather Forecasting via Global and Regional Coupling
Weiqi Chen, Wenwei Wang, Qilong Yuan, Lefei Shen, Bingqing Peng, Jiawei Chen, Bo Wu, Liang Sun
Time Series
  • Introduction of a global-regional coupling framework for high-resolution weather forecasting.
  • Development of the ScaleMixer module for dynamic identification of cross-scale interactions.
  • Significant performance improvement over operational NWP and AI baselines in forecasting accuracy.
  • Ability to capture complex weather phenomena in challenging terrains.
Read more
From Independent to Correlated Diffusion: Generalized Generative Modeling with Probabilistic Computers
Nihal Sanjay Singh, Mazdak Mohseni-Rajaee, Shaila Niazi, Kerem Y. Camsari
Generative Models Optimization Efficient ML
  • Introduction of correlated diffusion that incorporates Ising couplings into the sampling process.
  • Demonstration of improved sampling efficiency and accuracy using probabilistic computers (p-computers).
  • Validation of the framework on benchmark systems, showing closer alignment with MCMC distributions.
  • Establishment of a hybrid architecture combining p-computers for sampling and GPUs for neural network evaluation.
Read more
Improving Risk Stratification in Hypertrophic Cardiomyopathy: A Novel Score Combining Echocardiography, Clinical, and Medication Data
Marion Taconné, Valentina D.A. Corino, Annamaria Del Franco, Sara Giovani, Iacopo Olivotto, Adrien Al Wazzan, Erwan Donal, Pietro Cerveri, Luca Mainardi
Multimodal Interpretability
  • Development of a novel ML risk score for HCM using echocardiographic, clinical, and medication data.
  • The Random Forest model significantly outperformed the ESC score in predicting 5-year cardiovascular outcomes.
  • The model provides high interpretability through SHAP analysis, identifying both established and novel predictors.
  • Longitudinal analysis shows the model's stability over time, allowing for dynamic risk monitoring in clinical settings.
Read more
Knowledge Distillation for Efficient Transformer-Based Reinforcement Learning in Hardware-Constrained Energy Management Systems
Pascal Henrich, Jonas Sievers, Maximilian Beichter, Thomas Blank, Ralf Mikut, Veit Hagenmeyer
Reinforcement Learning Efficient ML
  • Knowledge Distillation effectively reduces the size and computational requirements of Transformer-based reinforcement learning models.
  • The distilled student models can outperform teacher models in terms of electricity cost efficiency.
  • Significant reductions in model parameters, memory usage, and inference time were achieved without sacrificing performance.
  • The approach enhances the applicability of reinforcement learning in resource-constrained environments, such as residential energy management systems.
Read more
Information-Theoretic Limits of Safety Verification for Self-Improving Systems
Arsenios Scrivens
Theory
  • Establishes dual conditions for safety in self-improving systems: bounded risk and unbounded utility.
  • Proves that classifiers under power-law risk schedules cannot achieve both safety and utility simultaneously.
  • Introduces a verification escape mechanism that allows for zero risk with positive true positive rates.
  • Demonstrates a universal finite-horizon ceiling for classifier utility, which is subpolynomial compared to verifiers.
Read more