AI-generated summaries

Today's ML research,
without the noise.

Daily summaries of the latest machine learning papers from arXiv, processed every 8 hours.

65 Papers today
8h Update frequency
7 Days of history
Predicting Trajectories of Long COVID in Adult Women: The Critical Role of Causal Disentanglement
Jing Wang, Jie Shen, Yiming Luo, Amar Sra, Qiaomin Xie, Jeremy C. Weiss
NLP Large Language Models Time Series
  • Developed a causal network model to predict PASC severity in women using LLM.
  • Achieved 86.7% precision in clinical severity prediction.
  • Successfully differentiated between active pathology symptoms and confounding factors like menopause.
  • Utilized wearable data to enhance prediction accuracy and reduce diagnostic ambiguity.
Read more
FoMo X: Modular Explainability Signals for Outlier Detection Foundation Models
Simon Klüttermann, Tim Katzke, Phuong Huong Nguyen, Emmanuel Müller
Interpretability
  • FoMo-X enhances the explainability of outlier detection models by integrating modular diagnostic heads.
  • The framework leverages frozen embeddings from pretrained PFNs to provide efficient, context-aware diagnostics.
  • Two diagnostic heads are introduced: one for severity assessment and another for uncertainty estimation.
  • Extensive evaluations show high fidelity in recovering diagnostic signals with negligible inference cost.
Read more
CLeAN: Continual Learning Adaptive Normalization in Dynamic Environments
Isabella Marasco, Davide Evangelista, Elena Loli Piccolomini, Michele Colajanni
Theory Optimization Efficient ML
  • CLeAN addresses the limitations of traditional normalization methods in continual learning contexts.
  • The technique employs learnable parameters updated via Exponential Moving Average (EMA) for adaptive normalization.
  • CLeAN improves model performance on new data while reducing catastrophic forgetting.
  • The study emphasizes the critical role of adaptive normalization in dynamic environments.
Read more
Collaborative Temporal Feature Generation via Critic-Free Reinforcement Learning for Cross-User Sensor-Based Activity Recognition
Xiaozhou Ye, Feng Jiang, Zihan Wang, Xiulai Wang, Yutao Zhang, Kevin I-Kai Wang
Reinforcement Learning Time Series Generative Models
  • Introduces CTFG, a novel framework for feature extraction in HAR that addresses cross-user variability.
  • Utilizes a Transformer-based autoregressive generator for sequential feature token generation.
  • Employs Group-Relative Policy Optimization to optimize feature generation without a critic.
  • Achieves state-of-the-art accuracy on benchmark datasets while reducing training variance.
Read more
MHPO: Modulated Hazard-aware Policy Optimization for Stable Reinforcement Learning
Hongjun Wang, Wei Liu, Weibo Gu, Xing Sun, Kai Han
Reinforcement Learning Optimization Multimodal
  • Introduction of the Log-Fidelity Modulator (LFM) for stable gradient optimization.
  • Implementation of Decoupled Hazard Penalty (DHP) for independent regulation of policy shifts.
  • Demonstrated superior performance and stability in RL training across diverse benchmarks.
  • Mitigation of risks associated with extreme policy shifts and high-variance outlier tokens.
Read more
Benchmarking Reinforcement Learning via Stochastic Converse Optimality: Generating Systems with Known Optimal Policies
Sinan Ibrahim, Grégoire Ouerdane, Hadi Salloum, Henni Ouerdane, Stefan Streif, Pavel Osinenko
Reinforcement Learning Theory Optimization
  • Introduction of a benchmarking framework for RL based on stochastic converse optimality.
  • Systematic generation of environments with known optimal policies for rigorous evaluation.
  • Validation through diverse environments and assessment of standard RL methods against ground-truth optima.
  • Provision of absolute metrics for performance evaluation, enhancing reproducibility in RL research.
Read more
Classifier Pooling for Modern Ordinal Classification
Noam H. Rotenberg, Andreia V. Faria, Brian Caffo
Theory Efficient ML
  • Introduces a model-agnostic approach for ordinal classification using any non-ordinal classifier.
  • Develops two algorithms: DifferenceOrdinalClassifier for cumulative classification and TreeOrdinalClassifier for hierarchical classification.
  • Provides an open-source Python package 'statlab' for easy implementation of the proposed methods.
  • Demonstrates superior performance of the proposed methods over traditional non-ordinal classifiers in various datasets.
Read more
RaDAR: Relation-aware Diffusion-Asymmetric Graph Contrastive Learning for Recommendation
Yixuan Huang, Jiawei Chen, Shengfan Zhang, Zongsheng Cao
Graph Learning
  • RaDAR addresses structural semantics degradation and limited relational expressiveness in recommendation systems.
  • The framework employs a dual-view generation architecture combining graph generative and denoising models.
  • Innovations include asymmetric contrastive learning and diffusion-guided augmentation for enhanced robustness.
  • RaDAR outperforms existing methods on multiple benchmarks, especially under high noise and sparsity.
Read more
The Finetuner's Fallacy: When to Pretrain with Your Finetuning Data
Christina Baek, Ricardo Pio Monti, David Schwab, Amro Abbas, Rishabh Adiga, Cody Blakeney, Maximilian Böther, Paul Burstein, Aldo Gael Carranza, Alvin Deng, Parth Doshi, Vineeth Dorna, Alex Fang, Tony Jiang, Siddharth Joshi, Brett W. Larsen, Jason Chan Lee, Katherine L. Mentzer, Luke Merrick, Haakon Mongstad, Fan Pan, Anshuman Suri, Darren Teh, Jason Telanoff, Jack Urbanek, Zhengping Wang, Josh Wills, Haoli Yin, Aditi Raghunathan, J. Zico Kolter, Bogdan Gaza, Ari Morcos, Matthew Leavitt, Pratyush Maini
Large Language Models Theory Efficient ML
  • Specialized pretraining (SPT) improves domain performance while preserving general capabilities.
  • SPT reduces the pretraining tokens needed to achieve a given domain performance by up to 1.75×.
  • Incorporating domain data early in training is more effective than reserving it for finetuning.
  • SPT outperforms traditional finetuning approaches, especially in underrepresented domains.
Read more
RangeAD: Fast On-Model Anomaly Detection
Luca Hinkamp, Simon Klüttermann, Emmanuel Müller
Efficient ML Theory
  • Introduction of the On-Model AD framework for anomaly detection.
  • Development of RangeAD, which uses internal neural activation ranges for real-time anomaly detection.
  • Demonstration of superior performance in high-dimensional tasks with lower inference costs.
  • Comprehensive ablation study validating the efficacy of the proposed method.
Read more
TimeAPN: Adaptive Amplitude-Phase Non-Stationarity Normalization for Time Series Forecasting
Yue Hu, Jialiang Tang, Siwei Yu, Baosheng Yu, Jing Zhang, Dacheng Tao
Time Series
  • TimeAPN addresses non-stationarity in time series forecasting by modeling amplitude and phase changes.
  • The framework utilizes discrete wavelet transform for frequency domain analysis.
  • It incorporates adaptive normalization mechanisms to handle abrupt fluctuations in signal energy.
  • TimeAPN is model-agnostic, allowing integration with various forecasting backbones.
Read more
Transition Flow Matching
Chenrui Ma
Generative Models
  • Introduction of Transition Flow Matching for efficient few-step generative modeling.
  • Derivation of the Transition Flow Identity and a new training objective for generative models.
  • Establishment of a unified theoretical perspective connecting Transition Flow Matching with Mean Velocity models.
  • Demonstration of competitive performance in image generation benchmarks.
Read more
Efficient Soft Actor-Critic with LLM-Based Action-Level Guidance for Continuous Control
Hao Ma, Zhiqiang Pu, Xiaolin Ai, Huimu Wang
Reinforcement Learning Large Language Models Robotics
  • GuidedSAC leverages LLMs for action-level guidance in reinforcement learning.
  • The algorithm maintains convergence guarantees of the original SAC while enhancing speed.
  • GuidedSAC outperforms standard SAC and advanced exploration methods in various tasks.
  • The approach addresses the inefficiencies of exploration in vast state-action spaces.
Read more
Efficient Reasoning on the Edge
Yelysei Bondarenko, Thomas Hehn, Rob Hesselink, Romain Lepert, Fabio Valerio Massoli, Evgeny Mironov, Leyla Mirvakhabova, Tribhuvanesh Orekondy, Spyridon Stasis, Andrey Kuzmin, Anna Kuzina, Markus Nagel, Ankita Nayak, Corrado Rainone, Ork de Rooij, Paul N Whatmough, Arash Behboodi, Babak Ehteshami Bejnordi
NLP Large Language Models Efficient ML
  • Introduces a lightweight approach for enabling reasoning in small LLMs using LoRA adapters.
  • Implements budget forcing via reinforcement learning to minimize verbosity in reasoning outputs.
  • Utilizes parallel test-time scaling to improve accuracy without significantly increasing latency.
  • Presents a dynamic adapter-switching mechanism to optimize resource usage during inference.
Read more
Federated Multi Agent Deep Learning and Neural Networks for Advanced Distributed Sensing in Wireless Networks
Nadine Muller, Stefano DeRosa, Su Zhang, Chun Lee Huan
Reinforcement Learning Federated Learning Graph Learning
  • Presents a comprehensive taxonomy of multi-agent deep learning in wireless networks.
  • Emphasizes the integration of federated learning with multi-agent systems for privacy-aware intelligence.
  • Highlights various application domains including MEC, UAV networks, and intrusion detection.
  • Identifies key challenges such as scalability, security, and real-time constraints in 6G deployments.
Read more
Minimum-Action Learning: Energy-Constrained Symbolic Model Selection for Physical Law Identification from Noisy Data
Martin G. Frasch
Optimization Interpretability Theory
  • MAL effectively identifies physical laws from noisy data by minimizing a Triple-Action functional.
  • The wide-stencil acceleration-matching technique reduces noise variance significantly, enabling learnability.
  • MAL achieved 100% identification accuracy for the true force law in all tested cases.
  • The framework combines symbolic model selection with energy-constrained optimization, enhancing interpretability.
Read more
Sample-Efficient Adaptation of Drug-Response Models to Patient Tumors under Strong Biological Domain Shift
Camille Jimenez Cortes, Philippe Lalanda, German Vega
Efficient ML
  • Proposes a novel staged transfer-learning framework for drug-response prediction.
  • Demonstrates that unsupervised pretraining improves few-shot adaptation to patient tumors.
  • Highlights the importance of separating representation learning from task supervision.
  • Provides insights into the latent-space geometry affecting adaptation efficiency.
Read more
Manifold-Matching Autoencoders
Laurent Cheret, Vincent Létourneau, Isar Nejadgholi, Chris Drummond, Hussein Al Osman, Maia Fraser
Theory Generative Models Efficient ML
  • Introduction of Manifold-Matching Autoencoders (MMAE) for improved dimensionality reduction.
  • Focus on aligning pairwise distances in latent space with input data distances.
  • MMAE shows superior performance in preserving geometric and topological structures.
  • Scalable approximation of Multidimensional Scaling (MDS) is achieved.
Read more
Abstraction as a Memory-Efficient Inductive Bias for Continual Learning
Elnaz Rahmati, Nona Ghazizadeh, Zhivar Sourati, Nina Rouhani, Morteza Dehghani
Theory Efficient ML Graph Learning
  • AAT introduces a lightweight, loss-level abstraction mechanism for online continual learning.
  • The method stabilizes learning by optimizing over both concrete instances and their abstract representations.
  • AAT outperforms standard instance-only learning and matches or exceeds experience replay baselines.
  • The paper introduces two new benchmarks for evaluating continual learning methods.
Read more
Unsupervised Symbolic Anomaly Detection
Md Maruf Hossain, Tim Katzke, Simon Klüttermann, Emmanuel Müller
Interpretability
  • SYRAN provides a transparent and interpretable approach to anomaly detection using symbolic regression.
  • The method generates human-readable equations that describe normal data patterns, allowing for direct inspection and validation.
  • SYRAN achieves competitive anomaly detection performance compared to existing state-of-the-art methods.
  • The approach is applicable across various domains without the need for labeled anomaly data.
Read more
Integrating Inductive Biases in Transformers via Distillation for Financial Time Series Forecasting
Yu-Chen Den, Kuan-Yu Chen, Kendro Vincent, Darby Tien-Hao Chang
Time Series
  • TIPS integrates multiple inductive biases into a unified Transformer model for financial forecasting.
  • The framework utilizes knowledge distillation to synthesize the strengths of bias-specialized teacher models.
  • TIPS outperforms existing state-of-the-art models in financial time series forecasting across multiple metrics.
  • The model demonstrates significant computational efficiency, requiring only 38% of the inference-time computation compared to alternatives.
Read more
The Importance of Being Smoothly Calibrated
Parikshit Gopalan, Konstantinos Stavropoulos, Kunal Talwar, Pranay Tankala
Theory
  • Introduces a new omniprediction guarantee for smoothly calibrated predictors.
  • Characterizes smooth calibration using the earth mover's distance to the nearest perfectly calibrated distribution.
  • Demonstrates that estimating the upper distance to calibration is fundamentally limited.
  • Unifies and extends prior results on omniprediction from smooth calibration.
Read more
Cohomological Obstructions to Global Counterfactuals: A Sheaf-Theoretic Foundation for Generative Causal Models
Rui Wu, Hong Xie, Yongjun Li
Generative Models Theory Graph Learning
  • Identifies fundamental flaws in the assumption that local causal mechanisms yield global counterfactual coherence.
  • Introduces a sheaf-theoretic framework to model structural causal models over Wasserstein spaces.
  • Develops the Entropic Wasserstein Causal Sheaf Laplacian to resolve topological conflicts without singularities.
  • Demonstrates the effectiveness of the proposed framework in high-dimensional scRNA-seq counterfactuals.
Read more
Only relative ranks matter in weight-clustered large language models
Borja Aizpurua, Sukhbinder Singh, Román Orús
Large Language Models Efficient ML Theory
  • Relative ranks of weights are more important than their exact values in LLMs.
  • Weight clustering can significantly compress LLMs without retraining, preserving accuracy.
  • Fine-tuning cluster means can recover a portion of accuracy loss at low cost.
  • Rank distortion leads to substantial performance degradation, while rank preservation maintains model quality.
Read more
Evidential Domain Adaptation for Remaining Useful Life Prediction with Incomplete Degradation
Yubo Hou, Mohamed Ragab, Yucheng Wang, Min Wu, Abdulla Alseiari, Chee-Keong Kwoh, Xiaoli Li, Zhenghua Chen
Time Series
  • EviAdapt addresses the limitations of existing domain adaptation methods in RUL prediction with incomplete degradation data.
  • The method segments data into distinct degradation stages for accurate stage-wise alignment.
  • Evidential uncertainty alignment is introduced to manage varying degradation patterns across domains.
  • Extensive experiments show that EviAdapt significantly outperforms current state-of-the-art techniques.
Read more
Federated Learning with Multi-Partner OneFlorida+ Consortium Data for Predicting Major Postoperative Complications
Yuanfang Ren, Varun Sai Vemuri, Zhenhong Hu, Benjamin Shickel, Ziyuan Guan, Tyler J. Loftus, Parisa Rashidi, Tezcan Ozrazgat-Baslanti, Azra Bihorac
Federated Learning
  • Federated learning models were developed to predict major postoperative complications using multicenter data.
  • The study included a large cohort of 358,644 patients and 494,163 surgical procedures.
  • Federated learning models showed superior or comparable predictive performance compared to local and central models.
  • The approach preserves patient data privacy while enhancing model generalizability.
Read more
DSS-GAN: Directional State Space GAN with Mamba backbone for Class-Conditional Image Synthesis
Aleksander Ogonowski, Konrad Klimaszewski, Przemysław Rokita
Generative Models Computer Vision
  • Introduction of DSS-GAN, the first GAN to use Mamba as a generator backbone for noise-to-image synthesis.
  • Development of the Directional Latent Routing (DLR) mechanism for improved class conditioning.
  • Achieves better performance metrics (FID, KID, precision-recall) than StyleGAN2-ADA with significantly fewer parameters.
  • Demonstrates that directional subvectors in the latent space allow for structured changes in synthesized images.
Read more
Capability-Guided Compression: Toward Interpretability-Aware Budget Allocation for Large Language Models
Rishaank Gupta
NLP Large Language Models Interpretability
  • Introduction of Capability-Guided Compression (CGC) framework for LLMs.
  • Capability density maps derived from Sparse Autoencoders provide a new signal for compression budget allocation.
  • Theoretical foundation linking capability density to component-level phase transitions.
  • Experimental validation shows independence of capability density from existing importance metrics.
Read more
QuantFL: Sustainable Federated Learning for Edge IoT via Pre-Trained Model Quantisation
Charuka Herath, Yogachandran Rahulamathavan, Varuna De Silva, Sangarapillai Lambotharan
Federated Learning Efficient ML
  • QUANTFL combines pre-trained model initialisation with structured quantisation to reduce communication costs in federated learning.
  • The framework achieves a 40% reduction in total communication while maintaining or exceeding accuracy compared to uncompressed baselines.
  • QUANTFL employs bucket-based quantisation schemes that adapt to the distribution of model updates, enhancing efficiency.
  • The method demonstrates robustness under non-IID data conditions, making it suitable for diverse IoT applications.
Read more
Topology-Preserving Deep Joint Source-Channel Coding for Semantic Communication
Omar Erak, Omar Alhussein, Fang Fang, Sami Muhaidat
Computer Vision Theory Optimization
  • Introduction of TopoJSCC, a topology-aware DeepJSCC framework.
  • Integration of persistent-homology regularizers for topology preservation.
  • Improved performance in topology preservation and PSNR under low SNR conditions.
  • End-to-end learning without the need for side information.
Read more
Translation Invariance of Neural Operators for the FitzHugh-Nagumo Model
Luca Pellegrini
Theory Efficient ML Time Series
  • Introduces a novel training strategy exploiting translation invariance in the FHN model.
  • Benchmarks seven different Neural Operator architectures for modeling excitable cell dynamics.
  • CNOs excel in translated dynamics but require higher training costs.
  • FNOs achieve low training error but have high inference times and less accuracy on translated data.
Read more
Binary Latent Protein Fitness Landscapes for Quantum Annealing Optimization
Truong-Son Hy
Optimization
  • Q-BioLat models protein fitness landscapes in binary latent spaces for efficient optimization.
  • The framework utilizes pretrained protein language models to create continuous embeddings that are binarized for optimization.
  • Empirical results show that Q-BioLat effectively identifies high-fitness protein variants.
  • Different optimization strategies exhibit distinct behaviors based on latent space dimensionality.
Read more
FEAT: A Linear-Complexity Foundation Model for Extremely Large Structured Data
Zhenghang Song, Tang Qian, Lu Chen, Yushuai Li, Zhengke Hu, Bingbing Fang, Yumeng Song, Junbo Zhao, Sheng Zhang, Tianyi Li
Efficient ML
  • FEAT addresses the O(N^2) complexity issue of traditional LDMs by utilizing linear-complexity encoding methods.
  • The model combines local and global attention mechanisms to preserve expressive representations in structured data.
  • FEAT incorporates a hybrid structural causal model for improved robustness in pre-training.
  • Empirical evaluations show significant performance improvements over existing models on real-world datasets.
Read more
Federated Distributional Reinforcement Learning with Distributional Critic Regularization
David Millard, Cecilia Alm, Rashid Ali, Pengcheng Shi, Ali Baheri
Reinforcement Learning Federated Learning Robotics
  • Introduction of FedDistRL, which federates distributional critics while keeping policies local.
  • Development of TR-FedDistRL, a barycentric regularization method that biases critic updates towards a risk-aware reference distribution.
  • Empirical demonstration of reduced mean-smearing and improved safety metrics compared to mean-oriented and non-federated baselines.
  • Theoretical stability results for the constrained critic update under a Wasserstein metric.
Read more
Personalized Fall Detection by Balancing Data with Selective Feedback Using Contrastive Learning
Awatif Yasmin, Tarek Mahmud, Sana Alamgeer, Anne H. H. Ngu
Time Series
  • The proposed framework effectively balances fall and non-fall activity data using semi-supervised contrastive learning.
  • Personalized models show improved recall and precision compared to traditional models trained on imbalanced datasets.
  • The Training from Scratch approach outperforms other retraining strategies, highlighting the importance of tailored data in model training.
  • The method simplifies the personalization process by automating sample selection, reducing the need for manual labeling.
Read more
PhasorFlow: A Python Library for Unit Circle Based Computing
Dibakar Sigdel, Namuna Panday
Theory Optimization Time Series
  • Introduction of the Phasor Circuit model with a comprehensive gate library.
  • Development of Variational Phasor Circuits for classical machine learning optimization.
  • Implementation of a Phasor Transformer that enhances token mixing without parameter overhead.
  • Validation of PhasorFlow on diverse tasks, showcasing its versatility and efficiency.
Read more
Objective Mispricing Detection for Shortlisting Undervalued Football Players via Market Dynamics and News Signals
Chinenye Omejieke, Shuyao Chen, Xia Cui
NLP
  • Introduces a reproducible framework for detecting undervalued football players based on objective mispricing.
  • Combines structured market data with NLP-derived signals from news articles to improve player valuation.
  • Demonstrates that market dynamics are the primary indicators of undervaluation, with NLP features providing additional insights.
  • Utilizes SHAP analyses for interpretability, enhancing trust in the model's recommendations.
Read more
OMNIFLOW: A Physics-Grounded Multimodal Agent for Generalized Scientific Reasoning
Hao Wu, Yongheng Zhang, Yuan Gao, Fan Xu, Fan Zhang, Ruobing Xie, Ruijian Gou, Yuxuan Liang, Xiaomeng Huang, Xian Wu
Multimodal Large Language Models Interpretability
  • OMNIFLOW is the first training-free framework for generalized fluid physical reasoning using LLMs.
  • The architecture enables zero-shot generalization to different governing equations with high prediction accuracy.
  • OMNIFLOW generates interpretable structured analysis reports, enhancing scientific discovery and decision-making.
Read more
Causal Representation Learning on High-Dimensional Data: Benchmarks, Reproducibility, and Evaluation Metrics
Alireza Sadeghi, Wael AbdAlmageed
Theory
  • Causal representation learning (CRL) models are essential for understanding causal relationships in high-dimensional data.
  • The paper critiques existing datasets and proposes characteristics for ideal datasets in CRL development.
  • An integrated evaluation framework is introduced to consolidate multiple performance metrics into a single score.
  • Reproducibility is highlighted as a critical issue, with recommendations for best practices in sharing code and results.
Read more
Auto-Unrolled Proximal Gradient Descent: An AutoML Approach to Interpretable Waveform Optimization
Ahmet Kaplan
Optimization Interpretability
  • Integration of AutoML with deep unfolding for waveform optimization.
  • Achieves high spectral efficiency with significantly fewer training samples.
  • Introduces a hybrid layer for learnable gradient transformation.
  • Addresses gradient normalization for improved training consistency.
Read more
What on Earth is AlphaEarth? Hierarchical structure and functional interpretability for global land cover
Ivan Felipe Benavides-Martinez, Justin Guthrie, Jhon Edwin Arias, Yeison Alberto Garces-Gomez, Angela Ines Guzman-Alvis, Cristiam Victoriano Portilla-Cabrera, Somnath Mondal, Andrew J. Allyn, Auroop R. Ganguly
Interpretability Multimodal Efficient ML
  • Introduces a functional interpretability framework for GAEF embeddings.
  • Identifies a hierarchical organization of embedding dimensions based on their roles.
  • Demonstrates that high classification accuracy can be achieved with only a few dimensions.
  • Highlights the redundancy in the embedding space, suggesting potential for computational efficiency.
Read more
ARES: Scalable and Practical Gradient Inversion Attack in Federated Learning through Activation Recovery
Zirui Gong, Leo Yu Zhang, Yanjun Zhang, Viet Vo, Tianqing Zhu, Shirui Pan, Cong Wang
Federated Learning
  • ARES enables high-fidelity reconstruction of training samples from large batches without architectural modifications.
  • The attack formulates the recovery problem as a noisy sparse recovery task using Lasso.
  • The incorporation of the imprint method allows for scalable reconstruction of individual samples.
  • Theoretical guarantees are established for the recovery rate and reconstruction error.
Read more
Beyond Reward Suppression: Reshaping Steganographic Communication Protocols in MARL via Dynamic Representational Circuit Breaking
Liu Hung Ming
Reinforcement Learning Theory Optimization
  • Introduces DRCB as a novel defense mechanism against steganographic collusion in MARL.
  • Demonstrates that existing static monitoring techniques are ineffective in reducing collusion.
  • Shows significant improvements in observer accuracy and reduced volatility under DRCB governance.
  • Highlights the Transparency Paradox, where agents achieve predictability while retaining covert communication capabilities.
Read more
DISCOVER: A Solver for Distributional Counterfactual Explanations
Yikai Gu, Lele Cao, Bo Zhao, Lei Lei, Lei You
Optimization Interpretability
  • DISCOVER is a model-agnostic solver that preserves the DCE objective while avoiding gradient-based optimization.
  • The method utilizes a sparse propose-and-select search to focus on the most influential samples for counterfactual generation.
  • An OT-guided cone sampling technique enhances the efficiency of candidate generation without relying on predictor gradients.
  • The approach successfully extends distributional counterfactual reasoning to non-differentiable models, making it applicable to a wider range of real-world scenarios.
Read more
Symmetry-Reduced Physics-Informed Learning of Tensegrity Dynamics
Jing Qin, Muhao Chen
Theory Efficient ML Robotics
  • Introduces SymPINN, a framework that incorporates geometric symmetries into tensegrity dynamics modeling.
  • Reduces computational complexity by using a symmetry basis for nodal coordinates.
  • Ensures predicted configurations satisfy symmetry constraints through symmetry transformations.
  • Demonstrates improved prediction accuracy and efficiency in numerical experiments.
Read more
SENSE: Efficient EEG-to-Text via Privacy-Preserving Semantic Retrieval
Akshaj Murhekar, Christina Liu, Abhijit Mishra, Shounak Roychowdhury, Jacek Gwizdka
NLP Large Language Models Multimodal
  • Introduces a lightweight EEG-to-text framework that avoids LLM fine-tuning.
  • Utilizes a CLIP-aligned EEG representation for semantic grounding and keyword inference.
  • Ensures privacy by keeping raw EEG data on-premises and only sharing extracted keywords.
  • Achieves comparable or improved performance over fine-tuned LLMs in generating text from EEG signals.
Read more
Variational Rectification Inference for Learning with Noisy Labels
Haoliang Sun, Qi Wei, Lei Feng, Yupeng Hu, Fan Liu, Hehe Fan, Yilong Yin
Theory Optimization
  • Introduces Variational Rectification Inference (VRI) for robust learning with noisy labels.
  • Formulates loss rectification as an amortized variational inference problem.
  • Utilizes a hierarchical Bayesian model to treat the rectifying vector as a latent variable.
  • Demonstrates improved generalization performance and avoids model collapse.
Read more
Discovering the Hidden Role of Gini Index In Prompt-based Classification
Ruixi Lin
NLP Large Language Models Optimization
  • The Gini Index serves as a valuable tool for detecting and optimizing class accuracy disparities in prompt-based classification.
  • Significant relative accuracy imbalances exist in both text and image classification tasks, regardless of dimensionality.
  • A post-hoc model-agnostic bias mitigation method based on the Gini Index can effectively reduce accuracy imbalances.
  • The proposed method enhances the performance of minority classes while limiting the dominance of frequently seen head classes.
Read more
Evaluating Causal Discovery Algorithms for Path-Specific Fairness and Utility in Healthcare
Nitish Nagesh, Elahe Khatibi, Thomas Hughes, Mahdi Bagheri, Pratik Gajane, Amir M. Rahmani
Graph Learning
  • Establishment of causal graph benchmarks for synthetic and real-world clinical datasets.
  • Evaluation of causal discovery algorithms on structural recovery and path-specific fairness.
  • Identification of significant variations in fairness-utility ratios across different algorithms.
  • Highlighting the necessity for graph-aware fairness evaluations in clinical applications.
Read more
Determinism in the Undetermined: Deterministic Output in Charge-Conserving Continuous-Time Neuromorphic Systems with Temporal Stochasticity
Jing Yan, Kang You, Zhezhi He, Yaoyu Zhang
Theory Efficient ML
  • Development of a unified continuous-time framework for charge-conserving SNNs.
  • Establishment of deterministic output under temporal stochasticity through rigorous proof.
  • Exact representational correspondence between charge-conserving SNNs and QANNs.
  • Demonstration of unique terminal states that are invariant to spike timing.
Read more
On the Cone Effect and Modality Gap in Medical Vision-Language Embeddings
David Restrepo, Miguel L Martins, Chenwei Wu, Luis Filipe Nakayama, Diego M Lopez, Stergios Christodoulidis, Maria Vakalopoulou, Enzo Ferrante
Multimodal
  • Introduces a post-hoc mechanism to adjust modality gap in VLMs without retraining.
  • Demonstrates that the modality gap significantly affects performance in medical datasets.
  • Finds that optimal separation is task-dependent, challenging the notion of universally minimizing the modality gap.
  • Highlights the pronounced cone effect in medical domains due to lower diversity in data.
Read more
Formal verification of tree-based machine learning models for lateral spreading
Krishna Kumar
Theory Interpretability
  • Introduces formal verification via SMT solvers for tree-based geotechnical ML models.
  • Formalizes four key geotechnical specifications for model compliance.
  • Demonstrates the limitations of post-hoc explainability methods in ensuring model consistency.
  • Establishes a verify-fix-verify engineering loop for improving model reliability.
Read more
FederatedFactory: Generative One-Shot Learning for Extremely Non-IID Distributed Scenarios
Andrea Moleri, Christian Internò, Ali Raza, Markus Olhofer, David Klindt, Fabio Stella, Barbara Hammer
Federated Learning Generative Models Computer Vision
  • FederatedFactory achieves centralized performance in extreme single-class silo scenarios, significantly improving accuracy from 11.36% to 90.57% on CIFAR-10.
  • The framework operates with zero dependency on external pre-trained models, relying solely on localized generative priors.
  • It utilizes a one-shot communication strategy, enhancing efficiency by avoiding multiple rounds of data transmission.
  • The architecture supports exact modular unlearning, allowing for the removal of specific client contributions without data leakage.
Read more
Learning Permutation Distributions via Reflected Diffusion on Ranks
Sizhuang He, Yangtian Zhang, Shiyang Zhang, David van Dijk
Generative Models Optimization
  • Introduction of Soft-Rank Diffusion for learning permutation distributions.
  • Utilization of a continuous soft-rank representation to enable smoother diffusion processes.
  • Development of contextualized generalized Plackett–Luce (cGPL) denoisers for enhanced expressivity.
  • Demonstrated superior performance on permutation generation tasks compared to existing methods.
Read more
The Phasor Transformer: Resolving Attention Bottlenecks on the Unit Circle
Dibakar Sigdel
Time Series Efficient ML Theory
  • Introduction of the Phasor Transformer block as a phase-native alternative to dense attention layers.
  • Achieves global token mixing with O(N log N) complexity using DFT token coupling.
  • Demonstrates competitive performance in time-series forecasting with fewer parameters than traditional Transformers.
  • Establishes a new efficiency-performance frontier for long-context temporal modeling.
Read more
Multi-Agent Reinforcement Learning for Dynamic Pricing: Balancing Profitability, Stability and Fairness
Krishna Kumar Neelakanta Pillai, Santha Kumari Amma
Reinforcement Learning Optimization
  • MAPPO outperforms other algorithms in terms of profit and stability.
  • MADDPG achieves fairer profit distribution among agents despite lower overall profit.
  • The study highlights the importance of stability and reproducibility in MARL for dynamic pricing.
  • Insights on trade-offs between exploration and reliability are provided, particularly regarding MASAC.
Read more
SpecMoE: Spectral Mixture-of-Experts Foundation Model for Cross-Species EEG Decoding
D. Darankoum, C. Habermacher, J. Volle, S. Grudinin
Time Series
  • Introduction of a Gaussian-smoothed masking strategy for EEG signal pretraining.
  • Development of SpecHi-Net, a hierarchical architecture for multi-scale feature extraction.
  • Implementation of a spectral gating mechanism in a mixture of experts framework.
  • Demonstration of state-of-the-art performance in diverse EEG decoding tasks.
Read more
A foundation model for electrodermal activity data
Leonardo Alchieri, Matteo Garzon, Lidia Alecci, Francesco Bombassei De Bona, Martin Gjoreski, Giovanni De Felice, Silvia Santini
Time Series
  • Introduction of UME, the first foundation model specifically for EDA data.
  • Compilation of EDAMAME, a large-scale EDA dataset from 24 public sources.
  • UME outperforms baseline models and matches generalist models with significantly lower computational costs.
  • Challenges in EDA modeling are acknowledged, indicating the need for further research.
Read more
Game-Theory-Assisted Reinforcement Learning for Border Defense: Early Termination based on Analytical Solutions
Goutam Das, Michael Dorothy, Kyle Volle, Daigo Shishika
Reinforcement Learning Theory Optimization
  • Introduces a hybrid framework combining game theory with multi-agent reinforcement learning (MARL).
  • Achieves significant improvements in training efficiency, with higher rewards and faster convergence.
  • Utilizes the Apollonius Circle for Nash equilibrium computation, allowing for early termination of RL episodes.
  • Demonstrates effectiveness across different team sizes in border defense scenarios.
Read more
CircuitBuilder: From Polynomials to Circuits via Reinforcement Learning
Weikun K. Zhang, Rohan Pandey, Bhaumik Mehta, Kaijie Jin, Naomi Morato, Archit Ganapule, Michael Ruofan Zeng, Jarod Alper
Reinforcement Learning Theory Efficient ML
  • The paper formulates the problem of arithmetic circuit synthesis as a single-player game for RL agents.
  • Two RL methods are compared: PPO+MCTS and SAC, with SAC showing better performance on simpler tasks.
  • PPO+MCTS demonstrates scalability to more complex polynomial instances.
  • The study suggests that RL can effectively navigate the vast search space of arithmetic circuits.
Read more
WINFlowNets: Warm-up Integrated Networks Training of Generative Flow Networks for Robotics and Machine Fault Adaptation
Zahin Sufiyan, Shadan Golestan, Yoshihiro Mitsuka, Shotaro Miwa, Osmar Zaiane
Reinforcement Learning Generative Models Robotics
  • WINFlowNets introduces a co-training framework for flow and retrieval networks, enhancing adaptability in dynamic environments.
  • The two-phase training strategy (Warm-Up and Dual-Training) eliminates the need for pre-training the retrieval network.
  • Experimental results show significant improvements in performance and stability over standard CFlowNets and leading RL algorithms.
  • WINFlowNets demonstrates strong adaptability in fault environments, making it suitable for real-world robotic applications.
Read more
Baguan-TS: A Sequence-Native In-Context Learning Model for Time Series Forecasting with Covariates
Linxiao Yang, Xue Jiang, Gezheng Xu, Tian Zhou, Min Yang, ZhaoYang Zhu, Linyuan Geng, Zhipeng Zeng, Qiming Chen, Xinyue Gu, Rong Jin, Liang Sun
Time Series
  • Baguan-TS unifies end-to-end representation learning with in-context learning for time series forecasting.
  • The model employs a 3D Transformer architecture that attends to temporal, variable, and context dimensions.
  • A Y-space retrieval-based calibration module improves model stability and forecasting accuracy.
  • The context-overfitting strategy enhances robustness by balancing denoising and sample selection.
Read more
The Agentic Researcher: A Practical Guide to AI-Assisted Research in Mathematics and Machine Learning
Max Zimmer, Nico Pelleriti, Christophe Roux, Sebastian Pokutta
Theory Optimization Large Language Models
  • Introduces a five-level taxonomy of AI integration in research.
  • Presents an open-source framework for using CLI coding agents as autonomous research assistants.
  • Demonstrates the framework's application through case studies in mathematics and machine learning.
  • Emphasizes the importance of human oversight and augmentation in AI-assisted research.
Read more
Conditional Inverse Learning of Time-Varying Reproduction Numbers Inference
Lanlan Yu, Quan-Hui Liu, Haoyue Zheng, Xinfu Yang
Time Series
  • CIRL addresses the ill-posed inverse problem of estimating time-varying reproduction numbers from epidemic data.
  • The framework combines epidemiological constraints with data-driven modeling to enhance adaptability to changing dynamics.
  • CIRL employs a Conditional Inverse Mapping Network and a Statistical Observation and Consistency Module to improve estimation accuracy.
  • Experiments validate the robustness of CIRL against observation noise and its responsiveness to abrupt transmission changes.
Read more
FlashSampling: Fast and Memory-Efficient Exact Sampling
Tomas Ruiz, Zhen Qin, Yifan Zhang, Xuyang Shen, Yiran Zhong, Mengdi Wang
NLP Large Language Models Efficient ML
  • FlashSampling fuses exact sampling into the LM-head matmul, eliminating the need for full logits tensor materialization.
  • The method computes logits tile-by-tile and retains only essential candidates, reducing memory traffic and improving efficiency.
  • FlashSampling achieves exact sampling without approximations, maintaining accuracy while enhancing performance.
  • The approach demonstrates significant speedups in end-to-end vLLM experiments across multiple GPU architectures.
Read more