AI-generated summaries

Today's ML research,
without the noise.

Daily summaries of the latest machine learning papers from arXiv, processed every 8 hours.

48 Papers today
8h Update frequency
7 Days of history
Wireless communication empowers online scheduling of partially-observable transportation multi-robot systems in a smart factory
Yaxin Liao, Qimei Cui, Kwang-Cheng Chen, Xiong Li, Jinlian Chen, Xiyu Zhao, Xiaofeng Tao, Ping Zhang
Robotics Optimization
  • Introduces a communication-enabled online scheduling framework for T-MRS in smart factories.
  • Integrates wireless M2M networking with route scheduling to enhance AGV coordination.
  • Demonstrates significant improvements in scheduling efficiency compared to traditional methods.
  • Highlights the differences between M2M and human-to-human communication in the context of scheduling.
Read more
Language-Assisted Image Clustering Guided by Discriminative Relational Signals and Adaptive Semantic Centers
Jun Ma, Xu Zhang, Zhengxing Jiao, Yaxin Hou, Hui Liu, Junhui Hou, Yuheng Jia
Computer Vision NLP Multimodal
  • Proposes a new framework for Language-Assisted Image Clustering (LAIC) addressing key limitations of existing methods.
  • Enhances inter-class discriminability by utilizing cross-modal relations for self-supervision signals.
  • Implements prompt learning to create adaptive semantic centers for improved clustering assignments.
  • Achieves an average performance improvement of 2.6% over state-of-the-art methods across multiple datasets.
Read more
Rethinking Multimodal Fusion for Time Series: Auxiliary Modalities Need Constrained Fusion
Seunghan Lee, Jun Seo, Jaehoon Lee, Sungdong Yoo, Minjae Kim, Tae Yoon Lim, Dongwan Kang, Hwanil Choi, SoonYoung Lee, Wonbin Ahn
Time Series Multimodal
  • Naive multimodal fusion strategies often underperform compared to unimodal TS models.
  • Constrained fusion methods, including the proposed Controlled Fusion Adapter (CFA), significantly improve performance.
  • CFA allows for controlled integration of auxiliary textual information without modifying the TS backbone.
  • The study involved over 20,000 experiments across diverse datasets and models, validating the effectiveness of constrained fusion.
Read more
Manifold Generalization Provably Proceeds Memorization in Diffusion Models
Zebang Shen, Ya-Ping Hsieh, Niao He
Generative Models Theory
  • Diffusion models can generate novel samples with coarse scores by capturing the geometry of the data.
  • The manifold hypothesis provides a framework for understanding generalization in diffusion models.
  • Generalization occurs at a faster statistical rate than full density estimation, especially for smooth manifolds.
  • Coarse score accuracy can still yield fine on-manifold coverage, enabling high-quality sample generation.
Read more
DeepDTF: Dual-Branch Transformer Fusion for Multi-Omics Anticancer Drug Response Prediction
Yuhan Zhao, Jacob Tennant, James Yang, Zhishan Guo, Young Whang, Ning Sui
Multimodal Graph Learning Interpretability
  • DeepDTF integrates multi-omics data and drug structures using a dual-branch Transformer architecture.
  • The model achieves superior performance on drug response prediction tasks compared to existing baselines.
  • It includes an interpretability module that connects predictions to biological pathways and gene attributions.
  • DeepDTF addresses challenges of cross-modal misalignment and high-dimensional data in cancer drug response modeling.
Read more
Probabilistic Geometric Alignment via Bayesian Latent Transport for Domain-Adaptive Foundation Models
Kuepon Aueawatthanaphisut, Kuepon Aueawatthanaphisut
Theory
  • Introduction of a novel uncertainty-aware probabilistic latent transport framework for foundation model adaptation.
  • Development of a Bayesian transport operator for geometry-preserving feature transfer under distributional shifts.
  • Integration of optimal transport dynamics with PAC-Bayesian generalization control, providing theoretical guarantees.
  • Empirical results demonstrate superior performance in latent manifold alignment and uncertainty calibration.
Read more
Safe Reinforcement Learning with Preference-based Constraint Inference
Chenglin Li, Guangchun Ruan, Hua Geng
Reinforcement Learning Robotics Optimization
  • Introduces PbCRL, a novel method for inferring safety constraints from human preferences.
  • Addresses limitations of traditional Bradley-Terry models in capturing heavy-tailed cost distributions.
  • Incorporates a dead zone mechanism and SNR loss to improve exploration and constraint alignment.
  • Demonstrates superior performance in safety and reward compared to existing methods.
Read more
Generalizing Dynamics Modeling More Easily from Representation Perspective
Yiming Wang, Zhengnan Zhang, Genghe Zhang, Jiawen Dan, Changchun Li, Chenlong Hu, Chris Nugent, Jun Liu, Ximing Li, Bo Yang
Time Series
  • Introduction of a generalized Pre-trained Dynamics EncoDER (PDEDER) for improved dynamics modeling.
  • Utilization of the Lyapunov exponent to minimize chaotic behavior in the latent space.
  • Incorporation of reconstruction and forecasting objectives to enhance model performance.
  • Evaluation on 12 dynamic systems shows significant improvements in forecasting accuracy.
Read more
Central Dogma Transformer III: Interpretable AI Across DNA, RNA, and Protein
Nobuyuki Ota
Interpretability Multimodal
  • CDT-III aligns its architecture with the central dogma, enhancing interpretability and biological relevance.
  • The two-stage architecture effectively separates transcription and translation processes, improving prediction accuracy.
  • Joint prediction of RNA and protein changes leads to better performance and interpretability.
  • The model can predict clinical side effects and generate hypotheses without clinical data, showcasing its practical applications.
Read more
Instruction-Tuned, but Not More Verifiable Instruction-Following: A Cross-Task Diagnosis for LoRA Adapters
Junyi Zou
NLP Large Language Models
  • Nominal training objectives do not consistently predict actual performance improvements across tasks.
  • The concept of 'capability drift' describes the mismatch between nominal labels and realized capabilities.
  • Routine cross-task evaluations are essential before deploying models to avoid unintended performance shifts.
  • Different benchmarks operationalize instruction following differently, leading to mixed evidence across evaluations.
Read more
From Arithmetic to Logic: The Resilience of Logic and Lookup-Based Neural Networks Under Parameter Bit-Flips
Alan T. L. Bacellar, Sathvik Chemudupati, Shashank Nag, Allison Seigler, Priscila M. V. Lima, Felipe M. G. FranΓ§a, Lizy K. John
Theory Efficient ML
  • Resilience against bit-flip errors is a structural property of neural architectures.
  • Lower precision, higher sparsity, bounded activations, and shallow depth improve resilience.
  • Logic and Lookup-Based Neural Networks (LUT-NNs) demonstrate superior stability under corruption.
  • A novel Even-Layer Recovery effect is observed in logic-based architectures.
Read more
TuneShift-KD: Knowledge Distillation and Transfer for Fine-tuned Models
Yushi Guan, Jeanine Ohene-Agyei, Daniel Kwan, Jean Sebastien Dandurand, Yifei Zhang, Nandita Vijaykumar
NLP Large Language Models Efficient ML
  • TuneShift-KD automates the distillation of specialized knowledge from fine-tuned models to target models.
  • The method relies on identifying perplexity differences to create a synthetic training dataset.
  • It does not require access to original training data or additional training of discriminators.
  • Models fine-tuned with TuneShift-KD show improved accuracy over previous knowledge transfer methods.
Read more
On the Use of Bagging for Local Intrinsic Dimensionality Estimation
KristΓ³f PΓ©ter, Ricardo J. G. B. Campello, James Bailey, Michael E. Houle
Theory
  • Introduces bagging as a variance-reduction technique for LID estimation.
  • Analyzes the complex interplay between sampling rate, neighborhood size, and ensemble size.
  • Demonstrates significant improvements in estimation accuracy through empirical results.
  • Proposes methods for combining bagging with neighborhood smoothing for enhanced performance.
Read more
Towards Effective Experiential Learning: Dual Guidance for Utilization and Internalization
Fei Bai, Zhipeng Chen, Chuan Hao, Ming Yang, Ran Tao, Bryan Dai, Wayne Xin Zhao, Jian Yang, Hongteng Xu
NLP Large Language Models Reinforcement Learning
  • DGO introduces a unified framework that combines external and internal experience for improved training effectiveness.
  • The framework operates through a closed-loop system of experience utilization and internalization.
  • DGO consistently outperforms baseline methods, demonstrating enhanced reasoning capabilities in LLMs.
  • The method achieves an average score of 32.41% on six benchmarks, improving to 39.38% with test-time scaling.
Read more
Forecasting with Guidance: Representation-Level Supervision for Time Series Forecasting
Jiacheng Wang, Liang Fan, Baihua Li, Luyan Zhang
Time Series
  • Identifies limitations of error-only supervision in deep learning-based time series forecasting.
  • Introduces ReGuider, a plug-in method for representation-level supervision using pretrained time series foundation models.
  • Demonstrates that ReGuider enhances the expressiveness of temporal representations in forecasting models.
  • Shows consistent improvements in forecasting accuracy across various datasets and architectures.
Read more
Permutation-Symmetrized Diffusion for Unconditional Molecular Generation
Gyeonghoon Ko, Juho Lee
Generative Models
  • Introduces a direct modeling approach for diffusion on the quotient manifold to achieve permutation invariance.
  • Derives an explicit expression for the heat kernel on the quotient manifold, enhancing understanding of diffusion dynamics.
  • Utilizes MCMC to approximate the permutation-symmetrized score for training.
  • Demonstrates competitive performance in unconditional molecular generation tasks on the QM9 dataset.
Read more
A Direct Classification Approach for Reliable Wind Ramp Event Forecasting under Severe Class Imbalance
Alejandro Morales-HernΓ‘ndez, Fabrizio De Caro, Gian Marco Paldino, Pascal Tribel, Alfredo Vaccaro, Gianluca Bontempi
Time Series
  • Introduces a direct classification approach for forecasting WPREs, addressing severe class imbalance.
  • Develops a data preprocessing strategy that enhances feature extraction from power observations.
  • Combines majority-class undersampling with ensemble learning to improve model performance.
  • Achieves over 85% accuracy and 88% weighted F1 score in numerical simulations.
Read more
Precision-Varying Prediction (PVP): Robustifying ASR systems against adversarial attacks
MatΓ­as Pizarro, Raghavan Narasimhan, Asja Fischer
Audio & Speech
  • PVP enhances ASR robustness by varying numerical precision during inference.
  • The method does not require retraining or access to model internals.
  • A lightweight detection strategy is proposed based on transcription consistency across precision modes.
  • Experiments show significant improvements in robustness and detection performance across multiple ASR models.
Read more
Conditionally Identifiable Latent Representation for Multivariate Time Series with Structural Dynamics
Minkey Chang, Jae-Young Kim
Time Series
  • Introduction of the Identifiable Variational Dynamic Factor Model (iVDFM) for multivariate time series.
  • Achieves identifiability by conditioning on the innovation process rather than latent states.
  • Utilizes linear diagonal dynamics to preserve identifiability and ensure computational efficiency.
  • Demonstrates improved factor recovery and intervention accuracy on synthetic and real-world data.
Read more
Steering Code LLMs with Activation Directions for Language and Library Control
Md Mahbubur Rahman, Arjun Guha, Harshitha Menon
Large Language Models NLP
  • Code LLMs exhibit strong implicit preferences for specific programming languages and libraries.
  • Layer-wise activation directions can be estimated to steer model outputs effectively.
  • Interventions can influence code generation even under neutral or conflicting prompts.
  • Steering strength varies by model and target, with risks of quality degradation from strong interventions.
Read more
Beyond Accuracy: Introducing a Symbolic-Mechanistic Approach to Interpretable Evaluation
Reza Habibi, Darian Lee, Magy Seif El-Nasr
NLP Interpretability
  • Traditional accuracy metrics fail to reliably distinguish between generalization and memorization in machine learning models.
  • The proposed symbolic-mechanistic evaluation framework combines symbolic rules with mechanistic interpretability to provide deeper insights into model behavior.
  • A case study on NL-to-SQL tasks illustrates the limitations of standard evaluation metrics, revealing hidden failures in models that appear competent based on accuracy alone.
  • The authors emphasize the need for mechanism-aware evaluation, particularly for tasks with clear algorithmic requirements.
Read more
Research on Individual Trait Clustering and Development Pathway Adaptation Based on the K-means Algorithm
Qianru Wei, Jihaoyu Yang, Cheng Zhang, Jinming Yang
Theory
  • Utilizes K-means clustering to categorize students based on individual traits.
  • Focuses on the fitness of students for specific career paths rather than just predicting career outcomes.
  • Provides targeted career guidance based on clustering results, enhancing personalized support.
  • Demonstrates the effectiveness of data-driven approaches in improving employment success rates for students.
Read more
Multimodal Training to Unimodal Deployment: Leveraging Unstructured Data During Training to Optimize Structured Data Only Deployment
Zigui Wang, Minghui Sun, Jiang Shu, Matthew M. Engelhard, Lauren Franz, Benjamin A. Goldstein
Multimodal
  • Introduces a multimodal learning framework that leverages unstructured EHR data for training while deploying a structured-only model.
  • Utilizes contrastive learning and knowledge distillation to transfer knowledge from a teacher model to a student model.
  • Achieves an AUROC of 0.705, outperforming the structured-only baseline of 0.656.
  • Highlights the importance of unstructured data in enhancing model performance in clinical settings.
Read more
Unveiling Hidden Convexity in Deep Learning: a Sparse Signal Processing Perspective
Emi Zeger, Mert Pilanci
Theory Optimization Interpretability
  • Convex equivalences of ReLU neural networks can simplify optimization and enhance theoretical understanding.
  • Reframing neural network training as a convex optimization task allows for efficient global optimization.
  • The paper presents an equivalence theorem connecting two-layer ReLU networks to convex group Lasso problems.
  • Experimental results indicate performance benefits when applying convex optimization frameworks to neural network training.
Read more
Linear-Nonlinear Fusion Neural Operator for Partial Differential Equations
Heng Wu, Junjie Wang, Benzhuo Lu
Efficient ML Theory Interpretability
  • Introduction of a linear-nonlinear multiplicative fusion mechanism for improved training efficiency.
  • LNF-NO architecture effectively decouples linear and nonlinear effects for better representation.
  • Demonstrated significant training speed improvements (up to 2.7x faster) compared to existing models.
  • Achieves comparable or better accuracy across various PDE benchmarks.
Read more
Lightweight Fairness for LLM-Based Recommendations via Kernelized Projection and Gated Adapters
Nan Cui, Wendy Hui Wang, Yue Ning
NLP Large Language Models Efficient ML
  • Proposes a lightweight bias mitigation method for LLM-based recommendations.
  • Combines kernelized INLP for bias removal with a gated MoE adapter for utility restoration.
  • Achieves fairness improvements without sacrificing recommendation accuracy.
  • No additional trainable parameters are required, making it computationally efficient.
Read more
A Learning Method with Gap-Aware Generation for Heterogeneous DAG Scheduling
Ruisong Zhou, Haijun Zou, Li Zhou, Chumin Sun, Zaiwen Wen
Reinforcement Learning Optimization Theory
  • WeCAN framework effectively addresses scheduling of heterogeneous DAGs using reinforcement learning.
  • Introduces a two-stage single-pass design for efficient schedule generation.
  • Develops an order-space analysis to identify and eliminate generation-induced optimality gaps.
  • Demonstrates superior performance in makespan compared to existing scheduling methods.
Read more
Robustness Quantification for Discriminative Models: a New Robustness Metric and its Application to Dynamic Classifier Selection
Rodrigo F. L. Lassance, Jasper De Bock
Theory
  • Introduction of a new robustness metric applicable to any probabilistic discriminative classifier.
  • The metric is based on Constant Odds Ratio (COR) perturbation, allowing for broader applicability.
  • Demonstrated correlation with accuracy through experiments using Accuracy Rejection Curves.
  • Application of the metric in dynamic classifier selection to improve prediction reliability.
Read more
Efficient Controller Learning from Human Preferences and Numerical Data Via Multi-Modal Surrogate Models
Lukas Theiner, Maik Pfefferkorn, Yongpeng Zhao, Sebastian Hirt, Rolf Findeisen
Optimization Robotics Multimodal
  • Introduces a multi-fidelity, multi-modal Bayesian optimization framework.
  • Integrates low-fidelity numerical data with high-fidelity human preferences.
  • Utilizes Gaussian process surrogate models for efficient learning.
  • Demonstrates application in tuning an autonomous vehicle's trajectory planner.
Read more
CN-Buzz2Portfolio: A Chinese-Market Dataset and Benchmark for LLM-Based Macro and Sector Asset Allocation from Daily Trending Financial News
Liyuan Chen, Shilong Li, Jiangpeng Yan, Shuoling Liu, Qiang Yang, Xiu Li
NLP Large Language Models
  • Introduction of CN-Buzz2Portfolio as a benchmark for evaluating LLMs in financial asset allocation.
  • Focus on macro and sector-level asset allocation rather than individual stock picking.
  • Implementation of a Tri-Stage CPA Agent Workflow to assess LLM performance.
  • Significant disparities observed among LLMs in translating financial narratives into portfolio strategies.
Read more
Diet Your LLM: Dimension-wise Global Pruning of LLMs via Merging Task-specific Importance Score
Jimyung Hong, Jaehyung Kim
Large Language Models Efficient ML
  • DIET is a dimension-wise global pruning framework that generates a single global mask for LLMs.
  • The method requires no additional training, relying solely on activation profiling from a small number of task-specific samples.
  • DIET consistently outperforms state-of-the-art structured pruning methods across various sparsity levels and model sizes.
  • The framework demonstrates significant accuracy gains, particularly in zero-shot commonsense reasoning tasks.
Read more
Kronecker-Structured Nonparametric Spatiotemporal Point Processes
Zhitong Xu, Qiwei Yuan, Yinghao Chen, Yan Sun, Bin Shen, Shandian Zhe
Time Series Theory Interpretability
  • KSTPP enables explicit discovery of event relationships while maintaining modeling flexibility.
  • The model captures complex interaction patterns, including excitation, inhibition, and time-varying effects.
  • Kronecker algebra is leveraged to reduce computational complexity and enhance scalability.
  • The framework outperforms existing neural point process models in predictive tasks.
Read more
Full waveform inversion method based on diffusion model
Caiyun Liu, Siyang Pei, Qingfeng Yu, Jie Xiong
Generative Models Optimization Theory
  • Introduction of a conditional diffusion model for full waveform inversion.
  • Utilization of two-dimensional density information to improve inversion accuracy.
  • Demonstrated enhanced resolution and structural fidelity in inversion results.
  • Increased stability and robustness in complex geological scenarios.
Read more
MsFormer: Enabling Robust Predictive Maintenance Services for Industrial Devices
Jiahui Zhou, Dan Li, Ruibing Jin, Jian Lou, Yanran Zhao, Zhenghua Chen, Zigui Jiang, See-Kiong Ng
Time Series
  • Introduction of MsFormer, a lightweight Multi-scale Transformer for predictive maintenance.
  • Incorporation of a Multi-scale Sampling module to capture multi-scale temporal correlations.
  • Use of a lightweight attention mechanism tailored for data-scarce environments.
  • Extensive validation on real-world datasets showing significant performance improvements.
Read more
End-to-End Efficient RL for Linear Bellman Complete MDPs with Deterministic Transitions
Zakaria Mhammedi, Alexander Rakhlin, Nneka Okolo
Reinforcement Learning Theory Efficient ML
  • Introduces a computationally efficient algorithm for linear Bellman complete MDPs with deterministic transitions.
  • Algorithm is end-to-end efficient for finite action spaces and requires only an argmax oracle for larger action spaces.
  • Achieves Ξ΅-optimal policy with polynomial sample and computational complexity.
  • Addresses a significant gap in existing literature regarding exploration in linear Bellman complete MDPs.
Read more
Cost-Sensitive Neighborhood Aggregation for Heterophilous Graphs: When Does Per-Edge Routing Help?
Eyal Weiss
Graph Learning
  • Introduces Cost-Sensitive Neighborhood Aggregation (CSNA) for GNNs to handle heterophilous graphs.
  • Distinguishes between adversarial and informative heterophily regimes and their implications for message routing.
  • Demonstrates that CSNA can preserve class-discriminative signals where mean aggregation fails.
  • Finds that per-edge routing is beneficial in adversarial contexts but not in informative ones.
Read more
Transformers Trained via Gradient Descent Can Provably Learn a Class of Teacher Models
Chenyang Zhang, Qingyue Zhao, Quanquan Gu, Yuan Cao
Theory Optimization
  • One-layer transformers can effectively learn from a general class of teacher models.
  • The paper establishes a tight convergence guarantee for population loss with a rate of Θ(1/T).
  • Transformers demonstrate robust out-of-distribution generalization capabilities.
  • The study identifies a bilinear structure that underpins various learning tasks, enabling unified theoretical guarantees.
Read more
Robustness Quantification and Uncertainty Quantification: Comparing Two Methods for Assessing the Reliability of Classifier Predictions
AdriΓ‘n Detavernier, Jasper De Bock
Theory
  • RQ outperforms UQ in assessing classifier prediction reliability, particularly under distribution shifts.
  • Both RQ and UQ can be combined for enhanced reliability assessments.
  • The study emphasizes the significance of reliability in high-stakes AI applications.
  • A comprehensive comparison is conducted using real datasets, expanding beyond previous studies focused on artificial data.
Read more
Bridging the Gap Between Climate Science and Machine Learning in Climate Model Emulation
Luca Schmidt, Nina Effenberger
Efficient ML
  • ML emulators can significantly reduce the computational costs associated with traditional climate models.
  • There is a disconnect between the climate science and machine learning communities regarding the use of emulators.
  • A framework integrating both fields can enhance the design and reliability of climate model emulators.
  • Closer collaboration can create feedback loops that improve both emulators and physical simulations.
Read more
MetaKube: An Experience-Aware LLM Framework for Kubernetes Failure Diagnosis
Wei Sun, Ting Wang, Xinran Tian, Wanshun Lan, Xuhan Feng, Haoyue Li, Fangxin Wang
Large Language Models
  • MetaKube integrates episodic memory networks, specialized language models, and causal knowledge graphs for enhanced Kubernetes diagnostics.
  • The framework allows for dynamic reasoning pathways, optimizing diagnostic speed and depth based on problem familiarity.
  • MetaKube's locally-deployable model ensures data privacy while achieving high diagnostic performance.
  • Experiential learning through EPMN significantly improves diagnostic accuracy over time.
Read more
Asymptotic Learning Curves for Diffusion Models with Random Features Score and Manifold Data
Anand Jerry George, Nicolas Macris
Generative Models Theory
  • Asymptotic expressions for errors in diffusion models are derived, highlighting the impact of manifold structure on sample complexity.
  • For linear manifolds, sample complexity scales linearly with intrinsic dimension, while this advantage diminishes for non-linear manifolds.
  • The study uses random feature neural networks to parameterize the score function, providing insights into the learning process of diffusion models.
  • The findings suggest that the geometric structure of data significantly influences the performance of generative models.
Read more
LineMVGNN: Anti-Money Laundering with Line-Graph-Assisted Multi-View Graph Neural Networks
Chung-Hoo Poon, James Kwok, Calvin Chow, Jang-Hyeon Choi
Graph Learning
  • Introduction of LineMVGNN, a new GNN model for AML detection.
  • Utilizes line graphs to enhance transaction information propagation.
  • Demonstrates superior performance compared to existing state-of-the-art methods.
  • Addresses scalability and interpretability issues in traditional AML systems.
Read more
The Coordinate System Problem in Persistent Structural Memory for Neural Architectures
Abhinaba Basu
Theory
  • Introduction of the Dual-View Pheromone Pathway Network (DPPN) for persistent structural memory.
  • Identification of coordinate stability and graceful transfer mechanisms as independent requirements for effective memory.
  • Demonstration that learned coordinate systems are unstable and hinder memory persistence.
  • Fixed random Fourier features provide stable coordinates but do not ensure effective transfer.
Read more
CoordLight: Learning Decentralized Coordination for Network-Wide Traffic Signal Control
Yifeng Zhang, Harsh Goel, Peizhuo Li, Mehul Damani, Sandeep Chinchali, Guillaume Sartoretti
Reinforcement Learning Optimization
  • Introduces Queue Dynamic State Encoding (QDSE) for enhanced traffic state representation.
  • Develops Neighbor-aware Policy Optimization (NAPO) to improve agent coordination.
  • Demonstrates superior performance over existing traffic signal control methods.
  • Addresses challenges of partial observability and decentralized decision-making.
Read more
PoiCGAN: A Targeted Poisoning Based on Feature-Label Joint Perturbation in Federated Learning
Tao Liu, Jiguang Lv, Dapeng Man, Weiye Xi, Yaole Li, Feiyu Zhao, Kuiming Wang, Yingchao Bian, Chen Xu, Wu Yang
Federated Learning Computer Vision Generative Models
  • PoiCGAN introduces a targeted poisoning attack framework that enhances stealthiness while maintaining model performance.
  • The method leverages dual-feature collaborative perturbations to minimize the impact on the main task's accuracy.
  • Experiments show a significant increase in attack success rates compared to existing methods.
  • The approach highlights new vulnerabilities in Federated Learning systems, necessitating stronger defenses.
Read more
Causal Discovery in Action: Learning Chain-Reaction Mechanisms from Interventions
Panayiotis Panayiotou, Γ–zgΓΌr Şimşek
Theory Graph Learning
  • Causal discovery in chain-reaction systems can be achieved through blocking interventions.
  • The proposed method provides a unique identification of causal structures with finite-sample guarantees.
  • Experiments show that the method outperforms observational heuristics in complex causal scenarios.
  • The approach is applicable to various real-world systems exhibiting cascade-like structures.
Read more
GEM: Guided Expectation-Maximization for Behavior-Normalized Candidate Action Selection in Offline RL
Haoyu Wang, Jingcheng Wang, Shunyu Wu, Xinwei Xiao
Reinforcement Learning
  • GEM provides a multimodal and controllable action selection framework for offline RL.
  • The method preserves distinct action hypotheses while focusing on high-value regions through GMMs.
  • Candidate-based selection allows for a flexible compute-quality trade-off at inference time.
  • GEM mitigates the risk of out-of-distribution errors associated with naive candidate maximization.
Read more
Learning Response-Statistic Shifts and Parametric Roll Episodes from Wave--Vessel Time Series via LSTM Functional Models
Jose del Aguila Ferrandis, Kevin T. Crofton
Time Series
  • Development of a data-driven surrogate model using LSTM networks for predicting parametric roll in vessels.
  • The model is trained on wave-motion time series generated from both experiments and simulations, making it versatile.
  • Focus on capturing not just the dynamics of parametric roll but also the statistical shifts in response distributions.
  • Evaluation of various loss functions to improve the model's accuracy in tail risk prediction.
Read more