AI-generated summaries

Today's ML research,
without the noise.

Daily summaries of the latest machine learning papers from arXiv, processed every 8 hours.

48 Papers today
8h Update frequency
7 Days of history
Local Motion Matters: A Deconstruct-Recompose Paradigm for Reinforcement Learning Pre-training from Videos
Jinwen Wang, Youfang Lin, Xiaobo Hu, Shuo Wang, Kai Lv
Reinforcement Learning Robotics
  • Introduction of the Deconstruct-Recompose Paradigm (DRP) for RL pre-training.
  • Focus on local motion patterns rather than global motions for better transferability.
  • Development of a Dual-Attention Encoder (DAE) to learn local motion representations.
  • Significant improvements in sample efficiency and performance across various tasks.
Read more
Spectroscopy Analysis with Machine Learning Regression for the Quantification of Carbon and Nitrogen Contents in Inceptisol and Oxisol Soil Types: Comparing Different Preprocessing and Validation methods as well as Feature Importance
Vinicius Herique Kieling, Guilherme Macedo Baggio, Felipe Augusto Bueno Rossi, Marco Antonio de Castro Barbosa, Dalcimar Casanova, Larissa Macedo dos Santos Tonial, Jefferson Tales Oliva
Efficient ML
  • NIR spectroscopy offers a rapid, cost-effective alternative to traditional soil analysis methods.
  • Savitzky-Golay filter and NIPALS-based outlier removal were the most effective preprocessing techniques.
  • Oxisols showed better predictive performance for C and N content compared to Inceptisols.
  • The study achieved low overfitting with RPD values greater than 2.0, indicating reliable model performance.
Read more
Distributed Online Bandit Submodular Maximization with Bounded Sampling Violations
Bin Du, Chang Liu, Dingqi Zhu, Lintao Ye, Dengfeng Sun
Optimization Theory
  • Developed a unified algorithmic framework for distributed online submodular maximization under partition matroid constraints.
  • Achieved sublinear (1 - 1/e)-regret guarantees for both full-information and bandit feedback models.
  • Introduced a bounded stochastic pipage rounding scheme to address sampling violations.
  • Demonstrated that cumulative sampling violations remain sublinear in T.
Read more
How Early Is Early Enough? Design-Dependent Observation-Window Sufficiency in Subscription Churn Prediction
Xiao Han, Yao Xiao, Chenyu Wu, Tongchen Zhang
Time Series
  • The sufficiency of early observation windows for churn prediction varies significantly across different experimental designs.
  • A nine-window sufficiency curve indicates diminishing returns in predictive performance within a 45-90 day range.
  • Contract-driven factors dominate churn prediction, but behavioral data adds predictive value in high-churn segments.
  • Early predictability is robust and not solely due to survivorship bias in the dataset.
Read more
GRPO, Dr. GRPO, and DAPO Are Three Operations on One Number: The Group-Standard-Deviation Identity
Yong Yi Bay, Kathleen A. Yearick
Reinforcement Learning Large Language Models Theory
  • GRPO, Dr. GRPO, and DAPO are variations of a single operation on the standard deviation of sampled answers.
  • The group-standard-deviation identity shows that disagreement among answers directly influences training updates.
  • A split group maximizes learning potential, while unanimous groups provide minimal training benefit.
  • The paper provides closed-form expressions for group size and difficulty bias, aiding practitioners in model training.
Read more
GAIA: Geometry-Adaptive Operator Learning for Forward and Inverse Problems
Meenakshi Krishnan, Pranav Pulijala, Ke Chen, Haizhao Yang, Ramani Duraiswami
Optimization Theory Efficient ML
  • GAIA provides a unified model for both forward and inverse problems on arbitrary geometries.
  • Introduces new benchmarks for varying-geometry inverse and BVP problems.
  • Achieves state-of-the-art results on all evaluated tasks, significantly reducing error rates.
  • Maintains stable accuracy across point resolutions, outperforming transformer-based baselines.
Read more
Representation as a Bottleneck for Mechanistic Interpretability: The Manifestation Unit Protocol
Hussein Chouman, Wataru Sasaki, Tomokazu Matsui, Hirohiko Suwa, Keiichi Yasumoto
Interpretability
  • Introduction of Manifestation Units, a structured protocol for organizing neural network component analyses.
  • Demonstrated that structured representation improves retrieval performance significantly over unstructured methods.
  • Established causal relationships and minimal-optimal core components in CNN and GPT-2 architectures.
  • Provided empirical evidence supporting the effectiveness of the proposed schema in mechanistic interpretability.
Read more
A Stationary-Distribution Theory for Triplet-Based Plateau Search in Random Forest Ensemble-Size Selection
Andrey A. Dukhovny, Andrey M. Lange
Theory Optimization
  • Introduces a stationary-distribution theory for ensemble-size selection in Random Forests.
  • Models the ensemble size as a birth-death Markov chain to analyze its behavior.
  • Demonstrates that the central ensemble size fluctuates around a stationary regime.
  • Provides equilibrium equations that characterize the stationary center and spread.
Read more
Joint discovery of governing partial differential equations from multi-source datasets by competitive optimization
Hao Xu, Siyu Lou, Yuntian Chen, Dongxiao Zhang
Optimization Interpretability Theory
  • Introduction of MCO-PDE framework for discovering PDEs from multi-source datasets.
  • Utilizes independent neural surrogates and a soft-competitive weighting mechanism.
  • Achieves high accuracy in recovering canonical equations with limited observations.
  • Handles complex domains and extracts meaningful laws from real-world experiments.
Read more
Generative Model Proposal based Particle Filtering for Data Assimilation
Chandni Nagda, Mayank Shrivastavam, Gudrun Thorkelsdottir, Gan Zhang, Morteza Mardani, Arindam Banerjee
Generative Models Time Series Theory
  • Introduction of Flow Proposal Particle Filters (FPPF) for improved data assimilation.
  • FPPF utilizes a learned conditional generative model to propose particle distributions.
  • The method effectively reduces weight variance and delays degeneracy in high-dimensional spaces.
  • FPPF and its localized variant (L-FPPF) show superior performance in chaotic dynamical systems.
Read more
Generative Modeling of Quantum Distribution with Functional Flow Matching
Jaehoon Hahm, Tak Hur, Joonseok Lee, Daniel K. Park
Generative Models Theory
  • Introduction of Quantum Flow Matching (QFM) for generative modeling of quantum distributions.
  • Utilization of spin Wigner functions to bypass direct density matrix learning.
  • Application of Functional Flow Matching (FFM) for effective learning in function space.
  • Demonstration of accurate reconstruction of quantum states and physical properties.
Read more
Policy Optimization Achieves Data-Dependent Regret Bounds in MDPs with Unknown Transitions
Mingyi Li, Taira Tsuchiya, Kenji Yamanishi
Reinforcement Learning Optimization Theory
  • Introduces a new algorithm for policy optimization in MDPs with unknown transitions.
  • Achieves data-dependent regret bounds, adapting to the complexity of the loss sequence.
  • Combines first-order, second-order, and path-length bounds with best-of-both-worlds guarantees.
  • Identifies a transition-dependent complexity term that impacts regret bounds.
Read more
Review Residuals: Update-Conditioned Residual Gating for Transformers
Kyle Kramer
Large Language Models NLP Theory
  • Introduction of Review Residuals, a gated residual update mechanism for transformers.
  • Additive gating form preserves identity and avoids vanishing gradients, ensuring stable training at depth.
  • Significant performance improvements at larger model sizes (590M and 1B parameters) compared to standard residuals and Highway gates.
  • No advantage observed at smaller scales, highlighting the method's effectiveness at scale.
Read more
Gauging, Measuring, and Controlling Critic Complexity in Actor-Critic Reinforcement Learning
Konstantin Garbers
Reinforcement Learning
  • Introduces critic complexity as a measurable and controllable dimension in actor-critic RL.
  • Utilizes spectral effective-rank entropy to quantify critic complexity.
  • Demonstrates a systematic relationship between critic complexity, performance, and bias, varying by task and algorithm.
  • Implements a spectral-entropy regularizer that effectively reduces critic complexity and improves performance in certain scenarios.
Read more
Distributionally Robust Linear Regression With Block Lewis Weights
Naren Sarayu Manoj, Kumar Kshitij Patel
Optimization Theory Efficient ML
  • Introduces a novel algorithm for group distributionally robust least squares regression.
  • Achieves optimal solutions with improved computational efficiency compared to existing methods.
  • Utilizes block Lewis weights to connect GDR problems to least squares frameworks.
  • Offers algorithms that interpolate between different loss minimization objectives.
Read more
Robustness of neural networks to random noise perturbations of their inputs
Mark Levene, Martyn Harris
Theory
  • Introduces a new robustness measure for neural networks against input perturbations.
  • Proposes robustness curves to visualize the impact of noise on mean squared error.
  • Demonstrates the method's applicability to various machine learning models beyond neural networks.
  • Validates the approach with experimental results on real-world datasets.
Read more
Knowledge Distillation from Large Reasoning Models to Compact Student Models: A Case Study on the John O Bryan Mathematics Competition
Gaurab Baral, Aaditya Khanal, Yangyang Tao, Junxiu Zhou
Large Language Models NLP Efficient ML
  • Constructed a CoT training corpus from 15 years of mathematics competition problems.
  • Demonstrated that LoRA fine-tuning improved the student model's accuracy from 64.67% to 69.43%.
  • Identified a practical lower bound of 50-100 words for response length in multi-step problems.
  • Provided an error-type analysis indicating that 40% of failures were due to formatting errors.
Read more
Bridging the Gap Between Latent and Explicit Reasoning with Looped Transformers
Ying Fan, Anej Svete, Kangwook Lee
NLP Large Language Models Efficient ML
  • LOTUS is the first latent-CoT method to bridge the accuracy gap with explicit CoT at the 3B scale.
  • The architecture employs looped Transformers to enhance computation depth without increasing parameters.
  • LOTUS reduces thought-phase latency by 2.5 to 6.9 times compared to explicit CoT methods.
  • The latent space of LOTUS is interpretable, recovering gold reasoning steps and revealing alternative valid steps.
Read more
Quality-Aware Modulation for Diffusion Transformers
Luke Budny, Yuhong Guo, Kevin Cheung
Generative Models Computer Vision
  • Introduction of the Quality Representation Module (QRM) for quality-aware modulation in diffusion transformers.
  • QRM enhances the denoising process by incorporating latent image quality signals.
  • No significant changes to the diffusion backbone or sampling schedule are required.
  • Extensive evaluations show improved image quality and prompt fidelity over baseline models.
Read more
Human-Machine Collaboration on Generative Meta-Learning: Model and Algorithm
Midhun Parakkal Unni, Samuel Kaski
Generative Models Reinforcement Learning Time Series
  • Introduction of the GMHF framework that combines generative modeling, reinforcement learning, and human feedback.
  • Theoretical bounds derived to demonstrate the potential of human feedback in improving generalization under distribution shifts.
  • Empirical validation shows significant reduction in deployment loss with increased expert reliability.
  • Framework extends beyond ODE-governed systems, applicable to non-dynamical probabilistic models.
Read more
Deep Reinforcement Learning for Spacecraft Attitude Control During Atmospheric Re-Entry
Alexander Fabisch, Melvin Laux, Mariela De Lucas Álvarez, Edoardo Caroselli, Julian Theis
Reinforcement Learning Robotics
  • Application of deep reinforcement learning for spacecraft attitude control during re-entry.
  • Comparison of RL performance against traditional PID controllers and hybrid control architectures.
  • Use of dynamics randomization to improve out-of-distribution generalization.
  • Hybrid controllers show superior performance in tracking and robustness compared to traditional methods.
Read more
FRAME: Learning the Adaptation Domain with a Mixture of Fractional-Fourier Experts
Tom Saliencro, Maya Lindqvist, Rohan Desai, Priya Nair, Daniel Whitmore
NLP Large Language Models Efficient ML
  • FRAME allows for a learnable adaptation domain, improving flexibility in PEFT methods.
  • The method utilizes a mixture of experts with fractional-Fourier orders, enhancing expressivity and reducing interference.
  • FRAME outperforms existing MoE-LoRA and spectral baselines while maintaining a low active parameter count.
  • The learned orders provide interpretable specialization across different tasks and layers.
Read more
Measuring Dead Directions: Decomposing and Classifying Singular Structure off Canonical Alignment
Tejas Pradeep Shirodkar
Theory Optimization Interpretability
  • Introduces a descent-free and alignment-free method for measuring singular structures in neural networks.
  • Develops a detect-then-read pipeline that adapts to different neural network architectures.
  • Successfully classifies dead directions into genuine singularities and gauge symmetries.
  • Demonstrates the ability to recover architecture-predicted orders in various trained networks.
Read more
Personalizing Marketplace Policies with Competing Objectives and Constrained Experiments: Evidence from a Job Marketplace
Yufei Wu, Zhen Yan
Optimization
  • Developed a framework for personalizing marketplace policies that balances competing objectives.
  • Introduced an ensemble-based hybrid ranking model that reduces guardrail risks while optimizing target metrics.
  • Addressed the challenges of cross-side externalities and marketplace interference in experimental design.
  • Validated the methodology through empirical testing and production deployment.
Read more
Surrogate Fidelity: When Can Open LLMs Explain Closed Ones?
Philippe Chlenski, Zachariah Carmichael, Ayush Warikoo, Chia-Tse Shao, Yingxiao Ye, Aobo Yang, Vivek Miglani, Nehal Bandi
NLP Large Language Models Interpretability
  • Introduces surrogate fidelity as a framework for evaluating mechanistic interpretability across open and closed models.
  • Establishes a hierarchy of surrogate fidelity metrics: prediction, attribution, representation, and cross-level.
  • Finds that prediction fidelity often overstates attribution fidelity, indicating a disconnect between model outputs and causal reasoning.
  • Identifies an access-validity inversion where stable white-box signals do not predict causal attributions effectively.
Read more
Gradient Smoothing: Coupling Layer-wise Updates for Improved Optimization
Haoming Meng, Anton Sugolov, Vardan Papyan
Optimization
  • Introduction of Depth-wise Gradient Augmentation as a new optimization paradigm.
  • Development of Gradient Smoothing, specifically the Window Smoothing operator, to enhance layer-wise updates.
  • Demonstrated improvements in optimization and generalization across various architectures and tasks.
  • Empirical and theoretical evidence supporting structured representation evolution across depth.
Read more
Safe Online Learning via Smooth Safety-Structured Policy Composition
Hongpeng Cao, Liqun Zhao, Yuliang Gu, Naira Hovakimyan, Lui Sha, Marco Caccamo
Reinforcement Learning Robotics Theory
  • AutoSafe integrates safety monitoring and intervention directly into the action generation process.
  • The architecture allows for smooth transitions between performance and safety behaviors.
  • Empirical results show strong safety enforcement and stable learning dynamics.
  • AutoSafe outperforms existing safety filter-based approaches in both safety assurance and task performance.
Read more
ComplianceGate: Classifier-Gated Multi-Tier LLM Routing for Inference in Regulated Industries
Abhishek Dey
NLP Large Language Models Efficient ML
  • Introduces a classifier-gated routing architecture for LLMs in regulated industries.
  • Ensures compliance by routing PII-containing queries to local endpoints before inference.
  • Achieves 39% median latency reduction and 33-52% cost savings based on query complexity.
  • The encoder classifier demonstrates 99.2% accuracy with minimal inference overhead.
Read more
Estimating Supply Incrementality in Two-sided Marketplaces: A Causal Machine Learning Approach
Yufei Wu, Daniel Schmierer, Dan Zylberglejd
Theory
  • Introduces a causal machine learning approach to estimate supply incrementality in two-sided marketplaces.
  • Combines double/debiased machine learning with a hierarchical Bayesian framework to isolate the impact of supply on bookings.
  • Utilizes geospatial measures of product segment similarity to improve model accuracy and reduce variance in treatment effect estimates.
  • Demonstrates strong out-of-sample performance and provides plausible estimates of marketplace returns to additional supply.
Read more
Generative Refinement for Low-Budget Black-Box Optimization
Edouard R. Dufour, Pascal Fua
Optimization Generative Models Theory
  • SPARROW decouples generative modeling from optimization, enhancing efficiency in low-budget settings.
  • The algorithm utilizes a fixed sampler as a proposal operator, requiring only knowledge of its corruption process.
  • Rank-based guidance over an archive of evaluated candidates improves robustness against unreliable feedback.
  • Asymptotic convergence guarantees are provided, demonstrating theoretical soundness.
Read more
Watermarking for Proprietary Dataset Protection
John Kirchenbauer, Brian R. Bartoldson, Bhavya Kailkhura, Tom Goldstein
NLP Large Language Models Generative Models
  • Watermarking is proposed as a solution to improve membership inference for generative models.
  • The study introduces a new randomization-based watermark detection test.
  • Watermarking can achieve comparable performance to traditional loss-based methods under specific conditions.
  • The authors provide a unified experimental framework for evaluating different membership inference techniques.
Read more
Why Do Few-Step Text Latents Fail When Image Latents Work? Non-Commitment at Sharp Categorical Readouts
Zhongyao Wang
NLP Generative Models Theory
  • Deterministic few-step generation fails for text latents due to sharp categorical readouts.
  • The failure is governed by decoder sharpness rather than transport accuracy.
  • DABI and CCI diagnostics reveal significant differences in performance between text and image decoders.
  • Two escape mechanisms (categorical commitment and stochastic re-injection) allow some models to succeed despite sharp readouts.
Read more
When Context Compensates for Sparse Event History: AlphaEarth for Spatio-Temporal Point-Process Forecasting
Yahya Aalaila, Mouad Elhamdi, Gerrit Großmann, Daniel Jenson, Elizaveta Semenova, Sebastian Vollmer
Time Series
  • AlphaEarth embeddings significantly enhance predictive performance in spatio-temporal point-process models.
  • The benefits of incorporating spatial context are most pronounced when local event histories are sparse.
  • The study provides controlled evidence on the effectiveness of external spatial context in improving spatial transfer in forecasting.
  • Predictive gains from AE embeddings taper off as the amount of historical data increases.
Read more
QuasiMoTTo: Quasi-Monte Carlo Test-Time Scaling
Michael Y. Li, Anthony Zhan, Kanishk Gandhi, Noah D. Goodman, Emily B. Fox
NLP Large Language Models Reinforcement Learning Efficient ML
  • QuasiMoTTo replaces i.i.d. sampling with correlated samples to reduce redundancy in inference compute and RL.
  • Utilizes quasi-Monte Carlo methods for generating more evenly distributed uniform samples.
  • Achieves 25-47% fewer samples for equivalent pass@k accuracy compared to i.i.d. sampling.
  • Matches i.i.d. performance in policy-gradient RL with 50% fewer training steps.
Read more
Active-GRPO: Adaptive Imitation and Self-Improving Reasoning for Molecular Optimization
Xuefeng Liu, Mingxuan Cao, Qinan Huang, Thomas Brettin, Rick Stevens, Le Cong
NLP Reinforcement Learning Optimization
  • Identifies the static-reference ceiling in reference-guided policy optimization, highlighting the risks of weak or misaligned references.
  • Introduces active reasoning as a new paradigm for training, enhancing the adaptability of reference-guided methods.
  • Implements Active-GRPO, which combines active imitation and self-improvement mechanisms to improve training robustness.
  • Demonstrates significant performance improvements in molecular optimization tasks compared to existing methods.
Read more
Visualizing High-Dimensional Graph Embeddings via Informed Multi-View Projections
Ya Ji, Xuefeng Li, Timo Brand, Jacob Miller, Peng Zhang, Stephen Kobourov, Yifan Hu
Graph Learning Optimization
  • Optimal 2D projections from high-dimensional graph embeddings yield better readability metrics than traditional 2D layouts.
  • The differentiable loss function SigmoidX significantly reduces edge crossings compared to existing methods.
  • DataFly provides an interactive platform for exploring high-dimensional graph layouts, enhancing user engagement and understanding.
  • The proposed method reveals structural patterns that are often hidden in static 2D visualizations.
Read more
OTCache: Optimal Transport for Geometry-Aware Caching in Diffusion Models
Huanlin Gao, Fang Zhao, Qiang Hui, Fuyuan Shi, Shaoan Zhao, Yantao Li, Chao Tan, Ting Lu, Yuren You, Kai Wang, Shiguo Lian
Generative Models Optimization Efficient ML
  • OTCache provides a training-free approach to accelerate diffusion models through optimal transport-inspired schedule modeling.
  • The framework overcomes limitations of existing caching methods by addressing additive independence assumptions and modeling schedule evolution.
  • Experiments show substantial acceleration in sampling times while maintaining high fidelity in generated outputs.
Read more
LeNEPA: No-Augmentation Next-Latent Prediction for Time-Series Representation Learning
Alexander Chemeris, Ming Jin, Randall Balestriero
Time Series
  • LeNEPA is a no-augmentation architecture for time-series representation learning.
  • It utilizes a next-latent-token prediction objective with a causal transformer backbone.
  • The method shows improved performance stability across different datasets without requiring specific augmentation strategies.
  • LeNEPA achieves faster representation acquisition compared to traditional methods.
Read more
Revocable Learned State via Process Sidecars
John Sweeney
NLP Large Language Models Theory
  • Introduces process sidecars for effective memory revocation in language models.
  • Proves that naive task arithmetic is first-order incomplete when safety training alters memory directions.
  • Demonstrates that process sidecars achieve second-order accuracy in memory edits.
  • Empirical results show significant improvements in refusal closure across multiple trials.
Read more
Scaling Up Thermodynamic AI Models
Andrew G. Moore
Efficient ML Theory Optimization
  • Development of a scalable backpropagation-based algorithm for training deep convolutional networks on Ising machines.
  • Achieved high classification accuracies of 94.9% on CIFAR-10 and 76.0% on CIFAR-100 using thermodynamic inference.
  • Established a mathematical theory linking inference cost to accuracy and controlling autocorrelation times.
  • Demonstrated that over 99.99% of FLOPs can be off-loaded to thermodynamic inference in larger models.
Read more
Probabilistic Inversion with Flow Matching
Baldur Paulwitz, Stefan Buske
Generative Models Optimization Theory
  • Flow Matching is adapted for probabilistic inversion in geophysics, enhancing the analysis of seismic data.
  • Probabilistic inversion allows for uncertainty quantification without the need for initial guesses or regularization.
  • The method is evaluated through case studies, demonstrating its applicability to both simple and complex models.
  • Flow Matching bridges the gap between traditional probabilistic methods and modern deep learning techniques.
Read more
LLM-Guided ODE Discovery and Parameter Inference from Small-Cohort Aggregate Data
Hanning Yang, Meropi Karakioulaki, Lennart Purucker, Tim Litwin, Cristina Has, Moritz Hess
Large Language Models Time Series Interpretability
  • AgentODE is an end-to-end framework for ODE discovery and parameter inference using population-level summary statistics.
  • The framework utilizes a large language model to propose ODE structures and refine parameter distributions iteratively.
  • AgentODE demonstrates superior performance in structure discovery compared to traditional methods that rely on individual-level data.
  • The approach is particularly valuable for modeling rare diseases where data scarcity and privacy constraints are significant challenges.
Read more
TDGT: A Tabular Data Generation Toolkit supporting adaptive GPU-accelerated Bayesian mixture models, diffusion-based models, and latent-space generative modeling
Vasileios C. Pezoulas, Nikolaos S. Tachos, Eleni Georga, Kostas Marias, Manolis Tsiknakis, Dimitrios I. Fotiadis
Generative Models
  • TDGT provides an integrated web-based toolkit for synthetic tabular data generation.
  • The Adaptive Bayesian Mixture Synthesizer (ABMS) autonomously optimizes mixture components, reducing manual configuration.
  • VAE-ABMS combines latent space learning with adaptive synthesis for high-fidelity data generation.
  • The toolkit includes GPU acceleration for efficient processing in large-scale scenarios.
Read more
TiRex-2: Generalizing TiRex to Multivariate Data and Streaming
Patrick Podest, Marco Pichler, Elias Bürger, Levente Zólyomi, Bernhard Voggenberger, Wilhelm Berghammer, Daniel Klotz, Sebastian Böck, Günter Klambauer, Sepp Hochreiter
Time Series
  • TiRex-2 is the first time series foundation model to effectively integrate both past and future covariates while ensuring strict causality.
  • The model operates at constant computational cost per time step, making it suitable for real-time streaming applications.
  • A novel synthetic coupling pipeline allows for scalable multivariate pretraining from univariate data, enhancing model generalization.
  • TiRex-2 achieves state-of-the-art performance on GIFT-Eval and fev-bench benchmarks.
Read more
Physics-informed Conditional Normalizing Flows for Angles-only Cislunar Orbit Determination
Walther Litteri, Massimiliano Vasile
Generative Models
  • Introduction of a generative modeling approach for orbit determination in cislunar space.
  • Utilization of normalizing flows for conditional density estimation based on angles-only measurements.
  • Incorporation of a physics-informed loss term to enhance the accuracy of state estimates.
  • Demonstration of improved performance over traditional orbit determination methods.
Read more
Expected Gain-based Escalation in Vertical Federated Learning
Mohamad Mestoukirdi, Vincent Corlay
Federated Learning Efficient ML Theory
  • Introduces a two-round inference protocol for selective escalation in VFL.
  • Develops an analytical routing rule based on expected gain without requiring a separate routing model.
  • Empirically shows improved communication-accuracy trade-off over existing methods.
  • Utilizes held-out calibration data for reliable score estimation.
Read more
TRIE: An Evaluation Framework for Stochastic PDE Surrogates
Bharat Srikishan, Javier E. Santos, Nikhil Muralidhar, Charles D. Young
Generative Models Theory Efficient ML
  • Introduction of TRIE as a novel evaluation framework for stochastic PDE surrogates.
  • Demonstration that traditional deterministic models fail to capture long-term statistical structures.
  • Generative models outperform other methods in capturing invariant measures and providing reliable uncertainty estimates.
  • Latent generative models with automatic dimension discovery reduce inference time significantly.
Read more
Resolving superposition in AI for interpretability and cross-modal alignment in patient-neuronal images
Jisung Park, Seohyeon Kang, Daeun Yoo, Eunsu Lee, Seoin Cho, Wooyeop Choi, Ian Choi, James R. Evan, Daesoo Kim, Sonia Gandhi, Minee L. Choi
Interpretability Computer Vision Multimodal
  • Superposition in neural networks can corrupt the geometry of latent spaces, impacting interpretability.
  • Sparse Autoencoders (SAEs) effectively disentangle superposed concepts, restoring geometric fidelity.
  • The authors adapt scRNA-seq analysis methods to image data, enhancing biological hypothesis evaluation.
  • GW-map framework aligns image representations with scRNA-seq data, reconstructing neuronal pathology pathways.
Read more