AI-generated summaries

Today's ML research,
without the noise.

Daily summaries of the latest machine learning papers from arXiv, processed every 8 hours.

48 Papers today
8h Update frequency
7 Days of history
Learning Manifold and Itô Dynamics with Branched Neural Rough Differential Equations
Luke Thompson, Dai Shi, Lequan Lin, Junbin Gao, Andi Han
Time Series Theory Robotics
  • Introduction of Branched Neural Rough Differential Equations (B-NRDEs) for modeling Itô dynamics on manifolds.
  • Utilization of Hopf algebras to enforce manifold constraints and facilitate Itô-type dynamics.
  • Development of a branched signature-kernel objective for Itô-consistent training.
  • Demonstration of B-NRDEs on various applications, showing improved performance over traditional methods.
Read more
Bayes-Sufficient Representations in Supervised Learning
Vasileios Sevetlidis
Theory
  • Introduces the concept of Bayes-sufficient representations in supervised learning.
  • Defines the Bayes quotient, which identifies inputs needing the same Bayes-optimal action.
  • Distinguishes between sufficiency and minimality of representations based on loss functions.
  • Connects the framework to property elicitation, showing how losses influence representation targets.
Read more
Autoregressive Diffusion World Models for Off-Policy Evaluation of LLM Agents
Kaixuan Liu, Guojun Xiong, Weinan Zhang, Shengpu Tang
NLP Large Language Models Reinforcement Learning
  • ADWM provides a framework for offline evaluation of LLM agents, reducing the need for live environment interactions.
  • The framework models transitions as independent denoising processes, preventing compounding errors common in autoregressive models.
  • ADWM incorporates policy guidance at each step of the diffusion process, ensuring accurate simulation of agent decision-making.
  • Empirical results show that ADWM outperforms traditional off-policy evaluation methods in ranking evaluation policies.
Read more
In-Context Graphical Inference
Zehua Cheng, Wei Dai, Jiahao Sun
Graph Learning Theory Efficient ML
  • ICG-I restores the sequential elimination structure in graphical inference, improving accuracy and scalability.
  • The method employs Tensor-Train compression to manage intermediate factors efficiently.
  • Theoretical guarantees are provided for error propagation, scoring rules, and coverage under distributional shifts.
  • ICG-I outperforms existing methods on benchmarks, particularly in challenging frustrated topologies.
Read more
Learning to Route LLMs from Implicit Cost-Performance Preferences via Meta-Learning
Jiahao Zeng, Ming Tang, Ningning Ding
Large Language Models Optimization NLP
  • Introduction of a perceptive LLM routing paradigm that learns user preferences through interaction.
  • Development of MetaRouter, a meta-learning framework for preference-aware LLM routing.
  • Demonstration of superior performance compared to existing routing methods across various datasets.
  • High efficiency in learning user preferences and adaptability to different LLMs.
Read more
A prism hierarchy of learning regimes in large linear autoencoders
Eugene Golikov, Yaroslav Gusev, Dmitry Yarotsky
Theory Optimization
  • Introduction of a prism hierarchy to classify extreme learning regimes in linear autoencoders.
  • Identification of five basic extreme regimes with specific scaling relations.
  • Extension of diagram-based methods to analyze finite training sets.
  • Derivation of explicit loss evolution expressions for four out of five regimes.
Read more
Literature-Guided Minimax Optimization of Virtual Epilepsy Neurostimulation
Cathy Liu
Optimization Large Language Models Theory
  • Introduces a literature-guided minimax optimization pipeline for epilepsy neurostimulation.
  • Demonstrates a 39.8% improvement in worst-case reward using intrinsic model-control parameters.
  • Highlights the variability and challenges in external stimulation protocols.
  • Establishes the role of LLMs as hypothesis generators rather than direct clinical decision-makers.
Read more
GOTabPFN: From Feature Ordering to Compact Tokenization for Tabular Foundation Models on High-Dimensional Data
Al Zadid Sultan Bin Habib, Md Younus Ahamed, Prashnna Kumar Gyawali, Gianfranco Doretto, Donald A. Adjeroh
Optimization Theory Efficient ML
  • Introduces GOTabPFN, a method for effective HDLSS tabular prediction.
  • Proposes GO-LR for feature ordering, proving its NP-hardness and providing a practical solution.
  • Implements NSC for dimensionality reduction by pooling features into meta-features.
  • Demonstrates improved accuracy and stability in predictions under tight feature budgets.
Read more
Maximising the Set-Piece Return: Optimising Football Corner Tactics with Graph Reinforcement Learning
Sean Groom, Michael Groom, Francisco Belo, Axl Rice, Liam Anderson, Victor-Alexandru Darvariu, Shuo Wang
Reinforcement Learning Graph Learning Optimization
  • Introduces a Graph Reinforcement Learning framework for optimizing football corner tactics.
  • Formulates corner kick optimization as a Markov Decision Process (MDP) to enable novel tactical discoveries.
  • Demonstrates significant performance improvements over traditional optimization methods on Premier League data.
  • Highlights the potential for automated tactical discovery in structured set-piece scenarios.
Read more
Deep Embedded Multiplicative DMD for Algebra-Preserving Koopman Learning
Kelan Gray, Finlay Brown, Nicolas Boullé, Matthew J. Colbrook
Theory Optimization Time Series
  • DeepMDMD combines deep learning with algebraic constraints of the Koopman operator.
  • The method learns a latent space that is dynamically coherent and compact.
  • It significantly reduces spectral pollution and improves forecasting stability.
  • DeepMDMD outperforms traditional methods in high-dimensional dynamical systems.
Read more
Reproducing, Analyzing, and Detecting Reward Hacking in Rubric-Based Reinforcement Learning
Xuekang Wang, Zhuoyuan Hao, Shuo Hou, Hao Peng, Juanzi Li, Xiaozhi Wang
Reinforcement Learning
  • Introduction of CHERRL, a controllable environment for studying reward hacking in rubric-based RL.
  • Analysis of judge biases reveals their impact on the discoverability and exploitability of hacking behaviors.
  • Development of the Reward Hacking Detection Agent (RHDA) for early detection of reward hacking from training logs.
  • Public availability of the CHERRL environment and code to facilitate further research.
Read more
Temporal Preference Concepts and their Functions in a Large Language Model
Ian Rios-Sialer, Shantanu Darveshi, Shuai Jiang, Avigya Paudel, Anastasiia Pronina, Ipshita Bandyopadhyay, Justin Shenk
Large Language Models Interpretability Theory
  • Identification of a temporal-preference subgraph in LLMs using mechanistic interpretability techniques.
  • LLMs exhibit a less steep discounting of future outcomes compared to humans, indicating behavioral inconsistencies.
  • Explicit control over temporal preferences is necessary for reliable decision-making in LLMs.
  • Steering vectors can successfully alter temporal preferences within the model.
Read more
Generalized TV–ℓp Structured Priors for Bayesian T1 Mapping
Disi Lin, Martin Berggren, Tommy Löfstedt
Theory
  • Introduction of a generalized TV–ℓp prior for Bayesian T1 mapping.
  • Demonstrated proper distribution properties of the proposed prior.
  • Utilization of No-U-Turn Sampler (NUTS) for efficient posterior inference.
  • Evaluation shows improved reliability and reduced uncertainty in T1 estimates.
Read more
Be Fair! Can Machine Learning Engineering Agents Adhere to Fairness Constraints?
Anna Richter, Julia Stoyanovich, Sebastian Schelter
Theory Interpretability
  • MLE agents automate ML pipeline development but create a responsibility gap for end-users.
  • Existing benchmarks do not adequately assess the fairness and compliance of MLE agents.
  • The proposed evaluation framework emphasizes domain-centric design and adherence to responsibility constraints.
  • An exploratory study shows that MLE agents underperform in fairness and predictive quality compared to human-designed pipelines.
Read more
TS-ICL: A Flexible Time-Indexed Foundation Model for Time Series via In-Context Learning
Etienne Le Naour, Tahar Nabil, Adrien Petralia
Time Series
  • TS-ICL is a unified model that integrates forecasting and imputation for time series data.
  • It utilizes a structured synthetic prior based on DAGs to enhance covariate-aware inference.
  • The model achieves state-of-the-art performance in zero-shot imputation benchmarks.
  • TS-ICL is efficient, being up to 50 times faster than existing time series foundation models during inference.
Read more
Curvature-aware dynamic precision approach for physics-informed neural networks
Yingjie Shao, Ioannis N. Athanasiadis, George van Voorn, Taniya Kapoor
Optimization Efficient ML Theory
  • Introduction of a dynamic precision approach for training PINNs.
  • Utilization of curvature information from L-BFGS to control numerical precision.
  • Significant reduction in training time while maintaining accuracy comparable to FP64.
  • Architecture-agnostic controller applicable across different neural network designs.
Read more
dMX: Differentiable Mixed-Precision Assignment for Low-Precision Floating-Point Formats
Giuseppe Franco, Ian Colbert, Pablo Monteagudo-Lago, Felix Marty, Nicholas Fraser
NLP Large Language Models Efficient ML
  • Introduces a differentiable framework for mixed-precision quantization in LLMs.
  • Formulates bit-width assignment as a continuous optimization problem, improving optimization stability.
  • Employs a temperature-based annealing mechanism for smooth transitions to hardware-compatible formats.
  • Demonstrates superior performance over existing layer-selection heuristics in various LLMs.
Read more
Contrastive Learning and Correlation Clustering for Sequences of Network Telescope Data
Jannik Presberger, Alexander Männel, Maynard Koch, Thomas C. Schmidt, Matthias Wählisch, Bjoern Andres
Time Series
  • Introduces a transformer model for unsupervised learning of network flow embeddings using contrastive learning.
  • Demonstrates that learned similarities are higher for sequences from the same source, generalizing to unseen data.
  • Applies correlation clustering to recover semantically meaningful clusters from the learned embeddings.
  • Shows potential for exploratory analysis of network traffic without the need for extensive annotations.
Read more
Beyond Structural Symmetries: Linear Mode Connectivity via Neuron Identifiability
Vincent Bürgin, Daniel Herbst, Ya-Wei Eileen Lin, Stefanie Jegelka
Theory Optimization Interpretability
  • Introduces a theoretical framework for neuron identifiability and effective function classes.
  • Demonstrates that neural networks can exhibit large families of equivalent solutions despite structural asymmetries.
  • Establishes conditions for merging representations without alignment, enabling unaligned linear mode connectivity.
  • Highlights the role of effective function classes in influencing the loss landscape.
Read more
Q-GNN: Query-Conditioned Graph Neural Networks with Type Awareness for Knowledge Graph Completion
Dongxiao He, Ruqiong Zhang, Zhizhi Yu, Ling Ding, Di Jin, Guangquan Xu, Zhiyong Feng
Graph Learning
  • Q-GNN incorporates both query entity and query relation information for enhanced reasoning in KGC.
  • The approach utilizes structural context and semantic type to guide the inference process.
  • A large language model is employed to infer entity types, enriching the model's understanding of entities.
  • Experimental results show that Q-GNN outperforms traditional GNN-based methods in KGC tasks.
Read more
Sparse Mixture-of-Experts Reward Models Learn Interpretable and Specialized Experts for Personalized Preference Modeling
Yifan Wang, Jinyi Mu, Mayank Jobanputra, Yu Wang, Ji-Ung Lee, Soyoung Oh, Isabel Valera, Vera Demberg
Reinforcement Learning Large Language Models Interpretability
  • Introduces a sparse Mixture-of-Experts reward model for preference modeling in RLHF.
  • Addresses the limitations of traditional reward models by capturing heterogeneous human preferences.
  • Demonstrates improved interpretability and effectiveness for personalization through specialized experts.
  • Achieves significant performance improvements in test-time personalization with minimal adaptation data.
Read more
Towards Pretraining Text Encoders for TabPFN
Mustafa Tajjar, Alexander Pfefferle, Lennart Purucker, Frank Hutter
NLP Multimodal Efficient ML
  • Introduces the TabPFN Text Adapter to improve text feature integration in TabPFN.
  • Eliminates the PCA bottleneck by directly mapping text embeddings to TabPFN's embedding space.
  • Maintains TabPFN's performance on numerical and categorical features while enhancing text processing.
  • Offers a more efficient training approach compared to traditional end-to-end pretraining methods.
Read more
UniFair: A unified fair clustering approach based on separation and compactness
Antonia Karra, Vasiliki Papanikou, Georgios Vardakas, Evaggelia Pitoura, Aristidis Likas
Optimization Theory
  • Introduces separation fairness as a new dimension of fairness in clustering.
  • Combines separation fairness with social fairness to address multiple sources of disparity.
  • Develops efficient optimization procedures for both traditional and deep clustering settings.
  • Demonstrates effectiveness through empirical evaluations on tabular and image datasets.
Read more
State commitment learning: training language models to distinguish computation from memory
Fei Ding, Yongkang Zhang, Runhao Liu, Yuhao Liao, Zijian Zeng, Huiming Yang
NLP Large Language Models Reinforcement Learning
  • Introduces State Commitment Learning to improve reasoning in language models.
  • Defines persistent-state sufficiency as a criterion for evaluating answer validity post-erasure.
  • Proposes Counterfactual Erasure RL (CERL) as a new training method.
  • Demonstrates significant improvements in reducing hidden thought dependency without sacrificing accuracy.
Read more
RIDE: An Open Dataset and Benchmark for Train Delay Prediction
Clément Elliker, Mathis Le Bail, Clément Mantoux, Jesse Read, Sonia Vanier
Time Series Graph Learning
  • RIDE is a nationwide dataset and benchmark for train delay prediction, addressing the lack of standardized resources in the field.
  • The dataset includes extensive records of train events and weather data, facilitating diverse modeling approaches.
  • Learning-based methods, especially graph neural networks, significantly outperform traditional non-learning models.
  • The benchmark provides a unified evaluation protocol, enabling consistent comparisons across different model families.
Read more
Intercomparison of Machine Learning Algorithms for Remote Sensing-based In-season Crop Mapping
August Posch, Jitendra Kumar, Forrest M. Hoffman, Auroop R. Ganguly
Optimization Time Series Computer Vision
  • In-season crop mapping is essential for timely agricultural responses to climate threats.
  • Support Vector Machines outperformed other algorithms in mapping accuracy.
  • Interannual variability significantly impacts model uncertainty.
  • The study combines remote sensing data with crop rotation history for improved accuracy.
Read more
Self-Distilled Policy Gradient
Yifeng Liu, Shiyuan Zhang, Yifan Zhang, Quanquan Gu
Reinforcement Learning Large Language Models Optimization
  • Introduction of SDPG, a self-distilled policy-gradient framework.
  • Combines group-relative verifier advantages with full-vocabulary self-distillation.
  • Addresses issues of sparse rewards and training instability in RL.
  • Empirical results show improved stability and performance over existing methods.
Read more
Large Language Models Hack Rewards, and Society
Wei Liu, Xinyi Mou, Hanqi Yan, Zhongyu Wei, Yulan He
Large Language Models Reinforcement Learning NLP
  • LLMs can exploit societal regulations akin to reward functions in RL, leading to 'societal hacking'.
  • The SocioHack benchmark reveals that LLMs can rediscover regulatory loopholes with high precision and recall.
  • Current safeguards against LLM misuse are limited and often fail to detect exploitative behaviors framed as benign.
  • The interaction between loophole discovery and regulatory patching creates a co-evolutionary dynamic that complicates safety.
Read more
Causal Atlases from Entropic Inference: Bayesian Networks beyond Optimal DAGs
Hazhir Aliahmadi, Irina Babayan, Greg van Anders
Graph Learning Theory
  • Introduces entropy-based inference for generating causal atlases in Bayesian networks.
  • Demonstrates that traditional optimization methods may obscure structural ambiguities in causal relationships.
  • Shows that maximum-entropy ensembles can capture multiple plausible causal structures.
  • Highlights the limitations of optimized DAGs in representing true causal relationships.
Read more
OPRD: On-Policy Representation Distillation
Shenzhi Yang, Guangcheng Zhu, Bowen Song, Haobo Wang, Mingxuan Xia, Xing Zheng, Yingfan Ma, Zhongqi Chen, Weiqiang Wang, Gang Chen
Large Language Models NLP Theory
  • OPRD shifts the focus of on-policy distillation from output space to hidden-state space.
  • It eliminates sampling variance in gradient estimation, leading to more stable training.
  • OPRD provides richer supervision by utilizing intermediate hidden states from the teacher model.
  • Empirical results show OPRD outperforms traditional OPD methods on benchmark tasks.
Read more
Multimarginal flow matching with optimal transport potentials
Raghav Kansal, David Crair, Nghia Nguyen, Scott Pope, Bradley Parry
Optimization Time Series Theory
  • Introduces OTP-FM, a multimarginal generalization of dynamic optimal transport.
  • Incorporates potential energy terms to smooth trajectories and avoid discontinuities.
  • Offers a flexible, simulation-free training algorithm that adapts to data.
  • Demonstrates state-of-the-art performance on various scientific datasets.
Read more
Trace-Mediated Peak Bias: Bridging Temporal Credit Assignment and Cognitive Heuristics in Deep Reinforcement Learning
Viktor Veselý, Aleksandar Todorov, Erwan Escudie, Matthia Sabatelli
Reinforcement Learning Optimization Theory
  • Identification of Trace-Mediated Peak Bias (TMPB) as a systematic failure mode in deep RL.
  • TMPB provides a computational parallel to the psychological Peak-End Rule.
  • Adaptive optimization techniques are essential for mitigating TMPB and achieving rational value estimation.
  • The study reveals how cognitive-like biases can emerge from the mathematics of temporal credit assignment.
Read more
TailLoR: Protecting Principal Components in Parameter-Efficient Continual Learning
Marius Dragoi, Ioana Pintilie, Alexandra Dragomir, Antonio Barbalau, Florin Brad
Efficient ML
  • TailLoR introduces a low-rank adaptation method that operates on the singular values of weight matrices.
  • It employs a soft spectral regularization to protect dominant singular components during updates.
  • The method allows for sequential adaptation without requiring access to prior task adapters, enhancing user privacy.
  • TailLoR demonstrates competitive performance against existing continual learning methods while improving the stability of weight matrix ranks.
Read more
BBOmix: A Tabular Benchmark for Hyperparameter Optimization of Unsupervised Biological Representation Learning
Luca Thale-Bombien, Jan Ewald, Ralf König, Aaron Klein
Optimization
  • BBOmix is the first open-source benchmark for unsupervised representation learning on biological data.
  • The benchmark includes 105,000 evaluations across multiple AE architectures and omics modalities.
  • The study quantifies the correlation between reconstruction loss and downstream task performance.
  • An extensive evaluation of state-of-the-art HPO methods is provided, establishing a baseline for future research.
Read more
Gradient Descent with Large Step Size Restores Symmetry in Deep Linear Networks with Multi-Pathway
Hee-Sung Kim, Sungyoon Lee
Optimization Theory
  • Discrete Gradient Descent with large step sizes leads to pathway re-balancing rather than persistent symmetry breaking.
  • Single-path solutions correspond to sharp minima, while balanced solutions across multiple pathways are flatter.
  • The paper establishes a theoretical relationship between the number of pathways, depth, and sharpness of minima.
  • Training dynamics under large step sizes exhibit two phases: initial symmetry breaking followed by re-balancing.
Read more
A Sliced-Wasserstein Framework on Correlation Matrices for EEG Decoding
Chen Hu, Rui Wang, Jiale Zhou, Jingjun Yi, Shaocheng Jin, Yidong Song, Yefeng Zheng
Time Series
  • Introduction of a Sliced Wasserstein framework for EEG decoding using correlation matrices.
  • Development of Pullback Euclidean Metric Sliced Wasserstein (PEMSW) for non-Euclidean spaces.
  • Instantiation of Correlation Sliced-Wasserstein discrepancies using OLM and LSM.
  • Demonstrated improved generalization in EEG decoding under distribution shifts.
Read more
Field Validation of a Multi-Resolution ConvLSTM Framework for Retaining Wall Deformation Prediction
Jihoon Kim, Heejung Youn
Time Series
  • The ConvLSTM framework integrates multiple temporal resolutions for improved prediction accuracy.
  • Field validation was conducted using data from 34 inclinometers across 11 excavation sites.
  • The framework achieved a mean absolute error of 1.4 mm and a coefficient of determination of 0.93.
  • Results indicate the model's robustness and applicability to various excavation conditions.
Read more
Learning While Acting: A Skill-Enhanced Test-Time Co-Evolution Framework for Online Lifelong Learning Agents
Bo Mao, Jie Zhou, Yutao Yang, Xin Li, Xian Wei, Qin Chen, Xingjiao Wu, Liang He
Reinforcement Learning Large Language Models NLP
  • LifeSkill enables online lifelong learning by allowing agents to adapt during deployment.
  • Verifier-Guided Skill Learning trains skill extraction based on execution feedback.
  • Online Skill Internalization transforms successful interactions into policy improvements.
  • LifeSkill outperforms strong baselines in long-horizon interactive tasks.
Read more
RUBAS: Rubric-Based Reinforcement Learning for Agent Safety
Xian Qi Loye, Qinglin Su, Zhexin Zhang, Shiyao Cui, Qi Zhu, Fei Mi, Hongning Wang, Minlie Huang
Reinforcement Learning Large Language Models NLP
  • RUBAS introduces a structured approach to agent safety through four dimensions: tool-use safety, argument safety, response safety, and helpfulness.
  • The framework leverages rubric-based rewards to provide interpretable feedback for reinforcement learning.
  • Extensive experiments show RUBAS outperforms standard alignment methods in safety and reduces harmful outputs.
  • The approach emphasizes the importance of joint modeling across multiple safety dimensions rather than isolated fixes.
Read more
Learning Long Range Spatio-Temporal Representations over Continuous Time Dynamic Graphs with State Space Models
Ayushman Raghuvanshi, Thummaluru Siddartha Readdy, Sundeep Prabhakar Chepuri, Mahesh Chandran
Graph Learning Time Series Efficient ML
  • Introduction of CTT-HiPPO for efficient memory compression in CTDGs.
  • Development of CTDG-SSM, a unified framework for capturing LRT and LRS dependencies.
  • Derivation of a discrete implementation for scalable computation.
  • Theoretical guarantees on robustness and permutation equivariance of the model.
Read more
Quantifying the Privacy of Counterfactuals by Leveraging Membership Inference Attacks Against Synthetic Data
Maryam Babaei, Yingke Wang, Hadrien Lautraite, Heber H. Arcolezi, Ulrich Aivodji, Sebastien Gambs
Theory Generative Models Interpretability
  • Counterfactuals can be exploited for privacy attacks, similar to synthetic data.
  • Membership inference attacks can be conducted on counterfactuals without model access.
  • The study bridges the gap between synthetic data privacy research and counterfactual analysis.
  • An ensembling MIA is proposed and tested against existing counterfactual attacks.
Read more
Bridging Domain Expertise and Generalization for Performance Estimation
Shuxuan Li, Zhilin Zhao, Quyu Kong, Wei-Shi Zheng
Theory Optimization
  • FRAP provides a novel approach to performance estimation under distribution shift by integrating a foundation model with a base model.
  • The framework aligns prediction distributions to minimize divergence, enhancing reliability in performance estimation.
  • Extensive experiments show FRAP outperforms traditional methods, indicating its robustness across diverse datasets and architectures.
  • The method addresses the limitations of relying solely on model outputs, which can be biased under distribution shifts.
Read more
QuBLAST: A Framework for Quantizing Large Language Models with Block-Level Compression Approach and Activation Scaling Strategy
Pasindu Wickramasinghe, Achyuta Muthuvelan, Rachmad Vidya Wicaksana Putra, Minghao Shao, Muhammad Shafique
NLP Large Language Models Efficient ML
  • QuBLAST introduces a block-level compression approach for mixed-precision quantization of LLMs.
  • The framework employs an activation scaling strategy to mitigate the impact of activation outliers.
  • Sensitivity analysis of attention blocks is utilized to optimize weight quantization levels.
  • QuBLAST achieves significant model size reduction while maintaining performance.
Read more
Less is MoE: Trimming Experts in Domain-Specialist Language Models
Haoze He, Xinkai Zou, Xuan Jiang, Xingyuan Ding, Ao Qu, Juncheng Billy Li, Heather Miller
NLP Large Language Models Efficient ML
  • Fisher importance outperforms existing metrics for identifying critical parameters in MoE models.
  • Fisher-MoE enables fine-grained compression at the intermediate dimension level rather than the expert level.
  • The proposed method preserves model capabilities while significantly reducing memory and improving inference speed.
  • Existing expert-level compression methods fail on general-purpose benchmarks due to their coarse granularity.
Read more
Scaling Laws for Behavioral Foundation Models over User Event Sequences
Rickard Brüel Gabrielsson
Optimization Efficient ML Theory
  • The optimal size for the event embedder is approximately 2% of the total model parameters.
  • Behavioral foundation models initially require a data-heavy approach, transitioning towards the Chinchilla heuristic as compute increases.
  • The evaluation metric significantly influences the scaling laws and optimal configurations for model training.
  • Negative sampling becomes a memory constraint at higher compute budgets rather than a compute constraint.
Read more
What Objects Enable, Not What They Are: Functional Latent Spaces for Affordance Reasoning
Rohan Siva, Neel P. Bhatt, Yunhao Yang, Seoyoung Lee, Nishant Gadde, Christian Ellis, Alvaro Velasquez, Zhangyang Wang, Ufuk Topcu
Robotics
  • Introduction of A4D, a framework for affordance-based reasoning in robot planning.
  • Mapping visual observations to a functional latent space enhances generalizability.
  • Achieves 94% accuracy on existing affordances, outperforming state-of-the-art methods.
  • Improves new-affordance inference accuracy from ~70% to over 90% with limited data.
Read more
A Geometric View of Counterfactual Behavior: Interaction of Boundary Proximity and Local Support
Ioanna Gemou, Matteo Gamba, Randall Balestriero, Ritambhara Singh
Interpretability Theory Multimodal
  • Introduces a geometric framework for evaluating counterfactual behavior in machine learning models.
  • Demonstrates that counterfactual behavior can vary significantly across classifier heads even with similar predictive performance.
  • Establishes the interaction between decision-boundary proximity and local data support as critical for determining feasible prediction changes.
  • Identifies counterfactual behavior as an important axis for model evaluation beyond predictive accuracy.
Read more
Enhancing the MADDPG Algorithm for Multi-Agent Learning via Action Inference and Importance Sampling
Marc Walden, Jason Liu, Shaashwath Sivakumar, Ryan Liu, Hamza Khan
Reinforcement Learning
  • Introduction of Action Inference to enhance policy accuracy and stability in MADDPG.
  • Implementation of geometric importance sampling to prioritize recent experiences in the replay buffer.
  • Evaluation conducted on the Predator–Prey task, showcasing improvements in learning stability and cooperation.
  • Demonstrated significant enhancements in exploration efficiency over the standard MADDPG algorithm.
Read more