AI-generated summaries

Today's ML research,
without the noise.

Daily summaries of the latest machine learning papers from arXiv, processed every 8 hours.

44 Papers today
8h Update frequency
7 Days of history
Reaching Beyond the Mode: RL for Distributional Reasoning in Language Models
Isha Puri, Mehul Damani, Idan Shenfeld, Marzyeh Ghassemi, Jacob Andreas, Yoon Kim
NLP Large Language Models Reinforcement Learning
  • Multi-Answer RL allows language models to generate multiple plausible answers with confidence estimates in a single pass.
  • The approach addresses the issue of entropy collapse seen in traditional RL training for LMs.
  • Empirical results show substantial improvements in answer diversity, coverage, and calibration scores.
  • Models trained with Multi-Answer RL are more token-efficient and accurate, particularly in coding tasks.
Read more
Anchored-Branched Steady-state WInd Flow Transformer (AB-SWIFT): a metamodel for 3D atmospheric flow in urban environments
Armand de Villeroché, Rem-Sophia Mouradi, Vincent Le Guen, Sibo Cheng, Marc Bocquet, Alban Farchi, Patrick Armand, Patrick Massin
Theory Efficient ML Graph Learning
  • AB-SWIFT is the first transformer-based neural operator specifically designed for local-scale atmospheric flow modeling.
  • The model is trained on a new dataset that includes various urban geometries and atmospheric stratifications.
  • AB-SWIFT achieves superior accuracy compared to existing transformer and graph neural network models.
  • The model's architecture allows for flexible representation of terrain topology and atmospheric conditions.
Read more
Neural Network Conversion of Machine Learning Pipelines
Man-Ling Sung, Jan Silovsky, Man-Hung Siu, Herbert Gish, Chinnu Pittapally
Theory Efficient ML Optimization
  • Explores the conversion of traditional ML pipelines into neural networks using a student-teacher learning approach.
  • Focuses on transferring knowledge from random forest classifiers to neural networks.
  • Demonstrates that student NNs can match the performance of teacher models with proper hyper-parameter tuning.
  • Investigates the use of random forests for hyper-parameter selection in neural networks.
Read more
A Practical Guide Towards Interpreting Time-Series Deep Clinical Predictive Models: A Reproducibility Study
Yongda Fan, John Wu, Andrea Fitzpatrick, Naveen Baskaran, Jimeng Sun, Adam Cross
Time Series Interpretability
  • Attention mechanisms can effectively enhance interpretability in clinical predictive models.
  • Black-box interpreters like KernelSHAP and LIME are not suitable for time-series clinical prediction tasks.
  • Many interpretability methods lack reliability and trustworthiness.
  • The study provides guidelines for improving interpretability in clinical settings.
Read more
CVA: Context-aware Video-text Alignment for Video Temporal Grounding
Sungho Moon, Seunghun Lee, Jiwan Seo, Sunghoon Im
Computer Vision Multimodal
  • Introduction of Query-aware Context Diversification (QCD) to enhance data augmentation.
  • Development of Context-invariant Boundary Discrimination (CBD) loss for improved semantic consistency.
  • Design of Context-enhanced Transformer Encoder (CTE) for effective multi-scale temporal context modeling.
  • Achieved state-of-the-art performance on VTG benchmarks, notably improving Recall@1 scores.
Read more
Spatiotemporal System Forecasting with Irregular Time Steps via Masked Autoencoder
Kewei Zhu, Yanze Xin, Jinwei Hu, Xiaoyuan Cheng, Yiming Yang, Sibo Cheng
Time Series
  • Introduction of the Physics-Spatiotemporal Masked Autoencoder (P-STMAE) for forecasting irregular time series.
  • Elimination of data imputation while preserving the physical integrity of dynamical systems.
  • Significant improvements in prediction accuracy and computational efficiency over traditional methods.
  • Demonstrated applicability in real-world scenarios, including ocean temperature forecasting.
Read more
Interpretable PM2.5 Forecasting for Urban Air Quality: A Comparative Study of Operational Time-Series Models
Moazzam Umer Gondal, Hamad ul Qudous, Asma Ahmad Farhan, Sultan Alamri
Time Series Interpretability
  • Developed a transparent, leakage-aware forecasting workflow for PM2.5 prediction.
  • Compared three operational time-series models: SARIMAX, Facebook Prophet, and NeuralProphet.
  • Demonstrated that lightweight models can achieve competitive accuracy and efficiency.
  • Found that online residual correction significantly improves model robustness.
Read more
Light Cones For Vision: Simple Causal Priors For Visual Hierarchy
Manglam Kartik, Neel Tushar Shah
Computer Vision
  • Introduction of Worldline Slot Attention for modeling visual hierarchies.
  • Demonstration that Euclidean geometry fails to capture hierarchical relationships.
  • Lorentzian geometry significantly outperforms hyperbolic embeddings in hierarchical object discovery.
  • Worldline binding allows for multi-scale information aggregation across hierarchy levels.
Read more
Experiential Reflective Learning for Self-Improving LLM Agents
Marc-Antoine Allard, Arnaud Teinturier, Victor Xing, Gautier Viaud
Large Language Models NLP Reinforcement Learning
  • ERL enables LLM agents to adapt to new environments by reflecting on past experiences.
  • The framework generates reusable heuristics that improve task execution without requiring parameter updates.
  • ERL outperforms existing experiential learning methods, achieving a 7.8% increase in success rate on the Gaia2 benchmark.
  • Heuristic retrieval is critical for enhancing performance and reliability in task completion.
Read more
Flow matching on homogeneous spaces
Francesco Ruscelli
Generative Models Theory Efficient ML
  • Introduces a framework for flow matching on homogeneous spaces by lifting to Lie groups.
  • Eliminates the need for complex geometric computations like geodesics.
  • Reformulates flow matching as a Euclidean task on Lie algebras, enhancing computational efficiency.
  • Demonstrates the framework's applicability through case studies on specific homogeneous spaces.
Read more
Once-for-All Channel Mixers (HYPERTINYPW): Generative Compression for TinyML
Yassien Shaalan
Efficient ML Time Series Audio & Speech
  • HYPERTINYPW compresses neural networks by generating most PW weights at load time, significantly reducing memory usage.
  • The method retains the first PW layer in INT8 format to stabilize performance for morphology-sensitive tasks.
  • Achieves a 6.31× reduction in model size while maintaining over 95% accuracy on benchmark ECG datasets.
  • Compatible with standard integer operations, ensuring easy integration into existing TinyML frameworks.
Read more
Revisiting On-Policy Distillation: Empirical Failure Modes and Simple Fixes
Yuqian Fu, Haohuan Huang, Kaiwen Jiang, Yuanheng Zhu, Dongbin Zhao
NLP Large Language Models Optimization
  • Analysis of estimator tradeoffs in OPD reveals biases and variance characteristics.
  • Identification of three failure modes in sampled-token OPD: imbalanced signals, unreliable guidance, and tokenizer mismatches.
  • Introduction of teacher top-K local support matching as a solution to improve OPD.
  • Empirical results show enhanced stability and performance in math reasoning tasks.
Read more
A Systematic Empirical Study of Grokking: Depth, Architecture, Activation, and Regularization
Shalima Binta Manir, Anamika Paul Rupa
Optimization Theory
  • Depth requires stabilization for effective grokking.
  • Architectural differences between models are largely confounded by optimization and regularization.
  • Activation function performance is dependent on the regularization regime.
  • Weight decay is a critical parameter for enabling grokking within a narrow range.
Read more
Layer-Specific Lipschitz Modulation for Fault-Tolerant Multimodal Representation Learning
Diyar Altinses, Andreas Schwung
Multimodal Theory Efficient ML
  • Introduces a unified framework for fault-tolerant multimodal representation learning.
  • Develops a dual-regularization mechanism to balance sensitivity for anomaly detection and correction.
  • Demonstrates improved performance on multimodal datasets compared to existing methods.
  • Provides a theoretical analysis of perturbation effects on neural network sensitivity.
Read more
Vision Hopfield Memory Networks
Jianfeng Wang, Amine M'Charrak, Luk Koska, Xiangtao Wang, Daniel Petriceanu, Mykyta Smyrnov, Ruizhi Wang, Michael Bumbar, Luca Pinchetti, Thomas Lukasiewicz
Computer Vision Multimodal Interpretability
  • V-HMN integrates hierarchical memory mechanisms for improved data efficiency and interpretability.
  • The model employs local and global Hopfield modules for associative memory dynamics.
  • Iterative refinement updates enhance error correction and representation learning.
  • V-HMN demonstrates competitive performance on computer vision benchmarks.
Read more
Learning to Staff: Offline Reinforcement Learning and Fine-Tuned LLMs for Warehouse Staffing Optimization
Kalle Kujanpää, Yuying Zhu, Kristina Klinkner, Shervin Malmasi
Reinforcement Learning Large Language Models Optimization
  • Development of a Transformer-GNN architecture for offline RL that improves throughput by 2.4%.
  • LLMs require significant task-specific adaptation, with fine-tuning necessary for performance matching historical baselines.
  • Iterative preference optimization simulates manager feedback, enabling LLMs to learn and adapt effectively.
  • The framework allows for future integration of real manager feedback, enhancing human-AI collaboration.
Read more
Grokking as a Falsifiable Finite-Size Transition
Yuda Bi, Chenyu Zhang, Qiheng Wang, Vince D Calhoun
Theory
  • Introduces a structured finite-size scaling approach to analyze grokking in neural networks.
  • Defines the group order p of Zp as an extensive variable and spectral head–tail contrast as an order parameter.
  • Demonstrates that grokking exhibits transition-like finite-size organization, challenging smooth-crossover interpretations.
  • Establishes a diagnostic chain that allows for falsifiable claims regarding grokking.
Read more
A Unified Memory Perspective for Probabilistic Trustworthy AI
Xueji Zhao, Likai Pei, Jianbo Liu, Kai Ni, Ningyuan Cao
Theory Efficient ML
  • Introduces a unified probabilistic memory abstraction for analyzing deterministic and stochastic operations.
  • Identifies a scaling mismatch between compute throughput, memory bandwidth, and entropy generation.
  • Examines architectural trade-offs between conventional von Neumann systems and emerging probabilistic compute-in-memory approaches.
  • Outlines evaluation criteria for memory systems to support probabilistic computation effectively.
Read more
Hessian-informed machine learning interatomic potential towards bridging theory and experiments
Bangchen Yin, Jian Ouyang, Zhen Fan, Kailai Lin, Hanshi Hu, Dingshun Lv, Weiluo Ren, Hai Xiao, Ji Chen, Changsu Cao
Theory Efficient ML Optimization
  • Introduction of Hi-MLIP for capturing local curvature of PES.
  • Development of HINT protocol to reduce Hessian label requirements.
  • Significant improvements in transition-state search and Gibbs free energy predictions.
  • Accurate treatment of anharmonic hydrides, matching experimental results.
Read more
Can an Actor-Critic Optimization Framework Improve Analog Design Optimization?
Sounak Dutta, Fin Amin, Sushil Panda, Jonathan Rabe, Yuejiang Wen, Paul Franzon
Optimization
  • Introduces ACOF, integrating actor-critic methodology into analog design optimization.
  • Enhances search efficiency by combining proposal and evaluation roles, improving interpretability.
  • Achieves significant performance improvements over existing optimization techniques.
  • Maintains compatibility with standard simulation-based design flows.
Read more
Knowledge-Guided Retrieval-Augmented Generation for Zero-Shot Psychiatric Data: Privacy Preserving Synthetic Data Generation
Adam Jakobsen, Sushant Gautam, Hugo Lewi Hammer, Susanne Olofsdotter, Miriam S Johanson, PÃ¥l Halvorsen, Vajira Thambawita
Generative Models Large Language Models NLP
  • Introduces a zero-shot, knowledge-guided framework for synthetic psychiatric data generation.
  • Utilizes Retrieval-Augmented Generation to ground LLM responses in clinical knowledge.
  • Demonstrates competitive performance against state-of-the-art generative models while preserving privacy.
  • Shows that clinical retrieval enhances the fidelity of generated data.
Read more
Can LLMs Beat Classical Hyperparameter Optimization Algorithms? A Study on autoresearch
Fabio Ferreira, Lucca Wobbe, Arjun Krishnakumar, Frank Hutter, Arber Zela
Optimization Large Language Models
  • Classical HPO methods outperform LLM-based agents in fixed search spaces.
  • LLM agents that edit training code can significantly improve optimization outcomes.
  • The hybrid method 'Centaur' combines classical optimization with LLM capabilities, achieving superior results.
  • Reliability in optimization methods is more critical than exploration breadth.
Read more
How unconstrained machine-learning models learn physical symmetries
Michelangelo Domina, Joseph William Abbott, Paolo Pegolo, Filippo Bigi, Michele Ceriotti
Theory Graph Learning Efficient ML
  • Unconstrained ML models can learn physical symmetries effectively through data augmentation.
  • The paper introduces metrics to measure symmetry content and equivariance in model outputs.
  • Analysis of symmetry information flow provides insights into model architecture and training.
  • Strategically injecting inductive biases can improve model stability and accuracy.
Read more
Uncertainty-Guided Label Rebalancing for CPS Safety Monitoring
John Ayotunde, Qinghua Xu, Guancheng Wang, Lionel C. Briand
Time Series
  • Introduces U-Balance, a novel approach for rebalancing imbalanced datasets in CPS safety monitoring.
  • Utilizes behavioral uncertainty to enhance label rebalancing without generating synthetic samples.
  • Demonstrates a significant correlation between behavioral uncertainty and safety outcomes.
  • Achieves a notable improvement in F1 score compared to existing methods.
Read more
GraphER: An Efficient Graph-Based Enrichment and Reranking Method for Retrieval-Augmented Generation
Ruizhong Miao, Yuying Wang, Rongguang Wang, Chenyang Li, Tao Sheng, Sujith Ravi, Dan Roth
NLP Graph Learning Efficient ML
  • GraphER enhances retrieval-augmented generation by capturing multiple forms of proximity beyond semantic similarity.
  • The method operates independently of knowledge graphs, allowing for seamless integration with existing vector stores.
  • GraphER is retriever-agnostic and introduces negligible latency, making it suitable for production environments.
  • Experiments show that GraphER significantly improves retrieval performance on complex queries compared to traditional methods.
Read more
On Neural Scaling Laws for Weather Emulation through Continual Training
Shashank Subramanian, Alexander Kiefer, Arnur Nigmetov, Amir Gholami, Dmitriy Morozov, Michael W. Mahoney
Time Series Theory Efficient ML
  • Adoption of a minimalist Swin Transformer architecture for weather forecasting.
  • Continual training with cooldown phases improves model performance and scaling behavior.
  • Identification of compute-optimal training regimes through IsoFLOP curves.
  • Demonstration that neural scaling laws can guide efficient resource allocation in scientific machine learning.
Read more
Missing-Aware Multimodal Fusion for Unified Microservice Incident Management
Wenzhuo Qian, Hailiang Zhao, Ziqi Wang, Zhipeng Gao, Jiayi Chen, Zhiwei Ling, Shuiguang Deng
Multimodal
  • ARMOR is designed to handle missing modalities in multimodal data for microservice incident management.
  • The framework utilizes a modality-specific asymmetric encoder to isolate distribution disparities among different data types.
  • A missing-aware gated fusion mechanism is employed to prevent cross-modal interference from incomplete inputs.
  • Self-supervised learning is leveraged to optimize anomaly detection, failure triage, and root cause localization without requiring extensive fault labels.
Read more
Transformers in the Dark: Navigating Unknown Search Spaces via Bandit Feedback
Jungtaek Kim, Thomas Zeng, Ziqian Lin, Minjae Lee, Chungpa Lee, Jy-yong Sohn, Hyung Il Koo, Kangwook Lee
Large Language Models Reinforcement Learning Theory
  • Introduction of a new framework for evaluating LLMs' search capabilities.
  • Transformers can theoretically represent and approximate distinct search strategies.
  • Current LLMs show limited search capabilities compared to traditional algorithms.
  • Targeted training can significantly improve LLM performance in search tasks.
Read more
Epistemic Compression: The Case for Deliberate Ignorance in High-Stakes AI
Steffen Lukas
Theory Efficient ML
  • High-capacity models often fail in high-stakes environments due to overfitting noise rather than capturing relevant signals.
  • Epistemic Compression promotes model robustness by aligning complexity with data stability, rather than simply increasing parameters.
  • The Regime Index effectively distinguishes between environments where simplicity or complexity is advantageous.
  • The study found a strong correlation between the Regime Index and the most effective modeling strategies in high-stakes domains.
Read more
Train at Moving Edge: Online-Verified Prompt Selection for Efficient RL Training of Large Reasoning Model
Jiahao Wu, Ning Lu, Shengcai Liu, Kun Wang, Yanting Yang, Li Qing, Ke Tang
Reinforcement Learning Large Language Models Efficient ML
  • HIVE framework improves efficiency in RL training of LLMs by selecting high-utility prompts.
  • The concept of 'learning edge' is introduced, highlighting the dynamic nature of prompt utility during training.
  • HIVE achieves up to 9.2 million fewer rollouts while maintaining or exceeding accuracy compared to existing methods.
  • The methodology combines historical data with real-time entropy measures to optimize prompt selection.
Read more
Optimal High-Probability Regret for Online Convex Optimization with Two-Point Bandit Feedback
Haishan Ye
Optimization Theory
  • Introduces the first high-probability regret bound for two-point feedback in OCO with strongly convex losses.
  • Achieves a regret bound of O(d(log T + log(1/δ))/µ), improving upon previous O(d²) dependencies.
  • Utilizes a novel analytical framework that departs from traditional reduction-based methods.
  • Matches the minimax optimal bounds for both time horizon and dimension.
Read more
Intern-S1-Pro: Scientific Multimodal Foundation Model at Trillion Scale
Yicheng Zou, Dongsheng Zhu, Lin Zhu, Tong Zhu, Yunhua Zhou, Peiheng Zhou, Xinyu Zhou, Dongzhan Zhou, Zhiwang Zhou, Yuhao Zhou, Bowen Zhou, Zhanping Zhong, Zhijie Zhong, Haiteng Zhao, Penghao Zhao, Xiaomeng Zhao, Zhiyuan Zhao, Yechen Zhang, Jin Zhang, Wenwei Zhang, Hongjie Zhang, Zhuo Zhang, Wenlong Zhang, Bo Zhang, Chao Zhang, Chen Zhang, Yuhang Zang, Fei Yuan, Jiakang Yuan, Jiashuo Yu, Jinhui Yin, Haochen Ye, Qian Yao, Bowen Yang, Danni Yang, Kaichen Yang, Ziang Yan, Jun Xu, Yicheng Xu, Wanghan Xu, Xuenan Xu, Chao Xu, Ruiliang Xu, Shuhao Xing, Long Xing, Xinchen Xie, Ling-I Wu, Zijian Wu, Zhenyu Wu, Lijun Wu, Yue Wu, Jianyu Wu, Wen Wu, Fan Wu, Xilin Wei, Qi Wei, Bingli Wang, Rui Wang, Ziyi Wang, Zun Wang, Yi Wang, Haomin Wang, Yizhou Wang, Lintao Wang, Yiheng Wang, Longjiang Wang, Bin Wang, Jian Tong, Zhongbo Tian, Huanze Tang, Chen Tang, Shixiang Tang, Yu Sun, Qiushi Sun, Xuerui Su, Qisheng Su, Chenlin Su, Demin Song, Jin Shi, Fukai Shang, Yuchen Ren, Pengli Ren, Xiaoye Qu, Yuan Qu, Jiantao Qiu, Yu Qiao, Runyu Peng, Tianshuo Peng, Jiahui Peng, Qizhi Pei, Zhuoshi Pan, Linke Ouyang, Wenchang Ning, Yichuan Ma, Zerun Ma, Ningsheng Ma, Runyuan Ma, Chengqi Lyu, Haijun Lv, Han Lv
Multimodal Large Language Models Reinforcement Learning
  • Intern-S1-Pro is the first one-trillion-parameter scientific multimodal foundation model.
  • The model integrates advanced agent capabilities for autonomous scientific workflows.
  • It has been trained on over 100 specialized tasks across critical scientific fields.
  • A group routing mechanism is introduced to enhance training stability and efficiency.
Read more
Maximum Entropy Behavior Exploration for Sim2Real Zero-Shot Reinforcement Learning
Jiajun Hu, Nuria Armengol Urpi, Jin Cheng, Stelian Coros
Reinforcement Learning Robotics
  • Introduces FB-MEBE, an online zero-shot RL algorithm for quadrupedal robots.
  • Maximizes entropy of behavior distribution to enhance exploration diversity.
  • Integrates a regularization critic to ensure policies are physically plausible.
  • Demonstrates improved performance in simulated tasks compared to other strategies.
Read more
Offline Decision Transformers for Neural Combinatorial Optimization: Surpassing Heuristics on the Traveling Salesman Problem
Hironori Ohigashi, Shinichiro Hamada
Reinforcement Learning Optimization
  • Introduces a novel application of Decision Transformers for the Traveling Salesman Problem.
  • Integrates a Pointer Network to effectively handle variable action spaces in node selection.
  • Employs expectile regression for improved Return-to-Go predictions, enhancing solution quality.
  • Demonstrates that offline RL can surpass traditional heuristics in generating optimal solutions.
Read more
SIGMA: Structure-Invariant Generative Molecular Alignment for Chemical Language Models via Autoregressive Contrastive Learning
Xinyu Wang, Fei Dou, Jinbo Bi, Minghu Song
Generative Models Graph Learning Optimization
  • SIGMA addresses trajectory divergence in ChemLMs by enforcing latent isotropy through dense trajectory alignment.
  • The Structure-Invariant Contrastive Loss maximizes mutual information between equivalent generation paths, decoupling chemical semantics from syntactic variations.
  • IsoBeam dynamically prunes redundant search paths during inference, reallocating resources to explore structurally distinct molecular scaffolds.
  • Empirical results show that SIGMA outperforms strong baselines in terms of sample efficiency and structural diversity.
Read more
Amplified Patch-Level Differential Privacy for Free via Random Cropping
Kaan Durmaz, Jan Schuchardt, Sebastian Schmidt, Stephan Günnemann
Computer Vision Theory Efficient ML
  • Random cropping can amplify differential privacy in machine learning models without additional computational cost.
  • A new patch-level neighboring relation is introduced, allowing for a more tailored approach to privacy in vision data.
  • The study provides a theoretical framework for understanding the privacy amplification effects of random cropping.
  • Empirical results demonstrate improved privacy-utility trade-offs in segmentation tasks using standard architectures.
Read more
Social Hippocampus Memory Learning
Liping Yi, Zhiming Zhao, Qinghua Hu
Federated Learning
  • SoHip introduces a memory-centric approach to social machine learning, focusing on memory sharing for collaboration.
  • The framework preserves privacy by keeping raw data and local model parameters on-device.
  • Theoretical guarantees on convergence and privacy are provided, enhancing the framework's reliability.
  • Experimental results show SoHip outperforms existing heterogeneous federated learning methods by up to 8.78% in accuracy.
Read more
Causal-INSIGHT: Probing Temporal Models to Extract Causal Structure
Benjamin Redden, Hui Wang, Shuyan Li
Time Series Interpretability Graph Learning
  • Causal-INSIGHT provides a model-agnostic approach to interpret temporal predictors by analyzing their responses to input clamping.
  • The framework constructs directed temporal influence signals to reveal dependencies used by predictors for predictions.
  • Qbic, a new graph selection criterion, balances predictive accuracy and structural complexity without needing ground-truth labels.
  • Causal-INSIGHT shows competitive structural accuracy and improves temporal delay localization across diverse models.
Read more
Local learning for stable backpropagation-free neural network training towards physical learning
Yaqi Guo, Fabian Braun, Bastiaan Ketelaar, Stephanie Tan, Richard Norte, Siddhant Kumar
Optimization Efficient ML Theory
  • FFzero enables stable neural network training without backpropagation.
  • The framework combines local learning and directional-derivative optimization.
  • Demonstrated effectiveness across multilayer perceptrons and convolutional networks.
  • Provides a viable approach for in-situ physical learning using simulated photonic networks.
Read more
Physics-Informed Neural Network Digital Twin for Dynamic Tray-Wise Modeling of Distillation Columns under Transient Operating Conditions
Debadutta Patra, Ayush Bardhan Tripathy, Soumya Ranjan Sahu, Sucheta Panda
Optimization Theory Time Series
  • Introduction of a PINN framework for modeling distillation columns under transient conditions.
  • Integration of thermodynamic constraints into the neural network's loss function for physical consistency.
  • Demonstrated superior performance compared to traditional data-driven models.
  • Development of a comprehensive transient dataset for training and evaluation.
Read more
Training LLMs for Multi-Step Tool Orchestration with Constrained Data Synthesis and Graduated Rewards
Cheng Jiayang, Xin Liu, Zhihan Zhang, Haoyang Wen, Zixuan Zhang, Qingyu Yin, Shiyang Li, Priyanka Nigam, Bing Yin, Chao Zhang, Yangqiu Song
Large Language Models Reinforcement Learning
  • Introduces a framework for training LLMs on multi-step tool orchestration using real API responses.
  • Develops a graduated reward system that enhances learning by providing feedback on partial correctness.
  • Demonstrates substantial improvements in model performance on ComplexFuncBench.
  • Confirms the necessity of both atomic validity and orchestration rewards through ablation studies.
Read more
GlowQ: Group-Shared LOw-Rank Approximation for Quantized LLMs
Selim An, Il hong Suh, Yeseong Kim
Large Language Models Efficient ML Optimization
  • GlowQ uses group-shared low-rank approximation to enhance quantized LLM efficiency.
  • The method reduces computational and memory overhead by caching shared factors for input-sharing groups.
  • GlowQ-S variant further optimizes performance by selectively applying corrections.
  • Empirical results show significant improvements in latency, throughput, and accuracy over strong baselines.
Read more
SEVerA: Verified Synthesis of Self-Evolving Agents
Debangshu Banerjee, Changming Xu, Gagandeep Singh
Large Language Models Generative Models Theory
  • Introduces a formal framework for synthesizing self-evolving agents with safety guarantees.
  • Combines hard formal specifications with soft performance objectives in agent synthesis.
  • Utilizes Formally Guarded Generative Models (FGGM) to ensure outputs meet specified contracts.
  • Achieves zero constraint violations across multiple evaluation tasks.
Read more
How Class Ontology and Data Scale Affect Audio Transfer Learning
Manuel Milling, Andreas Triantafyllopoulos, Alexander Gebhard, Simon Rampp, Björn W. Schuller
Audio & Speech
  • Transfer learning in audio tasks is significantly influenced by the similarity between pre-training and downstream tasks.
  • Increasing the number of samples and classes in pre-training data generally improves performance but is not as impactful as task similarity.
  • The study provides a set of pre-trained model states on various AudioSet subsets for further research.
  • Findings challenge the assumption that larger, more diverse datasets are always optimal for pre-training in audio tasks.
Read more