Abstract
This paper investigates the role of architectural inductive bias in enabling structural disentanglement within neural networks, focusing on bilinear multilayer perceptrons (MLPs). The authors argue that selective unlearning and long-horizon extrapolation failures in modern neural networks are rooted in how models structure their internal representations during training, rather than optimization algorithms alone. By leveraging bilinear parameterizations, which explicitly model multiplicative interactions, the authors demonstrate that these architectures possess a 'non-mixing' property under gradient flow conditions, allowing functional components to evolve orthogonally and align with the underlying algebraic structure of tasks. Through analytical insights and controlled experiments on modular arithmetic, cyclic reasoning, Lie group dynamics, and targeted unlearning benchmarks, the paper shows that bilinear MLPs outperform standard ReLU-based architectures in recovering true operators, improving model editability, and generalizing to compositional tasks. The findings suggest that architectural inductive bias plays a critical role in enabling reliable unlearning and extrapolation by fostering representational alignment with task-specific structures.
Methodology
The authors use bilinear MLPs, which calculate products between learned linear projections of inputs, to study structural disentanglement. Analytical proofs demonstrate the 'non-mixing' property under gradient flow, while controlled experiments validate the hypothesis across tasks with compositional and algebraic structures, including modular arithmetic, cyclic reasoning, Lie group dynamics, and unlearning benchmarks. Comparisons are made against standard ReLU-based architectures to assess performance in unlearning and generalization.
Results
Bilinear MLPs outperform standard ReLU-based architectures in recovering true operators aligned with task-specific algebraic structures. They exhibit significantly higher selectivity in unlearning and better generalization to long-horizon compositional tasks. Analytical findings confirm that bilinear parameterizations preserve the independence of interaction modes, enabling precise model editing.
Implications
The findings highlight the importance of architectural inductive bias in neural network design, particularly for tasks governed by compositional or algebraic structures. Bilinear MLPs could be applied to improve model editability, selective unlearning, and generalization in domains such as physics (Lie group dynamics), modular arithmetic, and feature interaction modeling. This work also provides a foundation for developing architectures tailored to specific problem regimes, emphasizing representational alignment over post-hoc algorithmic solutions.
View on arXiv