Gradient Routing

Can we control where learning happens in neural networks? Gradient routing addresses this by applying masks to limit the flow of gradients during backpropagation. By supplying different masks for different data points, the user can induce specialized subcomponents within a model. As part of SPAR, I worked on applying this method to AI safety research, focussing on addressing the problem of emergent misalignment during continual learning.

Image credit: Adapted from Cloud et al., 2024. CC-BY 4.0

Matthew A. Clarke
Matthew A. Clarke
Research Scientist, Associate Staff Visitor

Research Scientist at the AI Security Institute and Associate Staff Visitor at UCL.