Can we control where learning happens in neural networks? Gradient routing addresses this by applying masks to limit the flow of gradients during backpropagation. By supplying different masks for different data points, the user can induce specialized subcomponents within a model. I worked on applying this method to AI safety research, focussing on addressing the problem of
emergent misalignment during
continual learning.