Narrow fine-tuning can cause language models to become broadly misaligned. Is this preventable? A satisfactory solution should not require comprehensive alignment data, because comprehensive data may be difficult or impossible to obtain. We formulate this problem in terms of spillover; the effect that training on one context has on a model’s predictions on a wide range of other contexts. We demonstrate spillover and show it cannot be prevented simply by finetuning on limited alignment data. To address this challenge, we explore a variety of ways to control spillover using limited training data, including regularization, continual learning methods, and previously-proposed methods for controlling misgeneralization. We find multiple strategies that can improve generalization from limited alignment training data, including a novel method called Steering Weights. Our work progresses towards achieving fine-grained control over how models generalize their supervision, which may enable safer development of superintelligent AI systems.