✨ TL;DR
LPSR is an inference-time error correction method that monitors internal model activations to detect reasoning mistakes, then rolls back generation and steers the model using cached corrections—no training required. It improves an 8B model's math accuracy from 28.8% to 44.0%, outperforming prompted self-correction and even larger 70B models.
Large language models frequently make unrecoverable reasoning errors during text generation. Once a model takes a wrong reasoning step, subsequent tokens tend to compound the mistake rather than self-correct, leading to cascading failures in multi-step tasks like mathematical reasoning. Existing inference-time correction methods like prompted self-correction often perform worse than standard generation, and scaling approaches like best-of-N sampling require prohibitively high token budgets. The fundamental challenge is detecting when an error occurs mid-generation and intervening effectively without requiring model retraining or expensive additional forward passes.
Latent Phase-Shift Rollback (LPSR) operates at inference time by monitoring the residual stream at a critical layer during each generation step. The method uses a dual-gate mechanism combining cosine similarity and entropy metrics to detect abrupt directional reversals (phase shifts) in the activation space that signal reasoning errors. When a phase shift is detected, LPSR rolls back the key-value cache to before the error and injects a pre-computed steering vector to guide the model toward correct reasoning. The steering vectors are computed offline and cached, requiring no fine-tuning, gradient computation, or additional forward passes during inference. The method includes a layer sweep analysis to identify optimal monitoring depths.