Closed-Form Logit Steering

COLLINS WESTNEDGE
OCT 8, 2025


Goal

The goal is to derive the closed-form minimal perturbation to an input x that achieves any target probability p ∈ (0,1) in logistic regression.


Identities & Intuition

Model components

Geometry

Approach

Move \(x\) along \(w\) by some \(\lambda\):

\[ x' = x + \lambda w \] with \[ \operatorname{logit}(p) = w^T x' + b. \]


Derivation

  1. Set up the constraint: Since the model’s score is \(z = w^T x' + b\) and we want probability \(p\), we require \(w^T x' + b = \operatorname{logit}(p)\).

  2. Plug in \(x'\) (use \(w^T w=\|w\|^2\)):

\[ \begin{aligned} \operatorname{logit}(p) &= w^T(x+\lambda w)+b \\ &= w^T x + \lambda\, w^T w + b \\ &= w^T x + \lambda\|w\|^{2} + b. \end{aligned} \]

  1. Solve for \(\lambda\):

\[ \lambda = \frac{\operatorname{logit}(p) - (w^T x + b)}{\|w\|^{2}}. \]

  1. Substitute \(\lambda\) into \(x' = x + \lambda w\):

\[ x' = x + \frac{\operatorname{logit}(p) - (w^T x + b)}{\|w\|^{2}}\,w. \]


Final Formula

\[ \boxed{ x' = x + \frac{\operatorname{logit}(p) - (w^T x + b)}{\|w\|^{2}}\,w } \]

Where \(x' = x + \lambda w\) achieves the target probability \(p\).


Interactive Demo

Interactive 3D Visualization

Open interactive visualization β†’


Applications