The Stagflation Trap: Optimal Monetary Policy as an HJB Problem
Setting the Scene
Stagflation — high inflation coexisting with weak growth — confronts a central bank with a choice that has no clean resolution. Raise rates to fight inflation and you depress output further. Ease policy to support the economy and inflation runs hotter. This is not a failure of will or competence. It is a structural feature of the underlying control problem.
This post builds a minimal mathematical model of that trap. The framework is the **Hamilton–Jacobi–Bellman (HJB) equation** — the same tool used to price options, manage inventory, and solve rocket guidance problems. The economy here is a two-dimensional stochastic system; the central bank is an optimal controller; and the trap emerges cleanly from the geometry of the solution.
The Economy in Two Variables
We track two numbers at each moment in time: **inflation** $\pi$ and the **output gap** $y$ (how far GDP is below its potential). Their dynamics follow two coupled stochastic differential equations.
The first equation is a **Phillips curve** — inflation rises when the economy runs hot, and is pushed higher by any persistent supply shock $s \geq 0$:
$(1)$
The second equation is an **IS curve** — the output gap is compressed by tighter policy (a higher rate $u$) and is buffeted by aggregate demand shocks:
$(2)$
Here $\kappa = 0.5$ is the Phillips slope, $\phi = 0.8$ is the policy transmission coefficient, and $\sigma = 0.02$ is the noise level on both equations. The terms $dW_1$ and $dW_2$ are independent Brownian increments — the model’s representation of random shocks.
The central bank minimises the expected discounted quadratic loss:
$(3)$
with discount rate $\rho = 0.05$. The three terms penalise inflation, slack output, and costly rate moves respectively. Inflation carries twice the weight of output — a rough approximation of a price-stability mandate.
The Value Function and the HJB Equation
Rather than optimise over the full path of rates, the HJB approach collapses the problem to a single function of the current state. Define $V(\pi, y)$ as the **minimum achievable loss** starting from the state $(\pi, y)$, under the best possible policy from that point forward. This is the value function.
Thinking of $V$ geometrically: it is a surface over the $(\pi, y)$ plane. Points where $V$ is low are good starting positions — the economy is already close to where the bank wants it, and the remaining cost is small. Points where $V$ is high are bad — inflation is elevated, or output is deeply depressed, and a costly correction lies ahead.
The value function satisfies the HJB equation, which encodes the trade-off between current loss and future cost:
$(4)$
Minimising over $u$ is a pointwise calculus problem. Setting the derivative with respect to $u$ to zero gives the optimal rate in feedback form:
$(5)$
The optimal rate at any moment is proportional to how steeply the value function slopes in the $y$ direction. Intuitively: if tightening policy today reduces the output gap and that reduction saves future cost, do it.
Solving with a Quadratic Ansatz
Because the dynamics (1)–(2) are **linear** and the loss (3) is **quadratic**, the value function is exactly quadratic in the state — a paraboloid. We write:
$(6)$
where $P$ is a $2 \times 2$ positive-definite matrix (the curvature of the bowl), $p \in \mathbb{R}^2$ is a vector that tilts and shifts the bowl in response to the supply shock, and $v_0$ absorbs the noise contribution $\sigma^2 \operatorname{tr}(P)/\rho$.
Substituting into equation (4) and matching terms order by order:
**Quadratic terms** yield the **Algebraic Riccati Equation (ARE)**:
$$A^\top P + PA – PBR^{-1}B^\top P + Q = \rho P$$
with matrices $A$, $B$, $Q$, $R$ read off from the model. This is a standard equation solved numerically.
**Linear terms** yield an affine equation for $p$:
$(7)$
where $A_{\mathrm{cl}} = A – BR^{-1}B^\top P$ is the closed-loop dynamics matrix. When $s = 0$ this gives $p = 0$ — no tilt. When $s > 0$, the vector $p$ tilts the bowl away from the origin.
The Optimal Policy and the Trap
With $P$ and $p$ in hand, the optimal policy decomposes into two parts:
$(8)$
The **feedback** term $-Kx$ drives the economy back toward the origin — this part is the same whether or not a supply shock is present. The **feedforward** term $-R^{-1}B^\top p$ is a permanent premium the bank pays to counteract the shock. It does not vanish as time passes; as long as $s > 0$, the bank must sustain a tighter stance just to resist inflationary drift.
The trap is now visible. The optimal steady state $x^*$ — where the controlled system eventually settles — satisfies $A_{\mathrm{cl}}\,x^* + Bu_{\mathrm{ff}} + c = 0$. When $s > 0$, this pushes $x^*$ to **positive inflation and negative output simultaneously**. No choice of feedback strength $K$ can move $x^*$ back to the origin. The best the bank can do is minimise the weighted distance from it.

The Trap in Numbers
With the parameters above, numerical solution of the ARE gives the feedback gain $K$. The affine correction $p(s)$ is linear in $s$, so the feedforward premium and the displacement of $x^*$ both scale with the shock magnitude. For $s = 0.04$, the optimal equilibrium sits at strictly positive inflation and strictly negative output — the classic stagflation configuration.

The phase portrait makes one thing plain: the bank is not failing. Both sets of trajectories converge; the optimal policy is working in both cases. The problem is that under a persistent supply shock, the best achievable outcome is not the ideal one.
Takeaway
The stagflation trap is not a puzzle about preferences, credibility, or central bank competence. It is a mathematical consequence of a linear economy driven by a persistent cost-push shock: the optimal policy cannot simultaneously zero inflation and the output gap. The HJB framework makes this exact — the shock enters through the affine correction $p(s)$, shifts the cost bowl, and displaces the optimal steady state in a way that no feedback rule can undo. The permanent welfare cost scales with $s^2$. Whether it can be reduced through alternative loss targets, commitment devices, or coordinated fiscal policy is the question that follows naturally from this model.
Interested in this line of work? Get in touch.