The State, Not the Score

Why output format is a design decision and why the wrong one costs you the window.

Apr 12, 2026

More precision isn’t always better. At a certain resolution, a political risk score stops producing decisions and starts producing discussions.

The refresh cycle problem

Most political risk scores are refreshed quarterly. Some annually. A few times a month, if the provider is unusually active.

That means the number on your dashboard today was built on inputs collected 60, 90, or sometimes 120 days ago. The analysts who produced it were working from a political environment that predates the events now moving your portfolio. The model was already closed when the escalation started.

This is not a criticism of the analysts or the methodology. It is a description of the temporal mismatch between how conventional political risk scores are produced and how political escalation now moves. Quarterly refresh cycles were designed for a political environment in which risk shifted gradually, with the time between a political development and a market consequence measured in weeks. That environment assumed escalation was slow enough that a periodic snapshot remained useful between updates.

It isn’t anymore.

Liberation Day didn’t wait for a quarterly refresh. The 45-day buildup in Federal Register activity, executive-order velocity, and congressional signaling occurred within a single quarter. SVB’s escalation sequence compressed into weeks. January 6th compressed into 90 minutes. These events didn’t slow down to fit the update schedule of the tools designed to measure them.

A quarterly political risk score and a real-time escalation detection system are not competing approaches to the same problem. They are answering different questions entirely.

The score answers: What does our modeled political risk environment look like as of last quarter?

The detection system answers: What is escalation doing right now, and what state is it in?

Comparing them as alternative approaches to political risk measurement is like comparing a quarterly earnings report to a live trading feed and asking which gives you better intraday positioning. One is a periodic snapshot. The other is a continuous detection system. The snapshot has its uses. Early warning is not one of them.

The interpretation problem

Even if a score were current, it would still face a second structural limitation.

A continuous score requires a threshold decision before it produces action. Someone in the organization has to decide what the number means. Is 67 out of 100 an act number? What about 71? What changed between Tuesday, when it was 64, and Thursday, when it’s 71, and does that change warrant a call to the risk desk or a slide in next week’s deck?

That interpretation step lives downstream. It lives in the organization, under pressure, in real time, with incomplete context and competing priorities. In a slow-moving political environment, that’s fine. You have time to convene, discuss, and decide. The interpretation latency is a feature. It introduces human judgment before consequential action.

In a compressed escalation environment, that same latency is a liability. It’s the window closing while the meeting is being scheduled.

A state doesn’t ask the interpretation question. It answers it. The threshold decision has already been made upstream, in the architecture, before the event arrives. The output isn’t a number to interpret. It’s a characterization of where an escalation event is in its lifecycle and of the class of decision now in front of you. The PM, the risk committee, and the CEO receive a decision context rather than a data point.

Liberation Day

Consider what forty-five days of measurable political buildup looked like through a score-based lens.

Federal Register activity was increasing. Executive order velocity was accelerating. Congressional signaling was intensifying. The trajectory was visible in real time. The political risk scores in place on April 2nd were based on inputs from Q1 at best and Q4 2024 at worst. The tariff escalation had been building throughout the quarter, and the score was supposed to capture it. The model was closed before the signal appeared.

At what score do you act? If your political risk model showed US trade escalation risk at 58 on March 15th, what did you do? What about 64 on March 28th? 71 on April 1st? Each of those numbers required a human to decide what it meant, whether it had crossed their internal threshold, and whether that threshold warranted action today or could wait for more clarity.

More than five trillion dollars in S&P 500 market cap didn’t disappear because the information wasn’t there. It disappeared because the tools most institutions rely on weren’t designed to detect it as it formed. The scores that did capture some of the buildup delivered numbers that required interpretation rather than states that required decisions. By the time the interpretation converged on “this is serious,” the event had priced.

A Critical state entry doesn’t produce that sequence. It tells you the escalation is in motion, the compression is building, and the decision class in front of you is position adjustment, not continued monitoring. The interpretation happened upstream. You received a state.

The objection

The natural pushback: isn’t five states just a coarser score? Isn’t this argument just a preference for lower resolution?

No.

A score tells you the magnitude. It shows how elevated the risk is along a continuous dimension. As we examined when looking at the consensus gap, continuous scoring captures the level of escalation but not its trajectory. It says nothing about the character of what’s happening: whether the escalation is accelerating or plateauing, whether it’s in early formation or approaching a peak, or whether the next likely move is further compression or resolution.

A state indicates where an escalation event is in its lifecycle. Elevated means early convergence is forming, and readiness is warranted. Critical means the dual trigger has fired, compression is active, and the decision window is open. Resolving means the peak has passed, and the volatility premium is decaying. Anticipatory means a calendared event is approaching, with convergence building, and pre-positioning is the relevant decision class.

A score of 71 accelerating toward Critical and a score of 71 decelerating toward Resolving are entirely different situations. The number doesn’t tell you which. The state does.

SVB

SVB makes a related point from a different angle.

In the months leading up to the March 2023 collapse, interest rate risk exposure was evident in public filings. Congressional oversight activity was intensifying in the wake of Silvergate’s failure days earlier. The regulatory environment was under pressure. These signals were present across independent domains. What was missing wasn’t the information. What was missing was a system that aggregates those domains into a coherent escalation picture and characterizes the situation's lifecycle stage.

A score running against any one of those signals displayed a number. A number that meant different things to different analysts until the collapse made the interpretation obvious to everyone at once.

That simultaneous recognition is the bottom of the escalation ladder. The moment of institutional consensus. It is also, by definition, the moment after the window closed.

A state machine aggregating those signals would have characterized the cross-domain convergence as it built, entering Elevated as the independent pressure signals aligned, and Critical as velocity crossed the dual trigger in the days before the collapse. The desk that received those state transitions had a decision context at each stage. The desk reading a score had a number.

The RAG problem

This isn’t only a trading desk issue.

Every organization that briefs leadership on political risk delivers a score or a color. Red. Amber. Green. The RAG system is ubiquitous in enterprise risk management, from boardrooms and risk committees to legal counsel and CISO briefings. It shares the same two structural flaws as a continuous numeric score: it requires human interpretation before producing action, and it is typically updated on the same quarterly or annual cycle as the scores it’s derived from.

What does amber mean this week versus last week? Is this a new amber or a deepening amber? Has the situation deteriorated or just failed to improve? Is the right response to increase monitoring, brief leadership, or act? The color doesn’t answer those questions. It opens them. And if the color was set three months ago, it may not be describing the current situation at all.

The output format is a design decision. The refresh cycle is a design decision. Most organizations haven’t treated either one as such.

Why this matters beyond trading desks

The PM who receives a Critical state alert at 11 PM on a Friday is getting the same structural upgrade as the CEO who receives a briefing that says, “We have entered a Critical escalation state and here is what that means for the decisions in front of us this week.” Interpretation done upstream. Decision context delivered downstream. Time is preserved for the decision itself rather than the debate about what the information means.

A risk committee briefed on a state transition is equipped to make a decision. A risk committee briefed on a quarterly score must first determine whether the score still reflects reality, then decide what it means, and then decide what to do about it. In a compressed political environment, those extra steps are often the window.

The state doesn’t eliminate judgment. It preserves it for the decision that matters, rather than consuming it on the interpretation that preceded it.

GrayStak Media

Discussion about this post

Ready for more?