Mind the Shift: Decoding Monetary Policy Stance from FOMC Statements with Large Language Models

LLM-native, label-free

Frozen LLM representations + no human annotations. Self-supervised from the temporal order of FOMC meetings.

Mind the shift

Markets react to stance changes, not absolutes. DCS models inter-meeting shifts, which prior methods miss.

Beats every baseline

Tops supervised and LLM-judge methods. Spearman 0.62 with CPI. Significant β across 2Y, 10Y, 20Y yields.

Why Relative Shift Matters

A moderately hawkish statement is still a dovish shift if the previous one was more hawkish. Markets price the delta, not the level. DCS is the first method to enforce this—no labels needed.

Empirical Evidence Leaderboard

DCS leads on inflation correlation and yield regression across 1B–14B models. Ranked by average of shown metrics; best per column bolded.

Inflation correlation

Meeting-level stance scores and year-over-year inflation changes.
#	Model	Method	CPI (YoY)		PPI (YoY)
			Pearson	Spearman	Pearson	Spearman
1	DeepSeek-R1-14B	DCS (Ours)	0.502	0.624	0.480	0.553
2	Qwen3-4B	DCS (Ours)	0.459	0.606	0.434	0.524
3	Llama-3.1-8B	DCS (Ours)	0.481	0.538	0.424	0.455
4	Llama-3.1-8B	Logit-Based Judge	0.503	0.417	0.330	0.292
5	Llama-3.2-1B	DCS (Ours)	0.345	0.540	0.388	0.509
6	FOMC-RoBERTa	Supervised	0.388	0.446	0.288	0.285
7	—	Dictionary	0.314	0.384	0.289	0.299
8	DeepSeek-R1-14B	LLM Judge	0.389	0.340	0.257	0.242
9	Qwen3-4B	Logit-Based Judge	0.380	0.341	0.253	0.211
10	Llama-3.2-1B	Linear Probe	0.196	0.404	0.155	0.376
11	Llama-3.1-8B	LLM Judge	0.354	0.280	0.217	0.196
12	Llama-3.2-1B	LLM Judge	0.238	0.319	0.152	0.180
13	DeepSeek-R1-14B	Logit-Based Judge	0.227	0.179	0.078	0.069
14	Llama-3.2-1B	Logit-Based Judge	0.065	0.227	0.060	0.181
15	Qwen3-4B	LLM Judge	0.170	0.141	0.080	0.048
16	Qwen3-4B	Linear Probe	0.166	0.206	0.032	0.009
17	Llama-3.1-8B	Linear Probe	0.034	0.162	0.000	0.187
18	DeepSeek-R1-14B	Linear Probe	0.055	0.072	0.061	0.039

Treasury yield regression

Yield levels on stance scores.
#	Model	Method	2Y Yield		10Y Yield		20Y Yield
			β	p	β	p	β	p
1	DeepSeek-R1-14B	DCS (Ours)	2.121	<.01	0.782	<.01	0.771	<.01
2	Llama-3.1-8B	DCS (Ours)	0.842	<.01	0.796	<.01	0.852	<.01
3	FOMC-RoBERTa	Supervised	1.058	<.01	0.536	<.01	0.402	<.01
4	Llama-3.1-8B	LLM Judge	0.854	<.01	0.328	<.01	0.202	.096
5	Llama-3.1-8B	Logit-Based Judge	0.901	<.01	0.299	.012	0.158	.185
6	Llama-3.1-8B	Linear Probe	0.055	.754	0.227	.067	0.326	<.01

Label-free — self-supervised from meeting order Up to 71.1% accuracy — beats supervised + LLM-judge Market-reflected — significant across CPI, PPI, yields

Stance Over Time

DCS scores track with Fed rate cycles—and lead them. Hover or drag to explore each meeting.

Stance over time

Higher score = more hawkish. Switch panels to view level (DCS & FFR) or relative stance (delta).

Hover or drag

Which Sentences Drive the Score?

Red = hawkish, blue = dovish. Hover to see each sentence's contribution (Δ) to the final score.

How Teams Use DCS

Macro research

Detect policy regime shifts; use stance as a leading-indicator input for CPI/PPI views.

Rates strategy

A structured language factor for yield-curve forecasting and duration positioning.

Risk monitoring

Transparent central-bank communication signal for scenario analysis and risk reporting.

Resources

Download DCS Scores (CSV) Code Paper (arXiv)