Agentic AI · Equity Research

A research agent that thinks like
a buy-side analyst.

How much of a financial analyst's workflow can an AI agent actually replace? I built the answer — not with a chatbot that summarizes headlines, but with a structured agent that enforces a mandatory 7-step research sequence before forming any view.

Claude Sonnet 4 MCP · FastMCP fintools-mcp SEC EDGAR API yfinance Custom Python indicators RAG knowledge layer
System configuration
modelclaude-sonnet-4-20250514 · Anthropic /v1/messages
orchestrationModel Context Protocol (MCP) · FastMCP · stdio transport
serversfintools-mcp (11 tools) + sec-filings (8 tools) = 19 total
market datayfinance · SEC EDGAR API (direct) · NewsAPI
indicatorscustom Python · zero TA-Lib dependency
knowledge layerRAG-ready · swappable via system prompt files
distributionpyproject.toml · hatchling · pip / uv installable
7
Mandatory steps before
any output is written
20+
Live signals synthesized
per research report
5
Research modes across
stock, macro & options
The problem

The same query, run twice, can produce a different view. That's the problem.

LLM financial analysis is only as reliable as the data pipeline enforcing it. Without a mandatory retrieval sequence, models fill data gaps with priors — outputs that look structured but aren't traceable. The fix isn't a better model. It's discipline built into the system before the model is ever invoked.

Without structural enforcement
No directional view before estimate revisions are checked
Options chain skipped when data is hard to retrieve
Insider selling missed because Form 4s aren't queried
Contradictory signals averaged into a tidy "neutral"
Outputs vary run-to-run — not traceable back to sources
With the agent
7 data steps gated sequentially — no step can be skipped
Every signal labelled by source, basis, and time horizon
Form 4 clusters, 13F flows, and tone deltas pulled per run
Contradictions get their own section — never smoothed over
Same query, same structure, every time — fully traceable

The pipeline

Seven steps. Structurally enforced. No shortcuts.

The model cannot write output before step 7. This is enforcement via MCP tool gating — not prompt-level instruction. The same query, run twice, produces the same structure.

1
Live quote + trend score
Price, market cap, 52w range, trend score /100
2
Technical indicators
RSI (Wilder's), MACD, ATR, EMA 9/21/50/200, Fibonacci, support/resistance
3
Options chain analysis
IV/HV ratio, put/call skew, GEX, max pain — per expiry
4
Fundamental metrics
P/E, EV/EBITDA, margins, FCF — labelled trailing/forward, GAAP/adjusted
5
SEC filings
10-K/10-Q via EDGAR API · Form 4 cluster detection · 13F flow
6
Estimate revision direction
30-day consensus revision trend · up/down ratio · management tone delta
7
Cross-signal synthesis
All signals tagged tactical (0–3mo) or fundamental (6–18mo) · contradictions surfaced · conviction scored
⟶ Output unlocked only here

Key design decisions

Four choices that define the system's integrity.

⏱️
Dual time-horizon tagging
Every signal is explicitly labelled tactical (0–3mo) or fundamental (6–18mo). A bullish technical setup against deteriorating fundamentals isn't averaged into a "neutral" — it's stated as a genuine dual-horizon divergence. Disagreement is the signal.
🎙️
Management tone scoring
Five dimensions scored 1–5 and delta'd against the prior period. A beat with a falling tone score is a yellow flag. A miss with an improving tone may signal a turning point. Numbers and language are interrogated separately.
⚖️
Options as a cross-check
IV/HV ratio, put/call skew, and GEX are pulled for every directional call. If they contradict the price or fundamental picture, that contradiction is surfaced explicitly — not smoothed over into a cleaner-looking output.
🕐
Data staleness warnings
Every report explicitly states whether any data source is stale. If options IV is 110 days old, the report says so and labels the analysis a regime estimate — not live pricing. Research integrity over clean outputs.

Sample outputs

Real reports. Real data. April 29, 2026.

Four reports generated in a single session — each following the full 7-step pipeline. Below is what the agent actually found, not a description of what it does.

Equity Analysis · NVDA
NVIDIA Corporation · $213.17 · −2.55%
The agent confirmed the Blackwell cycle thesis — but also surfaced what consensus hasn't priced: $0 China H200 revenue in management's own Q1 guide despite 400,000+ units of approved orders pending clearance. That unmodeled upside sits alongside 18 months of uninterrupted insider selling at a 15:0 sell/buy ratio — a genuine dual-horizon divergence the agent flags rather than resolves.
revenue growth YoY+73.2%
PEG ratio0.74x
estimate revisions32 up / 1 down
operating margin65.0%
RSI (14-day)70.6 — overbought
insider sell/buy ratio15:0 (18 months)

Contradiction flagged: Tactical medium / Fundamental high — RSI overbought near 52w high against strongest estimate revision ratio in mega-cap tech
Peer Comparison · NVDA vs AMD vs TSM
Semiconductor cohort ranked across 4 dimensions
All three names in strong uptrends with clean EMA stacks — but the differentiation the agent surfaces is stark. AMD has the strongest technical momentum (ADX 45.3) and simultaneously the weakest fundamentals: 12.5% net margin vs NVDA's 55.6%, 123x trailing GAAP P/E. TSM is the only name not technically overbought and the only one capturing economics regardless of which AI chip architecture wins.
NVDA trailing P/E43.5x · PEG 0.74
AMD trailing P/E123.4x · ADX 45.3
TSM trailing P/E33.6x · RSI 61.4
NVDA FCF (FY2026)$58.1B
AMD net margin12.5% vs NVDA 55.6%

Contradiction flagged: AMD — strongest ADX (45.3) against weakest margins and highest P/E. Technical conviction without fundamental support.
Options Chain Analysis · TSM
116 contracts across 3 expirations · Live MCP data
The agent found something the fundamental numbers can't show: ATM IV at ~49% sits below estimated 30-day realized HV of ~55% — options are cheap vs realized vol in a name with a geopolitical overlay. Meanwhile the $220 May put (44% below spot, 17 DTE, 103.9% IV, 6,272 OI) is the agent's clearest non-fundamental signal — the only rational buyer is hedging a Taiwan Strait discontinuity, not a thesis on earnings.
ATM IV (May/Jun/Sep)~49% / ~48% / ~49%
HV 30-day (est.)~55% — IV cheap vs realized
P/C ratio May / Sep0.64 bullish / 1.75 put-heavy
$220P May OI6,272 · IV 103.9% · $0.11/share
$340P May OI15,589 — largest in chain

Contradiction flagged: Near-term call-heavy (bullish) while Sep P/C 1.75 accumulates deep OTM puts targeting Nov 2026 policy cliff.
Macro Analysis · US–China Tech Trade
Export control regime mapped to NVDA · AMD · TSM
The agent mapped a five-layer policy stack — from the Jan 2026 BIS case-by-case rule to the April 2026 TSMC Arizona framework — and separated what's priced from what isn't. Key finding: NVDA's Q1 guide assumes $0 China H200 revenue despite 2M+ units of demand. The agent also identifies the November 2026 rare earth suspension expiry as the most important unpriced binary event in the next 12 months for all three names.
SOXX recovery off lows+148% ($176 → $439)
USD proxy (UUP)Near 52w low — tailwind
HY credit (HYG)Near 52w high — no stress
NVDA China H200 in guide$0 assumed — unmodeled
Nov 2026 rare earth cliffNot priced · Binary risk

Key unpriced event: Rare earth suspension + Affiliates Rule waiver expire Nov 2026 — most severe re-escalation risk in US-China tech trade history if not renewed.

The answer

So — how much of the workflow can it actually replace?

That was the opening question. Here's what the build revealed.

What the agent handles end-to-end
Live data retrieval across 19 MCP tools — every run
Technical indicator computation with custom Python (no wrappers)
Options chain interpretation across multiple expirations
SEC filing retrieval and Form 4 cluster detection
Estimate revision direction and management tone delta
Structured output with labelled signals and contradiction sections
What stays with the analyst
Deciding which question to ask and which mode to run
Weighting contradictory signals against portfolio context
Acting on the conviction score — sizing, timing, conviction
Knowing when a signal is noise vs edge
Final verdict
More than I expected — if you build the discipline in first. The agent doesn't replace the analyst's judgment. It eliminates everything that stands between the analyst and the moment of judgment. That handoff is the point.

Personal reflection
Behind the build

The core design problem was reliability. LLM financial analysis is only as good as the data pipeline enforcing it — the same query run twice can produce materially different outputs if the model fills data gaps with priors. I designed against that from the start: every indicator custom-coded in Python (no TA-Lib), SEC EDGAR queried directly via the data API, and the output structure typed via prompt schemas so every metric carries its own label and basis.

The RAG knowledge layer was the piece that made the system genuinely extensible. Macro transmission maps, management tone scoring rubrics, filing interpretation guides — all ingested via system prompt files. Swap the files, update the domain knowledge, without touching the pipeline or the model. That decoupling is what makes this architecture reusable far beyond equities.

Interested in working together?

Get in touch