fillin / context engineering for agents
live · checking…
cutoff 2026-01-01 → today

Your agent stopped reading on 2026-01-01.

That's days of arXiv, HN, releases, and vendor announcements your model can't see. Fillin is the embeddings layer that gives your agent every post-cutoff document — per query, autonomously, citable.

Live demo below — the query already ran against the production endpoint when this page loaded. No install needed to see it work.

cutoff 2026-01-01 · day gap · corpus · latest
live response · fillin_query via streamable HTTP
running…
query →new MCP servers and agent infrastructure
Each result is a real, dated, citable doc from past your cutoff. This is what your agent gets back. Wire it up ↓

1What model is your agent running? // determines your blind spot

Pick the model your agent calls. We use its training cutoff to scope every retrieval — only delta data ever lands in your agent's context window. If your agent multiplexes models, pick the one with the latest cutoff (it's the upper bound of what it might already know).

your agent's blind spot
days
cutoff → today
days past your cutoff
docs in fillin · live
latest indexed · UTC
Without Fillin, your agent answers all of these questions either by refusing ("my training ended…") or by hallucinating specifics that sound plausible. Both are bad. Both are avoidable.

2How does your agent talk to its model?

Fillin's MCP server speaks Streamable HTTP at one URL. Pick your harness — we'll show the exact JSON snippet to drop in. If you build your own loop, pick curl / custom for the raw HTTP shape.

3Drop in the config

Add the snippet below to your harness's MCP config. Restart the harness so it picks up the new server.

mcp.json

  

4Verify the connection

Click below — this page hits the live fillin_health tool against the deployed MCP server. Sub-second response means your agent will see Fillin in its toolset on next start.

5Your agent now reads past its cutoff.

Three tools live in your agent's toolset, called autonomously when relevant. Fillin handles the embedding, the filtering, the reranking, the freshness — your agent just asks and gets clean post-cutoff context back.

model wired
cutoff scoped
tools added to your agent
fillin_query · fillin_stats · fillin_health
corpus you're now reading
checking…

What good looks like next: watch your agent autonomously call fillin_query when a question is post-cutoff. The model decides; you watch the trace; the answer is grounded in real, dated, citable sources.