a live demo, captured against production

Fillin. // rails for cutoff-aware agents

Every result you'll see in the next ten slides is real — pulled from the live deploy at https://fillin.glyphapi.dev/mcp/ moments before this deck rendered. No mockups. No staging. No happy-path stubs.

Press → to begin

the problem

Every deployed LLM has a permanent, growing temporal blind spot.

Claude Opus 4.7 · trained through 2026-01-01

cutoff · jan 2026

now · apr 30 2026

119

days the agent can't see

∞

growing every minute

0

existing retrieval is cutoff-aware

Web search returns things the model already knows. Signal-to-noise inside an agent loop dies. Fillin is the substrate that returns only the delta.

setup

An autonomous agent on Claude Opus 4.7 cutoff: 2026-01-01

receives a question from its operator:

“What new MCP servers and agent infrastructure shipped this year?”

This is a post-cutoff question by definition. The agent must either guess from training, hallucinate, or call an external tool.

let's watch what happens →

trial 1 — claude opus 4.7 alone

without fillin · no tool, no retrieval response in 1.8s

I don't have information about MCP server developments or agent infrastructure released after my training cutoff of January 1, 2026. I can tell you about the MCP ecosystem as it existed in late 2025 — the Model Context Protocol specification, the early server examples, and Anthropic's reference implementations — but anything shipped in 2026 is outside my knowledge.

To answer this accurately, you'd need to consult recent sources like Hacker News, GitHub Releases for MCP-related repositories, or Anthropic's blog. I can synthesize what I find if you can share recent posts.

honest hedge. Best-case behavior — the agent acknowledges its limit. Worst-case is hallucinated specifics that sound plausible. Neither delivers value to the operator.

trial 2 — claude opus 4.7 + fillin (one mcp tool added)

with fillin · agent autonomously calls fillin_query response in 0.6s · cost: $0.01

Five notable MCP and agent-infrastructure releases since your cutoff:

Microsoft Foundry Agent Service launched hosted agents with secure, scalable compute built specifically for agent workloads microsoft · 2026-04-23
Anthropic published production guidance — "Building agents that reach production systems with MCP" claude.com · 2026-04-23
"Production MCP Patterns, Part 2: The Agent Stack Grew Up" — practitioner essay on operational maturity of the agent stack this year medium · 2026-04-23
datapoint-mcp — new MCP server pattern that lets agents solicit real-time human opinions mid-loop github · 2026-04-29
"Lessons on Building MCP Servers" — operator post-mortem on real MCP deployments taoofmac · 2026-04-29

The pattern: hosted agent infra is going commercial (Microsoft), Anthropic is publishing production guidance, and the community is sharing operational lessons rather than just demos.

five citations · all real · all post-cutoff · all clickable. Click any link — they go to actual URLs that exist.

behind the scenes — what just happened

One tool. One call. Five docs back.

tool call · agent → fillin

{
  "name": "fillin_query",
  "arguments": {
    "query": "new MCP servers and agent infrastructure released",
    "cutoff": "2026-01-01",
    "k": 5
  }
}

request shape

POST https://fillin.glyphapi.dev/mcp/
   tools/call → fillin_query
   bearer auth · 60 rpm/IP cap

5 ranked docs · post-cutoff only

hn Lessons on Building MCP Servers 0.463

hn datapoint-mcp · human opinions 0.419

hn claude.com · agents to production 0.397

hn Production MCP Patterns, Part 2 0.394

hn Foundry Agent Service hosted agents 0.381

scoring: similarity × source_authority × recency_decay — pulls 3×k candidates, reweights, slices to k. arXiv and GitHub Releases beat HN noise; newer beats older with a 90-day half-life.

why this works

Three primitives. Zero noise.

01 / filter

Documents are filtered by published_at > cutoff at the database layer — not the application layer. A doc from 2024 doesn't enter the candidate pool, so it can't bleed into context. Smaller context cost, cleaner signal, no redundant retrieval.

02 / rank

Similarity alone over-rewards HN noise that uses the right keywords. Authority weighting (arXiv 0.95 > GH 0.95 > RSS 0.75 > HN 0.70) and a 90-day recency half-life pull canonical sources to the top. Tunable per-corpus via env override.

03 / cite

Every result carries source, URL, publish timestamp, and similarity score. Agents synthesize grounded answers; humans verify by clicking. No black box. No hallucinated citations possible — if it's not in the corpus, it's not in the response.

Continuous ingestion from 4 sources: Hacker News, arXiv, RSS, GitHub Releases. Cron tick every 30 minutes. 6,109+ docs and growing.

economics

Pay per query. In USDC. On Base.

$0.01 / query

≈ 100 queries per 1 USDC · settled on-chain

payment rails

x402 + USDC on Base // agents

bearer keys // humans + ops

no human required

no card form, no KYC, no signup

deposit · sign · query · done

why this matters for ZHCs: a Zero-Human Company's treasury wallet can fund and use Fillin in 4 API calls — no human in any step. That's not a side feature. That's the core architecture for autonomous customers.

live status

Production-grade // today.

endpoint fillin.glyphapi.dev/mcp/

corpus rows 6,109 · growing

freshness < 1 hr lag · cron tick 30 min

smithery.ai listing mandalazenwave/fillin · public

tls HSTS 2y · CSP locked · A+

rate limit per-IP · 60 rpm /mcp · 30 rpm /query

payment monitor USDC deposit watcher · base · live

audit posture 0 critical · 0 high · 0 medium

verified by /security-scan today. The product you just experienced is the deployed product. Click into smithery.ai/servers/mandalazenwave/fillin to see the listing or open the URL above directly.

how to plug in — under 5 minutes

Drop into any MCP-aware harness.

Add this to your ~/.claude/mcp.json or your agent framework's equivalent:

{
  "mcpServers": {
    "fillin": {
      "url": "https://fillin.glyphapi.dev/mcp/",
      "transport": "streamable-http"
    }
  }
}

Three tools appear in your agent's toolset: fillin_query, fillin_stats, fillin_health. The agent decides when to call them. You decide what to do with what they return.

Read the agents.json manifest → Verified on Smithery Try it in 60s