Generated from git history. Most recent first. Newer entries are at the top.
2026-05-21 · 6cb1652
fix(deploy/setup-stripe): poll /v1/health before smoke-testing checkout
The 3s sleep before the smoke test was shorter than fillin-api's cold-start
(embedder + corpus cache warmup is ~60s), causing setup-stripe.sh to print a
false-positive 502 even when Stripe was wired correctly. Poll /v1/health
with a 180s deadline so the script reports based on the real terminal state.
2026-05-20 · 88c2fbc
feat: payments receipts + push subscriptions + distribution surfaces
Three concurrent threads toward the "Fillin Magnificent" plan items:
#2 Payment Rails — close the receipts gap and make Stripe go-live a one-command op:
- new `/v1/payment/transactions/{address}` returns on-chain USDC deposit history
- new `/v1/billing/transactions` returns the bearer's per-Stripe-event credit log
- `bearer_credit_events` table + `record_bearer_credit_event()` write hook
- `deploy/setup-stripe.sh` interactive: prompts for secrets (hidden input),
ships them through SSH stdin (never argv), rewrites STRIPE_* lines in
/opt/fillin/.env, restarts fillin-api, smoke-tests /v1/billing/checkout
- `deploy/install.sh` now installs the templated `
[email protected]`
+ `fillin-prune` units, and reconciles enabled instances against
FILLIN_NETWORKS — fixes the long-standing bug where install.sh
re-enabled the legacy single-chain monitor on every run
- .env.example documents the multi-chain + Stripe block
#3 Push Subscriptions — turn Fillin from query endpoint into event bus:
- new `pubsub.py`: subscriptions table, topic-filter validator
(source/min_severity/keywords/affected_ecosystem), in-process SSE
registry, HMAC-signed webhook delivery
- LanceDB polling worker started under FastAPI lifespan; uses
`ingested_at` cursor (the column was added in PR #2 specifically for this)
- `POST /v1/subscriptions` (auth: bearer with positive balance),
`GET /v1/subscriptions`, `DELETE /v1/subscriptions/{id}`,
`GET /v1/subscribe/{id}` (SSE with 15s heartbeat)
- webhook deliveries carry X-Fillin-Timestamp + X-Fillin-Signature (HMAC-SHA256
over `ts.body`, Stripe-style replay-proof)
#1 Distribution surfaces — make Fillin discoverable + installable everywhere:
- `/llms.txt` route serving web/llms.txt (agent-crawler discovery)
- `Dockerfile` for the FastAPI+MCP HTTP server
- `docs/integrations/cursor.md` — MCP config + Stripe funding walkthrough
- `docs/integrations/claude-desktop.md` — claude_desktop_config.json shim
- `docs/integrations/browse-sh.md` — ready-to-submit skill spec (browse.sh
agents make external HTTP, so x402 settles correctly — verified design)
- `docs/integrations/mcp-registry-pr.md` — draft PR text for
modelcontextprotocol/servers community list
Tests: 259 passing. New coverage in test_pubsub.py (15), test_api_subscriptions.py
(9), test_api_discovery.py (3), additions in test_api_billing.py (4).
2026-05-17 · 07bde4a
markets: new /query/markets slice — Polymarket + Kalshi + Manifold + Metaculus (#4)
* Phase R: schema migration foundation + CVE answer-engine columns
Ships R.1 audit, R.2 typed-columns decision, R.3 ingested_at, and R.next
severity_score + affected[] against the cves source. The CVE answer-engine
SKU is now shippable — a buyer can act on a returned row (pin to
patched_range) without a second hop to NVD or GHSA.
SCHEMA_AUDIT.md — per-source field map for all 7 ingest paths; identifies
the 4 highest-WTP discards (ingested_at, CVE severity, quality signals,
GHSA/OSV patched_range).
SCHEMA_DECISION.md — chose typed columns (B) over single JSON blob (A).
Deciding field: GHSA/OSV affected[] tuple. JSON-blob path breaks LanceDB
filter pushdown, the one-call MCP wedge, and the training-data SKU. Each
new typed column names the SKU it unlocks.
db.py — schema grows from 7 to 10 columns (+ingested_at, +severity_score,
+affected). connect() auto-migrates pre-existing tables via add_columns —
metadata-only op, existing 25k rows survive. upsert() server-stamps
ingested_at and normalizes typed-CVE defaults so non-CVE sources land as
null/empty. query_delta_in_sources gains min_severity filter for the CVE
severity tier — pushed into the LanceDB WHERE clause, drops null-severity
rows silently rather than treating them as 0.
sources/cves.py — NVD pulls numeric CVSS baseScore (v3.1 → v3.0 → v2
fallback); GHSA reads cvss.score and pivots vulnerabilities[] into the
uniform affected struct with patched_range; OSV flattens
affected[].ranges[].events[] into the same shape. Title strings no longer
carry the parenthetical severity tag — that's redundant once typed.
mcp_server.py — query_cves accepts optional min_severity: float, threaded
through both inproc and HTTP paths.
Tests: 214 → 222 (+8). New: schema columns present, CVE row round-trip
preserves severity + affected, non-CVE rows default to null/empty, NVD
numeric baseScore + fallback + null, GHSA patched_range pivot, OSV range
flattening.
* papers: fix vapor ingest — use submittedOnDailyAt + date-granularity match
HF daily papers source was returning empty results. Root cause: comparing
the API's date-only field against a datetime cutoff with sub-day precision
silently dropped every row. Switched to paper.submittedOnDailyAt with
date-granularity comparison; 80 fresh papers backfilled on first run.
* payments: Stripe top-up + multichain wallet + bearer ledger + prune
Stripe Checkout → bearer ledger top-up flow:
- payments/stripe_billing.py mints a raw bearer + hash, sends Stripe
only the hash, stashes the raw key with a TTL, atomically claims it
on /billing/success after Stripe API confirms paid + livemode match.
- Webhook deduped by event id; requires payment_status==paid and
livemode match before crediting.
- scripts/check-no-stripe-keys.sh — pre-commit guard refuses to commit
Stripe secrets (sk_live, rk_live, whsec_) per Stripe's #1 leak vector.
api.py — /v1/billing/checkout + /v1/billing/webhook + /billing/success
routes; multichain x402 (Base + Optimism + Arbitrum + Polygon) selected
via payer header; Hit model + SliceQueryIn carry severity_score +
affected + min_severity for the cves slice (R.next surface).
payments/wallet.py — get_web3() now per-chain; cached singleton per
USDC contract; never reads BASE_RPC_URL.
payments/credits.py — bearer ledger (key_hash + balance) + nonce
challenge table + pending_reveal table; spend/refund + probe quota +
pruning primitives.
payments/prune.py — periodic hygiene over the three ephemeral tables,
idempotent DELETE WHERE stale, driven by fillin-prune.timer.
Tests: +940 lines across test_api_billing, test_api_multichain,
test_bearer_ledger, test_stripe_billing, test_pre_commit_hook,
test_wallet — covers Stripe webhook signing, multichain payer routing,
bearer claim-once semantics, pre-commit hook block-list.
* deploy: systemd units for monitor + prune timer + nginx CSP + README
deploy/
[email protected] — per-chain templated unit; runs the
multichain USDC deposit monitor as
[email protected] etc.
deploy/fillin-prune.service + .timer — daily ledger hygiene at 04:00 MT,
runs payments/prune.py to sweep expired nonces, old probes, and
unclaimed pending_reveal rows.
deploy/nginx-fillin.conf — relax CSP for the rich landing pages
(inline-eval + data: images required by the Pretext-style assets).
deploy/README.md — install + rotate runbook for the new units.
* web: hero corpus_match honesty + Stripe top-up signup + changelog
web/index.html — hero lede now surfaces the corpus_match: strong|weak|none
honesty signal so a visiting agent buyer sees that fillin returns the
quality of its own retrieval, not just docs.
web/signup.html — adds the Stripe-checkout top-up path alongside the
existing trial-key mailto flow; key delivery is one-shot on the
/billing/success page.
web/changelog.html — catches up to Phase R (ingested_at, severity_score,
affected[]), the daily snapshot slices, and the multichain x402 surface.
* markets: new /query/markets slice — Polymarket + Kalshi + Manifold + Metaculus
Adds the fourth daily-snapshot slice. One MCP call surfaces every active
market touching a topic across the four major prediction venues an agent
would otherwise check independently. Priced at $0.05 against the
four-venue rediscovery cost.
sources/markets.py — keyless public APIs only. Per-venue fetcher pulls
the active-market head, folds the load-bearing fields (question, current
implied probability / yes-price, close date, volume, venue) into `text`
so vector match catches "is X likely" queries. Defensive on Polymarket's
outcomePrices shape (Gamma has shipped both JSON-encoded string and
pre-decoded list). Manifold filters resolved markets out. Metaculus uses
community_prediction.full.q2 as the canonical median.
ingest.py — registered in SOURCES so the standard `ingest(hours, sources)`
orchestrator picks it up.
api.py — /query/markets route, FILLIN_PRICE_MARKETS_USDC env (default
$0.05), require_paid_or_key_markets dep, /.well-known/mcp/server-card
entry.
mcp_server.py — query_markets MCP tool mirrors the other slice tools'
shape. Docstring is explicit that the price snapshots in `text` are
first-sight only; for live pre-trade pricing, follow the venue url.
scripts/ingest_markets.py + run_daily_snapshots.sh — fires markets last
in the daily 1pm-MT cron, 30s after frontier.
agents.json + smithery.yaml — declare the tool, pricing, example
invocation. Corpora list now includes "markets".
Tests: 222 → 228 (+6). Per-venue parsing tests for Polymarket (yes-price
fold, dual price-shape tolerance), Kalshi (cents → display + close date
+ contract volume), Manifold (resolved-filter + probability render),
Metaculus (community-median path). Plus a cross-venue dedup test.
Known limitation (documented in script + tool docstring): prices in the
corpus are point-in-time at first ingestion — db.upsert dedupes by id so
existing market rows aren't refreshed. The slice answers *discovery*
("is there a market about X across the four venues") not live price.
Follow-up PR can add db.replace_source('markets', ...) so daily cron
refreshes price snapshots in place.
---------
Co-authored-by: Christopher Harris <
[email protected]>
Co-authored-by: Claude Opus 4.7 (1M context) <
[email protected]>
2026-05-16 · b8387c2
Phase R: schema migration foundation + CVE answer-engine + Stripe top-up + multichain (#2)
* Phase R: schema migration foundation + CVE answer-engine columns
Ships R.1 audit, R.2 typed-columns decision, R.3 ingested_at, and R.next
severity_score + affected[] against the cves source. The CVE answer-engine
SKU is now shippable — a buyer can act on a returned row (pin to
patched_range) without a second hop to NVD or GHSA.
SCHEMA_AUDIT.md — per-source field map for all 7 ingest paths; identifies
the 4 highest-WTP discards (ingested_at, CVE severity, quality signals,
GHSA/OSV patched_range).
SCHEMA_DECISION.md — chose typed columns (B) over single JSON blob (A).
Deciding field: GHSA/OSV affected[] tuple. JSON-blob path breaks LanceDB
filter pushdown, the one-call MCP wedge, and the training-data SKU. Each
new typed column names the SKU it unlocks.
db.py — schema grows from 7 to 10 columns (+ingested_at, +severity_score,
+affected). connect() auto-migrates pre-existing tables via add_columns —
metadata-only op, existing 25k rows survive. upsert() server-stamps
ingested_at and normalizes typed-CVE defaults so non-CVE sources land as
null/empty. query_delta_in_sources gains min_severity filter for the CVE
severity tier — pushed into the LanceDB WHERE clause, drops null-severity
rows silently rather than treating them as 0.
sources/cves.py — NVD pulls numeric CVSS baseScore (v3.1 → v3.0 → v2
fallback); GHSA reads cvss.score and pivots vulnerabilities[] into the
uniform affected struct with patched_range; OSV flattens
affected[].ranges[].events[] into the same shape. Title strings no longer
carry the parenthetical severity tag — that's redundant once typed.
mcp_server.py — query_cves accepts optional min_severity: float, threaded
through both inproc and HTTP paths.
Tests: 214 → 222 (+8). New: schema columns present, CVE row round-trip
preserves severity + affected, non-CVE rows default to null/empty, NVD
numeric baseScore + fallback + null, GHSA patched_range pivot, OSV range
flattening.
* papers: fix vapor ingest — use submittedOnDailyAt + date-granularity match
HF daily papers source was returning empty results. Root cause: comparing
the API's date-only field against a datetime cutoff with sub-day precision
silently dropped every row. Switched to paper.submittedOnDailyAt with
date-granularity comparison; 80 fresh papers backfilled on first run.
* payments: Stripe top-up + multichain wallet + bearer ledger + prune
Stripe Checkout → bearer ledger top-up flow:
- payments/stripe_billing.py mints a raw bearer + hash, sends Stripe
only the hash, stashes the raw key with a TTL, atomically claims it
on /billing/success after Stripe API confirms paid + livemode match.
- Webhook deduped by event id; requires payment_status==paid and
livemode match before crediting.
- scripts/check-no-stripe-keys.sh — pre-commit guard refuses to commit
Stripe secrets (sk_live, rk_live, whsec_) per Stripe's #1 leak vector.
api.py — /v1/billing/checkout + /v1/billing/webhook + /billing/success
routes; multichain x402 (Base + Optimism + Arbitrum + Polygon) selected
via payer header; Hit model + SliceQueryIn carry severity_score +
affected + min_severity for the cves slice (R.next surface).
payments/wallet.py — get_web3() now per-chain; cached singleton per
USDC contract; never reads BASE_RPC_URL.
payments/credits.py — bearer ledger (key_hash + balance) + nonce
challenge table + pending_reveal table; spend/refund + probe quota +
pruning primitives.
payments/prune.py — periodic hygiene over the three ephemeral tables,
idempotent DELETE WHERE stale, driven by fillin-prune.timer.
Tests: +940 lines across test_api_billing, test_api_multichain,
test_bearer_ledger, test_stripe_billing, test_pre_commit_hook,
test_wallet — covers Stripe webhook signing, multichain payer routing,
bearer claim-once semantics, pre-commit hook block-list.
* deploy: systemd units for monitor + prune timer + nginx CSP + README
deploy/
[email protected] — per-chain templated unit; runs the
multichain USDC deposit monitor as
[email protected] etc.
deploy/fillin-prune.service + .timer — daily ledger hygiene at 04:00 MT,
runs payments/prune.py to sweep expired nonces, old probes, and
unclaimed pending_reveal rows.
deploy/nginx-fillin.conf — relax CSP for the rich landing pages
(inline-eval + data: images required by the Pretext-style assets).
deploy/README.md — install + rotate runbook for the new units.
* web: hero corpus_match honesty + Stripe top-up signup + changelog
web/index.html — hero lede now surfaces the corpus_match: strong|weak|none
honesty signal so a visiting agent buyer sees that fillin returns the
quality of its own retrieval, not just docs.
web/signup.html — adds the Stripe-checkout top-up path alongside the
existing trial-key mailto flow; key delivery is one-shot on the
/billing/success page.
web/changelog.html — catches up to Phase R (ingested_at, severity_score,
affected[]), the daily snapshot slices, and the multichain x402 surface.
---------
Co-authored-by: Christopher Harris <
[email protected]>
Co-authored-by: Claude Opus 4.7 (1M context) <
[email protected]>
2026-05-14 · b528f96
landing + agents.json: hero → /onboard.html, kill private-repo links — Phase R prep
Phase R conversion-friction pass before Monday outbound.
F2 (highest-leverage): landing hero primary CTA now points at
/onboard.html ("Try it in 60 seconds") instead of an anchor to #tools;
pricing-sidebar copy promotes /onboard.html and demotes /signup to
"production bearer key". The interactive wizard already worked end-to-end
(model → harness → MCP config → fillin_health verify → live fillin_query
in browser) but was buried behind a nav link.
F4: agents.json free_paths now lists interactive_onboard alongside
trial_key, and tags trial_key as the human-onboard path so agent-native
self-bootstrap routes to /onboard.html or x402 rather than the mailto.
F1 (belt-and-suspenders until artchristech/fillin is flipped public):
removed every github.com/artchristech/fillin link from web/index.html
footer, web/onboard.html final CTA, web/pitch.html cta-row,
web/pricing.html self-host tier, web/changelog.html nav-cta,
agents.json repository field, smithery.yaml repository field. Replaced
with Smithery (verified third-party signal), /onboard.html, or
self-host mailto where appropriate. Re-add when repo visibility flips.
.gitignore: PHASE_R_PLAN.md + PHASE_R_ACCELERATE.md (internal strategy
artifacts, never publish).
web/signup.html intentionally left out of this commit — has my F1 edit
overlapping with Stripe/multi-chain in-flight work; will ship together.
2026-05-14 · 21b2976
landing: promote teal v2 — agent-first hero, 7-tool spec, x402 callout
- Recolor accent to logo teal (#3fb8ad), retire yellow-green
- Trim from 1384 → 877 lines; cut animated demo, slim sections
- Surface machine-readable endpoints (agents.json, openapi.json, /v1/corpus, MCP) as hero pills
- Add 7-tool spec table + decision-tree for "when to call what"
- Live status strip polling /v1/corpus
- Logo plate that matches JPG bg for invisible edges (3 placements)
2026-05-14 · a0a3f7c
ts client: publish v0.1.0 as @artchristech/fillin-client
The @fillin scope doesn't exist on npm yet. Shipping under the user's
existing @artchristech scope so the package is installable today.
README notes the planned migration to @fillin/client once the org
exists. Description updated to the search-engine-for-agents framing.
Live: https://www.npmjs.com/package/@artchristech/fillin-client
2026-05-14 · 1403208
docs: TS client README + HN post catch up to 7-tool reality
clients/fillin-ts/README.md:
- Hero reframed as search-engine-for-agents + names the 3 daily slices
- Quickstart comments the generic /query as $0.01 explicitly
- Methods section calls out that slice routes (cves/papers/frontier) +
/answer ship in the API but aren't yet wrapped — direct fetch or PR
- Surfaces /signup (20 free queries) alongside /pricing
HN_POST.md (drafts for Show HN — not on the live site):
- Title swapped to "Show HN: Fillin – search engine for AI agents (...)"
- Body reframes around search engine + 6 corpora + 7 tools + per-slice
pricing rationale
- Adds rediscovery-cost framing for differentiated pricing
- Updates corpus claim ("thin / HN+arXiv") to current reality (22.8k docs,
6 sources, two-tier ingest)
- Adds eval evidence (n=23 Opus, ~3.4× fewer tokens, ~2.25× cheaper)
- Adds two more anticipated questions (per-slice pricing, MCP location)
- Repo link points at github.com/artchristech/fillin (was placeholder)
Defer: docs/ZHC_STRATEGY.md, STRIPE_PLAN.md, OUTREACH.md, SUBMISSION.md
still reference flat $0.01 — internal-only, doesn't affect any user-facing
surface.
2026-05-14 · 3c7c1de
docs + /pricing: catch up to the 7-tool, search-engine-for-agents reality
The flat-$0.01 framing was actively misleading now that /query/cves,
/query/papers, /query/frontier each carry their own price. README still
described 4 sources from before the daily-snapshot work.
/pricing (web/pricing.html):
- h1: "Per-call pricing. In USDC. On Base." (was "Flat $0.01 per query")
- New per-tool table (7 rows) with prices and intended use
- "Try before you pay" surfaces both /v1/probe and /signup
- Three rails kept (x402, bearer/Stripe, self-hosted) but reframed —
same prices, different settlement
- "Why $0.01" replaced with "Why these prices" — rediscovery cost
framing for differentiated pricing
- FAQ "Is there a free tier?" answer corrected (yes, two paths)
README.md:
- Hero: search-engine-for-agents framing + 7-tool table front-and-center
- Stack section: 6 corpora named (HN/arXiv/RSS/GHReleases/CVEs/papers/frontier),
two-tier ingest (30min scheduler + 1pm MT daily snapshots) called out
- Try-before-you-pay: surfaces /v1/probe AND /signup trial bearer key
- Quickstart: pip install -r requirements.txt instead of stale per-package
list; mentions the slice routes alongside /query
Tests: 146/146.
2026-05-14 · fddbc02
deploy/nginx: relax CSP for rich landing pages
Original policy was default-src 'none' — fine when the API only ever
served JSON, broken now that we ship a styled landing with inline
<style>, Google Fonts, inline <script>, and /v1/corpus + /v1/probe
fetches from the page. Browsers correctly refused everything,
rendering the live site as plain serif HTML.
New policy keeps the lockdown shape (no third-party scripts, no
iframes, no foreign image hotlinks, no foreign connect targets) and
opens only what the page actually needs:
- img-src 'self' data:
- style-src 'self' 'unsafe-inline' https://fonts.googleapis.com
- font-src 'self' https://fonts.gstatic.com data:
- script-src 'self' 'unsafe-inline'
- connect-src 'self'
Mirrors the shape used by sister site /etc/nginx/sites-available/glyph.
Live config patched in-place + nginx -t green + reload (zero downtime).
Backup at /root/fillin.bak.20260514 on snapback.
2026-05-14 · 2dc75fa
landing: integrate hourglass logo (replaces accent-dot brand mark)
- web/logo.jpg: 1024x1024 teal hourglass mark (the new brand)
- web/index.html: nav brand swaps the .dot accent for an <img class="logo-mark">
at 26px with 5px radius. Adds <link rel="apple-touch-icon">.
- api.py: new GET /logo.jpg route via _serve_static
- Versioned URL (?v=1) in HTML to dodge Cloudflare's 4h cache of the
pre-route 404. Future logo swaps bump the version.
2026-05-14 · bb64ca6
ingest: arxiv min window 24h → 72h
Diagnostic dig: 24h was insufficient. arXiv publishes once per weekday
~17:30 UTC. Querying at 19:00 UTC, the latest batch is from yesterday's
17:30 → only 25.5h ago, just outside a 24h window. Result: scheduler
pulled 50 entries every tick, all dropped as out-of-window, arxiv corpus
went 15 days stale.
72h covers the worst case (Monday morning needing Friday's batch over
the weekend gap). Verified live: scheduler now pulls +1055 arxiv docs
on its first tick after the change.
146/146 tests pass.
2026-05-14 · 39958bd
ingest: per-source min window, arxiv floored at 24h
The scheduler runs every 30min with INGEST_WINDOW_HOURS=2, but arxiv
publishes in weekday batches and the API page-0 newest doc can be
24-72h old. With a 2h window every tick reads 50 entries and drops
all 50, leaving the arxiv corpus stale (15+ days at last check).
Fix: SOURCE_MIN_HOURS = {"arxiv": 24} in ingest.py applies a per-source
floor: max(hours, SOURCE_MIN_HOURS.get(name, 0)). Other sources keep
the caller-supplied window. Diagnostic log line now includes the
effective window per source.
146/146 tests pass.
2026-05-14 · d25d360
data refresh + cron to 1pm MT + smithery.yaml comprehensive sync
- Fired daily snapshot manually; corpus now at 21,677 rows (was 21,621);
cves +55 new, frontier +1, papers +0 (HF daily + bioRxiv both empty
for the last 24h — upstream sparseness, not a regression). The
read-ids + chunk-size=25 add strategy held: 300 docs ingested with
zero OOM.
- Cron rescheduled on snapback VPS: CRON_TZ=America/Denver, 0 13 * * *
(was 15 6 UTC). DST-aware via vixie cron (Debian 3.0pl1-162).
CRON_TZ scoped to the fillin entry only — other crontab jobs
(snapback, glyph, milkncookies) keep their original UTC schedule.
- smithery.yaml: comprehensive update — corpus surface now names all
6 active source families (HN, arXiv, RSS, GH releases, cves, papers,
frontier) with row count; /v1/probe free-tier path surfaced; /signup
trial-key (20 free queries) surfaced; corpus_match signal documented
in fillin_query example; daily 1pm MT cadence noted.
- agents.json: synced — added query_cves / query_papers / query_frontier
to mcp.tools, expanded corpora list with the 3 new sources, replaced
flat $0.01 pricing with per-tool table + free_paths + snapshot_cadence.
Diagnosis (no fix shipped per scope): arXiv staleness (newest doc
2026-04-29 in DB, while live API has docs through 2026-05-13) is
caused by scheduler INGEST_WINDOW_HOURS=2 being shorter than arXiv's
typical submission cadence — every 30-min tick reads page 0, sees the
50 newest entries are all >2h old, and writes 0. Fix would be either
a wider window for arxiv specifically, or graduating arxiv into the
daily-slice runner with --hours 24. Out of scope for this commit.
2026-05-14 · 0f0e739
landing: restore rich page + reframe as search engine for AI agents
The previous commit (0f73475) accidentally overwrote the rich 1381-line
landing page with a stale 519-line older version. This restores the
rich page (hero animation, demo, how-it-works, onboard, api, pricing,
proof, stack sections) AND applies the search-engine-for-agents
reframing on top:
- Title + meta + og + twitter: "search engine for AI agents"
- Eyebrow + lede: name the 3 daily slices and prices inline
- API section: "Five endpoints" with each route + price listed
- Pricing card: per-slice price list ($0.01-$0.05 range), removed
the misleading "100 queries per 1 USDC" line that only held for
the flat $0.01 tier
2026-05-14 · 0f73475
landing: reframe as search engine for AI agents + add 3 daily slices
- Title + meta + og: "search engine for AI agents" framing
- Tagline: name the 3 slices + their prices (CVEs $0.02, papers $0.03,
frontier $0.05)
- Stats strip: price tile shows the $0.01-$0.05 range, links to /pricing
- Endpoints table (section 03): added /query/cves, /query/papers,
/query/frontier, /answer rows with prices inline
- MCP section (section 05): "Seven tools" with prices listed for each
Section 07 corpus table auto-populates from /v1/corpus and will pick up
the new sources without code change.
2026-05-14 · 67ac88d
launch readiness: onboard XSS fix, CSO-cleared diff, internal docs gitignored
CSO daily-mode audit before pushing fillin_daily to public main:
* Fixed: web/onboard.html href XSS (MEDIUM) — added safeHref() URL
scheme allowlist (http/https only); applied escapeHtml to source +
published_at for defense in depth.
* Verified-fixed in code: XFF rate-limit bypass HIGH (CF-Connecting-IP
enforcement at api.py:67-81 + ufw lockdown via deploy/lockdown_origin.sh).
* Gitignored internal artifacts: CONTEXT.md, RECONCILE.md,
research-report.md, .gstack/ — never publish (working specs +
research notes that don't belong in the public repo).
* Updated security-report.md with the 2026-05-11 audit (consistent
with prior public-audit policy).
Bundles other launch-ready work that was already staged: README +
landing-page polish, /signup flow, agents.json mainnet update, new
tests/test_corpus_and_probe.py (13 tests), thesis v1+v2 pages, eval
artifacts, deploy hardening (lockdown_origin.sh).
Tests: 146 passing.
2026-05-14 · 3a69321
fillin_daily: 3 daily snapshot slices + differentiated pricing
Adds CVE / papers / frontier daily ingestors and paid MCP routes to turn
fillin into a search engine for AI agents. Each slice is its own paid lane
with prices set by elasticity:
- query_cves ($0.02) — NVD + GitHub Security Advisories + OSV
- query_papers ($0.03) — HuggingFace daily papers + bioRxiv (unioned with arxiv)
- query_frontier ($0.05) — OpenAI/Anthropic/DeepMind/Meta/Mistral feeds + HF trending
Backend:
- sources/{cves,papers,frontier}.py + scripts/ingest_*.py + run_daily_snapshots.sh
- QueryIn.sources whitelist filter; rejects unknown values 400
- Three paid routes /query/{slice}; single auth path via factory
- db.upsert chunked via FILLIN_UPSERT_CHUNK_SIZE (default 25); switched
merge_insert -> read-existing-ids + add to fix VPS OOM at 21k-row scale
- server-card.json advertises the 3 new tools
Smithery: yaml refreshed with new tools, prices, tags, and 3 example
invocations matching the search-engine-for-agents framing.
Tests: 122 -> 146 (+15 slice routes/auth/pricing, +1 chunked upsert dedup).
2026-05-13 · 54557a3
fillin_answer: synthesized-mode tool for weaker LLM callers
The eval showed fillin's value depends on the calling model's tool-synthesis
skill — Opus 4.7 extracts citations cleanly (4.29/answer); Nemotron-120B free
barely does (2.3, mostly hallucinated from training). To be a true enhancement
on *any* model, fillin needs to do the synthesis itself for callers that can't.
New tool: fillin_answer(query, cutoff, k)
Returns a 150-250 word answer with inline [title](url) citations, grounded
in post-cutoff retrieved docs. Server runs Haiku 4.5 over the top-k results
with a constrained system prompt (no outside knowledge, echo dates, refuse
if docs don't address the query). Raw docs are returned alongside for
verification.
Pricing: $0.02 USDC / call (vs $0.01 for fillin_query). Bearer keys unmetered.
x402 path refunds on synthesizer error or no-relevant-docs case so payers
never lose USDC for a non-synthesis.
Cold-start safe: falls back to {answer: null, reason: "synthesizer_not_configured"}
when ANTHROPIC_API_KEY is unset. x402 callers are auto-refunded the $0.02.
Cheap-signal addition: fillin_query and the in-proc MCP variant now return
top_score + corpus_match ("strong" | "weak" | "none") so agents can skip
the paid call when retrieval found nothing.
- synthesize.py: corpus_match thresholds + Haiku-backed synthesizer
- api.py: /answer endpoint, ANSWER_PRICE_USDC, AnswerOut model,
require_paid_or_key_answer dependency, refund-on-no-synthesis logic
- mcp_server.py: fillin_answer tool with both in-proc and remote paths
- agents.json + smithery.yaml + /.well-known/mcp/server-card.json: register
the new tool and pricing
- tests/test_synthesize.py: 10 new unit tests for corpus_match,
extract_citations, and the no-key fallback (no live API calls)
Test status: 122/122 passing (10 new).
2026-05-13 · cdde415
launch polish: embedder pre-warm + smithery.yaml example invocations
- api.py: lifespan now warms the embedder singleton with a dummy encode at
startup, so the first real fillin_query doesn't pay the ~2-3s
sentence-transformers cold-load cost (showed up as a p99 latency spike
on the Smithery performance dashboard)
- smithery.yaml: add four `examples:` entries (release-notes query, research
query, free health check, free corpus stats) so the listing surfaces
realistic call shapes to reviewers and prospective agent developers
Verified live: '[fillin] embedder warm' logs on snapback after restart.
2026-05-12 · e1885c9
launch prep: /signup trial-key flow, Proof section, mainnet-correct smithery.yaml
- web/signup.html + GET /signup route: 20-free-query trial via mailto prefilled
with name/use-case/harness/cutoff, so the "keys issued at fillin.glyphapi.dev"
promise has a real destination
- web/index.html: new #proof section with the n=8 head-to-head numbers
(Fillin = 2.25× cheaper than web search, 11× more inline citations);
/signup link added to nav, CTA, and pricing card
- smithery.yaml (new file, will commit to surface on registry): removes the
"free public tier" claim that didn't exist, points at /signup
- api.py: bundles already-deployed CF-Connecting-IP rate-limit hardening so
origin matches what's running on snapback (rsync had been the source of truth)
2026-05-06 · 70f9ec7
/freshness: cache 60s — public endpoint, full column scan was 8s cold
The freshness endpoint runs db.stats() which scans the published_at +
source columns over the full ~15k-row table; first call after a worker
restart took 8+ seconds and would time out the landing-page widget. As
a public no-auth endpoint, that's also a trivial accidental DOS.
Module-level dict cache with 60s TTL. The 'now' field is still computed
per request so consumers can detect a stale cache if needed.
2026-05-06 · fdf9608
db.stats(): include per-source row counts
The /freshness endpoint and the /status board both surface a "Sources"
widget driven by stats().by_source — but stats() never populated it,
so the dashboards rendered "0 sources". Adds a pylist sweep over the
source column (already loaded for the min/max scan) and groups counts.
No additional DB scan: reuses the same fetch.
2026-05-06 · dcd7b88
CI: pytest + tsc + npm test + changelog build on push and PR
Three jobs in .github/workflows/test.yml:
- python: pytest with eval/ excluded (live-network suite, not for CI).
- typescript: cd clients/fillin-ts; tsc --noEmit; npm test; npm run build.
- changelog: runs tools/build_changelog.py and asserts the output is
>5KB, so a regression in the generator can't ship green.
Caches pip and npm by lockfile. Triggers on push to main, PR to main,
and workflow_dispatch.
2026-05-06 · a2f7fc5
Add @fillin/client TypeScript SDK
Most agent code in 2026 is TypeScript (Mastra, LangGraph TS, Vercel AI
SDK, Cloudflare Agents). Python-only excluded that half of the dev
population; this opens it.
- src/index.ts: FillinClient with .query / .stats / .freshness /
.health / .paymentInfo. Bearer auth via apiKey option, with an
injectable fetch for testing or custom transports. AbortController
honors timeoutMs (default 30s). FillinError carries .status + .body.
- test/client.test.ts: three smoke tests — bearer header gets sent,
freshness works without a key, non-2xx throws FillinError. Uses
node:test, no external test framework.
- README.md: install, quickstart, options table, methods, errors.
- tsconfig.json + package.json wired for `npm run build` (tsc → dist/).
- .gitignore so node_modules + dist stay out.
Tests pass (3/3), tsc --noEmit clean. Not yet published to npm.
2026-05-06 · 850fa42
Site infrastructure: pricing, status, changelog, freshness, SEO
Builds the surface that was missing for a real product:
- /pricing — flat $0.01 page covering both x402 and bearer rails,
plus FAQ and a Stripe placeholder cell for when the checkout lands.
- /status — live in-browser health board over /healthz, /freshness,
/v1/payment/info, /agents.json. No server-side history yet (deferred
until usage justifies a time-series store).
- /changelog — generated from git log via tools/build_changelog.py;
baked into deploy/install.sh so every deploy refreshes it.
- /freshness — public corpus stats endpoint (no auth) backing both the
status board and a new freshness strip on the landing page.
- /robots.txt + /sitemap.xml + og.svg + favicon.svg — basic SEO and
social-share surface (was zero before).
- _shared.css — shared design tokens for sub-pages so they match index
without each duplicating ~500 lines of CSS.
api.py exposes all of the above plus _serve_static for asset routes
(robots/sitemap/favicon/og/css). _serve_html keeps a JSON fallback so
tests pass without staging the web/ directory.
Landing-page nav now actually links to onboard.html, pitch.html,
pricing, status, and changelog — they had been dangling html files
with no entry points.
2026-05-06 · 78eeb1b
agents.json: declare HTTP transport as primary, stdio as alt
The card claimed transport=stdio but the production deploy at
fillin.glyphapi.dev mounts FastMCP at /mcp via streamable-http.
Agents that fetched the card and tried to spawn the stdio process
would have looked at the wrong rail. Now lists both transports with
streamable-http as primary and exposes the http_endpoint inline.
2026-05-06 · cf3c5fa
Fix test isolation: reload mcp_server before reloading api
Each api fixture reloaded api.py inside a fresh TestClient. The lifespan
ran _mcp_instance.session_manager.run(), which raises after the first call
on a given FastMCP instance. Because mcp_server wasn't reloaded, the
singleton was reused and the second test in either test_api_limits.py or
test_api_x402.py exploded — 14 errors total on every CI/local run.
Reloading mcp_server first gives api.py a fresh FastMCP, so the session
manager starts cleanly for each test.
Before: 95 passed, 14 errors. After: 109 passed.
2026-05-06 · 47407d6
Serve landing/onboard/pitch HTML at /, /onboard.html, /pitch.html
Before this, web/index.html (1238 lines) was rsynced to the VPS but no
route served it — GET / returned the JSON stub. Visitors to the canonical
domain saw {"product":"Fillin","tagline":"..."} instead of the landing page.
Adds three FileResponse routes that fall back to a small JSON payload
when the file is missing (so tests run without staging the web/ dir).
Also tracks web/pitch.html which had been left untracked since Apr 30.
2026-05-06 · 8cb6838
Swap fillin.dev → fillin.glyphapi.dev across docs and config
User does not own fillin.dev — the domain's NS records point at Vercel
and currently return DEPLOYMENT_NOT_FOUND. Anyone who registers it
later could harvest bearer tokens and x402 payment headers from agents
that wired up via the README. Canonical domain is fillin.glyphapi.dev
(snapback VPS, server_name in deploy/nginx-fillin.conf:22).
24 string references across README, agents.json, mcp_server.py,
embeddings.py, web/index.html, tools/load_test.py, and docs/.
2026-04-30 · 7d9e68b
Eval economics: real partial baseline (n=7-8 / 25 queries)
Three-arm head-to-head: claude-alone vs claude+websearch vs claude+fillin.
Same model (Opus 4.7), same system prompt, same questions; only tools differ.
Run halted at run 28/75 due to Anthropic credits exhaustion. Numbers below
are from the 23 successful runs only — explicitly partial, explicitly noted.
Per-arm averages on 7-8 queries each:
claude alone 147 in / 800 out / 0 tools / 1.88 cite / $0.021/q
claude+websearch 36k in / 2010 out / 1.5 tools / 0.38 cite / $0.247/q
claude+fillin 11k in / 1392 out / 2.1 tools / 4.29 cite / $0.110/q
Headline: at this query mix, Fillin used 3.4× fewer input tokens, cost 2.25×
less per query, and produced 11× more inline citations than Anthropic's
hosted web_search. Latency was modestly faster too. Total real spend across
the 23 runs: $2.91.
Caveats are documented openly in eval/baseline.md — sample is small,
citation metric is "URLs in answer text" (websearch uses footnote-style
refs that don't render as URLs), and the query mix is biased toward
release notes + research papers where Fillin's corpus is well-aligned.
To complete: top up credits + python tools/eval_economics.py. ~$6 to finish.
2026-04-30 · eb270af
Reframe onboarding: model is the car, cutoff is the odometer
Old onboarding asked "pick your harness" first — but the harness is
just plumbing. The thing Fillin actually solves for is the model's
cutoff. New flow puts that first.
Hero: "Your agent is driving with outdated maps."
Sublede: "Fillin is the embeddings & context-engineering service that
closes the gap, on every query, autonomously."
Step 1 is now MODEL — pick from 12 cards (Claude Opus 4.7, Sonnet 4.6,
GPT-5.5, Gemini 2.5, Llama 4, Grok 4, Mistral L3, etc.) each showing
its training cutoff. Custom date row underneath. The moment a model
is selected, an inline gap-readout panel reveals:
- days of blind spot
- estimated # arxiv papers, HN posts, GH releases missed
- sharp pitch: "without Fillin your agent refuses or hallucinates"
Step 2 is harness selection (was step 1). Step 3 config snippet
auto-populates from the picked harness. Steps 4-5 verify + first
query (cutoff pre-filled from step 1). Step 6 done state shows the
model + cutoff that's now wired.
URL hash captures m=, c=, h=, s= so any state is shareable as a link.
Reorder reflects the strategic claim: the user is wiring up *context
engineering for an agent that runs an X with cutoff Y*, not just
"installing a tool".
2026-04-30 · e82890b
Agent onboarding wizard + CORS for browser-based MCP calls
web/onboard.html — 5-step interactive onboarding flow:
1. Pick your harness (Claude Code, Cursor, Continue, Zed, Eliza, curl)
2. Drop in tailored MCP config (copy-button, harness-aware)
3. Verify connection — live fetch to /mcp/ shows reachability + corpus
stats inline
4. Run a real query — input cutoff + topic, see ranked results with
clickable source URLs, plus copy-able curl version
5. Done state — what tools are now available, links to next steps
URL-hash state so steps are linkable. Live status indicator. Same
aesthetic as web/index.html and web/pitch.html.
Bootstrap nginx now ships CORS headers so pages from any origin
(including file://) can hit /mcp/ for verification + queries. ACAO=*
is fine because every authenticated path is still gated by bearer or
x402 — CORS just controls whether the *browser* lets the calling JS
read the response. Live nginx already patched; this keeps the
bootstrap in sync.
2026-04-30 · cd79e14
arxiv: bump max_pages 4→30 so backfills can paginate past ~48h
Each fetch() call starts at start=0 and pages forward in submittedDate-desc.
With max_pages=4 × PAGE_SIZE=50 = 200 papers per call, we never see beyond
the most recent ~2 days at current ~100 papers/day in our cats. Caught when
chunked backfill produced 0 net new docs after the 48h window — every
subsequent chunk re-fetched the same 200 papers and dedup-filtered them.
30 pages × 50 = 1500 paper ceiling, ~15 days at current rate, plenty for
both incremental and historical backfills. Per-page throttle still 3s, so
worst-case wall clock is ~90s.
2026-04-30 · 8b22564
arxiv: bump request timeout 30→60s; submittedDate-desc is slow for large windows
2026-04-30 · d25e39d
Fix arXiv ingest: space-as-OR encoding (was returning 0 silently)
_search_query joined categories with literal "+OR+" assuming the URL
would carry it through. requests.params double-encoded the '+' to
'%2B', which arXiv interprets as a literal plus character (not the
URL-form '+' = space) — silently returns an empty feed.
Switch to " OR " as the connector. requests then form-encodes the
spaces as '+', which is exactly the wire format arXiv wants.
Verified live: 88 papers in the last 24h across cs.{AI,LG,CL,CR,DC}.
This fills the second-largest gap in our corpus (rss came back with
51 docs, gh_releases with 12, arXiv was 0 since the multi-source
refactor landed).
2026-04-29 · 1eddf0d
Fix scheduler: drop max_pages kwarg that ingest() no longer accepts
scheduler.py was minted in v0 calling ingest(hours, max_pages). The
multi-source refactor (commit 27f56cf) dropped max_pages from ingest()
in favor of per-source defaults; scheduler.py was never updated. Every
tick since has raised TypeError, silently into the journal. Corpus
went stale by ~4 days; only caught when checking MCP status today.
Tests don't exercise the scheduler entrypoint — that's the blind spot.
Adding an integration smoke test for it is queued for next pass.
After deploy, run a one-time backfill: ingest --hours 96 to recover
the gap.
2026-04-29 · 82be046
Audit log: M1 + M2 closed and verified live
Both Medium findings resolved. Live HTTPS responses now carry
Strict-Transport-Security and Content-Security-Policy headers; live
config patched in-place. Bootstrap config refuses non-ACME HTTP traffic
with 503 on a fresh deploy. Global exception handler registered as
defense-in-depth against future debug=True slippage.
2026-04-29 · bdab800
Fix MEDIUM: bootstrap nginx exposes API in cleartext, no global exc handler
M1: deploy/nginx-fillin.conf bootstrap was proxying API traffic on
plain HTTP between install.sh and certbot. Bearer tokens and signed
payment headers leaked. Fix: bootstrap now refuses with 503 except for
ACME challenge path. certbot --nginx --redirect overwrites the catch-
all with a 301 to HTTPS once the cert is in. Hardening headers (HSTS,
CSP) baked into the bootstrap so the post-cert config inherits them.
HSTS pinned at max-age=63072000 with includeSubDomains. preload is
deliberately omitted — it submits to browser preload lists and is
effectively irreversible. Promote later when the domain is months-stable.
CSP: default-src 'none'; frame-ancestors 'none'; (API serves JSON only).
Live config patched in-place with the same headers; verified live —
HSTS + CSP appear on every response.
M2: api.py had no global Exception handler. A future debug=True
regression would leak full Python tracebacks on /query. Added a
@app.exception_handler(Exception) that logs the traceback server-side
and returns a generic {"error":"internal_error"} 500. FastAPI's
HTTPException + RequestValidationError handlers fire first, so this
is purely the catch-all for unexpected errors.
2026-04-28 · 7211fc6
Audit log: H1 (rate-limit-bucket-collapse) closed and verified
Verified live on snapback with bucket=2: IP A (1.1.1.1) and IP B
(2.2.2.2) each got 200/200/429 from separate buckets. Pre-fix would
have shared one bucket (A: 200/200/429, B: 429/429/429).
2026-04-28 · 224a90d
Fix HIGH: per-IP rate limits collapse to one bucket behind nginx
/security-scan caught it. uvicorn was launched without --proxy-headers,
so scope["client"] was always 127.0.0.1 (nginx loopback). All three
limits collapsed to a single global bucket: slowapi 30/min on /query,
10/min on /v1/payment/*, and the /mcp middleware's 60/min.
Fix: add --proxy-headers --forwarded-allow-ips=127.0.0.1 to the systemd
unit. uvicorn's ProxyHeadersMiddleware then rewrites scope["client"]
from X-Forwarded-For *only when the connecting IP is 127.0.0.1*, so
public clients cannot spoof. nginx is the sole trusted proxy.
Both slowapi (via get_remote_address) and the /mcp middleware
(via scope["client"]) now key on the real client IP automatically.
Updated the comment that lied about --proxy-headers being set.
2026-04-25 · ac66c47
Security audit doc: 4 critical fixes + 2 P0/P1 opens
Captures the full audit run today: findings, what's fixed, what remains
open, what was confirmed safe. Includes pre-deploy .env hygiene checklist
to prevent the wallet-PK-in-env mistake from happening again.
Open critical: rotate the testnet wallet that surfaced during the audit
(plaintext FILLIN_WALLET_PRIVATE_KEY in /opt/fillin/.env). Move private
keys to KMS/systemd-creds/age-encrypted file before mainnet flip.
Verified safe: constant-time bearer compare, atomic spend-with-nonce,
EIP-191 verification, micro-USDC accounting, refund-on-failure, /mcp
rate limit (5×200 → 7×429 confirmed live), nginx SSE buffering off.
2026-04-25 · de964fd
nginx: disable buffering for SSE / streamable-HTTP MCP
Default proxy_buffering=on holds the entire text/event-stream response
until the upstream closes. For MCP's streamable-HTTP transport, that
means tool-call responses appear to hang from outside even though the
server returned in <1s. Disable buffering on the proxy_pass so chunks
flush to the client as they arrive.
Live config patched in-place; this update keeps the bootstrap config in
sync for fresh deploys.
Caught while running the production-grade audit; before the fix, live
fillin_query through HTTPS was indistinguishable from a 30s timeout.
After: 0.57s.
2026-04-25 · e212e8a
Production-grade MCP fixes: deadlock, refund, rate limit
Audit found three real production issues. All fixed.
C1 (CRITICAL — production-breaking): MCP tool *calls* deadlocked through
the deployed HTTPS endpoint. fillin_query/stats/health each made HTTP
loopback calls to localhost:8766 from inside the same FastAPI worker
that was holding an SSE stream open for the MCP request. Classic event-
loop self-deadlock; resolved only at the 30s upstream timeout. Smithery's
probe passed because tools/list returns immediately, but no agent could
actually run a tool. Fix: detect co-mounted mode (loopback host) and
call db.query_delta / db.stats directly in-process. HTTP path preserved
for stdio mode where someone points at a remote Fillin.
C2: /mcp had no rate limit. slowapi only decorates FastAPI routes, not
mounted Starlette apps, so the public MCP surface was wide-open while
/query was capped at 30/min. Added a per-IP token-bucket as ASGI
middleware; default 60/min, override via FILLIN_MCP_RPM.
I1: x402 payers were charged before the query ran. A LanceDB lock or
embedding failure left them out-of-pocket. Added credits.refund() and
wrapped query_delta in api.py so server-side errors roll back the
deduction. Bearer mode unchanged (pays nothing).
Verified locally: fillin_health 0.8s, fillin_query 7s (cold) returning
real ranked results with proper scores in (0, 1].
2026-04-25 · 60f207d
deploy/install.sh: don't clobber certbot's nginx mods on re-deploy
Each subsequent deploy overwrote /etc/nginx/sites-available/fillin with
the bootstrap (HTTP-only) config, wiping certbot's HTTPS server block.
Caused HTTPS to break after every redeploy until 'certbot --nginx
--reinstall' was re-run.
Fix: only copy the bootstrap config on first deploy (when the file
doesn't exist). After that, certbot owns the file. nginx -t still
runs to validate.
Caught while wiring fillin.glyphapi.dev for smithery.
2026-04-25 · c18de4b
Allow public hosts through FastMCP's DNS-rebinding protection
FastMCP defaults allowed_hosts to localhost variants. Production hits
return 421 'Invalid Host header'. Override via FILLIN_MCP_HOSTS env
(comma-separated) and seed with fillin.glyphapi.dev so the live deploy
serves /mcp/ to remote clients.
Caught by smithery's probe failing.
2026-04-25 · 26af976
HTTP MCP transport: mount FastMCP at /mcp inside the FastAPI app
Smithery and most agent harnesses talk to MCP over Streamable HTTP, not
stdio. Same FastMCP instance now serves both — stdio when run as
'python mcp_server.py', HTTP at https://<host>/mcp when api.py boots.
- mcp_server.py: stateless_http=True, path "/" so mount at /mcp lands cleanly
- api.py: lifespan now wraps mcp.session_manager.run(); mount /mcp →
streamable_http_app() exposes the same fillin_query / fillin_stats /
fillin_health tools to remote clients
Verified locally — POST /mcp/ initialize returns server info, tools/list
returns all 3 tool schemas. Co-located in one process so loopback overhead
is negligible.
2026-04-25 · d88e896
Submission packet + agents.json manifest
User approved MCP catalog submission. Prepping the artifacts:
- agents.json (root): full manifest covering offer, endpoints, pricing,
payment rails, MCP tool schemas, the 4-call agent-bootstrap recipe,
rate limits, guarantees. The single URL an agent should be able to
read to bootstrap an integration without a human.
- api.py: GET /agents.json + GET /.well-known/agents.json serve it
(both standard discovery paths covered).
- docs/SUBMISSION.md: copy-paste packets for smithery.ai, mcp.so, the
punkpeye/awesome-mcp-servers PR (with diff line + body), parallel
awesome-list targets, Twitter draft, HN cadence, and crypto-native
registry plan (Olas, Virtuals, Eliza). Pre-submission checklist
included.
The actual catalog clicks + GitHub OAuth need a human — that's the
user's part. Everything else is ready.
2026-04-25 · 502be0c
Fix two bugs found via dogfood: score field + similarity formula
Ran agent.py against a fresh local Fillin (1693 docs, 4 sources). Two
quality bugs surfaced — neither would have shown up in unit tests because
both depend on real LanceDB distance distributions.
1. api.py:271 was returning raw L2 _distance as the API "score" field,
ignoring the reranked score entirely. Reranking was running but the
client never saw it.
2. ranking.py:effective_score used `1 - distance`, which clamps to zero
for any distance > 1.0. LanceDB returns L2 distance on unit-normalized
MiniLM vectors — typical "decent match" distances are 1.0-1.4, so
~80% of results collapsed to score=0 and the original LanceDB order
leaked through (defeating the rerank).
Fix: similarity = 1 / (1 + distance). Smooth monotonic decay in (0, 1].
No collapse, no orthogonal-equals-opposite degeneracy.
Also discovered (not yet fixed):
- RSS SSL cert verify fails on macOS Python; 4/5 default feeds bounce
- arXiv 5-day window returned 0 docs (probably a search-shape mismatch)
Tests: 19/19 passing. Re-ran the agent against the fix; top results for
"new LangChain releases" are now actual github.com/langchain-ai release
notes, not random HN posts.
2026-04-25 · 5abf820
ZHC strategy: rails for agents that buy services themselves
The Stripe path was last decade's framing. Real strategic claim: Fillin's
x402+USDC+Base+MCP stack is uniquely fitted to autonomous agents and Zero
Human Companies — wallet-native auth, no KYC, MCP-catalog-discoverable,
priced atomically per query.
- 30-day rails plan: MCP catalog submission, agents.json manifest,
fillin_account/fillin_estimate_cost tools, register on Olas + Virtuals
+ Eliza, hunt 5 ZHC customers (autonomous trading bots, DAO digesters,
Eliza-deployed character agents, etc).
- Reprioritization: defer Stripe, lean into crypto-native rails. Stripe
becomes the secondary path once humans-evaluating-for-ZHCs ask for it.
- ZHC profile + spend math: monitoring agents at 50 queries/min hit $20k/mo
per customer — one such customer beats 1k hobbyists.
2026-04-25 · 1a108a6
Outreach playbook + Stripe wiring plan
Two artifacts to drive Phase 5 of the launch (commercial traction):
- docs/OUTREACH.md: 3 cold-email templates (agent frameworks, daily-brief
products, research/dev-tools shops), tier-1/2/3 target list (~30 names),
cadence + personalization rules.
- docs/STRIPE_PLAN.md: end-to-end wiring plan to add card payments
alongside the existing x402 handshake. Stripe Checkout (not Elements),
one-time SKUs (not subs), reuse credits ledger via "fk_" key prefix,
webhook signature verification non-negotiable. Implementation in 4-6h.
No code yet — these are the artifacts the commercial push runs against.
2026-04-25 · 67bb951
Landing page parity + fillin_health MCP tool
Landing page was claiming a 2-source corpus + single-embedder stack;
neither is true anymore. Pulled it back into reality.
- web/index.html: stack chips reflect 4 sources, pluggable embeddings,
authority×recency rerank, MCP server. "How it works" copy now names
the rerank step honestly.
- mcp_server.py: fillin_health() — reachable, host, rows, earliest,
latest. Lets an agent decide whether to call fillin_query at all.
2026-04-25 · e352816
Authority + recency reranking: arxiv & GH releases beat HN noise
v2 roadmap item — once corpus has 4 sources of mixed quality, pure
cosine similarity is the wrong final order. Pull 3× candidates from
LanceDB, rerank by similarity × authority × recency, slice to k.
- ranking.py: SOURCE_AUTHORITY (arxiv/GH = 0.95, RSS = 0.75, HN = 0.70),
exp recency decay with 90-day half-life. Both tunable via env
(FILLIN_AUTHORITY JSON, FILLIN_RECENCY_HALF_LIFE).
- db.py: query_delta now pulls k * RERANK_FACTOR, reranks, slices.
Each result carries a `score` field so callers can show grounding
confidence.
- 12 new tests, 74 passing.
2026-04-25 · 822b0fb
Corpus expansion: RSS + GitHub Releases sources land
Phase 4 of the launch plan — depth compounds silently while traffic
arrives. Two new timestamped corpora wired into the existing
fetch(hours) → (docs, dropped) contract.
- sources/rss.py: feedparser-backed reader with HTML stripping, retry,
cutoff filter. Default feeds curated for the LLM/dev/agent space;
override with FILLIN_RSS_FEEDS.
- sources/github_releases.py: REST API client for /repos/{slug}/releases
with token-aware rate limiting (60/h → 5000/h with GITHUB_TOKEN).
Drafts and prereleases are skipped. Default repos seed the agent-
framework ecosystem; override with FILLIN_GITHUB_REPOS.
- ingest.py: registers both under SOURCES so they're picked up by the
default ingest pass and the scheduler.
- db.py: lazy-import lancedb so source modules import cheaply
(testable without the heavy DB dep).
- 22 new tests, 62 passing total.
2026-04-24 · 5bc3bb6
Pluggable embeddings: MiniLM default, Perplexity via OpenRouter ready
Prep work so swapping the embedding backend is a one-env-var change, not a
refactor. Default behavior unchanged — MiniLM stays on. Perplexity backend
is wired and tested but dormant until FILLIN_EMBED_MODEL=perplexity is set
and the corpus is re-embedded.
- embeddings.py: Embedder protocol + MiniLMEmbedder + PerplexityEmbedder
(OpenRouter, 32k context, $0.004/1M tok) + get_embedder singleton
- db.py: schema dim pulled from active embedder at table-create time
- tests/test_embeddings.py: 6 tests covering defaults, dims, auth gate
28 passing.
2026-04-24 · 71b71f5
Commercialization push: landing page, agent, MCP server, gstack
- web/index.html: single-file landing page with live SVG gap chart,
looping agent-↔-Fillin demo, interactive onboarding panel
(model/volume → USDC quote + copy-paste snippet)
- examples/agent.py: Claude Opus 4.7 + tool-runner demo that calls
fillin_query and synthesizes grounded answers with inline citations
- mcp_server.py: MCP server exposing fillin_query / fillin_stats over
stdio for Claude Code, Cursor, Continue, Zed, etc.
- tools/load_test.py: concurrent hammer with p50/p95/p99 reporting
- HN_POST.md: Show HN draft + operator notes
- CLAUDE.md: project manifest + gstack skill routing rules
- requirements.txt: pin anthropic + mcp
2026-04-22 · cc126b4
Add ONBOARDING.md + runnable Python client
Documents the full x402 signed-challenge flow for first-time paying
customers: discovery → deposit → credit check → challenge → sign → query,
with error-table, bearer-key shortcut, and TypeScript + Python snippets.
examples/client.py is a dependency-light reference client. Round-tripped
locally: the signatures it produces validate against payments/auth.py's
verify_signature.
2026-04-22 · 27f56cf
Infra toward first paying customer: rate limits, query cap, arXiv source
Knocks out the two remaining High items from security-report.md and adds
the second ingest source called out in FINDINGS.md's v1 roadmap.
Rate limiting (slowapi, per-IP):
- /query 30/min
- /v1/payment/challenge 10/min
- /v1/payment/account 10/min
Limits can be disabled for tests with FILLIN_DISABLE_RATE_LIMIT=1;
exceeding returns 429 with a structured JSON body.
Input validation on /query:
- query: max_length=512, min_length=1 (closes the embedder-DoS lever)
- cutoff: datetime (was str with manual ISO parsing) — invalid input now
fails at Pydantic boundary with 422
arXiv ingest source:
- sources/hackernews.py — existing HN logic moved out of ingest.py
- sources/arxiv.py — Atom feed for cs.AI/cs.LG/cs.CL/cs.CR/cs.DC,
sorted desc, paged until out of window
- ingest.py — orchestrator: iterates SOURCES, isolates
per-source failures, embeds once, upserts once
- scheduler.py unchanged; picks up both sources automatically
Tests: 51 → 67. Adds test_api_limits.py (rate limits + validation),
test_source_hn.py, test_source_arxiv.py, and rewrites test_ingest.py to
cover the orchestrator (source merging, failure isolation, subset flag).
2026-04-22 · 1f6cc99
Harden fillin: tests, efficiency, and signed x402 handshake
Lifts the seven "fine" components from /sitrep to "works great" and fixes
the CRITICAL finding from /security-scan.
Hardening:
- db.upsert: LanceDB merge_insert replaces O(n) in-Python id dedup
- db.stats: pyarrow column-scan min/max, no row-dict materialization
- ingest.fetch_page: 3-attempt exponential backoff on RequestException
- ingest.ingest: counts and logs dropped malformed hits
- wallet: explicit module cache with reset_cache(); is_configured() now
validates checksum, not just env presence; api lifespan hook calls init()
- credits.spend / credit_deposit: reject non-positive amounts; add
deposits_address_idx
Critical fix — EIP-191 signed-challenge handshake on /query:
Previously X-Payer-Address was trusted as payer identity; anyone could
drain any funded account after address enumeration via /v1/payment/account.
Now every paid /query requires X-Payer-Address, X-Payer-Nonce, and
X-Payer-Signature. POST /v1/payment/challenge mints a one-shot nonce +
canonical message; the server recovers the signer and rejects 401 on
mismatch. Nonce consumption + credit deduction are a single SQLite
BEGIN IMMEDIATE transaction — single-use, replay-proof, and a failed
spend rolls back nonce consumption.
Tests: 0 → 51, all green. Covers db, ingest, wallet, credits, EIP-191
recovery, nonce lifecycle, and end-to-end attacker scenarios (spoofed
address, wrong signer, replay, unfunded-but-signed) via FastAPI TestClient.
security-report.md documents the remaining High/Medium/Low items for a
follow-up pass (rate limiting, query max_length, balance enumeration,
dep pinning).
2026-04-21 · 4738a8d
x402 payments: Base USDC wallet + monitor + credits ledger
- payments/wallet.py: web3.py wrapper mirroring GLYPH's wallet.ts —
Base sepolia/mainnet USDC support, self-custodied key, balance checks.
- payments/credits.py: SQLite-backed per-payer credit ledger, micro-USDC
accounting (no float drift), idempotent deposits by tx_hash, atomic
spend with conditional UPDATE.
- payments/monitor.py: polls Base for USDC Transfer events into our
wallet and credits sender addresses. Confirmations=2, cold-starts at
head-1 so we don't replay history. Runs as its own systemd unit.
- api.py: /query now accepts EITHER a valid FILLIN_API_KEY bearer OR
an X-Payer-Address header whose credits cover the price. Unpaid
requests return 402 with pay_to, network, usdc_contract, and
next-step instructions. New GET /v1/payment/account/{addr} returns
balance. Bearer compare stays constant-time.
- deploy/nginx-fillin.conf: HTTP bootstrap site for fillin.glyphapi.dev
(TLS to be added by certbot --nginx after DNS propagates).
- deploy/fillin-monitor.service: hardened systemd unit for the monitor.
- deploy/install.sh: now also installs monitor unit + nginx site.
- All three units: PYTHONUNBUFFERED=1 for readable journal logs.
- requirements.txt: web3>=7, eth-account>=0.11.
Price locked at $0.01/USDC per query (Starfleet order — commodity
pricing was leaving money on the table for a product with a time-in-
market moat).
Verified live on snapback: unauthed POST /query → 402 with payment
instructions; /v1/payment/info returns wallet 0x763C… on base-sepolia;
monitor watching block 40503061+.
2026-04-20 · 420d378
Deploy Fillin to snapback: bearer auth + hardened systemd units
- api.py: bearer-token auth on /stats and /query via FILLIN_API_KEY env
var; public /healthz for unauthed liveness. Uses secrets.compare_digest
to avoid timing leaks. Returns 503 if key is not configured.
- deploy/fillin-api.service + fillin-scheduler.service: systemd units
with ProtectSystem=strict, ProtectHome, NoNewPrivileges, limited
ReadWritePaths. Cache dirs redirected into /opt/fillin/.cache so the
HF model download works under ProtectHome.
- deploy/install.sh: idempotent rsync + venv + systemctl deploy. First
run generates a cryptographically strong FILLIN_API_KEY (32 random
bytes hex) into /opt/fillin/.env with chmod 600.
Verified live on snapback:8766 — unauth→401, /healthz→200,
authed /query returns today's news with gap_days=109.58.
2026-04-19 · d015361
Fillin v0 — time-series vector DB of the internet
Core components:
- db.py: LanceDB schema + temporal delta query (cutoff + semantic similarity)
- ingest.py: Hacker News Algolia ingestion with dedup
- api.py: FastAPI POST /query, GET /stats
- demo.py: CLI simulating an LLM with a cutoff asking the delta
- scheduler.py: daemon loop for always-on ingestion (the moat clock)
- eval.py: wedge proof vs HN Algolia keyword baseline
- FINDINGS.md: 2026-04-19 eval — recall wins, relevance needs better embedder
Stack: LanceDB + sentence-transformers MiniLM-L6-v2 + FastAPI + requests.
300 HN docs indexed at first ingestion, covering 2026-04-19 window.