Changelog

What shipped.

Generated from git history. Most recent first. Newer entries are at the top.

2026-07-06 · 791d76c

refocus: one identity — Glyph, a language for agents

Archive every dead-era artifact (nothing deleted) to archive/2026-07-06-refocus/: Phase R/S strategy docs, ZHC/Stripe/outreach plans, the four 301'd funnels (onboard, onboard_v2, index, pricing), theme mockups, prompt-out artifacts. Add FOCUS.md (single source of identity: script/grammar/corpus framing, honest constraints, the one rule — no supply-side work without demand-side evidence), docs/RELAUNCH.md (one build step: /v1/encode, then distribution only), and web/language.html (redesigned front-door draft, not routed). Rewrite README and CLAUDE.md to match; commit AGENT_CONTEXT.md as the durable code reference.

2026-06-30 · d1a1c29

funnel: Phase 1 consolidation — one front door, one path, trial-first

Collapse the page sprawl (5 competing front doors) into a single funnel: film (/) -> /glyph (product) -> /signup (fund). - api.py: 301 the duplicate/legacy onboarding funnels (/onboard, /v2, /onboard_v2, /old, /index.html) -> /, and the stale multi-tier pricing page (/pricing, which pushed the $9/mo subscription and $0.01/0.02/0.03/0.05 tiers contradicting the one-number story) -> /glyph. Pages remain on disk; flip the handlers back to restore. Subscription endpoint stays, just unsurfaced. - web/glyph.html: trial-first. "Four ways in" -> "Start free, then one flat rate." Leads with POST /v1/signup (20 free queries, no card/wallet), then $0.01/query via x402 or a card-funded bearer at /signup. - web/signup.html: fix dead /onboard.html link -> /glyph + /llms.txt. Deployed + verified live: all 5 legacy routes 301 correctly, funnel serves 200, trial CTA live on /glyph. Claude-Session: https://claude.ai/code/session_01H32KfNXkHaWPScwhiES4ts

2026-06-26 · 218fd5f

landing: responsive citation fan-out (mobile clip fix) + design-loop harness

- web/glyph-cinematic.html: the piece fan-out spread is now min(182, (innerWidth/2-102)/1.5) instead of a hardcoded 182px, so the 4 citation cards stay fully on-screen at ~390px while staying byte-identical at desktop (min returns 182 for innerWidth >= 750px). Read per-frame in ui(), so resize/rotation is picked up automatically — no listener needed. - docs/design-loop.md: "the taste compiler" — the automated capture -> critique -> patch -> verify loop spec + token economics across loop styles. - docs/design-findings-round1.json + design-ledger.md: round-1 output of the loop run against the live ?frame= harness (this fix was its top P1). - prompts/: the prompt-out artifacts that drove the loop + the GRPO-selected fix prompt. Verified: before/after screenshots at 390px (clip gone) and 1440px (unchanged); token-math "1.06-2.33x" grep count unchanged; deployed live to glyphapi.dev. Claude-Session: https://claude.ai/code/session_01H32KfNXkHaWPScwhiES4ts

2026-06-25 · 6d2674f

landing: fix hero descender collision + outro ghost legibility

- Hero wordmark "Glyph" reserved descender space (padding-bottom:.16em) and the tagline gap now scales with viewport (margin-top:clamp(32px,4vw,64px)), so the y/p descenders no longer crash into "A megapixel is worth…". - Outro .ghost watermark dropped to opacity .05 + blur(2.5px) with a smaller mask core, so no legible monospace bleeds behind the "…your model can read." headline (was reading like accidental fine print / an asterisk caveat). Claude-Session: https://claude.ai/code/session_01H32KfNXkHaWPScwhiES4ts

2026-06-24 · b3b802e

domain: make glyphapi.dev (apex) the canonical Glyph front door

Consolidate the two-products-one-name mess. glyphapi.dev becomes the single canonical URL for the Fillin search engine (branded Glyph); the old apex product (glyph-api document-compression API on Fly.io, zero on-chain revenue ever) is archived. - Canonical public URL -> https://glyphapi.dev across manifests (server.json, smithery.yaml, agents.json), the cinematic landing (+ rel=canonical/OG), web/llms.txt, api.py discovery block, Stripe redirect defaults, deploy scripts. - mcp_server.py host allowlist now defaults to include glyphapi.dev + www (without it /mcp/ would 421 on the apex after cutover). - fillin.glyphapi.dev kept as a working alias (nginx server_name + allowlist) so registered MCP installs / agents.json / Smithery never break. - Machine identifiers UNCHANGED: registry id io.github.artchristech/fillin, fillin_* tool names, FILLIN_API_KEY, host units. - Apex snapshot + zero-revenue justification: archive/glyphapi-apex-2026-06-24/. - Cutover runbook: docs/cutover-glyphapi-apex.md. Focused suite green: test_substrate, test_glyph, test_corpus_and_probe, test_mcp_gate (78 passed). Claude-Session: https://claude.ai/code/session_01H32KfNXkHaWPScwhiES4ts

2026-06-23 · 64e740e

Make the Glyph cinematic the domain front door (/)

The bare domain now serves the cinematic scroll film instead of the onboarding funnel — it's the most compelling first impression and leads into the product via its CTAs (/glyph, /v1/glyph/sample.png). The onboarding funnel is preserved at /onboard.html and the brochure at /old; reverting is a one-line repoint of root(). Test: / now serves the cinematic; the onboard SSR assertion moved to /onboard.html. Claude-Session: https://claude.ai/code/session_01H32KfNXkHaWPScwhiES4ts

2026-06-23 · 5c088ac

Add /film — cinematic scroll landing for Glyph

A scroll-driven canvas film that performs the product pipeline: post-cutoff word-cloud → compression into one dense glyph (legibly-encoded surface, real /v1/glyph render textured in) → vision-model scanline read with live token HUD (3,538→2,580, honest 1.06–2.33×) → citation pieces fan out → transfer across Gemini/Qwen/Claude with color-coded read-back. Self-contained, rAF + eased scroll, depth-of-field, grain/vignette, reduced-motion aware, mobile-gated. Served at /film and /glyph/film via _serve_static — additive, /glyph landing untouched. Includes an inert ?frame= capture harness for headless screenshots. Design reviewed iteratively against real rendered screenshots (falsify-first); graded 10/10 on presentation by a pmarca-voiced judge (4→8→9→10). Claude-Session: https://claude.ai/code/session_01H32KfNXkHaWPScwhiES4ts

2026-06-23 · f9d682f

P4 (lean): /v1/freshness — one agent-facing freshness-SLA gate-point

Per-source ingest lag + verifiable staleness already existed scattered across /freshness, /v1/corpus, and /healthz. Consolidate into a single documented, free, no-auth GET /v1/freshness an agent can poll to watch a topic instead of paying to discover staleness. Returns as_of, rows, latest, ingest_lag_seconds, an `all_fresh` boolean (the single gate field), stale_sources, and the per-source {source, latest, lag_hours, slo_hours, stale} array. Reuses the existing _corpus_cache aggregate and _source_freshness() helper — no DB cost, no new deps, no new secret. Deliberately UNSIGNED: per-source SLO staleness is already verifiable; Ed25519 provenance signing is a one-commit add the day a customer needs to forward attestations, not speculative surface to carry at $0 revenue. (Scotty's call.) Test: months-old seed corpus → every source past SLO → all_fresh False, stale flags fire (a constant 'fresh' would be a lie). Advertised in web/llms.txt. Claude-Session: https://claude.ai/code/session_01H32KfNXkHaWPScwhiES4ts

2026-06-23 · c7ef68e

P2: reader auto-detection for substrate='auto'

Most callers won't pass `reader`, so today every unidentified caller gets the safe-but-suboptimal pixel→text substrate. Add substrate.infer_reader(): when /v1/retrieve omits `reader`, infer it from an explicit `X-Reader-Model` header (declare once at the transport layer) or the User-Agent. Honest-safe by construction: infer_reader uses a conservative signal map (separate from reader_class's allowlists, whose short tokens like o1/o3 would false-positive on UA strings), checks pixel/weak signals before tiled, and returns None for anything ambiguous → reader_class maps None to pixel → text. So auto-detection can never upgrade an unidentified reader into a glyph overclaim. The picker echoes the effective reader in selection.reader. Tests: 9 new (5 unit on infer_reader incl. ambiguous-prefers-safe + unknown→ None; 4 e2e proving header/UA drive the pick, body reader overrides header, unknown caller stays text). test_substrate.py 34 passed. Claude-Session: https://claude.ai/code/session_01H32KfNXkHaWPScwhiES4ts

2026-06-23 · 3cbfd50

P1: proof-in-paywall — substrate_estimate in /v1/probe + 402 bodies

Surface a per-reader text-vs-glyph token table (from substrate.pick_substrate / glyph._token_manifest) plus a /v1/glyph/sample.png link in both the free /v1/probe response (QueryOut.substrate_estimate, computed over the actual probe results) and the 402 paywall body (over the sample set). Turns the paywall into a value calculator: an agent sees the glyph-vs-text tradeoff for its query before paying. Honest by construction — reuses existing token math, pixel-billing readers are shown text (never a fake glyph win). Tests: test_corpus_and_probe.py + test_api_x402.py assert substrate_estimate present, carries the sample link, surfaces both billing classes, and pixel→text honesty guard. Focused run 23 passed. docs/launch-next.md: dated "Path forward (Smithery deferred)" section — DO NOW (P1 done → P2 → P4) vs OWNER-MUST-DO (review → deploy → push → Show HN). Claude-Session: https://claude.ai/code/session_01H32KfNXkHaWPScwhiES4ts

2026-06-23 · bca4d87

G4: finish Fillin→Glyph public rebrand across all distribution surfaces

Rebrand every public-facing surface to Glyph while keeping all machine identifiers (registry name io.github.artchristech/fillin, tool names fillin_*, FILLIN_API_KEY, X-Fillin-Key, host, systemd units) unchanged — zero break for existing installers. - server.json: title Fillin→Glyph, Glyph-framed description, version 0.3.0, websiteUrl → /glyph (identifier "name" left as fillin). - smithery.yaml: displayName Glyph, substrate-led + honest description, corpus ~21.6k→~351k, add retrieve_auto + glyph_search tools, glyph tags, config titles → Glyph, homepage → /glyph. - README, llms.txt, docs/integrations/* (cursor, claude-desktop, browse-sh, mcp-registry-pr, awesome-mcp-prs, posts/*): capital "Fillin" product name → Glyph (engine attributions intentionally kept); tool counts 7/8 → 10; surface retrieve_auto + glyph_search in tables. - Honest token math uniform everywhere: 1.06–2.33× flat-tile, parity-or-worse on pixel billing, 15× legibility-vs-screenshot. No 1.5–2.5× overclaim. - docs/launch-next.md: ranked product + distribution upgrade plan. Claude-Session: https://claude.ai/code/session_01H32KfNXkHaWPScwhiES4ts

2026-06-22 · 37ee717

G2: retrieve_auto MCP tool + /v1/retrieve bearer-refund fix

Tenet-3 polish (the MCP twin of HTTP /v1/retrieve, plus the pricing fix): - retrieve_auto MCP tool: one retrieval -> substrate.pick_substrate(results, reader) -> reader-appropriate substrate + the selection object. Returns Image+JSON blocks when the pick is glyph, JSON otherwise. Gated via _charge_or_402 (answer rate up front, refund_bearer delta on text/glyph). Reuses glyph render/manifest (no recomputed pixel math); inproc + remote paths. Factored glyph_search's packer into shared _pack_glyph; added _pack_retrieve for the remote path. - /v1/retrieve pricing: keep answer-rate-up-front + refund-the-delta (the only x402-compatible model: a nonce authorizes one spend). Fixed a real gap -- _refund credited only the x402 lane, so Stripe-funded bearers resolving to text/glyph were overcharged 2x. Added credits.refund_bearer and wired the bearer lane into retrieve_route._refund. - requirements: pillow>=10.3.0 (was >=10.0.0). - tests: MCP-level reader-conditional (gemini->glyph, claude->text) + bearer-refund regression. Focused suite green.

2026-06-22 · 474836c

feat(glyph): photo-glyph result surface + tenet-3 auto-substrate

Glyph rebrand — search results render as dense VLM-readable photo glyphs via a vendored learned bitmap font (glyph.py; data/glyphs.npy is gitignored and carried by the rsync deploy). New POST /v1/glyph (+ free /v1/glyph/sample.png decoder self-test and landing hero), opt-in glyph:true on /query, MCP glyph_search, and the /glyph landing (web/glyph.html). Tenet 3 — one retrieval, auto-pick the cheapest *legible* substrate for the caller's reader (substrate.py; POST /v1/retrieve): text on Claude/GPT pixel billing, dense glyph on Gemini/Qwen flat-tile billing, answer for weak tool-callers. Reader-conditional; built via /falsify-first -> SURVIVED confidence 2 (docs/substrate_falsify.md). 36 new tests (test_glyph 15 + test_substrate 21), reader-conditional proven (gemini->glyph, claude->text). Token claims reconciled to the measured number: ~parity on Claude pixel billing; 1.06-2.33x on flat-tile readers (~1.06x on small/typical sets, up to 2.33x on dense sets). The optimistic 1.5-2.5x is gone from the landing and llms.txt — matches the falsify-first verdict. api.py/mcp_server.py here also carry the prior-session routes (SSR, /v1/gap, /v1/signup, /v1/billing/subscribe): same blended files, committed whole.

2026-06-22 · 82d183f

chore: capture deployed-but-uncommitted prior work

Snapshot the 06-08/06-12 work that was rsync-deployed to the VPS but never committed, so the branch reflects production before the Glyph thread lands on top: - db.stats() Arrow vectorization + corpus cache (db.py) - rss User-Agent feed fix + frontier source (sources/) - subscription + trial-credit billing backends (payments/) - onboarding rewrite, live-stat SSR landing pages, /v2 (web/) - table-compaction, ANN-index, daily-snapshot scripts (scripts/) - service reboot-resilience unit (deploy/fillin-api.service) - trial-signup + billing + corpus tests, warmup-disable fixture (tests/) The api.py/mcp_server.py routes that drive these (SSR, /v1/gap, /v1/signup, /v1/billing/subscribe) ship in the next commit alongside the Glyph thread — they share blended diffs in those two files and are committed whole there.

2026-05-27 · f4ee419

fix(corpus): stale-while-revalidate + lock — stop 504s on storefront

/v1/corpus was the load-bearing source of fillin-api's 76% CPU pin and 80% 504-rate over the last 4 hours. db.stats() is O(N) in Python (to_pylist() on 122k rows + zip loop), the cache had a thundering-herd race (concurrent misses each ran their own scan, GIL-serialized into a queue that never drained), and uvicorn is single-worker so one slow corpus call blocked every other route. Landing-page visitors hitting the live-stats counter got 504s and bounced. Three changes: 1. SERIALIZED RECOMPUTE: module-level threading.Lock makes concurrent cold-cache callers dedupe to ONE stats() call. 2. STALE-WHILE-REVALIDATE: once any value is cached, stale callers get the stale value immediately and a daemon thread refreshes in the background. Sentinel ensures only one refresh thread is ever alive. 3. TTL 5min → 30min: corpus is a counter widget; the data only grows ~1k docs per 30-min ingest tick. 30min still far fresher than the data being summarized. Verified live: 16-way parallel internal calls all return 200 within 1s (was 30s timeouts). External through Cloudflare: warm calls ~90ms. Tier-2 fix (rewrite db.stats() to use Arrow GROUP BY + pre-compute via scheduler) deferred — the lock + SWR is the bleed-stopper.

2026-05-27 · c8d531e

chore: regenerate web/changelog.html (auto)

2026-05-26 · 7e85637

web: merge home → onboard, make onboard the front door

The brochure home was competing with the actual conversion funnel. Per the user's call (option B from /Users discussion): take the strongest piece of the home — the corpus-over-time chart and the "permanent, growing blind spot" framing — and inject it as a hero banner above step 1 of the funnel. Then point / at onboard.html so visitors land inside the product instead of in front of a price table. / → onboard.html (was index.html) /onboard.html → onboard.html (unchanged) /old, /index.html → index.html (preserved for inbound links) Onboard now opens with: tagline → "permanent growing blind spot" lede → corpus chart with the gap visualized → stepper → step 1 (pick model → see YOUR personalized gap). The chart uses onboard's accent palette (#c6ff3c) rather than the home's teal.

2026-05-26 · e0980e7

mcp: broaden gate auth aliases + declare smithery x-to header forward

Per Smithery's session-config spec, configSchema fields ship to upstream MCP servers via x-to metadata; without it Smithery falls back to forwarding as a query parameter named after the property. Two changes: - smithery.yaml: declare FILLIN_API_KEY → header X-Fillin-Key via x-to - mcp_server._extract_bearer: also accept FILLIN_API_KEY / fillin-api-key in headers and as ?FILLIN_API_KEY= / ?fillin_api_key= query params, so users on pre-update Smithery installs are charged correctly the moment they paste a key into the config field. [fillin-gate] telemetry now distinguishes header:authorization, header: x-fillin-key, header:FILLIN_API_KEY, qs:fillin_key, qs:FILLIN_API_KEY (smithery-default), and qs:api_key (smithery-session) so we can read the first organic Smithery hit and confirm the path used.

2026-05-26 · 7d48b88

fix(mcp): close in-proc payment bypass — gate paid tools via fillin_credits

The MCP tools (fillin_query, fillin_answer, query_cves/papers/frontier/markets) were bypassing auth entirely in the in-proc path used by Smithery and every other co-mounted MCP gateway. Result: 1800+ MCP requests in the last 3 days returned full results for $0. Each paid tool now charges via fillin_credits.spend_bearer using a bearer extracted from Authorization, X-Fillin-Key, or ?fillin_key=… (Smithery fallback). Missing/insufficient/unknown keys return a payment_required dict with the live Stripe topup URL and the /v1/probe free-taste endpoint. Also adds [fillin-gate] one-line telemetry per call so we can see how agents actually present their auth (header vs query string vs none). 10 new tests in tests/test_mcp_gate.py covering each path.

2026-05-23 · 9a45238

chore: regenerate web/changelog.html (auto)

2026-05-23 · e143592

feat(api): X-Fillin-Key header alias for Smithery gateway compat

ASGI middleware that copies an X-Fillin-Key request header into the canonical Authorization: Bearer position before any downstream auth runs. Required because Smithery's gateway blocks "Authorization" (and cookie / cf-* / smithery-* / x-smithery-*) as user-configurable parameter names, so a Smithery-routed connection can't pass a bearer token through the standard header. Behaviour: - X-Fillin-Key present, Authorization absent → synthesize "Authorization: Bearer <X-Fillin-Key>" at the ASGI scope level and strip the X-Fillin-Key header to avoid duplication. - Both present → Authorization wins; X-Fillin-Key is ignored (sending both is a misconfiguration, so we don't silently override the explicit Authorization). - Neither present → no-op. Smithery listing parameter config that pairs with this: Name: apiKey Type: string Location: header Required: false Output header: X-Fillin-Key 2 new tests; full suite 332 green. Production verified: garbage X-Fillin-Key reaches the auth lane and 402s (proving the alias fires), garbage Authorization still 402s (proving the canonical path is intact), both-headers case 402s on the Authorization side (proving precedence).

2026-05-23 · 76b8f23

feat: billing-success subscription cards + markets in HN post + healthcheck

Carry-over from the parent feat/payments-push-distribution branch that had been sitting in the working tree across multiple commits — bundled here so the marketplace branch's working tree is clean and prod matches HEAD. - api.py: subscription-card UI on the /billing/success page. Right after a Stripe checkout, the success page now exposes one-click "subscribe to CVEs / papers / frontier / markets" cards that POST to /v1/subscriptions and reveal a webhook secret inline (matching the bearer-key reveal flow on the same page). - HN_POST.md: add /query/markets line to the endpoints block (was inadvertently dropped when the markets slice shipped). - web/changelog.html: auto-regenerated by deploy from current git log. - deploy/healthcheck.sh: untracked helper script for one-command composite verification across /v1/health, /v1/payment/info, and /v1/billing/checkout. Already referenced in CONTEXT.md.

2026-05-23 · c737518

feat(mcp+landing): smithery capability quality — annotations + live call-counter badge

Two related improvements driven by the Smithery dashboard rubric: 1. Tool annotations (+5.93 Capability Quality pts). All 11 MCP tools now declare ToolAnnotations with a human-readable title plus readOnlyHint / destructiveHint / idempotentHint / openWorldHint. Smithery's "Annotations 0/7" check should flip to 7/7 on the next Publish/scan. Tool naming stays as-is for now (rename to a single prefix is breaking). 2. Live Smithery badge on the hero machine-row. The dynamic SVG at https://smithery.ai/badge/mandalazenwave/fillin sits next to the existing "MCP server" pill. Visitors see the current call counter trending — counters above 0 stop the listing from looking abandoned and feed back into Smithery's discovery ranking.

2026-05-23 · 6ccdc42

fix(mcp-url): advertise /mcp/ everywhere — stop Smithery uptime probe failing

Root cause: POST https://fillin.glyphapi.dev/mcp (no trailing slash) returns HTTP 307 redirect to /mcp/. Most modern HTTP clients follow it, but Smithery's automated uptime probe does not — it sees an empty body, scores the server "down," and the rating compounds downward over time. The redirect-target /mcp/ works perfectly (200 with full MCP initialize response in ~4s). Fixed by advertising /mcp/ canonical URL everywhere: - server.json bumped to v0.2.1, re-published to registry.modelcontextprotocol.io (verify: registry now returns /mcp/ as the canonical URL for our entry) - agents.json + smithery.yaml + web/llms.txt updated for the next deploy - 7 docs/integrations files updated (cursor, claude-desktop, awesome-mcp PR drafts, posts) - The two open awesome-mcp PR branches (jaw9c#340, punkpeye#6780) pushed with the same one-line correction Out of scope for this commit (deferred to follow-up): - User actions: ./deploy/install.sh to ship agents.json + llms.txt to VPS, and update the Smithery listing UI directly (their dashboard caches the server URL independent of registry/manifest). - Server-side defense in depth: replace Starlette's auto-307 on /mcp with an explicit 308 (POST-safe) redirect or mount the MCP app at both paths.

2026-05-23 · ba12e13

feat(marketplace): buy flow + MCP tools — end-to-end revenue loop

The marketplace is now operational. Agents can mint, search, AND buy info-assets, with atomic bearer-to-bearer settlement and Fillin taking 30% rake (env-overridable). What ships: - credits.transfer_between_bearers — atomic 2-leg debit+credit in a single SQLite BEGIN IMMEDIATE. Rolls back both legs on any failure. Self-transfer and credit > debit are rejected as defensive guards. - marketplace.buy_mint — orchestrates the buy: look up mint, verify listed, derive seller key_hash from agent record, split rake, transfer, record tx, return full mint payload. Bearer-to-bearer only; wallet-only sellers are explicitly rejected (need Stripe Connect or x402-out, MVP-deferred). - POST /v1/mints/{mint_id}/buy — bearer auth, 402 on insufficient balance, 409 if not listed, 404 if mint unknown, 400 on self-buy. - 3 new MCP tools: fillin_market_search (public discovery), fillin_mint and fillin_buy_mint (require FILLIN_API_KEY). All route over HTTP — no in-process bypass because marketplace endpoints need real auth. 23 new tests, full suite 330 green. Still deferred to follow-up: KMeans daily clustering job (fingerprint cluster_id still '*'), demand-oracle read endpoint, signup-form augmentation, wallet-funded buyers / sellers.

2026-05-23 · 111d740

feat(marketplace): MVP foundation — typed-reasoning info-asset mints

Adds the second product surface: agents pay Fillin once for retrieval, reason on top of it, then mint a tradeable (data + typed-reasoning) asset that Fillin attests with HMAC. Other agents buy the mint instead of re-doing the work; Fillin takes 30% rake (env-overridable) on every resale. Trust model: **Fillin attests, agents reason.** The mint pipeline verifies every claimed evidence chunk_id resolves to a real LanceDB row (anti-poisoning gate), validates a typed reasoning graph (no free-text slop), and HMAC-signs the canonical payload. Buyers verify Fillin's signature, not the underlying chunks — that's what makes the asset transferable. Shipped this session: - db.chunk_ids_exist() — evidence verification primitive - marketplace.py (415 LOC): schema for agents / mints / marketplace_transactions / onboarding_events, attestation compute+verify, rake split, demand-oracle aggregator - POST /v1/mints — sign + persist with full validation - GET /v1/mints/{id} — public read (auditable) - GET /v1/mints?model_family=&cutoff_quarter= — fingerprint search - 48 new tests (34 unit + 14 endpoint), full suite 307 green Deferred to follow-up: POST /v1/mints/:id/buy with rake routing, daily KMeans clustering job (fingerprint cluster_id starts at '*'), fillin_mint + fillin_market_search MCP tools, demand-oracle endpoint, signup form augmentation.

2026-05-23 · fd5790a

docs(distribution): D.1 — MCP Registry manifest + awesome-mcp PR drafts + post drafts

Closes Phase D.1 of "Distribution Everywhere": - server.json: published to registry.modelcontextprotocol.io as io.github.artchristech/fillin v0.2.0 (verify: https://registry.modelcontextprotocol.io/v0/servers?search=fillin). Replaces the obsolete plan to PR modelcontextprotocol/servers' README — that repo now redirects all listings to the registry via mcp-publisher. - docs/integrations/awesome-mcp-prs.md: drafts for 4 awesome-mcp catalogs. Opened: - jaw9c/awesome-remote-mcp-servers#340 (1.1k stars) - punkpeye/awesome-mcp-servers#6780 (87.4k stars) Blocked (PRs disabled/collaborators-only): wong2, appcypher. - docs/integrations/posts/: ready-to-paste copy for Cursor Discord #mcp, tweet, Anthropic Discord, r/ClaudeAI. Posting is a manual user step.

2026-05-21 · 6cb1652

fix(deploy/setup-stripe): poll /v1/health before smoke-testing checkout

The 3s sleep before the smoke test was shorter than fillin-api's cold-start (embedder + corpus cache warmup is ~60s), causing setup-stripe.sh to print a false-positive 502 even when Stripe was wired correctly. Poll /v1/health with a 180s deadline so the script reports based on the real terminal state.

2026-05-20 · 88c2fbc

feat: payments receipts + push subscriptions + distribution surfaces

Three concurrent threads toward the "Fillin Magnificent" plan items: #2 Payment Rails — close the receipts gap and make Stripe go-live a one-command op: - new `/v1/payment/transactions/{address}` returns on-chain USDC deposit history - new `/v1/billing/transactions` returns the bearer's per-Stripe-event credit log - `bearer_credit_events` table + `record_bearer_credit_event()` write hook - `deploy/setup-stripe.sh` interactive: prompts for secrets (hidden input), ships them through SSH stdin (never argv), rewrites STRIPE_* lines in /opt/fillin/.env, restarts fillin-api, smoke-tests /v1/billing/checkout - `deploy/install.sh` now installs the templated `[email protected]` + `fillin-prune` units, and reconciles enabled instances against FILLIN_NETWORKS — fixes the long-standing bug where install.sh re-enabled the legacy single-chain monitor on every run - .env.example documents the multi-chain + Stripe block #3 Push Subscriptions — turn Fillin from query endpoint into event bus: - new `pubsub.py`: subscriptions table, topic-filter validator (source/min_severity/keywords/affected_ecosystem), in-process SSE registry, HMAC-signed webhook delivery - LanceDB polling worker started under FastAPI lifespan; uses `ingested_at` cursor (the column was added in PR #2 specifically for this) - `POST /v1/subscriptions` (auth: bearer with positive balance), `GET /v1/subscriptions`, `DELETE /v1/subscriptions/{id}`, `GET /v1/subscribe/{id}` (SSE with 15s heartbeat) - webhook deliveries carry X-Fillin-Timestamp + X-Fillin-Signature (HMAC-SHA256 over `ts.body`, Stripe-style replay-proof) #1 Distribution surfaces — make Fillin discoverable + installable everywhere: - `/llms.txt` route serving web/llms.txt (agent-crawler discovery) - `Dockerfile` for the FastAPI+MCP HTTP server - `docs/integrations/cursor.md` — MCP config + Stripe funding walkthrough - `docs/integrations/claude-desktop.md` — claude_desktop_config.json shim - `docs/integrations/browse-sh.md` — ready-to-submit skill spec (browse.sh agents make external HTTP, so x402 settles correctly — verified design) - `docs/integrations/mcp-registry-pr.md` — draft PR text for modelcontextprotocol/servers community list Tests: 259 passing. New coverage in test_pubsub.py (15), test_api_subscriptions.py (9), test_api_discovery.py (3), additions in test_api_billing.py (4).

2026-05-17 · 07bde4a

markets: new /query/markets slice — Polymarket + Kalshi + Manifold + Metaculus (#4)

* Phase R: schema migration foundation + CVE answer-engine columns Ships R.1 audit, R.2 typed-columns decision, R.3 ingested_at, and R.next severity_score + affected[] against the cves source. The CVE answer-engine SKU is now shippable — a buyer can act on a returned row (pin to patched_range) without a second hop to NVD or GHSA. SCHEMA_AUDIT.md — per-source field map for all 7 ingest paths; identifies the 4 highest-WTP discards (ingested_at, CVE severity, quality signals, GHSA/OSV patched_range). SCHEMA_DECISION.md — chose typed columns (B) over single JSON blob (A). Deciding field: GHSA/OSV affected[] tuple. JSON-blob path breaks LanceDB filter pushdown, the one-call MCP wedge, and the training-data SKU. Each new typed column names the SKU it unlocks. db.py — schema grows from 7 to 10 columns (+ingested_at, +severity_score, +affected). connect() auto-migrates pre-existing tables via add_columns — metadata-only op, existing 25k rows survive. upsert() server-stamps ingested_at and normalizes typed-CVE defaults so non-CVE sources land as null/empty. query_delta_in_sources gains min_severity filter for the CVE severity tier — pushed into the LanceDB WHERE clause, drops null-severity rows silently rather than treating them as 0. sources/cves.py — NVD pulls numeric CVSS baseScore (v3.1 → v3.0 → v2 fallback); GHSA reads cvss.score and pivots vulnerabilities[] into the uniform affected struct with patched_range; OSV flattens affected[].ranges[].events[] into the same shape. Title strings no longer carry the parenthetical severity tag — that's redundant once typed. mcp_server.py — query_cves accepts optional min_severity: float, threaded through both inproc and HTTP paths. Tests: 214 → 222 (+8). New: schema columns present, CVE row round-trip preserves severity + affected, non-CVE rows default to null/empty, NVD numeric baseScore + fallback + null, GHSA patched_range pivot, OSV range flattening. * papers: fix vapor ingest — use submittedOnDailyAt + date-granularity match HF daily papers source was returning empty results. Root cause: comparing the API's date-only field against a datetime cutoff with sub-day precision silently dropped every row. Switched to paper.submittedOnDailyAt with date-granularity comparison; 80 fresh papers backfilled on first run. * payments: Stripe top-up + multichain wallet + bearer ledger + prune Stripe Checkout → bearer ledger top-up flow: - payments/stripe_billing.py mints a raw bearer + hash, sends Stripe only the hash, stashes the raw key with a TTL, atomically claims it on /billing/success after Stripe API confirms paid + livemode match. - Webhook deduped by event id; requires payment_status==paid and livemode match before crediting. - scripts/check-no-stripe-keys.sh — pre-commit guard refuses to commit Stripe secrets (sk_live, rk_live, whsec_) per Stripe's #1 leak vector. api.py — /v1/billing/checkout + /v1/billing/webhook + /billing/success routes; multichain x402 (Base + Optimism + Arbitrum + Polygon) selected via payer header; Hit model + SliceQueryIn carry severity_score + affected + min_severity for the cves slice (R.next surface). payments/wallet.py — get_web3() now per-chain; cached singleton per USDC contract; never reads BASE_RPC_URL. payments/credits.py — bearer ledger (key_hash + balance) + nonce challenge table + pending_reveal table; spend/refund + probe quota + pruning primitives. payments/prune.py — periodic hygiene over the three ephemeral tables, idempotent DELETE WHERE stale, driven by fillin-prune.timer. Tests: +940 lines across test_api_billing, test_api_multichain, test_bearer_ledger, test_stripe_billing, test_pre_commit_hook, test_wallet — covers Stripe webhook signing, multichain payer routing, bearer claim-once semantics, pre-commit hook block-list. * deploy: systemd units for monitor + prune timer + nginx CSP + README deploy/[email protected] — per-chain templated unit; runs the multichain USDC deposit monitor as [email protected] etc. deploy/fillin-prune.service + .timer — daily ledger hygiene at 04:00 MT, runs payments/prune.py to sweep expired nonces, old probes, and unclaimed pending_reveal rows. deploy/nginx-fillin.conf — relax CSP for the rich landing pages (inline-eval + data: images required by the Pretext-style assets). deploy/README.md — install + rotate runbook for the new units. * web: hero corpus_match honesty + Stripe top-up signup + changelog web/index.html — hero lede now surfaces the corpus_match: strong|weak|none honesty signal so a visiting agent buyer sees that fillin returns the quality of its own retrieval, not just docs. web/signup.html — adds the Stripe-checkout top-up path alongside the existing trial-key mailto flow; key delivery is one-shot on the /billing/success page. web/changelog.html — catches up to Phase R (ingested_at, severity_score, affected[]), the daily snapshot slices, and the multichain x402 surface. * markets: new /query/markets slice — Polymarket + Kalshi + Manifold + Metaculus Adds the fourth daily-snapshot slice. One MCP call surfaces every active market touching a topic across the four major prediction venues an agent would otherwise check independently. Priced at $0.05 against the four-venue rediscovery cost. sources/markets.py — keyless public APIs only. Per-venue fetcher pulls the active-market head, folds the load-bearing fields (question, current implied probability / yes-price, close date, volume, venue) into `text` so vector match catches "is X likely" queries. Defensive on Polymarket's outcomePrices shape (Gamma has shipped both JSON-encoded string and pre-decoded list). Manifold filters resolved markets out. Metaculus uses community_prediction.full.q2 as the canonical median. ingest.py — registered in SOURCES so the standard `ingest(hours, sources)` orchestrator picks it up. api.py — /query/markets route, FILLIN_PRICE_MARKETS_USDC env (default $0.05), require_paid_or_key_markets dep, /.well-known/mcp/server-card entry. mcp_server.py — query_markets MCP tool mirrors the other slice tools' shape. Docstring is explicit that the price snapshots in `text` are first-sight only; for live pre-trade pricing, follow the venue url. scripts/ingest_markets.py + run_daily_snapshots.sh — fires markets last in the daily 1pm-MT cron, 30s after frontier. agents.json + smithery.yaml — declare the tool, pricing, example invocation. Corpora list now includes "markets". Tests: 222 → 228 (+6). Per-venue parsing tests for Polymarket (yes-price fold, dual price-shape tolerance), Kalshi (cents → display + close date + contract volume), Manifold (resolved-filter + probability render), Metaculus (community-median path). Plus a cross-venue dedup test. Known limitation (documented in script + tool docstring): prices in the corpus are point-in-time at first ingestion — db.upsert dedupes by id so existing market rows aren't refreshed. The slice answers *discovery* ("is there a market about X across the four venues") not live price. Follow-up PR can add db.replace_source('markets', ...) so daily cron refreshes price snapshots in place. --------- Co-authored-by: Christopher Harris <[email protected]> Co-authored-by: Claude Opus 4.7 (1M context) <[email protected]>

2026-05-16 · b8387c2

Phase R: schema migration foundation + CVE answer-engine + Stripe top-up + multichain (#2)

2026-05-14 · b528f96

landing + agents.json: hero → /onboard.html, kill private-repo links — Phase R prep

Phase R conversion-friction pass before Monday outbound. F2 (highest-leverage): landing hero primary CTA now points at /onboard.html ("Try it in 60 seconds") instead of an anchor to #tools; pricing-sidebar copy promotes /onboard.html and demotes /signup to "production bearer key". The interactive wizard already worked end-to-end (model → harness → MCP config → fillin_health verify → live fillin_query in browser) but was buried behind a nav link. F4: agents.json free_paths now lists interactive_onboard alongside trial_key, and tags trial_key as the human-onboard path so agent-native self-bootstrap routes to /onboard.html or x402 rather than the mailto. F1 (belt-and-suspenders until artchristech/fillin is flipped public): removed every github.com/artchristech/fillin link from web/index.html footer, web/onboard.html final CTA, web/pitch.html cta-row, web/pricing.html self-host tier, web/changelog.html nav-cta, agents.json repository field, smithery.yaml repository field. Replaced with Smithery (verified third-party signal), /onboard.html, or self-host mailto where appropriate. Re-add when repo visibility flips. .gitignore: PHASE_R_PLAN.md + PHASE_R_ACCELERATE.md (internal strategy artifacts, never publish). web/signup.html intentionally left out of this commit — has my F1 edit overlapping with Stripe/multi-chain in-flight work; will ship together.

2026-05-14 · 21b2976

landing: promote teal v2 — agent-first hero, 7-tool spec, x402 callout

- Recolor accent to logo teal (#3fb8ad), retire yellow-green - Trim from 1384 → 877 lines; cut animated demo, slim sections - Surface machine-readable endpoints (agents.json, openapi.json, /v1/corpus, MCP) as hero pills - Add 7-tool spec table + decision-tree for "when to call what" - Live status strip polling /v1/corpus - Logo plate that matches JPG bg for invisible edges (3 placements)

2026-05-14 · a0a3f7c

ts client: publish v0.1.0 as @artchristech/fillin-client

The @fillin scope doesn't exist on npm yet. Shipping under the user's existing @artchristech scope so the package is installable today. README notes the planned migration to @fillin/client once the org exists. Description updated to the search-engine-for-agents framing. Live: https://www.npmjs.com/package/@artchristech/fillin-client

2026-05-14 · 1403208

docs: TS client README + HN post catch up to 7-tool reality

clients/fillin-ts/README.md: - Hero reframed as search-engine-for-agents + names the 3 daily slices - Quickstart comments the generic /query as $0.01 explicitly - Methods section calls out that slice routes (cves/papers/frontier) + /answer ship in the API but aren't yet wrapped — direct fetch or PR - Surfaces /signup (20 free queries) alongside /pricing HN_POST.md (drafts for Show HN — not on the live site): - Title swapped to "Show HN: Fillin – search engine for AI agents (...)" - Body reframes around search engine + 6 corpora + 7 tools + per-slice pricing rationale - Adds rediscovery-cost framing for differentiated pricing - Updates corpus claim ("thin / HN+arXiv") to current reality (22.8k docs, 6 sources, two-tier ingest) - Adds eval evidence (n=23 Opus, ~3.4× fewer tokens, ~2.25× cheaper) - Adds two more anticipated questions (per-slice pricing, MCP location) - Repo link points at github.com/artchristech/fillin (was placeholder) Defer: docs/ZHC_STRATEGY.md, STRIPE_PLAN.md, OUTREACH.md, SUBMISSION.md still reference flat $0.01 — internal-only, doesn't affect any user-facing surface.

2026-05-14 · 3c7c1de

docs + /pricing: catch up to the 7-tool, search-engine-for-agents reality

The flat-$0.01 framing was actively misleading now that /query/cves, /query/papers, /query/frontier each carry their own price. README still described 4 sources from before the daily-snapshot work. /pricing (web/pricing.html): - h1: "Per-call pricing. In USDC. On Base." (was "Flat $0.01 per query") - New per-tool table (7 rows) with prices and intended use - "Try before you pay" surfaces both /v1/probe and /signup - Three rails kept (x402, bearer/Stripe, self-hosted) but reframed — same prices, different settlement - "Why $0.01" replaced with "Why these prices" — rediscovery cost framing for differentiated pricing - FAQ "Is there a free tier?" answer corrected (yes, two paths) README.md: - Hero: search-engine-for-agents framing + 7-tool table front-and-center - Stack section: 6 corpora named (HN/arXiv/RSS/GHReleases/CVEs/papers/frontier), two-tier ingest (30min scheduler + 1pm MT daily snapshots) called out - Try-before-you-pay: surfaces /v1/probe AND /signup trial bearer key - Quickstart: pip install -r requirements.txt instead of stale per-package list; mentions the slice routes alongside /query Tests: 146/146.

2026-05-14 · fddbc02

deploy/nginx: relax CSP for rich landing pages

Original policy was default-src 'none' — fine when the API only ever served JSON, broken now that we ship a styled landing with inline <style>, Google Fonts, inline <script>, and /v1/corpus + /v1/probe fetches from the page. Browsers correctly refused everything, rendering the live site as plain serif HTML. New policy keeps the lockdown shape (no third-party scripts, no iframes, no foreign image hotlinks, no foreign connect targets) and opens only what the page actually needs: - img-src 'self' data: - style-src 'self' 'unsafe-inline' https://fonts.googleapis.com - font-src 'self' https://fonts.gstatic.com data: - script-src 'self' 'unsafe-inline' - connect-src 'self' Mirrors the shape used by sister site /etc/nginx/sites-available/glyph. Live config patched in-place + nginx -t green + reload (zero downtime). Backup at /root/fillin.bak.20260514 on snapback.

2026-05-14 · 2dc75fa

landing: integrate hourglass logo (replaces accent-dot brand mark)

- web/logo.jpg: 1024x1024 teal hourglass mark (the new brand) - web/index.html: nav brand swaps the .dot accent for an <img class="logo-mark"> at 26px with 5px radius. Adds <link rel="apple-touch-icon">. - api.py: new GET /logo.jpg route via _serve_static - Versioned URL (?v=1) in HTML to dodge Cloudflare's 4h cache of the pre-route 404. Future logo swaps bump the version.

2026-05-14 · bb64ca6

ingest: arxiv min window 24h → 72h

Diagnostic dig: 24h was insufficient. arXiv publishes once per weekday ~17:30 UTC. Querying at 19:00 UTC, the latest batch is from yesterday's 17:30 → only 25.5h ago, just outside a 24h window. Result: scheduler pulled 50 entries every tick, all dropped as out-of-window, arxiv corpus went 15 days stale. 72h covers the worst case (Monday morning needing Friday's batch over the weekend gap). Verified live: scheduler now pulls +1055 arxiv docs on its first tick after the change. 146/146 tests pass.

2026-05-14 · 39958bd

ingest: per-source min window, arxiv floored at 24h

The scheduler runs every 30min with INGEST_WINDOW_HOURS=2, but arxiv publishes in weekday batches and the API page-0 newest doc can be 24-72h old. With a 2h window every tick reads 50 entries and drops all 50, leaving the arxiv corpus stale (15+ days at last check). Fix: SOURCE_MIN_HOURS = {"arxiv": 24} in ingest.py applies a per-source floor: max(hours, SOURCE_MIN_HOURS.get(name, 0)). Other sources keep the caller-supplied window. Diagnostic log line now includes the effective window per source. 146/146 tests pass.

2026-05-14 · d25d360

data refresh + cron to 1pm MT + smithery.yaml comprehensive sync

- Fired daily snapshot manually; corpus now at 21,677 rows (was 21,621); cves +55 new, frontier +1, papers +0 (HF daily + bioRxiv both empty for the last 24h — upstream sparseness, not a regression). The read-ids + chunk-size=25 add strategy held: 300 docs ingested with zero OOM. - Cron rescheduled on snapback VPS: CRON_TZ=America/Denver, 0 13 * * * (was 15 6 UTC). DST-aware via vixie cron (Debian 3.0pl1-162). CRON_TZ scoped to the fillin entry only — other crontab jobs (snapback, glyph, milkncookies) keep their original UTC schedule. - smithery.yaml: comprehensive update — corpus surface now names all 6 active source families (HN, arXiv, RSS, GH releases, cves, papers, frontier) with row count; /v1/probe free-tier path surfaced; /signup trial-key (20 free queries) surfaced; corpus_match signal documented in fillin_query example; daily 1pm MT cadence noted. - agents.json: synced — added query_cves / query_papers / query_frontier to mcp.tools, expanded corpora list with the 3 new sources, replaced flat $0.01 pricing with per-tool table + free_paths + snapshot_cadence. Diagnosis (no fix shipped per scope): arXiv staleness (newest doc 2026-04-29 in DB, while live API has docs through 2026-05-13) is caused by scheduler INGEST_WINDOW_HOURS=2 being shorter than arXiv's typical submission cadence — every 30-min tick reads page 0, sees the 50 newest entries are all >2h old, and writes 0. Fix would be either a wider window for arxiv specifically, or graduating arxiv into the daily-slice runner with --hours 24. Out of scope for this commit.

2026-05-14 · 0f0e739

landing: restore rich page + reframe as search engine for AI agents

The previous commit (0f73475) accidentally overwrote the rich 1381-line landing page with a stale 519-line older version. This restores the rich page (hero animation, demo, how-it-works, onboard, api, pricing, proof, stack sections) AND applies the search-engine-for-agents reframing on top: - Title + meta + og + twitter: "search engine for AI agents" - Eyebrow + lede: name the 3 daily slices and prices inline - API section: "Five endpoints" with each route + price listed - Pricing card: per-slice price list ($0.01-$0.05 range), removed the misleading "100 queries per 1 USDC" line that only held for the flat $0.01 tier

2026-05-14 · 0f73475

landing: reframe as search engine for AI agents + add 3 daily slices

- Title + meta + og: "search engine for AI agents" framing - Tagline: name the 3 slices + their prices (CVEs $0.02, papers $0.03, frontier $0.05) - Stats strip: price tile shows the $0.01-$0.05 range, links to /pricing - Endpoints table (section 03): added /query/cves, /query/papers, /query/frontier, /answer rows with prices inline - MCP section (section 05): "Seven tools" with prices listed for each Section 07 corpus table auto-populates from /v1/corpus and will pick up the new sources without code change.

2026-05-14 · 67ac88d

launch readiness: onboard XSS fix, CSO-cleared diff, internal docs gitignored

CSO daily-mode audit before pushing fillin_daily to public main: * Fixed: web/onboard.html href XSS (MEDIUM) — added safeHref() URL scheme allowlist (http/https only); applied escapeHtml to source + published_at for defense in depth. * Verified-fixed in code: XFF rate-limit bypass HIGH (CF-Connecting-IP enforcement at api.py:67-81 + ufw lockdown via deploy/lockdown_origin.sh). * Gitignored internal artifacts: CONTEXT.md, RECONCILE.md, research-report.md, .gstack/ — never publish (working specs + research notes that don't belong in the public repo). * Updated security-report.md with the 2026-05-11 audit (consistent with prior public-audit policy). Bundles other launch-ready work that was already staged: README + landing-page polish, /signup flow, agents.json mainnet update, new tests/test_corpus_and_probe.py (13 tests), thesis v1+v2 pages, eval artifacts, deploy hardening (lockdown_origin.sh). Tests: 146 passing.

2026-05-14 · 3a69321

fillin_daily: 3 daily snapshot slices + differentiated pricing

Adds CVE / papers / frontier daily ingestors and paid MCP routes to turn fillin into a search engine for AI agents. Each slice is its own paid lane with prices set by elasticity: - query_cves ($0.02) — NVD + GitHub Security Advisories + OSV - query_papers ($0.03) — HuggingFace daily papers + bioRxiv (unioned with arxiv) - query_frontier ($0.05) — OpenAI/Anthropic/DeepMind/Meta/Mistral feeds + HF trending Backend: - sources/{cves,papers,frontier}.py + scripts/ingest_*.py + run_daily_snapshots.sh - QueryIn.sources whitelist filter; rejects unknown values 400 - Three paid routes /query/{slice}; single auth path via factory - db.upsert chunked via FILLIN_UPSERT_CHUNK_SIZE (default 25); switched merge_insert -> read-existing-ids + add to fix VPS OOM at 21k-row scale - server-card.json advertises the 3 new tools Smithery: yaml refreshed with new tools, prices, tags, and 3 example invocations matching the search-engine-for-agents framing. Tests: 122 -> 146 (+15 slice routes/auth/pricing, +1 chunked upsert dedup).

2026-05-13 · 54557a3

fillin_answer: synthesized-mode tool for weaker LLM callers

The eval showed fillin's value depends on the calling model's tool-synthesis skill — Opus 4.7 extracts citations cleanly (4.29/answer); Nemotron-120B free barely does (2.3, mostly hallucinated from training). To be a true enhancement on *any* model, fillin needs to do the synthesis itself for callers that can't. New tool: fillin_answer(query, cutoff, k) Returns a 150-250 word answer with inline [title](url) citations, grounded in post-cutoff retrieved docs. Server runs Haiku 4.5 over the top-k results with a constrained system prompt (no outside knowledge, echo dates, refuse if docs don't address the query). Raw docs are returned alongside for verification. Pricing: $0.02 USDC / call (vs $0.01 for fillin_query). Bearer keys unmetered. x402 path refunds on synthesizer error or no-relevant-docs case so payers never lose USDC for a non-synthesis. Cold-start safe: falls back to {answer: null, reason: "synthesizer_not_configured"} when ANTHROPIC_API_KEY is unset. x402 callers are auto-refunded the $0.02. Cheap-signal addition: fillin_query and the in-proc MCP variant now return top_score + corpus_match ("strong" | "weak" | "none") so agents can skip the paid call when retrieval found nothing. - synthesize.py: corpus_match thresholds + Haiku-backed synthesizer - api.py: /answer endpoint, ANSWER_PRICE_USDC, AnswerOut model, require_paid_or_key_answer dependency, refund-on-no-synthesis logic - mcp_server.py: fillin_answer tool with both in-proc and remote paths - agents.json + smithery.yaml + /.well-known/mcp/server-card.json: register the new tool and pricing - tests/test_synthesize.py: 10 new unit tests for corpus_match, extract_citations, and the no-key fallback (no live API calls) Test status: 122/122 passing (10 new).

2026-05-13 · cdde415

launch polish: embedder pre-warm + smithery.yaml example invocations

- api.py: lifespan now warms the embedder singleton with a dummy encode at startup, so the first real fillin_query doesn't pay the ~2-3s sentence-transformers cold-load cost (showed up as a p99 latency spike on the Smithery performance dashboard) - smithery.yaml: add four `examples:` entries (release-notes query, research query, free health check, free corpus stats) so the listing surfaces realistic call shapes to reviewers and prospective agent developers Verified live: '[fillin] embedder warm' logs on snapback after restart.

2026-05-12 · e1885c9

launch prep: /signup trial-key flow, Proof section, mainnet-correct smithery.yaml

- web/signup.html + GET /signup route: 20-free-query trial via mailto prefilled with name/use-case/harness/cutoff, so the "keys issued at fillin.glyphapi.dev" promise has a real destination - web/index.html: new #proof section with the n=8 head-to-head numbers (Fillin = 2.25× cheaper than web search, 11× more inline citations); /signup link added to nav, CTA, and pricing card - smithery.yaml (new file, will commit to surface on registry): removes the "free public tier" claim that didn't exist, points at /signup - api.py: bundles already-deployed CF-Connecting-IP rate-limit hardening so origin matches what's running on snapback (rsync had been the source of truth)

2026-05-06 · 70f9ec7

/freshness: cache 60s — public endpoint, full column scan was 8s cold

The freshness endpoint runs db.stats() which scans the published_at + source columns over the full ~15k-row table; first call after a worker restart took 8+ seconds and would time out the landing-page widget. As a public no-auth endpoint, that's also a trivial accidental DOS. Module-level dict cache with 60s TTL. The 'now' field is still computed per request so consumers can detect a stale cache if needed.

2026-05-06 · fdf9608

db.stats(): include per-source row counts

The /freshness endpoint and the /status board both surface a "Sources" widget driven by stats().by_source — but stats() never populated it, so the dashboards rendered "0 sources". Adds a pylist sweep over the source column (already loaded for the min/max scan) and groups counts. No additional DB scan: reuses the same fetch.

2026-05-06 · dcd7b88

CI: pytest + tsc + npm test + changelog build on push and PR

Three jobs in .github/workflows/test.yml: - python: pytest with eval/ excluded (live-network suite, not for CI). - typescript: cd clients/fillin-ts; tsc --noEmit; npm test; npm run build. - changelog: runs tools/build_changelog.py and asserts the output is >5KB, so a regression in the generator can't ship green. Caches pip and npm by lockfile. Triggers on push to main, PR to main, and workflow_dispatch.

2026-05-06 · a2f7fc5

Add @fillin/client TypeScript SDK

Most agent code in 2026 is TypeScript (Mastra, LangGraph TS, Vercel AI SDK, Cloudflare Agents). Python-only excluded that half of the dev population; this opens it. - src/index.ts: FillinClient with .query / .stats / .freshness / .health / .paymentInfo. Bearer auth via apiKey option, with an injectable fetch for testing or custom transports. AbortController honors timeoutMs (default 30s). FillinError carries .status + .body. - test/client.test.ts: three smoke tests — bearer header gets sent, freshness works without a key, non-2xx throws FillinError. Uses node:test, no external test framework. - README.md: install, quickstart, options table, methods, errors. - tsconfig.json + package.json wired for `npm run build` (tsc → dist/). - .gitignore so node_modules + dist stay out. Tests pass (3/3), tsc --noEmit clean. Not yet published to npm.

2026-05-06 · 850fa42

Site infrastructure: pricing, status, changelog, freshness, SEO

Builds the surface that was missing for a real product: - /pricing — flat $0.01 page covering both x402 and bearer rails, plus FAQ and a Stripe placeholder cell for when the checkout lands. - /status — live in-browser health board over /healthz, /freshness, /v1/payment/info, /agents.json. No server-side history yet (deferred until usage justifies a time-series store). - /changelog — generated from git log via tools/build_changelog.py; baked into deploy/install.sh so every deploy refreshes it. - /freshness — public corpus stats endpoint (no auth) backing both the status board and a new freshness strip on the landing page. - /robots.txt + /sitemap.xml + og.svg + favicon.svg — basic SEO and social-share surface (was zero before). - _shared.css — shared design tokens for sub-pages so they match index without each duplicating ~500 lines of CSS. api.py exposes all of the above plus _serve_static for asset routes (robots/sitemap/favicon/og/css). _serve_html keeps a JSON fallback so tests pass without staging the web/ directory. Landing-page nav now actually links to onboard.html, pitch.html, pricing, status, and changelog — they had been dangling html files with no entry points.

2026-05-06 · 78eeb1b

agents.json: declare HTTP transport as primary, stdio as alt

The card claimed transport=stdio but the production deploy at fillin.glyphapi.dev mounts FastMCP at /mcp via streamable-http. Agents that fetched the card and tried to spawn the stdio process would have looked at the wrong rail. Now lists both transports with streamable-http as primary and exposes the http_endpoint inline.

2026-05-06 · cf3c5fa

Fix test isolation: reload mcp_server before reloading api

Each api fixture reloaded api.py inside a fresh TestClient. The lifespan ran _mcp_instance.session_manager.run(), which raises after the first call on a given FastMCP instance. Because mcp_server wasn't reloaded, the singleton was reused and the second test in either test_api_limits.py or test_api_x402.py exploded — 14 errors total on every CI/local run. Reloading mcp_server first gives api.py a fresh FastMCP, so the session manager starts cleanly for each test. Before: 95 passed, 14 errors. After: 109 passed.

2026-05-06 · 47407d6

Serve landing/onboard/pitch HTML at /, /onboard.html, /pitch.html

Before this, web/index.html (1238 lines) was rsynced to the VPS but no route served it — GET / returned the JSON stub. Visitors to the canonical domain saw {"product":"Fillin","tagline":"..."} instead of the landing page. Adds three FileResponse routes that fall back to a small JSON payload when the file is missing (so tests run without staging the web/ dir). Also tracks web/pitch.html which had been left untracked since Apr 30.

2026-05-06 · 8cb6838

Swap fillin.dev → fillin.glyphapi.dev across docs and config

User does not own fillin.dev — the domain's NS records point at Vercel and currently return DEPLOYMENT_NOT_FOUND. Anyone who registers it later could harvest bearer tokens and x402 payment headers from agents that wired up via the README. Canonical domain is fillin.glyphapi.dev (snapback VPS, server_name in deploy/nginx-fillin.conf:22). 24 string references across README, agents.json, mcp_server.py, embeddings.py, web/index.html, tools/load_test.py, and docs/.

2026-04-30 · 7d9e68b

Eval economics: real partial baseline (n=7-8 / 25 queries)

Three-arm head-to-head: claude-alone vs claude+websearch vs claude+fillin. Same model (Opus 4.7), same system prompt, same questions; only tools differ. Run halted at run 28/75 due to Anthropic credits exhaustion. Numbers below are from the 23 successful runs only — explicitly partial, explicitly noted. Per-arm averages on 7-8 queries each: claude alone 147 in / 800 out / 0 tools / 1.88 cite / $0.021/q claude+websearch 36k in / 2010 out / 1.5 tools / 0.38 cite / $0.247/q claude+fillin 11k in / 1392 out / 2.1 tools / 4.29 cite / $0.110/q Headline: at this query mix, Fillin used 3.4× fewer input tokens, cost 2.25× less per query, and produced 11× more inline citations than Anthropic's hosted web_search. Latency was modestly faster too. Total real spend across the 23 runs: $2.91. Caveats are documented openly in eval/baseline.md — sample is small, citation metric is "URLs in answer text" (websearch uses footnote-style refs that don't render as URLs), and the query mix is biased toward release notes + research papers where Fillin's corpus is well-aligned. To complete: top up credits + python tools/eval_economics.py. ~$6 to finish.

2026-04-30 · eb270af

Reframe onboarding: model is the car, cutoff is the odometer

Old onboarding asked "pick your harness" first — but the harness is just plumbing. The thing Fillin actually solves for is the model's cutoff. New flow puts that first. Hero: "Your agent is driving with outdated maps." Sublede: "Fillin is the embeddings & context-engineering service that closes the gap, on every query, autonomously." Step 1 is now MODEL — pick from 12 cards (Claude Opus 4.7, Sonnet 4.6, GPT-5.5, Gemini 2.5, Llama 4, Grok 4, Mistral L3, etc.) each showing its training cutoff. Custom date row underneath. The moment a model is selected, an inline gap-readout panel reveals: - days of blind spot - estimated # arxiv papers, HN posts, GH releases missed - sharp pitch: "without Fillin your agent refuses or hallucinates" Step 2 is harness selection (was step 1). Step 3 config snippet auto-populates from the picked harness. Steps 4-5 verify + first query (cutoff pre-filled from step 1). Step 6 done state shows the model + cutoff that's now wired. URL hash captures m=, c=, h=, s= so any state is shareable as a link. Reorder reflects the strategic claim: the user is wiring up *context engineering for an agent that runs an X with cutoff Y*, not just "installing a tool".

2026-04-30 · e82890b

Agent onboarding wizard + CORS for browser-based MCP calls

web/onboard.html — 5-step interactive onboarding flow: 1. Pick your harness (Claude Code, Cursor, Continue, Zed, Eliza, curl) 2. Drop in tailored MCP config (copy-button, harness-aware) 3. Verify connection — live fetch to /mcp/ shows reachability + corpus stats inline 4. Run a real query — input cutoff + topic, see ranked results with clickable source URLs, plus copy-able curl version 5. Done state — what tools are now available, links to next steps URL-hash state so steps are linkable. Live status indicator. Same aesthetic as web/index.html and web/pitch.html. Bootstrap nginx now ships CORS headers so pages from any origin (including file://) can hit /mcp/ for verification + queries. ACAO=* is fine because every authenticated path is still gated by bearer or x402 — CORS just controls whether the *browser* lets the calling JS read the response. Live nginx already patched; this keeps the bootstrap in sync.

2026-04-30 · cd79e14

arxiv: bump max_pages 4→30 so backfills can paginate past ~48h

Each fetch() call starts at start=0 and pages forward in submittedDate-desc. With max_pages=4 × PAGE_SIZE=50 = 200 papers per call, we never see beyond the most recent ~2 days at current ~100 papers/day in our cats. Caught when chunked backfill produced 0 net new docs after the 48h window — every subsequent chunk re-fetched the same 200 papers and dedup-filtered them. 30 pages × 50 = 1500 paper ceiling, ~15 days at current rate, plenty for both incremental and historical backfills. Per-page throttle still 3s, so worst-case wall clock is ~90s.

2026-04-30 · 8b22564

arxiv: bump request timeout 30→60s; submittedDate-desc is slow for large windows

2026-04-30 · d25e39d

Fix arXiv ingest: space-as-OR encoding (was returning 0 silently)

_search_query joined categories with literal "+OR+" assuming the URL would carry it through. requests.params double-encoded the '+' to '%2B', which arXiv interprets as a literal plus character (not the URL-form '+' = space) — silently returns an empty feed. Switch to " OR " as the connector. requests then form-encodes the spaces as '+', which is exactly the wire format arXiv wants. Verified live: 88 papers in the last 24h across cs.{AI,LG,CL,CR,DC}. This fills the second-largest gap in our corpus (rss came back with 51 docs, gh_releases with 12, arXiv was 0 since the multi-source refactor landed).

2026-04-29 · 1eddf0d

Fix scheduler: drop max_pages kwarg that ingest() no longer accepts

scheduler.py was minted in v0 calling ingest(hours, max_pages). The multi-source refactor (commit 27f56cf) dropped max_pages from ingest() in favor of per-source defaults; scheduler.py was never updated. Every tick since has raised TypeError, silently into the journal. Corpus went stale by ~4 days; only caught when checking MCP status today. Tests don't exercise the scheduler entrypoint — that's the blind spot. Adding an integration smoke test for it is queued for next pass. After deploy, run a one-time backfill: ingest --hours 96 to recover the gap.

2026-04-29 · 82be046

Audit log: M1 + M2 closed and verified live

Both Medium findings resolved. Live HTTPS responses now carry Strict-Transport-Security and Content-Security-Policy headers; live config patched in-place. Bootstrap config refuses non-ACME HTTP traffic with 503 on a fresh deploy. Global exception handler registered as defense-in-depth against future debug=True slippage.

2026-04-29 · bdab800

Fix MEDIUM: bootstrap nginx exposes API in cleartext, no global exc handler

M1: deploy/nginx-fillin.conf bootstrap was proxying API traffic on plain HTTP between install.sh and certbot. Bearer tokens and signed payment headers leaked. Fix: bootstrap now refuses with 503 except for ACME challenge path. certbot --nginx --redirect overwrites the catch- all with a 301 to HTTPS once the cert is in. Hardening headers (HSTS, CSP) baked into the bootstrap so the post-cert config inherits them. HSTS pinned at max-age=63072000 with includeSubDomains. preload is deliberately omitted — it submits to browser preload lists and is effectively irreversible. Promote later when the domain is months-stable. CSP: default-src 'none'; frame-ancestors 'none'; (API serves JSON only). Live config patched in-place with the same headers; verified live — HSTS + CSP appear on every response. M2: api.py had no global Exception handler. A future debug=True regression would leak full Python tracebacks on /query. Added a @app.exception_handler(Exception) that logs the traceback server-side and returns a generic {"error":"internal_error"} 500. FastAPI's HTTPException + RequestValidationError handlers fire first, so this is purely the catch-all for unexpected errors.

2026-04-28 · 7211fc6

Audit log: H1 (rate-limit-bucket-collapse) closed and verified

Verified live on snapback with bucket=2: IP A (1.1.1.1) and IP B (2.2.2.2) each got 200/200/429 from separate buckets. Pre-fix would have shared one bucket (A: 200/200/429, B: 429/429/429).

2026-04-28 · 224a90d

Fix HIGH: per-IP rate limits collapse to one bucket behind nginx

/security-scan caught it. uvicorn was launched without --proxy-headers, so scope["client"] was always 127.0.0.1 (nginx loopback). All three limits collapsed to a single global bucket: slowapi 30/min on /query, 10/min on /v1/payment/*, and the /mcp middleware's 60/min. Fix: add --proxy-headers --forwarded-allow-ips=127.0.0.1 to the systemd unit. uvicorn's ProxyHeadersMiddleware then rewrites scope["client"] from X-Forwarded-For *only when the connecting IP is 127.0.0.1*, so public clients cannot spoof. nginx is the sole trusted proxy. Both slowapi (via get_remote_address) and the /mcp middleware (via scope["client"]) now key on the real client IP automatically. Updated the comment that lied about --proxy-headers being set.

2026-04-25 · ac66c47

Security audit doc: 4 critical fixes + 2 P0/P1 opens

Captures the full audit run today: findings, what's fixed, what remains open, what was confirmed safe. Includes pre-deploy .env hygiene checklist to prevent the wallet-PK-in-env mistake from happening again. Open critical: rotate the testnet wallet that surfaced during the audit (plaintext FILLIN_WALLET_PRIVATE_KEY in /opt/fillin/.env). Move private keys to KMS/systemd-creds/age-encrypted file before mainnet flip. Verified safe: constant-time bearer compare, atomic spend-with-nonce, EIP-191 verification, micro-USDC accounting, refund-on-failure, /mcp rate limit (5×200 → 7×429 confirmed live), nginx SSE buffering off.

2026-04-25 · de964fd

nginx: disable buffering for SSE / streamable-HTTP MCP

Default proxy_buffering=on holds the entire text/event-stream response until the upstream closes. For MCP's streamable-HTTP transport, that means tool-call responses appear to hang from outside even though the server returned in <1s. Disable buffering on the proxy_pass so chunks flush to the client as they arrive. Live config patched in-place; this update keeps the bootstrap config in sync for fresh deploys. Caught while running the production-grade audit; before the fix, live fillin_query through HTTPS was indistinguishable from a 30s timeout. After: 0.57s.

2026-04-25 · e212e8a

Production-grade MCP fixes: deadlock, refund, rate limit

Audit found three real production issues. All fixed. C1 (CRITICAL — production-breaking): MCP tool *calls* deadlocked through the deployed HTTPS endpoint. fillin_query/stats/health each made HTTP loopback calls to localhost:8766 from inside the same FastAPI worker that was holding an SSE stream open for the MCP request. Classic event- loop self-deadlock; resolved only at the 30s upstream timeout. Smithery's probe passed because tools/list returns immediately, but no agent could actually run a tool. Fix: detect co-mounted mode (loopback host) and call db.query_delta / db.stats directly in-process. HTTP path preserved for stdio mode where someone points at a remote Fillin. C2: /mcp had no rate limit. slowapi only decorates FastAPI routes, not mounted Starlette apps, so the public MCP surface was wide-open while /query was capped at 30/min. Added a per-IP token-bucket as ASGI middleware; default 60/min, override via FILLIN_MCP_RPM. I1: x402 payers were charged before the query ran. A LanceDB lock or embedding failure left them out-of-pocket. Added credits.refund() and wrapped query_delta in api.py so server-side errors roll back the deduction. Bearer mode unchanged (pays nothing). Verified locally: fillin_health 0.8s, fillin_query 7s (cold) returning real ranked results with proper scores in (0, 1].

2026-04-25 · 60f207d

deploy/install.sh: don't clobber certbot's nginx mods on re-deploy

Each subsequent deploy overwrote /etc/nginx/sites-available/fillin with the bootstrap (HTTP-only) config, wiping certbot's HTTPS server block. Caused HTTPS to break after every redeploy until 'certbot --nginx --reinstall' was re-run. Fix: only copy the bootstrap config on first deploy (when the file doesn't exist). After that, certbot owns the file. nginx -t still runs to validate. Caught while wiring fillin.glyphapi.dev for smithery.

2026-04-25 · c18de4b

Allow public hosts through FastMCP's DNS-rebinding protection

FastMCP defaults allowed_hosts to localhost variants. Production hits return 421 'Invalid Host header'. Override via FILLIN_MCP_HOSTS env (comma-separated) and seed with fillin.glyphapi.dev so the live deploy serves /mcp/ to remote clients. Caught by smithery's probe failing.

2026-04-25 · 26af976

HTTP MCP transport: mount FastMCP at /mcp inside the FastAPI app

Smithery and most agent harnesses talk to MCP over Streamable HTTP, not stdio. Same FastMCP instance now serves both — stdio when run as 'python mcp_server.py', HTTP at https://<host>/mcp when api.py boots. - mcp_server.py: stateless_http=True, path "/" so mount at /mcp lands cleanly - api.py: lifespan now wraps mcp.session_manager.run(); mount /mcp → streamable_http_app() exposes the same fillin_query / fillin_stats / fillin_health tools to remote clients Verified locally — POST /mcp/ initialize returns server info, tools/list returns all 3 tool schemas. Co-located in one process so loopback overhead is negligible.

2026-04-25 · d88e896

Submission packet + agents.json manifest

User approved MCP catalog submission. Prepping the artifacts: - agents.json (root): full manifest covering offer, endpoints, pricing, payment rails, MCP tool schemas, the 4-call agent-bootstrap recipe, rate limits, guarantees. The single URL an agent should be able to read to bootstrap an integration without a human. - api.py: GET /agents.json + GET /.well-known/agents.json serve it (both standard discovery paths covered). - docs/SUBMISSION.md: copy-paste packets for smithery.ai, mcp.so, the punkpeye/awesome-mcp-servers PR (with diff line + body), parallel awesome-list targets, Twitter draft, HN cadence, and crypto-native registry plan (Olas, Virtuals, Eliza). Pre-submission checklist included. The actual catalog clicks + GitHub OAuth need a human — that's the user's part. Everything else is ready.

2026-04-25 · 502be0c

Fix two bugs found via dogfood: score field + similarity formula

Ran agent.py against a fresh local Fillin (1693 docs, 4 sources). Two quality bugs surfaced — neither would have shown up in unit tests because both depend on real LanceDB distance distributions. 1. api.py:271 was returning raw L2 _distance as the API "score" field, ignoring the reranked score entirely. Reranking was running but the client never saw it. 2. ranking.py:effective_score used `1 - distance`, which clamps to zero for any distance > 1.0. LanceDB returns L2 distance on unit-normalized MiniLM vectors — typical "decent match" distances are 1.0-1.4, so ~80% of results collapsed to score=0 and the original LanceDB order leaked through (defeating the rerank). Fix: similarity = 1 / (1 + distance). Smooth monotonic decay in (0, 1]. No collapse, no orthogonal-equals-opposite degeneracy. Also discovered (not yet fixed): - RSS SSL cert verify fails on macOS Python; 4/5 default feeds bounce - arXiv 5-day window returned 0 docs (probably a search-shape mismatch) Tests: 19/19 passing. Re-ran the agent against the fix; top results for "new LangChain releases" are now actual github.com/langchain-ai release notes, not random HN posts.

2026-04-25 · 5abf820

ZHC strategy: rails for agents that buy services themselves

The Stripe path was last decade's framing. Real strategic claim: Fillin's x402+USDC+Base+MCP stack is uniquely fitted to autonomous agents and Zero Human Companies — wallet-native auth, no KYC, MCP-catalog-discoverable, priced atomically per query. - 30-day rails plan: MCP catalog submission, agents.json manifest, fillin_account/fillin_estimate_cost tools, register on Olas + Virtuals + Eliza, hunt 5 ZHC customers (autonomous trading bots, DAO digesters, Eliza-deployed character agents, etc). - Reprioritization: defer Stripe, lean into crypto-native rails. Stripe becomes the secondary path once humans-evaluating-for-ZHCs ask for it. - ZHC profile + spend math: monitoring agents at 50 queries/min hit $20k/mo per customer — one such customer beats 1k hobbyists.

2026-04-25 · 1a108a6

Outreach playbook + Stripe wiring plan

Two artifacts to drive Phase 5 of the launch (commercial traction): - docs/OUTREACH.md: 3 cold-email templates (agent frameworks, daily-brief products, research/dev-tools shops), tier-1/2/3 target list (~30 names), cadence + personalization rules. - docs/STRIPE_PLAN.md: end-to-end wiring plan to add card payments alongside the existing x402 handshake. Stripe Checkout (not Elements), one-time SKUs (not subs), reuse credits ledger via "fk_" key prefix, webhook signature verification non-negotiable. Implementation in 4-6h. No code yet — these are the artifacts the commercial push runs against.

2026-04-25 · 67bb951

Landing page parity + fillin_health MCP tool

Landing page was claiming a 2-source corpus + single-embedder stack; neither is true anymore. Pulled it back into reality. - web/index.html: stack chips reflect 4 sources, pluggable embeddings, authority×recency rerank, MCP server. "How it works" copy now names the rerank step honestly. - mcp_server.py: fillin_health() — reachable, host, rows, earliest, latest. Lets an agent decide whether to call fillin_query at all.

2026-04-25 · e352816

Authority + recency reranking: arxiv & GH releases beat HN noise

v2 roadmap item — once corpus has 4 sources of mixed quality, pure cosine similarity is the wrong final order. Pull 3× candidates from LanceDB, rerank by similarity × authority × recency, slice to k. - ranking.py: SOURCE_AUTHORITY (arxiv/GH = 0.95, RSS = 0.75, HN = 0.70), exp recency decay with 90-day half-life. Both tunable via env (FILLIN_AUTHORITY JSON, FILLIN_RECENCY_HALF_LIFE). - db.py: query_delta now pulls k * RERANK_FACTOR, reranks, slices. Each result carries a `score` field so callers can show grounding confidence. - 12 new tests, 74 passing.

2026-04-25 · 822b0fb

Corpus expansion: RSS + GitHub Releases sources land

Phase 4 of the launch plan — depth compounds silently while traffic arrives. Two new timestamped corpora wired into the existing fetch(hours) → (docs, dropped) contract. - sources/rss.py: feedparser-backed reader with HTML stripping, retry, cutoff filter. Default feeds curated for the LLM/dev/agent space; override with FILLIN_RSS_FEEDS. - sources/github_releases.py: REST API client for /repos/{slug}/releases with token-aware rate limiting (60/h → 5000/h with GITHUB_TOKEN). Drafts and prereleases are skipped. Default repos seed the agent- framework ecosystem; override with FILLIN_GITHUB_REPOS. - ingest.py: registers both under SOURCES so they're picked up by the default ingest pass and the scheduler. - db.py: lazy-import lancedb so source modules import cheaply (testable without the heavy DB dep). - 22 new tests, 62 passing total.

2026-04-24 · 5bc3bb6

Pluggable embeddings: MiniLM default, Perplexity via OpenRouter ready

Prep work so swapping the embedding backend is a one-env-var change, not a refactor. Default behavior unchanged — MiniLM stays on. Perplexity backend is wired and tested but dormant until FILLIN_EMBED_MODEL=perplexity is set and the corpus is re-embedded. - embeddings.py: Embedder protocol + MiniLMEmbedder + PerplexityEmbedder (OpenRouter, 32k context, $0.004/1M tok) + get_embedder singleton - db.py: schema dim pulled from active embedder at table-create time - tests/test_embeddings.py: 6 tests covering defaults, dims, auth gate 28 passing.

2026-04-24 · 71b71f5

Commercialization push: landing page, agent, MCP server, gstack

- web/index.html: single-file landing page with live SVG gap chart, looping agent-↔-Fillin demo, interactive onboarding panel (model/volume → USDC quote + copy-paste snippet) - examples/agent.py: Claude Opus 4.7 + tool-runner demo that calls fillin_query and synthesizes grounded answers with inline citations - mcp_server.py: MCP server exposing fillin_query / fillin_stats over stdio for Claude Code, Cursor, Continue, Zed, etc. - tools/load_test.py: concurrent hammer with p50/p95/p99 reporting - HN_POST.md: Show HN draft + operator notes - CLAUDE.md: project manifest + gstack skill routing rules - requirements.txt: pin anthropic + mcp

2026-04-22 · cc126b4

Add ONBOARDING.md + runnable Python client

Documents the full x402 signed-challenge flow for first-time paying customers: discovery → deposit → credit check → challenge → sign → query, with error-table, bearer-key shortcut, and TypeScript + Python snippets. examples/client.py is a dependency-light reference client. Round-tripped locally: the signatures it produces validate against payments/auth.py's verify_signature.

2026-04-22 · 27f56cf

Infra toward first paying customer: rate limits, query cap, arXiv source

Knocks out the two remaining High items from security-report.md and adds the second ingest source called out in FINDINGS.md's v1 roadmap. Rate limiting (slowapi, per-IP): - /query 30/min - /v1/payment/challenge 10/min - /v1/payment/account 10/min Limits can be disabled for tests with FILLIN_DISABLE_RATE_LIMIT=1; exceeding returns 429 with a structured JSON body. Input validation on /query: - query: max_length=512, min_length=1 (closes the embedder-DoS lever) - cutoff: datetime (was str with manual ISO parsing) — invalid input now fails at Pydantic boundary with 422 arXiv ingest source: - sources/hackernews.py — existing HN logic moved out of ingest.py - sources/arxiv.py — Atom feed for cs.AI/cs.LG/cs.CL/cs.CR/cs.DC, sorted desc, paged until out of window - ingest.py — orchestrator: iterates SOURCES, isolates per-source failures, embeds once, upserts once - scheduler.py unchanged; picks up both sources automatically Tests: 51 → 67. Adds test_api_limits.py (rate limits + validation), test_source_hn.py, test_source_arxiv.py, and rewrites test_ingest.py to cover the orchestrator (source merging, failure isolation, subset flag).

2026-04-22 · 1f6cc99

Harden fillin: tests, efficiency, and signed x402 handshake

Lifts the seven "fine" components from /sitrep to "works great" and fixes the CRITICAL finding from /security-scan. Hardening: - db.upsert: LanceDB merge_insert replaces O(n) in-Python id dedup - db.stats: pyarrow column-scan min/max, no row-dict materialization - ingest.fetch_page: 3-attempt exponential backoff on RequestException - ingest.ingest: counts and logs dropped malformed hits - wallet: explicit module cache with reset_cache(); is_configured() now validates checksum, not just env presence; api lifespan hook calls init() - credits.spend / credit_deposit: reject non-positive amounts; add deposits_address_idx Critical fix — EIP-191 signed-challenge handshake on /query: Previously X-Payer-Address was trusted as payer identity; anyone could drain any funded account after address enumeration via /v1/payment/account. Now every paid /query requires X-Payer-Address, X-Payer-Nonce, and X-Payer-Signature. POST /v1/payment/challenge mints a one-shot nonce + canonical message; the server recovers the signer and rejects 401 on mismatch. Nonce consumption + credit deduction are a single SQLite BEGIN IMMEDIATE transaction — single-use, replay-proof, and a failed spend rolls back nonce consumption. Tests: 0 → 51, all green. Covers db, ingest, wallet, credits, EIP-191 recovery, nonce lifecycle, and end-to-end attacker scenarios (spoofed address, wrong signer, replay, unfunded-but-signed) via FastAPI TestClient. security-report.md documents the remaining High/Medium/Low items for a follow-up pass (rate limiting, query max_length, balance enumeration, dep pinning).

2026-04-21 · 4738a8d

x402 payments: Base USDC wallet + monitor + credits ledger

- payments/wallet.py: web3.py wrapper mirroring GLYPH's wallet.ts — Base sepolia/mainnet USDC support, self-custodied key, balance checks. - payments/credits.py: SQLite-backed per-payer credit ledger, micro-USDC accounting (no float drift), idempotent deposits by tx_hash, atomic spend with conditional UPDATE. - payments/monitor.py: polls Base for USDC Transfer events into our wallet and credits sender addresses. Confirmations=2, cold-starts at head-1 so we don't replay history. Runs as its own systemd unit. - api.py: /query now accepts EITHER a valid FILLIN_API_KEY bearer OR an X-Payer-Address header whose credits cover the price. Unpaid requests return 402 with pay_to, network, usdc_contract, and next-step instructions. New GET /v1/payment/account/{addr} returns balance. Bearer compare stays constant-time. - deploy/nginx-fillin.conf: HTTP bootstrap site for fillin.glyphapi.dev (TLS to be added by certbot --nginx after DNS propagates). - deploy/fillin-monitor.service: hardened systemd unit for the monitor. - deploy/install.sh: now also installs monitor unit + nginx site. - All three units: PYTHONUNBUFFERED=1 for readable journal logs. - requirements.txt: web3>=7, eth-account>=0.11. Price locked at $0.01/USDC per query (Starfleet order — commodity pricing was leaving money on the table for a product with a time-in- market moat). Verified live on snapback: unauthed POST /query → 402 with payment instructions; /v1/payment/info returns wallet 0x763C… on base-sepolia; monitor watching block 40503061+.

2026-04-20 · 420d378

Deploy Fillin to snapback: bearer auth + hardened systemd units

- api.py: bearer-token auth on /stats and /query via FILLIN_API_KEY env var; public /healthz for unauthed liveness. Uses secrets.compare_digest to avoid timing leaks. Returns 503 if key is not configured. - deploy/fillin-api.service + fillin-scheduler.service: systemd units with ProtectSystem=strict, ProtectHome, NoNewPrivileges, limited ReadWritePaths. Cache dirs redirected into /opt/fillin/.cache so the HF model download works under ProtectHome. - deploy/install.sh: idempotent rsync + venv + systemctl deploy. First run generates a cryptographically strong FILLIN_API_KEY (32 random bytes hex) into /opt/fillin/.env with chmod 600. Verified live on snapback:8766 — unauth→401, /healthz→200, authed /query returns today's news with gap_days=109.58.

2026-04-19 · d015361

Fillin v0 — time-series vector DB of the internet

Core components: - db.py: LanceDB schema + temporal delta query (cutoff + semantic similarity) - ingest.py: Hacker News Algolia ingestion with dedup - api.py: FastAPI POST /query, GET /stats - demo.py: CLI simulating an LLM with a cutoff asking the delta - scheduler.py: daemon loop for always-on ingestion (the moat clock) - eval.py: wedge proof vs HN Algolia keyword baseline - FINDINGS.md: 2026-04-19 eval — recall wins, relevance needs better embedder Stack: LanceDB + sentence-transformers MiniLM-L6-v2 + FastAPI + requests. 300 HN docs indexed at first ingestion, covering 2026-04-19 window.