Changelog

What shipped.

Generated from git history. Most recent first. Newer entries are at the top.

2026-05-21 · 6cb1652
fix(deploy/setup-stripe): poll /v1/health before smoke-testing checkout
The 3s sleep before the smoke test was shorter than fillin-api's cold-start (embedder + corpus cache warmup is ~60s), causing setup-stripe.sh to print a false-positive 502 even when Stripe was wired correctly. Poll /v1/health with a 180s deadline so the script reports based on the real terminal state.
2026-05-20 · 88c2fbc
feat: payments receipts + push subscriptions + distribution surfaces
Three concurrent threads toward the "Fillin Magnificent" plan items: #2 Payment Rails — close the receipts gap and make Stripe go-live a one-command op: - new `/v1/payment/transactions/{address}` returns on-chain USDC deposit history - new `/v1/billing/transactions` returns the bearer's per-Stripe-event credit log - `bearer_credit_events` table + `record_bearer_credit_event()` write hook - `deploy/setup-stripe.sh` interactive: prompts for secrets (hidden input), ships them through SSH stdin (never argv), rewrites STRIPE_* lines in /opt/fillin/.env, restarts fillin-api, smoke-tests /v1/billing/checkout - `deploy/install.sh` now installs the templated `[email protected]` + `fillin-prune` units, and reconciles enabled instances against FILLIN_NETWORKS — fixes the long-standing bug where install.sh re-enabled the legacy single-chain monitor on every run - .env.example documents the multi-chain + Stripe block #3 Push Subscriptions — turn Fillin from query endpoint into event bus: - new `pubsub.py`: subscriptions table, topic-filter validator (source/min_severity/keywords/affected_ecosystem), in-process SSE registry, HMAC-signed webhook delivery - LanceDB polling worker started under FastAPI lifespan; uses `ingested_at` cursor (the column was added in PR #2 specifically for this) - `POST /v1/subscriptions` (auth: bearer with positive balance), `GET /v1/subscriptions`, `DELETE /v1/subscriptions/{id}`, `GET /v1/subscribe/{id}` (SSE with 15s heartbeat) - webhook deliveries carry X-Fillin-Timestamp + X-Fillin-Signature (HMAC-SHA256 over `ts.body`, Stripe-style replay-proof) #1 Distribution surfaces — make Fillin discoverable + installable everywhere: - `/llms.txt` route serving web/llms.txt (agent-crawler discovery) - `Dockerfile` for the FastAPI+MCP HTTP server - `docs/integrations/cursor.md` — MCP config + Stripe funding walkthrough - `docs/integrations/claude-desktop.md` — claude_desktop_config.json shim - `docs/integrations/browse-sh.md` — ready-to-submit skill spec (browse.sh agents make external HTTP, so x402 settles correctly — verified design) - `docs/integrations/mcp-registry-pr.md` — draft PR text for modelcontextprotocol/servers community list Tests: 259 passing. New coverage in test_pubsub.py (15), test_api_subscriptions.py (9), test_api_discovery.py (3), additions in test_api_billing.py (4).
2026-05-17 · 07bde4a
markets: new /query/markets slice — Polymarket + Kalshi + Manifold + Metaculus (#4)
* Phase R: schema migration foundation + CVE answer-engine columns Ships R.1 audit, R.2 typed-columns decision, R.3 ingested_at, and R.next severity_score + affected[] against the cves source. The CVE answer-engine SKU is now shippable — a buyer can act on a returned row (pin to patched_range) without a second hop to NVD or GHSA. SCHEMA_AUDIT.md — per-source field map for all 7 ingest paths; identifies the 4 highest-WTP discards (ingested_at, CVE severity, quality signals, GHSA/OSV patched_range). SCHEMA_DECISION.md — chose typed columns (B) over single JSON blob (A). Deciding field: GHSA/OSV affected[] tuple. JSON-blob path breaks LanceDB filter pushdown, the one-call MCP wedge, and the training-data SKU. Each new typed column names the SKU it unlocks. db.py — schema grows from 7 to 10 columns (+ingested_at, +severity_score, +affected). connect() auto-migrates pre-existing tables via add_columns — metadata-only op, existing 25k rows survive. upsert() server-stamps ingested_at and normalizes typed-CVE defaults so non-CVE sources land as null/empty. query_delta_in_sources gains min_severity filter for the CVE severity tier — pushed into the LanceDB WHERE clause, drops null-severity rows silently rather than treating them as 0. sources/cves.py — NVD pulls numeric CVSS baseScore (v3.1 → v3.0 → v2 fallback); GHSA reads cvss.score and pivots vulnerabilities[] into the uniform affected struct with patched_range; OSV flattens affected[].ranges[].events[] into the same shape. Title strings no longer carry the parenthetical severity tag — that's redundant once typed. mcp_server.py — query_cves accepts optional min_severity: float, threaded through both inproc and HTTP paths. Tests: 214 → 222 (+8). New: schema columns present, CVE row round-trip preserves severity + affected, non-CVE rows default to null/empty, NVD numeric baseScore + fallback + null, GHSA patched_range pivot, OSV range flattening. * papers: fix vapor ingest — use submittedOnDailyAt + date-granularity match HF daily papers source was returning empty results. Root cause: comparing the API's date-only field against a datetime cutoff with sub-day precision silently dropped every row. Switched to paper.submittedOnDailyAt with date-granularity comparison; 80 fresh papers backfilled on first run. * payments: Stripe top-up + multichain wallet + bearer ledger + prune Stripe Checkout → bearer ledger top-up flow: - payments/stripe_billing.py mints a raw bearer + hash, sends Stripe only the hash, stashes the raw key with a TTL, atomically claims it on /billing/success after Stripe API confirms paid + livemode match. - Webhook deduped by event id; requires payment_status==paid and livemode match before crediting. - scripts/check-no-stripe-keys.sh — pre-commit guard refuses to commit Stripe secrets (sk_live, rk_live, whsec_) per Stripe's #1 leak vector. api.py — /v1/billing/checkout + /v1/billing/webhook + /billing/success routes; multichain x402 (Base + Optimism + Arbitrum + Polygon) selected via payer header; Hit model + SliceQueryIn carry severity_score + affected + min_severity for the cves slice (R.next surface). payments/wallet.py — get_web3() now per-chain; cached singleton per USDC contract; never reads BASE_RPC_URL. payments/credits.py — bearer ledger (key_hash + balance) + nonce challenge table + pending_reveal table; spend/refund + probe quota + pruning primitives. payments/prune.py — periodic hygiene over the three ephemeral tables, idempotent DELETE WHERE stale, driven by fillin-prune.timer. Tests: +940 lines across test_api_billing, test_api_multichain, test_bearer_ledger, test_stripe_billing, test_pre_commit_hook, test_wallet — covers Stripe webhook signing, multichain payer routing, bearer claim-once semantics, pre-commit hook block-list. * deploy: systemd units for monitor + prune timer + nginx CSP + README deploy/[email protected] — per-chain templated unit; runs the multichain USDC deposit monitor as [email protected] etc. deploy/fillin-prune.service + .timer — daily ledger hygiene at 04:00 MT, runs payments/prune.py to sweep expired nonces, old probes, and unclaimed pending_reveal rows. deploy/nginx-fillin.conf — relax CSP for the rich landing pages (inline-eval + data: images required by the Pretext-style assets). deploy/README.md — install + rotate runbook for the new units. * web: hero corpus_match honesty + Stripe top-up signup + changelog web/index.html — hero lede now surfaces the corpus_match: strong|weak|none honesty signal so a visiting agent buyer sees that fillin returns the quality of its own retrieval, not just docs. web/signup.html — adds the Stripe-checkout top-up path alongside the existing trial-key mailto flow; key delivery is one-shot on the /billing/success page. web/changelog.html — catches up to Phase R (ingested_at, severity_score, affected[]), the daily snapshot slices, and the multichain x402 surface. * markets: new /query/markets slice — Polymarket + Kalshi + Manifold + Metaculus Adds the fourth daily-snapshot slice. One MCP call surfaces every active market touching a topic across the four major prediction venues an agent would otherwise check independently. Priced at $0.05 against the four-venue rediscovery cost. sources/markets.py — keyless public APIs only. Per-venue fetcher pulls the active-market head, folds the load-bearing fields (question, current implied probability / yes-price, close date, volume, venue) into `text` so vector match catches "is X likely" queries. Defensive on Polymarket's outcomePrices shape (Gamma has shipped both JSON-encoded string and pre-decoded list). Manifold filters resolved markets out. Metaculus uses community_prediction.full.q2 as the canonical median. ingest.py — registered in SOURCES so the standard `ingest(hours, sources)` orchestrator picks it up. api.py — /query/markets route, FILLIN_PRICE_MARKETS_USDC env (default $0.05), require_paid_or_key_markets dep, /.well-known/mcp/server-card entry. mcp_server.py — query_markets MCP tool mirrors the other slice tools' shape. Docstring is explicit that the price snapshots in `text` are first-sight only; for live pre-trade pricing, follow the venue url. scripts/ingest_markets.py + run_daily_snapshots.sh — fires markets last in the daily 1pm-MT cron, 30s after frontier. agents.json + smithery.yaml — declare the tool, pricing, example invocation. Corpora list now includes "markets". Tests: 222 → 228 (+6). Per-venue parsing tests for Polymarket (yes-price fold, dual price-shape tolerance), Kalshi (cents → display + close date + contract volume), Manifold (resolved-filter + probability render), Metaculus (community-median path). Plus a cross-venue dedup test. Known limitation (documented in script + tool docstring): prices in the corpus are point-in-time at first ingestion — db.upsert dedupes by id so existing market rows aren't refreshed. The slice answers *discovery* ("is there a market about X across the four venues") not live price. Follow-up PR can add db.replace_source('markets', ...) so daily cron refreshes price snapshots in place. --------- Co-authored-by: Christopher Harris <[email protected]> Co-authored-by: Claude Opus 4.7 (1M context) <[email protected]>
2026-05-16 · b8387c2
Phase R: schema migration foundation + CVE answer-engine + Stripe top-up + multichain (#2)
* Phase R: schema migration foundation + CVE answer-engine columns Ships R.1 audit, R.2 typed-columns decision, R.3 ingested_at, and R.next severity_score + affected[] against the cves source. The CVE answer-engine SKU is now shippable — a buyer can act on a returned row (pin to patched_range) without a second hop to NVD or GHSA. SCHEMA_AUDIT.md — per-source field map for all 7 ingest paths; identifies the 4 highest-WTP discards (ingested_at, CVE severity, quality signals, GHSA/OSV patched_range). SCHEMA_DECISION.md — chose typed columns (B) over single JSON blob (A). Deciding field: GHSA/OSV affected[] tuple. JSON-blob path breaks LanceDB filter pushdown, the one-call MCP wedge, and the training-data SKU. Each new typed column names the SKU it unlocks. db.py — schema grows from 7 to 10 columns (+ingested_at, +severity_score, +affected). connect() auto-migrates pre-existing tables via add_columns — metadata-only op, existing 25k rows survive. upsert() server-stamps ingested_at and normalizes typed-CVE defaults so non-CVE sources land as null/empty. query_delta_in_sources gains min_severity filter for the CVE severity tier — pushed into the LanceDB WHERE clause, drops null-severity rows silently rather than treating them as 0. sources/cves.py — NVD pulls numeric CVSS baseScore (v3.1 → v3.0 → v2 fallback); GHSA reads cvss.score and pivots vulnerabilities[] into the uniform affected struct with patched_range; OSV flattens affected[].ranges[].events[] into the same shape. Title strings no longer carry the parenthetical severity tag — that's redundant once typed. mcp_server.py — query_cves accepts optional min_severity: float, threaded through both inproc and HTTP paths. Tests: 214 → 222 (+8). New: schema columns present, CVE row round-trip preserves severity + affected, non-CVE rows default to null/empty, NVD numeric baseScore + fallback + null, GHSA patched_range pivot, OSV range flattening. * papers: fix vapor ingest — use submittedOnDailyAt + date-granularity match HF daily papers source was returning empty results. Root cause: comparing the API's date-only field against a datetime cutoff with sub-day precision silently dropped every row. Switched to paper.submittedOnDailyAt with date-granularity comparison; 80 fresh papers backfilled on first run. * payments: Stripe top-up + multichain wallet + bearer ledger + prune Stripe Checkout → bearer ledger top-up flow: - payments/stripe_billing.py mints a raw bearer + hash, sends Stripe only the hash, stashes the raw key with a TTL, atomically claims it on /billing/success after Stripe API confirms paid + livemode match. - Webhook deduped by event id; requires payment_status==paid and livemode match before crediting. - scripts/check-no-stripe-keys.sh — pre-commit guard refuses to commit Stripe secrets (sk_live, rk_live, whsec_) per Stripe's #1 leak vector. api.py — /v1/billing/checkout + /v1/billing/webhook + /billing/success routes; multichain x402 (Base + Optimism + Arbitrum + Polygon) selected via payer header; Hit model + SliceQueryIn carry severity_score + affected + min_severity for the cves slice (R.next surface). payments/wallet.py — get_web3() now per-chain; cached singleton per USDC contract; never reads BASE_RPC_URL. payments/credits.py — bearer ledger (key_hash + balance) + nonce challenge table + pending_reveal table; spend/refund + probe quota + pruning primitives. payments/prune.py — periodic hygiene over the three ephemeral tables, idempotent DELETE WHERE stale, driven by fillin-prune.timer. Tests: +940 lines across test_api_billing, test_api_multichain, test_bearer_ledger, test_stripe_billing, test_pre_commit_hook, test_wallet — covers Stripe webhook signing, multichain payer routing, bearer claim-once semantics, pre-commit hook block-list. * deploy: systemd units for monitor + prune timer + nginx CSP + README deploy/[email protected] — per-chain templated unit; runs the multichain USDC deposit monitor as [email protected] etc. deploy/fillin-prune.service + .timer — daily ledger hygiene at 04:00 MT, runs payments/prune.py to sweep expired nonces, old probes, and unclaimed pending_reveal rows. deploy/nginx-fillin.conf — relax CSP for the rich landing pages (inline-eval + data: images required by the Pretext-style assets). deploy/README.md — install + rotate runbook for the new units. * web: hero corpus_match honesty + Stripe top-up signup + changelog web/index.html — hero lede now surfaces the corpus_match: strong|weak|none honesty signal so a visiting agent buyer sees that fillin returns the quality of its own retrieval, not just docs. web/signup.html — adds the Stripe-checkout top-up path alongside the existing trial-key mailto flow; key delivery is one-shot on the /billing/success page. web/changelog.html — catches up to Phase R (ingested_at, severity_score, affected[]), the daily snapshot slices, and the multichain x402 surface. --------- Co-authored-by: Christopher Harris <[email protected]> Co-authored-by: Claude Opus 4.7 (1M context) <[email protected]>
2026-05-14 · b528f96
landing + agents.json: hero → /onboard.html, kill private-repo links — Phase R prep
Phase R conversion-friction pass before Monday outbound. F2 (highest-leverage): landing hero primary CTA now points at /onboard.html ("Try it in 60 seconds") instead of an anchor to #tools; pricing-sidebar copy promotes /onboard.html and demotes /signup to "production bearer key". The interactive wizard already worked end-to-end (model → harness → MCP config → fillin_health verify → live fillin_query in browser) but was buried behind a nav link. F4: agents.json free_paths now lists interactive_onboard alongside trial_key, and tags trial_key as the human-onboard path so agent-native self-bootstrap routes to /onboard.html or x402 rather than the mailto. F1 (belt-and-suspenders until artchristech/fillin is flipped public): removed every github.com/artchristech/fillin link from web/index.html footer, web/onboard.html final CTA, web/pitch.html cta-row, web/pricing.html self-host tier, web/changelog.html nav-cta, agents.json repository field, smithery.yaml repository field. Replaced with Smithery (verified third-party signal), /onboard.html, or self-host mailto where appropriate. Re-add when repo visibility flips. .gitignore: PHASE_R_PLAN.md + PHASE_R_ACCELERATE.md (internal strategy artifacts, never publish). web/signup.html intentionally left out of this commit — has my F1 edit overlapping with Stripe/multi-chain in-flight work; will ship together.
2026-05-14 · 21b2976
landing: promote teal v2 — agent-first hero, 7-tool spec, x402 callout
- Recolor accent to logo teal (#3fb8ad), retire yellow-green - Trim from 1384 → 877 lines; cut animated demo, slim sections - Surface machine-readable endpoints (agents.json, openapi.json, /v1/corpus, MCP) as hero pills - Add 7-tool spec table + decision-tree for "when to call what" - Live status strip polling /v1/corpus - Logo plate that matches JPG bg for invisible edges (3 placements)
2026-05-14 · a0a3f7c
ts client: publish v0.1.0 as @artchristech/fillin-client
The @fillin scope doesn't exist on npm yet. Shipping under the user's existing @artchristech scope so the package is installable today. README notes the planned migration to @fillin/client once the org exists. Description updated to the search-engine-for-agents framing. Live: https://www.npmjs.com/package/@artchristech/fillin-client
2026-05-14 · 1403208
docs: TS client README + HN post catch up to 7-tool reality
clients/fillin-ts/README.md: - Hero reframed as search-engine-for-agents + names the 3 daily slices - Quickstart comments the generic /query as $0.01 explicitly - Methods section calls out that slice routes (cves/papers/frontier) + /answer ship in the API but aren't yet wrapped — direct fetch or PR - Surfaces /signup (20 free queries) alongside /pricing HN_POST.md (drafts for Show HN — not on the live site): - Title swapped to "Show HN: Fillin – search engine for AI agents (...)" - Body reframes around search engine + 6 corpora + 7 tools + per-slice pricing rationale - Adds rediscovery-cost framing for differentiated pricing - Updates corpus claim ("thin / HN+arXiv") to current reality (22.8k docs, 6 sources, two-tier ingest) - Adds eval evidence (n=23 Opus, ~3.4× fewer tokens, ~2.25× cheaper) - Adds two more anticipated questions (per-slice pricing, MCP location) - Repo link points at github.com/artchristech/fillin (was placeholder) Defer: docs/ZHC_STRATEGY.md, STRIPE_PLAN.md, OUTREACH.md, SUBMISSION.md still reference flat $0.01 — internal-only, doesn't affect any user-facing surface.
2026-05-14 · 3c7c1de
docs + /pricing: catch up to the 7-tool, search-engine-for-agents reality
The flat-$0.01 framing was actively misleading now that /query/cves, /query/papers, /query/frontier each carry their own price. README still described 4 sources from before the daily-snapshot work. /pricing (web/pricing.html): - h1: "Per-call pricing. In USDC. On Base." (was "Flat $0.01 per query") - New per-tool table (7 rows) with prices and intended use - "Try before you pay" surfaces both /v1/probe and /signup - Three rails kept (x402, bearer/Stripe, self-hosted) but reframed — same prices, different settlement - "Why $0.01" replaced with "Why these prices" — rediscovery cost framing for differentiated pricing - FAQ "Is there a free tier?" answer corrected (yes, two paths) README.md: - Hero: search-engine-for-agents framing + 7-tool table front-and-center - Stack section: 6 corpora named (HN/arXiv/RSS/GHReleases/CVEs/papers/frontier), two-tier ingest (30min scheduler + 1pm MT daily snapshots) called out - Try-before-you-pay: surfaces /v1/probe AND /signup trial bearer key - Quickstart: pip install -r requirements.txt instead of stale per-package list; mentions the slice routes alongside /query Tests: 146/146.
2026-05-14 · fddbc02
deploy/nginx: relax CSP for rich landing pages
Original policy was default-src 'none' — fine when the API only ever served JSON, broken now that we ship a styled landing with inline <style>, Google Fonts, inline <script>, and /v1/corpus + /v1/probe fetches from the page. Browsers correctly refused everything, rendering the live site as plain serif HTML. New policy keeps the lockdown shape (no third-party scripts, no iframes, no foreign image hotlinks, no foreign connect targets) and opens only what the page actually needs: - img-src 'self' data: - style-src 'self' 'unsafe-inline' https://fonts.googleapis.com - font-src 'self' https://fonts.gstatic.com data: - script-src 'self' 'unsafe-inline' - connect-src 'self' Mirrors the shape used by sister site /etc/nginx/sites-available/glyph. Live config patched in-place + nginx -t green + reload (zero downtime). Backup at /root/fillin.bak.20260514 on snapback.
2026-05-14 · 2dc75fa
landing: integrate hourglass logo (replaces accent-dot brand mark)
- web/logo.jpg: 1024x1024 teal hourglass mark (the new brand) - web/index.html: nav brand swaps the .dot accent for an <img class="logo-mark"> at 26px with 5px radius. Adds <link rel="apple-touch-icon">. - api.py: new GET /logo.jpg route via _serve_static - Versioned URL (?v=1) in HTML to dodge Cloudflare's 4h cache of the pre-route 404. Future logo swaps bump the version.
2026-05-14 · bb64ca6
ingest: arxiv min window 24h → 72h
Diagnostic dig: 24h was insufficient. arXiv publishes once per weekday ~17:30 UTC. Querying at 19:00 UTC, the latest batch is from yesterday's 17:30 → only 25.5h ago, just outside a 24h window. Result: scheduler pulled 50 entries every tick, all dropped as out-of-window, arxiv corpus went 15 days stale. 72h covers the worst case (Monday morning needing Friday's batch over the weekend gap). Verified live: scheduler now pulls +1055 arxiv docs on its first tick after the change. 146/146 tests pass.
2026-05-14 · 39958bd
ingest: per-source min window, arxiv floored at 24h
The scheduler runs every 30min with INGEST_WINDOW_HOURS=2, but arxiv publishes in weekday batches and the API page-0 newest doc can be 24-72h old. With a 2h window every tick reads 50 entries and drops all 50, leaving the arxiv corpus stale (15+ days at last check). Fix: SOURCE_MIN_HOURS = {"arxiv": 24} in ingest.py applies a per-source floor: max(hours, SOURCE_MIN_HOURS.get(name, 0)). Other sources keep the caller-supplied window. Diagnostic log line now includes the effective window per source. 146/146 tests pass.
2026-05-14 · d25d360
data refresh + cron to 1pm MT + smithery.yaml comprehensive sync
- Fired daily snapshot manually; corpus now at 21,677 rows (was 21,621); cves +55 new, frontier +1, papers +0 (HF daily + bioRxiv both empty for the last 24h — upstream sparseness, not a regression). The read-ids + chunk-size=25 add strategy held: 300 docs ingested with zero OOM. - Cron rescheduled on snapback VPS: CRON_TZ=America/Denver, 0 13 * * * (was 15 6 UTC). DST-aware via vixie cron (Debian 3.0pl1-162). CRON_TZ scoped to the fillin entry only — other crontab jobs (snapback, glyph, milkncookies) keep their original UTC schedule. - smithery.yaml: comprehensive update — corpus surface now names all 6 active source families (HN, arXiv, RSS, GH releases, cves, papers, frontier) with row count; /v1/probe free-tier path surfaced; /signup trial-key (20 free queries) surfaced; corpus_match signal documented in fillin_query example; daily 1pm MT cadence noted. - agents.json: synced — added query_cves / query_papers / query_frontier to mcp.tools, expanded corpora list with the 3 new sources, replaced flat $0.01 pricing with per-tool table + free_paths + snapshot_cadence. Diagnosis (no fix shipped per scope): arXiv staleness (newest doc 2026-04-29 in DB, while live API has docs through 2026-05-13) is caused by scheduler INGEST_WINDOW_HOURS=2 being shorter than arXiv's typical submission cadence — every 30-min tick reads page 0, sees the 50 newest entries are all >2h old, and writes 0. Fix would be either a wider window for arxiv specifically, or graduating arxiv into the daily-slice runner with --hours 24. Out of scope for this commit.
2026-05-14 · 0f0e739
landing: restore rich page + reframe as search engine for AI agents
The previous commit (0f73475) accidentally overwrote the rich 1381-line landing page with a stale 519-line older version. This restores the rich page (hero animation, demo, how-it-works, onboard, api, pricing, proof, stack sections) AND applies the search-engine-for-agents reframing on top: - Title + meta + og + twitter: "search engine for AI agents" - Eyebrow + lede: name the 3 daily slices and prices inline - API section: "Five endpoints" with each route + price listed - Pricing card: per-slice price list ($0.01-$0.05 range), removed the misleading "100 queries per 1 USDC" line that only held for the flat $0.01 tier
2026-05-14 · 0f73475
landing: reframe as search engine for AI agents + add 3 daily slices
- Title + meta + og: "search engine for AI agents" framing - Tagline: name the 3 slices + their prices (CVEs $0.02, papers $0.03, frontier $0.05) - Stats strip: price tile shows the $0.01-$0.05 range, links to /pricing - Endpoints table (section 03): added /query/cves, /query/papers, /query/frontier, /answer rows with prices inline - MCP section (section 05): "Seven tools" with prices listed for each Section 07 corpus table auto-populates from /v1/corpus and will pick up the new sources without code change.
2026-05-14 · 67ac88d
launch readiness: onboard XSS fix, CSO-cleared diff, internal docs gitignored
CSO daily-mode audit before pushing fillin_daily to public main: * Fixed: web/onboard.html href XSS (MEDIUM) — added safeHref() URL scheme allowlist (http/https only); applied escapeHtml to source + published_at for defense in depth. * Verified-fixed in code: XFF rate-limit bypass HIGH (CF-Connecting-IP enforcement at api.py:67-81 + ufw lockdown via deploy/lockdown_origin.sh). * Gitignored internal artifacts: CONTEXT.md, RECONCILE.md, research-report.md, .gstack/ — never publish (working specs + research notes that don't belong in the public repo). * Updated security-report.md with the 2026-05-11 audit (consistent with prior public-audit policy). Bundles other launch-ready work that was already staged: README + landing-page polish, /signup flow, agents.json mainnet update, new tests/test_corpus_and_probe.py (13 tests), thesis v1+v2 pages, eval artifacts, deploy hardening (lockdown_origin.sh). Tests: 146 passing.
2026-05-14 · 3a69321
fillin_daily: 3 daily snapshot slices + differentiated pricing
Adds CVE / papers / frontier daily ingestors and paid MCP routes to turn fillin into a search engine for AI agents. Each slice is its own paid lane with prices set by elasticity: - query_cves ($0.02) — NVD + GitHub Security Advisories + OSV - query_papers ($0.03) — HuggingFace daily papers + bioRxiv (unioned with arxiv) - query_frontier ($0.05) — OpenAI/Anthropic/DeepMind/Meta/Mistral feeds + HF trending Backend: - sources/{cves,papers,frontier}.py + scripts/ingest_*.py + run_daily_snapshots.sh - QueryIn.sources whitelist filter; rejects unknown values 400 - Three paid routes /query/{slice}; single auth path via factory - db.upsert chunked via FILLIN_UPSERT_CHUNK_SIZE (default 25); switched merge_insert -> read-existing-ids + add to fix VPS OOM at 21k-row scale - server-card.json advertises the 3 new tools Smithery: yaml refreshed with new tools, prices, tags, and 3 example invocations matching the search-engine-for-agents framing. Tests: 122 -> 146 (+15 slice routes/auth/pricing, +1 chunked upsert dedup).
2026-05-13 · 54557a3
fillin_answer: synthesized-mode tool for weaker LLM callers
The eval showed fillin's value depends on the calling model's tool-synthesis skill — Opus 4.7 extracts citations cleanly (4.29/answer); Nemotron-120B free barely does (2.3, mostly hallucinated from training). To be a true enhancement on *any* model, fillin needs to do the synthesis itself for callers that can't. New tool: fillin_answer(query, cutoff, k) Returns a 150-250 word answer with inline [title](url) citations, grounded in post-cutoff retrieved docs. Server runs Haiku 4.5 over the top-k results with a constrained system prompt (no outside knowledge, echo dates, refuse if docs don't address the query). Raw docs are returned alongside for verification. Pricing: $0.02 USDC / call (vs $0.01 for fillin_query). Bearer keys unmetered. x402 path refunds on synthesizer error or no-relevant-docs case so payers never lose USDC for a non-synthesis. Cold-start safe: falls back to {answer: null, reason: "synthesizer_not_configured"} when ANTHROPIC_API_KEY is unset. x402 callers are auto-refunded the $0.02. Cheap-signal addition: fillin_query and the in-proc MCP variant now return top_score + corpus_match ("strong" | "weak" | "none") so agents can skip the paid call when retrieval found nothing. - synthesize.py: corpus_match thresholds + Haiku-backed synthesizer - api.py: /answer endpoint, ANSWER_PRICE_USDC, AnswerOut model, require_paid_or_key_answer dependency, refund-on-no-synthesis logic - mcp_server.py: fillin_answer tool with both in-proc and remote paths - agents.json + smithery.yaml + /.well-known/mcp/server-card.json: register the new tool and pricing - tests/test_synthesize.py: 10 new unit tests for corpus_match, extract_citations, and the no-key fallback (no live API calls) Test status: 122/122 passing (10 new).
2026-05-13 · cdde415
launch polish: embedder pre-warm + smithery.yaml example invocations
- api.py: lifespan now warms the embedder singleton with a dummy encode at startup, so the first real fillin_query doesn't pay the ~2-3s sentence-transformers cold-load cost (showed up as a p99 latency spike on the Smithery performance dashboard) - smithery.yaml: add four `examples:` entries (release-notes query, research query, free health check, free corpus stats) so the listing surfaces realistic call shapes to reviewers and prospective agent developers Verified live: '[fillin] embedder warm' logs on snapback after restart.
2026-05-12 · e1885c9
launch prep: /signup trial-key flow, Proof section, mainnet-correct smithery.yaml
- web/signup.html + GET /signup route: 20-free-query trial via mailto prefilled with name/use-case/harness/cutoff, so the "keys issued at fillin.glyphapi.dev" promise has a real destination - web/index.html: new #proof section with the n=8 head-to-head numbers (Fillin = 2.25× cheaper than web search, 11× more inline citations); /signup link added to nav, CTA, and pricing card - smithery.yaml (new file, will commit to surface on registry): removes the "free public tier" claim that didn't exist, points at /signup - api.py: bundles already-deployed CF-Connecting-IP rate-limit hardening so origin matches what's running on snapback (rsync had been the source of truth)
2026-05-06 · 70f9ec7
/freshness: cache 60s — public endpoint, full column scan was 8s cold
The freshness endpoint runs db.stats() which scans the published_at + source columns over the full ~15k-row table; first call after a worker restart took 8+ seconds and would time out the landing-page widget. As a public no-auth endpoint, that's also a trivial accidental DOS. Module-level dict cache with 60s TTL. The 'now' field is still computed per request so consumers can detect a stale cache if needed.
2026-05-06 · fdf9608
db.stats(): include per-source row counts
The /freshness endpoint and the /status board both surface a "Sources" widget driven by stats().by_source — but stats() never populated it, so the dashboards rendered "0 sources". Adds a pylist sweep over the source column (already loaded for the min/max scan) and groups counts. No additional DB scan: reuses the same fetch.
2026-05-06 · dcd7b88
CI: pytest + tsc + npm test + changelog build on push and PR
Three jobs in .github/workflows/test.yml: - python: pytest with eval/ excluded (live-network suite, not for CI). - typescript: cd clients/fillin-ts; tsc --noEmit; npm test; npm run build. - changelog: runs tools/build_changelog.py and asserts the output is >5KB, so a regression in the generator can't ship green. Caches pip and npm by lockfile. Triggers on push to main, PR to main, and workflow_dispatch.
2026-05-06 · a2f7fc5
Add @fillin/client TypeScript SDK
Most agent code in 2026 is TypeScript (Mastra, LangGraph TS, Vercel AI SDK, Cloudflare Agents). Python-only excluded that half of the dev population; this opens it. - src/index.ts: FillinClient with .query / .stats / .freshness / .health / .paymentInfo. Bearer auth via apiKey option, with an injectable fetch for testing or custom transports. AbortController honors timeoutMs (default 30s). FillinError carries .status + .body. - test/client.test.ts: three smoke tests — bearer header gets sent, freshness works without a key, non-2xx throws FillinError. Uses node:test, no external test framework. - README.md: install, quickstart, options table, methods, errors. - tsconfig.json + package.json wired for `npm run build` (tsc → dist/). - .gitignore so node_modules + dist stay out. Tests pass (3/3), tsc --noEmit clean. Not yet published to npm.
2026-05-06 · 850fa42
Site infrastructure: pricing, status, changelog, freshness, SEO
Builds the surface that was missing for a real product: - /pricing — flat $0.01 page covering both x402 and bearer rails, plus FAQ and a Stripe placeholder cell for when the checkout lands. - /status — live in-browser health board over /healthz, /freshness, /v1/payment/info, /agents.json. No server-side history yet (deferred until usage justifies a time-series store). - /changelog — generated from git log via tools/build_changelog.py; baked into deploy/install.sh so every deploy refreshes it. - /freshness — public corpus stats endpoint (no auth) backing both the status board and a new freshness strip on the landing page. - /robots.txt + /sitemap.xml + og.svg + favicon.svg — basic SEO and social-share surface (was zero before). - _shared.css — shared design tokens for sub-pages so they match index without each duplicating ~500 lines of CSS. api.py exposes all of the above plus _serve_static for asset routes (robots/sitemap/favicon/og/css). _serve_html keeps a JSON fallback so tests pass without staging the web/ directory. Landing-page nav now actually links to onboard.html, pitch.html, pricing, status, and changelog — they had been dangling html files with no entry points.
2026-05-06 · 78eeb1b
agents.json: declare HTTP transport as primary, stdio as alt
The card claimed transport=stdio but the production deploy at fillin.glyphapi.dev mounts FastMCP at /mcp via streamable-http. Agents that fetched the card and tried to spawn the stdio process would have looked at the wrong rail. Now lists both transports with streamable-http as primary and exposes the http_endpoint inline.
2026-05-06 · cf3c5fa
Fix test isolation: reload mcp_server before reloading api
Each api fixture reloaded api.py inside a fresh TestClient. The lifespan ran _mcp_instance.session_manager.run(), which raises after the first call on a given FastMCP instance. Because mcp_server wasn't reloaded, the singleton was reused and the second test in either test_api_limits.py or test_api_x402.py exploded — 14 errors total on every CI/local run. Reloading mcp_server first gives api.py a fresh FastMCP, so the session manager starts cleanly for each test. Before: 95 passed, 14 errors. After: 109 passed.
2026-05-06 · 47407d6
Serve landing/onboard/pitch HTML at /, /onboard.html, /pitch.html
Before this, web/index.html (1238 lines) was rsynced to the VPS but no route served it — GET / returned the JSON stub. Visitors to the canonical domain saw {"product":"Fillin","tagline":"..."} instead of the landing page. Adds three FileResponse routes that fall back to a small JSON payload when the file is missing (so tests run without staging the web/ dir). Also tracks web/pitch.html which had been left untracked since Apr 30.
2026-05-06 · 8cb6838
Swap fillin.dev → fillin.glyphapi.dev across docs and config
User does not own fillin.dev — the domain's NS records point at Vercel and currently return DEPLOYMENT_NOT_FOUND. Anyone who registers it later could harvest bearer tokens and x402 payment headers from agents that wired up via the README. Canonical domain is fillin.glyphapi.dev (snapback VPS, server_name in deploy/nginx-fillin.conf:22). 24 string references across README, agents.json, mcp_server.py, embeddings.py, web/index.html, tools/load_test.py, and docs/.
2026-04-30 · 7d9e68b
Eval economics: real partial baseline (n=7-8 / 25 queries)
Three-arm head-to-head: claude-alone vs claude+websearch vs claude+fillin. Same model (Opus 4.7), same system prompt, same questions; only tools differ. Run halted at run 28/75 due to Anthropic credits exhaustion. Numbers below are from the 23 successful runs only — explicitly partial, explicitly noted. Per-arm averages on 7-8 queries each: claude alone 147 in / 800 out / 0 tools / 1.88 cite / $0.021/q claude+websearch 36k in / 2010 out / 1.5 tools / 0.38 cite / $0.247/q claude+fillin 11k in / 1392 out / 2.1 tools / 4.29 cite / $0.110/q Headline: at this query mix, Fillin used 3.4× fewer input tokens, cost 2.25× less per query, and produced 11× more inline citations than Anthropic's hosted web_search. Latency was modestly faster too. Total real spend across the 23 runs: $2.91. Caveats are documented openly in eval/baseline.md — sample is small, citation metric is "URLs in answer text" (websearch uses footnote-style refs that don't render as URLs), and the query mix is biased toward release notes + research papers where Fillin's corpus is well-aligned. To complete: top up credits + python tools/eval_economics.py. ~$6 to finish.
2026-04-30 · eb270af
Reframe onboarding: model is the car, cutoff is the odometer
Old onboarding asked "pick your harness" first — but the harness is just plumbing. The thing Fillin actually solves for is the model's cutoff. New flow puts that first. Hero: "Your agent is driving with outdated maps." Sublede: "Fillin is the embeddings & context-engineering service that closes the gap, on every query, autonomously." Step 1 is now MODEL — pick from 12 cards (Claude Opus 4.7, Sonnet 4.6, GPT-5.5, Gemini 2.5, Llama 4, Grok 4, Mistral L3, etc.) each showing its training cutoff. Custom date row underneath. The moment a model is selected, an inline gap-readout panel reveals: - days of blind spot - estimated # arxiv papers, HN posts, GH releases missed - sharp pitch: "without Fillin your agent refuses or hallucinates" Step 2 is harness selection (was step 1). Step 3 config snippet auto-populates from the picked harness. Steps 4-5 verify + first query (cutoff pre-filled from step 1). Step 6 done state shows the model + cutoff that's now wired. URL hash captures m=, c=, h=, s= so any state is shareable as a link. Reorder reflects the strategic claim: the user is wiring up *context engineering for an agent that runs an X with cutoff Y*, not just "installing a tool".
2026-04-30 · e82890b
Agent onboarding wizard + CORS for browser-based MCP calls
web/onboard.html — 5-step interactive onboarding flow: 1. Pick your harness (Claude Code, Cursor, Continue, Zed, Eliza, curl) 2. Drop in tailored MCP config (copy-button, harness-aware) 3. Verify connection — live fetch to /mcp/ shows reachability + corpus stats inline 4. Run a real query — input cutoff + topic, see ranked results with clickable source URLs, plus copy-able curl version 5. Done state — what tools are now available, links to next steps URL-hash state so steps are linkable. Live status indicator. Same aesthetic as web/index.html and web/pitch.html. Bootstrap nginx now ships CORS headers so pages from any origin (including file://) can hit /mcp/ for verification + queries. ACAO=* is fine because every authenticated path is still gated by bearer or x402 — CORS just controls whether the *browser* lets the calling JS read the response. Live nginx already patched; this keeps the bootstrap in sync.
2026-04-30 · cd79e14
arxiv: bump max_pages 4→30 so backfills can paginate past ~48h
Each fetch() call starts at start=0 and pages forward in submittedDate-desc. With max_pages=4 × PAGE_SIZE=50 = 200 papers per call, we never see beyond the most recent ~2 days at current ~100 papers/day in our cats. Caught when chunked backfill produced 0 net new docs after the 48h window — every subsequent chunk re-fetched the same 200 papers and dedup-filtered them. 30 pages × 50 = 1500 paper ceiling, ~15 days at current rate, plenty for both incremental and historical backfills. Per-page throttle still 3s, so worst-case wall clock is ~90s.
2026-04-30 · 8b22564
arxiv: bump request timeout 30→60s; submittedDate-desc is slow for large windows
2026-04-30 · d25e39d
Fix arXiv ingest: space-as-OR encoding (was returning 0 silently)
_search_query joined categories with literal "+OR+" assuming the URL would carry it through. requests.params double-encoded the '+' to '%2B', which arXiv interprets as a literal plus character (not the URL-form '+' = space) — silently returns an empty feed. Switch to " OR " as the connector. requests then form-encodes the spaces as '+', which is exactly the wire format arXiv wants. Verified live: 88 papers in the last 24h across cs.{AI,LG,CL,CR,DC}. This fills the second-largest gap in our corpus (rss came back with 51 docs, gh_releases with 12, arXiv was 0 since the multi-source refactor landed).
2026-04-29 · 1eddf0d
Fix scheduler: drop max_pages kwarg that ingest() no longer accepts
scheduler.py was minted in v0 calling ingest(hours, max_pages). The multi-source refactor (commit 27f56cf) dropped max_pages from ingest() in favor of per-source defaults; scheduler.py was never updated. Every tick since has raised TypeError, silently into the journal. Corpus went stale by ~4 days; only caught when checking MCP status today. Tests don't exercise the scheduler entrypoint — that's the blind spot. Adding an integration smoke test for it is queued for next pass. After deploy, run a one-time backfill: ingest --hours 96 to recover the gap.
2026-04-29 · 82be046
Audit log: M1 + M2 closed and verified live
Both Medium findings resolved. Live HTTPS responses now carry Strict-Transport-Security and Content-Security-Policy headers; live config patched in-place. Bootstrap config refuses non-ACME HTTP traffic with 503 on a fresh deploy. Global exception handler registered as defense-in-depth against future debug=True slippage.
2026-04-29 · bdab800
Fix MEDIUM: bootstrap nginx exposes API in cleartext, no global exc handler
M1: deploy/nginx-fillin.conf bootstrap was proxying API traffic on plain HTTP between install.sh and certbot. Bearer tokens and signed payment headers leaked. Fix: bootstrap now refuses with 503 except for ACME challenge path. certbot --nginx --redirect overwrites the catch- all with a 301 to HTTPS once the cert is in. Hardening headers (HSTS, CSP) baked into the bootstrap so the post-cert config inherits them. HSTS pinned at max-age=63072000 with includeSubDomains. preload is deliberately omitted — it submits to browser preload lists and is effectively irreversible. Promote later when the domain is months-stable. CSP: default-src 'none'; frame-ancestors 'none'; (API serves JSON only). Live config patched in-place with the same headers; verified live — HSTS + CSP appear on every response. M2: api.py had no global Exception handler. A future debug=True regression would leak full Python tracebacks on /query. Added a @app.exception_handler(Exception) that logs the traceback server-side and returns a generic {"error":"internal_error"} 500. FastAPI's HTTPException + RequestValidationError handlers fire first, so this is purely the catch-all for unexpected errors.
2026-04-28 · 7211fc6
Audit log: H1 (rate-limit-bucket-collapse) closed and verified
Verified live on snapback with bucket=2: IP A (1.1.1.1) and IP B (2.2.2.2) each got 200/200/429 from separate buckets. Pre-fix would have shared one bucket (A: 200/200/429, B: 429/429/429).
2026-04-28 · 224a90d
Fix HIGH: per-IP rate limits collapse to one bucket behind nginx
/security-scan caught it. uvicorn was launched without --proxy-headers, so scope["client"] was always 127.0.0.1 (nginx loopback). All three limits collapsed to a single global bucket: slowapi 30/min on /query, 10/min on /v1/payment/*, and the /mcp middleware's 60/min. Fix: add --proxy-headers --forwarded-allow-ips=127.0.0.1 to the systemd unit. uvicorn's ProxyHeadersMiddleware then rewrites scope["client"] from X-Forwarded-For *only when the connecting IP is 127.0.0.1*, so public clients cannot spoof. nginx is the sole trusted proxy. Both slowapi (via get_remote_address) and the /mcp middleware (via scope["client"]) now key on the real client IP automatically. Updated the comment that lied about --proxy-headers being set.
2026-04-25 · ac66c47
Security audit doc: 4 critical fixes + 2 P0/P1 opens
Captures the full audit run today: findings, what's fixed, what remains open, what was confirmed safe. Includes pre-deploy .env hygiene checklist to prevent the wallet-PK-in-env mistake from happening again. Open critical: rotate the testnet wallet that surfaced during the audit (plaintext FILLIN_WALLET_PRIVATE_KEY in /opt/fillin/.env). Move private keys to KMS/systemd-creds/age-encrypted file before mainnet flip. Verified safe: constant-time bearer compare, atomic spend-with-nonce, EIP-191 verification, micro-USDC accounting, refund-on-failure, /mcp rate limit (5×200 → 7×429 confirmed live), nginx SSE buffering off.
2026-04-25 · de964fd
nginx: disable buffering for SSE / streamable-HTTP MCP
Default proxy_buffering=on holds the entire text/event-stream response until the upstream closes. For MCP's streamable-HTTP transport, that means tool-call responses appear to hang from outside even though the server returned in <1s. Disable buffering on the proxy_pass so chunks flush to the client as they arrive. Live config patched in-place; this update keeps the bootstrap config in sync for fresh deploys. Caught while running the production-grade audit; before the fix, live fillin_query through HTTPS was indistinguishable from a 30s timeout. After: 0.57s.
2026-04-25 · e212e8a
Production-grade MCP fixes: deadlock, refund, rate limit
Audit found three real production issues. All fixed. C1 (CRITICAL — production-breaking): MCP tool *calls* deadlocked through the deployed HTTPS endpoint. fillin_query/stats/health each made HTTP loopback calls to localhost:8766 from inside the same FastAPI worker that was holding an SSE stream open for the MCP request. Classic event- loop self-deadlock; resolved only at the 30s upstream timeout. Smithery's probe passed because tools/list returns immediately, but no agent could actually run a tool. Fix: detect co-mounted mode (loopback host) and call db.query_delta / db.stats directly in-process. HTTP path preserved for stdio mode where someone points at a remote Fillin. C2: /mcp had no rate limit. slowapi only decorates FastAPI routes, not mounted Starlette apps, so the public MCP surface was wide-open while /query was capped at 30/min. Added a per-IP token-bucket as ASGI middleware; default 60/min, override via FILLIN_MCP_RPM. I1: x402 payers were charged before the query ran. A LanceDB lock or embedding failure left them out-of-pocket. Added credits.refund() and wrapped query_delta in api.py so server-side errors roll back the deduction. Bearer mode unchanged (pays nothing). Verified locally: fillin_health 0.8s, fillin_query 7s (cold) returning real ranked results with proper scores in (0, 1].
2026-04-25 · 60f207d
deploy/install.sh: don't clobber certbot's nginx mods on re-deploy
Each subsequent deploy overwrote /etc/nginx/sites-available/fillin with the bootstrap (HTTP-only) config, wiping certbot's HTTPS server block. Caused HTTPS to break after every redeploy until 'certbot --nginx --reinstall' was re-run. Fix: only copy the bootstrap config on first deploy (when the file doesn't exist). After that, certbot owns the file. nginx -t still runs to validate. Caught while wiring fillin.glyphapi.dev for smithery.
2026-04-25 · c18de4b
Allow public hosts through FastMCP's DNS-rebinding protection
FastMCP defaults allowed_hosts to localhost variants. Production hits return 421 'Invalid Host header'. Override via FILLIN_MCP_HOSTS env (comma-separated) and seed with fillin.glyphapi.dev so the live deploy serves /mcp/ to remote clients. Caught by smithery's probe failing.
2026-04-25 · 26af976
HTTP MCP transport: mount FastMCP at /mcp inside the FastAPI app
Smithery and most agent harnesses talk to MCP over Streamable HTTP, not stdio. Same FastMCP instance now serves both — stdio when run as 'python mcp_server.py', HTTP at https://<host>/mcp when api.py boots. - mcp_server.py: stateless_http=True, path "/" so mount at /mcp lands cleanly - api.py: lifespan now wraps mcp.session_manager.run(); mount /mcp → streamable_http_app() exposes the same fillin_query / fillin_stats / fillin_health tools to remote clients Verified locally — POST /mcp/ initialize returns server info, tools/list returns all 3 tool schemas. Co-located in one process so loopback overhead is negligible.
2026-04-25 · d88e896
Submission packet + agents.json manifest
User approved MCP catalog submission. Prepping the artifacts: - agents.json (root): full manifest covering offer, endpoints, pricing, payment rails, MCP tool schemas, the 4-call agent-bootstrap recipe, rate limits, guarantees. The single URL an agent should be able to read to bootstrap an integration without a human. - api.py: GET /agents.json + GET /.well-known/agents.json serve it (both standard discovery paths covered). - docs/SUBMISSION.md: copy-paste packets for smithery.ai, mcp.so, the punkpeye/awesome-mcp-servers PR (with diff line + body), parallel awesome-list targets, Twitter draft, HN cadence, and crypto-native registry plan (Olas, Virtuals, Eliza). Pre-submission checklist included. The actual catalog clicks + GitHub OAuth need a human — that's the user's part. Everything else is ready.
2026-04-25 · 502be0c
Fix two bugs found via dogfood: score field + similarity formula
Ran agent.py against a fresh local Fillin (1693 docs, 4 sources). Two quality bugs surfaced — neither would have shown up in unit tests because both depend on real LanceDB distance distributions. 1. api.py:271 was returning raw L2 _distance as the API "score" field, ignoring the reranked score entirely. Reranking was running but the client never saw it. 2. ranking.py:effective_score used `1 - distance`, which clamps to zero for any distance > 1.0. LanceDB returns L2 distance on unit-normalized MiniLM vectors — typical "decent match" distances are 1.0-1.4, so ~80% of results collapsed to score=0 and the original LanceDB order leaked through (defeating the rerank). Fix: similarity = 1 / (1 + distance). Smooth monotonic decay in (0, 1]. No collapse, no orthogonal-equals-opposite degeneracy. Also discovered (not yet fixed): - RSS SSL cert verify fails on macOS Python; 4/5 default feeds bounce - arXiv 5-day window returned 0 docs (probably a search-shape mismatch) Tests: 19/19 passing. Re-ran the agent against the fix; top results for "new LangChain releases" are now actual github.com/langchain-ai release notes, not random HN posts.
2026-04-25 · 5abf820
ZHC strategy: rails for agents that buy services themselves
The Stripe path was last decade's framing. Real strategic claim: Fillin's x402+USDC+Base+MCP stack is uniquely fitted to autonomous agents and Zero Human Companies — wallet-native auth, no KYC, MCP-catalog-discoverable, priced atomically per query. - 30-day rails plan: MCP catalog submission, agents.json manifest, fillin_account/fillin_estimate_cost tools, register on Olas + Virtuals + Eliza, hunt 5 ZHC customers (autonomous trading bots, DAO digesters, Eliza-deployed character agents, etc). - Reprioritization: defer Stripe, lean into crypto-native rails. Stripe becomes the secondary path once humans-evaluating-for-ZHCs ask for it. - ZHC profile + spend math: monitoring agents at 50 queries/min hit $20k/mo per customer — one such customer beats 1k hobbyists.
2026-04-25 · 1a108a6
Outreach playbook + Stripe wiring plan
Two artifacts to drive Phase 5 of the launch (commercial traction): - docs/OUTREACH.md: 3 cold-email templates (agent frameworks, daily-brief products, research/dev-tools shops), tier-1/2/3 target list (~30 names), cadence + personalization rules. - docs/STRIPE_PLAN.md: end-to-end wiring plan to add card payments alongside the existing x402 handshake. Stripe Checkout (not Elements), one-time SKUs (not subs), reuse credits ledger via "fk_" key prefix, webhook signature verification non-negotiable. Implementation in 4-6h. No code yet — these are the artifacts the commercial push runs against.
2026-04-25 · 67bb951
Landing page parity + fillin_health MCP tool
Landing page was claiming a 2-source corpus + single-embedder stack; neither is true anymore. Pulled it back into reality. - web/index.html: stack chips reflect 4 sources, pluggable embeddings, authority×recency rerank, MCP server. "How it works" copy now names the rerank step honestly. - mcp_server.py: fillin_health() — reachable, host, rows, earliest, latest. Lets an agent decide whether to call fillin_query at all.
2026-04-25 · e352816
Authority + recency reranking: arxiv & GH releases beat HN noise
v2 roadmap item — once corpus has 4 sources of mixed quality, pure cosine similarity is the wrong final order. Pull 3× candidates from LanceDB, rerank by similarity × authority × recency, slice to k. - ranking.py: SOURCE_AUTHORITY (arxiv/GH = 0.95, RSS = 0.75, HN = 0.70), exp recency decay with 90-day half-life. Both tunable via env (FILLIN_AUTHORITY JSON, FILLIN_RECENCY_HALF_LIFE). - db.py: query_delta now pulls k * RERANK_FACTOR, reranks, slices. Each result carries a `score` field so callers can show grounding confidence. - 12 new tests, 74 passing.
2026-04-25 · 822b0fb
Corpus expansion: RSS + GitHub Releases sources land
Phase 4 of the launch plan — depth compounds silently while traffic arrives. Two new timestamped corpora wired into the existing fetch(hours) → (docs, dropped) contract. - sources/rss.py: feedparser-backed reader with HTML stripping, retry, cutoff filter. Default feeds curated for the LLM/dev/agent space; override with FILLIN_RSS_FEEDS. - sources/github_releases.py: REST API client for /repos/{slug}/releases with token-aware rate limiting (60/h → 5000/h with GITHUB_TOKEN). Drafts and prereleases are skipped. Default repos seed the agent- framework ecosystem; override with FILLIN_GITHUB_REPOS. - ingest.py: registers both under SOURCES so they're picked up by the default ingest pass and the scheduler. - db.py: lazy-import lancedb so source modules import cheaply (testable without the heavy DB dep). - 22 new tests, 62 passing total.
2026-04-24 · 5bc3bb6
Pluggable embeddings: MiniLM default, Perplexity via OpenRouter ready
Prep work so swapping the embedding backend is a one-env-var change, not a refactor. Default behavior unchanged — MiniLM stays on. Perplexity backend is wired and tested but dormant until FILLIN_EMBED_MODEL=perplexity is set and the corpus is re-embedded. - embeddings.py: Embedder protocol + MiniLMEmbedder + PerplexityEmbedder (OpenRouter, 32k context, $0.004/1M tok) + get_embedder singleton - db.py: schema dim pulled from active embedder at table-create time - tests/test_embeddings.py: 6 tests covering defaults, dims, auth gate 28 passing.
2026-04-24 · 71b71f5
Commercialization push: landing page, agent, MCP server, gstack
- web/index.html: single-file landing page with live SVG gap chart, looping agent-↔-Fillin demo, interactive onboarding panel (model/volume → USDC quote + copy-paste snippet) - examples/agent.py: Claude Opus 4.7 + tool-runner demo that calls fillin_query and synthesizes grounded answers with inline citations - mcp_server.py: MCP server exposing fillin_query / fillin_stats over stdio for Claude Code, Cursor, Continue, Zed, etc. - tools/load_test.py: concurrent hammer with p50/p95/p99 reporting - HN_POST.md: Show HN draft + operator notes - CLAUDE.md: project manifest + gstack skill routing rules - requirements.txt: pin anthropic + mcp
2026-04-22 · cc126b4
Add ONBOARDING.md + runnable Python client
Documents the full x402 signed-challenge flow for first-time paying customers: discovery → deposit → credit check → challenge → sign → query, with error-table, bearer-key shortcut, and TypeScript + Python snippets. examples/client.py is a dependency-light reference client. Round-tripped locally: the signatures it produces validate against payments/auth.py's verify_signature.
2026-04-22 · 27f56cf
Infra toward first paying customer: rate limits, query cap, arXiv source
Knocks out the two remaining High items from security-report.md and adds the second ingest source called out in FINDINGS.md's v1 roadmap. Rate limiting (slowapi, per-IP): - /query 30/min - /v1/payment/challenge 10/min - /v1/payment/account 10/min Limits can be disabled for tests with FILLIN_DISABLE_RATE_LIMIT=1; exceeding returns 429 with a structured JSON body. Input validation on /query: - query: max_length=512, min_length=1 (closes the embedder-DoS lever) - cutoff: datetime (was str with manual ISO parsing) — invalid input now fails at Pydantic boundary with 422 arXiv ingest source: - sources/hackernews.py — existing HN logic moved out of ingest.py - sources/arxiv.py — Atom feed for cs.AI/cs.LG/cs.CL/cs.CR/cs.DC, sorted desc, paged until out of window - ingest.py — orchestrator: iterates SOURCES, isolates per-source failures, embeds once, upserts once - scheduler.py unchanged; picks up both sources automatically Tests: 51 → 67. Adds test_api_limits.py (rate limits + validation), test_source_hn.py, test_source_arxiv.py, and rewrites test_ingest.py to cover the orchestrator (source merging, failure isolation, subset flag).
2026-04-22 · 1f6cc99
Harden fillin: tests, efficiency, and signed x402 handshake
Lifts the seven "fine" components from /sitrep to "works great" and fixes the CRITICAL finding from /security-scan. Hardening: - db.upsert: LanceDB merge_insert replaces O(n) in-Python id dedup - db.stats: pyarrow column-scan min/max, no row-dict materialization - ingest.fetch_page: 3-attempt exponential backoff on RequestException - ingest.ingest: counts and logs dropped malformed hits - wallet: explicit module cache with reset_cache(); is_configured() now validates checksum, not just env presence; api lifespan hook calls init() - credits.spend / credit_deposit: reject non-positive amounts; add deposits_address_idx Critical fix — EIP-191 signed-challenge handshake on /query: Previously X-Payer-Address was trusted as payer identity; anyone could drain any funded account after address enumeration via /v1/payment/account. Now every paid /query requires X-Payer-Address, X-Payer-Nonce, and X-Payer-Signature. POST /v1/payment/challenge mints a one-shot nonce + canonical message; the server recovers the signer and rejects 401 on mismatch. Nonce consumption + credit deduction are a single SQLite BEGIN IMMEDIATE transaction — single-use, replay-proof, and a failed spend rolls back nonce consumption. Tests: 0 → 51, all green. Covers db, ingest, wallet, credits, EIP-191 recovery, nonce lifecycle, and end-to-end attacker scenarios (spoofed address, wrong signer, replay, unfunded-but-signed) via FastAPI TestClient. security-report.md documents the remaining High/Medium/Low items for a follow-up pass (rate limiting, query max_length, balance enumeration, dep pinning).
2026-04-21 · 4738a8d
x402 payments: Base USDC wallet + monitor + credits ledger
- payments/wallet.py: web3.py wrapper mirroring GLYPH's wallet.ts — Base sepolia/mainnet USDC support, self-custodied key, balance checks. - payments/credits.py: SQLite-backed per-payer credit ledger, micro-USDC accounting (no float drift), idempotent deposits by tx_hash, atomic spend with conditional UPDATE. - payments/monitor.py: polls Base for USDC Transfer events into our wallet and credits sender addresses. Confirmations=2, cold-starts at head-1 so we don't replay history. Runs as its own systemd unit. - api.py: /query now accepts EITHER a valid FILLIN_API_KEY bearer OR an X-Payer-Address header whose credits cover the price. Unpaid requests return 402 with pay_to, network, usdc_contract, and next-step instructions. New GET /v1/payment/account/{addr} returns balance. Bearer compare stays constant-time. - deploy/nginx-fillin.conf: HTTP bootstrap site for fillin.glyphapi.dev (TLS to be added by certbot --nginx after DNS propagates). - deploy/fillin-monitor.service: hardened systemd unit for the monitor. - deploy/install.sh: now also installs monitor unit + nginx site. - All three units: PYTHONUNBUFFERED=1 for readable journal logs. - requirements.txt: web3>=7, eth-account>=0.11. Price locked at $0.01/USDC per query (Starfleet order — commodity pricing was leaving money on the table for a product with a time-in- market moat). Verified live on snapback: unauthed POST /query → 402 with payment instructions; /v1/payment/info returns wallet 0x763C… on base-sepolia; monitor watching block 40503061+.
2026-04-20 · 420d378
Deploy Fillin to snapback: bearer auth + hardened systemd units
- api.py: bearer-token auth on /stats and /query via FILLIN_API_KEY env var; public /healthz for unauthed liveness. Uses secrets.compare_digest to avoid timing leaks. Returns 503 if key is not configured. - deploy/fillin-api.service + fillin-scheduler.service: systemd units with ProtectSystem=strict, ProtectHome, NoNewPrivileges, limited ReadWritePaths. Cache dirs redirected into /opt/fillin/.cache so the HF model download works under ProtectHome. - deploy/install.sh: idempotent rsync + venv + systemctl deploy. First run generates a cryptographically strong FILLIN_API_KEY (32 random bytes hex) into /opt/fillin/.env with chmod 600. Verified live on snapback:8766 — unauth→401, /healthz→200, authed /query returns today's news with gap_days=109.58.
2026-04-19 · d015361
Fillin v0 — time-series vector DB of the internet
Core components: - db.py: LanceDB schema + temporal delta query (cutoff + semantic similarity) - ingest.py: Hacker News Algolia ingestion with dedup - api.py: FastAPI POST /query, GET /stats - demo.py: CLI simulating an LLM with a cutoff asking the delta - scheduler.py: daemon loop for always-on ingestion (the moat clock) - eval.py: wedge proof vs HN Algolia keyword baseline - FINDINGS.md: 2026-04-19 eval — recall wins, relevance needs better embedder Stack: LanceDB + sentence-transformers MiniLM-L6-v2 + FastAPI + requests. 300 HN docs indexed at first ingestion, covering 2026-04-19 window.