No QA team, no staging army — just one developer shipping a 100-city edge app and needing to not break it. The confidence comes from a 3-environment pipeline, real-D1 integration tests (no mocks), and Lighthouse CI as a hard deploy gate. This post is the DevOps story.
This is post 7 of the series.
Table of contents
- The 3-environment pipeline
- 4 dependencies, 6 kinds of tests
- Real D1, no mocks
- Lighthouse CI as a deploy gate
- Keeping test output out of the way
- FAQ
The 3-environment pipeline
Every change walks the same three stages, and green before advancing:
flowchart LR
L[Local
pnpm dev · dev D1] -->|green| D[Dev worker
real Workers runtime · dev D1]
D -->|golden path green| P[Prod
real users · prod D1]
P -.->|incident| R[wrangler rollback]| Stage | What it proves |
|---|---|
Local (pnpm dev) |
90% of iteration: UI and logic |
| Dev worker | The real Workers runtime — OpenNext, edge bindings, D1 |
| Prod | Real users, real data |
The rule: don't skip dev worker straight to prod. The local server can't catch edge-runtime or binding issues; only the dev worker can. An emergency hotfix can skip dev, but only on an explicit decision — never as routine. And D1 migrations always run on dev first, then prod — never an unproven migration against production.
4 dependencies, 6 kinds of tests
A deliberately small test stack covers a lot of ground:
flowchart TB
V[Vitest] --> U[Unit tests]
V --> I[D1 integration via miniflare]
PW[Playwright] --> E[E2E]
PW --> S[i18n smoke]
PW --> A11Y[a11y via axe-core]
LH[Lighthouse CI] --> WV[Web Vitals]Four dependencies — Vitest, Playwright, axe-core, Lighthouse CI — cover six kinds of tests: unit, D1 integration, e2e, i18n smoke, accessibility, and Web Vitals. Small surface, broad coverage, low maintenance. New test types are added only when the four genuinely can't cover a need — discipline against tool sprawl.
Real D1, no mocks
Integration tests run against a real local D1 via miniflare — actual SQL against an actual SQLite engine. Mocked D1 is not accepted. A mock that returns what you expect proves your mock, not your query. Real D1 catches the schema mismatch, the bad bind, the SQL that's valid in your head but not in SQLite. The cost — spinning up miniflare — is worth a test suite that tests reality.
One honest constraint worth documenting: some pages depend on D1 in ways the plain e2e environment (which runs without D1) can't serve, so those tests are CI-skipped and verified instead via the D1 integration suite and the dev worker. Naming that constraint beats a test that's "green" only because it never really ran.
Lighthouse CI as a deploy gate
Performance isn't a vibe — it's a gate. Lighthouse CI runs tiered budgets (e.g. Total Blocking Time), and a regression blocks the deploy. That has caught real problems:
- Adding Google Analytics pushed homepage TBT over budget. The fix wasn't to loosen the
budget — it was to load GA with a
lazyOnloadstrategy so analytics stops blocking the main thread. The gate forced the right fix instead of letting a slow page ship. - Budgets are tiered per page type and tuned to absorb real CI-runner noise without hiding true regressions — tight enough to catch a problem, loose enough to not cry wolf.
A performance gate turns "the site feels slow lately" into "this PR exceeded the TBT budget, here's the number" — actionable, not vibes.
Keeping test output out of the way
A full e2e run, a 268-route build, a wrangler upload manifest — each produces a firehose of
output. When iterating with an AI coding agent, that firehose would flood the working context.
So those heavy commands are delegated to a subagent that reads the full log in its own context
and returns only {passed, failed, failures:[...]}. The main thread sees a clean summary, not
ten thousand lines. (This is the harness/context discipline from
post 5, applied to DevOps.)
Key takeaways
- A 3-environment pipeline (local → dev worker → prod), green before advancing, catches edge-runtime issues a local server can't.
- Four test dependencies can cover six kinds of tests — keep the stack small and disciplined.
- Test against real D1 via miniflare; a mocked database only tests the mock.
- Make performance a hard gate with Lighthouse CI — it forces real fixes (like GA
lazyOnload) instead of slow pages shipping. - Offload noisy test/build output to a subagent so it returns a summary, not a firehose.
FAQ
How do you test a Cloudflare D1 app without mocks? Run a real local D1 with miniflare and execute actual SQL in your integration tests. This catches schema and query bugs a mock would hide.
What is a Lighthouse CI deploy gate? A CI step that runs Lighthouse against defined performance budgets (like Total Blocking Time) and fails the build on regression, blocking the deploy until performance is fixed.
How can a solo developer ship safely without QA? A staged pipeline (local → dev → prod) with green-before-advance, a small but broad automated test stack, real-D1 integration tests, performance gates, and fast rollback.
Why deploy to a dev worker before production? A local dev server can't reproduce the real Workers runtime, edge bindings, or D1 behavior. The dev worker validates those before real users see them.
Next → Multilingual SEO & GEO: sitemaps, llms.txt, and AI-citation across 4 languages