Ten Quality Gates — What Fails a Release at Routiine (And Why)

For the founder who wants to know exactly what stops code from reaching production — and why we will not ship when a gate fails, even at two a.m. on a Friday.

The Situation

Every software team has some version of "quality control." In most Dallas agencies I have audited, it amounts to a staging URL, a single tester, and a Slack thread where the founder types "looks good." That is not a process. That is a vibe check. When the inevitable post-launch bug shows up — a broken checkout, a leaking environment variable, a color-contrast failure that fails an accessibility audit — the team goes back to the code, patches the bug, and updates the Notion doc with "lesson learned: check this next time." The list of lessons grows. The list is never applied mechanically. The next release reintroduces a bug from three releases ago.

This is the default state of software quality in the Dallas market. It is not because the engineers are careless. It is because the agency's definition of "done" is subjective. "Done" means the engineer is tired. "Done" means the founder is satisfied in a Zoom call. "Done" means the invoice is ready. None of these is a falsifiable statement about the software itself. None of them can be automated. None of them survive a staff change.

At Routiine we replaced subjective "done" with ten Quality Gates. Every release clears every gate before it reaches production. Every gate has a specific failure condition that holds the release until a human resolves it. Every gate is automated where possible and manual where not. The gates are documented publicly at /forge as part of the FORGE methodology, and they are named on every engagement's contract so the client knows exactly what the standard is before they sign.

This piece exists because founders consistently ask the same question on the first discovery call: "What specifically do you check before you ship?" The answer is ten things, in order, and I want every prospective Routiine client to know them in detail before they decide to hire us. If the gates do not match what you expect, you should not hire us. If they do, you should know what you are paying for.

The Problem

The reason Quality Gates are not standard in Dallas software delivery is that they are expensive to set up and cheap to skip. The first engagement at a new agency almost never has a gate process because the team is optimizing for initial velocity. The second engagement inherits the lack of process because nobody wants to install brakes on a moving car. The tenth engagement still has no gates because the team has learned to live with the consequences — occasional post-launch fires, occasional refunds, occasional client churn. It is cheaper, in the short run, to absorb the failures than to build the infrastructure to prevent them.

This economic equation is correct for the agency and wrong for the client. The client pays for the failures in the form of emergency patches, reputation damage, and the loss of confidence that comes from never being sure whether the next deploy will break something. The client has no way to audit this cost, because they never see the deploys that would have failed if the gates existed. The cost is invisible.

The second reason Quality Gates are rare is that they require discipline at the moment of lowest reward. Every gate is most valuable on the deploy you almost skipped it on. A rushed Friday-night release to fix a production bug is exactly the moment a team will skip the test suite, skip the accessibility check, skip the security scan. And it is exactly the moment those gates would have caught a new vulnerability or a broken flow. The discipline has to be external to the engineer's willpower. It has to be built into the pipeline so that no human can override it without a recorded override, reviewed after the fact.

A third reason is that gates fail in a way that looks, to an impatient founder, like the team is slow. A release that fails the performance budget and gets held for a day looks, from the outside, like the developer did not ship. The only defense against this appearance is the prior agreement that gates are non-negotiable. When the founder signs the contract, the gates are named and accepted. Every future held release references the gate that held it. Over time, founders stop asking "why didn't this ship?" because the answer is always on the gate report and the answer always prevents a worse outcome later.

The fourth and most subtle problem is that gates have to be carefully chosen. A team that has forty gates ships nothing. A team that has two gates ships garbage. The number ten was arrived at after three years of tuning — adding gates when we learned a recurring failure mode, removing gates when we found they were cosmetic. The current ten are the minimum set that catches the ninety-five percent of production failures I have seen in Dallas agency work, and the maximum set that can run inside a ninety-minute release window without imposing unacceptable velocity cost.

The Implication

Releases that ship without gates do not fail visibly on day one. They fail slowly, over months, in a way that looks like bad luck.

Month one: a dependency vulnerability is disclosed upstream. A team without a security scan gate does not know the vulnerability is present until a customer's security review flags it. The team spends a week producing evidence that the vulnerability does not affect their usage, or patching it under time pressure.

Month two: a new browser version ships. The team without an accessibility or cross-browser gate discovers that a core user flow has broken in Safari on iOS. Conversion drops by a measurable percentage and nobody notices for nine days.

Month three: a new feature is shipped. A test regression is introduced — an old test starts relying on a behavior that has changed. Without a test-suite gate that is strict about greenness, the failing test is commented out and the regression ships to production. Six weeks later, the regression affects a specific customer in a specific edge case, and the founder loses the account.

Month four: a founder-team member leaves. They wrote a critical module without documentation. A team without a documentation gate discovers this only when the replacement engineer tries to modify the module and breaks production for forty minutes.

The Decay Thesis we publish at /living-software names this pattern precisely: software without active maintenance decays at a rate faster than the business can absorb. Quality Gates are the mechanism that bounds the decay rate. Without them, the decay compounds across every release. With them, every release stabilizes the system against the previous release's drift.

The monetary version of this, measured across a three-year horizon on a mid-complexity SaaS, is typically between forty and one hundred twenty thousand dollars in avoidable cost — emergency patches, customer refunds, reputation damage, and re-platform fees. The non-monetary version is worse: founders who ship without gates never feel confident, never sleep well on release nights, and progressively slow down their own roadmaps to reduce risk. The product stagnates not because of market forces but because the team has lost the nerve to deploy.

The ten gates we run are the specific structural answer to this pattern.

The Need-Payoff

Here are the ten Quality Gates, in order. Every Routiine release — Sprint Scope, Launch, Platform, or System — clears every one before it reaches production. Every gate has a named owner (automated agent or human). Every gate has a documented failure threshold. Every gate is run on every release regardless of size.

Gate 1 — Lint and format. Runs automatically on every pull request. Fails the build if the codebase diverges from the agreed style guide. Purpose: eliminate the entire category of "the diff is unreadable" review comments. Failure mode: stops the PR from merging. Typical resolution time: two minutes.

Gate 2 — Typecheck. Full strict-mode TypeScript check (or equivalent for other languages). No implicit any. No unchecked optional access. Fails on any type error. Purpose: catch the class of bugs that manifest only at runtime in specific edge cases. Failure mode: stops the PR. Typical resolution time: five to thirty minutes.

Gate 3 — Test suite green. Full unit, integration, and smoke test suite passes. No skipped tests without a recorded override. No flaky tests ignored. Purpose: verify that the shipped behavior matches the intended behavior. Failure mode: stops the release. Typical resolution time: fifteen minutes to four hours depending on the failure.

Gate 4 — Security scan. Dependency vulnerability scan (npm audit or equivalent) plus static analysis with Semgrep. No high-severity vulnerabilities. No secrets committed. Purpose: catch the upstream dependency risks and the "API key in the repo" class of failures. Failure mode: stops the release until remediated. Typical resolution time: thirty minutes to a day.

Gate 5 — Accessibility check. Automated axe-core run on every route. Manual keyboard navigation of critical flows. Color contrast verified against WCAG AA. Purpose: ensure the product is usable by everyone and legally defensible. Failure mode: stops the release for accessibility regressions. Typical resolution time: thirty minutes to two hours.

Gate 6 — Performance budget. Lighthouse performance score above the client's agreed target (typically ninety for marketing pages, seventy-five for authenticated app pages). Core Web Vitals within Google's green thresholds. Purpose: prevent the slow-creep of bundle size and image weight that degrades conversion. Failure mode: stops the release if budget is exceeded by more than five percent. Typical resolution time: one to four hours.

Gate 7 — Staging deploy. Full deploy to a production-equivalent staging environment. Smoke test of critical flows. End-to-end Playwright tests against staging. Purpose: catch the "works on my machine" class of bugs before they affect real users. Failure mode: stops the production deploy. Typical resolution time: fifteen minutes to a full day depending on the divergence.

Gate 8 — Acceptance test with founder. Named critical paths reviewed live by the founder or their designated acceptance tester, against the acceptance criteria written at sprint planning. Purpose: confirm that the shipped behavior matches what the founder thought they were getting. Failure mode: release held until founder sign-off. Typical resolution time: under an hour if the work is on-spec.

Gate 9 — Documentation and runbook updated. README reflects the new state. Any new environment variables documented. Any new third-party services added to the Inheritance Ledger. Any new operational procedures written down. Purpose: maintain the Ownership Transfer integrity continuously, not just at the end of the engagement. Failure mode: stops the release. Typical resolution time: fifteen to thirty minutes.

Gate 10 — Rollback plan confirmed. Every release must have a documented rollback procedure that can be executed by a single engineer inside ten minutes. Database migrations must be reversible or have a documented forward-fix plan. Feature flags used for any change with irreversible side effects. Purpose: ensure that a production failure can be recovered from without panic. Failure mode: release held until rollback plan is written and reviewed. Typical resolution time: fifteen minutes.

These ten gates run on every release, regardless of scope. A one-line CSS change clears all ten. A major feature shipment clears all ten. A two-a.m. hotfix clears all ten. The discipline is that the gates do not bend for urgency. If the hotfix is urgent, the response is to make the gates run faster, not to skip them.

The Living Software doctrine published at /living-software frames why this matters beyond any single release: software that clears ten gates on every deploy accumulates quality the way a bank account accumulates interest. Every release reinforces the previous release's standards. A team that skips gates on the rushed Friday release pays the interest in arrears, with penalty, in the next production incident.

The Ship-or-Pay Guarantee applies on top of this. If the ten gates prevent us from shipping by the contractually agreed date, the founder does not pay the final milestone. The risk of the gates holding a release is carried by us, not by the client. In practice, this means our estimates are conservative enough to absorb typical gate holds, and our operational discipline is strict enough that typical gate holds are shallow. The risk is real; the incidence is low. Over the last eighteen months, less than three percent of our releases have missed their committed ship date, and every missed release was, in retrospect, correctly held by a gate — the alternative would have been a worse production outcome.

The Wise Magician stance on Quality Gates is that they are not a premium service. They are the minimum standard any serious software team should operate at. We publish them openly so that other teams can adopt them. The adoption is good for the market. A Dallas where every team runs ten gates is a Dallas where founders stop getting burned.

Next Steps

Three actions, in order of commitment.

First, read the full FORGE methodology at /forge. The ten gates are described in operational detail — the specific tools we use, the specific thresholds we enforce, the specific failure modes we have seen in the last eighteen months. If you run a team and want to adopt any of the gates, copy freely. The methodology is published as a gift to the market.

Second, book a free FORGE Audit at /contact. In the audit I will review your current release process against the ten gates and produce a written gap analysis. You will see, precisely, which gates you are running, which ones you are skipping, and which ones you are running informally when you should be running them automatically. The audit is free regardless of whether you hire us.

Third, if you want every gate applied to your next build by a team that operates them daily, apply to the Founding Client Program at /work. Five slots at twenty percent below our standard rate. Every gate applied. Every release audited. Every handoff complete. Every delivery under the Ship-or-Pay Guarantee.

Quality is not a feature. It is a process. The process is ten gates, in order, every time. Founders who understand this stop hiring agencies that cannot name their gates.

Routiine

Ten Quality Gates — What Fails a Release at Routiine (And Why)