Ch. 32

Backpressure & Automated Feedback

Part 10 / Sustainability & Deployment

The core problem

AI made code generation fast. It didn’t make code verification fast. The human became the bottleneck.

An engineering lead at an automotive SaaS company shared this: one engineer reviewed 130 pull requests in 15 days. About 20 file changes each. 20-30% of the AI-generated code required manual fixes. That’s one person acting as the quality gate for an entire team’s AI output.

An enterprise with 1,500 engineers had been using Copilot, Cursor, and Claude Code for five months. More PRs were being generated. Merge times hadn’t changed. The bottleneck moved from production to review.

This is what happens when there’s no automated feedback between the AI and the human. Every mistake flows to you. Every hallucination is your problem. The fix isn’t a better model. It’s backpressure.

What backpressure means for AI agents

In distributed systems, backpressure prevents a fast producer from overwhelming a slow consumer. TCP uses window-based flow control. Reactive Streams use demand signaling. Kafka uses consumer group lag monitoring. The principle is the same in every case: the consumer tells the producer to slow down when it can’t keep up.

The same principle applies to AI-assisted development: the AI is the fast producer, you are the slow consumer, and most teams have zero backpressure between the two. The agent generates code at machine speed. The human reviews it at human speed. Without backpressure, the review queue grows without bound, review quality degrades, and the team ends up in one of the two failure modes described in Chapter 2 - rubber-stamping or bottlenecking.

Backpressure for agents means automated feedback loops that catch errors before they reach a human. Every error caught by a type checker, a test suite, or a linter is one fewer error in your review queue. The more errors the automated pipeline catches, the less work the human reviewer has to do, and the higher the quality of the work that does reach human review.

The backpressure hierarchy

Ordered from strongest to weakest signal. Implement top-down:

Layer 1: Type Systems

The most effective form of backpressure. When an agent generates TypeScript with strict mode, Rust, or Go, the compiler catches entire categories of errors instantly. The agent gets immediate feedback and self-corrects.

The quality of error messages matters. Rust’s compiler explains what went wrong, suggests a fix, and points to the exact location. That explanation feeds directly back into the LLM. The better the error message, the more likely the agent self-corrects on the first try.

If you’re choosing a language for a new project with heavy AI-assisted development, the strength of the type system should be a primary factor. Not because you need types for yourself. Because the agent needs types to catch its own mistakes.

Layer 2: Test Suites

A failing test tells the agent: “what you just did broke something.” A passing test tells the agent: “you’re on the right track.”

The key word is fast. If your test suite takes 20 minutes, the agent sits idle for 20 minutes between attempts. Teams that cut test suite runtime from 15 minutes to 90 seconds saw immediate improvement in agent output quality - more iterations, better final results.

Layer 3: Linters and Pre-commit Hooks

Before AI, pre-commit hooks were annoying. They slowed down your commit flow. Now that agents do the committing, it doesn’t matter if hooks add 30 seconds. The agent doesn’t care. Every issue caught by a linter is one fewer issue in your review queue.

Turn on every strict rule. The agent will comply without complaint.

Layer 4: Architecture Enforcement

Architecture enforcement tools (ArchUnit for Java, dependency-cruiser for JavaScript, custom rules for other languages) prevent agents from violating architectural boundaries. “The API layer cannot import from the database layer directly - it must go through the service layer.” “No circular dependencies between packages.” “The auth module cannot depend on the billing module.”

These rules catch a category of errors that type systems and tests miss

  • the code compiles, the tests pass, but the architecture is wrong. An agent that adds a direct database call from an API handler produces working code that violates your architecture. Without enforcement, this violation reaches human review. With enforcement, the build fails with a clear message, and the agent self-corrects.

Architecture enforcement is particularly valuable for agents because agents don’t have an intuitive understanding of your architecture. A human developer who has worked on the codebase for months knows that “we don’t call the database from the API layer.” An agent doesn’t know this unless it’s told - either through AGENTS.md (which it might ignore) or through enforcement (which it can’t ignore).

Layer 5: Visual Verification

For frontend work, agents can screenshot what they’ve rendered and compare:

async def visual_check(page_url: str, baseline_path: str) -> dict:

The complete backpressure pipeline

A production backpressure pipeline has five stages, each feeding results back to the agent before human review.

Stage 1: Static Analysis. The agent submits code. Within seconds, linters (ESLint, Ruff, golangci-lint), type checkers (TypeScript, mypy, go vet), and formatters (Prettier, Black, gofmt) run against the diff. Failures are returned to the agent as structured feedback - file, line number, rule violated, suggested fix. The agent self-corrects and resubmits. This stage catches 40-60% of issues and completes in under 10 seconds.

Stage 2: Test Execution. The corrected code runs against the project’s test suite - but only the affected tests, not the full suite. Test impact analysis (using tools like Jest’s --changedSince or pytest-testmon) identifies which tests are relevant to the changed files. Failures are returned with the test name, expected output, actual output, and stack trace. The agent analyzes the failure and fixes the code. This stage catches logic errors and regressions, completing in 30-90 seconds.

Stage 3: Security Scanning. Static application security testing (SAST) tools like Semgrep, CodeQL, or Snyk Code scan for vulnerabilities - SQL injection, path traversal, hardcoded secrets, insecure dependencies. Findings are returned with severity, CWE classification, and remediation guidance. The agent patches the vulnerability and resubmits. This stage is non-negotiable for production agents.

Stage 4: Architecture Enforcement. Custom rules verify that the change follows the project’s architectural conventions. Does the new file go in the right directory? Does it follow the naming convention? Does it import from allowed modules only? Does it respect the dependency graph? Tools like ArchUnit (Java), Dependency Cruiser (JavaScript), or custom scripts enforce these rules. Violations are returned with the rule name and expected behavior.

Stage 5: Human Review Gate. Only code that passes all four automated stages reaches a human reviewer. The reviewer sees a clean diff with a summary of what automated checks were run and passed. Their job is to evaluate what automation can’t - architectural fit, business logic correctness, and whether the approach makes sense. This is where human judgment adds the most value, unburdened by mechanical issues that the pipeline already caught.

The pipeline is iterative. If the agent fails at any stage, it receives feedback and retries from that stage - not from the beginning. A well-tuned pipeline allows 1-3 iterations before the code passes all automated checks. If the agent can’t self-correct after 3 attempts, the task is escalated to a human with the full trace of what was tried and why it failed.

Tuning backpressure

Backpressure isn’t binary. It’s a dial.

SettingSymptomFix
Too littleHallucinations pass through. Agent output looks clean but is subtly
wrong. Human catches logic errors.Add more automated checks. Strengthen test coverage.
Too muchFeedback loop is too slow. Agent waits 20 min for tests. Speed
advantage lost.Parallelize checks. Run only affected tests. Cache build artifacts.
Just rightAgent self-corrects in 1-3 iterations. Human reviews architecture and
judgment calls only.Target: full cycle in < 2 minutes.

Measuring backpressure effectiveness

How do you know if your backpressure is working? Track three metrics.

Self-correction rate: The percentage of agent errors that are caught and fixed by the agent itself (through backpressure) versus caught by human review. A healthy backpressure system has a self-correction rate above 80% - meaning 4 out of 5 errors never reach a human reviewer. If your self-correction rate is below 60%, your backpressure is too weak.

Iteration count: The average number of iterations (edit → check → fix cycles) before the agent produces output that passes all automated checks. A healthy iteration count is 1-3. If agents consistently need 5+ iterations, either the backpressure is too strict (catching things that don’t matter) or the agent’s initial output quality is too low (which might indicate a context engineering problem).

Human review time: The average time a human spends reviewing agent-generated PRs. This should decrease as backpressure improves - if automated checks catch more issues, humans have fewer issues to find. Track this metric weekly and investigate if it trends upward.

What this means for your stack

Your investment in engineering infrastructure is now directly correlated with how effectively you can use AI.

InfrastructureWithout ItWith It
Strong type systemHuman catches type errorsAgent self-corrects in seconds
Fast test suite (< 2 min)Human verifies behaviorAgent iterates until tests pass
Strict lintingHuman reviews styleAgent auto-fixes on commit
Architecture rulesHuman catches boundary violationsBuild fails with clear message
Pre-commit hooksHuman reviews everythingTrivial issues never reach PR

Teams with strong types, fast tests, and strict linting adopt AI agents faster and with less fatigue. Teams without these things experience humans acting as the quality gate, burning out under the review burden.

The best AI tool in the world, pointed at a codebase with no tests and no types, will produce output that a human has to manually verify line by line. And that human will burn out.

Backpressure for different tech stacks

The effectiveness of backpressure varies dramatically by tech stack. This isn’t a minor consideration - it should influence your technology choices for new projects where AI-assisted development is a priority.

TypeScript with strict mode is the current sweet spot for agent-assisted development. The type system catches a wide range of errors at compile time, the ecosystem has excellent linting tools (ESLint, Prettier), the test frameworks are fast (Vitest runs in under a second for most test suites), and the error messages are clear enough for agents to self-correct. Teams using TypeScript strict mode report 90-95% agent self-correction rates on type errors.

Rust has the strongest type system and the best compiler error messages of any mainstream language. The borrow checker catches entire categories of memory safety bugs that no test suite would find. Rust’s compiler errors are famously helpful - they explain what went wrong, suggest a fix, and point to the exact location. This makes Rust an excellent language for agent-assisted development, despite its learning curve. The downside is that Rust’s compilation times are longer, which slows the feedback loop.

Python is the weakest mainstream language for backpressure. Without type annotations, the type checker catches nothing. With type annotations (mypy, pyright), it catches some errors, but Python’s type system is less expressive than TypeScript’s or Rust’s. Python’s test frameworks are fast, but the lack of compile-time checking means more errors reach the test suite - and some errors reach production. Teams using Python for agent-assisted development should invest heavily in type annotations, strict mypy configuration, and comprehensive test coverage.

Go falls in the middle. Its type system is simpler than TypeScript’s or Rust’s, but it catches the most common errors. Go’s compilation is fast (sub-second for most projects), its error messages are clear, and its testing framework is built into the language. The main weakness is that Go’s type system doesn’t support generics as expressively as TypeScript, which means some errors that TypeScript catches at compile time require runtime checks in Go.

Backpressure and CI/CD integration

Backpressure is most effective when it’s integrated into your CI/CD pipeline, not just your local development environment. When an agent creates a pull request, the CI pipeline should run the full backpressure stack - type checking, tests, linting, security scanning, architecture enforcement - and report the results back to the agent. If the pipeline fails, the agent should be able to read the failure output, understand what went wrong, and push a fix automatically.

This creates a closed loop: agent generates code, CI runs checks, failures are reported back to the agent, agent fixes the issues, CI runs again. The loop continues until all checks pass or the agent hits its iteration limit. The human reviewer only sees the final result - code that has already passed all automated checks.

The key metric for CI-integrated backpressure is the cycle time - how long it takes for the agent to get feedback from CI. If CI takes 20 minutes, the agent waits 20 minutes between iterations. If CI takes 90 seconds, the agent can iterate 13 times in the same period. Teams that invest in fast CI see dramatically better agent output quality because the agent can iterate more times within its cost and time budgets.

The ROI of engineering infrastructure in the agent era

The return on investment for engineering infrastructure has changed dramatically with agent adoption. Before agents, a strict type system saved time by catching bugs at compile time instead of runtime. With agents, a strict type system saves time by enabling agent self-correction - a multiplier on the original benefit.

Consider the math. A strict TypeScript configuration catches roughly 20 type errors per day in a typical codebase. Before agents, each error would have been caught by a developer during testing - maybe 5 minutes per error, or 100 minutes per day. With agents, each error is caught by the compiler and fed back to the agent, which self-corrects in seconds. The developer never sees the error. The time savings is the same 100 minutes per day, but now it’s saving the developer from reviewing agent output rather than from debugging their own code.

The same multiplier applies to every layer of backpressure. Fast tests save developer time by catching regressions before they reach review. Strict linting saves developer time by catching style issues before they reach review. Architecture enforcement saves developer time by catching structural violations before they reach review. Each layer reduces the human review burden, which is the bottleneck in agent-assisted development.

This means that the ROI calculation for engineering infrastructure investments has changed. Before agents, investing a week in improving your test suite saved maybe 30 minutes per week in debugging time - a 6-month payback period. With agents, the same investment saves 30 minutes per week in debugging time plus 2 hours per week in review time

  • a 2-month payback period. The agent era makes engineering infrastructure investments pay off faster.

Related Concepts: AI Fatigue (Chapter 20), The Conductor Model (Chapter 21), Agent Evaluation (Chapter 26) Related Practices: Your First Agent in Production (Chapter 23), Security Checklist (Chapter 24)