The ADR Comeback: Anchoring Agentic Engineering Teams With Architecture Decision Records
Agentic coding tools are refactoring services that humans spent years getting right, and most of them have no memory of why the original design existed. Architecture Decision Records were a tidy idea when humans wrote all the code; they are now load-bearing infrastructure for any team where an AI agent touches production. A practical playbook for the ADR discipline, the new metrics that matter, and the three failure modes to avoid.

Last quarter I watched an agentic coding tool replace a perfectly good idempotency layer in a payments service. The agent did not know that the original implementation had been written that way after a Black Friday incident two years prior, when the previous design had double-charged 4,200 customers in 90 minutes. The agent saw a verbose middleware, decided it was redundant, and refactored it into something cleaner. The PR was green. The reviewer was junior. Production held for nine days before the same failure mode re-emerged at lower volume.
Nothing about that story is unusual in May 2026. What is unusual is how few teams have done anything about it.
The defense is not better guardrails. It is not a stricter code reviewer. It is a discipline most teams quietly abandoned in the late 2010s: the Architecture Decision Record. ADRs were a tidy idea when humans wrote all the code. They are now load-bearing infrastructure for any team where an AI agent touches production.
This is the playbook I have used to bring ADRs back into multi-team programs, the new metrics that matter when half the authors of your codebase are not human, and the three failure modes that kill the discipline before it gets traction.
The original case for ADRs, briefly
The ADR pattern was popularized by Michael Nygard in 2011 as a way to write down architectural decisions at the moment they were made, in version control, alongside the code they constrained. The template was deliberately spartan: context, decision, status, consequences. The whole point was to make the decision cheap enough to record that teams would actually do it.
For about five years, this caught on inside engineering organizations that were maturing past the "one architect in a basement" model. ADR repositories became a way to onboard new engineers, to settle recurring debates ("why are we on Postgres and not Mongo"), and to keep architects honest about the constraints they were imposing.
Then most teams quietly stopped writing them.
Why we let them lapse
There were three reasons, and all of them were defensible at the time.
The first was that ADRs became performative. In the worst cases they were written after the decision had already shipped, as a kind of architectural compliance theater. A document that ratifies a decision rather than records one has no real institutional value, and engineers correctly noticed that the effort produced no information.
The second was that engineering leaders started leaning on architects rather than archives. If you have a strong principal engineer in the room, you can usually get a verbal answer to "why did we do it this way" in under thirty seconds. That works fine until the principal engineer leaves, or the room gets too big, or the question is being asked at 2:14am by someone who is not in the room at all.
The third was that nobody was measuring the absence. ADR coverage is invisible until you need it. The cost of not having one shows up months later, in the form of a refactor that re-litigates a settled question or, more painfully, re-introduces a bug the original design was deliberately avoiding.
In a world where the cost of architectural drift was bounded by the speed of human authorship, this was a survivable level of decay. Drift was slow because typing was slow.
That world is over.
What agents changed
Agentic coding tools do not type slowly. A senior-level agent can produce, on a credible afternoon, more architecturally significant code than a four-person team could ship in a week. The work is good enough to merge. It is also generated by a process that has no memory of why the existing design exists, because the existing design is mostly comments, structure, and naming, and not a written record of the constraints it is satisfying.
This produces a specific class of failure I have started calling agentic drift: a slow, plausible erosion of architectural intent, committed in small PRs, each of which looks like a reasonable refactor on its own merits. The bug from the 2024 incident comes back because nobody told the agent the bug had ever happened. The compliance pattern from the 2023 audit gets simplified because nobody told the agent the pattern was load-bearing. The database constraint that was deliberately kept loose gets tightened because the agent inferred, correctly from the data shape and incorrectly from the business reality, that the field should be unique.

The defensive question is the same one ADRs were originally invented to answer: why was the original design this way? The difference is that the answer used to live in the head of a principal engineer who could be paged, and now it needs to live somewhere an agent can read.
That somewhere is an ADR archive.
ADRs as a constraint in the agent's planning loop
The mechanical insight is simple. If you put your ADRs in version control alongside your code, modern agentic frameworks will index them, retrieve the relevant ones at planning time, and use them as constraints in the plan they generate. The agent stops being a stateless transformer of intent into code and starts being a constrained planner that respects prior commitments.

The retrieval step is the cheap part. Most coding agents already do something like this for context, and the engineering effort is mostly about file layout and naming. The discipline is keeping the archive accurate, keeping it close to the code, and closing the loop when the agent itself makes a new architectural decision.
The last point is the one most teams miss. If you let an agent introduce a new pattern without writing the corresponding ADR, you have built a system that produces drift faster than it produces memory. The fix is to treat the ADR as a deliverable of the agent's work, not a separate activity humans do later. In practice, this means the same PR that introduces the new pattern includes the ADR file. Reviewer rejects the PR if the ADR is missing. No exceptions for "small" changes; small changes are how drift accumulates.
This is the same instinct, applied at the architecture layer, that drives the governance gap I have written about in the AI agent layer. The agent's work is not done when the code is merged. It is done when the institutional record reflects what the agent decided.
The lifecycle, kept in version control
An ADR has a lifecycle that is deliberately boring. It is proposed, then accepted, then eventually either deprecated or superseded by a newer ADR. Sometimes it is rejected and kept as a record of the road not taken, which turns out to be one of the highest-value categories of all, because the next time someone proposes the rejected idea, the conversation can start from "here is what we learned last time" rather than from zero.

The non-negotiable property is that an ADR is never deleted. Status changes. The record stays. This matters for humans, but it matters more for agents, because the agent needs to be able to reason about not just what the current decision is, but how the team got here. A superseded ADR with a pointer to its replacement is more useful to an agent than a tidy archive that only shows current state.
A template you can adopt this afternoon
The whole point of ADRs is that they are cheap enough to write that the discipline can survive contact with a real engineering schedule. The template I use across programs is roughly this, kept in a docs/adr/ directory in the repo:
# ADR-NNN: Short title in the form of a decision
Status: Proposed | Accepted | Deprecated | Superseded by ADR-NNN | Rejected
Date: YYYY-MM-DD
Authors: Names or handles
## Context
What is the situation that demanded a decision. Two or three short paragraphs.
Include the constraints that were in play, including non-obvious ones (an
incident from 18 months ago, a compliance gate, a vendor SLA).
## Decision
The decision in one or two sentences. Active voice. "We will use X for Y."
## Consequences
What gets easier, what gets harder, what we are accepting. Three to seven
bullets is usually right. Include the second-order consequences, not just the
direct ones.
## Alternatives considered
Two or three alternatives, with a sentence each on why they were not chosen.
This is the section that pays off most when an agent later proposes one of
the alternatives.
## Related
Pointers to other ADRs, incidents, vendor contracts, RFC threads.
You can compress this further. You can extend it. The exact shape matters less than the fact that it exists, that it is in the repo, and that the team treats it as the canonical place to look. If you are looking for a starter template, the adr-tools project on GitHub gives you a cli that scaffolds new ADRs with the right numbering and structure.
The new metrics that matter
The traditional delivery KPIs I have written about elsewhere still apply, but the agentic era introduces three new measurements that engineering leaders should be tracking explicitly.
ADR coverage is the percentage of architecturally significant decisions in the codebase that have a corresponding ADR. You will not get to 100 percent. You should not try. A reasonable target is 70 to 80 percent of decisions that touch the critical path: data model, service boundaries, authentication, billing, anything in the regulated zone, anything where an incident has happened before. Coverage below 40 percent is the danger zone, and the data I have seen across programs suggests it is where agentic drift accelerates fastest.

Supersession rate is how often ADRs get superseded by newer ADRs in a given quarter. A rate of zero is a problem. It means either the team is not making new architectural decisions, which is unlikely, or it is making them and not recording them, which is the actual failure. A healthy program supersedes 5 to 15 percent of its ADRs per year as the system evolves.
Agent retrieval rate is the new one. If your agentic tooling supports it, instrument how often the agent actually retrieves and references an ADR during planning. This is a leading indicator. The agent that is retrieving zero ADRs is the agent that is about to refactor your idempotency layer. The agent that is retrieving four ADRs and citing the relevant one in its PR description is the agent that has been brought into the team's institutional memory.
Three failure modes that kill the discipline
I have watched ADR programs fail in the same three ways often enough that they deserve names.
Ratification theater. The ADR is written after the decision has shipped, by an engineer who was told to write it for compliance reasons. The document records what happened, not why it was decided that way, and there is no signal in the consequences section because nobody is willing to write down what is now being accepted as a trade-off. The fix is to require the ADR in the same PR as the change, not as a follow-up. If the ADR is hard to write before the change ships, the change is not ready to ship.
Archive sprawl. The team writes hundreds of ADRs, most of them about minor implementation choices, and the archive becomes too noisy for either humans or agents to retrieve usefully. The fix is to be ruthless about scope. An ADR is for a decision that constrains future work or that will be expensive to revisit. A choice of npm package version is not an ADR. A choice to standardize on a database is. A useful test: if the decision could be made by a single engineer in a single PR without anyone else needing to know, it is not an ADR.
Detachment from the agent loop. The team writes great ADRs, keeps them in a tidy wiki, and the agent never sees them. This is the most common failure I have seen in the last six months, and it is entirely cultural. The fix is to physically move the archive into the code repository, name the directory predictably, and verify that your agentic tooling is actually indexing it. If you cannot show that the agent retrieved an ADR on a recent significant PR, the discipline is not yet working.
What this means for delivery leaders
The temptation, if you are a program manager or a product leader reading this, is to declare ADRs an engineering problem and move on. That is the wrong move. The reason ADRs lapsed the first time was that nobody outside engineering measured them, and the reason they will lapse again is the same.
The leadership job is small but specific. Add ADR coverage to your engineering KPI dashboard. Ask, in the same monthly cadence in which you ask about DORA metrics or velocity, what new ADRs were written and what existing ones were superseded. Treat a quarter with zero supersessions the way you would treat a quarter with zero retros: as a sign that the program has stopped learning out loud.
The cost of doing this is roughly 20 minutes per month of leadership attention. The cost of not doing it is the next idempotency layer that gets refactored away, in a small clean PR, on a Tuesday, by an agent that did not know the story.
ADRs are how you tell the story to the next reader. In May 2026, that reader is increasingly not a person.
References
- Michael Nygard. Documenting Architecture Decisions. cognitect.com
- Nat Pryce. adr-tools: command line tools for working with Architecture Decision Records. github.com
- ThoughtWorks Technology Radar. Lightweight Architecture Decision Records (Adopt, 2017 and renewed in subsequent volumes). thoughtworks.com
- Pollick, Rick. The AI Agent Governance Gap. rickpollick.com
- Pollick, Rick. From Gantt to Graph: Why Modern Project Management Demands Technical Fluency. rickpollick.com
