The foundations on which the assumptions of the Agile Manifesto rest are disappearing with agentic coding. The following basic assumptions from which agile practices were derived are falling away: the economic assumption that implementation is the bottleneck; the methodological assumption expressed by the Agile Manifesto in its four value pairs; the personnel assumption about teams, learning paths and roles; and the metric assumption that generation speed approximates delivery quality. At several of these points, agile logic is not being supplemented or shifted. It is being reversed.
These claims are supported by the current state of research on agentic software development, as far as it appeared between October 2025 and May 2026 as arXiv preprints or peer-reviewed contributions.
A different scarcity
Agile emerged in a time when the most expensive and slowest resource in software development was human labour. Developers cost money, onboarding took months, context switching was expensive and good architects were scarce. Several core principles followed from this scarcity.
Self-organisation emerged because central planning was too slow and too poorly informed to allocate scarce developer time sensibly. Timeboxing emerged because tasks could neither be estimated reliably nor tracked dependably over months, and smaller time windows limited the cost of error. Velocity emerged because a team knew its own average delivery speed better than any external capacity planning function. The principle of “working software over comprehensive documentation” emerged because a complete specification before the code was more expensive than the code itself and quickly became obsolete anyway.
Each of these answers makes sense as long as implementation is the bottleneck. That assumption is changing. Bhati calls the new bottleneck verification debt. Apostolou, Bosch and Holmström Olsson speak of a capability-deployment verification gap. In their 16 interviews with twelve companies, they describe the same picture from two directions: four companies demonstrated higher agentic capabilities but could not use them productively because reliable verification was missing. Bandara et al. propose Agentsway, a process model in which verification and human orchestration stand alongside code generation as equal concerns.
This changes the optimisation problem. If the scarce resource is no longer how quickly a human can write code, but how quickly an organisation can review, classify and take responsibility for generated code, then all agile answers to the old scarcity point at the wrong problem. Self-organisation is an answer to scarce implementation capacity. It is not an answer to scarce verification capacity. Velocity measures generation speed. It does not measure review capacity. Timeboxing structures human work in iterations. It does not structure a gate sequence consisting of specification, code, test, evidence bundle, review and approval.
A methodology built for a different scarcity is not modernised by adaptation. It loses its justification together with the optimisation problem it was designed to solve. That is the economic break.
Reversal instead of extension
The Agile Manifesto formulates four value pairs. In agentic work, three of them are not extended but pushed into reversal. The fourth remains formally intact, but its addressee changes.
Working software over comprehensive documentation. Spec-driven development reverses this order. Piskala describes the spectrum from spec-first to spec-as-source. Feng and Chen show in a pilot study that structured signatures can increase the test pass rate in repository-level generation. Lulla et al. measure a 28.64 percent runtime reduction and a 16.58 percent reduction in token consumption for an AGENTS.md across ten repositories and 124 pull requests. Galster et al. document eight configuration mechanisms in 2,853 repositories, including context files, skills, subagents, commands, rules and hooks. These artefacts do not come after the code as documentation. They stand before the code as a condition of possibility. Without them, output becomes less specific, slower and more expensive. The Manifesto order is not being extended. It is being reversed.
Individuals and interactions over processes and tools. When the tool is the main producer, this sentence loses its meaning. In Agentic Agile-V, Koch argues that long chat histories are not robust engineering contracts. Interaction becomes a risk that must be contained by processes and tools. Conversation-to-contract gates, acceptance gates, TDD governance (Hasanli et al.), policy-as-prompt (Kholkar, Ahuja), plane separation in CI/CD (Barnes, Ghaleb): the research turns the axis. Processes and tools become the carriers of commitment because interactions and individuals can no longer guarantee it.
Customer collaboration over contract negotiation. The agentic turn forces requirements to become verifiable. A specification that serves as input for an agent is a contract in the technical sense: acceptance criteria, constraints, test cases. This is not contract negotiation in the legal sense, but it is not the oral, collaborative requirements clarification of classic Agile either. The spec becomes a contract again, only in machine-readable form.
Responding to change over following a plan. This sentence formally still stands, but its addressee moves. The plan is now changed with gate discipline: which change may the agent execute, which change requires human approval, which change must never take effect automatically? The research on plane separation by Barnes and Ghaleb and on governance layers by Koch, Xu et al. and Kholkar reads in places closer to V-model discipline than to Scrum.
A methodology whose load-bearing values are reversed in three of four pairs cannot be continued without breaking the term. It must justify itself anew, or it loses its claim.
Who becomes senior?
The discussion about agentic tools speaks of the human as an orchestrator. That is the friendly formulation. The sharper reading is about personnel structure, and it is more uncomfortable.
Horikawa et al. examine 15,451 refactoring instances in 12,256 agentic pull requests. Refactoring occurs frequently, aims mostly at maintainability and readability, and often remains local. High-level design work is less strongly represented. This is precisely the distribution that used to serve as a learning field for junior developers: small tickets, local improvements, writing tests, improving readability. Pinna et al. show clear strengths for 7,156 agentic pull requests in certain task types, especially in the class of bounded, local changes. Robbes et al. document in their adoption study the rapid spread of exactly this type of activity.
What agents are good at is therefore not an arbitrary slice of development work. It is essentially the area through which junior developers build seniority. If this learning path is structurally shortened or skipped altogether, a question arises that is rarely asked in methodology debates: where, in five years, will the people come from who can reliably review, classify and take responsibility for agentic outputs? Apostolou et al. identify reliable verification as the missing element that prevents several companies from productively using higher agentic levels. Verification at this depth requires experience that, in the classic model, was built through exactly the work that agents are now taking over. The scientific material does not state this observation directly, but it follows from the combination of the cited findings with the common personnel structure of software teams.
The Scrum roles also come under pressure. Scrum Masters and Agile Coaches depend almost entirely on human team dynamics. They are the answer to conflict, communication deficits, missing self-organisation and unspoken expectations. In a workflow in which one human works with an agent, these issues become less central. What becomes central is technical workflow architecture: gate design, spec structures, tool rights, hook and policy configuration, MCP setup, evidence-bundle schemas. Galster et al. document eight such mechanisms, and most of them are not coaching topics. They are engineering topics. A role whose field of work is so tightly bound to a shrinking component of the methodology is not reformed. It is resized.
The Product Owner is replaced by business architects. The PO as an interface between business and development team was an answer to expensive development time and to business departments that had no direct translation capability towards implementation. When translation moves into machine-readable specifications and implementation capacity becomes cheaper, part of this mediation work disappears. Responsibility for acceptance criteria, spec maintenance and gate decisions can move closer to the business than the Scrum model envisaged. In its agile definition, the Product Owner is no longer necessary in this form.
The metric reversal
A metric is useful as long as its sign is stable. More is good, or less is good. A metric whose sign can reverse is no longer a metric. It is a signal that must be evaluated according to context.
Velocity stands exactly at this point. Agarwal, He and Vasilescu examine 401 agent-first repositories and 117 IDE-first repositories with matched control groups. They find short-term velocity gains that are front-loaded in the agentic groups. At the same time, static-analysis warnings and cognitive complexity rise significantly. Ehsani et al. show in 33,596 agentic pull requests that reviewer abandonment is the most frequent pattern in the manually categorised subset of rejected PRs. High generation speed does not correlate with high delivery quality. In the agentic groups, it correlates with growing complexity and review load that is not being handled.
Velocity therefore flips in an agentic environment from a virtue to a risk indicator. High agentic velocity with stable review capacity is suspicious, not good. A methodology that has to reinterpret its central control metric as a risk metric should not gloss this over. Velocity remains measurable. It loses its meaning as a control variable. Burndown shares this fate because its meaning rests on the same assumption: that generation speed approximates delivery speed.
No replacement exists yet. Gate cycle time, evidence completeness, verification density, policy conformity, accepted acceptance criteria per unit of time: all are plausible candidates, none is mature practice. This gap deserves attention. The agile movement is left without a stable control metric, while the old metric not only becomes unreliable but flips into its opposite.
A business model disappears
One layer appears only rarely in the scientific material, but it matters for the fate of agile methodology. Since the mid-2000s, Agile has also been a consulting and certification market. SAFe, Scrum.org, LeSS, Scaled Agile Inc. and a large number of independent coaches live from training courses, certificates and transformation projects. Their product consists of team coaching, rituals, role models and scaling frameworks.
The research points to other priorities. Bandara et al. propose Agentsway, a process model in which roles, learning loops, fine-tuning, orchestration and verification are central. Xu et al. call for a governance-first approach to agent engineering. Kholkar and Ahuja develop policy-as-prompt as a translation of governance rules into runtime guardrails. Koch works on a multi-layer translation of governance norms into runtime guardrails and on a V-model variant for agentic work. Galster et al. document the configuration layer of existing tools. All of this is engineering work, not coaching work.
This creates a structural transition problem. An industry whose business model is built on teaching rituals, roles and self-organisation practices cannot move into a market for gate architecture, spec engineering and policy configuration without losing part of its identity. That requires different methods, different curricula, different tools and different buyers inside companies. Here, too, it is not inevitable that the old industry disappears. It is inevitable, however, that it cannot keep its former claim to methodological market leadership without reinventing itself. Renewal on this scale is historically rather rare and usually possible only across a generational change.
What might remain
Which agile contributions continue to hold because they do not depend on the disappearing scarcity?
Small delivery packages remain useful because they reduce risk regardless of who writes the code. Close contact with business domains remains useful because specifications quickly become wrong without domain feedback. Reflection at regular intervals remains useful if its results land in the agent’s working environment, not only on a whiteboard. Accountability for delivery remains useful because agents cannot and should not carry responsibility. Transparency about ongoing work remains useful, but in the form of verifiable artefacts rather than regular status conversations.
This is a much shorter list than what the agile movement has exported as methodology. It consists essentially of practical wisdom older than the Agile Manifesto and able to stand without it. What it does not contain is the specifically agile apparatus of velocity, sprint estimation, self-organisation as a control principle, Scrum roles and ritual cadence. Exactly this apparatus distinguished Agile from older practice traditions. What remains when this apparatus disappears is not Agile with agents. It is closer to disciplined, verification-conscious software development with short feedback cycles, a practice that can be justified without the term Agile.
Conclusion
Agile methodology is not collapsing because its supporters abandon it. It is collapsing because the conditions it optimised against are disappearing in several layers at once. Scarcity moves from implementation to verification. The personnel structure that carried its idea of team and learning path loses its basis of reproduction. The control metric reverses its sign. The consulting economy that organised the methodology is left without a compatible product.
Four of these shifts are directly supported by the 22 works evaluated here. The fifth, the business-model question, is derived. Anyone looking at the documented shifts one by one sees adaptation. Reading them together gives a different picture. This is not the end of software-engineering practices. It is the end of the specific synthesis that was called Agile in the form in which it was taught and sold for twenty years.
The more open question is speed. Apostolou et al. show that industry is still early. Seven of the twelve companies interviewed operate at the lowest maturity level. Four can demonstrate higher levels but cannot use them productively. That gives the existing methodology time. It does little to change the direction. The gates, artefacts, plane separations, spec structures and governance layers documented in the research will sooner or later arrive in practice. When they do, velocity charts, daily stand-up notes and sprint boards will not be abolished. They will simply show the wrong thing more and more visibly.
Note on evidence and method
This essay is based on my own evaluation of 22 works on agentic software development from the period between October 2025 and May 2026. It is not a formal systematic literature review. The selection and interpretation are mine.
The evidence for the economic, methodological, personnel and metric break points comes directly from the evaluated works. The section on the consulting economy goes beyond the scientific material and is marked as such. The question of learning paths from junior developer work towards seniority is also an inference from the activity distributions described in the studies, not a thesis directly established by the research.
One research gap is clear. There are hardly any robust longitudinal studies of teams working agentically for more than a year. Cross-team coordination is barely studied. The translation of agile metrics into agentic workflows remains open. Responsibility after incidents, performance evaluation and skill development are also underexamined.
The commercial planning and assessment of the value of produced results remain entirely open.
Literature
The following list names the locally evaluated versions. For arXiv papers, the link points to the respective abstract page.
- Shyam Agarwal, Hao He, Bogdan Vasilescu: AI IDEs or Autonomous Agents? Measuring the Impact of Coding Agents on Software Development. arXiv:2601.13597. https://arxiv.org/abs/2601.13597
- Spyridon Alvanakis Apostolou, Jan Bosch, Helena Holmström Olsson: Agentic AI in Industry: Adoption Level and Deployment Barriers. arXiv:2605.14675. https://arxiv.org/abs/2605.14675
- Eranga Bandara et al.: AGENTSWAY – Software Development Methodology for AI Agents-Based Teams. arXiv:2510.23664. https://arxiv.org/abs/2510.23664
- Marcus Emmanuel Barnes, Taher A. Ghaleb: From Assistance to Agency: Rethinking Autonomy and Control in CI/CD Pipelines. arXiv:2605.07062. https://arxiv.org/abs/2605.07062
- Happy Bhati: Agentic AI in the Software Development Lifecycle: Architecture, Empirical Evidence, and the Reshaping of Software Engineering. arXiv:2604.26275. https://arxiv.org/abs/2604.26275
- Ramtin Ehsani et al.: Where Do AI Coding Agents Fail? An Empirical Study of Failed Agentic Pull Requests in GitHub. arXiv:2601.15195. https://arxiv.org/abs/2601.15195
- Shuzhao Feng, Boqi Chen: LLM-Assisted Repository-Level Generation with Structured Spec-Driven Engineering. arXiv:2605.02455. https://arxiv.org/abs/2605.02455
- Matthias Galster et al.: Configuring Agentic AI Coding Tools: An Exploratory Study. arXiv:2602.14690. https://arxiv.org/abs/2602.14690
- Taher A. Ghaleb: When AI Agents Touch CI/CD Configurations: Frequency and Success. arXiv:2601.17413. https://arxiv.org/abs/2601.17413
- Tarlan Hasanli et al.: TDD Governance for Multi-Agent Code Generation via Prompt Engineering. arXiv:2604.26615. https://arxiv.org/abs/2604.26615
- Rashina Hoda: Toward Agentic Software Engineering Beyond Code: Framing Vision, Values, and Vocabulary. arXiv:2510.19692. https://arxiv.org/abs/2510.19692
- Kosei Horikawa et al.: Agentic Refactoring: An Empirical Study of AI Coding Agents. arXiv:2511.04824. https://arxiv.org/abs/2511.04824
- Gauri Kholkar, Ratinder Ahuja: Policy-as-Prompt: Turning AI Governance Rules into Guardrails for AI Agents. arXiv:2509.23994. https://arxiv.org/abs/2509.23994
- Christopher Koch: From Governance Norms to Enforceable Controls: A Layered Translation Method for Runtime Guardrails in Agentic AI. arXiv:2604.05229. https://arxiv.org/abs/2604.05229
- Christopher Koch: Agentic Agile-V: From Vibe Coding to Verified Engineering in Software and Hardware Development. arXiv:2605.20456. https://arxiv.org/abs/2605.20456
- Jai Lal Lulla et al.: On the Impact of AGENTS.md Files on the Efficiency of AI Coding Agents. arXiv:2601.20404. https://arxiv.org/abs/2601.20404
- Giovanni Pinna et al.: Comparing AI Coding Agents: A Task-Stratified Analysis of Pull Request Acceptance. arXiv:2602.08915. https://arxiv.org/abs/2602.08915
- Deepak Babu Piskala: Spec-Driven Development: From Code to Contract in the Age of AI Coding Assistants. arXiv:2602.00180. https://arxiv.org/abs/2602.00180
- Romain Robbes et al.: Agentic Much? Adoption of Coding Agents on GitHub. arXiv:2601.18341. https://arxiv.org/abs/2601.18341
- Giovanni Rosa et al.: Understanding Specification-Driven Code Generation with LLMs: An Empirical Study Design. arXiv:2601.03878. https://arxiv.org/abs/2601.03878
- Yuheng Tang et al.: DevOps-Gym: Benchmarking AI Agents in Software DevOps Cycle. arXiv:2601.20882. https://arxiv.org/abs/2601.20882
- Qiang Xu et al.: From Craft to Constitution: A Governance-First Paradigm for Principled Agent Engineering. arXiv:2510.13857. https://arxiv.org/abs/2510.13857