← All articles

Large Language Models and Scrum: How AI Changes Software Development

By XNM Technologies · April 24, 2023 · 5 min read
Large Language Models and Scrum: How AI Changes Software Development

The pace at which large language models have entered the software development workflow has surprised even those who expected the technology to be significant. In the space of roughly two years, AI code generation tools went from novelty to near-ubiquity in many development teams. Copilot, ChatGPT, Claude, and a growing list of specialised coding assistants can now generate working code from natural language descriptions, explain existing code, write unit tests, identify bugs, and produce documentation — tasks that previously required significant developer time. For Scrum teams, this creates both an opportunity and a challenge: the opportunity to deliver more value per sprint, and the challenge of integrating a powerful new capability into a methodology built around human collaboration and continuous improvement.

What LLMs can do for a development team today

  1. Code generation. LLMs can generate boilerplate code, implement algorithms from descriptions, scaffold components, and suggest completions in real time. For routine tasks — CRUD endpoints, data transformation functions, standard design patterns — this can reduce implementation time substantially. The generated code requires review and is not always correct, but it shifts the developer's role from writing to evaluating and refining, which is often faster.

  2. Test generation. Writing unit tests is important and time-consuming. LLMs can generate test cases from function signatures and descriptions, including edge cases that a developer under time pressure might overlook. The tests should still be reviewed — LLM-generated tests can be superficially correct but miss the actual behaviour being tested — but they provide a useful starting point.

  3. Documentation. Code documentation is consistently under-prioritised in sprint planning and consistently under-delivered by the end of the sprint. LLMs can generate inline comments, docstrings, and README sections from code, making it practical to maintain documentation in a state that actually helps future maintainers.

  4. Code review assistance. LLMs can flag potential issues in code — unused variables, potential null pointer dereferences, common security vulnerabilities, inconsistencies with surrounding code style — that a human reviewer might miss under time pressure. They are not a replacement for human code review but serve as a useful first pass.

  5. Debugging. Explaining an error message and the surrounding code context to an LLM and asking it to identify likely causes is increasingly productive. LLMs have seen enough code and enough error messages to recognise common failure patterns, which can accelerate the diagnosis step of the debugging process.

How Scrum teams should integrate these tools

The most important principle is that LLM-generated code is not different from any other code when it comes to quality gates. It must go through the same code review process, pass the same automated tests, and meet the same Definition of Done criteria as human-written code. A team that reviews its own code carefully but rubber-stamps LLM output is not maintaining the quality standards it thinks it is maintaining. If anything, LLM-generated code benefits from more scrutiny, not less, because its failure modes — confident-sounding but subtly incorrect code, code that passes tests but has security vulnerabilities, code that works for the happy path but fails on edge cases — are different from the failure modes of human-written code.

  • Update the Definition of Done to explicitly include LLM-assisted code review checks where appropriate — for example, a specific security review step for any code that handles authentication or data access

  • Add prompt engineering to the team's shared skill set: the ability to write clear, well-scoped prompts that produce useful output is learnable and worth investing in

  • Use retrospectives to track whether LLM tool adoption is improving throughput and quality, or merely creating a different class of defects that appear later in the cycle

  • Redirect time savings consciously: if LLMs save each developer two hours per sprint, use that time for architecture work, user research, or technical debt reduction — not for cramming more low-quality features into the same timebox

The risks to manage

  1. Hallucinated code. LLMs generate plausible-looking code that is wrong. This is the most significant risk in production use. A junior developer who accepts generated code without understanding it — because the code looks correct and the deadline is near — may introduce bugs that are subtle and hard to trace. The defence is code review combined with a team culture that expects understanding, not just acceptance, of all code that enters the codebase.

  2. Security vulnerabilities in generated code. LLMs trained on public code repositories have seen a great deal of insecure code, and they can reproduce insecure patterns — SQL injection vulnerabilities, weak cryptographic choices, improper input validation — in generated output. Security-focused code review, static analysis tools, and automated security testing are essential complements to LLM-assisted development.

  3. Over-reliance eroding team capability. The long-term risk that receives the least attention is the erosion of fundamental skills in a team that never writes code from scratch. A team that cannot understand or debug code it did not write — or that has lost the habit of thinking carefully about algorithm choice because the LLM always suggests something — is fragile in ways that will not become apparent until the LLM produces something that requires deep understanding to fix. Maintaining deliberate practice in core development skills is not nostalgia; it is risk management.

How XNM Consulting supports Scrum teams navigating AI adoption

Integrating AI tools into a Scrum team's workflow is an organisational change problem as much as a technical one. It requires updating working agreements, calibrating the Definition of Done, training the team on effective prompt engineering, and building the review habits that keep quality high as the source of code shifts. XNM Consulting's programme and project delivery practice supports technology teams through exactly these kinds of practice changes — helping teams adopt new capabilities without sacrificing the quality discipline that makes Agile delivery reliable.

To learn how XNM Consulting can support your team's adoption of AI tools in an Agile context, visit our programme and project delivery services page.