Back to Technical Articles
Technical Mastery⚙️ Technical#AI development#Cursor#GitHub Copilot#development productivity#software engineering#AI tools#code quality

AI-Assisted Development: How We Cut Delivery Times and What We Won't Automate

Ekfix Team

AI coding tools have materially changed delivery speed for specific categories of work. They have also created a new class of technical debt: confident, plausible, wrong code that looks correct until it fails under production conditions.

Technical MasteryAI-Assisted Development: HowWe Cut Delivery Times and WhatWe Won't AutomateEkfix

AI-Assisted Development: How We Cut Delivery Times and What We Won't Automate

The question we get from clients now is not "are you using AI tools?" — it is "how much faster does it make you?" The expectation is that AI coding tools compress development timelines, and clients want to know whether their engagement pricing reflects that compression.

The honest answer is more nuanced than a percentage. AI coding tools produce substantial leverage on specific categories of work and minimal leverage — sometimes negative leverage — on others. Understanding which is which is the difference between teams that use AI well and teams building technical debt quickly.


What We Use

Our current production AI tool stack:

Cursor (AI-native IDE): The primary coding environment. Cursor has deep codebase awareness — it indexes the full repository and provides context-aware completions and edits that understand the patterns, naming conventions, and architectural decisions in the specific project. This is the primary workhorse for day-to-day development.

Claude (Anthropic): Used for extended-context tasks — refactoring large files, designing system architecture, reviewing complex logic, debugging multi-component issues that require understanding a large amount of code simultaneously. Claude's context window and reasoning quality make it better than Cursor's embedded model for tasks requiring sustained analytical attention.

GitHub Copilot: Secondary to Cursor but used in VS Code for specific workflows where Copilot's integration with GitHub-specific context (pull request descriptions, issue references) is useful.

ChatGPT / GPT-4o: Used for research, draft writing (documentation, spec documents, client communications), and as a second opinion on architectural decisions where we want to check Claude's reasoning against a different model.


Where AI Provides Genuine Leverage

1. Boilerplate and scaffolding generation

Software projects contain significant volumes of repetitive structural code: CRUD endpoint definitions, database schema migrations, type definitions, test file scaffolding, configuration files, environment setup scripts. This code is not complex, but it takes time and is error-prone when written by hand repeatedly.

AI tools generate boilerplate accurately and at near-zero time cost. A REST API with five resources — each needing a controller, service layer, data access object, input validation schema, and unit test skeleton — takes a competent engineer two to three days to scaffold correctly. With AI assistance, the same scaffolding takes three to four hours, and the remaining time goes into business logic.

Measured impact: Approximately 40–60% reduction in time spent on new-project scaffolding and feature scaffolding within established projects.

2. Test generation

Writing test cases — particularly the boundary cases and error conditions that matter for production reliability — is work that developers frequently deprioritise because it is time-consuming and intellectually repetitive. AI tools generate test cases from an existing implementation far faster than manual test authoring.

A function with ten distinct execution paths (happy path plus error conditions, edge cases, boundary values) that would take two hours to test manually takes twenty minutes with AI-assisted test generation. The AI identifies test cases from reading the implementation; the developer reviews and augments.

Caveat: AI-generated tests verify the implementation as written, not necessarily the implementation as intended. A bug in the logic will produce tests that pass while the bug persists. Human review of AI-generated tests must include checking that the test is testing the right behaviour, not just that the test passes.

3. Code explanation and comprehension

Codebases accumulate complexity. Understanding what a six-hundred-line unfamiliar file does, what a complex SQL query returns, or why a subtle bug is occurring requires reading comprehension time that AI tools have fundamentally changed. An unfamiliar function can be explained by an AI model in seconds; understanding its mutation side effects and interaction with other components takes a conversation, not hours of manual tracing.

For onboarding new engineers to existing codebases and for diagnosing production incidents involving unfamiliar code paths, this capability is among the highest-value AI tools provide.

4. Repetitive transformations

Data transformations, format conversions, migration script generation, cross-referencing API documentation to generate client types — work that is mechanical but requires accuracy. AI handles this well and eliminates the transcription errors that occur when humans do it manually.

5. Documentation generation

JSDoc, TypeDoc, README files, API documentation from OpenAPI specs — AI generates these at high quality from existing code. Documentation is perpetually underprioritised; AI removes most of the writing burden, leaving only review.


Where AI Does Not Help (And Sometimes Hurts)

1. Novel architecture decisions

When the design question is "how should this system be structured to meet requirements X, Y, and Z while remaining evolvable as the business grows," AI tools provide suggestions but not answers. The suggestions are drawn from training data patterns; they may not fit the specific constraints of the business domain, the team's capabilities, the existing codebase structure, or the long-term growth trajectory.

Architecture decisions require conversations with engineers who understand the business context, the technology constraints, and the tradeoffs. AI can inform those conversations but does not substitute for the judgement that makes a system good rather than just functional.

2. Security-critical components

AI models produce plausible security code that sometimes contains subtle vulnerabilities. Authentication logic, authorisation checks, cryptographic operations, and input validation deserve adversarial review that assumes the AI-generated code is wrong until proven otherwise by a security-aware engineer. The models are not adversarially minded; they produce code that looks correct rather than code hardened against attack.

This is not a reason to avoid AI assistance on security code entirely — but it is a reason for mandatory security review by an engineer who understands attack patterns, not just code review by someone checking syntax.

3. Business logic that encodes domain knowledge

Pricing engines, risk models, compliance rule interpreters, and workflow orchestrators encode business-domain knowledge that the AI model does not have. The code structure may be correct; the business logic embedded in the code requires knowledge of the domain to verify. AI assistance is useful for the structural scaffolding of these components; the business rules themselves must be specified, reviewed, and validated by people who understand the domain.

4. Debugging complex multi-system failures

Production incidents involving interactions between multiple systems — a payment gateway webhook, a background job, a database transaction, and a downstream notification service all involved in a race condition — require systematic observation, hypothesis testing, and reasoning about state. AI models provide useful debugging frameworks, but the actual investigation requires an engineer reading real logs, querying real databases, and testing real hypotheses in the production environment.

5. Code that must be understood and maintained

This is the most important limit. AI-generated code can be correct and unreadable. A ten-line function that an AI produces in seconds may be difficult for a human to reason about, debug six months later, or modify safely without introducing regressions. Code that will be modified frequently belongs in the minds of the engineers who maintain it, not only in a file generated by an AI.

The practice we enforce: every AI-generated code block must be read and understood by the engineer accepting it. Not just confirmed to pass tests — understood. An engineer who accepts AI output they do not understand is an engineer who cannot debug it when it fails.


The Net Delivery Time Impact

Based on two years of using AI coding tools in client project delivery, our measured impact by task category:

Task CategoryAI Leverage
Project scaffolding and setup50–65% time reduction
Standard CRUD features35–50% time reduction
Test coverage40–60% time reduction
Documentation60–75% time reduction
Code review and refactoring20–35% time reduction
Architecture design0–15% time reduction
Security-critical implementation0% (add review overhead)
Novel business logic0–20% time reduction
Debugging production incidents10–20% time reduction

Across a typical project, we estimate AI tooling reduces calendar delivery time by 25–35%. The reduction is not evenly distributed: projects dominated by scaffolding and standard features see the high end; projects dominated by novel algorithms, complex business logic, and security-critical components see the low end.


What Clients Should Know

The delivery time compression from AI tooling does not mean lower-quality output if the team's practice is sound. It means experienced engineers spend more time on the decisions that require experience — architecture, business logic, security review — and less time on mechanical work that was never a good use of their capability.

The risk is the inverse: a team using AI to move faster without the review disciplines will ship faster and accumulate more subtle defects and security vulnerabilities than a team working manually with proper review.

When evaluating a development partner's AI tool use, the useful question is not "do you use AI?" — it is "what is your review process for AI-generated code?" A team that can answer that question in specific detail with reference to their actual workflow is a team using AI responsibly. A team that cannot is a team where the speed gain comes at the expense of someone's future maintenance burden.


Related Articles