Technical debt is growing 8x faster. Code reviews are taking 50% longer. Your best engineers are becoming validators instead of builders.
At least, that’s what the data suggests might be coming.
We’re early in our AI rollout, starting to layer Cursor, ChatGPT, Claude, and some internal n8n agents into our workflow. And already, we’re asking tough questions: Are these tools truly making us faster, or just relocating the effort to different, and harder to spot, parts of the pipeline?
This isn’t a hit piece on AI tooling. We’re excited about it. But it’s easy to confuse movement with progress. And in talking with peers and digging into the research, I’m starting to think the AI productivity story is more complicated than it looks at first glance.
AI isn’t removing work. It’s relocating it.
And if you’re leading a team, understanding that shift early might be the difference between scaling well, and spending next year paying down invisible debt.
The Hydraulic Truth of Engineering Work
Imagine your development lifecycle like a hydraulic system. Push down on one end, make coding faster, and the pressure doesn’t vanish. It just moves. Somewhere else in the pipeline, things get tighter. Code reviews start taking longer. QA starts catching issues that weren’t there before. And your senior engineers get pulled into validation mode instead of building systems.
This is a pattern I’ve heard again and again from other teams ahead of us in adoption. A two-day feature estimate turns into four because the AI-generated logic looks fine… until it hits a subtle integration edge case. Then someone has to trace it back, figure out what the AI was trying to do, and rewrite it from scratch.
It doesn’t always happen. But when it does, it can quietly derail a sprint.
What We’re Watching For
As we start our rollout, we’ve been thinking about what signs might indicate the tools are helping or quietly causing friction. So far, these haven’t shown up in our rollout, but they’re high on our radar:
CTOs might start asking: Is this actually saving us money, or just shifting it downstream?
VPs of Engineering may notice: Velocity looks great, but is delivery becoming more chaotic?
Engineering Managers could feel it first: Review times creeping up. Devs asking for help interpreting AI code. Morale dipping as creativity gives way to validation.
These patterns may or may not show up for us, but if they do, we want to be ready to respond quickly.
What the Data Confirms
If you're wondering whether these concerns are just anecdotal, industry research offers some sobering context. (Links included for deeper exploration.)
GitClear’s study of 211 million lines of code found:
- Technical debt is growing 8x faster with AI-assisted commits
- Code duplication is 8x more likely
- Code churn is expected to double, with more code discarded within two weeks
Google’s 2024 DORA report showed that every 25% increase in AI usage delivered a 2.1% productivity gain, but came with a 7.2% drop in delivery stability.
Uplevel’s study found Copilot users had 41% higher bug rates, and Stanford research suggests nearly 40% of AI-generated suggestions contain potential security vulnerabilities.
These aren’t just teething issues. They’re signs of a shift in where the effort lands.
Where We Expect the Work to Move
Code Review: Multiple teams have reported longer PR review cycles, especially when senior developers are reviewing opaque or contextless AI-generated logic.
QA: Test suites that once passed confidently are starting to miss subtle regressions. Bugs tied to misunderstood prompts or incorrect assumptions surface later in the lifecycle.
Senior Dev Time: Principal engineers are becoming backstops: explaining, rewriting, or validating AI-written logic that junior devs can’t fully defend.
We’re watching these signals closely. We haven’t seen all of them yet. But we’re preparing for them.
When It All Shows Up
We’re not seeing these issues firsthand, and we’re not claiming to know how our rollout will play out. But based on industry research and public commentary, there’s a rough model out there of what teams might encounter as adoption matures:
- Month 1-2: Velocity increases. Teams feel unblocked. Excitement grows.
- Month 3-4: Review friction grows. Integration bugs become harder to track. QA surfaces edge cases that weren’t anticipated.
- Month 5-6: Debt accumulates. Delivery consistency may suffer. Senior devs spend more time in validation than design.
We’re not treating this as a forecast, but it’s a scenario we’re planning against. If we get ahead of it, maybe we avoid the worst of it.
A Concrete Example
One of the first signs that made us pause was when Cursor generated not one, but seven separate versions of an article card component for our hub pages, one for each hub. Instead of creating a reusable component, it cloned and slightly varied the logic across the page. After we realized the duplication, we tried nudging Cursor to consolidate the cards, but it responded by wrapping all seven in a parent component instead of refactoring them properly.
One of our developers ended up rewriting the whole thing manually.
It wasn’t a catastrophic bug, but it was telling. The AI was confidently wrong, and the cost wasn’t clear until a human took the time to unwind it.
It’s these invisible inefficiencies, multiplied across dozens of features, that we’re trying to head off early.
What We’re Doing Now
Planning for the Shift
We’re updating our estimates. Even if coding is faster, we’re increasing review time by 50%, budgeting 25% more QA, and blocking senior dev time for validation. That overhead is real. We’d rather plan for it than be surprised by it.
Gating the AI
We’re experimenting with Cursor’s prompt visibility and diff tools to get a clearer picture of where the AI is being overly assertive. We’re also exploring custom workflows using n8n to log and flag any PRs where generated code exceeds a certain threshold. Anything suspicious triggers an automatic tag for dual review.
Training for Oversight
We’re building a checklist of common AI mistakes, and training devs not just how to use the tools, but how to catch what they miss. Prompting is easy. Auditing is the hard part.
Tracking New Metrics
We’ve started tracking metrics like:
- % of AI-assisted PRs per sprint
- Average review time delta for AI-heavy vs human-written PRs
- Bug rework rate segmented by origin
- Senior dev time spent validating
Building Governance Early
We’re defining where AI can’t be used (e.g. auth logic, data privacy flows). We haven’t formalized prompt review yet, but we’re considering ways to surface AI-related issues during retros, especially cases where misunderstood logic caused extra rework. We’re assigning a governance lead who rotates quarterly to keep oversight embedded without adding red tape.
The AI Productivity Reality Check
If you're just getting started, ask your team these:
- How many of our recent PRs had >50% AI content?
- Are review times changing? Why?
- What do our senior devs say about their week. More building or more debugging?
- Have we seen bugs that felt like “phantom errors” from somewhere else?
These are the questions we’re starting with.
The Bottom Line
We’re not skeptical of AI. We’re excited, and we’re cautious.
The tools are impressive. But the long-term wins won’t go to the teams who adopt the fastest. They’ll go to the ones who learn the fastest. Who notice when things feel off. Who stay reflective.
If you’re starting your own rollout, ask the questions no one’s measuring yet. Revisit the parts of your process you assume are fine. And track what’s easy to overlook.
We’re doing the same. And we’re still learning too.
Looking Ahead: Beyond Defensive AI
Everything above focuses on managing the transition. But that’s just the starting point.
We’ve started asking: what if we’re thinking about this backwards?
Instead of asking, "How do we fit AI into our development process?" maybe we should ask: "What would development look like if AI were a first-class team member from the start?"
Here’s what we’re beginning to imagine:
Rethinking the Development Loop
Real-time architecture pairing
AI acts as a design partner. Understanding your full codebase, simulating trade-offs, and suggesting non-obvious patterns as you plan.
Integrated quality synthesis
Instead of QA catching issues later, AI flags integration risks, simulates edge cases, and generates tests while you build.
Accelerated developer growth
AI helps junior devs reason through architecture, performance, and security, not just syntax, speeding up learning by orders of magnitude.
Exploratory iteration
Rapid prototyping becomes the default. Instead of planning in static documents, teams test ideas in code and let better ones win.
Rethinking the Team Structure
From gatekeeping to guidance
Senior devs evolve into prompt architects and design mentors. Their job is to shape AI behavior and embed system-level wisdom.
From reviews to working sessions
PRs shift from quality gates to collaborative design spaces, where AI, humans, and systems explore trade-offs together.
From rigid process to shared principles
Instead of enforcing steps, teams define architectural philosophies that guide both people and AI. Principles replace checklists.
What We’re Watching For
This isn’t where we are yet. But we’re keeping an eye out for moments like:
- AI surfacing ideas we hadn’t considered
- Exploratory builds turning into stronger architecture
- Junior devs shipping work that feels senior
- Feedback loops that shrink dramatically
The Real Question
This first wave of AI adoption is just the warm-up.
The real shift comes when we stop trying to retrofit AI into legacy workflows, and instead design workflows that make the most of what AI does well.
Are we building guardrails for today’s risks, or scaffolding for tomorrow’s potential?
Ideally, both. Because the teams that navigate the transition thoughtfully will be best positioned for what comes next.
