Measuring engineering productivity is one of the most consequential decisions a technology leader makes. Done well, metrics drive improvement, align teams, and build trust with the business. Done poorly, metrics destroy morale, incentivise gaming, and drive away your best engineers.
The challenge: engineering productivity is multidimensional, context-dependent, and resistant to simple measurement. Here's how to navigate it.
DORA Metrics: The Foundation
The DORA (DevOps Research and Assessment) metrics are the most well-validated engineering performance indicators, backed by years of research across thousands of organisations.
The Four Key Metrics
| Metric | Definition | Elite | High | Medium | Low |
|---|---|---|---|---|---|
| Deployment Frequency | How often you deploy to production | On-demand (multiple/day) | Weekly to monthly | Monthly to 6 monthly | < once per 6 months |
| Lead Time for Changes | Time from commit to production | < 1 hour | 1 day - 1 week | 1 week - 1 month | 1-6 months |
| Change Failure Rate | % of deployments causing failure | 0-5% | 5-10% | 10-15% | 16-30% |
| Mean Time to Recovery | Time to restore service after failure | < 1 hour | < 1 day | 1 day - 1 week | > 1 week |
Why DORA Works
- Validated: Correlates with organisational performance (revenue, customer satisfaction), not just engineering metrics
- Balanced: Speed metrics (deployment frequency, lead time) are balanced by quality metrics (change failure rate, MTTR)
- Team-level: Measures team performance, not individual performance (critical distinction)
- Non-gameable: It's hard to improve one metric without genuinely improving the system (you can't deploy more often without also improving your change failure rate)
How to Collect DORA Metrics
| Metric | Data Source |
|---|---|
| Deployment Frequency | CI/CD pipeline (count production deployments) |
| Lead Time for Changes | Git + CI/CD (time from first commit to production deployment) |
| Change Failure Rate | Incident management system (incidents caused by deployments / total deployments) |
| Mean Time to Recovery | Incident management system (time from incident detection to resolution) |
Tools: LinearB, Sleuth, Jellyfish, Faros AI, or custom dashboards from CI/CD and incident data.
The SPACE Framework
SPACE (developed by Microsoft Research) provides a more comprehensive view of developer productivity:
S — Satisfaction and Well-Being
Developer satisfaction predicts retention and sustained performance. Measure through:
- Quarterly developer surveys (NPS, satisfaction questions)
- Exit interview analysis
- Burnout indicators (working hours, on-call burden)
P — Performance
The outcomes of development work:
- Change failure rate (DORA)
- Code review quality (meaningful feedback, not rubber stamps)
- Customer-reported defects
- Feature adoption rates
A — Activity
Observable development outputs:
- Pull requests merged
- Code reviews completed
- Deployments
- Documentation contributions
Critical warning: Activity metrics are the most dangerous to optimise for. If you reward developers for PR count, you'll get more PRs (smaller, lower-value). Measure activity as context, never as a target.
C — Communication and Collaboration
How effectively teams work together:
- Code review turnaround time
- Cross-team contributions
- Knowledge sharing (docs, presentations, mentoring)
- Meeting effectiveness
E — Efficiency and Flow
The ability to do work without friction:
- CI/CD pipeline speed
- Uninterrupted focus time per day
- Time spent waiting (on reviews, infrastructure, approvals)
- Developer environment setup time
What NOT to Measure
Lines of Code
A developer who deletes 500 lines of unnecessary code while adding 20 lines of clean implementation has done more valuable work than one who wrote 2,000 lines of boilerplate.
Story Points Per Developer
Story points are a planning tool, not a performance metric. The moment you measure individual story point velocity, engineers will inflate estimates.
Commit Count
Measuring commits incentivises small, meaningless commits rather than thoughtful, atomic changes.
Hours Worked
More hours ≠ more productivity. Measuring hours incentivises presence over outcomes and drives burnout.
Individual Rankings / Stack Rankings
Ranking developers against each other destroys collaboration. Engineering is a team sport — individual ranking creates competition where you need cooperation.
Using Metrics Correctly
For Coaching, Not Punishment
Metrics should inform coaching conversations:
- "Our lead time increased from 2 days to 5 days last quarter. Let's understand why and what we can do about it."
- NOT: "Your team's lead time is the worst in the organisation. Fix it."
For Teams, Not Individuals
DORA metrics measure team performance. Applying them to individuals is statistically invalid and culturally destructive.
For Trends, Not Snapshots
A single measurement is meaningless. Track trends over time:
- Is deployment frequency increasing or decreasing?
- Is change failure rate stable or deteriorating?
- Is developer satisfaction improving or declining?
With Context
Metrics without context are dangerous. A team with a high change failure rate might be:
- Working on a risky, high-value initiative (acceptable)
- Cutting corners due to deadline pressure (concerning)
- Lacking test automation (fixable)
- Dealing with a poorly designed system (systemic)
The number tells you where to look. The conversation tells you what to do.
Implementation Roadmap
Month 1: DORA Metrics
- Configure deployment tracking in your CI/CD platform
- Connect incident management to deployment data
- Build a basic dashboard with the four DORA metrics
- Establish baselines
Month 2: Developer Survey
- Design a quarterly developer satisfaction survey
- Run the first survey
- Analyse results and identify top pain points
- Create action items from survey findings
Month 3: Efficiency Metrics
- Measure CI/CD pipeline duration
- Track code review turnaround time
- Measure environment provisioning time
- Identify the biggest efficiency bottlenecks
Ongoing: Review and Act
- Monthly review of DORA metrics with engineering leadership
- Quarterly developer surveys
- Action items tracked and completed
- Celebrate improvements, investigate regressions
Engineering productivity metrics are powerful when used responsibly. If you need help establishing a metrics programme that drives improvement without destroying culture, let's talk.