A CI/CD pipeline is the factory floor of software engineering. Every feature, bug fix, and improvement flows through it. A fast, reliable pipeline means faster delivery and higher quality. A slow, flaky pipeline means frustrated engineers, delayed releases, and a growing list of "we'll fix it later" shortcuts.
Here's how to design a CI/CD pipeline that scales from a small team to a large engineering organisation.
Reference Architecture
Source → Build → Test → Security → Staging → Production
│ │ │ │ │ │
│ │ │ │ │ ┌────┴────┐
│ │ │ │ │ │ Canary │
│ │ │ │ │ │ → Full │
│ │ │ │ │ └─────────┘
│ │ │ │ │
│ │ │ │ Smoke Tests
│ │ │ │ Integration Tests
│ │ │ SAST, SCA, Container Scan
│ │ │
│ │ Unit Tests, Integration Tests
│ │
│ Compile, Build Container Image
│
Lint, Format Check, Commit Validation
Stage 1: Source
Trigger: Pull request created or updated, push to main branch.
Activities:
- Lint and format checks (fast feedback on style issues)
- Commit message validation (conventional commits)
- PR size check (flag PRs that are too large for effective review)
Target time: Under 30 seconds.
Stage 2: Build
Activities:
- Compile/transpile code
- Build container image
- Generate build artifacts
- Tag with version (git SHA + build number)
Target time: Under 2 minutes (with caching).
Stage 3: Test
Activities:
- Unit tests (fast, isolated, no external dependencies)
- Integration tests (with test databases, message queues)
- Contract tests (API contract validation between services)
Target time: Under 5 minutes. Parallelise test suites across multiple runners.
Stage 4: Security
Activities:
- SAST (static code analysis for vulnerabilities)
- SCA (dependency vulnerability scanning)
- Container image scanning
- IaC scanning (if infrastructure changes)
Target time: Under 3 minutes. Run in parallel with tests.
Stage 5: Staging
Activities:
- Deploy to staging environment
- Run smoke tests (critical user journeys)
- Run integration tests against staging
- Performance test (optional, for critical paths)
Target time: Under 5 minutes for deployment + smoke tests.
Stage 6: Production
Activities:
- Deploy using chosen strategy (canary, blue-green, rolling)
- Run production smoke tests
- Monitor error rates and latency
- Automatic rollback if metrics degrade
Target time: Under 10 minutes for full rollout.
Total pipeline time target: Under 15 minutes from commit to production. Elite teams achieve under 10 minutes.
Deployment Strategies
Rolling Update
New version replaces old version incrementally. Simple, no extra infrastructure.
Risk: Mixed versions serving traffic simultaneously. If the new version has a bug, some users are affected before rollback completes.
Best for: Stateless services where mixed-version traffic is acceptable.
Blue-Green
Two identical environments (blue and green). Deploy new version to the inactive environment, switch traffic, keep old environment as instant rollback.
Risk: Double infrastructure cost during deployment. Database schema changes need backward compatibility.
Best for: Services where zero-downtime deployment is critical and instant rollback is required.
Canary
Route a small percentage of traffic (1-5%) to the new version. Monitor metrics. Gradually increase traffic if healthy. Rollback instantly if not.
Risk: Requires traffic routing capability and sophisticated monitoring.
Best for: High-traffic services where you want production validation before full rollout.
Feature Flags
Deploy new code to production but control activation through feature flags. Decouple deployment from release.
Best for: Gradual rollouts, A/B testing, and the ability to quickly disable features without deployment.
Caching Strategies
Pipeline speed depends heavily on caching:
| Cache Type | What It Caches | Impact |
|---|---|---|
| Dependency cache | npm, pip, Maven packages | 50-80% faster install |
| Build cache | Docker layers, compiled artifacts | 40-70% faster builds |
| Test cache | Test results for unchanged code | Skip unchanged test suites |
| Container layer cache | Base image layers | 60-80% faster image builds |
Implementation: Most CI/CD platforms (GitHub Actions, GitLab CI, Azure DevOps) support caching natively. Use content-addressable caching (hash of lock file as cache key for dependencies).
Mono-Repo vs Multi-Repo
| Aspect | Mono-Repo | Multi-Repo |
|---|---|---|
| Pipeline complexity | Higher (selective builds needed) | Lower (one pipeline per repo) |
| Cross-service changes | Single PR, atomic | Multiple PRs, coordinated |
| Build speed | Slower without optimisation | Naturally scoped |
| Dependency management | Unified | Per-repo |
| Tooling | Needs Nx, Turborepo, or Bazel | Standard CI/CD |
Recommendation: Multi-repo for teams with clear service boundaries and independent release cycles. Mono-repo for teams with high cross-service coupling or shared libraries. Don't choose based on trend — choose based on your team's actual coordination patterns.
GitOps vs Push-Based Deployment
Push-based (traditional): CI/CD pipeline pushes changes to the target environment. The pipeline has credentials and access to deploy.
GitOps: The desired state is declared in Git. A controller (ArgoCD, Flux) running in the cluster continuously reconciles actual state with desired state. The pipeline pushes to Git, not to the cluster.
| Aspect | Push-Based | GitOps |
|---|---|---|
| Audit trail | Pipeline logs | Git history (complete, immutable) |
| Drift detection | None (fire and forget) | Continuous reconciliation |
| Rollback | Re-run old pipeline | Git revert |
| Security | Pipeline needs cluster credentials | Only the controller needs credentials |
| Complexity | Simpler to start | More components to manage |
Recommendation: GitOps for Kubernetes workloads (ArgoCD is excellent). Push-based for serverless, PaaS, and non-Kubernetes deployments.
Pipeline Observability
Monitor your pipeline as you monitor your production systems:
| Metric | Target |
|---|---|
| Pipeline duration (p50/p95) | Under 15 min / Under 25 min |
| Pipeline success rate | Above 95% |
| Flaky test rate | Below 2% |
| Time waiting for runner | Under 1 minute |
| Deployment frequency | Daily or better |
| Rollback rate | Below 5% |
A well-designed CI/CD pipeline is the foundation of engineering velocity. If you're optimising your deployment pipeline or building one from scratch, let's talk.