Most scaleups encounter penetration testing for the first time when an enterprise customer's procurement process demands it, or when a compliance framework (ISO 27001, SOC 2, NIS2) requires it. The typical result: a pentest report gets produced, the critical findings get fixed, and the document goes into a compliance folder.
This is the minimum viable approach. It satisfies the checkbox requirement but delivers a fraction of the potential security value.
Penetration testing, done correctly, is one of the most valuable security investments a growing company can make — not because it produces a certificate, but because it surfaces real vulnerabilities in real systems, tested by skilled attackers who think like the adversaries you'll eventually face.
Here's how to get genuine value from the exercise.
Penetration testing vs. vulnerability scanning
These are different things. A vulnerability scanner (Tenable, Qualys, Nessus) automates the detection of known CVEs and misconfigurations. A penetration test is a human-conducted adversarial simulation where a skilled tester attempts to achieve a specific objective — extracting data, escalating privileges, accessing restricted systems — by chaining multiple vulnerabilities and techniques. Both are valuable; neither replaces the other.
Types of Penetration Tests
Understanding the test types helps you specify the right scope for your situation.
- ✓Internet-facing systems
- ✓Web applications
- ✓APIs
- ✓VPN endpoints
- ✓Email security (SPF/DKIM)
- ✓Post-breach simulation
- ✓Lateral movement
- ✓Active Directory attacks
- ✓Internal services
- ✓Privilege escalation
- ✓OWASP Top 10
- ✓Authentication bypass
- ✓Authorisation flaws
- ✓Business logic
- ✓API security
- ✓IAM/RBAC misconfigs
- ✓Public storage buckets
- ✓Network security groups
- ✓Secrets in environment
- ✓Lateral movement in cloud
Choose scope based on your primary attack surface and compliance requirements. Most scaleups should start with Web Application + External Network.
Black Box vs. Grey Box vs. White Box
The knowledge level provided to testers affects the depth and focus of the engagement:
Black box: Testers receive only the target scope (URL, IP range) — no credentials, no architecture details. Simulates an external attacker with no insider knowledge. Good for testing your external exposure, but less efficient for finding deep application logic flaws.
Grey box: Testers receive some information — typically test credentials and a high-level architecture overview. The most common choice for web application testing; it allows testers to explore application logic thoroughly rather than spending half the engagement on reconnaissance.
White box: Testers receive full access to source code, architecture documentation, and credentials. The most thorough approach, as testers can identify vulnerabilities that wouldn't be externally visible. Required for security-critical applications (banking, healthcare).
For most scaleups, grey box web application testing combined with black box external network testing provides the best coverage-to-cost ratio.
Before the Test: Scoping Properly
Poor scoping is the most common reason pentests produce generic, low-value reports. Invest time upfront.
Define your objective: What is the highest-value asset an attacker could compromise? Customer PII? Payment data? Intellectual property? Source code? Define the "crown jewels" and design the scope around what would genuinely damage your business.
Define the scope clearly:
- In-scope systems and URLs
- In-scope user roles (standard user, admin, API consumer)
- Explicitly out-of-scope systems (production databases you don't want disrupted, third-party SaaS)
- Rules of engagement (is social engineering in scope? Physical access?)
Specify the test format:
- Grey box: provide test accounts for each role your application supports
- Architecture: a brief overview of your stack helps testers focus on the interesting areas
- Code review (if white box): specify which repositories are in scope
Timing:
- Avoid running tests during peak business periods
- Coordinate with your operations team — monitoring should be active during the test (you want to know if your defences would detect the attacker)
- Agree on communication channels for critical findings (a pentest team that finds a critical SQLi at 11pm should have a way to reach you)
During the Test: What Good Pentesters Look For
A skilled penetration tester isn't running a scanner and presenting the results. They're thinking like an attacker — chaining small issues into impactful attack paths.
The OWASP Top 10 for web applications is the baseline, but good testers go further:
- Business logic vulnerabilities: Can you purchase an item at a negative price? Can you access another user's data by changing an ID in the URL? Can you skip the payment step? These require understanding your application, not just running tools.
- Broken access control: The #1 OWASP vulnerability. Can a regular user access admin functionality? Can User A access User B's data? Can an API caller access resources they shouldn't?
- Insecure direct object references (IDOR): Changing
GET /orders/1234toGET /orders/1235and getting another user's order data. - Authentication flaws: Account takeover via password reset, token predictability, session fixation, JWT algorithm confusion attacks.
- Server-side request forgery (SSRF): Using your application to make requests to internal services — including cloud metadata endpoints that expose AWS/Azure credentials.
Cloud SSRF is critical
Server-Side Request Forgery against cloud metadata endpoints is one of the most impactful vulnerabilities in cloud-hosted applications. A successful SSRF to 169.254.169.254 (AWS) or 169.254.169.254 (Azure IMDS) can expose instance credentials, granting the attacker access to your cloud environment. This should be in scope for any web application pentest.
After the Test: Making the Most of the Report
A penetration test report is not the end of the process — it's the beginning of the remediation work. Here's how to handle it effectively.
Understanding the Severity Ratings
Pentest reports typically use CVSS (Common Vulnerability Scoring System) ratings: Critical, High, Medium, Low, and Informational. Don't treat these as the only input to prioritisation — a "High" finding in a non-customer-facing internal tool may be less urgent than a "Medium" finding in your payment API.
Prioritise by business risk, not just CVSS score. Ask for each finding: "What could an attacker do with this? How hard is it to exploit? What's the impact on our customers and business?"
The Remediation Process
Review all findings, categorise by severity and business risk, assign ownership to specific engineers. Every finding must have a named owner and a target remediation date.
Critical findings should be treated as production incidents. High findings should be addressed within two weeks. These represent exploitable vulnerabilities with significant business impact.
- →Fix or mitigate immediately
- →WAF rule as temporary control if fix takes time
- →Verify the fix with a retest
Medium findings should enter the engineering sprint backlog and be addressed within the next one to two sprints. These are real vulnerabilities that need fixing, even if the immediate risk is lower.
Low and informational findings represent security improvements rather than urgent vulnerabilities. Add them to a security backlog and work through them systematically.
Request a retest for Critical and High findings. Most pentest providers include this in the engagement cost. A fix you have not verified is a fix you have not made.
Tracking Progress
Create a remediation tracking spreadsheet (or use your ticket tracker) with: finding ID, description, severity, owner, target date, and status. Review in your weekly engineering meeting until all Critical and High findings are closed.
Building Continuous Security Testing
Annual pentests are a compliance cadence, not a security practice. Production systems change weekly; a pentest done twelve months ago tells you nothing about what your current system looks like.
Complement annual pentests with:
- SAST in CI/CD: Static analysis tools (Semgrep, SonarQube, Checkmarx) that run on every pull request
- DAST in staging: Dynamic scanning (OWASP ZAP, Burp Suite Enterprise) against staging before production deployments
- Dependency scanning: Snyk or Dependabot for known CVEs in your libraries
- Bug bounty programme: A managed bug bounty (HackerOne, Bugcrowd) provides continuous adversarial testing from a global researcher community — and you only pay for valid findings
The goal is a continuous security testing programme where the annual pentest validates the overall posture, but daily automated scanning catches new vulnerabilities before they reach production.
Security architecture and penetration testing programme design are areas I work on with clients regularly. If you need help specifying a pentest engagement, managing the remediation process, or building a continuous testing programme, let's talk.