compliance - freeCodeCamp.org

GDPR Article 32 for Software Engineers: Technical Controls, Implementations, and Auditor Questions

Ayobami Adejumo — Thu, 28 May 2026 16:20:25 +0000

When I first read GDPR Article 32, I made a mistake. I thought it was a legal document.

But it's not. It's an infrastructure specification.

The regulation says you need "appropriate technical measures" to protect personal data. That phrase is terrifying because it's vague. What does "appropriate" mean? What counts as a "technical measure"? Who decides whether you've done enough?

The compliance consultant will give you a 50-page policy document. The auditor will ignore it and ask for your database schema.

This guide is the middle ground. I've implemented Article 32 controls for 12 SaaS companies. The same nine controls appear every time. The same three auditor questions appear every time.

This is a complete guide to the 9 technical controls you must implement, the exact code and commands for each, and the questions your GDPR auditor will ask.

What You'll Learn
Prerequisites
Part 1: Understanding Article 32
Part 2: Article 32(1)(a) — Pseudonymisation and Encryption
Part 3: Article 32(1)(b) — Confidentiality and Integrity
Part 4: Article 32(1)(c) — Availability and Resilience
Part 5: Article 32(1)(d) — Regular Testing
Part 6: Penetration Testing
Best Practices Summary
What's Next
Resources

What You'll Learn

The 9 technical controls required by GDPR Article 32(1)(a) through (d)
Exact PostgreSQL commands for pseudonymisation and field-level encryption
How to implement automatic logoff and unique user identification
Application-level audit logging that goes beyond CloudTrail
Integrity controls that prove data has not been altered
mTLS and TLS 1.3 for transmission security
The 5 auditor questions you must answer with evidence

Let's dive in.

Prerequisites

Before following along, you should have:

Knowledge:

Familiarity with PostgreSQL and basic SQL
Basic understanding of AWS services (KMS, RDS, CloudTrail)
Comfort reading Python and JavaScript/Node.js code
A working knowledge of what GDPR is — if you are starting from scratch, read the ICO's GDPR overview first

Tools and access:

PostgreSQL 14 or later
An AWS account with IAM administrator access
Python 3.8 or later with cryptography library (pip install cryptography)
Node.js 16 or later
A compliance automation tool — Vanta or OneTrust — is optional but recommended for evidence collection

Estimated time: The controls in this guide take 2–4 weeks to implement fully, depending on your existing infrastructure. Individual controls range from 30 minutes (KMS key setup) to 5 days (full application-layer encryption rollout).

Part 1: Understanding Article 32 — The Technical Requirements

1.1. What Article 32 Actually Requires

Article 32 of the GDPR is titled "Security of processing." It requires controllers and processors to implement "appropriate technical and organisational measures" to ensure a level of security appropriate to the risk.

Here is the important distinction most teams miss: Article 32 is not a checklist of policies. A policy says "we encrypt personal data." Evidence says "here is the KMS key with automatic rotation, here is the application-layer encryption code, and here are the CloudTrail logs showing every decryption attempt." The auditor wants evidence, not documentation.

The four main requirements:

Section	Requirement	What It Means for Engineers
32(1)(a)	Pseudonymisation and encryption	Personal data must be stored so it cannot be attributed to a specific data subject without additional information held separately
32(1)(b)	Confidentiality, integrity, availability, and resilience	Systems must protect data from unauthorised access, alteration, loss, and be able to recover from incidents
32(1)(c)	Restoring availability and access	You must be able to restore data and regain system access after a physical or technical incident
32(1)(d)	Regular testing and risk assessment	You must have a process for regularly testing and evaluating your security measures

1.2. The Scope Question: What Data Is Covered?

Before implementing any controls, you must know what data falls under Article 32. The regulation applies to personal data — any information that can identify a living individual directly or indirectly.

Data types and their protection levels:

Category	Examples	Protection Level
Personal data	Name, email, phone, IP address	Standard
Sensitive personal data	Health data, biometric data, political opinions, religious beliefs	Enhanced
Pseudonymised data	Data where direct identifiers are replaced with a code	Standard
Anonymised data	Data that cannot be re-identified under any reasonable circumstances	Out of scope

The data mapping question your auditor will ask:

"Can you provide a data flow diagram showing where personal data enters your system, where it is stored, where it is processed, and how it is deleted?"

Before the auditor asks, run this command to document all databases storing personal data in your AWS environment:

# List all RDS instances with their encryption status
# Any StorageEncrypted: false is a finding
aws rds describe-db-instances \
  --query 'DBInstances[*].{
    ID:DBInstanceIdentifier,
    Engine:Engine,
    StorageEncrypted:StorageEncrypted,
    Region:AvailabilityZone
  }' \
  --output table

Any instance showing StorageEncrypted: false must be addressed before your Article 32 audit.

Part 2: Article 32(1)(a) — Pseudonymisation and Encryption

2.1. How to Implement Pseudonymisation at the Database Layer

Pseudonymisation replaces direct identifiers — names, email addresses, passport numbers — with a pseudonym or code. The goal is that the main working dataset cannot identify a data subject without access to a separately stored, separately protected lookup table.

Here is the incorrect approach — direct identifiers in plaintext:

-- Bad: Direct identifiers stored in the main working table
CREATE TABLE users (
    id SERIAL PRIMARY KEY,
    full_name VARCHAR(255),       -- Direct identifier — should not be here
    email VARCHAR(255),           -- Direct identifier — should not be here
    passport_number VARCHAR(50)   -- Direct identifier — should not be here
);

This approach means any engineer, analyst, or attacker with SELECT access to the users table can immediately read and identify individuals. There is no separation between working data and identifying data.

Here is the correct implementation with a separate identifiers table:

-- Good: Pseudonymised main table with a separate, restricted lookup table

-- Step 1: Main working table uses only the pseudonym
CREATE TABLE users (
    id SERIAL PRIMARY KEY,
    pseudonym UUID DEFAULT gen_random_uuid(),  -- Non-guessable pseudonym
    created_at TIMESTAMP DEFAULT NOW(),
    account_status VARCHAR(50)
    -- No direct identifiers here
);

-- Step 2: Identifier lookup table — kept separate, access restricted
CREATE TABLE user_identifiers (
    pseudonym UUID PRIMARY KEY,
    full_name VARCHAR(255),
    email VARCHAR(255),
    passport_number VARCHAR(50),
    FOREIGN KEY (pseudonym) REFERENCES users(pseudonym)
);

-- Step 3: Grant minimal, role-based access
GRANT SELECT ON users TO app_role;                              -- Application uses pseudonym only
GRANT SELECT, INSERT, UPDATE ON user_identifiers TO identity_service_role;  -- Only the identity service sees names

What each part does:

gen_random_uuid() creates a version-4 UUID pseudonym for each user — unpredictable and not reversible without the lookup table
The main users table is safe for analytics, reporting, and general application use without exposing any identifying information
Only the identity_service_role can join the two tables — this role is assigned only to the specific service that handles identity operations

The auditor question you will receive:

"How do you ensure that pseudonymised data cannot be re-identified by an unauthorised party?"

Your evidence:

-- Show that only the identity service role has access to the identifiers table
SELECT grantee, privilege_type, table_name
FROM information_schema.role_table_grants
WHERE table_name = 'user_identifiers';

-- Expected output: only identity_service_role listed

2.2. How to Implement Encryption at Rest with Customer-Managed Keys

Storage-layer encryption protects data if someone physically steals the disk. But it does not protect against a privileged AWS employee, a compromised cloud administrator, or an authorised user with direct database access. Article 32 auditors know this distinction — and they will ask about it.

Here is the incorrect approach — AWS-managed keys:

# Bad: AWS-managed KMS key
# You do not control who at AWS can access the key material
aws kms create-key \
  --origin AWS_KMS \
  --description "AWS managed key for production"

The problem: when the auditor asks "can you prove that AWS employees cannot decrypt your customer data?", the answer is no. AWS-managed keys are managed by AWS.

Here is the correct implementation — customer-managed key with automatic rotation:

# Step 1: Create a customer-managed KMS key
KEY_ID=$(aws kms create-key \
  --origin AWS_KMS \
  --description "Customer-managed key for production PII — Article 32 compliant" \
  --tags TagKey=Purpose,TagValue=GDPR TagKey=Environment,TagValue=production \
  --query 'KeyMetadata.KeyId' \
  --output text)

echo "Created KMS key: $KEY_ID"

# Step 2: Enable automatic 90-day rotation
aws kms enable-key-rotation --key-id $KEY_ID

# Step 3: Apply to your production RDS instance
aws rds modify-db-instance \
  --db-instance-identifier production-db \
  --kms-key-id $KEY_ID \
  --apply-immediately

The auditor question:

"Show me that your encryption keys are rotated automatically and that you can prove who has accessed them."

Your evidence:

# Verify rotation is enabled — expected output: true
aws kms get-key-rotation-status --key-id $KEY_ID \
  --query 'KeyRotationEnabled'

# Show the CloudTrail audit trail of every key usage event
aws logs filter-log-events \
  --log-group-name cloudtrail-logs \
  --filter-pattern '{ $.eventSource = "kms.amazonaws.com" }' \
  --query 'events[*].{Time:timestamp,Event:message}' \
  --output table

2.3. How to Implement Application-Layer Encryption for Sensitive Fields

Storage encryption is the floor. Application-layer encryption is the ceiling that Article 32 auditors are increasingly expecting for health data, financial records, and other sensitive personal data.

Here is the difference: with storage encryption only, a database administrator who runs SELECT email FROM users sees the plaintext email address. With application-layer encryption, they see gAAAAABm... — an encrypted byte string that only the application (with access to the Vault key) can decrypt.

# application_encryption.py
from cryptography.fernet import Fernet

class FieldEncryption:
    """
    Encrypts sensitive personal data fields before they are stored in the database.
    The encryption key is stored in HashiCorp Vault or AWS Secrets Manager — never in code.
    A database administrator with direct SQL access sees only encrypted bytes.
    """

    def __init__(self, key: str):
        # key must be a 32-byte base64-encoded string — retrieve from Vault
        self.cipher = Fernet(key.encode())

    def encrypt_field(self, plaintext: str) -> str:
        """Encrypt a sensitive field before writing to the database."""
        if not plaintext:
            return None
        encrypted_bytes = self.cipher.encrypt(plaintext.encode())
        return encrypted_bytes.decode()

    def decrypt_field(self, ciphertext: str) -> str:
        """
        Decrypt a field when legitimately needed by the application.
        This method requires the Vault key — database admins cannot call it.
        """
        if not ciphertext:
            return None
        decrypted_bytes = self.cipher.decrypt(ciphertext.encode())
        return decrypted_bytes.decode()


# Usage in your application:
from vault_client import get_secret  # Your Vault or Secrets Manager client

# Retrieve the encryption key at application startup — never hardcode it
encryption_key = get_secret("gdpr/field-encryption-key")
encryptor = FieldEncryption(encryption_key)

# Before storing a user's health record
user.health_data_encrypted = encryptor.encrypt_field(user.health_data_plaintext)

# Before reading for a legitimate purpose (subject access request, etc.)
health_data = encryptor.decrypt_field(user.health_data_encrypted)

The auditor question:

"If a database administrator queries the users table directly, can they read customer health data in plaintext?"

Your evidence: Run a direct database query and show the auditor the encrypted output. Then demonstrate that the decryption key is not accessible to database administrators — it is retrieved only by the application through Vault.

Part 3: Article 32(1)(b) — Confidentiality and Integrity

3.1. How to Implement Automatic Logoff

Article 32(1)(b) requires protection against "unauthorised access to personal data." A session that never expires — or expires after 24 hours — is an access control gap. A user who logs in on a shared machine and walks away has left an open door.

Here is the incorrect approach — a 24-hour JWT session:

// Bad: 24-hour access token with no inactivity check
const token = jwt.sign(
  { userId: user.id, role: user.role },
  process.env.JWT_SECRET,
  { expiresIn: '24h' }  // Too long — violates Article 32 intent
);

The problem: if a user logs in on a shared computer and closes the laptop without logging out, the session remains valid for up to 24 hours. Anyone who opens that laptop can access personal data.

Here is the correct implementation — a 15-minute access token with a rolling refresh:

// Good: Short-lived access token with rolling refresh via HTTP-only cookie

// Access token — valid for 15 minutes of activity
const accessToken = jwt.sign(
  { userId: user.id, role: user.role, type: 'access' },
  process.env.JWT_ACCESS_SECRET,
  { expiresIn: '15m' }
);

// Refresh token — valid for 8 hours total session duration
const refreshToken = jwt.sign(
  { userId: user.id, type: 'refresh' },
  process.env.JWT_REFRESH_SECRET,
  { expiresIn: '8h' }
);

// Set refresh token as HTTP-only cookie — not accessible to JavaScript
res.cookie('refreshToken', refreshToken, {
  httpOnly: true,    // Prevents XSS access
  secure: true,      // HTTPS only
  sameSite: 'strict', // Prevents CSRF
  maxAge: 8 * 60 * 60 * 1000  // 8 hours in milliseconds
});

// Session middleware that enforces absolute timeout
const MAX_TOTAL_SESSION_MS = 8 * 60 * 60 * 1000; // 8 hours

app.use((req, res, next) => {
  if (!req.session?.createdAt) return next();

  const sessionAge = Date.now() - req.session.createdAt;
  if (sessionAge > MAX_TOTAL_SESSION_MS) {
    req.session.destroy();
    return res.status(401).json({
      error: 'Session expired after 8 hours. Please log in again.'
    });
  }
  next();
});

The auditor question:

"Show me that your application terminates inactive sessions after a reasonable period."

Your evidence: A browser developer tools screenshot showing the cookie expiration time, plus a test recording showing that after 15 minutes of inactivity the user is presented with a re-authentication prompt.

3.2. How to Implement Unique User Identification with IRSA

Article 32(1)(b) requires that you can identify who accessed personal data. Shared service accounts make this impossible — the audit log shows data-export-service but you cannot tell which engineer triggered the export.

Here is the incorrect approach — a shared service account:

# Bad: One shared Kubernetes service account used by multiple engineers and pipelines
apiVersion: v1
kind: ServiceAccount
metadata:
  name: data-export           # Three engineers and two pipelines share this identity
  namespace: production

When an audit log shows data-export performed a bulk user export at 03:17 UTC, you cannot answer the auditor's question: "who authorised this?"

Here is the correct implementation — IAM Roles for Service Accounts (IRSA):

# Step 1: Create a separate IAM role for each service identity
# This command creates a role that can only be assumed by the 'payment-service'
# Kubernetes service account in the 'production' namespace

aws iam create-role \
  --role-name eks-payment-service-role \
  --assume-role-policy-document '{
    "Version": "2012-10-17",
    "Statement": [{
      "Effect": "Allow",
      "Principal": {
        "Federated": "arn:aws:iam::123456789012:oidc-provider/oidc.eks.us-east-1.amazonaws.com/id/YOUR_OIDC_ID"
      },
      "Action": "sts:AssumeRoleWithWebIdentity",
      "Condition": {
        "StringEquals": {
          "oidc.eks.us-east-1.amazonaws.com/id/YOUR_OIDC_ID:sub":
            "system:serviceaccount:production:payment-service"
        }
      }
    }]
  }'

# Step 2: Annotate the Kubernetes service account with its unique IAM role
apiVersion: v1
kind: ServiceAccount
metadata:
  name: payment-service          # One service account, one service, one role
  namespace: production
  annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::123456789012:role/eks-payment-service-role

Every AWS API call from payment-service now appears in CloudTrail as eks-payment-service-role — a unique, traceable identity. No shared accounts. No ambiguous audit logs.

The auditor question:

"How do you ensure that every action on personal data can be attributed to a specific individual or service?"

Your evidence:

# Verify no shared service accounts exist — every account should have a unique role annotation
kubectl get serviceaccounts --all-namespaces \
  -o jsonpath='{range .items[*]}{.metadata.namespace}/{.metadata.name}: {.metadata.annotations.eks\.amazonaws\.com/role-arn}{"\n"}{end}'

Part 4: Article 32(1)(c) — Availability and Resilience

4.1. How to Implement Multi-AZ and Backup Requirements

Article 32(1)(c) requires "the ability to restore the availability and access to personal data in a timely manner in the event of a physical or technical incident." This is not a suggestion — it is a legal requirement. If your database is in a single Availability Zone and that AZ experiences a networking event, you are in violation.

Here is the incorrect approach — single-AZ RDS with no automated backups:

# Bad: Single-AZ RDS — one networking event makes personal data unavailable
resource "aws_db_instance" "production" {
  identifier              = "production-database"
  multi_az                = false   # No automatic failover
  backup_retention_period = 0       # No automated backups — Article 32 violation
}

If the Availability Zone has a networking issue, the database is unreachable. If the instance is corrupted, there are no backups to restore. Both scenarios violate Article 32(1)(c).

Here is the correct implementation — Multi-AZ with tested automated backups:

# Good: Multi-AZ RDS with 30-day backup retention
resource "aws_db_instance" "production" {
  identifier = "production-database"

  # Multi-AZ creates a synchronous standby replica in a different AZ
  # Automatic failover completes in 60-120 seconds with no data loss
  multi_az = true

  # 30-day backup retention — gives you recovery point flexibility
  backup_retention_period = 30
  backup_window           = "03:00-04:00"  # Low-traffic window for backup

  # Copy all tags to snapshots for compliance tracking
  copy_tags_to_snapshot = true

  # Performance Insights for monitoring query health
  performance_insights_enabled          = true
  performance_insights_retention_period = 7

  tags = {
    Environment       = "production"
    DataClassification = "personal-data"
    GDPRScope         = "article32"
  }
}

How to test your RTO and RPO monthly:

# Step 1: Find your most recent automated snapshot
SNAPSHOT_ID=$(aws rds describe-db-snapshots \
  --db-instance-identifier production-database \
  --snapshot-type automated \
  --query 'sort_by(DBSnapshots, &SnapshotCreateTime)[-1].DBSnapshotIdentifier' \
  --output text)

echo "Testing restore of snapshot: $SNAPSHOT_ID"

# Step 2: Start the restore — measure the time
START_TIME=$(date +%s)

aws rds restore-db-instance-from-db-snapshot \
  --db-instance-identifier gdpr-restore-test \
  --db-snapshot-identifier $SNAPSHOT_ID \
  --db-instance-class db.t3.medium \
  --no-publicly-accessible \
  --tags Key=Purpose,Value=gdpr-rto-test Key=DeleteAfter,Value=$(date -d '+1 day' +%Y-%m-%d)

# Step 3: Wait for restore to complete
aws rds wait db-instance-available \
  --db-instance-identifier gdpr-restore-test

END_TIME=$(date +%s)
RTO_SECONDS=$((END_TIME - START_TIME))
echo "Restore completed in $((RTO_SECONDS / 60)) minutes"

# Step 4: Verify data integrity with a spot check
# Connect to the restored instance and verify record counts match production
# psql -h RESTORED_ENDPOINT -U admin -d production \
#   -c "SELECT COUNT(*) FROM users; SELECT MAX(created_at) FROM orders;"

# Step 5: Delete the test instance
aws rds delete-db-instance \
  --db-instance-identifier gdpr-restore-test \
  --skip-final-snapshot

The auditor question:

"What is your Recovery Time Objective and Recovery Point Objective for personal data? When did you last test it?"

Your evidence: A documented monthly DR test log showing: snapshot used, restore start time, restore completion time, data verification query results, and the engineer who conducted the test.

Part 5: Article 32(1)(d) — Regular Testing

5.1. How to Implement Automated Vulnerability Scanning

Article 32(1)(d) requires "a process for regularly testing, assessing and evaluating the effectiveness of technical and organisational measures." This includes automated vulnerability scanning of every container image before it reaches production.

Here is the incorrect approach — no scanning in the deployment pipeline:

# Bad: No vulnerability scanning — a critical CVE in the base image deploys undetected
name: Deploy
on: [push]
jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - run: docker build -t myapp .
      - run: docker push myapp  # Deploys without any security check

If a critical CVE is present in the base image (such as a remote code execution vulnerability in OpenSSL), it goes straight to production. Under Article 32(1)(d), this is a finding.

Here is the correct implementation — Trivy scanning with pipeline enforcement:

# Good: Trivy scans every image — CRITICAL/HIGH CVEs block the deployment
name: Security Scan and Deploy
on: [push, pull_request]

jobs:
  trivy-scan:
    name: Container Vulnerability Scan
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3

      - name: Build container image
        run: docker build -t myapp:${{ github.sha }} .

      - name: Scan for vulnerabilities with Trivy
        uses: aquasecurity/trivy-action@master
        with:
          image-ref: 'myapp:${{ github.sha }}'
          format: 'sarif'
          output: 'trivy-results.sarif'
          severity: 'CRITICAL,HIGH'
          exit-code: '1'         # Fail the pipeline — image cannot deploy with CRITICAL/HIGH CVEs

      - name: Upload scan results to GitHub Security tab
        uses: github/codeql-action/upload-sarif@v2
        if: always()             # Upload results even if scan failed, for review
        with:
          sarif_file: 'trivy-results.sarif'

Trivy scans for:

CVEs in the base image OS packages (for example, a critical OpenSSL vulnerability in your Ubuntu base)
Vulnerable versions of application dependencies (a known exploit in an npm or pip package your application uses)
Misconfigurations in the Dockerfile (running as root, using latest tag instead of a pinned SHA)

Results appear in the GitHub Security tab, creating a timestamped, searchable history of every scan. That history is your Article 32(1)(d) evidence.

How to run a weekly AWS Inspector assessment for running workloads:

# List all active CRITICAL findings across your AWS account
aws inspector2 list-findings \
  --filter-criteria '{
    "severity": [{"comparison": "EQUALS", "value": "CRITICAL"}],
    "findingStatus": [{"comparison": "EQUALS", "value": "ACTIVE"}]
  }' \
  --query 'findings[*].{
    Title:title,
    Resource:resources[0].id,
    Severity:severity,
    CVE:packageVulnerabilityDetails.vulnerabilityId
  }' \
  --output table

The auditor question:

"Show me your vulnerability management programme, including how you prioritise and remediate findings."

Your evidence: A weekly vulnerability report — generated automatically from the above command — showing active findings, severity, the GitHub issue created for each finding, and the closure date once remediated.

Part 6: Article 32(1)(d) — Penetration Testing

6.1. Why Automated Scanning Is Not Enough

Article 32(1)(d) requires evaluating the effectiveness of security measures. Automated vulnerability scanners find known CVEs in libraries and OS packages. They cannot find:

Business logic vulnerabilities (an API endpoint that returns another user's data when given a specific parameter)
Authentication bypasses (a JWT implementation that accepts unsigned tokens)
Privilege escalation paths (an attacker can move from a low-privilege role to admin through a sequence of legitimate API calls)
Insecure direct object references (accessing /api/users/124 instead of /api/users/123 returns data for a different customer)

The ICO (UK Information Commissioner's Office) and the CNIL (France's data protection authority) both state in their guidance that annual manual penetration testing is expected for organisations processing significant volumes of personal data.

What an acceptable pen test scope looks like:

# Annual Penetration Test Scope — Article 32 Compliance

## Testing Period
Start: 2025-04-01  
End: 2025-04-14  
Testing firm: [Accredited firm — CREST or CHECK certified]

## In Scope
- Production web application: https://app.yourcompany.com
- Production API: https://api.yourcompany.com/v1/*
- Authentication flows: OAuth2, JWT, session management
- Data stores: PostgreSQL (via application access only, not direct DB access)
- AWS account: External reconnaissance of public-facing services only

## Testing Types
- External infrastructure testing (all public IP ranges)
- Web application testing (OWASP Top 10 2021)
- API security testing (all authenticated and unauthenticated endpoints)
- Authentication and session management testing
- GDPR-specific test cases (data subject rights endpoints, consent flows)

## Remediation SLAs
- CRITICAL: 24 hours from report delivery
- HIGH: 7 calendar days
- MEDIUM: 30 calendar days
- LOW: 90 calendar days

How to track and evidence remediation:

# Create GitHub issues for each finding on receipt of the pen test report
# This creates a traceable record of every finding and its resolution

for finding_id in $(cat pentest-report-findings.txt); do
  gh issue create \
    --title "Pen test finding: $finding_id" \
    --body "See pentest-report-2025-04.pdf, section $finding_id. Severity: HIGH. SLA: 7 days." \
    --label "security,pentest" \
    --assignee "@security-lead"
done

The auditor question:

"When was your last penetration test? Show me the report and your remediation evidence."

Your evidence:

The penetration test report from a CREST or CHECK certified firm, dated within the last 12 months
A remediation tracker (GitHub issues or Jira) showing every CRITICAL and HIGH finding with a closure date
Evidence that all CRITICAL findings were closed within 24 hours (the git commit or deployment log)

Here are the key takeaways from this guide:

✅ Do: Implement application-layer encryption for sensitive fields. Storage encryption alone is not enough — a DBA with direct database access can still read plaintext.

✅ Do: Use customer-managed KMS keys with automatic rotation. You need to prove control over the key material.

✅ Do: Store pseudonymised data separately from identifiers, with restricted role-based access to the lookup table.

✅ Do: Enforce automatic logoff after 15 minutes of inactivity with an 8-hour absolute session limit.

✅ Do: Use unique service accounts with IRSA. Every action on personal data must be attributable to a specific identity.

✅ Do: Test your backups monthly. Document RTO and RPO with actual restore test results.

✅ Do: Run Trivy in CI to block CRITICAL and HIGH CVEs before deployment.

✅ Do: Conduct an annual manual penetration test from a CREST or CHECK certified firm.

❌ Don't: Use 24-hour JWT sessions or sessions with no inactivity timeout.

❌ Don't: Store secrets in environment variables, .env files, or hardcoded in source code.

❌ Don't: Skip the annual penetration test. An auditor from the ICO or CNIL will not accept "we run automated scans" as a substitute.

❌ Don't: Use AWS-managed KMS keys if you need to prove key material control to your auditor.

Resources

ICO Guide to GDPR Article 32 — The UK Information Commissioner's Office official guidance on Article 32 security obligations
ENISA Guidelines on Article 32 — The EU Agency for Cybersecurity's SME guidelines on personal data security
Trivy by Aqua Security — Open-source container vulnerability scanner used in Part 5
OWASP Top 10 2021 — The standard reference for web application security risks, used in pen test scoping
AWS KMS Key Rotation Documentation — Official AWS documentation for automatic key rotation
PostgreSQL Row Security Policies — How to implement row-level security for granular access control on pseudonymised data
EKS IAM Roles for Service Accounts (IRSA) — Official AWS documentation for unique service account identity on EKS
CREST Certified Testing Firms — Directory of CREST-certified penetration testing firms for your annual Article 32 assessment

Ayobami Adejumo is a senior platform engineer and compliance infrastructure specialist. He writes about GDPR engineering controls, SOC2 implementation, and FinOps - cloud cost optimization

The Complete SOC 2 Type II Implementation Handbook for Engineers: A Month-by-Month Roadmap with Real Commands

Ayobami Adejumo — Tue, 05 May 2026 18:26:21 +0000

If your team is preparing for a SOC 2 Type II review, this handbook is for you. It's a self-contained guide to the exact 90-day timeline, 14 critical controls, and evidence collection infrastructure that auditors actually check.

Everyone publishes the controls list. But nobody publishes the week-by-week engineering calendar you'll need to follow to make sure your ducks are in a row.

Here is the exact 90-day timeline — including the mistakes that add 60 days (and how to avoid them).

What You'll Learn
Prerequisites
Weeks 1–2: The Scope Decision
Weeks 3–6: The 14 Controls That Must Be Active on Day 1
Weeks 7–10: The Evidence Collection Infrastructure
Weeks 11–14: Auditor Selection and Readiness Assessment
Weeks 15–18: The Observation Period
The 90-Day SOC2 Timeline at a Glance
What's Next
Resources

What You'll Learn

By the end of this guide, you'll know:

How to scope your SOC2 boundary correctly — the decision that determines everything else
The 14 controls that must be active on day 1 of your observation period
How to build evidence collection infrastructure that runs automatically
How to choose an auditor and run a readiness assessment
What happens during the observation period and how to close gaps without restarting the clock

Let's dive in.

Prerequisites

Before following along, you should have:

Knowledge:

Basic understanding of AWS services (EC2, RDS, S3, IAM, VPC)
Familiarity with Terraform or another infrastructure as code tool
Comfort reading GitHub Actions YAML workflows
A general understanding of what SOC2 is — if you are starting from scratch, read the AICPA's SOC2 overview first

Tools and access:

An AWS account with administrator access
A GitHub organisation with admin rights
Terraform installed (v1.0 or later)
Python 3.8 or later (for the evidence collector Lambda)
A compliance automation platform — Vanta or Drata — connected to your AWS account and GitHub organisation

Estimated time: 90 days end-to-end, with active engineering work of approximately 8–12 hours per week in the first six weeks, tapering to 2–4 hours per week during the observation period.

Weeks 1–2: The Scope Decision — What Is In and Out of Your SOC2 Boundary

What Most Teams Get Wrong

Most teams scope their SOC2 boundary too broadly. They include every AWS account, every service, every environment. This is a mistake — and here is exactly why.

A broader scope means more controls to implement, more evidence to collect, and more systems the auditor will examine.

Every system inside your boundary must satisfy all 14 controls. Including your development sandbox means your engineers' experimental environments must have GuardDuty enabled, CloudTrail logging, and branch-protected deployments. That adds weeks of work and months of evidence collection for systems that pose no risk to your customers.

A correctly bounded scope means you include only the systems that store, process, or transmit customer data — and you prove that everything else cannot reach those systems.

Bad scope (over-inclusive):

Entire AWS Organization
├── Production (in scope)
├── Staging (in scope)
├── Development (in scope)
├── Sandbox (in scope)
└── CI/CD (in scope)

Good scope (correctly bounded):

SOC2 Boundary
├── Production AWS Account (in scope)
├── Production EKS Cluster (in scope)
├── Production RDS (in scope)
└── Everything else (OUT of scope — proven by network segmentation)

The correctly bounded scope works because it draws the tightest defensible line around the systems that actually handle customer data. Everything outside that line is excluded — not by assumption, but by technical controls that prevent those systems from reaching anything inside the boundary.

The Scope Decision Framework

For every system in your infrastructure, ask these four questions:

Question	If YES	If NO
Does this system store, process, or transmit customer data?	✅ In scope	❌ Out of scope
Does this system affect the availability of customer-facing services?	✅ In scope	❌ Out of scope
Does this system have access to production credentials?	✅ In scope	❌ Out of scope
Can a compromise of this system lead to a customer data breach?	✅ In scope	❌ Out of scope

Any system where the answer to even one question is yes belongs inside your boundary.

Network Segmentation — The Technical Proof That Your Boundary Holds

Network segmentation is the practice of dividing your infrastructure into isolated zones so that systems in one zone can't communicate with systems in another unless you explicitly allow it.

In the context of SOC2, it's the technical control that proves your out-of-scope systems genuinely can't reach your in-scope systems — not just by policy, but by infrastructure enforcement.

Without network segmentation, the SOC2 auditor can't trust that your boundary is real. A developer in your sandbox environment who can query your production database means the sandbox is effectively in scope, regardless of what your diagram says.

Here's the Terraform that implements network segmentation between your production and non-production environments. The network access control list (NACL) blocks all inbound traffic from the broader private IP range (10.0.0.0/8) into your in-scope production VPC, while the explicit aws_vpc_peering_connection comment documents the deliberate decision not to peer environments:

# This account has NO VPC peering to non-production environments.
# The absence of peering is itself the segmentation control.
# Do NOT add peering connections to this account without SOC2 scope review.

resource "aws_network_acl" "deny_non_production" {
  vpc_id = aws_vpc.production.id

  # Block all inbound traffic from non-production IP ranges
  ingress {
    rule_no    = 100
    action     = "deny"
    from_port  = 0
    to_port    = 0
    protocol   = "-1"
    cidr_block = "10.0.0.0/8"
  }

  # Allow legitimate inbound traffic (HTTPS from internet)
  ingress {
    rule_no    = 200
    action     = "allow"
    from_port  = 443
    to_port    = 443
    protocol   = "tcp"
    cidr_block = "0.0.0.0/0"
  }

  # Allow all outbound (tighten this per your architecture)
  egress {
    rule_no    = 100
    action     = "allow"
    from_port  = 0
    to_port    = 0
    protocol   = "-1"
    cidr_block = "0.0.0.0/0"
  }

  tags = {
    Name        = "production-nacl"
    Environment = "production"
    Purpose     = "SOC2 network segmentation"
  }
}

Verify the segmentation with this command after applying the Terraform:

# Confirm no VPC peering connections exist from production to non-production
aws ec2 describe-vpc-peering-connections \
  --filters Name=status-code,Values=active \
  --query 'VpcPeeringConnections[*].{ID:VpcPeeringConnectionId,Requester:RequesterVpcInfo.VpcId,Accepter:AccepterVpcInfo.VpcId}' \
  --output table

The Deliverable: Your SOC2 Boundary Diagram

At the end of weeks 1–2, you need a boundary diagram — a visual document that shows every in-scope system, every out-of-scope system, and the segmentation controls between them.

Here is what the diagram should contain:

Include every AWS service, every data flow arrow, and a label on the segmentation control. This diagram becomes your primary scope evidence and is typically the first thing an auditor asks for.

Weeks 3–6: The 14 Controls That Must Be Active on Day 1

These 14 controls must be implemented and actively collecting evidence from day 1 of your observation period. If you add any of them late, the observation period clock for that control restarts from the implementation date — not from day 1 of the audit period.

Think of the observation period as a surveillance camera recording your infrastructure. The auditor watches the footage later. If the camera was not on when a specific event occurred, that event has no record — and the SOC2 control for it has a gap.

Control 1: MFA Enforcement (CC6.6)

Multi-Factor Authentication (MFA) requires a user to verify their identity using two independent factors — something they know (a password) and something they have (a phone or hardware key). Without MFA, a stolen password is sufficient to access your production systems.

SOC2 CC6.6 requires that access to systems is restricted to authorized users. MFA is the technical control that makes "authorized" meaningful. Without it, any password compromise is a production access event.

To implement MFA, you can use AWS IAM Identity Center (formerly SSO) connected to your identity provider (Okta, Google Workspace, or Azure AD). MFA is then enforced at the identity provider level — any user without MFA enrolled can't authenticate, regardless of which AWS service they're trying to reach.

# IAM Identity Center configuration — MFA is enforced at the IdP level.
# No IAM user has direct console or CLI access.
# All access goes through SSO sessions (8-hour expiry by default).

resource "aws_ssoadmin_instance_access_control_attributes" "mfa" {
  instance_arn = tolist(data.aws_ssoadmin_instances.this.arns)[0]

  attribute {
    key = "email"
    value {
      source = ["$${path:email}"]
    }
  }
}

You can verify that no IAM users retain direct console access (which would bypass MFA):

# Any user listed here has direct console access bypassing SSO — investigate immediately
aws iam list-users \
  --query 'Users[?PasswordLastUsed!=`null`].[UserName,PasswordLastUsed]' \
  --output table

Control 2: Infrastructure as Code (CC8.1)

Infrastructure as Code (IaC) means defining your cloud infrastructure in version-controlled code files (Terraform, Pulumi, or AWS CDK) rather than creating resources manually through the AWS console. Every infrastructure change is proposed in a pull request, reviewed by a colleague, and applied through an automated pipeline.

SOC2 CC8.1 covers change management — the requirement that every change to your production environment is documented, reviewed, and approved. Manual console changes produce no audit trail. If an engineer opens the AWS console and creates a security group without going through Terraform, that change is invisible to your SOC2 auditor. IaC makes every change reviewable and traceable.

Now let's see how to implement IaC here. This GitHub Actions workflow applies Terraform only from the main branch, after a pull request has been reviewed and approved. The workflow creates an immutable record of every infrastructure change:

# .github/workflows/terraform-apply.yml
name: Terraform Apply (Production)
on:
  push:
    branches: [main]
    paths: ['terraform/**']

permissions:
  id-token: write   # Required for AWS OIDC authentication
  contents: read

jobs:
  apply:
    name: Apply Infrastructure Changes
    runs-on: ubuntu-latest
    environment: production  # Requires manual approval for production

    steps:
      - name: Checkout code
        uses: actions/checkout@v3

      - name: Configure AWS credentials (OIDC — no long-lived keys)
        uses: aws-actions/configure-aws-credentials@v2
        with:
          role-to-assume: arn:aws:iam::${{ secrets.AWS_ACCOUNT_ID }}:role/terraform-apply
          aws-region: us-east-1

      - name: Setup Terraform
        uses: hashicorp/setup-terraform@v2
        with:
          terraform_version: "1.6.0"

      - name: Terraform Plan
        run: |
          terraform init
          terraform plan -out=tfplan -input=false

      - name: Terraform Apply
        run: terraform apply -input=false tfplan

SOC2 evidence this produces: A GitHub Actions run log for every infrastructure change, showing who triggered it (the pull request author), when it was applied, and what changed.

Control 3: CloudTrail Enabled (CC7.1)

AWS CloudTrail is a service that records every API call made in your AWS account — who called it, when, from which IP address, and whether it succeeded. Think of it as the complete audit log of everything that has ever happened in your AWS environment.

SOC2 CC7.1 requires monitoring for security events. CloudTrail is the foundational logging layer — without it, you can't detect unauthorized access, investigate incidents, or prove to an auditor that your controls were operating as intended. An auditor who can't see historical AWS API activity can't verify that your access controls were enforced during the observation period.

To implement it, you'll want to enable multi-region CloudTrail so that activity in every AWS region is captured, including global services like IAM. You can ship logs to an S3 bucket with Object Lock enabled (Control 3 in the evidence collection section covers this) so logs can't be modified or deleted:

# Enable CloudTrail with log file validation and multi-region coverage
aws cloudtrail create-trail \
  --name production-audit-trail \
  --s3-bucket-name your-cloudtrail-logs-bucket \
  --is-multi-region-trail \
  --enable-log-file-validation \
  --include-global-service-events

# Start the trail (creation alone does not start logging)
aws cloudtrail start-logging --name production-audit-trail

# Verify the trail is active and logging
aws cloudtrail get-trail-status --name production-audit-trail \
  --query '{IsLogging:IsLogging,LatestDeliveryTime:LatestDeliveryTime}'

Control 4: GuardDuty Enabled (CC7.2)

AWS GuardDuty is a threat detection service that analyses your CloudTrail logs, VPC Flow Logs, and DNS logs. It uses machine learning to identify suspicious behaviour — things like an EC2 instance communicating with a known malware server, an IAM user logging in from an unusual country, or unusual API call patterns that indicate credential theft.

SOC2 CC7.2 requires the use of detection tools to identify potential security events. GuardDuty is the monitoring layer that tells you when something anomalous is happening, not just what happened after the fact. Without it, you would only discover a compromise when the damage is done.

Here's the implementation:

# Enable GuardDuty — findings published every 15 minutes for active threats
aws guardduty create-detector \
  --enable \
  --finding-publishing-frequency FIFTEEN_MINUTES

# Verify GuardDuty is active
aws guardduty list-detectors --query 'DetectorIds' --output table

You can set up an EventBridge rule to route CRITICAL and HIGH severity GuardDuty findings to your incident response channel immediately. A finding sitting unreviewed for 90 days is a qualified SOC2 finding.

Control 5: VPC Flow Logs (CC6.1)

VPC Flow Logs capture information about the IP traffic flowing through your Virtual Private Cloud — every accepted and rejected connection, including source IP, destination IP, port, protocol, and whether the traffic was allowed or denied. They are the network-level audit trail that CloudTrail doesn't provide.

SOC2 CC6.1 requires logical access controls and monitoring. VPC Flow Logs let you verify that your network segmentation is actually working (traffic you denied is showing as rejected in the logs), detect unexpected communication between services, and investigate security events at the network layer.

# Create an IAM role for VPC Flow Logs to deliver to CloudWatch
aws iam create-role \
  --role-name vpc-flow-logs-role \
  --assume-role-policy-document '{
    "Version":"2012-10-17",
    "Statement":[{
      "Effect":"Allow",
      "Principal":{"Service":"vpc-flow-logs.amazonaws.com"},
      "Action":"sts:AssumeRole"
    }]
  }'

# Enable VPC Flow Logs for all traffic (ACCEPT and REJECT)
aws ec2 create-flow-logs \
  --resource-ids vpc-YOUR_PRODUCTION_VPC_ID \
  --resource-type VPC \
  --traffic-type ALL \
  --log-group-name /aws/vpc/flow-logs/production \
  --deliver-log-permission-arn arn:aws:iam::YOUR_ACCOUNT_ID:role/vpc-flow-logs-role

# Verify flow logs are active
aws ec2 describe-flow-logs \
  --filter Name=resource-id,Values=vpc-YOUR_PRODUCTION_VPC_ID \
  --query 'FlowLogs[*].{Status:FlowLogStatus,LogGroup:LogGroupName}'

Control 6: Secrets Manager (CC6.7)

Secrets management means storing credentials (database passwords, API keys, certificates, and other sensitive configuration values) in a dedicated, access-controlled service (like AWS Secrets Manager or HashiCorp Vault) rather than in .env files, GitHub repository secrets, or hardcoded in application code.

SOC2 CC6.7 requires protecting sensitive system components from unauthorized access. A secret stored in an .env file committed to a repository is accessible to every developer with repo access, every CI/CD runner, and every engineer who has ever cloned the repo — including those who have since left the company.

A Secrets Manager provides centralised storage, access logging, automatic rotation, and fine-grained IAM permissions so only specific services can retrieve specific secrets.

Let's look at the implementation — storing and rotating a secret:

# Store a database credential with automatic 90-day rotation
aws secretsmanager create-secret \
  --name production/postgresql/credentials \
  --description "Production PostgreSQL credentials — rotated every 90 days" \
  --secret-string '{
    "username": "app_user",
    "password": "REPLACE_WITH_STRONG_PASSWORD",
    "host": "your-rds-endpoint.us-east-1.rds.amazonaws.com",
    "port": 5432,
    "dbname": "production"
  }'

# Enable automatic rotation every 90 days
aws secretsmanager rotate-secret \
  --secret-id production/postgresql/credentials \
  --rotation-rules AutomaticallyAfterDays=90

How your application retrieves the secret at runtime (no hardcoded credentials):

# Good: secret retrieved at runtime from Secrets Manager
import boto3
import json

def get_db_credentials():
    client = boto3.client('secretsmanager', region_name='us-east-1')
    response = client.get_secret_value(SecretId='production/postgresql/credentials')
    return json.loads(response['SecretString'])

# Bad: secret hardcoded in application code or .env file
DB_PASSWORD = "my_database_password_123"  # Never do this

The access log in CloudTrail records every time a secret is retrieved, by which IAM role, at what time. That log is your SOC2 evidence that secrets access is controlled and auditable.

Control 7: EBS Encryption (CC6.1)

EBS (Elastic Block Store) encryption ensures that the persistent disks attached to your EC2 instances and used by your RDS databases are encrypted at rest using AES-256. If an AWS employee or an attacker gained physical access to the storage hardware, the data would be unreadable without the encryption key.

SOC2 CC6.1 requires protecting information assets from unauthorised access. Encryption at rest is the control that protects data in the event of physical storage compromise or an improperly decommissioned disk. Enabling it account-wide means every new EBS volume is encrypted automatically, including RDS storage, EKS node volumes, and EC2 instance root volumes.

# Enable EBS encryption by default for all new volumes in this region
aws ec2 enable-ebs-encryption-by-default

# Verify it is enabled
aws ec2 get-ebs-encryption-by-default \
  --query 'EbsEncryptionByDefault'
# Expected output: true

# Check existing volumes — any showing false need to be migrated
aws ec2 describe-volumes \
  --query 'Volumes[?Encrypted==`false`].[VolumeId,Size,VolumeType]' \
  --output table

Any existing unencrypted volumes must be snapshot-and-replaced. The process: create a snapshot of the unencrypted volume, create a new encrypted volume from the snapshot, and swap it into the instance.

Control 8: S3 Block Public Access (CC6.1)

Amazon S3 buckets can be configured to allow public access — meaning anyone on the internet can read their contents without authentication. Block Public Access is an account-level and bucket-level setting that prevents any bucket from being made public, regardless of the bucket's own policy.

A misconfigured S3 bucket is one of the most common causes of data breaches in cloud environments. Block Public Access at the account level means a developer can't accidentally expose a bucket containing customer data, even if they set the wrong bucket policy. It's a guardrail, not just a policy.

# Block public access at the AWS account level — applies to all buckets
aws s3control put-public-access-block \
  --account-id YOUR_ACCOUNT_ID \
  --public-access-block-configuration \
    BlockPublicAcls=true,\
    IgnorePublicAcls=true,\
    BlockPublicPolicy=true,\
    RestrictPublicBuckets=true

# Verify account-level setting is active
aws s3control get-public-access-block \
  --account-id YOUR_ACCOUNT_ID

# Scan for any buckets that have public access enabled (should be zero)
aws s3api list-buckets --query 'Buckets[*].Name' --output text | \
  tr '\t' '\n' | while read bucket; do
    result=\((aws s3api get-public-access-block --bucket "\)bucket" 2>/dev/null)
    if echo "$result" | grep -q '"BlockPublicAcls": false'; then
      echo "WARNING: $bucket has public access not fully blocked"
    fi
  done

Control 9: Branch Protection (CC8.1)

Branch protection is a GitHub setting that prevents engineers from pushing code directly to your main branch without going through a pull request that has been reviewed and approved by at least one other team member. It also requires your CI pipeline to pass before any code can be merged.

SOC2 CC8.1 requires change management — the requirement that every change to production systems is documented, reviewed, and approved. Without branch protection, an engineer can push directly to main, which deploys directly to production through your CI/CD pipeline, with no review and no audit trail. Branch protection is the technical enforcement of your change management policy.

The critical setting that most teams miss: the "Do not allow bypassing the above settings" option must be enabled. Without it, administrators can bypass branch protection — and a SOC2 auditor will flag this as a gap because it means your change management control can be circumvented.

# .github/settings.yml — enforces branch protection via code
# Requires the settings GitHub App: https://github.com/apps/settings

branches:
  - name: main
    protection:
      required_pull_request_reviews:
        required_approving_review_count: 1
        dismiss_stale_reviews: true
        require_code_owner_reviews: false
      required_status_checks:
        strict: true
        contexts:
          - "CI / test"
          - "Security / trivy-scan"
      enforce_admins: true         # Admins cannot bypass — this is critical
      restrictions: null           # No push restriction beyond the above
      allow_force_pushes: false
      allow_deletions: false

Here's how you can verify that branch protection is enforced and admins can't bypass it:

# Returns the branch protection rules including enforce_admins status
curl -H "Authorization: token YOUR_GITHUB_TOKEN" \
  https://api.github.com/repos/YOUR_ORG/YOUR_REPO/branches/main/protection \
  | jq '{enforce_admins: .enforce_admins.enabled, required_reviews: .required_pull_request_reviews.required_approving_review_count}'

Control 10: Container Image Scanning (CC7.4)

Container image scanning analyses your Docker images before deployment to identify known security vulnerabilities (CVEs) in the operating system packages and application dependencies they contain.

Trivy is an open-source scanner that checks the base image (Ubuntu, Alpine, and so on), all installed OS packages, and language-specific dependencies (npm, pip, Go modules) against the National Vulnerability Database.

SOC2 CC7.4 requires monitoring and identifying vulnerabilities. Every container you deploy contains a base image with OS packages — and those packages regularly receive CVE disclosures. A critical CVE left unpatched for 90 days in a production container is a SOC2 finding. Automated scanning in CI means every image is checked before it can deploy.

# .github/workflows/security-scan.yml
name: Security Scan
on: [push, pull_request]

jobs:
  trivy-scan:
    name: Container Vulnerability Scan
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3

      - name: Build container image
        run: docker build -t app:${{ github.sha }} .

      - name: Scan image for vulnerabilities
        uses: aquasecurity/trivy-action@master
        with:
          image-ref: app:${{ github.sha }}
          format: sarif
          output: trivy-results.sarif
          severity: CRITICAL,HIGH
          exit-code: 1          # Fail the pipeline on CRITICAL or HIGH findings

      - name: Upload results to GitHub Security tab
        uses: github/codeql-action/upload-sarif@v2
        if: always()            # Upload even if scan found issues
        with:
          sarif_file: trivy-results.sarif

The scanner looks for:

CVEs in base image OS packages (for example, a critical OpenSSL vulnerability in your Ubuntu base)
Vulnerable versions of application dependencies (a known RCE in an npm package your app uses)
Misconfigurations in the Dockerfile itself (running as root, using latest tags)

Results appear in the GitHub Security tab for your repository, giving you a historical record of every scan — which is your SOC2 evidence.

Control 11: Incident Response Plan (CC9.2)

An incident response plan is a written, tested procedure that defines exactly what your team does when a security event occurs — from the moment an alert fires through to customer notification and post-incident review.

SOC2 CC9.2 requires that you have a documented process for responding to security events and that you've tested it. The auditor will ask for the written runbook and evidence that a tabletop exercise (a simulated incident walkthrough) has been conducted within the observation period.

Your incident response runbook must include:

Severity classification: Definitions of P1 (production down, customer data at risk), P2 (degraded service, potential risk), and P3 (minor issue, no customer impact) — and the response SLA for each.
Escalation path: Exactly who gets paged at each severity level, with contact details. Not "the on-call engineer" — specific names and a backup if the first person doesn't respond within 10 minutes.
First 15 minutes: The specific steps to take immediately — isolate the affected system, assess the scope, notify the incident channel, begin the timeline log.
Communication templates: Pre-written Slack messages, customer email templates, and regulatory notification templates (GDPR requires notification within 72 hours, HIPAA within 60 days).
Post-incident review: The blameless postmortem process, the 5-why root cause analysis template, and the action item tracking process.

Conduct a tabletop exercise at least once during your observation period: gather your engineering team for 45 minutes, simulate a realistic scenario (for example, "an AWS access key was committed to a public GitHub repo"), and walk through the runbook together. Document the meeting date, attendees, scenario, gaps found, and remediation actions. This document is your evidence.

Control 12: Access Reviews (CC6.3)

An access review is a quarterly audit of who has access to what in your production systems — AWS accounts, GitHub repositories, production databases, and every SaaS tool that touches customer data. You verify that every person on the list still works at the company and still needs the access their role grants them.

SOC2 CC6.3 requires that access is revoked when it's no longer needed. Former employees who retain access to production AWS accounts represent a genuine security risk and a definitive SOC2 finding.

In every access review I've conducted, at least 3–5 former employees or contractors still had active access they should not.

The quarterly access review checklist:

# 1. IAM users — list all with their last login date
aws iam generate-credential-report
aws iam get-credential-report --output text --query Content \
  | base64 --decode | cut -d',' -f1,5 | column -t -s ','

# 2. IAM roles — find roles that have not been used in 90+ days
aws iam get-account-authorization-details \
  --query 'RoleDetailList[*].{Role:RoleName,LastUsed:RoleLastUsed.LastUsedDate}' \
  --output table

# 3. Verify AWS SSO user list matches your current employee list
aws identitystore list-users \
  --identity-store-id YOUR_IDENTITY_STORE_ID \
  --query 'Users[*].{Name:DisplayName,Email:Emails[0].Value}' \
  --output table

Cross-reference the output against your current employee list in your HR system. Document every change made — access removed, permissions reduced, accounts disabled. The documented changes are the evidence that the review was conducted meaningfully, not just as a checkbox exercise.

Control 13: Backup Verification (CC9.5)

Backup verification is the process of actually restoring your backups to confirm they work — not just confirming that backups are being created. A backup that has never been tested doesn't exist from a recovery perspective.

SOC2 CC9.5 requires that recovery procedures are tested. If your production database is corrupted and you discover for the first time during the incident that your automated RDS snapshots can't be restored, you have both a disaster recovery failure and a SOC2 finding.

How to test your RDS backup:

# Step 1: Find your most recent production snapshot
aws rds describe-db-snapshots \
  --db-instance-identifier your-production-db \
  --query 'sort_by(DBSnapshots, &SnapshotCreateTime)[-1].DBSnapshotIdentifier' \
  --output text

# Step 2: Restore the snapshot to a test instance
aws rds restore-db-instance-from-db-snapshot \
  --db-instance-identifier backup-verification-test \
  --db-snapshot-identifier YOUR_SNAPSHOT_ID \
  --db-instance-class db.t3.medium \
  --no-publicly-accessible \
  --tags Key=Purpose,Value=backup-verification Key=Environment,Value=test

# Step 3: Wait for the restore to complete (typically 5–15 minutes)
aws rds wait db-instance-available \
  --db-instance-identifier backup-verification-test

# Step 4: Connect and verify data integrity (spot check key tables)
# Run this against the restored instance
psql -h RESTORED_INSTANCE_ENDPOINT -U your_user -d your_database \
  -c "SELECT COUNT(*) FROM users; SELECT MAX(created_at) FROM orders;"

# Step 5: Document the test result and delete the test instance
aws rds delete-db-instance \
  --db-instance-identifier backup-verification-test \
  --skip-final-snapshot

Document the test date, the snapshot used, the restore time, the data verification query results, and who conducted the test. Run this quarterly at minimum. This documentation is your SOC2 evidence for CC9.5.

Control 14: Change Management Log (CC8.1)

A change management log is the auditable record of every change made to your production environment — what changed, who approved it, and when it was applied.

SOC2 CC8.1 requires that changes to your production environment are authorized and documented. With IaC and GitOps in place, you already have two separate sources of immutable change history that together satisfy this control.

GitHub Pull Request history provides the record of every code and infrastructure change: who opened the PR, who reviewed and approved it, what the CI status was, and when it was merged. This is your change management log for application and infrastructure changes.

ArgoCD sync history provides the record of every deployment to your Kubernetes cluster: which application was synced, from which Git commit, at what time, and whether the sync succeeded.

To export the ArgoCD sync history as evidence:

# Export ArgoCD application sync history as JSON evidence
argocd app history YOUR_APP_NAME --output json > argocd-sync-history-$(date +%Y%m).json

# Upload to your SOC2 evidence bucket
aws s3 cp argocd-sync-history-$(date +%Y%m).json \
  s3://your-soc2-evidence-bucket/change-management/$(date +%Y/%m)/

# For each deployment, the evidence contains:
# - App name, deployed revision (Git commit SHA)
# - Deployment timestamp
# - Initiating user or automated sync
# - Success/failure status

Together, the GitHub PR history and the ArgoCD sync history give the auditor a complete, tamper-evident record of every change to your production environment during the observation period.

Weeks 7–10: The Evidence Collection Infrastructure

Evidence is the difference between passing and failing SOC2.

You might be wondering: what exactly is evidence? In SOC2 terms, evidence is the documentation that proves a specific control was operating correctly during a specific point in time within the observation period. A policy document says you will do something. Evidence proves you did it — and that you did it continuously, not just the week before the audit.

For example:

For MFA enforcement (Control 1), evidence is a screenshot of your IAM Identity Center MFA settings taken at a specific date during the observation period, combined with an IAM credential report showing zero IAM users with console access.
For GuardDuty (Control 4), evidence is the GuardDuty console screenshot showing active detectors, plus your documented response to any findings during the period.
For access reviews (Control 12), evidence is the completed access review document with dates, names, and specific access changes made.

The challenge is collecting this evidence continuously across 3–12 months without spending hundreds of hours on manual work. The solution is automated evidence collection infrastructure.

The Evidence Bucket — Tamper-Proof Storage for Your Audit Evidence

The evidence bucket is an S3 bucket with Object Lock enabled in GOVERNANCE mode. Object Lock prevents any object from being deleted or modified for the retention period you specify — in this case, 365 days. This means once a piece of evidence is uploaded, it can't be altered, even by a user with administrator access (without explicitly overriding the lock, which itself creates an audit trail).

This tamper-evident property is what gives the auditor confidence that the evidence was not created or modified after the fact.

# terraform/soc2-evidence-bucket.tf

resource "aws_s3_bucket" "soc2_evidence" {
  bucket = "\({var.company_name}-soc2-evidence-\){var.environment}"
}

# Block all public access to the evidence bucket
resource "aws_s3_bucket_public_access_block" "soc2_evidence" {
  bucket = aws_s3_bucket.soc2_evidence.id

  block_public_acls       = true
  block_public_policy     = true
  ignore_public_acls      = true
  restrict_public_buckets = true
}

# Enable versioning so overwrites create new versions, not replacements
resource "aws_s3_bucket_versioning" "soc2_evidence" {
  bucket = aws_s3_bucket.soc2_evidence.id
  versioning_configuration {
    status = "Enabled"
  }
}

# Object Lock in GOVERNANCE mode — objects cannot be deleted for 365 days
resource "aws_s3_bucket_object_lock_configuration" "soc2_evidence" {
  bucket = aws_s3_bucket.soc2_evidence.id

  rule {
    default_retention {
      mode = "GOVERNANCE"
      days = 365
    }
  }
}

# Encrypt all evidence at rest
resource "aws_s3_bucket_server_side_encryption_configuration" "soc2_evidence" {
  bucket = aws_s3_bucket.soc2_evidence.id

  rule {
    apply_server_side_encryption_by_default {
      sse_algorithm = "AES256"
    }
  }
}

The Daily Evidence Collector Lambda

This Lambda function runs automatically every day and exports the status of each critical control to a time-stamped JSON file in the evidence bucket. Over your 3–12 month observation period, it creates a daily record proving that your controls were active and operating.

The function checks seven controls automatically: CloudTrail status, GuardDuty status, VPC Flow Logs, S3 public access block, EBS encryption, MFA compliance, and GuardDuty finding count. Each daily snapshot is uploaded with Object Lock enabled so it can't be modified.

# lambda/evidence-collector/handler.py

import boto3
import json
from datetime import datetime, timedelta, timezone

def lambda_handler(event, context):
    """
    Daily SOC2 evidence collector.
    Runs at 00:00 UTC every day via EventBridge scheduler.
    Exports control status to S3 evidence bucket with Object Lock.
    """
    evidence = {
        'collection_timestamp': datetime.now(timezone.utc).isoformat(),
        'collection_date': datetime.now(timezone.utc).strftime('%Y-%m-%d'),
        'account_id': boto3.client('sts').get_caller_identity()['Account'],
        'controls': {}
    }

    # Control 3: CloudTrail status
    cloudtrail = boto3.client('cloudtrail')
    trails = cloudtrail.describe_trails(includeShadowTrails=False)['trailList']
    multi_region_trails = [t for t in trails if t.get('IsMultiRegionTrail')]
    evidence['controls']['cloudtrail'] = {
        'status': 'PASS' if multi_region_trails else 'FAIL',
        'detail': f"{len(multi_region_trails)} multi-region trail(s) active",
        'trails': [t['Name'] for t in multi_region_trails]
    }

    # Control 4: GuardDuty status
    guardduty = boto3.client('guardduty')
    detectors = guardduty.list_detectors()['DetectorIds']
    unresolved_critical = 0
    for detector_id in detectors:
        findings = guardduty.list_findings(
            DetectorId=detector_id,
            FindingCriteria={
                'Criterion': {
                    'severity': {'Gte': 7},  # HIGH and CRITICAL only
                    'service.archived': {'Eq': ['false']}
                }
            }
        )
        unresolved_critical += len(findings['FindingIds'])

    evidence['controls']['guardduty'] = {
        'status': 'PASS' if detectors else 'FAIL',
        'detail': f"{len(detectors)} detector(s) active, {unresolved_critical} unresolved HIGH/CRITICAL findings",
        'unresolved_high_critical': unresolved_critical
    }

    # Control 5: VPC Flow Logs
    ec2 = boto3.client('ec2')
    flow_logs = ec2.describe_flow_logs(
        Filters=[{'Name': 'resource-type', 'Values': ['VPC']},
                 {'Name': 'flow-log-status', 'Values': ['ACTIVE']}]
    )['FlowLogs']
    evidence['controls']['vpc_flow_logs'] = {
        'status': 'PASS' if flow_logs else 'FAIL',
        'detail': f"{len(flow_logs)} active VPC flow log(s)",
        'active_flow_logs': len(flow_logs)
    }

    # Control 7: EBS encryption by default
    ebs_encryption = ec2.get_ebs_encryption_by_default()['EbsEncryptionByDefault']
    evidence['controls']['ebs_encryption_by_default'] = {
        'status': 'PASS' if ebs_encryption else 'FAIL',
        'detail': 'EBS encryption by default is enabled' if ebs_encryption else 'EBS encryption by default is NOT enabled'
    }

    # Control 8: S3 Block Public Access (account level)
    s3control = boto3.client('s3control')
    account_id = boto3.client('sts').get_caller_identity()['Account']
    try:
        pab = s3control.get_public_access_block(AccountId=account_id)['PublicAccessBlockConfiguration']
        all_blocked = all([pab['BlockPublicAcls'], pab['IgnorePublicAcls'],
                           pab['BlockPublicPolicy'], pab['RestrictPublicBuckets']])
        evidence['controls']['s3_block_public_access'] = {
            'status': 'PASS' if all_blocked else 'FAIL',
            'detail': 'All four S3 Block Public Access settings enabled' if all_blocked else 'One or more S3 Block Public Access settings not enabled',
            'configuration': pab
        }
    except Exception as e:
        evidence['controls']['s3_block_public_access'] = {'status': 'FAIL', 'detail': str(e)}

    # Upload evidence to S3 with Object Lock
    s3 = boto3.client('s3')
    evidence_key = f"daily/{evidence['collection_date']}/control-status.json"
    lock_until = datetime.now(timezone.utc) + timedelta(days=365)

    s3.put_object(
        Bucket='YOUR_EVIDENCE_BUCKET_NAME',
        Key=evidence_key,
        Body=json.dumps(evidence, indent=2),
        ContentType='application/json',
        ObjectLockMode='GOVERNANCE',
        ObjectLockRetainUntilDate=lock_until
    )

    # Alert if any control fails
    failed_controls = [k for k, v in evidence['controls'].items() if v['status'] == 'FAIL']
    if failed_controls:
        sns = boto3.client('sns')
        sns.publish(
            TopicArn='YOUR_ALERT_TOPIC_ARN',
            Subject=f'SOC2 Control Failure Detected — {evidence["collection_date"]}',
            Message=f'The following controls failed their daily check:\n\n{json.dumps(failed_controls, indent=2)}'
        )

    return {
        'statusCode': 200,
        'controls_checked': len(evidence['controls']),
        'controls_failed': len(failed_controls),
        'evidence_location': f"s3://YOUR_EVIDENCE_BUCKET_NAME/{evidence_key}"
    }

The GitHub Actions Evidence Workflow

This workflow runs daily and captures evidence that can't be automated through AWS APIs — GitHub-level controls like branch protection status, recent pull request activity, and CI pipeline results. It exports these as JSON files to the same evidence bucket.

# .github/workflows/soc2-evidence.yml
name: SOC2 Evidence Collection
on:
  schedule:
    - cron: '0 1 * * *'   # 01:00 UTC daily (after the Lambda runs at 00:00)
  workflow_dispatch:        # Allow manual trigger when needed

permissions:
  contents: read

jobs:
  collect-github-evidence:
    name: Collect GitHub Control Evidence
    runs-on: ubuntu-latest

    steps:
      - uses: actions/checkout@v3

      - name: Configure AWS credentials
        uses: aws-actions/configure-aws-credentials@v2
        with:
          role-to-assume: arn:aws:iam::${{ secrets.AWS_ACCOUNT_ID }}:role/evidence-collector
          aws-region: us-east-1

      - name: Collect branch protection status
        run: |
          DATE=$(date +%Y-%m-%d)
          mkdir -p evidence/github

          # Export branch protection rules for main
          curl -s -H "Authorization: token ${{ secrets.GITHUB_TOKEN }}" \
            "https://api.github.com/repos/${{ github.repository }}/branches/main/protection" \
            | jq '{
                date: "'$DATE'",
                enforce_admins: .enforce_admins.enabled,
                required_reviews: .required_pull_request_reviews.required_approving_review_count,
                required_status_checks: .required_status_checks.contexts,
                allow_force_pushes: .allow_force_pushes.enabled
              }' > evidence/github/branch-protection-$DATE.json

          echo "Branch protection evidence collected"
          cat evidence/github/branch-protection-$DATE.json

      - name: Upload evidence to S3
        run: |
          DATE=$(date +%Y-%m-%d)
          aws s3 sync evidence/ \
            s3://\({{ secrets.SOC2_EVIDENCE_BUCKET }}/daily/\)DATE/github/ \
            --no-progress
          echo "Evidence uploaded: s3://\({{ secrets.SOC2_EVIDENCE_BUCKET }}/daily/\)DATE/github/"

Weeks 11–14: Auditor Selection and Readiness Assessment

How to Choose a SOC2 Auditor

Selecting the right auditor is more consequential than most teams realize. SOC2 audits are conducted by CPA firms — specifically, firms licensed to issue SOC reports. The right firm has experience with cloud-native, SaaS companies your size. The wrong firm could apply enterprise audit frameworks to a seed-stage startup and generate findings based on controls that aren't appropriate to your context.

Here is what to look for and what to watch out for:

Experience matters more than brand

A large Big Four firm isn't necessarily better than a specialist boutique auditor for a 20-person SaaS company.

Ask specifically: "How many SOC2 audits have you completed in the last 12 months for SaaS companies between 10 and 50 employees?" You want a firm where this is common, not exceptional.

Verify familiarity with your compliance tool

If you're using Vanta or Drata, confirm that the auditor has experience with evidence produced by those platforms. Some auditors prefer to collect evidence directly and are unfamiliar with automated evidence exports. An auditor who doesn't trust your Vanta evidence will ask you to re-collect everything manually.

Understand what Type II actually costs

For a Series A SaaS company, expect \(15,000–\)30,000 for a SOC2 Type II audit with a 3-month observation period. A quote below \(10,000 often means the auditor is cutting corners on the review depth. A quote above \)50,000 for a small company typically means the firm is applying enterprise pricing to a startup engagement.

Get references from similar companies

Ask the auditor for two or three references from SaaS companies they've audited in the last year. Call those references and ask: did the auditor understand cloud infrastructure? Were the findings reasonable? How was the communication during the review?

Here's a summary table of some things to watch out for:

Criteria	What to Look For	Red Flag
Experience	5+ years, 20+ SaaS audits annually	"We have completed several SOC2 audits" (vague)
Tool familiarity	Has reviewed Vanta/Drata evidence before	Requires manual re-collection of automated evidence
Company size fit	Has audited companies your size	Only lists enterprise clients as references
Cost (Type II)	\(15K–\)30K for a 20-person company	Under \(10K or over \)50K without clear justification
References	Can provide SaaS company contacts to call	Cannot provide references

How to Run a Readiness Assessment (Mock Audit)

A readiness assessment is a self-conducted simulation of the real audit, run 2–4 weeks before you engage the auditor. Its purpose is to find and close gaps before the auditor finds them, because gaps found in a mock audit cost you a week of remediation time, while gaps found in the real audit cost you a conditional report and a re-review.

You can run the readiness assessment yourself or hire a consultant to run it. The consultant approach is more valuable because an independent reviewer will find gaps you have rationalised away.

The process:

Step 1: Work through every control in the checklist below and attempt to produce the evidence that an auditor would request.
Step 2: For every control where you can't produce clear, timestamped evidence: that's a gap. Document it.
Step 3: Prioritise gaps by type. Evidence gaps (missing evidence for an active control) require evidence collection infrastructure fixes. Control gaps (a control that isn't implemented) require engineering work.
Step 4: Close all gaps before engaging the real auditor.

Control	Evidence Required	How to Verify	Ready?
MFA enforced	IAM credential report + SSO MFA policy screenshot	`aws iam get-credential-report`	⬜
CloudTrail active	Trail status + S3 delivery confirmation	`aws cloudtrail get-trail-status`	⬜
GuardDuty active	Detector list + finding review log	`aws guardduty list-detectors`	⬜
VPC Flow Logs	Active flow log list + sample log entries	`aws ec2 describe-flow-logs`	⬜
Secrets in Secrets Manager	Secret list + rotation policy confirmation	`aws secretsmanager list-secrets`	⬜
EBS encryption by default	Account-level encryption setting	`aws ec2 get-ebs-encryption-by-default`	⬜
S3 Block Public Access	Account-level PAB configuration	`aws s3control get-public-access-block`	⬜
Branch protection (no admin bypass)	GitHub branch protection API response	GitHub API or Settings UI	⬜
Trivy scanning in CI	GitHub Actions run history showing scans	GitHub Actions logs	⬜
Incident response runbook	Written runbook + tabletop exercise notes with date	Document review	⬜
Access review	Quarterly review document with specific changes made	Document review	⬜
Backup test	RDS restore log + data verification results	Document review	⬜
Change management log	GitHub PR history + ArgoCD sync history	GitHub and ArgoCD	⬜

The one thing most teams skip: Running the readiness assessment against their own evidence bucket. Pull a random day's evidence from the daily Lambda export and verify that it's complete, timestamped, and accurately reflects the control status on that day.

If the evidence file for December 14th shows GuardDuty as PASS but GuardDuty was actually disabled that day, the auditor will find the discrepancy in the AWS account history — and that's a qualified finding.

Weeks 15–18: The Observation Period

How the Auditor Observes Your Controls

The SOC2 auditor doesn't physically visit your office or sit inside your AWS console watching your infrastructure in real time. The audit is a remote, documentation-based process conducted entirely through evidence review.

Here is how it actually works:

First, the auditor provides a list of evidence requests — typically 80–150 items for a Type II audit. You upload the evidence to a shared portal (the auditor provides this — it is usually a secure document sharing platform). The auditor reviews the evidence, asks follow-up questions, and identifies gaps where evidence is missing or a control wasn't operating as described.

For automated controls like CloudTrail and GuardDuty, the evidence is your daily Lambda exports — the auditor spot-checks a sample of daily snapshots across the observation period to verify the controls were consistently active.

For manual controls like access reviews and backup tests, the evidence is the documents you produced when you ran those processes.

The practical implication: the auditor is trusting your evidence. This is why the Object Lock on your evidence bucket matters. It proves to the auditor that the evidence was generated at the time it claims to have been generated and hasn't been modified since.

What the Auditor Reviews Over the Observation Period

What They Check	How Often	What They Are Looking For
CloudTrail logs	Spot check monthly	Manual console changes that bypassed IaC, gaps in log delivery
GuardDuty findings	Review quarterly summary	HIGH or CRITICAL findings not remediated within your documented SLA
Access review completion	Verify each quarterly cycle	Reviews skipped, reviews with no access changes despite employee turnover
Incident response tests	Verify annually	No tabletop exercise conducted during the observation period
Evidence collection	Verify continuous coverage	Gaps in daily evidence exports, missing evidence for specific dates
Change management log	Sample PR/sync history	Deployments with no associated pull request or review

What Triggers a Finding

A SOC2 finding is the auditor's documented conclusion that a control wasn't operating effectively during the observation period. Findings range from observations (minor issues that don't affect the audit opinion) to qualified opinions (material failures that result in a qualified rather than unqualified report).

Understanding what triggers findings — and which ones restart the observation period — is critical for managing your audit timeline.

Control gaps occur when a required control isn't implemented or was disabled during the observation period. If you discover in month 2 that MFA wasn't enforced on one IAM user for the first three weeks, you must document the remediation and demonstrate the gap was closed.

Whether this restarts your observation period depends on how long the gap lasted and how the auditor assesses the risk — but a gap of less than 30 days that's immediately remediated and documented typically doesn't restart the clock.

Evidence gaps are more serious. If your daily Lambda evidence collector failed for two weeks and produced no evidence exports, you have a two-week window with no documented proof that your controls were operating. The auditor can't verify controls they can't see evidence for.

Evidence gaps almost always require extending the observation period because there's no way to retroactively produce evidence for a period that wasn't recorded.

Process failures occur when a manual control wasn't executed as documented. The most common is an access review that was skipped. Like control gaps, these can typically be remediated without restarting the clock if they're documented promptly and the remediation is clear.

Unpatched critical CVEs are a special case. If Trivy identifies a CRITICAL vulnerability in a production container and it remains unpatched for more than your documented remediation SLA (typically 30 days for critical, 90 days for high), this is a qualified finding that the auditor will note in the report.

How to Close Gaps Without Restarting the Clock

When you discover a gap during the observation period:

For control gaps:

1. Fix the control immediately — don't wait
2. Document the fix: screenshot, PR link, or CLI command output with timestamp
3. Note the gap date range in your audit log: "Control gap: 2024-03-10 to 2024-03-14 (4 days). Root cause: [X]. Remediated: [Y]. No customer data accessed during gap period."
4. Notify your auditor proactively — they will find it anyway; proactive disclosure is better than defensive explanation
5. The observation period doesn't restart if the gap was short-lived and promptly remediated

For evidence gaps:

1. Fix the evidence collection infrastructure immediately
2. Understand that you can't retroactively generate evidence for the gap period
3. The observation period for affected controls effectively restarts from the date evidence collection resumed
4. If the gap is early in your observation period, you may be able to extend the period rather than restart — discuss with your auditor

The pro tip: Set up a CloudWatch alarm that triggers if the evidence Lambda fails to deliver to S3 on schedule. A missing daily evidence file is caught within 24 hours, not discovered during the audit review.

The 90-Day SOC2 Timeline at a Glance

Weeks	Focus	Key Deliverables	Common Mistake
1–2	Scope	Boundary diagram, network segmentation Terraform	Over-scoping to include dev and staging
3–6	Controls	14 controls implemented and collecting evidence	Starting controls after the observation period begins
7–10	Evidence	S3 evidence bucket, Lambda daily collector, GitHub Actions workflow	Manual evidence collection with inevitable gaps
11–14	Readiness	Mock audit, gap remediation, auditor selected	Skipping the mock audit
15–18	Observation	Daily evidence, quarterly reviews, incident response test	Discovering evidence gaps during the audit rather than before

What's Next?

Start with Week 1. Define your SOC2 boundary. Apply the four-question framework to every system in your infrastructure. Draw the diagram in Excalidraw. Document the network segmentation controls.

Then implement the 14 controls in order, starting with MFA and CloudTrail — the two that most commonly fail audits when they're missing.

Then build your evidence collection infrastructure before the observation period starts. The automated Lambda and GitHub Actions workflow are the difference between a smooth audit and a 60-day extension.

One thing to remember: SOC2 is 20% controls, 30% evidence, and 50% continuous operation. Start early. Automate everything. Run a mock audit before you call the real one.

Resources

The following resources are referenced throughout this guide:

AICPA SOC2 Overview — The official SOC2 documentation from the American Institute of CPAs, including the Trust Service Criteria
Vanta — Compliance automation platform that connects to AWS and GitHub to automate evidence collection and track control status
Drata — Alternative compliance automation platform with similar capabilities to Vanta
Trivy by Aqua Security — Open-source container and filesystem vulnerability scanner used in Control 10
Excalidraw — Free, open-source diagram tool for creating the SOC2 boundary diagram
AWS IAM Identity Center documentation — Official AWS documentation for setting up SSO and MFA enforcement
GitHub branch protection documentation — Official GitHub documentation for configuring branch protection rules
ArgoCD documentation — Official ArgoCD documentation for GitOps deployment and sync history

Ayobami Adejumo is a senior platform engineer and FinOps specialist. He writes about SOC2 compliance engineering, Kubernetes cost optimization, and platform engineering.

How to Maintain SOC 2 Compliance: A Step-by-Step Guide

Alex Tray — Wed, 16 Oct 2024 14:22:23 +0000

While it might seem challenging to remain SOC 2 compliant, it is a critical process that helps earn your client’s trust and also ensures the security of your systems.

SOC 2 assesses how well a company protects its data based on five trust service criteria: protection, accessibility, processing completeness, confidentiality, and individual privacy.

In this article, we’ll examine the details of SOC 2 compliance and I’ll provide a complete guide to help your organization achieve and maintain this critical certification. We’ll also discuss the five trust services criteria and essential steps for implementation, and I’ll offer insights on preparing for and passing SOC 2 audits.

What is SOC 2 Compliance?
Learn About SOC 2 Trust Services Criteria
Implement Strong Access Controls
Continuously Monitor Your Systems
Document Everything
Prepare for Regular Audits
Ensure Vendor Compliance
Incident Response Plan
Employee Training and Awareness
SOC 1 vs SOC 2
Conclusion

What is SOC 2 Compliance?

SOC 2 (System and Organization Controls) represents an organization's framework for addressing the privacy, security, and reliability of customer data in cloud services.

Developed by the American Institute of Certified Public Accountants (AICPA), SOC 2 focuses on five key trust service principles: security, availability, processing integrity, confidentiality, and privacy. SOC 2 compliance, therefore, means that a company has taken appropriate measures to handle clients’ and partners’ sensitive data and gain their trust.

To stay compliant with the SOC 2 requirements, a company must perform several activities, including audits, system monitoring, and following various best practices and guidelines for data security.

Now we’ll discuss some of these best practices and how you and your team can implement them.

1. Learn About SOC 2 Trust Services Criteria

Let me highlight that the first fundamental rule to maintaining compliance is a thorough understanding of the SOC 2 trust service criteria. These are the five key areas that auditors will assess for SOC 2 certification:

Security: Non-intrusive measures of safeguarding the systems from unauthorized access.
Availability: Make sure systems are deliverable as they have been contracted in service-level agreements.
Processing Integrity: System processing must be complete, accurate, and authorized. For example, input validation checks must be implemented to prevent invalid data from entering the system, and automated workflows must be used to ensure that data is processed consistently and accurately.
Confidentiality: Electronic security covers aspects like how to protect sensitive information.
Privacy: This covers handling one's data according to the guidelines of existing privacy policies. It focuses on implementing data privacy policies, procedures, and controls to protect individuals' data. For example, organizations should obtain explicit consent from individuals before collecting and using their personal information and provide them with the right to access, correct, or delete their data.

Investing time in creating a relationship between your organization’s policies and procedures and these criteria is crucial. Make sure you and your team do this with your current security plans and policies, and ensure that they regularly comply with the above mentioned standards.

2. Implement Strong Access Controls

Poor access control measures are one of the most sure-fire ways to fail to achieve SOC 2 compliance. You’ll need to make sure that users only have access to the necessary information they need in order to work, giving them the fewest possible privileges.

You can achieve this by:

Implementing multi-factor authentication that must be passed before a user gets access to the organization’s network.
Setting up role-based access control (RBAC).
Reviewing user activity logs to identify and address any suspicious or unauthorized behavior. This helps detect potential security threats and ensure that access controls are followed.

3. Continuously Monitor Your Systems

SOC 2 is not just a one-time thorough audit – it always follows a set of guidelines. While SOC 2 audits take place annually, you can choose to conduct them more frequently, and also keep in mind the importance of regularly reviewing your security policies. You can also set up periodic internal audits as a litmus test of your safety measures.

But that means you must employ a procedure to monitor the systems regularly in the future. You can set up notifications on any abnormal incidences by using a security information and event management (SIEM) system to centralize and analyze security events, system outages, or slow network for adverse effects to the compliance level.

In addition to automated monitoring, you should schedule internal compliance audits from time to time to monitor your company’s compliance.

“We recommend organizations employ tools like vulnerability scanners, web application firewalls and penetration testing tools for scanning the organizational infrastructure for possible vulnerabilities,” says Jinson, a senior security researcher at Astra Security. These tools assist you in identifying risks beforehand, enabling you to mitigate them before they become major.

4. Document Everything

Documentation is one of the main pillars at the core of SOC 2 compliance. A comprehensive set of documents, including processes, security policies, and incident response plans, is essential for demonstrating compliance and providing auditors with the evidence they need.

By maintaining comprehensive documentation, you can ensure compliance with SOC 2 standards and reduce the risk of security breaches.

To keep this manageable:

Develop a compliance documentation collection center for more efficient retrieval of documents.
Make the documentation as flexible to update as you can, and make it as convenient as possible to share with the right people.
Effectively, document changes made to the system, who requests access to what part of the system, and any security threats.

5. Prepare for Regular Audits

A SOC 2 audit cannot be undertaken using a ‘set it and forget it’ approach. While the initial setup may not paint a pretty picture, you must be ready to remain compliant for annual or regular assessments.

The audit involves interviewing staff members, reviewing your company’s security policies, and thoroughly analyzing how your business complies with SOC 2 requirements through relevant pentesting tools such as DAST tools, which help identify vulnerabilities in real-time within your applications.

Maintain at least one person or a group conversing with the SOC 2 specifications.
Make sure that all the employees are aware of their responsibilities in helping to keep the business compliant.
Pre-audit checks are a good idea. You conduct an initial check of your organization’s policies which gives you the chance to rectify any problems well before the audit.

6. Ensure Vendor Compliance

Second-party vendors, which your company may engage for various goods or services, are also expected to comply with SOC 2 standards. If you interact with cloud providers, data processors, or any other service that processes your sensitive data, you must ensure they are SOC 2 compliant.

You should require that your vendors share their compliance reports with you, or you can perform assessments of all vendors. This helps ensure that they follow their security measures and do not compromise the ones you hold as paramount.

7. Have an Incident Response Plan

However much you bake security into your daily practices and policies, accidents happen sometimes. That’s why it’s imperative to have a concise and clear incident response plan to help maintain SOC 2 compliance.

Security Incident: Methods and Practices for Protection

When an incident occurs, you’ll need to determine which people are responsible for managing the incident.
Make sure you have the steps in place for internal reporting and communicating of breaches, as well as external reporting and communicating of breaches.
Remember, you should conduct frequent tests of the incident response plan and revise it according to the experiences of incidents or audits.
Select the best ransomware protection solution, such as Malwarebytes, or Bitdefender, which prevent ransomware infections and recover encrypted files, or NAKIVO ransomware protection, which I personally use to protect data backups, as this will significantly reduce the risk of data breaches caused by malware or ransomware attacks.

8. Employee Training and Awareness

It was seen that no matter how sophisticated your security measures are, they can only be as good as those who operate them. Make data protection procedures a part of the employees' training, including how to report an incident and company regulations. Remind them about phishing scams, passwords, their strength, and other corporate safety policies.

SOC 2 compliance is a conventional course in an organization, and everyone has a part to play. While it assists in general compliance during day-to-day business, it also plays a critical role in ensuring a seamless audit process.

SOC 1 vs SOC 2

While both SOC 1 and SOC 2 are frameworks for assessing organizational controls, they focus on different aspects of an organization's operations. SOC 1 primarily focuses on the reliability of financial reporting, assessing an organization's internal controls related to financial information.

SOC 2, on the other hand, is broader in scope. It evaluates an organization's control over security, availability, processing integrity, confidentiality, and privacy. This is particularly important for organizations that handle sensitive customer data.

Feature	SOC 1	SOC 2

Focus	Internal controls over financial reporting	Controls over security, availability, processing integrity, confidentiality, and privacy
Audience	Management, auditors, financial stakeholders	Management, customers, auditors, and other stakeholders
Purpose	Assure reliable financial information	Assure data security and operational controls
Criteria	AICPA's SAS No. 18	Trust Services Principles and Criteria
Scope	Financial reporting controls	Broader range of security and operational controls

Conclusion

In today’s data-driven world, earning and maintaining SOC 2 compliance is not just a box to tick but a strategic investment in your security and reputation.

Understanding the trust service criteria, controlling access, monitoring systems, and preparing for an audit are critical steps to ensuring your organization passes the SOC 2 check and is protected against data breaches.

This way, the client is protected from inside threats, and the organization actively aligns itself with security compliance.

compliance - freeCodeCamp.org

GDPR Article 32 for Software Engineers: Technical Controls, Implementations, and Auditor Questions

Table of Contents

What You'll Learn

Prerequisites

Part 1: Understanding Article 32 — The Technical Requirements

1.1. What Article 32 Actually Requires

1.2. The Scope Question: What Data Is Covered?

Part 2: Article 32(1)(a) — Pseudonymisation and Encryption

2.1. How to Implement Pseudonymisation at the Database Layer

2.2. How to Implement Encryption at Rest with Customer-Managed Keys

2.3. How to Implement Application-Layer Encryption for Sensitive Fields

Part 3: Article 32(1)(b) — Confidentiality and Integrity

3.1. How to Implement Automatic Logoff

3.2. How to Implement Unique User Identification with IRSA

Part 4: Article 32(1)(c) — Availability and Resilience

4.1. How to Implement Multi-AZ and Backup Requirements

Part 5: Article 32(1)(d) — Regular Testing

5.1. How to Implement Automated Vulnerability Scanning

Part 6: Article 32(1)(d) — Penetration Testing

6.1. Why Automated Scanning Is Not Enough

Best Practices for GDPR Article 32 Compliance

Resources

The Complete SOC 2 Type II Implementation Handbook for Engineers: A Month-by-Month Roadmap with Real Commands

Table of Contents

What You'll Learn

Prerequisites

Weeks 1–2: The Scope Decision — What Is In and Out of Your SOC2 Boundary

What Most Teams Get Wrong

The Scope Decision Framework

Network Segmentation — The Technical Proof That Your Boundary Holds

The Deliverable: Your SOC2 Boundary Diagram

Weeks 3–6: The 14 Controls That Must Be Active on Day 1

Control 1: MFA Enforcement (CC6.6)

Control 2: Infrastructure as Code (CC8.1)

Control 3: CloudTrail Enabled (CC7.1)

Control 4: GuardDuty Enabled (CC7.2)

Control 5: VPC Flow Logs (CC6.1)

Control 6: Secrets Manager (CC6.7)

Control 7: EBS Encryption (CC6.1)

Control 8: S3 Block Public Access (CC6.1)

Control 9: Branch Protection (CC8.1)

Control 10: Container Image Scanning (CC7.4)

Control 11: Incident Response Plan (CC9.2)

Control 12: Access Reviews (CC6.3)

Control 13: Backup Verification (CC9.5)

Control 14: Change Management Log (CC8.1)

Weeks 7–10: The Evidence Collection Infrastructure

The Evidence Bucket — Tamper-Proof Storage for Your Audit Evidence

The Daily Evidence Collector Lambda

The GitHub Actions Evidence Workflow

Weeks 11–14: Auditor Selection and Readiness Assessment

How to Choose a SOC2 Auditor

Experience matters more than brand

Verify familiarity with your compliance tool

Understand what Type II actually costs

Get references from similar companies

How to Run a Readiness Assessment (Mock Audit)

Weeks 15–18: The Observation Period

How the Auditor Observes Your Controls

What the Auditor Reviews Over the Observation Period

What Triggers a Finding

How to Close Gaps Without Restarting the Clock

The 90-Day SOC2 Timeline at a Glance

What's Next?

Resources

How to Maintain SOC 2 Compliance: A Step-by-Step Guide

Table of Contents

What is SOC 2 Compliance?

1. Learn About SOC 2 Trust Services Criteria

2. Implement Strong Access Controls

3. Continuously Monitor Your Systems

4. Document Everything

5. Prepare for Regular Audits

6. Ensure Vendor Compliance

7. Have an Incident Response Plan

Security Incident: Methods and Practices for Protection

8. Employee Training and Awareness

SOC 1 vs SOC 2

Conclusion