<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/"
    xmlns:atom="http://www.w3.org/2005/Atom" xmlns:media="http://search.yahoo.com/mrss/" version="2.0">
    <channel>
        
        <title>
            <![CDATA[ #gdpr - freeCodeCamp.org ]]>
        </title>
        <description>
            <![CDATA[ Browse thousands of programming tutorials written by experts. Learn Web Development, Data Science, DevOps, Security, and get developer career advice. ]]>
        </description>
        <link>https://www.freecodecamp.org/news/</link>
        <image>
            <url>https://cdn.freecodecamp.org/universal/favicons/favicon.png</url>
            <title>
                <![CDATA[ #gdpr - freeCodeCamp.org ]]>
            </title>
            <link>https://www.freecodecamp.org/news/</link>
        </image>
        <generator>Eleventy</generator>
        <lastBuildDate>Thu, 28 May 2026 20:56:29 +0000</lastBuildDate>
        <atom:link href="https://www.freecodecamp.org/news/tag/gdpr/rss.xml" rel="self" type="application/rss+xml" />
        <ttl>60</ttl>
        
            <item>
                <title>
                    <![CDATA[ GDPR Article 32 for Software Engineers: Technical Controls, Implementations, and Auditor Questions ]]>
                </title>
                <description>
                    <![CDATA[ When I first read GDPR Article 32, I made a mistake. I thought it was a legal document. But it's not. It's an infrastructure specification. The regulation says you need "appropriate technical measures ]]>
                </description>
                <link>https://www.freecodecamp.org/news/gdpr-article-32-for-software-engineers-technical-controls-implementations-and-auditor-questions/</link>
                <guid isPermaLink="false">6a186b4960295e5547e0936d</guid>
                
                    <category>
                        <![CDATA[ #gdpr ]]>
                    </category>
                
                    <category>
                        <![CDATA[ compliance  ]]>
                    </category>
                
                    <category>
                        <![CDATA[ infrastructure ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ Ayobami Adejumo ]]>
                </dc:creator>
                <pubDate>Thu, 28 May 2026 16:20:25 +0000</pubDate>
                <media:content url="https://cdn.hashnode.com/uploads/covers/5e1e335a7a1d3fcc59028c64/c73c68e8-7485-4993-a21f-84653ba29a10.png" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>When I first read GDPR Article 32, I made a mistake. I thought it was a legal document.</p>
<p>But it's not. It's an infrastructure specification.</p>
<p>The regulation says you need "appropriate technical measures" to protect personal data. That phrase is terrifying because it's vague. What does "appropriate" mean? What counts as a "technical measure"? Who decides whether you've done enough?</p>
<p>The compliance consultant will give you a 50-page policy document. The auditor will ignore it and ask for your database schema.</p>
<p>This guide is the middle ground. I've implemented Article 32 controls for 12 SaaS companies. The same nine controls appear every time. The same three auditor questions appear every time.</p>
<p>This is a complete guide to the 9 technical controls you must implement, the exact code and commands for each, and the questions your GDPR auditor will ask.</p>
<h2 id="heading-table-of-contents">Table of Contents</h2>
<ul>
<li><p><a href="#heading-what-youll-learn">What You'll Learn</a></p>
</li>
<li><p><a href="#heading-prerequisites">Prerequisites</a></p>
</li>
<li><p><a href="#heading-part-1-understanding-article-32-the-technical-requirements">Part 1: Understanding Article 32</a></p>
</li>
<li><p><a href="#heading-part-2-article-321a-pseudonymisation-and-encryption">Part 2: Article 32(1)(a) — Pseudonymisation and Encryption</a></p>
</li>
<li><p><a href="#heading-part-3-article-321b-confidentiality-and-integrity">Part 3: Article 32(1)(b) — Confidentiality and Integrity</a></p>
</li>
<li><p><a href="#heading-part-4-article-321c-availability-and-resilience">Part 4: Article 32(1)(c) — Availability and Resilience</a></p>
</li>
<li><p><a href="#heading-part-5-article-321d-regular-testing">Part 5: Article 32(1)(d) — Regular Testing</a></p>
</li>
<li><p><a href="#heading-part-6-article-321d-penetration-testing">Part 6: Penetration Testing</a></p>
</li>
<li><p><a href="#heading-best-practices-for-gdpr-article-32-compliance">Best Practices Summary</a></p>
</li>
<li><p><a href="#heading-whats-next">What's Next</a></p>
</li>
<li><p><a href="#heading-resources">Resources</a></p>
</li>
</ul>
<h2 id="heading-what-youll-learn">What You'll Learn</h2>
<ul>
<li><p>The 9 technical controls required by GDPR Article 32(1)(a) through (d)</p>
</li>
<li><p>Exact PostgreSQL commands for pseudonymisation and field-level encryption</p>
</li>
<li><p>How to implement automatic logoff and unique user identification</p>
</li>
<li><p>Application-level audit logging that goes beyond CloudTrail</p>
</li>
<li><p>Integrity controls that prove data has not been altered</p>
</li>
<li><p>mTLS and TLS 1.3 for transmission security</p>
</li>
<li><p>The 5 auditor questions you must answer with evidence</p>
</li>
</ul>
<p>Let's dive in.</p>
<h2 id="heading-prerequisites">Prerequisites</h2>
<p>Before following along, you should have:</p>
<p><strong>Knowledge:</strong></p>
<ul>
<li><p>Familiarity with PostgreSQL and basic SQL</p>
</li>
<li><p>Basic understanding of AWS services (KMS, RDS, CloudTrail)</p>
</li>
<li><p>Comfort reading Python and JavaScript/Node.js code</p>
</li>
<li><p>A working knowledge of what GDPR is — if you are starting from scratch, read the <a href="https://ico.org.uk/for-organisations/guide-to-data-protection/guide-to-the-general-data-protection-regulation-gdpr/">ICO's GDPR overview</a> first</p>
</li>
</ul>
<p><strong>Tools and access:</strong></p>
<ul>
<li><p>PostgreSQL 14 or later</p>
</li>
<li><p>An AWS account with IAM administrator access</p>
</li>
<li><p>Python 3.8 or later with <code>cryptography</code> library (<code>pip install cryptography</code>)</p>
</li>
<li><p>Node.js 16 or later</p>
</li>
<li><p>A compliance automation tool — <a href="https://vanta.com">Vanta</a> or <a href="https://onetrust.com">OneTrust</a> — is optional but recommended for evidence collection</p>
</li>
</ul>
<p><strong>Estimated time:</strong> The controls in this guide take 2–4 weeks to implement fully, depending on your existing infrastructure. Individual controls range from 30 minutes (KMS key setup) to 5 days (full application-layer encryption rollout).</p>
<h2 id="heading-part-1-understanding-article-32-the-technical-requirements">Part 1: Understanding Article 32 — The Technical Requirements</h2>
<h3 id="heading-11-what-article-32-actually-requires">1.1. What Article 32 Actually Requires</h3>
<p>Article 32 of the GDPR is titled "Security of processing." It requires controllers and processors to implement "appropriate technical and organisational measures" to ensure a level of security appropriate to the risk.</p>
<p>Here is the important distinction most teams miss: Article 32 is not a checklist of policies. A policy says "we encrypt personal data." Evidence says "here is the KMS key with automatic rotation, here is the application-layer encryption code, and here are the CloudTrail logs showing every decryption attempt." The auditor wants evidence, not documentation.</p>
<p><strong>The four main requirements:</strong></p>
<table>
<thead>
<tr>
<th>Section</th>
<th>Requirement</th>
<th>What It Means for Engineers</th>
</tr>
</thead>
<tbody><tr>
<td>32(1)(a)</td>
<td>Pseudonymisation and encryption</td>
<td>Personal data must be stored so it cannot be attributed to a specific data subject without additional information held separately</td>
</tr>
<tr>
<td>32(1)(b)</td>
<td>Confidentiality, integrity, availability, and resilience</td>
<td>Systems must protect data from unauthorised access, alteration, loss, and be able to recover from incidents</td>
</tr>
<tr>
<td>32(1)(c)</td>
<td>Restoring availability and access</td>
<td>You must be able to restore data and regain system access after a physical or technical incident</td>
</tr>
<tr>
<td>32(1)(d)</td>
<td>Regular testing and risk assessment</td>
<td>You must have a process for regularly testing and evaluating your security measures</td>
</tr>
</tbody></table>
<h3 id="heading-12-the-scope-question-what-data-is-covered">1.2. The Scope Question: What Data Is Covered?</h3>
<p>Before implementing any controls, you must know what data falls under Article 32. The regulation applies to personal data — any information that can identify a living individual directly or indirectly.</p>
<p><strong>Data types and their protection levels:</strong></p>
<table>
<thead>
<tr>
<th>Category</th>
<th>Examples</th>
<th>Protection Level</th>
</tr>
</thead>
<tbody><tr>
<td>Personal data</td>
<td>Name, email, phone, IP address</td>
<td>Standard</td>
</tr>
<tr>
<td>Sensitive personal data</td>
<td>Health data, biometric data, political opinions, religious beliefs</td>
<td>Enhanced</td>
</tr>
<tr>
<td>Pseudonymised data</td>
<td>Data where direct identifiers are replaced with a code</td>
<td>Standard</td>
</tr>
<tr>
<td>Anonymised data</td>
<td>Data that cannot be re-identified under any reasonable circumstances</td>
<td>Out of scope</td>
</tr>
</tbody></table>
<p><strong>The data mapping question your auditor will ask:</strong></p>
<blockquote>
<p>"Can you provide a data flow diagram showing where personal data enters your system, where it is stored, where it is processed, and how it is deleted?"</p>
</blockquote>
<p>Before the auditor asks, run this command to document all databases storing personal data in your AWS environment:</p>
<pre><code class="language-bash"># List all RDS instances with their encryption status
# Any StorageEncrypted: false is a finding
aws rds describe-db-instances \
  --query 'DBInstances[*].{
    ID:DBInstanceIdentifier,
    Engine:Engine,
    StorageEncrypted:StorageEncrypted,
    Region:AvailabilityZone
  }' \
  --output table
</code></pre>
<p>Any instance showing <code>StorageEncrypted: false</code> must be addressed before your Article 32 audit.</p>
<h2 id="heading-part-2-article-321a-pseudonymisation-and-encryption">Part 2: Article 32(1)(a) — Pseudonymisation and Encryption</h2>
<h3 id="heading-21-how-to-implement-pseudonymisation-at-the-database-layer">2.1. How to Implement Pseudonymisation at the Database Layer</h3>
<p>Pseudonymisation replaces direct identifiers — names, email addresses, passport numbers — with a pseudonym or code. The goal is that the main working dataset cannot identify a data subject without access to a separately stored, separately protected lookup table.</p>
<p><strong>Here is the incorrect approach — direct identifiers in plaintext:</strong></p>
<pre><code class="language-sql">-- Bad: Direct identifiers stored in the main working table
CREATE TABLE users (
    id SERIAL PRIMARY KEY,
    full_name VARCHAR(255),       -- Direct identifier — should not be here
    email VARCHAR(255),           -- Direct identifier — should not be here
    passport_number VARCHAR(50)   -- Direct identifier — should not be here
);
</code></pre>
<p>This approach means any engineer, analyst, or attacker with SELECT access to the <code>users</code> table can immediately read and identify individuals. There is no separation between working data and identifying data.</p>
<p><strong>Here is the correct implementation with a separate identifiers table:</strong></p>
<pre><code class="language-sql">-- Good: Pseudonymised main table with a separate, restricted lookup table

-- Step 1: Main working table uses only the pseudonym
CREATE TABLE users (
    id SERIAL PRIMARY KEY,
    pseudonym UUID DEFAULT gen_random_uuid(),  -- Non-guessable pseudonym
    created_at TIMESTAMP DEFAULT NOW(),
    account_status VARCHAR(50)
    -- No direct identifiers here
);

-- Step 2: Identifier lookup table — kept separate, access restricted
CREATE TABLE user_identifiers (
    pseudonym UUID PRIMARY KEY,
    full_name VARCHAR(255),
    email VARCHAR(255),
    passport_number VARCHAR(50),
    FOREIGN KEY (pseudonym) REFERENCES users(pseudonym)
);

-- Step 3: Grant minimal, role-based access
GRANT SELECT ON users TO app_role;                              -- Application uses pseudonym only
GRANT SELECT, INSERT, UPDATE ON user_identifiers TO identity_service_role;  -- Only the identity service sees names
</code></pre>
<p><strong>What each part does:</strong></p>
<ul>
<li><p><code>gen_random_uuid()</code> creates a version-4 UUID pseudonym for each user — unpredictable and not reversible without the lookup table</p>
</li>
<li><p>The main <code>users</code> table is safe for analytics, reporting, and general application use without exposing any identifying information</p>
</li>
<li><p>Only the <code>identity_service_role</code> can join the two tables — this role is assigned only to the specific service that handles identity operations</p>
</li>
</ul>
<p><strong>The auditor question you will receive:</strong></p>
<blockquote>
<p>"How do you ensure that pseudonymised data cannot be re-identified by an unauthorised party?"</p>
</blockquote>
<p><strong>Your evidence:</strong></p>
<pre><code class="language-sql">-- Show that only the identity service role has access to the identifiers table
SELECT grantee, privilege_type, table_name
FROM information_schema.role_table_grants
WHERE table_name = 'user_identifiers';

-- Expected output: only identity_service_role listed
</code></pre>
<h3 id="heading-22-how-to-implement-encryption-at-rest-with-customer-managed-keys">2.2. How to Implement Encryption at Rest with Customer-Managed Keys</h3>
<p>Storage-layer encryption protects data if someone physically steals the disk. But it does not protect against a privileged AWS employee, a compromised cloud administrator, or an authorised user with direct database access. Article 32 auditors know this distinction — and they will ask about it.</p>
<p><strong>Here is the incorrect approach — AWS-managed keys:</strong></p>
<pre><code class="language-bash"># Bad: AWS-managed KMS key
# You do not control who at AWS can access the key material
aws kms create-key \
  --origin AWS_KMS \
  --description "AWS managed key for production"
</code></pre>
<p>The problem: when the auditor asks "can you prove that AWS employees cannot decrypt your customer data?", the answer is no. AWS-managed keys are managed by AWS.</p>
<p><strong>Here is the correct implementation — customer-managed key with automatic rotation:</strong></p>
<pre><code class="language-bash"># Step 1: Create a customer-managed KMS key
KEY_ID=$(aws kms create-key \
  --origin AWS_KMS \
  --description "Customer-managed key for production PII — Article 32 compliant" \
  --tags TagKey=Purpose,TagValue=GDPR TagKey=Environment,TagValue=production \
  --query 'KeyMetadata.KeyId' \
  --output text)

echo "Created KMS key: $KEY_ID"

# Step 2: Enable automatic 90-day rotation
aws kms enable-key-rotation --key-id $KEY_ID

# Step 3: Apply to your production RDS instance
aws rds modify-db-instance \
  --db-instance-identifier production-db \
  --kms-key-id $KEY_ID \
  --apply-immediately
</code></pre>
<p><strong>The auditor question:</strong></p>
<blockquote>
<p>"Show me that your encryption keys are rotated automatically and that you can prove who has accessed them."</p>
</blockquote>
<p><strong>Your evidence:</strong></p>
<pre><code class="language-bash"># Verify rotation is enabled — expected output: true
aws kms get-key-rotation-status --key-id $KEY_ID \
  --query 'KeyRotationEnabled'

# Show the CloudTrail audit trail of every key usage event
aws logs filter-log-events \
  --log-group-name cloudtrail-logs \
  --filter-pattern '{ $.eventSource = "kms.amazonaws.com" }' \
  --query 'events[*].{Time:timestamp,Event:message}' \
  --output table
</code></pre>
<h3 id="heading-23-how-to-implement-application-layer-encryption-for-sensitive-fields">2.3. How to Implement Application-Layer Encryption for Sensitive Fields</h3>
<p>Storage encryption is the floor. Application-layer encryption is the ceiling that Article 32 auditors are increasingly expecting for health data, financial records, and other sensitive personal data.</p>
<p>Here is the difference: with storage encryption only, a database administrator who runs <code>SELECT email FROM users</code> sees the plaintext email address. With application-layer encryption, they see <code>gAAAAABm...</code> — an encrypted byte string that only the application (with access to the Vault key) can decrypt.</p>
<pre><code class="language-python"># application_encryption.py
from cryptography.fernet import Fernet

class FieldEncryption:
    """
    Encrypts sensitive personal data fields before they are stored in the database.
    The encryption key is stored in HashiCorp Vault or AWS Secrets Manager — never in code.
    A database administrator with direct SQL access sees only encrypted bytes.
    """

    def __init__(self, key: str):
        # key must be a 32-byte base64-encoded string — retrieve from Vault
        self.cipher = Fernet(key.encode())

    def encrypt_field(self, plaintext: str) -&gt; str:
        """Encrypt a sensitive field before writing to the database."""
        if not plaintext:
            return None
        encrypted_bytes = self.cipher.encrypt(plaintext.encode())
        return encrypted_bytes.decode()

    def decrypt_field(self, ciphertext: str) -&gt; str:
        """
        Decrypt a field when legitimately needed by the application.
        This method requires the Vault key — database admins cannot call it.
        """
        if not ciphertext:
            return None
        decrypted_bytes = self.cipher.decrypt(ciphertext.encode())
        return decrypted_bytes.decode()


# Usage in your application:
from vault_client import get_secret  # Your Vault or Secrets Manager client

# Retrieve the encryption key at application startup — never hardcode it
encryption_key = get_secret("gdpr/field-encryption-key")
encryptor = FieldEncryption(encryption_key)

# Before storing a user's health record
user.health_data_encrypted = encryptor.encrypt_field(user.health_data_plaintext)

# Before reading for a legitimate purpose (subject access request, etc.)
health_data = encryptor.decrypt_field(user.health_data_encrypted)
</code></pre>
<p><strong>The auditor question:</strong></p>
<blockquote>
<p>"If a database administrator queries the users table directly, can they read customer health data in plaintext?"</p>
</blockquote>
<p><strong>Your evidence:</strong> Run a direct database query and show the auditor the encrypted output. Then demonstrate that the decryption key is not accessible to database administrators — it is retrieved only by the application through Vault.</p>
<h2 id="heading-part-3-article-321b-confidentiality-and-integrity">Part 3: Article 32(1)(b) — Confidentiality and Integrity</h2>
<h3 id="heading-31-how-to-implement-automatic-logoff">3.1. How to Implement Automatic Logoff</h3>
<p>Article 32(1)(b) requires protection against "unauthorised access to personal data." A session that never expires — or expires after 24 hours — is an access control gap. A user who logs in on a shared machine and walks away has left an open door.</p>
<p><strong>Here is the incorrect approach — a 24-hour JWT session:</strong></p>
<pre><code class="language-javascript">// Bad: 24-hour access token with no inactivity check
const token = jwt.sign(
  { userId: user.id, role: user.role },
  process.env.JWT_SECRET,
  { expiresIn: '24h' }  // Too long — violates Article 32 intent
);
</code></pre>
<p>The problem: if a user logs in on a shared computer and closes the laptop without logging out, the session remains valid for up to 24 hours. Anyone who opens that laptop can access personal data.</p>
<p><strong>Here is the correct implementation — a 15-minute access token with a rolling refresh:</strong></p>
<pre><code class="language-javascript">// Good: Short-lived access token with rolling refresh via HTTP-only cookie

// Access token — valid for 15 minutes of activity
const accessToken = jwt.sign(
  { userId: user.id, role: user.role, type: 'access' },
  process.env.JWT_ACCESS_SECRET,
  { expiresIn: '15m' }
);

// Refresh token — valid for 8 hours total session duration
const refreshToken = jwt.sign(
  { userId: user.id, type: 'refresh' },
  process.env.JWT_REFRESH_SECRET,
  { expiresIn: '8h' }
);

// Set refresh token as HTTP-only cookie — not accessible to JavaScript
res.cookie('refreshToken', refreshToken, {
  httpOnly: true,    // Prevents XSS access
  secure: true,      // HTTPS only
  sameSite: 'strict', // Prevents CSRF
  maxAge: 8 * 60 * 60 * 1000  // 8 hours in milliseconds
});

// Session middleware that enforces absolute timeout
const MAX_TOTAL_SESSION_MS = 8 * 60 * 60 * 1000; // 8 hours

app.use((req, res, next) =&gt; {
  if (!req.session?.createdAt) return next();

  const sessionAge = Date.now() - req.session.createdAt;
  if (sessionAge &gt; MAX_TOTAL_SESSION_MS) {
    req.session.destroy();
    return res.status(401).json({
      error: 'Session expired after 8 hours. Please log in again.'
    });
  }
  next();
});
</code></pre>
<p><strong>The auditor question:</strong></p>
<blockquote>
<p>"Show me that your application terminates inactive sessions after a reasonable period."</p>
</blockquote>
<p><strong>Your evidence:</strong> A browser developer tools screenshot showing the cookie expiration time, plus a test recording showing that after 15 minutes of inactivity the user is presented with a re-authentication prompt.</p>
<h3 id="heading-32-how-to-implement-unique-user-identification-with-irsa">3.2. How to Implement Unique User Identification with IRSA</h3>
<p>Article 32(1)(b) requires that you can identify who accessed personal data. Shared service accounts make this impossible — the audit log shows <code>data-export-service</code> but you cannot tell which engineer triggered the export.</p>
<p><strong>Here is the incorrect approach — a shared service account:</strong></p>
<pre><code class="language-yaml"># Bad: One shared Kubernetes service account used by multiple engineers and pipelines
apiVersion: v1
kind: ServiceAccount
metadata:
  name: data-export           # Three engineers and two pipelines share this identity
  namespace: production
</code></pre>
<p>When an audit log shows <code>data-export performed a bulk user export at 03:17 UTC</code>, you cannot answer the auditor's question: "who authorised this?"</p>
<p><strong>Here is the correct implementation — IAM Roles for Service Accounts (IRSA):</strong></p>
<pre><code class="language-bash"># Step 1: Create a separate IAM role for each service identity
# This command creates a role that can only be assumed by the 'payment-service'
# Kubernetes service account in the 'production' namespace

aws iam create-role \
  --role-name eks-payment-service-role \
  --assume-role-policy-document '{
    "Version": "2012-10-17",
    "Statement": [{
      "Effect": "Allow",
      "Principal": {
        "Federated": "arn:aws:iam::123456789012:oidc-provider/oidc.eks.us-east-1.amazonaws.com/id/YOUR_OIDC_ID"
      },
      "Action": "sts:AssumeRoleWithWebIdentity",
      "Condition": {
        "StringEquals": {
          "oidc.eks.us-east-1.amazonaws.com/id/YOUR_OIDC_ID:sub":
            "system:serviceaccount:production:payment-service"
        }
      }
    }]
  }'
</code></pre>
<pre><code class="language-yaml"># Step 2: Annotate the Kubernetes service account with its unique IAM role
apiVersion: v1
kind: ServiceAccount
metadata:
  name: payment-service          # One service account, one service, one role
  namespace: production
  annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::123456789012:role/eks-payment-service-role
</code></pre>
<p>Every AWS API call from <code>payment-service</code> now appears in CloudTrail as <code>eks-payment-service-role</code> — a unique, traceable identity. No shared accounts. No ambiguous audit logs.</p>
<p><strong>The auditor question:</strong></p>
<blockquote>
<p>"How do you ensure that every action on personal data can be attributed to a specific individual or service?"</p>
</blockquote>
<p><strong>Your evidence:</strong></p>
<pre><code class="language-bash"># Verify no shared service accounts exist — every account should have a unique role annotation
kubectl get serviceaccounts --all-namespaces \
  -o jsonpath='{range .items[*]}{.metadata.namespace}/{.metadata.name}: {.metadata.annotations.eks\.amazonaws\.com/role-arn}{"\n"}{end}'
</code></pre>
<h2 id="heading-part-4-article-321c-availability-and-resilience">Part 4: Article 32(1)(c) — Availability and Resilience</h2>
<h3 id="heading-41-how-to-implement-multi-az-and-backup-requirements">4.1. How to Implement Multi-AZ and Backup Requirements</h3>
<p>Article 32(1)(c) requires "the ability to restore the availability and access to personal data in a timely manner in the event of a physical or technical incident." This is not a suggestion — it is a legal requirement. If your database is in a single Availability Zone and that AZ experiences a networking event, you are in violation.</p>
<p><strong>Here is the incorrect approach — single-AZ RDS with no automated backups:</strong></p>
<pre><code class="language-hcl"># Bad: Single-AZ RDS — one networking event makes personal data unavailable
resource "aws_db_instance" "production" {
  identifier              = "production-database"
  multi_az                = false   # No automatic failover
  backup_retention_period = 0       # No automated backups — Article 32 violation
}
</code></pre>
<p>If the Availability Zone has a networking issue, the database is unreachable. If the instance is corrupted, there are no backups to restore. Both scenarios violate Article 32(1)(c).</p>
<p><strong>Here is the correct implementation — Multi-AZ with tested automated backups:</strong></p>
<pre><code class="language-hcl"># Good: Multi-AZ RDS with 30-day backup retention
resource "aws_db_instance" "production" {
  identifier = "production-database"

  # Multi-AZ creates a synchronous standby replica in a different AZ
  # Automatic failover completes in 60-120 seconds with no data loss
  multi_az = true

  # 30-day backup retention — gives you recovery point flexibility
  backup_retention_period = 30
  backup_window           = "03:00-04:00"  # Low-traffic window for backup

  # Copy all tags to snapshots for compliance tracking
  copy_tags_to_snapshot = true

  # Performance Insights for monitoring query health
  performance_insights_enabled          = true
  performance_insights_retention_period = 7

  tags = {
    Environment       = "production"
    DataClassification = "personal-data"
    GDPRScope         = "article32"
  }
}
</code></pre>
<p><strong>How to test your RTO and RPO monthly:</strong></p>
<pre><code class="language-bash"># Step 1: Find your most recent automated snapshot
SNAPSHOT_ID=$(aws rds describe-db-snapshots \
  --db-instance-identifier production-database \
  --snapshot-type automated \
  --query 'sort_by(DBSnapshots, &amp;SnapshotCreateTime)[-1].DBSnapshotIdentifier' \
  --output text)

echo "Testing restore of snapshot: $SNAPSHOT_ID"

# Step 2: Start the restore — measure the time
START_TIME=$(date +%s)

aws rds restore-db-instance-from-db-snapshot \
  --db-instance-identifier gdpr-restore-test \
  --db-snapshot-identifier $SNAPSHOT_ID \
  --db-instance-class db.t3.medium \
  --no-publicly-accessible \
  --tags Key=Purpose,Value=gdpr-rto-test Key=DeleteAfter,Value=$(date -d '+1 day' +%Y-%m-%d)

# Step 3: Wait for restore to complete
aws rds wait db-instance-available \
  --db-instance-identifier gdpr-restore-test

END_TIME=$(date +%s)
RTO_SECONDS=$((END_TIME - START_TIME))
echo "Restore completed in $((RTO_SECONDS / 60)) minutes"

# Step 4: Verify data integrity with a spot check
# Connect to the restored instance and verify record counts match production
# psql -h RESTORED_ENDPOINT -U admin -d production \
#   -c "SELECT COUNT(*) FROM users; SELECT MAX(created_at) FROM orders;"

# Step 5: Delete the test instance
aws rds delete-db-instance \
  --db-instance-identifier gdpr-restore-test \
  --skip-final-snapshot
</code></pre>
<p><strong>The auditor question:</strong></p>
<blockquote>
<p>"What is your Recovery Time Objective and Recovery Point Objective for personal data? When did you last test it?"</p>
</blockquote>
<p><strong>Your evidence:</strong> A documented monthly DR test log showing: snapshot used, restore start time, restore completion time, data verification query results, and the engineer who conducted the test.</p>
<h2 id="heading-part-5-article-321d-regular-testing">Part 5: Article 32(1)(d) — Regular Testing</h2>
<h3 id="heading-51-how-to-implement-automated-vulnerability-scanning">5.1. How to Implement Automated Vulnerability Scanning</h3>
<p>Article 32(1)(d) requires "a process for regularly testing, assessing and evaluating the effectiveness of technical and organisational measures." This includes automated vulnerability scanning of every container image before it reaches production.</p>
<p><strong>Here is the incorrect approach — no scanning in the deployment pipeline:</strong></p>
<pre><code class="language-yaml"># Bad: No vulnerability scanning — a critical CVE in the base image deploys undetected
name: Deploy
on: [push]
jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - run: docker build -t myapp .
      - run: docker push myapp  # Deploys without any security check
</code></pre>
<p>If a critical CVE is present in the base image (such as a remote code execution vulnerability in OpenSSL), it goes straight to production. Under Article 32(1)(d), this is a finding.</p>
<p><strong>Here is the correct implementation — Trivy scanning with pipeline enforcement:</strong></p>
<pre><code class="language-yaml"># Good: Trivy scans every image — CRITICAL/HIGH CVEs block the deployment
name: Security Scan and Deploy
on: [push, pull_request]

jobs:
  trivy-scan:
    name: Container Vulnerability Scan
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3

      - name: Build container image
        run: docker build -t myapp:${{ github.sha }} .

      - name: Scan for vulnerabilities with Trivy
        uses: aquasecurity/trivy-action@master
        with:
          image-ref: 'myapp:${{ github.sha }}'
          format: 'sarif'
          output: 'trivy-results.sarif'
          severity: 'CRITICAL,HIGH'
          exit-code: '1'         # Fail the pipeline — image cannot deploy with CRITICAL/HIGH CVEs

      - name: Upload scan results to GitHub Security tab
        uses: github/codeql-action/upload-sarif@v2
        if: always()             # Upload results even if scan failed, for review
        with:
          sarif_file: 'trivy-results.sarif'
</code></pre>
<p>Trivy scans for:</p>
<ul>
<li><p>CVEs in the base image OS packages (for example, a critical OpenSSL vulnerability in your Ubuntu base)</p>
</li>
<li><p>Vulnerable versions of application dependencies (a known exploit in an npm or pip package your application uses)</p>
</li>
<li><p>Misconfigurations in the Dockerfile (running as root, using <code>latest</code> tag instead of a pinned SHA)</p>
</li>
</ul>
<p>Results appear in the GitHub Security tab, creating a timestamped, searchable history of every scan. That history is your Article 32(1)(d) evidence.</p>
<p><strong>How to run a weekly AWS Inspector assessment for running workloads:</strong></p>
<pre><code class="language-bash"># List all active CRITICAL findings across your AWS account
aws inspector2 list-findings \
  --filter-criteria '{
    "severity": [{"comparison": "EQUALS", "value": "CRITICAL"}],
    "findingStatus": [{"comparison": "EQUALS", "value": "ACTIVE"}]
  }' \
  --query 'findings[*].{
    Title:title,
    Resource:resources[0].id,
    Severity:severity,
    CVE:packageVulnerabilityDetails.vulnerabilityId
  }' \
  --output table
</code></pre>
<p><strong>The auditor question:</strong></p>
<blockquote>
<p>"Show me your vulnerability management programme, including how you prioritise and remediate findings."</p>
</blockquote>
<p><strong>Your evidence:</strong> A weekly vulnerability report — generated automatically from the above command — showing active findings, severity, the GitHub issue created for each finding, and the closure date once remediated.</p>
<h2 id="heading-part-6-article-321d-penetration-testing">Part 6: Article 32(1)(d) — Penetration Testing</h2>
<h3 id="heading-61-why-automated-scanning-is-not-enough">6.1. Why Automated Scanning Is Not Enough</h3>
<p>Article 32(1)(d) requires evaluating the effectiveness of security measures. Automated vulnerability scanners find known CVEs in libraries and OS packages. They cannot find:</p>
<ul>
<li><p>Business logic vulnerabilities (an API endpoint that returns another user's data when given a specific parameter)</p>
</li>
<li><p>Authentication bypasses (a JWT implementation that accepts unsigned tokens)</p>
</li>
<li><p>Privilege escalation paths (an attacker can move from a low-privilege role to admin through a sequence of legitimate API calls)</p>
</li>
<li><p>Insecure direct object references (accessing <code>/api/users/124</code> instead of <code>/api/users/123</code> returns data for a different customer)</p>
</li>
</ul>
<p>The ICO (UK Information Commissioner's Office) and the CNIL (France's data protection authority) both state in their guidance that annual manual penetration testing is expected for organisations processing significant volumes of personal data.</p>
<p><strong>What an acceptable pen test scope looks like:</strong></p>
<pre><code class="language-markdown"># Annual Penetration Test Scope — Article 32 Compliance

## Testing Period
Start: 2025-04-01  
End: 2025-04-14  
Testing firm: [Accredited firm — CREST or CHECK certified]

## In Scope
- Production web application: https://app.yourcompany.com
- Production API: https://api.yourcompany.com/v1/*
- Authentication flows: OAuth2, JWT, session management
- Data stores: PostgreSQL (via application access only, not direct DB access)
- AWS account: External reconnaissance of public-facing services only

## Testing Types
- External infrastructure testing (all public IP ranges)
- Web application testing (OWASP Top 10 2021)
- API security testing (all authenticated and unauthenticated endpoints)
- Authentication and session management testing
- GDPR-specific test cases (data subject rights endpoints, consent flows)

## Remediation SLAs
- CRITICAL: 24 hours from report delivery
- HIGH: 7 calendar days
- MEDIUM: 30 calendar days
- LOW: 90 calendar days
</code></pre>
<p><strong>How to track and evidence remediation:</strong></p>
<pre><code class="language-bash"># Create GitHub issues for each finding on receipt of the pen test report
# This creates a traceable record of every finding and its resolution

for finding_id in $(cat pentest-report-findings.txt); do
  gh issue create \
    --title "Pen test finding: $finding_id" \
    --body "See pentest-report-2025-04.pdf, section $finding_id. Severity: HIGH. SLA: 7 days." \
    --label "security,pentest" \
    --assignee "@security-lead"
done
</code></pre>
<p><strong>The auditor question:</strong></p>
<blockquote>
<p>"When was your last penetration test? Show me the report and your remediation evidence."</p>
</blockquote>
<p><strong>Your evidence:</strong></p>
<ol>
<li><p>The penetration test report from a CREST or CHECK certified firm, dated within the last 12 months</p>
</li>
<li><p>A remediation tracker (GitHub issues or Jira) showing every CRITICAL and HIGH finding with a closure date</p>
</li>
<li><p>Evidence that all CRITICAL findings were closed within 24 hours (the git commit or deployment log)</p>
</li>
</ol>
<h2 id="heading-best-practices-for-gdpr-article-32-compliance">Best Practices for GDPR Article 32 Compliance</h2>
<p>Here are the key takeaways from this guide:</p>
<p>✅ <strong>Do:</strong> Implement application-layer encryption for sensitive fields. Storage encryption alone is not enough — a DBA with direct database access can still read plaintext.</p>
<p>✅ <strong>Do:</strong> Use customer-managed KMS keys with automatic rotation. You need to prove control over the key material.</p>
<p>✅ <strong>Do:</strong> Store pseudonymised data separately from identifiers, with restricted role-based access to the lookup table.</p>
<p>✅ <strong>Do:</strong> Enforce automatic logoff after 15 minutes of inactivity with an 8-hour absolute session limit.</p>
<p>✅ <strong>Do:</strong> Use unique service accounts with IRSA. Every action on personal data must be attributable to a specific identity.</p>
<p>✅ <strong>Do:</strong> Test your backups monthly. Document RTO and RPO with actual restore test results.</p>
<p>✅ <strong>Do:</strong> Run Trivy in CI to block CRITICAL and HIGH CVEs before deployment.</p>
<p>✅ <strong>Do:</strong> Conduct an annual manual penetration test from a CREST or CHECK certified firm.</p>
<p>❌ <strong>Don't:</strong> Use 24-hour JWT sessions or sessions with no inactivity timeout.</p>
<p>❌ <strong>Don't:</strong> Store secrets in environment variables, .env files, or hardcoded in source code.</p>
<p>❌ <strong>Don't:</strong> Skip the annual penetration test. An auditor from the ICO or CNIL will not accept "we run automated scans" as a substitute.</p>
<p>❌ <strong>Don't:</strong> Use AWS-managed KMS keys if you need to prove key material control to your auditor.</p>
<h2 id="heading-resources">Resources</h2>
<ul>
<li><p><a href="https://ico.org.uk/for-organisations/guide-to-data-protection/guide-to-the-general-data-protection-regulation-gdpr/security/"><strong>ICO Guide to GDPR Article 32</strong></a> — The UK Information Commissioner's Office official guidance on Article 32 security obligations</p>
</li>
<li><p><a href="https://www.enisa.europa.eu/publications/guidelines-for-smes-on-the-security-of-personal-data-processing"><strong>ENISA Guidelines on Article 32</strong></a> — The EU Agency for Cybersecurity's SME guidelines on personal data security</p>
</li>
<li><p><a href="https://github.com/aquasecurity/trivy"><strong>Trivy by Aqua Security</strong></a> — Open-source container vulnerability scanner used in Part 5</p>
</li>
<li><p><a href="https://owasp.org/Top10/"><strong>OWASP Top 10 2021</strong></a> — The standard reference for web application security risks, used in pen test scoping</p>
</li>
<li><p><a href="https://docs.aws.amazon.com/kms/latest/developerguide/rotate-keys.html"><strong>AWS KMS Key Rotation Documentation</strong></a> — Official AWS documentation for automatic key rotation</p>
</li>
<li><p><a href="https://www.postgresql.org/docs/current/ddl-rowsecurity.html"><strong>PostgreSQL Row Security Policies</strong></a> — How to implement row-level security for granular access control on pseudonymised data</p>
</li>
<li><p><a href="https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html"><strong>EKS IAM Roles for Service Accounts (IRSA)</strong></a> — Official AWS documentation for unique service account identity on EKS</p>
</li>
<li><p><a href="https://www.crest-approved.org/members/certified-companies/"><strong>CREST Certified Testing Firms</strong></a> — Directory of CREST-certified penetration testing firms for your annual Article 32 assessment</p>
</li>
</ul>
<p><a href="https://github.com/aayostem">Ayobami Adejumo</a> is a senior platform engineer and compliance infrastructure specialist. He writes about GDPR engineering controls, SOC2 implementation, and FinOps - cloud cost optimization</p>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ How to Stay GDPR Compliant with Access Logs ]]>
                </title>
                <description>
                    <![CDATA[ By Yuli Stremovsky Privacy is a complicated topic. A well-known method used to save application logs turned out to be tricky with the new privacy regulations. In fact, new regulations define an IP address as a personal identifier. Like other user ide... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/how-to-stay-gdpr-compliant-with-access-logs/</link>
                <guid isPermaLink="false">66d46171a326133d12440a92</guid>
                
                    <category>
                        <![CDATA[ cybersecurity ]]>
                    </category>
                
                    <category>
                        <![CDATA[ #gdpr ]]>
                    </category>
                
                    <category>
                        <![CDATA[ privacy ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ freeCodeCamp ]]>
                </dc:creator>
                <pubDate>Fri, 08 Jan 2021 16:24:38 +0000</pubDate>
                <media:content url="https://www.freecodecamp.org/news/content/images/2021/01/privacy.jpg" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>By Yuli Stremovsky</p>
<p>Privacy is a complicated topic. A well-known method used to save application logs turned out to be tricky with the new privacy regulations. In fact, new regulations define an IP address as a personal identifier. Like other user identifiers, it should be treated with caution.</p>
<p>In this article, I will cover a few methods to make your logging privacy-friendly. </p>
<p>First, I will teach you basic <strong>GDPR terms</strong>: <strong>PII</strong> and <strong>forget-me user right</strong>. After that, we will cover methods to make web or application server logs GDPR ready. </p>
<p>Then I will talk about an open-source product I am developing called <strong><a target="_blank" href="http://databunker.org/">Databunker</a></strong> and how it helps. <strong>Databunker</strong> is a Swiss army knife tool for storing personal records.</p>
<h2 id="heading-some-gdpr-related-terms">Some GDPR-related terms</h2>
<h3 id="heading-what-is-personal-identifiable-information">What is Personal Identifiable Information?</h3>
<p>GDPR defines the concept of <strong>PII</strong> or <strong>Personal Identifiable Information</strong>. This can be any information that helps to identify a person. </p>
<p>For example, it can be a user name, address, telephone number, email address, or SSN. It can also be a weak identity, like browser information, IP address, session cookie name. </p>
<p>Like in triangulation, a combination of weak identities can lead us to a user. Strong and weak user identities are all considered <strong>PII</strong>.</p>
<p>The <strong>GDPR</strong> introduces the right for individuals to have their personal data erased. Your user or customer can send you an email asking you to remove their records. You have one month to respond to this request.</p>
<h3 id="heading-what-does-a-forget-me-request-mean-for-log-files">What does a forget-me request mean for log files?</h3>
<p>Deleting user data from the database is easy. You have SQL for that. Deleting user PII from the log file is the tricky part. </p>
<p>You might have different servers generating logs and you might feed logs to different cloud services. This might complicate how you perform record deletion. </p>
<p>In this article I will cover smarter methods to make your logging privacy-compliant.</p>
<h3 id="heading-introduction-to-databunker">Introduction to Databunker</h3>
<p>But first, let me give you a bit more information about what <strong>Databunker</strong> is and how it works since we'll be discussing it in some of these methods below.</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2021/01/databunker-solution.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><strong>Databunker</strong> is a GDPR compliant user store service for Web and mobile apps. It works as a backend application service. This product is a combination of several software concepts merged together. It provides secure PII storage and privacy by design out of the box:</p>
<ul>
<li>A Personal Identifiable Information (PII) storage and vault</li>
<li>Secure session storage for web applications</li>
<li>Privacy portal for customers</li>
<li>Application backend server</li>
<li>DPO management tool</li>
<li>Tokenization service</li>
<li>Secret sauce</li>
</ul>
<p>Project website: <a target="_blank" href="https://databunker.org/">https://databunker.org/</a></p>
<p>Full working Node.js example with Passport.js is available here: <a target="_blank" href="https://github.com/securitybunker/databunker-nodejs-example">https://github.com/securitybunker/databunker-nodejs-example</a></p>
<h2 id="heading-method-1-use-an-automatic-log-retention-period">Method 1: Use an automatic log retention period</h2>
<p>You have <strong>one month</strong> to respond to a <strong>user forget-me request</strong>. This actually means that you have one month to filter your log files from all user-related records – for example, filter out user IP addresses. </p>
<p>Or you can limit the log retention period just to one month. All older log entries will get removed. This way you do not need to do anything besides a one-time configuration of the log retention period.</p>
<h2 id="heading-method-2-use-pseudonymization-to-resolve-any-log-compliance-issues">Method 2: Use pseudonymization to resolve any log compliance issues</h2>
<p>GDPR discusses the concept of <strong>pseudonymization</strong>. This method will be based on the usage of the pseudonymization term. From the <a target="_blank" href="https://gdpr-info.eu/art-4-gdpr/">GDPR Article 4(5)</a>:</p>
<blockquote>
<p><em>‘pseudonymisation’ means the processing of personal data in such a manner that the personal data can no longer be attributed to a specific data subject without the use of additional information, provided that such additional information is kept separately and is subject to technical and organisational measures to ensure that the personal data are not attributed to an identified or identifiable natural person...</em></p>
</blockquote>
<p>You can keep personal data in a separate database, for example in <strong>Databunker</strong>. When you receive a user's <strong>forget-me request</strong>, you will delete the user's personal data from <strong>Databunker</strong>, <strong>leaving the log files unchanged</strong>.</p>
<p>To make our life even easier, we can print a user session and user token in each log line.</p>
<p>You can take a look at <a target="_blank" href="https://github.com/securitybunker/databunker-nodejs-example">this example</a> for reference:</p>
<blockquote>
<p>::ffff:141.226.198.55 - - [02/Jan/2021:18:42:54 +0000] "GET /user/me HTTP/1.1" 304 - "http://my-dev-site/user/login" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.66 Safari/537.36" <strong>"b994fdbf-694e-4289-b8db-04d8049da2e8" "1f587eb7-eaaa-1629-c108-b707d99798da"</strong></p>
</blockquote>
<p>This is different from a regular web server log by the addition of two custom variables at the end of the log line.</p>
<p><strong>"B994fdbf-694e-4289-b8db-04d8049da2e8</strong>" is the session token generated by the Databunker session library.</p>
<p><strong>"1f587eb7-eaaa-1629-c108-b707d99798da"</strong> is a user token of the logged-in user. It is the user token generated upon user creation in Databunker.</p>
<h2 id="heading-method-3-solution-for-high-security-environments">Method 3: Solution for high-security environments</h2>
<p>This method includes partial encryption of the log events. PII found in the log events will be grouped together and encrypted. The initial setup will include one time generation of the log-entry password for each user. This password for example can be saved in the user profile stored in <strong>Databunker</strong>.</p>
<p>As we need to know who the record owner is (to decrypt the record), we need to save the user id together with encrypted PII. So, another level of encryption will be used with a generic password.</p>
<p>For user identified log events, PII will be encrypted twice. The first time the data will be encrypted using the user's log-entry password. The second time, it'll be encrypted with the default password to hide the identified user id.</p>
<p>For identified users:</p>
<pre><code><span class="hljs-keyword">const</span> piiPayload = <span class="hljs-built_in">JSON</span>.stringify({ClientIP, BrowserUserAgent, SessionID});
coast piiEncrypted = Encrypt(UserPassword, piiPayload);
<span class="hljs-keyword">const</span> linePayload = <span class="hljs-built_in">JSON</span>.stringify({UserToken, <span class="hljs-attr">data</span>: btoa(piiEncrypted)});
<span class="hljs-keyword">const</span> encrypted = Encrypt(GenericPassword, linePayload);
</code></pre><p>If the user is unknown, only one level of encryption can be used:</p>
<pre><code><span class="hljs-keyword">const</span> piiPayload = <span class="hljs-built_in">JSON</span>.stringify({ClientIP, BrowserUserAgent, SessionID});
<span class="hljs-keyword">const</span> encrypted = Encrypt(GenericPassword, piiPayload);
</code></pre><p>When you get a user's forget-me request, you can remove the user's log-entry password and their profile stored in <strong><a target="_blank" href="http://databunker.org/">Databunker</a></strong>. This will make user log entries unrecoverable. This is completely ok and satisfies GDPR requirements. So extra actions to remove anything from logs files are not required.</p>
<h2 id="heading-summary">Summary</h2>
<p>With the right architecture, you can make your logging privacy compliant. It is not complicated. You can use <strong>Databunker</strong> or roll your own solution. </p>
<p>Whatever you choose is much better than completely ignoring this issue and manually removing user records from log files.</p>
<h3 id="heading-free-takeaway">Free takeaway</h3>
<p>I run a privacy training for startup founders and architects. It is <a target="_blank" href="https://basebunker.com/">available completely for FREE here</a>.</p>
<h3 id="heading-about-the-author">About the author</h3>
<p>Yuli Stremovsky is a world-class software and security architect. Founder of <a target="_blank" href="https://privacybunker.io/">PrivacyBunker.io</a> and <a target="_blank" href="http://databunker.org/">DataBunker.org</a> privacy products. Former Checkpoint, and RSA Security employee. An expert in marrying technological solutions with privacy.</p>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ GDPR terminology in plain English ]]>
                </title>
                <description>
                    <![CDATA[ By Alex Ewerlöf My team builds the technologies for some of the highest traffic newsrooms in Sweden and Norway. Part of the revenue comes from selling ads. Ads sell best when personalised, and for personalization you need data. Internet’s default bus... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/gdpr-terminology-in-plain-english-6087535e6adf/</link>
                <guid isPermaLink="false">66d45d5955db48792eed3ef5</guid>
                
                    <category>
                        <![CDATA[ data ]]>
                    </category>
                
                    <category>
                        <![CDATA[ #gdpr ]]>
                    </category>
                
                    <category>
                        <![CDATA[ privacy ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Security ]]>
                    </category>
                
                    <category>
                        <![CDATA[ tech  ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ freeCodeCamp ]]>
                </dc:creator>
                <pubDate>Wed, 23 May 2018 07:57:35 +0000</pubDate>
                <media:content url="https://cdn-media-1.freecodecamp.org/images/1*6hCZUnZq19I_UvHv4sIp9w.jpeg" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>By Alex Ewerlöf</p>
<p>My team builds the technologies for some of the highest traffic newsrooms in Sweden and Norway. Part of the revenue comes from selling ads. Ads sell best when personalised, and for personalization you need data. Internet’s default business model is based on ads. GDPR has big implications for online businesses like newsrooms.</p>
<p>But here’s the interesting part — the General Data Protection Regulation (GDPR) puts restrictions on what data can be gathered, how it can be used, and for how long it can be stored.</p>
<p>This post is about demystifying the core GDPR terms so everyone can understand this interesting topic. If you are European or have European users, you need to understand GDPR.</p>
<blockquote>
<p>TL;DR; this is a huge shift in how personal data is gathered from “by default” to “opt-in”. Plus some other perks.</p>
</blockquote>
<p>Here is a video that sums it up at a basic level:</p>
<p>Before we start, a quick disclaimer: I don’t represent my current/previous employers on my personal blog. The information provided here is purely based on my own research, and doesn’t necessarily reflect my company’s policies, strategy or implementation of GDPR.</p>
<h3 id="heading-a-bit-of-background">A bit of background</h3>
<p>GDPR came into effect on May 25. Despite making developers’ and marketers’ lives harder, it’s actually a very sweet deal for the end users. GDPR prevents the companies from gathering information they don’t need to (strictly speaking).</p>
<p>Despite starting with the word ‘General’, GDPR is actually an European Union (EU) law that <a target="_blank" href="https://www.lexology.com/library/detail.aspx?g=70046340-607b-4620-a680-6b6a0cefaf47">applies to</a>:</p>
<ol>
<li>Companies that are based in the EU</li>
<li>Companies that gather personal data from European citizens.</li>
</ol>
<p>Maybe that ‘General’ is good, because a huge part of the internet is European!</p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/1*jS0wmtrioU6pGFsVO6sS7g.gif" alt="Image" width="600" height="337" loading="lazy">
_Global internet usage during 24 hours ([wikipedia](https://en.wikipedia.org/wiki/Global_Internet_usage" rel="noopener" target="<em>blank" title="))</em></p>
<p>The word ‘Regulation’ in GDPR means that it must be applied in its entirety across the EU.</p>
<p>In the long run, this leads to <strong>privacy by design</strong>. This is a principle that calls for the inclusion of data protection from the start of designing the systems, rather than as an afterthought.</p>
<h3 id="heading-common-terminology">Common terminology</h3>
<p>Here’s a list of the most common GDPR terms:</p>
<ul>
<li>A <strong>Data Subject</strong> is a person (such as you and me) whose personal data is processed by a data controller (such as a company or service we use).</li>
<li>A <strong>Data Controller</strong> is an organisation that collects data from EU residents. It determines the purposes, conditions and means of processing the personal data.</li>
<li>The entity that does the actual data processing is called a <strong>Data Processor</strong> — an example might be a cloud service provider.</li>
<li><strong>Processing</strong> involves any operation performed on personal data, whether or not by automated means. This includes collection, use, recording, feeding it to machine learning algorithms (read <a target="_blank" href="https://www.oreilly.com/ideas/how-will-the-gdpr-impact-machine-learning">how ML is affected by GDPR</a>), and so on.</li>
</ul>
<h3 id="heading-gdpr-for-the-users">GDPR for the users</h3>
<p>Your <strong>personal data</strong> is any information that can be used to directly or indirectly identify you<em>.</em> For example: your name, home address, photo, email address, bank details, posts on social networking websites, medical information, or a computer or mobile IP address.</p>
<p>This data is usually used for <strong>profiling</strong>, in which automated processes evaluate, analyse, or predict your behaviour. As an example, knowing your age means you’ll be exposed to ads that are targeted to your age group. This is also true about data that you’re not explicitly giving to a company, like your IP address, which will be used to guess your location.</p>
<p>Now that GDPR is in effect, companies have limitations on what personal data they can gather and how long they can store it. They should justify why they need it.</p>
<h4 id="heading-when-the-companies-need-user-consent">When the companies NEED user consent</h4>
<p>The data controller (company) cannot just go and gather user data. They have to first ask for your permission or consent.</p>
<p>The consent must be explicit for data collected and for the purposes the data is used. The consent is freely given (if you say ‘no’, the company should still serve you as well as possible without your data). The consent should not be regarded as freely given if the data subject has no genuine or free choice or is unable to refuse or withdraw consent without detriment. The consent should be specific and explicit about what data is gathered and how it is processed. The user have the right <strong>to withdraw his or her consent at any time</strong> but more importantly <strong>it shall be as easy to withdraw as to give consent.</strong></p>
<p>Companies can no longer force you to tick a checkbox that says “I accept all terms and conditions and privacy policies”. That is why you were getting those emails from many websites informing you about their policies before the May 25th deadline.</p>
<p>The area of GDPR consent has a number of implications for businesses who record calls as a matter of practice. The typical “calls are recorded for training and security purposes” warnings will no longer be sufficient to gain assumed consent to record calls.</p>
<h4 id="heading-when-the-companies-dont-need-user-consent">When the companies DON’T need user consent</h4>
<p>There must be a reasonable legal basis for gathering an exact piece of data. According to the <a target="_blank" href="https://www.gdpreu.org/the-regulation/key-concepts/legitimate-interest/">GDPR’s site</a>, these can be when:</p>
<ul>
<li>Processing is necessary for the fulfillment of a contract to which the data subject is party or to take steps at the request of the data subject prior to entering into a contract.</li>
<li>Processing is necessary for compliance with a legal obligation to which the controller is subject.</li>
<li>Processing is necessary to protect the vital interests of the data subject or of another natural person.</li>
<li>Processing is necessary for the performance of a task carried out in the public interest or in the exercise of official authority vested in the controller.</li>
<li>Processing is necessary for the purposes of the legitimate interests pursued by the controller or by a third party unless such interests are overridden by the interests or <a target="_blank" href="https://en.wikipedia.org/wiki/Charter_of_Fundamental_Rights_of_the_European_Union">fundamental rights</a> and freedoms of the data subject, which require protection of personal data, in particular if the data subject is a child.</li>
</ul>
<p>The most important benefit of GDPR is that it <a target="_blank" href="https://gdpr-info.eu/chapter-3/">gives controls</a> to the users to:</p>
<ol>
<li>Erase their data whenever they like (also known as the <a target="_blank" href="https://en.wikipedia.org/wiki/Right_to_be_forgotten">Right to be Forgotten</a>). <strong>Data Erasure</strong> requests don’t stop at the data controller. If third party data processors are involved, they too have to stop processing the data and erase it. I’m guessing there’ll be a de facto standard API for that, but so far it’s more ad-hoc and depends on how services talk to each other. I’m sure in the future there’ll be services where you give them your personal info and they’ll check thousands of online services to give you an aggregated report of which sites have your information. The companies should provide a way to query if they have data for a particular user (without requiring registration)<em>.</em> Trivia: this is essentially in contradiction with how Blockchain works! Read more about <a target="_blank" href="https://medium.com/@alexewerlof/gdpr-for-blockchain-f73744b9be34">the implications of GDPR for Blockchain here</a>.</li>
<li>Own their data! The data subjects (users) can download and see their data and how it is processed. Furthermore, the data controller has to inform the data subject on details about the processing, such as the purposes of the processing, with whom the data is shared, and how it acquired the data. This is called <strong>right of access</strong> or <strong>subject access right</strong>. Personal data <a target="_blank" href="http://ec.europa.eu/justice/data-protection/international-transfers/index_en.htm">cannot be transferred to countries outside the European Union</a> unless they guarantee the same level of data protection.</li>
<li>Move their data to competitors. This is good for competition and eventually the users win. The data must be provided by the controller in a structured and commonly used standard electronic format. No more lock-in! This is known as <strong>data Portability</strong>. This will probably open up a whole new business segment for converting data formats from one controller to another controller.</li>
<li>Update/correct their data. The data subjects have the right to ask the data controllers to immediately correct (public or private) data that is invalid.</li>
</ol>
<p>I personally find the <strong>data breach</strong> announcement amazing.</p>
<p>The data controller is under a legal obligation to notify the relevant supervisory authority of any data breach without undue delay, unless the breach is likely to result in a risk to the rights and freedoms of the individuals affected.</p>
<p>Individuals have to be notified if an adverse impact is determined. There is a maximum of 72 hours after becoming aware of the data breach to make the report. In addition, the data processor will have to notify the data controller without undue delay after becoming aware of a personal data breach.</p>
<p>Do you remember <a target="_blank" href="https://www.washingtonpost.com/news/the-switch/wp/2016/11/10/yahoo-discovered-hack-leading-to-major-data-breach-two-years-before-it-was-disclosed/?noredirect=on&amp;utm_term=.4782fea5e3e5">when Yahoo kept its breach secret for two years</a>? Well, not anymore!</p>
<h3 id="heading-gdpr-for-the-governments">GDPR for the governments</h3>
<p>Since GDPR is quite a big thing, governments are involved to protect their citizens and enforce the regulations. There are two terms to understand:</p>
<ul>
<li>National <strong>Data Protection Authorities</strong> (<a target="_blank" href="https://www.whitecase.com/publications/article/chapter-14-data-protection-authorities-unlocking-eu-general-data-protection">DPA</a>) are appointed by each EU country to implement and enforce data protection law, and to offer guidance. <strong>Supervisory Authority</strong> (SA) <a target="_blank" href="https://www.i-scoop.eu/supervisory-authorities-consistency-and-data-protection-authorities-dpas/#What_is_a_Data_Protection_Authority_or_DPA_in_the_scope_of_the_GDPR">is another name</a> for DPO. As set out in Chapter 16, DPAs have significant enforcement powers, including the ability to issue substantial fines. They are also the place to go to in case of a violation of data protection legislation (in the scope of the GDPR for EU citizens) and for advice and specific questions and/or assistance from the perspective of organisations.</li>
<li>A <strong>Data Protection Officer</strong> (<a target="_blank" href="https://www.whitecase.com/publications/article/chapter-12-impact-assessments-dpos-and-codes-conduct-unlocking-eu-general-data">DPO</a>) is a an employee of the data controller (company) who is formally tasked with ensuring that an organisation is aware of, and complies with, its data protection responsibilities. More about this in the next section.</li>
</ul>
<p><img src="https://cdn-media-1.freecodecamp.org/images/1*Q6HO-CQ7tHOUAd0s9tG6xA.png" alt="Image" width="800" height="426" loading="lazy">
<em>DPA &amp; DPO</em></p>
<p>Each EU member has a main establishment where key decisions about data processing are made.</p>
<h3 id="heading-gdpr-for-the-companies">GDPR for the companies</h3>
<p>The upper fine limit for contravening GDPR is pretty expensive: up to €20 million, or up to 4% of the annual worldwide turnover of the preceding financial year… whichever is <em>higher</em>!</p>
<p>Companies that gather data have a responsibility and the liability to implement and demonstrate that they comply with GDPR. This is called <strong>compliance</strong>.</p>
<p>The companies are supposed to keep a log of who accessed what information for when the authorities ask for an audit. Records of processing activities must be maintained, that include purposes of the processing, categories involved and envisaged time limits.</p>
<p>The records must be made available to the supervisory authority on request. The interesting part is that even if the actual processing happens by another company (a data processor on behalf of the data controller), it is still the company that gathers the data that bears the main responsibility.</p>
<p>This whole new range of requirements is complicated enough to create a new job title: data protection officer (DPO)! This is an enterprise security leadership role responsible for overseeing data protection strategy and implementation to ensure compliance.</p>
<p>They also:</p>
<ul>
<li>Educate the company and employees on important compliance requirements</li>
<li>Are the point of contact between the company and supervisory authorities</li>
<li>Monitor and provide advice on data protection efforts across the company</li>
<li>Keep tabs on all data processing activities at the company, including the purpose of all processing activities, which must be made public on request</li>
<li>Answer inquiries from users regarding how their data is being used, data erasure right and queries regarding what measures the company has put in place to protect their personal information</li>
<li>Identify and reduce the privacy risks of entities by analysing the personal data that are processed and the policies in place to protect the data, which is called <strong>Data Privacy Impact Assessment</strong>. The GDPR <a target="_blank" href="https://www.itgovernance.co.uk/privacy-impact-assessment-pia">mandates a DPIA</a> be conducted where data processing is likely to result in a high risk to the rights and freedoms of natural persons.</li>
</ul>
<p>The DPO must have a support team and will also be responsible for continuing professional development to be independent of the organization that employs them, effectively as a “mini-regulator.”</p>
<p>If a business has multiple establishments in the EU, it will have a single supervisory authority as its lead authority, based on where the main data processing activities take place.</p>
<h3 id="heading-gdpr-for-the-developers">GDPR for the developers</h3>
<p>Since GDPR enforces <strong>privacy by design</strong>, it affects software architecture and its implementation. For example, we can no more keep logs of sensitive information (as mentioned before, IP addresses are considered personal information). This makes tracing bugs a bit harder.</p>
<p>Privacy settings must therefore be set at a high level by default. So we have to make sure checkboxes that expose personal data are not ticked by default.</p>
<p>If the Cloud is used for data storage, only the data owner, not the cloud service, should hold the decryption keys.</p>
<p>We cannot store data for longer than necessary. Database columns should have a <strong>data retention deadline</strong> which specifies when the data should be deleted.</p>
<p>Personally identifiable information should be <strong>pseudonymised</strong> in a way that it can no longer be linked (or ‘attributed’) to a single data subject without the use of additional data.</p>
<p>Read more about <a target="_blank" href="https://medium.com/@alexewerlof/gdpr-pseudonymization-techniques-62f7b3b46a56">the pseudonymization in techniques in my newer post</a>.</p>
<h3 id="heading-exceptions-to-gdpr">Exceptions to GDPR</h3>
<p>What good is a law if it is not meant to be broken? Don’t get too excited about your rights because the following cases are not covered by the regulation:</p>
<ul>
<li>Lawful interception, national security, the army, the police, justice</li>
<li>Statistical and scientific analysis for research</li>
<li>Deceased persons are subject to national legislation</li>
<li>There is a dedicated law on employer-employee relationships. The GDPR was developed with a focus on social networks and cloud providers, but did not consider enough requirements for handling employee data.</li>
<li>Processing of personal data by a natural person in the course of a purely personal or household activity</li>
</ul>
<h3 id="heading-acknowledgement">Acknowledgement</h3>
<p>Thanks to my colleague <a target="_blank" href="https://www.linkedin.com/in/ioanadodu/">Ioana Norgen</a> for proof-reading this post before publishing. Any possible errors are still mine.</p>
<h3 id="heading-sources">Sources</h3>
<ul>
<li>A <a target="_blank" href="https://www.eugdpr.org/glossary-of-terms.html">glossary</a> of GDPR terms</li>
<li><a target="_blank" href="https://gdpr.report/news/2017/11/07/data-masking-anonymisation-pseudonymisation/">Pseudoanonymization techniques</a></li>
<li><a target="_blank" href="https://en.wikipedia.org/wiki/General_Data_Protection_Regulation">Wikipedia</a></li>
<li><a target="_blank" href="https://digitalguardian.com/blog/what-data-protection-officer-dpo-learn-about-new-role-required-gdpr-compliance">Data Protection Officer</a></li>
</ul>
<h3 id="heading-interesting-reading">Interesting reading</h3>
<ul>
<li><a target="_blank" href="https://en.wikipedia.org/wiki/EPrivacy_Regulation_(European_Union)">ePrivacy</a>, a set of related regulations that are also enforced at the same time as GDPR. It targets any business that provides any form of online communication service, uses online tracking technologies, or engages in electronic direct marketing (eg. telecom operators and online communication services like Skype and WhatsApp). Its most important aspect is protection against spam SMS/email and marketing calls.</li>
<li>An <a target="_blank" href="https://techblog.bozho.net/gdpr-practical-guide-developers/">excellent guide to GDPR for developers</a> and some <a target="_blank" href="https://techblog.bozho.net/gdpr-developers-presentation/">nice slides</a></li>
<li>Belitsoft has made <a target="_blank" href="https://belitsoft.com/gdpr-compliance-checklist">a great checklist for businesses about GDPR</a> although not all items in the checklist are a requirement by GDPR and some like 2 factor authentication are more of a best practice.</li>
<li><a target="_blank" href="https://techblog.bozho.net/tracking-cookies-gdpr/">How GDPR affects cookies used for tracking</a></li>
<li>The data protection reform package also includes <a target="_blank" href="http://eur-lex.europa.eu/eli/dir/2016/680/oj/eng">a separate Data Protection Directive</a> for the police and criminal justice sector that provides rules on personal data exchanges at national, European, and international levels.</li>
<li><a target="_blank" href="https://www.theverge.com/2018/5/25/17393766/facebook-google-gdpr-lawsuit-max-schrems-europe">Facebook and Google hit with $8.8 billion in lawsuits on day one of GDPR</a></li>
<li><a target="_blank" href="https://en.wikipedia.org/wiki/Privacy_by_design">Privacy by design</a></li>
</ul>
<blockquote>
<p>The bottom line is: GDPR is an obvious right. Europe pioneered its establishment but this should be a global right. Talk about it with your friends, colleagues and law makers if you want to enjoy the same protection and choice as Europeans.</p>
</blockquote>
<p>If you liked this, you may enjoy: <a target="_blank" href="https://medium.com/@alexewerlof/what-s-cool-about-being-a-programmer-5a1e58efeee6">programming is the best job ever</a> and <a target="_blank" href="https://medium.com/@alexewerlof/how-i-learn-new-tech-cb79db19c818">how do I keep up with technology</a>.</p>
 ]]>
                </content:encoded>
            </item>
        
    </channel>
</rss>
