<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/"
    xmlns:atom="http://www.w3.org/2005/Atom" xmlns:media="http://search.yahoo.com/mrss/" version="2.0">
    <channel>
        
        <title>
            <![CDATA[ Shubham Katara - freeCodeCamp.org ]]>
        </title>
        <description>
            <![CDATA[ Browse thousands of programming tutorials written by experts. Learn Web Development, Data Science, DevOps, Security, and get developer career advice. ]]>
        </description>
        <link>https://www.freecodecamp.org/news/</link>
        <image>
            <url>https://cdn.freecodecamp.org/universal/favicons/favicon.png</url>
            <title>
                <![CDATA[ Shubham Katara - freeCodeCamp.org ]]>
            </title>
            <link>https://www.freecodecamp.org/news/</link>
        </image>
        <generator>Eleventy</generator>
        <lastBuildDate>Fri, 29 May 2026 10:33:51 +0000</lastBuildDate>
        <atom:link href="https://www.freecodecamp.org/news/author/katara/rss.xml" rel="self" type="application/rss+xml" />
        <ttl>60</ttl>
        
            <item>
                <title>
                    <![CDATA[ How to Build a Hybrid Cloud Platform with Google Cloud Services and On-Premise Kubernetes Infrastructure ]]>
                </title>
                <description>
                    <![CDATA[ In this article, you'll learn how to design and build a secure, scalable hybrid cloud platform that connects your on‑premises Kubernetes infrastructure to Google Cloud Platform. This allows on‑prem ap ]]>
                </description>
                <link>https://www.freecodecamp.org/news/build-a-hybrid-cloud-platform-with-google-cloud-services-and-on-premise-k8s-infra/</link>
                <guid isPermaLink="false">6a18c124782587548340fa90</guid>
                
                    <category>
                        <![CDATA[ google cloud ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Kubernetes ]]>
                    </category>
                
                    <category>
                        <![CDATA[ cloud native ]]>
                    </category>
                
                    <category>
                        <![CDATA[ CNCF ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Hybrid Cloud ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ Shubham Katara ]]>
                </dc:creator>
                <pubDate>Thu, 28 May 2026 22:26:44 +0000</pubDate>
                <media:content url="https://cdn.hashnode.com/uploads/covers/5e1e335a7a1d3fcc59028c64/a86db163-f513-48bd-8194-18c6cb894615.png" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>In this article, you'll learn how to design and build a secure, scalable hybrid cloud platform that connects your on‑premises Kubernetes infrastructure to Google Cloud Platform. This allows on‑prem apps can consume cloud services (notably GPUs) without brittle long‑lived keys, manual credential management, or risky network patterns.</p>
<p>Who this is for:</p>
<ul>
<li><p>Platform engineers, SREs, and security-focused cloud architects who operate mixed on‑prem and cloud Kubernetes estates.</p>
</li>
<li><p>Teams that need scalable, auditable access from on‑prem workloads to GCP resources (especially GPU instances) while minimizing operational overhead and blast radius.</p>
</li>
</ul>
<p>What you’ll get from this guide:</p>
<ul>
<li><p>The motivation and economics behind a hybrid approach (why GPUs often push workloads to the cloud).</p>
</li>
<li><p>Common pitfalls with service account keys and how “accidental air gaps” occur in real environments.</p>
</li>
<li><p>A practical, end‑to‑end pattern that uses Workload Identity Federation to give on‑prem pods short‑lived, auditable access to GCP without embedding keys.</p>
</li>
</ul>
<p>What’s included:</p>
<ul>
<li><p>Conceptual explanations, security tradeoffs, and operational best practices.</p>
</li>
<li><p>Concrete examples and Kubernetes/Terraform artifacts (linked in the GitHub repo at the end of this article) so you can reproduce the setup in your environment.</p>
</li>
</ul>
<p>Read on for the theory, then follow the hands‑on sections to provision GCP resources, configure federation, enforce policies with CEL and Kyverno, and validate secure, scalable GPU access from your on‑prem Kubernetes clusters.</p>
<p><strong>Note:</strong> Kubernetes and Terraform artifacts are linked in the GitHub repo at the end of this article.</p>
<h2 id="heading-table-of-contents">Table of Contents</h2>
<ul>
<li><p><a href="#heading-prerequisites">Prerequisites</a></p>
</li>
<li><p><a href="#heading-why-hybrid-cloud-matters">Why Hybrid Cloud Matters</a></p>
</li>
<li><p><a href="#heading-the-economics-of-hybrid-gpus-changed-everything">The Economics of Hybrid: GPUs Changed Everything</a></p>
</li>
<li><p><a href="#heading-why-service-account-keys-fail-at-scale">Why Service Account Keys Fail at Scale</a></p>
</li>
<li><p><a href="#heading-how-the-accidental-air-gap-happens">How the Accidental Air Gap Happens</a></p>
</li>
<li><p><a href="#heading-how-workload-identity-federation-bridges-the-gap">How Workload Identity Federation Bridges the Gap</a></p>
</li>
<li><p><a href="#heading-how-kubernetes-identity-works">How Kubernetes Identity Works</a></p>
</li>
<li><p><a href="#heading-how-to-prepare-google-cloud-platform-resources">How to prepare Google Cloud Platform resources</a></p>
</li>
<li><p><a href="#heading-how-to-use-cel-for-fine-grained-access-control">How to Use CEL for Fine-Grained Access Control</a></p>
</li>
<li><p><a href="#heading-how-to-inject-credentials-automatically-with-kyverno">How to Inject Credentials Automatically with Kyverno</a></p>
</li>
<li><p><a href="#heading-how-to-grant-iam-permissions-to-federated-identities">How to Grant IAM Permissions to Federated Identities</a></p>
</li>
<li><p><a href="#heading-how-to-verify-the-setup">How to Verify the Setup</a></p>
</li>
<li><p><a href="#heading-how-to-connect-on-prem-apps-to-cloud-gpus">How to Connect On-Prem Apps to Cloud GPUs</a></p>
</li>
<li><p><a href="#heading-how-to-scale-gpu-access-with-cel-conditions">How to Scale GPU Access with CEL Conditions</a></p>
</li>
<li><p><a href="#heading-the-security-properties-compared">The Security Properties Compared</a></p>
</li>
<li><p><a href="#heading-the-complete-infrastructure-as-code-layout">The Complete Infrastructure as Code Layout</a></p>
</li>
<li><p><a href="#heading-how-to-run-a-proof-of-concept-with-vcluster">How to Run a Proof of Concept with vCluster</a></p>
</li>
<li><p><a href="#heading-common-issues-and-how-to-solve-them">Common Issues and How to Solve Them</a></p>
</li>
<li><p><a href="#heading-conclusion">Conclusion</a></p>
</li>
</ul>
<h2 id="heading-prerequisites">Prerequisites</h2>
<p>Before following along, you'll need:</p>
<ul>
<li><p>A Kubernetes cluster that is <strong>not</strong> GKE (on-premises, bare-metal, or a virtual cluster)</p>
</li>
<li><p>A Google Cloud project with the following APIs enabled: IAM, Security Token Service (STS), and Workload Identity</p>
</li>
<li><p><a href="https://developer.hashicorp.com/terraform/install">Terraform</a> installed and configured</p>
</li>
<li><p><a href="https://kyverno.io/docs/installation/">Kyverno</a> installed in your cluster</p>
</li>
<li><p>Python 3 with <code>google-cloud-secret-manager</code> and <code>google-cloud-aiplatform</code> libraries (for the verification steps. Code available in the github repository.)</p>
</li>
<li><p><code>kubectl</code> access to your cluster</p>
</li>
</ul>
<h2 id="heading-why-hybrid-cloud-matters">Why Hybrid Cloud Matters</h2>
<p>If everything goes right, a hybrid cloud platform lets your on-premises and cloud workloads talk to each other as if they were part of the same network.</p>
<p>There are many practical reasons to run a hybrid cloud setup:</p>
<ul>
<li><p><strong>Offloading analytics to BigQuery:</strong> You keep your analytics apps on-prem for data sovereignty, but pipe large datasets into BigQuery for world-class processing power — without buying extra servers.</p>
</li>
<li><p><strong>Creating a unified network with Cloud Interconnect:</strong> Using Cloud Interconnect or Cloud VPN, your on-premises datacenter becomes an extension of the Google Cloud Platform (GCP) Virtual Private Cloud (VPC). Your on-prem invoice apps can talk to cloud-based user services with low latency and no public internet exposure.</p>
</li>
<li><p><strong>Cost-effective scalability via Cloud Storage:</strong> You can use cloud storage as a backend for local apps, storing logs, backups, and historical data while paying only for what you use.</p>
</li>
<li><p><strong>Event-driven syncing with Pub/Sub:</strong> When something happens on-prem, a message through Cloud Pub/Sub lets cloud services react instantly — no manual polling required.</p>
</li>
</ul>
<h2 id="heading-the-economics-of-hybrid-gpus-changed-everything">The Economics of Hybrid: GPUs Changed Everything</h2>
<p>Before diving into the technical problem, it's worth understanding why hybrid clouds matter more than ever.</p>
<p>Your organization, like most enterprises, has made significant investments in on-premises datacenters. Servers are bought. Racks are filled. Network infrastructure is paid for. The marginal cost of running one more workload is essentially zero.</p>
<p>Then came the AI wave.</p>
<p>Suddenly every team needs Graphics Processing Units (GPUs). Not one or two — dozens of A100s for training, fleets of inference endpoints, vector databases that need to sit close to the models. GPUs are scarce. Lead times for on-prem GPU hardware stretch into months. Cloud providers have them available in minutes.</p>
<p>The architecture that actually makes economic sense looks like this:</p>
<ul>
<li><p><strong>The on-prem datacenter handles the bulk of compute</strong> — web servers, business logic, databases, batch processing. This is commodity compute you've already paid for.</p>
</li>
<li><p><strong>The cloud handles what's scarce</strong> — GPU-accelerated inference, model training, AI/ML endpoints. You pay per request, scale on demand, and don't wait six months for hardware.</p>
</li>
</ul>
<p>The cloud isn't a full migration destination — it's an extension for capabilities you can't easily build on-prem.</p>
<p>But those on-prem workloads need to authenticate to cloud services. Every API call from the datacenter to a Vertex AI endpoint, every request to a GPU-powered inference service, every write to Cloud Storage for model artifacts — all of it needs credentials. That's the problem this article solves.</p>
<h2 id="heading-why-service-account-keys-fail-at-scale">Why Service Account Keys Fail at Scale</h2>
<p>Here's a scenario that plays out in thousands of enterprises daily.</p>
<p>A development team needs their on-prem application to write to Google Cloud Storage. The "obvious" solution? Generate a GCP service account key, base64 encode it, store it in a Kubernetes Secret, and mount it in the pod:</p>
<pre><code class="language-yaml">apiVersion: v1
kind: Secret
metadata:
  name: gcp-credentials
type: Opaque
data:
  key.json: eyJ0eXBlIjoic2VydmljZV9hY2NvdW50IiwicHJvamVjdF9pZCI6…
</code></pre>
<p>This works. It also introduces serious problems:</p>
<ul>
<li><p><strong>Never expires.</strong> That key is valid until someone remembers to rotate it (they won't) or it gets compromised (it will).</p>
</li>
<li><p><strong>Can be exfiltrated trivially.</strong> Anyone with read access to that namespace can run <code>kubectl get secret -o yaml</code> and walk away with permanent GCP access.</p>
</li>
<li><p><strong>Has no audit trail for the actual workload.</strong> GCP sees "service-account-xyz accessed this bucket" — not "pod frontend-abc-123 in namespace production."</p>
</li>
<li><p><strong>Scales terribly.</strong> 50 teams × 3 environments × 4 GCP projects = 600 keys to track, rotate, and hope haven't been committed to git.</p>
</li>
</ul>
<p>Security teams know this. That's why many organizations have done the only sensible thing: they have disabled service account key generation entirely.</p>
<h2 id="heading-how-the-accidental-air-gap-happens">How the Accidental Air Gap Happens</h2>
<p>When you disable key generation, you haven't solved the hybrid cloud platform problem — you've just made it someone else's problem. That someone is usually a platform team staring at a Jira ticket that says "cannot access GCP from on-prem, P1, blocking release."</p>
<p>The result? Your "hybrid cloud platform" isn't hybrid at all. It's two disconnected systems.</p>
<p>Teams resort to building intermediary services, API gateways that proxy requests, or finding creative ways to get keys anyway. None of this is a platform. It's duct tape.</p>
<h2 id="heading-how-workload-identity-federation-bridges-the-gap">How Workload Identity Federation Bridges the Gap</h2>
<p>Every Kubernetes cluster already issues cryptographically signed identity tokens to every pod. And Google Cloud has a service specifically designed to trust those tokens.</p>
<p>This is <strong>Workload Identity Federation</strong> — and combined with OpenID Connect (OIDC), it's the missing piece that makes hybrid platforms actually work.</p>
<p>The service is quite well named because of the word Federation. it means GCP doesn't store your identity — it agrees to trust identities issued by another system, as long as they can be cryptographically verified. This all works with a very well orchestrated set of steps in the following order:</p>
<ol>
<li><p>Pod presents its Kubernetes-issued JWT to GCP's STS endpoint.</p>
</li>
<li><p>STS verifies the signature against your cluster's public JWKS.</p>
</li>
<li><p>STS checks the JWT's claims against the Workload Identity Pool's rules (audience, issuer, CEL conditions).</p>
</li>
<li><p>STS returns a short-lived Google access token (typically 1 hour) that the pod uses for API calls.</p>
</li>
</ol>
<p>It is also worth mentioning that Workload Identity Federation is not Kubernetes specific. It works with AWS IAM, Azure AD, GitHub Actions OIDC, and any OIDC-compliant identity provider.</p>
<h2 id="heading-how-kubernetes-identity-works">How Kubernetes Identity Works</h2>
<p>Every pod with a ServiceAccount gets a JSON Web Token (JWT) automatically mounted at <code>/run/secrets/kubernetes.io/serviceaccount/token</code>. This isn't just an opaque blob — it's a signed assertion of identity:</p>
<pre><code class="language-json">{
  "iss": "https://kubernetes.default.svc.cluster.local",
  "sub": "system:serviceaccount:production:backend-api",
  "aud": ["https://iam.googleapis.com/..."],
  "kubernetes.io": {
    "namespace": "production",
    "serviceaccount": {
      "name": "backend-api"
    }
  },
  "exp": 1735689600
}
</code></pre>
<p>In a JWT, claims are just the key-value pairs inside the token's payload — each one is a claim the issuer is making about the subject. Think of them as facts the token is asserting, signed cryptographically so the verifier can trust them.</p>
<p>The critical insight: this token is created by a set of JSON Web Key Set (JWKS) and is verifiable by anyone who has your cluster's public keys, exposed via the JSON Web Key Set (JWKS) endpoint:</p>
<pre><code class="language-bash">kubectl get --raw /openid/v1/jwks
</code></pre>
<p>Google Cloud's Security Token Service (STS) can validate these tokens. No keys are exchanged. No secrets are stored. Just cryptographic proof of identity.</p>
<h2 id="heading-how-to-prepare-google-cloud-platform-resources">How to Prepare Google Cloud Platform resources</h2>
<p>The Workload Identity Pool is a trust boundary — a declaration that says "I accept identities from external sources." The OIDC Provider configures how to validate those identities.</p>
<pre><code class="language-hcl">resource "google_iam_workload_identity_pool" "pool" {
  workload_identity_pool_id = "hybrid-platform-pool"
  project                   = "my-project"
}

resource "google_iam_workload_identity_pool_provider" "k8s_provider" {
  project                            = "my-project"
  workload_identity_pool_id          = google_iam_workload_identity_pool.pool.workload_identity_pool_id
  workload_identity_pool_provider_id = "on-prem-cluster"

  attribute_mapping = {
    "google.subject"      = "assertion.sub"
    "attribute.namespace" = "assertion['kubernetes.io']['namespace']"
  }

  attribute_condition = "attribute.namespace in [\"production\", \"staging\"]"

  oidc {
    issuer_uri = "https://kubernetes.default.svc.cluster.local"
    jwks_json  = file("jwks.json")  # Your cluster's public keys
  }
}
</code></pre>
<p>Two things to note here:</p>
<ol>
<li><p><code>attribute_mapping</code> extracts claims from the Kubernetes JWT and makes them available as GCP attributes. By using `assertion['kubernetes.io']['namespace']`, the namespace is pulled out so you can use it for access control.</p>
</li>
<li><p><code>attribute_condition</code> is where security policy lives. More on this in the next section.</p>
</li>
</ol>
<h2 id="heading-how-to-use-cel-for-fine-grained-access-control">How to Use CEL for Fine-Grained Access Control</h2>
<p>The <code>attribute_condition</code> field uses Common Expression Language (CEL). This single line of policy can replace dozens of Identity and Access Management (IAM) bindings:</p>
<pre><code class="language-plaintext">attribute.namespace in ["production", "staging"]
</code></pre>
<p>With this condition, a pod in the <code>kube-system</code> namespace cannot authenticate to GCP at all — the token exchange is rejected before IAM is even consulted.</p>
<p>You can get more sophisticated:</p>
<pre><code class="language-plaintext">// Only production namespace, and only specific service accounts
attribute.namespace == "production" &amp;&amp;
  attribute.service_account in ["payment-processor", "order-service"]

// Allow staging, but only during business hours
attribute.namespace == "staging" &amp;&amp;
  request.time.getHours("America/New_York") &gt;= 9 &amp;&amp;
  request.time.getHours("America/New_York") &lt; 17
</code></pre>
<p>This is defense in depth. Even if someone creates a rogue ServiceAccount or has <code>kubectl</code> access, they cannot authenticate to GCP unless the CEL condition passes. The security boundary is enforced by Google's infrastructure, not by hoping developers follow policy.</p>
<h2 id="heading-how-to-inject-credentials-automatically-with-kyverno">How to Inject Credentials Automatically with Kyverno</h2>
<p>Having a working identity federation is only half the battle. Your customers and developers shouldn't need to understand OIDC, STS, or credential configuration files. They should deploy their app and have it work.</p>
<p>Before we get to the automation, it's worth pausing on what a <em>credential configuration file</em> actually is — because the name is a little misleading.</p>
<p>A credential configuration file (sometimes called an "external account config" or "ADC config") is a small JSON document that tells Google's client libraries <strong>how to obtain</strong> a credential at runtime. It is <strong>not</strong> itself a credential. You'll see the actual file later in this article — it contains no secrets. Just metadata: the Workload Identity Pool audience, the STS token-exchange endpoint, the source token type, and the path on the pod's filesystem where the real (short-lived) Kubernetes ServiceAccount token lives.</p>
<p>Compare that to a traditional service account key:</p>
<table>
<thead>
<tr>
<th></th>
<th>Service Account Key (<code>key.json</code>)</th>
<th>Credential Config (<code>credential-configuration.json</code>)</th>
</tr>
</thead>
<tbody><tr>
<td>What's inside the file</td>
<td>An RSA private key that <em>is</em> the credential</td>
<td>Instructions for exchanging an external token</td>
</tr>
<tr>
<td>Lifetime of the secret material</td>
<td>Forever, until manually rotated</td>
<td>Source token rotates automatically (~1h TTL)</td>
</tr>
<tr>
<td>If the file leaks</td>
<td>Long-lived access to a GCP service account</td>
<td>Useless on its own — points to a token only the pod can read</td>
</tr>
<tr>
<td>Identity model</td>
<td>Impersonates a GCP service account directly</td>
<td>Federates an external identity into GCP via STS</td>
</tr>
<tr>
<td>Who handles rotation</td>
<td>A human (or no one)</td>
<td>The Kubernetes API server, transparently</td>
</tr>
</tbody></table>
<p>Both files end up referenced by <code>GOOGLE_APPLICATION_CREDENTIALS</code> and look interchangeable from the application's point of view — but only one of them is dangerous to lose. The credential config file is safe to ship in a ConfigMap precisely because there's nothing to steal.</p>
<p>Having this file in the ConfigMap is half the solution. It actually needs to end up in the workload pods that need access to GCP services. This is where Kyverno comes in. A single ClusterPolicy automatically injects everything a pod needs:</p>
<pre><code class="language-yaml">apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: workload-identity-federation
spec:
  rules:
    - name: inject-gcp-credentials
      match:
        any:
          - resources:
              kinds:
                - Deployment
              selector:
                matchLabels:
                  workload-identity-federation: "enabled"
      mutate:
        patchStrategicMerge:
          spec:
            template:
              spec:
                volumes:
                  - name: workload-identity-credential-configuration
                    configMap:
                      name: workload-identity-federation-config
                containers:
                  - (name): "*"
                    volumeMounts:
                      - name: workload-identity-credential-configuration
                        mountPath: /etc/workload-identity
                        readOnly: true
                    env:
                      - name: GOOGLE_APPLICATION_CREDENTIALS
                        value: "/etc/workload-identity/credential-configuration.json"
</code></pre>
<p>The above cluster policy does three things:</p>
<ol>
<li><p>Mounts the configmap inside the containers in the deployment at <code>/etc/workload-identity</code>.</p>
</li>
<li><p>Injects an environment variable called <code>GOOGLE_APPLICATION_CREDENTIALS</code> that points to the absolute path of the credential config file.</p>
</li>
</ol>
<p>From a developer's perspective, this is their entire integration:</p>
<pre><code class="language-yaml">apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-app
  labels:
    workload-identity-federation: "enabled" # That's it.
spec:
  # ... normal deployment spec
</code></pre>
<p>The credential configuration file (created by Terraform as a ConfigMap) tells Google's client libraries how to exchange tokens:</p>
<pre><code class="language-json">{
  "type": "external_account",
  "audience": "//iam.googleapis.com/projects/PROJECT_NUMBER/locations/global/workloadIdentityPools/POOL_ID/providers/PROVIDER_ID",
  "subject_token_type": "urn:ietf:params:oauth:token-type:jwt",
  "token_url": "https://sts.googleapis.com/v1/token",
  "credential_source": {
    "file": "/run/secrets/kubernetes.io/serviceaccount/token"
  }
}
</code></pre>
<p>This JSON file is a credential configuration for Google's Workload Identity Federation. It instructs Google Cloud client libraries to obtain cloud access tokens by exchanging a Kubernetes ServiceAccount token (located at <code>/run/secrets/kubernetes.io/serviceaccount/token</code>) for a Google Cloud access token, using an external identity provider configured via a Workload Identity Pool. This allows workloads running outside of GCP, such as on-premises Kubernetes clusters, to authenticate to Google Cloud services without needing to manage long-lived service account keys.</p>
<p>Every Google Cloud SDK and client library understands this format. Python, Go, Java, and Node.js all just work.</p>
<h2 id="heading-how-to-grant-iam-permissions-to-federated-identities">How to Grant IAM Permissions to Federated Identities</h2>
<p>The service account token that has been trusted by the STS service, also known as a federated identity, need permissions to access resources. You bind IAM roles to the identity pool attributes:</p>
<pre><code class="language-hcl">resource "google_project_iam_member" "secret_access" {
  for_each = toset(["production", "staging"])
  project  = "my-project"
  role     = "roles/secretmanager.secretAccessor"
  member   = "principalSet://iam.googleapis.com/projects/\({PROJECT_NUMBER}/locations/global/workloadIdentityPools/\){POOL_ID}/attribute.namespace/${each.value}"
}
</code></pre>
<p>This grants Secret Manager access to all pods authenticated from the <code>production</code> or <code>staging</code> namespaces. The <code>principalSet</code> syntax allows matching on attributes. You can also restrict to specific service accounts:</p>
<pre><code class="language-plaintext">member = "principal://iam.googleapis.com/.../subject/system:serviceaccount:production:payment-processor"
</code></pre>
<h2 id="heading-how-to-verify-the-setup">How to Verify the Setup</h2>
<p>You can verify the setup with a simple Python script that lists secrets from Secret Manager. This runs inside a pod on your on-premises cluster:</p>
<pre><code class="language-python"># list_secrets.py - running on-prem, accessing GCP Secret Manager
from google.cloud import secretmanager

def list_secrets(project_id: str):
    """
    List all secrets in a GCP project.

    No credentials are passed explicitly. The google-cloud-secret-manager
    library automatically:
    1. Reads GOOGLE_APPLICATION_CREDENTIALS env var (set by Kyverno)
    2. Loads the credential configuration JSON
    3. Reads the K8s ServiceAccount token from /run/secrets/...
    4. Exchanges it for a GCP access token via STS
    5. Uses that token to call the Secret Manager API
    """
    client = secretmanager.SecretManagerServiceClient()
    parent = f"projects/{project_id}"

    print(f"Secrets in {project_id}:")
    print("-" * 40)

    for secret in client.list_secrets(request={"parent": parent}):
        secret_name = secret.name.split("/")[-1]
        print(f"  - {secret_name}")

    print("-" * 40)
    print("Authentication: Workload Identity Federation")
    print("Credentials: None stored, token exchanged at runtime")

if __name__ == "__main__":
    list_secrets("my-project-id")
</code></pre>
<p>Run this inside your labeled pod:</p>
<pre><code class="language-bash">$ kubectl exec -it my-app-xyz -- python list_secrets.py

Secrets in my-project-id:
----------------------------------------
  - database-password
  - api-key-stripe
  - oauth-client-secret
  - ml-model-api-key
----------------------------------------
Authentication: Workload Identity Federation
Credentials: None stored, token exchanged at runtime
</code></pre>
<p>No service account key. No secret mounted. Just a Kubernetes ServiceAccount token exchanged for GCP credentials at runtime.</p>
<p>This same pattern works for any GCP service — Secret Manager, Cloud Storage, BigQuery, Pub/Sub, and Vertex AI.</p>
<h2 id="heading-how-to-connect-on-prem-apps-to-cloud-gpus">How to Connect On-Prem Apps to Cloud GPUs</h2>
<p>Consider a typical flow: an on-prem order processing service needs to call a Vertex AI endpoint for fraud detection. The model runs on GPUs in Google Cloud (you can spin up A100s in minutes, not months). The application logic stays on-prem (you've already paid for that compute).</p>
<p>With the IAM bindings in place, any pod in the allowed namespaces can call Vertex AI:</p>
<pre><code class="language-python"># fraud_detector.py - running on-prem, calling cloud GPUs
from google.cloud import aiplatform

def check_fraud(transaction: dict) -&gt; float:
    """
    Call a Vertex AI endpoint for fraud detection.

    The model runs on A100 GPUs in Google Cloud.
    This code runs on-prem in the datacenter.

    Authentication is automatic:
    1. Kyverno injected GOOGLE_APPLICATION_CREDENTIALS
    2. The aiplatform SDK reads the credential config
    3. K8s SA token is exchanged for GCP token via STS
    4. Request is authenticated to Vertex AI
    """
    endpoint = aiplatform.Endpoint(
        endpoint_name="projects/my-project/locations/us-central1/endpoints/fraud-model"
    )
    prediction = endpoint.predict(instances=[transaction])
    return prediction.predictions[0]["fraud_score"]


def generate_embeddings(texts: list[str]) -&gt; list[list[float]]:
    """
    Generate text embeddings using a cloud-hosted model.

    Embedding models are GPU-intensive. Running them on-prem
    would require dedicated hardware. In the cloud, you pay per request.
    """
    from vertexai.language_models import TextEmbeddingModel

    model = TextEmbeddingModel.from_pretrained("text-embedding-004")
    embeddings = model.get_embeddings(texts)
    return [e.values for e in embeddings]
</code></pre>
<p>The developer doesn't think about authentication at all. They add the label to their deployment, and their on-prem pod can call:</p>
<ul>
<li><p><strong>Vertex AI endpoints</strong> for ML inference on cloud GPUs</p>
</li>
<li><p><strong>Cloud Storage</strong> for model artifacts and training data</p>
</li>
<li><p><strong>BigQuery</strong> for feature stores and analytics</p>
</li>
<li><p><strong>Pub/Sub</strong> for event streaming between environments</p>
</li>
<li><p><strong>Secret Manager</strong> for API keys and configuration</p>
</li>
</ul>
<p>This is the hybrid platform working as intended.</p>
<h2 id="heading-how-to-scale-gpu-access-with-cel-conditions">How to Scale GPU Access with CEL Conditions</h2>
<p>CEL conditions become especially powerful when you want to restrict GPU access to specific namespaces. For example, to allow only ML-related namespaces to access Vertex AI:</p>
<pre><code class="language-plaintext">attribute.namespace in ["ml-inference", "ml-training", "data-science"] &amp;&amp;
  attribute.service_account.startsWith("ml-")
</code></pre>
<p>You can also grant different access levels per namespace:</p>
<pre><code class="language-hcl"># ML inference namespace gets prediction access
resource "google_project_iam_member" "ml_inference" {
  project = "my-project"
  role    = "roles/aiplatform.user"
  member  = "principalSet://iam.googleapis.com/.../attribute.namespace/ml-inference"
}

# Data science namespace gets full Vertex AI access (for experimentation)
resource "google_project_iam_member" "data_science" {
  project = "my-project"
  role    = "roles/aiplatform.admin"
  member  = "principalSet://iam.googleapis.com/.../attribute.namespace/data-science"
}
</code></pre>
<p>The on-prem application teams don't need to know or care about GCP IAM. They deploy to the right namespace, add a label, and the platform handles the rest.</p>
<h2 id="heading-the-security-properties-compared">The Security Properties Compared</h2>
<p>Here's a side-by-side comparison of the two authentication approaches:</p>
<table>
<thead>
<tr>
<th>Property</th>
<th>Service Account Keys</th>
<th>Workload Identity Federation</th>
</tr>
</thead>
<tbody><tr>
<td>Credential lifetime</td>
<td>Until manually rotated (often years)</td>
<td>Short-lived (1 hour for GCP tokens)</td>
</tr>
<tr>
<td>Exfiltration risk</td>
<td>High — static key can be copied anywhere</td>
<td>Low — token expires quickly</td>
</tr>
<tr>
<td>Audit trail</td>
<td>Service account name only</td>
<td>Namespace + service account name</td>
</tr>
<tr>
<td>Key management overhead</td>
<td>600+ keys at scale</td>
<td>Zero keys to manage</td>
</tr>
<tr>
<td>Security policy enforcement</td>
<td>Manual / trust-based</td>
<td>Enforced by GCP infrastructure via CEL</td>
</tr>
<tr>
<td>Developer experience</td>
<td>Copy key, create secret, mount volume</td>
<td>Add one label to the deployment</td>
</tr>
</tbody></table>
<p>The short-lived nature of tokens deserves emphasis. Even in a worst-case scenario where a token is somehow exfiltrated, it expires. Kubernetes ServiceAccount tokens have a configurable lifetime, and the GCP access tokens issued by STS are valid for one hour. A service account key, by contrast, remains valid until someone explicitly rotates it — often years.</p>
<h2 id="heading-the-complete-infrastructure-as-code-layout">The Complete Infrastructure as Code Layout</h2>
<p>The entire solution is codified in Terraform, managing both GCP and Kubernetes resources:</p>
<pre><code class="language-plaintext">workload-identity-federation/
├── providers.tf      # Google + Kubernetes providers
├── locals.tf         # Configuration (namespaces, project ID, etc.)
├── gcp.tf            # Identity pool, provider, IAM bindings
└── kubernetes.tf     # ConfigMap with credential configuration
</code></pre>
<p>A single <code>terraform apply</code>:</p>
<ol>
<li><p>Creates the Workload Identity Pool in GCP</p>
</li>
<li><p>Configures the OIDC provider with your cluster's JWKS</p>
</li>
<li><p>Sets up IAM bindings for allowed namespaces</p>
</li>
<li><p>Creates ConfigMaps in each namespace with the credential configuration</p>
</li>
</ol>
<p>Combined with the Kyverno policy, you get a fully automated pipeline:</p>
<pre><code class="language-plaintext">New namespace added to allowed list
        │
        ▼
Terraform creates ConfigMap in that namespace
        │
        ▼
Developer deploys with label
        │
        ▼
Kyverno injects credentials automatically
        │
        ▼
Pod authenticates to GCP via OIDC
        │
        ▼
Application accesses GCP services
</code></pre>
<p>No tickets. No key requests. No secrets to manage.</p>
<h2 id="heading-how-to-run-a-proof-of-concept-with-vcluster">How to Run a Proof of Concept with vCluster</h2>
<p>To validate this works outside GKE, you can set up a demonstration using <a href="https://www.vcluster.com/">vCluster</a> — a virtual Kubernetes cluster that runs inside another Kubernetes cluster. This proves the solution works for any cluster. You can setup vCluster in Docker using <a href="https://github.com/loft-sh/vind/blob/main/docs/getting-started.md">vind</a></p>
<pre><code class="language-yaml"># vcluster.yaml
experimental:
  docker:
    nodes:
      - name: worker-1
      - name: worker-2
deploy:
  cni:
    flannel:
      enabled: true
controlPlane:
  distro:
    k8s:
      version: "v1.35.0"
</code></pre>
<pre><code class="language-shell">[root@localhost #] vcluster create hybrid --driver docker -f vcluster.yaml
[root@localhost #] kubectl get nodes
hybrid-control-plane   Ready    control-plane   14d   v1.34.0   192.168.107.2   &lt;none&gt;        Debian GNU/Linux 12 (bookworm)   7.0.5-orbstack-00330-ge3df4e19b0a0-dirty   containerd://2.1.3
hybrid-worker          Ready    &lt;none&gt;          14d   v1.34.0   192.168.107.3   &lt;none&gt;        Debian GNU/Linux 12 (bookworm)   7.0.5-orbstack-00330-ge3df4e19b0a0-dirty   containerd://2.1.3
hybrid-worker2         Ready    &lt;none&gt;          14d   v1.34.0   192.168.107.4   &lt;none&gt;        Debian GNU/Linux 12 (bookworm)   7.0.5-orbstack-00330-ge3df4e19b0a0-dirty   containerd://2.1.3
</code></pre>
<p>Inside the vCluster, deploy a simple test deployment:</p>
<pre><code class="language-yaml">apiVersion: apps/v1
kind: Deployment
metadata:
  name: gcp-test
  labels:
    workload-identity-federation: "enabled"
spec:
  replicas: 1
  selector:
    matchLabels:
      app: gcp-test
  template:
    metadata:
      labels:
        app: gcp-test
    spec:
      containers:
        - name: test
          image: google/cloud-sdk:slim
          command: ["sleep", "infinity"]
</code></pre>
<p>Exec into the pod and verify:</p>
<pre><code class="language-bash">$ kubectl exec -it gcp-test-xxx -- bash

# Inside the pod:
\( gcloud auth login --cred-file=\)GOOGLE_APPLICATION_CREDENTIALS
Authenticated with external account credentials for: [principal://iam.googleapis.com/...]

$ gcloud secrets list --project=my-project
NAME                 CREATED
database-password    2024-01-15T10:30:00Z
api-key              2024-01-14T09:15:00Z
</code></pre>
<p>No keys. No secrets mounted. Just identity federation working as designed.</p>
<h2 id="heading-common-issues-and-how-to-solve-them">Common Issues and How to Solve Them</h2>
<h3 id="heading-how-to-handle-jwks-retrieval-for-air-gapped-clusters">How to Handle JWKS Retrieval for Air-Gapped Clusters</h3>
<p>If your cluster's OIDC discovery endpoint isn't publicly reachable (most on-prem clusters aren't), you need to manually export the JWKS and upload it to GCP:</p>
<pre><code class="language-bash">kubectl get --raw /openid/v1/jwks &gt; jwks.json
</code></pre>
<p>This file must be updated if the cluster's signing keys rotate. Set up a periodic job that checks for key changes and updates the Terraform configuration.</p>
<h3 id="heading-how-to-fix-issuer-url-mismatches">How to Fix Issuer URL Mismatches</h3>
<p>The <code>iss</code> claim in the Kubernetes token must exactly match the issuer URL configured in the OIDC provider. For clusters using internal DNS:</p>
<pre><code class="language-plaintext">issuer_uri = "https://kubernetes.default.svc.cluster.local"
</code></pre>
<p>This URL doesn't need to be reachable from GCP — the JWKS file provides the validation keys. But it must match what's in the token exactly.</p>
<h3 id="heading-how-to-debug-token-exchange-failures">How to Debug Token Exchange Failures</h3>
<p>When authentication fails, the error messages can be cryptic. Common causes and fixes:</p>
<table>
<thead>
<tr>
<th>Error</th>
<th>Likely Cause</th>
<th>Fix</th>
</tr>
</thead>
<tbody><tr>
<td><code>invalid_grant</code></td>
<td>Issuer URL mismatch</td>
<td>Check <code>iss</code> claim in JWT against configured <code>issuer_uri</code></td>
</tr>
<tr>
<td><code>audience mismatch</code></td>
<td>Wrong <code>audience</code> in credential config</td>
<td>Regenerate the credential configuration JSON via Terraform</td>
</tr>
<tr>
<td><code>CEL condition failed</code></td>
<td>Namespace not in allowed list</td>
<td>Add namespace to <code>attribute_condition</code> and re-apply</td>
</tr>
<tr>
<td><code>JWKS validation failed</code></td>
<td>Signing keys have rotated</td>
<td>Re-export JWKS and update Terraform config</td>
</tr>
</tbody></table>
<h2 id="heading-conclusion">Conclusion</h2>
<p>After implementing this setup, on-premises workloads authenticate to Google Cloud exactly like GKE workloads do — without a single long-lived credential. The security team is happy (no keys to audit), developers are happy (just add a label), and the platform team is happy (no more credential management tickets).</p>
<p>Here's what you accomplished in this tutorial:</p>
<ol>
<li><p>/Understood why service account keys fail at scale and the security risks they introduce</p>
</li>
<li><p>Created a Workload Identity Pool and OIDC provider in GCP to trust your cluster's token issuer</p>
</li>
<li><p>Used CEL conditions to enforce fine-grained, namespace-level access policies</p>
</li>
<li><p>Automated credential injection into pods using a Kyverno ClusterPolicy</p>
</li>
<li><p>Bound IAM roles to federated identity attributes — no long-lived keys anywhere</p>
</li>
<li><p>Verified the setup by calling GCP APIs (Secret Manager, Vertex AI) from an on-prem pod</p>
</li>
<li><p>Proved the solution works on any Kubernetes cluster using vCluster</p>
</li>
</ol>
<p>The technologies used here aren't new. OIDC has been in Kubernetes since version 1.20. Workload Identity Federation has been in GCP for years. Kyverno and Terraform are mature tools. What this tutorial puts together is an end-to-end solution that developers can adopt with minimal effort.</p>
<p>If your organization has disabled service account keys (or should), this is the path forward. Your on-prem and cloud clusters can finally be what they were always meant to be: secure extensions of each other.</p>
<p><em>The complete implementation is available as a Terraform module with Kyverno policies:</em> <a href="https://github.com/shkatara/hybrid-platform-gcp-workload-identity-federation"><em>github.com/shkatara/hybrid-platform-gcp-workload-identity-federation</em></a></p>
<p>If this helps, you can follow me on <a href="https://www.linkedin.com/in/shubhamkatara/">https://www.linkedin.com/in/shubhamkatara/</a>, <a href="https://www.youtube.com/@kubesimplify">https://www.youtube.com/@kubesimplify</a>, <a href="https://www.linkedin.com/company/kubesimplify/">https://www.linkedin.com/company/kubesimplify/</a> and</p>
 ]]>
                </content:encoded>
            </item>
        
    </channel>
</rss>
