<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/"
    xmlns:atom="http://www.w3.org/2005/Atom" xmlns:media="http://search.yahoo.com/mrss/" version="2.0">
    <channel>
        
        <title>
            <![CDATA[ containers - freeCodeCamp.org ]]>
        </title>
        <description>
            <![CDATA[ Browse thousands of programming tutorials written by experts. Learn Web Development, Data Science, DevOps, Security, and get developer career advice. ]]>
        </description>
        <link>https://www.freecodecamp.org/news/</link>
        <image>
            <url>https://cdn.freecodecamp.org/universal/favicons/favicon.png</url>
            <title>
                <![CDATA[ containers - freeCodeCamp.org ]]>
            </title>
            <link>https://www.freecodecamp.org/news/</link>
        </image>
        <generator>Eleventy</generator>
        <lastBuildDate>Thu, 21 May 2026 10:20:55 +0000</lastBuildDate>
        <atom:link href="https://www.freecodecamp.org/news/tag/containers/rss.xml" rel="self" type="application/rss+xml" />
        <ttl>60</ttl>
        
            <item>
                <title>
                    <![CDATA[ How to Encrypt Kubernetes Traffic with cert-manager, Let's Encrypt, and Internal TLS ]]>
                </title>
                <description>
                    <![CDATA[ Most engineers assume their Kubernetes cluster encrypts all of its traffic. It doesn't. The commands you run with kubectl are encrypted — your client and the API server speak TLS. The API server talki ]]>
                </description>
                <link>https://www.freecodecamp.org/news/how-to-encrypt-kubernetes-traffic/</link>
                <guid isPermaLink="false">6a0df3b68b034602219e482c</guid>
                
                    <category>
                        <![CDATA[ Kubernetes ]]>
                    </category>
                
                    <category>
                        <![CDATA[ containers ]]>
                    </category>
                
                    <category>
                        <![CDATA[ distributed system ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ Destiny Erhabor ]]>
                </dc:creator>
                <pubDate>Wed, 20 May 2026 17:47:34 +0000</pubDate>
                <media:content url="https://cdn.hashnode.com/uploads/covers/5fc16e412cae9c5b190b6cdd/c1cf9847-fa0f-49f3-93f4-3c5c1e8ac4c0.png" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>Most engineers assume their Kubernetes cluster encrypts all of its traffic. It doesn't. The commands you run with <code>kubectl</code> are encrypted — your client and the API server speak TLS. The API server talking to etcd is usually encrypted too, depending on how the cluster was provisioned.</p>
<p>But traffic between your pods? Plaintext by default. Ingress traffic from the internet to your services? Only encrypted if you explicitly configure TLS. And certificates for internal services? You have to provision those yourself.</p>
<p>This is not a Kubernetes oversight. It's a deliberate design choice — Kubernetes provides the primitives and leaves the implementation to you. The problem is that certificate management is notoriously painful. Certificates expire. Provisioning them manually doesn't scale. Forgetting to rotate them causes outages.</p>
<p>cert-manager solves this. It runs as a controller inside your cluster, watches for <code>Certificate</code> resources, requests certificates from configured issuers, stores them in Kubernetes Secrets, and rotates them automatically before they expire. You declare what you want, cert-manager makes it happen and keeps it that way.</p>
<p>In this article you'll work through how cert-manager's core model works, automate public Ingress TLS using Let's Encrypt, set up an internal Certificate Authority for service-to-service encryption, and understand how certificate rotation works so outages caused by expired certificates become a thing of the past.</p>
<h2 id="heading-prerequisites">Prerequisites</h2>
<ul>
<li><p>A kind cluster with the nginx Ingress controller installed</p>
</li>
<li><p>Helm 3 installed</p>
</li>
<li><p>A domain name with DNS you control — needed for the Let's Encrypt demo</p>
</li>
<li><p>Basic understanding of TLS: you know what a certificate, a private key, and a CA are</p>
</li>
</ul>
<p>All demo files are in the <a href="https://github.com/Caesarsage/DevOps-Cloud-Projects/tree/main/intermediate/k8/security/cert-manager">DevOps-Cloud-Projects GitHub repository</a>.</p>
<h2 id="heading-table-of-contents">Table of Contents</h2>
<ul>
<li><p><a href="#heading-what-is-and-isnt-encrypted-in-kubernetes">What Is and Isn't Encrypted in Kubernetes</a></p>
</li>
<li><p><a href="#heading-how-cert-manager-works">How cert-manager Works</a></p>
<ul>
<li><p><a href="#heading-the-four-core-resources">The Four Core Resources</a></p>
</li>
<li><p><a href="#heading-issuers-and-clusterissuers">Issuers and ClusterIssuers</a></p>
</li>
<li><p><a href="#heading-the-certificate-lifecycle">The Certificate Lifecycle</a></p>
</li>
<li><p><a href="#heading-acme-challenges-http-01-vs-dns-01">ACME Challenges: HTTP-01 vs DNS-01</a></p>
</li>
</ul>
</li>
<li><p><a href="#heading-demo-1--install-cert-manager-and-issue-a-lets-encrypt-certificate">Demo 1 — Install cert-manager and Issue a Let's Encrypt Certificate</a></p>
</li>
<li><p><a href="#heading-how-to-get-a-wildcard-certificate-with-dns-01">How to Get a Wildcard Certificate with DNS-01</a></p>
</li>
<li><p><a href="#heading-demo-2--set-up-an-internal-ca-for-service-to-service-tls">Demo 2 — Set Up an Internal CA for Service-to-Service TLS</a></p>
</li>
<li><p><a href="#heading-how-certificate-rotation-works">How Certificate Rotation Works</a></p>
</li>
<li><p><a href="#heading-cleanup">Cleanup</a></p>
</li>
<li><p><a href="#heading-conclusion">Conclusion</a></p>
</li>
</ul>
<h2 id="heading-what-is-and-isnt-encrypted-in-kubernetes">What Is and Isn't Encrypted in Kubernetes?</h2>
<p>Before installing anything, it's worth being precise about what the cluster already protects and what it leaves open.</p>
<table>
<thead>
<tr>
<th>Traffic path</th>
<th>Encrypted by default?</th>
<th>Notes</th>
</tr>
</thead>
<tbody><tr>
<td><code>kubectl</code> → API server</td>
<td>Yes</td>
<td>TLS with the cluster CA</td>
</tr>
<tr>
<td>API server → etcd</td>
<td>Usually</td>
<td>Depends on cluster provisioner — verify with your setup</td>
</tr>
<tr>
<td>API server → kubelet</td>
<td>Yes</td>
<td>TLS, but kubelet cert verification depends on configuration</td>
</tr>
<tr>
<td>Pod → Pod (same cluster)</td>
<td><strong>No</strong></td>
<td>Plaintext unless you add a service mesh or mTLS</td>
</tr>
<tr>
<td>Internet → Ingress</td>
<td><strong>No</strong></td>
<td>Opt-in — requires TLS configuration on the Ingress resource</td>
</tr>
<tr>
<td>Pod → Kubernetes API</td>
<td>Yes</td>
<td>Via the service account token and cluster CA</td>
</tr>
</tbody></table>
<p>The two gaps that matter most in practice are pod-to-pod traffic and Ingress TLS. This article covers both Ingress TLS with Let's Encrypt and internal service-to-service encryption using a private CA.</p>
<h2 id="heading-how-cert-manager-works">How cert-manager Works</h2>
<p>cert-manager is a Kubernetes operator. It extends the Kubernetes API with custom resources that represent certificate requests and their configuration. When you create a <code>Certificate</code> resource, cert-manager's controller picks it up, requests a certificate from the configured issuer, and stores the resulting certificate and private key in a Kubernetes Secret. When the certificate approaches its expiry, cert-manager renews it automatically.</p>
<p>This model means your application doesn't know or care about certificate management. It reads a Secret. cert-manager keeps that Secret fresh.</p>
<h3 id="heading-the-four-core-resources">The Four Core Resources</h3>
<p>cert-manager introduces four custom resources that you'll use regularly:</p>
<table>
<thead>
<tr>
<th>Resource</th>
<th>What it represents</th>
</tr>
</thead>
<tbody><tr>
<td><code>Issuer</code></td>
<td>A certificate authority or ACME account — namespace-scoped</td>
</tr>
<tr>
<td><code>ClusterIssuer</code></td>
<td>Same as Issuer, but available cluster-wide</td>
</tr>
<tr>
<td><code>Certificate</code></td>
<td>A request for a certificate — describes what you want</td>
</tr>
<tr>
<td><code>CertificateRequest</code></td>
<td>An individual signing request — created automatically by cert-manager, rarely touched directly</td>
</tr>
</tbody></table>
<p>In practice you'll mostly deal with <code>ClusterIssuer</code> and <code>Certificate</code>. The <code>ClusterIssuer</code> defines where certificates come from. The <code>Certificate</code> defines what certificate you want and where to store it.</p>
<h3 id="heading-issuers-and-clusterissuers">Issuers and ClusterIssuers</h3>
<p>An <code>Issuer</code> can only issue certificates within its own namespace. A <code>ClusterIssuer</code> can issue certificates in any namespace. For shared infrastructure like Let's Encrypt, you almost always want a <code>ClusterIssuer</code>. For application-specific internal CAs, an <code>Issuer</code> scoped to that application's namespace is the safer choice.</p>
<p>cert-manager supports several issuer types. The three you'll encounter most often are:</p>
<p><strong>ACME</strong> — for public certificates from Let's Encrypt or any ACME-compatible CA. Ownership of the domain is proven via an HTTP-01 or DNS-01 challenge.</p>
<p><strong>CA</strong> — for internal certificates signed by a CA whose private key is stored in a Kubernetes Secret. Used for service-to-service TLS within the cluster.</p>
<p><strong>Self-signed</strong> — generates self-signed certificates. Rarely useful on its own, but essential as the bootstrap step when creating an internal CA.</p>
<h3 id="heading-the-certificate-lifecycle">The Certificate Lifecycle</h3>
<p>When you create a <code>Certificate</code> resource, cert-manager follows this sequence:</p>
<ol>
<li><p>Creates a <code>CertificateRequest</code> with a CSR (Certificate Signing Request)</p>
</li>
<li><p>Passes the CSR to the configured issuer</p>
</li>
<li><p>For ACME issuers: creates a <code>Challenge</code> resource and fulfils it (more on this below)</p>
</li>
<li><p>Receives the signed certificate from the issuer</p>
</li>
<li><p>Stores the certificate and private key in the Kubernetes Secret named in <code>spec.secretName</code></p>
</li>
<li><p>Monitors the certificate's expiry — by default, renews when 2/3 of the validity period has elapsed</p>
</li>
</ol>
<p>Your application mounts the Secret. cert-manager updates it silently. Most applications that watch for file changes will pick up the new certificate without a restart.</p>
<h3 id="heading-acme-challenges-http-01-vs-dns-01">ACME Challenges: HTTP-01 vs DNS-01</h3>
<p>Let's Encrypt needs proof that you control the domain before it issues a certificate. ACME defines two challenge types for this.</p>
<p><strong>HTTP-01</strong> works by having cert-manager create a temporary HTTP endpoint at <code>http://&lt;your-domain&gt;/.well-known/acme-challenge/&lt;token&gt;</code>. Let's Encrypt sends a request to that URL. If the response matches the expected token, the challenge passes. This requires your cluster to be reachable from the internet on port 80.</p>
<p><strong>DNS-01</strong> works by having cert-manager create a temporary DNS TXT record at <code>_acme-challenge.&lt;your-domain&gt;</code>. Let's Encrypt checks for that record. This doesn't require inbound HTTP access, which makes it the right choice for private clusters, and it's the only way to get wildcard certificates (<code>*.example.com</code>).</p>
<p>The trade-off: HTTP-01 is simpler to set up but only works for single domains and requires internet-accessible infrastructure. DNS-01 requires API access to your DNS provider but works for internal clusters and wildcards.</p>
<h2 id="heading-demo-1-install-cert-manager-and-issue-a-certificate-using-pebble-and-lets-encrypt">Demo 1 — Install cert-manager and Issue a Certificate Using Pebble and Let's Encrypt</h2>
<p>Pebble is Let's Encrypt's local ACME test server. It runs inside your cluster, issues certificates using the same ACME protocol as Let's Encrypt, and requires no public domain or internet access. Using Pebble lets you test the full cert-manager flow — challenge, issuance, renewal — on a plain kind cluster.</p>
<p>Once you understand the flow locally, switching to real Let's Encrypt is a one-line change: replace the ClusterIssuer server URL and point a DNS record at a publicly reachable cluster. The rest of the configuration is identical.</p>
<p>You'll install cert-manager, create a <code>ClusterIssuer</code> for Let's Encrypt, deploy a sample application with an Ingress, and watch a real certificate be issued and stored automatically.</p>
<h3 id="heading-step-1-install-cert-manager">Step 1: Install cert-manager</h3>
<p>cert-manager is now distributed via OCI Helm charts from <code>quay.io/jetstack</code>. The <code>--set crds.enabled=true</code> flag installs the Custom Resource Definitions as part of the chart:</p>
<pre><code class="language-bash">helm upgrade cert-manager oci://quay.io/jetstack/charts/cert-manager \
  --install \
  --create-namespace \
  --namespace cert-manager \
  --set crds.enabled=true \
  --version v1.17.0 \
  --wait
</code></pre>
<p>You also need the nginx Ingress controller — cert-manager routes HTTP-01 challenges through it. The <code>controller.service.type=ClusterIP</code> override is for kind specifically: the default <code>LoadBalancer</code> Service never gets an <code>EXTERNAL-IP</code> on kind (there's no cloud LB), which makes <code>--wait</code> hang forever. On a real cluster, drop the override and keep <code>LoadBalancer</code>.</p>
<pre><code class="language-bash">helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
helm repo update

helm install ingress-nginx ingress-nginx/ingress-nginx \
  --namespace ingress-nginx \
  --create-namespace \
  --set controller.service.type=ClusterIP \
  --wait
</code></pre>
<p>Confirm all four components are running:</p>
<pre><code class="language-bash">kubectl get pods -n cert-manager
kubectl get pods -n ingress-nginx
</code></pre>
<pre><code class="language-plaintext">NAME                                       READY   STATUS    RESTARTS   AGE
cert-manager-76f84784c8-r4fx4              1/1     Running   0          6m45s
cert-manager-cainjector-66fbf49587-gv25n   1/1     Running   0          6m45s
cert-manager-webhook-577fddf86-l5wj4       1/1     Running   0          6m45s

NAME                                        READY   STATUS    RESTARTS   AGE
ingress-nginx-controller-6c7cd85885-h7zgx   1/1     Running   0          3m34s
</code></pre>
<blockquote>
<p>kind-specific gotcha — remove the nginx admission webhook now.** On kind, the nginx admission webhook serves with a self-signed certificate that the Kubernetes API server cannot verify. The first time you try to create <em>any</em> Ingress resource you'll see <code>failed calling webhook "validate.nginx.ingress.kubernetes.io": ... x509: certificate signed by unknown authority</code>. Delete the webhook up front so the rest of the demo doesn't trip over it:</p>
</blockquote>
<pre><code class="language-bash">kubectl delete validatingwebhookconfiguration ingress-nginx-admission
</code></pre>
<h3 id="heading-step-2-install-pebble">Step 2: Install Pebble</h3>
<p>Pebble is the local ACME test server, distributed by the JupyterHub project. It ships with a companion CoreDNS deployment (<code>pebble-coredns</code>) that Pebble uses to resolve names during ACME validation.</p>
<pre><code class="language-bash">helm install pebble pebble \
  --repo https://jupyterhub.github.io/helm-chart/ \
  --namespace pebble \
  --create-namespace \
  --wait
</code></pre>
<p>Confirm both pods are running:</p>
<pre><code class="language-bash">kubectl get pods -n pebble
</code></pre>
<pre><code class="language-plaintext">NAME                              READY   STATUS    RESTARTS   AGE
pebble-8d8d49d64-lz8ck            1/1     Running   0          36s
pebble-coredns-7fb5c7cbf4-4jw9h   1/1     Running   0          36s
</code></pre>
<h3 id="heading-step-3-wire-up-dns-for-the-fake-hostname">Step 3: Wire up DNS for the fake hostname</h3>
<p>We're going to issue a cert for <code>echo.pebble.local</code>. That hostname is fake — it doesn't exist in any real DNS — so we have to teach <strong>two</strong> independent resolvers about it before issuance will work:</p>
<table>
<thead>
<tr>
<th>Resolver</th>
<th>Used by</th>
<th>What we need it to do</th>
</tr>
</thead>
<tbody><tr>
<td><code>pebble-coredns</code> (in the <code>pebble</code> namespace)</td>
<td>Pebble itself, when it makes the HTTP-01 validation request</td>
<td>Resolve <code>echo.pebble.local</code> → ingress-nginx ClusterIP</td>
</tr>
<tr>
<td>Cluster CoreDNS (<code>kube-system</code>)</td>
<td>cert-manager's HTTP-01 <strong>self-check</strong> before reporting the challenge ready</td>
<td>Forward <code>pebble.local</code> lookups to <code>pebble-coredns</code></td>
</tr>
</tbody></table>
<p>If you skip either layer, the Order will go to <code>invalid</code> state with a DNS lookup failure.</p>
<p>First grab the two IPs you'll need:</p>
<pre><code class="language-bash">NGINX_IP=$(kubectl get svc -n ingress-nginx ingress-nginx-controller \
  -o jsonpath='{.spec.clusterIP}')
PEBBLE_DNS_IP=$(kubectl get svc pebble-coredns -n pebble \
  -o jsonpath='{.spec.clusterIP}')
echo "NGINX_IP=\(NGINX_IP  PEBBLE_DNS_IP=\)PEBBLE_DNS_IP"
</code></pre>
<p><strong>Patch</strong> <code>pebble-coredns</code> to answer for <code>*.pebble.local</code> with the ingress controller's IP. The CoreDNS <code>template</code> plugin parses unreliably when the whole block is collapsed onto one line, so apply a real multi-line ConfigMap:</p>
<pre><code class="language-bash">cat &lt;&lt;EOF | kubectl apply -f -
apiVersion: v1
kind: ConfigMap
metadata:
  name: pebble-coredns
  namespace: pebble
data:
  Corefile: |
    .:8053 {
      errors
      health
      ready
      template ANY ANY pebble.local {
        answer "{{ .Name }} 60 IN A ${NGINX_IP}"
      }
      forward . /etc/resolv.conf
      cache 2
      reload
    }
EOF

kubectl rollout restart deploy/pebble-coredns -n pebble
kubectl rollout status deploy/pebble-coredns -n pebble
</code></pre>
<p>Verify it answers correctly:</p>
<pre><code class="language-bash">kubectl run dnstest --rm -it --restart=Never --image=busybox -- \
  nslookup echo.pebble.local ${PEBBLE_DNS_IP}
</code></pre>
<p>You should see <code>Address: &lt;NGINX_IP&gt;</code> in the response. If you get <code>SERVFAIL</code>, check <code>kubectl logs -n pebble deploy/pebble-coredns</code> — a parser error like <code>not a TTL: "}"</code> means the template block collapsed onto one line again.</p>
<p><strong>Patch the cluster CoreDNS</strong> so cert-manager's self-check can resolve the same name. Add a stub zone that forwards <code>pebble.local</code> to <code>pebble-coredns</code>:</p>
<pre><code class="language-bash">cat &lt;&lt;EOF | kubectl apply -f -
apiVersion: v1
kind: ConfigMap
metadata:
  name: coredns
  namespace: kube-system
data:
  Corefile: |
    .:53 {
        errors
        health {
           lameduck 5s
        }
        ready
        kubernetes cluster.local in-addr.arpa ip6.arpa {
           pods insecure
           fallthrough in-addr.arpa ip6.arpa
           ttl 30
        }
        forward . /etc/resolv.conf {
           max_concurrent 1000
        }
        cache 30
        loop
        reload
        loadbalance
    }
    pebble.local:53 {
        forward . ${PEBBLE_DNS_IP}
    }
EOF

kubectl rollout restart deploy/coredns -n kube-system
kubectl rollout status deploy/coredns -n kube-system
</code></pre>
<p>Verify the cluster resolver now answers for <code>echo.pebble.local</code> (without specifying a server — it'll use the default kube-dns):</p>
<pre><code class="language-bash">kubectl run dnstest --rm -it --restart=Never --image=busybox -- \
  nslookup echo.pebble.local
</code></pre>
<p>Both <code>Server: 10.96.0.10</code> and <code>Address: &lt;NGINX_IP&gt;</code> should appear.</p>
<h3 id="heading-step-4-fetch-the-pebble-ca-and-create-the-clusterissuer">Step 4: Fetch the Pebble CA and create the ClusterIssuer</h3>
<p>Pebble signs its certificates with a self-signed root that lives in the <code>pebble</code> ConfigMap under <code>root-cert.pem</code>. cert-manager needs to trust this CA to talk to Pebble's ACME directory, so we pass it as a base64-encoded <code>caBundle</code> in the ClusterIssuer:</p>
<pre><code class="language-bash">kubectl get configmap pebble -n pebble \
  -o jsonpath='{.data.root-cert\.pem}' &gt; pebble-ca.crt

head -1 pebble-ca.crt   # should print -----BEGIN CERTIFICATE-----

CA_BUNDLE=$(base64 -i pebble-ca.crt | tr -d '\n')
echo "CA_BUNDLE length: ${#CA_BUNDLE}"   # ~1600 chars, one continuous line
</code></pre>
<p>Create the ClusterIssuer using the heredoc — the <code>${CA_BUNDLE}</code> shell variable gets substituted into the YAML before kubectl reads it:</p>
<pre><code class="language-bash">kubectl apply -f - &lt;&lt;EOF
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: pebble
spec:
  acme:
    server: https://pebble.pebble.svc.cluster.local/dir
    email: test@example.com
    privateKeySecretRef:
      name: pebble-account-key
    caBundle: ${CA_BUNDLE}
    solvers:
      - http01:
          ingress:
            ingressClassName: nginx
EOF
</code></pre>
<p>Check the issuer is ready:</p>
<pre><code class="language-bash">kubectl get clusterissuer pebble
</code></pre>
<pre><code class="language-plaintext">NAME     READY   AGE
pebble   True    5s
</code></pre>
<p>If <code>READY</code> stays <code>False</code>, the two most common causes are a malformed caBundle (verify it's a single unbroken base64 line with no newlines) or Pebble being unreachable from the <code>cert-manager</code> namespace. To check reachability:</p>
<pre><code class="language-bash">kubectl run test-curl --rm -it --restart=Never \
  --image=curlimages/curl:latest \
  --namespace cert-manager -- \
  curl -k https://pebble.pebble.svc.cluster.local/dir
</code></pre>
<p>If that returns JSON, Pebble is reachable.</p>
<h3 id="heading-step-5-deploy-a-sample-application">Step 5: Deploy a sample application</h3>
<pre><code class="language-yaml"># echo-app.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: echo
  namespace: default
spec:
  replicas: 1
  selector:
    matchLabels:
      app: echo
  template:
    metadata:
      labels:
        app: echo
    spec:
      containers:
        - name: echo
          image: ealen/echo-server:latest
          ports:
            - containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
  name: echo
  namespace: default
spec:
  selector:
    app: echo
  ports:
    - port: 80
      targetPort: 80
</code></pre>
<pre><code class="language-bash">kubectl apply -f echo-app.yaml
</code></pre>
<p>Verify the resources came up:</p>
<pre><code class="language-bash">kubectl get deploy,pod,svc -n default
</code></pre>
<pre><code class="language-plaintext">NAME                   READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/echo   1/1     1            1           32s

NAME                        READY   STATUS    RESTARTS   AGE
pod/echo-5665fbcfdd-mbgxj   1/1     Running   0          36s

NAME                 TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)   AGE
service/echo         ClusterIP   10.96.103.114   &lt;none&gt;        80/TCP    40s
service/kubernetes   ClusterIP   10.96.0.1       &lt;none&gt;        443/TCP   32m
</code></pre>
<h3 id="heading-step-6-create-an-ingress-with-tls">Step 6: Create an Ingress with TLS</h3>
<p>The <code>cert-manager.io/cluster-issuer: pebble</code> annotation tells cert-manager to automatically create a <code>Certificate</code> resource for this Ingress, using the issuer we just created. The hostname <code>echo.pebble.local</code> doesn't need to resolve externally — we taught both DNS resolvers about it in Step 3.</p>
<pre><code class="language-yaml"># echo-ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: echo
  namespace: default
  annotations:
    cert-manager.io/cluster-issuer: pebble
spec:
  ingressClassName: nginx
  tls:
    - hosts:
        - echo.pebble.local
      secretName: echo-tls     # cert-manager will create this Secret
  rules:
    - host: echo.pebble.local
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: echo
                port:
                  number: 80
</code></pre>
<pre><code class="language-bash">kubectl apply -f echo-ingress.yaml
</code></pre>
<h3 id="heading-step-7-watch-the-certificate-being-issued">Step 7: Watch the certificate being issued</h3>
<pre><code class="language-bash"># Watch the Certificate resource (Ctrl-C once Ready=True)
kubectl get certificate echo-tls -n default -w
</code></pre>
<pre><code class="language-plaintext">NAME       READY   SECRET     AGE
echo-tls   False   echo-tls   5s
echo-tls   True    echo-tls   28s
</code></pre>
<p>When <code>READY</code> becomes <code>True</code>, the certificate has been issued and stored in the <code>echo-tls</code> Secret. The full chain — CertificateRequest → Order → Challenge → solver pod → Secret — happens in well under a minute on a healthy cluster:</p>
<pre><code class="language-bash">kubectl get certificate,certificaterequest,order,challenge -n default
</code></pre>
<pre><code class="language-plaintext">NAME                                   READY   SECRET     AGE
certificate.cert-manager.io/echo-tls   True    echo-tls   81s

NAME                                            APPROVED   DENIED   READY   ISSUER   AGE
certificaterequest.cert-manager.io/echo-tls-1   True                True    pebble   81s

NAME                                               STATE   AGE
order.acme.cert-manager.io/echo-tls-1-1824732543   valid   81s
</code></pre>
<p>(Challenges are deleted automatically once an Order completes, so <code>kubectl get challenge -n default</code> typically shows nothing at this point — that's success, not failure.)</p>
<p>If <code>READY</code> stays <code>False</code> for more than a minute, see the troubleshooting tips at the end of this section.</p>
<p>Inspect the issued certificate to confirm Pebble signed it:</p>
<pre><code class="language-bash">kubectl get secret echo-tls -n default -o jsonpath='{.data.tls\.crt}' | \
  base64 -d | openssl x509 -noout -issuer -subject -dates
</code></pre>
<pre><code class="language-plaintext">issuer=CN=Pebble Intermediate CA 05478c
subject=
notBefore=May 17 19:09:22 2026 GMT
notAfter=Aug 15 19:09:21 2026 GMT
</code></pre>
<p>Issuer is Pebble's intermediate CA — proof the full ACME flow worked end-to-end. The cert is valid for 90 days, and cert-manager will renew it automatically at day 60.</p>
<p>Hit the ingress over HTTPS from inside the cluster to confirm everything is wired together:</p>
<pre><code class="language-bash">kubectl run curltest --rm -it --restart=Never --image=curlimages/curl -- \
  curl -sk https://echo.pebble.local/
</code></pre>
<p>The echo server should return a JSON blob — note the <code>"x-forwarded-proto":"https"</code> field, which proves the request came through nginx over TLS.</p>
<p><strong>Troubleshooting if the cert never goes Ready:</strong></p>
<ul>
<li><p><code>kubectl describe order -n default</code> — look for "DNS problem" or "Connection refused" in the events.</p>
</li>
<li><p><code>kubectl logs -n pebble deploy/pebble --tail=50</code> — Pebble logs the exact URL it tried to fetch during validation and any errors.</p>
</li>
<li><p>If the Order is stuck pending with no events: cert-manager hasn't reconciled yet. Wait 30s.</p>
</li>
<li><p>If the Order is <code>invalid</code>: one of the two DNS layers (Step 3) is misconfigured. Re-run both <code>nslookup</code> checks.</p>
</li>
<li><p>If the Ingress apply itself failed with an x509 webhook error: you skipped the <code>kubectl delete validatingwebhookconfiguration ingress-nginx-admission</code> step in Step 1.</p>
</li>
</ul>
<h3 id="heading-step-8-switch-to-lets-encrypt-staging-real-public-domain">Step 8: Switch to Let's Encrypt staging (real public domain)</h3>
<p>Pebble proved the flow works locally. Now move to a publicly-reachable domain pointed at a publicly-reachable cluster. The DNS gymnastics from Step 3 go away — the domain is real, so both resolvers find it without intervention.</p>
<p>Use Let's Encrypt <strong>staging</strong> first. It speaks the same ACME protocol as production but with generous rate limits, so failed attempts during testing won't lock you out:</p>
<pre><code class="language-yaml"># clusterissuer-staging.yaml
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: letsencrypt-staging
spec:
  acme:
    server: https://acme-staging-v02.api.letsencrypt.org/directory
    email: your-email@example.com
    privateKeySecretRef:
      name: letsencrypt-staging-account-key
    solvers:
      - http01:
          ingress:
            ingressClassName: nginx
</code></pre>
<pre><code class="language-bash">kubectl apply -f clusterissuer-staging.yaml

# Point the Ingress at staging and the real hostname, then force re-issuance
kubectl annotate ingress echo \
  cert-manager.io/cluster-issuer=letsencrypt-staging --overwrite -n default
kubectl delete secret echo-tls -n default
</code></pre>
<p>The new cert's issuer will look something like <code>(STAGING) Let's Encrypt</code>.</p>
<h3 id="heading-step-9-switch-to-lets-encrypt-production">Step 9: Switch to Let's Encrypt production</h3>
<p>Once staging works, repeat with the production ClusterIssuer. The only difference is the <code>server</code> URL:</p>
<pre><code class="language-yaml"># clusterissuer-prod.yaml
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: letsencrypt-prod
spec:
  acme:
    server: https://acme-v02.api.letsencrypt.org/directory
    email: your-email@example.com
    privateKeySecretRef:
      name: letsencrypt-prod-account-key
    solvers:
      - http01:
          ingress:
            ingressClassName: nginx
</code></pre>
<pre><code class="language-bash">kubectl apply -f clusterissuer-prod.yaml
kubectl annotate ingress echo \
  cert-manager.io/cluster-issuer=letsencrypt-prod --overwrite -n default
kubectl delete secret echo-tls -n default
</code></pre>
<p>cert-manager detects the missing Secret and immediately requests a browser-trusted certificate from production Let's Encrypt.</p>
<p>cert-manager detects the missing Secret and immediately triggers a new certificate request using the production issuer.</p>
<h2 id="heading-how-to-get-a-wildcard-certificate-with-dns-01">How to Get a Wildcard Certificate with DNS-01</h2>
<p>HTTP-01 challenges work well for single domains with public ingress. But there are two situations where you need DNS-01 instead: when your cluster is not publicly accessible (internal clusters, air-gapped environments, staging namespaces behind a VPN), and when you want a wildcard certificate that covers all subdomains of your domain.</p>
<p>DNS-01 requires cert-manager to be able to create and delete TXT records in your DNS provider. cert-manager has built-in support for Route53, Cloud DNS, Cloudflare, Azure DNS, and many others.</p>
<p>Here is a <code>ClusterIssuer</code> for DNS-01 using AWS Route53:</p>
<pre><code class="language-yaml"># clusterissuer-dns01.yaml
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: letsencrypt-dns01
spec:
  acme:
    server: https://acme-v02.api.letsencrypt.org/directory
    email: your-email@example.com
    privateKeySecretRef:
      name: letsencrypt-dns01-account-key
    solvers:
      - dns01:
          route53:
            region: us-east-1
            # Use IRSA (IAM Roles for Service Accounts) in production
            # rather than static credentials
            hostedZoneID: YOUR_HOSTED_ZONE_ID
</code></pre>
<p>A wildcard <code>Certificate</code> using that issuer:</p>
<pre><code class="language-yaml"># wildcard-cert.yaml
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: wildcard-example-com
  namespace: default
spec:
  secretName: wildcard-example-com-tls
  issuerRef:
    name: letsencrypt-dns01
    kind: ClusterIssuer
  commonName: "*.example.com"
  dnsNames:
    - "*.example.com"
    - "example.com"        # Also cover the apex domain
  duration: 2160h           # 90 days
  renewBefore: 720h         # Renew 30 days before expiry
</code></pre>
<p>The resulting Secret <code>wildcard-example-com-tls</code> can be referenced by any Ingress in the <code>default</code> namespace. All subdomains — <code>api.example.com</code>, <code>dashboard.example.com</code>, <code>staging.example.com</code> — are covered by a single certificate that rotates automatically.</p>
<p>For Cloudflare instead of Route53, the solver section looks like this:</p>
<pre><code class="language-yaml">    solvers:
      - dns01:
          cloudflare:
            email: your-email@example.com
            apiTokenSecretRef:
              name: cloudflare-api-token
              key: api-token
</code></pre>
<h2 id="heading-demo-2-set-up-an-internal-ca-for-service-to-service-tls">Demo 2 — Set Up an Internal CA for Service-to-Service TLS</h2>
<p>Let's Encrypt certificates are great for public-facing services. But for internal services — a gRPC microservice calling another, a web application talking to its database — you don't need public trust. You need a CA that the cluster trusts, and you need it to issue certificates for service names that don't exist as public DNS records.</p>
<p>cert-manager's CA issuer handles this. You create a root CA, tell cert-manager about it, and then issue certificates for internal services using that CA. Every service that trusts the root CA trusts every certificate it issues.</p>
<h3 id="heading-step-1-create-a-self-signed-clusterissuer">Step 1: Create a self-signed ClusterIssuer</h3>
<p>A self-signed issuer generates certificates that are signed by the certificate itself — it is its own CA. You use this as a bootstrap step to create the root CA certificate:</p>
<pre><code class="language-yaml"># selfsigned-issuer.yaml
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: selfsigned
spec:
  selfSigned: {}
</code></pre>
<pre><code class="language-bash">kubectl apply -f selfsigned-issuer.yaml
</code></pre>
<h3 id="heading-step-2-create-the-root-ca-certificate">Step 2: Create the root CA certificate</h3>
<p>Use the self-signed issuer to create a CA certificate. The <code>isCA: true</code> field tells cert-manager this certificate can sign other certificates:</p>
<pre><code class="language-yaml"># internal-ca.yaml
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: internal-ca
  namespace: cert-manager    # Store in cert-manager namespace
spec:
  isCA: true
  commonName: internal-ca
  secretName: internal-ca-secret
  duration: 87600h           # 10 years — this is a root CA
  renewBefore: 720h
  privateKey:
    algorithm: ECDSA
    size: 256
  issuerRef:
    name: selfsigned
    kind: ClusterIssuer
</code></pre>
<pre><code class="language-bash">kubectl apply -f internal-ca.yaml
kubectl get certificate internal-ca -n cert-manager
</code></pre>
<pre><code class="language-plaintext">NAME          READY   SECRET               AGE
internal-ca   True    internal-ca-secret   8s
</code></pre>
<h3 id="heading-step-3-create-a-ca-clusterissuer-backed-by-the-root-ca">Step 3: Create a CA ClusterIssuer backed by the root CA</h3>
<p>Now create a <code>ClusterIssuer</code> that uses the root CA Secret you just created. This is the issuer that will sign certificates for your internal services:</p>
<pre><code class="language-yaml"># internal-ca-issuer.yaml
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: internal-ca
spec:
  ca:
    secretName: internal-ca-secret   # References the Secret in cert-manager namespace
</code></pre>
<pre><code class="language-bash">kubectl apply -f internal-ca-issuer.yaml
kubectl get clusterissuer internal-ca
</code></pre>
<pre><code class="language-plaintext">NAME          READY   AGE
internal-ca   True    5s
</code></pre>
<h3 id="heading-step-4-issue-a-certificate-for-an-internal-service">Step 4: Issue a certificate for an internal service</h3>
<p>Now issue a certificate for an internal gRPC service. The <code>dnsNames</code> use Kubernetes internal DNS names — <code>&lt;service&gt;.&lt;namespace&gt;.svc.cluster.local</code>:</p>
<pre><code class="language-yaml"># payments-cert.yaml
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: payments-tls
  namespace: production
spec:
  secretName: payments-tls-secret
  issuerRef:
    name: internal-ca
    kind: ClusterIssuer
  commonName: payments.production.svc.cluster.local
  dnsNames:
    - payments.production.svc.cluster.local
    - payments.production.svc
    - payments
  duration: 2160h     # 90 days
  renewBefore: 360h   # Renew 15 days before expiry
</code></pre>
<pre><code class="language-bash">kubectl create namespace production
kubectl apply -f payments-cert.yaml
kubectl get certificate payments-tls -n production
</code></pre>
<pre><code class="language-plaintext">NAME           READY   SECRET                AGE
payments-tls   True    payments-tls-secret   6s
</code></pre>
<p>The Secret <code>payments-tls-secret</code> now contains <code>tls.crt</code>, <code>tls.key</code>, and <code>ca.crt</code>. Mount this into your application pod:</p>
<pre><code class="language-yaml"># In your Deployment spec
volumes:
  - name: tls
    secret:
      secretName: payments-tls-secret
containers:
  - name: payments
    volumeMounts:
      - name: tls
        mountPath: /etc/tls
        readOnly: true
</code></pre>
<p>Your application reads <code>/etc/tls/tls.crt</code> and <code>/etc/tls/tls.key</code> to configure TLS. Other services that need to trust it read <code>/etc/tls/ca.crt</code>.</p>
<h3 id="heading-step-5-distribute-the-ca-bundle-with-trust-manager">Step 5: Distribute the CA bundle with trust-manager</h3>
<p>The problem with a custom CA is that every service needs to know about it. cert-manager's companion tool, trust-manager, handles this by distributing the CA bundle as a <code>ConfigMap</code> to every namespace:</p>
<pre><code class="language-bash">helm upgrade trust-manager oci://quay.io/jetstack/charts/trust-manager \
  --install \
  --namespace cert-manager \
  --wait
</code></pre>
<p>Create a <code>Bundle</code> resource that takes the CA certificate from the <code>internal-ca-secret</code> and distributes it cluster-wide:</p>
<pre><code class="language-yaml"># ca-bundle.yaml
apiVersion: trust.cert-manager.io/v1alpha1
kind: Bundle
metadata:
  name: internal-ca-bundle
spec:
  sources:
    - secret:
        name: internal-ca-secret
        key: ca.crt
  target:
    configMap:
      key: ca-bundle.crt
    namespaceSelector:
      matchLabels:
        # Distribute to all namespaces with this label
        kubernetes.io/metadata.name: production
</code></pre>
<pre><code class="language-bash">kubectl apply -f ca-bundle.yaml
</code></pre>
<p>After a few seconds, every matching namespace has a ConfigMap named <code>internal-ca-bundle</code> containing the CA certificate. Applications mount this ConfigMap to trust internally-issued certificates without any per-service configuration.</p>
<h3 id="heading-step-6-verify-the-certificate-chain">Step 6: Verify the certificate chain</h3>
<pre><code class="language-bash"># Extract the CA cert and service cert
kubectl get secret payments-tls-secret -n production \
  -o jsonpath='{.data.ca\.crt}' | base64 -d &gt; ca.crt

kubectl get secret payments-tls-secret -n production \
  -o jsonpath='{.data.tls\.crt}' | base64 -d &gt; payments.crt

# Verify the cert was signed by the CA
openssl verify -CAfile ca.crt payments.crt
</code></pre>
<pre><code class="language-plaintext">payments.crt: OK
</code></pre>
<h2 id="heading-how-certificate-rotation-works">How Certificate Rotation Works</h2>
<p>Certificate rotation is the part of certificate management that breaks production clusters most often. cert-manager handles it automatically, but understanding the mechanism helps you tune it and debug it when things go wrong.</p>
<p>cert-manager watches every <code>Certificate</code> resource it manages and checks the expiry of the underlying certificate in the Secret. When the remaining validity drops below the <code>renewBefore</code> threshold, cert-manager triggers a renewal. The default <code>renewBefore</code> is 1/3 of the certificate's total validity period — so a 90-day certificate starts renewing at day 60.</p>
<p>The renewal creates a new <code>CertificateRequest</code>, goes through the full issuance flow, and updates the Secret in place. The new certificate replaces the old one atomically. Applications that use file mounts and watch for changes (most modern web servers and gRPC frameworks do) will pick up the new certificate without restarting.</p>
<pre><code class="language-bash"># See the current rotation status
kubectl describe certificate echo-tls -n default
</code></pre>
<p>Look for these fields in the output:</p>
<pre><code class="language-plaintext">Status:
  Not After:   2024-06-18T10:00:00Z
  Not Before:  2024-03-20T10:00:00Z
  Renewal Time: 2024-05-18T10:00:00Z   # When cert-manager will start renewing
  Conditions:
    Type:    Ready
    Status:  True
    Message: Certificate is up to date and has not expired
</code></pre>
<p>If a renewal fails — for example, because the HTTP-01 challenge can't be completed — cert-manager retries with exponential backoff. The existing certificate continues to serve until it actually expires, giving you a window to debug the issue.</p>
<p>To see renewal events in real time:</p>
<pre><code class="language-bash">kubectl get events -n default --field-selector reason=Issued
kubectl get events -n default --field-selector reason=Failed
</code></pre>
<p><strong>Setting</strong> <code>renewBefore</code> <strong>correctly:</strong> For public-facing services, 30 days before a 90-day certificate is a sensible buffer. For internal short-lived certificates (24-hour validity), set <code>renewBefore</code> to 8 hours so rotation happens well before expiry even if the first attempt fails. Never set <code>renewBefore</code> to more than half the certificate's validity — cert-manager will immediately try to renew a certificate it just issued.</p>
<h2 id="heading-cleanup">Cleanup</h2>
<pre><code class="language-bash"># Remove demo resources
kubectl delete ingress echo -n default
kubectl delete service echo -n default
kubectl delete deployment echo -n default
kubectl delete secret echo-tls -n default
kubectl delete certificate payments-tls -n production
kubectl delete namespace production

# Uninstall cert-manager and trust-manager
helm uninstall trust-manager -n cert-manager
helm uninstall cert-manager -n cert-manager
kubectl delete namespace cert-manager

# Remove ClusterIssuers
kubectl delete clusterissuer letsencrypt-staging letsencrypt-prod \
  internal-ca selfsigned 2&gt;/dev/null
</code></pre>
<h2 id="heading-conclusion">Conclusion</h2>
<p>Kubernetes leaves TLS configuration entirely to you. In this article you worked through both the public and internal sides of that responsibility.</p>
<p>On the public side, you installed cert-manager using the current OCI Helm chart, created a <code>ClusterIssuer</code> backed by Let's Encrypt, and watched cert-manager go through the full ACME HTTP-01 challenge flow — from creating a temporary solver pod to storing a valid certificate in a Kubernetes Secret. You saw how switching from staging to production is a one-line annotation change, and how cert-manager renews certificates automatically before they expire.</p>
<p>On the internal side, you bootstrapped a private CA using cert-manager's self-signed issuer, created a <code>ClusterIssuer</code> backed by that CA, and issued certificates for internal service names that only exist inside the cluster. You used trust-manager to distribute the CA bundle cluster-wide so services can trust each other's certificates without per-service configuration. And you saw how to verify the certificate chain with <code>openssl</code> so you can confirm it's working before deploying to production.</p>
<p>Understanding certificate rotation is what separates teams that manage TLS confidently from teams that get woken up at 3am by an expired certificate. cert-manager automates the renewal, but the <code>renewBefore</code> field is your safety margin — set it correctly and know how to read the renewal status.</p>
<p>All YAML manifests and Helm values from this article are available in the <a href="https://github.com/Caesarsage/DevOps-Cloud-Projects/tree/main/intermediate/k8/security/cert-manager">DevOps-Cloud-Projects GitHub repository</a>.</p>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ How to Secure a Kubernetes Cluster: RBAC, Pod Hardening, and Runtime Protection ]]>
                </title>
                <description>
                    <![CDATA[ In 2018, RedLock's cloud security research team discovered that Tesla's Kubernetes dashboard was exposed to the public internet with no password on it. An attacker had found it, deployed pods inside T ]]>
                </description>
                <link>https://www.freecodecamp.org/news/how-to-secure-a-kubernetes-cluster-handbook/</link>
                <guid isPermaLink="false">69c4112310e664c5dac43f41</guid>
                
                    <category>
                        <![CDATA[ Kubernetes ]]>
                    </category>
                
                    <category>
                        <![CDATA[ containers ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Security ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ Destiny Erhabor ]]>
                </dc:creator>
                <pubDate>Wed, 25 Mar 2026 16:45:23 +0000</pubDate>
                <media:content url="https://cdn.hashnode.com/uploads/covers/5e1e335a7a1d3fcc59028c64/4039b7a4-bb45-4df5-b13b-7414985c1a7e.png" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>In 2018, RedLock's cloud security research team discovered that Tesla's Kubernetes dashboard was exposed to the public internet with no password on it.</p>
<p>An attacker had found it, deployed pods inside Tesla's cluster, and was using them to mine cryptocurrency – all on Tesla's AWS bill. The cluster had no authentication on the dashboard, no network restrictions on egress, and nothing monitoring for intrusion. Any one of those controls would have stopped the attack. None of them were in place.</p>
<p>This wasn't a sophisticated zero-day exploit. It was a misconfigured default.</p>
<p>Kubernetes ships with powerful security primitives. The problem is that almost none of them are enabled by default. A fresh cluster is deliberately permissive so it's easy to get started. That permissiveness is a feature in development. In production, it's a liability.</p>
<p>In this handbook, we'll work through the three most impactful security layers in Kubernetes. We'll start with Role-Based Access Control, which governs who can do what to which resources in the API. From there we'll move to pod runtime security, which locks down what containers can actually do once they're running on a node. Finally we'll deploy Falco, a syscall-level detection engine that watches for attacks in progress and alerts in real time.</p>
<p>By the end, you'll have a hardened cluster with working RBAC policies, enforced pod security standards, and live detection rules that fire when something suspicious happens.</p>
<h2 id="heading-prerequisites">Prerequisites</h2>
<ul>
<li><p><code>kubectl</code> installed and configured</p>
</li>
<li><p>Docker Desktop or a Linux machine (to run kind)</p>
</li>
<li><p>Basic Kubernetes familiarity – you know what a Pod, Deployment, and Namespace are</p>
</li>
<li><p>No prior security experience needed</p>
</li>
</ul>
<p>All demos run on a local kind cluster. Full YAML and setup scripts are in the <a href="https://github.com/Caesarsage/DevOps-Cloud-Projects/tree/main/intermediate/security">companion GitHub repository</a>.</p>
<h2 id="heading-table-of-contents">Table of Contents</h2>
<ul>
<li><p><a href="#heading-the-kubernetes-threat-landscape">The Kubernetes Threat Landscape</a></p>
</li>
<li><p><a href="#heading-what-youll-build">What You'll Build</a></p>
</li>
<li><p><a href="#heading-demo-1--run-a-cluster-security-baseline-with-kube-bench">Demo 1 — Run a Cluster Security Baseline with kube-bench</a></p>
</li>
<li><p><a href="#heading-how-to-configure-rbac">How to Configure RBAC</a></p>
<ul>
<li><p><a href="#heading-the-four-rbac-objects">The Four RBAC Objects</a></p>
</li>
<li><p><a href="#heading-how-to-discover-resources-verbs-and-api-groups">How to Discover Resources, Verbs, and API Groups</a></p>
</li>
<li><p><a href="#heading-roles-and-clusterroles">Roles and ClusterRoles</a></p>
</li>
<li><p><a href="#heading-rolebindings-and-clusterrolebindings">RoleBindings and ClusterRoleBindings</a></p>
</li>
<li><p><a href="#heading-how-to-use-service-accounts-safely">How to Use Service Accounts Safely</a></p>
</li>
<li><p><a href="#heading-how-to-audit-your-rbac-configuration">How to Audit Your RBAC Configuration</a></p>
</li>
</ul>
</li>
<li><p><a href="#heading-demo-2--build-a-least-privilege-rbac-policy-for-a-ci-pipeline">Demo 2 — Build a Least-Privilege RBAC Policy for a CI Pipeline</a></p>
</li>
<li><p><a href="#heading-demo-3--audit-rbac-with-rakkess-and-rbac-lookup">Demo 3 — Audit RBAC with rakkess and rbac-lookup</a></p>
</li>
<li><p><a href="#how-to-harden-pod-runtime-security">How to Harden Pod Runtime Security</a></p>
<ul>
<li><p><a href="#heading-pod-security-admission">Pod Security Admission</a></p>
</li>
<li><p><a href="#heading-how-to-configure-securitycontext">How to Configure securityContext</a></p>
</li>
<li><p><a href="#heading-opagatekeeper-vs-kyverno">OPA/Gatekeeper vs Kyverno</a></p>
</li>
<li><p><a href="#heading-how-to-detect-runtime-threats-with-falco">How to Detect Runtime Threats with Falco</a></p>
</li>
</ul>
</li>
<li><p><a href="#heading-demo-4--harden-a-pod-with-securitycontext">Demo 4 — Harden a Pod with securityContext</a></p>
</li>
<li><p><a href="#heading-demo-5--deploy-falco-and-write-a-custom-detection-rule">Demo 5 — Deploy Falco and Write a Custom Detection Rule</a></p>
</li>
<li><p><a href="#heading-cleanup">Cleanup</a></p>
</li>
<li><p><a href="#heading-conclusion">Conclusion</a></p>
</li>
</ul>
<h2 id="heading-the-kubernetes-threat-landscape">The Kubernetes Threat Landscape</h2>
<p>To understand what you're defending against, you need to understand where Kubernetes exposes attack surface. There are six main areas, and most production incidents trace back to at least one of them.</p>
<p>The <strong>API server</strong> is the front door to your cluster. Every <code>kubectl</code> command, every CI deploy, and every controller reconciliation loop sends requests here. Unauthenticated or over-privileged access to the API server is effectively game over: an attacker who can talk to it can create pods, read secrets, and modify workloads freely.</p>
<p><strong>etcd</strong> is the key-value store where all cluster state lives, including your Secrets. Kubernetes Secrets are base64-encoded by default, not encrypted. Anyone with direct access to etcd can read every password, token, and certificate in the cluster without going through the API server at all.</p>
<p>The <strong>kubelet</strong> runs on each node and manages the pods assigned to it. If its API is reachable without authentication – which is the default on older clusters – an attacker can exec into any pod on that node and read its memory without ever touching the API server.</p>
<p>The <strong>container runtime</strong> is the layer that actually runs your containers. A container that escapes its isolation boundary lands directly in the host OS. A privileged container with <code>hostPID: true</code> can read the memory of every other process on the node, including other containers.</p>
<p>Your <strong>supply chain</strong> (base images, third-party dependencies, Helm charts, operators) is a potential entry point at every step. The XZ Utils backdoor discovered in 2024 showed how close a well-positioned supply chain attack can come to widespread infrastructure compromise.</p>
<p>Finally, the <strong>network</strong>: by default, every pod in a Kubernetes cluster can reach every other pod on any port. There are no internal firewalls between workloads unless you explicitly create them with NetworkPolicy.</p>
<img src="https://cdn.hashnode.com/uploads/covers/5f2a6b76d7d55f162b5da2ee/2e49d975-4f69-4d14-9646-76c6ec377115.png" alt="Kubernetes threat landscape" style="display:block;margin:0 auto" width="4079" height="980" loading="lazy">

<h3 id="heading-real-world-breaches">Real-World Breaches</h3>
<p>These three incidents are worth understanding before you write a single line of YAML. They're not theoretical – they're documented post-mortems from real production clusters.</p>
<table>
<thead>
<tr>
<th>Incident</th>
<th>Year</th>
<th>Root cause</th>
<th>What was missing</th>
</tr>
</thead>
<tbody><tr>
<td><strong>Tesla cryptomining</strong></td>
<td>2018</td>
<td>Kubernetes dashboard exposed with no authentication, Unrestricted egress</td>
<td>RBAC on the dashboard endpoint + default-deny NetworkPolicy</td>
</tr>
<tr>
<td><strong>Capital One data breach</strong></td>
<td>2019</td>
<td>SSRF vulnerability in a WAF let an attacker reach the EC2 metadata API, which returned credentials for an over-privileged IAM role</td>
<td>Pod-level IAM restrictions (IRSA) + blocking metadata API egress</td>
</tr>
<tr>
<td><strong>Shopify bug bounty (Kubernetes)</strong></td>
<td>2021</td>
<td>A researcher accessed internal Kubernetes metadata through a misconfigured internal service, exposing pod environment variables containing secrets</td>
<td>Secret management outside environment variables + network segmentation</td>
</tr>
</tbody></table>
<p>The pattern across all three: not zero-day exploits, but misconfigured defaults and missing controls that should have been standard practice.</p>
<p>This article addresses the RBAC and pod security gaps directly.</p>
<h2 id="heading-what-youll-build">What You'll Build</h2>
<p>Before the first command, here is the security posture you'll have by the end of this article:</p>
<p>You'll start by running kube-bench to get a CIS Benchmark baseline – a concrete score showing where a default cluster stands before any hardening. From there you'll build a least-privilege RBAC policy for a CI pipeline service account and verify its permission boundaries, then audit the full cluster to confirm no over-privileged accounts exist.</p>
<p>On the pod security side, you'll enforce the <code>restricted</code> Pod Security Admission profile on your workload namespace and apply a hardened <code>securityContext</code> to a deployment: non-root user, read-only root filesystem, dropped capabilities, and seccomp profile. To close out, you'll deploy Falco in eBPF mode with a custom detection rule that fires when suspicious tools are run inside a container.</p>
<p>Start to finish, with a kind cluster already running, the demos take about 45–60 minutes.</p>
<h2 id="heading-demo-1-run-a-cluster-security-baseline-with-kube-bench">Demo 1: Run a Cluster Security Baseline with kube-bench</h2>
<p>Before hardening anything, it's a good idea to measure where you are. <a href="https://github.com/aquasecurity/kube-bench">kube-bench</a> runs the CIS Kubernetes Benchmark against your cluster and reports which checks pass and which fail. A baseline run gives you a concrete picture of your cluster's default security posture – and a reference point you can re-run after applying any hardening changes.</p>
<h3 id="heading-step-1-create-a-kind-cluster">Step 1: Create a kind cluster</h3>
<p>Save the following as <code>kind-config.yaml</code>:</p>
<pre><code class="language-yaml"># kind-config.yaml
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
  - role: control-plane
  - role: worker
  - role: worker
</code></pre>
<pre><code class="language-bash">kind create cluster --name k8s-security --config kind-config.yaml
</code></pre>
<p>Expected output:</p>
<pre><code class="language-plaintext">Creating cluster "k8s-security" ...
 ✓ Ensuring node image (kindest/node:v1.29.0) 🖼
 ✓ Preparing nodes 📦 📦 📦
 ✓ Writing configuration 📜
 ✓ Starting control-plane 🕹️
 ✓ Installing CNI 🔌
 ✓ Installing StorageClass 💾
 ✓ Joining worker nodes 🚜
Set kubectl context to "kind-k8s-security"
</code></pre>
<h3 id="heading-step-2-run-kube-bench">Step 2: Run kube-bench</h3>
<p>kube-bench runs as a Job inside the cluster, mounting the host filesystem to inspect Kubernetes configuration files and processes:</p>
<pre><code class="language-bash">kubectl apply -f https://raw.githubusercontent.com/aquasecurity/kube-bench/main/job.yaml
kubectl wait --for=condition=complete job/kube-bench --timeout=120s
kubectl logs job/kube-bench
</code></pre>
<p>The output is long. Scroll to the summary at the bottom:</p>
<pre><code class="language-plaintext">== Summary master ==
0 checks PASS
11 checks FAIL
 9 checks WARN
 0 checks INFO

== Summary node ==
17 checks PASS
 2 checks FAIL
40 checks WARN
 0 checks INFO
</code></pre>
<p>A fresh kind cluster typically fails around 14 checks. Three of the most important failures explain why defaults are a problem:</p>
<table>
<thead>
<tr>
<th>Check ID</th>
<th>Description</th>
<th>Why it matters</th>
</tr>
</thead>
<tbody><tr>
<td><strong>1.2.1</strong></td>
<td><code>--anonymous-auth</code> is not set to false on the API server</td>
<td>Anonymous requests can reach the API server without authentication – exactly how the Tesla dashboard was accessed</td>
</tr>
<tr>
<td><strong>1.2.6</strong></td>
<td><code>--kubelet-certificate-authority</code> is not set</td>
<td>The API server cannot verify kubelet identity, enabling man-in-the-middle attacks between the control plane and nodes</td>
</tr>
<tr>
<td><strong>4.2.6</strong></td>
<td><code>--protect-kernel-defaults</code> is not set on the kubelet</td>
<td>Kernel parameters can be modified from within a container, which is one step toward a container escape</td>
</tr>
</tbody></table>
<p><strong>Note:</strong> Some kube-bench findings are expected on kind because kind is a development tool, not a production-hardened environment. The important thing is to understand what each finding means and whether it applies to your target production setup.</p>
<p>Delete the Job when you're done:</p>
<pre><code class="language-bash">kubectl delete job kube-bench
</code></pre>
<p>Now that you have a baseline, you know what you're starting from. The next step is to work through the most impactful control on that list: access control. RBAC governs every interaction with the Kubernetes API, and getting it right is the foundation everything else builds on.</p>
<h2 id="heading-how-to-configure-rbac">How to Configure RBAC</h2>
<p>Role-Based Access Control is the authorisation layer in Kubernetes. Every request that reaches the API server – from <code>kubectl</code>, from a pod, from a controller – is checked against RBAC rules after authentication succeeds. If there is no rule that explicitly allows the action, Kubernetes denies it.</p>
<p>The key word is "explicitly". RBAC in Kubernetes is additive only. There is no <code>deny</code> rule. You grant access by creating rules, and you remove access by deleting them. This makes the mental model clean: if a subject can do something, you gave it permission to do that thing.</p>
<h3 id="heading-a-brief-case-study-the-shopify-kubernetes-misconfiguration">A Brief Case Study: The Shopify Kubernetes Misconfiguration</h3>
<p>In 2021, security researcher Silas Cutler discovered that a Shopify internal service exposed Kubernetes metadata through an SSRF vulnerability. The metadata included pod environment variables that contained secrets. The root cause was partly RBAC: the service's service account had broader cluster access than it needed, and there was no least-privilege review process.</p>
<p>Shopify paid a $25,000 bug bounty and fixed the issue. The lesson is straightforward: a service account should only have the permissions it needs to do its specific job. Nothing more.</p>
<p>This is the principle you'll apply in Demo 2.</p>
<h3 id="heading-the-four-rbac-objects">The Four RBAC Objects</h3>
<p>RBAC in Kubernetes is built from four API objects. Two define permissions, two bind those permissions to subjects:</p>
<table>
<thead>
<tr>
<th>Object</th>
<th>Scope</th>
<th>What it does</th>
</tr>
</thead>
<tbody><tr>
<td><code>Role</code></td>
<td>Namespace</td>
<td>Defines a set of permissions within one namespace</td>
</tr>
<tr>
<td><code>ClusterRole</code></td>
<td>Cluster-wide</td>
<td>Defines permissions across all namespaces, or for cluster-scoped resources like Nodes</td>
</tr>
<tr>
<td><code>RoleBinding</code></td>
<td>Namespace</td>
<td>Grants the permissions of a Role or ClusterRole to a subject, within one namespace</td>
</tr>
<tr>
<td><code>ClusterRoleBinding</code></td>
<td>Cluster-wide</td>
<td>Grants the permissions of a ClusterRole to a subject across the entire cluster</td>
</tr>
</tbody></table>
<p>A <strong>subject</strong> is a user, a group, or a service account. Users and groups come from your authentication layer – client certificates, OIDC tokens, or cloud provider identity. Service accounts are Kubernetes-native identities created for pods.</p>
<h3 id="heading-how-to-discover-resources-verbs-and-api-groups">How to Discover Resources, Verbs, and API Groups</h3>
<p>Before you can write a <code>Role</code>, you need to know three things: the resource name, the API group it belongs to, and the verbs it supports. You shouldn't have to guess any of them – <code>kubectl</code> can tell you everything.</p>
<h4 id="heading-list-all-available-resources-and-their-api-groups">List all available resources and their API groups</h4>
<pre><code class="language-bash">kubectl api-resources
</code></pre>
<p>Partial output:</p>
<pre><code class="language-plaintext">NAME                    SHORTNAMES  APIVERSION                     NAMESPACED  KIND
bindings                            v1                             true        Binding
configmaps              cm          v1                             true        ConfigMap
endpoints               ep          v1                             true        Endpoints
events                  ev          v1                             true        Event
namespaces              ns          v1                             false       Namespace
nodes                   no          v1                             false       Node
pods                    po          v1                             true        Pod
secrets                             v1                             true        Secret
serviceaccounts         sa          v1                             true        ServiceAccount
services                svc         v1                             true        Service
deployments             deploy      apps/v1                        true        Deployment
replicasets             rs          apps/v1                        true        ReplicaSet
statefulsets            sts         apps/v1                        true        StatefulSet
cronjobs                cj          batch/v1                       true        CronJob
jobs                                batch/v1                       true        Job
ingresses               ing         networking.k8s.io/v1           true        Ingress
networkpolicies         netpol      networking.k8s.io/v1           true        NetworkPolicy
clusterroles                        rbac.authorization.k8s.io/v1   false       ClusterRole
roles                               rbac.authorization.k8s.io/v1   true        Role
</code></pre>
<p>The <code>APIVERSION</code> column is what you put in <code>apiGroups</code>. Strip the version suffix and use only the group part:</p>
<table>
<thead>
<tr>
<th>APIVERSION in output</th>
<th>apiGroups value in Role</th>
</tr>
</thead>
<tbody><tr>
<td><code>v1</code></td>
<td><code>""</code> (empty string – the core group)</td>
</tr>
<tr>
<td><code>apps/v1</code></td>
<td><code>"apps"</code></td>
</tr>
<tr>
<td><code>batch/v1</code></td>
<td><code>"batch"</code></td>
</tr>
<tr>
<td><code>networking.k8s.io/v1</code></td>
<td><code>"networking.k8s.io"</code></td>
</tr>
<tr>
<td><code>rbac.authorization.k8s.io/v1</code></td>
<td><code>"rbac.authorization.k8s.io"</code></td>
</tr>
</tbody></table>
<p>The <code>NAMESPACED</code> column tells you whether to use a <code>Role</code> (namespaced resources) or a <code>ClusterRole</code> (non-namespaced resources like <code>nodes</code>).</p>
<h4 id="heading-filter-by-api-group">Filter by API group</h4>
<p>If you want to see only resources in a specific group, for example, everything in <code>apps</code>:</p>
<pre><code class="language-bash">kubectl api-resources --api-group=apps
</code></pre>
<pre><code class="language-plaintext">NAME                  SHORTNAMES  APIVERSION  NAMESPACED  KIND
controllerrevisions               apps/v1     true        ControllerRevision
daemonsets            ds          apps/v1     true        DaemonSet
deployments           deploy      apps/v1     true        Deployment
replicasets           rs          apps/v1     true        ReplicaSet
statefulsets          sts         apps/v1     true        StatefulSet
</code></pre>
<h4 id="heading-list-all-verbs-for-a-specific-resource">List all verbs for a specific resource</h4>
<p>Each resource supports a different set of verbs. To see exactly which verbs a resource supports, use <code>kubectl api-resources</code> with <code>-o wide</code> and look at the <code>VERBS</code> column:</p>
<pre><code class="language-bash">kubectl api-resources -o wide | grep -E "^NAME|^pods "
</code></pre>
<pre><code class="language-plaintext">NAME  SHORTNAMES  APIVERSION  NAMESPACED  KIND  VERBS
pods  po          v1          true        Pod   create,delete,deletecollection,get,list,patch,update,watch
</code></pre>
<p>Or explain the resource directly:</p>
<pre><code class="language-bash">kubectl explain pod --api-version=v1 | head -10
</code></pre>
<p>The full set of verbs Kubernetes supports in RBAC rules is:</p>
<table>
<thead>
<tr>
<th>Verb</th>
<th>What it allows</th>
</tr>
</thead>
<tbody><tr>
<td><code>get</code></td>
<td>Read a single named resource: <code>kubectl get pod my-pod</code></td>
</tr>
<tr>
<td><code>list</code></td>
<td>Read all resources of a type: <code>kubectl get pods</code></td>
</tr>
<tr>
<td><code>watch</code></td>
<td>Stream changes to resources: used by controllers and informers</td>
</tr>
<tr>
<td><code>create</code></td>
<td>Create a new resource</td>
</tr>
<tr>
<td><code>update</code></td>
<td>Replace an existing resource (<code>kubectl apply</code> on an existing object)</td>
</tr>
<tr>
<td><code>patch</code></td>
<td>Partially modify a resource (<code>kubectl patch</code>)</td>
</tr>
<tr>
<td><code>delete</code></td>
<td>Delete a single resource</td>
</tr>
<tr>
<td><code>deletecollection</code></td>
<td>Delete all resources of a type in a namespace</td>
</tr>
<tr>
<td><code>exec</code></td>
<td>Run a command inside a pod (<code>kubectl exec</code>)</td>
</tr>
<tr>
<td><code>portforward</code></td>
<td>Forward a port from a pod (<code>kubectl port-forward</code>)</td>
</tr>
<tr>
<td><code>proxy</code></td>
<td>Proxy HTTP requests to a pod</td>
</tr>
<tr>
<td><code>log</code></td>
<td>Read pod logs (<code>kubectl logs</code>)</td>
</tr>
</tbody></table>
<p><strong>Important:</strong> <code>get</code> and <code>list</code> are separate verbs. Granting <code>list</code> on <code>secrets</code> lets a subject enumerate every secret name and value in a namespace, even if you didn't also grant <code>get</code>. Always think about both when working with sensitive resources like <code>secrets</code>, <code>serviceaccounts</code>, and <code>configmaps</code>.</p>
<h4 id="heading-look-up-a-resources-group-with-kubectl-explain">Look up a resource's group with kubectl explain</h4>
<p>If you already know the resource name but aren't sure of its group, <code>kubectl explain</code> tells you:</p>
<pre><code class="language-bash">kubectl explain deployment
</code></pre>
<pre><code class="language-plaintext">GROUP:      apps
KIND:       Deployment
VERSION:    v1
...
</code></pre>
<pre><code class="language-bash">kubectl explain ingress
</code></pre>
<pre><code class="language-plaintext">GROUP:      networking.k8s.io
KIND:       Ingress
VERSION:    v1
...
</code></pre>
<p>This is the fastest way to look up the <code>apiGroups</code> value for any resource when writing a Role.</p>
<h4 id="heading-a-complete-lookup-workflow">A complete lookup workflow</h4>
<p>Here is the practical workflow when writing a new Role from scratch:</p>
<pre><code class="language-bash"># 1. Find the resource name and API group
kubectl api-resources | grep deployment

# Output:
# deployments   deploy   apps/v1   true   Deployment

# 2. Find the verbs it supports
kubectl api-resources -o wide | grep deployment

# Output:
# deployments   deploy   apps/v1   true   Deployment   create,delete,...,get,list,patch,update,watch

# 3. Write the Role using the group (strip the version) and the verbs you need
</code></pre>
<pre><code class="language-yaml">apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: deployment-reader
  namespace: staging
rules:
  - apiGroups: ["apps"]       # from: apps/v1 → strip /v1
    resources: ["deployments"]
    verbs: ["get", "list", "watch"]
</code></pre>
<p>With this workflow, you never have to guess an API group or verb. You look it up, then write the minimal rule you need.</p>
<h3 id="heading-roles-and-clusterroles">Roles and ClusterRoles</h3>
<p>A <code>Role</code> defines which verbs are allowed on which resources. Here is a Role that grants read-only access to Pods and ConfigMaps inside the <code>staging</code> namespace:</p>
<pre><code class="language-yaml"># role-ci-reader.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: ci-reader
  namespace: staging
rules:
  - apiGroups: [""]          # "" = the core API group (Pods, Services, Secrets, ConfigMaps)
    resources: ["pods", "configmaps"]
    verbs: ["get", "list", "watch"]
</code></pre>
<p>The <code>apiGroups</code> field tells Kubernetes which API group owns the resource. The core group uses an empty string <code>""</code>. Apps-level resources like Deployments use <code>"apps"</code>. Custom resources use their own group, such as <code>"networking.k8s.io"</code>.</p>
<p>A <code>ClusterRole</code> is structurally identical but omits the namespace and can reference cluster-scoped resources like Nodes and PersistentVolumes:</p>
<pre><code class="language-yaml"># clusterrole-node-reader.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: node-reader    # no namespace field
rules:
  - apiGroups: [""]
    resources: ["nodes"]
    verbs: ["get", "list", "watch"]
</code></pre>
<h4 id="heading-when-to-use-which">When to use which:</h4>
<p>Use a <code>Role</code> when the permission is specific to one namespace. A compromised service account can only affect that namespace: the blast radius is contained. Use a <code>ClusterRole</code> when you need access to cluster-scoped resources, or when you want a reusable permission template that multiple namespaces can share.</p>
<p>A common mistake is reaching for a <code>ClusterRole</code> "just to be safe" because it's easier to configure. Namespace-scoped <code>Roles</code> are almost always the right default.</p>
<h3 id="heading-rolebindings-and-clusterrolebindings">RoleBindings and ClusterRoleBindings</h3>
<p>A Role by itself does nothing. You need a binding to attach it to a subject. Here is a <code>RoleBinding</code> that grants the <code>ci-reader</code> Role to the <code>ci-pipeline</code> service account:</p>
<pre><code class="language-yaml"># rolebinding-ci.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: ci-reader-binding
  namespace: staging
subjects:
  - kind: ServiceAccount
    name: ci-pipeline       # the service account name
    namespace: staging      # the namespace the SA lives in
roleRef:
  kind: Role
  name: ci-reader           # must match the Role name exactly
  apiGroup: rbac.authorization.k8s.io
</code></pre>
<p>There is a useful pattern worth knowing: you can bind a <code>ClusterRole</code> using a <code>RoleBinding</code>. This creates namespace-scoped access using a reusable permission template. The <code>ClusterRole</code> defines the rules, while the <code>RoleBinding</code> constrains those rules to a single namespace.</p>
<pre><code class="language-yaml"># RoleBinding referencing a ClusterRole — scoped to one namespace only
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: view-binding
  namespace: staging
subjects:
  - kind: ServiceAccount
    name: ci-pipeline
    namespace: staging
roleRef:
  kind: ClusterRole          # ClusterRole, but bound to one namespace via RoleBinding
  name: view                 # Kubernetes built-in ClusterRole: read-only access to most resources
  apiGroup: rbac.authorization.k8s.io
</code></pre>
<p>Kubernetes ships with several useful built-in ClusterRoles: <code>view</code> (read-only access to most resources), <code>edit</code> (read/write to most resources), <code>admin</code> (full namespace admin), and <code>cluster-admin</code> (full cluster admin). Use them rather than reinventing them.</p>
<h3 id="heading-how-to-use-service-accounts-safely">How to Use Service Accounts Safely</h3>
<p>Every pod in Kubernetes runs as a service account. If you don't specify one, Kubernetes uses the <code>default</code> service account in that namespace.</p>
<p>The default service account starts with no permissions – but it still has a token automatically mounted into every pod at <code>/var/run/secrets/kubernetes.io/serviceaccount/token</code>. This means every container in your cluster can authenticate to the API server by default, even if it has nothing useful to do there.</p>
<p>The single most impactful change you can make is to disable this automatic token mounting on service accounts that don't need API access:</p>
<pre><code class="language-yaml"># serviceaccount.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: my-app
  namespace: production
automountServiceAccountToken: false   # no token mounted into pods by default
</code></pre>
<p>You can also control it at the pod level:</p>
<pre><code class="language-yaml">spec:
  automountServiceAccountToken: false   # override at pod level
  serviceAccountName: my-app
  containers:
    - name: app
      image: my-app:1.0
</code></pre>
<h4 id="heading-the-cluster-admin-anti-pattern">The cluster-admin anti-pattern:</h4>
<p>Never bind <code>cluster-admin</code> to a service account that runs in a pod. <code>cluster-admin</code> grants full read/write access to every resource in the cluster. An attacker who compromises a pod running as <code>cluster-admin</code> owns your cluster completely.</p>
<p>You will see this in Helm charts and tutorials because it "makes things work". It works because it disables the entire authorisation layer. That is not a solution – it's a ticking clock.</p>
<p>The Capital One breach is a direct example of this pattern at the cloud layer: an EC2 instance role had permissions far beyond what the application needed. The SSRF vulnerability was the initial foothold. The over-privileged role was what turned a minor bug into a $80 million fine.</p>
<h3 id="heading-how-to-audit-your-rbac-configuration">How to Audit Your RBAC Configuration</h3>
<p>The <code>kubectl auth can-i</code> command lets you check permissions for any subject. Use <code>--as</code> to impersonate a service account:</p>
<pre><code class="language-bash">SA="system:serviceaccount:staging:ci-pipeline"

# These should return 'yes'
kubectl auth can-i list pods        --namespace staging --as $SA
kubectl auth can-i get  configmaps  --namespace staging --as $SA

# These should return 'no'
kubectl auth can-i delete pods      --namespace staging --as $SA
kubectl auth can-i get  secrets     --namespace staging --as $SA
kubectl auth can-i list pods        --namespace production --as $SA
</code></pre>
<p>To list every permission a subject has in a namespace:</p>
<pre><code class="language-bash">kubectl auth can-i --list \
  --namespace staging \
  --as system:serviceaccount:staging:ci-pipeline
</code></pre>
<p>For a visual matrix across the whole cluster, install <a href="https://github.com/corneliusweig/rakkess">rakkess</a> (part of krew):</p>
<pre><code class="language-bash">kubectl krew install access-matrix

# Permission matrix for all service accounts in staging
kubectl access-matrix --namespace staging
</code></pre>
<p>Example output:</p>
<pre><code class="language-plaintext">NAME          GET  LIST  WATCH  CREATE  UPDATE  PATCH  DELETE
ci-pipeline    ✓    ✓     ✓      ✗       ✗       ✗      ✗
default        ✗    ✗     ✗      ✗       ✗       ✗      ✗
monitoring     ✓    ✓     ✓      ✗       ✗       ✗      ✗
</code></pre>
<p>If you see <code>✓</code> in the CREATE, UPDATE, PATCH, or DELETE columns for a service account that should only read, that's a finding that needs remediation.</p>
<p>⚠️ <strong>The wildcard danger:</strong> The most dangerous RBAC configuration is a wildcard on all three dimensions:</p>
<pre><code class="language-yaml">apiGroups: [""] 
resources: [""] 
verbs: ["*"]
</code></pre>
<p>This is functionally identical to <code>cluster-admin</code>. You will find it in Helm charts for controllers installed with "convenience" permissions. Always audit third-party RBAC before installing operators into a production cluster.</p>
<h2 id="heading-demo-2-build-a-least-privilege-rbac-policy-for-a-ci-pipeline">Demo 2 – Build a Least-Privilege RBAC Policy for a CI Pipeline</h2>
<p>In this demo, you'll create a service account for a CI pipeline that can list pods and read configmaps in the <code>staging</code> namespace – and nothing else.</p>
<h3 id="heading-step-1-create-the-namespace-and-service-account">Step 1: Create the namespace and service account</h3>
<pre><code class="language-bash">kubectl create namespace staging
</code></pre>
<pre><code class="language-yaml"># ci-serviceaccount.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: ci-pipeline
  namespace: staging
automountServiceAccountToken: false
</code></pre>
<pre><code class="language-bash">kubectl apply -f ci-serviceaccount.yaml
</code></pre>
<h3 id="heading-step-2-create-the-role">Step 2: Create the Role</h3>
<pre><code class="language-yaml"># ci-role.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: ci-reader
  namespace: staging
rules:
  - apiGroups: [""]
    resources: ["pods"]
    verbs: ["get", "list", "watch"]
  - apiGroups: [""]
    resources: ["configmaps"]
    verbs: ["get", "list"]
</code></pre>
<pre><code class="language-bash">kubectl apply -f ci-role.yaml
</code></pre>
<h3 id="heading-step-3-bind-the-role-to-the-service-account">Step 3: Bind the Role to the service account</h3>
<pre><code class="language-yaml"># ci-rolebinding.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: ci-reader-binding
  namespace: staging
subjects:
  - kind: ServiceAccount
    name: ci-pipeline
    namespace: staging
roleRef:
  kind: Role
  name: ci-reader
  apiGroup: rbac.authorization.k8s.io
</code></pre>
<pre><code class="language-bash">kubectl apply -f ci-rolebinding.yaml
</code></pre>
<h3 id="heading-step-4-test-allowed-operations">Step 4: Test allowed operations</h3>
<pre><code class="language-bash">SA="system:serviceaccount:staging:ci-pipeline"

kubectl auth can-i list pods       --namespace staging     --as $SA   # yes
kubectl auth can-i get  pods       --namespace staging     --as $SA   # yes
kubectl auth can-i list configmaps --namespace staging     --as $SA   # yes
</code></pre>
<h3 id="heading-step-5-test-denied-operations">Step 5: Test denied operations</h3>
<pre><code class="language-bash">kubectl auth can-i delete pods       --namespace staging     --as $SA   # no
kubectl auth can-i get  secrets      --namespace staging     --as $SA   # no
kubectl auth can-i list pods         --namespace production  --as $SA   # no
kubectl auth can-i create deployments --namespace staging    --as $SA   # no
</code></pre>
<p>All four should return <code>no</code>. Notice the third test: even if there were a matching Role in the <code>staging</code> namespace, the service account cannot access <code>production</code>. A <code>RoleBinding</code> cannot cross namespace boundaries, this is by design.</p>
<p>Writing a least-privilege policy for a service account you control is the easy part. The harder part is auditing what already exists in a cluster. That's what Demo 3 covers.</p>
<h2 id="heading-demo-3-audit-rbac-with-rakkess-and-rbac-lookup">Demo 3 – Audit RBAC with rakkess and rbac-lookup</h2>
<p>Now you'll scan the full cluster to surface any accounts with more permissions than they need.</p>
<h3 id="heading-step-1-install-the-tools">Step 1: Install the tools</h3>
<pre><code class="language-bash">kubectl krew install access-matrix
kubectl krew install rbac-lookup
</code></pre>
<h3 id="heading-step-2-run-rakkess-across-the-cluster">Step 2: Run rakkess across the cluster</h3>
<pre><code class="language-bash"># All service accounts in kube-system
kubectl access-matrix --namespace kube-system

# All ServiceAccounts cluster-wide
kubectl access-matrix
</code></pre>
<h3 id="heading-step-3-find-all-cluster-admin-bindings">Step 3: Find all cluster-admin bindings</h3>
<p>There are two ways subjects get cluster-admin access: via a <code>ClusterRoleBinding</code> (cluster-wide), or via a <code>RoleBinding</code> that references the <code>cluster-admin</code> ClusterRole (namespace-scoped, still dangerous). Check both:</p>
<pre><code class="language-bash"># Find ClusterRoleBindings that grant cluster-admin
kubectl rbac-lookup cluster-admin --kind ClusterRole --output wide
</code></pre>
<p>On a fresh kind cluster this returns:</p>
<pre><code class="language-plaintext">No RBAC Bindings found
</code></pre>
<p>That is the correct and expected result. A default kind cluster doesn't create any <code>ClusterRoleBindings</code> to <code>cluster-admin</code>. The role exists, but nothing is bound to it at the cluster level by default. If you see entries here in your production cluster, each one is a finding worth investigating.</p>
<p>To find who has cluster-level admin access through other means, query the bindings directly:</p>
<pre><code class="language-bash"># Find all ClusterRoleBindings and the subjects they grant
kubectl get clusterrolebindings -o wide
</code></pre>
<pre><code class="language-plaintext">NAME                                                   ROLE                                                                       AGE   USERS                         GROUPS                         SERVICEACCOUNTS
cluster-admin                                          ClusterRole/cluster-admin                                                  10d   system:masters
system:kube-controller-manager                         ClusterRole/system:kube-controller-manager                                 10d
system:kube-scheduler                                  ClusterRole/system:kube-scheduler                                          10d
system:node                                            ClusterRole/system:node                                                    10d
...
</code></pre>
<p>The <code>cluster-admin</code> ClusterRoleBinding grants access to the <code>system:masters</code> group – the group your kubeconfig certificate belongs to. This is expected. Every other binding in this list is worth reviewing to understand what it grants and why.</p>
<p><strong>What to look for:</strong> Any binding where the SERVICEACCOUNTS column is populated with an application service account (not a <code>system:</code> prefixed one) is a potential over-privilege finding. Application pods should never need cluster-admin.</p>
<h3 id="heading-step-4-verify-the-ci-pipeline-service-account">Step 4: Verify the ci-pipeline service account</h3>
<pre><code class="language-bash">kubectl rbac-lookup ci-pipeline --kind ServiceAccount --output wide
</code></pre>
<p>Expected output:</p>
<pre><code class="language-bash">SUBJECT                               SCOPE     ROLE             SOURCE
ServiceAccount/staging:ci-pipeline    staging   Role/ci-reader   RoleBinding/ci-reader-binding
</code></pre>
<p>The format is <code>/&lt;role-name&gt; &lt;binding-kind&gt;/&lt;binding-name&gt;</code>. This tells you:</p>
<ul>
<li><p>The service account is bound to the <code>ci-reader</code> Role</p>
</li>
<li><p>The binding is a <code>RoleBinding</code> named <code>ci-reader-binding</code></p>
</li>
<li><p>There is no namespace prefix on the role name because it is a namespaced <code>Role</code>, not a <code>ClusterRole</code></p>
</li>
</ul>
<p>If the output showed <code>ClusterRole/something</code> here, that would be a finding. It would mean the service account has cluster-wide permissions, not namespace-scoped ones.</p>
<p><strong>rbac-lookup vs kubectl get:</strong> <code>rbac-lookup</code> gives you a subject-centric view: "what does this account have access to?" <code>kubectl get rolebindings,clusterrolebindings -A</code> gives you a binding-centric view: "what bindings exist in the cluster?" Use both. rbac-lookup is faster for auditing a specific service account, while the <code>kubectl get</code> approach is better for a full cluster inventory.</p>
<p>With RBAC locked down, the API server is protected. But RBAC says nothing about what a container can do once it's running. That's a separate layer entirely.</p>
<h2 id="heading-how-to-harden-pod-runtime-security">How to Harden Pod Runtime Security</h2>
<p>RBAC controls who can talk to the Kubernetes API. Pod security controls what containers can do once they're running on a node. These are different threat vectors: RBAC protects the control plane, pod security protects the data plane.</p>
<p>A container that runs as root with no capability restrictions can, if compromised, write backdoors to the host filesystem, load kernel modules, read the memory of other processes if <code>hostPID: true</code> is set, and in some configurations escape the container entirely. Pod security closes these doors before an attacker can open them.</p>
<h3 id="heading-a-case-study-the-hildegard-malware-campaign">A Case Study: The Hildegard Malware Campaign</h3>
<p>In early 2021, Palo Alto's Unit 42 research team documented a cryptomining malware campaign called Hildegard that specifically targeted Kubernetes clusters. The attack chain was:</p>
<ol>
<li><p>Find a cluster with the kubelet API exposed without authentication</p>
</li>
<li><p>Deploy a privileged pod with <code>hostPID: true</code></p>
</li>
<li><p>Use the privileged pod to read credentials from other containers' memory</p>
</li>
<li><p>Establish persistence by writing to the host filesystem</p>
</li>
</ol>
<p>Steps 3 and 4 would have been impossible if the pods in the cluster had been running with <code>readOnlyRootFilesystem: true</code>, dropped capabilities, and no <code>hostPID</code>. The attacker had the initial foothold. Pod security would have contained the blast radius.</p>
<h3 id="heading-pod-security-admission">Pod Security Admission</h3>
<p>Pod Security Admission (PSA) is the built-in admission controller that enforces pod security standards at the namespace level. It replaced PodSecurityPolicy in Kubernetes 1.25.</p>
<p><strong>Migrating from PSP?</strong> If you're on Kubernetes &lt; 1.25, you may still be using PodSecurityPolicy, which was removed in 1.25. The migration path is: enable PSA in <code>audit</code> mode first to identify violations, fix them workload by workload, then switch to <code>enforce</code>. For policies PSA cannot express, add Kyverno alongside it.</p>
<p>PSA defines three profiles:</p>
<table>
<thead>
<tr>
<th>Profile</th>
<th>Who it's for</th>
<th>What it restricts</th>
</tr>
</thead>
<tbody><tr>
<td><code>privileged</code></td>
<td>System components (CNI plugins, monitoring agents)</td>
<td>Nothing – no restrictions</td>
</tr>
<tr>
<td><code>baseline</code></td>
<td>Most workloads</td>
<td>Blocks known privilege escalations: no <code>hostNetwork</code>, no <code>hostPID</code>, no privileged containers</td>
</tr>
<tr>
<td><code>restricted</code></td>
<td>Security-sensitive workloads</td>
<td>Everything in baseline, plus: must run as non-root, must drop capabilities, must set a seccomp profile</td>
</tr>
</tbody></table>
<p>And three enforcement modes:</p>
<table>
<thead>
<tr>
<th>Mode</th>
<th>Effect</th>
<th>When to use</th>
</tr>
</thead>
<tbody><tr>
<td><code>enforce</code></td>
<td>Rejects pods that violate the profile at admission</td>
<td>Production – once you've fixed violations</td>
</tr>
<tr>
<td><code>audit</code></td>
<td>Allows pods but records violations in the audit log</td>
<td>Migration – see what would break without breaking anything</td>
</tr>
<tr>
<td><code>warn</code></td>
<td>Allows pods but sends a warning to the client</td>
<td>Development – fast feedback in your terminal</td>
</tr>
</tbody></table>
<p>The migration path: start with <code>audit</code> and <code>warn</code> to identify violations, fix them, then switch to <code>enforce</code>. The two modes can run simultaneously.</p>
<p>Apply them as namespace labels:</p>
<pre><code class="language-yaml"># namespace-staging.yaml
apiVersion: v1
kind: Namespace
metadata:
  name: staging
  labels:
    # Start here: audit and warn simultaneously
    pod-security.kubernetes.io/audit: restricted
    pod-security.kubernetes.io/audit-version: latest
    pod-security.kubernetes.io/warn: restricted
    pod-security.kubernetes.io/warn-version: latest
</code></pre>
<p>Once violations are resolved, add enforce:</p>
<pre><code class="language-bash">kubectl label namespace staging \
  pod-security.kubernetes.io/enforce=restricted \
  pod-security.kubernetes.io/enforce-version=latest \
  --overwrite
</code></pre>
<p>Note: don't use <code>--overwrite</code> here. Without it, if <code>enforce</code> is already set to a different value the command will error – which is exactly what you want. You should see:</p>
<pre><code class="language-plaintext">namespace/staging labeled
</code></pre>
<p>If you see <code>namespace/staging not labeled</code>, it means <code>enforce=restricted</code> and <code>enforce-version=latest</code> were already set to those exact values. Confirm enforcement is active:</p>
<pre><code class="language-bash">kubectl get namespace staging --show-labels
</code></pre>
<p>Look for <code>pod-security.kubernetes.io/enforce=restricted</code> in the output. If it's there, enforcement is active.</p>
<h3 id="heading-how-to-configure-securitycontext">How to Configure securityContext</h3>
<p>A <code>securityContext</code> defines the privilege and access control settings for a pod or container. These are the seven fields you should configure on every production workload:</p>
<table>
<thead>
<tr>
<th>Field</th>
<th>Set at</th>
<th>What it controls</th>
</tr>
</thead>
<tbody><tr>
<td><code>runAsNonRoot</code></td>
<td>Pod</td>
<td>Rejects containers that run as UID 0 (root)</td>
</tr>
<tr>
<td><code>runAsUser</code> / <code>runAsGroup</code></td>
<td>Pod</td>
<td>Sets a specific UID/GID – don't rely on the image default</td>
</tr>
<tr>
<td><code>fsGroup</code></td>
<td>Pod</td>
<td>All mounted volumes are owned by this GID</td>
</tr>
<tr>
<td><code>seccompProfile</code></td>
<td>Pod</td>
<td>Filters syscalls using a seccomp profile</td>
</tr>
<tr>
<td><code>allowPrivilegeEscalation</code></td>
<td>Container</td>
<td>Blocks <code>setuid</code> binaries and <code>sudo</code></td>
</tr>
<tr>
<td><code>readOnlyRootFilesystem</code></td>
<td>Container</td>
<td>Makes the container filesystem read-only</td>
</tr>
<tr>
<td><code>capabilities.drop</code></td>
<td>Container</td>
<td>Removes Linux capabilities (drop <code>ALL</code>, add back only what is needed)</td>
</tr>
</tbody></table>
<p>The annotated YAML below shows all seven in context:</p>
<pre><code class="language-yaml"># secure-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: secure-app
  namespace: staging
spec:
  replicas: 2
  selector:
    matchLabels:
      app: secure-app
  template:
    metadata:
      labels:
        app: secure-app
    spec:
      securityContext:
        runAsNonRoot: true         # container must run as a non-root user
        runAsUser: 10001           # explicit UID — don't rely on the image's default
        runAsGroup: 10001          # explicit GID
        fsGroup: 10001             # volumes are owned by this group
        seccompProfile:
          type: RuntimeDefault     # use the container runtime's default seccomp profile
      automountServiceAccountToken: false
      containers:
        - name: app
          image: nginx:1.25-alpine
          securityContext:
            allowPrivilegeEscalation: false   # block setuid and sudo inside the container
            readOnlyRootFilesystem: true      # the single highest-impact setting
            capabilities:
              drop:
                - ALL                         # drop every Linux capability
              add: []                         # add back only what is explicitly needed
          volumeMounts:
            - name: tmp
              mountPath: /tmp
            - name: nginx-cache
              mountPath: /var/cache/nginx
            - name: nginx-run
              mountPath: /var/run
      volumes:
        # nginx needs writable directories — provide them as emptyDir volumes
        - name: tmp
          emptyDir: {}
        - name: nginx-cache
          emptyDir: {}
        - name: nginx-run
          emptyDir: {}
</code></pre>
<h4 id="heading-why-readonlyrootfilesystem-true-is-the-most-important-setting">Why <code>readOnlyRootFilesystem: true</code> is the most important setting:</h4>
<p>Most post-exploitation techniques require writing to the filesystem. Dropping a backdoor, modifying a binary, writing a cron job, or installing a keylogger all require a writable filesystem. Set <code>readOnlyRootFilesystem: true</code> and every one of these techniques is blocked.</p>
<p>The downside is that many applications write to directories like <code>/tmp</code> or <code>/var/cache</code>. The fix is to mount <code>emptyDir</code> volumes at those specific paths, as shown above. The rest of the filesystem stays read-only.</p>
<p><strong>What each field prevents:</strong></p>
<table>
<thead>
<tr>
<th>Field</th>
<th>What it prevents</th>
</tr>
</thead>
<tbody><tr>
<td><code>runAsNonRoot: true</code></td>
<td>Blocks containers that were built to run as root – they fail at admission</td>
</tr>
<tr>
<td><code>runAsUser: 10001</code></td>
<td>Ensures a known, non-privileged UID even if the image doesn't set one</td>
</tr>
<tr>
<td><code>allowPrivilegeEscalation: false</code></td>
<td>Blocks <code>setuid</code> binaries and <code>sudo</code> – the most common privilege escalation path</td>
</tr>
<tr>
<td><code>readOnlyRootFilesystem: true</code></td>
<td>Prevents writing backdoors, modifying binaries, or creating persistence</td>
</tr>
<tr>
<td><code>capabilities: drop: ALL</code></td>
<td>Removes Linux capabilities like <code>NET_RAW</code> (raw socket access) and <code>SYS_ADMIN</code> (kernel operations)</td>
</tr>
<tr>
<td><code>seccompProfile: RuntimeDefault</code></td>
<td>Filters syscalls to a safe default set – blocks ~300 of the ~400 available syscalls</td>
</tr>
</tbody></table>
<h3 id="heading-opagatekeeper-vs-kyverno">OPA/Gatekeeper vs Kyverno</h3>
<p>PSA covers the fundamentals. But you'll eventually need policies that PSA cannot express: all images must come from your private registry, all pods must have resource limits, no container may use the <code>latest</code> tag. For these, you need a policy engine.</p>
<p>Two mature options exist:</p>
<table>
<thead>
<tr>
<th></th>
<th>OPA/Gatekeeper</th>
<th>Kyverno</th>
</tr>
</thead>
<tbody><tr>
<td><strong>Policy language</strong></td>
<td>Rego (a custom logic language)</td>
<td>YAML, same format as Kubernetes resources</td>
</tr>
<tr>
<td><strong>Learning curve</strong></td>
<td>Steep: Rego takes real time to learn</td>
<td>Gentle: if you write YAML, you can write policies</td>
</tr>
<tr>
<td><strong>Mutation</strong></td>
<td>Yes, via <code>Assign</code>/<code>AssignMetadata</code></td>
<td>Yes: first-class, well-documented feature</td>
</tr>
<tr>
<td><strong>Audit mode</strong></td>
<td>Yes: reports existing violations</td>
<td>Yes: policy audit mode</td>
</tr>
<tr>
<td><strong>Ecosystem</strong></td>
<td>Integrates with OPA in non-K8s contexts</td>
<td>Kubernetes-native only</td>
</tr>
<tr>
<td><strong>Best for</strong></td>
<td>Complex cross-resource logic and teams already using OPA</td>
<td>Teams who want K8s-native syntax and fast setup</td>
</tr>
</tbody></table>
<p>If you're starting fresh, Kyverno gets you to working policies faster. Here is a Kyverno policy that blocks images from outside your trusted registry:</p>
<pre><code class="language-yaml"># kyverno-registry-policy.yaml
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: restrict-image-registries
spec:
  validationFailureAction: Enforce
  background: true
  rules:
    - name: validate-registries
      match:
        any:
          - resources:
              kinds: ["Pod"]
      validate:
        message: "Images must come from registry.corp.internal/"
        pattern:
          spec:
            containers:
              - image: "registry.corp.internal/*"
</code></pre>
<h3 id="heading-how-to-detect-runtime-threats-with-falco">How to Detect Runtime Threats with Falco</h3>
<p>PSA and <code>securityContext</code> are preventive controls: they block known-bad configurations before pods start. Falco is a detective control. It watches what containers do while they're running and alerts when something looks wrong.</p>
<p>Falco operates at the syscall level using eBPF. It attaches to the Linux kernel and intercepts every system call made by every container on the node – file opens, network connections, process spawns, privilege escalations. It does this without modifying containers, without injecting sidecars, and with minimal overhead.</p>
<h4 id="heading-what-falco-detects-out-of-the-box">What Falco detects out of the box:</h4>
<p>Falco's default ruleset covers the most common attack patterns. It fires when a shell is opened inside a running container, whether that's a <code>kubectl exec</code> session or a reverse shell from an exploit.</p>
<p>It watches for reads on sensitive files like <code>/etc/shadow</code>, <code>/etc/kubernetes/admin.conf</code>, and <code>/root/.ssh/</code>. It catches the dropper pattern: a binary written to disk and immediately executed. It detects outbound connections to known malicious IPs, writes to <code>/proc</code> or <code>/sys</code> that suggest kernel manipulation, and package managers like <code>apt</code>, <code>yum</code>, or <code>pip</code> being run inside containers that have no business installing software.</p>
<p>Each of these is a rule in Falco's default ruleset. You can extend it with custom rules for your specific workloads – which is exactly what you'll do in Demo 5. But first let's harden the Pod.</p>
<h2 id="heading-demo-4-harden-a-pod-with-securitycontext">Demo 4 – Harden a Pod with securityContext</h2>
<p>In this demo, you'll start with a default nginx deployment, observe the PSA violations it triggers, harden it step by step, and confirm it passes under the <code>restricted</code> profile.</p>
<h3 id="heading-step-1-apply-psa-labels-in-audit-mode">Step 1: Apply PSA labels in audit mode</h3>
<pre><code class="language-bash">kubectl label namespace staging \
  pod-security.kubernetes.io/audit=restricted \
  pod-security.kubernetes.io/warn=restricted
</code></pre>
<h3 id="heading-step-2-deploy-insecure-nginx-and-observe-the-warnings">Step 2: Deploy insecure nginx and observe the warnings</h3>
<pre><code class="language-yaml"># insecure-nginx.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-insecure
  namespace: staging
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nginx-insecure
  template:
    metadata:
      labels:
        app: nginx-insecure
    spec:
      containers:
        - name: nginx
          image: nginx:1.25-alpine
</code></pre>
<pre><code class="language-bash">kubectl apply -f insecure-nginx.yaml
</code></pre>
<p>Expected output (PSA warns but still creates the deployment in <code>warn</code> mode):</p>
<pre><code class="language-plaintext">Warning: would violate PodSecurity "restricted:latest":
  allowPrivilegeEscalation != false (container "nginx" must set
    securityContext.allowPrivilegeEscalation=false)
  unrestricted capabilities (container "nginx" must set
    securityContext.capabilities.drop=["ALL"])
  runAsNonRoot != true (pod or container "nginx" must set
    securityContext.runAsNonRoot=true)
  seccompProfile not set (pod or container "nginx" must set
    securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost")
deployment.apps/nginx-insecure created
</code></pre>
<p>Four violations. Every one of them is a real security gap. But the pod was still created "deployment.apps/nginx-insecure created"</p>
<h3 id="heading-step-3-deploy-the-hardened-version">Step 3: Deploy the hardened version</h3>
<pre><code class="language-bash">kubectl apply -f secure-deployment.yaml   # the YAML from the securityContext section above
</code></pre>
<p>No warnings this time.</p>
<h3 id="heading-step-4-switch-the-namespace-to-enforce">Step 4: Switch the namespace to enforce</h3>
<pre><code class="language-bash&quot;">kubectl label namespace staging \
  pod-security.kubernetes.io/enforce=restricted \
  pod-security.kubernetes.io/enforce-version=latest
</code></pre>
<p>Expected output:</p>
<pre><code class="language-plaintext">namespace/staging labeled
</code></pre>
<p>This is the moment enforcement becomes active. Any new pod that violates the <code>restricted</code> profile will be rejected from this point on.</p>
<h3 id="heading-step-5-confirm-insecure-deployments-are-now-rejected">Step 5: Confirm insecure deployments are now rejected</h3>
<pre><code class="language-bash">kubectl delete deployment nginx-insecure -n staging
kubectl apply -f insecure-nginx.yaml
</code></pre>
<p>Expected output:</p>
<pre><code class="language-shell">Warning: would violate PodSecurity "restricted:latest": allowPrivilegeEscalation != false ...
deployment.apps/nginx-insecure created
</code></pre>
<p>The Deployment object is created. PSA enforces at the <strong>pod</strong> level, not the Deployment level. The Deployment and its ReplicaSet exist, but every attempt to create a pod is rejected. Check the ReplicaSet:</p>
<pre><code class="language-bash">kubectl get replicaset -n staging -l app=nginx-insecure
</code></pre>
<pre><code class="language-plaintext">NAME                       DESIRED   CURRENT   READY   AGE
nginx-insecure-b668d867b   1         0         0       30s
</code></pre>
<p><code>DESIRED=1</code> but <code>CURRENT=0</code>. The ReplicaSet cannot create any pods because they're rejected at admission. Describe the ReplicaSet to see the rejection events:</p>
<pre><code class="language-bash">kubectl describe replicaset -n staging -l app=nginx-insecure
</code></pre>
<pre><code class="language-plaintext">Warning  FailedCreate  ReplicaSet "nginx-insecure-b668d867b" create Pod
  "nginx-insecure-xxx" failed: pods is forbidden: violates PodSecurity
  "restricted:latest": allowPrivilegeEscalation != false, unrestricted
  capabilities, runAsNonRoot != true, seccompProfile not set
</code></pre>
<p>The hardened deployment continues running with its pods intact. The insecure one has zero pods and never will. This is exactly how PSA is supposed to work.</p>
<h3 id="heading-step-6-score-the-hardened-pod-with-kube-score">Step 6: Score the hardened pod with kube-score</h3>
<p><a href="https://github.com/zegl/kube-score">kube-score</a> is a static analysis tool that scores Kubernetes manifests against security and reliability best practices:</p>
<pre><code class="language-bash"># macOS
brew install kube-score
# Linux: https://github.com/zegl/kube-score/releases

kube-score score secure-deployment.yaml -v
</code></pre>
<p>Expected output (abridged):</p>
<pre><code class="language-plaintext">apps/v1/Deployment secure-app in staging 
  path=secure-deployment.yaml
    [OK] Stable version
    [OK] Label values
    [CRITICAL] Container Resources
        · app -&gt; CPU limit is not set
            Resource limits are recommended to avoid resource DDOS. Set resources.limits.cpu
        · app -&gt; Memory limit is not set
            Resource limits are recommended to avoid resource DDOS. Set resources.limits.memory
        · app -&gt; CPU request is not set
            Resource requests are recommended to make sure that the application can start and run without crashing. Set resources.requests.cpu
        · app -&gt; Memory request is not set
            Resource requests are recommended to make sure that the application can start and run without crashing. Set resources.requests.memory
    [CRITICAL] Container Image Pull Policy
        · app -&gt; ImagePullPolicy is not set to Always
            It's recommended to always set the ImagePullPolicy to Always, to make sure that the imagePullSecrets are always correct, and to always get the image you want.
    [OK] Pod Probes Identical
    [CRITICAL] Container Ephemeral Storage Request and Limit
        · app -&gt; Ephemeral Storage limit is not set
            Resource limits are recommended to avoid resource DDOS. Set resources.limits.ephemeral-storage
        · app -&gt; Ephemeral Storage request is not set
            Resource requests are recommended to make sure the application can start and run without crashing. Set resource.requests.ephemeral-storage
    [OK] Environment Variable Key Duplication
    [OK] Container Security Context Privileged
    [OK] Pod Topology Spread Constraints
        · Pod Topology Spread Constraints
            No Pod Topology Spread Constraints set, kube-scheduler defaults assumed
    [OK] Container Image Tag
    [CRITICAL] Pod NetworkPolicy
        · The pod does not have a matching NetworkPolicy
            Create a NetworkPolicy that targets this pod to control who/what can communicate with this pod. Note, this feature needs to be supported by the CNI implementation used in the Kubernetes cluster to have an effect.
    [OK] Container Security Context User Group ID
    [OK] Container Security Context ReadOnlyRootFilesystem
    [CRITICAL] Deployment has PodDisruptionBudget
        · No matching PodDisruptionBudget was found
            It's recommended to define a PodDisruptionBudget to avoid unexpected downtime during Kubernetes maintenance operations, such as when draining a node.
    [WARNING] Deployment has host PodAntiAffinity
        · Deployment does not have a host podAntiAffinity set
            It's recommended to set a podAntiAffinity that stops multiple pods from a deployment from being scheduled on the same node. This increases availability in case the node becomes unavailable.
    [OK] Deployment Pod Selector labels match template metadata labels
</code></pre>
<p>Notice there are no security context violations: <code>securityContext</code>, <code>readOnlyRootFilesystem</code>, <code>seccompProfile</code>, and <code>runAsNonRoot</code> all pass. The remaining findings are about <strong>resource management</strong> (CPU/memory limits, ephemeral storage), <strong>availability</strong> (PodDisruptionBudget, anti-affinity), and <strong>network policy</strong> – not security context hardening. Those are important for production readiness, but they're a separate concern from the pod security hardening we did here.</p>
<p>You now have a pod that PSA accepts and kube-score validates. The next step is to add a detection layer – something that watches what the pod does at runtime, not just how it was configured at admission.</p>
<h2 id="heading-demo-5-deploy-falco-and-write-a-custom-detection-rule">Demo 5 – Deploy Falco and Write a Custom Detection Rule</h2>
<p>Now, you'll deploy Falco in eBPF mode, trigger a default alert, then extend Falco with a custom rule that catches <code>curl</code> and <code>wget</code> being run inside containers.</p>
<h3 id="heading-step-1-install-falco-via-helm">Step 1: Install Falco via Helm</h3>
<pre><code class="language-bash">helm repo add falcosecurity https://falcosecurity.github.io/charts
helm repo update

helm install falco falcosecurity/falco \
  --namespace falco \
  --create-namespace \
  --set driver.kind=modern_ebpf \
  --set tty=true \
  --wait
</code></pre>
<p>Confirm Falco is running on every node:</p>
<pre><code class="language-shell">kubectl get pods -n falco
</code></pre>
<pre><code class="language-shell">NAME           READY   STATUS    RESTARTS   AGE
falco-x8k2p    1/1     Running   0          45s
falco-m9nqr    1/1     Running   0          45s
falco-j4tpw    1/1     Running   0          45s
</code></pre>
<p>One pod per node. Falco runs as a DaemonSet because it needs to monitor syscalls on every node independently.</p>
<h3 id="heading-step-2-trigger-a-default-alert">Step 2: Trigger a default alert</h3>
<p>Open a second terminal and stream the Falco logs:</p>
<pre><code class="language-shell"># Terminal 2 — watch for alerts
kubectl logs -n falco -l app.kubernetes.io/name=falco -f --max-log-requests 3
</code></pre>
<p>In your first terminal, exec into the secure-app pod:</p>
<pre><code class="language-bash"># Terminal 1 — trigger the shell detection
POD=$(kubectl get pod -n staging -l app=secure-app \
  -o jsonpath='{.items[0].metadata.name}')
kubectl exec -it $POD -n staging -- sh
</code></pre>
<p>Within a second, Terminal 2 shows:</p>
<pre><code class="language-plaintext">2024-03-15T14:23:41.456Z: Notice A shell was spawned in a container with an attached terminal
  (user=root user_loginuid=-1 k8s.ns=staging k8s.pod=secure-app-7d9f8b-xxx
   container=app shell=sh parent=runc cmdline=sh terminal=34816)
  rule=Terminal shell in container  priority=NOTICE
  tags=[container, shell, mitre_execution]
</code></pre>
<p>This is Falco's built-in <code>Terminal shell in container</code> rule firing. It detected the <code>kubectl exec</code> session the moment you ran it.</p>
<h3 id="heading-step-3-write-a-custom-rule">Step 3: Write a custom rule</h3>
<p>The built-in rules are comprehensive, but every production environment has workloads with unique behaviour. Here is a custom rule that alerts when <code>curl</code> or <code>wget</code> is executed inside any container:</p>
<pre><code class="language-yaml"># custom-rules.yaml
customRules:
  custom-rules.yaml: |-
    - rule: Suspicious network tool in container
      desc: &gt;
        Detects execution of curl or wget inside a running container.
        These tools are commonly used for data exfiltration, downloading
        attacker payloads, or reaching command-and-control servers.
        Production containers should not be making ad-hoc HTTP requests.
      condition: &gt;
        spawned_process
        and container
        and proc.name in (curl, wget)
      output: &gt;
        Network tool executed in container
        (user=%user.name tool=%proc.name cmd=%proc.cmdline
         pod=%k8s.pod.name ns=%k8s.ns.name image=%container.image)
      priority: WARNING
      tags: [network, exfiltration, custom]
</code></pre>
<p>Apply it by upgrading the Helm release:</p>
<pre><code class="language-bash"> helm upgrade falco falcosecurity/falco \
  --namespace falco \
  --set driver.kind=modern_ebpf \
  --set tty=true \
  -f custom-rules.yaml
</code></pre>
<p>Good, it deployed. Now wait for pods to be ready and test your custom rule:</p>
<h3 id="heading-step-4-test-the-custom-rule">Step 4: Test the custom rule</h3>
<pre><code class="language-bash"># Terminal 1 — run curl inside the container
kubectl exec -it $POD -n staging -- sh -c 'curl https://example.com'
</code></pre>
<p>Terminal 2 immediately shows:</p>
<pre><code class="language-plaintext">2024-03-15T14:31:07.812Z: Warning Network tool executed in container
  (user=root tool=curl cmd=curl https://example.com
   pod=secure-app-7d9f8b-xxx ns=staging image=nginx:1.25-alpine)
  rule=Suspicious network tool in container  priority=WARNING
  tags=[network, exfiltration, custom]
</code></pre>
<h3 id="heading-step-5-route-alerts-to-slack-with-falcosidekick">Step 5: Route alerts to Slack with Falcosidekick</h3>
<p>Streaming logs is useful during development. In production, you need alerts routed to your alerting pipeline. Falcosidekick handles this with support for Slack, PagerDuty, Datadog, Elasticsearch, and over 50 other outputs:</p>
<pre><code class="language-yaml"># falcosidekick-values.yaml
config:
  slack:
    webhookurl: "https://hooks.slack.com/services/YOUR/WEBHOOK/URL"
    minimumpriority: "warning"
    messageformat: &gt;
      [{{.Priority}}] {{.Rule}} |
      pod: {{.OutputFields.k8s.pod.name}} |
      ns: {{.OutputFields.k8s.ns.name}} |
      image: {{.OutputFields.container.image}}
</code></pre>
<pre><code class="language-bash">helm install falcosidekick falcosecurity/falcosidekick \
  --namespace falco \
  -f falcosidekick-values.yaml
</code></pre>
<p><strong>Tuning Falco for production:</strong> A fresh Falco deployment will generate false positives, especially in the first week. Your job is to tune rules to match your workloads' normal behaviour, not to respond to every alert.</p>
<p>Here's the workflow: deploy in staging → identify false positives → add <code>except</code> conditions to rules → validate the false positive rate is low → enable in production with alerting.</p>
<h2 id="heading-cleanup">Cleanup</h2>
<p>To remove everything created in this article:</p>
<pre><code class="language-bash"># Delete the staging namespace and everything in it
kubectl delete namespace staging
 
# Delete Falco and Falcosidekick
helm uninstall falco -n falco
helm uninstall falcosidekick -n falco
kubectl delete namespace falco
 
# Delete the kind cluster entirely
kind delete cluster --name k8s-security
</code></pre>
<h2 id="heading-conclusion">Conclusion</h2>
<p>In this handbook, you secured a Kubernetes cluster across three layers: RBAC, pod runtime security, and runtime threat detection.</p>
<p>You built a least-privilege service account, enforced the restricted Pod Security Admission profile, hardened pods with securityContext, deployed Falco for syscall-level detection, and wrote a custom rule to catch suspicious tools inside containers.</p>
<p>Each layer maps to a real-world breach – Tesla, Capital One, Hildegard – showing how these controls would have contained the damage. Run kube-bench again to measure the improvement.</p>
<p>All YAML manifests, Helm values, and setup scripts from this article are available in the <a href="https://github.com/Caesarsage/DevOps-Cloud-Projects/tree/main/intermediate/security">companion GitHub repository</a>.</p>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ How to Use MLflow to Manage Your Machine Learning Lifecycle ]]>
                </title>
                <description>
                    <![CDATA[ Training machine learning models usually starts out being organized and ends up in absolute chaos. We’ve all been there: dozens of experiments scattered across random notebooks, and model files saved  ]]>
                </description>
                <link>https://www.freecodecamp.org/news/how-to-use-mlflow-to-manage-your-machine-learning-lifecycle/</link>
                <guid isPermaLink="false">69c18bfc30a9b81e3a92bbbd</guid>
                
                    <category>
                        <![CDATA[ mlops ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Python ]]>
                    </category>
                
                    <category>
                        <![CDATA[ containers ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Machine Learning ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ Temitope Oyedele ]]>
                </dc:creator>
                <pubDate>Mon, 23 Mar 2026 18:52:44 +0000</pubDate>
                <media:content url="https://cdn.hashnode.com/uploads/covers/5e1e335a7a1d3fcc59028c64/f829ab55-926d-43cd-b027-16c754445b09.png" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>Training machine learning models usually starts out being organized and ends up in absolute chaos.</p>
<p>We’ve all been there: dozens of experiments scattered across random notebooks, and model files saved as <code>model_v2_final_FINAL.pkl</code> because no one is quite sure which version actually worked.</p>
<p>Once you move from a solo project to a team, or try to push something to production, that "organized chaos" quickly becomes a serious bottleneck.</p>
<p>Solving this mess requires more than just better naming conventions: it requires a way to standardize how we track and hand off our work. This is the specific gap MLflow was built to fill.</p>
<p>Originally released by the team at Databricks in 2018, it has become a standard open-source platform for managing the entire machine learning lifecycle. It acts as a central hub where your experiments, code, and models live together, rather than being tucked away in forgotten folders.</p>
<p>In this tutorial, we'll cover the core philosophy behind MLflow and how its modular architecture solves the 'dependency hell' of machine learning. We'll break down the four primary pillars of Tracking, Projects, Models, and the Model Registry, and walk through a practical implementation of each so you can move your projects from local notebooks to a production-ready lifecycle.</p>
<h3 id="heading-table-of-contents">Table of Contents:</h3>
<ul>
<li><p><a href="#heading-prerequisites">Prerequisites:</a></p>
</li>
<li><p><a href="#heading-mlflow-architecture-the-big-picture">MLflow Architecture: The Big Picture</a></p>
</li>
<li><p><a href="#heading-understanding-mlflow-tracking">Understanding MLflow Tracking</a></p>
<ul>
<li><p><a href="#heading-a-tracking-example">A Tracking Example</a></p>
</li>
<li><p><a href="#heading-where-does-the-data-actually-go">Where Does the Data Actually Go?</a></p>
</li>
<li><p><a href="#heading-why-bother-with-this-setup">Why Bother with This Setup?</a></p>
</li>
</ul>
</li>
<li><p><a href="#heading-understanding-mlflow-projects">Understanding MLflow Projects</a></p>
<ul>
<li><p><a href="#heading-the-mlproject-file">The MLproject File</a></p>
</li>
<li><p><a href="#heading-why-this-actually-matters">Why this Actually Matters</a></p>
</li>
</ul>
</li>
<li><p><a href="#heading-understanding-the-mlflow-model-registry">Understanding the MLflow Model Registry</a></p>
</li>
<li><p><a href="#heading-moving-a-model-through-the-pipeline">Moving a Model through the Pipeline</a></p>
<ul>
<li><a href="#heading-why-does-this-matter">Why Does This Matter?</a></li>
</ul>
</li>
<li><p><a href="#heading-how-the-components-fit-together">How the Components Fit Together</a></p>
</li>
<li><p><a href="#heading-wrapping-up">Wrapping Up</a></p>
</li>
</ul>
<h3 id="heading-prerequisites">Prerequisites:</h3>
<p>To get the most out of this tutorial, you should have:</p>
<ul>
<li><p><strong>Basic Python proficiency:</strong> Comfort with context managers (<code>with</code> statements) and decorators.</p>
</li>
<li><p><strong>Machine Learning fundamentals:</strong> A general understanding of training/testing splits and model evaluation metrics (like accuracy or loss).</p>
</li>
<li><p><strong>Local Environment:</strong> Python 3.8+ installed. Familiarity with <code>pip</code> or <code>conda</code> for installing packages is helpful.</p>
</li>
</ul>
<h2 id="heading-mlflow-architecture-the-big-picture">MLflow Architecture: The Big Picture</h2>
<p>To understand why MLflow is so effective, you have to look at how it's actually put together. MLflow isn't one giant or rigid tool. It’s a modular system designed around four loosely coupled components that are its core pillars.</p>
<p>This is a big deal because it means you don’t have to commit to the entire ecosystem at once. If you only need to track experiments and don't care about the other features, you can just use that part and ignore the rest.</p>
<p>To make this a bit more concrete, here is how those pieces map to things you probably already use:</p>
<ul>
<li><p><strong>MLflow Tracking:</strong> Logs experiments, metrics, and parameters. (Think: <strong>Git commits for ML runs</strong>)</p>
</li>
<li><p><strong>MLflow Projects:</strong> Packages code for reproducibility. (Think: <strong>A Docker image for ML code</strong>)</p>
</li>
<li><p><strong>MLflow Models:</strong> A standard format for multiple frameworks. (Think: <strong>A universal adapter</strong>)</p>
</li>
<li><p><strong>Model Registry:</strong> Handles versioning and governing models. (Think: <strong>A CI/CD pipeline for models</strong>)</p>
</li>
</ul>
<p>Architecturally, you can think of MLflow in two layers: the Client and the Server.</p>
<p>The Client is where you spend most of your time. It’s your training script or your Jupyter notebook where you log metrics or register a model.</p>
<p>The Server is the brain in the background that handles the storage. It consists of a Tracking Server, a Backend Store (usually a database like PostgreSQL), and an Artifact Store. That’s the place where big files like model weights live, such as S3 or GCS.</p>
<p>This separation is why MLflow is so flexible. You can start with everything running locally on your laptop using just your file system. When you're ready to scale up to a larger team, you can swap that out for a centralized server and cloud storage with almost no changes to your actual code. It grows with your project instead of forcing you to start over once things get serious.</p>
<p>Now, let's look at each of these four pillars of MLflow so you understand how they work.</p>
<h2 id="heading-understanding-mlflow-tracking">Understanding MLflow Tracking</h2>
<p>For most teams, the <strong>Tracking</strong> component is the front door to MLflow. Its job is simple: it acts as a digital lab notebook that records everything happening during a training run.</p>
<p>Instead of you frantically trying to remember what your learning rate was or where you saved that accuracy plot, MLflow just sits in the background and logs it for you.</p>
<p>The core unit here is the <strong>run</strong>. Think of a run as a single execution of your training code. During that run, the architecture captures four specific types of information:</p>
<ul>
<li><p><strong>Parameters:</strong> Your inputs, like batch size or the number of trees in a forest.</p>
</li>
<li><p><strong>Metrics:</strong> Your outputs, like accuracy or loss, which can be tracked over time.</p>
</li>
<li><p><strong>Artifacts:</strong> The "heavy" stuff, such as model weights, confusion matrices, or images.</p>
</li>
<li><p><strong>Tags and Metadata:</strong> Context like which developer ran the code and which Git commit was used.</p>
</li>
</ul>
<h3 id="heading-a-tracking-example">A Tracking Example</h3>
<p>Seeing this in practice is the best way to understand how the architecture actually works. You don't need to rebuild your entire pipeline – you just wrap your training logic in a context manager.</p>
<p>Here is what a basic integration looks like in Python:</p>
<pre><code class="language-python">import mlflow 
import mlflow.sklearn 
from sklearn.ensemble import RandomForestClassifier 
from sklearn.metrics import accuracy_score 

# This block opens the run and keeps things organized
with mlflow.start_run():    
    # Log parameters    
    mlflow.log_param("n_estimators", 100)    
    mlflow.log_param("max_depth", 5)    
    
    # Train the model    
    model = RandomForestClassifier(n_estimators=100, max_depth=5)    
    model.fit(X_train, y_train)    
    
    # Log metrics    
    accuracy = accuracy_score(y_test, model.predict(X_test))    
    mlflow.log_metric("accuracy", accuracy)    
    
    # Log the model as an artifact    
    mlflow.sklearn.log_model(model, "random_forest_model")
</code></pre>
<img src="https://cdn.hashnode.com/uploads/covers/627d043a4903bec29b5871be/0c63f9c4-3f16-4591-be58-51a0acca5f80.png" alt="A comparison table in the MLflow UI showing three training runs side-by-side, highlighting differences in parameters and metrics." style="display:block;margin:0 auto" width="2862" height="1384" loading="lazy">

<p>The <code>mlflow.start_run()</code> context manager creates a new run and automatically closes it when the block exits. Everything logged inside that block is associated with that run and stored in the Backend Store.</p>
<h3 id="heading-where-does-the-data-actually-go">Where Does the Data Actually Go?</h3>
<p>When you’re just starting out on your laptop, MLflow keeps things simple by creating a local <code>./mlruns</code> directory. The real power shows up when you move to a team environment and point everyone to a centralized Tracking Server.</p>
<p>The system splits the data based on how "heavy" it is. Your structured data (parameters and metrics) is small and needs to be searchable, so it goes into a SQL database like PostgreSQL. Your unstructured data (the actual model files or large plots) is too bulky for a database. The architecture ships that off to an Artifact Store like Amazon S3 or Google Cloud Storage.</p>
<img src="https://cdn.hashnode.com/uploads/covers/627d043a4903bec29b5871be/e8aa2e4e-09a8-4767-a1f3-b07810680615.png" alt="The MLflow Artifact Store view showing the directory structure for a logged model, including the MLmodel metadata and model.pkl file." style="display:block;margin:0 auto" width="2862" height="1384" loading="lazy">

<h3 id="heading-why-bother-with-this-setup">Why Bother with This Setup?</h3>
<p>Relying on "vibes" and messy naming conventions is a recipe for disaster once your project grows. It might work for a day or two, but it falls apart the moment you need to compare twenty different versions of a model.</p>
<p>By separating the tracking into its own architectural pillar, MLflow gives you a queryable history. Instead of digging through old notebooks, you can just hop into the UI, filter for the best results, and see exactly which configuration got you there. It takes the guesswork out of the "science" part of data science.</p>
<img src="https://cdn.hashnode.com/uploads/covers/627d043a4903bec29b5871be/cd83e4b7-38b7-4644-8166-e48ba00d581a.png" alt="An MLflow Parallel Coordinates plot visualizing the relationship between the number of estimators and model accuracy across multiple runs." style="display:block;margin:0 auto" width="2862" height="1384" loading="lazy">

<img src="https://cdn.hashnode.com/uploads/covers/627d043a4903bec29b5871be/6d1383f5-7ace-4b9d-a566-64a3807cdcd7.png" alt="An MLflow scatter plot illustrating the positive correlation between the n_estimators parameter and the resulting model accuracy." style="display:block;margin:0 auto" width="2862" height="1384" loading="lazy">

<h2 id="heading-understanding-mlflow-projects">Understanding MLflow Projects</h2>
<p>You can train the most accurate model in the world, but if your colleague can’t reproduce your results on their machine, that model isn't worth much.</p>
<p>This is where MLflow Projects come in. They solve the reproducibility headache by providing a standard way to package your code, your dependencies, and your entry points into one neat bundle.</p>
<p>Think of an MLflow Project as a directory (or a Git repo) with a special "instruction manual" at its root called an <code>MLproject</code> file. This file tells anyone (or any server) exactly what environment is needed and how to kick off the execution.</p>
<h3 id="heading-the-mlproject-file">The MLproject File</h3>
<p>Instead of sending someone a long README with installation steps, you just give them this file. Here is what a typical MLproject setup looks like for a training pipeline:</p>
<pre><code class="language-yaml">name: my_ml_project
conda_env: conda.yaml

entry_points:
  train:
    parameters:
      learning_rate: {type: float, default: 0.01}
      epochs: {type: int, default: 50}
      data_path: {type: str}
    command: "python train.py --lr {learning_rate} --epochs {epochs} --data {data_path}"
  
  evaluate:
    parameters:
      model_path: {type: str}
    command: "python evaluate.py --model {model_path}"
</code></pre>
<p>The conda_env line points to a conda.yaml file that lists the exact Python packages and versions your code needs. If you want even more isolation, MLflow supports Docker environments too.</p>
<p>The beauty of this setup is the simplicity. Anyone with MLflow installed can run your entire project with a single command:</p>
<pre><code class="language-bash">mlflow run . -P learning_rate=0.001 -P epochs=100 -P data_path=./data/train.csv
</code></pre>
<h3 id="heading-why-this-actually-matters">Why this Actually Matters</h3>
<p>MLflow Projects really shine in two specific scenarios. The first is onboarding. A new team member can clone your repo and be up and running in minutes, rather than spending their entire first day debugging library version conflicts.</p>
<p>The second is CI/CD. Because these projects are triggered programmatically, they fit perfectly into automated retraining pipelines. When reproducibility is non-negotiable, having a "single source of truth" for how to run your code makes life a lot easier for everyone involved.</p>
<h2 id="heading-understanding-the-mlflow-model-registry">Understanding the MLflow Model Registry</h2>
<p>Tracking experiments tells you which model is the "winner," but the Model Registry is where you actually manage that winner’s journey from your notebook to a live production environment.</p>
<p>Think of it as the governance layer. It handles versioning, stage management, and creates a clear audit trail so you never have to guess which model is currently running in the wild.</p>
<p>The Registry uses a few simple concepts to keep things organized:</p>
<ul>
<li><p><strong>Registered Model:</strong> This is the overall name for your project, like CustomerChurnPredictor.</p>
</li>
<li><p><strong>Model Version:</strong> Every time you push a new iteration, MLflow auto-increments the version (v1, v2, and so on).</p>
</li>
<li><p><strong>Stage:</strong> These are labels like <strong>Staging</strong>, <strong>Production</strong>, or <strong>Archived</strong>. They tell your team exactly where a model stands in its lifecycle.</p>
</li>
<li><p><strong>Annotations:</strong> These are just notes and tags. They’re great for documenting why a specific version was promoted or what its quirks are.</p>
</li>
</ul>
<img src="https://cdn.hashnode.com/uploads/covers/627d043a4903bec29b5871be/bcd77d8f-a37c-4b0f-a112-9e2ad36d8cc2.png" alt="The MLflow Model Registry interface showing Version 1 of the IrisClassifier model officially transitioned to the Production stage." style="display:block;margin:0 auto" width="2862" height="1384" loading="lazy">

<h2 id="heading-moving-a-model-through-the-pipeline">Moving a Model through the Pipeline</h2>
<p>In a real-world workflow, you don't just "deploy" a file. You transition it through stages. Here's how that looks using the MLflow Client:</p>
<pre><code class="language-plaintext">Python
import mlflow
from mlflow.tracking import MlflowClient

client = MlflowClient()

# First, we register the model from a run that went well
result = mlflow.register_model(
    model_uri=f"runs:/{run_id}/random_forest_model",
    name="CustomerChurnPredictor"
)

# Then, we move Version 1 to Staging so the QA team can look at it
client.transition_model_version_stage(
    name="CustomerChurnPredictor",
    version=1,
    stage="Staging"
)

# Once everything checks out, we promote it to Production
client.transition_model_version_stage(
    name="CustomerChurnPredictor",
    version=1,
    stage="Production"
)
</code></pre>
<h3 id="heading-why-does-this-matter">Why Does This Matter?</h3>
<p>The Model Registry solves a problem that usually gets messy the moment a team grows: knowing exactly which version is live, who approved it, and what it was compared against. Without this, that information usually ends up buried in Slack threads or outdated spreadsheets.</p>
<p>It also makes rollbacks incredibly painless. If Version 3 starts acting up in production, you don't need to redeploy your entire stack. You can just transition Version 2 back to the "Production" stage in the registry. Since your serving infrastructure is built to always pull the "Production" tag, it will automatically swap back to the stable version.</p>
<h2 id="heading-how-the-components-fit-together">How the Components Fit Together</h2>
<p>To see how all of this actually works in the real world, it helps to walk through a typical workflow from start to finish. It's essentially a relay race where each component hands off the baton to the next one.</p>
<p>It starts with a data scientist running a handful of experiments. Every time they hit run, MLflow Tracking is in the background taking notes. It logs metrics and saves model artifacts into the Backend Store automatically. At this stage, everything is about exploration and finding that one winner.</p>
<p>Once that best run is identified, the model gets officially registered in the Model Registry. This is where the team takes over. They can hop into the UI to check the annotations, review the evaluation results, and move the model into Staging. After it passes a few more validation tests, it gets the green light and is promoted to Production.</p>
<p>When it is time to actually serve the model, the deployment system simply asks the Registry for the current Production version. This happens whether you are using Kubernetes, a cloud endpoint, or MLflow’s built-in server.</p>
<p>Because the MLproject file handled the dependencies and the MLflow Models format handled the framework details, the serving infrastructure does not have to care if the model was built with Scikit-learn or PyTorch. The hand-off is smooth because all the necessary info is already there.</p>
<p>This flow is what turns MLflow from a collection of useful utilities into a full MLOps platform. It connects the messy experimental phase of data science to the rigid world of production software.</p>
<h2 id="heading-wrapping-up">Wrapping Up</h2>
<p>At the end of the day, MLflow architecture is built to stay out of your way. It doesn't force you to change how you write your code or which libraries you use. Instead, it just provides the structure needed to make your machine learning projects reproducible and easier to manage as a team.</p>
<p>Whether you're just trying to get away from naming files model_final_v2.pkl or you are building a complex CI/CD pipeline for your models, understanding these four pillars is the best place to start. The best way to learn is to just fire up a local tracking server and start logging. You will probably find that once you have that "source of truth" for your experiments, you will never want to go back to the old way of doing things.</p>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ How to Use Docker Compose for Production Workloads — with Profiles, Watch Mode, and GPU Support ]]>
                </title>
                <description>
                    <![CDATA[ There's a perception problem with Docker Compose. Ask a room full of platform engineers what they think of it, and you'll hear some version of: "It's great for local dev, but we use Kubernetes for rea ]]>
                </description>
                <link>https://www.freecodecamp.org/news/how-to-use-docker-compose-for-production-workloads/</link>
                <guid isPermaLink="false">69aadee178c5adcd0e18ddd3</guid>
                
                    <category>
                        <![CDATA[ Docker ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Docker compose ]]>
                    </category>
                
                    <category>
                        <![CDATA[ containers ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Devops ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ Balajee Asish Brahmandam ]]>
                </dc:creator>
                <pubDate>Fri, 06 Mar 2026 14:04:17 +0000</pubDate>
                <media:content url="https://cdn.hashnode.com/uploads/covers/5fc16e412cae9c5b190b6cdd/73c5f43a-321c-4ce1-8eb4-872b532cc8dd.png" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>There's a perception problem with Docker Compose. Ask a room full of platform engineers what they think of it, and you'll hear some version of: "It's great for local dev, but we use Kubernetes for real work."</p>
<p>I get it. I held that same opinion for years. Compose was the thing I used to spin up a Postgres database on my laptop, not something I'd trust with a staging environment, let alone a workload that needed GPU access.</p>
<p>Then 2024 and 2025 happened. Docker shipped a set of features that quietly transformed Compose from a developer convenience tool into something that can handle complex deployment scenarios. Profiles let you manage multiple environments from a single file. Watch mode killed the painful rebuild cycle that made container-based development feel sluggish. GPU support opened the door to ML inference workloads. And a bunch of smaller improvements (better health checks, Bake integration, structured logging) filled in the gaps that used to make Compose feel like a toy.</p>
<p>Here's what I'll cover: using Docker Compose profiles to manage multiple environments from one file, setting up watch mode for instant code syncing during development, configuring GPU passthrough for machine learning workloads, implementing proper health checks and startup ordering so your services stop crashing on cold starts, and using Bake to bridge the gap between your local Compose workflow and production image builds. I'll also tell you where Compose still falls short and where you should reach for something else.</p>
<h2 id="heading-prerequisites"><strong>Prerequisites</strong></h2>
<p>You should be comfortable with Docker basics and have written a <code>compose.yaml</code> file before. You'll need Docker Compose v2 installed. The minimum version depends on which features you want: <code>service_healthy</code> dependency conditions require v2.20.0+, watch mode requires v2.22.0+, and the <code>gpus:</code> shorthand requires v2.30.0+. Run <code>docker compose version</code> to check what you have.</p>
<h2 id="heading-table-of-contents">Table of Contents</h2>
<ul>
<li><p><a href="#heading-prerequisites">Prerequisites</a></p>
</li>
<li><p><a href="#heading-the-modern-compose-file-whats-changed">The Modern Compose File: What's Changed</a></p>
</li>
<li><p><a href="#heading-how-to-use-profiles-to-manage-multiple-environments">How to Use Profiles to Manage Multiple Environments</a></p>
<ul>
<li><a href="#heading-real-world-profile-patterns-ive-used">Real-World Profile Patterns I've Used</a></li>
</ul>
</li>
<li><p><a href="#heading-how-to-use-watch-mode-to-end-the-rebuild-cycle">How to Use Watch Mode to End the Rebuild Cycle</a></p>
<ul>
<li><a href="#heading-watch-mode-vs-bind-mounts">Watch Mode vs. Bind Mounts</a></li>
</ul>
</li>
<li><p><a href="#heading-how-to-set-up-gpu-support-for-machine-learning-workloads">How to Set Up GPU Support for Machine Learning Workloads</a></p>
<ul>
<li><a href="#heading-how-to-combine-multi-gpu-workloads-with-profiles">How to Combine Multi-GPU Workloads with Profiles</a></li>
</ul>
</li>
<li><p><a href="#heading-how-to-configure-health-checks-dependencies-and-startup-ordering">How to Configure Health Checks, Dependencies, and Startup Ordering</a></p>
</li>
<li><p><a href="#heading-how-to-use-bake-for-production-image-builds">How to Use Bake for Production Image Builds</a></p>
</li>
<li><p><a href="#heading-what-compose-is-not-an-honest-assessment">What Compose Is Not (An Honest Assessment)</a></p>
</li>
<li><p><a href="#heading-a-practical-adoption-path">A Practical Adoption Path</a></p>
</li>
<li><p><a href="#heading-wrapping-up">Wrapping Up</a></p>
</li>
</ul>
<h2 id="heading-the-modern-compose-file-whats-changed">The Modern Compose File: What's Changed</h2>
<p>If you haven't looked at a Compose file recently, the first thing you'll notice is that the <code>version</code> field is gone. Docker Compose v2 ignores it entirely, and including it actually triggers a deprecation warning. A modern <code>compose.yaml</code> starts cleanly with your services, no preamble needed.</p>
<p>But the structural changes go deeper than that. Here's what a modern, production-aware Compose file looks like for a typical web application stack:</p>
<pre><code class="language-yaml">services:
  api:
    image: ghcr.io/myorg/api:${TAG:-latest}
    env_file: [configs/common.env]
    environment:
      - NODE_ENV=${NODE_ENV:-production}
    ports:
      - "8080:8080"
    depends_on:
      db:
        condition: service_healthy
    deploy:
      resources:
        limits:
          memory: 512M
          cpus: "1.0"
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8080/health"]
      interval: 10s
      timeout: 5s
      retries: 3

  db:
    image: postgres:16-alpine
    volumes:
      - db-data:/var/lib/postgresql/data
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U postgres"]
      interval: 5s
      retries: 5

volumes:
  db-data:
</code></pre>
<p>Look at what's in there: resource limits, health checks with dependency conditions, proper volume management. These aren't nice-to-haves. They're the features that make Compose viable beyond your laptop.</p>
<p>Health checks in particular solve one of Compose's oldest and most annoying pain points: the race condition where your web server starts before the database is actually ready to accept connections. If you've ever added <code>sleep 10</code> to a startup script and crossed your fingers, you know what I'm talking about.</p>
<h2 id="heading-how-to-use-profiles-to-manage-multiple-environments">How to Use Profiles to Manage Multiple Environments</h2>
<p>This is the feature that changed my relationship with Compose. Before profiles, managing different environments meant choosing between two painful approaches. Either you maintained multiple Compose files (<code>docker-compose.yml</code>, <code>docker-compose.dev.yml</code>, <code>docker-compose.test.yml</code>, <code>docker-compose.prod.yml</code>) and dealt with the inevitable drift between them. Or you used one big bloated file where you commented out services depending on the context. Both approaches were fragile, and both led to those fun "works on my machine" conversations.</p>
<p>Profiles give you a much cleaner path. You assign services to named groups. Services without a profile always start. Services with a profile only start when you explicitly activate that profile. You can also activate profiles with the <code>COMPOSE_PROFILES</code> environment variable instead of the CLI flag, which is handy for CI (see the <a href="https://docs.docker.com/compose/how-tos/profiles/">official profiles docs</a> for the full syntax).</p>
<p>Here's what that looks like:</p>
<pre><code class="language-yaml">services:
  api:
    image: myapp:latest
    # No profiles = always starts

  db:
    image: postgres:16
    # No profiles = always starts

  debug-tools:
    image: busybox
    profiles: [debug]
    # Only starts with --profile debug

  prometheus:
    image: prom/prometheus
    profiles: [monitoring]
    # Only starts with --profile monitoring

  grafana:
    image: grafana/grafana
    profiles: [monitoring]
    depends_on: [prometheus]
</code></pre>
<p>Now your team operates with simple, memorable commands:</p>
<pre><code class="language-bash"># Development: just the core stack
docker compose up -d

# Development with observability
docker compose --profile monitoring up -d

# CI: core stack only (no monitoring overhead)
docker compose up -d

# Full stack with debugging
docker compose --profile debug --profile monitoring up
</code></pre>
<p>One Compose file. No drift. No guesswork about which override file to pass.</p>
<h3 id="heading-real-world-profile-patterns-ive-used">Real-World Profile Patterns I've Used</h3>
<p>Four patterns I keep coming back to:</p>
<p><strong>The "infra-only" pattern.</strong> This is for developers who run application code natively on their host machine but need infrastructure services like databases, message queues, and caches in containers. You leave infrastructure services without a profile and put application services behind one. Your backend developer runs <code>docker compose up</code> to get Postgres and Redis, then starts the API directly on their host with their favorite debugger attached.</p>
<p><strong>The "mock vs. real" pattern.</strong> You put a <code>payments-mock</code> service in the <code>dev</code> profile and a real payments gateway service in the <code>prod</code> profile. Same Compose file, totally different behavior depending on context. This one saved my team from accidentally hitting a live payment API during development more than once.</p>
<p><strong>The "CI optimization" pattern.</strong> Heavy services like Selenium browsers and monitoring stacks go behind profiles so your CI pipeline skips them. Your test suite runs faster without that overhead, and you only pull those services in when you actually need end-to-end integration tests.</p>
<p><strong>The "AI/ML workloads" pattern.</strong> GPU-dependent services (inference servers, model training containers) go into a <code>gpu</code> profile. Developers without GPUs can still work on the rest of the stack without anything breaking.</p>
<p>One practical tip that's saved me a lot of headaches: document your profiles in the project's README. It sounds obvious, but when a new team member runs <code>docker compose up</code> and wonders why the monitoring dashboard isn't starting, they need a single place to find the answer. A quick table listing each profile and what it includes will save you from answering the same Slack question every onboarding cycle.</p>
<h2 id="heading-how-to-use-watch-mode-to-end-the-rebuild-cycle">How to Use Watch Mode to End the Rebuild Cycle</h2>
<p>If profiles solved the environment management problem, watch mode solved the developer experience problem.</p>
<p>You probably know the old workflow for container-based development. It went like this: edit code, run <code>docker compose build</code>, run <code>docker compose up</code>, test your change, find a bug, edit again, rebuild, restart, test. Each iteration costs you thirty seconds to a minute of waiting. Over a full day of active development, you're losing an hour or more just sitting there watching build logs scroll by.</p>
<p>Watch mode (introduced in Compose v2.22.0 and significantly improved in later releases) monitors your local files and automatically takes action when something changes. It supports three synchronization strategies, and picking the right one for each situation is the key to making it work well. The <a href="https://docs.docker.com/compose/how-tos/file-watch/">official watch mode docs</a> cover the full spec if you want to dig deeper.</p>
<p><code>sync</code> copies changed files directly into the running container. This works best for interpreted languages like Python, JavaScript, and Ruby, and for frameworks with hot module reloading like React, Vue, or Next.js. The file lands in the container, the framework picks up the change, and your browser updates. No rebuild, no restart. If you're working with a compiled language like Go, Rust, or Java, <code>sync</code> won't help you since the code needs to be recompiled. Use <code>rebuild</code> for those instead.</p>
<p><code>rebuild</code> triggers a full image rebuild and container replacement. You want this for dependency changes, like when you update <code>package.json</code> or <code>requirements.txt</code>, or when you modify the Dockerfile itself. In those cases, syncing files isn't enough. You need a fresh image.</p>
<p><code>sync+restart</code> syncs files into the container, then restarts the main process. This is ideal for configuration file changes like <code>nginx.conf</code> or database configs, where the application needs to reload to pick up the new settings but the image itself is fine.</p>
<p>Here's what a real-world watch configuration looks like for a Node.js application:</p>
<pre><code class="language-yaml">services:
  api:
    build: .
    ports: ["3000:3000"]
    command: npx nodemon server.js
    develop:
      watch:
        - action: sync
          path: ./src
          target: /app/src
          ignore:
            - node_modules/
        - action: rebuild
          path: package.json
        - action: sync+restart
          path: ./config
          target: /app/config
</code></pre>
<p>You start it with <code>docker compose up --watch</code>, or you can run <code>docker compose watch</code> as a standalone command if you'd rather keep the file sync events separate from your application logs.</p>
<p>A few things to know before you set this up. Watch mode only works with services that have a local <code>build:</code> context. If you're pulling a prebuilt image from a registry, there's nothing for Compose to sync or rebuild, so watch will ignore that service. Your container also needs basic file utilities (<code>stat</code>, <code>mkdir</code>) installed, and the container <code>USER</code> must have write access to the target path. If you're using a minimal base image like <code>scratch</code> or <code>distroless</code>, the <code>sync</code> action won't work. And if you're on an older Compose version, check which actions are supported: <code>sync+restart</code> and <code>sync+exec</code> were added in later minor releases after the initial v2.22.0 launch.</p>
<p>It's a massive improvement. Edit a source file, save it, and the change is live in under a second for frameworks with hot reload. No context switching to run build commands. No waiting. Just code.</p>
<h3 id="heading-watch-mode-vs-bind-mounts">Watch Mode vs. Bind Mounts</h3>
<p>A fair question you might be asking: bind mounts have provided a form of live-reload for years. Why does watch mode need to exist?</p>
<p>Bind mounts work, but they come with platform-specific issues that have plagued Docker Desktop for a long time. On macOS and Windows, bind mounts go through a filesystem sharing layer between the host OS and the Linux VM running Docker. This introduces permission quirks, performance problems on large directories (ever watched a <code>node_modules</code> folder choke a bind mount on macOS?), and inconsistent file notification behavior that makes hot reload unreliable.</p>
<p>Watch mode sidesteps these issues by explicitly syncing files at the application level. It's more predictable, works consistently across platforms, and gives you more control over what happens when a file changes.</p>
<p>That said, bind mounts still work well for many use cases, especially if you're on native Linux where the performance overhead doesn't exist. Watch mode is the better choice for teams that have run into cross-platform issues, or for anyone who wants the automatic rebuild and restart triggers that bind mounts can't provide.</p>
<h2 id="heading-how-to-set-up-gpu-support-for-machine-learning-workloads">How to Set Up GPU Support for Machine Learning Workloads</h2>
<p>This is the feature that made me rethink what Compose can do.</p>
<p>Docker has supported GPU passthrough for individual containers for years through the NVIDIA Container Toolkit and the <code>--gpus</code> flag. But configuring GPU access in Compose files used to require clunky runtime declarations that were poorly documented and changed between Compose versions. It was the kind of thing where you'd find a Stack Overflow answer from 2021, try it, and discover it didn't work anymore.</p>
<p>The modern Compose spec handles it cleanly through the <code>deploy.resources.reservations.devices</code> block:</p>
<pre><code class="language-yaml">services:
  inference:
    image: myorg/model-server:latest
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]
</code></pre>
<p>If you're on Compose v2.30.0 or later, there's also a shorter syntax using the <code>gpus:</code> field:</p>
<pre><code class="language-yaml">services:
  inference:
    image: myorg/model-server:latest
    gpus:
      - driver: nvidia
        count: 1
</code></pre>
<p>Both approaches do the same thing. The <code>deploy.resources</code> syntax works on older Compose versions and gives you more control (like setting <code>device_ids</code> to pin specific GPUs). The <code>gpus:</code> shorthand is cleaner when you just need basic access.</p>
<p><strong>One thing that will trip you up if you skip it:</strong> your host machine needs the right GPU drivers and <code>nvidia-container-toolkit</code> installed before any of this works. Run <code>nvidia-smi</code> on the host first. If that command doesn't show your GPUs, Compose won't see them either. For CUDA workloads, use official GPU base images like <code>nvidia/cuda</code> or the PyTorch/TensorFlow GPU images. The <a href="https://docs.docker.com/compose/how-tos/gpu-support/">Compose GPU access docs</a> walk through the full setup.</p>
<p>That's the whole thing. When you run <code>docker compose up</code>, the inference service gets access to one NVIDIA GPU. You can set <code>count</code> to <code>"all"</code> if you want every available GPU, or use <code>device_ids</code> to assign specific GPUs to specific services.</p>
<h3 id="heading-how-to-combine-multi-gpu-workloads-with-profiles">How to Combine Multi-GPU Workloads with Profiles</h3>
<p>Here's where profiles and GPU support work really well together. Consider an ML workload where you need an LLM for text generation, an embedding model for vector search, and a vector database:</p>
<pre><code class="language-yaml">services:
  vectordb:
    image: milvus/milvus:latest
    # Runs on CPU, no profile needed

  llm-server:
    image: ollama/ollama:latest
    profiles: [gpu]
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              device_ids: ["1"]
              capabilities: [gpu]
    volumes:
      - model-cache:/root/.ollama

  embedding-server:
    image: myorg/embeddings:latest
    profiles: [gpu]
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              device_ids: ["0"]
              capabilities: [gpu]
</code></pre>
<p>Developers without GPUs work on the application logic with just <code>docker compose up</code>. The vector database starts, they can write code against its API, and everything runs fine. When it's time to test the full ML pipeline, someone with a multi-GPU workstation runs <code>docker compose --profile gpu up</code> and gets the complete stack with specific GPU assignments.</p>
<p>This pattern has become central to our AIOps platform development. The team building alerting logic doesn't need GPUs. The team training anomaly detection models does. One Compose file serves both teams.</p>
<h2 id="heading-how-to-configure-health-checks-dependencies-and-startup-ordering">How to Configure Health Checks, Dependencies, and Startup Ordering</h2>
<p>One of Compose's most underappreciated improvements is how it handles service dependencies. The <code>depends_on</code> directive now supports conditions that actually mean something (this requires Compose v2.20.0+, see the <a href="https://docs.docker.com/compose/how-tos/startup-order/">startup ordering docs</a> for the full picture):</p>
<pre><code class="language-yaml">depends_on:
  db:
    condition: service_healthy
  redis:
    condition: service_started
</code></pre>
<p>When you combine this with proper health checks, you eliminate the "sleep 10 and hope" pattern that plagues so many Compose setups. Your API service actually waits until PostgreSQL is accepting connections before it tries to start. Not just until the container is running, but until the database process inside it has passed its health check.</p>
<p>One detail that catches people: tune your <code>start_period</code>. Databases like PostgreSQL need time to initialize on first boot, especially if they're running migrations. Without a <code>start_period</code>, the health check starts counting retries immediately and can declare the service unhealthy before it even had a chance to finish starting up. A config like this works well for most database services:</p>
<pre><code class="language-yaml">healthcheck:
  test: ["CMD-SHELL", "pg_isready -U postgres"]
  interval: 5s
  timeout: 2s
  retries: 10
  start_period: 30s
</code></pre>
<p>The <code>start_period</code> gives the container 30 seconds of grace time where failed health checks don't count against the retry limit.</p>
<p>This might seem like a small detail, but if you've ever worked on a stack with eight or ten interconnected services, you know how much time you can waste debugging cascading failures during cold starts. Proper startup ordering prevents all of that and makes your local environment behave much more like production.</p>
<h2 id="heading-how-to-use-bake-for-production-image-builds">How to Use Bake for Production Image Builds</h2>
<p>I mentioned Bake integration earlier, and it's worth its own section because it solves a problem you'll hit as soon as you start using Compose for anything beyond local dev: your development Compose file and your production build process have different needs.</p>
<p>During development, you want fast builds, local caches, and single-platform images. For production, you want tagged images pushed to a registry, multi-platform builds, and build attestations. Trying to cram both into your <code>compose.yaml</code> gets messy fast.</p>
<p>Docker Bake (<code>docker buildx bake</code>) can read your <code>compose.yaml</code> and generate build targets from it, but you can override and extend those targets with a separate <code>docker-bake.hcl</code> file. This keeps your development workflow clean while giving CI the knobs it needs. The <a href="https://docs.docker.com/build/bake/">Bake documentation</a> covers the full HCL syntax and Compose integration.</p>
<p>Here's a minimal <code>docker-bake.hcl</code>:</p>
<pre><code class="language-hcl">group "default" {
  targets = ["api", "worker"]
}

target "api" {
  context    = "api"
  dockerfile = "Dockerfile"
  tags       = ["registry.example.com/team/api:release"]
  platforms  = ["linux/amd64"]
}

target "worker" {
  context    = "worker"
  dockerfile = "Dockerfile"
  tags       = ["registry.example.com/team/worker:release"]
}
</code></pre>
<p>Then your CI pipeline runs <code>docker buildx bake</code> to produce release images, while developers keep using <code>docker compose up --build</code> locally. The two workflows share the same Dockerfiles but have separate build configurations where they need them.</p>
<p>The pattern I've landed on: use Compose for local development and CI test environments, use Bake in CI to produce the release images, and push those images into whatever deployment target your team uses (staging server, Kubernetes cluster, edge node). Compose gets you from code to running containers fast. Bake gets you from code to production-ready images with proper tags and attestations.</p>
<h2 id="heading-what-compose-is-not-an-honest-assessment">What Compose Is Not (An Honest Assessment)</h2>
<p>I've spent this entire article making the case that Compose has grown up. But I should also tell you where it falls short. I'd rather you hear it from me now than discover it the hard way in production.</p>
<p><strong>Compose is not a container orchestrator.</strong> It doesn't schedule work across multiple hosts. It doesn't do automatic failover. It won't give you rolling updates with zero downtime, and it has no concept of service mesh networking. If you need any of those things, you need Kubernetes, Nomad, or Docker Swarm (if you're still using it).</p>
<p><strong>Compose doesn't replace Helm or Kustomize.</strong> If you're deploying to Kubernetes, Compose files don't translate directly. Docker offers Compose Bridge to convert Compose files into Kubernetes manifests, but it's still experimental and won't handle complex Kubernetes-specific configurations like custom resource definitions or ingress rules.</p>
<p><strong>Compose doesn't handle secrets well in production.</strong> The secrets support exists, but it's limited compared to HashiCorp Vault, AWS Secrets Manager, or Kubernetes secrets. For anything beyond a staging environment, you'll want an external secrets management solution.</p>
<p>The sweet spot for modern Compose is clear: local development, CI/CD testing environments, single-node staging environments, and workloads where a single powerful machine (particularly for GPU work) is the right deployment target. Within that scope, Compose is excellent. Outside of it, you'll hit walls fast.</p>
<p>If you do run Compose in a staging or single-node production setup, a few more things are worth adding that I haven't covered here: <code>restart: unless-stopped</code> on every service so containers come back after a host reboot, a logging driver config so your logs go somewhere searchable instead of disappearing into <code>docker logs</code>, and a backup strategy for your named volumes. These aren't Compose-specific problems, but Compose won't solve them for you either.</p>
<h2 id="heading-a-practical-adoption-path">A Practical Adoption Path</h2>
<p>If you're currently working with a basic Compose setup and want to start using these features, here's the order I'd recommend. Each step is incremental, backward-compatible, and valuable on its own. You don't have to do all of this at once.</p>
<p><strong>Week 1: Add health checks and proper</strong> <code>depends_on</code> <strong>conditions.</strong> This alone will eliminate the most common frustration: services crashing on startup because their dependencies aren't ready yet. Start with your database and your main application service. Once those two are wired up with <code>condition: service_healthy</code>, you'll notice the difference immediately.</p>
<pre><code class="language-yaml">healthcheck:
  test: ["CMD-SHELL", "pg_isready -U postgres"]
  interval: 5s
  timeout: 2s
  retries: 10
  start_period: 30s
</code></pre>
<p><strong>Week 2: Introduce profiles.</strong> Start by putting your monitoring stack behind a <code>monitoring</code> profile and your debug tools behind a <code>debug</code> profile. Then delete whatever extra Compose files you've been maintaining. Having one source of truth instead of four files that are almost-but-not-quite the same makes everything simpler.</p>
<p><strong>Week 3: Set up watch mode for your most-edited service.</strong> Pick the service where your developers spend the most time iterating. Get watch mode working there first. Once the team sees the difference (saving a file and seeing the change reflected in under a second) they'll ask for it on everything else.</p>
<p><strong>Week 4: Add resource limits.</strong> Define memory and CPU limits for every service. This prevents one runaway container from starving the rest and gives you a realistic preview of how your services behave under production constraints. It's also useful for catching memory leaks early.</p>
<pre><code class="language-yaml">deploy:
  resources:
    limits:
      memory: 512M
      cpus: "1.0"
</code></pre>
<h2 id="heading-wrapping-up">Wrapping Up</h2>
<p>Docker Compose in 2026 is not the same tool it was a few years ago. Profiles, watch mode, GPU support, proper dependency management, and Bake integration have turned it into something that can handle real, complex workloads, as long as those workloads fit on a single node.</p>
<p>It's not Kubernetes, and it shouldn't try to be. But for local development, CI pipelines, staging environments, and single-machine GPU workloads, it's become hard to argue against. If you've been dismissing Compose because of what it used to be, the current version deserves a second look.</p>
<p>If you found this useful, you can find me writing about DevOps, containers, and AIOps best practices on my blog.</p>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ How to Use Different Container Runtimes: Docker, Podman, and Containerd Explained ]]>
                </title>
                <description>
                    <![CDATA[ If you’re a developer working with containers, chances are Docker is your go-to tool. But did you know that there's a whole ecosystem of container runtimes out there? Some are lighter, some are more secure, and some are specifically built for Kuberne... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/how-to-use-different-container-runtimes-docker-podman-and-containerd-explained/</link>
                <guid isPermaLink="false">6994e01b44a48dd86fdf0816</guid>
                
                    <category>
                        <![CDATA[ containers ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Docker ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Kubernetes ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ Destiny Erhabor ]]>
                </dc:creator>
                <pubDate>Tue, 17 Feb 2026 21:39:39 +0000</pubDate>
                <media:content url="https://cdn.hashnode.com/res/hashnode/image/upload/v1771357533601/1cba7a91-19f0-4038-93e6-504b121a9a03.png" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>If you’re a developer working with containers, chances are Docker is your go-to tool. But did you know that there's a whole ecosystem of container runtimes out there? Some are lighter, some are more secure, and some are specifically built for Kubernetes.</p>
<p>Understanding different container runtimes gives you more options. You can choose the right tool for your specific needs, whether that's better security, lower resource usage, or easier integration with Kubernetes.</p>
<p>In this tutorial, you'll learn about three major container runtimes and how to use them on your system. We’ll dive into practical examples with complete code you can run right now. By the end, you’ll understand when to use each runtime and how to move containers between them.</p>
<h2 id="heading-table-of-contents">Table of Contents</h2>
<ol>
<li><p><a class="post-section-overview" href="#heading-what-are-container-runtimes">What Are Container Runtimes?</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-how-to-understand-high-level-vs-low-level-runtimes">How to Understand High-Level vs Low-Level Runtimes</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-how-to-use-docker-as-your-baseline">How to Use Docker as Your Baseline</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-how-to-use-podman-the-daemonless-alternative">How to Use Podman – The Daemonless Alternative</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-how-to-work-with-containerd">How to Work with Containerd</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-how-to-move-containers-between-runtimes">How to Move Containers Between Runtimes</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-real-world-use-cases">Real-World Use Cases</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-quick-reference-guide">Quick Reference Guide</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-conclusion">Conclusion</a></p>
</li>
</ol>
<h2 id="heading-what-are-container-runtimes">What Are Container Runtimes?</h2>
<p>A container runtime is the software that actually runs your containers. When you type <code>docker run nginx</code>, for example, several things happen behind the scenes. The Docker CLI talks to the Docker daemon, which then uses a container runtime (usually containerd) to actually create and run the container.</p>
<p>Think of it like this: if containers are apps on your phone, the container runtime is the operating system that makes those apps work. Just like you can install the same app on different phones (iPhone vs Android), you can run the same container on different runtimes.</p>
<h3 id="heading-why-does-this-matter">Why Does This Matter?</h3>
<p>You might wonder why you should care about what's running your containers. Docker works fine, right? Here are a few reasons:</p>
<ol>
<li><p><strong>Security:</strong> Some runtimes like Podman can run containers without root privileges. This means if someone breaks out of your container, they don't have full system access.</p>
</li>
<li><p><strong>Resource usage:</strong> Different runtimes use different amounts of memory and CPU. On a resource-constrained server or edge device, this matters a lot.</p>
</li>
<li><p><strong>Integration:</strong> If you're deploying to Kubernetes, understanding containerd or CRI-O helps you troubleshoot production issues.</p>
</li>
<li><p><strong>Licensing:</strong> Docker Desktop has licensing requirements for large companies. Alternatives like Podman are completely free.</p>
</li>
</ol>
<p>Here’s a chart that summarizes these key points:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1770901945553/8ef53746-02d1-4936-8930-fc7255aaa2bc.jpeg" alt="Container runtime comparison chart" class="image--center mx-auto" width="2647" height="1510" loading="lazy"></p>
<h2 id="heading-how-to-understand-high-level-vs-low-level-runtimes">How to Understand High-Level vs Low-Level Runtimes</h2>
<p>Container runtimes are split into two categories, and understanding this distinction helps you see how everything fits together.</p>
<h3 id="heading-low-level-runtimes">Low-Level Runtimes</h3>
<p>Low-level runtimes like <code>runc</code> and <code>crun</code> do the actual work of creating containers. They interact directly with the Linux kernel to create isolated environments using features like namespaces and cgroups.</p>
<p><strong>Namespaces</strong> isolate what a process can see. For example, a process namespace means the container can't see other processes running on your system. A network namespace means it has its own network stack.</p>
<p><strong>Cgroups</strong> (control groups) limit what a process can use. You can limit a container to 512MB of RAM or 50% of one CPU core. This prevents one container from hogging all your resources.</p>
<p>These low-level runtimes implement the OCI (Open Container Initiative) Runtime Specification. This is a standard that defines exactly how to run a container. Because of this standard, you can swap out runtimes and your containers still work.</p>
<h3 id="heading-high-level-runtimes">High-Level Runtimes</h3>
<p>High-level runtimes like Docker, Podman, and containerd manage images, networking, volumes, and provide user-friendly interfaces. They handle pulling images from registries, setting up networks between containers, and managing container lifecycles.</p>
<p>These high-level runtimes use low-level runtimes under the hood. When you run <code>docker run</code>, Docker ultimately calls <code>runc</code> to create the container. This layering means you get a nice interface while still benefiting from the standard, battle-tested low-level runtime.</p>
<h4 id="heading-why-this-layering-matters">Why This Layering Matters:</h4>
<p>This separation of concerns is powerful. High-level runtimes can focus on user experience and features while low-level runtimes focus on reliably creating containers. You can swap low-level runtimes without changing your workflow. Some people use <code>crun</code> instead of <code>runc</code> because it's written in C and starts faster.</p>
<h2 id="heading-how-to-use-docker-as-your-baseline">How to Use Docker as Your Baseline</h2>
<p>Let's start with Docker since you're probably already familiar with it. This will give us a baseline to compare other runtimes against. We'll build a simple web application and then run the same application in different runtimes to see how they compare.</p>
<h3 id="heading-how-to-install-docker">How to Install Docker</h3>
<p>You can find installation guides for your operating system:</p>
<ul>
<li><p><a target="_blank" href="https://docs.docker.com/desktop/install/mac-install/">Docker Desktop for</a> <a target="_blank" href="https://docs.docker.com/desktop/install/mac-install/">Mac</a></p>
</li>
<li><p><a target="_blank" href="https://docs.docker.com/desktop/install/windows-install/">Docker Desktop for Windows</a></p>
</li>
<li><p><a target="_blank" href="https://docs.docker.com/engine/install/">Docker Engine for Linux</a></p>
</li>
</ul>
<h3 id="heading-how-to-run-a-test-container">How to Run a Test Container</h3>
<p>Let's verify that Docker works by running a simple container:</p>
<pre><code class="lang-bash">docker run hello-world
</code></pre>
<p>You should see a message that says:</p>
<pre><code class="lang-bash">Hello from Docker!
This message shows that your installation appears to be working correctly.
</code></pre>
<h4 id="heading-what-just-happened">What Just Happened?</h4>
<p>When you ran that command, Docker checked if the <code>hello-world</code> image exists locally. It didn't find it, so it pulled the image from Docker Hub (a public registry). Then it created a container from that image, started the container, and the container printed its message and exited.</p>
<p>All of this happened in a few seconds. Now let's build something more useful.</p>
<h3 id="heading-how-to-create-a-web-server">How to Create a Web Server</h3>
<p>Create a new directory for your project:</p>
<pre><code class="lang-bash">mkdir ~/container-demo
<span class="hljs-built_in">cd</span> ~/container-demo
</code></pre>
<p>The <code>~</code> symbol means your home directory. On macOS, this is <code>/Users/yourname</code>. On Linux, it's <code>/home/yourname</code>.</p>
<p>Create a simple HTML file:</p>
<pre><code class="lang-bash">cat &gt; index.html &lt;&lt; <span class="hljs-string">'EOF'</span>
&lt;!DOCTYPE html&gt;
&lt;html&gt;
&lt;head&gt;&lt;title&gt;Container Demo&lt;/title&gt;&lt;/head&gt;
&lt;body&gt;
  &lt;h1&gt;Hello from Docker!&lt;/h1&gt;
  &lt;p&gt;This is running <span class="hljs-keyword">in</span> a container.&lt;/p&gt;
&lt;/body&gt;
&lt;/html&gt;
EOF
</code></pre>
<p>This creates a basic HTML file. The <code>cat &gt;</code> command writes to a file, and <code>&lt;&lt; 'EOF'</code> means "read until you see EOF" (End Of File). This is a handy way to create files from the command line.</p>
<h3 id="heading-how-to-create-a-dockerfile">How to Create a Dockerfile</h3>
<p>You can create a dockerfile like this:</p>
<pre><code class="lang-bash">cat &gt; Dockerfile &lt;&lt; <span class="hljs-string">'EOF'</span>
FROM nginx:alpine
COPY index.html /usr/share/nginx/html/
EOF
</code></pre>
<h4 id="heading-understanding-the-dockerfile">Understanding the Dockerfile:</h4>
<p>The Dockerfile has two instructions:</p>
<ol>
<li><p><strong>FROM nginx:alpine</strong>: This starts with the official Nginx image. The <code>:alpine</code> tag means we're using the Alpine Linux version, which is much smaller (about 20MB instead of 130MB). Alpine is a minimal Linux distribution popular in containers because of its small size.</p>
</li>
<li><p><strong>COPY index.html /usr/share/nginx/html/</strong>: This copies your HTML file into the location where Nginx serves files. Inside the container, Nginx is configured to serve files from <code>/usr/share/nginx/html/</code>.</p>
</li>
</ol>
<h3 id="heading-how-to-build-a-docker-image">How to Build a Docker Image</h3>
<pre><code class="lang-bash">docker build -t my-web-app .
</code></pre>
<p>The <code>-t</code> flag means "tag" – we're naming the image <code>my-web-app</code>. The <code>.</code> at the end means "use the current directory as the build context". Docker will look for a Dockerfile in the current directory and send all files here to the Docker daemon for building.</p>
<p>You'll see output like:</p>
<pre><code class="lang-bash">[+] Building 2.3s (7/7) FINISHED
=&gt; [internal] load build definition from Dockerfile
=&gt; =&gt; transferring dockerfile: 98B
=&gt; [internal] load .dockerignore
...
=&gt; =&gt; naming to docker.io/library/my-web-app
</code></pre>
<p>This shows Docker building your image layer by layer. Each instruction in the Dockerfile creates a new layer. These layers are cached, so if you rebuild without changes, it's instant.</p>
<h3 id="heading-how-to-run-a-docker-container">How to Run a Docker Container</h3>
<pre><code class="lang-bash">docker run -d -p 8080:80 my-web-app
</code></pre>
<h4 id="heading-understanding-the-flags">Understanding the Flags:</h4>
<ul>
<li><p><strong>-d</strong> means "detached mode" – run in the background. Without this, the container runs in the foreground and you'll see Nginx's log output. With <code>-d</code>, it returns immediately and runs in the background.</p>
</li>
<li><p><strong>-p 8080:80</strong> maps port 8080 on your host machine to port 80 inside the container. Nginx listens on port 80 inside the container. To access it from your browser, you need to map it to a port on your machine. We chose 8080, but you could use any available port.</p>
</li>
</ul>
<p>Open your browser and visit <code>http://localhost:8080</code>. You should see your HTML page!</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1770902754636/b6641413-7bd6-4548-aa75-dbc487630c1d.png" alt="Localhost running docker container" class="image--center mx-auto" width="884" height="594" loading="lazy"></p>
<h4 id="heading-how-to-check-running-containers">How to Check Running Containers:</h4>
<pre><code class="lang-bash">docker ps
</code></pre>
<p>This shows all running containers. You'll see something like:</p>
<pre><code class="lang-bash">CONTAINER ID   IMAGE        COMMAND                  PORTS                  NAMES
a1b2c3d4e5f6   my-web-app   <span class="hljs-string">"/docker-entrypoint.…"</span>   0.0.0.0:8080-&gt;80/tcp   peaceful_curie
</code></pre>
<p>Docker automatically generated a random name (<code>peaceful_curie</code> in this example). You can specify a name with <code>--name</code> if you prefer.</p>
<h4 id="heading-how-to-view-container-logs">How to View Container Logs:</h4>
<pre><code class="lang-bash">docker logs &lt;container-id&gt;
</code></pre>
<p>Replace <code>&lt;container-id&gt;</code> with the ID from <code>docker ps</code> (just the first few characters work). This shows what's happening inside the container. For Nginx, you'll see access logs showing requests to your web server.</p>
<h4 id="heading-how-to-stop-the-container">How to Stop the Container:</h4>
<pre><code class="lang-bash">docker stop &lt;container-id&gt;
</code></pre>
<p>This gracefully stops the container. Nginx receives a signal to shut down cleanly.</p>
<p>Now that you understand how to use Docker, let’s check out how Podman works next.</p>
<h2 id="heading-how-to-use-podman-the-daemonless-alternative">How to Use Podman – The Daemonless Alternative</h2>
<p>Now let's try Podman. It's designed to be a drop-in replacement for Docker, but with some key differences that make it interesting for specific use cases.</p>
<h3 id="heading-why-podman-exists">Why Podman Exists</h3>
<p>Docker runs as a daemon (a background service) that requires root privileges. This daemon always runs, listening for commands. This architecture has some downsides:</p>
<ol>
<li><p><strong>Security:</strong> The Docker daemon runs as root. If someone compromises the daemon, they have root access to your entire system.</p>
</li>
<li><p><strong>Resource Usage:</strong> The daemon consumes resources even when you're not running containers.</p>
</li>
<li><p><strong>Single Point of Failure:</strong> If the daemon crashes, all your containers stop.</p>
</li>
</ol>
<p>Podman solves these problems by not using a daemon at all. Each <code>podman</code> command runs independently. This is called a "daemonless" architecture.</p>
<h3 id="heading-key-podman-features">Key Podman Features</h3>
<p>To summarize, here are some key helpful features of Podman that might make it a good fit for your projects:</p>
<ol>
<li><p><strong>No daemon required:</strong> Each command runs independently. No background service needed.</p>
</li>
<li><p><strong>Rootless by default:</strong> Containers run as your regular user, not as root. This dramatically improves security.</p>
</li>
<li><p><strong>Drop-in Docker replacement:</strong> Most Docker commands work exactly the same. You can even alias <code>docker=podman</code> and many applications won't notice the difference.</p>
</li>
<li><p><strong>Pod support:</strong> Podman has a concept of "pods" like Kubernetes. This is unique among container tools.</p>
</li>
</ol>
<p>Now that you understand the benefits of Podman, let’s see how you can use it.</p>
<h3 id="heading-how-to-install-podman">How to Install Podman</h3>
<p>Podman installation varies by operating system. Here are the official guides:</p>
<ul>
<li><p><a target="_blank" href="https://podman.io/docs/installation#macos">Podman for macOS</a></p>
</li>
<li><p><a target="_blank" href="https://podman.io/docs/installation#macos">Podman fo</a><a target="_blank" href="https://podman.io/docs/installation#windows">r</a> <a target="_blank" href="https://podman.io/docs/installation#windows">Windo</a><a target="_blank" href="https://podman.io/docs/installation#macos">ws</a></p>
</li>
<li><p><a target="_blank" href="https://podman.io/docs/installation#macos">Podman for</a> <a target="_blank" href="https://podman.io/docs/installation#linux">Li</a><a target="_blank" href="https://podman.io/docs/installation#windows">nux</a></p>
</li>
</ul>
<p><strong>For macOS users</strong> (what we'll use in this tutorial), you can install Podman using Homebrew:</p>
<pre><code class="lang-bash">brew install podman
</code></pre>
<h3 id="heading-how-to-initialize-and-start-podman-machine">How to Initialize and Start Podman Machine</h3>
<p>On macOS, Podman needs a Linux VM to run containers (since containers use Linux kernel features). Podman Machine handles this for you:</p>
<pre><code class="lang-bash">podman machine init
</code></pre>
<p>This creates a small Linux VM. You’ll only need to do this once. The VM is about 1GB and uses minimal resources when running.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1770903100891/671690cc-8073-4748-b2df-c4308585d411.png" alt="Initialize podman machine" class="image--center mx-auto" width="1028" height="344" loading="lazy"></p>
<p>Start the machine:</p>
<pre><code class="lang-bash">podman machine start
</code></pre>
<p>Verify it's working:</p>
<pre><code class="lang-bash">podman --version
</code></pre>
<p>You should see something like:</p>
<pre><code class="lang-bash">podman version 4.5.0
</code></pre>
<h3 id="heading-how-to-run-containers-with-podman">How to Run Containers with Podman</h3>
<p>Here's where it gets interesting. You can use nearly identical commands to Docker. Let's build and run the same web server you created earlier:</p>
<pre><code class="lang-bash"><span class="hljs-comment"># Build the image (same command as Docker)</span>
podman build -t my-web-app .

<span class="hljs-comment"># Run the container</span>
podman run -d -p 8081:80 my-web-app

<span class="hljs-comment"># See running container</span>
podman ps
</code></pre>
<p>Notice that we used port 8081 this time so it doesn't conflict with the Docker container if it's still running. Visit <code>http://localhost:8081</code> and you'll see the same page, but this time it's running in Podman!</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1770903417925/4717dd8f-bda5-4aaa-ad16-2a24726ee820.png" alt="Localhost running podman container" class="image--center mx-auto" width="856" height="458" loading="lazy"></p>
<p>If you experience issue when running the podman build command, you can delete the docker image using <code>docker image rm my-web-app:latest</code>.</p>
<h4 id="heading-whats-different-under-the-hood">What's Different Under the Hood?</h4>
<p>Even though the commands look the same, what's happening is different: first no daemon was involved. The <code>podman</code> command directly created and started the container. And the container is running as your user, not as root.</p>
<p>You can verify this by checking what user owns the process:</p>
<pre><code class="lang-bash">podman top &lt;container-id&gt; user
</code></pre>
<p>You'll see your username, not <code>root</code>.</p>
<h3 id="heading-podman-pods-a-unique-feature">Podman Pods – A Unique Feature</h3>
<p>Podman has a unique feature that Docker doesn't have: pods. A pod is a group of containers that share networking and storage. This is the same concept Kubernetes uses, which makes Podman excellent for local Kubernetes development.</p>
<h4 id="heading-why-pods-matter">Why Pods Matter:</h4>
<p>In real applications, you often have multiple containers that need to work together. For example, a web application typically needs a database to store data, a cache layer for temporary storage of frequently accessed data and a logging container for request, response, and non-sensitive critical application metadata.</p>
<p>These four containers (web, database, cache, logger) need to communicate with each other. In Docker, you'd create a custom network and connect each container to it. In Podman, you can create a pod that automatically handles this networking.</p>
<h3 id="heading-how-to-create-a-podman-pod">How to Create a Podman Pod</h3>
<pre><code class="lang-bash">podman pod create --name my-app-pod -p 8082:80
</code></pre>
<p>This creates a pod named <code>my-app-pod</code> and exposes port 8082 on your host to port 80 inside the pod. Notice that you don't expose ports on individual containers – you expose them on the pod.</p>
<p>Add a web server to the pod:</p>
<pre><code class="lang-bash">podman run -d --pod my-app-pod --name web nginx:alpine
</code></pre>
<p>The <code>--pod</code> flag tells Podman to run this container inside the pod. The container doesn't need its own port mapping because the pod handles that.</p>
<p>Add Redis (an in-memory database) to the pod:</p>
<pre><code class="lang-bash">podman run -d --pod my-app-pod --name cache redis:alpine
</code></pre>
<p>Now you have two containers running in the same pod. Here's the powerful part: they share the same network namespace.</p>
<p>To check your pod:</p>
<pre><code class="lang-bash"><span class="hljs-comment"># List all pods</span>
podman pod ps -a

<span class="hljs-comment"># Show details for one pod</span>
podman pod inspect &lt;pod-name-or-id&gt;

<span class="hljs-comment"># Check processes running in the pod</span>
podman top pod &lt;pod-name-or-id&gt;

<span class="hljs-comment"># See logs from containers in that pod</span>
podman logs &lt;container-name-or-id&gt;
</code></pre>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1770903859712/3cabe09b-d693-4adf-85bf-74115122203a.png" alt="Podman pod inspection showing container running" class="image--center mx-auto" width="1128" height="744" loading="lazy"></p>
<h4 id="heading-understanding-shared-networking">Understanding Shared Networking:</h4>
<p>Both containers can reach each other using <code>localhost</code>. The web container can connect to Redis using <code>localhost:6379</code> (Redis's default port). It's as if they're running on the same machine.</p>
<p>This is exactly how Kubernetes pods work. If you learn Podman pods, you're learning Kubernetes networking too.</p>
<h3 id="heading-how-to-generate-kubernetes-yaml-from-pods">How to Generate Kubernetes YAML from Pods</h3>
<p>Here's where Podman really shines. You can generate Kubernetes-compatible YAML from your pod:</p>
<pre><code class="lang-bash">podman generate kube my-app-pod &gt; my-app-pod.yaml
</code></pre>
<p>Open <code>my-app-pod.yaml</code> and you'll see proper Kubernetes configuration:</p>
<pre><code class="lang-yaml"><span class="hljs-comment"># Save the output of this file and use kubectl create -f to import</span>
<span class="hljs-comment"># it into Kubernetes.</span>
<span class="hljs-comment">#</span>
<span class="hljs-comment"># Created with podman-5.7.1</span>
<span class="hljs-attr">apiVersion:</span> <span class="hljs-string">v1</span>
<span class="hljs-attr">kind:</span> <span class="hljs-string">Pod</span>
<span class="hljs-attr">metadata:</span>
  <span class="hljs-attr">annotations:</span>
    <span class="hljs-attr">io.kubernetes.cri-o.SandboxID/cache:</span> <span class="hljs-string">5e56bd9eab1a02a88654e3614312302d0f3f8d3652480498e6d1eef7d4824019</span>
    <span class="hljs-attr">io.kubernetes.cri-o.SandboxID/web:</span> <span class="hljs-string">5e56bd9eab1a02a88654e3614312302d0f3f8d3652480498e6d1eef7d4824019</span>
  <span class="hljs-attr">creationTimestamp:</span> <span class="hljs-string">"2026-02-12T13:44:55Z"</span>
  <span class="hljs-attr">labels:</span>
    <span class="hljs-attr">app:</span> <span class="hljs-string">my-app-pod</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">my-app-pod</span>
<span class="hljs-attr">spec:</span>
  <span class="hljs-attr">containers:</span>
  <span class="hljs-bullet">-</span> <span class="hljs-attr">args:</span>
    <span class="hljs-bullet">-</span> <span class="hljs-string">nginx</span>
    <span class="hljs-bullet">-</span> <span class="hljs-string">-g</span>
    <span class="hljs-bullet">-</span> <span class="hljs-string">daemon</span> <span class="hljs-string">off;</span>
    <span class="hljs-attr">image:</span> <span class="hljs-string">docker.io/library/nginx:alpine</span>
    <span class="hljs-attr">name:</span> <span class="hljs-string">web</span>
    <span class="hljs-attr">ports:</span>
    <span class="hljs-bullet">-</span> <span class="hljs-attr">containerPort:</span> <span class="hljs-number">80</span>
      <span class="hljs-attr">hostPort:</span> <span class="hljs-number">8082</span>
  <span class="hljs-bullet">-</span> <span class="hljs-attr">args:</span>
    <span class="hljs-bullet">-</span> <span class="hljs-string">redis-server</span>
    <span class="hljs-attr">image:</span> <span class="hljs-string">docker.io/library/redis:alpine</span>
    <span class="hljs-attr">name:</span> <span class="hljs-string">cache</span>
</code></pre>
<p>This file can be deployed directly to any Kubernetes cluster:</p>
<pre><code class="lang-bash"><span class="hljs-comment"># using minikube cluster</span>
kubectl apply -f my-app-pod.yaml
</code></pre>
<p>This is incredibly useful for local development. You can prototype your application using Podman pods, generate the YAML, and deploy to Kubernetes without rewriting anything.</p>
<h3 id="heading-how-to-manage-podman-machines">How to Manage Podman Machines</h3>
<p>When working with Podman on macOS or Windows, you're using a Linux VM. Here's how to manage it.</p>
<h4 id="heading-list-all-podman-machines">List all Podman machines:</h4>
<pre><code class="lang-bash">podman machine list
</code></pre>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1771074980607/84d2692e-11ce-4943-9187-a6a993d43c1d.png" alt="podman machine list" class="image--center mx-auto" width="1296" height="210" loading="lazy"></p>
<p>This shows all your Podman VMs, their status (running or stopped), and their names. The default machine is usually called <code>podman-machine-default</code>.</p>
<h4 id="heading-check-machine-status-and-info">Check machine status and info:</h4>
<pre><code class="lang-bash">podman machine info
</code></pre>
<p>This displays detailed information about your current machine including CPU, memory, and disk usage.</p>
<h4 id="heading-stop-the-podman-machine">Stop the Podman machine:</h4>
<pre><code class="lang-bash">podman machine stop
</code></pre>
<p>If you have multiple machines, specify the name:</p>
<pre><code class="lang-bash">podman machine stop podman-machine-default
</code></pre>
<p>This stops the VM but preserves it. All your images and containers remain intact. When you stop the machine, all running containers inside it are stopped.</p>
<h4 id="heading-start-a-stopped-machine">Start a stopped machine:</h4>
<pre><code class="lang-bash">podman machine start
</code></pre>
<p>Or with a specific name:</p>
<pre><code class="lang-bash">podman machine start podman-machine-default
</code></pre>
<p>This restarts the VM. Your images are still there, but containers remain stopped unless you started them with a restart policy.</p>
<h4 id="heading-delete-a-podman-machine">Delete a Podman machine:</h4>
<pre><code class="lang-bash">podman machine rm podman-machine-default
</code></pre>
<p>This completely destroys the VM and all its contents (images, containers, volumes). Use this when you want to start fresh or free up disk space.</p>
<p>With this basic understanding of how Podman works, we can move on and learn about how to use Containerd.</p>
<h2 id="heading-how-to-work-with-containerd">How to Work with Containerd</h2>
<p>Containerd is the runtime that Docker itself uses under the hood. It's also the default runtime for most Kubernetes installations. When you run Docker, you're actually using containerd without knowing it.</p>
<h3 id="heading-why-use-containerd-directly">Why Use containerd Directly?</h3>
<p>You might wonder why you'd use containerd directly if Docker already uses it. Here are a few reasons:</p>
<ol>
<li><p><strong>Kubernetes:</strong> Most Kubernetes clusters use containerd as their container runtime. Understanding it helps you troubleshoot production issues.</p>
</li>
<li><p><strong>Minimal footprint:</strong> containerd has no UI and minimal features. It uses less memory than Docker Desktop (about 50MB vs 2GB).</p>
</li>
<li><p><strong>Building tools:</strong> If you're building container orchestration tools, working directly with containerd gives you fine-grained control.</p>
</li>
</ol>
<h3 id="heading-understanding-the-architecture">Understanding the Architecture</h3>
<p>The containerd architecture looks like this:</p>
<pre><code class="lang-bash">Your Command → nerdctl → containerd → runc → Container
</code></pre>
<p>In this chain, nerdctl provides a Docker-like CLI, containerd manages images and container lifecycle, and runc actually creates the container using kernel features.</p>
<h3 id="heading-how-to-install-containerd-with-nerdctl">How to Install containerd with nerdctl</h3>
<p>containerd is designed for systems (like Kubernetes) rather than direct developer use. The installation approach differs by operating system:</p>
<ul>
<li><p><a target="_blank" href="https://lima-vm.io/docs/installation/">Lima for macOS</a> (includes nerdctl)</p>
</li>
<li><p><a target="_blank" href="https://github.com/containerd/containerd/blob/main/docs/getting-started.md">containerd for Linux</a> (native installation)</p>
</li>
<li><p><a target="_blank" href="https://github.com/containerd/nerdctl/releases">nerdctl releases</a> (for all platforms)</p>
</li>
</ul>
<p><strong>For macOS users</strong> (what we'll use in this tutorial), we’ll use Lima, which provides a Linux VM with containerd and nerdctl already installed.</p>
<pre><code class="lang-bash">brew install lima
</code></pre>
<p>Lima comes with nerdctl built-in, so you don't need to install it separately.</p>
<p><strong>For Linux users</strong>, you can install containerd directly from your package manager and download nerdctl from the GitHub releases page. Containerd runs natively on Linux without needing a VM.</p>
<h3 id="heading-how-to-start-a-lima-instance">How to Start a Lima Instance</h3>
<pre><code class="lang-bash">limactl start
</code></pre>
<p>This creates a default Linux VM running containerd with nerdctl available. The VM is configured with reasonable defaults (2GB RAM, 100GB disk). You can customize these settings if needed.</p>
<p>Lima mounts your home directory inside the VM, so you can access your files. This makes working with Lima feel transparent – you don't need to copy files into the VM.</p>
<p>Verify it's working:</p>
<pre><code class="lang-bash">lima nerdctl run hello-world
</code></pre>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1771074008992/aa76dcd5-8eb5-4baf-9d72-47e1f4aa3ae3.png" alt="Containerd Lima instance and Hello-world container" class="image--center mx-auto" width="1686" height="834" loading="lazy"></p>
<h3 id="heading-how-to-run-your-app-with-nerdctl">How to Run Your App with nerdctl</h3>
<p>The commands are nearly identical to Docker. This is intentional – nerdctl aims for Docker compatibility. Since we're running through Lima, we’ll prefix commands with <code>lima</code>.</p>
<p>Navigate to your project directory:</p>
<pre><code class="lang-bash"><span class="hljs-built_in">cd</span> ~/container-demo
</code></pre>
<p>Build the image:</p>
<pre><code class="lang-bash">lima nerdctl build -t my-web-app .
</code></pre>
<p>Run the container:</p>
<pre><code class="lang-bash">lima nerdctl run -d -p 8083:80 my-web-app
</code></pre>
<p>Visit <code>http://localhost:8083</code> to see your app running on containerd!</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1771074352767/8ee7339d-8145-494e-9bac-41bfc8f620e1.png" alt="Localhost running containerd container" class="image--center mx-auto" width="1066" height="598" loading="lazy"></p>
<h3 id="heading-whats-different-from-docker">What's Different from Docker?</h3>
<p>Under the hood, a lot is different. Containerd is managing your image and container. There's no daemon in the traditional sense (containerd runs differently than dockerd). Images are stored differently (though they're OCI-compliant so they're compatible).</p>
<p>But from your perspective as a developer, the commands feel the same. This is the power of standards like OCI.</p>
<h4 id="heading-how-to-check-running-containers-1">How to Check Running Containers:</h4>
<pre><code class="lang-bash">lima nerdctl ps
</code></pre>
<p>This shows all running containers.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1771074426408/3b5da24c-0dad-4c8e-9ce8-fd8cb319f9f9.png" alt="Running containers" class="image--center mx-auto" width="1878" height="304" loading="lazy"></p>
<h3 id="heading-how-to-manage-lima-vms">How to Manage Lima VMs</h3>
<p>When working with containerd through Lima, you're using a Linux VM. Here's how to manage it.</p>
<h4 id="heading-list-all-lima-vms">List all Lima VMs:</h4>
<pre><code class="lang-bash">limactl list
</code></pre>
<p>This shows all your Lima VMs, their status (running or stopped), and their names. The default VM is usually called <code>default</code>.</p>
<h4 id="heading-check-vm-status-and-info">Check VM status and info:</h4>
<pre><code class="lang-bash">limactl info default
</code></pre>
<p>This displays detailed information about the specified VM including its configuration and resource usage.</p>
<h4 id="heading-stop-the-lima-vm">Stop the Lima VM:</h4>
<pre><code class="lang-bash">limactl stop default
</code></pre>
<p>This stops the VM but preserves it. All your images and containers remain intact. When you stop the VM, all running containers inside it are stopped. The next time you start it, your images will still be there but containers remain stopped.</p>
<h4 id="heading-start-a-stopped-vm">Start a stopped VM:</h4>
<pre><code class="lang-bash">limactl start default
</code></pre>
<p>This restarts the VM. Your images persist across restarts, so you don't need to rebuild them.</p>
<h4 id="heading-delete-a-lima-vm">Delete a Lima VM:</h4>
<pre><code class="lang-bash">limactl delete default
</code></pre>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1771074893694/071702bb-8a35-4681-98ec-f1375a52c5d7.png" alt="Containerd VM list and deletion" class="image--center mx-auto" width="1272" height="324" loading="lazy"></p>
<p>This completely destroys the VM and all its contents (images, containers, volumes). Use this when you want to start fresh or free up disk space. You'll need to run <code>limactl start</code> again to create a new VM.</p>
<h4 id="heading-create-a-new-vm-with-custom-settings">Create a new VM with custom settings:</h4>
<pre><code class="lang-bash">limactl start --name my-custom-vm --cpus 4 --memory 8
</code></pre>
<p>This creates a new VM with 4 CPUs and 8GB of memory. You can have multiple Lima VMs for different projects.</p>
<h2 id="heading-how-to-move-containers-between-runtimes">How to Move Containers Between Runtimes</h2>
<p>Thanks to the OCI (Open Container Initiative) standard, you can move container images between different runtimes. This is incredibly powerful – you can build with one tool and deploy with another.</p>
<h3 id="heading-why-standards-matter">Why Standards Matter</h3>
<p>Before OCI, each container runtime used its own image format. Moving images between runtimes was difficult or impossible.</p>
<p>OCI created standards for the Runtime Specification (how to run a container), the Image Specification (how to package a container image), and the Distribution Specification (how to transfer images between systems).</p>
<p>Now all major runtimes follow these standards, making images portable.</p>
<h3 id="heading-method-1-using-container-registries">Method 1 – Using Container Registries</h3>
<p>The easiest way to share images is through a container registry like Docker Hub, GitHub Container Registry, or your own private registry. Any runtime can push and pull from registries.</p>
<p>First, build with Docker:</p>
<pre><code class="lang-bash">docker build -t my-username/my-app:v1 .
</code></pre>
<p>The image name has three parts: <code>my-username</code> (your registry username), <code>my-app</code> (the application name), and <code>v1</code> (a version tag).</p>
<p>Push to Docker Hub:</p>
<pre><code class="lang-bash">docker login
docker push my-username/my-app:v1
</code></pre>
<p>You'll need to create a free Docker Hub account if you don't have one. The <code>docker login</code> command prompts for your credentials.</p>
<p>Now pull with Podman:</p>
<pre><code class="lang-bash">podman pull my-username/my-app:v1
</code></pre>
<p>Podman downloads the image from Docker Hub. Even though it was built with Docker, Podman can use it because both follow OCI standards.</p>
<p>Or pull with nerdctl:</p>
<pre><code class="lang-bash">lima nerdctl pull my-username/my-app:v1
</code></pre>
<p>Same image, three different runtimes. This is the power of standards.</p>
<h3 id="heading-method-2-export-and-import">Method 2 – Export and Import</h3>
<p>If you don't want to use a public registry (maybe your image contains proprietary code), you can export images as tar files. This is perfect for air-gapped environments or simply moving images between machines.</p>
<p>Export from Docker:</p>
<pre><code class="lang-bash">docker save my-web-app -o my-web-app.tar
</code></pre>
<p>This creates a file called <code>my-web-app.tar</code> containing the image and all its layers. The file might be large (tens or hundreds of megabytes) depending on your image.</p>
<p>Import to Podman:</p>
<pre><code class="lang-bash">podman load -i my-web-app.tar
</code></pre>
<p>Import to nerdctl:</p>
<pre><code class="lang-bash">lima nerdctl load -i my-web-app.tar
</code></pre>
<p>Now you have the same image available in all three runtimes! You can verify:</p>
<pre><code class="lang-bash">docker images
podman images  
lima nerdctl images
</code></pre>
<p>All three commands will show <code>my-web-app</code> in their image lists.</p>
<h4 id="heading-understanding-image-layers">Understanding Image Layers:</h4>
<p>When you export an image, you're exporting all its layers. Each line in your Dockerfile creates a layer. These layers are shared between images, which saves disk space.</p>
<p>For example, if you have 10 images all based on <code>nginx:alpine</code>, they all share the nginx layers. Only the layers unique to each image take up additional space.</p>
<h2 id="heading-real-world-use-cases">Real-World Use Cases</h2>
<p>Let's look at some real scenarios where choosing the right runtime matters. These examples show how technical decisions have practical impacts.</p>
<h3 id="heading-use-case-1-security-first-development">Use Case 1 – Security-First Development</h3>
<p>If you're working on security-sensitive applications (financial services, healthcare, government), Podman's rootless containers are a huge advantage.</p>
<h4 id="heading-the-security-problem">The Security Problem:</h4>
<p>Traditional Docker requires root privileges. If someone exploits a vulnerability in your container and escapes to the host system, they have root access. This is called a "container escape" vulnerability.</p>
<p>Podman's rootless mode solves this:</p>
<pre><code class="lang-bash"><span class="hljs-comment"># All Podman commands run as your user by default</span>
podman run --rm -it alpine whoami
</code></pre>
<p>This outputs your username, not <code>root</code>. The command uses <code>--rm</code> to remove the container when it exits (cleanup), <code>-it</code> to make it interactive with a terminal, <code>alpine</code> as a minimal Linux distribution, and <code>whoami</code> as a command that prints your username.</p>
<p>Even if someone breaks out of the container, they only have your user's permissions. They can't install system-wide malware, access other users' data, modify system configuration, or install kernel modules.</p>
<p>This dramatically reduces the impact of a container escape.</p>
<h4 id="heading-example-security-scenario">Example Security Scenario:</h4>
<p>Imagine you're running a web application that processes user uploads. A vulnerability lets an attacker execute code in your container. With Docker running as root, they could escape the container, install a rootkit, steal all data from your server, and persist even after you patch the vulnerability.</p>
<p>With Podman rootless, they might escape the container but can only access files your user can access. They can't persist beyond the container and can't affect other users or system files.</p>
<p>The difference is dramatic.</p>
<h3 id="heading-use-case-2-testing-kubernetes-locally">Use Case 2 – Testing Kubernetes Locally</h3>
<p>Podman can generate Kubernetes YAML from running containers. This is perfect for prototyping before you commit to a Kubernetes configuration.</p>
<h4 id="heading-the-development-workflow">The Development Workflow:</h4>
<ol>
<li><p>Run your application locally with Podman</p>
</li>
<li><p>Test and iterate quickly</p>
</li>
<li><p>Generate Kubernetes YAML when it works</p>
</li>
<li><p>Deploy to a real cluster</p>
</li>
</ol>
<p>Here's a practical example. Let's say you're building a web application with a database:</p>
<p>Run your containers:</p>
<pre><code class="lang-bash"><span class="hljs-comment"># Create a pod (like a Kubernetes pod)</span>
podman pod create --name myapp -p 8080:80

<span class="hljs-comment"># Add web server</span>
podman run -d --pod myapp --name web nginx:alpine

<span class="hljs-comment"># Add PostgreSQL</span>
podman run -d --pod myapp --name db \
  -e POSTGRES_PASSWORD=secret \
  postgres:alpine
</code></pre>
<p>Test your application at <code>http://localhost:8080</code>. When it works, generate Kubernetes YAML:</p>
<pre><code class="lang-bash">podman generate kube myapp &gt; myapp.yaml
</code></pre>
<p>Now you can deploy <code>myapp.yaml</code> to any Kubernetes cluster:</p>
<pre><code class="lang-bash">kubectl apply -f myapp.yaml
</code></pre>
<p>This is much faster than writing Kubernetes YAML by hand and debugging in a cluster. You iterate locally, then deploy when ready.</p>
<h4 id="heading-why-this-matters">Why This Matters:</h4>
<p>Kubernetes has a steep learning curve. The YAML configuration is verbose and error-prone. By starting with simple Podman commands and generating YAML, you can focus on your application first, learn Kubernetes gradually, catch configuration errors early, and iterate quickly without cloud costs.</p>
<h3 id="heading-use-case-3-resource-constrained-environments">Use Case 3 – Resource-Constrained Environments</h3>
<p>containerd has the smallest footprint. If you're running containers on edge devices, Raspberry Pi, or resource-constrained servers, this matters a lot.</p>
<h4 id="heading-comparing-memory-usage">Comparing Memory Usage:</h4>
<p>Here are typical memory footprints for each runtime:</p>
<ul>
<li><p>Docker Desktop uses approximately 2GB RAM (includes the VM, daemon, UI, and Kubernetes).</p>
</li>
<li><p>Podman uses approximately 500MB RAM (includes the VM on macOS).</p>
</li>
<li><p>Containerd uses approximately 50MB RAM (just the runtime, no extras).</p>
</li>
</ul>
<p>On a developer laptop with 16GB RAM, this difference doesn't matter much. But consider these scenarios:</p>
<p><strong>1. Edge Computing:</strong></p>
<p>You're running containers on edge devices with 1GB RAM total. Docker Desktop won't fit. containerd leaves room for your application.</p>
<p><strong>2. IoT Devices:</strong></p>
<p>A Raspberry Pi with 2GB RAM running Docker Desktop leaves little room for your application. containerd uses minimal resources.</p>
<p><strong>3. High-Density Servers:</strong></p>
<p>Running 100 containers per server. Every MB counts. Using containerd instead of full Docker saves 2GB per server × 100 servers = 200GB.</p>
<p><strong>Example Setup for Edge Device:</strong></p>
<pre><code class="lang-bash"><span class="hljs-comment"># On a Raspberry Pi or similar device</span>
sudo apt-get install containerd
sudo apt-get install nerdctl

<span class="hljs-comment"># Now you can run containers with minimal overhead</span>
nerdctl run -d my-lightweight-app
</code></pre>
<p>Your application gets to use most of the available RAM instead of competing with a heavy runtime.</p>
<h2 id="heading-quick-reference-guide">Quick Reference Guide</h2>
<p>Here's a handy comparison of common commands across runtimes:</p>
<div class="hn-table">
<table>
<thead>
<tr>
<td>Task</td><td>Docker</td><td>Podman</td><td>nerdctl (via Lima)</td></tr>
</thead>
<tbody>
<tr>
<td>Build image</td><td><code>docker build -t app .</code></td><td><code>podman build -t app .</code></td><td><code>lima nerdctl build -t app .</code></td></tr>
<tr>
<td>Run container</td><td><code>docker run -d app</code></td><td><code>podman run -d app</code></td><td><code>lima nerdctl run -d app</code></td></tr>
<tr>
<td>List containers</td><td><code>docker ps</code></td><td><code>podman ps</code></td><td><code>lima nerdctl ps</code></td></tr>
<tr>
<td>View logs</td><td><code>docker logs &lt;id&gt;</code></td><td><code>podman logs &lt;id&gt;</code></td><td><code>lima nerdctl logs &lt;id&gt;</code></td></tr>
<tr>
<td>Stop container</td><td><code>docker stop &lt;id&gt;</code></td><td><code>podman stop &lt;id&gt;</code></td><td><code>lima nerdctl stop &lt;id&gt;</code></td></tr>
<tr>
<td>Remove container</td><td><code>docker rm &lt;id&gt;</code></td><td><code>podman rm &lt;id&gt;</code></td><td><code>lima nerdctl rm &lt;id&gt;</code></td></tr>
<tr>
<td>List images</td><td><code>docker images</code></td><td><code>podman images</code></td><td><code>lima nerdctl images</code></td></tr>
<tr>
<td>Pull image</td><td><code>docker pull nginx</code></td><td><code>podman pull nginx</code></td><td><code>lima nerdctl pull nginx</code></td></tr>
<tr>
<td>Push to registry</td><td><code>docker push app</code></td><td><code>podman push app</code></td><td><code>lima nerdctl push app</code></td></tr>
<tr>
<td>Execute in container</td><td><code>docker exec -it &lt;id&gt; sh</code></td><td><code>podman exec -it &lt;id&gt; sh</code></td><td><code>lima nerdctl exec -it &lt;id&gt; sh</code></td></tr>
</tbody>
</table>
</div><h2 id="heading-conclusion">Conclusion</h2>
<p>In this guide, we’ve explored three major container runtimes and learned how to use Docker, Podman, and containerd. The container ecosystem is much bigger than just Docker, and knowing alternatives gives you more options for security, performance, and specialized use cases.</p>
<p>Use Docker when you're learning or need the best documentation. Use Podman when you need rootless security or are building CI/CD pipelines. Use containerd when you need minimal resource usage or are deploying to Kubernetes clusters.</p>
<p>Thanks to OCI standards, your containers are portable. Build with Docker, test with Podman, deploy with containerd – it all works together! You're not locked into one vendor or tool.</p>
<p>As always, I hope you enjoyed this guide and learned something. If you want to stay connected or see more hands-on DevOps content, you can follow me on <a target="_blank" href="https://www.linkedin.com/in/destiny-erhabor">LinkedIn</a> and <a target="_blank" href="https://github.com/Caesarsage/DevOps-Cloud-Projects">DevOps Cloud Projects</a></p>
<p>Happy containerizing!</p>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ How to Dockerize Your Application and Deploy It ]]>
                </title>
                <description>
                    <![CDATA[ Modern applications rarely live in isolation. They move between laptops, staging servers, and production environments. Each environment has its own quirks, missing libraries, or slightly different configurations. This is where many “works on my machi... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/how-to-dockerize-your-application-and-deploy-it/</link>
                <guid isPermaLink="false">69851e61087459735f840552</guid>
                
                    <category>
                        <![CDATA[ Docker ]]>
                    </category>
                
                    <category>
                        <![CDATA[ deployment ]]>
                    </category>
                
                    <category>
                        <![CDATA[ containers ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ Manish Shivanandhan ]]>
                </dc:creator>
                <pubDate>Thu, 05 Feb 2026 22:49:05 +0000</pubDate>
                <media:content url="https://cdn.hashnode.com/res/hashnode/image/upload/v1770331734345/d53fdc31-231b-4194-96e2-efcec036cfb2.png" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>Modern applications rarely live in isolation. They move between laptops, staging servers, and production environments.</p>
<p>Each environment has its own quirks, missing libraries, or slightly different configurations. This is where many “works on my machine” problems begin.</p>
<p><a target="_blank" href="https://www.freecodecamp.org/news/what-is-docker-used-for-a-docker-container-tutorial-for-beginners/">Docker</a> was created to solve this exact issue, and it has become a core skill for anyone building and deploying software today.</p>
<p>In this article, you’ll learn how to Dockerize a <a target="_blank" href="https://www.freecodecamp.org/news/how-to-build-and-deploy-a-loganalyzer-agent-using-langchain/">LogAnalyzer Agent project</a> and prepare it for deployment.</p>
<p>We’ll first understand what Docker is and why it matters. Then we’ll walk through converting this FastAPI-based project into a Dockerized application. Finally, we’ll cover how to build and upload the Docker image so it can be deployed to a cloud platform like Sevalla.</p>
<p>You only need a basic understanding of Python for this project. If you want to learn Docker in detail, go through this <a target="_blank" href="https://www.freecodecamp.org/news/how-docker-containers-work/">detailed tutorial</a>.</p>
<h2 id="heading-what-well-cover">What We’ll Cover</h2>
<ul>
<li><p><a class="post-section-overview" href="#heading-what-is-docker">What is Docker</a>?</p>
</li>
<li><p><a class="post-section-overview" href="#heading-why-docker-matters">Why Docker Matters</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-understanding-the-project">Understanding the Project</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-writing-the-dockerfile">Writing the Dockerfile</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-handling-environment-variables-in-docker">Handling Environment Variables in Docker</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-building-the-docker-image">Building the Docker Image</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-testing-the-container-locally">Testing the Container Locally</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-preparing-the-image-for-deployment">Preparing the Image for Deployment</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-adding-the-docker-image-to-sevalla">Adding the Docker Image to Sevalla</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-final-thoughts">Final Thoughts</a></p>
</li>
</ul>
<h2 id="heading-what-is-docker">What is Docker?</h2>
<p><a target="_blank" href="https://www.docker.com/">Docker</a> is a tool that packages your application together with everything it needs to run. This includes the operating system libraries, system dependencies, Python version, and Python packages. The result is called a Docker image. When this image runs, it becomes a container.</p>
<p>A container behaves the same way everywhere. If it runs on your laptop, it will run the same way on a cloud server. This consistency is the main reason Docker is so widely used.</p>
<p>For the LogAnalyzer Agent, this means that FastAPI, LangChain, and all Python dependencies will always be available, regardless of where the app is deployed.</p>
<h2 id="heading-why-docker-matters">Why Docker Matters</h2>
<p>Without Docker, deployment usually involves manually installing dependencies on a server. This process is slow and error prone. A missing system package or a wrong Python version can break the app.</p>
<p>Docker removes this uncertainty. You define the environment once, using a Dockerfile, and reuse it everywhere. This makes onboarding new developers easier, simplifies CI pipelines, and reduces production bugs.</p>
<p>For AI-powered services like the LogAnalyzer Agent, Docker is even more important. These services often rely on specific library versions and environment variables, such as API keys. Docker ensures that these details are controlled and repeatable.</p>
<h2 id="heading-understanding-the-project">Understanding the Project</h2>
<p>Before containerizing the application, it’s important to understand its structure. The LogAnalyzer Agent consists of a FastAPI backend that serves an HTML frontend and exposes an API endpoint for log analysis.</p>
<p>The backend depends on Python packages like FastAPI, LangChain, and the OpenAI client. It also relies on an environment variable for the OpenAI API key.</p>
<p>From Docker’s point of view, this is a typical Python web service. That makes it an ideal candidate for containerization.</p>
<p>At this stage, you should clone the <a target="_blank" href="https://github.com/manishmshiva/loganalyzer">project repository</a> to your local machine. You can run the app using the command <code>python app.py</code></p>
<h2 id="heading-writing-the-dockerfile">Writing the Dockerfile</h2>
<p>The <a target="_blank" href="https://docs.docker.com/reference/dockerfile/">Dockerfile</a> is the recipe that tells Docker how to build your image. It starts with a base image, installs dependencies, copies your code, and defines how the application should start.</p>
<p>For this project, a lightweight Python image is a good choice. The Dockerfile might look like this:</p>
<pre><code class="lang-python">FROM python:<span class="hljs-number">3.11</span>-slim

WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
EXPOSE <span class="hljs-number">8000</span>
CMD [<span class="hljs-string">"uvicorn"</span>, <span class="hljs-string">"main:app"</span>, <span class="hljs-string">"--host"</span>, <span class="hljs-string">"0.0.0.0"</span>, <span class="hljs-string">"--port"</span>, <span class="hljs-string">"8000"</span>]
</code></pre>
<p>Each line has a purpose: the base image provides Python and the working directory keeps files organized.</p>
<p>Dependencies are installed before copying the full code to improve build caching. The expose instruction documents the port used by the app. The command starts the FastAPI server.</p>
<p>This file alone turns your project into something Docker understands.</p>
<h2 id="heading-handling-environment-variables-in-docker">Handling Environment Variables in Docker</h2>
<p>The LogAnalyzer Agent relies on an OpenAI API key. This key should never be hardcoded into the image. Instead, Docker allows environment variables to be passed at runtime.</p>
<p>During local testing, you can still use a <code>.env</code> file. When running the container, you can pass the variable using Docker’s environment flags or your deployment platform’s settings.</p>
<p>This separation keeps secrets secure and allows the same image to be used in multiple environments.</p>
<h2 id="heading-building-the-docker-image">Building the Docker Image</h2>
<p>Once the Dockerfile is ready, building the image is straightforward. From the root of the project, you run a Docker build command:</p>
<pre><code class="lang-python">docker build -t loganalyzer:latest .
</code></pre>
<p>Docker reads the Dockerfile, executes each step, and produces an image.</p>
<p>This image contains your FastAPI app, the HTML UI, and all dependencies. At this point, you can run it locally to verify that everything works exactly as before.</p>
<p>Running the container locally is an important validation step. If the app works inside Docker on your machine, it’s very likely to work in production as well.</p>
<h2 id="heading-testing-the-container-locally">Testing the Container Locally</h2>
<p>After building the image, you can start a container and map its port to your local machine. When the container starts, Uvicorn runs inside it, just like it did outside Docker.</p>
<pre><code class="lang-python">docker run -d -p <span class="hljs-number">8000</span>:<span class="hljs-number">8000</span> -e OPENAI_API_KEY=your_api_key_here loganalyzer:latest
</code></pre>
<p>You should be able to open a browser, upload a log file, and receive analysis results. If something fails, the container logs will usually point you to missing files or incorrect paths.</p>
<p>This feedback loop is fast and helps you fix issues before deployment.</p>
<h2 id="heading-preparing-the-image-for-deployment">Preparing the Image for Deployment</h2>
<p>At this stage, the Docker image is ready to be uploaded to a container registry. A registry is a place where Docker images are stored and shared. Your deployment platform will later pull the image from this registry.</p>
<p>We’ll use <a target="_blank" href="https://hub.docker.com/">DockerHub</a> to push our image. Create an account and run <code>docker login</code> command to authenticate it with your terminal.</p>
<p>Now let’s tag and push your image to the repository:</p>
<pre><code class="lang-python">docker tag loganalyzer:latest your-dockerhub-username/loganalyzer:latest
docker push your-dockerhub-username/loganalyzer:latest
</code></pre>
<h2 id="heading-adding-the-docker-image-to-sevalla">Adding the Docker Image to Sevalla</h2>
<p>The final step is to upload the Docker image for deployment.</p>
<p>You can choose any cloud provider, like AWS, DigitalOcean, or others, to run your application. I’ll be using Sevalla for this example.</p>
<p><a target="_blank" href="https://sevalla.com/">Sevalla</a> is a developer-friendly PaaS provider. It offers application hosting, database, object storage, and static site hosting for your projects.</p>
<p>Every platform will charge you for creating a cloud resource. Sevalla comes with a $20 credit for us to use, so we won’t incur any costs for this example.</p>
<p><a target="_blank" href="https://app.sevalla.com/login">Log in</a> to Sevalla and click on Applications -&gt; Create new application:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1770296138206/6b3399a2-ed25-498b-80d0-e1ec05fabc35.png" alt="Sevalla Home Page" class="image--center mx-auto" width="1000" height="434" loading="lazy"></p>
<p>You can see the option to link your <a target="_blank" href="https://hub.docker.com/r/manishmshiva/loganalyzer">container repository</a>. Use the default settings. Click “Create application”.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1770296158854/680b220a-2afc-4521-86c5-0436b8a6c408.png" alt="Create New Application" class="image--center mx-auto" width="1000" height="634" loading="lazy"></p>
<p>Now we have to add our OpenAI API key to the environment variables. Click on the “Environment variables” section once the application is created, and save the <code>OPENAI_API_KEY</code> value as an environment variable.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1770296194532/b0a77d83-bfea-4774-9b42-6cf1bc7d63dd.png" alt="Add environment variables" class="image--center mx-auto" width="1000" height="428" loading="lazy"></p>
<p>We’re now ready to deploy our application. Click on “Deployments” and click “Deploy now”. It will take 2–3 minutes for the deployment to complete.</p>
<p>Once done, click on “Visit app”. You will see the application served via a URL ending with <code>sevalla.app</code>.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1770296212110/f9b75458-cb08-461c-9a2d-e1f785be8161.png" alt="Live application" class="image--center mx-auto" width="1000" height="478" loading="lazy"></p>
<p>Congrats! Your log analyser service is now Dockerized and live.</p>
<p>From this point on, deployment becomes simple. A new version of the app is just a new Docker image. You can push an image to the repository and Sevalla will pull it automatically.</p>
<h2 id="heading-final-thoughts">Final Thoughts</h2>
<p>Docker turns your application into a portable, predictable unit. For the LogAnalyzer Agent, this means the AI logic, the FastAPI server, and the frontend all move together as one artifact.</p>
<p>By cloning the project, adding a Dockerfile, and building an image, you convert a local prototype into a deployable service. Uploading that image to Sevalla completes the journey from code to production.</p>
<p>Once you’re comfortable with this workflow, you’ll find that Docker isn’t just a deployment tool. It becomes a core part of how you design, test, and ship applications with confidence.</p>
<p><em>Hope you enjoyed this article. Learn more about me by</em> <a target="_blank" href="https://manishshivanandhan.com/"><strong><em>visiting my website</em></strong></a><em>.</em></p>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ How to Containerize and Deploy Your Node.js Applications ]]>
                </title>
                <description>
                    <![CDATA[ When you build a Node.js application, running it locally is simple. You type npm start, and it works. But when you need to run it on the cloud, things get complicated. You need to think about servers, environments, dependencies, and deployment pipeli... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/how-to-containerize-and-deploy-your-nodejs-applications/</link>
                <guid isPermaLink="false">68e840ed25ca8a99242df116</guid>
                
                    <category>
                        <![CDATA[ containers ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Node.js ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Docker ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ Manish Shivanandhan ]]>
                </dc:creator>
                <pubDate>Thu, 09 Oct 2025 23:10:37 +0000</pubDate>
                <media:content url="https://cdn.hashnode.com/res/hashnode/image/upload/v1760051426715/fd0f14cf-95dc-4191-b0fc-e5c916520097.png" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>When you build a Node.js application, running it locally is simple. You type <code>npm start</code>, and it works.</p>
<p>But when you need to run it on the cloud, things get complicated. You need to think about servers, environments, dependencies, and deployment pipelines. That’s where containerization comes in.</p>
<p>Containers make your application portable and predictable. You can run the same code with the same setup anywhere, from your laptop to the cloud.</p>
<p>In this guide, we will walk through how to containerize a simple Node.js API and deploy it to the cloud. By the end, you will know how to set up Docker for your app, push it to a registry, and see your application running on the cloud.</p>
<h2 id="heading-table-of-contents"><strong>Table of Contents</strong></h2>
<ul>
<li><p><a class="post-section-overview" href="#heading-prerequisites">Prerequisites</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-what-is-containerization">What is Containerization?</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-setting-up-a-nodejs-app">Setting Up a Node.js App</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-writing-the-dockerfile">Writing the Dockerfile</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-building-and-testing-the-container">Building and Testing the Container</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-preparing-for-deployment">Preparing for Deployment</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-deploying-to-the-cloud">Deploying to the Cloud</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-scaling-your-app">Scaling Your App</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-updating-your-app">Updating Your App</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-benefits-of-sing-containers">Benefits of sing Containers</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-conclusion">Conclusion</a></p>
</li>
</ul>
<h2 id="heading-prerequisites">Prerequisites</h2>
<p>Before we dive into containerizing and deploying your Node.js application, make sure you have the following set up on your system. These basics will help you follow along without running into errors.</p>
<p><strong>Node.js and npm</strong><br>You should have <a target="_blank" href="https://nodejs.org/en">Node.js</a> (v18 or higher) and npm installed on your local machine. This ensures you can run your app locally before containerizing it.<br>To check your versions, run:</p>
<pre><code class="lang-python">node -v
npm -v
</code></pre>
<p><strong>Docker installed and running</strong><br><a target="_blank" href="https://www.docker.com/">Docker</a> is the core tool we’ll use to containerize the app. Install Docker Desktop or Docker Engine depending on your system. Once installed, confirm that it’s running and working by typing:</p>
<pre><code class="lang-python">docker --version
</code></pre>
<p><strong>Docker Hub account (or any container registry)</strong><br>You’ll need a Docker Hub account to push your container image to the cloud. This allows your deployment platform to pull and run the image. You can create one for free at <a target="_blank" href="http://hub.docker.com">hub.docker.com</a><a target="_blank" href="https://hub.docker.com/">.</a></p>
<p>Once you have these prerequisites ready, you’ll be set to build your first containerized Node.js app and deploy it to the cloud.</p>
<h2 id="heading-what-is-containerization"><strong>What is Containerization?</strong></h2>
<p>Containerization is a way to package an application along with everything it needs to run. That includes the code, libraries, system tools, and settings. The package is called a container image.</p>
<p>When you run that image, you get a container that behaves exactly the same on any system that supports <a target="_blank" href="https://www.freecodecamp.org/news/the-docker-handbook/">Docker</a>.</p>
<p>Without containers, deployment can be messy. Your app might work on your machine but fail in production due to missing libraries or version mismatches.</p>
<p>Containers solve this by locking in the environment. Think of them as lightweight virtual machines that only contain what your app needs.</p>
<h2 id="heading-setting-up-a-nodejs-app"><strong>Setting Up a Node.js App</strong></h2>
<p>Let’s start by building a simple Node.js API. We will keep it minimal so we can focus on the containerization and deployment steps.</p>
<p>Create a new folder and add a file called <code>server.js</code>:</p>
<pre><code class="lang-plaintext">const express = require('express');
const app = express();
const PORT = process.env.PORT || 3000;

app.get('/', (req, res) =&gt; {
  res.json({ message: 'Hello from Container!' });
});
app.listen(PORT, () =&gt; {
  console.log(`Server running on port ${PORT}`);
});
</code></pre>
<p>Next, create a <code>package.json</code> file with the following content:</p>
<pre><code class="lang-plaintext">{
  "name": "container-node-app",
  "version": "1.0.0",
  "main": "server.js",
  "scripts": {
    "start": "node server.js"
  },
  "dependencies": {
    "express": "^5.1.0"
  }
}
</code></pre>
<p>Run <code>npm install</code> to install the Express dependency. You now have a simple Node.js API that runs locally. You can test it with <code>npm start</code> and open <code>http://localhost:3000</code> in your browser.</p>
<h2 id="heading-writing-the-dockerfile"><strong>Writing the Dockerfile</strong></h2>
<p>To run this app in a container, we need to write a <code>Dockerfile</code>. This file defines how to build the container image. Create a new file called <code>Dockerfile</code> and add this:</p>
<pre><code class="lang-plaintext">FROM node:24

WORKDIR /usr/src/app
COPY package*.json ./
RUN npm install
COPY . .
EXPOSE 3000
CMD ["npm", "start"]
</code></pre>
<p>Let’s break this down. We start with the official Node.js 24 image. We set a working directory inside the container. We copy the package files and install dependencies.</p>
<p>Then we copy the rest of the code. We expose port 3000 so that the app can accept traffic. Finally, we run <code>npm start</code> as the default command.</p>
<h2 id="heading-building-and-testing-the-container"><strong>Building and Testing the Container</strong></h2>
<p>Now that we have the <code>Dockerfile</code>, we can build the image. Run the following command:</p>
<pre><code class="lang-plaintext">docker build -t container-node-app .
</code></pre>
<p>This builds an image named <code>container-node-app</code>. To test it locally, run:</p>
<pre><code class="lang-plaintext">docker run -p 3000:3000 container-node-app
</code></pre>
<p>Open <code>http://localhost:3000</code> in your browser, and you should see the JSON message <code>{"message":"Hello from Container!"}</code>. At this point, we know our app works in a container.</p>
<h2 id="heading-preparing-for-deployment"><strong>Preparing for Deployment</strong></h2>
<p>To deploy on any cloud platform, you need to push your image to a container registry. A registry is a place where container images are stored and shared. Your cloud provider can pull images from <a target="_blank" href="https://hub.docker.com/">Docker Hub</a> or other registries.</p>
<p>Tag your image with a registry path. For Docker Hub, it looks like this:</p>
<pre><code class="lang-plaintext">docker tag container-node-app your-dockerhub-username/container-node-app:latest
</code></pre>
<p>Then log in and push it:</p>
<pre><code class="lang-plaintext">docker login
docker push your-dockerhub-username/container-node-app:latest
</code></pre>
<p>Your image should now be available in the cloud registry and ready for deployment.</p>
<p>Here’s mine:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1759747825354/e217d7f1-6131-41a2-a8b1-76e8ad84399a.webp" alt="Docker Registry" class="image--center mx-auto" width="1100" height="501" loading="lazy"></p>
<h2 id="heading-deploying-to-the-cloud"><strong>Deploying to the Cloud</strong></h2>
<p>In this tutorial, I’ll be using Sevalla since it offers a free tier, so there are no costs involved to deploy this container to the cloud. You can use other providers like <a target="_blank" href="https://aws.amazon.com/">AWS</a> or <a target="_blank" href="https://www.heroku.com/">Heroku</a>, but just note that you will incur costs for creating resources.</p>
<p><a target="_blank" href="https://sevalla.com/">Sevalla</a> is a modern, usage-based Platform-as-a-service provider. It offers application hosting, database, object storage, and static site hosting for your projects.</p>
<p>Once you have your account set up, you can create a new application and tell it which container image to use. Sevalla will pull the image from the registry, create a container, and handle the networking, scaling, and updates for you.</p>
<p>To get started, <a target="_blank" href="https://app.sevalla.com/login">login</a> to Sevalla. In the dashboard, choose to create a new application. Give it a name like <code>node-api</code>. Provide the registry path of your image.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1759747861994/4ad344d6-d8a5-4593-a85e-eb679bc600f5.webp" alt="Create application" class="image--center mx-auto" width="1100" height="663" loading="lazy"></p>
<p>Choose a location and use the “Hobby” plan. Sevalla comes with a $50 free credit, so you wont be charged for deploying this image.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1759747920267/cf23401d-131e-4c51-a248-411d8624542c.webp" alt="Application Resources" class="image--center mx-auto" width="1100" height="677" loading="lazy"></p>
<p>Click “Create and Deploy”. Sevalla will handle the rest. You can watch it configure the application and run the deployment.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1759747953591/79db7997-88a3-48f7-ae09-65703ec2abab.webp" alt="Sevalla Deployment" class="image--center mx-auto" width="1100" height="472" loading="lazy"></p>
<p>Once the deployment is complete, click on “Visit app” to get your app’s live URL. You can see the response from the API.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1759747987239/b3a1de3a-3f3a-48d6-86e1-27137f6b41fd.webp" alt="Sevalla deployment success" class="image--center mx-auto" width="822" height="196" loading="lazy"></p>
<h2 id="heading-scaling-your-app"><strong>Scaling Your App</strong></h2>
<p>One of the main benefits of Sevalla is easy scaling. If you start getting more traffic, you can increase the number of containers running your app with just a few clicks. Sevalla will load balance traffic between them. This means your app can handle more requests without downtime.</p>
<p>Scaling with containers is efficient because each container runs the exact same code. There is no need to configure extra servers manually. Sevalla takes care of orchestration, so your focus stays on writing code instead of managing infrastructure.</p>
<h2 id="heading-updating-your-app"><strong>Updating Your App</strong></h2>
<p>When you make changes to your Node.js app, updating is straightforward. You rebuild the Docker image, push it to the registry, and tell Sevalla to redeploy.</p>
<p>Since containers are immutable, every new build creates a fresh environment. This ensures your updates are clean, consistent, and free of old dependencies.</p>
<p>For example, if you change the message in <code>server.js</code> and want to deploy it, you would run:</p>
<pre><code class="lang-plaintext">docker build -t your-dockerhub-username/container-node-app:latest .
docker push your-dockerhub-username/container-node-app:latest
</code></pre>
<p>Then trigger a redeploy in the Sevalla dashboard. Within minutes, your users will see the updated response.</p>
<h2 id="heading-benefits-of-sing-containers"><strong>Benefits of sing Containers</strong></h2>
<p><a target="_blank" href="https://techcrunch.com/2016/10/16/wtf-is-a-container/">Containers</a> bring many advantages when deploying Node.js applications. They make your app portable because the container holds both the code and its dependencies, ensuring it runs the same way everywhere.</p>
<p>They improve consistency, since every build creates an isolated environment without leftover files or mismatched versions. Scaling becomes simple because you can spin up more containers as traffic grows, and each one behaves identically. Updates are cleaner too, as you replace old containers with fresh ones built from the latest code.</p>
<p>For developers, this means fewer surprises and less time fixing environment issues. Containers provide a reliable foundation, so you can focus on building features rather than troubleshooting deployments.</p>
<h2 id="heading-conclusion"><strong>Conclusion</strong></h2>
<p>Containerization is one of the most important shifts in modern software development. By learning how to put your Node.js app into a Docker container, you unlock the ability to run it anywhere.</p>
<p>In this guide, we built a small Node.js API, created a Dockerfile, tested the container locally, pushed it to a registry, and deployed it to the cloud. The steps you followed here apply to much larger and more complex applications as well. Once you get the basics, you can scale up your workflows to production-level projects.</p>
<p>Hope you enjoyed this article. Connect with me <a target="_blank" href="https://www.linkedin.com/in/manishmshiva/?originalSubdomain=in">on Linkedin</a> or <a target="_blank" href="https://manishshivanandhan.com/">visit my website</a>.</p>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ How to Debug Kubernetes Pods with Traceloop: A Complete Beginner's Guide ]]>
                </title>
                <description>
                    <![CDATA[ Debugging Kubernetes pods can feel like detective work. Your app crashes, and you're left wondering what happened in those critical moments leading up to failure. Traditional kubectl commands show you logs and statuses, but they can't tell you exactl... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/how-to-debug-kubernetes-pods-with-traceloop-a-complete-beginners-guide/</link>
                <guid isPermaLink="false">68b1d0b4c2405fa2535ed0c8</guid>
                
                    <category>
                        <![CDATA[ Traceloop ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Devops ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Kubernetes ]]>
                    </category>
                
                    <category>
                        <![CDATA[ debugging ]]>
                    </category>
                
                    <category>
                        <![CDATA[ inspektor gadget ]]>
                    </category>
                
                    <category>
                        <![CDATA[ containers ]]>
                    </category>
                
                    <category>
                        <![CDATA[ observability ]]>
                    </category>
                
                    <category>
                        <![CDATA[ SRE ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ Opaluwa Emidowojo ]]>
                </dc:creator>
                <pubDate>Fri, 29 Aug 2025 16:09:24 +0000</pubDate>
                <media:content url="https://cdn.hashnode.com/res/hashnode/image/upload/v1756483063551/4179b718-7883-4a89-a9c2-1c678185469a.png" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>Debugging Kubernetes pods can feel like detective work. Your app crashes, and you're left wondering what happened in those critical moments leading up to failure. Traditional <code>kubectl</code> commands show you logs and statuses, but they can't tell you exactly what your application was doing at the system level when things went wrong.</p>
<p>What if you had a flight recorder for your applications, something that captures every system call in real-time, so you can "rewind" and see the exact sequence of events that led to a crash? That's what Traceloop does. It continuously traces system calls in your pods, giving you a detailed replay of what happened before, during, and after issues occur.</p>
<p>In this guide, you’ll learn how to use Traceloop's system call tracing to debug pod issues that would otherwise be nearly impossible to diagnose.</p>
<h2 id="heading-prerequisites"><strong>Prerequisites</strong></h2>
<p>Before we begin, here are some prerequisites – things you’ll need to know and have:</p>
<ul>
<li><p><strong>Basic Kubernetes concepts</strong>: Understanding of pods, deployments, services, and namespaces</p>
</li>
<li><p><strong>kubectl fundamentals</strong>: Comfortable with commands like <code>kubectl get</code>, <code>kubectl describe</code>, <code>kubectl logs</code>, and <code>kubectl exec</code></p>
</li>
<li><p><strong>Container basics</strong>: Understanding how containerized applications work</p>
</li>
<li><p><strong>Basic Linux concepts</strong>: Understanding of processes and system calls (helpful, but we'll explain as we go)</p>
</li>
</ul>
<p><strong>Technical Requirements</strong></p>
<ul>
<li><p><strong>Kubernetes cluster access</strong>: Local (minikube, kind, Docker Desktop) or cloud-based cluster</p>
</li>
<li><p><code>kubectl</code> installed and configured to connect to your cluster</p>
</li>
<li><p>Sufficient permissions (cluster admin or equivalent RBAC) to:</p>
<ul>
<li><p>Install and run eBPF-based tools (Traceloop uses eBPF)</p>
</li>
<li><p>Create/modify pods and deployments</p>
</li>
<li><p>Access pod logs and system-level data</p>
</li>
</ul>
</li>
<li><p><strong>Linux-based Kubernetes nodes</strong>: Most clusters already run on Linux.</p>
</li>
</ul>
<p><strong>System Requirements</strong></p>
<ul>
<li><p><strong>Extended Berkeley Packet Filter (eBPF) support</strong>: Used for tracing and monitoring at the kernel level. Kernel version 5.10+ recommended.</p>
</li>
<li><p><strong>Sufficient cluster resources</strong>: Traceloop runs alongside your applications</p>
</li>
</ul>
<h3 id="heading-table-of-contents">Table of Contents</h3>
<ol>
<li><p><a class="post-section-overview" href="#heading-what-is-traceloop">What is Traceloop?</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-how-traceloop-works">How Traceloop Works</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-how-to-set-up-traceloop">How to Set Up Traceloop</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-your-first-trace-hands-on-tutorial">Your First Trace: Hands-On Tutorial</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-step-by-step-debugging-walkthrough">Step-by-Step Debugging Walkthrough</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-real-world-debugging-scenarios">Real-World Debugging Scenarios</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-best-practices">Best Practices</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-conclusion">Conclusion</a></p>
</li>
</ol>
<h2 id="heading-what-is-traceloop">What is Traceloop?</h2>
<p><a target="_blank" href="https://inspektor-gadget.io/docs/main/gadgets/traceloop/">Traceloop</a> is a system call tracing and observability tool that works across containerized environments, from Docker containers running locally to pods in production Kubernetes clusters. But before we discuss what that means, let's talk about why system calls matter for debugging.</p>
<p>Every time your application does anything (like opening a file, making a network request, allocating memory, or crashing), it has to interact with the operating system through system calls. These are the fundamental building blocks of how any program interacts with the world around it.</p>
<p>Here's where traditional debugging falls short: when your container crashes, the logs might tell you "segmentation fault" or "out of memory," but they don't tell you the sequence of events that led there. Did the application try to access a file that didn't exist? Was it making network calls that failed? Did it run out of file descriptors?</p>
<p>Traceloop captures this missing piece. It sits at the kernel level using eBPF technology, recording every system call your application makes in real-time. Think of it as installing a dashcam in your application. It's always recording with minimal resources, and when something goes wrong, you have the footage.</p>
<p>Strace is another popular debugging tool – but it requires you to know that there's a problem first. With Traceloop, we can conveniently run it continuously in the background with minimal overhead. If your container crashes at 3am, you can immediately "rewind the tape" and see exactly what system calls happened leading up to the crash.</p>
<p>This helps debug intermittent issues that happen randomly in production but never when you are watching. Because Traceloop is always recording, you finally have visibility into what your application was doing when these mysterious failures occur.</p>
<h2 id="heading-how-traceloop-works">How Traceloop Works</h2>
<p>Now that you understand what Traceloop does, let's look under the hood at how it captures and processes system calls in your containerized environments.</p>
<h3 id="heading-the-technical-foundation">The Technical Foundation</h3>
<p>Traceloop is built on eBPF, a technology that allows programs to run safely in the Linux kernel without changing kernel code. Think of eBPF as a way to install "hooks" directly into the kernel that can observe everything happening on your system with minimal performance impact.</p>
<p>Unlike traditional monitoring tools that work from userspace, eBPF programs run in kernel space, giving them access to system calls as they happen, without relying on the application logging appropriate error messages. This is why Traceloop can capture events that never make it to application logs, like failed system calls or crashes that happen before the application can write anything.</p>
<h3 id="heading-the-flight-recorder-architecture">The Flight Recorder Architecture</h3>
<p>Traceloop uses eBPF maps as an overwriteable ring buffer. Imagine a tape recorder that continuously records over itself. It's always capturing system calls, but it only keeps the most recent data in memory. When something goes wrong, the recording automatically preserves what happened leading up to the incident, just like an airplane's flight recorder after a crash.</p>
<p>This approach solves the production debugging problem: you don't need to predict when issues will happen or attach debuggers after the fact. The recording is always running, waiting for you to need it.</p>
<h3 id="heading-system-call-capture-flow">System Call Capture Flow</h3>
<p>Here's how Traceloop captures and processes system calls across your Kubernetes environment:</p>
<ol>
<li><p><strong>Application pods</strong> generate system calls through normal operation – opening files, making network connections, allocating memory.</p>
</li>
<li><p><strong>eBPF probes (also called hooks)</strong> intercept these system calls at the kernel level before they're processed.</p>
</li>
<li><p><strong>Traceloop recorder</strong> captures the events, buffers them, and adds container context using Inspektor Gadget enrichment (pod name, namespace, container ID).</p>
</li>
<li><p><strong>Output stream</strong> formats the data and makes it available for analysis in real-time or after an incident.</p>
</li>
<li><p><strong>Traceloop user</strong> views and analyzes the captured trace to diagnose the root cause of issues.</p>
</li>
</ol>
<p>Below is a visual representation of the flow. The key advantage is that Traceloop sees everything your application does, even actions that fail silently or happen too quickly for traditional logging to catch. This gives you complete visibility into your application's interaction with the operating system.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1755043403339/c5047de7-afc4-48aa-a28e-ee3a1dfbe47f.jpeg" alt="Flow diagram showing how Traceloop works. Application Pods generate system calls, which undergo kernel-level interception via eBPF probes. The probes capture events and pass them to the Traceloop Recorder, which buffers and formats the data. The Output Stream then displays the results to the Traceloop User. The process highlights steps from generating syscalls to capturing, recording, formatting, and presenting the results." class="image--center mx-auto" width="600" height="400" loading="lazy"></p>
<h3 id="heading-container-isolation-and-context">Container Isolation and Context</h3>
<p>One of Traceloop's strengths is understanding containerized environments. It doesn't just capture raw system calls – it adds context about which pod, container, and namespace generated each call. This means you can trace specific applications without getting overwhelmed by system calls from other containers running on the same node.</p>
<p>This container awareness makes Traceloop particularly powerful in Kubernetes environments where you might have dozens of pods running on a single node, but you only care about debugging one specific application.</p>
<h2 id="heading-how-to-set-up-traceloop">How to Set Up Traceloop</h2>
<p>Before we can start tracing system calls, we need to set up Traceloop in your Kubernetes environment. Traceloop is part of the <a target="_blank" href="https://inspektor-gadget.io/">Inspektor Gadget</a> ecosystem, which provides flexibility in how you use it.</p>
<h3 id="heading-installation-overview">Installation Overview</h3>
<p>This setup:</p>
<ul>
<li><p>Deploys Inspektor Gadget components to all worker nodes</p>
</li>
<li><p>Eliminates the download and initialization overhead on each use, as components are pre-loaded and ready </p>
</li>
<li><p>Eliminates the need to reinstall or reconfigure for each debugging session – just run your traces immediately</p>
</li>
<li><p>Requires cluster admin permissions</p>
</li>
<li><p>Works best for teams doing regular debugging</p>
</li>
</ul>
<h4 id="heading-installation-requirements">Installation Requirements</h4>
<p>First, ensure your cluster meets the requirements:</p>
<ul>
<li><p>Kubernetes cluster with Linux nodes</p>
</li>
<li><p>eBPF support</p>
</li>
<li><p>kubectl installed and configured</p>
</li>
<li><p>Cluster admin permissions</p>
</li>
</ul>
<h4 id="heading-install-kubectl-gadget">Install kubectl gadget</h4>
<p>The recommended way is using krew (kubectl plugin manager):</p>
<pre><code class="lang-bash"><span class="hljs-comment"># Install krew if you don't have it</span>
curl -fsSLO <span class="hljs-string">"https://github.com/kubernetes-sigs/krew/releases/latest/download/krew-linux_amd64.tar.gz"</span>
tar zxvf krew-linux_amd64.tar.gz
./krew-linux_amd64 install krew
<span class="hljs-built_in">export</span> PATH=<span class="hljs-string">"<span class="hljs-variable">${KREW_ROOT:-<span class="hljs-variable">$HOME</span>/.krew}</span>/bin:<span class="hljs-variable">$PATH</span>"</span>

<span class="hljs-comment"># Install kubectl gadget</span>
kubectl krew install gadget
</code></pre>
<p>Alternatively, you can install directly:</p>
<pre><code class="lang-bash"><span class="hljs-comment"># For Linux/macOS</span>
curl -sL https://github.com/inspektor-gadget/inspektor-gadget/releases/latest/download/kubectl-gadget-linux-amd64.tar.gz | sudo tar -C /usr/<span class="hljs-built_in">local</span>/bin -xzf - kubectl-gadget

<span class="hljs-comment"># Verify installation</span>
kubectl gadget version
</code></pre>
<h4 id="heading-deploy-inspektor-gadget-to-your-cluster">Deploy Inspektor Gadget to Your Cluster</h4>
<p>Deploy the Inspektor Gadget components to your cluster:</p>
<pre><code class="lang-bash">kubectl gadget deploy
</code></pre>
<p>This installs the necessary DaemonSets and RBAC configurations that allow gadgets like Traceloop to run on your cluster nodes.</p>
<p>Alternatively, you can also deploy using <a target="_blank" href="https://inspektor-gadget.io/docs/v0.43.0/reference/install-kubernetes/#installation-with-the-helm-chart">Helm</a>.</p>
<h4 id="heading-verify-installation">Verify Installation</h4>
<p>Check that the gadget pods are running:</p>
<pre><code class="lang-bash">kubectl get pods -n gadget
</code></pre>
<p>You should see gadget pods running on each node in your cluster.</p>
<h2 id="heading-your-first-trace-hands-on-tutorial">Your First Trace: Hands-On Tutorial</h2>
<p>Now let's capture our first system call trace. We'll create a simple scenario and watch what happens at the system level.</p>
<h3 id="heading-setting-up-the-test-environment">Setting Up the Test Environment</h3>
<p>First, create a dedicated namespace for our tracing experiments:</p>
<pre><code class="lang-bash">kubectl create ns test-traceloop-ns
</code></pre>
<p><strong>Expected output:</strong></p>
<pre><code class="lang-bash">namespace/test-traceloop-ns created
</code></pre>
<p>Next, create a simple pod that we can interact with:</p>
<pre><code class="lang-bash">kubectl run -n test-traceloop-ns --image busybox test-traceloop-pod --<span class="hljs-built_in">command</span> -- sleep inf
</code></pre>
<p><strong>Expected output:</strong></p>
<pre><code class="lang-bash">pod/test-traceloop-pod created
</code></pre>
<p>This creates a BusyBox container that sleeps indefinitely, giving us a stable target for tracing.</p>
<h3 id="heading-starting-your-first-trace">Starting Your First Trace</h3>
<p>Next, start tracing system calls for our test pod:</p>
<pre><code class="lang-bash">kubectl gadget run traceloop:latest --namespace test-traceloop-ns
</code></pre>
<p>This command starts the flight recorder. You'll see column headers showing what information Traceloop captures:</p>
<pre><code class="lang-bash">K8S.NODE    K8S.NAMESPACE    K8S.PODNAME    K8S.CONTAINERNAME    CPU    PID    COMM    SYSCALL    PARAMETERS    RET
</code></pre>
<p>The trace is now running in the background, continuously recording system calls from our pod.</p>
<h3 id="heading-generating-system-calls">Generating System Calls</h3>
<p>With the trace running, let's generate some activity. In a new terminal window, run a command inside your test pod:</p>
<pre><code class="lang-bash">kubectl <span class="hljs-built_in">exec</span> -ti -n test-traceloop-ns test-traceloop-pod -- /bin/sh
</code></pre>
<p>Once inside the container, run some basic commands:</p>
<pre><code class="lang-bash">ls /
<span class="hljs-built_in">echo</span> <span class="hljs-string">"Hello World"</span> &gt; /tmp/test.txt
cat /tmp/test.txt
</code></pre>
<h3 id="heading-collecting-the-trace">Collecting the Trace</h3>
<p>Back in your original terminal where Traceloop is running, press <strong>Ctrl+C</strong> to stop the recording and see the captured system calls.</p>
<p>You'll see output similar to this:</p>
<pre><code class="lang-bash">K8S.NODE            K8S.NAMESPACE        K8S.PODNAME          K8S.CONTAINERNAME    CPU  PID    COMM  SYSCALL      PARAMETERS                   RET
minikube-docker     test-traceloop-ns    test-traceloop-pod   test-traceloop-pod   2    95419  ls    openat       dfd=-100, filename=<span class="hljs-string">"/lib"</span>    3
minikube-docker     test-traceloop-ns    test-traceloop-pod   test-traceloop-pod   2    95419  ls    getdents64   fd=3, dirent=0x...          201
minikube-docker     test-traceloop-ns    test-traceloop-pod   test-traceloop-pod   2    95419  ls    write        fd=1, buf=<span class="hljs-string">"bin dev etc..."</span>   201
minikube-docker     test-traceloop-ns    test-traceloop-pod   test-traceloop-pod   2    95419  ls    exit_group   error_code=0                 0
</code></pre>
<h3 id="heading-understanding-your-first-trace">Understanding Your First Trace</h3>
<p>Let's break down what we're seeing:</p>
<ul>
<li><p><strong>K8S.PODNAME</strong>: Which pod generated these system calls</p>
</li>
<li><p><strong>PID</strong>: Process ID of the command that ran</p>
</li>
<li><p><strong>COMM</strong>: The command name (ls, echo, cat)</p>
</li>
<li><p><strong>SYSCALL</strong>: The actual system call made (openat, write, exit_group)</p>
</li>
<li><p><strong>PARAMETERS</strong>: Arguments passed to the system call</p>
</li>
<li><p><strong>RET</strong>: Return value (0 usually means success)</p>
</li>
</ul>
<p>This trace shows the <code>ls</code> command opening the <code>/lib</code> directory, reading directory entries, writing the output to stdout, and exiting successfully.</p>
<h3 id="heading-clean-up">Clean Up</h3>
<p>Remove the test resources:</p>
<pre><code class="lang-bash">kubectl delete pod test-traceloop-pod -n test-traceloop-ns
kubectl delete ns test-traceloop-ns
</code></pre>
<p>You can now see exactly what your applications are doing at the kernel level, something that traditional logs and kubectl commands can't show you.</p>
<p>Let's try this with an application that crashes.</p>
<h2 id="heading-step-by-step-debugging-walkthrough">Step-by-Step Debugging Walkthrough</h2>
<p>Now that you know how to capture traces, let's take a look at a real debugging scenario. We'll create an application that crashes and use Traceloop to uncover the root cause. Something that would be nearly impossible with traditional kubectl debugging.</p>
<h3 id="heading-the-scenario-a-mysterious-crash">The Scenario: A Mysterious Crash</h3>
<p>Let's create a Python application that has a subtle bug. It tries to write to a file it doesn't have permission to access, then crashes. This mimics real-world scenarios where applications fail due to permission issues, missing files, or resource constraints.</p>
<h3 id="heading-setting-up-the-problematic-application">Setting Up the Problematic Application</h3>
<p>First, we’ll create a new namespace for our debugging exercise:</p>
<pre><code class="lang-bash">kubectl create ns debug-traceloop-ns
</code></pre>
<p>Now, let's create a pod with an application that will crash:</p>
<pre><code class="lang-bash">kubectl run -n debug-traceloop-ns crash-app --image=python:3.9-slim --restart=Never -- python3 -c <span class="hljs-string">"
import time
import os
print('App starting...')
time.sleep(5)
print('Trying to write to restricted file...')
try:
    with open('/etc/passwd', 'w') as f:
        f.write('malicious content')
except Exception as e:
    print(f'Error: {e}')
    exit(1)
"</span>
</code></pre>
<p>This creates a pod that will:</p>
<ol>
<li><p>Start successfully</p>
</li>
<li><p>Try to write to <code>/etc/passwd</code> (a restricted system file)</p>
</li>
<li><p>Fail and crash with exit code 1</p>
</li>
</ol>
<h3 id="heading-starting-the-trace-before-the-crash">Starting the Trace Before the Crash</h3>
<p>Here's the key difference from traditional debugging. We start tracing before we know there's a problem. In a real scenario, you'd have Traceloop running continuously.</p>
<pre><code class="lang-bash">kubectl gadget run traceloop:latest --namespace debug-traceloop-ns
</code></pre>
<p>The trace starts recording immediately. You'll see the column headers, and the flight recorder is now capturing every system call.</p>
<h3 id="heading-observing-the-application-behavior">Observing the Application Behavior</h3>
<p>In another terminal, check the pod status:</p>
<pre><code class="lang-bash">kubectl get pods -n debug-traceloop-ns -w
</code></pre>
<p>You'll see the pod go through these states:</p>
<ul>
<li><code>Pending</code> → <code>Running</code> → <code>Error</code> → <code>CrashLoopBackOff</code></li>
</ul>
<p>Traditional debugging would show you:</p>
<pre><code class="lang-bash">kubectl logs -n debug-traceloop-ns crash-app
</code></pre>
<p>Output:</p>
<pre><code class="lang-bash">App starting...
Trying to write to restricted file...
Error: [Errno 13] Permission denied: <span class="hljs-string">'/etc/passwd'</span>
</code></pre>
<p>But this doesn't tell you exactly what the application tried to do at the system level.</p>
<h3 id="heading-collecting-and-analyzing-the-trace">Collecting and Analyzing the Trace</h3>
<p>Back in your Traceloop terminal, press <strong>Ctrl+C</strong> to stop the recording. You'll see system calls like this:</p>
<pre><code class="lang-bash">K8S.NODE        K8S.NAMESPACE      K8S.PODNAME  COMM    SYSCALL    PARAMETERS                           RET
minikube-docker debug-traceloop-ns crash-app    python3 openat     dfd=-100, filename=<span class="hljs-string">"/etc/passwd"</span>    -13
minikube-docker debug-traceloop-ns crash-app    python3 write      fd=3, buf=<span class="hljs-string">"App starting..."</span>         16
minikube-docker debug-traceloop-ns crash-app    python3 openat     dfd=-100, filename=<span class="hljs-string">"/etc/passwd"</span>    -13
minikube-docker debug-traceloop-ns crash-app    python3 exit_group error_code=1                        0
</code></pre>
<h3 id="heading-reading-the-system-call-story">Reading the System Call Story</h3>
<p>The trace reveals the exact sequence of events:</p>
<ol>
<li><p><code>openat filename="/etc/passwd" RET=-13</code>: The application tried to open <code>/etc/passwd</code> for writing</p>
<ul>
<li>Return code <code>-13</code> = <code>EACCES</code> (Permission denied)</li>
</ul>
</li>
<li><p><code>write buf="App starting..."</code>: Normal logging output (successful)</p>
</li>
<li><p><code>openat filename="/etc/passwd" RET=-13</code>: Second attempt to open the restricted file (still denied)</p>
</li>
<li><p><code>exit_group error_code=1</code>: Application exits with error code 1</p>
</li>
</ol>
<h3 id="heading-what-traceloop-revealed">What Traceloop Revealed</h3>
<p>Traditional debugging told us "Permission denied" but Traceloop shows us:</p>
<ul>
<li><p><strong>Exactly which file</strong> the application tried to access</p>
</li>
<li><p><strong>When</strong> the permission denial happened in the execution flow</p>
</li>
<li><p><strong>How many times</strong> it tried (twice in this case)</p>
</li>
<li><p><strong>The exact system call</strong> that failed (<code>openat</code>)</p>
</li>
</ul>
<h3 id="heading-real-world-applications">Real-World Applications</h3>
<p>This same approach works for debugging:</p>
<ul>
<li><p><strong>File not found errors</strong>: See exactly which files your app is looking for</p>
</li>
<li><p><strong>Network connection failures</strong>: Observe failed <code>connect()</code> system calls with specific addresses</p>
</li>
<li><p><strong>Memory issues</strong>: Watch <code>mmap()</code> and <code>brk()</code> calls that fail</p>
</li>
<li><p><strong>Container startup problems</strong>: See which system calls fail during initialization</p>
</li>
</ul>
<h3 id="heading-clean-up-1">Clean Up</h3>
<p>Remove the test resources:</p>
<pre><code class="lang-bash">kubectl delete pod crash-app -n debug-traceloop-ns
kubectl delete ns debug-traceloop-ns
</code></pre>
<h3 id="heading-key-takeaway">Key Takeaway</h3>
<p>Traditional Kubernetes debugging shows you what went wrong after it happened. Traceloop's continuous recording shows you exactly how it went wrong at the system level. This level of detail is invaluable for debugging complex production issues where the logs don't tell the full story.</p>
<h2 id="heading-real-world-debugging-scenarios">Real-World Debugging Scenarios</h2>
<p>Now that you understand the fundamentals, let's explore common production issues and how Traceloop helps diagnose them. These scenarios mirror real problems you'll encounter in Kubernetes environments.</p>
<h3 id="heading-scenario-1-container-startup-failures">Scenario 1: Container Startup Failures</h3>
<p><strong>The problem</strong>: Your pod gets stuck in <code>CrashLoopBackOff</code> with unhelpful logs.</p>
<p>Traditional <code>kubectl</code> commands show limited information:</p>
<pre><code class="lang-bash">kubectl describe pod failing-app
<span class="hljs-comment"># Events: Back-off restarting failed container</span>

kubectl logs failing-app
<span class="hljs-comment"># (Empty or minimal output)</span>
</code></pre>
<p>System calls show the application tried to:</p>
<ol>
<li><p>Access configuration files that don't exist</p>
</li>
<li><p>Connect to services that aren't available</p>
</li>
<li><p>Write to directories without proper permissions</p>
</li>
</ol>
<p>Key system calls to watch:</p>
<ol>
<li><p><code>openat</code> with <code>-2</code> return (file not found)</p>
</li>
<li><p><code>connect</code> with <code>-111</code> return (connection refused)</p>
</li>
<li><p><code>access</code> with <code>-13</code> return (permission denied)</p>
</li>
</ol>
<h3 id="heading-scenario-2-memory-and-resource-issues">Scenario 2: Memory and Resource Issues</h3>
<p><strong>The problem</strong>: Application performance degrades or gets OOMKilled.</p>
<p>What Traceloop shows:</p>
<ol>
<li><p><code>mmap</code> calls failing (memory allocation issues)</p>
</li>
<li><p><code>brk</code> system calls indicating heap growth</p>
</li>
<li><p>File descriptor exhaustion through failed <code>openat</code> calls</p>
</li>
<li><p>Excessive <code>write</code> calls indicating memory pressure</p>
</li>
</ol>
<p><strong>Example pattern</strong>:</p>
<pre><code class="lang-bash">SYSCALL    PARAMETERS           RET
mmap       length=1048576       -12  <span class="hljs-comment"># ENOMEM - out of memory</span>
brk        brk=0x55555557d000   0    <span class="hljs-comment"># Heap expansion</span>
openat     filename=<span class="hljs-string">"/tmp/..."</span>   -24  <span class="hljs-comment"># EMFILE - too many open files</span>
</code></pre>
<h3 id="heading-scenario-3-network-connectivity-problems">Scenario 3: Network Connectivity Problems</h3>
<p><strong>The problem</strong>: Service-to-service communication fails intermittently.</p>
<p>Traditional debugging limitations:</p>
<ol>
<li><p>Application logs show "connection timeout"</p>
</li>
<li><p>Network policies seem correct</p>
</li>
<li><p>DNS resolution appears to work</p>
</li>
</ol>
<p>What Traceloop reveals:</p>
<ol>
<li><p>Exact IP addresses and ports being attempted</p>
</li>
<li><p>DNS resolution patterns through <code>openat</code> on <code>/etc/resolv.conf</code></p>
</li>
<li><p>Failed <code>connect</code> calls with specific error codes</p>
</li>
<li><p>Socket creation and binding issues</p>
</li>
</ol>
<p><strong>Key indicators</strong>:</p>
<pre><code class="lang-bash">SYSCALL    PARAMETERS                    RET
socket     family=AF_INET, <span class="hljs-built_in">type</span>=SOCK     3
connect    fd=3, addr=10.96.0.1:443     -110  <span class="hljs-comment"># ETIMEDOUT</span>
close      fd=3                         0
</code></pre>
<h3 id="heading-scenario-4-configuration-and-secret-issues">Scenario 4: Configuration and Secret Issues</h3>
<p><strong>The problem</strong>: Application can't access mounted secrets or config maps.</p>
<p>What system calls reveal:</p>
<ol>
<li><p>File access patterns for mounted volumes</p>
</li>
<li><p>Permission checks on secret files</p>
</li>
<li><p>Configuration file parsing attempts</p>
</li>
</ol>
<p>Common patterns:</p>
<ol>
<li><p>Multiple <code>openat</code> attempts on different config file paths</p>
</li>
<li><p><code>access</code> calls checking file permissions before opening</p>
</li>
<li><p>Failed reads from mounted secret volumes</p>
</li>
</ol>
<h3 id="heading-scenario-5-performance-bottlenecks">Scenario 5: Performance Bottlenecks</h3>
<p><strong>The problem</strong>: Application response times are slow without obvious cause.</p>
<p>Traceloop analysis:</p>
<ol>
<li><p>Excessive <code>fsync</code> calls (disk I/O bottlenecks)</p>
</li>
<li><p>Many <code>futex</code> calls (lock contention)</p>
</li>
<li><p>Frequent <code>recvfrom</code> timeouts (network issues)</p>
</li>
<li><p>Repeated file system operations</p>
</li>
</ol>
<p><strong>Performance indicators</strong>:</p>
<pre><code class="lang-bash">SYSCALL     FREQUENCY    ISSUE
fsync       High         Disk I/O bottleneck
futex       Excessive    Lock contention
poll        Many         Waiting <span class="hljs-keyword">for</span> I/O
recvfrom    Timeouts     Network delays
</code></pre>
<h2 id="heading-best-practices"><strong>Best Practices</strong></h2>
<h3 id="heading-when-to-use-traceloop"><strong>When to Use Traceloop</strong></h3>
<p>Traceloop is most useful when you’re dealing with the kinds of problems that are notoriously difficult to pin down. If you’ve ever struggled with debugging intermittent crashes that don’t happen on demand, or run into confusing permission and access issues, this is where it works best.  </p>
<p>It also helps uncover performance bottlenecks at the system level and provides visibility into application behavior during tricky startup failures. Another common use case is diagnosing network connectivity problems between pods, where other tools usually can't help</p>
<p>Of course, not every problem requires system call tracing. For application-level issues, logs and APM tools are more effective. Cluster-level concerns are often better handled with <code>kubectl describe</code> or by looking at events, and if you’re primarily monitoring resources, standard metrics and dashboards show you what's happening.</p>
<h3 id="heading-performance-considerations"><strong>Performance Considerations</strong></h3>
<p>Like any tracing tool, Traceloop adds some overhead, but it keeps the overhead low. You can keep it efficient by narrowing the scope of your traces. For example, filtering by namespace with <code>--namespace specific-ns</code>, or targeting specific pods using <code>--podname target-pod</code>. In high-traffic environments, it’s best to run traces for shorter periods, and node-specific tracing can further isolate debugging when you don’t want to instrument the entire cluster.</p>
<p>In most cases, Traceloop uses very little CPU and memory, thanks to its eBPF-based approach. This makes it lighter than traditional tools like strace. The actual cost depends on the volume of system calls being recorded, so it’s a good practice to monitor resource usage in your own environment to confirm it’s operating within acceptable limits.</p>
<h3 id="heading-integration-with-your-workflow"><strong>Integration with Your Workflow</strong></h3>
<p>Traceloop works well in dev and production workflows. In development, it’s a powerful way to understand how your application interacts with the system. You can use it to confirm that your app handles edge cases correctly, or to validate permission and resource configurations before promoting workloads into production.</p>
<p>In production environments, you can deploy it in different ways. Depending on how much overhead you're okay with, some teams run it continuously on a small subset of nodes, while others use it only when traditional debugging methods don’t provide enough insight. Pairing Traceloop with your existing monitoring and logging stack can give you a much more complete picture of system behavior.</p>
<p>It also helps with teamwork. Sharing trace outputs makes it easier for teams to reason about complex issues together. The data it provides can guide improvements in error handling and logging, and documenting common system call patterns can help onboard new developers more quickly.</p>
<h3 id="heading-security-considerations"><strong>Security Considerations</strong></h3>
<p>Because Traceloop records low-level system activity, you need to be mindful of what it captures.</p>
<p><strong>What Traceloop Can See:</strong></p>
<ul>
<li><p>System call parameters (such as filenames and network addresses)</p>
</li>
<li><p>Process information and command arguments</p>
</li>
<li><p>File access patterns and permissions</p>
</li>
</ul>
<p><strong>Privacy Measures:</strong></p>
<ul>
<li><p>Limit trace duration to minimize data collection</p>
</li>
<li><p>Use namespace isolation to avoid capturing unrelated workloads</p>
</li>
<li><p>Apply data retention policies for trace outputs</p>
</li>
<li><p>Watch for sensitive information in file paths or system call parameters</p>
</li>
</ul>
<h2 id="heading-conclusion"><strong>Conclusion</strong></h2>
<p>Traceloop doesn’t just tell you something went wrong – it shows you how. By recording every system call in real time, it turns mysterious Kubernetes failures into solvable problems. Whether the issue happened seconds ago or in the middle of the night, the tool gives you the ability to rewind, inspect, and respond with confidence.</p>
<h3 id="heading-when-to-use-it">When to Use It</h3>
<p>Keep in mind that Traceloop complements your existing debugging toolkit rather than replacing it. Reach for it when logs don’t tell the whole story, when intermittent problems are hiding in the shadows, when <code>kubectl</code> commands leave you guessing, or when you need to see how your application is really interacting with the system.</p>
<p>Once you’re comfortable with Traceloop, you can add more tools. <a target="_blank" href="https://inspektor-gadget.io/">Inspektor Gadget</a> offers other tools for network, security, and performance debugging that pair well with Traceloop. Integrating it into your incident response workflow, sharing insights across your team, and even considering continuous tracing for critical workloads are good things to try next.</p>
<p>The next time you run into a stubborn Kubernetes pod failure, you won’t be stuck speculating. With Traceloop, you can “rewind the tape” and see exactly what happened. System call tracing may sound complex at first, but in practice, it’s one of the most powerful ways to truly understand how applications behave in containerized environments.</p>
<p><strong>PS:</strong> Have any questions about Traceloop or want to share your debugging challenges? The Inspektor Gadget team and community hang out in the <a target="_blank" href="https://kubernetes.slack.com/archives/CSYL75LF6">#inspektor-gadget</a> channel on Kubernetes Slack. It's a great place to get help from the engineers who built these tools, share experiences, and maybe even contribute to making the ecosystem even better.  </p>
<p>You can also connect with me on <a target="_blank" href="https://www.linkedin.com/in/emidowojo/">LinkedIn</a> if you’d like to stay in touch. If you made it to the end of this tutorial, thanks for reading!</p>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ Learn Kubernetes – Full Handbook for Developers, Startups, and Businesses ]]>
                </title>
                <description>
                    <![CDATA[ You’ve probably heard the word Kubernetes floating around, or it’s cooler nickname k8s (pronounced “kates“). Maybe in a job post, a tech podcast, or from that one DevOps friend who always brings it up like it’s the secret sauce to everything 😅. It s... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/learn-kubernetes-handbook-devs-startups-businesses/</link>
                <guid isPermaLink="false">68150214fd424d0874293171</guid>
                
                    <category>
                        <![CDATA[ Kubernetes ]]>
                    </category>
                
                    <category>
                        <![CDATA[ containers ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Docker ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Devops ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Cloud ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Cloud Computing ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ Prince Onukwili ]]>
                </dc:creator>
                <pubDate>Fri, 02 May 2025 17:34:12 +0000</pubDate>
                <media:content url="https://cdn.hashnode.com/res/hashnode/image/upload/v1746205417767/d9d6b0d3-f2a5-44eb-83b5-d1a614bead9f.png" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>You’ve probably heard the word Kubernetes floating around, or it’s cooler nickname k8s (pronounced “kates“). Maybe in a job post, a tech podcast, or from that one DevOps friend who always brings it up like it’s the secret sauce to everything 😅. It sounds important, but also... kinda mysterious.</p>
<p>So what is Kubernetes, really? Why is it everywhere? And should you care?</p>
<p>In this handbook, we’ll unpack Kubernetes in a way that actually makes sense. No buzzwords. No overwhelming tech-speak. Just straight talk. You’ll learn what Kubernetes is, how it came about, and why it became such a big deal – especially for teams building and running huge apps with millions of users.</p>
<p>We’ll rewind a bit to see how things were done before Kubernetes showed up (spoiler: it wasn’t pretty), and walk through the real problems it was designed to solve.</p>
<p>By the end, you’ll not only understand the purpose of Kubernetes, but you’ll also know how to deploy a simple app on a Kubernetes cluster – even if you’re just getting started.</p>
<p>Yep, by the time we’re done, you’ll go from <em>“I keep hearing about Kubernetes”</em> to <em>“Hey, I kinda get it now!”</em> 😄</p>
<h2 id="heading-table-of-contents">📚 Table of Contents</h2>
<ol>
<li><p><a class="post-section-overview" href="#heading-what-is-kubernetes">What is Kubernetes?</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-how-applications-were-deployed-before-kubernetes">How Applications Were Deployed Before Kubernetes</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-the-problem-kubernetes-solves">The Problem Kubernetes Solves 🧠</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-how-kubernetes-works-components-of-a-kubernetes-environment">How Kubernetes Works – Components of a Kubernetes Environment 🧑‍🔧</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-kubernetes-workloads-pods-deployments-services-amp-more">Kubernetes Workloads 🛠️ – Pods, Deployments, Services, &amp; More</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-how-to=create-a-kubernetes-cluster-in-a-demo-environment-with-play-with-k8s">How to Create a Kubernetes Cluster in a Demo Environment with play-with-k8s</a></p>
<ul>
<li><p><a class="post-section-overview" href="#heading-sign-in-to-play-with-kubernetes">Sign in to Play with Kubernetes</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-create-your-kubernetes-cluster">Create Your Kubernetes Cluster</a></p>
</li>
</ul>
</li>
<li><p><a class="post-section-overview" href="#heading-how-to-deploy-your-application-on-a-kubernetes-cluster">How to Deploy an Application on Your Kubernetes Cluster</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-advantages-of-using-kubernetes-in-business">✅ Advantages of Using Kubernetes in Business</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-disadvantages-of-using-kubernetes">😬 Disadvantages of Using Kubernetes</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-use-cases-when-and-when-not-to-use-kubernetes">Use Cases: When (and When Not) to Use Kubernetes</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-conclusion">Conclusion</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-study-further">Study Further 📚</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-about-the-author">About the Author 👨‍💻</a></p>
</li>
</ol>
<h2 id="heading-what-is-kubernetes"><strong>What is Kubernetes?</strong></h2>
<p>Imagine you're building a huge software platform, like a banking app. This app needs many features, like user onboarding, depositing money, withdrawals, payments, and so on. These features are so big and complex that it’s easier to split them into separate applications. These individual applications are called microservices.</p>
<p><strong>So what are Microservices</strong>? Think of them like little building blocks that work together to create a bigger platform. So, you might have:</p>
<ul>
<li><p>One microservice for user onboarding</p>
</li>
<li><p>Another for processing deposits</p>
</li>
<li><p>Another for handling payments</p>
</li>
<li><p>And many, many more!</p>
</li>
</ul>
<p>To the user, it still looks like they’re using one smooth, unified banking app. But behind the scenes, it’s like a bunch of little apps working together to make everything run.</p>
<h3 id="heading-but-heres-where-things-get-tricky">But here’s where things get tricky...</h3>
<p>When you have dozens (or even hundreds) of these microservices, managing them becomes a nightmare. You might need to:</p>
<ul>
<li><p><strong>Deploy</strong> each one separately</p>
</li>
<li><p><strong>Monitor</strong> them individually (to ensure they don’t crash/become slow due to too much load)</p>
</li>
<li><p><strong>Scale</strong> them (make them bigger to handle more users) as traffic surges, one by one</p>
</li>
</ul>
<p>So, if your banking app suddenly gets millions of users, you'd have to manually tweak and update each microservice to keep it running smoothly. 😖 It’s a lot of work, and if something goes wrong, you’re in deep trouble.</p>
<h3 id="heading-this-is-where-kubernetes-comes-to-the-rescue">This is where Kubernetes comes to the rescue! 🚀</h3>
<p>Kubernetes is like a super-efficient manager for all these microservices. It’s a platform that helps you:</p>
<ul>
<li><p><strong>Automate</strong> the deployment (getting the apps up and running)</p>
</li>
<li><p><strong>Scale</strong> the microservices (making them bigger or smaller as needed based on the inflow of traffic – your customers)</p>
</li>
<li><p><strong>Monitor</strong> them (keeping an eye on their health)</p>
</li>
<li><p><strong>Ensure reliability</strong> (so if one microservice breaks/fails, k8s replaces it immediately)</p>
</li>
</ul>
<p>In simple terms, Kubernetes takes all your little microservices and organizes them, ensuring they run smoothly together, no matter how much traffic your app gets. It handles everything behind the scenes, like a conductor leading an orchestra, so your microservices work together without chaos.</p>
<h2 id="heading-how-applications-were-deployed-before-kubernetes"><strong>How Applications Were Deployed Before Kubernetes</strong></h2>
<p>Before Kubernetes came into the picture, software teams had quite the juggling act when it came to deploying applications – especially when they were made up of lots of microservices.</p>
<p>One popular method was using a <strong>distributed system</strong> setup. Here’s what that looked like:</p>
<p>Imagine each microservice (like your user onboarding, payments, deposits, and so on) being installed on separate servers (physical computers or virtual machines). Each of these servers had to be carefully prepared:</p>
<ul>
<li><p>The microservice itself needed to be installed.</p>
</li>
<li><p>The software dependencies it needed (like programming languages, libraries, tools) also had to be installed.</p>
</li>
<li><p>Everything had to be configured manually ON EACH server.</p>
</li>
</ul>
<p>And all of these servers had to talk to each other – sometimes over the public internet, or via private networks like VPNs.</p>
<p>Sounds like a lot of work, right? 😮 It was! Managing updates, fixing bugs, scaling up during traffic spikes, and keeping things from crashing could turn into a full-time headache for developers and system admins. 😖</p>
<h3 id="heading-then-came-containers">Then Came Containers 🚢</h3>
<p>A more modern solution that eased the pain (a little) was using containers.</p>
<p><strong>So, what are containers?</strong></p>
<p>Think of a container like a lunchbox for your microservice. Instead of installing the microservice and its supporting tools directly on a server, you pack everything it needs – code, settings, software libraries – into this single, neat container. Wherever the container goes, the microservice runs exactly the same way. No surprises!</p>
<p>Tools like <a target="_blank" href="https://www.docker.com/">Docker</a> made this super easy. Once your microservice was packed into a container, you could deploy it on:</p>
<ul>
<li><p>A single server</p>
</li>
<li><p>Multiple servers</p>
</li>
<li><p>Or cloud platforms like AWS Elastic Beanstalk, Azure App Service, or Google Cloud Run.</p>
</li>
</ul>
<h2 id="heading-the-problem-kubernetes-solves"><strong>The Problem Kubernetes Solves</strong> 🧠</h2>
<p>At first, when containers arrived on the scene, it felt like developers had struck gold.</p>
<p>You could package a microservice into a neat little container and run it anywhere – no more installing the same software on every server again and again. Tools like Docker and Docker Compose made this smooth for small projects.</p>
<p>But the real world? That’s where it got messy.</p>
<h3 id="heading-the-growing-headache-of-managing-containers">The Growing Headache of Managing Containers 💡</h3>
<p>When you have just a few microservices, you can manually deploy and manage their containers without much stress. But when your app grows – and you suddenly have dozens or even hundreds of microservices – managing them becomes an uphill battle:</p>
<ul>
<li><p>You had to deploy each container manually.</p>
</li>
<li><p>You had to restart them if one crashed.</p>
</li>
<li><p>You had to scale them one by one when more users started flooding in.</p>
</li>
</ul>
<p>Docker and Docker Compose were great for a small playground or startups, but not for an enterprise application with high traffic inflow.</p>
<h3 id="heading-cloud-managed-services-helped-but-only-up-to-a-point">Cloud-Managed Services Helped... But Only Up To a Point 🧑‍💻</h3>
<p>Cloud services like AWS Elastic Beanstalk, Azure App Service, and Google Code Engine offered a shortcut. They let you deploy containers without worrying about setting up servers.</p>
<p>You could:</p>
<ul>
<li><p>Deploy each container on its own managed cloud instance.</p>
</li>
<li><p>Scale them automatically based on traffic.</p>
</li>
</ul>
<p>BUT there were still some big headaches:</p>
<h4 id="heading-grouping-microservices-was-awkward-and-expensive">📦 Grouping microservices was awkward and expensive</h4>
<p>Sure, you could organize containers by environment (like “testing” or “production”) or even by team (like “Finance” or “HR”). But each new microservice usually needed its own cloud instance – for example, a separate Azure App Service or Elastic Beanstalk environment FOR EVERY SINGLE CONTAINER.</p>
<p>Imagine this:</p>
<ul>
<li><p>Each App Service instance costs ~$50 per month.</p>
</li>
<li><p>You’ve got 10 microservices.</p>
</li>
<li><p>That’s $500/month... even if they’re barely used. 💸 Yikes!</p>
</li>
</ul>
<h3 id="heading-kubernetes-smarter-leaner-and-more-flexible">Kubernetes: Smarter, Leaner, and More Flexible 💪</h3>
<p>With Kubernetes, you don’t need to spin up a separate server for each microservice. You can start with just one or two servers (VMs) – and Kubernetes will automatically decide which container goes where based on available space and resources.</p>
<p>No stress, no waste! 💡</p>
<h3 id="heading-kubernetes-lets-you-customize-everything">🧑‍🍳 <strong>Kubernetes Lets You Customize Everything</strong></h3>
<ol>
<li><p>You can assign resources to each microservice container.<br> 👉 Example: If you have a "Payment" microservice that’s lightweight, you might give it 0.5 vCPUs and 512MB of memory. If you have a "Data Analytics" microservice that’s resource-hungry, you could give it 2 vCPUs and 4GB of memory.</p>
</li>
<li><p>You can set a minimum number of instances for each microservice.<br> 👉 Example: If you want at least 2 copies of your "Login" service always running (so your app doesn’t break if one fails), Kubernetes makes sure you always have 2 live copies at all times.</p>
</li>
<li><p>You can group your containers however you like:<br> 👉 By teams (Finance, HR, DevOps) or by environments (Testing, Staging, Production). Kubernetes makes this grouping super clean and logical.</p>
</li>
<li><p>You can automatically scale individual containers.<br> 👉 When more users flood your app, Kubernetes can create extra copies (called “replicas”) of only the containers that are under pressure. No more wasting resources on containers that don’t need it.</p>
</li>
<li><p>You can even scale your servers!<br> 👉 Kubernetes can automatically increase the number of servers (VMs) in your environment – called a <strong>Cluster</strong> – when traffic grows. So you could start with 2 VMs at $30 each ($60/month) and let Kubernetes add more servers only when necessary, rather than locking yourself into high fixed costs like $500/month for cloud-managed services.</p>
</li>
</ol>
<p>Also, Kubernetes works <strong>the same way everywhere</strong>. Whether you deploy your containers on AWS, Google Cloud, Azure, or even your own laptop – Kubernetes doesn’t care. Your setup stays the same.</p>
<p>Compare that to managed services like Elastic Beanstalk or Azure App Service – which tie you to their platform, making it super hard to switch later.</p>
<p>✅ <strong>In short:</strong> Kubernetes saves you money, time, and a whole lot of headaches. It lets you run, scale, and organize your microservices without being chained to a single cloud provider — and without drowning in manual work.</p>
<h2 id="heading-how-kubernetes-works-components-of-a-kubernetes-environment"><strong>How Kubernetes Works — Components of a Kubernetes Environment</strong> 🧑‍🔧</h2>
<p>So by now you’ve seen the problem: running dozens (or hundreds!) of microservices manually is like juggling too many balls – you’re bound to drop some.</p>
<p>That’s why Kubernetes was created. But... how does it actually do all this magic? Let’s first break it down with the technical definition (simple but sharp – perfect for interviews) and then the layperson’s analogy (so it sticks in your head!).</p>
<h3 id="heading-1-cluster">1️⃣ <strong>Cluster 🏰</strong></h3>
<p>A Kubernetes Cluster is the entire setup of machines (physical or cloud-based) where Kubernetes runs. It’s made of one or more Master Nodes and Worker Nodes, working together to deploy and manage containerized applications.</p>
<p>Think of a Kubernetes Cluster as your entire playground. This is the environment where all your microservices live, grow, and play together.</p>
<p>A cluster is made up of two types of computers (called nodes):</p>
<ul>
<li><p>Master Node (nowadays often called the Control Plane)</p>
</li>
<li><p>Worker Nodes</p>
</li>
</ul>
<h3 id="heading-2-master-node-control-plane">2️⃣ <strong>Master Node (Control Plane) 👑</strong></h3>
<p>The Master Node is like the brain of Kubernetes. It manages and coordinates the whole cluster – deciding which applications run where, monitoring health, and scaling things up or down as needed.</p>
<p>It’s like the boss of the entire cluster. It doesn’t run your applications directly. Instead, it:</p>
<ul>
<li><p>Watches over the worker nodes</p>
</li>
<li><p>Decides which microservice (container) goes where</p>
</li>
<li><p>Makes sure everything runs smoothly and fairly</p>
</li>
</ul>
<p>Think of it like a factory manager who tells machines what to do, when to start, when to stop, and where to send the next package.</p>
<p>Inside the Master Node are a few clever mini-components that handle the real work.</p>
<h3 id="heading-3-api-server">3️⃣ <strong>API Server 💌</strong></h3>
<p>The API Server is the front door to Kubernetes. It handles communication between users and the system, taking commands and feeding them into the cluster.</p>
<p>This is where you (or your team) give Kubernetes instructions. Whether you're deploying a new app or scaling an existing one, you "talk" to the API Server first. It's like submitting a request at the front desk – the API server passes it on to the right people (or machines).</p>
<h3 id="heading-4-scheduler">4️⃣ <strong>Scheduler 📅</strong></h3>
<p>The Scheduler assigns Pods (applications) to Worker Nodes based on available resources and needs.</p>
<p>Imagine you’ve asked Kubernetes to launch a new microservice. The Scheduler checks:</p>
<ul>
<li><p>Which worker node has enough space?</p>
</li>
<li><p>Which node has enough memory and CPU?</p>
</li>
<li><p>Where would this service run best?</p>
</li>
</ul>
<p>It makes the decision and assigns the microservice to the perfect spot. Smart, huh?</p>
<h3 id="heading-5-controller-manager">5️⃣ <strong>Controller Manager 🎛️</strong></h3>
<p>The Controller Manager runs controllers that watch over the cluster and ensures that the system’s actual state matches the desired state.</p>
<p>This component watches over the system like a hawk. Let’s say you told Kubernetes:<br><em>"Hey, I want 3 copies of my payment microservice running at all times."</em></p>
<p>If one of them crashes, the Controller Manager sees that and spins up a new one to replace it automatically. It makes sure the reality always matches the plan.</p>
<h3 id="heading-6-etcd">6️⃣ <strong>etcd 📚</strong></h3>
<p>etcd is Kubernetes' memory – a distributed key-value store where cluster data is saved: config files, state, and metadata.</p>
<p>Imagine a notebook where all rules, records, and plans are written down. Without etcd, Kubernetes would forget everything.</p>
<h3 id="heading-7-worker-nodes">7️⃣ <strong>Worker Nodes 💪</strong></h3>
<p>Worker Nodes are the servers that run the actual application containers, doing the heavy lifting in the cluster.</p>
<p>These are the machines where your microservices actually live and run. The Master Node gives orders, but the Worker Nodes do the heavy lifting – they run your containers!</p>
<p>Each worker node has a few helpers to manage its microservices:</p>
<ul>
<li><p>The Kubelet</p>
</li>
<li><p>The Kube Proxy</p>
</li>
</ul>
<h3 id="heading-8-kubelet">8️⃣ <strong>Kubelet 📢</strong></h3>
<p>The Kubelet is the agent which lives on each Worker Node that makes sure containers are healthy and running as expected.</p>
<p>It listens to the Master Node’s instructions. If the Master Node says:<em>"Hey, run this container!",</em> the Kubelet makes it happen and keeps it running. If something goes wrong, the Kubelet reports back to the Master Node</p>
<h3 id="heading-9-kube-proxy">9️⃣ <strong>Kube Proxy 🚦</strong></h3>
<p>Kube Proxy handles network traffic, ensuring that Pods can talk to each other and to the outside world.</p>
<p>Imagine your banking app’s login service needs to talk to the payments service. The Kube Proxy handles the routing so the request reaches the right place. It also handles load balancing, so no single microservice gets overwhelmed.</p>
<p>So, to summarize:</p>
<ul>
<li><p>The Master Node is the boss – it plans, watches, and assigns tasks.</p>
</li>
<li><p>The Worker Nodes do the actual work – running your microservices.</p>
</li>
<li><p>Components like etcd, Kubelet, Scheduler, Controller Manager, and Kube Proxy all work together like parts of a well-oiled machine.</p>
</li>
</ul>
<p>Kubernetes is designed to handle your microservices automatically – keeping them alive, scaling them up, moving them around, and restarting them if they crash – so you don’t have to babysit them yourself.</p>
<h2 id="heading-kubernetes-workloads-pods-deployments-services-amp-more">Kubernetes Workloads 🛠️ — Pods, Deployments, Services, &amp; More</h2>
<p>Kubernetes workloads are the objects you use to manage and run your applications. Think of them as blueprints 📐 that tell Kubernetes <strong>what</strong> to run and <strong>how</strong> to run it – whether it’s a single app container, a group of containers, a database, or a batch job. Here are some of the workloads in Kubernetes:</p>
<h3 id="heading-1-pods">1️⃣ <strong>Pods</strong></h3>
<p>A <strong>Pod</strong> is the smallest and simplest unit in the Kubernetes object model. It represents a single instance of a running process in your cluster and can contain one or more containers that share storage and network resources. ​</p>
<p>Think of a Pod as a wrapper around one or more containers that need to work together. They share the same network IP and storage, allowing them to communicate easily and share data. Pods are ephemeral (live for a short time, they can be replaced very easily). If a Pod dies, Kubernetes can create a new one to replace it almost instantly.​</p>
<p>Say you have an application which is split into 2 distributed monoliths – a frontend and a backend. The frontend will run in a container in Pod A, while the backend app will run in a container in another Pod B.</p>
<h3 id="heading-2-deployments">2️⃣ <strong>Deployments</strong></h3>
<p>A <strong>Deployment</strong> provides declarative updates for Pods and ReplicaSets. You describe a desired state in a Deployment, and the Deployment Controller changes the actual state to the desired state at a controlled rate.</p>
<p>Deployments manage the lifecycle of your application Pods. They ensure that the specified number of Pods are running and can handle updates, rollbacks, and scaling. If a Pod fails, the Deployment automatically replaces it to maintain the desired state.​</p>
<p>Imagine you're managing a store. A Deployment is like the store manager – you tell it how many workers (Pods) you want, and it makes sure they’re always present. If one doesn't show up for work, the manager finds a replacement automatically. You can also tell it to hire more workers or fire some when needed.</p>
<h3 id="heading-3-services">3️⃣ <strong>Services</strong></h3>
<p>A <strong>Service</strong> in Kubernetes defines a way to access/communicate with Pods. Services enable communication between different Pods (for example, your frontend Pod A can communicate with your backend Pod B via a service) and can expose your application to external traffic (for example the public internet). ​</p>
<p>Services act as a stable endpoint to access a set of Pods. Even if the underlying Pods change, the Service's IP and DNS name remain constant, ensuring communication between the Pods within the cluster or with the internet.</p>
<p>A Service is like the front door to your app. No matter which worker (Pod) is behind it, people always use the same entrance to access it. It hides the messy stuff happening behind the scenes and gives users a simple way to connect to your app.</p>
<h3 id="heading-4-replicasets">4️⃣ <strong>ReplicaSets</strong></h3>
<p>A <strong>ReplicaSet</strong> ensures that a specified number of identical Pods are running at any given time. It is often used to guarantee the availability of a specified number of Pods (horizontal scaling). ​</p>
<p>ReplicaSets maintain a stable set of running Pods. If a Pod crashes or is deleted, the ReplicaSet automatically creates a new one to replace it, ensuring your application remains available.​</p>
<p>Think of a ReplicaSet like a robot that counts how many copies of your app are running. If one goes missing, it automatically makes a new one. It keeps the number steady, just like you told it to.</p>
<h3 id="heading-5-daemonsets">5️⃣ <strong>DaemonSets</strong></h3>
<p>A <strong>DaemonSet</strong> ensures that all (or some) Nodes run an instance (a copy) of a specific Pod. As nodes are added to the cluster, Pods are added to them. As nodes are removed from the cluster, those Pods are also removed. ​</p>
<p>DaemonSets are used to deploy a Pod on every node in the cluster. This is useful for running background tasks like log collection or monitoring agents on all nodes (for example to get the CPU, memory, and disk usage of each node).​</p>
<p>A DaemonSet is like saying, “I want this helper app to run on <strong>every single computer</strong> we have.” As mentioned earlier, it’s great for things like log collectors or security checkers – small helpers that every machine should have.</p>
<h3 id="heading-6-statefulsets">6️⃣ <strong>StatefulSets</strong></h3>
<p>A <strong>StatefulSet</strong> is the workload API object used to manage stateful applications (applications that store data, for example in their filesystem – databases). It manages the deployment and scaling of a set of Pods and provides guarantees about the ordering and uniqueness of these Pods.</p>
<p>StatefulSets are designed for applications that require persistent storage and stable network identities, like databases.</p>
<p>Let’s say you’re running a database or anything that needs to save info. A StatefulSet is like giving each app a name tag and a personal drawer to store their stuff. Even if you restart them, they come back with the same name and same drawer.</p>
<h3 id="heading-7-jobs">7️⃣ <strong>Jobs</strong></h3>
<p>A <strong>Job</strong> creates one or more Pods and ensures that a specified number of them successfully terminate. As Pods successfully complete, the Job tracks the successful completions. When a specified number of successful completions is reached, the Job is complete. ​</p>
<p>A Job is like a one-time task. Imagine sending out a batch of emails or processing a report. You want the task to run, finish, and then stop. That’s exactly what a Job does.</p>
<h3 id="heading-8-cronjobs">8️⃣ <strong>CronJobs</strong></h3>
<p>A <strong>CronJob</strong> creates Jobs on a time-based schedule. It runs a Job periodically on a given schedule, written in Cron format.</p>
<p>A CronJob is like setting a reminder or alarm. It tells your app (in this case the Job) to do something every night at 2 AM, every Monday morning, or once a month – whatever schedule you give it.</p>
<h2 id="heading-how-to-create-a-kubernetes-cluster-in-a-demo-environment-with-play-with-k8s">🛠️ How to Create a Kubernetes Cluster in a Demo Environment with <code>play-with-k8s</code></h2>
<p>As we've discussed earlier, a Kubernetes cluster is a set of machines (called nodes) that run containerized applications.</p>
<p>Setting up a Kubernetes cluster locally or in the cloud can be complex and expensive. To simplify the learning process, Docker provides a free, browser-based platform called <a target="_blank" href="https://labs.play-with-k8s.com/">Play with Kubernetes</a>. This environment allows you to create and interact with a Kubernetes cluster without installing anything on your local machine. It's an excellent tool for beginners to get hands-on experience with Kubernetes.​</p>
<h3 id="heading-sign-in-to-play-with-kubernetes">🔐 Sign in to Play with Kubernetes</h3>
<ol>
<li><p><strong>Visit the platform</strong> at <a target="_blank" href="https://labs.play-with-k8s.com/">https://labs.play-with-k8s.com/</a>.​</p>
</li>
<li><p><strong>Authenticate:</strong></p>
<ul>
<li><p>Click on the "Login" button.</p>
</li>
<li><p>You can sign in using your Docker Hub or GitHub account.</p>
</li>
<li><p>If you don't have an account, you can create one for free on <a target="_blank" href="https://hub.docker.com/">Docker Hub</a> or <a target="_blank" href="https://github.com/">GitHub</a>.​</p>
</li>
</ul>
</li>
</ol>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1746083007442/a038ee6c-b471-4880-ba17-2e8927678780.png" alt="Sign in to Play with k8s" class="image--center mx-auto" width="770" height="848" loading="lazy"></p>
<h3 id="heading-create-your-kubernetes-cluster">🚀 Create Your Kubernetes Cluster</h3>
<p>Once signed in, follow these steps to set up your cluster:</p>
<h4 id="heading-step-1-start-a-new-session">Step 1: Start a New Session:</h4>
<p>Click on the <strong>"Start"</strong> button to initiate a new session.​ This will create a new session giving you about 4 hours of play time, after which the cluster and it’s resources will be automatically terminated.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1746083204331/8410e18b-4ed4-4374-8d4f-44f0fefa1623.png" alt="Play with k8s timed session" class="image--center mx-auto" width="1590" height="254" loading="lazy"></p>
<h4 id="heading-step-2-add-instances">Step 2: Add Instances:</h4>
<p>Then click on <strong>"+ Add New Instance"</strong> to create a new node (Virtual Machine).  </p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1746083280594/740d963a-c70f-43c6-8354-e6ea0c3d7f41.png" alt="Create new master node (VM)" class="image--center mx-auto" width="332" height="254" loading="lazy"></p>
<p>This will open a terminal window where you can run commands.​  </p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1746083304493/ffd34d73-e5cd-41d0-908a-2240924e7ad0.png" alt="Terminal of newly created node" class="image--center mx-auto" width="1912" height="966" loading="lazy"></p>
<h4 id="heading-step-3-initialize-the-master-node">Step 3: Initialize the Master Node:</h4>
<p>In the terminal, run the following command to initialize the master node:​</p>
<pre><code class="lang-bash">kubeadm init --apiserver-advertise-address $(hostname -i) --pod-network-cidr &lt;SPECIFIED_IP_ADDRESS&gt;
</code></pre>
<p>You can find the command in the terminal. In my case, the IP address is <code>10.5.0.0/16</code>. Replace the <code>&lt;SPECIFIED_IP_ADDRESS&gt;</code> placeholder with the IP address specified in your terminal.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1746083865451/fdf18710-c987-4221-bc02-369cd709a849.png" alt="Initialize the master node and the control plane" class="image--center mx-auto" width="1124" height="389" loading="lazy"></p>
<p>This process will set up the control plane of your Kubernetes cluster.​</p>
<h4 id="heading-step-4-add-worker-nodes">Step 4: Add Worker Nodes:</h4>
<p>If you want to add worker nodes, in the master node terminal, you'll find a <code>kubeadm join...</code> command after running the <code>kubeadm init --apiserver-advertise-address $(hostname -i) --pod-network-cidr &lt;SPECIFIED_IP_ADDRESS&gt;</code> command.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1746084559142/6e539ef6-0219-40da-95e7-42abc9f1af8c.png" alt="Command to add worker node to control plane" class="image--center mx-auto" width="1571" height="627" loading="lazy"></p>
<p>Click on <strong>"+ Add New Instance"</strong> to create another node just as you did earlier.</p>
<p>Run this command in the new node's terminal to join it to the cluster:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1746084666411/78f07ba1-7f1f-402e-9ed8-c4d6054bdcab.png" alt="Add worker node to control plane" class="image--center mx-auto" width="1912" height="966" loading="lazy"></p>
<h4 id="heading-step-5-configure-the-clusters-networking">Step 5: Configure the Cluster’s networking:</h4>
<p>Navigate to the master node, and run the command below to configure the cluster’s networking.</p>
<pre><code class="lang-bash">kubectl apply -f https://raw.githubusercontent.com/cloudnativelabs/kube-router/master/daemonset/kubeadm-kuberouter.yaml
</code></pre>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1746085296963/ba35966c-5dd1-4e17-b4b5-85639cb3a80d.png" alt="Configure networking in the cluster" class="image--center mx-auto" width="1532" height="467" loading="lazy"></p>
<h4 id="heading-step-6-verify-the-cluster">Step 6: Verify the Cluster:</h4>
<p>In the master node terminal (the first node with the highlighted user profile), run:​</p>
<pre><code class="lang-bash">kubectl get nodes
</code></pre>
<p>You should see a list of nodes in your cluster, including the master and any worker nodes you've added.​</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1746085583418/45e55418-4b0f-461f-98d8-3b0c8f19b839.png" alt="Nodes in the cluster" class="image--center mx-auto" width="466" height="138" loading="lazy"></p>
<p>Congratulations! You just created your very own Kubernetes cluster with 2 VMs: the master node (where the control plane resides), and the worker nodes (where the Kubernetes workloads, for example Pods, will be deployed).</p>
<h2 id="heading-how-to-deploy-an-application-on-your-kubernetes-cluster">🚀 How to Deploy an Application on Your Kubernetes Cluster</h2>
<p>Now that we've set up our Kubernetes cluster using Play with Kubernetes, it's time to deploy the application and make it accessible over the internet.</p>
<h3 id="heading-understanding-imperative-vs-declarative-approaches-in-kubernetes">🧠 Understanding Imperative vs. Declarative Approaches in Kubernetes</h3>
<p>Before we proceed, it's essential to grasp the two primary methods for managing resources in Kubernetes: <strong>Imperative</strong> and <strong>Declarative</strong>.</p>
<h3 id="heading-imperative-approach">🖋️ Imperative Approach</h3>
<p>In the imperative approach, you directly issue commands to the Kubernetes API to create or modify resources. Each command specifies the desired action, and Kubernetes executes it immediately.​</p>
<p>Imagine telling someone, "Turn on the light." You're giving a direct command, and the action happens right away. Similarly, with imperative commands, you instruct Kubernetes step-by-step on what to do.</p>
<p><strong>Example:</strong><br>To create a pod running an NGINX container, run the below command in the terminal of the master node:​</p>
<pre><code class="lang-bash">kubectl run nginx-pod --image=nginx
</code></pre>
<p>Now wait a few seconds and run the command below to check the status of the pod:</p>
<pre><code class="lang-bash">kubectl get pods
</code></pre>
<p>You should get a response similar to this</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1746087463204/52ef26e5-96df-4d91-8a2d-7527a38786d2.png" alt="Get pods running in the cluster" class="image--center mx-auto" width="465" height="118" loading="lazy"></p>
<p>Now let’s expose our Pod to the internet by creating a <strong>Service.</strong> Run the command below to expose the Pod:</p>
<pre><code class="lang-bash">kubectl expose pod nginx-pod --<span class="hljs-built_in">type</span>=NodePort --port=80
</code></pre>
<p>To get the IP address of the Cluster so we can access our Pod, run the command below:</p>
<pre><code class="lang-bash">kubectl get svc
</code></pre>
<p>The command displays the IP address from which we can access our service. You should get an output similar to this:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1746088678881/a4f3bdbc-c7eb-4696-ba6e-587637be5792.png" alt="Get service IP address" class="image--center mx-auto" width="712" height="140" loading="lazy"></p>
<p>Now, copy the IP address for the <code>nginx-pod</code> service and run the command below to make a request to your Pod:</p>
<pre><code class="lang-bash">curl &lt;YOUR-SERVICE-IP-ADDRESS&gt;
</code></pre>
<p>Replace the <code>&lt;YOUR-SERVICE-IP-ADDRESS&gt;</code> placeholder with the IP address of your <code>nginx-pod</code> service. In my case, it’s <code>10.98.108.173</code>.</p>
<p>You should get a response from your <code>nginx-pod</code> Pod:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1746088937046/8b86cd63-21f0-45d3-9ab5-59bd630fb37c.png" alt="Make a request to the Nginx Pod running in the Cluster" class="image--center mx-auto" width="730" height="465" loading="lazy"></p>
<p>We couldn’t access the Pod from the internet, that is our browser, because our Cluster isn’t connected to a cloud service like AWS or Google Cloud which can provide us with an external load balancer.</p>
<p>Now let’s try doing the same thing but using the Declarative method.</p>
<h3 id="heading-declarative-approach">🚀 Declarative Approach</h3>
<p>So far, we used the imperative approach, where we typed commands like <code>kubectl run</code> or <code>kubectl expose</code> directly into the terminal to make Kubernetes do something immediately.</p>
<p>But Kubernetes has another (and often better) way to do things: the declarative approach.</p>
<h4 id="heading-what-is-the-declarative-approach">🧾 What Is the Declarative Approach?</h4>
<p>Instead of giving Kubernetes instructions step-by-step like a chef in a kitchen, you give it a full recipe – a file that describes exactly what you want (for example, what app to run, how many copies of it, how to expose it, and so on).</p>
<p>This recipe is written in a file called a <strong>manifest</strong>.</p>
<h4 id="heading-whats-a-manifest">📘 What’s a Manifest?</h4>
<p>A manifest is a file (usually written in YAML format) that describes a Kubernetes object – like a Pod, a Deployment, or a Service.</p>
<p>It’s like writing down what you want, handing it over to Kubernetes, and saying: “Hey, please make sure this exists exactly how I described it.”</p>
<p>We’ll use two manifests:</p>
<ol>
<li><p>One to deploy our application</p>
</li>
<li><p>Another to expose it to the internet</p>
</li>
</ol>
<p>Let’s walk through it!</p>
<h4 id="heading-step-1-clone-the-github-repo">📁 Step 1: Clone the GitHub Repo</h4>
<p>We already have a GitHub repo that contains the two manifest files we need. Let’s clone it into our Kubernetes environment.</p>
<p>Run this in the terminal (on your master node):</p>
<pre><code class="lang-bash">git <span class="hljs-built_in">clone</span> https://github.com/onukwilip/simple-kubernetes-app
</code></pre>
<p>Now, let’s go into the folder:</p>
<pre><code class="lang-bash"><span class="hljs-built_in">cd</span> simple-kubernetes-app
</code></pre>
<p>You should see two files:</p>
<ul>
<li><p><code>deployment.yaml</code></p>
</li>
<li><p><code>service.yaml</code></p>
</li>
</ul>
<h4 id="heading-step-2-understanding-the-deployment-manifest-deploymentyaml">📦 Step 2: Understanding the Deployment Manifest (<code>deployment.yaml</code>)</h4>
<p>This manifest will tell Kubernetes to deploy our app and ensure it’s always running.</p>
<p>Here’s what’s inside:</p>
<pre><code class="lang-yaml"><span class="hljs-attr">apiVersion:</span> <span class="hljs-string">apps/v1</span>
<span class="hljs-attr">kind:</span> <span class="hljs-string">Deployment</span>
<span class="hljs-attr">metadata:</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">nginx-deployment</span>
<span class="hljs-attr">spec:</span>
  <span class="hljs-attr">replicas:</span> <span class="hljs-number">3</span>
  <span class="hljs-attr">selector:</span>
    <span class="hljs-attr">matchLabels:</span>
      <span class="hljs-attr">app:</span> <span class="hljs-string">nginx</span>
  <span class="hljs-attr">template:</span>
    <span class="hljs-attr">metadata:</span>
      <span class="hljs-attr">labels:</span>
        <span class="hljs-attr">app:</span> <span class="hljs-string">nginx</span>
    <span class="hljs-attr">spec:</span>
      <span class="hljs-attr">containers:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">nginx</span>
        <span class="hljs-attr">image:</span> <span class="hljs-string">nginx</span>
</code></pre>
<p>Now, let’s break this down:</p>
<ul>
<li><p><code>apiVersion: apps/v1</code>: This tells Kubernetes which version of the API we’re using to define this object.</p>
</li>
<li><p><code>kind: Deployment</code>: This means we’re creating a Deployment (a controller that manages Pods).</p>
</li>
<li><p><code>metadata.name</code>: We’re giving our Deployment a name: <code>nginx-deployment</code>.</p>
</li>
<li><p><code>spec.replicas: 3</code>: We’re telling Kubernetes: “Please run 3 copies (replicas) of this app.”</p>
</li>
<li><p><code>selector.matchLabels</code>: Kubernetes will use this label to find which Pods this Deployment is managing.</p>
</li>
<li><p><code>template.metadata.labels</code> &amp; <code>spec.containers</code>: This section describes the Pods that the Deployment should create – each Pod will run a container using the official <code>nginx</code> image.</p>
</li>
</ul>
<p>✅ In plain terms: We're asking Kubernetes to create and maintain 3 copies of an app that runs NGINX, and automatically restart them if any fails.</p>
<h4 id="heading-step-3-understanding-the-service-manifest-serviceyaml">🌐 Step 3: Understanding the Service Manifest (<code>service.yaml</code>)</h4>
<p>This file tells Kubernetes to expose our NGINX app to the outside world using a Service.</p>
<p>Here’s the file – let’s break this down, too:</p>
<pre><code class="lang-yaml"><span class="hljs-attr">apiVersion:</span> <span class="hljs-string">v1</span>
<span class="hljs-attr">kind:</span> <span class="hljs-string">Service</span>
<span class="hljs-attr">metadata:</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">nginx-service</span>
<span class="hljs-attr">spec:</span>
  <span class="hljs-attr">type:</span> <span class="hljs-string">NodePort</span>
  <span class="hljs-attr">selector:</span>
    <span class="hljs-attr">app:</span> <span class="hljs-string">nginx</span>
  <span class="hljs-attr">ports:</span>
  <span class="hljs-bullet">-</span> <span class="hljs-attr">protocol:</span> <span class="hljs-string">TCP</span>
    <span class="hljs-attr">port:</span> <span class="hljs-number">80</span>
    <span class="hljs-attr">targetPort:</span> <span class="hljs-number">80</span>
</code></pre>
<ul>
<li><p><code>apiVersion: v1</code>: We’re using version 1 of the Kubernetes API.</p>
</li>
<li><p><code>kind: Service</code>: We’re creating a Service object.</p>
</li>
<li><p><code>metadata.name: nginx-service</code>: Giving it a name.</p>
</li>
<li><p><code>spec.type: NodePort</code>: We’re exposing it through a port on the node (so we can access it via the node's IP address).</p>
</li>
<li><p><code>selector.app: nginx</code>: This tells Kubernetes to connect this Service to Pods with the label <code>app: nginx</code>.</p>
</li>
<li><p><code>ports.port</code> and <code>targetPort</code>: The Service will listen on port 80 and forward traffic to port 80 on the Pod.</p>
</li>
</ul>
<p>✅ In plain terms: This file says, “Expose our NGINX app through the cluster’s network so we can access it from the outside world.”</p>
<h4 id="heading-step-4-clean-up-previous-resources">🧹 Step 4: Clean Up Previous Resources</h4>
<p>If you’re still running the Pod and Service we created using the imperative approach, let’s delete them to avoid conflicts:</p>
<pre><code class="lang-bash">kubectl delete pod nginx-pod
kubectl delete service nginx-pod
</code></pre>
<h4 id="heading-step-5-apply-the-manifests">📥 Step 5: Apply the Manifests</h4>
<p>Now let’s deploy the NGINX app and expose it – this time using the <strong>declarative</strong> way.</p>
<p>From inside the <code>simple-kubernetes-app</code> folder, run:</p>
<pre><code class="lang-bash">kubectl apply -f deployment.yaml
</code></pre>
<p>Then:</p>
<pre><code class="lang-bash">kubectl apply -f service.yaml
</code></pre>
<p>This will create the Deployment and the Service described in the files. 🎉</p>
<h4 id="heading-step-6-check-that-its-running">🔍 Step 6: Check That It’s Running</h4>
<p>Let’s see if the Pods were created:</p>
<pre><code class="lang-bash">kubectl get pods
</code></pre>
<p>You should see 3 Pods running!</p>
<p>And let’s check the service:</p>
<pre><code class="lang-bash">kubectl get svc
</code></pre>
<p>Look for the <code>nginx-service</code>. You’ll see something like:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1746092825896/617084f1-3a71-4cfd-a287-9f7a9ac08810.png" alt="Access service NodePort" class="image--center mx-auto" width="736" height="181" loading="lazy"></p>
<p>Note the <strong>NodePort</strong> (for example, <code>30001</code>) as we’ll use it to access the app.</p>
<h4 id="heading-step-7-access-the-app">🌍 Step 7: Access the App</h4>
<p>You can now send a request to your app like this:</p>
<pre><code class="lang-bash">curl http://&lt;YOUR-NODE-IP&gt;:&lt;NODE-PORT&gt;
</code></pre>
<blockquote>
<p>Replace <code>&lt;YOUR-NODE-IP&gt;</code> with the IP of your master node (you’ll usually find this in Play With Kubernetes at the top of your terminal), and <code>&lt;NODE-PORT&gt;</code> with the NodePort shown in the <code>kubectl get svc</code> command.</p>
</blockquote>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1746092570586/b33cabc0-ea1e-4a70-ab55-9f3a0761bec0.png" alt="Get master node IP address" class="image--center mx-auto" width="897" height="501" loading="lazy"></p>
<p>You should see the HTML content of the NGINX welcome page printed out.</p>
<p>Now terminate the cluster environment by clicking the <strong>CLOSE SESSION</strong> button:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1746093081895/79139f75-5e6b-4991-be74-38ecbbf2ef66.png" alt="79139f75-5e6b-4991-be74-38ecbbf2ef66" class="image--center mx-auto" width="336" height="407" loading="lazy"></p>
<h3 id="heading-why-declarative-is-better-in-most-cases">🆚 Why Declarative Is Better (In Most Cases)</h3>
<ul>
<li><p>🔁 <strong>Reusable</strong>: You can use the same files again and again.</p>
</li>
<li><p>📦 <strong>Version-controlled</strong>: You can push these files to GitHub and track changes over time.</p>
</li>
<li><p>🛠️ <strong>Fixes mistakes easily</strong>: Want to change 3 replicas to 5? Just update the file and re-apply!</p>
</li>
<li><p>🧠 <strong>Easier to maintain</strong>: Especially when you have many resources to manage.</p>
</li>
</ul>
<h2 id="heading-advantages-of-using-kubernetes-in-business">💼 Advantages of Using Kubernetes in Business</h2>
<p>Kubernetes isn’t just a developer tool—it’s a business enabler as well. It helps companies deliver products faster, more reliably, and with reduced operational overhead.</p>
<p>Let’s break down how Kubernetes translates to real-world business benefits:</p>
<h3 id="heading-1-better-use-of-cloud-resources-cost-savings">1️⃣ <strong>Better Use of Cloud Resources = Cost Savings</strong></h3>
<p>Before Kubernetes, deploying many microservices for a single application often meant creating separate cloud resources (like one Azure App Service per microservice), which could rack up huge costs quickly. Imagine $50/month per service × 10 services = $500/month 😬.</p>
<p><strong>With Kubernetes:</strong><br>You can run multiple microservices on fewer virtual machines (VMs) while Kubernetes automatically decides the most efficient way to use the available servers. That means you pay for fewer servers and get more out of them 💸.</p>
<h3 id="heading-2-high-availability-and-uptime-happy-customers">2️⃣ <strong>High Availability and Uptime = Happy Customers</strong></h3>
<p>Kubernetes watches your apps like a hawk 👀. If one of them crashes or fails, Kubernetes restarts or replaces it <em>immediately</em> – automatically.</p>
<p><strong>For your business:</strong><br>This means less downtime, fewer support tickets, and happier customers who don’t even notice when things go wrong in the background.</p>
<h3 id="heading-3-easy-scaling-during-high-demand">3️⃣ <strong>Easy Scaling During High Demand</strong></h3>
<p>Manually scaling apps during high traffic (like Black Friday) can be a nightmare 😰. And if you don't act fast, customers experience slowness or crashes.</p>
<p><strong>With Kubernetes:</strong><br>You can configure each microservice to automatically scale — meaning it adds more instances of that service <em>only when needed</em> (too many users on your site trying to purchase different products) and scales back down when traffic drops. This ensures your app is always responsive and you only pay for what you use.</p>
<h3 id="heading-4-faster-deployment-faster-time-to-market">4️⃣ <strong>Faster Deployment = Faster Time to Market</strong></h3>
<p>Kubernetes supports automation and repeatability. Teams can deploy new features or microservices faster without worrying about infrastructure setup every time.</p>
<p><strong>For business:</strong><br>This means faster product updates, quicker response to market demands, and competitive advantage 🚀.</p>
<h3 id="heading-5-consistent-environments-fewer-bugs">5️⃣ <strong>Consistent Environments = Fewer Bugs</strong></h3>
<p>Each microservice in Kubernetes is containerized, meaning it runs with all its dependencies in a self-contained package. You can run the exact same app setup in:</p>
<ul>
<li><p>Development</p>
</li>
<li><p>Testing</p>
</li>
<li><p>Production</p>
</li>
</ul>
<p>This reduces bugs caused by "it works on my machine" issues 🤦‍♂️ and helps teams build with confidence.</p>
<h3 id="heading-6-vendor-independence-bye-bye-to-vendor-lock-in">6️⃣ <strong>Vendor Independence (Bye-bye to Vendor lock-in)</strong></h3>
<p>When you use cloud-managed services (like AWS Elastic Beanstalk or Azure App Service), it’s often hard to move to another provider because everything is tailored to that specific platform.</p>
<p><strong>With Kubernetes:</strong><br>It works the same way on AWS, Azure, GCP, or even your own data center. This means you can switch cloud providers easily and avoid being locked into one vendor – aka cloud freedom! ☁️🕊️</p>
<h3 id="heading-7-organizational-clarity">7️⃣ <strong>Organizational Clarity</strong></h3>
<p>Kubernetes lets you organize your apps clearly. You can group workloads by:</p>
<ul>
<li><p>Team (for example, Finance, HR)</p>
</li>
<li><p>Environment (for example, testing, staging, production)</p>
</li>
</ul>
<p>This structure helps large teams collaborate better, stay organized, and manage resources efficiently.</p>
<h2 id="heading-disadvantages-of-using-kubernetes">😬 Disadvantages of Using Kubernetes</h2>
<p>Like everything in tech, Kubernetes isn’t all rainbows and rockets 🚀. Just like any other tool, it has its pros and its cons. And it's super important for startup founders, product managers, or even CEOs to know when Kubernetes is the right fit – and when it’s just overkill.</p>
<p>Let’s break down the main disadvantages in a simple, honest way:</p>
<h3 id="heading-1-youll-likely-need-a-devops-engineer-or-team">👨‍🔧 1. You’ll Likely Need a DevOps Engineer or Team</h3>
<p>Kubernetes is powerful, yes. But that power comes with great responsibility 😅.</p>
<p>In simple terms:</p>
<ul>
<li><p>You don't just "click a button" and your app is magically running.</p>
</li>
<li><p>Kubernetes needs someone who understands how to set it up, keep it running, and fix issues when they pop up. This person (or team) is usually called a DevOps Engineer, SIte Relability Engineer or Cloud Engineer.</p>
</li>
</ul>
<p>Here’s what they’ll typically handle:</p>
<ul>
<li><p>Creating the cluster (the environment where your apps will run)</p>
</li>
<li><p>Defining how your app containers should behave (how many should run, how much memory they need, when they should restart, and so on)</p>
</li>
<li><p>Monitoring the apps and making sure they’re healthy</p>
</li>
<li><p>Ensuring security rules are followed</p>
</li>
<li><p>Handling automated scaling, deployment rollouts, backups, and so on.</p>
</li>
</ul>
<p>💡 <strong>In short:</strong> You’ll need someone skilled to manage this tool. If you’re a solo founder or a small team with no DevOps experience, Kubernetes might be too much upfront.</p>
<h3 id="heading-2-kubernetes-can-be-expensive-if-used-prematurely">💰 2. Kubernetes Can Be Expensive (If Used Prematurely)</h3>
<p>Kubernetes saves money at scale – but can cost more if you adopt it too early or for the wrong use case.</p>
<p>Here's why:</p>
<ul>
<li><p>Kubernetes is meant for managing multiple applications or microservices. If your business only has one small app, you’re using a rocket to deliver a pizza 🍕 – it’s just not necessary.</p>
</li>
<li><p>Kubernetes is also best when you have high or unpredictable traffic. It can automatically scale up your services when traffic spikes...but if your traffic is steady and small, you won’t benefit much from that power.</p>
</li>
</ul>
<p>Let’s say:</p>
<ul>
<li><p>You have one app with moderate traffic.</p>
</li>
<li><p>You deploy it on Kubernetes (which requires at least 1–2 VMs + setup).</p>
</li>
<li><p>You hire a DevOps engineer to manage it.</p>
</li>
<li><p>You pay for cloud compute + storage + monitoring.</p>
</li>
</ul>
<p>You could end up spending $300–$800/month or more... for something that could’ve been hosted on a simple service like <a target="_blank" href="https://render.com">Render</a>, <a target="_blank" href="https://www.heroku.com">Heroku</a>, or a basic VM for a fraction of the cost.</p>
<p>So when <strong>should</strong> you consider Kubernetes?</p>
<ul>
<li><p>When your platform is made up of multiple services (For example, separate services for user auth, payments, analytics, notifications, and so on)</p>
</li>
<li><p>When you’re expecting traffic spikes (for example, launching in new countries, going viral, seasonal demand like black Friday)</p>
</li>
<li><p>When you want flexibility in managing your infrastructure across cloud providers (AWS, GCP, Azure) or even on-premises</p>
</li>
</ul>
<h2 id="heading-use-cases-when-and-when-not-to-use-kubernetes">🧭 Use Cases: When (and When Not) to Use Kubernetes</h2>
<p>Kubernetes is an incredibly powerful tool – but it’s not always the right solution from day one.</p>
<p>Let’s break down when it makes sense to use Kubernetes and when it might be overkill 👇</p>
<h3 id="heading-when-you-should-use-kubernetes">✅ When You Should Use Kubernetes</h3>
<p>Kubernetes becomes essential in these scenarios:</p>
<h4 id="heading-1-your-application-is-made-of-many-microservices">1. Your Application Is Made of Many Microservices</h4>
<p>If your app is broken down into multiple microservices – like user authentication, payments, orders, notifications, and more – it’s a good sign that Kubernetes might eventually help.</p>
<p>Kubernetes can:</p>
<ul>
<li><p>Help manage each microservice independently</p>
</li>
<li><p>Automatically scale each one based on demand</p>
</li>
<li><p>Restart failed services automatically</p>
</li>
<li><p>Make it easier to roll out updates to specific parts of the application</p>
</li>
</ul>
<h4 id="heading-2-youre-getting-steady-and-high-traffic">2. You’re Getting <em>Steady and High</em> Traffic</h4>
<p>It’s not just about complexity – it’s about demand.</p>
<p>If your app receives a consistent, high volume of users (like hundreds or thousands every day), and you start seeing signs that your servers are getting overloaded, Kubernetes shines here. It can:</p>
<ul>
<li><p>Automatically increase resources when traffic surges</p>
</li>
<li><p>Balance the load across multiple servers</p>
</li>
<li><p>Prevent downtime due to traffic spikes</p>
</li>
</ul>
<h4 id="heading-3-you-want-portability-and-cloud-independence">3. You Want Portability and Cloud Independence</h4>
<p>If your business doesn’t want to be locked into just one cloud provider (for example, only AWS), Kubernetes gives you flexibility. You can move your application between AWS, GCP, Azure – or even to your own data center – with fewer changes.</p>
<h4 id="heading-4-your-devops-team-is-growing">4. Your DevOps Team Is Growing</h4>
<p>When you have multiple developers or teams working on different parts of the app, Kubernetes helps:</p>
<ul>
<li><p>Organize and isolate workloads per team</p>
</li>
<li><p>Improve collaboration and consistency</p>
</li>
<li><p>Provide easy access control and monitoring</p>
</li>
</ul>
<h3 id="heading-when-you-should-not-use-kubernetes">❌ When You Should Not Use Kubernetes</h3>
<p>Let’s be honest: Kubernetes is not for everyone, especially not at the beginning.</p>
<h4 id="heading-1-you-just-launched-your-app">1. You Just Launched Your App</h4>
<p>In the early days of your product, when you’ve just launched and traffic is still low, Kubernetes is <em>overkill</em>. You don’t need its complexity (yet).</p>
<p>👉 Instead, deploy your app or each microservice on a simple virtual machine (VM). It’s cheaper and faster to get started.</p>
<h4 id="heading-2-you-dont-need-auto-scaling-yet">2. You Don’t Need Auto-scaling (Yet)</h4>
<p>If traffic to your app is still small and manageable, a single server (or a few of them) can easily handle the load. In that case, it’s better to:</p>
<ul>
<li><p>Deploy your microservices manually or with Docker Compose</p>
</li>
<li><p>Monitor and scale manually when needed</p>
</li>
<li><p>Keep things simple until the need for automation becomes obvious</p>
</li>
</ul>
<h4 id="heading-3-you-dont-have-a-devops-team">3. You Don’t Have a DevOps Team</h4>
<p>Kubernetes is powerful – but it needs expertise to set up and maintain. If you don’t have a DevOps engineer or someone who understands Kubernetes, it may cause more problems than it solves.</p>
<p>Hiring a DevOps team can be expensive, and setting up Kubernetes incorrectly can lead to outages, security risks, or wasted resources 💸</p>
<h3 id="heading-when-to-move-to-kubernetes">📈 When to Move to Kubernetes</h3>
<p>So, what’s the best path forward?</p>
<p>Here’s a simple roadmap:</p>
<ol>
<li><p><strong>Start small</strong>: Deploy your app (or microservices) on one or a few VMs</p>
</li>
<li><p><strong>Watch traffic</strong>: As user demand grows, increase VM size or replicate the app manually</p>
</li>
<li><p><strong>Track pain points</strong>: If scaling becomes too manual, or if services crash under load...</p>
</li>
<li><p><strong>Then adopt Kubernetes</strong> 🧠</p>
</li>
</ol>
<p>It’s not about how complex your app is – it’s about when the traffic and growth demand an upgrade in how you manage things.</p>
<h3 id="heading-tldr-for-founders-and-devops-teams">🎯 TL;DR for Founders and DevOps Teams</h3>
<ul>
<li><p>Don’t jump to Kubernetes just because it’s trendy</p>
</li>
<li><p>Use it only when traffic grows steadily and auto-scaling becomes necessary</p>
</li>
<li><p>Kubernetes is most valuable when you want to scale reliably and efficiently</p>
</li>
<li><p>Before that point, stick to simple deployments – it’ll save you time, money, and stress</p>
</li>
</ul>
<h2 id="heading-conclusion">🎉 Conclusion</h2>
<p>Wow! What a journey we’ve been on 😄</p>
<p>We started by answering the big question — <strong>What is Kubernetes?</strong> We discovered that it’s not some mythical beast, but a powerful orchestration tool that helps us manage, deploy, scale, and maintain containerized applications in a smarter way.</p>
<p>Then, we took a step back in time to see how applications were deployed before Kubernetes — the headaches of manually installing software on servers, spinning up separate cloud instances for every microservice, and racking up huge cloud bills just to stay afloat. We also saw how containers simplified things, but even they had their own limitations when managed at scale.</p>
<p>That’s where Kubernetes came to the rescue</p>
<p>We explored:</p>
<ul>
<li><p><strong>The problems Kubernetes solves</strong> – like auto-scaling, efficient resource management, cost savings, and seamless container grouping.</p>
</li>
<li><p><strong>Kubernetes architecture and components</strong> – breaking down complex terms like the cluster, master node, worker nodes, Pods, Services, Kubelet, and more, into simple, easy-to-digest ideas.</p>
</li>
<li><p><strong>Kubernetes workloads</strong> like Deployments, Pods, Services, DaemonSets, and StatefulSets, and what they do behind the scenes to keep our apps running reliably.</p>
</li>
</ul>
<p>From theory to practice, we even got our hands dirty:</p>
<ul>
<li><p>We created a free Kubernetes cluster using Play with Kubernetes 🧪</p>
</li>
<li><p>Deployed a real application using both imperative (direct command) and declarative (manifest file) approaches</p>
</li>
<li><p>Understood why the declarative method makes our infrastructure easier to manage, especially when our systems grow.</p>
</li>
</ul>
<p>Then we took a business lens 🔍 and looked at:</p>
<ul>
<li><p>The advantages of Kubernetes – from auto-scaling during traffic surges, to cost efficiency, and cloud-agnostic deployment.</p>
</li>
<li><p>And also the disadvantages – like needing experienced DevOps engineers and not being ideal for every stage of a product's lifecycle.</p>
</li>
</ul>
<p>Finally, we wrapped up with real-life use cases, highlighting when Kubernetes is a must-have, and when it’s better to wait – especially for early-stage startups still trying to find their audience.</p>
<p>So, whether you're a DevOps newbie, a startup founder, or just someone curious about how modern tech keeps your favorite apps online – you now have a strong foundational understanding of Kubernetes 🙌</p>
<p>Kubernetes is powerful, but it doesn't have to be overwhelming. With a solid grasp of the basics (which you now have 💪), you're well on your way to managing scalable applications like a pro.</p>
<p>Start simple. Grow smart. And when the time is right – Kubernetes will be your best friend.</p>
<h2 id="heading-study-further"><strong>Study Further 📚</strong></h2>
<p>If you would like to learn more about Kubernetes, you can check out the courses below:</p>
<ul>
<li><p><a target="_blank" href="https://www.udemy.com/course/docker-kubernetes-the-practical-guide/">Docker &amp; Kubernetes: The Practical Guide (Academind - Udemy)</a></p>
</li>
<li><p><a target="_blank" href="https://www.coursera.org/specializations/certified-kubernetes-application-developer-ckad-course">Certified Kubernetes Application Developer (CKAD) Specialization (Coursera)</a></p>
</li>
</ul>
<h2 id="heading-about-the-author"><strong>About the Author 👨‍💻</strong></h2>
<p>Hi, I’m Prince! I’m a DevOps engineer and Cloud architect passionate about building, deploying, and managing scalable applications and sharing knowledge with the tech community<a target="_blank" href="https://www.udemy.com/course/github-actions-the-complete-guide/?couponCode=CMCPSALE24">.</a></p>
<p>If you enjoyed this article, you can learn more about me by exploring more of my blogs and projects on my <a target="_blank" href="https://www.linkedin.com/in/prince-onukwili-a82143233/">LinkedIn profile.</a> You can find my <a target="_blank" href="https://www.linkedin.com/in/prince-onukwili-a82143233/details/publications/">LinkedIn articles here</a>. You can also <a target="_blank" href="https://prince-onuk.vercel.app/achievements#articles">visit my website</a> to read more of my articles as well. Let’s connect and grow together! 😊</p>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ How to Set Up a Kubernetes Network Policy and Secure Your Cluster ]]>
                </title>
                <description>
                    <![CDATA[ In a Kubernetes environment, proper networking allows for seamless communication between various components within the cluster and the external environment. As your applications grow, networking becomes more and more important and helps ensure that t... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/set-up-kubernetes-network-policy-and-secure-your-cluster/</link>
                <guid isPermaLink="false">67b49d275027eb87a4a1ba21</guid>
                
                    <category>
                        <![CDATA[ Kubernetes ]]>
                    </category>
                
                    <category>
                        <![CDATA[ containers ]]>
                    </category>
                
                    <category>
                        <![CDATA[ clusters ]]>
                    </category>
                
                    <category>
                        <![CDATA[ AWS ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ Eti Ijeoma ]]>
                </dc:creator>
                <pubDate>Tue, 18 Feb 2025 14:45:59 +0000</pubDate>
                <media:content url="https://cdn.hashnode.com/res/hashnode/image/upload/v1739889943803/796d97e8-a1c9-41e4-a678-61477514c020.png" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>In a Kubernetes environment, proper networking allows for seamless communication between various components within the cluster and the external environment. As your applications grow, networking becomes more and more important and helps ensure that the application is scalable and secure enough to meet your users’ demands.</p>
<p>Kubernetes networking helps you manage how pods, services, and other external entities interact in this environment to ensure proper connectivity, isolation, and load distribution where necessary. It offers a flexible yet sophisticated networking system that implements fine-grained security controls through Network Policies.</p>
<p>One unique feature of Kubernetes is that it lets you deploy and manage multiple applications at scale within a single cluster. This helps you manage resources efficiently and optimizes costs as your applications run. But this also introduces challenges related to resource isolation and security. This is where proper Kubernetes networking becomes essential.  </p>
<p>In this article, we will discuss the fundamentals of Kubernetes networking and how it facilitates secure connections within a cluster. We will also explore Network Policies as a mechanism for defining rules that regulate pod-to-pod and pod-to-external communication, ensuring fine-grained control over traffic flow within the cluster.</p>
<h3 id="heading-heres-what-well-cover"><strong>Here’s what we’ll cover:</strong></h3>
<ol>
<li><p><a class="post-section-overview" href="#heading-breakdown-of-network-connectivity-types">Breakdown of Network Connectivity Types</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-what-are-kubernetes-network-policies">What Are Kubernetes Network Policies?</a></p>
<ul>
<li><p><a class="post-section-overview" href="#heading-how-kubernetes-network-policies-work">How Kubernetes Network Policies Work</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-how-to-implement-networking-policies">How to Implement Networking Policies</a></p>
</li>
</ul>
</li>
<li><p><a class="post-section-overview" href="#heading-how-to-set-up-a-simple-kubernetes-network-policy-on-eks">How to Set Up a Simple Kubernetes Network Policy on EKS</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-when-and-why-to-use-kubernetes-network-policies">When and Why to Use Kubernetes Network Policies</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-best-practices-for-implementing-kubernetes-network-policies-on-your-cluster">Best practices for Implementing Kubernetes Network Policies on your cluster</a></p>
</li>
</ol>
<h2 id="heading-breakdown-of-network-connectivity-types"><strong>Breakdown of Network Connectivity Types</strong></h2>
<p>Kubernetes networking is designed to achieve four important goals within the Kubernetes environment to ensure the seamless operation of a Kubernetes cluster. These goals are set to ensure that there is proper communication between the containers, pods, and external entities, enabling them to work together effectively within the Kubernetes infrastructure.</p>
<h3 id="heading-container-to-container-communication"><strong>Container-to-Container Communication</strong></h3>
<p>One of the goals of implementing proper Kubernetes networking is to allow containers within the same pod to communicate directly with each other. Sharing the same networking namespace allows these containers to interact with each other using localhost, resulting in low-latency communication that helps multi-container applications function properly.</p>
<p>Container-to-container communication is useful when working with workloads that have tightly coupled processes and need to communicate quickly without latency within a single pod.</p>
<h3 id="heading-pod-to-pod-connectivity"><strong>Pod-to-Pod connectivity</strong></h3>
<p>Within a Kubernetes environment, pods are assigned unique IP addresses, making pod-to-pod communication simple and straightforward. Understanding traditional networking between servers, Kubernetes removes the complexity of Network Address Translation (NAT), enabling pods to communicate with ease.</p>
<p>Pod-to-pod communication is the backbone of microservices architecture, allowing each pod to operate independently while remaining connected to others.</p>
<h3 id="heading-pod-to-service-interaction"><strong>Pod-to-Service Interaction</strong></h3>
<p>Services in Kubernetes are often described as stable endpoints that help pods access each other. They ensure that traffic is routed to the right pod, regardless of the complexity of the pod setup. Service-to-pod communication is typically reliable, especially in environments where traffic and pod configurations are constantly evolving.</p>
<h3 id="heading-external-to-internal-access"><strong>External-to-Internal Access</strong></h3>
<p>One of the goals of Kubernetes networking is also to manage traffic that comes from outside the cluster. There are several tools, like Ingress Controllers and LoadBalancers, that help handle external-to-internal communication. These tools help ensure that the right application is exposed to end-users, ensuring the proper delivery of services.</p>
<p>While Kubernetes networking meets the requirements of these goals, communication is usually open-ended by default. This means that pods within a cluster can freely communicate with each other without any restriction. This is not ideal, especially in a production environment where isolation and security are important. This is where Kubernetes Network Policies come into play.</p>
<h2 id="heading-what-are-kubernetes-network-policies"><strong>What Are Kubernetes Network Policies?</strong></h2>
<p>Kubernetes Network Policies give you a way to enforce fine-grained control over the flow of traffic within your pods. These policies allow you to define which pods can communicate with each other or with other devices – so they act as a security layer with rules that restrict or allow specific types of traffic.</p>
<p>For example, if certain pods handle sensitive data or information, Network Policies can ensure that only authorized pods or external systems can gain access to it.</p>
<p>Implementing Network Policies also helps your Kubernetes clusters maintain security and compliance by restricting unnecessary communication and reducing traffic flow that could cause a security breach.</p>
<h3 id="heading-how-kubernetes-network-policies-work">How Kubernetes Network Policies Work</h3>
<p>Kubernetes Network Policies provide fine-grained access control within the Kubernetes cluster to manage network traffic at the pod level. Here, you can define separate rules for ingress and egress and restrict traffic to a particular port range.</p>
<p>In Kubernetes Network Policies, multiple policies can target the same pod. In this case, you can create "allow" rules to determine which traffic is permitted. Any traffic that doesn’t match the "allow" rule will be blocked.</p>
<p>Network Policies use IP addresses and port numbers to regulate traffic. This provides control over network flows to adhere to specific security requirements.</p>
<p>On the other hand, Network Policies aren’t a complete solution within an environment due to certain limitations. They cannot log blocked traffic events, meaning you cannot observe or debug why and when the Kubernetes Network Policy is blocking specific traffic. To achieve this, you need to use external tools supported by your CNI plugin.</p>
<p>CNI stands for Container Network Interface, a standard interface used by Kubernetes to manage network resources in containers. The CNI plugin is essential for providing container networking capabilities such as IP address allocation, routing, and enforcement of network policies. The plugin also enables the cluster to handle pod networking, including assigning network policies to pods and managing traffic flow between them.</p>
<p>Some popular network plugins include Calico, Cilium, Flannel, and Weave Net, each offering unique features and support for Network Policy integration.</p>
<h3 id="heading-how-to-implement-networking-policies"><strong>How to Implement Networking Policies</strong></h3>
<p>Properly implementing Network Policies relies on the CNI plugin you’re using in the Kubernetes Cluster. For Network Policies to take effect, the CNI plugin configured on your cluster must support them.</p>
<p>Network policies are usually enabled by default in managed Kubernetes services provided by cloud platforms such as Amazon EKS, Microsoft Azure AKS, or Google Cloud GKE. But if you manage your own cluster, you need to ensure that your CNI plugin is compatible. For example, the popular CNI plugin Flannel doesn’t support network policies, whereas Calico does.</p>
<h2 id="heading-how-to-set-up-a-simple-kubernetes-network-policy-on-eks"><strong>How to Set Up a Simple Kubernetes Network Policy on EKS</strong></h2>
<h3 id="heading-prerequisites"><strong>Prerequisites</strong></h3>
<p>Ensure you have the following installed on your Ubuntu Server:</p>
<ul>
<li><h4 id="heading-aws-cli-for-authentication-and-interactions-with-aws-resources"><code>AWS CLI</code>: For authentication and interactions with AWS resources</h4>
</li>
<li><h4 id="heading-kubectl-kubernetes-cli"><code>kubectl</code>: Kubernetes CLI</h4>
</li>
<li><h4 id="heading-eksctl-this-is-a-cli-for-managing-eks-clusters"><code>eksctl</code>: This is a CLI for managing EKS clusters</h4>
</li>
</ul>
<h3 id="heading-steps"><strong>Steps</strong></h3>
<p>First, create your AWS EKS cluster using the following CLI commands:</p>
<pre><code class="lang-bash">eksctl create cluster \

  --name my-eks-cluster \

  --region us-east-1 \

  --nodegroup-name ng-eks \

  --node-type t3.medium \

  --nodes 3 \

  --nodes-min 2 \

  --nodes-max 4 \

  --with-oidc \

  --version 1.31
</code></pre>
<p>Next, enable the Amazon VPC CNI plugin for your Kubernetes cluster. To do this, within your EKS Cluster, make sure that the Amazon VPC CNI plugin is installed to manage pod networking.  </p>
<p>Check the status like this:</p>
<pre><code class="lang-bash">kubectl get pods -n kube-system | grep aws-node
</code></pre>
<p>If it’s not running, deploy or update it to run on the cluster</p>
<pre><code class="lang-bash">kubectl apply -f https://raw.githubusercontent.com/aws/amazon-vpc-cni-k8s/master/config/v1.12/aws-k8s-cni.yaml
</code></pre>
<p>Amazon VPC CNI does not support and enforce network policies. So we have to install Calico, which is a CNI that works with the VPC CNI for Network Policies.</p>
<pre><code class="lang-bash">kubectl apply -f https://raw.githubusercontent.com/projectcalico/calico/v3.25.0/manifests/calico.yaml
</code></pre>
<p>Confirm that Calico is installed and running like this:</p>
<pre><code class="lang-bash">kubectl get pods -n kube-system | grep calico
</code></pre>
<p>Now that we have set up Calico on our AWS EKS Cluster, let's examine various Kubernetes Network Policies that we can apply to it.</p>
<h3 id="heading-allow-all-traffic-to-a-specific-pod-in-the-cluster">Allow all traffic to a specific pod in the cluster:</h3>
<pre><code class="lang-bash">apiVersion: networking.k8s.io/v1

kind: NetworkPolicy

metadata:

  name: pod-network-policy

spec:

  podSelector:

    matchLabels:

      app: application-demo

  policyTypes:

    - Ingress

    - Egress

  ingress:

    - {}

  egress:

    - {}
</code></pre>
<p>This configuration defines a NetworkPolicy named <code>pod-network-policy</code> that applies to all pods with the label <code>app: application-demo</code>. The <code>podSelector</code> ensures that only the pods with this label are targeted.</p>
<ul>
<li><p>The <code>policyTypes</code> field indicates that this policy controls both <strong>Ingress</strong> (incoming traffic) and <strong>Egress</strong> (outgoing traffic).</p>
</li>
<li><p>The <code>ingress</code> and <code>egress</code> rules are defined with empty braces <code>{}</code>, meaning no restrictions are applied—all traffic is allowed, both inbound and outbound.</p>
</li>
</ul>
<h3 id="heading-deny-all-traffic-to-a-pod-in-the-cluster">Deny all traffic to a pod in the cluster:</h3>
<pre><code class="lang-bash">
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: pod-network-policy
spec:
  podSelector:
    matchLabels:
      app: application-demo
  policyTypes:
    - Ingress
    - Egress
</code></pre>
<p>This configuration also selects pods labeled <code>app: application-demo</code> and applies the policy to both Ingress and Egress traffic.</p>
<p>Since no specific rules are defined, Kubernetes denies all traffic by default. This is also known as a "deny by default" policy, used to enforce strict isolation, preventing pods from communicating with others unless explicitly allowed by additional policies.</p>
<h3 id="heading-deny-all-ingress-traffic-to-the-pods-in-the-cluster">Deny all ingress traffic to the pods in the cluster</h3>
<pre><code class="lang-bash">apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: pod-network-policy
spec:
  podSelector: {}
  policyTypes:
    - Ingress
</code></pre>
<p>This configuration applies a NetworkPolicy to all pods in the namespace.</p>
<ul>
<li><p>The empty <code>podSelector</code> is empty (<code>{}</code>), meaning it applies to all pods in the namespace, regardless of their labels.</p>
</li>
<li><p>The <code>policyTypes</code> field specifies that the policy only applies to Ingress traffic.</p>
</li>
<li><p>Since no explicit Ingress rules are defined, Kubernetes blocks all incoming traffic by default.</p>
</li>
</ul>
<h3 id="heading-deny-all-egress-traffic-to-the-pods-in-the-cluster">Deny all egress traffic to the pods in the cluster</h3>
<pre><code class="lang-bash">apiVersion: networking.k8s.io/v1

kind: NetworkPolicy

metadata:

  name: pod-network-policy

spec:

  podSelector: {}

  policyTypes:

    - Egress
</code></pre>
<p>In the configuration above:</p>
<ul>
<li><p>The <code>podSelector</code> is empty (<code>{}</code>), meaning the policy applies to all pods in the namespace.</p>
</li>
<li><p>The <code>policyTypes</code> field specifies that this policy only applies to Egress traffic.</p>
</li>
<li><p>Since no explicit Egress rules are defined, Kubernetes blocks all outgoing traffic for the target pods by default.</p>
</li>
</ul>
<h2 id="heading-when-and-why-to-use-kubernetes-network-policies">When and Why to Use Kubernetes Network Policies</h2>
<p>There are various use cases for implementing Kubernetes Network Policies to improve cluster security.</p>
<p>For example, perhaps you want to restrict who/what can access the database. If you have a database deployed within the cluster, Kubernetes Network Policies ensure that only authorized pods can communicate with it, blocking access from unauthorized applications within the cluster.</p>
<p>Or perhaps you want to isolate sensitive pods. Properly implementating Network Policies helps isolate sensitive pods that do not need to accept inbound traffic from other pods, strengthening security within the infrastructure.</p>
<h2 id="heading-best-practices-for-implementing-kubernetes-network-policies-on-your-cluster"><strong>Best Practices for Implementing Kubernetes Network Policies on Your Cluster</strong></h2>
<p>To maximize the effectiveness and security benefits of your Kubernetes Network Policies, keep these best practices in mind:</p>
<ul>
<li><p><strong>Ensure all pods are covered by a Network Policy:</strong> In an ideal production environment, all pods within the cluster should be covered by a network policy that limits their network to only the Ingress and Egress targets set in the configuration. Without this policy in place, all the pods can communicate freely, posing a huge security risk.</p>
</li>
<li><p><strong>Complement Network Policies with other security measures</strong>: While Network Policies are essential for Network isolation, they should be part of a wider security and networking strategy for your cluster. Additional safeguards should include Role-Based Access Control, which restricts unauthorized users from accessing or modifying the pod configurations, and advanced security contexts, which limit container capabilities.</p>
</li>
<li><p><strong>Always test Network Policies before deploying to production.</strong> Kubernetes Network Policies can be a bit of a hassle to validate, especially in a production environment, because they may hinder many running processes within the cluster. Always test new policies to ensure that they are working as intended within the cluster. For example, if you implement a new testing policy, use tools like curl or ping to verify blocked connectivity within the cluster.</p>
</li>
<li><p><strong>Always review your Network Policies as the cluster grows.</strong> As your cluster grows with the increasing user base and engineering needs, your Network Policies must always reflect new workloads, such as Pods and Namespaces. It is always best to review and update your Network Policies to stay relevant and ensure your environment is secure. </p>
</li>
<li><p><strong>Use precise target selectors for the configurations:</strong> Be specific when defining pod selectors, namespaces, and ipBlock ranges within your Network policies. For example, if you are working with namespace selectors, ensure that all the pods within that namespace conform to its security goals. Avoid using namespace selectors if you need to deploy pods that should communicate with other pods in the namespace. This is ideal because implementing namespace or pod selectors vaguely will impact the server, leading to unintended access.</p>
</li>
</ul>
<h2 id="heading-conclusion"><strong>Conclusion</strong></h2>
<p>In this article, you learned about Kubernetes Network Policies as a way to manage and restrict communication between pods. Since pods don’t have network isolation by default, setting up the right policies is important for security.</p>
<p>While Network Policies play an important role, it is also important to protect your Cloud environment by ensuring your infrastructure is hardened – so make sure you also implement RBAC and regular vulnerability scans. You should also allocate only needed pod resources, build minimal base images for the pods, and follow Kubernetes security best practices in general.</p>
<p>By doing this, you can achieve end-to-end protection for your workloads.</p>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ How to Containerize a Node.js Application Using Docker – A Beginner's Guide ]]>
                </title>
                <description>
                    <![CDATA[ Over the years, applications and tools have become more complex to keep up with people’s changing requirements and expectations. But this can create issues of code compatibility and remote access. For example, a codebase that functions properly on Wi... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/containerize-a-nodejs-application-using-docker/</link>
                <guid isPermaLink="false">6793a775498f1e108e0ba05d</guid>
                
                    <category>
                        <![CDATA[ Docker ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Node.js ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Dockerfile ]]>
                    </category>
                
                    <category>
                        <![CDATA[ containers ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ Oluwatobi ]]>
                </dc:creator>
                <pubDate>Fri, 24 Jan 2025 14:45:09 +0000</pubDate>
                <media:content url="https://cdn.hashnode.com/res/hashnode/image/upload/v1737681497302/0540f730-f1c3-496c-bd47-912fdc95d468.png" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>Over the years, applications and tools have become more complex to keep up with people’s changing requirements and expectations. But this can create issues of code compatibility and remote access. For example, a codebase that functions properly on Windows may develop compatibility errors when installed on Linux.</p>
<p>Fortunately, Docker comes to the rescue. But you might be wondering – what is Docker, and how does it help? You’ll learn all this and more in this tutorial.</p>
<p>But before we start, here are some prerequisites:</p>
<ul>
<li><p>Knowledge of Linux commands</p>
</li>
<li><p>Knowledge of terminal usage</p>
</li>
<li><p>Knowledge of Node.js and Express.js</p>
</li>
</ul>
<h2 id="heading-table-of-contents">Table Of Contents</h2>
<ol>
<li><p><a class="post-section-overview" href="#heading-what-is-docker">What is Docker?</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-how-to-install-docker">How to Install Docker</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-demo-project-how-to-containerize-a-nodejs-application">Demo Project: How to Containerize a Node.js Application</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-wrapping-up">Wrapping Up</a></p>
</li>
</ol>
<h2 id="heading-what-is-docker">What is Docker?</h2>
<p>Docker is an open-source tool that makes it easy to run software in a consistent way, no matter where you are. It does this by putting your application and everything it needs (like libraries and settings) into a container (which I’ll discuss more in a moment).</p>
<p>Think of a container like a box: it holds your app and all its parts, so it works exactly the same on your laptop, a server, or in the cloud. Docker helps developers avoid the "it works on my machine" problem by ensuring everything is packaged together in a reliable and portable way.</p>
<p>Docker was created by Solomon Hykes in 2013. Over the years, it has evolved to cover a wide range of tools. It’s become a go-to tool for improving the application deployment and networking processes.</p>
<p>Before we proceed, here are some key terms you will come across as we go through this tutorial:</p>
<h3 id="heading-docker-engine"><strong>Docker Engine</strong></h3>
<p>The Docker engine, as its name implies, is the powerhouse for Docker applications. It has a client and a server component. The Docker client, in our case, is the command-line interface tool or Docker terminal we’ll be using to send relevant commands for project execution. The Docker server, popularly known as the daemon, is the server that handles running the various Docker images and containers.</p>
<h3 id="heading-docker-image"><strong>Docker Image</strong></h3>
<p>Docker images are premade templates of executable software and systems. Docker offers a wide range of images ranging from operating system templates to server templates, software templates, and so on. You can find all these on the Docker hub registry where these images are stored.</p>
<p>You can also build a specific image and host it either publicly on the Docker hub or in a private registry.</p>
<h3 id="heading-docker-containers"><strong>Docker Containers</strong></h3>
<p>Docker containers are executable compact instances built on the template generated which is the Docker image. They’re lightweight, portable packages that include everything needed to run a piece of software—code, runtime, libraries, and system tools. A container ensures the application runs consistently regardless of the environment.</p>
<h3 id="heading-benefits-of-using-docker">Benefits of Using Docker</h3>
<p>Here are some of the benefits of using Docker as a backend developer:</p>
<ul>
<li><p>Docker is a great tool for creating a solid DevOps culture for application development, as it clarifies the functions of the development and operations teams.</p>
</li>
<li><p>It’s also quite flexible, allowing for easy deployment of microservices and distributed monolithic backend applications.</p>
</li>
<li><p>It also minimizes errors from dependency misconfigurations during installations as it ports the app with its necessary dependencies all at once.</p>
</li>
</ul>
<p>Moving on, we will be diving into how to Dockerize a Node.JS Express application. But before that, you’ll need to install Docker on your computer. You can skip this if you already have it installed.</p>
<h2 id="heading-how-to-install-docker">How to Install Docker</h2>
<p>Docker is a cross-platform tool which can be installed across all popular operating systems (Windows, Mac OS, and Linux distros). For this tutorial, I’ll only be highlighting how to set up Docker on Windows.</p>
<p>If you’re currently using any OS other than Windows, you can easily set Docker up by following the steps in the Docker documentation <a target="_blank" href="https://docs.docker.com/engine/install/">here</a>.</p>
<p>For windows users, it is essential that your PC meets the minimum specifications – otherwise the installation won't be successful. The minimum requirements are the following:</p>
<ul>
<li><p>A Windows OS version not less than Windows 10 home</p>
</li>
<li><p>A PC with WSL-2 installed or Hypervisor enabled.</p>
</li>
</ul>
<p>With that, let's move on to downloading the Docker installer executable. You can download the latest Docker installer from <a target="_blank" href="https://www.docker.com/products/docker-desktop/">here</a>. After you do that, run the software and accept the terms and conditions. On successful completion, launch the application. This is what you should see:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1737154696376/dcbf3b23-10cc-452a-b206-46973163e8d6.png" alt="Docker desktop GUI" class="image--center mx-auto" width="1872" height="616" loading="lazy"></p>
<p>To confirm that you’ve successfully installed the application, navigate to the command prompt terminal and run <code>Docker –-version</code>. You should see the exact version of the Docker engine tool you’ve installed if it was successful.</p>
<p>We’ll now move on to the project proper.</p>
<h2 id="heading-demo-project-how-to-containerize-a-nodejs-application">Demo Project: How to Containerize a Node.js Application</h2>
<p>In this section, we will be containerizing a simple Node.js-based backend service with minimal dependencies. This will show you how to containerize and port an application using a Docker application containerization technique known as the <strong>Dockerfile</strong>. Keep in mind that if you have a more complex application, it may be better to use the <a target="_blank" href="https://www.freecodecamp.org/news/what-is-docker-compose-how-to-use-it/"><strong>Docker compose YAML tool</strong></a>.</p>
<p>To begin with, we will set up the sample Node.js application. I’ll provide the entire code setup in this article, below. But first, let’s understand what a <strong>dockerfile</strong> is.</p>
<h3 id="heading-what-is-a-dockerfile">What is a Dockerfile?</h3>
<p>Basically, a Dockerfile is a template system which allows the user to input commands which, when executed, can produce a functional image of the application. This image can then be converted into a container.</p>
<p>Here are some commands included in the basic structure of a Dockerfile:</p>
<ul>
<li><p><code>CMD</code><strong>:</strong> sets the default command to run if no command is specified when the container starts. It can be overridden by providing a command when running the container (<code>docker run ...</code>).</p>
</li>
<li><p><code>ENTRYPOINT</code><strong>:</strong> Specifies the main command that always runs when the container starts. It’s not easily overridden, but arguments can be appended.<br>  <strong>Note</strong> that <code>CMD</code> and <code>ENTRYPOINT</code> both specify what command or process the container should run when it starts. But they’re used differently and have distinct purposes. Use <code>CMD</code> for default behavior that can be overridden. Use <code>ENTRYPOINT</code> for a fixed command that defines the container's primary purpose.</p>
</li>
<li><p><code>FROM</code><strong>:</strong> This is usually the opening statement in a Dockerfile. This command fetches a base image which forms the foundation for building the image of the application in question. For instance, in our application, the base image for a Node.js application is to have the baseline Node.js engine installed.</p>
</li>
<li><p><code>WORKDIR</code><strong>:</strong> This syntax defines the active working directory where the application files will live within the defined container. An automatic folder will be created if it’s not already available.</p>
</li>
<li><p><code>COPY</code><strong>:</strong> This syntax is used to ensure that the files necessary for creating the Docker image from the code base project file are copied into the newly created Docker container. The directories of these files are carefully highlighted.</p>
</li>
<li><p><code>RUN</code><strong>:</strong> This syntax specifies the script that you want to be run before completing the application’s containerization.</p>
</li>
<li><p><code>ENV</code><strong>:</strong> This syntax is used to highlight environmental variables and secrets which will be invoked during the process of running the application.</p>
</li>
<li><p><code>EXPOSE</code><strong>:</strong> This syntax maps out the browsing port where the application is used to communicate with the external internet. For example <code>EXPOSE: 3000</code> maps out the application web interface to <code>localhost:3000</code>.</p>
</li>
</ul>
<p>Diving deeper into Docker, let’s quickly go over some key Docker commands we’ll be using throughout this tutorial:</p>
<ul>
<li><p><code>Docker ps</code><strong>:</strong> This command lists all the running containers on your Docker terminal.</p>
</li>
<li><p><code>Docker run</code><strong>:</strong> This command executes a Docker image to trigger an instance of a container.</p>
</li>
<li><p><code>Docker build</code><strong>:</strong> This command works based on the Docker file to generate an image of a service or application.</p>
</li>
<li><p><code>Docker rm</code><strong>:</strong> this command can be used to delete an image using the image identification details.</p>
</li>
</ul>
<h3 id="heading-how-to-containerize-the-app">How to Containerize the App</h3>
<p>Now we can start containerizing our simple Node/Express application. To follow along with the tutorial, you can get the base code from <a target="_blank" href="https://github.com/oluwatobi2001/Typescript_test">here</a>.</p>
<p>On testing it locally, it returns a CRUD API where you can create, fetch, update, and delete products when executed. We’ll package the application for easy deployment on the cloud using our Docker engine. We’ll be able to do this using the Dockerfile tool we discussed above.</p>
<h4 id="heading-step-1-create-the-dockerfile">Step 1: Create the dockerfile</h4>
<p>In your project folder, create a file named <code>Dockerfile</code>. Make sure the name is <strong>exactly</strong> "Dockerfile" (no extension, and case-sensitive in some systems – so make sure it’s capitalized).</p>
<p>If you're using a code editor, simply create a new file named <code>Dockerfile</code>. If you're using a basic text editor, save the file with the name <code>Dockerfile</code> and ensure it doesn’t accidentally save with an extension like <code>.txt</code>.</p>
<p>Then enter the first line:</p>
<pre><code class="lang-dockerfile"><span class="hljs-keyword">FROM</span> Node:<span class="hljs-number">18</span>-alpine
</code></pre>
<p>This command fetches the base image we’ll use to power our Express application which is the Node engine itself.</p>
<p>You might be wondering what the <code>alpine</code> is for. Alpine is a lightweight, much more compressed version of a Docker image. It excludes incorporating additional packages not directly essential to the base operating system. It's advocated as a standard good code practice to use lightweight distros for faster execution and easy use.</p>
<h4 id="heading-step-2-set-the-working-directory">Step 2: Set the working directory</h4>
<pre><code class="lang-dockerfile"><span class="hljs-keyword">WORKDIR</span><span class="bash"> /app</span>
</code></pre>
<p>This sets the working directory of the image to the <code>/app</code> folder of the container. It makes sure that all file actions occur here and all files are copied into this directory.</p>
<h4 id="heading-step-3-copy-the-necessary-files">Step 3: Copy the necessary files</h4>
<pre><code class="lang-dockerfile"><span class="hljs-keyword">COPY</span><span class="bash"> package.json</span>
</code></pre>
<p>This command copies the <code>package.json</code> files which has a list of dependences and packages to be installed to power our application.</p>
<h4 id="heading-step-4-execute-a-setup-script">Step 4: Execute a setup script</h4>
<pre><code class="lang-dockerfile"><span class="hljs-keyword">RUN</span><span class="bash"> npm install</span>
</code></pre>
<p>This command ensures that all the necessary dependencies to power our Node.js applications are installed on the container.</p>
<h4 id="heading-step-5-copy-the-code-files">Step 5: Copy the code files</h4>
<pre><code class="lang-dockerfile"><span class="hljs-keyword">COPY</span><span class="bash"> . .</span>
</code></pre>
<p>This command ensures that all the files within the local directory get copied into the container file system within the established working directory.</p>
<h4 id="heading-step-6-expose-the-server-port">Step 6: Expose the server port</h4>
<pre><code class="lang-dockerfile">

<span class="hljs-keyword">EXPOSE</span> <span class="hljs-number">3000</span>
</code></pre>
<p>This command exposes the server port that we intend to use to access the container. In this case it's port 3000.</p>
<h4 id="heading-step-7-include-the-command-to-bring-the-container-to-life">Step 7: Include the command to bring the container to life</h4>
<pre><code class="lang-dockerfile"><span class="hljs-keyword">CMD</span><span class="bash"> [<span class="hljs-string">"npm"</span>, <span class="hljs-string">"run"</span>, <span class="hljs-string">"dev"</span>]4</span>
</code></pre>
<p>This command is executed a the end in order to power on the Node.js application. It simply runs the <code>npm run dev</code> command which is what you’d use for a development environment. To run it in a production environment, you’d use the <code>npm start</code> command instead.</p>
<p>Having completed this process, here is how the final Dockerfile structure should look:</p>
<pre><code class="lang-dockerfile"><span class="hljs-keyword">FROM</span> Node:<span class="hljs-number">18</span>-alpine
<span class="hljs-keyword">WORKDIR</span><span class="bash"> /app</span>

<span class="hljs-keyword">COPY</span><span class="bash"> package.json</span>

<span class="hljs-keyword">RUN</span><span class="bash"> npm install</span>

<span class="hljs-keyword">COPY</span><span class="bash"> . .</span>

<span class="hljs-keyword">CMD</span><span class="bash"> [<span class="hljs-string">"npm"</span>, <span class="hljs-string">"run"</span>, <span class="hljs-string">"dev"</span>]</span>
</code></pre>
<h3 id="heading-testing-the-docker-container">Testing the Docker container</h3>
<p>To round it up, we will be creating a Docker image of our Node.js application. To do this, execute the command <code>docker build -t nodeapp .</code> . The <code>docker build</code> command builds the image, while the <code>-t</code> allows for specifying the image tag’s details.</p>
<p>In our case, we’re assigning the name <code>nodeapp</code> to the image we will be creating and the image will be created within the working directory.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1737154702142/98e05981-bb05-41c6-919f-02b3261f3caa.png" alt="This image runs the docker build command" class="image--center mx-auto" width="1517" height="681" loading="lazy"></p>
<p>Congratulations! You have successfully built your first Docker image. To see all the images on your local repo, execute the command <code>docker images</code>.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1737154714828/71f50b4f-8df5-4885-a5fc-6365dd903645.png" alt="A image showing the docker images command being executed and the list of all the images available locally" class="image--center mx-auto" width="865" height="103" loading="lazy"></p>
<p>In order to create a working instance of your image for testing, execute the command <code>docker run nodeapp</code>.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1737154708130/bb6968f2-829d-4107-be82-4bdd9c167d53.png" alt="Executing a running instance of our docker image" class="image--center mx-auto" width="1122" height="457" loading="lazy"></p>
<p>We’re using Mongo DB as our database for this tutorial, so we’ll need to pass the MongoDB URL as an environment variable to the Docker container. Environment variables help you safeguard certain key variables which shouldn’t be exposed to the public. Other variables which can be passed as environment variables include API keys and encryption codes.</p>
<p>To pass the MongoDB URL to the Docker container, we use the <code>-e</code> tag to ensure that Docker recognizes the corresponding value inputted as an environment variable.</p>
<p><code>docker run -e JWT_SECRETS={enter the value of your choice} -e MONGO_URL={The mongo url of your choice} nodeapp</code>.</p>
<p>To also use the container in the background, just attach the <code>-d</code> tag which represents the detach option. This option allows the container to run in the background despite exiting the command line terminal.</p>
<p>In the event of no errors, navigating to <code>localhost:5000</code> should also produce something similar to the image below.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1737506281699/54bb1d9b-0be7-42e3-b212-bb4bd27e019d.png" alt="Postman testing the localhost:5000 " class="image--center mx-auto" width="956" height="476" loading="lazy"></p>
<h2 id="heading-wrapping-up">Wrapping Up</h2>
<p>In this article, you learned about what Docker is and how it works, along with its common commands and how to use it to containerize a backend application. Moving on from the basics, you can also explore other uses of Docker in continuous integration and development. To learn more about Docker, you can check out its documentation <a target="_blank" href="https://docs.docker.com/">here</a>.</p>
<p>I would also recommend using your new knowledge to deploy projects with real-life use cases, as well as exploring networking in Docker applications. To make your app live, you can easily deploy the Docker image you created to any of the popular cloud service providers like AWS, GCP, Azure, and so on.</p>
<p>Feel free to ask me any questions! You can also check out my other articles <a target="_blank" href="http://portfolio-121.netlify.app">here</a>. Till next time, keep on coding!</p>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ How to Run Integration Tests with GitHub Service Containers ]]>
                </title>
                <description>
                    <![CDATA[ Recently, I published an article about using Testcontainers to emulate external dependencies like a database and cache for backend integration tests. That article also explained the different ways of running the integration tests, environment scaffol... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/how-to-run-integration-tests-with-github-service-containers/</link>
                <guid isPermaLink="false">677d8125f9b13835118c7958</guid>
                
                    <category>
                        <![CDATA[ GitHub ]]>
                    </category>
                
                    <category>
                        <![CDATA[ github-actions ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Devops ]]>
                    </category>
                
                    <category>
                        <![CDATA[ containers ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Testing ]]>
                    </category>
                
                    <category>
                        <![CDATA[ CI/CD ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Docker ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ Alex Pliutau ]]>
                </dc:creator>
                <pubDate>Tue, 07 Jan 2025 19:31:49 +0000</pubDate>
                <media:content url="https://cdn.hashnode.com/res/hashnode/image/upload/v1735305764768/8e3d8980-456b-4828-abb7-dff749bbf1fd.png" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>Recently, I published an <a target="_blank" href="https://www.freecodecamp.org/news/integration-tests-using-testcontainers/"><strong>article</strong></a> about using <a target="_blank" href="https://testcontainers.com/"><strong>Testcontainers</strong></a> to emulate external dependencies like a database and cache for backend integration tests. That article also explained the different ways of running the integration tests, environment scaffolding, and their pros and cons.</p>
<p>In this article, I want to show another alternative in case you use GitHub Actions as your CI platform (the most popular CI/CD solution at the moment). This alternative is called <a target="_blank" href="https://docs.github.com/en/actions/use-cases-and-examples/using-containerized-services/about-service-containers"><strong>Service Containers</strong></a>, and I’ve realized that not many developers seem to know about it.</p>
<p>In this hands-on tutorial, I’ll demonstrate how to create a GitHub Actions workflow for integration tests with external dependencies (MongoDB and Redis) using the <a target="_blank" href="https://github.com/plutov/packagemain/tree/master/testcontainers-demo">demo Go application</a> we created in that previous tutorial. We’ll also review the pros and cons of GitHub Service Containers.</p>
<h2 id="heading-prerequisites">Prerequisites</h2>
<ul>
<li><p>A basic understanding of GitHub Actions workflows.</p>
</li>
<li><p>Familiarity with Docker containers.</p>
</li>
<li><p>Basic knowledge of Go toolchain.</p>
</li>
</ul>
<h2 id="heading-table-of-contents">Table of Contents</h2>
<ul>
<li><p><a class="post-section-overview" href="#heading-what-are-service-containers">What are Service Containers?</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-why-not-docker-compose">Why not Docker Compose?</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-job-runtime">Job Runtime</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-readiness-healthcheck">Readiness Healthcheck</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-private-container-registries">Private Container Registries</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-sharing-data-between-services">Sharing Data Between Services</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-golang-integration-tests">Golang Integration Tests</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-personal-experience-amp-limitations">Personal Experience &amp; Limitations</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-conclusion">Conclusion</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-resources">Resources</a></p>
</li>
</ul>
<h2 id="heading-what-are-service-containers">What are Service Containers?</h2>
<p>Service Containers are Docker containers that offer a simple and portable way to host dependencies like databases (MongoDB in our example), web services, or caching systems (Redis in our example) that your application needs within a workflow.</p>
<p>This article focuses on integration tests, but there are many other possible applications for service containers. For example, you can also use them to run supporting tools required by your workflow, such as code analysis tools, linters, or security scanners.</p>
<h2 id="heading-why-not-docker-compose">Why Not Docker Compose?</h2>
<p>Sounds similar to <strong>services</strong> in Docker Compose, right? Well, that’s because it is.</p>
<p>But while you could technically <a target="_blank" href="https://github.com/marketplace/actions/docker-compose-action">use Docker Compose</a> within a GitHub Actions workflow by installing Docker Compose and running <strong>docker-compose up</strong>, service containers provide a more integrated and streamlined approach that’s specifically designed for the GitHub Actions environment.</p>
<p>Also, while they are similar, they solve different problems and have different general purposes:</p>
<ul>
<li><p>Docker Compose is good when you need to manage a multi-container application on your local machine or a single server. It’s best suited for long-living environments.</p>
</li>
<li><p>Service Containers are ephemeral and exist only for the duration of a workflow run, and they’re defined directly within your GitHub Actions workflow file.</p>
</li>
</ul>
<p>Just keep in mind that the feature set of service containers (at least as of now) is more limited compared to Docker Compose, so be ready to discover some potential bottlenecks. We will cover some of them at the end of this article.</p>
<h2 id="heading-job-runtime">Job Runtime</h2>
<p>You can run GitHub jobs directly on a runner machine or in a Docker container (by specifying the <strong>container</strong> property). The second option simplifies the access to your services by using labels you define in the <strong>services</strong> section.</p>
<p>To run directly on a runner machine:</p>
<p><strong>.github/workflows/test.yaml</strong></p>
<pre><code class="lang-yaml"><span class="hljs-attr">jobs:</span>
  <span class="hljs-attr">integration-tests:</span>
    <span class="hljs-attr">runs-on:</span> <span class="hljs-string">ubuntu-24.04</span>

    <span class="hljs-attr">services:</span>
      <span class="hljs-attr">mongo:</span>
        <span class="hljs-attr">image:</span> <span class="hljs-string">mongodb/mongodb-community-server:7.0-ubi8</span>
        <span class="hljs-attr">ports:</span>
          <span class="hljs-bullet">-</span> <span class="hljs-number">27017</span><span class="hljs-string">:27017</span>

    <span class="hljs-attr">steps:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-attr">run:</span> <span class="hljs-string">|</span>
          <span class="hljs-string">echo</span> <span class="hljs-string">"addr 127.0.0.1:27017"</span>
</code></pre>
<p>Or you can run it in a container (<a target="_blank" href="https://images.chainguard.dev/directory/image/go/overview">Chainguard Go Image</a> in our case):</p>
<pre><code class="lang-yaml"><span class="hljs-attr">jobs:</span>
  <span class="hljs-attr">integration-tests:</span>
    <span class="hljs-attr">runs-on:</span> <span class="hljs-string">ubuntu-24.04</span>
    <span class="hljs-attr">container:</span> <span class="hljs-string">cgr.dev/chainguard/go:latest</span>

    <span class="hljs-attr">services:</span>
      <span class="hljs-attr">mongo:</span>
        <span class="hljs-attr">image:</span> <span class="hljs-string">mongodb/mongodb-community-server:7.0-ubi8</span>
        <span class="hljs-attr">ports:</span>
          <span class="hljs-bullet">-</span> <span class="hljs-number">27017</span><span class="hljs-string">:27017</span>
    <span class="hljs-attr">steps:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-attr">run:</span> <span class="hljs-string">|</span>
          <span class="hljs-string">echo</span> <span class="hljs-string">"addr mongo:27017"</span>
</code></pre>
<p>You can also omit the host port, so the container port will be randomly assigned to a free port on the host. You can then access the port using the variable.</p>
<p>Benefits of omitting the host port:</p>
<ul>
<li><p>Avoids port conflicts – for example when you run many services on the same host.</p>
</li>
<li><p>Enhances Portability – your configurations become less dependent on the specific host environment.</p>
</li>
</ul>
<pre><code class="lang-yaml"><span class="hljs-attr">jobs:</span>
  <span class="hljs-attr">integration-tests:</span>
    <span class="hljs-attr">runs-on:</span> <span class="hljs-string">ubuntu-24.04</span>
    <span class="hljs-attr">container:</span> <span class="hljs-string">cgr.dev/chainguard/go:1.23</span>

    <span class="hljs-attr">services:</span>
      <span class="hljs-attr">mongo:</span>
        <span class="hljs-attr">image:</span> <span class="hljs-string">mongodb/mongodb-community-server:7.0-ubi8</span>
        <span class="hljs-attr">ports:</span>
          <span class="hljs-bullet">-</span> <span class="hljs-number">27017</span><span class="hljs-string">/tcp</span>
    <span class="hljs-attr">steps:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-attr">run:</span> <span class="hljs-string">|</span>
          <span class="hljs-string">echo</span> <span class="hljs-string">"addr mongo:$<span class="hljs-template-variable">{{ job.services.mongo.ports['27017'] }}</span>"</span>
</code></pre>
<p>Of course, there are pros and cons to each approach.</p>
<p>Running in a container:</p>
<ul>
<li><p><strong>Pros</strong>: Simplified network access (use labels as hostnames), and automatic port exposure within the container network. You also get better isolation/security as the job runs in an isolated environment.</p>
</li>
<li><p><strong>Cons</strong>: Implied overhead of containerization.</p>
</li>
</ul>
<p>Running on the runner machine:</p>
<ul>
<li><p><strong>Pros</strong>: Potentially less overhead than running the job inside a container.</p>
</li>
<li><p><strong>Cons</strong>: Requires manual port mapping for service container access (using localhost:). There’s also less isolation/security, as the job runs directly on the runner machine. This potentially affects other jobs or the runner itself if something goes wrong.</p>
</li>
</ul>
<h2 id="heading-readiness-healthcheck">Readiness Healthcheck</h2>
<p>Prior to running the integration tests that connect to your provisioned containers, you’ll often need to make sure that the services are ready. You can do this by specifying <a target="_blank" href="https://docs.docker.com/reference/cli/docker/container/create/#options">docker create options</a> such as <strong>health-cmd</strong>.</p>
<p>This is very important – otherwise the services may not be ready when you start accessing them.</p>
<p>In the case of MongoDB and Redis, these will be the following:</p>
<pre><code class="lang-yaml">    <span class="hljs-attr">services:</span>
      <span class="hljs-attr">mongo:</span>
        <span class="hljs-attr">image:</span> <span class="hljs-string">mongodb/mongodb-community-server:7.0-ubi8</span>
        <span class="hljs-attr">ports:</span>
          <span class="hljs-bullet">-</span> <span class="hljs-number">27017</span><span class="hljs-string">/27017</span>
        <span class="hljs-attr">options:</span> <span class="hljs-string">&gt;-
          --health-cmd "echo 'db.runCommand("ping").ok' | mongosh mongodb://localhost:27017/test --quiet"
          --health-interval 5s
          --health-timeout 10s
          --health-retries 10
</span>
      <span class="hljs-attr">redis:</span>
        <span class="hljs-attr">image:</span> <span class="hljs-string">redis:7</span>
        <span class="hljs-attr">ports:</span>
          <span class="hljs-bullet">-</span> <span class="hljs-number">6379</span><span class="hljs-string">:6379</span>
        <span class="hljs-attr">options:</span> <span class="hljs-string">&gt;-
          --health-cmd "redis-cli ping"
          --health-interval 5s
          --health-timeout 10s
          --health-retries 10</span>
</code></pre>
<p>In the Action logs, you can see the readiness status:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1736245987630/0b0bf229-b8d3-4e4e-8e0b-e3bbe5f9a6d8.png" alt="GitHub Actions Logs" class="image--center mx-auto" width="512" height="210" loading="lazy"></p>
<h2 id="heading-private-container-registries">Private Container Registries</h2>
<p>In our example, we’re using public images from Dockerhub, but it’s possible to use private images from you private registries as well, such as Amazon Elastic Container Registry (ECR), Google Artifact Registry, and so on.</p>
<p>Make sure to store the credentials in <a target="_blank" href="https://docs.github.com/en/actions/security-for-github-actions/security-guides/using-secrets-in-github-actions">Secrets</a> and then reference them in the <strong>credentials</strong> section.</p>
<pre><code class="lang-yaml"><span class="hljs-attr">services:</span>
  <span class="hljs-attr">private_service:</span>
    <span class="hljs-attr">image:</span> <span class="hljs-string">ghcr.io/org/service_repo</span>
    <span class="hljs-attr">credentials:</span>
      <span class="hljs-attr">username:</span> <span class="hljs-string">${{</span> <span class="hljs-string">secrets.registry_username</span> <span class="hljs-string">}}</span>
      <span class="hljs-attr">password:</span> <span class="hljs-string">${{</span> <span class="hljs-string">secrets.registry_token</span> <span class="hljs-string">}}</span>
</code></pre>
<h2 id="heading-sharing-data-between-services">Sharing Data Between Services</h2>
<p>You can use volumes to share data between services or other steps in a job. You can specify named Docker volumes, anonymous Docker volumes, or bind mounts on the host. But it’s not directly possible to mount the source code as a container volume. You can refer to this <a target="_blank" href="https://github.com/orgs/community/discussions/42127">open discussion</a> for more context.</p>
<p>To specify a volume, you specify the source and destination path: <code>&lt;source&gt;:&lt;destinationPath&gt;</code></p>
<p>The <code>&lt;source&gt;</code> is a volume name or an absolute path on the host machine, and <code>&lt;destinationPath&gt;</code> is an absolute path in the container.</p>
<pre><code class="lang-yaml"><span class="hljs-attr">volumes:</span>
  <span class="hljs-bullet">-</span> <span class="hljs-string">/src/dir:/dst/dir</span>
</code></pre>
<p>Volumes in Docker (and GitHub Actions using Docker) provide persistent data storage and sharing between containers or job steps, decoupling data from container images.</p>
<h2 id="heading-project-setup">Project Setup</h2>
<p>Before diving into the full source code, let's set up our project for running integration tests with GitHub Service Containers.</p>
<ol>
<li><p>Create a new GitHub repository.</p>
</li>
<li><p>Initialize a Go module using <code>go mod init</code></p>
</li>
<li><p>Create a simple Go application.</p>
</li>
<li><p>Add integration tests in <code>integration_test.go</code></p>
</li>
<li><p>Create a <code>.github/workflows</code> directory.</p>
</li>
<li><p>Create a file named <code>integration-tests.yaml</code> inside the <code>.github/workflows</code> directory.</p>
</li>
</ol>
<h2 id="heading-golang-integration-tests">Golang Integration Tests</h2>
<p>Now as we can provision our external dependencies, let’s have a look at how to run our integration tests in Go. We will do it in the <strong>steps</strong> section of our workflow file.</p>
<p>We will run our tests in a container which uses <a target="_blank" href="https://images.chainguard.dev/directory/image/go/overview">Chainguard Go image</a>. This means we don’t have to install/setup Go. If you want to run your tests directly on a runner machine, you need to use the <a target="_blank" href="https://github.com/actions/setup-go">setup-go</a> Action.</p>
<p>You can find the full source code with tests and this workflow <a target="_blank" href="https://github.com/plutov/service-containers">here</a>.</p>
<p><strong>.github/workflows/integration-tests.yaml</strong></p>
<pre><code class="lang-yaml"><span class="hljs-attr">name:</span> <span class="hljs-string">"integration-tests"</span>

<span class="hljs-attr">on:</span>
  <span class="hljs-attr">workflow_dispatch:</span>
  <span class="hljs-attr">push:</span>
    <span class="hljs-attr">branches:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">main</span>

<span class="hljs-attr">jobs:</span>
  <span class="hljs-attr">integration-tests:</span>
    <span class="hljs-attr">runs-on:</span> <span class="hljs-string">ubuntu-24.04</span>
    <span class="hljs-attr">container:</span> <span class="hljs-string">cgr.dev/chainguard/go:latest</span>

    <span class="hljs-attr">env:</span>
      <span class="hljs-attr">MONGO_URI:</span> <span class="hljs-string">mongodb://mongo:27017</span>
      <span class="hljs-attr">REDIS_URI:</span> <span class="hljs-string">redis://redis:6379</span>

    <span class="hljs-attr">services:</span>
      <span class="hljs-attr">mongo:</span>
        <span class="hljs-attr">image:</span> <span class="hljs-string">mongodb/mongodb-community-server:7.0-ubi8</span>
        <span class="hljs-attr">ports:</span>
          <span class="hljs-bullet">-</span> <span class="hljs-number">27017</span><span class="hljs-string">:27017</span>
        <span class="hljs-attr">options:</span> <span class="hljs-string">&gt;-
          --health-cmd "echo 'db.runCommand("ping").ok' | mongosh mongodb://localhost:27017/test --quiet"
          --health-interval 5s
          --health-timeout 10s
          --health-retries 10
</span>
      <span class="hljs-attr">redis:</span>
        <span class="hljs-attr">image:</span> <span class="hljs-string">redis:7</span>
        <span class="hljs-attr">ports:</span>
          <span class="hljs-bullet">-</span> <span class="hljs-number">6379</span><span class="hljs-string">:6379</span>
        <span class="hljs-attr">options:</span> <span class="hljs-string">&gt;-
          --health-cmd "redis-cli ping"
          --health-interval 5s
          --health-timeout 10s
          --health-retries 10
</span>
    <span class="hljs-attr">steps:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">Check</span> <span class="hljs-string">out</span> <span class="hljs-string">repository</span> <span class="hljs-string">code</span>
        <span class="hljs-attr">uses:</span> <span class="hljs-string">actions/checkout@v4</span>

      <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">Download</span> <span class="hljs-string">dependencies</span>
        <span class="hljs-attr">run:</span> <span class="hljs-string">go</span> <span class="hljs-string">mod</span> <span class="hljs-string">download</span>

      <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">Run</span> <span class="hljs-string">Integration</span> <span class="hljs-string">Tests</span>
        <span class="hljs-attr">run:</span> <span class="hljs-string">go</span> <span class="hljs-string">test</span> <span class="hljs-string">-tags=integration</span> <span class="hljs-string">-timeout=120s</span> <span class="hljs-string">-v</span> <span class="hljs-string">./...</span>
</code></pre>
<p>To summarize what’s going on here:</p>
<ol>
<li><p>We run our job in a container with Go (<strong>container</strong>)</p>
</li>
<li><p>We spin up two services: MongoDB and Redis (<strong>services</strong>)</p>
</li>
<li><p>We configure healthchecks to make sure our services are “Healthy” when we run the tests (<strong>options</strong>)</p>
</li>
<li><p>We perform a standard code checkout</p>
</li>
<li><p>Then we run the Go tests</p>
</li>
</ol>
<p>Once the Action is completed (it took <strong>~1 min</strong> for this example), all the services will be stopped and orphaned so we don’t need to worry about that.</p>
<p><img src="https://miro.medium.com/v2/resize:fit:480/0*QLl4vjotU6o1osy-.png" alt="GitHub Actions Logs: full run" width="480" height="409" loading="lazy"></p>
<h2 id="heading-personal-experience-amp-limitations">Personal Experience &amp; Limitations</h2>
<p>We’ve been using service containers for running backend integration tests at <a target="_blank" href="https://www.binarly.io/">BINARLY</a> for some time, and they work great. But the initial workflow creation took some time and we encountered the following bottlenecks:</p>
<ul>
<li><p>It’s not possible to override or run custom commands in an action service container (as you would do in Docker Compose using the <strong>command</strong> property). <a target="_blank" href="https://github.com/actions/runner/pull/1152">Open pull request</a></p>
<ul>
<li>Workaround: we had to find a solution that doesn’t require that. In our case, we were lucky and could do the same with environment variables.</li>
</ul>
</li>
<li><p>It’s not directly possible to mount the source code as a container volume. <a target="_blank" href="https://github.com/orgs/community/discussions/42127">Open discussion</a></p>
<ul>
<li>While this is indeed a big limitation, you can copy the code from your repository into your mounted directory after the service container has started.</li>
</ul>
</li>
</ul>
<h2 id="heading-conclusion">Conclusion</h2>
<p>GitHub service containers are a great option to scaffold an ephemeral testing environment by configuring it directly in your GitHub workflow. With configuration being somewhat similar to Docker Compose, it’s easy to run any containerised application and communication with it in your pipeline. This ensures that GitHub runners take care of shutting everything down upon completion.</p>
<p>If you use Github Actions, this approach works extremely well as it is specifically designed for the GitHub Actions environment.</p>
<h3 id="heading-resources">Resources</h3>
<ul>
<li><p><a target="_blank" href="https://github.com/plutov/service-containers">Source Code</a></p>
</li>
<li><p><a target="_blank" href="https://docs.github.com/en/actions/writing-workflows/workflow-syntax-for-github-actions#jobsjob_idservices">GitHub Documentation</a></p>
</li>
<li><p>Discover more articles on <a target="_blank" href="https://packagemain.tech/">packagemain.tech</a></p>
</li>
</ul>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ An Introduction to Docker and Containers for Beginners ]]>
                </title>
                <description>
                    <![CDATA[ In the world of modern software development, efficiency and consistency are key. Developers and operations teams need solutions that help them manage, deploy, and run applications seamlessly across different environments. Containers and Docker are te... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/an-introduction-to-docker-and-containers-for-beginners/</link>
                <guid isPermaLink="false">6745acca09ace2d743c17be9</guid>
                
                    <category>
                        <![CDATA[ Docker ]]>
                    </category>
                
                    <category>
                        <![CDATA[ containers ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ Kedar Makode ]]>
                </dc:creator>
                <pubDate>Tue, 26 Nov 2024 11:11:06 +0000</pubDate>
                <media:content url="https://cdn.hashnode.com/res/hashnode/image/upload/v1731093934598/6f2fa740-63e6-48e9-8e17-364544d1fcc6.png" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>In the world of modern software development, efficiency and consistency are key. Developers and operations teams need solutions that help them manage, deploy, and run applications seamlessly across different environments.</p>
<p>Containers and Docker are technologies that have revolutionized how software is built, tested, and deployed.</p>
<p>Whether you're new to the world of tech or just looking to understand the basics of Docker, this article will guide you through the essentials.</p>
<h2 id="heading-table-of-content">Table of Content</h2>
<ul>
<li><p><a class="post-section-overview" href="#heading-what-are-containers">What Are Containers?</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-what-is-docker">What is Docker?</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-why-docker">Why Docker?</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-docker-architecture">Docker Architecture</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-dockers-container-runtime-containerd">Docker’s Container Runtime: containerd</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-how-to-create-a-simple-container-using-docker">How to Create a Simple Container Using Docker</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-wrapping-up">Wrapping Up</a></p>
</li>
</ul>
<h2 id="heading-what-are-containers">What Are Containers?</h2>
<p>Before diving into Docker, let’s first understand containers. Imagine that you’re working on a project, and your application works perfectly on your laptop. But when you try to run the same application on a different machine, it fails. This is often due to differences in environments: different operating systems, installed software versions, or configurations.</p>
<p>Containers solve this problem by packaging an application and all its dependencies like libraries, frameworks, and configuration files into a single, standardized unit. This ensures that the application runs the same no matter where it's deployed, whether on your laptop, a server, or in the cloud.</p>
<p>Key features of containers:</p>
<ul>
<li><p><strong>Lightweight</strong>: Containers share the host system's kernel, unlike virtual machines (VMs) that require separate OS instances, making them faster and more efficient.</p>
</li>
<li><p><strong>Portable</strong>: Once built, a container can run consistently across various environments.</p>
</li>
<li><p><strong>Isolated</strong>: Containers run in isolated processes, meaning that they don’t interfere with other applications running on the same system.</p>
</li>
</ul>
<h2 id="heading-what-is-docker">What is Docker?</h2>
<p>Now that we understand containers, let’s talk about Docker, the platform that has made containers mainstream.</p>
<p>Docker is an open-source tool designed to simplify the process of creating, managing, and deploying containers. Launched in 2013, Docker has rapidly become the go-to solution for containerization due to its ease of use, community support, and powerful ecosystem of tools.</p>
<h3 id="heading-key-concepts-in-docker">Key Concepts in Docker</h3>
<ol>
<li><p><strong>Docker Images</strong>: Think of a Docker image as a blueprint for your container. It contains everything needed to run the application, including code, libraries, and system dependencies. Images are built from a set of instructions written in a Dockerfile.</p>
</li>
<li><p><strong>Docker Containers</strong>: A container is a running instance of a Docker image. When you create and start a container, Docker launches the image into an isolated environment where your application can run.</p>
</li>
<li><p><strong>Dockerfile</strong>: This is a text file that contains the steps needed to create a Docker image. It’s where you define what your container will look like, including the base image, application code, and any additional dependencies.</p>
</li>
<li><p><strong>Docker Hub</strong>: Docker Hub is a public registry where developers can share and access pre-built images. If you're working on a common application or technology stack, chances are that there’s already an image available on Docker Hub, saving you time.</p>
</li>
<li><p><strong>Docker Compose</strong>: For applications that require multiple containers (for example, a web server and a database), Docker Compose allows you to define and manage multi-container environments using a simple YAML file.</p>
</li>
</ol>
<h2 id="heading-why-docker">Why Docker?</h2>
<p>Docker's popularity stems from its ability to solve a variety of challenges developers face today:</p>
<ul>
<li><p><strong>Consistency Across Environments</strong>: Developers can "build once, run anywhere," ensuring the same application works the same way in different environments, from local development to production.</p>
</li>
<li><p><strong>Speed</strong>: Docker containers are fast to start and stop, making them ideal for testing and deployment pipelines.</p>
</li>
<li><p><strong>Efficient Use of Resources</strong>: Since containers share the host system's resources more effectively than virtual machines, they reduce overhead and allow for greater density in deployments.</p>
</li>
<li><p><strong>Version Control for Your Applications</strong>: Docker allows you to version control not only your code but also the environment in which your code runs. This is particularly useful for rolling back to previous versions or debugging issues in production.</p>
</li>
</ul>
<h2 id="heading-docker-architecture">Docker Architecture</h2>
<p>When you first start using Docker, you may treat it as a box that "just works." While that’s fine for getting started, a deeper understanding of Docker’s architecture will help you troubleshoot issues, optimize performance, and make informed decisions about your containerization strategy.</p>
<p>Docker's architecture is designed to ensure efficiency, flexibility, and scalability. It’s composed of several components that work together to create, manage, and run containers. Let’s take a closer look at each of these components.</p>
<h3 id="heading-docker-architecture-key-components">Docker Architecture: Key Components</h3>
<p>Docker’s architecture is built around a client-server model that includes the following components</p>
<ul>
<li><p><strong>Docker Client</strong></p>
</li>
<li><p><strong>Docker Daemon (dockerd)</strong></p>
</li>
<li><p><strong>Docker Engine</strong></p>
</li>
<li><p><strong>Docker Images</strong></p>
</li>
<li><p><strong>Docker Containers</strong></p>
</li>
<li><p><strong>Docker Registries</strong></p>
</li>
</ul>
<p><img src="https://docs.docker.com/get-started/images/docker-architecture.webp" alt="Docker Architecture" width="1233" height="651" loading="lazy"></p>
<h4 id="heading-1-docker-client">1. Docker Client</h4>
<p>The Docker Client is the primary way users interact with Docker. It’s a command-line tool that sends instructions to the Docker Daemon (which we’ll cover next) using REST APIs. Commands like <code>docker build</code>, <code>docker pull</code>, and <code>docker run</code> are executed from the Docker Client.</p>
<p>When you type a command like <code>docker run nginx</code>, the Docker Client translates that into a request that the Docker Daemon can understand and act upon. Essentially, the Docker Client acts as a front-end for interacting with Docker’s more complex backend components.</p>
<h4 id="heading-2-docker-daemon-dockerd">2. Docker Daemon (dockerd)</h4>
<p>The Docker Daemon, also known as <strong>dockerd</strong>, is the brain of the entire Docker operation. It’s a background process that listens for requests from the Docker Client and manages Docker objects like containers, images, networks, and volumes.</p>
<p>Here’s what the Docker Daemon is responsible for</p>
<ul>
<li><p><strong>Building and running containers</strong>: When the client sends a command to run a container, the daemon pulls the image, creates the container, and starts it.</p>
</li>
<li><p><strong>Managing Docker resources</strong>: The daemon handles tasks like network configurations and volume management.</p>
</li>
</ul>
<ul>
<li>The Docker Daemon runs on the host machine and communicates with the Docker Client using a REST API, Unix sockets, or a network interface. It’s also responsible for interacting with container runtimes, which handle the actual execution of containers.</li>
</ul>
<h4 id="heading-3-docker-engine">3. Docker Engine</h4>
<p>The Docker Engine is the core part of Docker. It’s what makes the entire platform work, combining the client, daemon, and container runtime. Docker Engine can run on various operating systems, including Linux, Windows, and macOS.</p>
<p>There are two versions of the Docker Engine</p>
<ul>
<li><p><strong>Docker CE (Community Edition)</strong>: This is the free, open-source version of Docker that’s widely used for personal and smaller-scale projects.</p>
</li>
<li><p><strong>Docker EE (Enterprise Edition)</strong>: The paid, enterprise-level version of Docker comes with additional features like enhanced security, support, and certification.</p>
</li>
</ul>
<p>The Docker Engine simplifies the complexities of container orchestration by integrating the various components required to build, run, and manage containers.</p>
<h4 id="heading-4-docker-images">4. Docker Images</h4>
<p>A Docker Image is a read-only template that contains everything your application needs to run—code, libraries, dependencies, and configurations. Images are the building blocks of containers. When you run a container, you are essentially creating a writable layer on top of a Docker image.</p>
<p>Docker Images are typically built from Dockerfiles, which are text files that contain instructions on how to build the image. For example, a basic Dockerfile might start with a base image like <code>nginx</code> or <code>ubuntu</code> and include commands to copy files, install dependencies, or set environment variables.</p>
<p>Here’s a simple example of a Dockerfile</p>
<pre><code class="lang-bash">dockerfileCopy codeFROM nginx:latest
COPY ./html /usr/share/nginx/html
EXPOSE 80
</code></pre>
<p>In this example, we’re using the official Nginx image as the base and copying our local HTML files into the container’s web directory.</p>
<p>Once the image is built, it can be stored in a Docker Registry and shared with others.</p>
<h4 id="heading-5-docker-containers">5. Docker Containers</h4>
<p>A Docker Container is a running instance of a Docker Image. It’s lightweight and isolated from other containers, yet it shares the kernel of the host operating system. Each container has its own file system, memory, CPU allocation, and network settings, which makes it portable and reproducible.</p>
<p>Containers can be created, started, stopped, and destroyed, and they can even be persisted between reboots. Because containers are based on images, they ensure that applications will behave the same way no matter where they’re run.</p>
<p>A few key characteristics of Docker containers:</p>
<ul>
<li><p><strong>Isolation</strong>: Containers are isolated from each other and the host, but they still share the same OS kernel.</p>
</li>
<li><p><strong>Portability</strong>: Containers can run anywhere, whether on your local machine, a virtual machine, or a cloud provider.</p>
</li>
</ul>
<h4 id="heading-6-docker-registries">6. Docker Registries</h4>
<p>A Docker Registry is a centralized place where Docker Images are stored and distributed. The most popular registry is Docker Hub, which hosts millions of publicly available images. Organizations can also set up private registries to store and distribute their own images securely.</p>
<p>Docker Registries provide several key features:</p>
<ul>
<li><p><strong>Image Versioning</strong>: Images are versioned using tags, making it easy to manage different versions of an application.</p>
</li>
<li><p><strong>Access Control</strong>: Registries can be public or private, with role-based access control to manage who can pull or push images.</p>
</li>
<li><p><strong>Distribution</strong>: Images can be pulled from a registry and deployed anywhere, making it easy to share and reuse containerized applications.</p>
</li>
</ul>
<h2 id="heading-dockers-container-runtime-containerd">Docker’s Container Runtime: containerd</h2>
<p>One important recent development in Docker’s architecture is the use of containerd. Docker used to have its own container runtime, but now it uses containerd, a container runtime that follows industry standards and is also used by other platforms like Kubernetes.</p>
<ol>
<li><p>containerd is responsible for</p>
<ul>
<li><p>Starting and stopping containers</p>
</li>
<li><p>Managing storage and networking for containers</p>
</li>
<li><p>Pulling container images from registries</p>
</li>
</ul>
</li>
</ol>
<p>By separating the container runtime from Docker’s higher-level functionality, Docker has become more modular, allowing other tools to use containerd while Docker focuses on user-facing features.</p>
<h2 id="heading-how-to-create-a-simple-container-using-docker">How to Create a Simple Container Using Docker</h2>
<p><strong>Pull the Linux Image</strong></p>
<p>Start by pulling the <code>alpine</code> image from Docker Hub. The <code>alpine</code> image is a minimal Linux distribution, designed to be lightweight and fast.</p>
<p>Run the following command:</p>
<pre><code class="lang-bash">docker pull alpine
</code></pre>
<p>This will download the <code>alpine</code> image to your local system.</p>
<p><strong>Run the Container</strong></p>
<p>Create and start a container using the <code>alpine</code> image. We’ll also launch a terminal session inside the container.</p>
<pre><code class="lang-bash">docker run -it alpine /bin/sh
</code></pre>
<p>Here’s what each option means:</p>
<ul>
<li><p><code>docker run</code>: Creates and starts a new container.</p>
</li>
<li><p><code>-it</code>: Allows you to interact with the container (interactive mode + terminal).</p>
</li>
<li><p><code>alpine</code>: Specifies the image to use.</p>
</li>
<li><p><code>/bin/sh</code>: Specifies the command to run inside the container (a shell session in this case).</p>
</li>
</ul>
<p><strong>Explore the Container</strong></p>
<p>Once the container is running, you’ll see a shell prompt that looks something like this</p>
<pre><code class="lang-bash">/ <span class="hljs-comment">#</span>
</code></pre>
<p>This indicates you are inside the Alpine Linux container. You can now run Linux commands. For example:</p>
<p>Check the current directory:</p>
<pre><code class="lang-bash"><span class="hljs-built_in">pwd</span>
</code></pre>
<p>List files in the directory:</p>
<pre><code class="lang-bash">ls
</code></pre>
<p>Output: A minimal directory structure, as Alpine is a lightweight image.</p>
<p>You can also install a package (Alpine uses <code>apk</code> as its package manager):</p>
<pre><code class="lang-bash">apk add curl
</code></pre>
<p><strong>Exit the Container</strong></p>
<p>When you're done exploring, type <code>exit</code> to close the session and stop the container</p>
<pre><code class="lang-bash">bashCopy codeexit
</code></pre>
<p><strong>Access the Container After It’s Stopped</strong></p>
<p>If you want to access the container again after stopping it, you can use this command to list all containers (including stopped ones):</p>
<pre><code class="lang-bash">docker ps -a
</code></pre>
<p>You’ll see a list of containers with their IDs and statuses, then you can start the stopped container:</p>
<pre><code class="lang-bash">docker start &lt;container-id&gt;
</code></pre>
<p>You can attach to the container's shell using this command:</p>
<pre><code class="lang-bash">docker <span class="hljs-built_in">exec</span> -it &lt;container-id&gt; /bin/sh
</code></pre>
<p>If you no longer need the container, you can remove it</p>
<ol>
<li><p>Stop the container (if it’s still running):</p>
<pre><code class="lang-bash"> docker stop &lt;container-id&gt;
</code></pre>
</li>
<li><p>Remove the container:</p>
<pre><code class="lang-bash"> docker rm &lt;container-id&gt;
</code></pre>
</li>
</ol>
<p><strong>Key Docker Commands Recap</strong></p>
<div class="hn-table">
<table>
<thead>
<tr>
<td><strong>Command</strong></td><td><strong>Description</strong></td></tr>
</thead>
<tbody>
<tr>
<td><code>docker pull alpine</code></td><td>Downloads the Alpine Linux image.</td></tr>
<tr>
<td><code>docker run -it alpine /bin/sh</code></td><td>Creates and starts an interactive container.</td></tr>
<tr>
<td><code>docker ps -a</code></td><td>Lists all containers (running and stopped).</td></tr>
<tr>
<td><code>docker start &lt;container-id&gt;</code></td><td>Starts a stopped container.</td></tr>
<tr>
<td><code>docker exec -it &lt;container-id&gt;</code></td><td>Attaches to a running container.</td></tr>
<tr>
<td><code>docker stop &lt;container-id&gt;</code></td><td>Stops a running container.</td></tr>
<tr>
<td><code>docker rm &lt;container-id&gt;</code></td><td>Removes a stopped container.</td></tr>
</tbody>
</table>
</div><h2 id="heading-wrapping-up">Wrapping Up</h2>
<p>Now that you've got a foundational understanding, it's time to put your knowledge to use. Start experimenting with Docker, build your first container, and explore its vast ecosystem.</p>
<p>You'll soon see why Docker has become a cornerstone of modern DevOps and software engineering.</p>
<p>You can follow me on</p>
<ul>
<li><p><a target="_blank" href="https://twitter.com/Kedar__98">Twitter</a></p>
</li>
<li><p><a target="_blank" href="https://www.linkedin.com/in/kedar-makode-9833321ab/?originalSubdomain=in">LinkedIn</a></p>
</li>
</ul>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ How to Implement Event-Driven Data Processing with Traefik, Kafka, and Docker ]]>
                </title>
                <description>
                    <![CDATA[ In modern system design, Event-Driven Architecture (EDA) focuses on creating, detecting, using, and responding to events within a system. Events are significant occurrences that can affect a system’s hardware or software, such as user actions, state ... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/how-to-implement-event-driven-data-processing/</link>
                <guid isPermaLink="false">673c7ac360ba8e6675690350</guid>
                
                    <category>
                        <![CDATA[ Docker ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Microservices ]]>
                    </category>
                
                    <category>
                        <![CDATA[ containers ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ Abraham Dahunsi ]]>
                </dc:creator>
                <pubDate>Tue, 19 Nov 2024 11:47:15 +0000</pubDate>
                <media:content url="https://cdn.hashnode.com/res/hashnode/image/upload/v1731772751529/58ee1304-a5d9-4be4-a709-1026de99ab3e.png" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>In modern system design, <a target="_blank" href="https://en.wikipedia.org/wiki/Event-driven_programming">Event-Driven Architecture</a> (EDA) focuses on creating, detecting, using, and responding to events within a system. Events are significant occurrences that can affect a system’s hardware or software, such as user actions, state changes, or data updates.</p>
<p>EDA enables different parts of an application to interact in a decoupled way, allowing them to communicate through events instead of direct calls. This setup lets components work independently, respond to events asynchronously, and adjust to changing business needs without major system reconfiguration, promoting agility.</p>
<p>New and <a target="_blank" href="https://en.wikipedia.org/wiki/Event-driven_architecture">modern applications now heavily rely on real-time data processing and responsiveness</a>. The EDA’s importance cannot be overstated because it provides the framework that supports those requirements. By using asynchronous communication and event-driven interactions, systems can efficiently handle high volumes of transactions and maintain performance under unstable loads. These features are particularly appreciated in environments where changes are very spontaneous, such as e-commerce platforms or IoT applications.</p>
<p>Some key components of EDA include:</p>
<ul>
<li><p><strong>Event Sources</strong>: These are the producers that generate events when significant actions occur within the system. Examples include user interactions or data changes.</p>
</li>
<li><p><strong>Listeners</strong>: These are entities that subscribe to specific events and respond when those events occur. Listeners enable the system to react dynamically to changes.</p>
</li>
<li><p><strong>Handlers</strong>: These are responsible for processing the events once they are detected by listeners, executing the necessary business logic or workflows triggered by the event.</p>
</li>
</ul>
<p>In this article, you will learn how to implement event-driven data processing using Traefik, Kafka, and Docker.</p>
<p>Here is a <a target="_blank" href="https://github.com/Abraham12611/EventMesh">simple application hosted on GitHub</a> that you can quickly run to get an overview of what you will be building today.</p>
<h2 id="heading-table-of-contents">Table of Contents</h2>
<p>Here is what we'll cover:</p>
<ul>
<li><p><a class="post-section-overview" href="#heading-table-of-contents">Table of Contents</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-prerequisites">Prerequisites</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-understanding-the-technologies">Understanding the Technologies</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-how-to-set-up-the-environment">How to Set Up the Environment</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-how-to-build-the-event-driven-system">How to Build the Event-Driven System</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-how-to-integrate-traefik-with-kafka">How to Integrate Traefik with Kafka</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-testing-the-setup">Testing the Setup</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-conclusion">Conclusion</a></p>
</li>
</ul>
<p>Let's get started!</p>
<h2 id="heading-prerequisites">Prerequisites</h2>
<p>Before you begin:</p>
<ul>
<li><p>Deploy an Ubuntu 24.04 instance with at least 4 GB of RAM and a minimum of 20 GB of free disk space to accommodate Docker images, containers, and Kafka data.</p>
</li>
<li><p>Access the instance with a non-root user with sudo privileges.</p>
</li>
<li><p>Update the package index.</p>
</li>
</ul>
<pre><code class="lang-bash">sudo apt update
</code></pre>
<h2 id="heading-understanding-the-technologies">Understanding the Technologies</h2>
<h3 id="heading-apache-kafka">Apache Kafka</h3>
<p>Apache Kafka is a distributed event streaming platform built for high-throughput data pipelines and real-time streaming applications. It acts as the backbone for implementing EDA by efficiently managing large volumes of events. Kafka uses a publish-subscribe model where producers send events to topics, and consumers subscribe to these topics to receive the events.</p>
<p>Some of the key features of Kafka include:</p>
<ul>
<li><p><strong>High Throughput</strong>: Kafka is capable of handling millions of events per second with low latency, making it suitable for high-volume applications.</p>
</li>
<li><p><strong>Fault Tolerance</strong>: Kafka's distributed architecture ensures data durability and availability even in the face of server failures. It replicates data across multiple brokers within a cluster.</p>
</li>
<li><p><strong>Scalability</strong>: Kafka can easily scale horizontally by adding more brokers to the cluster or partitions to topics, accommodating growing data needs without significant reconfiguration.</p>
</li>
</ul>
<h3 id="heading-traefik">Traefik</h3>
<p>Traefik is a modern HTTP reverse proxy and load balancer designed specifically for microservices architectures. It automatically discovers services running in your infrastructure and routes traffic accordingly. Traefik simplifies the management of microservices by providing dynamic routing capabilities based on service metadata.</p>
<p>Some of the key features of Traefik include:</p>
<ul>
<li><p>Dynamic Configuration: Traefik automatically updates its routing configuration as services are added or removed, eliminating manual intervention.</p>
</li>
<li><p>Load Balancing: It efficiently distributes incoming requests across multiple service instances, improving performance and reliability.</p>
</li>
<li><p>Integrated Dashboard: Traefik provides a user-friendly dashboard for monitoring traffic and service health in real-time.</p>
</li>
</ul>
<p>By using Kafka and Traefik in an event-driven architecture, you can build responsive systems that efficiently handle real-time data processing while maintaining high availability and scalability.</p>
<h2 id="heading-how-to-set-up-the-environment">How to Set Up the Environment</h2>
<h3 id="heading-how-to-install-docker-on-ubuntu-2404">How to Install Docker on Ubuntu 24.04</h3>
<ol>
<li>Install the required packages.</li>
</ol>
<pre><code class="lang-bash">sudo apt install ca-certificates curl gnupg lsb-release
</code></pre>
<ol start="2">
<li>Add Docker’s official GPG Key.</li>
</ol>
<pre><code class="lang-bash">curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
</code></pre>
<ol start="3">
<li>Add the Docker repository to your APT sources.</li>
</ol>
<pre><code class="lang-bash"><span class="hljs-built_in">echo</span> <span class="hljs-string">"deb [arch=<span class="hljs-subst">$(dpkg --print-architecture)</span> signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu <span class="hljs-subst">$(lsb_release -cs)</span> stable"</span> | sudo tee /etc/apt/sources.list.d/docker.list &gt; /dev/null
</code></pre>
<ol start="4">
<li>Update the package index again and install Docker Engine with the Docker Compose plugin.</li>
</ol>
<pre><code class="lang-bash">sudo apt update
sudo apt install docker-ce docker-ce-cli containerd.io docker-compose-plugin
</code></pre>
<ol start="5">
<li>Check to verify the installation.</li>
</ol>
<pre><code class="lang-bash">sudo docker run hello-world
</code></pre>
<p>Expected Output:</p>
<pre><code class="lang-bash">Unable to find image <span class="hljs-string">'hello-world:latest'</span> locally
latest: Pulling from library/hello-world
c1ec31eb5944: Pull complete
Digest: sha256:305243c734571da2d100c8c8b3c3167a098cab6049c9a5b066b6021a60fcb966
Status: Downloaded newer image <span class="hljs-keyword">for</span> hello-world:latest

Hello from Docker!
This message shows that your installation appears to be working correctly.
</code></pre>
<h3 id="heading-how-to-configure-docker-compose">How to Configure Docker Compose</h3>
<p>Docker Compose simplifies the management of multi-container applications, allowing you to define and run services in a single file.</p>
<ol>
<li>Create a project directory</li>
</ol>
<pre><code class="lang-bash">mkdir ~/kafka-traefik-setup &amp;&amp; <span class="hljs-built_in">cd</span> ~/kafka-traefik-setup
</code></pre>
<ol start="2">
<li>Create a <code>docker-compose.yml</code> file.</li>
</ol>
<pre><code class="lang-bash">nano docker-compose.yml
</code></pre>
<ol start="3">
<li>Add the following configuration to the file to define your services.</li>
</ol>
<pre><code class="lang-yaml"><span class="hljs-attr">version:</span> <span class="hljs-string">'3.8'</span>

<span class="hljs-attr">services:</span>
  <span class="hljs-attr">kafka:</span>
    <span class="hljs-attr">image:</span> <span class="hljs-string">wurstmeister/kafka:latest</span>
    <span class="hljs-attr">ports:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"9092:9092"</span>
    <span class="hljs-attr">environment:</span>
      <span class="hljs-attr">KAFKA_ADVERTISED_LISTENERS:</span> <span class="hljs-string">INSIDE://kafka:9092,OUTSIDE://localhost:9092</span>
      <span class="hljs-attr">KAFKA_LISTENER_SECURITY_PROTOCOL_MAP:</span> <span class="hljs-string">INSIDE:PLAINTEXT,OUTSIDE:PLAINTEXT</span>
      <span class="hljs-attr">KAFKA_LISTENERS:</span> <span class="hljs-string">INSIDE://0.0.0.0:9092,OUTSIDE://0.0.0.0:9092</span>
      <span class="hljs-attr">KAFKA_ZOOKEEPER_CONNECT:</span> <span class="hljs-string">zookeeper:2181</span>

  <span class="hljs-attr">zookeeper:</span>
    <span class="hljs-attr">image:</span> <span class="hljs-string">wurstmeister/zookeeper:latest</span>
    <span class="hljs-attr">ports:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"2181:2181"</span>

  <span class="hljs-attr">traefik:</span>
    <span class="hljs-attr">image:</span> <span class="hljs-string">traefik:v2.9</span>
    <span class="hljs-attr">ports:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"80:80"</span>       <span class="hljs-comment"># HTTP traffic</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"8080:8080"</span>   <span class="hljs-comment"># Traefik dashboard (insecure)</span>
    <span class="hljs-attr">command:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"--api.insecure=true"</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"--providers.docker=true"</span>
    <span class="hljs-attr">volumes:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"/var/run/docker.sock:/var/run/docker.sock"</span>
</code></pre>
<p>Save your changes with <code>ctrl + o</code>, then exit with <code>ctrl + x</code>.</p>
<ol start="4">
<li>Start your services.</li>
</ol>
<pre><code class="lang-bash">docker compose up -d
</code></pre>
<p>Expected Output:</p>
<pre><code class="lang-bash">[+] Running 4/4
 ✔ Network kafka-traefik-setup_default        Created                  0.2s
 ✔ Container kafka-traefik-setup-zookeeper-1  Started                  1.9s
 ✔ Container kafka-traefik-setup-traefik-1    Started                  1.9s
 ✔ Container kafka-traefik-setup-kafka-1      Started                  1.9s
</code></pre>
<h2 id="heading-how-to-build-the-event-driven-system">How to Build the Event-Driven System</h2>
<h3 id="heading-how-to-create-event-producers">How to Create Event Producers</h3>
<p>To produce events in Kafka, you will need to implement a Kafka producer. Below is an example using Java.</p>
<ol>
<li>Create a file <a target="_blank" href="http://kafka-producer.java"><code>kafka-producer.java</code></a>.</li>
</ol>
<pre><code class="lang-bash">nano kafka-producer.java
</code></pre>
<ol start="2">
<li>Add the following configuration for a Kafka Producer.</li>
</ol>
<pre><code class="lang-java"><span class="hljs-keyword">import</span> org.apache.kafka.clients.producer.KafkaProducer;
<span class="hljs-keyword">import</span> org.apache.kafka.clients.producer.ProducerRecord;
<span class="hljs-keyword">import</span> org.apache.kafka.clients.producer.RecordMetadata;

<span class="hljs-keyword">import</span> java.util.Properties;

<span class="hljs-keyword">public</span> <span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">SimpleProducer</span> </span>{
    <span class="hljs-function"><span class="hljs-keyword">public</span> <span class="hljs-keyword">static</span> <span class="hljs-keyword">void</span> <span class="hljs-title">main</span><span class="hljs-params">(String[] args)</span> </span>{
        <span class="hljs-comment">// Set up the producer properties</span>
        Properties props = <span class="hljs-keyword">new</span> Properties();
        props.put(<span class="hljs-string">"bootstrap.servers"</span>, <span class="hljs-string">"localhost:9092"</span>);
        props.put(<span class="hljs-string">"key.serializer"</span>, <span class="hljs-string">"org.apache.kafka.common.serialization.StringSerializer"</span>);
        props.put(<span class="hljs-string">"value.serializer"</span>, <span class="hljs-string">"org.apache.kafka.common.serialization.StringSerializer"</span>);

        <span class="hljs-comment">// Create the producer</span>
        KafkaProducer&lt;String, String&gt; producer = <span class="hljs-keyword">new</span> KafkaProducer&lt;&gt;(props);

        <span class="hljs-keyword">try</span> {
            <span class="hljs-comment">// Send a message to the topic "my-topic"</span>
            ProducerRecord&lt;String, String&gt; record = <span class="hljs-keyword">new</span> ProducerRecord&lt;&gt;(<span class="hljs-string">"my-topic"</span>, <span class="hljs-string">"key1"</span>, <span class="hljs-string">"Hello, Kafka!"</span>);
            RecordMetadata metadata = producer.send(record).get(); <span class="hljs-comment">// Synchronous send</span>
            System.out.printf(<span class="hljs-string">"Sent message with key %s to partition %d with offset %d%n"</span>, 
                              record.key(), metadata.partition(), metadata.offset());
        } <span class="hljs-keyword">catch</span> (Exception e) {
            e.printStackTrace();
        } <span class="hljs-keyword">finally</span> {
            <span class="hljs-comment">// Close the producer</span>
            producer.close();
        }
    }
}
</code></pre>
<p>Save your changes with <code>ctrl + o</code>, then exit with <code>ctrl + x</code>.</p>
<p>In the above configuration, the producer sends a message with the key "key1" and the value "Hello, Kafka!" to the topic "my-topic".</p>
<h3 id="heading-how-to-set-up-kafka-topics">How to Set Up Kafka Topics</h3>
<p>Before producing or consuming messages, you need to create topics in Kafka.</p>
<ol>
<li>Use the <a target="_blank" href="http://kafka-topics.sh"><code>kafka-topics.sh</code></a> script included with your Kafka installation to create a topic.</li>
</ol>
<pre><code class="lang-bash">kafka-topics.sh --bootstrap-server localhost:9092 --create --topic &lt;TopicName&gt; --partitions &lt;NumberOfPartitions&gt; --replication-factor &lt;ReplicationFactor&gt;
</code></pre>
<p>For example, if you want to create a topic named <code>my-topic</code> with 3 partitions and a replication factor of 1, run:</p>
<pre><code class="lang-bash">docker <span class="hljs-built_in">exec</span> &lt;Kafka Container ID&gt; /opt/kafka/bin/kafka-topics.sh --bootstrap-server localhost:9092 --create --topic my-topic --partitions 3 --replication-factor 1
</code></pre>
<p>Expected Output:</p>
<pre><code class="lang-bash">Created topic my-topic.
</code></pre>
<ol start="2">
<li>Check to confirm if the Topic was created successfully.</li>
</ol>
<pre><code class="lang-bash">docker <span class="hljs-built_in">exec</span> -it kafka-traefik-setup-kafka-1 /opt/kafka/bin/kafka-topics.sh --bootstrap-server localhost:9092 --list
</code></pre>
<p>Expected Output:</p>
<pre><code class="lang-bash">my-topic
</code></pre>
<h3 id="heading-how-to-create-event-consumers">How to Create Event Consumers</h3>
<p>After you have created your producers and topics, you can create consumers to read messages from those topics.</p>
<ol>
<li>Create a file <a target="_blank" href="http://kafka-consumer.java"><code>kafka-consumer.java</code></a>.</li>
</ol>
<pre><code class="lang-bash">nano kafka-consumer.java
</code></pre>
<ol start="2">
<li>Add the following configuration for a Kafka consumer.</li>
</ol>
<pre><code class="lang-java"><span class="hljs-keyword">import</span> org.apache.kafka.clients.consumer.ConsumerConfig;
<span class="hljs-keyword">import</span> org.apache.kafka.clients.consumer.ConsumerRecords;
<span class="hljs-keyword">import</span> org.apache.kafka.clients.consumer.KafkaConsumer;
<span class="hljs-keyword">import</span> org.apache.kafka.clients.consumer.ConsumerRecord;

<span class="hljs-keyword">import</span> java.time.Duration;
<span class="hljs-keyword">import</span> java.util.Collections;
<span class="hljs-keyword">import</span> java.util.Properties;

<span class="hljs-keyword">public</span> <span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">SimpleConsumer</span> </span>{
    <span class="hljs-function"><span class="hljs-keyword">public</span> <span class="hljs-keyword">static</span> <span class="hljs-keyword">void</span> <span class="hljs-title">main</span><span class="hljs-params">(String[] args)</span> </span>{
        <span class="hljs-comment">// Set up the consumer properties</span>
        Properties props = <span class="hljs-keyword">new</span> Properties();
        props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, <span class="hljs-string">"localhost:9092"</span>);
        props.put(ConsumerConfig.GROUP_ID_CONFIG, <span class="hljs-string">"my-group"</span>);
        props.put(ConsumerConfig.KEY_SERIALIZER_CLASS_CONFIG, <span class="hljs-string">"org.apache.kafka.common.serialization.StringDeserializer"</span>);
        props.put(ConsumerConfig.VALUE_SERIALIZER_CLASS_CONFIG, <span class="hljs-string">"org.apache.kafka.common.serialization.StringDeserializer"</span>);

        <span class="hljs-comment">// Create the consumer</span>
        KafkaConsumer&lt;String, String&gt; consumer = <span class="hljs-keyword">new</span> KafkaConsumer&lt;&gt;(props);

        <span class="hljs-comment">// Subscribe to the topic</span>
        consumer.subscribe(Collections.singletonList(<span class="hljs-string">"my-topic"</span>));

        <span class="hljs-keyword">try</span> {
            <span class="hljs-keyword">while</span> (<span class="hljs-keyword">true</span>) {
                <span class="hljs-comment">// Poll for new records</span>
                ConsumerRecords&lt;String, String&gt; records = consumer.poll(Duration.ofMillis(<span class="hljs-number">100</span>));
                <span class="hljs-keyword">for</span> (ConsumerRecord&lt;String, String&gt; record : records) {
                    System.out.printf(<span class="hljs-string">"Consumed message with key %s and value %s from partition %d at offset %d%n"</span>,
                                      record.key(), record.value(), record.partition(), record.offset());
                }
            }
        } <span class="hljs-keyword">finally</span> {
            <span class="hljs-comment">// Close the consumer</span>
            consumer.close();
        }
    }
}
</code></pre>
<p>Save your changes with <code>ctrl + o</code>, then exit with <code>ctrl + x</code>.</p>
<p>In the above configuration, the consumer subscribes to <code>my-topic</code> and continuously polls for new messages. When messages are received, it prints out their keys and values along with partition and offset information.</p>
<h2 id="heading-how-to-integrate-traefik-with-kafka">How to Integrate Traefik with Kafka</h2>
<h3 id="heading-configure-traefik-as-a-reverse-proxy">Configure Traefik as a Reverse Proxy.</h3>
<p>Integrating Traefik as a reverse proxy for Kafka allows you to manage incoming traffic efficiently while providing features such as dynamic routing and SSL termination.</p>
<ol>
<li>Update the <code>docker-compose.yml</code> file.</li>
</ol>
<pre><code class="lang-yaml"><span class="hljs-attr">version:</span> <span class="hljs-string">'3.8'</span>

<span class="hljs-attr">services:</span>
  <span class="hljs-attr">kafka:</span>
    <span class="hljs-attr">image:</span> <span class="hljs-string">wurstmeister/kafka:latest</span>
    <span class="hljs-attr">ports:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"9092:9092"</span>
    <span class="hljs-attr">environment:</span>
      <span class="hljs-attr">KAFKA_ADVERTISED_LISTENERS:</span> <span class="hljs-string">INSIDE://kafka:9092,OUTSIDE://localhost:9092</span>
      <span class="hljs-attr">KAFKA_LISTENER_SECURITY_PROTOCOL_MAP:</span> <span class="hljs-string">INSIDE:PLAINTEXT,OUTSIDE:PLAINTEXT</span>
      <span class="hljs-attr">KAFKA_LISTENERS:</span> <span class="hljs-string">INSIDE://0.0.0.0:9092,OUTSIDE://0.0.0.0:9092</span>
      <span class="hljs-attr">KAFKA_ZOOKEEPER_CONNECT:</span> <span class="hljs-string">zookeeper:2181</span>
    <span class="hljs-attr">labels:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"traefik.enable=true"</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"traefik.http.routers.kafka.rule=Host(`kafka.example.com`)"</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"traefik.http.services.kafka.loadbalancer.server.port=9092"</span>

  <span class="hljs-attr">zookeeper:</span>
    <span class="hljs-attr">image:</span> <span class="hljs-string">wurstmeister/zookeeper:latest</span>
    <span class="hljs-attr">ports:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"2181:2181"</span>

  <span class="hljs-attr">traefik:</span>
    <span class="hljs-attr">image:</span> <span class="hljs-string">traefik:v2.9</span>
    <span class="hljs-attr">ports:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"80:80"</span>        <span class="hljs-comment"># HTTP traffic</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"8080:8080"</span>    <span class="hljs-comment"># Traefik dashboard (insecure)</span>
    <span class="hljs-attr">command:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"--api.insecure=true"</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"--providers.docker=true"</span>
    <span class="hljs-attr">volumes:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"/var/run/docker.sock:/var/run/docker.sock"</span>
</code></pre>
<p>In this configuration, replace <a target="_blank" href="http://kafka.example.com"><code>kafka.example.com</code></a> with your actual domain name. The labels define the routing rules that Traefik will use to direct traffic to the Kafka service.</p>
<ol start="2">
<li>Restart your services.</li>
</ol>
<pre><code class="lang-bash">docker compose up -d
</code></pre>
<ol start="3">
<li><p>Access your Traefik dashboard by accessing <a target="_blank" href="http://localhost:8080"><code>http://localhost:8080</code></a> on your web browser.</p>
<p> <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1731753126986/fc124c80-1da2-43eb-9385-426bf6a12756.png" alt="Traefik dashboard on http://localhost:8080" class="image--center mx-auto" width="1412" height="613" loading="lazy"></p>
<h3 id="heading-load-balancing-with-traefik">Load Balancing with Traefik</h3>
<p> Traefik provides built-in load balancing capabilities that can help distribute requests across multiple instances of your Kafka producers and consumers.</p>
<h3 id="heading-strategies-for-load-balancing-event-driven-microservices">Strategies for Load Balancing Event-Driven Microservices</h3>
<ol>
<li><strong>Round Robin</strong>:</li>
</ol>
</li>
</ol>
<p>    By default, Traefik uses a round-robin strategy to distribute incoming requests evenly across all available instances of a service. This is effective for balancing load when multiple instances of Kafka producers or consumers are running.</p>
<ol start="2">
<li><strong>Sticky Sessions</strong>:</li>
</ol>
<p>    If you require that requests from a specific client always go to the same instance (for example, maintaining session state), you can configure sticky sessions in Traefik using cookies or headers.</p>
<ol start="3">
<li><strong>Health Checks</strong>:</li>
</ol>
<p>    Configure health checks in Traefik to ensure that traffic is only routed to healthy instances of your Kafka services. You can do this by adding health check parameters in the service definitions within your <code>docker-compose.yml</code> file:</p>
<pre><code class="lang-yaml">    <span class="hljs-attr">labels:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"traefik.http.services.kafka.loadbalancer.healthcheck.path=/health"</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"traefik.http.services.kafka.loadbalancer.healthcheck.interval=10s"</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"traefik.http.services.kafka.loadbalancer.healthcheck.timeout=3s"</span>
</code></pre>
<h2 id="heading-testing-the-setup">Testing the Setup</h2>
<h3 id="heading-verifying-event-production-and-consumption">Verifying Event Production and Consumption</h3>
<ol>
<li>Kafka provides built-in command-line tools for testing. Start a Console producer.</li>
</ol>
<pre><code class="lang-bash">    docker <span class="hljs-built_in">exec</span> -it kafka-traefik-setup-kafka-1 /opt/kafka/bin/kafka-console-producer.sh --broker-list localhost:9092 --topic my-topic
</code></pre>
<p>    After running this command, you can type messages into the terminal, which will be sent to the specified Kafka topic.</p>
<ol start="2">
<li>Start another terminal session and start a console consumer.</li>
</ol>
<pre><code class="lang-bash">    docker <span class="hljs-built_in">exec</span> -it kafka-traefik-setup-kafka-1 /opt/kafka/bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic my-topic --from-beginning
</code></pre>
<p>    This command will display all messages in <code>my-topic</code>, including those produced before the consumer started.</p>
<ol start="3">
<li>To see how well your consumers are keeping up with producers, you can run the following command to check the lag for a specific consumer group.</li>
</ol>
<pre><code class="lang-bash">    docker <span class="hljs-built_in">exec</span> -it kafka-traefik-setup-kafka-1 /opt/kafka/bin/kafka-consumer-groups.sh --bootstrap-server localhost:9092 --describe --group &lt;your-consumer-group&gt;
</code></pre>
<h3 id="heading-monitoring-and-logging">Monitoring and Logging</h3>
<ol>
<li><strong>Kafka Metrics</strong>:</li>
</ol>
<p>    Kafka exposes numerous metrics that can be monitored using JMX (Java Management Extensions). You can configure JMX to export these metrics to monitoring systems like Prometheus or Grafana. Key metrics to monitor include:</p>
<ul>
<li><p><strong>Message Throughput</strong>: The rate of messages produced and consumed.</p>
</li>
<li><p><strong>Consumer Lag</strong>: The difference between the last produced message offset and the last consumed message offset.</p>
</li>
<li><p><strong>Broker Health</strong>: Metrics related to broker performance, such as request rates and error rates.</p>
</li>
</ul>
<ol start="2">
<li><strong>Prometheus and Grafana Integration</strong>:</li>
</ol>
<p>    To visualize Kafka metrics, you can set up Prometheus to scrape metrics from your Kafka brokers. Follow these steps:</p>
<ul>
<li><p>Enable JMX Exporter on your Kafka brokers by adding it as a Java agent in your broker configuration.</p>
</li>
<li><p>Configure Prometheus by adding a scrape job in its configuration file (<code>prometheus.yml</code>) that points to your JMX Exporter endpoint.</p>
</li>
<li><p>Use Grafana to create dashboards that visualize these metrics in real-time.</p>
</li>
</ul>
<h3 id="heading-how-to-implement-monitoring-for-traefik">How to Implement Monitoring for Traefik</h3>
<ol>
<li><strong>Traefik Metrics Endpoint.</strong></li>
</ol>
<p>    Traefik provides built-in support for exporting metrics via Prometheus. To enable this feature, add the following configuration in your Traefik service definition within <code>docker-compose.yml</code>:</p>
<pre><code class="lang-yaml">    <span class="hljs-attr">command:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"--metrics.prometheus=true"</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"--metrics.prometheus.addservice=true"</span>
</code></pre>
<ol start="2">
<li><strong>Visualizing Traefik Metrics with Grafana</strong>.</li>
</ol>
<p>    Once Prometheus is scraping Traefik metrics, you can visualize them using Grafana:</p>
<ul>
<li><p>Create a new dashboard in Grafana and add panels that display key Traefik metrics such as:</p>
</li>
<li><p><strong>traefik_entrypoint_requests_total</strong>: Total number of requests received.</p>
</li>
<li><p><strong>traefik_backend_request_duration_seconds</strong>: Response times of backend services.</p>
</li>
<li><p><strong>traefik_service_requests_total</strong>: Total requests forwarded to backend services.</p>
</li>
</ul>
<ol start="3">
<li><strong>Setting Up Alerts</strong>.</li>
</ol>
<p>    Configure alerts in Prometheus or Grafana based on specific thresholds (e.g., high consumer lag or increased error rates).</p>
<h2 id="heading-conclusion">Conclusion</h2>
<p>    In this guide, you successfully implemented Event Driven Architecture (EDA) using Kafka and Traefik within the Ubuntu 24.04 environment.</p>
<h3 id="heading-additional-resources">Additional Resources</h3>
<p>    To learn more you can visit:</p>
<ul>
<li><p>The <a target="_blank" href="https://kafka.apache.org/documentation/">Apache Kafka Official Documentation</a></p>
</li>
<li><p>The <a target="_blank" href="https://doc.traefik.io/traefik/">Traefik Official Documentation</a></p>
</li>
<li><p>The <a target="_blank" href="https://docs.docker.com/">Docker Official Documentation</a></p>
</li>
<li><p>Vultr guide for for <a target="_blank" href="https://docs.vultr.com/set-up-traefik-proxy-as-a-reverse-proxy-for-docker-containers-on-ubuntu-24-04">setting up Traefik Proxy on Ubuntu 24.04</a></p>
</li>
</ul>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ How to Self-host a Container Registry ]]>
                </title>
                <description>
                    <![CDATA[ A container registry is a storage catalog from where you can push and pull container images. There are many public and private registries available to developers such as Docker Hub, Amazon ECR, and Google Cloud Artifact Registry. But sometimes, inste... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/how-to-self-host-a-container-registry/</link>
                <guid isPermaLink="false">670ea63e203bba3017cc96ff</guid>
                
                    <category>
                        <![CDATA[ Docker ]]>
                    </category>
                
                    <category>
                        <![CDATA[ containers ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Linux ]]>
                    </category>
                
                    <category>
                        <![CDATA[ nginx ]]>
                    </category>
                
                    <category>
                        <![CDATA[ SSL ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ Alex Pliutau ]]>
                </dc:creator>
                <pubDate>Tue, 15 Oct 2024 17:28:30 +0000</pubDate>
                <media:content url="https://cdn.hashnode.com/res/hashnode/image/upload/v1728918386211/cf6fd053-453e-4257-abcd-16942c345845.png" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>A container registry is a storage catalog from where you can push and pull container images.</p>
<p>There are many public and private registries available to developers such as <a target="_blank" href="https://hub.docker.com/">Docker Hub</a>, <a target="_blank" href="https://aws.amazon.com/ecr/">Amazon ECR</a>, and <a target="_blank" href="https://cloud.google.com/artifact-registry/docs">Google Cloud Artifact Registry</a>. But sometimes, instead of relying on an external vendor, you might want to host your images yourself. This gives you more control over how the registry is configured and where the container images are hosted.</p>
<p>This article is a hands-on tutorial that’ll teach you how to self-host a Container Registry.</p>
<h2 id="heading-table-of-contents">Table of Contents</h2>
<ul>
<li><p><a class="post-section-overview" href="#heading-what-is-a-container-image">What is a Container Image?</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-what-is-a-container-registry">What is a Container Registry?</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-why-you-might-want-to-self-host-a-container-registry">Why you might want to self-host a Container Registry</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-how-to-self-host-a-container-registry">How to self-host a Container Registry</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-step-1-install-docker-and-docker-compose-on-the-server">Step 1: Install Docker and Docker Compose on the server</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-step-2-configure-and-run-the-registry-container">Step 2: Configure and run the registry container</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-step-3-run-nginx-for-handling-tls">Step 3: Run NGINX for handling TLS</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-ready-to-go">Ready to go!</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-other-options">Other options</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-conclusion">Conclusion</a></p>
</li>
</ul>
<p>You will get the most out of this article if you’re already familiar with the tools like Docker and NGINX, and have a general understanding of what a container is.</p>
<h2 id="heading-what-is-a-container-image">What is a Container Image?</h2>
<p>Before we talk about container registries, let's first understand what a container image is. In a nutshell, a container image is a package that includes all of the files, libraries, and configurations to run a container. They are composed of <a target="_blank" href="https://docs.docker.com/get-started/docker-concepts/building-images/understanding-image-layers/">layers</a> where each layer represents a set of file system changes that add, remove, or modify files.</p>
<p>The most common way to create a container image is to use a <strong>Dockerfile</strong>.</p>
<pre><code class="lang-bash"><span class="hljs-comment"># build an image</span>
docker build -t pliutau/hello-world:v0 .

<span class="hljs-comment"># check the images locally</span>
docker images
<span class="hljs-comment"># REPOSITORY    TAG       IMAGE ID       CREATED          SIZE</span>
<span class="hljs-comment"># hello-world   latest    9facd12bbcdd   22 seconds ago   11MB</span>
</code></pre>
<p>This creates a container image that is stored on your local machine. But what if you want to share this image with others or use it on a different machine? This is where container registries come in.</p>
<h2 id="heading-what-is-a-container-registry">What is a Container Registry?</h2>
<p>A container registry is a storage catalog where you can push and pull container images from. The images are grouped into repositories, which are collections of related images with the same name. For example, on Docker Hub registry, <a target="_blank" href="https://hub.docker.com/_/nginx">nginx</a> is the name of the repository that contains different versions of the NGINX images.</p>
<p>Some registries are public, meaning that the images hosted on them are accessible to anyone on the Internet. Public registries such as <a target="_blank" href="https://hub.docker.com/">Docker Hub</a> are a good option to host open-source projects.</p>
<p>On the other hand, private registries provide a way to incorporate security and privacy into enterprise container image storage, either hosted in cloud or on-premises. These private registries often come with advanced security features and technical support.</p>
<p>There is a growing list of private registries available such as <a target="_blank" href="https://aws.amazon.com/ecr/">Amazon ECR</a>, <a target="_blank" href="https://cloud.google.com/artifact-registry/docs">GCP Artifact Registry</a>, <a target="_blank" href="https://github.com/features/packages">GitHub Container Registry</a>, and Docker Hub also offers a private repository feature.</p>
<p>As a developer, you interact with a container registry when using the <code>docker push</code> and <code>docker pull</code> commands.</p>
<pre><code class="lang-bash">docker push docker.io/pliutau/hello-world:v0

<span class="hljs-comment"># In case of Docker Hub we could also skip the registry part</span>
docker push pliutau/hello-world:v0
</code></pre>
<p>Let's look at the anatomy of a container image URL:</p>
<pre><code class="lang-bash">docker pull docker.io/pliutau/hello-world:v0@sha256:dc11b2...
                |            |            |          |
                ↓            ↓            ↓          ↓
             registry    repository      tag       digest
</code></pre>
<h2 id="heading-why-you-might-want-to-self-host-a-container-registry">Why You Might Want to Self-host a Container Registry</h2>
<p>Sometimes, instead of relying on a provider like AWS or GCP, you might want to host your images yourself. This keeps your infrastructure internal and makes you less reliant on external vendors. In some heavily regulated industries, this is even a requirement.</p>
<p>A self-hosted registry runs on your own servers, giving you more control over how the registry is configured and where the container images are hosted. At the same time it comes with a cost of maintaining and securing the registry.</p>
<h2 id="heading-how-to-self-host-a-container-registry">How to Self-host a Container Registry</h2>
<p>There are several open-source container registry solutions available. The most popular one is officially supported by Docker, called <a target="_blank" href="https://hub.docker.com/_/registry">registry</a>, with its implementation for storing and distributing of container images and artifacts. This means that you can run your own registry inside a container.</p>
<p>Here are the main steps to run a registry on a server:</p>
<ul>
<li><p>Install Docker and Docker Compose on the server.</p>
</li>
<li><p>Configure and run the <strong>registry</strong> container.</p>
</li>
<li><p>Run <strong>NGINX</strong> for handling TLS and forwarding requests to the registry container.</p>
</li>
<li><p>Setup SSL certificates and configure a domain.</p>
</li>
</ul>
<h3 id="heading-step-1-install-docker-and-docker-compose-on-the-server">Step 1: Install Docker and Docker Compose on the server</h3>
<p>You can use any server that supports Docker. For example, you can use a DigitalOcean Droplet with Ubuntu. For this demo I used Google Cloud Compute to create a VM with Ubuntu.</p>
<pre><code class="lang-bash">neofetch

<span class="hljs-comment"># OS: Ubuntu 20.04.6 LTS x86_64</span>
<span class="hljs-comment"># CPU: Intel Xeon (2) @ 2.200GHz</span>
<span class="hljs-comment"># Memory: 3908MiB</span>
</code></pre>
<p>Once we're inside our VM, we should install Docker and Docker Compose. Docker Compose is optional, but it makes it easier to manage multi-container applications.</p>
<pre><code class="lang-bash"><span class="hljs-comment"># install docker engine and docker-compose</span>
sudo snap install docker

<span class="hljs-comment"># verify the installation</span>
docker --version
docker-compose --version
</code></pre>
<h3 id="heading-step-2-configure-and-run-the-registry-container">Step 2: Configure and run the registry container</h3>
<p>Next we need to configure our registry container. The following <strong>compose.yaml</strong> file will create a registry container with a volume for storing the images and a volume for storing the password file.</p>
<pre><code class="lang-yaml"><span class="hljs-attr">services:</span>
  <span class="hljs-attr">registry:</span>
    <span class="hljs-attr">image:</span> <span class="hljs-string">registry:latest</span>
    <span class="hljs-attr">environment:</span>
      <span class="hljs-attr">REGISTRY_AUTH:</span> <span class="hljs-string">htpasswd</span>
      <span class="hljs-attr">REGISTRY_AUTH_HTPASSWD_REALM:</span> <span class="hljs-string">Registry</span> <span class="hljs-string">Realm</span>
      <span class="hljs-attr">REGISTRY_AUTH_HTPASSWD_PATH:</span> <span class="hljs-string">/auth/registry.password</span>
      <span class="hljs-attr">REGISTRY_STORAGE_FILESYSTEM_ROOTDIRECTORY:</span> <span class="hljs-string">/data</span>
    <span class="hljs-attr">volumes:</span>
      <span class="hljs-comment"># Mount the password file</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">./registry/registry.password:/auth/registry.password</span>
      <span class="hljs-comment"># Mount the data directory</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">./registry/data:/data</span>
    <span class="hljs-attr">ports:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-number">5000</span>
</code></pre>
<p>The password file defined in <strong>REGISTRY_AUTH_HTPASSWD_PATH</strong> is used to authenticate users when they push or pull images from the registry. We should create a password file using the <strong>htpasswd</strong> command. We should also create a folder for storing the images.</p>
<pre><code class="lang-yaml"><span class="hljs-string">mkdir</span> <span class="hljs-string">-p</span> <span class="hljs-string">./registry/data</span>

<span class="hljs-comment"># install htpasswd</span>
<span class="hljs-string">sudo</span> <span class="hljs-string">apt</span> <span class="hljs-string">install</span> <span class="hljs-string">apache2-utils</span>

<span class="hljs-comment"># create a password file. username: busy, password: bee</span>
<span class="hljs-string">htpasswd</span> <span class="hljs-string">-Bbn</span> <span class="hljs-string">busy</span> <span class="hljs-string">bee</span> <span class="hljs-string">&gt;</span> <span class="hljs-string">./registry/registry.password</span>
</code></pre>
<p>Now we can start the registry container. If you see this message, than everything is working as it should:</p>
<pre><code class="lang-yaml"><span class="hljs-string">docker-compose</span> <span class="hljs-string">up</span>

<span class="hljs-comment"># successfull run should output something like this:</span>
<span class="hljs-comment"># registry | level=info msg="listening on [::]:5000"</span>
</code></pre>
<h3 id="heading-step-3-run-nginx-for-handling-tls">Step 3: Run NGINX for handling TLS</h3>
<p>As mentioned earlier, we can use NGINX to handle TLS and forward requests to the registry container.</p>
<p>The Docker Registry requires a valid trusted SSL certificate to work. You can use something like <a target="_blank" href="https://letsencrypt.org/">Let's Encrypt</a> or obtain it manually. Make sure you have a domain name pointing to your server (<strong>registry.pliutau.com</strong> in my case). For this demo I already obtained the certificates using <a target="_blank" href="https://certbot.eff.org/">certbot</a> and put it in the <strong>./nginx/certs</strong> directory.</p>
<p>Since we're running our Docker Registry in a container, we can run NGINX in a container as well by adding the following service to the <strong>compose.yaml</strong> file:</p>
<pre><code class="lang-yaml"><span class="hljs-attr">services:</span>
  <span class="hljs-attr">registry:</span>
    <span class="hljs-comment"># ...</span>
  <span class="hljs-attr">nginx:</span>
    <span class="hljs-attr">image:</span> <span class="hljs-string">nginx:latest</span>
    <span class="hljs-attr">depends_on:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">registry</span>
    <span class="hljs-attr">volumes:</span>
      <span class="hljs-comment"># mount the nginx configuration</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">./nginx/nginx.conf:/etc/nginx/nginx.conf</span>
      <span class="hljs-comment"># mount the certificates obtained from Let's Encrypt</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">./nginx/certs:/etc/nginx/certs</span>
    <span class="hljs-attr">ports:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"443:443"</span>
</code></pre>
<p>Our <strong>nginx.conf</strong> file could look like this:</p>
<pre><code class="lang-yaml"><span class="hljs-string">worker_processes</span> <span class="hljs-string">auto;</span>

<span class="hljs-string">events</span> {
    <span class="hljs-string">worker_connections</span> <span class="hljs-number">1024</span><span class="hljs-string">;</span>
}

<span class="hljs-string">http</span> {
    <span class="hljs-string">upstream</span> <span class="hljs-string">registry</span> {
        <span class="hljs-string">server</span> <span class="hljs-string">registry:5000;</span>
    }

    <span class="hljs-string">server</span> {
        <span class="hljs-string">server_name</span> <span class="hljs-string">registry.pliutau.com;</span>
        <span class="hljs-string">listen</span> <span class="hljs-number">443</span> <span class="hljs-string">ssl;</span>

        <span class="hljs-string">ssl_certificate</span> <span class="hljs-string">/etc/nginx/certs/fullchain.pem;</span>
        <span class="hljs-string">ssl_certificate_key</span> <span class="hljs-string">/etc/nginx/certs/privkey.pem;</span>

        <span class="hljs-string">location</span> <span class="hljs-string">/</span> {
            <span class="hljs-comment"># important setting for large images</span>
            <span class="hljs-string">client_max_body_size</span>                <span class="hljs-string">1000m;</span>

            <span class="hljs-string">proxy_pass</span>                          <span class="hljs-string">http://registry;</span>
            <span class="hljs-string">proxy_set_header</span>  <span class="hljs-string">Host</span>              <span class="hljs-string">$http_host;</span>
            <span class="hljs-string">proxy_set_header</span>  <span class="hljs-string">X-Real-IP</span>         <span class="hljs-string">$remote_addr;</span>
            <span class="hljs-string">proxy_set_header</span>  <span class="hljs-string">X-Forwarded-For</span>   <span class="hljs-string">$proxy_add_x_forwarded_for;</span>
            <span class="hljs-string">proxy_set_header</span>  <span class="hljs-string">X-Forwarded-Proto</span> <span class="hljs-string">$scheme;</span>
            <span class="hljs-string">proxy_read_timeout</span>                  <span class="hljs-number">900</span><span class="hljs-string">;</span>
        }
    }
}
</code></pre>
<h3 id="heading-ready-to-go">Ready to go!</h3>
<p>After these steps we can run our registry and Nginx containers.</p>
<pre><code class="lang-bash">docker-compose up
</code></pre>
<p>Now, on the client side, you can push and pull the images from your registry. But first we need to login to the registry.</p>
<pre><code class="lang-bash">docker login registry.pliutau.com

<span class="hljs-comment"># Username: busy</span>
<span class="hljs-comment"># Password: bee</span>
<span class="hljs-comment"># Login Succeeded</span>
</code></pre>
<p>Time to build and push our image to our self-hosted registry:</p>
<pre><code class="lang-bash">docker build -t registry.pliutau.com/pliutau/hello-world:v0 .

docker push registry.pliutau.com/pliutau/hello-world:v0
<span class="hljs-comment"># v0: digest: sha256:a56ea4... size: 738</span>
</code></pre>
<p>On your server you can check the uploaded images in the data folder:</p>
<pre><code class="lang-bash">ls -la ./registry/data/docker/registry/v2/repositories/
</code></pre>
<h3 id="heading-other-options">Other options</h3>
<p>Following the example above, you can also run the registry on Kubernetes. Or you could use a managed registry service like <a target="_blank" href="https://goharbor.io/">Harbor</a>, which is an open-source registry that provides advanced security features and is compatible with Docker and Kubernetes.</p>
<p>Also, if you want to have a UI for your self-hosted registry, you could use a project like <a target="_blank" href="https://github.com/Joxit/docker-registry-ui">joxit/docker-registry-ui</a> and run it in a separate container.</p>
<h2 id="heading-conclusion">Conclusion</h2>
<p>Self-hosted Container Registries allow you to have complete control over your registry and the way it's deployed. At the same time it comes with a cost of maintaining and securing the registry.</p>
<p>Whatever your reasons for running a self-hosted registry, you now know how it's done. From here you can compare the different options and choose the one that best fits your needs.</p>
<p>You can find the full source code for this demo on <a target="_blank" href="https://github.com/plutov/packagemain/tree/master/26-self-hosted-container-registry">GitHub</a>. Also, you can watch it as a video on <a target="_blank" href="https://www.youtube.com/watch?v=TGLfQZ9qRaI">our YouTube channel</a>.</p>
 ]]>
                </content:encoded>
            </item>
        
    </channel>
</rss>
