This tells Docker Compose to build the Dockerfile from the “sambaonly” directory, upload/pull built containers to my newly setup private registry, and export port 445 from the container.
To deploy this manifest, I followed Docker Swarm’s tutorial. I first used Docker Compose to build and upload the container to the private registry:
docker-compose build docker-compose push
After the container is built, the app can be deployed with
docker stack deploy command, specifying the service name:
$ docker stack deploy --compose-file docker-compose.yml samba-swarm Ignoring unsupported options: build Creating network samba-swarm_default Creating service samba-swarm_samba zhuowei@dora:~/Documents/docker$ docker stack services samba-swarm ID NAME MODE REPLICAS IMAGE PORTS yg8x8yfytq5d samba-swarm_samba replicated 1/1
And now the app is running under Samba Swarm. I tested that it still works with
zhuowei@dora:~$ smbclient \\\\localhost\\workdir -U % WARNING: The "syslog" option is deprecated Try "help" to get a list of possible commands. smb: \> ls . D 0 Fri Oct 5 12:14:43 2018 .. D 0 Sun Oct 7 22:09:49 2018 hello.txt N 13 Fri Oct 5 11:17:34 2018 102685624 blocks of size 1024. 72252576 blocks available smb: \>
Adding another node
Once again, Docker Swarm’s simplicity shines through here. To setup a second node, I first installed Docker, then ran the command that Docker gave me when I setup the swarm:
ralph:~# docker swarm join --token SWMTKN-1-abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwx-abcdefghijklmnopqrstuvwxy 10.133.7.100:2377 This node joined a swarm as a worker.
To run my application on both nodes, I ran Docker Swarm’s
scale command on the manager node:
zhuowei@dora:~/Documents/docker$ docker service scale samba-swarm_samba=2 samba-swarm_samba scaled to 2 overall progress: 2 out of 2 tasks 1/2: running [==================================================>] 2/2: running [==================================================>] verify: Service converged
On the new worker node, the new container showed up:
ralph:~# docker container ls CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 7539549283bd 127.0.0.1:5000/samba:latest "/usr/sbin/smbd -FS …" 20 seconds ago Up 18 seconds 445/tcp samba-swarm_samba.1.abcdefghijklmnopqrstuvwxy
Testing load balancing
Docker Swarm includes a built-in load balancer called the Mesh Router: requests made to any node’s IP address is automatically distributed across the Swarm.
To test this, I made 1000 connections to the manager node’s IP address with
print("#!/bin/bash") for i in range(1000): print("nc -v 10.133.7.100 445 &") print("wait")
Samba spawns one new process for each connection, so if the load balancing works, I would expect about 500 Samba processes on each node in the Swarm. This is indeed what happens.
After I ran the script to make 1000 connections, I checked the number of Samba processes on the manager (10.133.7.100):
zhuowei@dora:~$ ps -ef|grep smbd|wc 506 5567 42504
and on the worker node (10.133.7.50):
ralph:~# ps -ef|grep smbd|wc 506 3545 28862
So exactly half of the requests made to the manager node were magically redirected to the first worker node, showing that the Swarm cluster is working properly.
I found Docker Swarm to be very easy to setup, and it performed well under (a light) load.
Kubernetes is becoming the industry standard for container orchestration. It’s significantly more flexible than Docker Swarm, but this also makes it harder to setup. I found that it’s not too hard, though.
For this experiment, instead of using a pre-built Kubernetes dev environment such as
minikube, I decided to setup my own cluster, using Kubeadm, WeaveNet, and MetalLB.
Setting up Kubernetes
That reputation is no longer accurate: Kubernetes developers have automated almost every step into a very easy-to-use setup script called
Here’s what I ended up running.
First I had to disable Swap on each node:
root@dora:~# swapoff -a root@dora:~# systemctl restart kubelet.service
Next, I setup the master node (10.133.7.100) with the following command:
sudo kubeadm init --pod-network-cidr=10.134.0.0/16 --apiserver-advertise-address=10.133.7.100 --apiserver-cert-extra-sans=10.0.2.15
--pod-network-cidr option assigns an internal network address to all the nodes on the network, used for internal communications within Kubernetes.
--apiserver-cert-extra-sansoptions were added because of a quirk in my VirtualBox setup: the main virtual network card on the VMs (which has IP 10.0.2.15) can only access the Internet. I had to clarify that other nodes have to access the master using the 10.133.7.100 IP address.
After running this command, Kubeadm printed some instructions:
Your Kubernetes master has initialized successfully! To start using your cluster, you need to run the following as a regular user: mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config You should now deploy a pod network to the cluster. Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at: https://kubernetes.io/docs/concepts/cluster-administration/addons/ You can now join any number of machines by running the following on each node as root: kubeadm join 10.133.7.100:6443 --token abcdefghijklmnopqrstuvw --discovery-token-ca-cert-hash sha256:abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijkl
I missed these instructions the first time, so I didn’t actually finish setup. I then spent a whole week wondering why none of my containers worked!
The Kubernetes developers must’ve be like:
After I finally read the instructions, I had to do three more things:
- First, I had to run the commands given by
kubeadmto setup a configuration file.
- By default, Kubernetes won’t schedule containers on the master node, only on worker nodes. Since I only have one node right now, the tutorial showed me this command to allow running containers on the only node:
kubectl taint nodes --all node-role.kubernetes.io/master-
- Finally, I had to choose a network for my cluster.
Unlike Docker Swarm, which must use its own mesh-routing layer for both networking and load balancing, Kubernetes offers multiple choices for networking and load-balancing.
The networking component allows containers to talk to each other internally. I did some research, and this comparison article suggested Flannel or WeaveNet as they are easy to setup. Thus, I decided to try WeaveNet. I followed the instructions from the kubeadm tutorial to apply WeaveNet’s configuration:
kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version | base64 | tr -d '\n')"
Next, to allow containers to talk to the outside world, I need a load balancer. From my research, I had the impression that most Kubernetes load balancer implementations are focused on HTTP services only, not raw TCP. Thankfully, I found MetalLB, a recent (one-year old) project that’s plugging this gap.
To install MetalLB, I followed its Getting Started tutorial, and first deployed MetalLB:
kubectl apply -f https://raw.githubusercontent.com/google/metallb/v0.7.3/manifests/metallb.yaml
Next, I allocated the IP range 10.133.7.200–10.133.7.230 to MetalLB, by making and applying this configuration file:
kubectl apply -f metallb-config.yaml
Deploying the app
Kubernetes’ service configuration files are more verbose than Docker Swarm’s, due to Kubernetes’ flexibiity. In addition to specifying the container to run, like Docker Swarm, I have to specify how each port should be treated.
After reading Kubernetes’ tutorial, I came up with this Kubernetes configuration, made of one Service and one Deployment.
# https://kubernetes.io/docs/tasks/run-application/run-single-instance-stateful-application/ kind: Service apiVersion: v1 metadata: name: samba labels: app: samba spec: ports: - port: 445 protocol: TCP targetPort: 445 selector: app: samba type: LoadBalancer ---
This Service tells Kubernetes to export TCP port 445 from our Samba containers to the load balancer.
apiVersion: apps/v1 kind: Deployment metadata: name: samba labels: app: samba spec: selector: matchLabels: app: samba replicas: 1 template: metadata: labels: app: samba spec: containers: - image: 127.0.0.1:5000/samba:latest name: samba ports: - containerPort: 445 stdin: true
This Deployment object tells Kubernetes to run my container and export a port for the Service to handle.
replicas: 1 — that's how many instances of the container I want to run.
I can deploy this service to Kubernetes using
zhuowei@dora:~/Documents/docker$ kubectl apply -f kubernetes-samba.yaml service/samba configured deployment.apps/samba configured
and, after rebooting my virtual machine a few times, the Deployment finally started working:
zhuowei@dora:~/Documents/docker$ kubectl get pods NAME READY STATUS RESTARTS AGE samba-57945b8895-dfzgl 1/1 Running 0 52m zhuowei@dora:~/Documents/docker$ kubectl get service samba NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE samba LoadBalancer 10.108.157.165 10.133.7.200 445:30246/TCP 91m
My service is now available at the External-IP assigned by MetalLB:
zhuowei@dora:~$ smbclient \\\\10.133.7.200\\workdir -U % WARNING: The "syslog" option is deprecated Try "help" to get a list of possible commands. smb: \> ls . D 0 Fri Oct 5 12:14:43 2018 .. D 0 Sun Oct 7 22:09:49 2018 hello.txt N 13 Fri Oct 5 11:17:34 2018 102685624 blocks of size 1024. 72252576 blocks available smb: \>
Adding another node
Adding another node in a Kubernetes cluster is much easier: I just had to run the command given by
kubeadm on the new machine:
zhuowei@davy:~$ sudo kubeadm join 10.133.7.100:6443 --token abcdefghijklmnopqrstuvw --discovery-token-ca-cert-hash sha256:abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijkl (snip...) This node has joined the cluster:* Certificate signing request was sent to apiserver and a response was received. * The Kubelet was informed of the new secure connection details. Run 'kubectl get nodes' on the master to see this node join the cluster.
Odd quirks of my setup
I had to make two changes due to my VirtualBox setup:
First, since my virtual machine has two network cards, I have to manually tell Kubernetes my machine’s IP address. According to this issue, I had to edit
and change one line to
before restarting Kubernetes:
root@davy:~# systemctl daemon-reload root@davy:~# systemctl restart kubelet.service
The other tweak is for the Docker registry: since the new node can’t access my private registry on the master node, I decided to do a terrible hack and share the registry from my master node to the new machine using
zhuowei@davy:~$ ssh dora.local -L 5000:localhost:5000
This forwards port 5000 from the master node,
dora (which runs my Docker registry) to localhost, where Kubernetes can find it on this machine.
In actual production, one would probably host the Docker registry on a separate machine, so all nodes can access it.
Scaling up the application
With the second machine setup, I modified my original Deployment to add another instance of the app:
After rebooting both the master and the worker a few times, the new instance of my app finally exited
CreatingContainer status and started to run:
zhuowei@dora:~/Documents/docker$ kubectl get pods NAME READY STATUS RESTARTS AGE samba-57945b8895-dfzgl 1/1 Running 0 62m samba-57945b8895-qhrtl 1/1 Running 0 12m
Testing Load Balancing
I used the same procedure to open 1000 connections to Samba running on Kubernetes. The result is interesting.
zhuowei@dora$ ps -ef|grep smbd|wc 492 5411 41315
zhuowei@davy:~$ ps -ef|grep smbd|wc 518 5697 43499
Kubernetes/MetalLB also balanced the load across the two machines, but the master machine got slightly fewer connections than the worker machine. I wonder why.
Anyways, this shows that I finally managed to setup Kubernetes after a bunch of detours. When I saw the containers working, I felt like this guy.
Comparison and conclusion
Features common to both: Both can manage containers and intelligently load balance requests across the same TCP application across two different virtual machines. Both have good documentation for initial setup.
Docker Swarm’s strengths: simple setup with no configuration needed, tight integration with Docker.
Kubernetes’ strengths: flexible components, many available resources and add-ons.
Kubernetes vs Docker Swarm is a tradeoff between simplicity and flexibility.
I found it easier to setup Docker Swarm, but I can’t just, for example, swap out the load balancer for another component — there’s no way to configure it: I would have to disable it all together.
On Kubernetes, finding the right setup took me a while thanks to the daunting amount of choices, but in exchange, I can swap out parts of my cluster as needed, and I can easily install add-ons, such as a fancy dashboard.
If you just want to try Kubernetes without all this setup, I suggest using
minikube, which offers a prebuilt Kubernetes cluster virtual machine, no setup needed.
Finally, I’m impressed that both engines supported raw TCP services: other infrastructure-as-a-service providers such as Heroku or Glitch only supports HTTP(s) website hosting. The availability of TCP services means that one can deploy one’s database servers, cache servers, and even Minecraft servers using the same tools to deploy web applications, making container orchestration management a very useful skill.
In conclusion, if I were building a cluster, I would use Docker Swarm. If I were paying someone else to build a cluster for me, I would ask for Kubernetes.
What I learned
- How to work with Docker containers
- How to setup a two-node Docker Swarm cluster
- How to setup a two-node Kubernetes cluster, and which choices would work for a TCP-based app
- How to deploy an app to Docker Swarm and Kubernetes
- How to fix anything by rebooting a computer enough times, like I’m still using Windows 98
- Kubernetes and Docker Swarm aren’t as intimidating as they sound