Getting Started with Kubernetes
Kubernetes is a highly popular open-source container management system. The goal of the Kubernetes project is to make management of containers across multiple nodes as simple as managing containers on a single system. To accomplish this, it offers quite a few unique features such as Traffic Load Balancing, Self-Healing (automatic restarts), Scheduling, Scaling, and Rolling Updates.
In today’s article, we’ll learn about Kubernetes by deploying a simple web application across a multinode Kubernetes cluster. Before we can start deploying containers however, we first need to set up a cluster.
Standing Up a Kubernetes Cluster
The official Getting Started guide walks you through deploying a Kubernetes cluster on Google’s Container Engine platform. While this is a quick and easy method to get up and running, for this article, we’ll be deploying Kubernetes with an alternative provider, specifically via Vagrant. We’re using Vagrant for a few reasons, but primarily because it shows how to deploy Kubernetes on a generic platform rather than GCE.
Kubernetes comes with several install scripts for various platforms. The majority of these scripts use Bash environmental variables to change what and how Kubernetes is installed.
For this installation, we’ll define two variables:
NUM_NODES
– controls the number of nodes to deployKUBERNETES_PROVIDER
– specifies the platform on which to install Kubernetes
Let’s define that the installation scripts should use the vagrant
platform and to provision a 2 node cluster.
$ export NUM_NODES=2 $ export KUBERNETES_PROVIDER=vagrant
If we wanted to deploy Kubernetes on AWS, for example, we could do so by changing the KUBERNETES_PROVIDER
environmental variable to aws
.
Using the Kubernetes install script
While there are many walk-throughs on how to install Kubernetes, I have found that the easiest method is to use the Kubernetes install script available at (https://get.k8s.io).
This script is essentially a wrapper to the installation scripts distributed with Kubernetes, which makes the process quite a bit easier. One of my favorite things about this script is that it will also download Kubernetes for you.
To start using this script, we’ll need to download it; we can do this with a quick curl
command. Once we’ve downloaded the script, we can execute it by running the bash
command followed by the script name.
$ curl https://get.k8s.io > kubernetes_install.sh $ bash kubernetes_install.sh Downloading kubernetes release v1.2.2 to /var/tmp/kubernetes.tar.gz % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 426M 100 426M 0 0 12.2M 0 0:00:34 0:00:34 --:--:-- 6007k Unpacking kubernetes release v1.2.2 Creating a kubernetes on vagrant... <output truncated> Kubernetes master is running at https://10.245.1.2 Heapster is running at https://10.245.1.2/api/v1/proxy/namespaces/kube-system/services/heapster KubeDNS is running at https://10.245.1.2/api/v1/proxy/namespaces/kube-system/services/kube-dns kubernetes-dashboard is running at https://10.245.1.2/api/v1/proxy/namespaces/kube-system/services/kubernetes-dashboard Grafana is running at https://10.245.1.2/api/v1/proxy/namespaces/kube-system/services/monitoring-grafana InfluxDB is running at https://10.245.1.2/api/v1/proxy/namespaces/kube-system/services/monitoring-influxdb Installation successful!
After the script completes execution, we have a running Kubernetes cluster. However, we still have one more step before we can start to interact with this Kubernetes cluster; we need the kubectl
command to be installed.
Setting up kubectl
The kubectl
command exists for both Linux and Mac OS X. Since I’m running this installation from my MacBook, I’ll be installing the Mac OS X version of kubectl
. This means I’ll be running the cluster via Vagrant but interacting with that cluster from my MacBook.
To download the kubectl
command, we will once again use curl
.
$ curl https://storage.googleapis.com/kubernetes-release/release/v1.2.0/bin/darwin/amd64/kubectl > kubectl % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 38.7M 100 38.7M 0 0 10.4M 0 0:00:03 0:00:03 --:--:-- 10.4M $ chmod 750 kubectl
After the kubectl
binary is downloaded and permissions are changed to allow execution, the kubectl
command is almost ready. One more step is required before we can start interacting with our Kubernetes cluster. That step is to configure the kubectl
command.
$ export KUBECONFIG=~/.kube/config
As with most Kubernetes scripts, the kubectl
command’s configuration is driven by environmental variables. When we executed the cluster installation script above, that script created a .kube
configuration directory in my users home directory. Within that directory, it also created a file named config
. This file is used to store information about the Kubernetes cluster that was created.
By setting the KUBECONFIG
environmental variable to ~/.kube/config
, we are defining that the kubectl
command should reference this configuration file. Let’s take a quick look at that file to get a better understanding of what is being set.
$ cat ~/.kube/config apiVersion: v1 clusters: - cluster: certificate-authority-data: <SSLKEYDATA> server: https://10.245.1.2 name: vagrant contexts: - context: cluster: vagrant user: vagrant name: vagrant current-context: vagrant kind: Config preferences: {} users: - name: vagrant user: client-certificate-data: <SSLKEYDATA> client-key-data: <SSLKEYDATA> token: sometoken - name: vagrant-basic-auth user: password: somepassword username: admin
The .kube/config
file sets two main pieces of information:
- Location of the cluster
- Authentication data for communicating with that cluster
With the .kube/config
file defined, let’s attempt to execute a kubectl
command against our Kubernetes cluster to verify everything is working.
$ ./kubectl get nodes NAME STATUS AGE kubernetes-node-1 Ready 31m kubernetes-node-2 Ready 24m
The output of the ./kubectl get nodes
command shows us that we were able to connect to our Kubernetes cluster and display the status of our two nodes kubernetes-node-1
and kubernetes-node-2
. With this, we can move on as our installation is complete.
More About Kubernetes Nodes
In the command above, we used kubectl
to show the status of the available Kubernetes Nodes on this cluster. However, we really didn’t explore what a node is or what role it plays within a cluster.
A Kubernetes Node is a physical or virtual (in our case, virtual) machine used to host application containers. In a traditional container-based environment, you would typically define that specific containers run on specified physical or virtual hosts. In a Kubernetes cluster, however, you simply define what application containers you wish to run. The Kubernetes master determines which node the application container will run on.
This methodology also enables the Kubernetes cluster to perform tasks such as automated restarts when containers or nodes die.
Deploying Our Application
With our Kubernetes cluster ready, we can now start deploying application containers. The application container we will be deploying today will be an instance of Ghost. Ghost is a popular JavaScript-based blogging platform, and with its official Docker image, it’s pretty simple to deploy.
Since we’ll be using a prebuilt Docker container, we won’t need to first build a Docker image. However, it is important to call out that in order to use custom-built containers on a Kubernetes cluster. You must first build the container and push it to a Docker repository such as Docker Hub.
To start our Ghost container, we will use the ./kubectl
command with the run
option.
$ ./kubectl run ghost --image=ghost --port=2368 deployment "ghost" created
In the command above, we created a deployment named ghost
, using the image ghost
and specified that the ghost
container requires the port 2368
. Before going too far, let’s first verify that the container is running. We can verify this by executing the kubectl
command with the get pods
options.
$ ./kubectl get pods NAME READY STATUS RESTARTS AGE ghost-3550706830-4c2n5 1/1 Running 0 20m
The get pods
option will tell the kubectl
command to list all of the Kubernetes Pods currently deployed to the cluster.
What Are Pods and Deployments?
A Pod is a group of containers that can communicate with each other as though they are running within the same system. For those familiar with Docker, this may sound like linking containers, but there’s actually a bit more to it than that. Containers within Pods are not only able to connect to each other through a localhost
connection, the processes running within the containers are also able to share memory segments with other containers.
The goal of a Pod is to allow applications running within the Pod to interact in the same way they would as though they were not running in containers but simply running on the same physical host. This ability makes it easy to deploy applications that are not specifically designed to run within containers.
A Deployment, or Deployment Object, is similar to the concept of a Desired State. Essentially the Deployment is a high-level configuration around a desired function. For example, earlier when we started the Ghost container, we didn’t just launch a Ghost container. We actually configured Kubernetes to ensure that at least one copy of a Ghost container is running.
Creating a service for Ghost
While containers within Pods can connect to systems external to the cluster, external systems and even other Pods cannot communicate with them. This is because, by default, the port defined for the Ghost service is not exposed beyond this cluster. This is where Services come into play.
In order to make our Ghost application accessible outside the cluster, the deployment we just created needs to be exposed as a Kubernetes Service. To set our Ghost deployment as a service, we will use the kubectl
command once again, this time using the expose
option.
$ ./kubectl expose deployment ghost --type="NodePort" service "ghost" exposed
In the above command, we used the flag --type
with the argument of NodePort
. This flag defines the service type to expose for this service, in this case a NodePort
service type. The NodePort
service type will set all nodes to listen on the specified port. We can see our change take effect if we use the kubectl
command again, but this time with the get services
option.
$ ./kubectl get services NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE ghost 10.247.63.140 nodes 2368/TCP 55s kubernetes 10.247.0.1 <none> 443/TCP 19m
Service types
At the moment, Kubernetes supports three service types:
ClusterIP
NodePort
LoadBalancer
If we wanted to only expose this service to other Pods within this cluster, we can use the ClusterIP
service type, which is the default. This opens the port on each node for Pod to Pod communication.
The LoadBalancer
service type is designed to provision an external IP to act as a Load Balancer for the service. Since our deployment is leveraging Vagrant on a local laptop, this option does not work in our environment. It does work with Kubernetes clusters deployed in cloud environments like GCE or AWS.
Testing our Ghost instance
Since we did not specify a port to use when defining our NodePort
service, Kubernetes randomly assigned a port. To see what port it assigned, we can use the kubectl
command, with the describe service
option.
$ ./kubectl describe service ghost Name: ghost Namespace: default Labels: run=ghost Selector: run=ghost Type: NodePort IP: 10.247.63.140 Port: <unset> 2368/TCP NodePort: <unset> 32738/TCP Endpoints: 10.246.3.3:2368 Session Affinity: None No events.
We can see that the port assigned is 32738
. With this port, we can use the curl
command to make an HTTP call to any of our Kubernetes Nodes and get redirected to port 2368
within our applications container.
$ curl -vk http://10.245.1.3:32738 * Rebuilt URL to: http://10.245.1.3:32738/ * Trying 10.245.1.3... * Connected to 10.245.1.3 (10.245.1.3) port 32738 (#0) > GET / HTTP/1.1 > Host: 10.245.1.3:32738 > User-Agent: curl/7.43.0 > Accept: */* > < HTTP/1.1 200 OK < X-Powered-By: Express
From the output of the curl
command, we can see that the connection was successful with a 200 OK
response. What is interesting about this is that the request was to a node that wasn’t running the Ghost container. We can see this if we use the kubectl
to describe
the Pod.
$ ./kubectl describe pod ghost-3550706830-ss4se Name: ghost-3550706830-ss4se Namespace: default Node: kubernetes-node-2/10.245.1.4 Start Time: Sat, 16 Apr 2016 21:13:20 -0700 Labels: pod-template-hash=3550706830,run=ghost Status: Running IP: 10.246.3.3 Controllers: ReplicaSet/ghost-3550706830 Containers: ghost: Container ID: docker://55ea497a166ff13a733d4ad3be3abe42a6d7f3d2c259f2653102fedda485e25d Image: ghost Image ID: docker://09849b7a78d3882afcd46f2310c8b972352bc36aaec9f7fe7771bbc86e5222b9 Port: 2368/TCP QoS Tier: memory: BestEffort cpu: Burstable Requests: cpu: 100m State: Running Started: Sat, 16 Apr 2016 21:14:33 -0700 Ready: True Restart Count: 0 Environment Variables: Conditions: Type Status Ready True Volumes: default-token-imnyi: Type: Secret (a volume populated by a Secret) SecretName: default-token-imnyi No events.
In the description above, we can see that the Ghost Pod is running on kubernetes-node-2
. However, the HTTP request we just made was to kubernetes-node-1
. This is made possible by a Kubernetes service called kube-proxy
. With kube-proxy
, whenever traffic arrives on a service’s port, the Kubernetes node will check if the service is running local to that node. If not, it will redirect the traffic to a node that is running that service.
In the case above, this means that even though the HTTP request was made to kubernetes-node-1
, the kube-proxy
service redirected that traffic to kubernetes-node-2
where the container is running.
This feature allows users to run services without having to worry about where the service is and whether or not it has moved from node to node. A very useful feature that reduces quite a bit of maintenance and headache.
Scaling a Deployment
Now that our Ghost service is running and accessible to the outside world, we need to perform our last task, scaling out our Ghost application across multiple instances. To do this, we can simply call the kubectl
command again, this time however, with the scale
option.
$ ./kubectl scale deployment ghost --replicas=4
deployment "ghost" scaled
We’ve specified that there should be 4 replicas
of the ghost deployment. If we execute kubectl get pods
again we should see 3 additional pods for the ghost deployment.
./kubectl get pods NAME READY STATUS RESTARTS AGE ghost-3550706830-49r81 1/1 Running 0 7h ghost-3550706830-4c2n5 1/1 Running 0 8h ghost-3550706830-7o2fs 1/1 Running 0 7h ghost-3550706830-f3dae 1/1 Running 0 7h
With this last step, we now have our Ghost service running across multiple nodes and multiple pods. As requests are made to our Ghost service, those requests will be load balanced to our various Ghost instances.
Summary
In this article, we learned how to deploy a multinode Kubernetes cluster across several Vagrant-managed virtual machines. We also learned how to deploy an application to the cluster, make that application accessible to external systems, and finally scale out the application to four instances. Even though we performed all of these tasks, we have only scratched the surface of Kubernetes.
Want to explore more of Kubernetes features? Let us know in the comments.
Reference: | Getting Started with Kubernetes from our WCG partner Benjamin Cane at the Codeship Blog blog. |