Web Dev

Using Swarm with Calico on Docker Machine

In 2015, Docker announced their plugin system and revealed a list of early network and storage integrators. During Docker Global Hackday #3, we started playing with Swarm and Calico — however the tooling and integration at that stage made it difficult to implement.

Thanks to the hard work of contributors to both projects, the barrier to entry for implementing Swam and Calico has been greatly reduced. In this article, we’ll look into implementing a Docker Swarm and using Calico as a network plugin, using Docker Machine.

Swarm allows a set of Docker hosts to be clustered, presenting a container API which abstracts scheduling of containers. By using a swarm instead of a set of hosts, much of the complexity around managing application availability and distributing resources is taken care of.

Project Calico provides a layer 3 network implementation, aimed at scalable datacenter deployments. Compared to traditional network overlays, Calico provides a more efficient implementation with minimal packet encapsulation. This allows better usage of node resources and a simple yet powerful network stack for your infrastructure.

While Calico is a great SDN solution for many cases, it has drawbacks in advanced cases. You can read more about why you should use Calico here.

To prototype using Swarm with Calico, I’ll be using Docker Machine creating VMs on VirtualBox. There may be some minor changes when using a different driver, but the core process should be the same.

Many of the steps taken here are not production safe — keep an eye out for warnings around implementation specifics in this document, as well as in the relevant provider docs.

Implementing Docker Swarm without Calico

Let’s start by creating a non-Calico Docker swarm using the standard Docker Machine interface. This is purely an exercise for comparison. We’ll be following the documentation listed in the Swarm docs.

You’ll need an existing Docker instance to get started, purely to create a Swarm discovery token, so if need be, create a Docker Machine instance just for this. Let’s create a Swarm discovery token:

$ docker run --rm swarm create
c25aa882df76a92ae962f4b4fc26168d

Next we can launch the necessary Swarm containers using this token to coordinate the swarm.

$ docker-machine create -d virtualbox --swarm --swarm-master --swarm-discovery token://c25aa882df76a92ae962f4b4fc26168d swarm-master
Running pre-create checks...
Creating machine...
Waiting for machine to be running, this may take a few minutes...
Machine is running, waiting for SSH to be available...
Detecting operating system of created instance...
Provisioning created instance...
Copying certs to the local machine directory...
Copying certs to the remote machine...
Setting Docker configuration on the remote daemon...
Configuring swarm...
To see how to connect Docker to this machine, run: docker-machine env swarm-master

$ docker-machine create -d virtualbox --swarm --swarm-discovery token://c25aa882df76a92ae962f4b4fc26168d swarm-agent-00
Running pre-create checks...
Creating machine...
Waiting for machine to be running, this may take a few minutes...
Machine is running, waiting for SSH to be available...
Detecting operating system of created instance...
Provisioning created instance...
Copying certs to the local machine directory...
Copying certs to the remote machine...
Setting Docker configuration on the remote daemon...
Configuring swarm...
To see how to connect Docker to this machine, run: docker-machine env swarm-agent-00
….
$ docker-machine create -d virtualbox --swarm --swarm-discovery token://c25aa882df76a92ae962f4b4fc26168d swarm-agent-NN
...

At this point, you should have a running Swarm cluster. Let’s take a look at it.

$ docker-machine ls
NAME             ACTIVE   DRIVER       STATE     URL                         SWARM
swarm-agent-00   -        virtualbox   Running   tcp://192.168.99.101:2376   swarm-master
swarm-agent-01   -        virtualbox   Running   tcp://192.168.99.102:2376   swarm-master
swarm-master     *        virtualbox   Running   tcp://192.168.99.100:2376   swarm-master (master)
$ eval $(docker-machine env --swarm swarm-master)
$ docker ps -a
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS                                     NAMES
8f4cef9a02b3        swarm:latest        "/swarm join --advert"   9 minutes ago       Up 9 minutes        2375/tcp                                  swarm-agent-01/swarm-agent
01a3712457c7        swarm:latest        "/swarm join --advert"   10 minutes ago      Up 10 minutes       2375/tcp                                  swarm-agent-00/swarm-agent
5220dca465d2        swarm:latest        "/swarm join --advert"   12 minutes ago      Up 12 minutes       2375/tcp                                  swarm-master/swarm-agent
1205877156ee        swarm:latest        "/swarm manage --tlsv"   12 minutes ago      Up 12 minutes       2375/tcp, 192.168.99.100:3376->3376/tcp   swarm-master/swarm-agent-master
$ docker info
Containers: 4
Images: 3
Role: primary
Strategy: spread
Filters: health, port, dependency, affinity, constraint
Nodes: 3
 swarm-agent-00: 192.168.99.101:2376
  └ Status: Healthy
  └ Containers: 1
  └ Reserved CPUs: 0 / 1
  └ Reserved Memory: 0 B / 1.021 GiB
  └ Labels: executiondriver=native-0.2, kernelversion=4.1.13-boot2docker, operatingsystem=Boot2Docker 1.9.1 (TCL 6.4.1); master : cef800b - Fri Nov 20 19:33:59 UTC 2015, provider=virtualbox, storagedriver=aufs
 swarm-agent-01: 192.168.99.102:2376
  └ Status: Healthy
  └ Containers: 1
  └ Reserved CPUs: 0 / 1
  └ Reserved Memory: 0 B / 1.021 GiB
  └ Labels: executiondriver=native-0.2, kernelversion=4.1.13-boot2docker, operatingsystem=Boot2Docker 1.9.1 (TCL 6.4.1); master : cef800b - Fri Nov 20 19:33:59 UTC 2015, provider=virtualbox, storagedriver=aufs
 swarm-master: 192.168.99.100:2376
  └ Status: Healthy
  └ Containers: 2
  └ Reserved CPUs: 0 / 1
  └ Reserved Memory: 0 B / 1.021 GiB
  └ Labels: executiondriver=native-0.2, kernelversion=4.1.13-boot2docker, operatingsystem=Boot2Docker 1.9.1 (TCL 6.4.1); master : cef800b - Fri Nov 20 19:33:59 UTC 2015, provider=virtualbox, storagedriver=aufs
CPUs: 3
Total Memory: 3.064 GiB
Name: 1205877156ee

docker ps now shows containers running on all nodes. You can interact with containers on the swarm as a whole using the Docker CLI.

Integrating Calico

Using Swarm with Calico requires a few changes from the standard swarm creation process.

Each node within the Swarm cluster must be configured to offload the cluster store to a location also accessible to the Calico cluster. Each host also needs to create a Calico node using the calicoctl binary. This configures the host and creates a set of Docker containers to maintain the cluster.

Because of these changes, we can’t create a swarm using the Docker Machine helpers. We will need to create individual machine hosts and then manually create Swarm and Calico clusters. We’ll be following the guide from the Calico docs for implementing Calico as a Docker network plugin. We hope to see more direct plugin support around the docker-machine tool in the future.

Create a machine cluster

To implement a Calico cluster, we’ll need a set of unclustered Docker hosts.

$ docker-machine create -d virtualbox node-00
Running pre-create checks...
Creating machine...
Waiting for machine to be running, this may take a few minutes...
Machine is running, waiting for SSH to be available...
Detecting operating system of created instance...
Provisioning created instance...
Copying certs to the local machine directory...
Copying certs to the remote machine...
Setting Docker configuration on the remote daemon...
To see how to connect Docker to this machine, run: docker-machine env node-00

$ docker-machine create -d virtualbox node-01
Running pre-create checks...
Creating machine...
Waiting for machine to be running, this may take a few minutes...
Machine is running, waiting for SSH to be available...
Detecting operating system of created instance...
Provisioning created instance...
Copying certs to the local machine directory...
Copying certs to the remote machine...
Setting Docker configuration on the remote daemon...
To see how to connect Docker to this machine, run: docker-machine env node-01
...
$ docker-machine create -d virtualbox node-NN
…
$ docker-machine ls
NAME      ACTIVE   DRIVER       STATE     URL                         SWARM
node-00   *        virtualbox   Running   tcp://192.168.99.100:2376   
node-01   -        virtualbox   Running   tcp://192.168.99.101:2376   
node-02   -        virtualbox   Running   tcp://192.168.99.102:2376

Set up a cluster store

We need to use an external cluster store, etcd, in order to synchronize the Calico and Swarm clusters.

We can run this etcd store on our cluster. In a production environment, this should be a scaled cluster supporting HA. However in this example, we’ll just run a single instance. We’ll run this single instance of etcd on node-00, which is on IP 192.168.99.100.

$ eval $(docker-machine env node-00)
$ docker run -d -p 2379:2379 quay.io/coreos/etcd -advertise-client-urls http://192.168.99.100:2379 -listen-client-urls http://0.0.0.0:2379
...
$ curl 192.168.99.100:2379/v2/keys
{"action":"get","node":{"dir":true}}

Set up a Calico cluster

Thanks to tooling and the fact that we’re using boot2docker as a base OS in Docker Machine, Calico requires very little setup.

We simply need to download calicoctl and use it to set up the cluster on each host. The ETCD_AUTHORITY variable will remain the same on each host, however NODE_IP should be the IP of the host.

$ docker-machine ssh node-00
...
docker $ wget http://www.projectcalico.org/latest/calicoctl
Connecting to www.projectcalico.org (64.91.234.195:80)
Connecting to www.projectcalico.org (64.91.234.195:80)
Connecting to github.com (192.30.252.128:443)
Connecting to github-cloud.s3.amazonaws.com (54.231.114.122:443)
calicoctl            100% |*******************************|  5428k  0:00:00 ETA
docker $ chmod +x calicoctl
docker $ sudo ETCD_AUTHORITY=192.168.99.100:2379 ./calicoctl node --libnetwork --ip=$NODE_IP
Pulling Docker image calico/node:v0.14.0

Calico node is running with id: 3cb0b50060d2bf423bd22f51ec30b9408bf5d199f631a70da7ca340902e7e134
Pulling Docker image calico/node-libnetwork:v0.7.0

Calico libnetwork driver is running with id: a8a9e8bd63e2540de76fb89fc40e31652b2df08392366046d3304626387e5b01
docker $ sudo  ETCD_AUTHORITY=192.168.99.100:2379  ./calicoctl status
calico-node container is running. Status: Up 7 minutes
Running felix version 1.3.0rc6

IPv4 BGP status
IP: 192.168.99.100    AS Number: 64511 (inherited)
+--------------+-----------+-------+-------+------+
| Peer address | Peer type | State | Since | Info |
+--------------+-----------+-------+-------+------+
+--------------+-----------+-------+-------+------+

IPv6 BGP status
No IPv6 address configured.

This needs to be run on every host, using the relevant host IP. Once this is done, calicotl should list all connected nodes.

docker $ sudo ETCD_AUTHORITY=192.168.99.100:2379 ./calicoctl status
calico-node container is running. Status: Up 19 seconds
Running felix version 1.3.0rc6

IPv4 BGP status
IP: 192.168.99.102    AS Number: 64511 (inherited)
+----------------+-------------------+-------+----------+-------------+
|  Peer address  |     Peer type     | State |  Since   |     Info    |
+----------------+-------------------+-------+----------+-------------+
| 192.168.99.100 | node-to-node mesh |   up  | 23:38:29 | Established |
| 192.168.99.101 | node-to-node mesh |   up  | 23:38:30 | Established |
+----------------+-------------------+-------+----------+-------------+

IPv6 BGP status
No IPv6 address configured.

Configure your Docker hosts

Before we can set up a Swarm, we need to reconfigure each host to use the same cluster store as Calico.

This is complicated by the fact we are using boot2docker. However we can still reconfigure our Docker initialization parameters with an external cluster store. We can do this by adding a --cluster-store argument to /var/lib/boot2docker/profile.

$ cat <<EOF > profile.new

EXTRA_ARGS='
--label provider=virtualbox
--cluster-store=etcd://192.168.99.100:2379
'
CACERT=/var/lib/boot2docker/ca.pem
DOCKER_HOST='-H tcp://0.0.0.0:2376'
DOCKER_STORAGE=aufs
DOCKER_TLS=auto
SERVERKEY=/var/lib/boot2docker/server-key.pem
SERVERCERT=/var/lib/boot2docker/server.pem

EOF
$ sudo chown root:root profile.new && sudo mv profile.new /var/l
ib/boot2docker/profile && sudo /etc/init.d/docker restart

Now your Docker hosts are configured to use the same cluster store as Calico.

Setting up a Swarm

Just like with the simple Swarm example, we first need to create a cluster token and then Swarm nodes and a manager.

If you plan on deploying Swarm in a production environment, be sure to read the Swarm docs on discovery methods and use something other than the standard Swarm token service.

Starting Swarm nodes is fairly standard. However we need to configure the manager with TLS enabled and mount the generated boot2docker certificates when starting the container.

Keep in mind that in this configuration the Swarm manager is a single point of failure. For production environments, be sure to follow the Swarm docs on HA deployments.

$ docker run --rm swarm create
938891a526e627d6ab11dd2e92cb8694
$ docker-machine ls
NAME      ACTIVE   DRIVER       STATE     URL                         SWARM
node-00   -        virtualbox   Running   tcp://192.168.99.100:2376   
node-01   -        virtualbox   Running   tcp://192.168.99.101:2376   
node-02   *        virtualbox   Running   tcp://192.168.99.102:2376  
$ eval $(docker-machine env node-00)
$ docker run -d swarm join --addr=192.168.99.100:2376 token://938891a526e627d6ab11dd2e92cb8694
$ eval $(docker-machine env node-01)
$ docker run -d swarm join --addr=192.168.99.101:2376 token://938891a526e627d6ab11dd2e92cb8694
$ eval $(docker-machine env node-02)
$ docker run -d swarm join --addr=192.168.99.102:2376 token://938891a526e627d6ab11dd2e92cb8694
$ docker run -dp 2377:2375 -v /var/lib/boot2docker:/var/lib/boot2docker swarm manage --tlsverify --tlscert /var/lib/boot2docker/server.pem --tlscacert /var/lib/boot2docker/ca.pem --tlskey /var/lib/boot2docker/server-key.pem token://938891a526e627d6ab11dd2e92cb8694

At this point, we have a Swarm manager with control over all nodes in the cluster, available via 192.168.99.102:2377. From here, we can follow the standard steps to test Docker Swarm and Calico listed in the Calico tutorial.

Rather than using the -H flag, you can also just redefine the DOCKER_HOST variable to specify the Swarm endpoint. Be sure to check your etcd cluster store to make sure it’s still running; restarting the Docker service may not have automatically started it.

$ eval $(docker-machine env node-02)
$ export DOCKER_HOST=tcp://192.168.99.102:2377
$ docker ps
fe3fcff4e0d2        calico/node-libnetwork:v0.7.0   "./start.sh"             About an hour ago   Up 56 minutes                                                                     node-02/calico-libnetwork
83a472b3ba65        calico/node:v0.14.0             "/sbin/start_runit"      About an hour ago   Up 56 minutes                                                                     node-02/calico-node
dbe7968c4707        calico/node-libnetwork:v0.7.0   "./start.sh"             About an hour ago   Up 53 minutes                                                                     node-01/calico-libnetwork
4bddb4e3d6b4        calico/node:v0.14.0             "/sbin/start_runit"      About an hour ago   Up 53 minutes                                                                     node-01/calico-node
a8a9e8bd63e2        calico/node-libnetwork:v0.7.0   "./start.sh"             About an hour ago   Up 54 minutes                                                                     node-00/calico-libnetwork
3cb0b50060d2        calico/node:v0.14.0             "/sbin/start_runit"      About an hour ago   Up 54 minutes                                                                     node-00/calico-node
3c9092601f5a        quay.io/coreos/etcd             "/etcd -advertise-cli"   2 hours ago         Up About a minute   2380/tcp, 4001/tcp, 192.168.99.100:2379->2379/tcp, 7001/tcp   node-00/boring_panini

It is also simpler to manage your Calico cluster by directly executing commands inline via docker-machine ssh, which can be aliased in Bash or scripted. But keep in mind that any time your Docker Machine needs to be restarted, you’ll need to re-download the calicoctl binary.

$ docker-machine ssh node-00 sudo ETCD_AUTHORITY=192.168.99.100:2379 ./calicoctl pool show

+----------------+---------+
|   IPv4 CIDR    | Options |
+----------------+---------+
| 192.168.0.0/16 |         |
+----------------+---------+
+--------------------------+---------+
|        IPv6 CIDR         | Options |
+--------------------------+---------+
| fd80:24e2:f998:72d6::/64 |         |
+--------------------------+---------+

Using Calico

Now that Calico is set up, what does that mean for your infrastructure and your applications?

First of all, Calico is not an entirely self-managing service. Be sure to read the documentation thoroughly to ensure you are applying the correct base iptables rules and configuring your nodes in a sensible manner. It’s important to understand what restrictions Calico brings to your infrastructure, as well as what it does and does not protect against. The Calico docs provide information on these concerns.

With the Calico libnetwork driver in place, you can manage networks and container IPs via the docker network interface. This means that after standard Calico node configuration, most operational changes required can be implemented through the libnetwork driver in much the same way such changes would be made using an overlay network.

Conclusion

Calico can greatly enhance your Docker infrastructure by facilitating scale and providing a more full-featured SDN than a standard overlay.

With recent updates in Docker and Calico tooling, setting up and maintaining a cluster is far simpler, and running one in a highly available manner in production is well documented. As further tooling and integrations develops, we expect this process to be even more simplified and configurable.

Calico provides a simple-to-implement container network, a highly scalable alternative to the standard overlay. By integrating with these networks via libnetwork and using a standard control interface, you can make it easy to switch Calico out for another network layer or connect different geographic regions as a single homogenous network.

Before settling on Calcio, be sure to read up on its benefits and those of its alternatives such as Weave and the standard overlay.

Reference: Using Swarm with Calico on Docker Machine from our WCG partner Florian Motlik at the Codeship Blog blog.

Brendan Fosberry

Brendan Fosberry is a software engineer at @codeship. He has a background in Datacenter Automation and Docker services and can usually be found fiddling in Go, Ruby or C#.
Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Back to top button