Story about Docker Swarm with GlusterFS and Gitlab CI – Episode 1
This article should become an article series, which will explain how to setup a complete and production ready hosting, continous integration and building environment with Docker Swarm, GlusterFS and Gitlab Community Edition. Because of the complexity for each topic, this article would be divided into multiple parts which will explain the details.
The introduction article (this one) will give you the big picture of what we’re doing at our company and what infrastructure is serving our more than 120 Docker containers (in 51 swarm stacks), which mostly contains productive customer applications we have built.
We are a small team of web developers which spent most of the day working on customer/internal projects, improving existing ones or fixing/maintaining legacy applications from the past. The most time we are jumping between multiple projects during a week. Because while developing new applications, also the old ones require some attentions (fixing, improving, upgrading, …). If you were constantly jumping between different projects, then manual deployments (ftp, git, or something else from the last century) can be very confusing, error prone or frustrating (for you, your team members or at least your customers).
About 6 months ago we decided to get rid of the old stuff like manual testing, deployment or maintaining our hosting servers. Our goal was not clear at this point, but we had clear ideas in what direction we should go and what problems should be eliminated. Even if our final goal was not clear, we knew that Docker and Gitlab would be our best friends at this journey to a reliable continous integration architecture.
While playing around with Docker, Docker Swarm and Gitlab (and it’s great CI component) we learned a lot of things about these systems. And while the everydays process of learning by doing, the goal respectively the direction we should go, became much clearer and clearer.
Back then I never thought, that we will build an architecture, of what we have today. In comparison to mid 2017 it’s like rocket science!
The Big Picture
Today, we’re pushing new code to our Gitlab repositories. Always (the most time) in separate topic branches (like feature/… or fix/…). The most of these commits get’s reviewed by one or more of our colleagues before they were merged into the master or release branch. Once arrived in the release branch, Gitlabs CI system will start a new build pipeline. Each pipeline executes multiple CI jobs (e.g. testing, code quality check, and so on). One of the most important (and interesting) job is the build job itself. At this stage, we’re build a Docker image. This Docker image will contain everything needed to serve our application. It’s a complete build once, run everywhere solution. After the Docker image was successfully built, we were pushing it into Gitlabs built in Docker Registry (after successfully executed some tests like smoketests of course!). Now this Docker image must be deploy on our production servers.
Our production servers consists of four dedicated servers. One dedicated build server (for Gitlabs CI build runner) and three dockerhosts which hosts all our containers. To give you a few key data about the server dimension, here are some specs:
– our three dockerhosts: Intel Xeon E5; 256 GB ECC RAM; 900 GB SSD RAID 0; 900 GB SSD
– our build machine: Intel Xeon E3; 64 GB ECC RAM; 500 GB NVMe SSD RAID 0
Every server is a dedicated machine connected with a 1 GBit/s switch at our providers datacenter.
All three servers are really performant machines which only task is executing the Docker daemon and serving our containers. The only other packages which got installed are OpenVPN, GlusterFS, and Borg (our Backup Solution). All machines are connected to one big Docker Swarm cluster. This means: independently on which host you start a (Swarm controlled) Docker container, this container maybee spawn on a another host inside the Swam cluster. It does not matter on which node a container was started by the Swarm. It does not matter on which node you were logged in. It does not matter which public IP you enter in your browser. It does not matter if one or even two servers crashes… You application will be available and will survive most of all host related crashes!
Even after 6 month of playing, using and doing it in production… sometimes its still like magic!
Maybe you were interested of more in depth details about the systems, components, workflows, tools or the connection between them. If that’s the case, you can jump to one of the next articles or (recommended) read them one by one.
To be written….
Gitlab CI and our own (huge) build tool ecosystem
In this article we’ll give a an overview and detailed explanation of how we build our own Docker images with Gitlab. This will cover subjects like: what base images did we use, how does a pipeline looks like, how our .gitlab-ci.yml is constructed and what tools were needed to achieve our vision of a continuous integration architecture.
Gitlab CI and our npm based build tools – Episode 2
Hostsystem OS Configuration
All in all this will be a complete and quite secure setup guide how our dockerhosts are configured. It will cover subjects like the first minutes on a fresh OS, our reasons why we took a RAID 0, the configuration of OpenVPN, Docker and securing these things. A small sub theme will be the explanation of GlusterFS and why we need it for our persistent data.
This article doesn’t claim to be a (complete) introduction of Docker Swarm. It’s more like a story of how we’re using it at a production scale, how our docker-compose.yaml files look like and why we made certain decision.