Cyrill Burgener is a backend engineer on the Developer Enablement team at GetYourGuide. In this article, he explains how his team implemented reliable, continuous deployment for tooling running on developers' laptops.
The Developer Enablement team owns a major suite of internal command-line interface (CLI) tooling used by our developers. Part of that ownership involves making sure the tooling is deployed successfully to developers’ laptops.
This is not a simple task because, while the deployment of web services is a well-trodden path in DevOps with containers and a range of continuous delivery (CD) tools, the same is not true for the deployment of CLI tools. Here, containers are not generally applicable and there are no widely used CD tools.
In the past, we’ve made deployments with a basic script that ran a self-update. It had to be invoked manually, once a month, when a new version of the tooling was released, and it led to broken and outdated installations and frustrated developers. In this article, I’ll share the learnings and practices we used to design the new deployment process that is faster and more reliable.
{{divider}}
When considering the distribution of our tooling suite, there are some additional difficulties that arise because of certain requirements. Our tooling suite is written in Python, which brings some unique challenges to dependency management. On top of that, our developers can choose between Mac OS or Ubuntu for development, so all the tooling and dependencies need to be compatible with both platforms.
For most of the development environment, the tooling depends heavily on Docker and Docker Compose. The total dependencies are the Python interpreter, Python libraries, Docker and Docker Compose. All of those have different ways of being installed with a combination of DMG files, brew and pip on MacOS, and apt and pip on Ubuntu.
Finally, installing the tooling suite is one of the first steps new joiners need to take, so it has to be simple enough that a new associate engineer can do it. After all this, the installation process needs to be repeated regularly to apply updates.
We came up with a new solution to deliver our tooling suite that improves both speed and reliability. We would like to share some of the things we learned from that experience that might be helpful to others who are trying similar things:
Deciding on the technologies to use is one of the most important choices you’ll make when optimizing for the ease of tooling deployment. Different programming languages offer different ways of managing dependencies, and some are better for distributing software than others. If you can choose the language you’re going to use to write your tooling, we recommend picking one that can be shipped as a standalone file or directory with all dependencies included. Popular choices nowadays are Rust or Go.
Since our tooling suite already exists and is too large to be rewritten in a different language or with a different technology, we have to stick with Python. Unfortunately for us, Python offers no established, cross-platform method for shipping applications as a standalone package to the target system.
The alternative method we chose is installing each application plus dependencies into a separate virtual environment, which we do using pipx. This allows the dependencies to be isolated between applications and avoids dependency version conflicts. The major drawback of using virtualenvs is that they’re prone to breaking on MacOS. They depend on the location of the Python interpreter in the file system, which changes on every upgrade. Luckily, Python upgrades don’t happen very often and the broken installation can be fixed with minimal effort, which I’ll explain below.
Originally, upgrades would be executed via an upgrade command that was integrated into the tooling, i.e., a self-upgrade. This turns upgrading into a dangerous operation, as any error happening during the upgrade has the potential to leave the installation in a half-upgraded, broken state.
To make it safer, we separated the installer from the rest of the tooling. This makes upgrading simpler, as the whole installation can be removed and reinstalled cleanly from scratch. If it fails, the installer can easily be restarted. The language the installer is written in doesn’t have to be the same as the tooling — an added benefit. We chose Go, which makes it very easy to distribute.
Dependencies that aren’t distributed with the software need to be validated to ensure they’re available and updated to the correct versions. Most language-specific package managers can do this for library dependencies if you pass them a list of all dependencies and valid versions. Custom code is most likely needed to either install or at least validate, dependencies that aren’t installable through the package manager.
For our library dependencies, we use a requirements.txt file to list and pin them to specific versions. For Python, there are also more advanced options available, like Poetry which makes it easier to manage indirect dependencies. Other dependencies that we have are Docker, Docker Compose, and of course the Python interpreter. For those, we implemented custom checks in the installer to validate that the right version is available for them.
Continuous deployment seems obvious in the context of modern software development, but there’s no commonly used installer known to us that offers this feature. Any easy way to get continuous deployment is with frequent automatic upgrades.
In our case, the installer can run in the background as a monitor that repeatedly checks if a new upgrade is available and installs it. To get the monitor running, we spawn the installer as a daemon process and add it to autostart with the go-autostart module.
Using what we learned, we came up with a new approach that performs the following high-level steps:
Our tooling distribution is much more successful with this new installation flow. Most developers were previously using versions that were more than a month old, some even several months old. Now the vast majority of developers use a version that’s not more than a couple of days old.
The rate at which our team is contacted for support also went down considerably. We used to receive weekly support cases from developers about broken installs, and nowadays those are very rare. Overall, the investment to improve the tooling distribution process was well worth it given the reduction in support efforts for our team.
The code for our installer isn’t currently publicly available. If you’re in a similar situation where you think you could make use of it, please reach out to us via den@getyourguide.com and we can discuss the possibility of open sourcing our code.
If you’re interested in joining our Engineering team, check out our open roles.