As we are a small team we focus on automating as much as possible and use Ansible a lot for our deployments but lacked a proper testing infrastructure satistying our needs (ability to run services with systemd as well as more complex container and network setups). In a previous article we deployed our Ansible role testing infrastructure but we did not explain into details how we setup our continuous intergration (CI). Let’s also rewind back a little and check what we’re doing with it and why we chose to manage our own workers.
My team is very small so we try to automate as much as possible. As sysadmins we use Ansible and follow the “Infrastructure as Code” principle. Over time we built many Ansible roles and we want to be sure we spot problems when we make changes but also when new version of the tooling are released. We’ve tried to make this as easy to use as possible.
In this article we’ll explore the different pieces of this system and explain some of the steps and difficulties we encountered.
My team, as sysadmins, builds solutions for our open source communities. This includes various kinds of tasks but regularly we are asked to help do more “simple” things like website deployment.
In fact the web can be deceitfully complex and as we’re not web designers we try to find ways to simplify our work. Many of the projects we work with use Jekyll for their web sites, so we needed something that could be set up fast without deep Jekyll or Ruby knowledge.
We’ll be exploring the latest work we did on this topic.
At OSCI we’re looking at the container world to help us host services for our communities in an easier way. We’ve been using VMs via libvirt for most of our workload and that works well but there’s specific features that are embedded in the container workflow and are really interesting. I especially like that updates are made from scratch, no left-over from a previous deployment, and also the readiness/liveliness checks which is more proactive than traditional monitoring.
Containers introduce new ways to run applications and the current trend is to run one process per container. Unfortunately that’s simply not possible for most workloads because existing software are not architectured to work that way. Even running a single binary often results in forks to drop capabilities, or multiple forks to spawn workers.
Moreover not everyone agrees on this model, seeing running a service manager as PID 1 as now part of the UNIX API. It’s true that systemd nowadays does more than just ripping zombie processes. The way a service needs to be spawn is clearly defined in services files and there’s no need to reinvent the wheel.
Anyway, Jon Trossbach, our former intern, worked on containerization of postfix and he concluded that a major overhaul of the software design would need to happen to adapt to this new model. Even if you cannot enjoy all the benefits of the container model you may still wish to use its workflow and some of its features, that’s why I’ve been experimenting with our OpenShift Dedicated account to make this use case functional.
My team in Red Hat’s Open Source Program Office provides hosting and sysadmin care to various open source communities, mainly using CentOS Linux, so when the CentOS Project announced that CentOS Linux would be discontinued in favor of CentOS Stream, the move was a real concern for us.
We had many uncertainties after the announcement, which was a bit abrupt. Rich Bowen, the CentOS Community Manager at the time, clarified the positions of Red Hat and of the CentOS Project regarding the changes. For one thing, we learned that new, “CentOS Classic-style” community distributions were expected (and welcomed), so we could wait for one of those to emerge if CentOS Stream didn’t meet our needs.
We had to decide whether to remain with CentOS, by shifting to CentOS Stream, to choose a new, forked distribution, or to switch to Red Hat Enterprise Linux.
We’ve used CentOS Linux for our services because it’s a slower-moving distribution than something like Fedora Linux. Since CentOS Stream is the upstream for the next minor release of Red Hat Enterprise Linux, we had reason to expect the changes to remain slow-moving. In addition, the fact that CentOS is now positioned upstream of RHEL gives us the opportunity to contribute bug fixes or feature requests to the project.
After more research and discussion, it became clear that we would not know for sure whether CentOS Stream would work for us without testing. We started experimenting first with test VMs, then new projects, followed by migration of non-critical machines, to prove out whether we should make CentOS Stream 8 our default.
Here on the Open Source Community Infrastructure (OSCI) team at Red Hat, we run most of our workloads on CentOS virtual machines, but we’ve been aiming to containerize some of these workloads and run them on Kubernetes/OpenShift. We reason that if we have more of our applications running on OpenShift, then that should translate to more efficient use of both our hardware resources and our people resources, leaving us able to support more upstream communities.
As a part of this effort I have been investigating how the Postfix MTA might be properly containerized and what sort of changes would best suit Postfix if it were to be refactored into a Kubernetes-native application. If a FLOSS MTA like this was driven to completion, it would have the potential to streamline the management of email services for many infrastructure teams including ours.