Kubernetes
Terraform
Go
Ansible
Python
Developer Experience
Rest API
GitHub Actions
Hetzner Cloud is a prominent European cloud provider offering virtual machines and other cloud infrastructure. As a member of the Integrations team, I am responsible for maintaining and enhancing our suite of Open Source Integrations, including the Go SDK, Python SDK, CLI, Terraform Provider, Ansible Collection, Kubernetes Cloud Controller Manager, and csi-driver.
Collaborating closely with other teams at Hetzner Cloud, I contribute to the development and integration of new features into our Open Source Integrations. This entails working closely with my co-workers to provide early feedback for new APIs, to make sure that they are consistent and easy to understand for our customers. In addition to supporting new features, I address customer issues, providing timely resolutions and implementing fixes and enhancements. I also interact with our community to get their changes merged and implemented.
Overall, my role at Hetzner Cloud allows me to combine my passion for Open Source technologies with my expertise in cloud infrastructure. By continuously enhancing our Open Source Integrations, I help in empowering customers to leverage the full potential of Hetzner Cloud’s cloud infrastructure solutions.
- Driving initiatives to keep our API easy to understand
Kubernetes
Cluster API
OpenStack
GitLab CI
GitOps
Prometheus & Alertmanager
Go
kubebuilder
Ansible
teuto.net provides public and private clouds based on OpenStack, managed Kubernetes on top of these clouds, and additional services like consulting or training. Within teuto.net, I worked on the Kubernetes team, where I was responsible for the next iteration of our managed Kubernetes offering.
At the beginning of 2021, teuto.net decided to replace the existing custom tooling for managing Kubernetes clusters with Cluster API and Cluster API Provider OpenStack. I authored a proposal for the new implementation and implemented it over the following months. This included GitLab CI pipelines for building VM images, writing templates for clusters, an Ansible playbook to bootstrap new Cluster API Management Cluster, extensive internal documentation and runbooks, Prometheus alerting, an E2E test pipeline, and a custom Kubernetes operator to tie it all together into one coherent API.
As I always liked to work upstream, and a stable foundation is necessary for this offering, I started contributing to Cluster API Provider and became a reviewer for the project, participating in the office hours and wherever else help was needed.
When not working on new features, I packaged newly released Kubernetes versions for our platform and rolled out the changes to our managed clusters while communicating with our customers about upcoming changes and maintenance windows.
For 3 months in 2021, I lead an initiative to improve alert fatigue for our on-call engineers, by analyzing the pains they were experiencing and then applying a consistent labeling strategy on the alerts, as well as filtering unwanted alerts in Alertmanager and Prometheus.
- Tech Lead for managed Kubernetes offering
- Upstream contributions to Cluster API Provider OpenStack
AWS
Node.js
Gitlab CI
Terraform
Nest.js
Typescript
narando provides a crowd-working platform where texts such as magazine articles and blog posts are recorded by professional narrators and subsequently published to websites and podcast platforms such as Spotify. It is a small company, so I wore many hats, often simultaneously, in the 5 years I worked there.
As the Lead Developer, I was responsible for the 3-person dev team. In that role, I planned the implementation of new features and oversaw our bi-weekly planning sessions. I introduced modern development practices such as merge requests and taught the other engineers to get them to excel at their tasks.
In my role as a Backend Engineer, I planned and implemented a new version of the production and publishing platforms that power narando, as the previous proof of concept was hitting its limits. This new platform was built using multiple Node.js services for different use-cases such as core production, file handling, post-production, and publishing feeds.
As the Cloud Infrastructure Engineer, I planned and implemented our infrastructure in AWS. Most of our services were deployed as containers to ECS and were backed by RDS MySQL. I also built an event queue for our services on top of SNS+SQS. Everything was configured by Terraform and by using common templates, new services could be launched in less than a day.
To improve our development velocity and reduce toil, I build automated pipelines for our services to run tests and linting for each commit merge request and to automatically release and deploy the services once they are merged. During the first year we got started with Jenkins but then switched to GitLab CI because Jenkins needed too much maintenance and we liked the Gitlab CI syntax better.
- Built out the infrastructure on AWS using mostly container technology (ECS)
- Planned and implemented a Node.js microservice architecture
- Introduced code reviews to improve the code quality and learn from each other
Kubernetes
Backend
CI/CD
Node.js
Jenkins
TrackCode is a Transport Management System for the last-mile delivery logistics companies, importing shipment data directly from the customers' systems and exporting scan events and signatures back to the logistics networks.
When I joined TrackCode, I initially focused on the Node.js backend software, which was talking to the Web and Android Apps through a JSON API and persisting data in MySQL and MongoDB.
Later on, my role at TrackCode shifted and I started working more on infrastructure and DevOps topics. I planned and implemented a new development platform for our growing team on top of Kubernetes with Rancher. To sustainably learn from past issues and avoid being waken up by alerts, I established a postmortem culture, which helped us fully explore incidents and solve the root causes.
To improve our confidence and velocity during development and deployment, I implemented the full CI/CD lifecycle for all of our services, using Jenkins, Unit Tests, and finally an automated release to Kubernetes.
To help us debug any incidents and investigate for customer support tickets, we used Splunk, but this proved to be too expensive in the long run. I developed our alternative monitoring and observability stack using Graylog, Grafana, and Prometheus. Using these, I also established SLOs for one of our core metrics, “time to export events”.
- Planned and implemented a new development platform on top of Kubernetes
- Established a structured process to analyze incidents (postmortems)
- Implemented a full CI/CD lifecycle for all services (Jenkins, Tests, Release to Kubernetes)
Ruby on Rails
Android