Welcome! I’m Zsolt from the internal platform team at Bitrise. In this post, I’ll present our team’s mission and try to paint a broad picture of the daily challenges we face and the tools we’re using to achieve our goals. Let’s start at the beginning, with why our team has come together.
What we want
As a rapidly growing company, we have some pretty ambitious goals set ahead of us. In order to successfully reach them, we definitely need to bring out our best and provide developer teams everything they need to be highly performant.
One key contributor to that is the ability to iterate quickly and to scale confidently, without getting bogged down with the internals of the platform the team's developing for. Developers just want to push changes as fast as possible, confirm what they’re building is working brilliantly and is performing well, and be able to scale it arbitrarily as (hopefully) the adoption drives a higher traffic. All this, without needing to work out details of observability, security, load balancing, traffic management, and resource optimization.
What we need
After assessing our technological and business requirements, we’ve found that Kubernetes is a highly flexible platform that can support all our use cases with excellent scaling characteristics, and with it we can do all the magic above without asking too much from our ever-so-busy developers.
So, we’ve formed a virtual team of volunteers and put together a prototype: a GitOps-based CD flow to deploy to our Istio service mesh running on GKE (and other GCP resources like cloud SQL), using tools like terragrunt, helm, berglas, and Datadog for observability. OK, that’s a lot of nouns in a single sentence :) Don’t worry, we’ll break them down in the following posts. But the point is, it was a successful kickoff and it gained traction: some of our production services have been developed on top of it. After successfully validating the idea, it was time to step up the game.
In the last OKR session there was already buy-in from the higher management team to focus on scalability and productivity, and the adoption of the platform was a great candidate. And so, with the help of our new funding and headcounts, we’ve formed a real team that is committed to bringing services to the platform, and to make the adoption as smooth as possible by building tools to help developers manage their apps.
But that’s just the first step, our ultimate goal is to reach a point where all the resources (infra components, secrets, configuration, even documentation) needed to run a service is fully automated and developers can create everything they need using a couple of clicks and not worry about the intricacies of the platform they’re running on.
To achieve that, we will build out an internal developer portal where developers can set up, manage, and observe their systems on the UI. Actually, it’s not just for developers, everyone should be able to find all the necessary information about our systems here: documentation, important links, and even system architecture diagrams. Our dream is to make this portal the central control panel of Bitrise.
Meet the Internal Platform team
And so we’re here: four engineers, under the guidance of an engineering manager and a product manager working hard to serve the needs of our customers — the developers of Bitrise. The job is not easy: we have to build a platform that conforms to all our company standards (like SOC2), it’s flexible enough to cover all our use cases, while still being so easy to use that developers feel an actual improvement after adopting it.
The team consists of people with diverse backgrounds which really helps us tackle the inherently complex nature of the problem we’re aiming to solve. This is a fine mix of disciplines like distributed system design, infrastructure management, automation tooling, and application software development, with security sprinkled on top. What we do is basically what the industry calls (incorrectly) DevOps, or SRE might also be a fancy name that pops into our minds. But really, we need to be developers ourselves to understand the needs of them and to be able to build robust tools, and we also need to have deep knowledge about the platform and libraries we use (Kubernetes, Helm, Istio, Terraform).
What we do
Having just formed, we jumped right into planning our roadmap for the next period and to figure out how we can achieve the most impact. First, we’re focusing on creating a scalable project structure that provides a frame in which systems and teams are well isolated to help keep resources clean and tidy, as well as to ensure the best security possible.
After establishing a long-term scalable project structure, we’re focusing on the most important resource management flows of services, building software tools to automate them to provide developers a smooth experience when working with our platform.
We’ll cover these flows one-by-one, creating the internal developer portal’s building blocks. Luckily, we don’t have to build out everything from scratch, as the pioneers at Spotify have already laid the groundwork for such a portal: they have created Backstage, an open source portal for exactly this reason.
Backstage is actually a pretty cool app with quite a rich feature set and a ton of integrations. Honestly, we are so fortunate to have this, we wouldn’t be able to get where we want to go without it.
Backstage is written in TypeScript and React / Node.JS and uses a quite versatile plugin system, and so we’ve started building our own plugins for our use cases. These are not just fancy UI applications, we also have to build out the infrastructure beneath, run automations, and figure out how all the components can be glued together to cover resource management, CI, CD, and observability.
And of course, we are also constantly improving the underlying Kubernetes-based platform, manage projects and resources on GCP using Terraform / Terragrunt, and help out the brave teams who have already taken the step to migrate. It’s a fairly complex job we have, but that makes sure we never get bored and we learn something every day, which is one of the top reasons why I - personally - love working here.