The 3 phases of Prezi’s development environment evolution
Published on Oct 13, 2020
Unlike production, development environments usually evolve ad-hoc. Teams are so busy deploying new features that they don’t have time to invest in “sideways work” that doesn’t result in immediate value for customers.
I’ve noticed that one tactic teams use to prioritize developer productivity is to let some efficiencies build up, so that they can understand the underlying cause and invest in fixing that.
In this post, I chatted with Laszlo from Prezi to hear how they’ve increased developer happiness and feature velocity by iterating on their development workflow.
How is the Prezi application architected?
We have around one hundred microservices: most are Python, and some are Scala or Go. They are built on a unified pattern that defines:
- How to structure code (e.g. what kind of frameworks we use, and a set of internal libraries)
- The management details. For example, how to build the services, run unit, integration and end-to-end tests, generate reports, collect build metrics, handle static assets, etc.
- Extension points for special use cases.
Each service has a YAML descriptor file located at the root of each repository that:
- Provides information about themselves
- Tells us which features they need
- And how they want to use them
How have Prezi development environments evolved over time?
We have a long history of development environments in Prezi. There were three main iterations.
Phase One: Custom Hooks
Originally we had a tool that set up everything on the developer’s machine by providing hooks for services that told the tool how to (among other things):
- Install dependencies
- Test the service
- And run the service
The tool also provided a layer on top of them to orchestrate communication between the services.
The problem with this setup was that each developer’s machine was different and these hook scripts quite often failed – for example, Python libraries depending on different versions of global binaries or header files would fail to compile. Sometimes we’d take a long time to fix these individual problems just to then run into another problem with another service – and then this cycle would get repeated a few months later when you wanted to work with the service again.
Phase Two: Docker Compose
Next, we improved this tool to use containerized versions of dependencies, and orchestrated them with Docker Compose. This was more stable so developers ended up running more and more complex setups and larger sets of services. Under these workloads, laptops were spinning up to annoying levels. Plus, the services that were being actively developed still had to be run from the host machine rather than from within Docker Compose.
It was at this point when the Developer Experience (DX) Team was formed. We talked a lot with teams about their workflows, needs, and pains with the current development ecosystem.
Phase Three: Remote environment on Kubernetes
We learned from developers that they wanted a more stable, robust, and scalable solution. Also, the DX Team wanted more control over the environment to provide better support and maintainability.
We decided to build a remote development environment on top of Kubernetes. The developer’s code and dependencies would all run in the cluster so that we could manage it. Only a light CLI, and some libraries, would run on the developer’s laptop.
After extensive research and prototyping with existing tools and open source libraries, we gradually started releasing it to one team after another.
How does local development work now?
The development environment has a small command-line application running on the developer’s machine. It is a thin wrapper around open source libraries and installs the same versions of these to every machine. Each release of the tool is a working ecosystem. When you upgrade, it will install the newer versions of all dependencies if necessary.
Developers have their own sandbox - Kubernetes namespace - and can run services in two different modes:
- Dependency mode: In this scenario, the tool is told that a given set of services are needed (and based on the descriptors mentioned above it also figures out transitive dependencies). It then simply deploys - with Helm and Helmfile - pre-built and pre-optimized containers of the dependencies. (These images are automatically assembled on each master build as part of our unified pipeline)
- Development mode: This is for the services that are being actively developed. We use Skaffold to build the images and sync changes so developers quickly see the result of their modifications and get feedback.
The CLI pretty much only sets environment variables and generates files that are fed to the tools that run the services for you in the development cluster.
Besides the application services, there are of course many other containers - platform dependencies - that are also managed and deployed by the tool.
Data is persisted between development sessions and support components can also be deployed separately to make it easier to, for example, work with data or correlate logs.
How Cloud Native kills developer productivity
Making the switch to microservices but think it’s too good to be true? Or you already made the switch but you’re starting to notice that local development is harder than it used to be. You’re not alone.Download Now
In your opinion, what are the must-have attributes in a dev env?
I think the development environment should reflect the production environment as closely as possible given your use cases and constraints. It greatly lessens the mental burden and makes it easier to coordinate between teams.
It should be easy and straightforward to get set up to the point where you can start getting things done with short feedback loops to enable fast iteration. However, it’s explicitly not our goal to hide the complexity and the tools running under the hood. The development environment should not feel like “magic”. We expect developers to debug issues themselves, and have a basic understanding of components like Kubernetes so that they can directly check on the pods.
We did not want to add proxy commands that simply call
kubectl describe podor
kubectl logs. If you need to learn that, you might as well learn how to do it directly with kubectl. The knowledge and familiarity with these open source tools is something developers greatly benefit from and make use of in staging and production.
There are of course internal tools and rules and constraints everywhere, but we firmly aim to minimize them. Extensive documentation, knowledge shares, and workshops are part of providing the environment.
It is imperative to instrument the environment and collect metrics at each sensible point. For example, we collect which commands are used, which services are brought up in what modes, and the dependency graph of services. Deployment time - per service and groups of services - is one of our most important metrics. Since the environment is disposable - and will actually auto clean up after a certain period of inactivity to save resources - it is very important to be able to quickly set things up. It matters a lot whether you need to wait 5 or 15 minutes just to get started. There is always room for improvement here.
How has your custom-built development environment improved your workflow?
I can think of at least two separate aspects of the development process that have greatly improved by this change.
- The developer’s daily workflow is better:
- It takes a lot less time and effort to get to a working environment. Before, it took days to set up an environment that would break from time to time. Now, it takes minutes.
- Each developer’s environment is disposable and encapsulated.
- A unique url can be shared with others to show progress and open discussions.
- Things are easier for the Developer Experience Team:
- Everyone has the same set of tools with the exact same version installed which makes it a lot easier for us to provide support.
- The development cluster, docker build hosts, registries, etc. are all centralized. We only need to upgrade once and everyone is using the latest and greatest. It is a lot more maintainable for us this way.
- It’s easier to provide up-to-date, well-written documentation for the developers because we can piggyback off the existing documentation for the open source tools we built on top of. This results in a way more efficient onboarding process.
Kelda and Prezi
Kelda and Prezi have collaborated for a long time. We first met when we were building the predecessor to Blimp, which moves your Docker Compose development environment into the cloud. Prezi had already custom-built Blimp internally, while we were trying to make a general solution, so we’ve been trading ideas ever since.
Get started with Blimp to get the benefits of Prezi’s developer experience tool without having to build it yourself!
See Blimp commands and usage in the Docs