Infrastructure Platform Teams

Most Infrastructure as Code management tools today are designed for workflows where infrastructure engineers write low-level infrastructure code, and are also responsible for deploying that code to create and update environments. Infrastructure users, like application teams who need environments to develop, test, and run applications, have two choices. They can write and manage low-level infrastructure code for themselves, or have someone else do it for them.

Team Topologies diagram showing a single full stack team owning software delivery and infrastructure development

Application teams coding their own low-level infrastructure can work ok for smaller systems, where there is one team that does everything. But it doesn’t scale to larger systems where multiple teams build, deploy, and test applications on the same infrastructure.

So you end up with an “infrastructure management team”, where a separate team builds and runs environments for the application teams. An infrastructure management team is not a platform team (although it’s often called exactly that) because it doesn’t provide environment management as a service. Instead, it handles most aspects of managing environments on behalf of their users. Need the database to be resized? Ask the infrastructure management team. Need to add a message queue? Ask the infrastructure management team. Need to develop and deploy a new service? Ask the infrastructure management team. Need a new environment for testing? Ask the infrastructure management team.

Team Topologies diagram showing a separate software delivery team and infrastructure management team

This model, although common, is a drag for everyone involved. Infrastructure engineers spend their time handling routine tasks for application teams, and arguing with them about whose fault problems are when the software doesn’t deploy or work correctly. Developers spend their time waiting for infrastructure teams to make minor changes, and arguing with them about whose fault problems are when the software doesn’t deploy or work correctly. Leaders wonder why it takes so long to get simple features and improvements delivered to customers.

Team Topologies diagram combined with a value stream map showing many handovers needed with a separate infrastructure management team

So the two most common ways of organizing teams and workflows for infrastructure are to have the development team do everything for themselves, which doesn’t scale, or have a team manage infrastructure on their behalf, which is inefficient. What’s the alternative?

An infrastructure platform team provides environments as a service or components. Application teams can resize their application’s database without asking someone to do it for them. Application teams can add a message queue to their system without asking someone to do it for them. Application teams can develop and deploy new services without asking someone to add infrastructure for them. Application teams can spin up a new environment for testing without asking someone to do it for them.

Team Topologies diagram showing an infrastructure platform team and several software delivery teams

Making an infrastructure platform requires a few things that aren’t all that common. Infrastructure engineers need to develop deployable infrastructure components for application teams to use. Application teams need to be able to select, configure, and integrate infrastructure that is presented at a level that is meaningful to them. Most IaC tools present a wrapper over fine-grained IaaS APIs, which is not generally meaningful at the application level.

Note that, although I describe this model as an “infrastructure platform team”, these teams may provide components that other teams can deploy. Larger organizations will find it useful to split the responsibility for infrastructure across multiple platform teams. It’s best to organize these teams around domain, which in many cases mean these teams are not purely focused on infrastructure. An “availability platform team”, for example, may provide monitoring tools, recovery scripts, and infrastructure components that application teams can use to ensure the reliability of their services.

Tools and platforms need to support working this way. I described several categories of tools in my interesting infrastructure tools post, including infrastructure composition and deployment tools.

Now that I’m wrapping up the third edition of the Infrastructure as Code book I’m turning my attention to the question of how to improve infrastructure workflows. I believe the focus should be on the needs of the teams and organizations that use infrastructure, rather than optimizing siloed workflows for building snowflake infrastructure as code.

I’d like to learn about how effective organizations find their use of Infrastructure as Code practices and tools. Please consider filling in my infrastructure effectiveness survey. I’ll use what I learn for future content, including posts and talks.

These diagrams use the Team Topologies pattern language for showing and describing team interactions. See the Team Topologies team shape templates.

Share on

Bluesky LinkedIn

Infrastructure Platform Teams

Share on

You may also enjoy

Where is the value with infrastructure automation?

Will moving beyond Infrastructure as Code improve software delivery effectiveness?

Some Things I Learned From The Infrastructure Effectiveness Survey

The Quality x Speed Quadrants