
DevOps Research and Assessment (DORA) is an organization that researches the benefits organizations get from DevOps principles and practices like the ones discussed here.
From their research, DORA has identified four metrics that high-performers score well against for delivery and operational performance. They suggest using these metrics as goals for an aspirational organization. These metrics are:
Lead time for changes
The amount of time it takes a commit to get into production; that is, how long it takes to go through the inner and outer loops. This includes both the development time (how long it takes to code the feature) and the wait time in various stages of the CI/CD pipeline.
Deployment frequency
How often an organization successfully releases to production. This could be multiple deployments per day for high-performing organizations.
Change failure rate
The percentage of deployments causing a failure in production. This could be defined as a service impairment or service outage. The key is to track when a failure requires remediation (like a hotfix, rollback, fix forward, or patch).
Time to restore serice
How long it takes an organization to recover from a failure in production. This could be the time it takes to correct a defect or to restore service during a catastrophic event (like a site outage).
These four metrics provide a balanced view of not only how fast and frequently an organization delivers software efficiently but also how effectively they respond to problems in production. They are based on years of research and have been shown to be predictive of software delivery performance and organizational performance. The type of automation provided by the factory is the enabler of these metrics. The message is that just because automation is in place does not mean an organization will perform well, but organizations that do perform well are likely to have a factory in place.
For more information on DORA metrics, you can refer to the book Accelerate: The Science of Lean Software and DevOps (IT Revolution Press) by Nicole Forsgren, Jez Humble, and Gene Kim, which discusses these topics in detail.
Canary Releases
When deploying to production, you may want to do a canary release. This is where you deploy the new version of the application alongside the old version and gradually increase the traffic to the new version. This allows you to test the new version in production and roll back if there are any issues.
For example, you could choose to deploy the new version of a service to only 5% of users initially and then monitor the error rate and performance. If everything is working as expected, you can gradually increase the percentage of users until all users are using the new version. In Chapter 13 on observability and monitoring, you will look at how you can monitor the performance of your application and use this to make decisions about when to roll out new versions.