By: Suresh GP
DevOps Institute, which is dedicated to advancing the human elements of DevOps, pioneered in bringing collective knowledge on DevOps from across the Globe since 2015. In January 2020 it introduced the SRE Foundation course based on the service management approach introduced by Google. If you are new to DevOps and SRE, this blog highlights the goal of each course and will help you decide the right course for you.
DevOps emerged when organizations were adopting Agile for developing new features but struggled to operationalize it. DevOps is quite broad, covering the entire value stream of software development and support. To be successful in DevOps, one should equally value Culture, Automation, Lean practice, Unified Measure and Sharing (abbreviated as CALMS).
This course provides you a breadth of knowledge in a structured way. It starts with why DevOps is relevant to business stakeholders by DevOps and supported by case studies from companies that have successfully implemented DevOps.
Principles & Practices
DevOps Foundation stresses the 3 key principles “Flow, Feedback, Continuous Experimentation and learning” from the book ‘The Phoenix Project’. Applying and reflecting on these principles helps to start your DevOps journey on the right track.
Continuing with the principles, it provides a strong foundation on key practices like Continuous Integration, Continuous Delivery, DevSecOps. Continuous Testing. This will refine your approach.
Value From This Course
You can take advantage of this course to:
- Impart the importance of culture in the DevOps Journey
- Tools agnostic but provide huge guidance on how to automate
- Key patterns and anti-patterns in implementing CI/CD
- Use Lean practices like Value Stream Mapping to visualize and optimize your app lifecycle
Google defines Site Reliability Engineering as: “treating operations as if it’s a software problem”. The operations team, traditionally, focuses on handling incidents, responding to events and updating or patching the system as and when it occurs. As the system grows it becomes very difficult to maintain. SRE addresses this problem both proactively (by automation) and reactively (by responding to incidents).
The SRE Foundation course provides you broader knowledge on balancing reliability compared to the velocity of deploying features to production. It stresses upon defining the service level that would satisfy you, customers, more than focusing on 99.99% uptime which is practically challenging.
Principles and Practices
Key principles that SRE foundation focuses on:
- Approaching operations as a software problem
- Defining service levels objectives from the end-user perspective
- Understanding and eliminating toil
- Automating processes to remove toil
- Building a generative culture for shared responsibility, and
- Reducing the cost of failure
Value from this course
You can take advantage of this course to
- How to get started with Site Reliability Engineering
- Define error budgets for your services to deploy features and eliminate toils
- Approaching Monitoring for a quick response
- How SRE blends with other frameworks