DevOps Institute

Why You Should Bring Chaos Engineering to Your Legacy CI/CD Pipeline

Updated January 18, 2023
By: Anurag Sharma

In an increasingly distributed world, we often ask ourselves if continuous delivery can accommodate legacy software systems and if we can use chaos engineering to improve reliability in these environments. There is an assumption that legacy can’t be agile, or an attitude that its future is uncertain but sure to be short. But nobody knows how short, and it could be years or even decades before this type of technical debt can be worked away. Legacy is heritage, is cherished and often represents core business systems in an organization; cash cows that the business depends on, that have become highly complex during their existence and as such are incredibly difficult to replace and cannot just be abandoned.

Can Chaos Engineering Help with Legacy Systems?

Not everyone is a unicorn, born on the web FANG. These organizations carry some legacy. Legacy applications carry with them legacy databases built on technologies, and there’s a good chance most organizations have to deal with massive old-style RDMS. Whilst a vast majority of global IT teams are moving from monoliths to microservices, there is always a transition period where legacy applications and components need to operate close to the speed of the new world. These legacy monoliths typically have a tightly coupled architecture that needs to be loosened to allow for incremental, small-batch change, test, and release in order to enhance velocity.

There is a proven approach for building consistency into software development, Continuous Delivery, which is here to stay. On the IT operations side, we have chaos engineering, capable of swiftly uncovering the failures of software that teams aren’t aware exist but have the potential to ruin a business. Chaos engineering carries a real and very clear message that it’s preferable to constantly practice small failures than increase the risk of catastrophic public failure, which can seriously and adversely impact a business’s reputation.

The idea is to consider the overall ecosystem and when it comes to legacy, choose your battle appropriately, which includes all your critical legacy systems.

Let’s clarify before we go further. There are differences between resilience assessment approaches, Disaster Recovery versus Game Days.

What doesn’t matter for Chaos Engineering:

Size of your organization and team
Language technologies, development methodology (Waterfall/Agile)

Why You Should Bring Chaos to Your Legacy CI/CD Pipelines

The ultimate goal of your CI/CD is to automate the software build process to enhance velocity. Once you set it up, it makes sense to integrate chaos engineering with your CI/CD. As part of the deployment pipelines, you can push your chaos files to start disruption in the specific environment. Here are a few scenarios:

Make legacy dependencies unavailable when you push a deployment
Introduce a failure in key codes and orchestrate a canary deployment
Reduce the capacity and run the load test just after deployment

Useful Experiments on Your Build cycle

Here are some examples:

For legacy pipelines, let’s take the example of the mainframe. It starts with version control tools like ISPW, ChangeMAN, etc, Build, Release, Deploy e.g. Topaz, IBM tools, etc, Operate manual/automated, Monitoring BMC, Splunk, etc. Here are two chaos experiments that help to assess your pipeline:

Application-specific experiment: Where a specific idea or test design should be applied to check the reliability, and this can be a one-off experiment. This can be used during development, building, testing, deploying, operating and monitoring
GameDays: This will be more real-time with shared responsibilities across the team with a specific focus

The idea here is to speed up deployment and find issues before they hit production

What are the best monkeys for Game Days?

Latency Monkey: Remember when you move away from legacy, you mostly remain in a transition state so it’s extremely crucial to ensure integration. You need someone to challenge your latency
Big Iron King Kong (Legacy Monkey): This monkey should be able to allow you to experience below
Key DB start/Stop automatically
In introducing highly localized failure in a legacy system or make the system slow
Terminate the entire database or disconnect from datacenter

For build pipelines, the golden spot remains in the middle because, usually, the software itself plays a role in responding to the failure. For example, the software might include an automated restart, throttling, failover, etc. If those are software functions, then the software can either work or not work, and the build should be able to uncover that.

A true differentiation of the best from the rest is, your growing focus on the reliability of the entire ecosystem, and how effectively you test the resilience of your system from build to all the way through production. Chaos Engineering along with the future of releases CD are two best set-ups that use it effectively and get the maximum value of it.

Link to original article

Community at DevOps Institute

Join now

[EP112] Why an AIOps Certification is Something You Should Think About

Join Eveline Oehrlich and Suresh GP for a discussion on Why an AIOps Certification is Something You Should Think About Transcript 00:00:02,939 → 00:00:05,819 Narrator: You're listening to the Humans of DevOps podcast, a 00:00:05,819 → 00:00:09,449 podcast focused on...

[EP111] ITSM Value Streams: Transform Opportunity Into Outcome book review

Join Eveline Oehrlich and David Billouz for a discussion on ITSM Value Streams: Transform Opportunity Into Outcome book review. Transcript Narrator 0:02 You're listening to the humans of DevOps podcast, a podcast focused on advancing the humans of DevOps through...

[Ep110] Open Source, Brew and Tea!

Join Eveline Oehrlich and Max Howell, CEO of tea.xyz and creator of Homebrew, to discuss open source including "the Nebraska problem," challenges, and more. Max Howell is the CEO of tea.xyz and creator of Homebrew. Brew was one of the largest open source projects of...

DevOps Institute

Why You Should Bring Chaos Engineering to Your Legacy CI/CD Pipeline

Can Chaos Engineering Help with Legacy Systems?

Why You Should Bring Chaos to Your Legacy CI/CD Pipelines

What are the best monkeys for Game Days?

Community at DevOps Institute

related posts

[EP112] Why an AIOps Certification is Something You Should Think About

[EP111] ITSM Value Streams: Transform Opportunity Into Outcome book review

[Ep110] Open Source, Brew and Tea!

Buy the exam from the PeopleCert website

Buy the exam from the PeopleCert website

Buy the exam from the PeopleCert website

Buy the exam from the PeopleCert website

Buy the exam from the PeopleCert website

Buy the exam from the PeopleCert website

Buy the exam from the PeopleCert website

Buy the exam from the PeopleCert website

Buy the exam from the PeopleCert website

Buy the exam from the PeopleCert website

Buy the exam from the PeopleCert website

Buy the exam from the PeopleCert website

Complete your application from the PeopleCert website