Devops Unit 5 Notes
Devops Unit 5 Notes
CAMS
DevOps and all its allies, including business, IT operations, QA, InfoSec, and
development teams, and how they collectively deliver, deploy and integrate automated business
processes. CALMS is a cipher for Collaboration, Automation, Lean, Measurement, and
Sharing. It is structured around metrics that help organizations analyze its structure and its
feasibility in any organization.
The CALMS model provides a framework that works as a reference to compare the maturity of
its team and reckon the state of teams for the transmuting change that goes with it. As the
business demands are growing organizations need to lean towards faster through reliable ways of
developing products.
Culture
Inter and Intra team communication, the pace of deployment, handling outages,
development to production cycle cohere culture in an organization. It has led to a culture change
fostering the traditional development process. It can be seen as an evolution of agile teams, the
difference being that the operations team is included. Earlier, developers mainly focused on
building and developing products while the operations team handled features like availability,
performance, security, and reliability. It provides a culture where the development and the
operations team can collaborate for any incidence reporting that may cause business problems.
For modern business purposes, the issues need to be resolved quickly, which is possible the
teams have one source of data. This leads to a collaborative and shared responsibility
environment.
DevOps culture promotes increased transparency, communication, and alliance between the
teams. The inclusion of automated processes fosters the SDLC, thus promoting organizational
success and enhancing team performance.
An agile approach or culture where both development and operational teamwork continuously
together with each other while building a quality product. Click to explore about, Building
DevOps Build Pipeline
Automation
Systems can be made reliable by eliminating repetitive manual work, which can be done
through automation. The companies that are new to automation start with continuous delivery.
The code is passed through many automated tests, then packed up the builds and advanced to
environments through automated deployments.
To implement iterative updates faster and with total efficiency, automation helps to add
technology to the tasks with reduced human assistance. By integrating API-centric design with
automation, teams can deliver resources swiftly with supported proof of concept, development,
testing, and deployment. This facilitates the development and operations team to deliver the
product to customers faster and efficiently.
Tests executed by computers are more trustworthy and transparent than humans. These tests
catch security vulnerabilities and bugs and inform the IT/Ops, thus helping to reduce failure
sustainability at the time of release.
Lean
In it, the lean process allows development teams to deploy quickly and imperfectly. The
present software development culture finds it better to launch the product into the customer’s
hands as soon as possible than to wait for the product to be perfect. In the context of it, teams
assume that failure is inevitable and can happen anytime.
It can fully utilize lean methodologies by delivering value products to customers and aligning
with the business.
Measurement
Data metrics play an important role in implementing best practices in the its environment. A
successful DevOps implementation should measure the people metrics like feature usage, service
level agreements (SLAs), their feedback, etc.
Data can help analyze the performance, business workflow, customer feedback, etc. This data
teams to make decisions could be more useful when shared with other teams. IT performance
includes four measures:
Deployment frequencies.
Lead time for changes.
Meantime to restore.
Change failure rate.
It must look into two types of key measures: inputs and outcomes.
For example, if the development team wants to add new features to the product, but you are
seeing a high customer churn over other aspects of the product, the data you provide could be
helpful.
Sharing
Assessment for DevOps enables your organization to identify the new opportunities where you
can use it and associated technologies with the greatest business impact and ROI. Click to
explore about, DevOps Assessment
Write a Test: Developers write a test case that defines the expected behavior of a specific part
of the software. This test is often written in a testing framework like JUnit for Java, pytest for
Python, or similar tools.
Run the Test (It Fails): Since there is no code implemented yet, the initial test will fail. This
failure is expected because the code to fulfill the test's requirements hasn't been written.
Write the Code: Developers then write the code necessary to make the test pass. This involves
implementing the required functionality.
Run the Test (It Passes): After writing the code, developers run the test again. This time, it
should pass, indicating that the code now meets the specified requirements.
Refactor (if needed): Once the test passes, developers may refactor the code to improve its
structure, performance, or other aspects. Importantly, they can make these changes with
confidence, knowing that as long as the tests continue to pass, the software's functionality
remains intact.
Repeat: Steps 1-5 are repeated for each piece of functionality or code change, incrementally
building and enhancing the software.
By following this iterative cycle of writing tests, implementing code, and verifying
functionality, TDD helps ensure that software is more reliable, maintainable, and adheres to the
desired specifications. In DevOps, this approach aligns with the principles of automation,
continuous integration, and continuous delivery (CI/CD) to deliver high-quality software
efficiently and consistently.
Benefits:
Here are the key advantages of using TDD in DevOps:
1. Improved Code Quality:
2. TDD encourages developers to write clean, modular, and well-structured code. Since
tests are written before code is implemented, they serve as a blueprint for the desired
functionality. This leads to more reliable and maintainable code.
3. Early Detection of Bugs:
4. TDD ensures that tests are run frequently, often automatically as part of the continuous
integration (CI) pipeline. Any code changes that introduce defects or regressions are
identified immediately, allowing for rapid correction.
5. Rapid Feedback:
6. TDD provides rapid feedback to developers. They can quickly determine whether their
code meets the desired specifications, resulting in faster development cycles.
7. Reduced Debugging Efforts:
8. By catching and addressing issues early in the development process, TDD reduces the
time and effort spent on debugging and troubleshooting later on.
9. Automated Testing:
10. TDD promotes the creation of automated tests. These tests can be integrated into the
CI/CD pipeline, ensuring that code changes are automatically validated and preventing
broken code from reaching production.
11. Regression Testing:
12. TDD tests serve as regression tests, ensuring that existing functionality remains intact as
new features are added or code is modified. This reduces the risk of introducing
unintended side effects.
13. Documentation:
14. Test cases effectively serve as documentation for how a component or feature should
behave. This documentation is always up-to-date and can be easily referenced by
developers, QA teams, and other stakeholders.
15. Enhanced Collaboration:
16. TDD fosters collaboration between developers and QA teams. QA teams can actively
participate in defining test cases, which helps ensure that the software meets both
functional and non-functional requirements.
17. Supports Continuous Integration and Continuous Delivery (CI/CD):
18. TDD aligns well with the principles of CI/CD. Automated tests can be executed as part of
the CI pipeline, allowing for the rapid delivery of tested and validated code to production.
19. Reduces Technical Debt:
20. TDD encourages regular code refactoring to improve code quality. This helps prevent the
accumulation of technical debt, making it easier to maintain and extend the codebase over
time.
21. Increased Confidence:
22. Developers and stakeholders have greater confidence in the software's reliability and
correctness due to comprehensive test coverage.
23. Cost Savings:
24. Catching and fixing defects early in the development process is more cost-effective than
addressing them later in the lifecycle or in a production environment.
1. Higher Code Quality: TDD enforces writing tests before writing code, which leads to
more reliable and better-structured code. This helps reduce the number of defects and
makes it easier to maintain and extend the codebase.
2. Faster Feedback: TDD provides rapid feedback to developers. As tests are run
frequently, any regressions or issues are identified early in the development process,
allowing for immediate correction.
3. Automated Testing: TDD encourages the creation of automated tests, which are crucial
in DevOps for continuous integration and continuous delivery (CI/CD) pipelines.
Automated tests ensure that code changes do not break existing functionality.
4. Documentation: Test cases effectively serve as documentation for the expected behavior
of the code. New developers can quickly understand how a component should behave by
examining the associated tests.
5. Enhanced Collaboration: TDD fosters collaboration between developers and QA teams.
QA teams can participate in defining test cases, and developers can ensure that the code
meets those criteria.
6. Continuous Improvement: The TDD process encourages regular code refactoring. This
helps maintain code quality and prevents the accumulation of technical debt.
7. Limitations of TDD in DevOps:
8. Learning Curve: TDD can be challenging for developers who are new to the practice. It
requires a shift in mindset and may slow down development initially as developers learn
to write effective tests.
9. Initial Investment: Writing tests before code can seem time-consuming initially.
However, this investment pays off in terms of reduced debugging and maintenance
efforts later in the development cycle.
10. Incomplete Testing: TDD primarily focuses on unit testing, which verifies individual
components in isolation. While essential, it doesn't replace other forms of testing like
integration testing, system testing, or user acceptance testing. These need to be
incorporated into the overall testing strategy.
11. Overemphasis on Testing: Overzealous adherence to TDD can lead to excessive testing,
resulting in brittle test suites and increased maintenance efforts for test code.
12. Not Suitable for All Situations: TDD may not be the best approach for all projects. In
some cases, especially when requirements are unclear or when working with emerging
technologies, it can be challenging to write tests upfront.
13. False Sense of Security: Passing tests don't guarantee that the software is completely
bug-free or that it meets all user requirements. It's possible to have well-tested code that
still fails to deliver value to users.
Configuration management
Here is the list of the ten best and most popular (in no particular order) configuration
management tools for DevOps.
Ansible
Currently, the most used and accustomed tool in our company, Ansible, lets the developer get
free of repetition and focus more on strategy. That way, everyday tasks stop interfering with
complex processes. The framework employs executable XML or YAML configuration data files
to specify system configuration algorithms. The defined sequence of actions is then run by the
proper Python-based executables. The framework is pretty simple to learn and doesn’t require
separate agents to manage nodes (it uses the Paramiko module and standard SSH for that).
Terraform
An open-source SCM platform for conveniently managing clusters, services, and cloud-based
infrastructure aspects via IaC. The platform can be easily integrated with Azure, AWS, and a
bunch of other cloud solutions. Databases, servers, and other essential objects have individual
interfaces and representation means. You can set repeatable deployments of cloud
infrastructures, with the platform helping you provision AWS resources from text files and
handling the set deployment tasks autonomously.
Chef Infra
Focused on DevOps, infrastructure tools by Chef help achieve new levels of IT management
flexibility, efficiency, and convenience. They ultimately help speed up the delivery of software
through fast and simple means of building, testing, and patching up new environments,
deploying new software versions most properly, boosting system resiliency and risk management
through dedicated metrics, and helping properly deliver any type of infrastructure in any
environment seamlessly and continuously.
Vagrant
Focused on building and maintaining virtual machine environments, Vagrant helps reduce the
time needed to set up a development environment and boost the production parity. You can also
use it to conveniently share virtual environment configurations and setup assets between team
members without going far. A good advantage of this one is the way it handles provisioning by
provisioning data files locally before implementing all the changes in other related environments.
TeamCity
TeamCity is an efficient CI and build management solution from the renowned JetBrains. The
platform allows taking source code from different version control systems to use in one build,
reusing parent project settings in a subproject in a multitude of ways, efficiently detect hung
builds, and highlight builds that you need to return to later on. It is a great CI/CD solution for
also checking builds via convenient Project Overview and making workflows in various
environments more flexible overall.
Puppet Enterprise
There are two versions of this tool – Puppet and Puppet Enterprise. The first one has a free open-
source version, and the latter is free for no more than ten nodes. Puppet is a highly organized tool
that uses modules to keep everything in place and make quick adjustments. Thus, you can
orchestrate remediation, monitor ongoing changes, and plan out and implement deployments
fast. You can also manage a number of servers in one place, define infrastructures as code, and
complete enforced system configurations.
Octopus Deploy
With Octopus, complex deployments can be easily managed both physically and in the cloud.
The solution has all the capabilities to eliminate many common deployment errors, efficiently
distribute software deployment tasks in your team, painlessly deploy in new unfamiliar
environments, and eventually multiply your usual number of releases within a certain time
period.
SaltStack
This Python-based configuration tool delivers SSH and push methods for effective business-
client communication. Similarly to running ad-hoc scripts, the platform provides a much more
refined and well-structured workflow with heavy doses of automation for smoothing out your
usual continuous implementation and continuous delivery processes.
AWS Config
With AWS Config, you can efficiently audit, assess, and further inspect configurations related to
AWS resources. The real treat, however, is the secrets tracking capabilities AWS Config
provides. It allows tracking detailed histories of resource configurations, reviewing
customizations in AWS resource configurations and inter-relationships, and define the all-around
compliance with configurations specified by internal guidelines.
It facilitates the ability to communicate status of documents and code as well as changes
that have been made to them. High-quality of the software that has been tested and used,
makes it a reusable asset and saves development costs
Increased efficiencies, stability and control by improving visibility and tracking.
The ability to define and enforce formal policies and procedures that govern asset
identification, status monitoring, and auditing.
All components and sub-components are carefully itemized. This means there is a clear
understanding of a product and its component elements and how they relate to each other.
.Maintains project team morale. A change to a specification of a product can have a
detrimental effect when the team have to redo all their work.
It helps to eliminate confusion, chaos, double maintenance and the shared data problem.
Infrastructure automation
Infrastructure automation is a fundamental practice in DevOps that involves using code and
automation tools to provision, configure, and manage infrastructure resources. This approach
streamlines the deployment and management of applications and services, reduces manual errors,
and enables faster and more consistent infrastructure changes. Here are key concepts and tools
related to infrastructure automation in DevOps:
Infrastructure as Code (IaC):
IaC is the practice of managing and provisioning infrastructure using code or scripts rather than
manual processes.
Popular IaC tools include Terraform, AWS CloudFormation, Ansible, and Puppet.
Configuration Management:
Configuration management tools, such as Ansible, Puppet, and Chef, automate the configuration
of servers and ensure they are consistent across the infrastructure.
They help maintain the desired state of servers and applications.
Orchestration:
Orchestration tools like Kubernetes or Docker Swarm are used to manage and automate the
deployment, scaling, and networking of containers.
They simplify container orchestration and ensure high availability and scalability.
Continuous Integration/Continuous Deployment (CI/CD):
CI/CD pipelines automate the building, testing, and deployment of code and infrastructure
changes.
Jenkins, Travis CI, CircleCI, and GitLab CI/CD are common CI/CD tools.
Version Control:
Using version control systems like Git, you can track changes to your infrastructure code and
collaborate with team members effectively.
Immutable Infrastructure:
Immutable infrastructure involves treating infrastructure as disposable and recreating it from
scratch with every change.
Tools like Packer and Docker facilitate the creation of immutable images.
Monitoring and Logging:
Integrating monitoring and logging tools like Prometheus, Grafana, ELK Stack, or Datadog helps
automate the collection and analysis of infrastructure and application metrics.
Compliance and Security as Code:
Implementing security and compliance policies as code (Security as Code) helps ensure that
infrastructure meets required security standards.
Tools like HashiCorp Sentinel and AWS Config can be used for this purpose.
Self-Service Portals:
Some organizations develop self-service portals or catalogs that enable teams to request and
provision resources through automation scripts or predefined templates.
Cloud Services:
Cloud providers offer various services for infrastructure automation, including AWS
CloudFormation, Azure Resource Manager, and Google Cloud Deployment Manager.
Benefits of Infrastructure Automation in DevOps:
Speed and Efficiency: Automation reduces the time required to provision and manage
infrastructure, enabling faster development and deployment cycles.
Consistency: Automation ensures that infrastructure is provisioned and configured consistently,
reducing the risk of configuration drift and errors.
Scalability: Infrastructure can be scaled up or down automatically in response to changing
workloads.
Reduced Manual Errors: Automation reduces the likelihood of human errors in configuration
and provisioning.
Version Control: Infrastructure code can be versioned, providing a history of changes and
enabling collaboration among team members.
Cost Optimization: Automation can help manage and optimize cloud resource costs by shutting
down unused resources and rightsizing instances.
In summary, infrastructure automation in DevOps is a key practice that enables organizations to
efficiently manage and scale their infrastructure while ensuring consistency and reliability. It
plays a crucial role in achieving the goals of agility, speed, and reliability in modern software
development and operations.
Root Cause Analysis (RCA) is a method of analyzing major problems before attempting to fix
them. It involves isolating and identifying the problem's fundamental root cause. A root cause is
defined as a factor that, if removed, would prevent the occurrence of the bad event. Other
elements that affect the outcome should not be regarded as root causes.
Root cause analysis is important for solving an issue since preventing an event from occurring is
preferable to the negative consequences. For large organizations, short-term fixes are not
economical; RCA helps to permanently eliminate the source of the defect.
Root cause analysis can be done with a variety of tools and approaches, but in general, it entails
digging deep into a process to determine what, when, and why an event occurs. However, root
cause analysis is a reactive approach, which means that an error or bad event must occur before
RCA can be applied.
Root cause analysis is a team-based practice, not a choice made by a single person. RCA should
begin by precisely identifying the issue, which is frequently an undesirable event that should not
occur again.
To keep track of all important details, RCA should be used soon after an undesirable event.
Process owners are the fundamental skeleton for a proper RCA, but they may not be comfortable
with such meetings and conversations. As a result, managers will play a key role in conveying
the value of RCA and maintaining the organization's non-blame culture.
The goal of RCA is to find all of the components that contribute to a problem or event. An
analysis method is the most effective way to accomplish this. The following are some of the
RCA methods:
The “5-Whys” Analysis
A basic problem-solving strategy that allows people to quickly get to the base of the
issue. The Toyota Production System popularised it in the 1970s. Looking at a problem
and asking "why" and "what produced this problem" is part of this strategy. The answer
to the first "why" frequently leads to a second "why," and so on, forming the foundation
for the "5-why" examination.
Pareto Analysis
A decision-making statistical technique for analyzing a small number of activities that
have a significant overall effect. The premise is that only a few essential reasons create
80% of problems.
Barrier Analysis
An investigation or design method entails tracking the routes via which a target is harmed
by a hazard, as well as identifying any failed or absent countermeasures that could or
should have prevented the unintended outcome.
Change Analysis
In circumstances where change is occurring, looks for prospective risk consequences and
appropriate risk management techniques methodically. This can include situations where
system configurations are modified, operating practices or policies are revised, or new or
different activities are undertaken, among other things.
Root cause analysis can be applied to a range of situations in a variety of industries. Each
industry may undertake the analysis in a somewhat different way, but when it comes to
investigating issues with heavy machinery, most follow the same general five-step method.
Step 1: Data Collection
Collecting data is the most critical phase in the root cause analysis process, similar to how police
maintain a crime scene and methodically collect evidence for evaluation. It's best to collect data
as soon as possible after a failure or, if possible, while it's still happening.
Make a note of any physical proof of the failure in addition to the data. Conditions before,
during, and after the incident; employee involvement; and any environmental elements are
examples of data you should collect.
Step 2: Assessment
Analyze all obtained data throughout the assessment phase to uncover possible causal factors
until one (or more) root causes are identified. The assessment phase, according to the DOE's
procedure, consists of four steps:
4. Working backward from the root cause, identify the reasons why the causes in the
preceding phase exist; the root cause being the reasons that, if fixed, will prevent these
and similar failures around the facility from occurring. The assessment phase comes to a
halt after the root cause has been identified.
Once a root cause has been identified, corrective action can be taken to improve and strengthen
your process. Determine the corrective action for each reason first.
Then, to ensure that your corrective activities are practicable, ask these five things given out by
the DOE.
1. Prevent recurrence
2. Feasible
3. Production objectives
4. Safety
5. Effective
Before taking corrective action, your entire firm should debate and consider the benefits and
drawbacks of doing so. Consider how much it will cost to make these modifications. Training,
engineering, risk-based, and operational expenses are all possible costs. Weigh the benefits of
removing the failures against the likelihood that the remedial actions will work.
Step 4: Communication
Communication is essential. Make sure that everyone who is affected is aware of the planned
change or implementation. Supervisors, managers, engineers, and operations and maintenance
staff are examples of these parties in the manufacturing setting.
Step 5: Follow-up
In the follow-up step, you'll see if your corrective action was successful in correcting the
difficulties.
Follow up on remedial actions to ensure that they were properly implemented and are
operating as intended.
Review the new corrective action tracking system regularly to ensure that it is working
properly.
Analyze any further recurrence of the same event to identify why the corrective actions
failed. Make a note of any new occurrences and analyze the symptoms.
Regular follow-up allows you to assess how well your corrective actions are working and helps
in the detection of new issues that could lead to future failures.
Why Root Cause Analysis is Important?
In the industry, repeat problems are a source of waste. Website downtime, product rework,
increased scrap, and the time and resources spent "solving" the problem are all examples of
waste. We may assume that the problem has been fixed when, in fact, we have just addressed a
symptom of the problem rather than the fundamental cause.
When done correctly, a Root Cause Analysis can reveal weaknesses in your processes or systems
that contributed to the non-conformance and help you figure out how to avoid it in the future. An
RCA is used to figure out what went wrong, why it went wrong, and what improvements or
modifications are needed. Repeat problems can be avoided with the right implementation of
RCA.
The use of RCA methodologies and tools is not restricted to manufacturing process issues. Many
industries employ the RCA methodology in a variety of scenarios, and this organized approach
to problem-solving is widely used. RCA is employed in a variety of situations, including but not
limited to:
The point is that RCA can be used to solve practically every problem that businesses confront
daily. A company that could benefit from RCA has a high rate of erroneous customer orders and
shipments. The process can be mapped, examined, and the problem's underlying causes
identified and resolved. As a result, the company has a happier, more loyal client base and
reduced total costs.
Culture of Continuous Improvement : DevOps promotes a culture where teams are encouraged to
regularly assess their processes, tools, and practices to identify areas for improvement. This
continuous improvement mindset is central to DevOps and is a key aspect of organizational
learning.
Feedback Loops: DevOps emphasizes the importance of feedback loops at every stage of the
software delivery pipeline. This includes automated testing, monitoring, and user feedback.
These loops provide valuable information for learning and making informed decisions.
Post-Incident Reviews (Post-Mortems): When incidents or failures occur, DevOps encourages
teams to conduct post-incident reviews to understand the root causes and identify preventive
measures. These reviews are a critical aspect of organizational learning and help prevent similar
incidents in the future.
Automation and Monitoring: DevOps practices heavily rely on automation and monitoring tools.
These tools provide data that can be analyzed to identify bottlenecks, inefficiencies, and areas
where improvements can be made. Teams use this data to make informed decisions and iterate
on their processes.
Cross-Functional Collaboration: DevOps encourages collaboration between development,
operations, and other relevant teams, such as security and quality assurance. This cross-
functional collaboration fosters knowledge sharing and accelerates learning across different
domains.
Knowledge Sharing: DevOps teams often use centralized knowledge repositories and
documentation to share best practices, learnings from past experiences, and guidelines for
processes and tools. This knowledge sharing ensures that lessons learned are not lost and can
benefit the entire organization.
Experimentation and Innovation: DevOps encourages teams to experiment with new tools and
practices. Experimentation allows teams to learn what works best for their specific context and
continuously refine their approach.
Training and Skill Development: Organizations investing in DevOps often provide training and
skill development opportunities for their teams. This ensures that team members have the
knowledge and skills needed to effectively implement DevOps practices.
Leadership Support: Effective organizational learning in DevOps requires support from
leadership. Leaders should promote a culture of learning, allocate resources for improvement
initiatives, and demonstrate a commitment to DevOps principles.
Measuring Success: Key performance indicators (KPIs) are used to measure the success of
DevOps initiatives. Regularly assessing these KPIs helps teams understand the impact of their
improvements and adjust their strategies accordingly.
In summary, organizational learning is deeply embedded in DevOps, with a focus on continuous
improvement, feedback, collaboration, and a culture that values learning from both successes and
failures. By fostering a culture of learning and adaptation, organizations can achieve greater
efficiency, agility, and reliability in their software delivery and IT operations.