11/12/20
TRACK: CI/CD CONTINUOUS EVERYTHING
NOVEMBER 12, 2020
Hasan Yasar
Six Categories of
Monitoring in the
DevOps Pipeline
[DISTRIBUTION
[DISTRIBUTION
STATEMENT
STATEMENT
A] Approved
A] Approved
for public
for public
release
release
and and
unlimited
unlimited
distribution.
distribution.
TRACK: CI/CD CONTINUOUS EVERYTHING
Copyright 2020 Carnegie Mellon University.
This material is based upon work funded and supported by the Department of Defense under Contract No. FA8702-
15-D-0002 with Carnegie Mellon University for the operation of the Software Engineering Institute, a federally funded
research and development center.
NO WARRANTY. THIS CARNEGIE MELLON UNIVERSITY AND SOFTWARE ENGINEERING INSTITUTE MATERIAL IS
FURNISHED ON AN "AS-IS" BASIS. CARNEGIE MELLON UNIVERSITY MAKES NO WARRANTIES OF ANY KIND, EITHER
EXPRESSED OR IMPLIED, AS TO ANY MATTER INCLUDING, BUT NOT LIMITED TO, WARRANTY OF FITNESS FOR
PURPOSE OR MERCHANTABILITY, EXCLUSIVITY, OR RESULTS OBTAINED FROM USE OF THE MATERIAL. CARNEGIE
MELLON UNIVERSITY DOES NOT MAKE ANY WARRANTY OF ANY KIND WITH RESPECT TO FREEDOM FROM PATENT,
TRADEMARK, OR COPYRIGHT INFRINGEMENT.
[DISTRIBUTION STATEMENT A] This material has been approved for public release and unlimited distribution. Please
see Copyright notice for non-US Government use and distribution.
This material may be reproduced in its entirety, without modification, and freely distributed in written or electronic
form without requesting formal permission. Permission is required for any other use. Requests for permission should
be directed to the Software Engineering Institute at permission@sei.cmu.edu.
DM20-1053
[DISTRIBUTION STATEMENT A] Approved for public release and unlimited distribution.
1
11/12/20
TRACK: CI/CD CONTINUOUS EVERYTHING
Metrics about me! • 25+ years of software development
experiences
• Certified Scrum Practitioner
• Certified Ethical Hacker
• Various roles throughout SDLC ; Manager,
Architect, Tester, Developer, QA, IT Manager,
Project Manager, VP…
• Started with waterfall in 1990
• Started with agile in 2003
• Started with DevOps in 2010
• Instructor on delivering DevOps course at
CMU, SEI since 2015
• DevOps, DevSecOps community organizer,
frequent Speaker
• PC members in various research conferences,
• Editorial board member, IJSS, AJSE
• Vice Chair of IEEE 2675 DevOps study group
[DISTRIBUTION STATEMENT A] Approved for public release and unlimited distribution.
TRACK: CI/CD CONTINUOUS EVERYTHING
Agenda
• Metrics, Logs, Reports à Data
• Why? Who?
• DevOps Metrics
• Monitoring
• Architecture
• Metrics to Dashboard
• Actions?
• Takeaway
[DISTRIBUTION STATEMENT A] Approved for public release and unlimited distribution.
2
11/12/20
TRACK: CI/CD CONTINUOUS EVERYTHING
Metrics, Logs, Reports à Data
[DISTRIBUTION STATEMENT A] Approved for public release and unlimited distribution.
TRACK: CI/CD CONTINUOUS EVERYTHING
What is…
• Logs: a document used to record and describe selected items
identified during execution of process or activity (PMBBOK)
• Metrics:
• (1) quantitative measure of the degree to which a system, component, or
process possesses a given attribute IEEE 24765
• (2) defined measurement method and the measurement scale (ISO 14102)
• Reports: information item that describes the results of activities
such as investigations, observations, assessments, or tests (IEEE
15289)
[DISTRIBUTION STATEMENT A] Approved for public release and unlimited distribution.
3
11/12/20
TRACK: CI/CD CONTINUOUS EVERYTHING
Logs vs metrics
While logs are about a specific event, metrics are a
measurement at a point in time for the system.
Capability Completion
2500
2000
Work Remaining
1500
1000
500
0
Sprint 1 Sprint 2 Sprint 3 Sprint 4 Sprint 5
Capability 1 500 400 480 450 375
Capability 2 200 100 25 0 0
Capability 3 2000 1950 1950 1900 1900
Capability 1 Capability 2 Capability 3
[DISTRIBUTION STATEMENT A] Approved for public release and unlimited distribution.
TRACK: CI/CD CONTINUOUS EVERYTHING
All are in…. # cases, estimates, task status, etc
Status report, # of messages Build ,Integration, test results, status report
Version, logs, audit
Aggregated data visualization
Secure coding, logs, audit SLOC, SAST
[DISTRIBUTION STATEMENT A] Approved for public release and unlimited distribution.
4
11/12/20
TRACK: CI/CD CONTINUOUS EVERYTHING
DevOps metrics
• Without metrics there is no way to know if you are improving in your
performance of processes to answer :
• Is the service delivering value to the users?
• Is the service operating properly?
• Are we achieving business goals?
• Is the service secure?
• Is the infrastructure adequate?
• Is the service being attacked?
• Can future needs be supported?
• Are we able to plan new product? If so how much?
• Are we compliant?
[DISTRIBUTION STATEMENT A] Approved for public release and unlimited distribution.
TRACK: CI/CD CONTINUOUS EVERYTHING
• Inability to break down cost of delivering software (fast or delay)
• Inability to trace the cost of defects to business impact
• No clarity on benefits of future investments
• Inability to track cost Developing new features or fixing defects
*State of Software Delivery Management Report 2020
[DISTRIBUTION STATEMENT A] Approved for public release and unlimited distribution.
10
5
11/12/20
TRACK: CI/CD CONTINUOUS EVERYTHING
Why?
“when you can measure what you are speaking about, and
can express it in numbers, you know something about it: but
when you cannot measure it, when you cannot express it in
numbers, your knowledge is of meagre and unsatisfactory
kind: it may be the beginning of knowledge, but you have
scarcely in your thoughts advanced to the stage of science”
Lord Kelvin, a physicist
[DISTRIBUTION STATEMENT A] Approved for public release and unlimited distribution.
11
TRACK: CI/CD CONTINUOUS EVERYTHING
What kind of data?
The SEI’s Goal-Question-Metric-Indicator-Method provides a
structured approach to objectives to practical measurement
Success indicators
Goal Success %
criteria
1 2 3 4 1 2 3 4
Reporting Periods
Strategy to
accomplish
Analysis indicators goal ct
pa
100
80
Im
60
Progress indicators
Tasks to
40
20
accomplish goal
Tasks
For project Roll-up for
Task 1 higher management
Task 2
manager
Task 3
Test Cases
Complete
• A c tu a l
100 Actual
• 80
60
Task n P la n n e d
40 Planned
Functions 20
R e p o r tin g P e r io d s
Reporting Periods
[DISTRIBUTION STATEMENT A] Approved for public release and unlimited distribution.
12
6
11/12/20
TRACK: CI/CD CONTINUOUS EVERYTHING
Characteristics of good metrics
• Relevant : It must be related to a business goal.
• Observable : A metric that can't be measured is useless.
• Actionable : It should suggest the need for corrective actions or improvements to
workflows, policies, incentives, tools, etc.
• Traceable : It should be possible to causally trace the metric to root causes.
• Reliable : It should produce similar results under similar conditions and resist
manipulation.
• Automatable : Metrics collection should be built into the system to avoid manual
work, errors, and delay.
• Auditable : It should be free from any teams/person influence
• Collectible: It should be rationalized form examining related metrics
[DISTRIBUTION STATEMENT A] Approved for public release and unlimited distribution.
13
TRACK: CI/CD CONTINUOUS EVERYTHING
DevOps metrics pyramid
[DISTRIBUTION STATEMENT A] Approved for public release and unlimited distribution.
14
7
11/12/20
TRACK: CI/CD CONTINUOUS EVERYTHING
Guidelines to select..
• Avoid relying on a single measure.
• Look for trends, outliers, and level shifts rather than averages.
• Trends indicate an imbalance.
• Outliers indicate process misbehavior.
• Shifts indicate an underlying change.
• Outcome measures directly affect the customer or the business,
while
• Process measures help you understand and improve.
[DISTRIBUTION STATEMENT A] Approved for public release and unlimited distribution.
15
TRACK: CI/CD CONTINUOUS EVERYTHING
When to measure?
• The pipeline includes key transitions and events
• Bug Report Submitted
• Change Request Submitted
• Code Commit
• Build progress
• Test results
• Deployment activities
• Operating Failure or Recovery
• Continuous monitoring for assurance
• Application usage and latency (processors, memory)
• Network traffic (volume and source)
[DISTRIBUTION STATEMENT A] Approved for public release and unlimited distribution.
16
8
11/12/20
TRACK: CI/CD CONTINUOUS EVERYTHING
A typical Measurement-metrics categories
• Productivity (like: Deployment Frequency, Lead times )
• Reliability (Like: MTTR, MTTD, Time to Failure )
• Quality (like: Failed Deployments, # of tickets, Rework)
• Security (like: Change Request, time to approval, #incidents)
• Operations (like: resource usage, attack frequency)
• The measurements are typically used to:
• Identify problems or change to a baseline
• Compare actual performance to a desired level
• Evaluate the effect of a change
[DISTRIBUTION STATEMENT A] Approved for public release and unlimited distribution.
17
TRACK: CI/CD CONTINUOUS EVERYTHING
Monitoring
[DISTRIBUTION STATEMENT A] Approved for public release and unlimited distribution.
18
9
11/12/20
TRACK: CI/CD CONTINUOUS EVERYTHING
End to end visibility
• Delivering any capability is a data driven activity
• All stakeholder should be able to access any
dashboard
[DISTRIBUTION STATEMENT A] Approved for public release and unlimited distribution.
19
TRACK: CI/CD CONTINUOUS EVERYTHING
Monitoring
• Monitoring determining the status of a system, a process oer an activity (ISO/IEC)
• Monitoring is the collection, interpretation, and action on information gathered
from a system, from its design and implementation to running in production. It
translates metrics into a measurable user experience.
Infrastructure, Monitoring Monitoring Designated
pipeline or system collects system displays person (or
system the generated information or possibly
information is information alerts monitoring
generated based system) takes
on an observed action
threshold
[DISTRIBUTION STATEMENT A] Approved for public release and unlimited distribution.
20
10
11/12/20
TRACK: CI/CD CONTINUOUS EVERYTHING
Types of Monitoring
Area Reasons
Development • Improve DevOps processes
Usability (UX) • Measure user reactions to various aspects of the system
Performance • Identify failures
/Application • Identify performance problems
• Characterize workload for short and long-term capacity planning, as well as for charging
purposes
Security • Detect intruders
• Identify data breaches
• Identify vulnerabilities
Business Process • System management
• Lifecycle management
• KPI, Business Value
Functional Monitoring • Monitoring of each case or set-of use cases
• Various Deployment cases
[DISTRIBUTION STATEMENT A] Approved for public release and unlimited distribution.
21
TRACK: CI/CD CONTINUOUS EVERYTHING
Map metrics to monitor
Phase Issues
Development • Build failures
• Testing failures
• Issue monitoring
Operations • Product monitoring / System expansion forecasting
• Outage monitoring
• resource usage, attack frequency
Security • Vulnerability monitoring
• Predicting future anomalies
• Change Request, time to approval, #incidents
[DISTRIBUTION STATEMENT A] Approved for public release and unlimited distribution.
22
11
11/12/20
TRACK: CI/CD CONTINUOUS EVERYTHING
Monitoring Implementation
When implementing monitoring in a system, one must add:
• Instrumentation: Code or routine added to main application code, to retrieve relevant information
about a certain feature of the system.
• Telemetry: An automated communications process by which measurements and other data are
collected and transmitted to receiving equipment for monitoring.
• Storage: Any mechanism to store the captured information, and pass it on to displaying or
alerting capabilities.
• Displaying: A mechanism to display the collected information, usually through visual metaphors
that allow fast and intuitive understanding of what is happening to the system.
• Alerting: Any mechanism to indicate that something is not normal and needs attention.
COTS Niche Custom
Elastic ELK Stack VirtualDevManager* D3 Visualizations
(storing and displaying) (storing and displaying) (displaying)
[DISTRIBUTION STATEMENT A] Approved for public release and unlimited distribution.
23
TRACK: CI/CD CONTINUOUS EVERYTHING
Monitoring System Architecture
*Partially based on “DevOps: A Software Architect's Perspective” (SEI Series in Software Engineering), Len Bass, Ingo Weber, Liming Zhu
[DISTRIBUTION STATEMENT A] Approved for public release and unlimited distribution.
24
12
11/12/20
TRACK: CI/CD CONTINUOUS EVERYTHING
Design Dashboard
• Valid Information
What’s the source of the data?
Has the process for collecting or reporting changed?
What’s changing? What’s different?
• Actionable Choices
What’s the significance of the information?
Does it represent a risk or problem?
What decision is required?
What actions are appropriate?
• Follow-Through
What will you monitor after the action? How often?
Can you get the monitoring data you need?
Who is responsible?
[DISTRIBUTION STATEMENT A] Approved for public release and unlimited distribution.
25
TRACK: CI/CD CONTINUOUS EVERYTHING
Different roles need different information
Dashboards can hold a large amount of information
and are good in displaying outliers to expected
behaviors.
Acquisition, product development, and programs make many
assumptions.
[DISTRIBUTION STATEMENT A] Approved for public release and unlimited distribution.
26
13
11/12/20
TRACK: CI/CD CONTINUOUS EVERYTHING
Takeaway
[DISTRIBUTION STATEMENT A] Approved for public release and unlimited distribution.
27
TRACK: CI/CD CONTINUOUS EVERYTHING
• Map metrics in each DevOps critical
Success
• Monitor the Impact of Measurement
Systems
• Create a Learning Environment to
Process Improvement
• Avoid vanity metrics that promote
quantity or speed over quality
• Avoid conflict metrics that promote
individuals rather than teams
[DISTRIBUTION STATEMENT A] Approved for public release and unlimited distribution.
28
14
11/12/20
TRACK: CI/CD CONTINUOUS EVERYTHING
SEI team GitHub Projects
• Once Click DevOps deployment
https://github.com/DSOI-ALL/devops-microcosm
• Sample app with DevOps Process
https://github.com/DSOI-ALL/flask_api_sample
• Tagged checkpoints
• v0.1.0: base Flask project
• v0.2.0: Vagrant development configuration
• v0.3.0: Test environment and Fabric deployment
• v0.4.0: Upstart services, external configuration files
• v0.5.0: Production environment
• On YouTube:
https://www.youtube.com/watch?v=5nQlJ-FWA5A
[DISTRIBUTION STATEMENT A] Approved for public release and unlimited distribution.
29
TRACK: CI/CD CONTINUOUS EVERYTHING
For more information…
DevOps: https://www.sei.cmu.edu/go/devops
DevOps Blog: https://insights.sei.cmu.edu/devops
Webinar : https://www.sei.cmu.edu/publications/webinars/index.cfm
Podcast : https://www.sei.cmu.edu/publications/podcasts/index.cfm
[DISTRIBUTION STATEMENT A] Approved for public release and unlimited distribution.
30
15
11/12/20
TRACK: CI/CD CONTINUOUS EVERYTHING
Thank you
Hasan Yasar
Technical Director, Adjunct Faculty Member
Continuous Deployment of Capability,
Software Engineering Institute | Carnegie Mellon University
hyasar@cmu.edu
[DISTRIBUTION STATEMENT A] Approved for public release and unlimited distribution.
31
TRACK: CI/CD CONTINUOUS EVERYTHING
THANK YOU TO OUR SPONSORS
[DISTRIBUTION STATEMENT A] Approved for public release and unlimited distribution.
32
16
11/12/20
It is a question-and-answer time! TRACK: CI/CD CONTINUOUS EVERYTHING
Please join slack channel #2020-ci-cd
[DISTRIBUTION STATEMENT A] Approved for public release and unlimited distribution.
33
17