SlideShare a Scribd company logo
Observability, Distributed Tracing,
and Open Source
The Missing Primer
2
https://laprensasa.com/culture/art-music/mozart-festival-texas-returns-uiw/
3
4
5
• DanielKhan
daniel.khan@dynatrace.com
@dkhan
• Dir. TechnologyStrategy @Dynatrace
• Everything Open Source Monitoring &
standards& our contributionsto it
• Chairof W3C Trace Context
About me
6
Why I am doing this talk
Distributed
Tracing
Observability
W3C Trace
Context
OpenCensus
OpenTracing
OpenTelemetry
Metrics
Span
Trace
7
Application
In the Beginning there was the Monolith
Presentation
Business Logic
Data Access
Database
Services
Presentation
API Gateway
Auth Inventory CartAccount
Offers Shipping CheckoutStatus
Wire
8
Developmentin a Microservices World
Cart
Dev
Preproduction
Cart Auth InventoryAccount
Offers Shopping CheckoutStatus
Push
Cart
• Latency
• Response Time
• Error Rate
• Number of queries
KPI’s
9
Metrics
Source: https://techblog.commercetools.com/adding-consistency-and-automation-to-grafana-e99eb374fe40
… containtime correlated datapoints
• Counter
Monotonously increasing values
Think: Odometer
• Gauge
Increasing and decreasing values
Think: Tachometer
• Histogram
Groups values into buckets
Think: Knock events 0-50mph, 51-100mph, …
10
Collecting and Charting Metrics
11
Error
242
Success
1302
Cart Service
12
Complecity has movedto the NetworkLayer
Client API GW Service
Service
Service
Service
Service
ServiceCart
Which requests lead to an error in our cart service?
Trace
a42b a42b
a42b
a42b
a42b
a42b
a42b
a42b
a42b = Trace Context
13
A Trace is a Tree of Spans
Trace
Span
Span
Span
Click GW API
Spans represent a single operationand containmetadatalike the HTTP method, or a databasequery, or an error code
JDBC
Span: callDB()
Span: JDBC call
14
Trace Context Propagation
Cartsa42b a42b
Extract
Inject
In process propagation
Auto Instrumentation
• Zipkin
• Sleuth
• OpenTelemetry
• Commercial
• …
15
Trace Context Header Formats
• Proprietary
• B3 Header (Zipkin)
• W3C Trace Context
What’sthe header name and what does it contain?
16
W3C Trace Context
Service A API GW Service B
Trace
Service C
OpenTelemetry AWS Zipkin OpenTelemetry
Goal: All monitoring systems and middlewaresagree on one format for trace context propagation
Span
Span
Span
Span
17
W3C Trace Context Format
traceparent: 00-0af7651916cd43dd8448eb211c80319c-00f067aa0ba902b7-01 tracestate:
rojo=00f067aa0ba902b7,congo=t61rcWkgMzE
Version TraceID ParentID Flags
18
Data collection
So far we just instrumentedthe code to propagatecontext but no data has been collected
Trace
Span
Span
Span
Agent Agent Agent
Click GW API
TraceContext TraceContext
MonitoringSystem Storage
19
Data Collection & PresentationSystems
Solution Agents Instrumentation Storage Presentation
Zipkin / Sleuth + + + +
Jaeger - - + +
OpenTelemetry + + - -
Commercial + + + +
20
Zipkin
21
Jaeger
22
Commercial
23
Entity Model Based Service Flow
24
Detecting Errors
25
Solving our Cart Problem
Client API GW Service
Service
Service
Service
Service
ServiceCart
Trace
Client
Service
Currency
Cart
API GW
GET: Currency=EURO
26
What we did to Solve the Problem
1. We used metrics to learn about a problem
2. We used distributedtracing to pass along a unique ID per trace
• For that,we used auto instrumentationto extract and inject the trace ID
3. We used a monitoring system and its agentsto collected traces and we could filter transactionsthatproduced an
error
4. We looked into the metadata of such a transactionto identify how it differs from succeeding ones
27
You’ve mentionedOpenTelemetry …
+ =
In early 2019 OpenCensusand OpenTracingmerged into OpenTelemetry
Metrics, Traces, Logs
28
APIs SDKs Exporters Collector
29
30
31
32
33
OpenTelemetry – Developer usecases
• Cloud nativemicroservices architectures are hard
to trace and debug during development
• In developmentOpenTelemetry can be used to
either
• manuallycreate spans to trace certain
execution paths
• use provided auto-instrumentation tomake
a system observable
• As backend and UI, Jaeger is the most popular
tool. It’s open source and solely displaystraces
34
OpenTelemetry – in Production
• Provides just a fraction of what modern tools provide
• Traces
• Metrics
• Logs
• Topology
• Behavior
• Code level visibility
• Metadata
• Manual instrumentation codeneeds to be kept up-to-date
• A backend needs to be maintained
• No support model if instrumentation breaksproductioncode
• No enterprise features (access control, throttling, scaling, …)
35
Why do Vendors Care then?
36
OpenTelemetry Company Contribution Stats
Google
Microsoft
Dynatrace
37
38
What happens when we add support for a new framework?
• Today, our engineers reverse engineer frameworks to add
instrumentationsupport to them
• Every time an update is released, the instrumentationcode is
being tested.
• In case of issues, it goes back to the developmentteam who
needs to fix it and deploy an update.
• The whole process is automated and transparent to the customer ☺
• This is costly and time consuming
39
In-process tracing
Click GW API
MonitoringSystem
Trace
Span
Span
Span
HZQ
Span: doHZQ()()
Span: HZQ call
OTEL HZQ
Wrapper
40
“We want every platform and library to
be pre-instrumented with
OpenTelemetry and we’re committed to
making this as easy as possible.”
Sergey Kanzhelev (Google)
41
What is Observability and how does it differ fromMonitoring?
1. In control theory, observability is a measure of how well internal states of a system can be inferred from
knowledgeof its external outputs.
Source: Wikipedia
2. In software development, observabilityisachieved by adding code (instrumentation)that emits telemetry
data.
3. Monitoringis the act of displayingand analyzing this telemetry data.
4. Monitoringalone can tell you that there is a problem.
E.g. ”We see that some users experience a 50% higher response time on check-out”
5. Observablityhelps finding the root cause (the why) by providingdatathat can be correlatedand analysed
freely even if this problem is completelynew to you (unknown unknowns)
E.g. “The response time of the checkout increases exponentially with the number of items in the basket,
because of a misplaced for loop that executes the same database query times the number of items for every
item in the basket”
42
Putting it all Together
• Metrics can help you to learn that there is a problem
• Distributedtracing becomes increasingly importantto understandmulti-tier execution paths and root causes
of problems
• Developersnow rely on metrics and traces to understandhow their service functionsin their microservice
architectures
• Pure Open Source solutionsare viable for pre-prod environments
• Standardization isthe only way to tackle today’scomplexity and Open Source is the key driver
• Vendorsare prepared to tap intodata collected by Open Source standard toolsto add enterprise features on
top to support web-scale workloads
43
dynatrace.com
@dkhan
daniel.khan@dynatrace.com
Thank you!

More Related Content

What's hot (20)

Observability
ObservabilityObservability
Observability
Martin Gross
 
Opentelemetry - From frontend to backend
Opentelemetry - From frontend to backendOpentelemetry - From frontend to backend
Opentelemetry - From frontend to backend
Sebastian Poxhofer
 
OpenTelemetry Introduction
OpenTelemetry Introduction OpenTelemetry Introduction
OpenTelemetry Introduction
DimitrisFinas1
 
Everything You wanted to Know About Distributed Tracing
Everything You wanted to Know About Distributed TracingEverything You wanted to Know About Distributed Tracing
Everything You wanted to Know About Distributed Tracing
Amuhinda Hungai
 
Distributed Tracing for Kafka with OpenTelemetry with Daniel Kim | Kafka Summ...
Distributed Tracing for Kafka with OpenTelemetry with Daniel Kim | Kafka Summ...Distributed Tracing for Kafka with OpenTelemetry with Daniel Kim | Kafka Summ...
Distributed Tracing for Kafka with OpenTelemetry with Daniel Kim | Kafka Summ...
HostedbyConfluent
 
Cloud-Native Observability
Cloud-Native ObservabilityCloud-Native Observability
Cloud-Native Observability
Tyler Treat
 
Observability
ObservabilityObservability
Observability
Diego Pacheco
 
Grafana Loki: like Prometheus, but for Logs
Grafana Loki: like Prometheus, but for LogsGrafana Loki: like Prometheus, but for Logs
Grafana Loki: like Prometheus, but for Logs
Marco Pracucci
 
OSMC 2022 | OpenTelemetry 101 by Dotan Horovit s.pdf
OSMC 2022 | OpenTelemetry 101 by Dotan Horovit s.pdfOSMC 2022 | OpenTelemetry 101 by Dotan Horovit s.pdf
OSMC 2022 | OpenTelemetry 101 by Dotan Horovit s.pdf
NETWAYS
 
Distributed tracing using open tracing & jaeger 2
Distributed tracing using open tracing & jaeger 2Distributed tracing using open tracing & jaeger 2
Distributed tracing using open tracing & jaeger 2
Chandresh Pancholi
 
Observability at Scale
Observability at Scale Observability at Scale
Observability at Scale
Knoldus Inc.
 
Kubernetes: A Short Introduction (2019)
Kubernetes: A Short Introduction (2019)Kubernetes: A Short Introduction (2019)
Kubernetes: A Short Introduction (2019)
Megan O'Keefe
 
Intro to open source observability with grafana, prometheus, loki, and tempo(...
Intro to open source observability with grafana, prometheus, loki, and tempo(...Intro to open source observability with grafana, prometheus, loki, and tempo(...
Intro to open source observability with grafana, prometheus, loki, and tempo(...
LibbySchulze
 
Distributed tracing 101
Distributed tracing 101Distributed tracing 101
Distributed tracing 101
Itiel Shwartz
 
Api observability
Api observability Api observability
Api observability
Red Hat
 
Observability & Datadog
Observability & DatadogObservability & Datadog
Observability & Datadog
JamesAnderson599331
 
Service Mesh - Observability
Service Mesh - ObservabilityService Mesh - Observability
Service Mesh - Observability
Araf Karsh Hamid
 
Improve monitoring and observability for kubernetes with oss tools
Improve monitoring and observability for kubernetes with oss toolsImprove monitoring and observability for kubernetes with oss tools
Improve monitoring and observability for kubernetes with oss tools
Nilesh Gule
 
Distributed Tracing in Practice
Distributed Tracing in PracticeDistributed Tracing in Practice
Distributed Tracing in Practice
DevOps.com
 
How to Move from Monitoring to Observability, On-Premises and in a Multi-Clou...
How to Move from Monitoring to Observability, On-Premises and in a Multi-Clou...How to Move from Monitoring to Observability, On-Premises and in a Multi-Clou...
How to Move from Monitoring to Observability, On-Premises and in a Multi-Clou...
Splunk
 
Opentelemetry - From frontend to backend
Opentelemetry - From frontend to backendOpentelemetry - From frontend to backend
Opentelemetry - From frontend to backend
Sebastian Poxhofer
 
OpenTelemetry Introduction
OpenTelemetry Introduction OpenTelemetry Introduction
OpenTelemetry Introduction
DimitrisFinas1
 
Everything You wanted to Know About Distributed Tracing
Everything You wanted to Know About Distributed TracingEverything You wanted to Know About Distributed Tracing
Everything You wanted to Know About Distributed Tracing
Amuhinda Hungai
 
Distributed Tracing for Kafka with OpenTelemetry with Daniel Kim | Kafka Summ...
Distributed Tracing for Kafka with OpenTelemetry with Daniel Kim | Kafka Summ...Distributed Tracing for Kafka with OpenTelemetry with Daniel Kim | Kafka Summ...
Distributed Tracing for Kafka with OpenTelemetry with Daniel Kim | Kafka Summ...
HostedbyConfluent
 
Cloud-Native Observability
Cloud-Native ObservabilityCloud-Native Observability
Cloud-Native Observability
Tyler Treat
 
Grafana Loki: like Prometheus, but for Logs
Grafana Loki: like Prometheus, but for LogsGrafana Loki: like Prometheus, but for Logs
Grafana Loki: like Prometheus, but for Logs
Marco Pracucci
 
OSMC 2022 | OpenTelemetry 101 by Dotan Horovit s.pdf
OSMC 2022 | OpenTelemetry 101 by Dotan Horovit s.pdfOSMC 2022 | OpenTelemetry 101 by Dotan Horovit s.pdf
OSMC 2022 | OpenTelemetry 101 by Dotan Horovit s.pdf
NETWAYS
 
Distributed tracing using open tracing & jaeger 2
Distributed tracing using open tracing & jaeger 2Distributed tracing using open tracing & jaeger 2
Distributed tracing using open tracing & jaeger 2
Chandresh Pancholi
 
Observability at Scale
Observability at Scale Observability at Scale
Observability at Scale
Knoldus Inc.
 
Kubernetes: A Short Introduction (2019)
Kubernetes: A Short Introduction (2019)Kubernetes: A Short Introduction (2019)
Kubernetes: A Short Introduction (2019)
Megan O'Keefe
 
Intro to open source observability with grafana, prometheus, loki, and tempo(...
Intro to open source observability with grafana, prometheus, loki, and tempo(...Intro to open source observability with grafana, prometheus, loki, and tempo(...
Intro to open source observability with grafana, prometheus, loki, and tempo(...
LibbySchulze
 
Distributed tracing 101
Distributed tracing 101Distributed tracing 101
Distributed tracing 101
Itiel Shwartz
 
Api observability
Api observability Api observability
Api observability
Red Hat
 
Service Mesh - Observability
Service Mesh - ObservabilityService Mesh - Observability
Service Mesh - Observability
Araf Karsh Hamid
 
Improve monitoring and observability for kubernetes with oss tools
Improve monitoring and observability for kubernetes with oss toolsImprove monitoring and observability for kubernetes with oss tools
Improve monitoring and observability for kubernetes with oss tools
Nilesh Gule
 
Distributed Tracing in Practice
Distributed Tracing in PracticeDistributed Tracing in Practice
Distributed Tracing in Practice
DevOps.com
 
How to Move from Monitoring to Observability, On-Premises and in a Multi-Clou...
How to Move from Monitoring to Observability, On-Premises and in a Multi-Clou...How to Move from Monitoring to Observability, On-Premises and in a Multi-Clou...
How to Move from Monitoring to Observability, On-Premises and in a Multi-Clou...
Splunk
 

Similar to Observability, Distributed Tracing, and Open Source: The Missing Primer (20)

"Distributed Tracing: New DevOps Foundation" by Jayesh Ahire
"Distributed Tracing: New DevOps Foundation" by Jayesh Ahire  "Distributed Tracing: New DevOps Foundation" by Jayesh Ahire
"Distributed Tracing: New DevOps Foundation" by Jayesh Ahire
CodeOps Technologies LLP
 
Distributed Tracing: New DevOps Foundation
Distributed Tracing: New DevOps FoundationDistributed Tracing: New DevOps Foundation
Distributed Tracing: New DevOps Foundation
CodeOps Technologies LLP
 
Observability for Application Developers (1)-1.pptx
Observability for Application Developers (1)-1.pptxObservability for Application Developers (1)-1.pptx
Observability for Application Developers (1)-1.pptx
OpsTree solutions
 
Manage Microservices Chaos and Complexity with Observability
Manage Microservices Chaos and Complexity with ObservabilityManage Microservices Chaos and Complexity with Observability
Manage Microservices Chaos and Complexity with Observability
NGINX, Inc.
 
Agile Gurugram 2023 | Observability for Modern Applications. How does it help...
Agile Gurugram 2023 | Observability for Modern Applications. How does it help...Agile Gurugram 2023 | Observability for Modern Applications. How does it help...
Agile Gurugram 2023 | Observability for Modern Applications. How does it help...
AgileNetwork
 
DockerCon SF 2019 - Observability Workshop
DockerCon SF 2019 - Observability WorkshopDockerCon SF 2019 - Observability Workshop
DockerCon SF 2019 - Observability Workshop
Kevin Crawley
 
What Is OpenTelemetry? A Complete Introduction
What Is OpenTelemetry? A Complete IntroductionWhat Is OpenTelemetry? A Complete Introduction
What Is OpenTelemetry? A Complete Introduction
Ciente
 
stackconf 2023 | Practical introduction to OpenTelemetry tracing by Nicolas F...
stackconf 2023 | Practical introduction to OpenTelemetry tracing by Nicolas F...stackconf 2023 | Practical introduction to OpenTelemetry tracing by Nicolas F...
stackconf 2023 | Practical introduction to OpenTelemetry tracing by Nicolas F...
NETWAYS
 
Beginner's Guide to Observability@Devoxx PL 2024
Beginner's  Guide to Observability@Devoxx PL 2024Beginner's  Guide to Observability@Devoxx PL 2024
Beginner's Guide to Observability@Devoxx PL 2024
michniczscribd
 
stackconf 2022: Open Source for Better Observability
stackconf 2022: Open Source for Better Observabilitystackconf 2022: Open Source for Better Observability
stackconf 2022: Open Source for Better Observability
NETWAYS
 
Meetup OpenTelemetry Intro
Meetup OpenTelemetry IntroMeetup OpenTelemetry Intro
Meetup OpenTelemetry Intro
DimitrisFinas1
 
ThroughTheLookingGlass_EffectiveObservability.pptx
ThroughTheLookingGlass_EffectiveObservability.pptxThroughTheLookingGlass_EffectiveObservability.pptx
ThroughTheLookingGlass_EffectiveObservability.pptx
Grace Jansen
 
Introduction to Open Telemetry as Observability Library
Introduction to Open  Telemetry as Observability LibraryIntroduction to Open  Telemetry as Observability Library
Introduction to Open Telemetry as Observability Library
Tonny Adhi Sabastian
 
Adopting Open Telemetry as Distributed Tracer on your Microservices at Kubern...
Adopting Open Telemetry as Distributed Tracer on your Microservices at Kubern...Adopting Open Telemetry as Distributed Tracer on your Microservices at Kubern...
Adopting Open Telemetry as Distributed Tracer on your Microservices at Kubern...
Tonny Adhi Sabastian
 
"Introducing Distributed Tracing in a Large Software System", Kostiantyn Sha...
"Introducing Distributed Tracing in a Large Software System",  Kostiantyn Sha..."Introducing Distributed Tracing in a Large Software System",  Kostiantyn Sha...
"Introducing Distributed Tracing in a Large Software System", Kostiantyn Sha...
Fwdays
 
WJAX 2019 - Taking Distributed Tracing to the next level
WJAX 2019 - Taking Distributed Tracing to the next levelWJAX 2019 - Taking Distributed Tracing to the next level
WJAX 2019 - Taking Distributed Tracing to the next level
Frank Pfleger
 
2307 - DevBCN - Otel 101_compressed.pdf
2307 - DevBCN - Otel 101_compressed.pdf2307 - DevBCN - Otel 101_compressed.pdf
2307 - DevBCN - Otel 101_compressed.pdf
DimitrisFinas1
 
ADDO Open Source Observability Tools
ADDO Open Source Observability Tools ADDO Open Source Observability Tools
ADDO Open Source Observability Tools
Mickey Boxell
 
Java il spanning services 2019
Java il   spanning services 2019Java il   spanning services 2019
Java il spanning services 2019
Yair Galler
 
Migrating from OpenTracing to OpenTelemetry - Kubernetes Community Days Munic...
Migrating from OpenTracing to OpenTelemetry - Kubernetes Community Days Munic...Migrating from OpenTracing to OpenTelemetry - Kubernetes Community Days Munic...
Migrating from OpenTracing to OpenTelemetry - Kubernetes Community Days Munic...
SonjaChevre
 
"Distributed Tracing: New DevOps Foundation" by Jayesh Ahire
"Distributed Tracing: New DevOps Foundation" by Jayesh Ahire  "Distributed Tracing: New DevOps Foundation" by Jayesh Ahire
"Distributed Tracing: New DevOps Foundation" by Jayesh Ahire
CodeOps Technologies LLP
 
Distributed Tracing: New DevOps Foundation
Distributed Tracing: New DevOps FoundationDistributed Tracing: New DevOps Foundation
Distributed Tracing: New DevOps Foundation
CodeOps Technologies LLP
 
Observability for Application Developers (1)-1.pptx
Observability for Application Developers (1)-1.pptxObservability for Application Developers (1)-1.pptx
Observability for Application Developers (1)-1.pptx
OpsTree solutions
 
Manage Microservices Chaos and Complexity with Observability
Manage Microservices Chaos and Complexity with ObservabilityManage Microservices Chaos and Complexity with Observability
Manage Microservices Chaos and Complexity with Observability
NGINX, Inc.
 
Agile Gurugram 2023 | Observability for Modern Applications. How does it help...
Agile Gurugram 2023 | Observability for Modern Applications. How does it help...Agile Gurugram 2023 | Observability for Modern Applications. How does it help...
Agile Gurugram 2023 | Observability for Modern Applications. How does it help...
AgileNetwork
 
DockerCon SF 2019 - Observability Workshop
DockerCon SF 2019 - Observability WorkshopDockerCon SF 2019 - Observability Workshop
DockerCon SF 2019 - Observability Workshop
Kevin Crawley
 
What Is OpenTelemetry? A Complete Introduction
What Is OpenTelemetry? A Complete IntroductionWhat Is OpenTelemetry? A Complete Introduction
What Is OpenTelemetry? A Complete Introduction
Ciente
 
stackconf 2023 | Practical introduction to OpenTelemetry tracing by Nicolas F...
stackconf 2023 | Practical introduction to OpenTelemetry tracing by Nicolas F...stackconf 2023 | Practical introduction to OpenTelemetry tracing by Nicolas F...
stackconf 2023 | Practical introduction to OpenTelemetry tracing by Nicolas F...
NETWAYS
 
Beginner's Guide to Observability@Devoxx PL 2024
Beginner's  Guide to Observability@Devoxx PL 2024Beginner's  Guide to Observability@Devoxx PL 2024
Beginner's Guide to Observability@Devoxx PL 2024
michniczscribd
 
stackconf 2022: Open Source for Better Observability
stackconf 2022: Open Source for Better Observabilitystackconf 2022: Open Source for Better Observability
stackconf 2022: Open Source for Better Observability
NETWAYS
 
Meetup OpenTelemetry Intro
Meetup OpenTelemetry IntroMeetup OpenTelemetry Intro
Meetup OpenTelemetry Intro
DimitrisFinas1
 
ThroughTheLookingGlass_EffectiveObservability.pptx
ThroughTheLookingGlass_EffectiveObservability.pptxThroughTheLookingGlass_EffectiveObservability.pptx
ThroughTheLookingGlass_EffectiveObservability.pptx
Grace Jansen
 
Introduction to Open Telemetry as Observability Library
Introduction to Open  Telemetry as Observability LibraryIntroduction to Open  Telemetry as Observability Library
Introduction to Open Telemetry as Observability Library
Tonny Adhi Sabastian
 
Adopting Open Telemetry as Distributed Tracer on your Microservices at Kubern...
Adopting Open Telemetry as Distributed Tracer on your Microservices at Kubern...Adopting Open Telemetry as Distributed Tracer on your Microservices at Kubern...
Adopting Open Telemetry as Distributed Tracer on your Microservices at Kubern...
Tonny Adhi Sabastian
 
"Introducing Distributed Tracing in a Large Software System", Kostiantyn Sha...
"Introducing Distributed Tracing in a Large Software System",  Kostiantyn Sha..."Introducing Distributed Tracing in a Large Software System",  Kostiantyn Sha...
"Introducing Distributed Tracing in a Large Software System", Kostiantyn Sha...
Fwdays
 
WJAX 2019 - Taking Distributed Tracing to the next level
WJAX 2019 - Taking Distributed Tracing to the next levelWJAX 2019 - Taking Distributed Tracing to the next level
WJAX 2019 - Taking Distributed Tracing to the next level
Frank Pfleger
 
2307 - DevBCN - Otel 101_compressed.pdf
2307 - DevBCN - Otel 101_compressed.pdf2307 - DevBCN - Otel 101_compressed.pdf
2307 - DevBCN - Otel 101_compressed.pdf
DimitrisFinas1
 
ADDO Open Source Observability Tools
ADDO Open Source Observability Tools ADDO Open Source Observability Tools
ADDO Open Source Observability Tools
Mickey Boxell
 
Java il spanning services 2019
Java il   spanning services 2019Java il   spanning services 2019
Java il spanning services 2019
Yair Galler
 
Migrating from OpenTracing to OpenTelemetry - Kubernetes Community Days Munic...
Migrating from OpenTracing to OpenTelemetry - Kubernetes Community Days Munic...Migrating from OpenTracing to OpenTelemetry - Kubernetes Community Days Munic...
Migrating from OpenTracing to OpenTelemetry - Kubernetes Community Days Munic...
SonjaChevre
 
Ad

More from VMware Tanzu (20)

Spring into AI presented by Dan Vega 5/14
Spring into AI presented by Dan Vega 5/14Spring into AI presented by Dan Vega 5/14
Spring into AI presented by Dan Vega 5/14
VMware Tanzu
 
What AI Means For Your Product Strategy And What To Do About It
What AI Means For Your Product Strategy And What To Do About ItWhat AI Means For Your Product Strategy And What To Do About It
What AI Means For Your Product Strategy And What To Do About It
VMware Tanzu
 
Make the Right Thing the Obvious Thing at Cardinal Health 2023
Make the Right Thing the Obvious Thing at Cardinal Health 2023Make the Right Thing the Obvious Thing at Cardinal Health 2023
Make the Right Thing the Obvious Thing at Cardinal Health 2023
VMware Tanzu
 
Enhancing DevEx and Simplifying Operations at Scale
Enhancing DevEx and Simplifying Operations at ScaleEnhancing DevEx and Simplifying Operations at Scale
Enhancing DevEx and Simplifying Operations at Scale
VMware Tanzu
 
Spring Update | July 2023
Spring Update | July 2023Spring Update | July 2023
Spring Update | July 2023
VMware Tanzu
 
Platforms, Platform Engineering, & Platform as a Product
Platforms, Platform Engineering, & Platform as a ProductPlatforms, Platform Engineering, & Platform as a Product
Platforms, Platform Engineering, & Platform as a Product
VMware Tanzu
 
Building Cloud Ready Apps
Building Cloud Ready AppsBuilding Cloud Ready Apps
Building Cloud Ready Apps
VMware Tanzu
 
Spring Boot 3 And Beyond
Spring Boot 3 And BeyondSpring Boot 3 And Beyond
Spring Boot 3 And Beyond
VMware Tanzu
 
Spring Cloud Gateway - SpringOne Tour 2023 Charles Schwab.pdf
Spring Cloud Gateway - SpringOne Tour 2023 Charles Schwab.pdfSpring Cloud Gateway - SpringOne Tour 2023 Charles Schwab.pdf
Spring Cloud Gateway - SpringOne Tour 2023 Charles Schwab.pdf
VMware Tanzu
 
Simplify and Scale Enterprise Apps in the Cloud | Boston 2023
Simplify and Scale Enterprise Apps in the Cloud | Boston 2023Simplify and Scale Enterprise Apps in the Cloud | Boston 2023
Simplify and Scale Enterprise Apps in the Cloud | Boston 2023
VMware Tanzu
 
Simplify and Scale Enterprise Apps in the Cloud | Seattle 2023
Simplify and Scale Enterprise Apps in the Cloud | Seattle 2023Simplify and Scale Enterprise Apps in the Cloud | Seattle 2023
Simplify and Scale Enterprise Apps in the Cloud | Seattle 2023
VMware Tanzu
 
tanzu_developer_connect.pptx
tanzu_developer_connect.pptxtanzu_developer_connect.pptx
tanzu_developer_connect.pptx
VMware Tanzu
 
Tanzu Virtual Developer Connect Workshop - French
Tanzu Virtual Developer Connect Workshop - FrenchTanzu Virtual Developer Connect Workshop - French
Tanzu Virtual Developer Connect Workshop - French
VMware Tanzu
 
Tanzu Developer Connect Workshop - English
Tanzu Developer Connect Workshop - EnglishTanzu Developer Connect Workshop - English
Tanzu Developer Connect Workshop - English
VMware Tanzu
 
Virtual Developer Connect Workshop - English
Virtual Developer Connect Workshop - EnglishVirtual Developer Connect Workshop - English
Virtual Developer Connect Workshop - English
VMware Tanzu
 
Tanzu Developer Connect - French
Tanzu Developer Connect - FrenchTanzu Developer Connect - French
Tanzu Developer Connect - French
VMware Tanzu
 
Simplify and Scale Enterprise Apps in the Cloud | Dallas 2023
Simplify and Scale Enterprise Apps in the Cloud | Dallas 2023Simplify and Scale Enterprise Apps in the Cloud | Dallas 2023
Simplify and Scale Enterprise Apps in the Cloud | Dallas 2023
VMware Tanzu
 
SpringOne Tour: Deliver 15-Factor Applications on Kubernetes with Spring Boot
SpringOne Tour: Deliver 15-Factor Applications on Kubernetes with Spring BootSpringOne Tour: Deliver 15-Factor Applications on Kubernetes with Spring Boot
SpringOne Tour: Deliver 15-Factor Applications on Kubernetes with Spring Boot
VMware Tanzu
 
SpringOne Tour: The Influential Software Engineer
SpringOne Tour: The Influential Software EngineerSpringOne Tour: The Influential Software Engineer
SpringOne Tour: The Influential Software Engineer
VMware Tanzu
 
SpringOne Tour: Domain-Driven Design: Theory vs Practice
SpringOne Tour: Domain-Driven Design: Theory vs PracticeSpringOne Tour: Domain-Driven Design: Theory vs Practice
SpringOne Tour: Domain-Driven Design: Theory vs Practice
VMware Tanzu
 
Spring into AI presented by Dan Vega 5/14
Spring into AI presented by Dan Vega 5/14Spring into AI presented by Dan Vega 5/14
Spring into AI presented by Dan Vega 5/14
VMware Tanzu
 
What AI Means For Your Product Strategy And What To Do About It
What AI Means For Your Product Strategy And What To Do About ItWhat AI Means For Your Product Strategy And What To Do About It
What AI Means For Your Product Strategy And What To Do About It
VMware Tanzu
 
Make the Right Thing the Obvious Thing at Cardinal Health 2023
Make the Right Thing the Obvious Thing at Cardinal Health 2023Make the Right Thing the Obvious Thing at Cardinal Health 2023
Make the Right Thing the Obvious Thing at Cardinal Health 2023
VMware Tanzu
 
Enhancing DevEx and Simplifying Operations at Scale
Enhancing DevEx and Simplifying Operations at ScaleEnhancing DevEx and Simplifying Operations at Scale
Enhancing DevEx and Simplifying Operations at Scale
VMware Tanzu
 
Spring Update | July 2023
Spring Update | July 2023Spring Update | July 2023
Spring Update | July 2023
VMware Tanzu
 
Platforms, Platform Engineering, & Platform as a Product
Platforms, Platform Engineering, & Platform as a ProductPlatforms, Platform Engineering, & Platform as a Product
Platforms, Platform Engineering, & Platform as a Product
VMware Tanzu
 
Building Cloud Ready Apps
Building Cloud Ready AppsBuilding Cloud Ready Apps
Building Cloud Ready Apps
VMware Tanzu
 
Spring Boot 3 And Beyond
Spring Boot 3 And BeyondSpring Boot 3 And Beyond
Spring Boot 3 And Beyond
VMware Tanzu
 
Spring Cloud Gateway - SpringOne Tour 2023 Charles Schwab.pdf
Spring Cloud Gateway - SpringOne Tour 2023 Charles Schwab.pdfSpring Cloud Gateway - SpringOne Tour 2023 Charles Schwab.pdf
Spring Cloud Gateway - SpringOne Tour 2023 Charles Schwab.pdf
VMware Tanzu
 
Simplify and Scale Enterprise Apps in the Cloud | Boston 2023
Simplify and Scale Enterprise Apps in the Cloud | Boston 2023Simplify and Scale Enterprise Apps in the Cloud | Boston 2023
Simplify and Scale Enterprise Apps in the Cloud | Boston 2023
VMware Tanzu
 
Simplify and Scale Enterprise Apps in the Cloud | Seattle 2023
Simplify and Scale Enterprise Apps in the Cloud | Seattle 2023Simplify and Scale Enterprise Apps in the Cloud | Seattle 2023
Simplify and Scale Enterprise Apps in the Cloud | Seattle 2023
VMware Tanzu
 
tanzu_developer_connect.pptx
tanzu_developer_connect.pptxtanzu_developer_connect.pptx
tanzu_developer_connect.pptx
VMware Tanzu
 
Tanzu Virtual Developer Connect Workshop - French
Tanzu Virtual Developer Connect Workshop - FrenchTanzu Virtual Developer Connect Workshop - French
Tanzu Virtual Developer Connect Workshop - French
VMware Tanzu
 
Tanzu Developer Connect Workshop - English
Tanzu Developer Connect Workshop - EnglishTanzu Developer Connect Workshop - English
Tanzu Developer Connect Workshop - English
VMware Tanzu
 
Virtual Developer Connect Workshop - English
Virtual Developer Connect Workshop - EnglishVirtual Developer Connect Workshop - English
Virtual Developer Connect Workshop - English
VMware Tanzu
 
Tanzu Developer Connect - French
Tanzu Developer Connect - FrenchTanzu Developer Connect - French
Tanzu Developer Connect - French
VMware Tanzu
 
Simplify and Scale Enterprise Apps in the Cloud | Dallas 2023
Simplify and Scale Enterprise Apps in the Cloud | Dallas 2023Simplify and Scale Enterprise Apps in the Cloud | Dallas 2023
Simplify and Scale Enterprise Apps in the Cloud | Dallas 2023
VMware Tanzu
 
SpringOne Tour: Deliver 15-Factor Applications on Kubernetes with Spring Boot
SpringOne Tour: Deliver 15-Factor Applications on Kubernetes with Spring BootSpringOne Tour: Deliver 15-Factor Applications on Kubernetes with Spring Boot
SpringOne Tour: Deliver 15-Factor Applications on Kubernetes with Spring Boot
VMware Tanzu
 
SpringOne Tour: The Influential Software Engineer
SpringOne Tour: The Influential Software EngineerSpringOne Tour: The Influential Software Engineer
SpringOne Tour: The Influential Software Engineer
VMware Tanzu
 
SpringOne Tour: Domain-Driven Design: Theory vs Practice
SpringOne Tour: Domain-Driven Design: Theory vs PracticeSpringOne Tour: Domain-Driven Design: Theory vs Practice
SpringOne Tour: Domain-Driven Design: Theory vs Practice
VMware Tanzu
 
Ad

Recently uploaded (20)

List Unfolding - 'unfold' as the Computational Dual of 'fold', and how 'unfol...
List Unfolding - 'unfold' as the Computational Dual of 'fold', and how 'unfol...List Unfolding - 'unfold' as the Computational Dual of 'fold', and how 'unfol...
List Unfolding - 'unfold' as the Computational Dual of 'fold', and how 'unfol...
Philip Schwarz
 
iOS Developer Resume 2025 | Pramod Kumar
iOS Developer Resume 2025 | Pramod KumariOS Developer Resume 2025 | Pramod Kumar
iOS Developer Resume 2025 | Pramod Kumar
Pramod Kumar
 
zOS CommServer support for the Network Express feature on z17
zOS CommServer support for the Network Express feature on z17zOS CommServer support for the Network Express feature on z17
zOS CommServer support for the Network Express feature on z17
zOSCommserver
 
grade 9 ai project cycle Artificial intelligence.pptx
grade 9 ai project cycle Artificial intelligence.pptxgrade 9 ai project cycle Artificial intelligence.pptx
grade 9 ai project cycle Artificial intelligence.pptx
manikumar465287
 
Optimising Claims Management with Claims Processing Systems
Optimising Claims Management with Claims Processing SystemsOptimising Claims Management with Claims Processing Systems
Optimising Claims Management with Claims Processing Systems
Insurance Tech Services
 
Marketing And Sales Software Services.pptx
Marketing And Sales Software Services.pptxMarketing And Sales Software Services.pptx
Marketing And Sales Software Services.pptx
julia smits
 
The rise of e-commerce has redefined how retailers operate—and reconciliation...
The rise of e-commerce has redefined how retailers operate—and reconciliation...The rise of e-commerce has redefined how retailers operate—and reconciliation...
The rise of e-commerce has redefined how retailers operate—and reconciliation...
Prachi Desai
 
aswjkdwelhjdfshlfjkhewljhfljawerhwjarhwjkahrjar
aswjkdwelhjdfshlfjkhewljhfljawerhwjarhwjkahrjaraswjkdwelhjdfshlfjkhewljhfljawerhwjarhwjkahrjar
aswjkdwelhjdfshlfjkhewljhfljawerhwjarhwjkahrjar
muhammadalikhanalikh1
 
Top 10 Mobile Banking Apps in the USA.pdf
Top 10 Mobile Banking Apps in the USA.pdfTop 10 Mobile Banking Apps in the USA.pdf
Top 10 Mobile Banking Apps in the USA.pdf
LL Technolab
 
How John started to like TDD (instead of hating it) (ViennaJUG, June'25)
How John started to like TDD (instead of hating it) (ViennaJUG, June'25)How John started to like TDD (instead of hating it) (ViennaJUG, June'25)
How John started to like TDD (instead of hating it) (ViennaJUG, June'25)
Nacho Cougil
 
Internship in South western railways on software
Internship in South western railways on softwareInternship in South western railways on software
Internship in South western railways on software
abhim5889
 
Micro-Metrics Every Performance Engineer Should Validate Before Sign-Off
Micro-Metrics Every Performance Engineer Should Validate Before Sign-OffMicro-Metrics Every Performance Engineer Should Validate Before Sign-Off
Micro-Metrics Every Performance Engineer Should Validate Before Sign-Off
Tier1 app
 
Delivering More with Less: AI Driven Resource Management with OnePlan
Delivering More with Less: AI Driven Resource Management with OnePlan Delivering More with Less: AI Driven Resource Management with OnePlan
Delivering More with Less: AI Driven Resource Management with OnePlan
OnePlan Solutions
 
Scalefusion Remote Access for Apple Devices
Scalefusion Remote Access for Apple DevicesScalefusion Remote Access for Apple Devices
Scalefusion Remote Access for Apple Devices
Scalefusion
 
Intranet Examples That Are Changing the Way We Work
Intranet Examples That Are Changing the Way We WorkIntranet Examples That Are Changing the Way We Work
Intranet Examples That Are Changing the Way We Work
BizPortals Solutions
 
Boost Student Engagement with Smart Attendance Software for Schools
Boost Student Engagement with Smart Attendance Software for SchoolsBoost Student Engagement with Smart Attendance Software for Schools
Boost Student Engagement with Smart Attendance Software for Schools
Visitu
 
Shortcomings of EHS Software – And How to Overcome Them
Shortcomings of EHS Software – And How to Overcome ThemShortcomings of EHS Software – And How to Overcome Them
Shortcomings of EHS Software – And How to Overcome Them
TECH EHS Solution
 
Build enterprise-ready applications using skills you already have!
Build enterprise-ready applications using skills you already have!Build enterprise-ready applications using skills you already have!
Build enterprise-ready applications using skills you already have!
PhilMeredith3
 
BoxLang-Dynamic-AWS-Lambda by Luis Majano.pdf
BoxLang-Dynamic-AWS-Lambda by Luis Majano.pdfBoxLang-Dynamic-AWS-Lambda by Luis Majano.pdf
BoxLang-Dynamic-AWS-Lambda by Luis Majano.pdf
Ortus Solutions, Corp
 
Oliveira2024 - Combining GPT and Weak Supervision.pdf
Oliveira2024 - Combining GPT and Weak Supervision.pdfOliveira2024 - Combining GPT and Weak Supervision.pdf
Oliveira2024 - Combining GPT and Weak Supervision.pdf
GiliardGodoi1
 
List Unfolding - 'unfold' as the Computational Dual of 'fold', and how 'unfol...
List Unfolding - 'unfold' as the Computational Dual of 'fold', and how 'unfol...List Unfolding - 'unfold' as the Computational Dual of 'fold', and how 'unfol...
List Unfolding - 'unfold' as the Computational Dual of 'fold', and how 'unfol...
Philip Schwarz
 
iOS Developer Resume 2025 | Pramod Kumar
iOS Developer Resume 2025 | Pramod KumariOS Developer Resume 2025 | Pramod Kumar
iOS Developer Resume 2025 | Pramod Kumar
Pramod Kumar
 
zOS CommServer support for the Network Express feature on z17
zOS CommServer support for the Network Express feature on z17zOS CommServer support for the Network Express feature on z17
zOS CommServer support for the Network Express feature on z17
zOSCommserver
 
grade 9 ai project cycle Artificial intelligence.pptx
grade 9 ai project cycle Artificial intelligence.pptxgrade 9 ai project cycle Artificial intelligence.pptx
grade 9 ai project cycle Artificial intelligence.pptx
manikumar465287
 
Optimising Claims Management with Claims Processing Systems
Optimising Claims Management with Claims Processing SystemsOptimising Claims Management with Claims Processing Systems
Optimising Claims Management with Claims Processing Systems
Insurance Tech Services
 
Marketing And Sales Software Services.pptx
Marketing And Sales Software Services.pptxMarketing And Sales Software Services.pptx
Marketing And Sales Software Services.pptx
julia smits
 
The rise of e-commerce has redefined how retailers operate—and reconciliation...
The rise of e-commerce has redefined how retailers operate—and reconciliation...The rise of e-commerce has redefined how retailers operate—and reconciliation...
The rise of e-commerce has redefined how retailers operate—and reconciliation...
Prachi Desai
 
aswjkdwelhjdfshlfjkhewljhfljawerhwjarhwjkahrjar
aswjkdwelhjdfshlfjkhewljhfljawerhwjarhwjkahrjaraswjkdwelhjdfshlfjkhewljhfljawerhwjarhwjkahrjar
aswjkdwelhjdfshlfjkhewljhfljawerhwjarhwjkahrjar
muhammadalikhanalikh1
 
Top 10 Mobile Banking Apps in the USA.pdf
Top 10 Mobile Banking Apps in the USA.pdfTop 10 Mobile Banking Apps in the USA.pdf
Top 10 Mobile Banking Apps in the USA.pdf
LL Technolab
 
How John started to like TDD (instead of hating it) (ViennaJUG, June'25)
How John started to like TDD (instead of hating it) (ViennaJUG, June'25)How John started to like TDD (instead of hating it) (ViennaJUG, June'25)
How John started to like TDD (instead of hating it) (ViennaJUG, June'25)
Nacho Cougil
 
Internship in South western railways on software
Internship in South western railways on softwareInternship in South western railways on software
Internship in South western railways on software
abhim5889
 
Micro-Metrics Every Performance Engineer Should Validate Before Sign-Off
Micro-Metrics Every Performance Engineer Should Validate Before Sign-OffMicro-Metrics Every Performance Engineer Should Validate Before Sign-Off
Micro-Metrics Every Performance Engineer Should Validate Before Sign-Off
Tier1 app
 
Delivering More with Less: AI Driven Resource Management with OnePlan
Delivering More with Less: AI Driven Resource Management with OnePlan Delivering More with Less: AI Driven Resource Management with OnePlan
Delivering More with Less: AI Driven Resource Management with OnePlan
OnePlan Solutions
 
Scalefusion Remote Access for Apple Devices
Scalefusion Remote Access for Apple DevicesScalefusion Remote Access for Apple Devices
Scalefusion Remote Access for Apple Devices
Scalefusion
 
Intranet Examples That Are Changing the Way We Work
Intranet Examples That Are Changing the Way We WorkIntranet Examples That Are Changing the Way We Work
Intranet Examples That Are Changing the Way We Work
BizPortals Solutions
 
Boost Student Engagement with Smart Attendance Software for Schools
Boost Student Engagement with Smart Attendance Software for SchoolsBoost Student Engagement with Smart Attendance Software for Schools
Boost Student Engagement with Smart Attendance Software for Schools
Visitu
 
Shortcomings of EHS Software – And How to Overcome Them
Shortcomings of EHS Software – And How to Overcome ThemShortcomings of EHS Software – And How to Overcome Them
Shortcomings of EHS Software – And How to Overcome Them
TECH EHS Solution
 
Build enterprise-ready applications using skills you already have!
Build enterprise-ready applications using skills you already have!Build enterprise-ready applications using skills you already have!
Build enterprise-ready applications using skills you already have!
PhilMeredith3
 
BoxLang-Dynamic-AWS-Lambda by Luis Majano.pdf
BoxLang-Dynamic-AWS-Lambda by Luis Majano.pdfBoxLang-Dynamic-AWS-Lambda by Luis Majano.pdf
BoxLang-Dynamic-AWS-Lambda by Luis Majano.pdf
Ortus Solutions, Corp
 
Oliveira2024 - Combining GPT and Weak Supervision.pdf
Oliveira2024 - Combining GPT and Weak Supervision.pdfOliveira2024 - Combining GPT and Weak Supervision.pdf
Oliveira2024 - Combining GPT and Weak Supervision.pdf
GiliardGodoi1
 

Observability, Distributed Tracing, and Open Source: The Missing Primer

  • 1. Observability, Distributed Tracing, and Open Source The Missing Primer
  • 3. 3
  • 4. 4
  • 5. 5 • DanielKhan [email protected] @dkhan • Dir. TechnologyStrategy @Dynatrace • Everything Open Source Monitoring & standards& our contributionsto it • Chairof W3C Trace Context About me
  • 6. 6 Why I am doing this talk Distributed Tracing Observability W3C Trace Context OpenCensus OpenTracing OpenTelemetry Metrics Span Trace
  • 7. 7 Application In the Beginning there was the Monolith Presentation Business Logic Data Access Database Services Presentation API Gateway Auth Inventory CartAccount Offers Shipping CheckoutStatus Wire
  • 8. 8 Developmentin a Microservices World Cart Dev Preproduction Cart Auth InventoryAccount Offers Shopping CheckoutStatus Push Cart • Latency • Response Time • Error Rate • Number of queries KPI’s
  • 9. 9 Metrics Source: https://techblog.commercetools.com/adding-consistency-and-automation-to-grafana-e99eb374fe40 … containtime correlated datapoints • Counter Monotonously increasing values Think: Odometer • Gauge Increasing and decreasing values Think: Tachometer • Histogram Groups values into buckets Think: Knock events 0-50mph, 51-100mph, …
  • 12. 12 Complecity has movedto the NetworkLayer Client API GW Service Service Service Service Service ServiceCart Which requests lead to an error in our cart service? Trace a42b a42b a42b a42b a42b a42b a42b a42b a42b = Trace Context
  • 13. 13 A Trace is a Tree of Spans Trace Span Span Span Click GW API Spans represent a single operationand containmetadatalike the HTTP method, or a databasequery, or an error code JDBC Span: callDB() Span: JDBC call
  • 14. 14 Trace Context Propagation Cartsa42b a42b Extract Inject In process propagation Auto Instrumentation • Zipkin • Sleuth • OpenTelemetry • Commercial • …
  • 15. 15 Trace Context Header Formats • Proprietary • B3 Header (Zipkin) • W3C Trace Context What’sthe header name and what does it contain?
  • 16. 16 W3C Trace Context Service A API GW Service B Trace Service C OpenTelemetry AWS Zipkin OpenTelemetry Goal: All monitoring systems and middlewaresagree on one format for trace context propagation Span Span Span Span
  • 17. 17 W3C Trace Context Format traceparent: 00-0af7651916cd43dd8448eb211c80319c-00f067aa0ba902b7-01 tracestate: rojo=00f067aa0ba902b7,congo=t61rcWkgMzE Version TraceID ParentID Flags
  • 18. 18 Data collection So far we just instrumentedthe code to propagatecontext but no data has been collected Trace Span Span Span Agent Agent Agent Click GW API TraceContext TraceContext MonitoringSystem Storage
  • 19. 19 Data Collection & PresentationSystems Solution Agents Instrumentation Storage Presentation Zipkin / Sleuth + + + + Jaeger - - + + OpenTelemetry + + - - Commercial + + + +
  • 23. 23 Entity Model Based Service Flow
  • 25. 25 Solving our Cart Problem Client API GW Service Service Service Service Service ServiceCart Trace Client Service Currency Cart API GW GET: Currency=EURO
  • 26. 26 What we did to Solve the Problem 1. We used metrics to learn about a problem 2. We used distributedtracing to pass along a unique ID per trace • For that,we used auto instrumentationto extract and inject the trace ID 3. We used a monitoring system and its agentsto collected traces and we could filter transactionsthatproduced an error 4. We looked into the metadata of such a transactionto identify how it differs from succeeding ones
  • 27. 27 You’ve mentionedOpenTelemetry … + = In early 2019 OpenCensusand OpenTracingmerged into OpenTelemetry Metrics, Traces, Logs
  • 29. 29
  • 30. 30
  • 31. 31
  • 32. 32
  • 33. 33 OpenTelemetry – Developer usecases • Cloud nativemicroservices architectures are hard to trace and debug during development • In developmentOpenTelemetry can be used to either • manuallycreate spans to trace certain execution paths • use provided auto-instrumentation tomake a system observable • As backend and UI, Jaeger is the most popular tool. It’s open source and solely displaystraces
  • 34. 34 OpenTelemetry – in Production • Provides just a fraction of what modern tools provide • Traces • Metrics • Logs • Topology • Behavior • Code level visibility • Metadata • Manual instrumentation codeneeds to be kept up-to-date • A backend needs to be maintained • No support model if instrumentation breaksproductioncode • No enterprise features (access control, throttling, scaling, …)
  • 35. 35 Why do Vendors Care then?
  • 36. 36 OpenTelemetry Company Contribution Stats Google Microsoft Dynatrace
  • 37. 37
  • 38. 38 What happens when we add support for a new framework? • Today, our engineers reverse engineer frameworks to add instrumentationsupport to them • Every time an update is released, the instrumentationcode is being tested. • In case of issues, it goes back to the developmentteam who needs to fix it and deploy an update. • The whole process is automated and transparent to the customer ☺ • This is costly and time consuming
  • 39. 39 In-process tracing Click GW API MonitoringSystem Trace Span Span Span HZQ Span: doHZQ()() Span: HZQ call OTEL HZQ Wrapper
  • 40. 40 “We want every platform and library to be pre-instrumented with OpenTelemetry and we’re committed to making this as easy as possible.” Sergey Kanzhelev (Google)
  • 41. 41 What is Observability and how does it differ fromMonitoring? 1. In control theory, observability is a measure of how well internal states of a system can be inferred from knowledgeof its external outputs. Source: Wikipedia 2. In software development, observabilityisachieved by adding code (instrumentation)that emits telemetry data. 3. Monitoringis the act of displayingand analyzing this telemetry data. 4. Monitoringalone can tell you that there is a problem. E.g. ”We see that some users experience a 50% higher response time on check-out” 5. Observablityhelps finding the root cause (the why) by providingdatathat can be correlatedand analysed freely even if this problem is completelynew to you (unknown unknowns) E.g. “The response time of the checkout increases exponentially with the number of items in the basket, because of a misplaced for loop that executes the same database query times the number of items for every item in the basket”
  • 42. 42 Putting it all Together • Metrics can help you to learn that there is a problem • Distributedtracing becomes increasingly importantto understandmulti-tier execution paths and root causes of problems • Developersnow rely on metrics and traces to understandhow their service functionsin their microservice architectures • Pure Open Source solutionsare viable for pre-prod environments • Standardization isthe only way to tackle today’scomplexity and Open Source is the key driver • Vendorsare prepared to tap intodata collected by Open Source standard toolsto add enterprise features on top to support web-scale workloads