Bartek Plotka
Bwplotka Bplotka
Fabian Reinartz
fabxc
Global, durable Prometheus monitoring
Prometheus 2.0
● Reliable operational model
● Powerful query language
● Scraping capabilities beyond the casual usage
● Local metric storage
Prometheus at Scale
A dream
n1
Improbable case
2 n+1
...
● Multiple isolated Kubernetes clusters
n1
Improbable case
2
Prometheus
n+1
Prometheus
...
● Multiple isolated Kubernetes clusters
● Single Prometheus server per cluster
1
Improbable case
2
Prometheus
n
n+1
Prometheus
...
● Multiple isolated Kubernetes clusters
● Single Prometheus server per cluster
● Dashboards & Alertmanager in separate cluster
Grafana
Alertmanager
Improbable case
● Multiple isolated Kubernetes clusters
● Single Prometheus server per cluster
● Dashboards & Alertmanager in separate cluster
Grafana
Alertmanager
What is missing? 1
2
Prometheus
n
n+1
Prometheus
...
Global View
See everything from a single
place!
1
Global View
2
Prometheus
n
n+1
Prometheus
...
Grafana
Alertmanager
“Alert when 66% of clusters in a region are down”
1
Global View
2
Prometheus
n
n+1
Prometheus
...
● How to aggregate data from different clusters?
Grafana
Alertmanager
1
Global View
2
Prometheus
n
n+1
Prometheus
...
● How to aggregate data from different clusters?
○ Use hierarchical federation?
Grafana
Alertmanager
Prometheus
1
Global View
2
Prometheus
n
n+1
Prometheus
...
● How to aggregate data from different clusters?
○ Use hierarchical federation?
Grafana
AlertmanagerSingle point of failure
Maintenance
What data are federated?
Prometheus
Availability
Where is my sample?
1
Availability
2
Prometheus
n
n+1
Prometheus
...
Grafana
Alertmanager
Operator error
Hardware failure
Rollout
1
Availability
2
Prometheus
n
n+1
Prometheus
...
● Can we assure no loss in metric data?
Grafana
Alertmanager
1
Availability
2
Prometheus
n
n+1
Prometheus
...
● Can we assure no loss in metric data?
○ Add HA replicas?
Grafana
Alertmanager
Prometheus Prometheus
1
Availability
2
Prometheus
n
n+1
Prometheus
...
● Can we assure no loss in metric data?
○ Add HA replicas?
Grafana
Alertmanager
Prometheus Prometheus
What replica we should query?
Cost?
Where to put rules and alerts?
Historical Metrics
What exactly happened
X weeks ago?
Metric retention
T - 3 years TT - 12 monthsT - 2 years
?
“Go back to what happened 6 months ago...”
Metric retention
T - 3 years TT - 12 monthsT - 2 years
?
“Good progress! Memory allocs looks better than 1,5 year ago!”
T
X
Metric retention
“Let’s see user traffic across years!”
T - 2 years T - 12 monthsT - 3 years
? ? ? ?
T
X
Metric retention
Infrastructure retention: 9 days
9 days
T - 3 years TT - 12 monthsT - 2 years T
Metric retention
● Can we have longer retention?
Metric retention
● Can we have longer retention?
○ Upgrade to Prometheus 2.0
Metric retention
● Can we have longer retention?
○ Upgrade to Prometheus 2.0
○ Scale SSD Vertically?
SSD
Prometheus
Metric retention
● Can we have longer retention?
○ Upgrade to Prometheus 2.0
○ Scale SSD Vertically?
SSD
Prometheus
Metric retention
● Can we have longer retention?
○ Upgrade to Prometheus 2.0
○ Scale SSD Vertically?
SSD
Prometheus
Metric retention
● Can we have longer retention?
○ Upgrade to Prometheus 2.0
○ Scale SSD Vertically?
SSD
Prometheus
SSD
SSD
SSD
SSD
Metric retention
● Can we have longer retention?
○ Upgrade to Prometheus 2.0
○ Scale SSD Vertically?
SSD
Prometheus
Backup?
Maintenance
Cost
Recap
1
2
Prometheus
n
n+1
Prometheus
...
Grafana
Alertmanager
Prometheus
It is just hard to…
● Have a global view
● Have a HA in place
● Increase retention
Thanos
It is just hard to…
● Have a global view
● Have a HA in place
● Increase retention
● Seamless integration with Prometheus
● Easy deployment model
● Minimal number of dependencies
● Minimal baseline cost
Additional Goals
Global View
See everything from a single
place!
SSD
Prometheus
Prometheus
Targets
SSD
Sidecar
Prometheus Sidecar
Targets
SSD
Sidecar
Prometheus Sidecar
Targets
gRPC (Store API)
Store API
service Store {
rpc Series(SeriesRequest) returns (stream SeriesResponse);
rpc LabelNames(LabelNamesRequest) returns (LabelNamesResponse);
rpc LabelValues(LabelValuesRequest) returns (LabelValuesResponse);
}
message SeriesRequest {
int64 min_time = 1;
int64 max_time = 2;
repeated LabelMatcher matchers = 3;
}
Sidecar
Prometheus
remote read
Store API
SSD
Querier
Prometheus Sidecar
Querier
Store API
Targets
HTTP
Query API
SSD
Global View
Prometheus Sidecar
Querier
Targets
SSD
Sidecar
Targets
Prometheus
Merge
Store API
SSD
Global View + Availability
Prometheus Sidecar
Targets
SSD
Sidecar
Targets
Prometheus
SSD
Sidecar Prometheus
“replica”:”1”
“replica”:”2”
Querier
Merge
Deduplicate
Store API
Thanos
It is just hard to…
● Have a global view
● Have a HA in place
● Increase retention
Historical Metrics
What exactly happened
X weeks ago?
TSDB Layout
Block 2 Block 4Block 3Block 1
T-200T-300 T-100 T-50 T
TSDB Layout
Block 4Block 3Block 1
chunks chunks
chunks chunks
index
T-200T-300 T-100 T-50 T
SSD
Data saving
Prometheus Sidecar
Targets
Object Storage
Blocks Blocks
Block
Store
Object Storage
Blocks
Cache
Store
Querier
Store API
Store
Object Storage
Blocks
Cache
Store
Querier
Block
Store API
Store
● A series is made up of one or more “chunks”
● A chunk contains ~120 samples each
● Chunks can be retrieved through HTTP byte
range queries
Store
● A series is made up of one or more “chunks”
● A chunk contains ~120 samples each
● Chunks can be retrieved through HTTP byte
range queries
Example:
● 1000 series @ 30s scrape interval
Store
● A series is made up of one or more “Chunks”
● A chunk contains ~120 samples each
● Chunks can be retrieved through HTTP byte
range queries
Example:
● 1000 series @ 30s scrape interval
● Query 1 year
8.7 million chunks/range queries
Store
Leverage Prometheus’ TSDB file layout
Store
Leverage Prometheus’ TSDB file layout
● Chunks of the same series are aligned
sequentially
Store
Leverage Prometheus’ TSDB file layout
● Chunks of the same series are aligned
● Similar series are aligned, e.g. due to same
metric name
Store
Leverage Prometheus’ TSDB file layout
● Chunks of the same series are aligned
● Similar series are aligned, e.g. due to same
metric name
Consolidating ranges in close proximity reduces
request count by 4-6 orders of magnitude.
8.7 million requests turned into O(20) requests.
Store
Leverage Prometheus’ TSDB file layout
● Chunks of the same series are aligned
● Similar series are aligned, e.g. due to same
metric name
Index lookups profit from a similar approach.
Compaction
Density matters
Compaction
Object Storage
Blocks
Disk
Compactor
Compaction
Object Storage
Blocks
Disk
Compactor
Blocks
Compaction
Object Storage
Blocks
Disk
Compactor
Blocks
Block
Compaction
Object Storage
Blocks
Disk
Compactor
Block
Thanos
It is just hard to…
● Have a global view
● Have a HA in place
● Increase retention
Downsampling
Let’s just step back a little
Downsampling
Raw: 16 bytes/sample
Compressed: 1.07
bytes/sample
Downsampling
BUT…
Downsampling
Decompressing one sample takes 10-40 nanoseconds
● Times 1000 series @ 30s scrape interval
● Times 1 year
Downsampling
Decompressing one sample takes 10-40 nanoseconds
● Times 1000 series @ 30s scrape interval
● Times 1 year
● Over 1 billion samples, i.e. 10-40s – for decoding alone
● Plus your actual computation over all those samples, e.g. rate()
Downsampling
Block
RAW
Block
@ 5m
Block
@ 1h
10x 12x
Downsampling
raw chunk
count sum min max counter
raw chunk...
Downsampling
count sum min max counter
...
Downsampling
count sum min max counter
count_over_time(requests_total[1h])
Downsampling
count sum min max counter
sum_over_time(requests_total[1h])
Downsampling
count sum min max counter
min(requests_total)
min_over_time(requests_total[1h])
Downsampling
count sum min max counter
max(requests_total)
max_over_time(requests_total[1h])
Downsampling
count sum min max counter
rate(requests_total[1h])
increase(requests_total[1h])
Downsampling
count sum min max counter
requests_total
avg(requests_total)
...
*
avg
Full Architecture
Querier
SSD
Sidecar Prometheus
SSD
Sidecar Prometheus
QuerierQuerier
…
Compactor
Store
Bucket
Full Architecture
$ thanos sidecar …
$ thanos query …
$ thanos store …
$ thanos compact …
Deployment Models
Querier
S P
QuerierQuerier
…
Store
Bucket
S P
Querier
S P
QuerierQuerier
…
Store
Bucket
S P
Querier
S P
QuerierQuerier
…
Store
Bucket
S P
Cluster A
Cluster B
Cluster C
Deployment Models
Querier
S P
QuerierQuerier
…
Store
Bucket
S P
Querier
S P
QuerierQuerier
…
Store
Bucket
S P
Querier
S P
QuerierQuerier
…
Store
Bucket
S P
Cluster A
Cluster B
Cluster C
Federation (through Store API)
Deployment Models
Querier
S P
QuerierQuerier
…
Store
Bucket
S P
S P …
Store
Bucket
S P
S P …
Store
Bucket
S P
Cluster A
Cluster B
Cluster C
Global Scale Thanos Cluster
Cost
● Store + Query node ~ Savings on Prometheus side (+/- 0)
● Fewer SSD space on Prometheus side (savings)
● Basically: just your data stored in S3/GCS/HDFS + requests
Cost
Example:
● 20 Prometheus servers each ingesting 100k samples/sec, 500GB of local disk
● 20 x 250GB of new data per month + ~20% overhead for downsampling
● $1440/month for storage after 1 year (72TB of queryable data)
● $100/month for sustained 100 query/sec against object storage
Thanos Cost: $1540
Cost
Example:
● 20 Prometheus servers each ingesting 100k samples/sec, 500GB of local disk
● 20 x 250GB of new data per month + ~20% overhead for downsampling
● $1440/month for storage after 1 year (72TB of queryable data)
● $100/month for sustained 100 query/sec against object storage
● $1530/month savings in local SSDs
Thanos Cost: $1540 Prometheus Savings: $1530
Demo - retention
Demo - deduplication
Demo - deduplication
Any questions?
github.com/improbable-eng/thanos
Fabian Reinartz
fabxc
Bartek Plotka
bwplotka Bplotka

More Related Content

PDF
Monitoring Kubernetes with Prometheus
PPTX
Microsoft Defender for Endpoint
PDF
Empower Your Security Practitioners with Elastic SIEM
ODP
eBPF maps 101
PDF
Alphorm.com Formation CCNP ENCOR 350-401 (1of8) : Commutation
PDF
TechnicalTerraformLandingZones121120229238.pdf
PPTX
The Basic Introduction of Open vSwitch
PDF
Routage statique
Monitoring Kubernetes with Prometheus
Microsoft Defender for Endpoint
Empower Your Security Practitioners with Elastic SIEM
eBPF maps 101
Alphorm.com Formation CCNP ENCOR 350-401 (1of8) : Commutation
TechnicalTerraformLandingZones121120229238.pdf
The Basic Introduction of Open vSwitch
Routage statique

What's hot (20)

PDF
Prometheus and Thanos
PDF
Thanos - Prometheus on Scale
PPTX
Prometheus in Practice: High Availability with Thanos (DevOpsDays Edinburgh 2...
PDF
Infrastructure & System Monitoring using Prometheus
PPTX
Prometheus 101
ODP
Monitoring With Prometheus
PPTX
Scaling Prometheus on Kubernetes with Thanos
PDF
Prometheus Overview
PDF
Prometheus and Docker (Docker Galway, November 2015)
PPTX
Prometheus and Grafana
PPTX
Prometheus design and philosophy
PDF
Monitoring with prometheus
PPTX
Introduction to Apache Kafka
PDF
Kubernetes Observability with Prometheus by Example
PDF
Fundamentals of Apache Kafka
PPTX
Prometheus - Intro, CNCF, TSDB,PromQL,Grafana
PPTX
OpenTelemetry For Architects
PDF
Producer Performance Tuning for Apache Kafka
PPTX
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
PDF
Introduction to Open Telemetry as Observability Library
Prometheus and Thanos
Thanos - Prometheus on Scale
Prometheus in Practice: High Availability with Thanos (DevOpsDays Edinburgh 2...
Infrastructure & System Monitoring using Prometheus
Prometheus 101
Monitoring With Prometheus
Scaling Prometheus on Kubernetes with Thanos
Prometheus Overview
Prometheus and Docker (Docker Galway, November 2015)
Prometheus and Grafana
Prometheus design and philosophy
Monitoring with prometheus
Introduction to Apache Kafka
Kubernetes Observability with Prometheus by Example
Fundamentals of Apache Kafka
Prometheus - Intro, CNCF, TSDB,PromQL,Grafana
OpenTelemetry For Architects
Producer Performance Tuning for Apache Kafka
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Introduction to Open Telemetry as Observability Library
Ad

Similar to Thanos: Global, durable Prometheus monitoring (20)

PDF
Netflix Keystone Pipeline at Samza Meetup 10-13-2015
PDF
EVCache: Lowering Costs for a Low Latency Cache with RocksDB
PDF
Netflix Open Source Meetup Season 4 Episode 2
PDF
Kafka to the Maxka - (Kafka Performance Tuning)
KEY
London devops logging
PDF
FOSDEM 2019: M3, Prometheus and Graphite with metrics and monitoring in an in...
PDF
PGConf APAC 2018 - High performance json postgre-sql vs. mongodb
PDF
Optimizing Tiered Storage for Low-Latency Real-Time Analytics at AI Scale
PDF
Virtual Flink Forward 2020: Autoscaling Flink at Netflix - Timothy Farkas
PDF
Big data Argentina meetup 2020-09: Intro to presto on docker
PDF
EVCache & Moneta (GoSF)
PDF
OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...
PDF
Microservices with Micronaut
PDF
How Prometheus Store the Data
PDF
VMworld Europe 2014: Virtual SAN Best Practices and Use Cases
PDF
Operating and Supporting Delta Lake in Production
PDF
Presto at Tivo, Boston Hadoop Meetup
PDF
Elasticsearch on Kubernetes
PDF
Five Lessons in Distributed Databases
PDF
MesosCon 2018
Netflix Keystone Pipeline at Samza Meetup 10-13-2015
EVCache: Lowering Costs for a Low Latency Cache with RocksDB
Netflix Open Source Meetup Season 4 Episode 2
Kafka to the Maxka - (Kafka Performance Tuning)
London devops logging
FOSDEM 2019: M3, Prometheus and Graphite with metrics and monitoring in an in...
PGConf APAC 2018 - High performance json postgre-sql vs. mongodb
Optimizing Tiered Storage for Low-Latency Real-Time Analytics at AI Scale
Virtual Flink Forward 2020: Autoscaling Flink at Netflix - Timothy Farkas
Big data Argentina meetup 2020-09: Intro to presto on docker
EVCache & Moneta (GoSF)
OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...
Microservices with Micronaut
How Prometheus Store the Data
VMworld Europe 2014: Virtual SAN Best Practices and Use Cases
Operating and Supporting Delta Lake in Production
Presto at Tivo, Boston Hadoop Meetup
Elasticsearch on Kubernetes
Five Lessons in Distributed Databases
MesosCon 2018
Ad

Recently uploaded (20)

PPTX
Modernising the Digital Integration Hub
PPTX
AI IN MARKETING- PRESENTED BY ANWAR KABIR 1st June 2025.pptx
PPTX
2018-HIPAA-Renewal-Training for executives
PDF
Architecture types and enterprise applications.pdf
PDF
Hybrid horned lizard optimization algorithm-aquila optimizer for DC motor
PDF
Zenith AI: Advanced Artificial Intelligence
PDF
OpenACC and Open Hackathons Monthly Highlights July 2025
PDF
Comparative analysis of machine learning models for fake news detection in so...
PDF
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
DOCX
search engine optimization ppt fir known well about this
PDF
Taming the Chaos: How to Turn Unstructured Data into Decisions
PDF
Improvisation in detection of pomegranate leaf disease using transfer learni...
PDF
How IoT Sensor Integration in 2025 is Transforming Industries Worldwide
PDF
Flame analysis and combustion estimation using large language and vision assi...
PPTX
Microsoft Excel 365/2024 Beginner's training
PDF
How ambidextrous entrepreneurial leaders react to the artificial intelligence...
PDF
NewMind AI Weekly Chronicles – August ’25 Week III
PPTX
TEXTILE technology diploma scope and career opportunities
PPT
Module 1.ppt Iot fundamentals and Architecture
PPTX
GROUP4NURSINGINFORMATICSREPORT-2 PRESENTATION
Modernising the Digital Integration Hub
AI IN MARKETING- PRESENTED BY ANWAR KABIR 1st June 2025.pptx
2018-HIPAA-Renewal-Training for executives
Architecture types and enterprise applications.pdf
Hybrid horned lizard optimization algorithm-aquila optimizer for DC motor
Zenith AI: Advanced Artificial Intelligence
OpenACC and Open Hackathons Monthly Highlights July 2025
Comparative analysis of machine learning models for fake news detection in so...
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
search engine optimization ppt fir known well about this
Taming the Chaos: How to Turn Unstructured Data into Decisions
Improvisation in detection of pomegranate leaf disease using transfer learni...
How IoT Sensor Integration in 2025 is Transforming Industries Worldwide
Flame analysis and combustion estimation using large language and vision assi...
Microsoft Excel 365/2024 Beginner's training
How ambidextrous entrepreneurial leaders react to the artificial intelligence...
NewMind AI Weekly Chronicles – August ’25 Week III
TEXTILE technology diploma scope and career opportunities
Module 1.ppt Iot fundamentals and Architecture
GROUP4NURSINGINFORMATICSREPORT-2 PRESENTATION

Thanos: Global, durable Prometheus monitoring