SlideShare a Scribd company logo
1
When it absolutely, positively,
has to be there
Reliability Guarantees in Apache Kafka
@jeffholoman @gwenshap
4
Apache Kafka
High Throughput
Low Latency
Scalable
Centralized
Real-time
5
Streaming Platform
Producer Consumer
Streaming Applications
Connectors Connectors
Apache
Kafka
6
Versions of Apache Kafka
• 0.7.0 <- Please don’t
• 0.8.0 <- Replication exists, it will continue evolving with every release
• 0.8.2 <- New producer, offset commits to Kafka
• 0.9.0 <- New consumer, Connect APIs
• 0.10.0 <- New consumer improvements, Streams APIs
• 0.11.0 <- Idempotent producer, transactional semantics, Exactly once.
7
Kafka Components
• Broker
• Java clients:
• Producer
• Consumers
• Kafka Streams
• Kafka Connect
• Non-Java:
• Librdkafka
• Librdkafka based – Python, Go, NodeJS, C#...
• Others
8
If Kafka is a critical piece of our pipeline
 Can we be 100% sure that our data will get there?
 Can we lose messages?
 How do we verify?
 Who’s fault is it?
10
Distributed Systems
 Things Fail
 Systems are designed to
tolerate failure
 We must expect failures
and design our code and
configure our systems to
handle them
11
Network
Broker MachineClient Machine
Data Flow
Kafka Client
Broker
O/S Socket Buffer
NIC
NIC
Page Cache
Disk
Application
Thread
O/S Socket Buffercallbac
k
✗
✗
✗
✗
✗
✗
✗✗ data
ack /
exception
Replication
12
Client Machine
Kafka Client
O/S Socket Buffer
NIC
Application Thread
✗
✗ ✗Broker Machine
Broker
NIC
Page Cache
Disk
O/S Socket Buffer
✗
✗
✗
✗
Network
Data Flow
✗
data
offsets
ZK
Kafka✗
13
Kafka is super reliable.
Stores data, on disk. Replicated.
… if you know how to configure it
that way.
14
Replication is your friend
 Kafka protects against failures by replicating data
 The unit of replication is the partition
 One replica is designated as the Leader
 Follower replicas fetch data from the leader
 The leader holds the list of “in-sync” replicas
15
Replication and ISRs
0
1
2
0
1
2
0
1
2
Producer
Broker
100
Broker
101
Broker
102
Topic:
Partitions
:
Replicas:
my_topic
3
3
Partition
:
Leader:
ISR:
1
101
100,102
Partition
:
Leader:
ISR:
2
102
101,100
Partition
:
Leader:
ISR:
0
100
101,102
16
ISR
2 things make a replica in-sync
• Lag behind leader
• replica.lag.time.max.ms – replica that didn’t fetch or is behind
• replica.lag.max.messages – will go away has gone away in 0.9
• Connection to Zookeeper
17
Terminology
Acked
• Producers will not retry sending.
• Depends on producer setting
Committed
• Only when message got to all ISR
(future leaders have it).
• Consumers can read.
• replica.lag.time.max.ms controls: how long can a
dead replica prevent consumers from reading?
Committed Offsets
• Consumer told Kafka the latest offsets it read. By
default the consumer will not see these events
again.
18
Replication
Acks = all
• Waits for all in-sync replicas to reply.
Replica 3
100
Replica 2
100
Replica 1
100
Time
19
Replica 3 stopped replicating for some reason
Replication
Replica 3
100
Replica 2
100
101
Replica 1
100
101
Time
Acked in acks =
all
“committed”
Acked in acks =
1
but not
“committed”
20
Replication
Replica 3
100
Replica 2
100
101
Replica 1
100
101
Time
One replica drops out of ISR, or goes offline
All messages are now acked and committed
21
2nd Replica drops out, or is offline
Replication
Replica 3
100
Replica 2
100
101
Replica 1
100
101
102
103
104Time
22
Replication
Replica 3
100
Replica 2
100
101
Replica 1
100
101
102
103
104Time
Now we’re in trouble
✗
23
Replication
If Replica 2 or 3 come back online before the leader, you can will lose data.
Replica 3
100
Replica 2
100
101
Replica 1
100
101
102
103
104Time
All those are
“acked” and
“committed”
24
So what to do
Disable Unclean Leader Election
• unclean.leader.election.enable = false
• Default from 0.11.0
Set replication factor
• default.replication.factor = 3
Set minimum ISRs
• min.insync.replicas = 2
25
Warning
min.insync.replicas is applied at the topic-level.
Must alter the topic configuration manually if created before the server level change
Must manually alter the topic < 0.9.0 (KAFKA-2114)
26
Replication
Replication = 3
Min ISR = 2
Replica 3
100
Replica 2
100
Replica 1
100
Time
27
Replication
Replica 3
100
Replica 2
100
101
Replica 1
100
101
Time
One replica drops out of ISR, or goes offline
28
Replication
Replica 3
100
Replica 2
100
101
Replica 1
100
101102
103
104
Time
2nd Replica fails out, or is out of sync
Buffers
in
Produc
er
29
30
Producer Internals
Producer sends batches of messages to a buffer
M3
Application
Thread
Application
Thread
Application
Thread
send()
M2 M1 M0
Batch 3
Batch 2
Batch 1
Fail
?
response
retry
Update
Future
callback
drain
Metadata or
Exception
31
Basics
Durability can be configured with the producer configuration request.required.acks
• 0 The message is written to the network (buffer)
• 1 The message is written to the leader
• all The producer gets an ack after all ISRs receive the data; the message is committed
Make sure producer doesn’t just throws messages away!
• block.on.buffer.full = true < 0.9.0
• max.block.ms = Long.MAX_VALUE
• Or handle the BufferExhaustedException / TimeoutException yourself
32
“New” Producer
All calls are non-blocking async
2 Options for checking for failures:
• Immediately block for response: send().get()
• Do followup work in Callback, close producer after error threshold
• Be careful about buffering these failures. Future work? KAFKA-1955
• Don’t forget to close the producer! producer.close() will block until in-flight txns complete
retries (producer config) defaults to 0
In flight requests could lead to message re-ordering
33
34
Consumer
Three choices for Consumer API
• Simple Consumer
• High Level Consumer (ZookeeperConsumer)
• New KafkaConsumer
35
New Consumer – auto commit
props.put("enable.auto.commit", "true");
props.put("auto.commit.interval.ms", "10000");
KafkaConsumer<String, String> consumer = new KafkaConsumer<String, String>(props);
consumer.subscribe(Arrays.asList("foo", "bar"));
while (true) {
ConsumerRecords<String, String> records = consumer.poll(100);
for (ConsumerRecord<String, String> record : records) {
processAndUpdateDB(record);
}
}
What if we crash
after 8 seconds?
Commit automatically
every 10 seconds
36
New Consumer – manual commit
props.put("enable.auto.commit", "false");
KafkaConsumer<String, String> consumer = new KafkaConsumer<String, String>(props);
consumer.subscribe(Arrays.asList("foo", "bar"));
while (true) {
ConsumerRecords<String, String> records = consumer.poll(100);
for (ConsumerRecord<String, String> record : records)
processAndUpdateDB(record);
consumer.commitSync();
}
Commit entire
batch outside the
loop!
43
Minimize Duplicates for At Least Once Consuming
1. Commit your own offsets - Set autocommit.enable = false
2. Use Rebalance Listener to limit duplicates
3. Make sure you commit only what you are done processing
4. Note: New consumer is single threaded – one consumer per thread.
44
Exactly Once Semantics
At most once is easy
At least once is not bad either – commit after 100% sure data is safe
Exactly once is tricky
• Commit data and offsets in one transaction
• Idempotent producer
Kafka Connect – many connectors (especially Confluent’s) are exactly once
by using an external database to write events and store offsets in one transaction
Kafka Streams – starting at 0.11.0 have easy to configure exactly once
(exactly.once=true).
Other stream processing systems – have their own thing.
47
How we test Kafka?
"""Replication tests.
These tests verify that replication provides simple durability guarantees by checking that data
acked by
brokers is still available for consumption in the face of various failure scenarios.
Setup: 1 zk, 3 kafka nodes, 1 topic with partitions=3, replication-factor=3, and
min.insync.replicas=2
- Produce messages in the background
- Consume messages in the background
- Drive broker failures (shutdown, or bounce repeatedly with kill -15 or kill -9)
- When done driving failures, stop producing, and finish consuming
- Validate that every acked message was consumed
"""
48
Monitoring for Data Loss
• Monitor for producer errors – watch the retry numbers
• Monitor consumer lag – MaxLag or via offsets
• Standard schema:
• Each message should contain timestamp and originating service and host
• Each producer can report message counts and offsets to a special topic
• “Monitoring consumer” reports message counts to another special topic
• “Important consumers” also report message counts
• Reconcile the results
49
Be Safe, Not Sorry
Acks = all
Max.block.ms = Long.MAX_VALUE
Retries = MAX_INT
( Max.inflight.requests.per.connection = 1 )
Producer.close()
Replication-factor >= 3
Min.insync.replicas = 2
Unclean.leader.election = false
Auto.offset.commit = false
Commit after processing
Monitor!
50
Thank You!

More Related Content

PPTX
Multi-Cluster and Failover for Apache Kafka - Kafka Summit SF 17
Gwen (Chen) Shapira
 
PPTX
Nyc kafka meetup 2015 - when bad things happen to good kafka clusters
Gwen (Chen) Shapira
 
PPTX
When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...
confluent
 
PDF
Exactly-once Semantics in Apache Kafka
confluent
 
PDF
Reliability Guarantees for Apache Kafka
confluent
 
PPTX
Introduction to Apache Kafka
Jeff Holoman
 
PPT
Kafka Reliability - When it absolutely, positively has to be there
Gwen (Chen) Shapira
 
PDF
Introduction to Apache Kafka
Shiao-An Yuan
 
Multi-Cluster and Failover for Apache Kafka - Kafka Summit SF 17
Gwen (Chen) Shapira
 
Nyc kafka meetup 2015 - when bad things happen to good kafka clusters
Gwen (Chen) Shapira
 
When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...
confluent
 
Exactly-once Semantics in Apache Kafka
confluent
 
Reliability Guarantees for Apache Kafka
confluent
 
Introduction to Apache Kafka
Jeff Holoman
 
Kafka Reliability - When it absolutely, positively has to be there
Gwen (Chen) Shapira
 
Introduction to Apache Kafka
Shiao-An Yuan
 

What's hot (20)

PDF
Kafka At Scale in the Cloud
confluent
 
PDF
Building Stream Infrastructure across Multiple Data Centers with Apache Kafka
Guozhang Wang
 
PDF
Kafka Summit SF 2017 - One Data Center is Not Enough: Scaling Apache Kafka Ac...
confluent
 
PDF
Apache Kafka Introduction
Amita Mirajkar
 
PDF
Apache kafka
NexThoughts Technologies
 
ODP
Kafka aws
Ariel Moskovich
 
PPTX
Apache Kafka Best Practices
DataWorks Summit/Hadoop Summit
 
PDF
Troubleshooting Kafka's socket server: from incident to resolution
Joel Koshy
 
PPTX
Reducing Microservice Complexity with Kafka and Reactive Streams
jimriecken
 
PPTX
Kafka 101
Clement Demonchy
 
PPTX
Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013
Christopher Curtin
 
PPTX
Kafka Reliability Guarantees ATL Kafka User Group
Jeff Holoman
 
PPTX
Apache kafka
Viswanath J
 
PPTX
... No it's Apache Kafka!
makker_nl
 
PDF
Let the alpakka pull your stream
Enno Runne
 
PDF
Kafkaesque days at linked in in 2015
Joel Koshy
 
PPTX
Decoupling Decisions with Apache Kafka
Grant Henke
 
PPTX
Kafka Summit NYC 2017 - Deep Dive Into Apache Kafka
confluent
 
PDF
Fundamentals of Apache Kafka
Chhavi Parasher
 
PDF
Apache Kafka - Free Friday
Otávio Carvalho
 
Kafka At Scale in the Cloud
confluent
 
Building Stream Infrastructure across Multiple Data Centers with Apache Kafka
Guozhang Wang
 
Kafka Summit SF 2017 - One Data Center is Not Enough: Scaling Apache Kafka Ac...
confluent
 
Apache Kafka Introduction
Amita Mirajkar
 
Kafka aws
Ariel Moskovich
 
Apache Kafka Best Practices
DataWorks Summit/Hadoop Summit
 
Troubleshooting Kafka's socket server: from incident to resolution
Joel Koshy
 
Reducing Microservice Complexity with Kafka and Reactive Streams
jimriecken
 
Kafka 101
Clement Demonchy
 
Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013
Christopher Curtin
 
Kafka Reliability Guarantees ATL Kafka User Group
Jeff Holoman
 
Apache kafka
Viswanath J
 
... No it's Apache Kafka!
makker_nl
 
Let the alpakka pull your stream
Enno Runne
 
Kafkaesque days at linked in in 2015
Joel Koshy
 
Decoupling Decisions with Apache Kafka
Grant Henke
 
Kafka Summit NYC 2017 - Deep Dive Into Apache Kafka
confluent
 
Fundamentals of Apache Kafka
Chhavi Parasher
 
Apache Kafka - Free Friday
Otávio Carvalho
 
Ad

Similar to Kafka reliability velocity 17 (20)

PPTX
Apache Kafka Reliability
Jeff Holoman
 
PPT
Apache Kafka Reliability Guarantees StrataHadoop NYC 2015
Jeff Holoman
 
PDF
Common issues with Apache Kafka® Producer
confluent
 
PDF
Seek and Destroy Kafka Under Replication
HostedbyConfluent
 
PPTX
Apache Kafka
Joe Stein
 
PPTX
Webinar patterns anti patterns
confluent
 
PDF
intro-kafka
Rahul Shukla
 
PDF
Apache Kafka – (Pattern and) Anti-Pattern
confluent
 
PDF
Apache Kafka Women Who Code Meetup
Snehal Nagmote
 
PDF
Apache Kafka - Scalable Message-Processing and more !
Guido Schmutz
 
PPT
Kafka Explainaton
NguyenChiHoangMinh
 
PPTX
World of Tanks Experience of Using Kafka
Levon Avakyan
 
PPTX
Distributed messaging with Apache Kafka
Saumitra Srivastav
 
PDF
Non-Kafkaesque Apache Kafka - Yottabyte 2018
Otávio Carvalho
 
PDF
Kafka practical experience
Rico Chen
 
PPTX
Monitoring Apache Kafka
confluent
 
PDF
SFBigAnalytics_20190724: Monitor kafka like a Pro
Chester Chen
 
PPTX
Apache kafka
Srikrishna k
 
PDF
apachekafka-160907180205.pdf
TarekHamdi8
 
PDF
Messaging for Modern Applications
Tom McCuch
 
Apache Kafka Reliability
Jeff Holoman
 
Apache Kafka Reliability Guarantees StrataHadoop NYC 2015
Jeff Holoman
 
Common issues with Apache Kafka® Producer
confluent
 
Seek and Destroy Kafka Under Replication
HostedbyConfluent
 
Apache Kafka
Joe Stein
 
Webinar patterns anti patterns
confluent
 
intro-kafka
Rahul Shukla
 
Apache Kafka – (Pattern and) Anti-Pattern
confluent
 
Apache Kafka Women Who Code Meetup
Snehal Nagmote
 
Apache Kafka - Scalable Message-Processing and more !
Guido Schmutz
 
Kafka Explainaton
NguyenChiHoangMinh
 
World of Tanks Experience of Using Kafka
Levon Avakyan
 
Distributed messaging with Apache Kafka
Saumitra Srivastav
 
Non-Kafkaesque Apache Kafka - Yottabyte 2018
Otávio Carvalho
 
Kafka practical experience
Rico Chen
 
Monitoring Apache Kafka
confluent
 
SFBigAnalytics_20190724: Monitor kafka like a Pro
Chester Chen
 
Apache kafka
Srikrishna k
 
apachekafka-160907180205.pdf
TarekHamdi8
 
Messaging for Modern Applications
Tom McCuch
 
Ad

More from Gwen (Chen) Shapira (20)

PPTX
Velocity 2019 - Kafka Operations Deep Dive
Gwen (Chen) Shapira
 
PPTX
Lies Enterprise Architects Tell - Data Day Texas 2018 Keynote
Gwen (Chen) Shapira
 
PPTX
Gluecon - Kafka and the service mesh
Gwen (Chen) Shapira
 
PPTX
Papers we love realtime at facebook
Gwen (Chen) Shapira
 
PPTX
Multi-Datacenter Kafka - Strata San Jose 2017
Gwen (Chen) Shapira
 
PPTX
Streaming Data Integration - For Women in Big Data Meetup
Gwen (Chen) Shapira
 
PPTX
Kafka at scale facebook israel
Gwen (Chen) Shapira
 
PPTX
Kafka connect-london-meetup-2016
Gwen (Chen) Shapira
 
PPTX
Fraud Detection for Israel BigThings Meetup
Gwen (Chen) Shapira
 
PPTX
Fraud Detection Architecture
Gwen (Chen) Shapira
 
PPTX
Have your cake and eat it too
Gwen (Chen) Shapira
 
PPTX
Kafka for DBAs
Gwen (Chen) Shapira
 
PPTX
Data Architectures for Robust Decision Making
Gwen (Chen) Shapira
 
PPTX
Kafka and Hadoop at LinkedIn Meetup
Gwen (Chen) Shapira
 
PPTX
Kafka & Hadoop - for NYC Kafka Meetup
Gwen (Chen) Shapira
 
PPTX
Twitter with hadoop for oow
Gwen (Chen) Shapira
 
PPTX
R for hadoopers
Gwen (Chen) Shapira
 
PPTX
Scaling ETL with Hadoop - Avoiding Failure
Gwen (Chen) Shapira
 
PPTX
Intro to Spark - for Denver Big Data Meetup
Gwen (Chen) Shapira
 
PPTX
Incredible Impala
Gwen (Chen) Shapira
 
Velocity 2019 - Kafka Operations Deep Dive
Gwen (Chen) Shapira
 
Lies Enterprise Architects Tell - Data Day Texas 2018 Keynote
Gwen (Chen) Shapira
 
Gluecon - Kafka and the service mesh
Gwen (Chen) Shapira
 
Papers we love realtime at facebook
Gwen (Chen) Shapira
 
Multi-Datacenter Kafka - Strata San Jose 2017
Gwen (Chen) Shapira
 
Streaming Data Integration - For Women in Big Data Meetup
Gwen (Chen) Shapira
 
Kafka at scale facebook israel
Gwen (Chen) Shapira
 
Kafka connect-london-meetup-2016
Gwen (Chen) Shapira
 
Fraud Detection for Israel BigThings Meetup
Gwen (Chen) Shapira
 
Fraud Detection Architecture
Gwen (Chen) Shapira
 
Have your cake and eat it too
Gwen (Chen) Shapira
 
Kafka for DBAs
Gwen (Chen) Shapira
 
Data Architectures for Robust Decision Making
Gwen (Chen) Shapira
 
Kafka and Hadoop at LinkedIn Meetup
Gwen (Chen) Shapira
 
Kafka & Hadoop - for NYC Kafka Meetup
Gwen (Chen) Shapira
 
Twitter with hadoop for oow
Gwen (Chen) Shapira
 
R for hadoopers
Gwen (Chen) Shapira
 
Scaling ETL with Hadoop - Avoiding Failure
Gwen (Chen) Shapira
 
Intro to Spark - for Denver Big Data Meetup
Gwen (Chen) Shapira
 
Incredible Impala
Gwen (Chen) Shapira
 

Recently uploaded (20)

PDF
Community & News Update Q2 Meet Up 2025
VictoriaMetrics
 
PPTX
Visualising Data with Scatterplots in IBM SPSS Statistics.pptx
Version 1 Analytics
 
PDF
How to Seamlessly Integrate Salesforce Data Cloud with Marketing Cloud.pdf
NSIQINFOTECH
 
PDF
IEEE-CS Tech Predictions, SWEBOK and Quantum Software: Towards Q-SWEBOK
Hironori Washizaki
 
PPTX
Save Business Costs with CRM Software for Insurance Agents
Insurance Tech Services
 
PDF
Bandai Playdia The Book - David Glotz
BluePanther6
 
PDF
Key Features to Look for in Arizona App Development Services
Net-Craft.com
 
PPTX
oapresentation.pptx
mehatdhavalrajubhai
 
PDF
Become an Agentblazer Champion Challenge Kickoff
Dele Amefo
 
PPTX
AIRLINE PRICE API | FLIGHT API COST |
philipnathen82
 
PPTX
EU POPs Limits & Digital Product Passports Compliance Strategy 2025.pptx
Certivo Inc
 
DOCX
The Five Best AI Cover Tools in 2025.docx
aivoicelabofficial
 
PDF
Teaching Reproducibility and Embracing Variability: From Floating-Point Exper...
University of Rennes, INSA Rennes, Inria/IRISA, CNRS
 
PDF
PFAS Reporting Requirements 2026 Are You Submission Ready Certivo.pdf
Certivo Inc
 
PPTX
Maximizing Revenue with Marketo Measure: A Deep Dive into Multi-Touch Attribu...
bbedford2
 
PDF
Jenkins: An open-source automation server powering CI/CD Automation
SaikatBasu37
 
PDF
On Software Engineers' Productivity - Beyond Misleading Metrics
Romén Rodríguez-Gil
 
PPTX
Presentation of Computer CLASS 2 .pptx
darshilchaudhary558
 
PPTX
Web Testing.pptx528278vshbuqffqhhqiwnwuq
studylike474
 
PDF
The Role of Automation and AI in EHS Management for Data Centers.pdf
TECH EHS Solution
 
Community & News Update Q2 Meet Up 2025
VictoriaMetrics
 
Visualising Data with Scatterplots in IBM SPSS Statistics.pptx
Version 1 Analytics
 
How to Seamlessly Integrate Salesforce Data Cloud with Marketing Cloud.pdf
NSIQINFOTECH
 
IEEE-CS Tech Predictions, SWEBOK and Quantum Software: Towards Q-SWEBOK
Hironori Washizaki
 
Save Business Costs with CRM Software for Insurance Agents
Insurance Tech Services
 
Bandai Playdia The Book - David Glotz
BluePanther6
 
Key Features to Look for in Arizona App Development Services
Net-Craft.com
 
oapresentation.pptx
mehatdhavalrajubhai
 
Become an Agentblazer Champion Challenge Kickoff
Dele Amefo
 
AIRLINE PRICE API | FLIGHT API COST |
philipnathen82
 
EU POPs Limits & Digital Product Passports Compliance Strategy 2025.pptx
Certivo Inc
 
The Five Best AI Cover Tools in 2025.docx
aivoicelabofficial
 
Teaching Reproducibility and Embracing Variability: From Floating-Point Exper...
University of Rennes, INSA Rennes, Inria/IRISA, CNRS
 
PFAS Reporting Requirements 2026 Are You Submission Ready Certivo.pdf
Certivo Inc
 
Maximizing Revenue with Marketo Measure: A Deep Dive into Multi-Touch Attribu...
bbedford2
 
Jenkins: An open-source automation server powering CI/CD Automation
SaikatBasu37
 
On Software Engineers' Productivity - Beyond Misleading Metrics
Romén Rodríguez-Gil
 
Presentation of Computer CLASS 2 .pptx
darshilchaudhary558
 
Web Testing.pptx528278vshbuqffqhhqiwnwuq
studylike474
 
The Role of Automation and AI in EHS Management for Data Centers.pdf
TECH EHS Solution
 

Kafka reliability velocity 17

  • 1. 1 When it absolutely, positively, has to be there Reliability Guarantees in Apache Kafka @jeffholoman @gwenshap
  • 2. 4 Apache Kafka High Throughput Low Latency Scalable Centralized Real-time
  • 3. 5 Streaming Platform Producer Consumer Streaming Applications Connectors Connectors Apache Kafka
  • 4. 6 Versions of Apache Kafka • 0.7.0 <- Please don’t • 0.8.0 <- Replication exists, it will continue evolving with every release • 0.8.2 <- New producer, offset commits to Kafka • 0.9.0 <- New consumer, Connect APIs • 0.10.0 <- New consumer improvements, Streams APIs • 0.11.0 <- Idempotent producer, transactional semantics, Exactly once.
  • 5. 7 Kafka Components • Broker • Java clients: • Producer • Consumers • Kafka Streams • Kafka Connect • Non-Java: • Librdkafka • Librdkafka based – Python, Go, NodeJS, C#... • Others
  • 6. 8 If Kafka is a critical piece of our pipeline  Can we be 100% sure that our data will get there?  Can we lose messages?  How do we verify?  Who’s fault is it?
  • 7. 10 Distributed Systems  Things Fail  Systems are designed to tolerate failure  We must expect failures and design our code and configure our systems to handle them
  • 8. 11 Network Broker MachineClient Machine Data Flow Kafka Client Broker O/S Socket Buffer NIC NIC Page Cache Disk Application Thread O/S Socket Buffercallbac k ✗ ✗ ✗ ✗ ✗ ✗ ✗✗ data ack / exception Replication
  • 9. 12 Client Machine Kafka Client O/S Socket Buffer NIC Application Thread ✗ ✗ ✗Broker Machine Broker NIC Page Cache Disk O/S Socket Buffer ✗ ✗ ✗ ✗ Network Data Flow ✗ data offsets ZK Kafka✗
  • 10. 13 Kafka is super reliable. Stores data, on disk. Replicated. … if you know how to configure it that way.
  • 11. 14 Replication is your friend  Kafka protects against failures by replicating data  The unit of replication is the partition  One replica is designated as the Leader  Follower replicas fetch data from the leader  The leader holds the list of “in-sync” replicas
  • 13. 16 ISR 2 things make a replica in-sync • Lag behind leader • replica.lag.time.max.ms – replica that didn’t fetch or is behind • replica.lag.max.messages – will go away has gone away in 0.9 • Connection to Zookeeper
  • 14. 17 Terminology Acked • Producers will not retry sending. • Depends on producer setting Committed • Only when message got to all ISR (future leaders have it). • Consumers can read. • replica.lag.time.max.ms controls: how long can a dead replica prevent consumers from reading? Committed Offsets • Consumer told Kafka the latest offsets it read. By default the consumer will not see these events again.
  • 15. 18 Replication Acks = all • Waits for all in-sync replicas to reply. Replica 3 100 Replica 2 100 Replica 1 100 Time
  • 16. 19 Replica 3 stopped replicating for some reason Replication Replica 3 100 Replica 2 100 101 Replica 1 100 101 Time Acked in acks = all “committed” Acked in acks = 1 but not “committed”
  • 17. 20 Replication Replica 3 100 Replica 2 100 101 Replica 1 100 101 Time One replica drops out of ISR, or goes offline All messages are now acked and committed
  • 18. 21 2nd Replica drops out, or is offline Replication Replica 3 100 Replica 2 100 101 Replica 1 100 101 102 103 104Time
  • 19. 22 Replication Replica 3 100 Replica 2 100 101 Replica 1 100 101 102 103 104Time Now we’re in trouble ✗
  • 20. 23 Replication If Replica 2 or 3 come back online before the leader, you can will lose data. Replica 3 100 Replica 2 100 101 Replica 1 100 101 102 103 104Time All those are “acked” and “committed”
  • 21. 24 So what to do Disable Unclean Leader Election • unclean.leader.election.enable = false • Default from 0.11.0 Set replication factor • default.replication.factor = 3 Set minimum ISRs • min.insync.replicas = 2
  • 22. 25 Warning min.insync.replicas is applied at the topic-level. Must alter the topic configuration manually if created before the server level change Must manually alter the topic < 0.9.0 (KAFKA-2114)
  • 23. 26 Replication Replication = 3 Min ISR = 2 Replica 3 100 Replica 2 100 Replica 1 100 Time
  • 24. 27 Replication Replica 3 100 Replica 2 100 101 Replica 1 100 101 Time One replica drops out of ISR, or goes offline
  • 25. 28 Replication Replica 3 100 Replica 2 100 101 Replica 1 100 101102 103 104 Time 2nd Replica fails out, or is out of sync Buffers in Produc er
  • 26. 29
  • 27. 30 Producer Internals Producer sends batches of messages to a buffer M3 Application Thread Application Thread Application Thread send() M2 M1 M0 Batch 3 Batch 2 Batch 1 Fail ? response retry Update Future callback drain Metadata or Exception
  • 28. 31 Basics Durability can be configured with the producer configuration request.required.acks • 0 The message is written to the network (buffer) • 1 The message is written to the leader • all The producer gets an ack after all ISRs receive the data; the message is committed Make sure producer doesn’t just throws messages away! • block.on.buffer.full = true < 0.9.0 • max.block.ms = Long.MAX_VALUE • Or handle the BufferExhaustedException / TimeoutException yourself
  • 29. 32 “New” Producer All calls are non-blocking async 2 Options for checking for failures: • Immediately block for response: send().get() • Do followup work in Callback, close producer after error threshold • Be careful about buffering these failures. Future work? KAFKA-1955 • Don’t forget to close the producer! producer.close() will block until in-flight txns complete retries (producer config) defaults to 0 In flight requests could lead to message re-ordering
  • 30. 33
  • 31. 34 Consumer Three choices for Consumer API • Simple Consumer • High Level Consumer (ZookeeperConsumer) • New KafkaConsumer
  • 32. 35 New Consumer – auto commit props.put("enable.auto.commit", "true"); props.put("auto.commit.interval.ms", "10000"); KafkaConsumer<String, String> consumer = new KafkaConsumer<String, String>(props); consumer.subscribe(Arrays.asList("foo", "bar")); while (true) { ConsumerRecords<String, String> records = consumer.poll(100); for (ConsumerRecord<String, String> record : records) { processAndUpdateDB(record); } } What if we crash after 8 seconds? Commit automatically every 10 seconds
  • 33. 36 New Consumer – manual commit props.put("enable.auto.commit", "false"); KafkaConsumer<String, String> consumer = new KafkaConsumer<String, String>(props); consumer.subscribe(Arrays.asList("foo", "bar")); while (true) { ConsumerRecords<String, String> records = consumer.poll(100); for (ConsumerRecord<String, String> record : records) processAndUpdateDB(record); consumer.commitSync(); } Commit entire batch outside the loop!
  • 34. 43 Minimize Duplicates for At Least Once Consuming 1. Commit your own offsets - Set autocommit.enable = false 2. Use Rebalance Listener to limit duplicates 3. Make sure you commit only what you are done processing 4. Note: New consumer is single threaded – one consumer per thread.
  • 35. 44 Exactly Once Semantics At most once is easy At least once is not bad either – commit after 100% sure data is safe Exactly once is tricky • Commit data and offsets in one transaction • Idempotent producer Kafka Connect – many connectors (especially Confluent’s) are exactly once by using an external database to write events and store offsets in one transaction Kafka Streams – starting at 0.11.0 have easy to configure exactly once (exactly.once=true). Other stream processing systems – have their own thing.
  • 36. 47 How we test Kafka? """Replication tests. These tests verify that replication provides simple durability guarantees by checking that data acked by brokers is still available for consumption in the face of various failure scenarios. Setup: 1 zk, 3 kafka nodes, 1 topic with partitions=3, replication-factor=3, and min.insync.replicas=2 - Produce messages in the background - Consume messages in the background - Drive broker failures (shutdown, or bounce repeatedly with kill -15 or kill -9) - When done driving failures, stop producing, and finish consuming - Validate that every acked message was consumed """
  • 37. 48 Monitoring for Data Loss • Monitor for producer errors – watch the retry numbers • Monitor consumer lag – MaxLag or via offsets • Standard schema: • Each message should contain timestamp and originating service and host • Each producer can report message counts and offsets to a special topic • “Monitoring consumer” reports message counts to another special topic • “Important consumers” also report message counts • Reconcile the results
  • 38. 49 Be Safe, Not Sorry Acks = all Max.block.ms = Long.MAX_VALUE Retries = MAX_INT ( Max.inflight.requests.per.connection = 1 ) Producer.close() Replication-factor >= 3 Min.insync.replicas = 2 Unclean.leader.election = false Auto.offset.commit = false Commit after processing Monitor!

Editor's Notes

  • #6: Apache Kafka is no longer just pub-sub messaging. Because of its persistence and reliability, it makes a great place to manage general streams of events and to drive streaming applications.
  • #8: We are going to start by discussing reliability guarantees as implemented by the broker’s replication protocol. We then discuss how to configure the clients for better reliability. We’ll use Java clients as an example. For non-Java clients: The C client (librdkafka) works pretty much the same way – same configurations and guarantees will work. Same for clients in other languages based on Librdkafka. For other clients… its hard to make generalizations. Some are very different and the advice in this talk will not work for them.
  • #12: Low Level Diagram: Not talking about producer / consumer design yet…maybe this is too low-level though Show diagram of network send -> os socket -> NIC -> ---- NIC -> Os socket buffer -> socket -> internal message flow / socket server -> response back to client -> how writes get persisted to disk including os buffers, async write etc Then overlay places where things can fail.
  • #13: Low Level Diagram: Not talking about producer / consumer design yet…maybe this is too low-level though Show diagram of network send -> os socket -> NIC -> ---- NIC -> Os socket buffer -> socket -> internal message flow / socket server -> response back to client -> how writes get persisted to disk including os buffers, async write etc Then overlay places where things can fail.
  • #16: Highlight boxes with different color
  • #21: When Replica 3 is back, it will catch up
  • #31: Kafka exposes it’s binary TCP protocol via a Java api which is what we’ll be discussing here. So everything in the box is what’s happening inside the producer. Generally speaking, you have an application thread, or threads that take individual messages and “send” them to Kafka. What happens under the covers is that these messages are batched up where possible in order to amortize the overhead of the send, stored in a buffer and communicated over to kafka. After Kafka has completed it’s work, a response is returned back for each message. This happens asynchronously, using Java’s concurrent API. This response is comprised of either an exception or a metadata record. If the metadata is returned, which contains the offset, partition and topic, then things are good and we continue processing. However, if an error has returned the producer will automatically retry the failed message, up to a configurable # or amount of time. When this exception occurs and we have retries enabled, these retries actually just go right back to the start of the batches being prepared to send back to Kafka.
  • #36: Commit every 10 seconds, but we don’t really have any control over what’s processed, and this can lead to duplicates
  • #39: If you are doing too much work commits don’t count as heartbeat ->
  • #41: So lets so we have auto-commit enabled, and we are chugging along, and counting on the consumer to commit our offsets for us. This is great because we don’t have to code anything, and don’t have think about the frequency of commits and the impact that might have on our throughput. Life is good. But now we’ve lost a thread or a process. And we don’t really know where we are in the processing, Because the last auto-commit committed stuff that we hadn’t actually written to disk.
  • #42: So now we’re in a situation where we think we’ve read all of our data but we will have gaps in data. Note the same risk applies if we lose a partition or broker and get a new leader. OR