SlideShare a Scribd company logo
Data Models and Consumer
Idioms Using Apache Kafka for
Continuous Data Stream
Processing




Surge’12
September 27, 2012
Erik Onnen
@eonnen
About Me

• Director   of Architecture and Development at Urban
 Airship
• Formerly   Jive Software, Liberty Mutual, Opsware,
 Progress
• Java,   C++, Python
• Background    in messaging systems
 • Contributor   to ActiveMQ
 • Global    Tibco deployments
 • ESB    Commercial Products
About Urban Airship

• Engagement     platform using location and push
 notifications
• Analytics   for delivery, conversion and influence
• High   precision targeting capabilities
This Talk

• How    UA uses Kafka
• Kafka   architecture digest
• Data   structures and stream processing w/ Kafka
• Operational   considerations
Kafka at Urban Airship
Kafka at Urban Airship



“The use for activity stream processing makes Kafka comparable to Facebook's
Scribe or Apache Flume... though the architecture and primitives are very different
for these systems and make Kafka more comparable to a traditional messaging
system.”
- http://incubator.apache.org/kafka/ Sep 27, 2012
Kafka at Urban Airship



“The use for activity stream processing makes Kafka comparable to Facebook's
Scribe or Apache Flume... though the architecture and primitives are very different
for these systems and make Kafka more comparable to a traditional messaging
system.”
- http://incubator.apache.org/kafka/ Sep 27, 2012




“Let’s use it for all the things”
- me, 2010
Kafka at Urban Airship
Kafka at Urban Airship

• On   the critical path for many of our core capabilities
Kafka at Urban Airship

• On   the critical path for many of our core capabilities
 • Device   metadata
Kafka at Urban Airship

• On   the critical path for many of our core capabilities
 • Device   metadata
 • Message    delivery analytics
Kafka at Urban Airship

• On   the critical path for many of our core capabilities
 • Device   metadata
 • Message    delivery analytics
 • Device   connectivity state
Kafka at Urban Airship

• On   the critical path for many of our core capabilities
 • Device   metadata
 • Message    delivery analytics
 • Device   connectivity state
 • Feeds   our operational data warehouse
Kafka at Urban Airship

• On   the critical path for many of our core capabilities
 • Device   metadata
 • Message    delivery analytics
 • Device   connectivity state
 • Feeds   our operational data warehouse
• Three   Kafka clusters doing in aggregate > 7B msg/day
Kafka at Urban Airship

• On   the critical path for many of our core capabilities
 • Device   metadata
 • Message    delivery analytics
 • Device   connectivity state
 • Feeds   our operational data warehouse
• Three   Kafka clusters doing in aggregate > 7B msg/day
• Peak   capacity observed single consumer 750K msg/sec
Kafka at Urban Airship

• On    the critical path for many of our core capabilities
  • Device   metadata
  • Message    delivery analytics
  • Device   connectivity state
  • Feeds   our operational data warehouse
• Three   Kafka clusters doing in aggregate > 7B msg/day
• Peak   capacity observed single consumer 750K msg/sec
• All   bare metal hardware hosted with an MSP
Kafka at Urban Airship

• On    the critical path for many of our core capabilities
  • Device   metadata
  • Message    delivery analytics
  • Device   connectivity state
  • Feeds   our operational data warehouse
• Three   Kafka clusters doing in aggregate > 7B msg/day
• Peak   capacity observed single consumer 750K msg/sec
• All   bare metal hardware hosted with an MSP
• Factoring   prominently in our multi-facility architecture
Kafka Core Concepts - The Big Picture
Kafka Core Concepts - The Big Picture
Kafka Core Concepts
Kafka Core Concepts

• Publish   subscribe system (not a queue)
• One   producer, zero or more consumers
• Consumers    aren’t contending with each other for
 messages
• Messages    retained for a configured window of time
• Messages    grouped by topics
• Consumers    partition a topic as a group:
 •1   consumer thread - all topic messages
 •2   consumers threads - each .5 total messages
 •3   consumers threads - each .3 total messages
Kafka Core Concepts - Producers
Kafka Core Concepts - Producers

• Producers   have no idea who will consume a message or
 when
• Deliver   messages to one and only one topic
• Deliver   messages to one and only one broker*
• Deliver   a message to one and only one partition on a
 broker
• Messages    are not ack’d in any way (not when received,
 not when on disk, not on a boat, not in a plane...)
• Messages    largely opaque to producers
• Send   messages at or below a configured size†
Kafka Core Concepts - Brokers
Kafka Core Concepts - Brokers

• Dumb     by design
 • No    shared state
 • Publish   small bits of metadata to ZooKeeper
 • Messages    are pulled by consumers (no push state
   management)
• Manage    sets of segment files, one per topic + partition
 combination
• All   delivery done through sendfile calls on mmap’d files
 - very fast, avoids system -> user -> system copy for
 every send
Kafka Core Concepts - Brokers

• Nearly   invisible in the grand scheme of operations if
 they have enough disk and RAM
Kafka Core Concepts - Brokers

• Don’t   fear the JVM (just put it in a corner)
 • Most    of the heavy lifting is done in system calls
 • Minimal    on-heap buffering keeps most garbage in
  ParNew
 • 20   minute sample has approximately 100 ParNew
  collections for a total of .42 seconds in GC
  (0.0003247526)
Kafka Core Concepts - Consumers
Kafka Core Concepts - Consumers

• Consumer     configured for one and only one group
• Messages   are consumed in KafkaMessageStream
 iterators that never stop but may block
• Message   message stream is a combination of:
 • Topic   (SPORTS)
 • Group   (SPORTS EVENT LOGGER | SCORE UPDATER)
 • Broker(s)   - 1 or more brokers feed a logical stream
 • Partition(s)   - 1 or more partitions from a broker + topic
Kafka Is Excellent for...
Kafka Is Excellent for...

• Small,   expressive messages - BYOD
Kafka Is Excellent for...

• Small,   expressive messages - BYOD
• Throughput
Kafka Is Excellent for...

• Small,   expressive messages - BYOD
• Throughput

 • Decimates    any JMS or AMQP servers for PubSub
   throughput
Kafka Is Excellent for...

• Small,   expressive messages - BYOD
• Throughput

 • Decimates    any JMS or AMQP servers for PubSub
   throughput
 • >70x     better throughput than beanstalkd
Kafka Is Excellent for...

• Small,   expressive messages - BYOD
• Throughput

 • Decimates    any JMS or AMQP servers for PubSub
   throughput
 • >70x     better throughput than beanstalkd
 • Scales   well with number of consumers, topics
Kafka Is Excellent for...

• Small,   expressive messages - BYOD
• Throughput

 • Decimates    any JMS or AMQP servers for PubSub
   throughput
 • >70x     better throughput than beanstalkd
 • Scales   well with number of consumers, topics
 • Re-balance    after consumer failures
Kafka Is Excellent for...

• Small,   expressive messages - BYOD
• Throughput

 • Decimates    any JMS or AMQP servers for PubSub
   throughput
 • >70x     better throughput than beanstalkd
 • Scales   well with number of consumers, topics
 • Re-balance    after consumer failures
• Rewind    in time scenarios
Kafka Is Excellent for...

• Small,   expressive messages - BYOD
• Throughput

 • Decimates    any JMS or AMQP servers for PubSub
   throughput
 • >70x     better throughput than beanstalkd
 • Scales   well with number of consumers, topics
 • Re-balance    after consumer failures
• Rewind    in time scenarios
• Allowing   transient “taps” into streams of data for roughly
 the cost of transport
But, Kafka Makes Critical Concessions - Brokers
But, Kafka Makes Critical Concessions - Brokers

• Data   not redundant - if a broker dies, you have to
 restore it to recover that data
But, Kafka Makes Critical Concessions - Brokers

• Data   not redundant - if a broker dies, you have to
 restore it to recover that data
 • Shore   up hardware
But, Kafka Makes Critical Concessions - Brokers

• Data   not redundant - if a broker dies, you have to
 restore it to recover that data
 • Shore   up hardware
 • Consume     as fast as possible
But, Kafka Makes Critical Concessions - Brokers

• Data   not redundant - if a broker dies, you have to
 restore it to recover that data
 • Shore   up hardware
 • Consume     as fast as possible
 • Persist   to shared storage or use BRDB
But, Kafka Makes Critical Concessions - Brokers

• Data   not redundant - if a broker dies, you have to
 restore it to recover that data
 • Shore   up hardware
 • Consume     as fast as possible
 • Persist   to shared storage or use BRDB
 • Upcoming     replication
But, Kafka Makes Critical Concessions - Brokers

• Data   not redundant - if a broker dies, you have to
 restore it to recover that data
 • Shore   up hardware
 • Consume     as fast as possible
 • Persist   to shared storage or use BRDB
 • Upcoming     replication
• Segment    corruption can be fatal for that topic + partition
Kafka Critical Concessions - Consumers
Kafka Critical Concessions - Consumers

• Messages   can be delivered out of order
Kafka Critical Concessions - Consumers

• Messages   can be delivered out of order
Kafka Critical Concessions - Consumers
Kafka Critical Concessions - Consumers

• No   once and only once semantics
Kafka Critical Concessions - Consumers

• No   once and only once semantics
• Consumers    must correctly handle the same message
 multiple times
Kafka Critical Concessions - Consumers

• No   once and only once semantics
• Consumers    must correctly handle the same message
 multiple times
 • Rebalance   after fail can result in redelivery
Kafka Critical Concessions - Consumers

• No   once and only once semantics
• Consumers    must correctly handle the same message
 multiple times
 • Rebalance   after fail can result in redelivery
 • Consumer    failure or unclean shutdown can result in
  redelivery
Kafka Critical Concessions - Consumers

• No   once and only once semantics
• Consumers       must correctly handle the same message
 multiple times
 • Rebalance     after fail can result in redelivery
 • Consumer      failure or unclean shutdown can result in
   redelivery
• Possibility   of out of order delivery and redelivery require
 idempotent, commutative consumers when dealing with
 systems of record
Storage Patterns and Data Structures
Storage Patterns and Data Structures

• Urban   Airship uses Kafka for
Storage Patterns and Data Structures

• Urban   Airship uses Kafka for
 • Analytics
Storage Patterns and Data Structures

• Urban   Airship uses Kafka for
 • Analytics

   • Producers   write device data to Kafka
Storage Patterns and Data Structures

• Urban   Airship uses Kafka for
 • Analytics

   • Producers   write device data to Kafka
   • Consumers    create dimensional indexes in HBase
Storage Patterns and Data Structures

• Urban   Airship uses Kafka for
 • Analytics

   • Producers   write device data to Kafka
   • Consumers    create dimensional indexes in HBase
 • Operational   Data
Storage Patterns and Data Structures

• Urban   Airship uses Kafka for
 • Analytics

   • Producers   write device data to Kafka
   • Consumers    create dimensional indexes in HBase
 • Operational   Data
   • Producers   are services writing to Kafka
Storage Patterns and Data Structures

• Urban   Airship uses Kafka for
 • Analytics

   • Producers   write device data to Kafka
   • Consumers    create dimensional indexes in HBase
 • Operational   Data
   • Producers   are services writing to Kafka
   • Consumers    write to ODW (HBase as JSON)
Storage Patterns and Data Structures

• Urban   Airship uses Kafka for
 • Analytics

   • Producers   write device data to Kafka
   • Consumers    create dimensional indexes in HBase
 • Operational   Data
   • Producers   are services writing to Kafka
   • Consumers    write to ODW (HBase as JSON)
 • Presence    Data
Storage Patterns and Data Structures

• Urban   Airship uses Kafka for
 • Analytics

   • Producers   write device data to Kafka
   • Consumers    create dimensional indexes in HBase
 • Operational   Data
   • Producers   are services writing to Kafka
   • Consumers    write to ODW (HBase as JSON)
 • Presence    Data
   • Producers   are connectivity nodes writing to Kafka
Storage Patterns and Data Structures

• Urban   Airship uses Kafka for
 • Analytics

   • Producers   write device data to Kafka
   • Consumers    create dimensional indexes in HBase
 • Operational   Data
   • Producers   are services writing to Kafka
   • Consumers    write to ODW (HBase as JSON)
 • Presence    Data
   • Producers   are connectivity nodes writing to Kafka
   • Consumers    write to LevelDB
Storage Patterns - Device Metadata
Storage Patterns - Device Metadata


{ deviceId:”PONIES”, tags:[”BEYONCE”], timestamp:1}
Storage Patterns - Device Metadata


{ deviceId:”PONIES”, tags:[”BEYONCE”], timestamp:1}



{ deviceId:”PONIES”, tags:[”BEYONCE”, “JAY-Z”, “NICKLEBACK”],
timestamp:2}
Storage Patterns - Device Metadata


{ deviceId:”PONIES”, tags:[”BEYONCE”], timestamp:1}



{ deviceId:”PONIES”, tags:[”BEYONCE”, “JAY-Z”, “NICKLEBACK”],
timestamp:2}


{ deviceId:”PONIES”, tags:[”BEYONCE”, “JAY-Z”, “NICKLEBACK”],
timestamp:3}
Storage Patterns - Device Metadata
Storage Patterns - Device Metadata

• Primitive   incarnation - blast an update into a row, keyed
 on deviceID
Storage Patterns - Device Metadata

• Primitive   incarnation - blast an update into a row, keyed
 on deviceID
 • RDBMS
Storage Patterns - Device Metadata

• Primitive   incarnation - blast an update into a row, keyed
 on deviceID
 • RDBMS

   • INSERT    OR UPDATE DEVICE_METADATA (ID, VALUE)
    VALUES (DEVICE_ID, BLOB) WHERE ID = deviceID;
Storage Patterns - Device Metadata

• Primitive   incarnation - blast an update into a row, keyed
 on deviceID
 • RDBMS

   • INSERT    OR UPDATE DEVICE_METADATA (ID, VALUE)
    VALUES (DEVICE_ID, BLOB) WHERE ID = deviceID;
   • Denormalize     - forget joining to read tags, way too
    expensive
Storage Patterns - Device Metadata
Storage Patterns - Device Metadata

 • Column   Store
Storage Patterns - Device Metadata

 • Column   Store
  • Write   k=deviceId -> c=NULL -> v= BLOB
Storage Patterns - Device Metadata

 • Column   Store
  • Write   k=deviceId -> c=NULL -> v= BLOB
 • Both
Storage Patterns - Device Metadata

 • Column   Store
  • Write   k=deviceId -> c=NULL -> v= BLOB
 • Both

  • Idempotent
Storage Patterns - Device Metadata

 • Column   Store
  • Write   k=deviceId -> c=NULL -> v= BLOB
 • Both

  • Idempotent

  • FAIL   - mutations can arrive out of order, can be
    replayed
Storage Patterns - Device Metadata

 • Column   Store
  • Write   k=deviceId -> c=NULL -> v= BLOB
 • Both

  • Idempotent

  • FAIL   - mutations can arrive out of order, can be
    replayed
  • Commutative
Storage Patterns - Device Metadata
Storage Patterns - Device Metadata

• Improved   approach - leverage the timestamp of the
 mutation
Storage Patterns - Device Metadata

• Improved   approach - leverage the timestamp of the
 mutation
 • RDBMS
Storage Patterns - Device Metadata

• Improved   approach - leverage the timestamp of the
 mutation
 • RDBMS

   • INSERT   OR UPDATE DEVICE_METADATA (KEY, VALUE,
    TS) VALUES (DEVICE_ID, BLOB, TS) WHERE ID =
    deviceID AND TS = TS;
Storage Patterns - Device Metadata

• Improved   approach - leverage the timestamp of the
 mutation
 • RDBMS

   • INSERT   OR UPDATE DEVICE_METADATA (KEY, VALUE,
    TS) VALUES (DEVICE_ID, BLOB, TS) WHERE ID =
    deviceID AND TS = TS;
   • Heavy-handed    approach
Storage Patterns - Device Metadata

• Improved   approach - leverage the timestamp of the
 mutation
 • RDBMS

   • INSERT   OR UPDATE DEVICE_METADATA (KEY, VALUE,
    TS) VALUES (DEVICE_ID, BLOB, TS) WHERE ID =
    deviceID AND TS = TS;
   • Heavy-handed    approach
   • Massive   I/O on TS index or risk reading an entire
    block per version with no adjacent blocks
Storage Patterns - Device Metadata
Storage Patterns - Device Metadata

 • Column   Store
Storage Patterns - Device Metadata

 • Column   Store
Storage Patterns - Device Metadata

 • Column   Store
Storage Patterns - Device Metadata

 • Column   Store
Storage Patterns - Device Metadata
Storage Patterns - Device Metadata

 • Column   Store
Storage Patterns - Device Metadata

 • Column   Store
  • Write   k=deviceId -> c=INV(ts) -> v=BLOB
Storage Patterns - Device Metadata

 • Column   Store
  • Write   k=deviceId -> c=INV(ts) -> v=BLOB
  • Reads   are simple slices of one column, easy for LSM
    (pop the top column in the row)
Storage Patterns - Device Metadata

 • Column   Store
  • Write   k=deviceId -> c=INV(ts) -> v=BLOB
  • Reads   are simple slices of one column, easy for LSM
    (pop the top column in the row)
  • No   transactions required, much smaller lock footprint
Storage Patterns - Device Metadata

 • Column    Store
  • Write   k=deviceId -> c=INV(ts) -> v=BLOB
  • Reads    are simple slices of one column, easy for LSM
    (pop the top column in the row)
  • No    transactions required, much smaller lock footprint
 • Both
Storage Patterns - Device Metadata

 • Column    Store
  • Write   k=deviceId -> c=INV(ts) -> v=BLOB
  • Reads    are simple slices of one column, easy for LSM
    (pop the top column in the row)
  • No    transactions required, much smaller lock footprint
 • Both

  • Idempotent
Storage Patterns - Device Metadata

 • Column    Store
  • Write   k=deviceId -> c=INV(ts) -> v=BLOB
  • Reads    are simple slices of one column, easy for LSM
    (pop the top column in the row)
  • No    transactions required, much smaller lock footprint
 • Both

  • Idempotent

  • Commutative
Storage Patterns - Device Metadata

 • Column    Store
  • Write   k=deviceId -> c=INV(ts) -> v=BLOB
  • Reads    are simple slices of one column, easy for LSM
    (pop the top column in the row)
  • No    transactions required, much smaller lock footprint
 • Both

  • Idempotent

  • Commutative

  • Old   versions not removed automatically
Storage Patterns - Device Metadata

 • Column    Store
  • Write   k=deviceId -> c=INV(ts) -> v=BLOB
  • Reads    are simple slices of one column, easy for LSM
    (pop the top column in the row)
  • No    transactions required, much smaller lock footprint
 • Both

  • Idempotent

  • Commutative

  • Old   versions not removed automatically
  • Secondary    indexes very difficult
Storage Patterns - Device Metadata
Storage Patterns - Device Metadata

• Gangam   Style - tag per column, deletions tombstoned
Storage Patterns - Device Metadata

• Gangam   Style - tag per column, deletions tombstoned
 • RDBMS   - select for update and/or big txns?
Storage Patterns - Device Metadata

• Gangam   Style - tag per column, deletions tombstoned
 • RDBMS   - select for update and/or big txns?
 • Column   Store
Storage Patterns - Device Metadata

• Gangam   Style - tag per column, deletions tombstoned
 • RDBMS   - select for update and/or big txns?
 • Column   Store
  • Addition   k=deviceId -> c=TAG -> v=TS
Storage Patterns - Device Metadata

• Gangam   Style - tag per column, deletions tombstoned
 • RDBMS   - select for update and/or big txns?
 • Column   Store
  • Addition   k=deviceId -> c=TAG -> v=TS
  • Deletion   k=deviceId -> c=TAG -> v=-(TS)
Storage Patterns - Device Metadata

• Gangam    Style - tag per column, deletions tombstoned
 • RDBMS    - select for update and/or big txns?
 • Column    Store
  • Addition   k=deviceId -> c=TAG -> v=TS
  • Deletion   k=deviceId -> c=TAG -> v=-(TS)
  • Cell   timestamp set to event timestamp in both cases
    (old updates ignored)
Storage Patterns - Device Metadata

• Gangam    Style - tag per column, deletions tombstoned
 • RDBMS    - select for update and/or big txns?
 • Column    Store
  • Addition   k=deviceId -> c=TAG -> v=TS
  • Deletion   k=deviceId -> c=TAG -> v=-(TS)
  • Cell   timestamp set to event timestamp in both cases
    (old updates ignored)
 • Easy   to (re)build secondary indexes, tag counts
Storage Patterns - Device Metadata

• Gangam    Style - tag per column, deletions tombstoned
 • RDBMS    - select for update and/or big txns?
 • Column    Store
  • Addition   k=deviceId -> c=TAG -> v=TS
  • Deletion   k=deviceId -> c=TAG -> v=-(TS)
  • Cell   timestamp set to event timestamp in both cases
    (old updates ignored)
 • Easy   to (re)build secondary indexes, tag counts
 • Commutative,      Idempotent and Fast
Storage Patterns - Device Metadata
Storage Patterns - Device Metadata
Storage Patterns - Device Metadata
Storage Patterns - Device Metadata
Operational Considerations - Buffering
Operational Considerations - Buffering

•A   message in a broker is not immediately visible to a
 consumer
Operational Considerations - Buffering

•A   message in a broker is not immediately visible to a
 consumer
• Kafka   buffers data until one of two conditions is true
Operational Considerations - Buffering

•A   message in a broker is not immediately visible to a
 consumer
• Kafka   buffers data until one of two conditions is true
 • log.flush.interval   reached
Operational Considerations - Buffering

•A   message in a broker is not immediately visible to a
 consumer
• Kafka   buffers data until one of two conditions is true
 • log.flush.interval   reached
 • log.default.flush.interval.ms   elapsed
Operational Considerations - Buffering

•A   message in a broker is not immediately visible to a
 consumer
• Kafka   buffers data until one of two conditions is true
 • log.flush.interval   reached
 • log.default.flush.interval.ms   elapsed
• False   latency for low throughput workloads
Operational Considerations - Buffering

•A   message in a broker is not immediately visible to a
 consumer
• Kafka   buffers data until one of two conditions is true
 • log.flush.interval   reached
 • log.default.flush.interval.ms   elapsed
• False   latency for low throughput workloads
• The   smaller of the two represents loss message potential
Operational Considerations - The FetcherRunnable
Operational Considerations - The FetcherRunnable

• Consumer   spawns a number of FetcherRunnable threads
 to read from brokers
Operational Considerations - The FetcherRunnable

• Consumer   spawns a number of FetcherRunnable threads
 to read from brokers
• FetcherRunnable   feeds messages into queues that back
 the KafkaMessageStream API
Operational Considerations - The FetcherRunnable

• Consumer   spawns a number of FetcherRunnable threads
 to read from brokers
• FetcherRunnable   feeds messages into queues that back
 the KafkaMessageStream API
• FetchRunnable   must remain healthy for consumers to see
 messages
Operational Considerations - The FetcherRunnable

• Consumer        spawns a number of FetcherRunnable threads
 to read from brokers
• FetcherRunnable            feeds messages into queues that back
 the KafkaMessageStream API
• FetchRunnable             must remain healthy for consumers to see
 messages
    // consume the messages in the threads
    for(final KafkaStream<Message> stream: streams) {
     executor.submit(new Runnable() {
      public void run() {
       for(MessageAndMetadata msgAndMetadata: stream) {
        // process message (msgAndMetadata.message())
       }}};}
Operational Considerations - The FetcherRunnable
Operational Considerations - The FetcherRunnable
Operational Considerations - The FetcherRunnable
•A   given FetcherRunnable is the lone source of data for
 its streams
Operational Considerations - The FetcherRunnable
•A   given FetcherRunnable is the lone source of data for
 its streams
• When   a FetcherRunnable dies, the streams block
 indefinitely
Operational Considerations - The FetcherRunnable
•A   given FetcherRunnable is the lone source of data for
 its streams
• When      a FetcherRunnable dies, the streams block
 indefinitely
2012-06-15 00:31:39,422 - ERROR [FetchRunnable-0:kafka.consumer.FetcherRunnable] - error in
FetcherRunnable
java.io.IOException: Connection reset by peer

     at sun.nio.ch.FileDispatcher.read0(Native Method)

     at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:21)

     at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:202)

     at sun.nio.ch.IOUtil.read(IOUtil.java:175)

     at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:243)

     at kafka.utils.Utils$.read(Utils.scala:483)

     at kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:53)

     at kafka.network.Receive$class.readCompletely(Transmission.scala:56)

     at kafka.network.BoundedByteBufferReceive.readCompletely(BoundedByteBufferReceive.scala:28)

     at kafka.consumer.SimpleConsumer.getResponse(SimpleConsumer.scala:181)

     at kafka.consumer.SimpleConsumer.liftedTree2$1(SimpleConsumer.scala:129)

     at kafka.consumer.SimpleConsumer.multifetch(SimpleConsumer.scala:119)

     at kafka.consumer.FetcherRunnable.run(FetcherRunnable.scala:63)
Operational Considerations - Rate is King
Operational Considerations - Rate is King

• MONITOR   YOUR CONSUMPTION RATES
Operational Considerations - Rate is King

• MONITOR    YOUR CONSUMPTION RATES
 • Kafka   JMX Beans
Operational Considerations - Rate is King

• MONITOR    YOUR CONSUMPTION RATES
 • Kafka   JMX Beans
 • Application   metrics for specific consumption behaviors
  (use Yammer Timer metrics)
Operational Considerations - Rate is King

• MONITOR    YOUR CONSUMPTION RATES
 • Kafka   JMX Beans
 • Application   metrics for specific consumption behaviors
  (use Yammer Timer metrics)
• Understand   what “normal” is, alert when you are out of
 that band by some tolerance
Operational Considerations - Rate is King

• MONITOR    YOUR CONSUMPTION RATES
 • Kafka   JMX Beans
 • Application   metrics for specific consumption behaviors
  (use Yammer Timer metrics)
• Understand   what “normal” is, alert when you are out of
 that band by some tolerance
• Not   overcommitting consumers helps - nobody is idle
Operational Considerations - The Retention Window
Operational Considerations - The Retention Window

• Data   written to a segment file on a broker (topic +
 partition)
Operational Considerations - The Retention Window

• Data   written to a segment file on a broker (topic +
 partition)
• Every   consumer group has a relative offset within a
 segment
Operational Considerations - The Retention Window

• Data   written to a segment file on a broker (topic +
 partition)
• Every   consumer group has a relative offset within a
 segment
• Individual   consumers move the offset and store to
 ZooKeeper on a regular interval
Operational Considerations - The Retention Window

• Data   written to a segment file on a broker (topic +
 partition)
• Every   consumer group has a relative offset within a
 segment
• Individual   consumers move the offset and store to
 ZooKeeper on a regular interval
• Segments     are retained for log.retention.hours
Operational Considerations - The Retention Window

• Data   written to a segment file on a broker (topic +
 partition)
• Every   consumer group has a relative offset within a
 segment
• Individual   consumers move the offset and store to
 ZooKeeper on a regular interval
• Segments     are retained for log.retention.hours
• Segments     deleted when outside retention window
Operational Considerations - The Retention Window
Operational Considerations - The Retention Window
Operational Considerations - The Retention Window
Operational Considerations - The Retention Window
Operational Considerations - The Retention Window
Operational Considerations - The Retention Window

• Consumers   update offsets in ZooKeeper
Operational Considerations - The Retention Window

• Consumers    update offsets in ZooKeeper
• Monitor   them and make sure they’re progressing
Operational Considerations - The Retention Window

• Consumers    update offsets in ZooKeeper
• Monitor   them and make sure they’re progressing
• Look   for skew in rate of change between partition offsets
Operational Considerations - The Retention Window

• Consumers    update offsets in ZooKeeper
• Monitor   them and make sure they’re progressing
• Look   for skew in rate of change between partition offsets
• Monitoring   consumption rate can also help
Operational Considerations - Scala




“Reading that Scala stack trace
sure was easy”

- Nobody Ever
Operational Considerations - Scala
2012-07-04 11:49:08,469 - WARN [ZkClient-EventThread-132-
zookeeper-0:2181,zookeeper-1:2181,zookeeper-2:2181:org.I0Itec.zkclient.ZkEventThread] - Error handling event ZkEvent[Children of /
brokers/topics/SEND_EVENTS changed sent to kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener@43d248b4]
java.lang.NullPointerException
    at scala.util.parsing.combinator.Parsers$NoSuccess.<init>(Parsers.scala:131)
    at scala.util.parsing.combinator.Parsers$Failure.<init>(Parsers.scala:158)
    at scala.util.parsing.combinator.Parsers$$anonfun$acceptIf$1.apply(Parsers.scala:489)
    at scala.util.parsing.combinator.Parsers$$anonfun$acceptIf$1.apply(Parsers.scala:487)
    at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:182)
    at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:203)
    at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:203)
    at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:182)
    at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:182)

... (~50 lines elided)

    at   scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:203)
    at   scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:203)
    at   scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:182)
    at   scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:182)
    at   scala.util.parsing.combinator.Parsers$Success.flatMapWithNext(Parsers.scala:113)
    at   scala.util.parsing.combinator.Parsers$Parser$$anonfun$flatMap$1.apply(Parsers.scala:200)
    at   scala.util.parsing.combinator.Parsers$Parser$$anonfun$flatMap$1.apply(Parsers.scala:200)
    at   scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:182)
    at   scala.util.parsing.combinator.Parsers$Parser$$anonfun$flatMap$1.apply(Parsers.scala:200)
    at   scala.util.parsing.combinator.Parsers$Parser$$anonfun$flatMap$1.apply(Parsers.scala:200)
    at   scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:182)
    at   scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:203)
    at   scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:203)
    at   scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:182)
    at   scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:208)
    at   scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:208)
    at   scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:182)
    at   scala.util.parsing.combinator.Parsers$$anon$2.apply(Parsers.scala:742)
    at   scala.util.parsing.json.JSON$.parseRaw(JSON.scala:71)
    at   scala.util.parsing.json.JSON$.parseFull(JSON.scala:85)
Operational Considerations - Brokers
Operational Considerations - Brokers

• Monitor   IOPS and IOUtil
Operational Considerations - Brokers

• Monitor   IOPS and IOUtil
• Under   no circumstances allow a broker to run out of disk
 space (don’t even get close)
Operational Considerations - Brokers

• Monitor   IOPS and IOUtil
• Under   no circumstances allow a broker to run out of disk
 space (don’t even get close)
• fetch.size   - amount of data a consumer will pull
Operational Considerations - Brokers

• Monitor   IOPS and IOUtil
• Under   no circumstances allow a broker to run out of disk
 space (don’t even get close)
• fetch.size   - amount of data a consumer will pull
• max.message.size     - largest message a producer can
 submit to a broker
Operational Considerations - Brokers

• Monitor   IOPS and IOUtil
• Under    no circumstances allow a broker to run out of disk
 space (don’t even get close)
• fetch.size   - amount of data a consumer will pull
• max.message.size     - largest message a producer can
 submit to a broker
• Broker   enforces neither of these prior to v0.8 :(
Operational Considerations - Brokers

• Monitor   IOPS and IOUtil
• Under    no circumstances allow a broker to run out of disk
 space (don’t even get close)
• fetch.size   - amount of data a consumer will pull
• max.message.size     - largest message a producer can
 submit to a broker
• Broker   enforces neither of these prior to v0.8 :(
 • KAFKA-490
Operational Considerations - Brokers

• Monitor   IOPS and IOUtil
• Under    no circumstances allow a broker to run out of disk
 space (don’t even get close)
• fetch.size   - amount of data a consumer will pull
• max.message.size     - largest message a producer can
 submit to a broker
• Broker   enforces neither of these prior to v0.8 :(
 • KAFKA-490

 • KAFKA-247
Operational Considerations - Brokers
Operational Considerations - Brokers
2012-06-15 04:47:35,632 - ERROR [FetchRunnable-2:kafka.consumer.FetcherRunnable] - error in
FetcherRunnable for RN-OL:3-22
kafka.common.InvalidMessageSizeException: invalid message size:152173251 only received bytes:307196
at 0 possible causes (1) a single message larger than the fetch size; (2) log corruption

     at kafka.message.ByteBufferMessageSet$$anon$1.makeNext(ByteBufferMessageSet.scala:75)

     at kafka.message.ByteBufferMessageSet$$anon$1.makeNext(ByteBufferMessageSet.scala:61)

     at kafka.utils.IteratorTemplate.maybeComputeNext(IteratorTemplate.scala:58)

     at kafka.utils.IteratorTemplate.hasNext(IteratorTemplate.scala:50)

     at kafka.message.ByteBufferMessageSet.validBytes(ByteBufferMessageSet.scala:49)

     at kafka.consumer.PartitionTopicInfo.enqueue(PartitionTopicInfo.scala:70)

     at kafka.consumer.FetcherRunnable$$anonfun$run$3.apply(FetcherRunnable.scala:80)

     at kafka.consumer.FetcherRunnable$$anonfun$run$3.apply(FetcherRunnable.scala:66)

     at scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:61)

     at scala.collection.immutable.List.foreach(List.scala:45)

     at kafka.consumer.FetcherRunnable.run(FetcherRunnable.scala:66)
Operational Considerations - Consumers
Operational Considerations - Consumers

• Consumer   tuning is an art
Operational Considerations - Consumers

• Consumer   tuning is an art
 • Overcommit   - more threads than partitions
Operational Considerations - Consumers

• Consumer    tuning is an art
 • Overcommit    - more threads than partitions
  • Idling   (often entire consumer processes)
Operational Considerations - Consumers

• Consumer    tuning is an art
 • Overcommit    - more threads than partitions
  • Idling   (often entire consumer processes)
  • Excessive   rebalancing
Operational Considerations - Consumers

• Consumer    tuning is an art
 • Overcommit    - more threads than partitions
  • Idling   (often entire consumer processes)
  • Excessive   rebalancing
 • Under   commit - less threads than partitions
Operational Considerations - Consumers

• Consumer    tuning is an art
 • Overcommit    - more threads than partitions
  • Idling   (often entire consumer processes)
  • Excessive   rebalancing
 • Under   commit - less threads than partitions
  • Serial   fetchers won’t keep up depending on workload
Operational Considerations - Consumers

• Consumer    tuning is an art
 • Overcommit    - more threads than partitions
  • Idling   (often entire consumer processes)
  • Excessive   rebalancing
 • Under   commit - less threads than partitions
  • Serial   fetchers won’t keep up depending on workload
  • Big   GCs can cause rebalancing
Operational Considerations - Consumers

• Consumer    tuning is an art
 • Overcommit     - more threads than partitions
  • Idling   (often entire consumer processes)
  • Excessive    rebalancing
 • Under    commit - less threads than partitions
  • Serial   fetchers won’t keep up depending on workload
  • Big   GCs can cause rebalancing
 • Just   right - 2 partitions / consumer thread ratio
Operational Considerations - Consumers

• Consumer     tuning is an art
 • Overcommit     - more threads than partitions
  • Idling   (often entire consumer processes)
  • Excessive    rebalancing
 • Under    commit - less threads than partitions
  • Serial   fetchers won’t keep up depending on workload
  • Big   GCs can cause rebalancing
 • Just   right - 2 partitions / consumer thread ratio
 • Mostly    pivots on consumer workload (i.e. latency)
Operational Considerations - Incubators Gonna
Incubate
Operational Considerations - Incubators Gonna
Incubate
 • Deployed in some large installations
Operational Considerations - Incubators Gonna
Incubate
 • Deployed in some large installations

• Largely   learning in production
Operational Considerations - Incubators Gonna
Incubate
 • Deployed in some large installations

• Largely   learning in production
• Hasn’t   lived through a long lineage of people being
 mean to it or using in anger
Operational Considerations - Incubators Gonna
Incubate
 • Deployed in some large installations

• Largely       learning in production
• Hasn’t      lived through a long lineage of people being
 mean to it or using in anger

2012-06-15 04:25:00,774 - ERROR [kafka-processor-3:Processor@215] - java.lang.RuntimeException:
OOME with size 1195725856
java.lang.RuntimeException: OOME with size 1195725856

      at kafka.network.BoundedByteBufferReceive.byteBufferAllocate(BoundedByteBufferReceive.scala:81)

      at kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:60)

      at kafka.network.Processor.read(SocketServer.scala:283)

      at kafka.network.Processor.run(SocketServer.scala:202)

      at java.lang.Thread.run(Thread.java:662)
Caused by: java.lang.OutOfMemoryError: Java heap space

      at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:39)

      at java.nio.ByteBuffer.allocate(ByteBuffer.java:312)

      at kafka.network.BoundedByteBufferReceive.byteBufferAllocate(BoundedByteBufferReceive.scala:77)
Operational Considerations - Incubators Gonna
Incubate
Operational Considerations - Incubators Gonna
Incubate
Operational Considerations - Incubators Gonna
Incubate
Operational Considerations - Incubators Gonna
Incubate




• With   any incubator project, assume it will be rough
 around the edges
Operational Considerations - Incubators Gonna
Incubate




• With   any incubator project, assume it will be rough
 around the edges
• Assume    that if you point your monitoring agent at the
 service port, things will break
Operational Considerations - Incubators Gonna
Incubate




• With   any incubator project, assume it will be rough
 around the edges
• Assume    that if you point your monitoring agent at the
 service port, things will break
• As   a general practice, measure the intended outcome of
 production changes
Acknowledgements

The storage models proposed were inspired and adapted
by:


http://engineering.twitter.com/2010/05/introducing-
flockdb.html


https://github.com/mochi/statebox
Q&A

We’re hiring!
• Infrastructure

• Django

• Operations



Contact:
erik@urbanairship.com (that I put my email in slides is not
an invitation to sell me software so don’t do that)
@eonnen - twitter

More Related Content

What's hot (20)

Apache Kafka 0.8 basic training - Verisign
Apache Kafka 0.8 basic training - VerisignApache Kafka 0.8 basic training - Verisign
Apache Kafka 0.8 basic training - Verisign
Michael Noll
 
From Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
From Message to Cluster: A Realworld Introduction to Kafka Capacity PlanningFrom Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
From Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
confluent
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
Jeff Holoman
 
Building Event-Driven Systems with Apache Kafka
Building Event-Driven Systems with Apache KafkaBuilding Event-Driven Systems with Apache Kafka
Building Event-Driven Systems with Apache Kafka
Brian Ritchie
 
Kafka Summit SF 2017 - Best Practices for Running Kafka on Docker Containers
Kafka Summit SF 2017 - Best Practices for Running Kafka on Docker ContainersKafka Summit SF 2017 - Best Practices for Running Kafka on Docker Containers
Kafka Summit SF 2017 - Best Practices for Running Kafka on Docker Containers
confluent
 
Tuning kafka pipelines
Tuning kafka pipelinesTuning kafka pipelines
Tuning kafka pipelines
Sumant Tambe
 
Kafka Summit SF 2017 - Kafka and the Polyglot Programmer
Kafka Summit SF 2017 - Kafka and the Polyglot ProgrammerKafka Summit SF 2017 - Kafka and the Polyglot Programmer
Kafka Summit SF 2017 - Kafka and the Polyglot Programmer
confluent
 
Kafka blr-meetup-presentation - Kafka internals
Kafka blr-meetup-presentation - Kafka internalsKafka blr-meetup-presentation - Kafka internals
Kafka blr-meetup-presentation - Kafka internals
Ayyappadas Ravindran (Appu)
 
Kafka Summit SF 2017 - Kafka Connect Best Practices – Advice from the Field
Kafka Summit SF 2017 - Kafka Connect Best Practices – Advice from the FieldKafka Summit SF 2017 - Kafka Connect Best Practices – Advice from the Field
Kafka Summit SF 2017 - Kafka Connect Best Practices – Advice from the Field
confluent
 
What's new in Confluent 3.2 and Apache Kafka 0.10.2
What's new in Confluent 3.2 and Apache Kafka 0.10.2 What's new in Confluent 3.2 and Apache Kafka 0.10.2
What's new in Confluent 3.2 and Apache Kafka 0.10.2
confluent
 
Data Pipelines with Kafka Connect
Data Pipelines with Kafka ConnectData Pipelines with Kafka Connect
Data Pipelines with Kafka Connect
Kaufman Ng
 
Introducing Exactly Once Semantics To Apache Kafka
Introducing Exactly Once Semantics To Apache KafkaIntroducing Exactly Once Semantics To Apache Kafka
Introducing Exactly Once Semantics To Apache Kafka
Apurva Mehta
 
Kafka internals
Kafka internalsKafka internals
Kafka internals
David Groozman
 
Introducing Kafka's Streams API
Introducing Kafka's Streams APIIntroducing Kafka's Streams API
Introducing Kafka's Streams API
confluent
 
Developing with the Go client for Apache Kafka
Developing with the Go client for Apache KafkaDeveloping with the Go client for Apache Kafka
Developing with the Go client for Apache Kafka
Joe Stein
 
Introduction to Kafka and Zookeeper
Introduction to Kafka and ZookeeperIntroduction to Kafka and Zookeeper
Introduction to Kafka and Zookeeper
Rahul Jain
 
Apache Kafka Introduction
Apache Kafka IntroductionApache Kafka Introduction
Apache Kafka Introduction
Amita Mirajkar
 
Design Patterns for working with Fast Data
Design Patterns for working with Fast DataDesign Patterns for working with Fast Data
Design Patterns for working with Fast Data
MapR Technologies
 
Introduction to Apache Kafka- Part 1
Introduction to Apache Kafka- Part 1Introduction to Apache Kafka- Part 1
Introduction to Apache Kafka- Part 1
Knoldus Inc.
 
Introduction to Apache Kafka and why it matters - Madrid
Introduction to Apache Kafka and why it matters - MadridIntroduction to Apache Kafka and why it matters - Madrid
Introduction to Apache Kafka and why it matters - Madrid
Paolo Castagna
 
Apache Kafka 0.8 basic training - Verisign
Apache Kafka 0.8 basic training - VerisignApache Kafka 0.8 basic training - Verisign
Apache Kafka 0.8 basic training - Verisign
Michael Noll
 
From Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
From Message to Cluster: A Realworld Introduction to Kafka Capacity PlanningFrom Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
From Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
confluent
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
Jeff Holoman
 
Building Event-Driven Systems with Apache Kafka
Building Event-Driven Systems with Apache KafkaBuilding Event-Driven Systems with Apache Kafka
Building Event-Driven Systems with Apache Kafka
Brian Ritchie
 
Kafka Summit SF 2017 - Best Practices for Running Kafka on Docker Containers
Kafka Summit SF 2017 - Best Practices for Running Kafka on Docker ContainersKafka Summit SF 2017 - Best Practices for Running Kafka on Docker Containers
Kafka Summit SF 2017 - Best Practices for Running Kafka on Docker Containers
confluent
 
Tuning kafka pipelines
Tuning kafka pipelinesTuning kafka pipelines
Tuning kafka pipelines
Sumant Tambe
 
Kafka Summit SF 2017 - Kafka and the Polyglot Programmer
Kafka Summit SF 2017 - Kafka and the Polyglot ProgrammerKafka Summit SF 2017 - Kafka and the Polyglot Programmer
Kafka Summit SF 2017 - Kafka and the Polyglot Programmer
confluent
 
Kafka Summit SF 2017 - Kafka Connect Best Practices – Advice from the Field
Kafka Summit SF 2017 - Kafka Connect Best Practices – Advice from the FieldKafka Summit SF 2017 - Kafka Connect Best Practices – Advice from the Field
Kafka Summit SF 2017 - Kafka Connect Best Practices – Advice from the Field
confluent
 
What's new in Confluent 3.2 and Apache Kafka 0.10.2
What's new in Confluent 3.2 and Apache Kafka 0.10.2 What's new in Confluent 3.2 and Apache Kafka 0.10.2
What's new in Confluent 3.2 and Apache Kafka 0.10.2
confluent
 
Data Pipelines with Kafka Connect
Data Pipelines with Kafka ConnectData Pipelines with Kafka Connect
Data Pipelines with Kafka Connect
Kaufman Ng
 
Introducing Exactly Once Semantics To Apache Kafka
Introducing Exactly Once Semantics To Apache KafkaIntroducing Exactly Once Semantics To Apache Kafka
Introducing Exactly Once Semantics To Apache Kafka
Apurva Mehta
 
Introducing Kafka's Streams API
Introducing Kafka's Streams APIIntroducing Kafka's Streams API
Introducing Kafka's Streams API
confluent
 
Developing with the Go client for Apache Kafka
Developing with the Go client for Apache KafkaDeveloping with the Go client for Apache Kafka
Developing with the Go client for Apache Kafka
Joe Stein
 
Introduction to Kafka and Zookeeper
Introduction to Kafka and ZookeeperIntroduction to Kafka and Zookeeper
Introduction to Kafka and Zookeeper
Rahul Jain
 
Apache Kafka Introduction
Apache Kafka IntroductionApache Kafka Introduction
Apache Kafka Introduction
Amita Mirajkar
 
Design Patterns for working with Fast Data
Design Patterns for working with Fast DataDesign Patterns for working with Fast Data
Design Patterns for working with Fast Data
MapR Technologies
 
Introduction to Apache Kafka- Part 1
Introduction to Apache Kafka- Part 1Introduction to Apache Kafka- Part 1
Introduction to Apache Kafka- Part 1
Knoldus Inc.
 
Introduction to Apache Kafka and why it matters - Madrid
Introduction to Apache Kafka and why it matters - MadridIntroduction to Apache Kafka and why it matters - Madrid
Introduction to Apache Kafka and why it matters - Madrid
Paolo Castagna
 

Viewers also liked (19)

101 ways to configure kafka - badly (Kafka Summit)
101 ways to configure kafka - badly (Kafka Summit)101 ways to configure kafka - badly (Kafka Summit)
101 ways to configure kafka - badly (Kafka Summit)
Henning Spjelkavik
 
Leveraging Kafka for Big Data in Real Time Bidding, Analytics, ML & Campaign ...
Leveraging Kafka for Big Data in Real Time Bidding, Analytics, ML & Campaign ...Leveraging Kafka for Big Data in Real Time Bidding, Analytics, ML & Campaign ...
Leveraging Kafka for Big Data in Real Time Bidding, Analytics, ML & Campaign ...
Helena Edelson
 
Realtime Analytics with Apache Cassandra - JAX London
Realtime Analytics with Apache Cassandra - JAX LondonRealtime Analytics with Apache Cassandra - JAX London
Realtime Analytics with Apache Cassandra - JAX London
Acunu
 
From Spark to Ignition: Fueling Your Business on Real-Time Analytics
From Spark to Ignition: Fueling Your Business on Real-Time AnalyticsFrom Spark to Ignition: Fueling Your Business on Real-Time Analytics
From Spark to Ignition: Fueling Your Business on Real-Time Analytics
SingleStore
 
Apache Kafka Bay Area Sep Meetup - 24/7 Customer, Inc.
Apache Kafka Bay Area Sep Meetup - 24/7 Customer, Inc.Apache Kafka Bay Area Sep Meetup - 24/7 Customer, Inc.
Apache Kafka Bay Area Sep Meetup - 24/7 Customer, Inc.
Suneet Grover
 
Solr & Cassandra: Searching Cassandra with DataStax Enterprise
Solr & Cassandra: Searching Cassandra with DataStax EnterpriseSolr & Cassandra: Searching Cassandra with DataStax Enterprise
Solr & Cassandra: Searching Cassandra with DataStax Enterprise
DataStax Academy
 
WSO2Con US 2013 - Creating the API Centric Enterprise Towards a Connected Bus...
WSO2Con US 2013 - Creating the API Centric Enterprise Towards a Connected Bus...WSO2Con US 2013 - Creating the API Centric Enterprise Towards a Connected Bus...
WSO2Con US 2013 - Creating the API Centric Enterprise Towards a Connected Bus...
WSO2
 
Apache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinApache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek Berlin
Christian Johannsen
 
SnapLogic Adds Support for Kafka and HDInsight to Elastic Integration Platform
SnapLogic Adds Support for Kafka and HDInsight to Elastic Integration PlatformSnapLogic Adds Support for Kafka and HDInsight to Elastic Integration Platform
SnapLogic Adds Support for Kafka and HDInsight to Elastic Integration Platform
SnapLogic
 
Cloud-Con: Integration & Web APIs
Cloud-Con: Integration & Web APIsCloud-Con: Integration & Web APIs
Cloud-Con: Integration & Web APIs
SnapLogic
 
IPAAS_information on your terms
IPAAS_information on your termsIPAAS_information on your terms
IPAAS_information on your terms
Market Engel SAS
 
Anypoint mq (mulesoft) introduction
Anypoint mq (mulesoft)  introductionAnypoint mq (mulesoft)  introduction
Anypoint mq (mulesoft) introduction
removed_34be96619b7b4fcc8ecd9495ad92d40b
 
SnapLogic's Latest Elastic iPaaS Release Adds Hybrid Links for Spark, Cortana...
SnapLogic's Latest Elastic iPaaS Release Adds Hybrid Links for Spark, Cortana...SnapLogic's Latest Elastic iPaaS Release Adds Hybrid Links for Spark, Cortana...
SnapLogic's Latest Elastic iPaaS Release Adds Hybrid Links for Spark, Cortana...
SnapLogic
 
IPaaS
IPaaSIPaaS
IPaaS
Jonathan Boyer
 
Java Messaging Service
Java Messaging ServiceJava Messaging Service
Java Messaging Service
Dilip Prajapati
 
Cloud fuse-apachecon eu-2012
Cloud fuse-apachecon eu-2012Cloud fuse-apachecon eu-2012
Cloud fuse-apachecon eu-2012
Charles Moulliard
 
Jms
JmsJms
Jms
Prabhat gangwar
 
API Athens Meetup - API standards 22.03.2016
API Athens Meetup - API standards 22.03.2016API Athens Meetup - API standards 22.03.2016
API Athens Meetup - API standards 22.03.2016
Ivan Goncharov
 
MQ Light for Bluemix - IBM Interconnect 2015 session AME4183
MQ Light for Bluemix - IBM Interconnect 2015 session AME4183MQ Light for Bluemix - IBM Interconnect 2015 session AME4183
MQ Light for Bluemix - IBM Interconnect 2015 session AME4183
Robert Nicholson
 
101 ways to configure kafka - badly (Kafka Summit)
101 ways to configure kafka - badly (Kafka Summit)101 ways to configure kafka - badly (Kafka Summit)
101 ways to configure kafka - badly (Kafka Summit)
Henning Spjelkavik
 
Leveraging Kafka for Big Data in Real Time Bidding, Analytics, ML & Campaign ...
Leveraging Kafka for Big Data in Real Time Bidding, Analytics, ML & Campaign ...Leveraging Kafka for Big Data in Real Time Bidding, Analytics, ML & Campaign ...
Leveraging Kafka for Big Data in Real Time Bidding, Analytics, ML & Campaign ...
Helena Edelson
 
Realtime Analytics with Apache Cassandra - JAX London
Realtime Analytics with Apache Cassandra - JAX LondonRealtime Analytics with Apache Cassandra - JAX London
Realtime Analytics with Apache Cassandra - JAX London
Acunu
 
From Spark to Ignition: Fueling Your Business on Real-Time Analytics
From Spark to Ignition: Fueling Your Business on Real-Time AnalyticsFrom Spark to Ignition: Fueling Your Business on Real-Time Analytics
From Spark to Ignition: Fueling Your Business on Real-Time Analytics
SingleStore
 
Apache Kafka Bay Area Sep Meetup - 24/7 Customer, Inc.
Apache Kafka Bay Area Sep Meetup - 24/7 Customer, Inc.Apache Kafka Bay Area Sep Meetup - 24/7 Customer, Inc.
Apache Kafka Bay Area Sep Meetup - 24/7 Customer, Inc.
Suneet Grover
 
Solr & Cassandra: Searching Cassandra with DataStax Enterprise
Solr & Cassandra: Searching Cassandra with DataStax EnterpriseSolr & Cassandra: Searching Cassandra with DataStax Enterprise
Solr & Cassandra: Searching Cassandra with DataStax Enterprise
DataStax Academy
 
WSO2Con US 2013 - Creating the API Centric Enterprise Towards a Connected Bus...
WSO2Con US 2013 - Creating the API Centric Enterprise Towards a Connected Bus...WSO2Con US 2013 - Creating the API Centric Enterprise Towards a Connected Bus...
WSO2Con US 2013 - Creating the API Centric Enterprise Towards a Connected Bus...
WSO2
 
Apache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinApache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek Berlin
Christian Johannsen
 
SnapLogic Adds Support for Kafka and HDInsight to Elastic Integration Platform
SnapLogic Adds Support for Kafka and HDInsight to Elastic Integration PlatformSnapLogic Adds Support for Kafka and HDInsight to Elastic Integration Platform
SnapLogic Adds Support for Kafka and HDInsight to Elastic Integration Platform
SnapLogic
 
Cloud-Con: Integration & Web APIs
Cloud-Con: Integration & Web APIsCloud-Con: Integration & Web APIs
Cloud-Con: Integration & Web APIs
SnapLogic
 
IPAAS_information on your terms
IPAAS_information on your termsIPAAS_information on your terms
IPAAS_information on your terms
Market Engel SAS
 
SnapLogic's Latest Elastic iPaaS Release Adds Hybrid Links for Spark, Cortana...
SnapLogic's Latest Elastic iPaaS Release Adds Hybrid Links for Spark, Cortana...SnapLogic's Latest Elastic iPaaS Release Adds Hybrid Links for Spark, Cortana...
SnapLogic's Latest Elastic iPaaS Release Adds Hybrid Links for Spark, Cortana...
SnapLogic
 
Cloud fuse-apachecon eu-2012
Cloud fuse-apachecon eu-2012Cloud fuse-apachecon eu-2012
Cloud fuse-apachecon eu-2012
Charles Moulliard
 
API Athens Meetup - API standards 22.03.2016
API Athens Meetup - API standards 22.03.2016API Athens Meetup - API standards 22.03.2016
API Athens Meetup - API standards 22.03.2016
Ivan Goncharov
 
MQ Light for Bluemix - IBM Interconnect 2015 session AME4183
MQ Light for Bluemix - IBM Interconnect 2015 session AME4183MQ Light for Bluemix - IBM Interconnect 2015 session AME4183
MQ Light for Bluemix - IBM Interconnect 2015 session AME4183
Robert Nicholson
 
Ad

Similar to Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream Processing (20)

Kafka 10000 feet view
Kafka 10000 feet viewKafka 10000 feet view
Kafka 10000 feet view
younessx01
 
Unleashing Real-time Power with Kafka.pptx
Unleashing Real-time Power with Kafka.pptxUnleashing Real-time Power with Kafka.pptx
Unleashing Real-time Power with Kafka.pptx
Knoldus Inc.
 
kafka simplicity and complexity
kafka simplicity and complexitykafka simplicity and complexity
kafka simplicity and complexity
Paolo Platter
 
Distributed messaging through Kafka
Distributed messaging through KafkaDistributed messaging through Kafka
Distributed messaging through Kafka
Dileep Kalidindi
 
Current and Future of Apache Kafka
Current and Future of Apache KafkaCurrent and Future of Apache Kafka
Current and Future of Apache Kafka
Joe Stein
 
An Introduction to Apache Kafka
An Introduction to Apache KafkaAn Introduction to Apache Kafka
An Introduction to Apache Kafka
Amir Sedighi
 
Kafka syed academy_v1_introduction
Kafka syed academy_v1_introductionKafka syed academy_v1_introduction
Kafka syed academy_v1_introduction
Syed Hadoop
 
Kafkha real time analytics platform.pptx
Kafkha real time analytics platform.pptxKafkha real time analytics platform.pptx
Kafkha real time analytics platform.pptx
dummyuseage1
 
Apache Kafka - Free Friday
Apache Kafka - Free FridayApache Kafka - Free Friday
Apache Kafka - Free Friday
Otávio Carvalho
 
Fundamentals and Architecture of Apache Kafka
Fundamentals and Architecture of Apache KafkaFundamentals and Architecture of Apache Kafka
Fundamentals and Architecture of Apache Kafka
Angelo Cesaro
 
What is Kafka & why is it Important? (UKOUG Tech17, Birmingham, UK - December...
What is Kafka & why is it Important? (UKOUG Tech17, Birmingham, UK - December...What is Kafka & why is it Important? (UKOUG Tech17, Birmingham, UK - December...
What is Kafka & why is it Important? (UKOUG Tech17, Birmingham, UK - December...
Lucas Jellema
 
Apache Kafka
Apache KafkaApache Kafka
Apache Kafka
Saroj Panyasrivanit
 
AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...
AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...
AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...
Lucas Jellema
 
Kafka.pptx (uploaded from MyFiles SomnathDeb_PC)
Kafka.pptx (uploaded from MyFiles SomnathDeb_PC)Kafka.pptx (uploaded from MyFiles SomnathDeb_PC)
Kafka.pptx (uploaded from MyFiles SomnathDeb_PC)
somnathdeb0212
 
kafka_session_updated.pptx
kafka_session_updated.pptxkafka_session_updated.pptx
kafka_session_updated.pptx
Koiuyt1
 
Kafka for Scale
Kafka for ScaleKafka for Scale
Kafka for Scale
Eyal Ben Ivri
 
Apache Kafka - Messaging System Overview
Apache Kafka - Messaging System OverviewApache Kafka - Messaging System Overview
Apache Kafka - Messaging System Overview
Dmitry Tolpeko
 
Etl, esb, mq? no! es Apache Kafka®
Etl, esb, mq?  no! es Apache Kafka®Etl, esb, mq?  no! es Apache Kafka®
Etl, esb, mq? no! es Apache Kafka®
confluent
 
DevOps Fest 2020. Сергій Калінець. Building Data Streaming Platform with Apac...
DevOps Fest 2020. Сергій Калінець. Building Data Streaming Platform with Apac...DevOps Fest 2020. Сергій Калінець. Building Data Streaming Platform with Apac...
DevOps Fest 2020. Сергій Калінець. Building Data Streaming Platform with Apac...
DevOps_Fest
 
Reducing Microservice Complexity with Kafka and Reactive Streams
Reducing Microservice Complexity with Kafka and Reactive StreamsReducing Microservice Complexity with Kafka and Reactive Streams
Reducing Microservice Complexity with Kafka and Reactive Streams
jimriecken
 
Kafka 10000 feet view
Kafka 10000 feet viewKafka 10000 feet view
Kafka 10000 feet view
younessx01
 
Unleashing Real-time Power with Kafka.pptx
Unleashing Real-time Power with Kafka.pptxUnleashing Real-time Power with Kafka.pptx
Unleashing Real-time Power with Kafka.pptx
Knoldus Inc.
 
kafka simplicity and complexity
kafka simplicity and complexitykafka simplicity and complexity
kafka simplicity and complexity
Paolo Platter
 
Distributed messaging through Kafka
Distributed messaging through KafkaDistributed messaging through Kafka
Distributed messaging through Kafka
Dileep Kalidindi
 
Current and Future of Apache Kafka
Current and Future of Apache KafkaCurrent and Future of Apache Kafka
Current and Future of Apache Kafka
Joe Stein
 
An Introduction to Apache Kafka
An Introduction to Apache KafkaAn Introduction to Apache Kafka
An Introduction to Apache Kafka
Amir Sedighi
 
Kafka syed academy_v1_introduction
Kafka syed academy_v1_introductionKafka syed academy_v1_introduction
Kafka syed academy_v1_introduction
Syed Hadoop
 
Kafkha real time analytics platform.pptx
Kafkha real time analytics platform.pptxKafkha real time analytics platform.pptx
Kafkha real time analytics platform.pptx
dummyuseage1
 
Apache Kafka - Free Friday
Apache Kafka - Free FridayApache Kafka - Free Friday
Apache Kafka - Free Friday
Otávio Carvalho
 
Fundamentals and Architecture of Apache Kafka
Fundamentals and Architecture of Apache KafkaFundamentals and Architecture of Apache Kafka
Fundamentals and Architecture of Apache Kafka
Angelo Cesaro
 
What is Kafka & why is it Important? (UKOUG Tech17, Birmingham, UK - December...
What is Kafka & why is it Important? (UKOUG Tech17, Birmingham, UK - December...What is Kafka & why is it Important? (UKOUG Tech17, Birmingham, UK - December...
What is Kafka & why is it Important? (UKOUG Tech17, Birmingham, UK - December...
Lucas Jellema
 
AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...
AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...
AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...
Lucas Jellema
 
Kafka.pptx (uploaded from MyFiles SomnathDeb_PC)
Kafka.pptx (uploaded from MyFiles SomnathDeb_PC)Kafka.pptx (uploaded from MyFiles SomnathDeb_PC)
Kafka.pptx (uploaded from MyFiles SomnathDeb_PC)
somnathdeb0212
 
kafka_session_updated.pptx
kafka_session_updated.pptxkafka_session_updated.pptx
kafka_session_updated.pptx
Koiuyt1
 
Apache Kafka - Messaging System Overview
Apache Kafka - Messaging System OverviewApache Kafka - Messaging System Overview
Apache Kafka - Messaging System Overview
Dmitry Tolpeko
 
Etl, esb, mq? no! es Apache Kafka®
Etl, esb, mq?  no! es Apache Kafka®Etl, esb, mq?  no! es Apache Kafka®
Etl, esb, mq? no! es Apache Kafka®
confluent
 
DevOps Fest 2020. Сергій Калінець. Building Data Streaming Platform with Apac...
DevOps Fest 2020. Сергій Калінець. Building Data Streaming Platform with Apac...DevOps Fest 2020. Сергій Калінець. Building Data Streaming Platform with Apac...
DevOps Fest 2020. Сергій Калінець. Building Data Streaming Platform with Apac...
DevOps_Fest
 
Reducing Microservice Complexity with Kafka and Reactive Streams
Reducing Microservice Complexity with Kafka and Reactive StreamsReducing Microservice Complexity with Kafka and Reactive Streams
Reducing Microservice Complexity with Kafka and Reactive Streams
jimriecken
 
Ad

Recently uploaded (20)

Domino IQ – What to Expect, First Steps and Use Cases
Domino IQ – What to Expect, First Steps and Use CasesDomino IQ – What to Expect, First Steps and Use Cases
Domino IQ – What to Expect, First Steps and Use Cases
panagenda
 
Oracle Cloud Infrastructure Generative AI Professional
Oracle Cloud Infrastructure Generative AI ProfessionalOracle Cloud Infrastructure Generative AI Professional
Oracle Cloud Infrastructure Generative AI Professional
VICTOR MAESTRE RAMIREZ
 
Create Your First AI Agent with UiPath Agent Builder
Create Your First AI Agent with UiPath Agent BuilderCreate Your First AI Agent with UiPath Agent Builder
Create Your First AI Agent with UiPath Agent Builder
DianaGray10
 
The case for on-premises AI
The case for on-premises AIThe case for on-premises AI
The case for on-premises AI
Principled Technologies
 
7 Salesforce Data Cloud Best Practices.pdf
7 Salesforce Data Cloud Best Practices.pdf7 Salesforce Data Cloud Best Practices.pdf
7 Salesforce Data Cloud Best Practices.pdf
Minuscule Technologies
 
Introduction to Typescript - GDG On Campus EUE
Introduction to Typescript - GDG On Campus EUEIntroduction to Typescript - GDG On Campus EUE
Introduction to Typescript - GDG On Campus EUE
Google Developer Group On Campus European Universities in Egypt
 
Evaluation Challenges in Using Generative AI for Science & Technical Content
Evaluation Challenges in Using Generative AI for Science & Technical ContentEvaluation Challenges in Using Generative AI for Science & Technical Content
Evaluation Challenges in Using Generative AI for Science & Technical Content
Paul Groth
 
Developing Schemas with FME and Excel - Peak of Data & AI 2025
Developing Schemas with FME and Excel - Peak of Data & AI 2025Developing Schemas with FME and Excel - Peak of Data & AI 2025
Developing Schemas with FME and Excel - Peak of Data & AI 2025
Safe Software
 
“State-space Models vs. Transformers for Ultra-low-power Edge AI,” a Presenta...
“State-space Models vs. Transformers for Ultra-low-power Edge AI,” a Presenta...“State-space Models vs. Transformers for Ultra-low-power Edge AI,” a Presenta...
“State-space Models vs. Transformers for Ultra-low-power Edge AI,” a Presenta...
Edge AI and Vision Alliance
 
AI Agents in Logistics and Supply Chain Applications Benefits and Implementation
AI Agents in Logistics and Supply Chain Applications Benefits and ImplementationAI Agents in Logistics and Supply Chain Applications Benefits and Implementation
AI Agents in Logistics and Supply Chain Applications Benefits and Implementation
Christine Shepherd
 
Oracle Cloud Infrastructure AI Foundations
Oracle Cloud Infrastructure AI FoundationsOracle Cloud Infrastructure AI Foundations
Oracle Cloud Infrastructure AI Foundations
VICTOR MAESTRE RAMIREZ
 
Your startup on AWS - How to architect and maintain a Lean and Mean account
Your startup on AWS - How to architect and maintain a Lean and Mean accountYour startup on AWS - How to architect and maintain a Lean and Mean account
Your startup on AWS - How to architect and maintain a Lean and Mean account
angelo60207
 
Data Virtualization: Bringing the Power of FME to Any Application
Data Virtualization: Bringing the Power of FME to Any ApplicationData Virtualization: Bringing the Power of FME to Any Application
Data Virtualization: Bringing the Power of FME to Any Application
Safe Software
 
Azure vs AWS Which Cloud Platform Is Best for Your Business in 2025
Azure vs AWS  Which Cloud Platform Is Best for Your Business in 2025Azure vs AWS  Which Cloud Platform Is Best for Your Business in 2025
Azure vs AWS Which Cloud Platform Is Best for Your Business in 2025
Infrassist Technologies Pvt. Ltd.
 
ELNL2025 - Unlocking the Power of Sensitivity Labels - A Comprehensive Guide....
ELNL2025 - Unlocking the Power of Sensitivity Labels - A Comprehensive Guide....ELNL2025 - Unlocking the Power of Sensitivity Labels - A Comprehensive Guide....
ELNL2025 - Unlocking the Power of Sensitivity Labels - A Comprehensive Guide....
Jasper Oosterveld
 
Palo Alto Networks Cybersecurity Foundation
Palo Alto Networks Cybersecurity FoundationPalo Alto Networks Cybersecurity Foundation
Palo Alto Networks Cybersecurity Foundation
VICTOR MAESTRE RAMIREZ
 
Scaling GenAI Inference From Prototype to Production: Real-World Lessons in S...
Scaling GenAI Inference From Prototype to Production: Real-World Lessons in S...Scaling GenAI Inference From Prototype to Production: Real-World Lessons in S...
Scaling GenAI Inference From Prototype to Production: Real-World Lessons in S...
Anish Kumar
 
Establish Visibility and Manage Risk in the Supply Chain with Anchore SBOM
Establish Visibility and Manage Risk in the Supply Chain with Anchore SBOMEstablish Visibility and Manage Risk in the Supply Chain with Anchore SBOM
Establish Visibility and Manage Risk in the Supply Chain with Anchore SBOM
Anchore
 
Mark Zuckerberg teams up with frenemy Palmer Luckey to shape the future of XR...
Mark Zuckerberg teams up with frenemy Palmer Luckey to shape the future of XR...Mark Zuckerberg teams up with frenemy Palmer Luckey to shape the future of XR...
Mark Zuckerberg teams up with frenemy Palmer Luckey to shape the future of XR...
Scott M. Graffius
 
Co-Constructing Explanations for AI Systems using Provenance
Co-Constructing Explanations for AI Systems using ProvenanceCo-Constructing Explanations for AI Systems using Provenance
Co-Constructing Explanations for AI Systems using Provenance
Paul Groth
 
Domino IQ – What to Expect, First Steps and Use Cases
Domino IQ – What to Expect, First Steps and Use CasesDomino IQ – What to Expect, First Steps and Use Cases
Domino IQ – What to Expect, First Steps and Use Cases
panagenda
 
Oracle Cloud Infrastructure Generative AI Professional
Oracle Cloud Infrastructure Generative AI ProfessionalOracle Cloud Infrastructure Generative AI Professional
Oracle Cloud Infrastructure Generative AI Professional
VICTOR MAESTRE RAMIREZ
 
Create Your First AI Agent with UiPath Agent Builder
Create Your First AI Agent with UiPath Agent BuilderCreate Your First AI Agent with UiPath Agent Builder
Create Your First AI Agent with UiPath Agent Builder
DianaGray10
 
7 Salesforce Data Cloud Best Practices.pdf
7 Salesforce Data Cloud Best Practices.pdf7 Salesforce Data Cloud Best Practices.pdf
7 Salesforce Data Cloud Best Practices.pdf
Minuscule Technologies
 
Evaluation Challenges in Using Generative AI for Science & Technical Content
Evaluation Challenges in Using Generative AI for Science & Technical ContentEvaluation Challenges in Using Generative AI for Science & Technical Content
Evaluation Challenges in Using Generative AI for Science & Technical Content
Paul Groth
 
Developing Schemas with FME and Excel - Peak of Data & AI 2025
Developing Schemas with FME and Excel - Peak of Data & AI 2025Developing Schemas with FME and Excel - Peak of Data & AI 2025
Developing Schemas with FME and Excel - Peak of Data & AI 2025
Safe Software
 
“State-space Models vs. Transformers for Ultra-low-power Edge AI,” a Presenta...
“State-space Models vs. Transformers for Ultra-low-power Edge AI,” a Presenta...“State-space Models vs. Transformers for Ultra-low-power Edge AI,” a Presenta...
“State-space Models vs. Transformers for Ultra-low-power Edge AI,” a Presenta...
Edge AI and Vision Alliance
 
AI Agents in Logistics and Supply Chain Applications Benefits and Implementation
AI Agents in Logistics and Supply Chain Applications Benefits and ImplementationAI Agents in Logistics and Supply Chain Applications Benefits and Implementation
AI Agents in Logistics and Supply Chain Applications Benefits and Implementation
Christine Shepherd
 
Oracle Cloud Infrastructure AI Foundations
Oracle Cloud Infrastructure AI FoundationsOracle Cloud Infrastructure AI Foundations
Oracle Cloud Infrastructure AI Foundations
VICTOR MAESTRE RAMIREZ
 
Your startup on AWS - How to architect and maintain a Lean and Mean account
Your startup on AWS - How to architect and maintain a Lean and Mean accountYour startup on AWS - How to architect and maintain a Lean and Mean account
Your startup on AWS - How to architect and maintain a Lean and Mean account
angelo60207
 
Data Virtualization: Bringing the Power of FME to Any Application
Data Virtualization: Bringing the Power of FME to Any ApplicationData Virtualization: Bringing the Power of FME to Any Application
Data Virtualization: Bringing the Power of FME to Any Application
Safe Software
 
Azure vs AWS Which Cloud Platform Is Best for Your Business in 2025
Azure vs AWS  Which Cloud Platform Is Best for Your Business in 2025Azure vs AWS  Which Cloud Platform Is Best for Your Business in 2025
Azure vs AWS Which Cloud Platform Is Best for Your Business in 2025
Infrassist Technologies Pvt. Ltd.
 
ELNL2025 - Unlocking the Power of Sensitivity Labels - A Comprehensive Guide....
ELNL2025 - Unlocking the Power of Sensitivity Labels - A Comprehensive Guide....ELNL2025 - Unlocking the Power of Sensitivity Labels - A Comprehensive Guide....
ELNL2025 - Unlocking the Power of Sensitivity Labels - A Comprehensive Guide....
Jasper Oosterveld
 
Palo Alto Networks Cybersecurity Foundation
Palo Alto Networks Cybersecurity FoundationPalo Alto Networks Cybersecurity Foundation
Palo Alto Networks Cybersecurity Foundation
VICTOR MAESTRE RAMIREZ
 
Scaling GenAI Inference From Prototype to Production: Real-World Lessons in S...
Scaling GenAI Inference From Prototype to Production: Real-World Lessons in S...Scaling GenAI Inference From Prototype to Production: Real-World Lessons in S...
Scaling GenAI Inference From Prototype to Production: Real-World Lessons in S...
Anish Kumar
 
Establish Visibility and Manage Risk in the Supply Chain with Anchore SBOM
Establish Visibility and Manage Risk in the Supply Chain with Anchore SBOMEstablish Visibility and Manage Risk in the Supply Chain with Anchore SBOM
Establish Visibility and Manage Risk in the Supply Chain with Anchore SBOM
Anchore
 
Mark Zuckerberg teams up with frenemy Palmer Luckey to shape the future of XR...
Mark Zuckerberg teams up with frenemy Palmer Luckey to shape the future of XR...Mark Zuckerberg teams up with frenemy Palmer Luckey to shape the future of XR...
Mark Zuckerberg teams up with frenemy Palmer Luckey to shape the future of XR...
Scott M. Graffius
 
Co-Constructing Explanations for AI Systems using Provenance
Co-Constructing Explanations for AI Systems using ProvenanceCo-Constructing Explanations for AI Systems using Provenance
Co-Constructing Explanations for AI Systems using Provenance
Paul Groth
 

Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream Processing

  • 1. Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream Processing Surge’12 September 27, 2012 Erik Onnen @eonnen
  • 2. About Me • Director of Architecture and Development at Urban Airship • Formerly Jive Software, Liberty Mutual, Opsware, Progress • Java, C++, Python • Background in messaging systems • Contributor to ActiveMQ • Global Tibco deployments • ESB Commercial Products
  • 3. About Urban Airship • Engagement platform using location and push notifications • Analytics for delivery, conversion and influence • High precision targeting capabilities
  • 4. This Talk • How UA uses Kafka • Kafka architecture digest • Data structures and stream processing w/ Kafka • Operational considerations
  • 5. Kafka at Urban Airship
  • 6. Kafka at Urban Airship “The use for activity stream processing makes Kafka comparable to Facebook's Scribe or Apache Flume... though the architecture and primitives are very different for these systems and make Kafka more comparable to a traditional messaging system.” - http://incubator.apache.org/kafka/ Sep 27, 2012
  • 7. Kafka at Urban Airship “The use for activity stream processing makes Kafka comparable to Facebook's Scribe or Apache Flume... though the architecture and primitives are very different for these systems and make Kafka more comparable to a traditional messaging system.” - http://incubator.apache.org/kafka/ Sep 27, 2012 “Let’s use it for all the things” - me, 2010
  • 8. Kafka at Urban Airship
  • 9. Kafka at Urban Airship • On the critical path for many of our core capabilities
  • 10. Kafka at Urban Airship • On the critical path for many of our core capabilities • Device metadata
  • 11. Kafka at Urban Airship • On the critical path for many of our core capabilities • Device metadata • Message delivery analytics
  • 12. Kafka at Urban Airship • On the critical path for many of our core capabilities • Device metadata • Message delivery analytics • Device connectivity state
  • 13. Kafka at Urban Airship • On the critical path for many of our core capabilities • Device metadata • Message delivery analytics • Device connectivity state • Feeds our operational data warehouse
  • 14. Kafka at Urban Airship • On the critical path for many of our core capabilities • Device metadata • Message delivery analytics • Device connectivity state • Feeds our operational data warehouse • Three Kafka clusters doing in aggregate > 7B msg/day
  • 15. Kafka at Urban Airship • On the critical path for many of our core capabilities • Device metadata • Message delivery analytics • Device connectivity state • Feeds our operational data warehouse • Three Kafka clusters doing in aggregate > 7B msg/day • Peak capacity observed single consumer 750K msg/sec
  • 16. Kafka at Urban Airship • On the critical path for many of our core capabilities • Device metadata • Message delivery analytics • Device connectivity state • Feeds our operational data warehouse • Three Kafka clusters doing in aggregate > 7B msg/day • Peak capacity observed single consumer 750K msg/sec • All bare metal hardware hosted with an MSP
  • 17. Kafka at Urban Airship • On the critical path for many of our core capabilities • Device metadata • Message delivery analytics • Device connectivity state • Feeds our operational data warehouse • Three Kafka clusters doing in aggregate > 7B msg/day • Peak capacity observed single consumer 750K msg/sec • All bare metal hardware hosted with an MSP • Factoring prominently in our multi-facility architecture
  • 18. Kafka Core Concepts - The Big Picture
  • 19. Kafka Core Concepts - The Big Picture
  • 21. Kafka Core Concepts • Publish subscribe system (not a queue) • One producer, zero or more consumers • Consumers aren’t contending with each other for messages • Messages retained for a configured window of time • Messages grouped by topics • Consumers partition a topic as a group: •1 consumer thread - all topic messages •2 consumers threads - each .5 total messages •3 consumers threads - each .3 total messages
  • 22. Kafka Core Concepts - Producers
  • 23. Kafka Core Concepts - Producers • Producers have no idea who will consume a message or when • Deliver messages to one and only one topic • Deliver messages to one and only one broker* • Deliver a message to one and only one partition on a broker • Messages are not ack’d in any way (not when received, not when on disk, not on a boat, not in a plane...) • Messages largely opaque to producers • Send messages at or below a configured size†
  • 24. Kafka Core Concepts - Brokers
  • 25. Kafka Core Concepts - Brokers • Dumb by design • No shared state • Publish small bits of metadata to ZooKeeper • Messages are pulled by consumers (no push state management) • Manage sets of segment files, one per topic + partition combination • All delivery done through sendfile calls on mmap’d files - very fast, avoids system -> user -> system copy for every send
  • 26. Kafka Core Concepts - Brokers • Nearly invisible in the grand scheme of operations if they have enough disk and RAM
  • 27. Kafka Core Concepts - Brokers • Don’t fear the JVM (just put it in a corner) • Most of the heavy lifting is done in system calls • Minimal on-heap buffering keeps most garbage in ParNew • 20 minute sample has approximately 100 ParNew collections for a total of .42 seconds in GC (0.0003247526)
  • 28. Kafka Core Concepts - Consumers
  • 29. Kafka Core Concepts - Consumers • Consumer configured for one and only one group • Messages are consumed in KafkaMessageStream iterators that never stop but may block • Message message stream is a combination of: • Topic (SPORTS) • Group (SPORTS EVENT LOGGER | SCORE UPDATER) • Broker(s) - 1 or more brokers feed a logical stream • Partition(s) - 1 or more partitions from a broker + topic
  • 31. Kafka Is Excellent for... • Small, expressive messages - BYOD
  • 32. Kafka Is Excellent for... • Small, expressive messages - BYOD • Throughput
  • 33. Kafka Is Excellent for... • Small, expressive messages - BYOD • Throughput • Decimates any JMS or AMQP servers for PubSub throughput
  • 34. Kafka Is Excellent for... • Small, expressive messages - BYOD • Throughput • Decimates any JMS or AMQP servers for PubSub throughput • >70x better throughput than beanstalkd
  • 35. Kafka Is Excellent for... • Small, expressive messages - BYOD • Throughput • Decimates any JMS or AMQP servers for PubSub throughput • >70x better throughput than beanstalkd • Scales well with number of consumers, topics
  • 36. Kafka Is Excellent for... • Small, expressive messages - BYOD • Throughput • Decimates any JMS or AMQP servers for PubSub throughput • >70x better throughput than beanstalkd • Scales well with number of consumers, topics • Re-balance after consumer failures
  • 37. Kafka Is Excellent for... • Small, expressive messages - BYOD • Throughput • Decimates any JMS or AMQP servers for PubSub throughput • >70x better throughput than beanstalkd • Scales well with number of consumers, topics • Re-balance after consumer failures • Rewind in time scenarios
  • 38. Kafka Is Excellent for... • Small, expressive messages - BYOD • Throughput • Decimates any JMS or AMQP servers for PubSub throughput • >70x better throughput than beanstalkd • Scales well with number of consumers, topics • Re-balance after consumer failures • Rewind in time scenarios • Allowing transient “taps” into streams of data for roughly the cost of transport
  • 39. But, Kafka Makes Critical Concessions - Brokers
  • 40. But, Kafka Makes Critical Concessions - Brokers • Data not redundant - if a broker dies, you have to restore it to recover that data
  • 41. But, Kafka Makes Critical Concessions - Brokers • Data not redundant - if a broker dies, you have to restore it to recover that data • Shore up hardware
  • 42. But, Kafka Makes Critical Concessions - Brokers • Data not redundant - if a broker dies, you have to restore it to recover that data • Shore up hardware • Consume as fast as possible
  • 43. But, Kafka Makes Critical Concessions - Brokers • Data not redundant - if a broker dies, you have to restore it to recover that data • Shore up hardware • Consume as fast as possible • Persist to shared storage or use BRDB
  • 44. But, Kafka Makes Critical Concessions - Brokers • Data not redundant - if a broker dies, you have to restore it to recover that data • Shore up hardware • Consume as fast as possible • Persist to shared storage or use BRDB • Upcoming replication
  • 45. But, Kafka Makes Critical Concessions - Brokers • Data not redundant - if a broker dies, you have to restore it to recover that data • Shore up hardware • Consume as fast as possible • Persist to shared storage or use BRDB • Upcoming replication • Segment corruption can be fatal for that topic + partition
  • 47. Kafka Critical Concessions - Consumers • Messages can be delivered out of order
  • 48. Kafka Critical Concessions - Consumers • Messages can be delivered out of order
  • 50. Kafka Critical Concessions - Consumers • No once and only once semantics
  • 51. Kafka Critical Concessions - Consumers • No once and only once semantics • Consumers must correctly handle the same message multiple times
  • 52. Kafka Critical Concessions - Consumers • No once and only once semantics • Consumers must correctly handle the same message multiple times • Rebalance after fail can result in redelivery
  • 53. Kafka Critical Concessions - Consumers • No once and only once semantics • Consumers must correctly handle the same message multiple times • Rebalance after fail can result in redelivery • Consumer failure or unclean shutdown can result in redelivery
  • 54. Kafka Critical Concessions - Consumers • No once and only once semantics • Consumers must correctly handle the same message multiple times • Rebalance after fail can result in redelivery • Consumer failure or unclean shutdown can result in redelivery • Possibility of out of order delivery and redelivery require idempotent, commutative consumers when dealing with systems of record
  • 55. Storage Patterns and Data Structures
  • 56. Storage Patterns and Data Structures • Urban Airship uses Kafka for
  • 57. Storage Patterns and Data Structures • Urban Airship uses Kafka for • Analytics
  • 58. Storage Patterns and Data Structures • Urban Airship uses Kafka for • Analytics • Producers write device data to Kafka
  • 59. Storage Patterns and Data Structures • Urban Airship uses Kafka for • Analytics • Producers write device data to Kafka • Consumers create dimensional indexes in HBase
  • 60. Storage Patterns and Data Structures • Urban Airship uses Kafka for • Analytics • Producers write device data to Kafka • Consumers create dimensional indexes in HBase • Operational Data
  • 61. Storage Patterns and Data Structures • Urban Airship uses Kafka for • Analytics • Producers write device data to Kafka • Consumers create dimensional indexes in HBase • Operational Data • Producers are services writing to Kafka
  • 62. Storage Patterns and Data Structures • Urban Airship uses Kafka for • Analytics • Producers write device data to Kafka • Consumers create dimensional indexes in HBase • Operational Data • Producers are services writing to Kafka • Consumers write to ODW (HBase as JSON)
  • 63. Storage Patterns and Data Structures • Urban Airship uses Kafka for • Analytics • Producers write device data to Kafka • Consumers create dimensional indexes in HBase • Operational Data • Producers are services writing to Kafka • Consumers write to ODW (HBase as JSON) • Presence Data
  • 64. Storage Patterns and Data Structures • Urban Airship uses Kafka for • Analytics • Producers write device data to Kafka • Consumers create dimensional indexes in HBase • Operational Data • Producers are services writing to Kafka • Consumers write to ODW (HBase as JSON) • Presence Data • Producers are connectivity nodes writing to Kafka
  • 65. Storage Patterns and Data Structures • Urban Airship uses Kafka for • Analytics • Producers write device data to Kafka • Consumers create dimensional indexes in HBase • Operational Data • Producers are services writing to Kafka • Consumers write to ODW (HBase as JSON) • Presence Data • Producers are connectivity nodes writing to Kafka • Consumers write to LevelDB
  • 66. Storage Patterns - Device Metadata
  • 67. Storage Patterns - Device Metadata { deviceId:”PONIES”, tags:[”BEYONCE”], timestamp:1}
  • 68. Storage Patterns - Device Metadata { deviceId:”PONIES”, tags:[”BEYONCE”], timestamp:1} { deviceId:”PONIES”, tags:[”BEYONCE”, “JAY-Z”, “NICKLEBACK”], timestamp:2}
  • 69. Storage Patterns - Device Metadata { deviceId:”PONIES”, tags:[”BEYONCE”], timestamp:1} { deviceId:”PONIES”, tags:[”BEYONCE”, “JAY-Z”, “NICKLEBACK”], timestamp:2} { deviceId:”PONIES”, tags:[”BEYONCE”, “JAY-Z”, “NICKLEBACK”], timestamp:3}
  • 70. Storage Patterns - Device Metadata
  • 71. Storage Patterns - Device Metadata • Primitive incarnation - blast an update into a row, keyed on deviceID
  • 72. Storage Patterns - Device Metadata • Primitive incarnation - blast an update into a row, keyed on deviceID • RDBMS
  • 73. Storage Patterns - Device Metadata • Primitive incarnation - blast an update into a row, keyed on deviceID • RDBMS • INSERT OR UPDATE DEVICE_METADATA (ID, VALUE) VALUES (DEVICE_ID, BLOB) WHERE ID = deviceID;
  • 74. Storage Patterns - Device Metadata • Primitive incarnation - blast an update into a row, keyed on deviceID • RDBMS • INSERT OR UPDATE DEVICE_METADATA (ID, VALUE) VALUES (DEVICE_ID, BLOB) WHERE ID = deviceID; • Denormalize - forget joining to read tags, way too expensive
  • 75. Storage Patterns - Device Metadata
  • 76. Storage Patterns - Device Metadata • Column Store
  • 77. Storage Patterns - Device Metadata • Column Store • Write k=deviceId -> c=NULL -> v= BLOB
  • 78. Storage Patterns - Device Metadata • Column Store • Write k=deviceId -> c=NULL -> v= BLOB • Both
  • 79. Storage Patterns - Device Metadata • Column Store • Write k=deviceId -> c=NULL -> v= BLOB • Both • Idempotent
  • 80. Storage Patterns - Device Metadata • Column Store • Write k=deviceId -> c=NULL -> v= BLOB • Both • Idempotent • FAIL - mutations can arrive out of order, can be replayed
  • 81. Storage Patterns - Device Metadata • Column Store • Write k=deviceId -> c=NULL -> v= BLOB • Both • Idempotent • FAIL - mutations can arrive out of order, can be replayed • Commutative
  • 82. Storage Patterns - Device Metadata
  • 83. Storage Patterns - Device Metadata • Improved approach - leverage the timestamp of the mutation
  • 84. Storage Patterns - Device Metadata • Improved approach - leverage the timestamp of the mutation • RDBMS
  • 85. Storage Patterns - Device Metadata • Improved approach - leverage the timestamp of the mutation • RDBMS • INSERT OR UPDATE DEVICE_METADATA (KEY, VALUE, TS) VALUES (DEVICE_ID, BLOB, TS) WHERE ID = deviceID AND TS = TS;
  • 86. Storage Patterns - Device Metadata • Improved approach - leverage the timestamp of the mutation • RDBMS • INSERT OR UPDATE DEVICE_METADATA (KEY, VALUE, TS) VALUES (DEVICE_ID, BLOB, TS) WHERE ID = deviceID AND TS = TS; • Heavy-handed approach
  • 87. Storage Patterns - Device Metadata • Improved approach - leverage the timestamp of the mutation • RDBMS • INSERT OR UPDATE DEVICE_METADATA (KEY, VALUE, TS) VALUES (DEVICE_ID, BLOB, TS) WHERE ID = deviceID AND TS = TS; • Heavy-handed approach • Massive I/O on TS index or risk reading an entire block per version with no adjacent blocks
  • 88. Storage Patterns - Device Metadata
  • 89. Storage Patterns - Device Metadata • Column Store
  • 90. Storage Patterns - Device Metadata • Column Store
  • 91. Storage Patterns - Device Metadata • Column Store
  • 92. Storage Patterns - Device Metadata • Column Store
  • 93. Storage Patterns - Device Metadata
  • 94. Storage Patterns - Device Metadata • Column Store
  • 95. Storage Patterns - Device Metadata • Column Store • Write k=deviceId -> c=INV(ts) -> v=BLOB
  • 96. Storage Patterns - Device Metadata • Column Store • Write k=deviceId -> c=INV(ts) -> v=BLOB • Reads are simple slices of one column, easy for LSM (pop the top column in the row)
  • 97. Storage Patterns - Device Metadata • Column Store • Write k=deviceId -> c=INV(ts) -> v=BLOB • Reads are simple slices of one column, easy for LSM (pop the top column in the row) • No transactions required, much smaller lock footprint
  • 98. Storage Patterns - Device Metadata • Column Store • Write k=deviceId -> c=INV(ts) -> v=BLOB • Reads are simple slices of one column, easy for LSM (pop the top column in the row) • No transactions required, much smaller lock footprint • Both
  • 99. Storage Patterns - Device Metadata • Column Store • Write k=deviceId -> c=INV(ts) -> v=BLOB • Reads are simple slices of one column, easy for LSM (pop the top column in the row) • No transactions required, much smaller lock footprint • Both • Idempotent
  • 100. Storage Patterns - Device Metadata • Column Store • Write k=deviceId -> c=INV(ts) -> v=BLOB • Reads are simple slices of one column, easy for LSM (pop the top column in the row) • No transactions required, much smaller lock footprint • Both • Idempotent • Commutative
  • 101. Storage Patterns - Device Metadata • Column Store • Write k=deviceId -> c=INV(ts) -> v=BLOB • Reads are simple slices of one column, easy for LSM (pop the top column in the row) • No transactions required, much smaller lock footprint • Both • Idempotent • Commutative • Old versions not removed automatically
  • 102. Storage Patterns - Device Metadata • Column Store • Write k=deviceId -> c=INV(ts) -> v=BLOB • Reads are simple slices of one column, easy for LSM (pop the top column in the row) • No transactions required, much smaller lock footprint • Both • Idempotent • Commutative • Old versions not removed automatically • Secondary indexes very difficult
  • 103. Storage Patterns - Device Metadata
  • 104. Storage Patterns - Device Metadata • Gangam Style - tag per column, deletions tombstoned
  • 105. Storage Patterns - Device Metadata • Gangam Style - tag per column, deletions tombstoned • RDBMS - select for update and/or big txns?
  • 106. Storage Patterns - Device Metadata • Gangam Style - tag per column, deletions tombstoned • RDBMS - select for update and/or big txns? • Column Store
  • 107. Storage Patterns - Device Metadata • Gangam Style - tag per column, deletions tombstoned • RDBMS - select for update and/or big txns? • Column Store • Addition k=deviceId -> c=TAG -> v=TS
  • 108. Storage Patterns - Device Metadata • Gangam Style - tag per column, deletions tombstoned • RDBMS - select for update and/or big txns? • Column Store • Addition k=deviceId -> c=TAG -> v=TS • Deletion k=deviceId -> c=TAG -> v=-(TS)
  • 109. Storage Patterns - Device Metadata • Gangam Style - tag per column, deletions tombstoned • RDBMS - select for update and/or big txns? • Column Store • Addition k=deviceId -> c=TAG -> v=TS • Deletion k=deviceId -> c=TAG -> v=-(TS) • Cell timestamp set to event timestamp in both cases (old updates ignored)
  • 110. Storage Patterns - Device Metadata • Gangam Style - tag per column, deletions tombstoned • RDBMS - select for update and/or big txns? • Column Store • Addition k=deviceId -> c=TAG -> v=TS • Deletion k=deviceId -> c=TAG -> v=-(TS) • Cell timestamp set to event timestamp in both cases (old updates ignored) • Easy to (re)build secondary indexes, tag counts
  • 111. Storage Patterns - Device Metadata • Gangam Style - tag per column, deletions tombstoned • RDBMS - select for update and/or big txns? • Column Store • Addition k=deviceId -> c=TAG -> v=TS • Deletion k=deviceId -> c=TAG -> v=-(TS) • Cell timestamp set to event timestamp in both cases (old updates ignored) • Easy to (re)build secondary indexes, tag counts • Commutative, Idempotent and Fast
  • 112. Storage Patterns - Device Metadata
  • 113. Storage Patterns - Device Metadata
  • 114. Storage Patterns - Device Metadata
  • 115. Storage Patterns - Device Metadata
  • 117. Operational Considerations - Buffering •A message in a broker is not immediately visible to a consumer
  • 118. Operational Considerations - Buffering •A message in a broker is not immediately visible to a consumer • Kafka buffers data until one of two conditions is true
  • 119. Operational Considerations - Buffering •A message in a broker is not immediately visible to a consumer • Kafka buffers data until one of two conditions is true • log.flush.interval reached
  • 120. Operational Considerations - Buffering •A message in a broker is not immediately visible to a consumer • Kafka buffers data until one of two conditions is true • log.flush.interval reached • log.default.flush.interval.ms elapsed
  • 121. Operational Considerations - Buffering •A message in a broker is not immediately visible to a consumer • Kafka buffers data until one of two conditions is true • log.flush.interval reached • log.default.flush.interval.ms elapsed • False latency for low throughput workloads
  • 122. Operational Considerations - Buffering •A message in a broker is not immediately visible to a consumer • Kafka buffers data until one of two conditions is true • log.flush.interval reached • log.default.flush.interval.ms elapsed • False latency for low throughput workloads • The smaller of the two represents loss message potential
  • 123. Operational Considerations - The FetcherRunnable
  • 124. Operational Considerations - The FetcherRunnable • Consumer spawns a number of FetcherRunnable threads to read from brokers
  • 125. Operational Considerations - The FetcherRunnable • Consumer spawns a number of FetcherRunnable threads to read from brokers • FetcherRunnable feeds messages into queues that back the KafkaMessageStream API
  • 126. Operational Considerations - The FetcherRunnable • Consumer spawns a number of FetcherRunnable threads to read from brokers • FetcherRunnable feeds messages into queues that back the KafkaMessageStream API • FetchRunnable must remain healthy for consumers to see messages
  • 127. Operational Considerations - The FetcherRunnable • Consumer spawns a number of FetcherRunnable threads to read from brokers • FetcherRunnable feeds messages into queues that back the KafkaMessageStream API • FetchRunnable must remain healthy for consumers to see messages // consume the messages in the threads for(final KafkaStream<Message> stream: streams) { executor.submit(new Runnable() { public void run() { for(MessageAndMetadata msgAndMetadata: stream) { // process message (msgAndMetadata.message()) }}};}
  • 128. Operational Considerations - The FetcherRunnable
  • 129. Operational Considerations - The FetcherRunnable
  • 130. Operational Considerations - The FetcherRunnable •A given FetcherRunnable is the lone source of data for its streams
  • 131. Operational Considerations - The FetcherRunnable •A given FetcherRunnable is the lone source of data for its streams • When a FetcherRunnable dies, the streams block indefinitely
  • 132. Operational Considerations - The FetcherRunnable •A given FetcherRunnable is the lone source of data for its streams • When a FetcherRunnable dies, the streams block indefinitely 2012-06-15 00:31:39,422 - ERROR [FetchRunnable-0:kafka.consumer.FetcherRunnable] - error in FetcherRunnable java.io.IOException: Connection reset by peer at sun.nio.ch.FileDispatcher.read0(Native Method) at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:21) at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:202) at sun.nio.ch.IOUtil.read(IOUtil.java:175) at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:243) at kafka.utils.Utils$.read(Utils.scala:483) at kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:53) at kafka.network.Receive$class.readCompletely(Transmission.scala:56) at kafka.network.BoundedByteBufferReceive.readCompletely(BoundedByteBufferReceive.scala:28) at kafka.consumer.SimpleConsumer.getResponse(SimpleConsumer.scala:181) at kafka.consumer.SimpleConsumer.liftedTree2$1(SimpleConsumer.scala:129) at kafka.consumer.SimpleConsumer.multifetch(SimpleConsumer.scala:119) at kafka.consumer.FetcherRunnable.run(FetcherRunnable.scala:63)
  • 134. Operational Considerations - Rate is King • MONITOR YOUR CONSUMPTION RATES
  • 135. Operational Considerations - Rate is King • MONITOR YOUR CONSUMPTION RATES • Kafka JMX Beans
  • 136. Operational Considerations - Rate is King • MONITOR YOUR CONSUMPTION RATES • Kafka JMX Beans • Application metrics for specific consumption behaviors (use Yammer Timer metrics)
  • 137. Operational Considerations - Rate is King • MONITOR YOUR CONSUMPTION RATES • Kafka JMX Beans • Application metrics for specific consumption behaviors (use Yammer Timer metrics) • Understand what “normal” is, alert when you are out of that band by some tolerance
  • 138. Operational Considerations - Rate is King • MONITOR YOUR CONSUMPTION RATES • Kafka JMX Beans • Application metrics for specific consumption behaviors (use Yammer Timer metrics) • Understand what “normal” is, alert when you are out of that band by some tolerance • Not overcommitting consumers helps - nobody is idle
  • 139. Operational Considerations - The Retention Window
  • 140. Operational Considerations - The Retention Window • Data written to a segment file on a broker (topic + partition)
  • 141. Operational Considerations - The Retention Window • Data written to a segment file on a broker (topic + partition) • Every consumer group has a relative offset within a segment
  • 142. Operational Considerations - The Retention Window • Data written to a segment file on a broker (topic + partition) • Every consumer group has a relative offset within a segment • Individual consumers move the offset and store to ZooKeeper on a regular interval
  • 143. Operational Considerations - The Retention Window • Data written to a segment file on a broker (topic + partition) • Every consumer group has a relative offset within a segment • Individual consumers move the offset and store to ZooKeeper on a regular interval • Segments are retained for log.retention.hours
  • 144. Operational Considerations - The Retention Window • Data written to a segment file on a broker (topic + partition) • Every consumer group has a relative offset within a segment • Individual consumers move the offset and store to ZooKeeper on a regular interval • Segments are retained for log.retention.hours • Segments deleted when outside retention window
  • 145. Operational Considerations - The Retention Window
  • 146. Operational Considerations - The Retention Window
  • 147. Operational Considerations - The Retention Window
  • 148. Operational Considerations - The Retention Window
  • 149. Operational Considerations - The Retention Window
  • 150. Operational Considerations - The Retention Window • Consumers update offsets in ZooKeeper
  • 151. Operational Considerations - The Retention Window • Consumers update offsets in ZooKeeper • Monitor them and make sure they’re progressing
  • 152. Operational Considerations - The Retention Window • Consumers update offsets in ZooKeeper • Monitor them and make sure they’re progressing • Look for skew in rate of change between partition offsets
  • 153. Operational Considerations - The Retention Window • Consumers update offsets in ZooKeeper • Monitor them and make sure they’re progressing • Look for skew in rate of change between partition offsets • Monitoring consumption rate can also help
  • 154. Operational Considerations - Scala “Reading that Scala stack trace sure was easy” - Nobody Ever
  • 155. Operational Considerations - Scala 2012-07-04 11:49:08,469 - WARN [ZkClient-EventThread-132- zookeeper-0:2181,zookeeper-1:2181,zookeeper-2:2181:org.I0Itec.zkclient.ZkEventThread] - Error handling event ZkEvent[Children of / brokers/topics/SEND_EVENTS changed sent to kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener@43d248b4] java.lang.NullPointerException     at scala.util.parsing.combinator.Parsers$NoSuccess.<init>(Parsers.scala:131)     at scala.util.parsing.combinator.Parsers$Failure.<init>(Parsers.scala:158)     at scala.util.parsing.combinator.Parsers$$anonfun$acceptIf$1.apply(Parsers.scala:489)     at scala.util.parsing.combinator.Parsers$$anonfun$acceptIf$1.apply(Parsers.scala:487)     at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:182)     at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:203)     at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:203)     at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:182)     at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:182) ... (~50 lines elided)     at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:203)     at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:203)     at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:182)     at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:182)     at scala.util.parsing.combinator.Parsers$Success.flatMapWithNext(Parsers.scala:113)     at scala.util.parsing.combinator.Parsers$Parser$$anonfun$flatMap$1.apply(Parsers.scala:200)     at scala.util.parsing.combinator.Parsers$Parser$$anonfun$flatMap$1.apply(Parsers.scala:200)     at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:182)     at scala.util.parsing.combinator.Parsers$Parser$$anonfun$flatMap$1.apply(Parsers.scala:200)     at scala.util.parsing.combinator.Parsers$Parser$$anonfun$flatMap$1.apply(Parsers.scala:200)     at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:182)     at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:203)     at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:203)     at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:182)     at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:208)     at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:208)     at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:182)     at scala.util.parsing.combinator.Parsers$$anon$2.apply(Parsers.scala:742)     at scala.util.parsing.json.JSON$.parseRaw(JSON.scala:71)     at scala.util.parsing.json.JSON$.parseFull(JSON.scala:85)
  • 157. Operational Considerations - Brokers • Monitor IOPS and IOUtil
  • 158. Operational Considerations - Brokers • Monitor IOPS and IOUtil • Under no circumstances allow a broker to run out of disk space (don’t even get close)
  • 159. Operational Considerations - Brokers • Monitor IOPS and IOUtil • Under no circumstances allow a broker to run out of disk space (don’t even get close) • fetch.size - amount of data a consumer will pull
  • 160. Operational Considerations - Brokers • Monitor IOPS and IOUtil • Under no circumstances allow a broker to run out of disk space (don’t even get close) • fetch.size - amount of data a consumer will pull • max.message.size - largest message a producer can submit to a broker
  • 161. Operational Considerations - Brokers • Monitor IOPS and IOUtil • Under no circumstances allow a broker to run out of disk space (don’t even get close) • fetch.size - amount of data a consumer will pull • max.message.size - largest message a producer can submit to a broker • Broker enforces neither of these prior to v0.8 :(
  • 162. Operational Considerations - Brokers • Monitor IOPS and IOUtil • Under no circumstances allow a broker to run out of disk space (don’t even get close) • fetch.size - amount of data a consumer will pull • max.message.size - largest message a producer can submit to a broker • Broker enforces neither of these prior to v0.8 :( • KAFKA-490
  • 163. Operational Considerations - Brokers • Monitor IOPS and IOUtil • Under no circumstances allow a broker to run out of disk space (don’t even get close) • fetch.size - amount of data a consumer will pull • max.message.size - largest message a producer can submit to a broker • Broker enforces neither of these prior to v0.8 :( • KAFKA-490 • KAFKA-247
  • 165. Operational Considerations - Brokers 2012-06-15 04:47:35,632 - ERROR [FetchRunnable-2:kafka.consumer.FetcherRunnable] - error in FetcherRunnable for RN-OL:3-22 kafka.common.InvalidMessageSizeException: invalid message size:152173251 only received bytes:307196 at 0 possible causes (1) a single message larger than the fetch size; (2) log corruption at kafka.message.ByteBufferMessageSet$$anon$1.makeNext(ByteBufferMessageSet.scala:75) at kafka.message.ByteBufferMessageSet$$anon$1.makeNext(ByteBufferMessageSet.scala:61) at kafka.utils.IteratorTemplate.maybeComputeNext(IteratorTemplate.scala:58) at kafka.utils.IteratorTemplate.hasNext(IteratorTemplate.scala:50) at kafka.message.ByteBufferMessageSet.validBytes(ByteBufferMessageSet.scala:49) at kafka.consumer.PartitionTopicInfo.enqueue(PartitionTopicInfo.scala:70) at kafka.consumer.FetcherRunnable$$anonfun$run$3.apply(FetcherRunnable.scala:80) at kafka.consumer.FetcherRunnable$$anonfun$run$3.apply(FetcherRunnable.scala:66) at scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:61) at scala.collection.immutable.List.foreach(List.scala:45) at kafka.consumer.FetcherRunnable.run(FetcherRunnable.scala:66)
  • 167. Operational Considerations - Consumers • Consumer tuning is an art
  • 168. Operational Considerations - Consumers • Consumer tuning is an art • Overcommit - more threads than partitions
  • 169. Operational Considerations - Consumers • Consumer tuning is an art • Overcommit - more threads than partitions • Idling (often entire consumer processes)
  • 170. Operational Considerations - Consumers • Consumer tuning is an art • Overcommit - more threads than partitions • Idling (often entire consumer processes) • Excessive rebalancing
  • 171. Operational Considerations - Consumers • Consumer tuning is an art • Overcommit - more threads than partitions • Idling (often entire consumer processes) • Excessive rebalancing • Under commit - less threads than partitions
  • 172. Operational Considerations - Consumers • Consumer tuning is an art • Overcommit - more threads than partitions • Idling (often entire consumer processes) • Excessive rebalancing • Under commit - less threads than partitions • Serial fetchers won’t keep up depending on workload
  • 173. Operational Considerations - Consumers • Consumer tuning is an art • Overcommit - more threads than partitions • Idling (often entire consumer processes) • Excessive rebalancing • Under commit - less threads than partitions • Serial fetchers won’t keep up depending on workload • Big GCs can cause rebalancing
  • 174. Operational Considerations - Consumers • Consumer tuning is an art • Overcommit - more threads than partitions • Idling (often entire consumer processes) • Excessive rebalancing • Under commit - less threads than partitions • Serial fetchers won’t keep up depending on workload • Big GCs can cause rebalancing • Just right - 2 partitions / consumer thread ratio
  • 175. Operational Considerations - Consumers • Consumer tuning is an art • Overcommit - more threads than partitions • Idling (often entire consumer processes) • Excessive rebalancing • Under commit - less threads than partitions • Serial fetchers won’t keep up depending on workload • Big GCs can cause rebalancing • Just right - 2 partitions / consumer thread ratio • Mostly pivots on consumer workload (i.e. latency)
  • 176. Operational Considerations - Incubators Gonna Incubate
  • 177. Operational Considerations - Incubators Gonna Incubate • Deployed in some large installations
  • 178. Operational Considerations - Incubators Gonna Incubate • Deployed in some large installations • Largely learning in production
  • 179. Operational Considerations - Incubators Gonna Incubate • Deployed in some large installations • Largely learning in production • Hasn’t lived through a long lineage of people being mean to it or using in anger
  • 180. Operational Considerations - Incubators Gonna Incubate • Deployed in some large installations • Largely learning in production • Hasn’t lived through a long lineage of people being mean to it or using in anger 2012-06-15 04:25:00,774 - ERROR [kafka-processor-3:Processor@215] - java.lang.RuntimeException: OOME with size 1195725856 java.lang.RuntimeException: OOME with size 1195725856 at kafka.network.BoundedByteBufferReceive.byteBufferAllocate(BoundedByteBufferReceive.scala:81) at kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:60) at kafka.network.Processor.read(SocketServer.scala:283) at kafka.network.Processor.run(SocketServer.scala:202) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.OutOfMemoryError: Java heap space at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:39) at java.nio.ByteBuffer.allocate(ByteBuffer.java:312) at kafka.network.BoundedByteBufferReceive.byteBufferAllocate(BoundedByteBufferReceive.scala:77)
  • 181. Operational Considerations - Incubators Gonna Incubate
  • 182. Operational Considerations - Incubators Gonna Incubate
  • 183. Operational Considerations - Incubators Gonna Incubate
  • 184. Operational Considerations - Incubators Gonna Incubate • With any incubator project, assume it will be rough around the edges
  • 185. Operational Considerations - Incubators Gonna Incubate • With any incubator project, assume it will be rough around the edges • Assume that if you point your monitoring agent at the service port, things will break
  • 186. Operational Considerations - Incubators Gonna Incubate • With any incubator project, assume it will be rough around the edges • Assume that if you point your monitoring agent at the service port, things will break • As a general practice, measure the intended outcome of production changes
  • 187. Acknowledgements The storage models proposed were inspired and adapted by: http://engineering.twitter.com/2010/05/introducing- flockdb.html https://github.com/mochi/statebox
  • 188. Q&A We’re hiring! • Infrastructure • Django • Operations Contact: [email protected] (that I put my email in slides is not an invitation to sell me software so don’t do that) @eonnen - twitter

Editor's Notes

  • #2: \n
  • #3: \n
  • #4: \n
  • #5: \n
  • #6: \n
  • #7: \n
  • #8: \n
  • #9: \n
  • #10: \n
  • #11: \n
  • #12: \n
  • #13: \n
  • #14: \n
  • #15: \n
  • #16: \n
  • #17: \n
  • #18: \n
  • #19: \n
  • #20: \n
  • #21: \n
  • #22: \n
  • #23: \n
  • #24: At any time, a stream may be reading from 4 partitions on 3 brokers\n
  • #25: Bring your own data\n
  • #26: Bring your own data\n
  • #27: Bring your own data\n
  • #28: Bring your own data\n
  • #29: Bring your own data\n
  • #30: Bring your own data\n
  • #31: Bring your own data\n
  • #32: Bring your own data\n
  • #33: \n
  • #34: \n
  • #35: \n
  • #36: \n
  • #37: \n
  • #38: \n
  • #39: \n
  • #40: \n
  • #41: \n
  • #42: \n
  • #43: \n
  • #44: \n
  • #45: \n
  • #46: \n
  • #47: \n
  • #48: \n
  • #49: \n
  • #50: \n
  • #51: \n
  • #52: \n
  • #53: \n
  • #54: \n
  • #55: \n
  • #56: \n
  • #57: \n
  • #58: \n
  • #59: \n
  • #60: \n
  • #61: \n
  • #62: \n
  • #63: \n
  • #64: \n
  • #65: \n
  • #66: \n
  • #67: \n
  • #68: \n
  • #69: \n
  • #70: \n
  • #71: \n
  • #72: \n
  • #73: \n
  • #74: \n
  • #75: \n
  • #76: \n
  • #77: \n
  • #78: could use ttl for removing old versions but set to what?\n
  • #79: could use ttl for removing old versions but set to what?\n
  • #80: could use ttl for removing old versions but set to what?\n
  • #81: could use ttl for removing old versions but set to what?\n
  • #82: could use ttl for removing old versions but set to what?\n
  • #83: could use ttl for removing old versions but set to what?\n
  • #84: could use ttl for removing old versions but set to what?\n
  • #85: could use ttl for removing old versions but set to what?\n
  • #86: could use ttl for removing old versions but set to what?\n
  • #87: \n
  • #88: \n
  • #89: \n
  • #90: \n
  • #91: \n
  • #92: \n
  • #93: \n
  • #94: \n
  • #95: Next OPS\n
  • #96: Next OPS\n
  • #97: Next OPS\n
  • #98: Interval == number of messages\nMessage loss potential in the event of hard process fail\n
  • #99: Interval == number of messages\nMessage loss potential in the event of hard process fail\n
  • #100: Interval == number of messages\nMessage loss potential in the event of hard process fail\n
  • #101: Interval == number of messages\nMessage loss potential in the event of hard process fail\n
  • #102: Interval == number of messages\nMessage loss potential in the event of hard process fail\n
  • #103: Interval == number of messages\nMessage loss potential in the event of hard process fail\n
  • #104: \n
  • #105: \n
  • #106: \n
  • #107: \n
  • #108: When the runnable dies, the consumers will idle\n
  • #109: \n
  • #110: \n
  • #111: \n
  • #112: \n
  • #113: \n
  • #114: \n
  • #115: \n
  • #116: \n
  • #117: \n
  • #118: \n
  • #119: \n
  • #120: \n
  • #121: \n
  • #122: \n
  • #123: \n
  • #124: \n
  • #125: \n
  • #126: \n
  • #127: \n
  • #128: \n
  • #129: \n
  • #130: \n
  • #131: \n
  • #132: \n
  • #133: \n
  • #134: \n
  • #135: \n
  • #136: \n
  • #137: \n
  • #138: \n
  • #139: 145MB vs. 300K\n
  • #140: Balancing feedback loop herd and consumer GC loses ZK lease\n
  • #141: Balancing feedback loop herd and consumer GC loses ZK lease\n
  • #142: Balancing feedback loop herd and consumer GC loses ZK lease\n
  • #143: Balancing feedback loop herd and consumer GC loses ZK lease\n
  • #144: Balancing feedback loop herd and consumer GC loses ZK lease\n
  • #145: Balancing feedback loop herd and consumer GC loses ZK lease\n
  • #146: Balancing feedback loop herd and consumer GC loses ZK lease\n
  • #147: Balancing feedback loop herd and consumer GC loses ZK lease\n
  • #148: Balancing feedback loop herd and consumer GC loses ZK lease\n
  • #149: \n
  • #150: \n
  • #151: \n
  • #152: \n
  • #153: 1195725856 was the beginning of a GET /all request to what should have been our monitoring port\n
  • #154: 1195725856 was the beginning of a GET /all request to what should have been our monitoring port\n
  • #155: 1195725856 was the beginning of a GET /all request to what should have been our monitoring port\n
  • #156: 1195725856 was the beginning of a GET /all request to what should have been our monitoring port\n
  • #157: 1195725856 was the beginning of a GET /all request to what should have been our monitoring port\n
  • #158: \n
  • #159: \n