0% found this document useful (0 votes)

25 views59 pages

Best Practices For Stream Processing With GridGain and Apache Ignite and Kafka

The document outlines best practices for integrating Apache Kafka with GridGain and Apache Ignite, emphasizing the importance of this integration for real-time data processing. It details the features and functionalities of GridGain and Ignite Kafka connectors, including deployment, monitoring, and management strategies. Additionally, it provides examples of integration and discusses performance tuning and scalability considerations.

Uploaded by

rakkeshbisht7510

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

25 views59 pages

Best Practices For Stream Processing With GridGain and Apache Ignite and Kafka

Uploaded by

rakkeshbisht7510

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 59

Best Practices for Stream

Processing with GridGain and

Apache Ignite and Kafka

Alexey Kukushkin Rob Meyer

Professional Services Outbound Product Management
Agenda
• Why we need Kafka/Confluent-Ignite/GridGain integration
• Ignite/GridGain Kafka/Confluent Connectors
• Deployment, monitoring and management
• Integration Examples
• Performance and scalability tuning
•Q&A
Why we need
Kafka/Confluent-Ignite/GridGain
integration
Apache Kafka and Confluent

A distributed streaming
platform:
• Publish/subscribe
• Scalable
• Fault-tolerant
• Real-time
• Persistent
• Written mostly in Scala
GridGain In-Memory Computing Platform
• Built on Apache Ignite
• Comprehensive platform
that supports all projects
• No rip and replace Existing Applications New Applications, Analytics Streaming, Machine Learning
• In-memory speed, petabyte scale
• Enables HTAP, streaming analytics In-Memory In-Memory Streaming Continuous
Data Grid Database Analytics Learning
and continuous learning
GridGain In-Memory Computing Platform
• What GridGain adds
• Production-ready releases
• Enterprise-grade security,
deployment and management RDBMS NoSQL Hadoop

• Global support and services

• Proven for mission critical apps

GridGain Company Confidential

Streaming Analytics, Machine and Deep Learning

Stream Ingestion Stream Processing Machine, Deep Learning Analytics Decision Automation

1 1
Messaging ODBC/JDBC Spark (DataFrame, RDD, HDFS) Java, .NET, … R , Python

Continuous Learning
Streaming SQL Transactions Compute (Machine, Deep Learning) Services

Kafka Kafka
Camel Memory-Centric Storage Camel
Spark Spark
Storm 3rd party Persistence Native Persistence Storm
JMS JMS
MQTT MQTT
… …
RDBMS NoSQL Hadoop SQL NoSQL

1. R and Python developers currently invoke Java classes. Direct R and Python support planned.
Ignite/GridGain & Kafka Integration
• Kafka is commonly used as a
messaging backbone in a
heterogeneous system
• Add Ignite/GridGain to a
Kafka-based system
https://www.imcsummit.org/2018/eu/session/embracing-service-consumption-shift-banking
Ignite and GridGain Kafka
Connectors
Developing Kafka Consumers & Producers
You can develop Kafka integration for any system using Kafka Producer
and Consumer APIs but you need to solve problems like:
• How to use each API of every producer and consumer
• How Kafka will understand your data
• How data will be converted between producers and consumers
• How to scale the producer-to-consumer flow
• How to recover from a failure
• … and many more
GridGain-Kafka Connector: Out-of-the-box Integration

• Addresses all the integration challenges using best practices

• Does not need any coding even in the most complex integrations
• Developed by GridGain/Ignite Community with help from Confluent
to ensure both Ignite and Kafka best practices
• Based on Kafka Connect and Ignite APIs
• Kafka Connect API encourages design for scalability, failover and data schema
• GridGain Source Connector uses Ignite Continuous Queries
• GridGain Sink Connector uses Ignite Data Streamer
Kafka Source and Sink Connectors
Kafka Connect Server Types
In general, there are 4
separate clusters in Kafka
Connect infrastructure:
• Kafka cluster
• cluster nodes called Brokers
• Kafka Connect cluster
• cluster nodes called Workers
• Source and Sink
GridGain/Ignite clusters
• Server Nodes
GridGain Connector Features
Two connectors independent from each other:
• GridGain Source Connector
• streams data from GridGain into Kafka
• uses Ignite continuous queries
• GridGain Sink Connector
• streams data from Kafka into GridGain
• uses Ignite data streamers
GridGain Source Connector: Scalability
Kafka Source Connector Model:

Scales by assigning multiple source partitions to Kafka Connect tasks.

For GridGain Source Connector:
• Partition = Cache
• Record = Cache Entry
GridGain Source Connector:
Rebalancing and Failover
Rebalancing: re-assignment of Kafka Connectors and Tasks to Workers when
• A Worker joins or leaves the cluster
• A cache is added or removed
Failover: resuming operation after a failure
• how to resume after failure or rebalancing without losing cache updates occurred
when Kafka Worker node was down?
Source Offset: position in the source stream. Kafka Connect:
• provides persistent and distributed source offset storage
• automatically saves last committed offset
• allows resuming from the last offset without losing data.
Problem: caches have no offsets!
GridGain Source Connector: Failover Policies
• None: no source offset saved, start listening to current data after
restart
• Cons: updates occurred during downtime are lost (“at least once” data
delivery guarantee violated)
• Pros: fastest
• Full Snapshot: no source offset saved, always pull all data from the
cache upon startup
• Cons:
• Slow, not applicable for big caches
• Duplicate data (”exactly once” data delivery guarantee is violated)
• Pros: no data is lost
GridGain Source Connector: Failover Policies
• Backlog: resume from the last
committed source offset
• Kafka Backlog cache in Ignite
• Key: incremental offset
• Value: cache name and serialized
cache entries
• Kafka Backlog service in Ignite
• Runs continuous queries pulling data
from source caches into Backlog
• Source Connector gets data from
Backlog from backlog starting from
the last committed offset
• Cons
• Intrusive: GridGain cluster impact
• Complex configuration: need to
estimate amount of memory for
Backlog
GridGain Source Connector:
Dynamic Reconfiguration
Connector monitors list of available caches and re-configures itself if a
cache is added or removed.
Use cacheWhitelist and cacheBlacklist properties to define from which
caches to pull data.
GridGain Source Connector: Initial Data Load
Use shallLoadInitialData configuration property to specify if you want
the Connector to load the data that is already in the cache by the time
the Connector starts.
GridGain Sink Connector
• Sink Connectors are inherently scalable since consuming data from a
Kafka topic is scalable
• Sink Connectors inherently support failover thanks to the Kafka
Connector framework auto-committing offsets of the pushed data.
GridGain Connector Data Schema
Both Source and Sink GridGain Connectors support data schema.
• Allows GridGain Connectors understand data with attached schema from
other Kafka producers and consumers
• Source Connector attaches Kafka schema built from Ignite Binary objects
• Sink Connector converts Kafka records to Ignite Binary objects using Kafka
schema
Limitations:
• Ignite Annotations are not supported
• Ignite CHAR converted to Kafka SHORT (same for arrays)
• Ignite UUID and CLASS converted to Kafka STRING (same for arrays)
Ignite Connector Features
• Ignite Source Connector
• pushes data from Ignite into Kafka
• uses Ignite Events
• must enable EVT_CACHE_OBJECT_PUT, which negatively impacts
cluster performance
• Ignite Sink Connector
• pulls data from Kafka into Ignite
• use Ignite data streamer
Apache Ignite vs. GridGain Connectors
Feature Apache Ignite Connector GridGain Connector
Scalability Limited Source connector creates a task per cache
Source connector is not parallel Sink connector is parallel
Sink connector is parallel
Failover NO YES
Source data is lost during connector Source connector can be configured to
restart or rebalancing resume from the last committed offset
Preserving source NO YES
data schema
Handling multiple NO YES
caches Connector can be configured to handle any
number of caches
Dynamic NO YES
Reconfiguration Source connector detects added or removed
caches and re-configures itself
Apache Ignite vs. GridGain Connectors
Feature Apache Ignite Connector GridGain Connector
Initial Data Load NO YES
Handling data YES YES
removals
Serialization and YES YES
Deserialization of
data
Filtering Limited YES
Only source connector supports a filter Both source and sink connectors support
filters
Transformations Kafka SMTs Kafka SMTs
Apache Ignite vs. GridGain Connectors
Feature Apache Ignite Connector GridGain Connector
DevOps Some free-text error logging Health Model defined
Support Apache Ignite Community Supported by GridGain, certified by
Confluent
Packaging Uber JAR Connector Package
Deployment Plugin PATH on all Kafka Connect Plugin PATH on all Kafka Connect workers.
workers CLASSPATH on all GridGain nodes.
Kafka API Version 0.10 2.0
Source API Ignite events Ignite continuous queries
Sink API Ignite data streamer Ignite data streamer
Deployment, monitoring and
management
GridGain Connector Deployment
1. Prepare Connector Package
2. Register Connector with Kafka
3. Register Connector with GridGain
Prepare GridGain Connector Package
1. GridGain-Kafka Connector is part of GridGain Enterprise and
Ultimate 8.5.3 (to be released in the end of October 2018)
2. The connector is in
$GRIDGAIN_HOME/integration/gridgain-kafka-connect
• (GRIDGAIN_HOME environment variable points to the root GridGain
installation directory)
3. Pull missing connector dependencies into the package:
cd $GRIDGAIN_HOME/integration/gridgain-kafka-connect
./copy-dependencies.sh
Register GridGain Connector with Kafka
For every Kafka Connect Worker:
1. Copy GridGain Connector package directory to where you want
Kafka Connectors to be located
for example, into /opt/kafka/connect directory
2. Edit Kafka Connect Worker configuration (kafka-connect-
standalone.properties or kafka-connect-distributed.properties) to
register the connector on the plugin path:
plugin.path=/opt/kafka/connect/gridgain-kafka-connect
Register GridGain Connector with GridGain
This assumes GridGain version is 8.5.3
On every GridGain server node copy the below JARs into
$GRIDGAIN_HOME/libs/user directory. Get the Kafka JARs from the Kafka
Connect workers:
• gridgain-kafka-connect-8.5.3.jar
• connect-api-2.0.0.jar
• kafka-clients-2.0.0.jar
Ignite Connector Deployment
1. Prepare Connector Package
2. Register Connector with Kafka
Prepare Ignite Connector Package
This assumes Ignite version is 2.6.
Create a direcotory containing the below JARs (find JARs in the
$IGNITE_HOME/libs sub-directories):
• ignite-kafka-connect-0.10.0.1.jar
• ignite-core-2.6.0.jar
• ignite-spring-2.6.0.jar
• cache-api-1.0.0.jar
• spring-aop-4.3.16.RELEASE.jar
• spring-beans-4.3.16.RELEASE.jar
• spring-context-4.3.16.RELEASE.jar
• spring-core-4.3.16.RELEASE.jar
• spring-expression-4.3.16.RELEASE.jar
• commons-logging-1.1.1.jar
Register GridGain Connector with Kafka
For every Kafka Connect Worker:
1. Copy Ignite Connector package directory to where you want Kafka
Connectors to be located
for example, into /opt/kafka/connect directory
2. Edit Kafka Connect Worker configuration (kafka-connect-
standalone.properties or kafka-connect-distributed.properties) to
register the connector on the plugin path:
plugin.path=/opt/kafka/connect/ignite-kafka-connect
Monitoring: GridGain Connector
Well defined Health Model:
• Numeric Event ID uniquely identifies specific problem
• Event severity
• Problem description and recovery action is available at
https://docs.gridgain.com/docs/certified-kafka-connector-monitoring

Configure your monitoring system to detect event ID in the logs and may be
run automated recovery as defined in the Health Model
• Sample structured log entry (# used as a delimiter):
09-10-2018 19:57:35 # ERROR # 15000 # Spring XML configuration
path is invalid: /invalid/path/ignite.xml
Monitoring: Ignite Connector
No Health Model is defined.
1. Run negative tests
2. Check Kafka and Ignite logs output
3. Configure your monitoring system to detect corresponding text
patterns in the logs
Integration Examples
Propagating RDBMS updates into GridGain
Propagating RDBMS updates into GridGain
Ignite/GridGain has a 3rd Party Persistence feature (Cache Store) that
allows:
• Propagating cache changes to external storage like RDBMS
• Automatically copying data from external storage to Ignite upon accessing
data missed in Ignite
What if you want to propagate external storage change to Ignite at the
moment of the change? - 3rd Party Persistence cannot do that!
Propagating RDBMS updates into GridGain
Use Kafka to achieve that without writing single line of code!
Assumptions
• For simplicity we will run everything on the same host
• In distributed mode GridGain nodes, Kafka Connect workers and Kafka
brokers are running on different hosts
• GridGain 8.5.3 cluster with GRIDGAIN_HOME variable set on the
nodes
• Kafka 2.0 cluster with KAFKA_HOME variable set on all brokers
1. Run DB Server
We will use H2 Database in this demo.
We will use /tmp/gridgain-h2-connect as a work directory.
• Download H2 and set H2_HOME environment variable.
• Run H2 Server:
java -cp $H2_HOME/bin/h2*.jar org.h2.tools.Server -webPort 18082 -tcpPort 19092
TCP server running at tcp://172.25.4.74:19092 (only local connections)
PG server running at pg://172.25.4.74:5435 (only local connections)
Web Console server running at http://172.25.4.74:18082 (only local connections)

• In the opened H2 Web Console specify

JDBC URL:
jdbc:h2:/tmp/gridgain-h2-
connect/marketdata
• Press Connect
2. Create DB Tables and Add Some Data
In H2 Web Console Execute:
• CREATE TABLE IF NOT EXISTS QUOTES (id
int, date_time timestamp, price
double, PRIMARY KEY (id));
• CREATE TABLE IF NOT EXISTS TRADES (id
int, symbol varchar, PRIMARY KEY
(id));
• INSERT INTO TRADES (id, symbol) VALUES
(1, 'IBM');
• INSERT INTO QUOTES (id, date_time,
price) VALUES (1, CURRENT_TIMESTAMP(),
1.0);
3. Start GridGain Cluster (Single-node)
<bean id="ignite.cfg" class="org.apache.ignite.configuration.IgniteConfiguration">
<property name="discoverySpi">
<bean class="org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi">
<property name="ipFinder">
<bean class="org.apache.ignite.spi.discovery.tcp.ipfinder.vm.TcpDiscoveryVmIpFinder">
<property name="addresses">
<list>
<value>127.0.0.1:47500</value>
</list>
</property>
</bean>
</property>
</bean>
</property>
</bean>
$GRIDGAIN_HOME/bin/ignite.sh /tmp/gridgain-h2-connect/ignite-server.xml
[15:41:15] Ignite node started OK (id=b9963f9a)
[15:41:15] Topology snapshot [ver=1, servers=1, clients=0, CPUs=8, offheap=3.2GB, heap=1.0GB]
[15:41:15] ^-- Node [id=B9963F9A-8F1E-4177-9743-F129414EB133, clusterState=ACTIVE]
4. Deploy Source and Sink Connectors
• Download Confluent JDBC Connector package from
https://www.confluent.io/connector/kafka-connect-jdbc/
• Unzip Confluent JDBC Connector package into /tmp/gridgain-h2-
connect/confluentinc-kafka-connect-jdbc
• Copy GridGain Connector package from $GRIDGAIN_HOME/integration/gridgain-
kafka-connect into /tmp/gridgain-h2-connect/gridgain-kafka-connect
• Copy kafka-connect-standalone.properties Kafka worker configuration file from
$KAFKA_HOME/config into /tmp/gridgain-h2-connect and set the plugin path
property:
plugin.path=/tmp/gridgain-h2-connect/confluentinc-kafka-
connect-jdbc-5.1.0-SNAPSHOT,/tmp/gridgain-h2-
connect/gridgain-gridgain-kafka-connect-8.7.0-SNAPSHOT
5. Start Kafka Cluster (Single-broker)
• Configure Zookeeper with /tmp/gridgain-h2-connect/zookeeper.properties:
dataDir=/tmp/gridgain-h2-connect/zookeeper
clientPort=2181
• Start Zookeeper:
$KAFKA_HOME/bin/zookeeper-server-start.sh /tmp/gridgain-h2-
connect/zookeeper.properties
• Configure Kafka broker: copy default $KAFKA_HOME/config/server.properties to
/tmp/gridgain-h2-connect/kafka-server.properties customize it:
broker.id=0
listeners=PLAINTEXT://:9092
log.dirs=/tmp/gridgain-h2-connect/kafka-logs
zookeeper.connect=localhost:2181
• Start Kafka broker:
$KAFKA_HOME/bin/kafka-server-start.sh /tmp/gridgain-h2-
connect/kafka-server.properties
[2018-10-10 16:11:21,573] INFO Kafka version : 2.0.0
(org.apache.kafka.common.utils.AppInfoParser)
[2018-10-10 16:11:21,573] INFO Kafka commitId : 3402a8361b734732
(org.apache.kafka.common.utils.AppInfoParser)
[2018-10-10 16:11:21,574] INFO [KafkaServer id=0] started (kafka.server.KafkaServer)
6. Configure Source JDBC Connector
/tmp/gridgain-h2-connect/kafka-connect-h2-source.properties:

name=h2-marketdata-source
connector.class=io.confluent.connect.jdbc.JdbcSourceConnector
tasks.max=10

connection.url=jdbc:h2:tcp://localhost:19092//tmp/gridgain-h2-connect/marketdata
table.whitelist=quotes,trades

mode=timestamp+incrementing
timestamp.column.name=date_time
incrementing.column.name=id

topic.prefix=h2-
7. Configure Sink GridGain Connector
/tmp/gridgain-h2-connect/kafka-connect-gridgain-sink.properties:

name=gridgain-marketdata-sink
topics=h2-QUOTES,h2-TRADES
tasks.max=10
connector.class=org.gridgain.kafka.sink.IgniteSinkConnector

igniteCfg=/tmp/gridgain-h2-connect/ignite-client-sink.xml
topicPrefix=h2-
8. Start Kafka-Connect Cluster (Single-worker)
$KAFKA_HOME/bin/connect-standalone.sh \
/tmp/gridgain-h2-connect/kafka-connect-standalone.properties \
/tmp/gridgain-h2-connect/kafka-connect-h2-source.properties \
/tmp/gridgain-h2-connect/kafka-connect-gridgain-sink.properties

[2018-10-10 16:52:21,618] INFO Created connector h2-marketdata-source

(org.apache.kafka.connect.cli.ConnectStandalone:104)
[2018-10-10 16:52:22,254] INFO Created connector gridgain-marketdata-sink
(org.apache.kafka.connect.cli.ConnectStandalone:104)
9. See Caches Created in GridGain
Open GridGain Web Console Monitoring Dashboard at
https://console.gridgain.com/monitoring/dashboard and see GridGain
Sink Connector created QUOTES and TRADES caches:
10. See Initial H2 Data in GridGain
Open GridGain Web Console Queries page and run Scan queries for
QUOTES and TRADES:
11. Update H2 Tables
In H2 Web Console Execute:
• INSERT INTO TRADES (id, symbol) VALUES
(2, ‘INTL');
• INSERT INTO QUOTES (id, date_time,
price) VALUES (2, CURRENT_TIMESTAMP(),
2.0);
12. See Realtime H2 Data in GridGain
Open GridGain Web Console Queries page and run Scan queries for
QUOTES and TRADES:
Performance and scalability
tuning
Disable Processing of Updates
For performance reasons, Sink Connector does not support existing
cache entry update by default.
Set shallProcessUpdates configuration setting to true to make the Sink
Connector update existing entries.
Disable Dynamic Schema
Source connector caches key and values schemas.
• The schemas are created as the first cache entry is pulled and re-used for all
subsequent entries.
This works only if the schemas never change.
• Set isSchemaDynamic to true to support schema changes.
Consider Disabling Schema
Source Connector does not generate schemas
if isSchemaless configuration setting is true.
Disabling schemas significantly improves performance.
Carefully Choose Failover Policy
• Can allow losing data? Use None.
• Caches are small (e.g. reference data caches)? Use Full Snapshot.
• Otherwise use Backlog.
Plan Kafka Connect Backlog Capacity
Only Backlog failover policy supports both “at least once” and “exactly
once” delivery guarantee.
GridGain Source Connector creates Backlog in the “kafka-connect”
memory region, which requires capacity planning to avoid losing data
by eviction (unless persistence is enabled).
Consider the worst case scenario:
• Maximum Kafka Connector worker downtime allowed in your system
• Peak traffic
Multiple peak traffic by max downtime to estimate “kafka-connect”
data region size.
Q&A
Thank you!

UI and UX Presentation
100% (2)
UI and UX Presentation
20 pages
Design A Google Analytic Like Backend System
No ratings yet
Design A Google Analytic Like Backend System
3 pages
EMC.E20-559.v2018-03-12.q94: Show Answer
No ratings yet
EMC.E20-559.v2018-03-12.q94: Show Answer
26 pages
Lab 10 Oracle Access Management - Access Manager 11g R2 PS3 OAM Authentication Plug-In
100% (1)
Lab 10 Oracle Access Management - Access Manager 11g R2 PS3 OAM Authentication Plug-In
33 pages
Real Time Analytics With Apache Kafka and Spark: Rahul Jain
100% (1)
Real Time Analytics With Apache Kafka and Spark: Rahul Jain
54 pages
GridGain Feature Comparison Vs GemFire
No ratings yet
GridGain Feature Comparison Vs GemFire
3 pages
Real-Time Streaming in Big Data: Kafka and Spark With Singlestore
100% (1)
Real-Time Streaming in Big Data: Kafka and Spark With Singlestore
23 pages
06 Sample Exam Questions
No ratings yet
06 Sample Exam Questions
79 pages
Confluent Cloud: What Customers Are Saying
No ratings yet
Confluent Cloud: What Customers Are Saying
2 pages
Caching Product Comparison
No ratings yet
Caching Product Comparison
2 pages
MyBatis 3 User Guide 5
No ratings yet
MyBatis 3 User Guide 5
15 pages
Gridgain® In-Memory Computing Platform: Feature Comparison: Pivotal Gemfire®
No ratings yet
Gridgain® In-Memory Computing Platform: Feature Comparison: Pivotal Gemfire®
14 pages
In - Memory Data Fabric in Action: Apache Ignite
No ratings yet
In - Memory Data Fabric in Action: Apache Ignite
16 pages
Data Steaming Sylll
No ratings yet
Data Steaming Sylll
12 pages
Testing Data Streaming Applications: Lars Albertsson, Independent Consultant Øyvind Løkling, Schibsted Media Group
No ratings yet
Testing Data Streaming Applications: Lars Albertsson, Independent Consultant Øyvind Løkling, Schibsted Media Group
26 pages
Apache Kafka Essentials
No ratings yet
Apache Kafka Essentials
10 pages
5
No ratings yet
5
1 page
4
No ratings yet
4
2 pages
Event Streams - Sales Enablement
No ratings yet
Event Streams - Sales Enablement
17 pages
CSAD Lesson 6 On-Prem Platform Monitoring - 631
No ratings yet
CSAD Lesson 6 On-Prem Platform Monitoring - 631
29 pages
CLO041GU04
No ratings yet
CLO041GU04
27 pages
Atlan - Kafka Strategy
No ratings yet
Atlan - Kafka Strategy
9 pages
Understanding Apache Kafka White Paper
No ratings yet
Understanding Apache Kafka White Paper
7 pages
Stream Processing at Lyft
No ratings yet
Stream Processing at Lyft
20 pages
Integrating Apache Nifi and Apache Kafka
No ratings yet
Integrating Apache Nifi and Apache Kafka
5 pages
Sandvine DS ActiveLogic
No ratings yet
Sandvine DS ActiveLogic
4 pages
Large Scale Data Pipelines
No ratings yet
Large Scale Data Pipelines
91 pages
Apache Kafka Reinvented For The Cloud
No ratings yet
Apache Kafka Reinvented For The Cloud
15 pages
GridGain - One Compute Grid, Many Data Grids
No ratings yet
GridGain - One Compute Grid, Many Data Grids
6 pages
Module4 1
No ratings yet
Module4 1
68 pages
Instaclustr Understanding Apache Kafka White Paper
No ratings yet
Instaclustr Understanding Apache Kafka White Paper
8 pages
Bigdata Notes
No ratings yet
Bigdata Notes
26 pages
Introduction To Confluent Components
No ratings yet
Introduction To Confluent Components
68 pages
Ppb1 Workshop Streaming
No ratings yet
Ppb1 Workshop Streaming
64 pages
HD Mod011 Kafka
No ratings yet
HD Mod011 Kafka
29 pages
BD Imp Ques 2
No ratings yet
BD Imp Ques 2
26 pages
Kafka Sparkstreaming
No ratings yet
Kafka Sparkstreaming
75 pages
Big Data Concepts - Spark & Streaming
No ratings yet
Big Data Concepts - Spark & Streaming
35 pages
Streaming Graph Processing Unit5
No ratings yet
Streaming Graph Processing Unit5
7 pages
Kafka
No ratings yet
Kafka
23 pages
7 Snowflake Reference Architectures For Application Builders
No ratings yet
7 Snowflake Reference Architectures For Application Builders
13 pages
How We Built a Data Pipeline With Lambda Architecture Using Spark_Spark Streaming _ by Swetha Kasireddy _ Walmart Global Tech Blog _ Medium
No ratings yet
How We Built a Data Pipeline With Lambda Architecture Using Spark_Spark Streaming _ by Swetha Kasireddy _ Walmart Global Tech Blog _ Medium
11 pages
Sala Questions
No ratings yet
Sala Questions
38 pages
Non Apache Component
No ratings yet
Non Apache Component
22 pages
Cloud-Agnostic Data Engineering Architecture For Real-Time I
No ratings yet
Cloud-Agnostic Data Engineering Architecture For Real-Time I
39 pages
Module 4
No ratings yet
Module 4
14 pages
Assignment No. 3 For Business Data Analytics
No ratings yet
Assignment No. 3 For Business Data Analytics
16 pages
Streaming Data and Stream Processing With Apache Kafka ™: David Tucker, Director of Partner Engineering
No ratings yet
Streaming Data and Stream Processing With Apache Kafka ™: David Tucker, Director of Partner Engineering
44 pages
User Manual of Transformer Ratio Tester TRT300 (V1.2) - 2
No ratings yet
User Manual of Transformer Ratio Tester TRT300 (V1.2) - 2
12 pages
My Journey As A Data Engineer Spans Over
No ratings yet
My Journey As A Data Engineer Spans Over
6 pages
SHUKLAA
100% (1)
SHUKLAA
43 pages
Kotlin Exception Handling
No ratings yet
Kotlin Exception Handling
5 pages
Apache Kafka Confluent Enterprise Ref Architecture
No ratings yet
Apache Kafka Confluent Enterprise Ref Architecture
17 pages
BDA Unit V
No ratings yet
BDA Unit V
21 pages
Kafka Clustering v1.0.0
No ratings yet
Kafka Clustering v1.0.0
20 pages
20250407-EB-Buyers Guide for OEM Program CSP
No ratings yet
20250407-EB-Buyers Guide for OEM Program CSP
14 pages
The Ultimate Guide To IBM Certified Solution Developer - App Connect Enterprise V11
No ratings yet
The Ultimate Guide To IBM Certified Solution Developer - App Connect Enterprise V11
2 pages
Kafka
No ratings yet
Kafka
43 pages
Sampreet's Resume
No ratings yet
Sampreet's Resume
1 page
ML Lec 01 Introduction I
No ratings yet
ML Lec 01 Introduction I
21 pages
Bba 3 - Iit - U1
No ratings yet
Bba 3 - Iit - U1
5 pages
Compute Engine
No ratings yet
Compute Engine
49 pages
Manufacturing Questions
No ratings yet
Manufacturing Questions
4 pages
Learning Real-Time Processing With Spark Streaming - Sample Chapter
No ratings yet
Learning Real-Time Processing With Spark Streaming - Sample Chapter
30 pages
Toronto Hadoop User Group Spark
No ratings yet
Toronto Hadoop User Group Spark
16 pages
SJC Icse 2025 Computer Applications Prelims Paper
No ratings yet
SJC Icse 2025 Computer Applications Prelims Paper
5 pages
Apache Ignite: - in - Memory Data Fabric
No ratings yet
Apache Ignite: - in - Memory Data Fabric
16 pages
Icp Lab #4: Input and Output Devices
No ratings yet
Icp Lab #4: Input and Output Devices
10 pages
Java Control Statements
No ratings yet
Java Control Statements
20 pages
Splunk SPLK-1001 v2022-01-21 q144
No ratings yet
Splunk SPLK-1001 v2022-01-21 q144
33 pages
KK Implementation Checklist
No ratings yet
KK Implementation Checklist
7 pages
Tutorial Solutions - Week6
No ratings yet
Tutorial Solutions - Week6
11 pages
ARM Cortex-M0 DesignStart ReleaseNote
No ratings yet
ARM Cortex-M0 DesignStart ReleaseNote
21 pages
TerasLIS English Indonesia
No ratings yet
TerasLIS English Indonesia
24 pages
COM284 - Computer Organization Project
No ratings yet
COM284 - Computer Organization Project
4 pages
PHP Previous Year Question Solve by Sayed
No ratings yet
PHP Previous Year Question Solve by Sayed
5 pages
CS Project
No ratings yet
CS Project
15 pages
CC Unit 4 MCQ
100% (1)
CC Unit 4 MCQ
10 pages
ROS2 Week Day 4
No ratings yet
ROS2 Week Day 4
30 pages
ABAP Programming in HR
No ratings yet
ABAP Programming in HR
165 pages
Galaxy Watch Active SM-R500 - en
No ratings yet
Galaxy Watch Active SM-R500 - en
139 pages
2015 CASA New Orleans Superhero Race Results
No ratings yet
2015 CASA New Orleans Superhero Race Results
10 pages
MIT - The Dark Secret at The Heart of AI
No ratings yet
MIT - The Dark Secret at The Heart of AI
13 pages
XRF T6 User Manual
No ratings yet
XRF T6 User Manual
38 pages
Bristol Babcock Interface Reference EPDOC-XXX8-En-431
No ratings yet
Bristol Babcock Interface Reference EPDOC-XXX8-En-431
52 pages
Chế Độ Cắt Gia Công Cơ Khí - Nguyễn Ngọc Đào, 256 Trang
No ratings yet
Chế Độ Cắt Gia Công Cơ Khí - Nguyễn Ngọc Đào, 256 Trang
256 pages

Best Practices For Stream Processing With GridGain and Apache Ignite and Kafka

Uploaded by

Best Practices For Stream Processing With GridGain and Apache Ignite and Kafka

Uploaded by

Best Practices for Stream

Processing with GridGain and

Alexey Kukushkin Rob Meyer

• Global support and services

GridGain Company Confidential

• Addresses all the integration challenges using best practices

Scales by assigning multiple source partitions to Kafka Connect tasks.

• In the opened H2 Web Console specify

[2018-10-10 16:52:21,618] INFO Created connector h2-marketdata-source

You might also like