1
Cassandra 2.2 and 3.0
new features
DuyHai DOAN
Apache Cassandra Technical Evangelist
#VoxxedBerlin @doanduyhai
Datastax
2
•  Founded in April 2010
•  We contribute a lot to Apache Cassandra™
•  400+ customers (25 of the Fortune 100), 450+ employees
•  Headquarter in San Francisco Bay area
•  EU headquarter in London, offices in France and Germany
•  Datastax Enterprise = OSS Cassandra + extra features
Materialized Views (MV)
•  Why ?
•  Detailed Impl
•  Gotchas
Why Materialized Views ?
•  Relieve the pain of manual denormalization
CREATE TABLE user(
id int PRIMARY KEY,
country text,
…
);
CREATE TABLE user_by_country(
country text,
id int,
…,
PRIMARY KEY(country, id)
);
4
CREATE TABLE user_by_country (
country text,
id int,
firstname text,
lastname text,
PRIMARY KEY(country, id));
Materialzed View In Action
CREATE MATERIALIZED VIEW user_by_country
AS SELECT country, id, firstname, lastname
FROM user
WHERE country IS NOT NULL AND id IS NOT NULL
PRIMARY KEY(country, id)
5
Materialzed View Syntax
CREATE MATERIALIZED VIEW [IF NOT EXISTS]
keyspace_name.view_name
AS SELECT column1, column2, ...
FROM keyspace_name.table_name
WHERE column1 IS NOT NULL AND column2 IS NOT NULL ...
PRIMARY KEY(column1, column2, ...)
Must select all primary key columns of base table
•  IS NOT NULL condition for now
•  more complex conditions in future
•  at least all primary key columns of base table
(ordering can be different)
•  maximum 1 column NOT pk from base table
6
Materialized Views Demo
7
Materialized View Impl
C*
C*
C*
C*
C* C*
C* C*
UPDATE user
SET country=‘FR’
WHERE id=1
①
•  send mutation to all replicas
•  waiting for ack(s) with CL
8
Materialized View Impl
C*
C*
C*
C*
C* C*
C* C*
UPDATE user
SET country=‘FR’
WHERE id=1
②
Acquire local lock on
base table partition
9
Materialized View Impl
C*
C*
C*
C*
C* C*
C* C*
UPDATE user
SET country=‘FR’
WHERE id=1
③
Local read to fetch current values
SELECT * FROM user WHERE id=1
10
Materialized View Impl
C*
C*
C*
C*
C* C*
C* C*
UPDATE user
SET country=‘FR’
WHERE id=1
④
Create BatchLog with
•  DELETE FROM user_by_country
WHERE country = ‘old_value’
•  INSERT INTO
user_by_country(country, id, …)
VALUES(‘FR’, 1, ...)
11
Materialized View Impl
C*
C*
C*
C*
C* C*
C* C*
UPDATE user
SET country=‘FR’
WHERE id=1
⑤
Execute async BatchLog
to paired view replica
with CL = ONE
12
Materialized View Impl
C*
C*
C*
C*
C* C*
C* C*
UPDATE user
SET country=‘FR’
WHERE id=1
⑥
Apply base table updade locally
SET COUNTRY=‘FR’
13
Materialized View Impl
C*
C*
C*
C*
C* C*
C* C*
UPDATE user
SET country=‘FR’
WHERE id=1
⑦
Release local lock
14
Materialized View Impl
C*
C*
C*
C*
C* C*
C* C*
UPDATE user
SET country=‘FR’
WHERE id=1
⑧
Return ack to
coordinator
15
Materialized View Impl
C*
C*
C*
C*
C* C*
C* C*
UPDATE user
SET country=‘FR’
WHERE id=1
⑨
If CL ack(s)
received, ack client
16
MV Failure Cases: concurrent updates
Read base row (country=‘UK’)
•  DELETE FROM mv WHERE
country=‘UK’
•  INSERT INTO mv …(country)
VALUES(‘US’)
•  Send async BatchLog
•  Apply update country=‘US’
1) UPDATE … SET country=‘US’ 2) UPDATE … SET country=‘FR’
Read base row (country=‘UK’)
•  DELETE FROM mv WHERE
country=‘UK’
•  INSERT INTO mv …(country)
VALUES(‘FR’)
•  Send async BatchLog
•  Apply update country=‘FR’
t0
t1
t2
Without local lock
17
MV Failure Cases: concurrent updates
Read base row (country=‘UK’)
•  DELETE FROM mv WHERE
country=‘UK’
•  INSERT INTO mv …(country)
VALUES(‘US’)
•  Send async BatchLog
•  Apply update country=‘US’
1) UPDATE … SET country=‘US’ 2) UPDATE … SET country=‘FR’
Read base row (country=‘UK’)
•  DELETE FROM mv WHERE
country=‘UK’
•  INSERT INTO mv …(country)
VALUES(‘FR’)
•  Send async BatchLog
•  Apply update country=‘FR’
t0
t1
t2
Without local lock
18
INSERT INTO mv …(country) VALUES(‘US’)
INSERT INTO mv …(country) VALUES(‘FR’)
MV Failure Cases: concurrent updates
Read base row (country=‘UK’)
•  DELETE FROM mv WHERE
country=‘UK’
•  INSERT INTO mv …(country)
VALUES(‘US’)
•  Send async BatchLog
•  Apply update country=‘US’
1) UPDATE … SET country=‘US’ 2) UPDATE … SET country=‘FR’
Read base row (country=‘US’)
•  DELETE FROM mv WHERE
country=‘US’
•  INSERT INTO mv …(country)
VALUES(‘FR’)
•  Send async BatchLog
•  Apply update country=‘FR’
With local lock
🔒
🔓 🔒
🔓19
MV Failure Cases: failed updates to MV
C*
C*
C*
C*
C* C*
C* C*
UPDATE user
SET country=‘FR’
WHERE id=1
⑤
Execute async BatchLog
to paired view replica
with CL = ONE
✘
MV replica down
20
MV Failure Cases: failed updates to MV
C*
C*
C*
C*
C* C*
C* C*
UPDATE user
SET country=‘FR’
WHERE id=1
BatchLog
replay
MV replica up
21
Materialized View Performance
•  Write performance
•  local lock
•  local read-before-write for MV à update contention on partition (most of perf hits)
•  local batchlog for MV
•  ☞ you only pay this price once whatever number of MV
•  for each base table update: mv_count x 2 (DELETE + INSERT) extra mutations
22
Materialized View Performance
•  Write performance vs manual denormalization
•  MV better because no client-server network traffic for read-before-write
•  MV better because less network traffic for multiple views (client-side BATCH)
•  Makes developer life easier à priceless
23
Materialized View Performance
•  Read performance vs secondary index
•  MV better because single node read (secondary index can hit many nodes)
•  MV better because single read path (secondary index = read index + read data)
24
Materialized Views Consistency
•  Consistency level
•  CL honoured for base table, ONE for MV + local batchlog
•  Weaker consistency guarantees for MV than for base table.
•  Exemple, write at QUORUM
•  guarantee that QUORUM replicas of base tables have received write
•  guarantee that QUORUM of MV replicas will eventually receive DELETE + INSERT
25
Materialized Views Gotchas
•  Beware of hot spots !!!
•  MV user_by_gender 😱
26
Q & A
! "
27
User Define Functions (UDF)
•  Why ?
•  Detailed Impl
•  UDAs
•  Gotchas
Rationale
•  Push computation server-side
•  save network bandwidth (1000 nodes!)
•  simplify client-side code
•  provide standard & useful function (sum, avg …)
•  accelerate analytics use-case (pre-aggregation for Spark)
29
How to create an UDF ?
CREATE [OR REPLACE] FUNCTION [IF NOT EXISTS]
[keyspace.]functionName (param1 type1, param2 type2, …)
CALL ON NULL INPUT | RETURNS NULL ON NULL INPUT
RETURN returnType
LANGUAGE language
AS $$
// source code here
$$;
30
How to create an UDF ?
CREATE [OR REPLACE] FUNCTION [IF NOT EXISTS]
[keyspace.]functionName (param1 type1, param2 type2, …)
CALLED ON NULL INPUT | RETURNS NULL ON NULL INPUT
RETURN returnType
LANGUAGE language
AS $$
// source code here
$$;
An UDF is keyspace-wide
31
How to create an UDF ?
CREATE [OR REPLACE] FUNCTION [IF NOT EXISTS]
[keyspace.]functionName (param1 type1, param2 type2, …)
CALLED ON NULL INPUT | RETURNS NULL ON NULL INPUT
RETURN returnType
LANGUAGE language
AS $$
// source code here
$$;
Param name to refer to in the code
Type = CQL3 type
32
How to create an UDF ?
CREATE [OR REPLACE] FUNCTION [IF NOT EXISTS]
[keyspace.]functionName (param1 type1, param2 type2, …)
CALLED ON NULL INPUT | RETURNS NULL ON NULL INPUT
RETURN returnType
LANGUAGE language // j
AS $$
// source code here
$$;
Always called
Null-check mandatory in code
33
How to create an UDF ?
CREATE [OR REPLACE] FUNCTION [IF NOT EXISTS]
[keyspace.]functionName (param1 type1, param2 type2, …)
CALLED ON NULL INPUT | RETURNS NULL ON NULL INPUT
RETURN returnType
LANGUAGE language // jav
AS $$
// source code here
$$;
If any input is null, code block is
skipped and return null
34
How to create an UDF ?
CREATE [OR REPLACE] FUNCTION [IF NOT EXISTS]
[keyspace.]functionName (param1 type1, param2 type2, …)
CALLED ON NULL INPUT | RETURNS NULL ON NULL INPUT
RETURN returnType
LANGUAGE language
AS $$
// source code here
$$;
CQL types
•  primitives (boolean, int, …)
•  collections (list, set, map)
•  tuples
•  UDT
35
How to create an UDF ?
CREATE [OR REPLACE] FUNCTION [IF NOT EXISTS]
[keyspace.]functionName (param1 type1, param2 type2, …)
CALLED ON NULL INPUT | RETURNS NULL ON NULL INPUT
RETURN returnType
LANGUAGE language
AS $$
// source code here
$$; JVM supported languages
•  Java, Scala
•  Javascript (slow)
•  Groovy, Jython, JRuby
•  Clojure ( JSR 223 impl issue)
36
How to create an UDF ?
CREATE [OR REPLACE] FUNCTION [IF NOT EXISTS]
[keyspace.]functionName (param1 type1, param2 type2, …)
CALLED ON NULL INPUT | RETURNS NULL ON NULL INPUT
RETURN returnType
LANGUAGE language
AS $$
// source code here
$$;
37
UDF Demo
38
UDA
•  Real use-case for UDF
•  Aggregation server-side à huge network bandwidth saving
•  Provide similar behavior for Group By, Sum, Avg etc …
39
How to create an UDA ?
CREATE [OR REPLACE] AGGREGATE [IF NOT EXISTS]
[keyspace.]aggregateName(type1, type2, …)
SFUNC accumulatorFunction
STYPE stateType
[FINALFUNC finalFunction]
INITCOND initCond;
Only type, no param name
State type
Initial state type
40
How to create an UDA ?
CREATE [OR REPLACE] AGGREGATE [IF NOT EXISTS]
[keyspace.]aggregateName(type1, type2, …)
SFUNC accumulatorFunction
STYPE stateType
[FINALFUNC finalFunction]
INITCOND initCond;
Accumulator function. Signature:
accumulatorFunction(stateType, type1, type2, …)
RETURNS stateType
41
How to create an UDA ?
CREATE [OR REPLACE] AGGREGATE [IF NOT EXISTS]
[keyspace.]aggregateName(type1, type2, …)
SFUNC accumulatorFunction
STYPE stateType
[FINALFUNC finalFunction]
INITCOND initCond;
Optional final function. Signature:
finalFunction(stateType)
42
How to create an UDA ?
CREATE [OR REPLACE] AGGREGATE [IF NOT EXISTS]
[keyspace.]aggregateName(type1, type2, …)
SFUNC accumulatorFunction
STYPE stateType
[FINALFUNC finalFunction]
INITCOND initCond;
UDA return type ?
If finalFunction
•  return type of finalFunction
Else
•  return stateType
43
UDA Demo
44
Gotchas
C* C*
C*
C*
UDA
①
② & ③
⑤
② & ③
② & ③
45
Gotchas
C* C*
C*
C*
UDA
①
② & ③
⑤
② & ③
② & ③
46
Why do not apply UDF/UDA on replica node ?
Gotchas
C* C*
C*
C*
UDA
①
② & ③
④
•  apply accumulatorFunction
•  apply finalFunction
⑤
② & ③
② & ③
1.  Because of eventual
consistency
2.  UDF/UDA applied AFTER
last-write-win logic
47
Gotchas
48
•  UDA in Cassandra is not distributed !
•  Execute UDA on a large number of rows (106 for ex.)
•  single fat partition
•  multiple partitions
•  full table scan
•  à Increase client-side timeout
•  default Java driver timeout = 12 secs
•  JAVA-1033 JIRA for per-request timeout setting
Cassandra UDA or Apache Spark ?
49
Consistency
Level
Single/Multiple
Partition(s)
Recommended
Approach
ONE Single partition UDA with token-aware driver because node local
ONE Multiple partitions Apache Spark because distributed reads
> ONE Single partition UDA because data-locality lost with Spark
> ONE Multiple partitions Apache Spark definitely
Cassandra UDA or Apache Spark ?
50
Consistency
Level
Single/Multiple
Partition(s)
Recommended
Approach
ONE Single partition UDA with token-aware driver because node local
ONE Multiple partitions Apache Spark because distributed reads
> ONE Single partition UDA because data-locality lost with Spark
> ONE Multiple partitions Apache Spark definitely
Q & A
! "
51
New Storage Engine
•  Data structure
•  Disk space usage
Pre 3.0 data structure
Map<byte[ ], SortedMap<byte[ ], Cell>>
53
CREATE TABLE sensor_data(
sensor_id uuid,
date timestamp,
sensor_type text,
sensor_value double,
PRIMARY KEY(sensor_id, date)
);
Pre 3.0 on disk layout
54
RowKey: de305d54-75b4-431b-adb2-eb6b9e546014
=> (column=2015-04-27 10:00:00+0100:, value=, timestamp=1430128800)
=> (column=2015-04-27 10:00:00+0100:sensor_type, value=‘Temperature’, timestamp=1430128800)
=> (column=2015-04-27 10:00:00+0100:sensor_value, value=23.48, timestamp=1430128800)
=> (column=2015-04-27 10:01:00+0100:, value=, timestamp=1430128860)
=> (column=2015-04-27 10:01:00+0100:sensor_type, value=‘Temperature’, timestamp=1430128860)
=> (column=2015-04-27 10:01:00+0100:sensor_value, value=24.08, timestamp=1430128860)
Clustering values are repeated
for each normal column
Full timestamp storage
Cassandra 3.0 data structure
Map<byte[ ], SortedMap<ClusteringColumn, Row>>
55
CREATE TABLE sensor_data(
sensor_id uuid,
date timestamp,
sensor_type text,
sensor_value double,
PRIMARY KEY(sensor_id, date)
);
Cassandra 3.0 on disk layout
56
PartitionKey: de305d54-75b4-431b-adb2-eb6b9e546014
=> clusteringColumn:2015-04-27 10:00:00+0100
=> row_timestamp=1430128800
=> (column_value=‘Temperature’, delta_encoded_timestamp=+0)
=> (column_value=23.48, delta_encoded_timestamp=+0)
=> clusteringColumn:2015-04-27 10:01:00+0100
=> row_timestamp=1430128860
=> (column_value=‘Temperature’, delta_encoded_timestamp=+0)
=> (column_value=24.08, delta_encoded_timestamp=+0)
Delta-encoded timestamp
vs row timestamp
Gains
57
•  No clustering value repetition
•  Column labels are stored only once in meta data
•  Delta encoding of timestamp, 8 bytes saved each time
•  Less disk space used
Benchmarks
58
CREATE TABLE events (
id uuid,
date timeuuid,
prop1 int,
prop2 text,
prop3 float,
PRIMARY KEY(id, date));
106 rows
Small string
Benchmarks
59
CREATE TABLE largetext(
key int,
prop1 int,
prop2 text,
PRIMARY KEY(id));
106 rows
Large string (1000)
Benchmarks
60
CREATE TABLE
largeclustering(
key int,
clust text,
prop1 int,
prop2 set<float>,
PRIMARY KEY(id, clust));
106 rowsMedium string (100)
50 items
Benchmarks
61
CREATE TABLE events (
id uuid,
date timeuuid,
prop1 int,
prop2 text,
prop3 float,
PRIMARY KEY(id, date))
WITH COMPACT STORAGE ;
Q & A
! "
62
@doanduyhai
duy_hai.doan@datastax.com
https://academy.datastax.com/
Thank You
63

More Related Content

PDF
Cassandra UDF and Materialized Views
PDF
User defined-functions-cassandra-summit-eu-2014
PDF
Testing Cassandra Guarantees under Diverse Failure Modes with Jepsen
PDF
Data stax academy
PDF
Http4s, Doobie and Circe: The Functional Web Stack
PDF
Node Boot Camp
PDF
Cassandra 3.0 Awesomeness
KEY
The Why and How of Scala at Twitter
Cassandra UDF and Materialized Views
User defined-functions-cassandra-summit-eu-2014
Testing Cassandra Guarantees under Diverse Failure Modes with Jepsen
Data stax academy
Http4s, Doobie and Circe: The Functional Web Stack
Node Boot Camp
Cassandra 3.0 Awesomeness
The Why and How of Scala at Twitter

What's hot (20)

PDF
XQuery in the Cloud
PDF
Indexing in Cassandra
PDF
Scala @ TechMeetup Edinburgh
PDF
Terraform introduction
PDF
PDF
Mentor Your Indexes
PDF
Not your Grandma's XQuery
PDF
XQuery Rocks
PDF
Faster Data Analytics with Apache Spark using Apache Solr - Kiran Chitturi, L...
PDF
Scala active record
PDF
Custom deployments with sbt-native-packager
PDF
Scala coated JVM
PDF
Solr Indexing and Analysis Tricks
PDF
Spark workshop
ODP
Aura Project for PHP
PPTX
A Brief Intro to Scala
PDF
Webエンジニアから見たiOS5
PDF
Introductory Overview to Managing AWS with Terraform
PDF
Benchx: An XQuery benchmarking web application
PDF
Lucene for Solr Developers
XQuery in the Cloud
Indexing in Cassandra
Scala @ TechMeetup Edinburgh
Terraform introduction
Mentor Your Indexes
Not your Grandma's XQuery
XQuery Rocks
Faster Data Analytics with Apache Spark using Apache Solr - Kiran Chitturi, L...
Scala active record
Custom deployments with sbt-native-packager
Scala coated JVM
Solr Indexing and Analysis Tricks
Spark workshop
Aura Project for PHP
A Brief Intro to Scala
Webエンジニアから見たiOS5
Introductory Overview to Managing AWS with Terraform
Benchx: An XQuery benchmarking web application
Lucene for Solr Developers
Ad

Viewers also liked (20)

PDF
Cassandra Materialized Views
PDF
Spring 4.3-component-design
PDF
Paolucci voxxed-days-berlin-2016-age-of-orchestration
PDF
Voxxed berlin2016profilers|
PDF
Docker orchestration voxxed days berlin 2016
PDF
The internet of (lego) trains
PDF
Advanced akka features
PDF
Light Weight Transactions Under Stress (Christopher Batey, The Last Pickle) ...
PDF
OrientDB - Voxxed Days Berlin 2016
PDF
Size does matter - How to cut (micro-)services correctly
PDF
Advanced search and Top-K queries in Cassandra
PPT
05 OLAP v6 weekend
PPTX
FedX - Optimization Techniques for Federated Query Processing on Linked Data
PPT
Whats A Data Warehouse
PDF
Data Warehouse and OLAP - Lear-Fabini
PPTX
Oracle Optimizer: 12c New Capabilities
PPT
Benchmarking graph databases on the problem of community detection
PDF
Materialized views in PostgreSQL
PPTX
SSSW2015 Data Workflow Tutorial
PDF
Olap Cube Design
 
Cassandra Materialized Views
Spring 4.3-component-design
Paolucci voxxed-days-berlin-2016-age-of-orchestration
Voxxed berlin2016profilers|
Docker orchestration voxxed days berlin 2016
The internet of (lego) trains
Advanced akka features
Light Weight Transactions Under Stress (Christopher Batey, The Last Pickle) ...
OrientDB - Voxxed Days Berlin 2016
Size does matter - How to cut (micro-)services correctly
Advanced search and Top-K queries in Cassandra
05 OLAP v6 weekend
FedX - Optimization Techniques for Federated Query Processing on Linked Data
Whats A Data Warehouse
Data Warehouse and OLAP - Lear-Fabini
Oracle Optimizer: 12c New Capabilities
Benchmarking graph databases on the problem of community detection
Materialized views in PostgreSQL
SSSW2015 Data Workflow Tutorial
Olap Cube Design
 
Ad

Similar to Cassandra and materialized views (20)

PDF
Developing and Deploying Apps with the Postgres FDW
PPTX
Build 2016 - B880 - Top 6 Reasons to Move Your C++ Code to Visual Studio 2015
PDF
[Td 2015] what is new in visual c++ 2015 and future directions(ulzii luvsanba...
PDF
Cassandra 3 new features 2016
PDF
Michael Hall [InfluxData] | Become an InfluxDB Pro in 20 Minutes | InfluxDays...
PDF
How and Where in GLORP
PDF
Implementing New Web
PDF
Implementing new WebAPIs
PDF
Building DSLs On CLR and DLR (Microsoft.NET)
PDF
Getting Started with PL/Proxy
PDF
Simple ETL in python 3.5+ with Bonobo - PyParis 2017
PDF
Simple ETL in python 3.5+ with Bonobo, Romain Dorgueil
PDF
Practical pig
PDF
Declarative Infrastructure Tools
PDF
Integration-Monday-Stateful-Programming-Models-Serverless-Functions
PDF
Bye bye $GLOBALS['TYPO3_DB']
PDF
Booting into functional programming
PPTX
What’s new in .NET
PDF
Performance measurement and tuning
PDF
Dependencies Managers in C/C++. Using stdcpp 2014
Developing and Deploying Apps with the Postgres FDW
Build 2016 - B880 - Top 6 Reasons to Move Your C++ Code to Visual Studio 2015
[Td 2015] what is new in visual c++ 2015 and future directions(ulzii luvsanba...
Cassandra 3 new features 2016
Michael Hall [InfluxData] | Become an InfluxDB Pro in 20 Minutes | InfluxDays...
How and Where in GLORP
Implementing New Web
Implementing new WebAPIs
Building DSLs On CLR and DLR (Microsoft.NET)
Getting Started with PL/Proxy
Simple ETL in python 3.5+ with Bonobo - PyParis 2017
Simple ETL in python 3.5+ with Bonobo, Romain Dorgueil
Practical pig
Declarative Infrastructure Tools
Integration-Monday-Stateful-Programming-Models-Serverless-Functions
Bye bye $GLOBALS['TYPO3_DB']
Booting into functional programming
What’s new in .NET
Performance measurement and tuning
Dependencies Managers in C/C++. Using stdcpp 2014

Recently uploaded (20)

PPTX
WJQSJXNAZJVCVSAXJHBZKSJXKJKXJSBHJBJEHHJB
PDF
Sanket Mhaiskar Resume - Senior Software Engineer (Backend, AI)
PDF
Bright VPN Crack Free Download (Latest 2025)
PPTX
Foundations of Marketo Engage: Nurturing
PPTX
Streamlining Project Management in the AV Industry with D-Tools for Zoho CRM ...
PDF
Streamlining Project Management in Microsoft Project, Planner, and Teams with...
PDF
Top 10 Project Management Software for Small Teams in 2025.pdf
PPTX
Presentation - Summer Internship at Samatrix.io_template_2.pptx
PPTX
Why 2025 Is the Best Year to Hire Software Developers in India
PDF
SOFTWARE ENGINEERING Software Engineering (3rd Edition) by K.K. Aggarwal & Yo...
PPTX
A Spider Diagram, also known as a Radial Diagram or Mind Map.
PPTX
SAP Business AI_L1 Overview_EXTERNAL.pptx
PPTX
Folder Lock 10.1.9 Crack With Serial Key
PDF
Odoo Construction Management System by CandidRoot
PPTX
Human Computer Interaction lecture Chapter 2.pptx
PPTX
ESDS_SAP Application Cloud Offerings.pptx
PPTX
Human-Computer Interaction for Lecture 2
PDF
AI-Powered Fuzz Testing: The Future of QA
PPTX
Human-Computer Interaction for Lecture 1
PPTX
Viber For Windows 25.7.1 Crack + Serial Keygen
WJQSJXNAZJVCVSAXJHBZKSJXKJKXJSBHJBJEHHJB
Sanket Mhaiskar Resume - Senior Software Engineer (Backend, AI)
Bright VPN Crack Free Download (Latest 2025)
Foundations of Marketo Engage: Nurturing
Streamlining Project Management in the AV Industry with D-Tools for Zoho CRM ...
Streamlining Project Management in Microsoft Project, Planner, and Teams with...
Top 10 Project Management Software for Small Teams in 2025.pdf
Presentation - Summer Internship at Samatrix.io_template_2.pptx
Why 2025 Is the Best Year to Hire Software Developers in India
SOFTWARE ENGINEERING Software Engineering (3rd Edition) by K.K. Aggarwal & Yo...
A Spider Diagram, also known as a Radial Diagram or Mind Map.
SAP Business AI_L1 Overview_EXTERNAL.pptx
Folder Lock 10.1.9 Crack With Serial Key
Odoo Construction Management System by CandidRoot
Human Computer Interaction lecture Chapter 2.pptx
ESDS_SAP Application Cloud Offerings.pptx
Human-Computer Interaction for Lecture 2
AI-Powered Fuzz Testing: The Future of QA
Human-Computer Interaction for Lecture 1
Viber For Windows 25.7.1 Crack + Serial Keygen

Cassandra and materialized views

  • 1. 1 Cassandra 2.2 and 3.0 new features DuyHai DOAN Apache Cassandra Technical Evangelist #VoxxedBerlin @doanduyhai
  • 2. Datastax 2 •  Founded in April 2010 •  We contribute a lot to Apache Cassandra™ •  400+ customers (25 of the Fortune 100), 450+ employees •  Headquarter in San Francisco Bay area •  EU headquarter in London, offices in France and Germany •  Datastax Enterprise = OSS Cassandra + extra features
  • 3. Materialized Views (MV) •  Why ? •  Detailed Impl •  Gotchas
  • 4. Why Materialized Views ? •  Relieve the pain of manual denormalization CREATE TABLE user( id int PRIMARY KEY, country text, … ); CREATE TABLE user_by_country( country text, id int, …, PRIMARY KEY(country, id) ); 4
  • 5. CREATE TABLE user_by_country ( country text, id int, firstname text, lastname text, PRIMARY KEY(country, id)); Materialzed View In Action CREATE MATERIALIZED VIEW user_by_country AS SELECT country, id, firstname, lastname FROM user WHERE country IS NOT NULL AND id IS NOT NULL PRIMARY KEY(country, id) 5
  • 6. Materialzed View Syntax CREATE MATERIALIZED VIEW [IF NOT EXISTS] keyspace_name.view_name AS SELECT column1, column2, ... FROM keyspace_name.table_name WHERE column1 IS NOT NULL AND column2 IS NOT NULL ... PRIMARY KEY(column1, column2, ...) Must select all primary key columns of base table •  IS NOT NULL condition for now •  more complex conditions in future •  at least all primary key columns of base table (ordering can be different) •  maximum 1 column NOT pk from base table 6
  • 8. Materialized View Impl C* C* C* C* C* C* C* C* UPDATE user SET country=‘FR’ WHERE id=1 ① •  send mutation to all replicas •  waiting for ack(s) with CL 8
  • 9. Materialized View Impl C* C* C* C* C* C* C* C* UPDATE user SET country=‘FR’ WHERE id=1 ② Acquire local lock on base table partition 9
  • 10. Materialized View Impl C* C* C* C* C* C* C* C* UPDATE user SET country=‘FR’ WHERE id=1 ③ Local read to fetch current values SELECT * FROM user WHERE id=1 10
  • 11. Materialized View Impl C* C* C* C* C* C* C* C* UPDATE user SET country=‘FR’ WHERE id=1 ④ Create BatchLog with •  DELETE FROM user_by_country WHERE country = ‘old_value’ •  INSERT INTO user_by_country(country, id, …) VALUES(‘FR’, 1, ...) 11
  • 12. Materialized View Impl C* C* C* C* C* C* C* C* UPDATE user SET country=‘FR’ WHERE id=1 ⑤ Execute async BatchLog to paired view replica with CL = ONE 12
  • 13. Materialized View Impl C* C* C* C* C* C* C* C* UPDATE user SET country=‘FR’ WHERE id=1 ⑥ Apply base table updade locally SET COUNTRY=‘FR’ 13
  • 14. Materialized View Impl C* C* C* C* C* C* C* C* UPDATE user SET country=‘FR’ WHERE id=1 ⑦ Release local lock 14
  • 15. Materialized View Impl C* C* C* C* C* C* C* C* UPDATE user SET country=‘FR’ WHERE id=1 ⑧ Return ack to coordinator 15
  • 16. Materialized View Impl C* C* C* C* C* C* C* C* UPDATE user SET country=‘FR’ WHERE id=1 ⑨ If CL ack(s) received, ack client 16
  • 17. MV Failure Cases: concurrent updates Read base row (country=‘UK’) •  DELETE FROM mv WHERE country=‘UK’ •  INSERT INTO mv …(country) VALUES(‘US’) •  Send async BatchLog •  Apply update country=‘US’ 1) UPDATE … SET country=‘US’ 2) UPDATE … SET country=‘FR’ Read base row (country=‘UK’) •  DELETE FROM mv WHERE country=‘UK’ •  INSERT INTO mv …(country) VALUES(‘FR’) •  Send async BatchLog •  Apply update country=‘FR’ t0 t1 t2 Without local lock 17
  • 18. MV Failure Cases: concurrent updates Read base row (country=‘UK’) •  DELETE FROM mv WHERE country=‘UK’ •  INSERT INTO mv …(country) VALUES(‘US’) •  Send async BatchLog •  Apply update country=‘US’ 1) UPDATE … SET country=‘US’ 2) UPDATE … SET country=‘FR’ Read base row (country=‘UK’) •  DELETE FROM mv WHERE country=‘UK’ •  INSERT INTO mv …(country) VALUES(‘FR’) •  Send async BatchLog •  Apply update country=‘FR’ t0 t1 t2 Without local lock 18 INSERT INTO mv …(country) VALUES(‘US’) INSERT INTO mv …(country) VALUES(‘FR’)
  • 19. MV Failure Cases: concurrent updates Read base row (country=‘UK’) •  DELETE FROM mv WHERE country=‘UK’ •  INSERT INTO mv …(country) VALUES(‘US’) •  Send async BatchLog •  Apply update country=‘US’ 1) UPDATE … SET country=‘US’ 2) UPDATE … SET country=‘FR’ Read base row (country=‘US’) •  DELETE FROM mv WHERE country=‘US’ •  INSERT INTO mv …(country) VALUES(‘FR’) •  Send async BatchLog •  Apply update country=‘FR’ With local lock 🔒 🔓 🔒 🔓19
  • 20. MV Failure Cases: failed updates to MV C* C* C* C* C* C* C* C* UPDATE user SET country=‘FR’ WHERE id=1 ⑤ Execute async BatchLog to paired view replica with CL = ONE ✘ MV replica down 20
  • 21. MV Failure Cases: failed updates to MV C* C* C* C* C* C* C* C* UPDATE user SET country=‘FR’ WHERE id=1 BatchLog replay MV replica up 21
  • 22. Materialized View Performance •  Write performance •  local lock •  local read-before-write for MV à update contention on partition (most of perf hits) •  local batchlog for MV •  ☞ you only pay this price once whatever number of MV •  for each base table update: mv_count x 2 (DELETE + INSERT) extra mutations 22
  • 23. Materialized View Performance •  Write performance vs manual denormalization •  MV better because no client-server network traffic for read-before-write •  MV better because less network traffic for multiple views (client-side BATCH) •  Makes developer life easier à priceless 23
  • 24. Materialized View Performance •  Read performance vs secondary index •  MV better because single node read (secondary index can hit many nodes) •  MV better because single read path (secondary index = read index + read data) 24
  • 25. Materialized Views Consistency •  Consistency level •  CL honoured for base table, ONE for MV + local batchlog •  Weaker consistency guarantees for MV than for base table. •  Exemple, write at QUORUM •  guarantee that QUORUM replicas of base tables have received write •  guarantee that QUORUM of MV replicas will eventually receive DELETE + INSERT 25
  • 26. Materialized Views Gotchas •  Beware of hot spots !!! •  MV user_by_gender 😱 26
  • 27. Q & A ! " 27
  • 28. User Define Functions (UDF) •  Why ? •  Detailed Impl •  UDAs •  Gotchas
  • 29. Rationale •  Push computation server-side •  save network bandwidth (1000 nodes!) •  simplify client-side code •  provide standard & useful function (sum, avg …) •  accelerate analytics use-case (pre-aggregation for Spark) 29
  • 30. How to create an UDF ? CREATE [OR REPLACE] FUNCTION [IF NOT EXISTS] [keyspace.]functionName (param1 type1, param2 type2, …) CALL ON NULL INPUT | RETURNS NULL ON NULL INPUT RETURN returnType LANGUAGE language AS $$ // source code here $$; 30
  • 31. How to create an UDF ? CREATE [OR REPLACE] FUNCTION [IF NOT EXISTS] [keyspace.]functionName (param1 type1, param2 type2, …) CALLED ON NULL INPUT | RETURNS NULL ON NULL INPUT RETURN returnType LANGUAGE language AS $$ // source code here $$; An UDF is keyspace-wide 31
  • 32. How to create an UDF ? CREATE [OR REPLACE] FUNCTION [IF NOT EXISTS] [keyspace.]functionName (param1 type1, param2 type2, …) CALLED ON NULL INPUT | RETURNS NULL ON NULL INPUT RETURN returnType LANGUAGE language AS $$ // source code here $$; Param name to refer to in the code Type = CQL3 type 32
  • 33. How to create an UDF ? CREATE [OR REPLACE] FUNCTION [IF NOT EXISTS] [keyspace.]functionName (param1 type1, param2 type2, …) CALLED ON NULL INPUT | RETURNS NULL ON NULL INPUT RETURN returnType LANGUAGE language // j AS $$ // source code here $$; Always called Null-check mandatory in code 33
  • 34. How to create an UDF ? CREATE [OR REPLACE] FUNCTION [IF NOT EXISTS] [keyspace.]functionName (param1 type1, param2 type2, …) CALLED ON NULL INPUT | RETURNS NULL ON NULL INPUT RETURN returnType LANGUAGE language // jav AS $$ // source code here $$; If any input is null, code block is skipped and return null 34
  • 35. How to create an UDF ? CREATE [OR REPLACE] FUNCTION [IF NOT EXISTS] [keyspace.]functionName (param1 type1, param2 type2, …) CALLED ON NULL INPUT | RETURNS NULL ON NULL INPUT RETURN returnType LANGUAGE language AS $$ // source code here $$; CQL types •  primitives (boolean, int, …) •  collections (list, set, map) •  tuples •  UDT 35
  • 36. How to create an UDF ? CREATE [OR REPLACE] FUNCTION [IF NOT EXISTS] [keyspace.]functionName (param1 type1, param2 type2, …) CALLED ON NULL INPUT | RETURNS NULL ON NULL INPUT RETURN returnType LANGUAGE language AS $$ // source code here $$; JVM supported languages •  Java, Scala •  Javascript (slow) •  Groovy, Jython, JRuby •  Clojure ( JSR 223 impl issue) 36
  • 37. How to create an UDF ? CREATE [OR REPLACE] FUNCTION [IF NOT EXISTS] [keyspace.]functionName (param1 type1, param2 type2, …) CALLED ON NULL INPUT | RETURNS NULL ON NULL INPUT RETURN returnType LANGUAGE language AS $$ // source code here $$; 37
  • 39. UDA •  Real use-case for UDF •  Aggregation server-side à huge network bandwidth saving •  Provide similar behavior for Group By, Sum, Avg etc … 39
  • 40. How to create an UDA ? CREATE [OR REPLACE] AGGREGATE [IF NOT EXISTS] [keyspace.]aggregateName(type1, type2, …) SFUNC accumulatorFunction STYPE stateType [FINALFUNC finalFunction] INITCOND initCond; Only type, no param name State type Initial state type 40
  • 41. How to create an UDA ? CREATE [OR REPLACE] AGGREGATE [IF NOT EXISTS] [keyspace.]aggregateName(type1, type2, …) SFUNC accumulatorFunction STYPE stateType [FINALFUNC finalFunction] INITCOND initCond; Accumulator function. Signature: accumulatorFunction(stateType, type1, type2, …) RETURNS stateType 41
  • 42. How to create an UDA ? CREATE [OR REPLACE] AGGREGATE [IF NOT EXISTS] [keyspace.]aggregateName(type1, type2, …) SFUNC accumulatorFunction STYPE stateType [FINALFUNC finalFunction] INITCOND initCond; Optional final function. Signature: finalFunction(stateType) 42
  • 43. How to create an UDA ? CREATE [OR REPLACE] AGGREGATE [IF NOT EXISTS] [keyspace.]aggregateName(type1, type2, …) SFUNC accumulatorFunction STYPE stateType [FINALFUNC finalFunction] INITCOND initCond; UDA return type ? If finalFunction •  return type of finalFunction Else •  return stateType 43
  • 45. Gotchas C* C* C* C* UDA ① ② & ③ ⑤ ② & ③ ② & ③ 45
  • 46. Gotchas C* C* C* C* UDA ① ② & ③ ⑤ ② & ③ ② & ③ 46 Why do not apply UDF/UDA on replica node ?
  • 47. Gotchas C* C* C* C* UDA ① ② & ③ ④ •  apply accumulatorFunction •  apply finalFunction ⑤ ② & ③ ② & ③ 1.  Because of eventual consistency 2.  UDF/UDA applied AFTER last-write-win logic 47
  • 48. Gotchas 48 •  UDA in Cassandra is not distributed ! •  Execute UDA on a large number of rows (106 for ex.) •  single fat partition •  multiple partitions •  full table scan •  à Increase client-side timeout •  default Java driver timeout = 12 secs •  JAVA-1033 JIRA for per-request timeout setting
  • 49. Cassandra UDA or Apache Spark ? 49 Consistency Level Single/Multiple Partition(s) Recommended Approach ONE Single partition UDA with token-aware driver because node local ONE Multiple partitions Apache Spark because distributed reads > ONE Single partition UDA because data-locality lost with Spark > ONE Multiple partitions Apache Spark definitely
  • 50. Cassandra UDA or Apache Spark ? 50 Consistency Level Single/Multiple Partition(s) Recommended Approach ONE Single partition UDA with token-aware driver because node local ONE Multiple partitions Apache Spark because distributed reads > ONE Single partition UDA because data-locality lost with Spark > ONE Multiple partitions Apache Spark definitely
  • 51. Q & A ! " 51
  • 52. New Storage Engine •  Data structure •  Disk space usage
  • 53. Pre 3.0 data structure Map<byte[ ], SortedMap<byte[ ], Cell>> 53 CREATE TABLE sensor_data( sensor_id uuid, date timestamp, sensor_type text, sensor_value double, PRIMARY KEY(sensor_id, date) );
  • 54. Pre 3.0 on disk layout 54 RowKey: de305d54-75b4-431b-adb2-eb6b9e546014 => (column=2015-04-27 10:00:00+0100:, value=, timestamp=1430128800) => (column=2015-04-27 10:00:00+0100:sensor_type, value=‘Temperature’, timestamp=1430128800) => (column=2015-04-27 10:00:00+0100:sensor_value, value=23.48, timestamp=1430128800) => (column=2015-04-27 10:01:00+0100:, value=, timestamp=1430128860) => (column=2015-04-27 10:01:00+0100:sensor_type, value=‘Temperature’, timestamp=1430128860) => (column=2015-04-27 10:01:00+0100:sensor_value, value=24.08, timestamp=1430128860) Clustering values are repeated for each normal column Full timestamp storage
  • 55. Cassandra 3.0 data structure Map<byte[ ], SortedMap<ClusteringColumn, Row>> 55 CREATE TABLE sensor_data( sensor_id uuid, date timestamp, sensor_type text, sensor_value double, PRIMARY KEY(sensor_id, date) );
  • 56. Cassandra 3.0 on disk layout 56 PartitionKey: de305d54-75b4-431b-adb2-eb6b9e546014 => clusteringColumn:2015-04-27 10:00:00+0100 => row_timestamp=1430128800 => (column_value=‘Temperature’, delta_encoded_timestamp=+0) => (column_value=23.48, delta_encoded_timestamp=+0) => clusteringColumn:2015-04-27 10:01:00+0100 => row_timestamp=1430128860 => (column_value=‘Temperature’, delta_encoded_timestamp=+0) => (column_value=24.08, delta_encoded_timestamp=+0) Delta-encoded timestamp vs row timestamp
  • 57. Gains 57 •  No clustering value repetition •  Column labels are stored only once in meta data •  Delta encoding of timestamp, 8 bytes saved each time •  Less disk space used
  • 58. Benchmarks 58 CREATE TABLE events ( id uuid, date timeuuid, prop1 int, prop2 text, prop3 float, PRIMARY KEY(id, date)); 106 rows Small string
  • 59. Benchmarks 59 CREATE TABLE largetext( key int, prop1 int, prop2 text, PRIMARY KEY(id)); 106 rows Large string (1000)
  • 60. Benchmarks 60 CREATE TABLE largeclustering( key int, clust text, prop1 int, prop2 set<float>, PRIMARY KEY(id, clust)); 106 rowsMedium string (100) 50 items
  • 61. Benchmarks 61 CREATE TABLE events ( id uuid, date timeuuid, prop1 int, prop2 text, prop3 float, PRIMARY KEY(id, date)) WITH COMPACT STORAGE ;
  • 62. Q & A ! " 62