Datastax Academy: What ? Why ? How ?
DuyHai DOAN
Cassandra as Infrastructure Technology
•  ING (Cassandra as a service)
•  Netflix
•  Sony Playstation Network
•  Microsoft Office 365
•  Ebay (40Tb in a single table …)
•  Etc ..
© 2015 DataStax, All Rights Reserved.
 2
Cassandra as Infrastructure Technology
•  The SMACK stack as an alternative to the Hadoop stack for streaming
•  Spark
•  Mesos
•  Akka
•  Cassandra
•  Kafka
•  Read @helenaedelson slides here http://goo.gl/cCIE7F
© 2015 DataStax, All Rights Reserved.
 3
Rich eco-system around Cassandra
•  Apache Spark (C* connector)
•  Apache Zeppelin (C* interpreter)
•  Apache Mesos (https://github.com/mesosphere/cassandra-mesos)
•  Apache Kafka (KIP-30)
•  Apache Shiro (C* as cluster session store)
•  Hunk, JasperSoft, Pentaho, Tableau ..
© 2015 DataStax, All Rights Reserved.
 4
Increasing SQL-like features
•  CQL DML (SELECT, INSERT, UPDATE, DELETE …)
•  CQL DDL CREATE/ALTER/DROP (SCHEMA, TABLE, TYPE, FUNCTION …)
•  CQL Credentials
•  CREATE/ALTER/DROP (USER, ROLE)
•  GRANT <xxx> PERMISSION ON <resource> TO <user_name>
•  REVOKE <xxx> PERMISSION ON <resource> FROM <user_name>
© 2015 DataStax, All Rights Reserved.
 5
Increasing SQL-like features
•  User Defined Functions
CREATE [OR REPLACE] FUNCTION [IF NOT EXISTS] keyspace';'function-name ( <arg-name> <arg-type> )
(CALLED | RETURNS NULL) ON NULL INPUT
RETURNS <type>
LANGUAGE <language>
AS <body>
© 2015 DataStax, All Rights Reserved.
 6
Increasing SQL-like features
•  Materialized Views
CREATE MATERIALIZED VIEW [IF NOT EXISTS] keyspace_name.view_name AS
SELECT column1, column2, ...
FROM keyspace_name.table_name
WHERE column1 IS NOT NULL AND column2 IS NOT NULL ...
PRIMARY KEY(column1, column2, ...)
•  Real time notifications (CDC) CASSANDRA-8844
© 2015 DataStax, All Rights Reserved.
 7
More powerful search in future
•  Apple open-sourced Secondary Index Impl
•  https://github.com/xedin/sasi
CREATE CUSTOM INDEX ON sasi (bio) USING
'org.apache.cassandra.db.index.SSTableAttachedSecondaryIndex'
WITH OPTIONS = {
'analyzer_class': 'org.apache.cassandra.db.index.sasi.analyzer.StandardAnalyzer',
'tokenization_enable_stemming': 'true',
'analyzed': 'true',
'tokenization_normalize_lowercase': 'true',
'tokenization_locale': 'en'
};
© 2015 DataStax, All Rights Reserved.
 8
More powerful search in future
•  Apple open-sourced Secondary Index Impl
•  https://github.com/xedin/sasi
SELECT *
FROM sasi
WHERE (created_at > 1442959315018 OR first_name = 'P')
AND age > 26
ALLOW FILTERING;
© 2015 DataStax, All Rights Reserved.
 9
More powerful search in future
•  Limited to 2.0.x branch
•  Needs special patch to OSS code
•  Support only COMPACT STORAGE table
•  Only compatible with Murmur3Partitioner
•  CASSANDRA-10661 to merge to Cassandra 3.0 !!!
•  Github issues to support full CQL3 (https://github.com/xedin/sasi/issues/3)
© 2015 DataStax, All Rights Reserved.
 10
Datastax Gartner reports (Operational DB)
© 2015 DataStax, All Rights Reserved.
 11
Oct 2013
Datastax Gartner reports (Operational DB)
© 2015 DataStax, All Rights Reserved.
 12
Oct 2014
Datastax Gartner reports (Operational DB)
© 2015 DataStax, All Rights Reserved.
 13
Oct 2015
Cassandra Job Trend
© 2015 DataStax, All Rights Reserved.
 14
Cassandra Job Offers (I’ve received)
© 2015 DataStax, All Rights Reserved.
 15
Cassandra Job Offers (I’ve received)
© 2015 DataStax, All Rights Reserved.
 16
Problem ?
© 2015 DataStax, All Rights Reserved.
 17
•  Lack of Cassandra skills
•  Difficulty to hire Cassandra experts
Solution: https://academy.datastax.com
© 2015 DataStax, All Rights Reserved.
 18
Self-Paced Courses
19
© 2015 DataStax, All Rights Reserved.
FREE
Instructor-Led Training
20
© 2015 DataStax, All Rights Reserved.
FREE
O’Reilly Certification
21
© 2015 DataStax, All Rights Reserved.
Technical Evangelists
22
© 2015 DataStax, All Rights Reserved.
•  On-site help, data-modeling, cluster health check
•  duy_hai.doan@datastax.com, @doanduyhai
FREE
From Devs & Ops perspective
23
© 2015 DataStax, All Rights Reserved.
Cassandra is mainstream
+
You are trained & certified
=
Career Boost
24
© 2015 DataStax, All Rights Reserved.
academy.datastax.com

More Related Content

PDF
Apache zeppelin the missing component for the big data ecosystem
PDF
Spark zeppelin-cassandra at synchrotron
PDF
Cassandra and Spark, closing the gap between no sql and analytics codemotio...
PDF
Apache zeppelin, the missing component for the big data ecosystem
PDF
Cassandra 3 new features 2016
PDF
Apache cassandra in 2016
PDF
Spark cassandra integration, theory and practice
PDF
Sasi, cassandra on full text search ride
Apache zeppelin the missing component for the big data ecosystem
Spark zeppelin-cassandra at synchrotron
Cassandra and Spark, closing the gap between no sql and analytics codemotio...
Apache zeppelin, the missing component for the big data ecosystem
Cassandra 3 new features 2016
Apache cassandra in 2016
Spark cassandra integration, theory and practice
Sasi, cassandra on full text search ride

What's hot (20)

PDF
Spark cassandra integration 2016
PDF
Cassandra UDF and Materialized Views
PDF
Cassandra introduction 2016
PDF
Datastax enterprise presentation
PDF
Sasi, cassandra on the full text search ride At Voxxed Day Belgrade 2016
PDF
Spark Cassandra 2016
PDF
Habits of Effective Sqoop Users
PDF
Spark Programming
PDF
Real time data processing with spark & cassandra @ NoSQLMatters 2015 Paris
PDF
Cassandra introduction 2016
PDF
Apache Sqoop: Unlocking Hadoop for Your Relational Database
PPTX
Using existing language skillsets to create large-scale, cloud-based analytics
PPTX
Hadoop on osx
PDF
Rebuilding Solr 6 Examples - Layer by Layer: Presented by Alexandre Rafalovit...
PDF
Webinar: What's New in Solr 6
PDF
Cassandra Materialized Views
PDF
Apache Spark and DataStax Enablement
PDF
Big data analytics with Spark & Cassandra
PDF
DataEngConf SF16 - Spark SQL Workshop
PDF
Solr Indexing and Analysis Tricks
Spark cassandra integration 2016
Cassandra UDF and Materialized Views
Cassandra introduction 2016
Datastax enterprise presentation
Sasi, cassandra on the full text search ride At Voxxed Day Belgrade 2016
Spark Cassandra 2016
Habits of Effective Sqoop Users
Spark Programming
Real time data processing with spark & cassandra @ NoSQLMatters 2015 Paris
Cassandra introduction 2016
Apache Sqoop: Unlocking Hadoop for Your Relational Database
Using existing language skillsets to create large-scale, cloud-based analytics
Hadoop on osx
Rebuilding Solr 6 Examples - Layer by Layer: Presented by Alexandre Rafalovit...
Webinar: What's New in Solr 6
Cassandra Materialized Views
Apache Spark and DataStax Enablement
Big data analytics with Spark & Cassandra
DataEngConf SF16 - Spark SQL Workshop
Solr Indexing and Analysis Tricks
Ad

Viewers also liked (17)

PDF
Cassandra nice use cases and worst anti patterns no sql-matters barcelona
PDF
Libon cassandra summiteu2014
PDF
Cassandra 3 new features @ Geecon Krakow 2016
PDF
Introduction to Cassandra & Data model
PDF
C* Summit 2013: The World's Next Top Data Model by Patrick McFadin
PDF
Introduction to KillrChat
PDF
Cassandra introduction @ ParisJUG
PDF
KillrChat presentation
PDF
Apache Zeppelin @DevoxxFR 2016
PDF
Cassandra drivers and libraries
PDF
Fast track to getting started with DSE Max @ ING
PDF
Cassandra introduction @ NantesJUG
PDF
Cassandra introduction mars jug
PDF
KillrChat Data Modeling
PDF
Datastax day 2016 introduction to apache cassandra
PDF
Cassandra introduction at FinishJUG
PDF
Cassandra for the ops dos and donts
Cassandra nice use cases and worst anti patterns no sql-matters barcelona
Libon cassandra summiteu2014
Cassandra 3 new features @ Geecon Krakow 2016
Introduction to Cassandra & Data model
C* Summit 2013: The World's Next Top Data Model by Patrick McFadin
Introduction to KillrChat
Cassandra introduction @ ParisJUG
KillrChat presentation
Apache Zeppelin @DevoxxFR 2016
Cassandra drivers and libraries
Fast track to getting started with DSE Max @ ING
Cassandra introduction @ NantesJUG
Cassandra introduction mars jug
KillrChat Data Modeling
Datastax day 2016 introduction to apache cassandra
Cassandra introduction at FinishJUG
Cassandra for the ops dos and donts
Ad

Similar to Data stax academy (20)

PDF
Johnny Miller – Cassandra + Spark = Awesome- NoSQL matters Barcelona 2014
PPTX
Big data architecture on cloud computing infrastructure
PDF
Maria db 10 and the mariadb foundation(colin)
PPTX
Azure satpn19 time series analytics with azure adx
PDF
Chef for OpenStack December 2012
PDF
Streaming Solutions for Real time problems
PDF
Spark Summit EU talk by Mike Percy
PPTX
An intro to Azure Data Lake
PPTX
Cassandra
PDF
Big Data Day LA 2016/ NoSQL track - Apache Kudu: Fast Analytics on Fast Data,...
PDF
MySQL Ecosystem in 2023 - FOSSASIA'23 - Alkin.pptx.pdf
PPTX
Apache Cassandra introduction
PPTX
Building an intelligent big data application in 30 minutes
PDF
Databases in the hosted cloud
PDF
What is MariaDB Server 10.3?
PPTX
BI, Reporting and Analytics on Apache Cassandra
PDF
【旧版】Oracle Exadata Cloud Service:サービス概要のご紹介 [2020年8月版]
PPTX
Webinar - DataStax Enterprise 5.1: 3X the operational analytics speed, help f...
PDF
Koalas: How Well Does Koalas Work?
PDF
Webinar - DreamObjects/Ceph Case Study
Johnny Miller – Cassandra + Spark = Awesome- NoSQL matters Barcelona 2014
Big data architecture on cloud computing infrastructure
Maria db 10 and the mariadb foundation(colin)
Azure satpn19 time series analytics with azure adx
Chef for OpenStack December 2012
Streaming Solutions for Real time problems
Spark Summit EU talk by Mike Percy
An intro to Azure Data Lake
Cassandra
Big Data Day LA 2016/ NoSQL track - Apache Kudu: Fast Analytics on Fast Data,...
MySQL Ecosystem in 2023 - FOSSASIA'23 - Alkin.pptx.pdf
Apache Cassandra introduction
Building an intelligent big data application in 30 minutes
Databases in the hosted cloud
What is MariaDB Server 10.3?
BI, Reporting and Analytics on Apache Cassandra
【旧版】Oracle Exadata Cloud Service:サービス概要のご紹介 [2020年8月版]
Webinar - DataStax Enterprise 5.1: 3X the operational analytics speed, help f...
Koalas: How Well Does Koalas Work?
Webinar - DreamObjects/Ceph Case Study

More from Duyhai Doan (9)

PDF
Pourquoi Terraform n'est pas le bon outil pour les déploiements automatisés d...
PDF
Le futur d'apache cassandra
PDF
Big data 101 for beginners devoxxpl
PDF
Big data 101 for beginners riga dev days
PDF
Datastax day 2016 : Cassandra data modeling basics
PDF
Algorithme distribués pour big data saison 2 @DevoxxFR 2016
PDF
Distributed algorithms for big data @ GeeCon
PDF
Spark cassandra connector.API, Best Practices and Use-Cases
PDF
Algorithmes distribues pour le big data @ DevoxxFR 2015
Pourquoi Terraform n'est pas le bon outil pour les déploiements automatisés d...
Le futur d'apache cassandra
Big data 101 for beginners devoxxpl
Big data 101 for beginners riga dev days
Datastax day 2016 : Cassandra data modeling basics
Algorithme distribués pour big data saison 2 @DevoxxFR 2016
Distributed algorithms for big data @ GeeCon
Spark cassandra connector.API, Best Practices and Use-Cases
Algorithmes distribues pour le big data @ DevoxxFR 2015

Recently uploaded (20)

PDF
Aug23rd - Mulesoft Community Workshop - Hyd, India.pdf
PDF
CEH Module 2 Footprinting CEH V13, concepts
PDF
Transform-Your-Streaming-Platform-with-AI-Driven-Quality-Engineering.pdf
PPTX
SGT Report The Beast Plan and Cyberphysical Systems of Control
PDF
Early detection and classification of bone marrow changes in lumbar vertebrae...
PDF
Electrocardiogram sequences data analytics and classification using unsupervi...
PDF
EIS-Webinar-Regulated-Industries-2025-08.pdf
PDF
A hybrid framework for wild animal classification using fine-tuned DenseNet12...
PDF
Transform-Quality-Engineering-with-AI-A-60-Day-Blueprint-for-Digital-Success.pdf
PDF
4 layer Arch & Reference Arch of IoT.pdf
PDF
SaaS reusability assessment using machine learning techniques
PDF
Transform-Your-Factory-with-AI-Driven-Quality-Engineering.pdf
PPTX
Presentation - Principles of Instructional Design.pptx
PDF
Rapid Prototyping: A lecture on prototyping techniques for interface design
PDF
Decision Optimization - From Theory to Practice
PDF
giants, standing on the shoulders of - by Daniel Stenberg
PDF
The-2025-Engineering-Revolution-AI-Quality-and-DevOps-Convergence.pdf
PDF
IT-ITes Industry bjjbnkmkhkhknbmhkhmjhjkhj
PDF
Examining Bias in AI Generated News Content.pdf
PDF
Data Virtualization in Action: Scaling APIs and Apps with FME
Aug23rd - Mulesoft Community Workshop - Hyd, India.pdf
CEH Module 2 Footprinting CEH V13, concepts
Transform-Your-Streaming-Platform-with-AI-Driven-Quality-Engineering.pdf
SGT Report The Beast Plan and Cyberphysical Systems of Control
Early detection and classification of bone marrow changes in lumbar vertebrae...
Electrocardiogram sequences data analytics and classification using unsupervi...
EIS-Webinar-Regulated-Industries-2025-08.pdf
A hybrid framework for wild animal classification using fine-tuned DenseNet12...
Transform-Quality-Engineering-with-AI-A-60-Day-Blueprint-for-Digital-Success.pdf
4 layer Arch & Reference Arch of IoT.pdf
SaaS reusability assessment using machine learning techniques
Transform-Your-Factory-with-AI-Driven-Quality-Engineering.pdf
Presentation - Principles of Instructional Design.pptx
Rapid Prototyping: A lecture on prototyping techniques for interface design
Decision Optimization - From Theory to Practice
giants, standing on the shoulders of - by Daniel Stenberg
The-2025-Engineering-Revolution-AI-Quality-and-DevOps-Convergence.pdf
IT-ITes Industry bjjbnkmkhkhknbmhkhmjhjkhj
Examining Bias in AI Generated News Content.pdf
Data Virtualization in Action: Scaling APIs and Apps with FME

Data stax academy

  • 1. Datastax Academy: What ? Why ? How ? DuyHai DOAN
  • 2. Cassandra as Infrastructure Technology •  ING (Cassandra as a service) •  Netflix •  Sony Playstation Network •  Microsoft Office 365 •  Ebay (40Tb in a single table …) •  Etc .. © 2015 DataStax, All Rights Reserved. 2
  • 3. Cassandra as Infrastructure Technology •  The SMACK stack as an alternative to the Hadoop stack for streaming •  Spark •  Mesos •  Akka •  Cassandra •  Kafka •  Read @helenaedelson slides here http://goo.gl/cCIE7F © 2015 DataStax, All Rights Reserved. 3
  • 4. Rich eco-system around Cassandra •  Apache Spark (C* connector) •  Apache Zeppelin (C* interpreter) •  Apache Mesos (https://github.com/mesosphere/cassandra-mesos) •  Apache Kafka (KIP-30) •  Apache Shiro (C* as cluster session store) •  Hunk, JasperSoft, Pentaho, Tableau .. © 2015 DataStax, All Rights Reserved. 4
  • 5. Increasing SQL-like features •  CQL DML (SELECT, INSERT, UPDATE, DELETE …) •  CQL DDL CREATE/ALTER/DROP (SCHEMA, TABLE, TYPE, FUNCTION …) •  CQL Credentials •  CREATE/ALTER/DROP (USER, ROLE) •  GRANT <xxx> PERMISSION ON <resource> TO <user_name> •  REVOKE <xxx> PERMISSION ON <resource> FROM <user_name> © 2015 DataStax, All Rights Reserved. 5
  • 6. Increasing SQL-like features •  User Defined Functions CREATE [OR REPLACE] FUNCTION [IF NOT EXISTS] keyspace';'function-name ( <arg-name> <arg-type> ) (CALLED | RETURNS NULL) ON NULL INPUT RETURNS <type> LANGUAGE <language> AS <body> © 2015 DataStax, All Rights Reserved. 6
  • 7. Increasing SQL-like features •  Materialized Views CREATE MATERIALIZED VIEW [IF NOT EXISTS] keyspace_name.view_name AS SELECT column1, column2, ... FROM keyspace_name.table_name WHERE column1 IS NOT NULL AND column2 IS NOT NULL ... PRIMARY KEY(column1, column2, ...) •  Real time notifications (CDC) CASSANDRA-8844 © 2015 DataStax, All Rights Reserved. 7
  • 8. More powerful search in future •  Apple open-sourced Secondary Index Impl •  https://github.com/xedin/sasi CREATE CUSTOM INDEX ON sasi (bio) USING 'org.apache.cassandra.db.index.SSTableAttachedSecondaryIndex' WITH OPTIONS = { 'analyzer_class': 'org.apache.cassandra.db.index.sasi.analyzer.StandardAnalyzer', 'tokenization_enable_stemming': 'true', 'analyzed': 'true', 'tokenization_normalize_lowercase': 'true', 'tokenization_locale': 'en' }; © 2015 DataStax, All Rights Reserved. 8
  • 9. More powerful search in future •  Apple open-sourced Secondary Index Impl •  https://github.com/xedin/sasi SELECT * FROM sasi WHERE (created_at > 1442959315018 OR first_name = 'P') AND age > 26 ALLOW FILTERING; © 2015 DataStax, All Rights Reserved. 9
  • 10. More powerful search in future •  Limited to 2.0.x branch •  Needs special patch to OSS code •  Support only COMPACT STORAGE table •  Only compatible with Murmur3Partitioner •  CASSANDRA-10661 to merge to Cassandra 3.0 !!! •  Github issues to support full CQL3 (https://github.com/xedin/sasi/issues/3) © 2015 DataStax, All Rights Reserved. 10
  • 11. Datastax Gartner reports (Operational DB) © 2015 DataStax, All Rights Reserved. 11 Oct 2013
  • 12. Datastax Gartner reports (Operational DB) © 2015 DataStax, All Rights Reserved. 12 Oct 2014
  • 13. Datastax Gartner reports (Operational DB) © 2015 DataStax, All Rights Reserved. 13 Oct 2015
  • 14. Cassandra Job Trend © 2015 DataStax, All Rights Reserved. 14
  • 15. Cassandra Job Offers (I’ve received) © 2015 DataStax, All Rights Reserved. 15
  • 16. Cassandra Job Offers (I’ve received) © 2015 DataStax, All Rights Reserved. 16
  • 17. Problem ? © 2015 DataStax, All Rights Reserved. 17 •  Lack of Cassandra skills •  Difficulty to hire Cassandra experts
  • 19. Self-Paced Courses 19 © 2015 DataStax, All Rights Reserved. FREE
  • 20. Instructor-Led Training 20 © 2015 DataStax, All Rights Reserved. FREE
  • 21. O’Reilly Certification 21 © 2015 DataStax, All Rights Reserved.
  • 22. Technical Evangelists 22 © 2015 DataStax, All Rights Reserved. •  On-site help, data-modeling, cluster health check •  [email protected], @doanduyhai FREE
  • 23. From Devs & Ops perspective 23 © 2015 DataStax, All Rights Reserved. Cassandra is mainstream + You are trained & certified = Career Boost
  • 24. 24 © 2015 DataStax, All Rights Reserved. academy.datastax.com