Node.js and Cassandra
For highly concurrent systems
Software as a service
Most common scenario

•
•
•

I/O Bound
o db
o other services
o file system
Low CPU usage
Peaks and valleys
Why Node.js
Why Node.js
Event-based

•
•
•
•

Single threaded
Minimum overhead per connection
Predictable amount of memory under load
Apache / IIS vs Nginx: Process-based vs event-loop

Everything runs in parallel except your code
Why Node.js
Event-based: Apache vs Nginx

source webfaction.com
Why Node.js
Async I/O

•
•

Uses OS network interfaces + fixed thread pool
Time spent: db connections / files
Why Node.js
Javascript: Closures

•
•
•
•

CPS: Continuation passing style
The scope of the outer function -> inner function
Javascript: Functional / Dynamic / Object oriented
... packet manager / os community / V8 / ubiquitous...
Why Node.js
Stream everything

•
•
•
•

Avoid buffering
HTTP: Chunked requests and responses
TCP: Chunks readable in a stream
Stream Piping (UNIX like)
The Driver for Cassandra
The Driver for Cassandra
Features

•
•
•
•
•

Connection pooling to multiple hosts
Load balancing
Automatic failover / retry
Row and field streaming
Queuing: Concurrent connecting / preparing
The Driver for Cassandra
App

Cassandra nodes

A
G

E

D

C

F

H
B
The Driver for Cassandra
Sample: Json Web Api
1

app.get('/user/:id', function (req, res, next){

2

var query = 'SELECT * FROM users WHERE id = ?';

3

cassandra.executeAsPrepared(query, [req.params.id], function (err, result) {

4

if (err) return next(err);

5

var row = result.rows[0];

6

//Response: expose some properties of the user

7

res.json({id: req.params.id, name: row.get('name')});

8
9

});
});
The Driver for Cassandra
How row streaming works
Readable
stream

Transform
stream

Transform
stream

Socket

Protocol

Parser

Chunks

Heade
r and
body
chunk
s

Client

Row
The Driver for Cassandra
Sample: Field streaming
1

app.get('/user/:id/image', function (req, res, next){

2

var query = 'SELECT id, profile_image FROM users WHERE id = ?';

3

cassandra.streamField(query, [req.params.id], function (err, row, image) {

4

if (err) return next(err);

5

//pipe the image stream to the response stream

6

image.pipe(res);

7
8

});
});
The Driver for Cassandra
Sample: Field streaming + image resizing
1

app.get('/user/:id/image', function (req, res, next){

2

var query = 'SELECT id, profile_image FROM users WHERE id = ?';

3

cassandra.streamField(query, [req.params.id], function (err, row, image) {

4

if (err) return next(err);

5

//pipe the image stream to a resizer stream

6

image.pipe(resizer).pipe(res);

7
8

});
});
Moving forward
Next features

•
•

Multiple data centers support.
Cassandra query tracing

Contribute! :)
Thanks!

Jorge Bay Gondra
@jorgebg
jorgebaygondra@gmail.com
github.com/jorgebay/node-cassandra-cql
npm install node-cassandra-cql
A Design Framework for Highly Concurrent Systems by Matt Welsh, Steven D. Gribble, Eric A.
Brewer, and David Culler @ UC Berkeley
Concurrency is not Parallelism (it's better) by Rob Pike @Google Go lang
How the single threaded non blocking IO model works in Node.js

References and further reading

Node.js and Cassandra

  • 1.
    Node.js and Cassandra Forhighly concurrent systems
  • 2.
    Software as aservice Most common scenario • • • I/O Bound o db o other services o file system Low CPU usage Peaks and valleys
  • 3.
  • 4.
    Why Node.js Event-based • • • • Single threaded Minimumoverhead per connection Predictable amount of memory under load Apache / IIS vs Nginx: Process-based vs event-loop Everything runs in parallel except your code
  • 5.
    Why Node.js Event-based: Apachevs Nginx source webfaction.com
  • 6.
    Why Node.js Async I/O • • UsesOS network interfaces + fixed thread pool Time spent: db connections / files
  • 7.
    Why Node.js Javascript: Closures • • • • CPS:Continuation passing style The scope of the outer function -> inner function Javascript: Functional / Dynamic / Object oriented ... packet manager / os community / V8 / ubiquitous...
  • 8.
    Why Node.js Stream everything • • • • Avoidbuffering HTTP: Chunked requests and responses TCP: Chunks readable in a stream Stream Piping (UNIX like)
  • 9.
    The Driver forCassandra
  • 10.
    The Driver forCassandra Features • • • • • Connection pooling to multiple hosts Load balancing Automatic failover / retry Row and field streaming Queuing: Concurrent connecting / preparing
  • 11.
    The Driver forCassandra App Cassandra nodes A G E D C F H B
  • 12.
    The Driver forCassandra Sample: Json Web Api 1 app.get('/user/:id', function (req, res, next){ 2 var query = 'SELECT * FROM users WHERE id = ?'; 3 cassandra.executeAsPrepared(query, [req.params.id], function (err, result) { 4 if (err) return next(err); 5 var row = result.rows[0]; 6 //Response: expose some properties of the user 7 res.json({id: req.params.id, name: row.get('name')}); 8 9 }); });
  • 13.
    The Driver forCassandra How row streaming works Readable stream Transform stream Transform stream Socket Protocol Parser Chunks Heade r and body chunk s Client Row
  • 14.
    The Driver forCassandra Sample: Field streaming 1 app.get('/user/:id/image', function (req, res, next){ 2 var query = 'SELECT id, profile_image FROM users WHERE id = ?'; 3 cassandra.streamField(query, [req.params.id], function (err, row, image) { 4 if (err) return next(err); 5 //pipe the image stream to the response stream 6 image.pipe(res); 7 8 }); });
  • 15.
    The Driver forCassandra Sample: Field streaming + image resizing 1 app.get('/user/:id/image', function (req, res, next){ 2 var query = 'SELECT id, profile_image FROM users WHERE id = ?'; 3 cassandra.streamField(query, [req.params.id], function (err, row, image) { 4 if (err) return next(err); 5 //pipe the image stream to a resizer stream 6 image.pipe(resizer).pipe(res); 7 8 }); });
  • 16.
    Moving forward Next features • • Multipledata centers support. Cassandra query tracing Contribute! :)
  • 17.
  • 18.
    A Design Frameworkfor Highly Concurrent Systems by Matt Welsh, Steven D. Gribble, Eric A. Brewer, and David Culler @ UC Berkeley Concurrency is not Parallelism (it's better) by Rob Pike @Google Go lang How the single threaded non blocking IO model works in Node.js References and further reading