Scaling Real-time collaboration
with MongoDB
MongoDB World, June 2019, NYC
Presenting today
Nicolae Gudumac
CTO & Co-Founder
Planable Inc.
@nick_gudumac
Real-time technology
is an essential part of modern apps
Real-time technology
is an essential part of modern apps
Real-time technology
is an essential part of modern apps
the collaboration platform for brand marketing teams
the collaboration platform for brand marketing teams
Product demo
Requirements & challenges
Semi-structured and
complex data
models
Frequent changes
in requirements
High performance &
scalability
Sync in real-time
Data structures
• DraftJS
• Canvas
• Workflow status
• Metadata
MongoDB at Planable
• Primary data store
• Mostly normalized data
• One to many relations with references & embedding
• Change streams
One to Many relationships
Pages Posts CommentsWorkspacesUsers
{
_id: ‘123’,
workspaces: [“w1”,
“ws2”, “ws3”]
…
}
{
_id: ‘1563’,
name: “Jusco”
…
}
{
_id: “6229”,
workspaceId: “1563”
…
}
{
_id: “1263”,
workspaceId,
pageId,
…
}
{
_id: “1242”,
workspaceId,
pageId,
postId,
userId
…
}
How do we real-time?
Pub / Sub
Pub/Sub, web sockets, Node.js, Mini Mongo
Scaling real-time is hard
Poll-and-diff
SERVER CSERVER BSERVER A
C1 C2 C3 C4 C5 C6 C7 C8
Poll-and-diff
SERVER CSERVER BSERVER A
C1 C2 C3 C4 C5 C6 C7 C8
Poll-and-diff
SERVER CSERVER BSERVER A
C1 C2 C3 C4 C5 C6 C7 C8
Simple approach and easy to understand
scaling characteristics
Updates come with delay
Does not scale well with many users and data
Works with complex queries and sort/skip
Poll-and-diff
SERVER CSERVER BSERVER A
C1 C2 C3 C4 C5 C6 C7 C8
Simple approach and easy to understand
scaling characteristics
Updates come with delay
Does not scale well with many users and data
Works with complex queries and sort/skip
Tailing the Oplog
SERVER CSERVER BSERVER A
C1 C2 C3 C4 C5 C6 C7 C8
Tailing the Oplog
SERVER CSERVER BSERVER A
C1 C2 C3 C4 C5 C6 C7 C8
Oplog
Works seamlessly on light loads
Updates come in real-time
Little additional load on the database
Tailing the Oplog
SERVER CSERVER BSERVER A
C1 C2 C3 C4 C5 C6 C7 C8
Oplog
Works seamlessly on light loads
Updates come in real-time
Little additional load on the database
Batch updates, inserts, deletes
overwhelms the server
Cannot scale servers horizontally
Redis Oplog
SERVER CSERVER BSERVER A
C1 C2 C3 C4 C5 C6 C7 C8
Redis Oplog
SERVER CSERVER BSERVER A
C1 C2 C3 C4 C5 C6 C7 C8
Redis Oplog
SERVER CSERVER BSERVER A
C1 C2 C3 C4 C5 C6 C7 C8
No additional load on MongoDB
Server receives less updates
More control on reactivity
Redis Oplog
No additional load on MongoDB
SERVER CSERVER BSERVER A
C1 C2 C3 C4 C5 C6 C7 C8
Server receives less updates
More control on reactivity
Increased latency
Additional point of failure
Additional system to scale, monitor, maintain
No reactivity on external database operations
Change Streams
SERVER CSERVER BSERVER A
C1 C2 C3 C4 C5 C6 C7 C8
Change Streams
const changeStream = db.Posts.watch({
fullDocument: ‘updateLookup'
});
changeStream.on('change', next => {
// process next document
// send update to client
});
3.6+
Change Streams
SERVER CSERVER BSERVER A
C1 C2 C3 C4 C5 C6 C7 C8
Change Streams
3.6+
Decreased load on servers
Use powerful aggregation pipelines
Single source of truth
Reliable, durable, resumable
Limited number of opened streams
Increased memory usage of MongoDB
Using Change streams in production
• Reduce the number of streams opened
• Watch only the “hot” collections
• Allocate more memory (RAM) to MongoDB
• Increase the poolSize when connecting to MongoDB
• Distribute the change streams to secondaries
Monolith to Microservices
SERVER CSERVER BSERVER A
C1 C2 C3 C4 C5 C6 C7 C8
Webhooks
Versioning
Notifications
Analytics
Versioning content
Watch updates on collection and
specific fields
const changeStream = db.Posts.watch(
[
{
$match: {
operationType: 'update',
'updateDescription.updateFields.content': { $exists: true }
}
}
],
{ fullDocument: 'updateLookup' }
);
Versioning content
const changeStream = db.Posts.watch(
[
{
$match: {
operationType: 'update',
'updateDescription.updateFields.content': { $exists: true }
}
}
],
{ fullDocument: 'updateLookup' }
);
changeStream.on('change', event => {
db.Revisions.insert({
postId: event.documentKey._id,
content: event.updateDescription.updateFields.content,
updatedAt: new Date(),
snapshot: event.fullDocument
});
});
Save snapshot of post and updates
in another collection
Watch updates on collection and
specific fields
Notifications
const changeStream = db.Activity.watch(
[
{
$match: {
operationType: 'insert',
}
}
],
{ maxAwaitTimeMS: 100000 }
);
changeStream.on('change', event => {
sendNotification(event.fullDocument)
});
Watch activity stream
Send notification to email,
browser or mobile
Sync with data warehouse (ETL)
Key takeaways
Change streams makes it easier to scale real-time apps
Change streams enables transition from monolith to micro-services
Q & A

MongoDB World 2019: Scaling Real-time Collaboration with MongoDB

  • 1.
    Scaling Real-time collaboration withMongoDB MongoDB World, June 2019, NYC
  • 2.
    Presenting today Nicolae Gudumac CTO& Co-Founder Planable Inc. @nick_gudumac
  • 3.
    Real-time technology is anessential part of modern apps
  • 4.
    Real-time technology is anessential part of modern apps
  • 5.
    Real-time technology is anessential part of modern apps
  • 6.
    the collaboration platformfor brand marketing teams
  • 7.
    the collaboration platformfor brand marketing teams
  • 8.
  • 11.
    Requirements & challenges Semi-structuredand complex data models Frequent changes in requirements High performance & scalability Sync in real-time
  • 12.
    Data structures • DraftJS •Canvas • Workflow status • Metadata
  • 13.
    MongoDB at Planable •Primary data store • Mostly normalized data • One to many relations with references & embedding • Change streams
  • 14.
    One to Manyrelationships Pages Posts CommentsWorkspacesUsers { _id: ‘123’, workspaces: [“w1”, “ws2”, “ws3”] … } { _id: ‘1563’, name: “Jusco” … } { _id: “6229”, workspaceId: “1563” … } { _id: “1263”, workspaceId, pageId, … } { _id: “1242”, workspaceId, pageId, postId, userId … }
  • 15.
    How do wereal-time?
  • 16.
  • 17.
    Pub/Sub, web sockets,Node.js, Mini Mongo
  • 18.
  • 19.
    Poll-and-diff SERVER CSERVER BSERVERA C1 C2 C3 C4 C5 C6 C7 C8
  • 20.
    Poll-and-diff SERVER CSERVER BSERVERA C1 C2 C3 C4 C5 C6 C7 C8
  • 21.
    Poll-and-diff SERVER CSERVER BSERVERA C1 C2 C3 C4 C5 C6 C7 C8 Simple approach and easy to understand scaling characteristics Updates come with delay Does not scale well with many users and data Works with complex queries and sort/skip
  • 22.
    Poll-and-diff SERVER CSERVER BSERVERA C1 C2 C3 C4 C5 C6 C7 C8 Simple approach and easy to understand scaling characteristics Updates come with delay Does not scale well with many users and data Works with complex queries and sort/skip
  • 23.
    Tailing the Oplog SERVERCSERVER BSERVER A C1 C2 C3 C4 C5 C6 C7 C8
  • 24.
    Tailing the Oplog SERVERCSERVER BSERVER A C1 C2 C3 C4 C5 C6 C7 C8 Oplog Works seamlessly on light loads Updates come in real-time Little additional load on the database
  • 25.
    Tailing the Oplog SERVERCSERVER BSERVER A C1 C2 C3 C4 C5 C6 C7 C8 Oplog Works seamlessly on light loads Updates come in real-time Little additional load on the database Batch updates, inserts, deletes overwhelms the server Cannot scale servers horizontally
  • 26.
    Redis Oplog SERVER CSERVERBSERVER A C1 C2 C3 C4 C5 C6 C7 C8
  • 27.
    Redis Oplog SERVER CSERVERBSERVER A C1 C2 C3 C4 C5 C6 C7 C8
  • 28.
    Redis Oplog SERVER CSERVERBSERVER A C1 C2 C3 C4 C5 C6 C7 C8 No additional load on MongoDB Server receives less updates More control on reactivity
  • 29.
    Redis Oplog No additionalload on MongoDB SERVER CSERVER BSERVER A C1 C2 C3 C4 C5 C6 C7 C8 Server receives less updates More control on reactivity Increased latency Additional point of failure Additional system to scale, monitor, maintain No reactivity on external database operations
  • 30.
    Change Streams SERVER CSERVERBSERVER A C1 C2 C3 C4 C5 C6 C7 C8 Change Streams const changeStream = db.Posts.watch({ fullDocument: ‘updateLookup' }); changeStream.on('change', next => { // process next document // send update to client }); 3.6+
  • 31.
    Change Streams SERVER CSERVERBSERVER A C1 C2 C3 C4 C5 C6 C7 C8 Change Streams 3.6+ Decreased load on servers Use powerful aggregation pipelines Single source of truth Reliable, durable, resumable Limited number of opened streams Increased memory usage of MongoDB
  • 32.
    Using Change streamsin production • Reduce the number of streams opened • Watch only the “hot” collections • Allocate more memory (RAM) to MongoDB • Increase the poolSize when connecting to MongoDB • Distribute the change streams to secondaries
  • 33.
    Monolith to Microservices SERVERCSERVER BSERVER A C1 C2 C3 C4 C5 C6 C7 C8 Webhooks Versioning Notifications Analytics
  • 34.
    Versioning content Watch updateson collection and specific fields const changeStream = db.Posts.watch( [ { $match: { operationType: 'update', 'updateDescription.updateFields.content': { $exists: true } } } ], { fullDocument: 'updateLookup' } );
  • 35.
    Versioning content const changeStream= db.Posts.watch( [ { $match: { operationType: 'update', 'updateDescription.updateFields.content': { $exists: true } } } ], { fullDocument: 'updateLookup' } ); changeStream.on('change', event => { db.Revisions.insert({ postId: event.documentKey._id, content: event.updateDescription.updateFields.content, updatedAt: new Date(), snapshot: event.fullDocument }); }); Save snapshot of post and updates in another collection Watch updates on collection and specific fields
  • 36.
    Notifications const changeStream =db.Activity.watch( [ { $match: { operationType: 'insert', } } ], { maxAwaitTimeMS: 100000 } ); changeStream.on('change', event => { sendNotification(event.fullDocument) }); Watch activity stream Send notification to email, browser or mobile
  • 37.
    Sync with datawarehouse (ETL)
  • 38.
    Key takeaways Change streamsmakes it easier to scale real-time apps Change streams enables transition from monolith to micro-services
  • 39.