SlideShare a Scribd company logo
Lessons Learned:

Migrating Buffer's Production
Database to MongoDB Atlas
from Mongo 2.4 Dan Farrelly
Who am I? What is Buffer?
! Dan Farrelly - CTO
! buffer.com
! Buffer is a suite of products to help you build your brand and connect with
your customers online.
! Buffer Publish - Schedules and sends 800k+ social media posts per day
! MongoDB 2.4.8
! 4+ TB
! 8 Shards across 2 large instances
! Another cloud provider
! Spun up in January 2013
! Sharded in mid 2014
! MongoDB 3.4
! ? TB
! ? Shards
! MongoDB Atlas for ease of
management, future upgrades
! Migration September 2018
Before After
Why we couldn’t use out-of-box tools
! Live migration tools didn’t support migration of sharded databases from
! Database access was locked down from previous cloud provider
! Could not upgrade database in place
! We wanted to change the number of shards (from 8 to 3)
! We wanted to change some of our shard keys
! But we were/are a team of non-MongoDB-experts
Our migration plan
! Migrate large collections (> 10GB or > 5M documents)
! Build indexes (1-2 days)
! Run delta migrations on large collections
! Enter maintenance mode in the app
! Migrate small collections using mongodump + mongorestore
! Run final delta migrations on large collections
! Validate data & indexes
! Cutover application to target database
! Disable maintenance mode
The Basics
Get your driver versions right
! Check compatibility before you choose your target db version
! Wire Protocol version incompatibility can cause issues
! Be mindful of major driver changes across versions
👍 3.4 2.4
👎 3.6 2.6
Leveraging mongodump
! Use the mongodump and mongorestore versions for your target database -
don’t mix and match!
! You can pipe mongodump in to mongorestore to avoid having to write to
disk in between both steps
! Speed things up w/ --numInsertionWorkersPerCollection
! Skip indexes if you run into issues restoring
mongodump -db “my-app” -c “users” --archive 
| mongorestore --archive 
--numInsertionWorkersPerCollection=2 
--noIndexRestore
Finding bad documents in the source
! Bad documents in 2.4
○ Fields starting with “$”
○ Indexed fields with values greater than 1024 bytes
! Insert all documents into a capped collection in a temp database running
target version
Use Bulk + Delta for largest collections
! Largest collections will be too slow for a mongo dump + restore
! Use bulk and delta migration approach:
○ Bulk insert all data & record timestamp
○ Index a field like updated_at on source collection
○ Find all updated documents and upsert into the target
! Remember to log deleted records in between deltas
Number your scripts
! Feels simple, but makes your migration fool-proof
/scripts
010createIndexes.sh
020staticCollections.sh
100scaleDownWorkers.sh
110enterMaintenanceMode.sh
200tokens.sh
210permissions.sh
220payments.sh
300validation.sh
Double & triple check indexes have been built
! Validate your indexes have been built
! 1 or 2 indexes failing to build could mean hours of headaches for you
! Don’t end up with a collScan on a collection with 1B+ records!
function createAll(){
db.coll.createIndexes([…])
db.other.createIndex({…})
}
createAll()
Train your new database to use indexes (if needed)
! Your new database doesn’t always choose the index you want it to initially
db.<collection>.getPlanCache().clear()
db.<collection>.getPlanCache().clearPlansByQuery(…)
db.runCommand({ planCacheSetFilter: … })
The Advanced
Sharding: Stop the balancer
! Try to ensure your source and target cluster is not trying to balance your
data while you’re migrating
! You don’t want data moving between shards during migration!
! If you’re pre-splitting chunks and stop the balancer you won’t need to
balance during migration on the target anyway 😉
sh.getBalancerState();
sh.isBalancerRunning();
sh.stopBalancer();
sh.setBalancerState(false)
Sharding: Pre-split chunks for faster migration
! Estimate how much of your data should fit within a chunk (64MB)
! Write a script to iterate through your source data and run the split
command on the target collection
for (var i=2012; i<2019; i++) {
for (var j=1; j<=12; j++) {
for (var k=1; k<=31; k++) {
// Our collection had roughly 64MB of data for every 2 hours
// of “profiles” between 2012 and 2018:
for (var hour=2; hour<=24; hour=hour+2) {
db.adminCommand({
split: “buffer.updates”,
middle: { profile_id: ObjectId.fromDate(…), _id: MaxKey } })
Our results
! 4+ TB before to 1.2 TB w/ WiredTiger’s compression
! 8 Shards to 3 shards
! Averaged 40-100 QR/QW during peak time to 0-2
! Maintenance cutover was 4 hours longer than expected due to unforeseen
issues with index building
! Easy upgrade to 3.6 with zero downtime a few months later
And we did it all remotely!
! 11 hours of Zoom video calls during the day of migration
! 1 day of in-person planning with consulting engineer - everything else done
over Slack + Zoom
! If we can perform such a gnarly migration without being MongoDB
experts, so can you!
Questions?
dan@buffer.com
@djfarrelly

More Related Content

What's hot (20)

Mongo Web Apps: OSCON 2011
Mongo Web Apps: OSCON 2011Mongo Web Apps: OSCON 2011
Mongo Web Apps: OSCON 2011
rogerbodamer
 
MongoDB
MongoDBMongoDB
MongoDB
Tharun Srinivasa
 
MongoDB: An Introduction - june-2011
MongoDB:  An Introduction - june-2011MongoDB:  An Introduction - june-2011
MongoDB: An Introduction - june-2011
Chris Westin
 
MongoDB World 2016: Poster Sessions eBook
MongoDB World 2016: Poster Sessions eBookMongoDB World 2016: Poster Sessions eBook
MongoDB World 2016: Poster Sessions eBook
MongoDB
 
Mongodb tutorial at Easylearning Guru
Mongodb tutorial  at Easylearning GuruMongodb tutorial  at Easylearning Guru
Mongodb tutorial at Easylearning Guru
KCC Software Ltd. & Easylearning.guru
 
MongoDB Knowledge Shareing
MongoDB Knowledge ShareingMongoDB Knowledge Shareing
MongoDB Knowledge Shareing
Philip Zhong
 
Basics of MongoDB
Basics of MongoDB Basics of MongoDB
Basics of MongoDB
HabileLabs
 
Using MongoDB + Hadoop Together
Using MongoDB + Hadoop TogetherUsing MongoDB + Hadoop Together
Using MongoDB + Hadoop Together
MongoDB
 
Mongodb intro
Mongodb introMongodb intro
Mongodb intro
christkv
 
MongoDB World 2019: Raiders of the Anti-patterns: A Journey Towards Fixing Sc...
MongoDB World 2019: Raiders of the Anti-patterns: A Journey Towards Fixing Sc...MongoDB World 2019: Raiders of the Anti-patterns: A Journey Towards Fixing Sc...
MongoDB World 2019: Raiders of the Anti-patterns: A Journey Towards Fixing Sc...
MongoDB
 
Using Aggregation for analytics
Using Aggregation for analyticsUsing Aggregation for analytics
Using Aggregation for analytics
MongoDB
 
MongoDB presentation
MongoDB presentationMongoDB presentation
MongoDB presentation
Hyphen Call
 
Social Data and Log Analysis Using MongoDB
Social Data and Log Analysis Using MongoDBSocial Data and Log Analysis Using MongoDB
Social Data and Log Analysis Using MongoDB
Takahiro Inoue
 
Dan Sullivan - Data Analytics and Text Mining with MongoDB - NoSQL matters Du...
Dan Sullivan - Data Analytics and Text Mining with MongoDB - NoSQL matters Du...Dan Sullivan - Data Analytics and Text Mining with MongoDB - NoSQL matters Du...
Dan Sullivan - Data Analytics and Text Mining with MongoDB - NoSQL matters Du...
NoSQLmatters
 
Mongo DB 102
Mongo DB 102Mongo DB 102
Mongo DB 102
Abhijeet Vaikar
 
Mongodb introduction and_internal(simple)
Mongodb introduction and_internal(simple)Mongodb introduction and_internal(simple)
Mongodb introduction and_internal(simple)
Kai Zhao
 
Mongo db report
Mongo db reportMongo db report
Mongo db report
Hyphen Call
 
MongoDB + Spring
MongoDB + SpringMongoDB + Spring
MongoDB + Spring
Norberto Leite
 
Updating materialized views and caches using kafka
Updating materialized views and caches using kafkaUpdating materialized views and caches using kafka
Updating materialized views and caches using kafka
Zach Cox
 
An introduction to MongoDB
An introduction to MongoDBAn introduction to MongoDB
An introduction to MongoDB
César Trigo
 
Mongo Web Apps: OSCON 2011
Mongo Web Apps: OSCON 2011Mongo Web Apps: OSCON 2011
Mongo Web Apps: OSCON 2011
rogerbodamer
 
MongoDB: An Introduction - june-2011
MongoDB:  An Introduction - june-2011MongoDB:  An Introduction - june-2011
MongoDB: An Introduction - june-2011
Chris Westin
 
MongoDB World 2016: Poster Sessions eBook
MongoDB World 2016: Poster Sessions eBookMongoDB World 2016: Poster Sessions eBook
MongoDB World 2016: Poster Sessions eBook
MongoDB
 
MongoDB Knowledge Shareing
MongoDB Knowledge ShareingMongoDB Knowledge Shareing
MongoDB Knowledge Shareing
Philip Zhong
 
Basics of MongoDB
Basics of MongoDB Basics of MongoDB
Basics of MongoDB
HabileLabs
 
Using MongoDB + Hadoop Together
Using MongoDB + Hadoop TogetherUsing MongoDB + Hadoop Together
Using MongoDB + Hadoop Together
MongoDB
 
Mongodb intro
Mongodb introMongodb intro
Mongodb intro
christkv
 
MongoDB World 2019: Raiders of the Anti-patterns: A Journey Towards Fixing Sc...
MongoDB World 2019: Raiders of the Anti-patterns: A Journey Towards Fixing Sc...MongoDB World 2019: Raiders of the Anti-patterns: A Journey Towards Fixing Sc...
MongoDB World 2019: Raiders of the Anti-patterns: A Journey Towards Fixing Sc...
MongoDB
 
Using Aggregation for analytics
Using Aggregation for analyticsUsing Aggregation for analytics
Using Aggregation for analytics
MongoDB
 
MongoDB presentation
MongoDB presentationMongoDB presentation
MongoDB presentation
Hyphen Call
 
Social Data and Log Analysis Using MongoDB
Social Data and Log Analysis Using MongoDBSocial Data and Log Analysis Using MongoDB
Social Data and Log Analysis Using MongoDB
Takahiro Inoue
 
Dan Sullivan - Data Analytics and Text Mining with MongoDB - NoSQL matters Du...
Dan Sullivan - Data Analytics and Text Mining with MongoDB - NoSQL matters Du...Dan Sullivan - Data Analytics and Text Mining with MongoDB - NoSQL matters Du...
Dan Sullivan - Data Analytics and Text Mining with MongoDB - NoSQL matters Du...
NoSQLmatters
 
Mongodb introduction and_internal(simple)
Mongodb introduction and_internal(simple)Mongodb introduction and_internal(simple)
Mongodb introduction and_internal(simple)
Kai Zhao
 
Updating materialized views and caches using kafka
Updating materialized views and caches using kafkaUpdating materialized views and caches using kafka
Updating materialized views and caches using kafka
Zach Cox
 
An introduction to MongoDB
An introduction to MongoDBAn introduction to MongoDB
An introduction to MongoDB
César Trigo
 

Similar to MongoDB World 2019: Lessons Learned: Migrating Buffer's Production Database to MongoDB Atlas from MongoDB 2.4 (20)

Mongodb
MongodbMongodb
Mongodb
Thiago Veiga
 
Optimizing MongoDB: Lessons Learned at Localytics
Optimizing MongoDB: Lessons Learned at LocalyticsOptimizing MongoDB: Lessons Learned at Localytics
Optimizing MongoDB: Lessons Learned at Localytics
Benjamin Darfler
 
SQL vs NoSQL, an experiment with MongoDB
SQL vs NoSQL, an experiment with MongoDBSQL vs NoSQL, an experiment with MongoDB
SQL vs NoSQL, an experiment with MongoDB
Marco Segato
 
Mongo db and hadoop driving business insights - final
Mongo db and hadoop   driving business insights - finalMongo db and hadoop   driving business insights - final
Mongo db and hadoop driving business insights - final
MongoDB
 
Solving Your Backup Needs Using Ops Manager, Cloud Manager and Atlas
Solving Your Backup Needs Using Ops Manager, Cloud Manager and AtlasSolving Your Backup Needs Using Ops Manager, Cloud Manager and Atlas
Solving Your Backup Needs Using Ops Manager, Cloud Manager and Atlas
MongoDB
 
Architecting Wide-ranging Analytical Solutions with MongoDB
Architecting Wide-ranging Analytical Solutions with MongoDBArchitecting Wide-ranging Analytical Solutions with MongoDB
Architecting Wide-ranging Analytical Solutions with MongoDB
Matthew Kalan
 
Extend db
Extend dbExtend db
Extend db
Sridhar Valaguru
 
Spark + AI Summit 2019: Headaches and Breakthroughs in Building Continuous Ap...
Spark + AI Summit 2019: Headaches and Breakthroughs in Building Continuous Ap...Spark + AI Summit 2019: Headaches and Breakthroughs in Building Continuous Ap...
Spark + AI Summit 2019: Headaches and Breakthroughs in Building Continuous Ap...
Landon Robinson
 
Headaches and Breakthroughs in Building Continuous Applications
Headaches and Breakthroughs in Building Continuous ApplicationsHeadaches and Breakthroughs in Building Continuous Applications
Headaches and Breakthroughs in Building Continuous Applications
Databricks
 
MongoDB.local Atlanta: Modern Data Backup and Recovery from On-Premises to th...
MongoDB.local Atlanta: Modern Data Backup and Recovery from On-Premises to th...MongoDB.local Atlanta: Modern Data Backup and Recovery from On-Premises to th...
MongoDB.local Atlanta: Modern Data Backup and Recovery from On-Premises to th...
MongoDB
 
Lessons Learned from Building a Multi-Tenant Saas Content Management System o...
Lessons Learned from Building a Multi-Tenant Saas Content Management System o...Lessons Learned from Building a Multi-Tenant Saas Content Management System o...
Lessons Learned from Building a Multi-Tenant Saas Content Management System o...
MongoDB
 
Mongo db pefrormance optimization strategies
Mongo db pefrormance optimization strategiesMongo db pefrormance optimization strategies
Mongo db pefrormance optimization strategies
ronwarshawsky
 
Mongo db
Mongo dbMongo db
Mongo db
Gyanendra Yadav
 
MongoDB World 2018: Solving Your Backup Needs Using MongoDB Ops Manager, Clou...
MongoDB World 2018: Solving Your Backup Needs Using MongoDB Ops Manager, Clou...MongoDB World 2018: Solving Your Backup Needs Using MongoDB Ops Manager, Clou...
MongoDB World 2018: Solving Your Backup Needs Using MongoDB Ops Manager, Clou...
MongoDB
 
MongoDB San Francisco 2013: Storing eBay's Media Metadata on MongoDB present...
MongoDB San Francisco 2013: Storing eBay's Media Metadata on MongoDB  present...MongoDB San Francisco 2013: Storing eBay's Media Metadata on MongoDB  present...
MongoDB San Francisco 2013: Storing eBay's Media Metadata on MongoDB present...
MongoDB
 
Storing eBay's Media Metadata on MongoDB, by Yuri Finkelstein, Architect, eBay
Storing eBay's Media Metadata on MongoDB, by Yuri Finkelstein, Architect, eBayStoring eBay's Media Metadata on MongoDB, by Yuri Finkelstein, Architect, eBay
Storing eBay's Media Metadata on MongoDB, by Yuri Finkelstein, Architect, eBay
MongoDB
 
MongoDB : Scaling, Security & Performance
MongoDB : Scaling, Security & PerformanceMongoDB : Scaling, Security & Performance
MongoDB : Scaling, Security & Performance
Sasidhar Gogulapati
 
MongoDB Introduction, Installation & Execution
MongoDB Introduction, Installation & ExecutionMongoDB Introduction, Installation & Execution
MongoDB Introduction, Installation & Execution
iFour Technolab Pvt. Ltd.
 
Webinar: Managing Real Time Risk Analytics with MongoDB
Webinar: Managing Real Time Risk Analytics with MongoDB Webinar: Managing Real Time Risk Analytics with MongoDB
Webinar: Managing Real Time Risk Analytics with MongoDB
MongoDB
 
Compact, Compress, De-Duplicate (DAOS)
Compact, Compress, De-Duplicate (DAOS)Compact, Compress, De-Duplicate (DAOS)
Compact, Compress, De-Duplicate (DAOS)
Ulrich Krause
 
Optimizing MongoDB: Lessons Learned at Localytics
Optimizing MongoDB: Lessons Learned at LocalyticsOptimizing MongoDB: Lessons Learned at Localytics
Optimizing MongoDB: Lessons Learned at Localytics
Benjamin Darfler
 
SQL vs NoSQL, an experiment with MongoDB
SQL vs NoSQL, an experiment with MongoDBSQL vs NoSQL, an experiment with MongoDB
SQL vs NoSQL, an experiment with MongoDB
Marco Segato
 
Mongo db and hadoop driving business insights - final
Mongo db and hadoop   driving business insights - finalMongo db and hadoop   driving business insights - final
Mongo db and hadoop driving business insights - final
MongoDB
 
Solving Your Backup Needs Using Ops Manager, Cloud Manager and Atlas
Solving Your Backup Needs Using Ops Manager, Cloud Manager and AtlasSolving Your Backup Needs Using Ops Manager, Cloud Manager and Atlas
Solving Your Backup Needs Using Ops Manager, Cloud Manager and Atlas
MongoDB
 
Architecting Wide-ranging Analytical Solutions with MongoDB
Architecting Wide-ranging Analytical Solutions with MongoDBArchitecting Wide-ranging Analytical Solutions with MongoDB
Architecting Wide-ranging Analytical Solutions with MongoDB
Matthew Kalan
 
Spark + AI Summit 2019: Headaches and Breakthroughs in Building Continuous Ap...
Spark + AI Summit 2019: Headaches and Breakthroughs in Building Continuous Ap...Spark + AI Summit 2019: Headaches and Breakthroughs in Building Continuous Ap...
Spark + AI Summit 2019: Headaches and Breakthroughs in Building Continuous Ap...
Landon Robinson
 
Headaches and Breakthroughs in Building Continuous Applications
Headaches and Breakthroughs in Building Continuous ApplicationsHeadaches and Breakthroughs in Building Continuous Applications
Headaches and Breakthroughs in Building Continuous Applications
Databricks
 
MongoDB.local Atlanta: Modern Data Backup and Recovery from On-Premises to th...
MongoDB.local Atlanta: Modern Data Backup and Recovery from On-Premises to th...MongoDB.local Atlanta: Modern Data Backup and Recovery from On-Premises to th...
MongoDB.local Atlanta: Modern Data Backup and Recovery from On-Premises to th...
MongoDB
 
Lessons Learned from Building a Multi-Tenant Saas Content Management System o...
Lessons Learned from Building a Multi-Tenant Saas Content Management System o...Lessons Learned from Building a Multi-Tenant Saas Content Management System o...
Lessons Learned from Building a Multi-Tenant Saas Content Management System o...
MongoDB
 
Mongo db pefrormance optimization strategies
Mongo db pefrormance optimization strategiesMongo db pefrormance optimization strategies
Mongo db pefrormance optimization strategies
ronwarshawsky
 
MongoDB World 2018: Solving Your Backup Needs Using MongoDB Ops Manager, Clou...
MongoDB World 2018: Solving Your Backup Needs Using MongoDB Ops Manager, Clou...MongoDB World 2018: Solving Your Backup Needs Using MongoDB Ops Manager, Clou...
MongoDB World 2018: Solving Your Backup Needs Using MongoDB Ops Manager, Clou...
MongoDB
 
MongoDB San Francisco 2013: Storing eBay's Media Metadata on MongoDB present...
MongoDB San Francisco 2013: Storing eBay's Media Metadata on MongoDB  present...MongoDB San Francisco 2013: Storing eBay's Media Metadata on MongoDB  present...
MongoDB San Francisco 2013: Storing eBay's Media Metadata on MongoDB present...
MongoDB
 
Storing eBay's Media Metadata on MongoDB, by Yuri Finkelstein, Architect, eBay
Storing eBay's Media Metadata on MongoDB, by Yuri Finkelstein, Architect, eBayStoring eBay's Media Metadata on MongoDB, by Yuri Finkelstein, Architect, eBay
Storing eBay's Media Metadata on MongoDB, by Yuri Finkelstein, Architect, eBay
MongoDB
 
MongoDB : Scaling, Security & Performance
MongoDB : Scaling, Security & PerformanceMongoDB : Scaling, Security & Performance
MongoDB : Scaling, Security & Performance
Sasidhar Gogulapati
 
MongoDB Introduction, Installation & Execution
MongoDB Introduction, Installation & ExecutionMongoDB Introduction, Installation & Execution
MongoDB Introduction, Installation & Execution
iFour Technolab Pvt. Ltd.
 
Webinar: Managing Real Time Risk Analytics with MongoDB
Webinar: Managing Real Time Risk Analytics with MongoDB Webinar: Managing Real Time Risk Analytics with MongoDB
Webinar: Managing Real Time Risk Analytics with MongoDB
MongoDB
 
Compact, Compress, De-Duplicate (DAOS)
Compact, Compress, De-Duplicate (DAOS)Compact, Compress, De-Duplicate (DAOS)
Compact, Compress, De-Duplicate (DAOS)
Ulrich Krause
 
Ad

More from MongoDB (20)

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB
 
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB
 
Ad

Recently uploaded (20)

ELNL2025 - Unlocking the Power of Sensitivity Labels - A Comprehensive Guide....
ELNL2025 - Unlocking the Power of Sensitivity Labels - A Comprehensive Guide....ELNL2025 - Unlocking the Power of Sensitivity Labels - A Comprehensive Guide....
ELNL2025 - Unlocking the Power of Sensitivity Labels - A Comprehensive Guide....
Jasper Oosterveld
 
Domino IQ – Was Sie erwartet, erste Schritte und Anwendungsfälle
Domino IQ – Was Sie erwartet, erste Schritte und AnwendungsfälleDomino IQ – Was Sie erwartet, erste Schritte und Anwendungsfälle
Domino IQ – Was Sie erwartet, erste Schritte und Anwendungsfälle
panagenda
 
Cybersecurity Fundamentals: Apprentice - Palo Alto Certificate
Cybersecurity Fundamentals: Apprentice - Palo Alto CertificateCybersecurity Fundamentals: Apprentice - Palo Alto Certificate
Cybersecurity Fundamentals: Apprentice - Palo Alto Certificate
VICTOR MAESTRE RAMIREZ
 
TimeSeries Machine Learning - PyData London 2025
TimeSeries Machine Learning - PyData London 2025TimeSeries Machine Learning - PyData London 2025
TimeSeries Machine Learning - PyData London 2025
Suyash Joshi
 
AI Agents in Logistics and Supply Chain Applications Benefits and Implementation
AI Agents in Logistics and Supply Chain Applications Benefits and ImplementationAI Agents in Logistics and Supply Chain Applications Benefits and Implementation
AI Agents in Logistics and Supply Chain Applications Benefits and Implementation
Christine Shepherd
 
Developing Schemas with FME and Excel - Peak of Data & AI 2025
Developing Schemas with FME and Excel - Peak of Data & AI 2025Developing Schemas with FME and Excel - Peak of Data & AI 2025
Developing Schemas with FME and Excel - Peak of Data & AI 2025
Safe Software
 
Compliance-as-a-Service document pdf text
Compliance-as-a-Service document pdf textCompliance-as-a-Service document pdf text
Compliance-as-a-Service document pdf text
Earthling security
 
Palo Alto Networks Cybersecurity Foundation
Palo Alto Networks Cybersecurity FoundationPalo Alto Networks Cybersecurity Foundation
Palo Alto Networks Cybersecurity Foundation
VICTOR MAESTRE RAMIREZ
 
Trends Report: Artificial Intelligence (AI)
Trends Report: Artificial Intelligence (AI)Trends Report: Artificial Intelligence (AI)
Trends Report: Artificial Intelligence (AI)
Brian Ahier
 
Scaling GenAI Inference From Prototype to Production: Real-World Lessons in S...
Scaling GenAI Inference From Prototype to Production: Real-World Lessons in S...Scaling GenAI Inference From Prototype to Production: Real-World Lessons in S...
Scaling GenAI Inference From Prototype to Production: Real-World Lessons in S...
Anish Kumar
 
Trends Artificial Intelligence - Mary Meeker
Trends Artificial Intelligence - Mary MeekerTrends Artificial Intelligence - Mary Meeker
Trends Artificial Intelligence - Mary Meeker
Clive Dickens
 
Your startup on AWS - How to architect and maintain a Lean and Mean account
Your startup on AWS - How to architect and maintain a Lean and Mean accountYour startup on AWS - How to architect and maintain a Lean and Mean account
Your startup on AWS - How to architect and maintain a Lean and Mean account
angelo60207
 
Oracle Cloud Infrastructure AI Foundations
Oracle Cloud Infrastructure AI FoundationsOracle Cloud Infrastructure AI Foundations
Oracle Cloud Infrastructure AI Foundations
VICTOR MAESTRE RAMIREZ
 
Introduction to Typescript - GDG On Campus EUE
Introduction to Typescript - GDG On Campus EUEIntroduction to Typescript - GDG On Campus EUE
Introduction to Typescript - GDG On Campus EUE
Google Developer Group On Campus European Universities in Egypt
 
MCP vs A2A vs ACP: Choosing the Right Protocol | Bluebash
MCP vs A2A vs ACP: Choosing the Right Protocol | BluebashMCP vs A2A vs ACP: Choosing the Right Protocol | Bluebash
MCP vs A2A vs ACP: Choosing the Right Protocol | Bluebash
Bluebash
 
Agentic AI: Beyond the Buzz- LangGraph Studio V2
Agentic AI: Beyond the Buzz- LangGraph Studio V2Agentic AI: Beyond the Buzz- LangGraph Studio V2
Agentic AI: Beyond the Buzz- LangGraph Studio V2
Shashikant Jagtap
 
Boosting MySQL with Vector Search -THE VECTOR SEARCH CONFERENCE 2025 .pdf
Boosting MySQL with Vector Search -THE VECTOR SEARCH CONFERENCE 2025 .pdfBoosting MySQL with Vector Search -THE VECTOR SEARCH CONFERENCE 2025 .pdf
Boosting MySQL with Vector Search -THE VECTOR SEARCH CONFERENCE 2025 .pdf
Alkin Tezuysal
 
Dancing with AI - A Developer's Journey.pptx
Dancing with AI - A Developer's Journey.pptxDancing with AI - A Developer's Journey.pptx
Dancing with AI - A Developer's Journey.pptx
Elliott Richmond
 
Evaluation Challenges in Using Generative AI for Science & Technical Content
Evaluation Challenges in Using Generative AI for Science & Technical ContentEvaluation Challenges in Using Generative AI for Science & Technical Content
Evaluation Challenges in Using Generative AI for Science & Technical Content
Paul Groth
 
LSNIF: Locally-Subdivided Neural Intersection Function
LSNIF: Locally-Subdivided Neural Intersection FunctionLSNIF: Locally-Subdivided Neural Intersection Function
LSNIF: Locally-Subdivided Neural Intersection Function
Takahiro Harada
 
ELNL2025 - Unlocking the Power of Sensitivity Labels - A Comprehensive Guide....
ELNL2025 - Unlocking the Power of Sensitivity Labels - A Comprehensive Guide....ELNL2025 - Unlocking the Power of Sensitivity Labels - A Comprehensive Guide....
ELNL2025 - Unlocking the Power of Sensitivity Labels - A Comprehensive Guide....
Jasper Oosterveld
 
Domino IQ – Was Sie erwartet, erste Schritte und Anwendungsfälle
Domino IQ – Was Sie erwartet, erste Schritte und AnwendungsfälleDomino IQ – Was Sie erwartet, erste Schritte und Anwendungsfälle
Domino IQ – Was Sie erwartet, erste Schritte und Anwendungsfälle
panagenda
 
Cybersecurity Fundamentals: Apprentice - Palo Alto Certificate
Cybersecurity Fundamentals: Apprentice - Palo Alto CertificateCybersecurity Fundamentals: Apprentice - Palo Alto Certificate
Cybersecurity Fundamentals: Apprentice - Palo Alto Certificate
VICTOR MAESTRE RAMIREZ
 
TimeSeries Machine Learning - PyData London 2025
TimeSeries Machine Learning - PyData London 2025TimeSeries Machine Learning - PyData London 2025
TimeSeries Machine Learning - PyData London 2025
Suyash Joshi
 
AI Agents in Logistics and Supply Chain Applications Benefits and Implementation
AI Agents in Logistics and Supply Chain Applications Benefits and ImplementationAI Agents in Logistics and Supply Chain Applications Benefits and Implementation
AI Agents in Logistics and Supply Chain Applications Benefits and Implementation
Christine Shepherd
 
Developing Schemas with FME and Excel - Peak of Data & AI 2025
Developing Schemas with FME and Excel - Peak of Data & AI 2025Developing Schemas with FME and Excel - Peak of Data & AI 2025
Developing Schemas with FME and Excel - Peak of Data & AI 2025
Safe Software
 
Compliance-as-a-Service document pdf text
Compliance-as-a-Service document pdf textCompliance-as-a-Service document pdf text
Compliance-as-a-Service document pdf text
Earthling security
 
Palo Alto Networks Cybersecurity Foundation
Palo Alto Networks Cybersecurity FoundationPalo Alto Networks Cybersecurity Foundation
Palo Alto Networks Cybersecurity Foundation
VICTOR MAESTRE RAMIREZ
 
Trends Report: Artificial Intelligence (AI)
Trends Report: Artificial Intelligence (AI)Trends Report: Artificial Intelligence (AI)
Trends Report: Artificial Intelligence (AI)
Brian Ahier
 
Scaling GenAI Inference From Prototype to Production: Real-World Lessons in S...
Scaling GenAI Inference From Prototype to Production: Real-World Lessons in S...Scaling GenAI Inference From Prototype to Production: Real-World Lessons in S...
Scaling GenAI Inference From Prototype to Production: Real-World Lessons in S...
Anish Kumar
 
Trends Artificial Intelligence - Mary Meeker
Trends Artificial Intelligence - Mary MeekerTrends Artificial Intelligence - Mary Meeker
Trends Artificial Intelligence - Mary Meeker
Clive Dickens
 
Your startup on AWS - How to architect and maintain a Lean and Mean account
Your startup on AWS - How to architect and maintain a Lean and Mean accountYour startup on AWS - How to architect and maintain a Lean and Mean account
Your startup on AWS - How to architect and maintain a Lean and Mean account
angelo60207
 
Oracle Cloud Infrastructure AI Foundations
Oracle Cloud Infrastructure AI FoundationsOracle Cloud Infrastructure AI Foundations
Oracle Cloud Infrastructure AI Foundations
VICTOR MAESTRE RAMIREZ
 
MCP vs A2A vs ACP: Choosing the Right Protocol | Bluebash
MCP vs A2A vs ACP: Choosing the Right Protocol | BluebashMCP vs A2A vs ACP: Choosing the Right Protocol | Bluebash
MCP vs A2A vs ACP: Choosing the Right Protocol | Bluebash
Bluebash
 
Agentic AI: Beyond the Buzz- LangGraph Studio V2
Agentic AI: Beyond the Buzz- LangGraph Studio V2Agentic AI: Beyond the Buzz- LangGraph Studio V2
Agentic AI: Beyond the Buzz- LangGraph Studio V2
Shashikant Jagtap
 
Boosting MySQL with Vector Search -THE VECTOR SEARCH CONFERENCE 2025 .pdf
Boosting MySQL with Vector Search -THE VECTOR SEARCH CONFERENCE 2025 .pdfBoosting MySQL with Vector Search -THE VECTOR SEARCH CONFERENCE 2025 .pdf
Boosting MySQL with Vector Search -THE VECTOR SEARCH CONFERENCE 2025 .pdf
Alkin Tezuysal
 
Dancing with AI - A Developer's Journey.pptx
Dancing with AI - A Developer's Journey.pptxDancing with AI - A Developer's Journey.pptx
Dancing with AI - A Developer's Journey.pptx
Elliott Richmond
 
Evaluation Challenges in Using Generative AI for Science & Technical Content
Evaluation Challenges in Using Generative AI for Science & Technical ContentEvaluation Challenges in Using Generative AI for Science & Technical Content
Evaluation Challenges in Using Generative AI for Science & Technical Content
Paul Groth
 
LSNIF: Locally-Subdivided Neural Intersection Function
LSNIF: Locally-Subdivided Neural Intersection FunctionLSNIF: Locally-Subdivided Neural Intersection Function
LSNIF: Locally-Subdivided Neural Intersection Function
Takahiro Harada
 

MongoDB World 2019: Lessons Learned: Migrating Buffer's Production Database to MongoDB Atlas from MongoDB 2.4

  • 1. Lessons Learned:
 Migrating Buffer's Production Database to MongoDB Atlas from Mongo 2.4 Dan Farrelly
  • 2. Who am I? What is Buffer? ! Dan Farrelly - CTO ! buffer.com ! Buffer is a suite of products to help you build your brand and connect with your customers online. ! Buffer Publish - Schedules and sends 800k+ social media posts per day
  • 3. ! MongoDB 2.4.8 ! 4+ TB ! 8 Shards across 2 large instances ! Another cloud provider ! Spun up in January 2013 ! Sharded in mid 2014 ! MongoDB 3.4 ! ? TB ! ? Shards ! MongoDB Atlas for ease of management, future upgrades ! Migration September 2018 Before After
  • 4. Why we couldn’t use out-of-box tools ! Live migration tools didn’t support migration of sharded databases from ! Database access was locked down from previous cloud provider ! Could not upgrade database in place ! We wanted to change the number of shards (from 8 to 3) ! We wanted to change some of our shard keys ! But we were/are a team of non-MongoDB-experts
  • 5. Our migration plan ! Migrate large collections (> 10GB or > 5M documents) ! Build indexes (1-2 days) ! Run delta migrations on large collections ! Enter maintenance mode in the app ! Migrate small collections using mongodump + mongorestore ! Run final delta migrations on large collections ! Validate data & indexes ! Cutover application to target database ! Disable maintenance mode
  • 7. Get your driver versions right ! Check compatibility before you choose your target db version ! Wire Protocol version incompatibility can cause issues ! Be mindful of major driver changes across versions 👍 3.4 2.4 👎 3.6 2.6
  • 8. Leveraging mongodump ! Use the mongodump and mongorestore versions for your target database - don’t mix and match! ! You can pipe mongodump in to mongorestore to avoid having to write to disk in between both steps ! Speed things up w/ --numInsertionWorkersPerCollection ! Skip indexes if you run into issues restoring mongodump -db “my-app” -c “users” --archive | mongorestore --archive --numInsertionWorkersPerCollection=2 --noIndexRestore
  • 9. Finding bad documents in the source ! Bad documents in 2.4 ○ Fields starting with “$” ○ Indexed fields with values greater than 1024 bytes ! Insert all documents into a capped collection in a temp database running target version
  • 10. Use Bulk + Delta for largest collections ! Largest collections will be too slow for a mongo dump + restore ! Use bulk and delta migration approach: ○ Bulk insert all data & record timestamp ○ Index a field like updated_at on source collection ○ Find all updated documents and upsert into the target ! Remember to log deleted records in between deltas
  • 11. Number your scripts ! Feels simple, but makes your migration fool-proof /scripts 010createIndexes.sh 020staticCollections.sh 100scaleDownWorkers.sh 110enterMaintenanceMode.sh 200tokens.sh 210permissions.sh 220payments.sh 300validation.sh
  • 12. Double & triple check indexes have been built ! Validate your indexes have been built ! 1 or 2 indexes failing to build could mean hours of headaches for you ! Don’t end up with a collScan on a collection with 1B+ records! function createAll(){ db.coll.createIndexes([…]) db.other.createIndex({…}) } createAll()
  • 13. Train your new database to use indexes (if needed) ! Your new database doesn’t always choose the index you want it to initially db.<collection>.getPlanCache().clear() db.<collection>.getPlanCache().clearPlansByQuery(…) db.runCommand({ planCacheSetFilter: … })
  • 15. Sharding: Stop the balancer ! Try to ensure your source and target cluster is not trying to balance your data while you’re migrating ! You don’t want data moving between shards during migration! ! If you’re pre-splitting chunks and stop the balancer you won’t need to balance during migration on the target anyway 😉 sh.getBalancerState(); sh.isBalancerRunning(); sh.stopBalancer(); sh.setBalancerState(false)
  • 16. Sharding: Pre-split chunks for faster migration ! Estimate how much of your data should fit within a chunk (64MB) ! Write a script to iterate through your source data and run the split command on the target collection for (var i=2012; i<2019; i++) { for (var j=1; j<=12; j++) { for (var k=1; k<=31; k++) { // Our collection had roughly 64MB of data for every 2 hours // of “profiles” between 2012 and 2018: for (var hour=2; hour<=24; hour=hour+2) { db.adminCommand({ split: “buffer.updates”, middle: { profile_id: ObjectId.fromDate(…), _id: MaxKey } })
  • 17. Our results ! 4+ TB before to 1.2 TB w/ WiredTiger’s compression ! 8 Shards to 3 shards ! Averaged 40-100 QR/QW during peak time to 0-2 ! Maintenance cutover was 4 hours longer than expected due to unforeseen issues with index building ! Easy upgrade to 3.6 with zero downtime a few months later
  • 18. And we did it all remotely! ! 11 hours of Zoom video calls during the day of migration ! 1 day of in-person planning with consulting engineer - everything else done over Slack + Zoom ! If we can perform such a gnarly migration without being MongoDB experts, so can you!