MMUG18
MySQL	Failover	and	Orchestrator
Simon	Mudd simon.mudd@booking.com
17th May	2017
Gracias	a	Tuenti
117/05/2017 Madrid	MySQL	Users	Group	- MMUG18
• Por permitir el	uso de	sus oficinas para	esta presentación
Content
• Handling	failover	with	MySQL
• Downtime	&	Requirements
• MySQL	Clustering	solutions
• Non-clusterings solutions	and	considerations
• Orchestrator
• Questions
217/05/2017 Madrid	MySQL	Users	Group	- MMUG18
Is	Downtime	Acceptable?
• Do	you	have	a	system	that	needs	to	run	24	x	7	?
• Not	everyone	does
• If	you	have	a	website	then	generally	downtime	is	not	acceptable
317/05/2017 Madrid	MySQL	Users	Group	- MMUG18
Requirements
Goal:	Run	24	x	7	x	365	with	no downtime
• Is	this	really	necessary?
• If	you	ask	management	they’ll	always	say	yes…
• What	is	the	cost?
• Shorter	downtime	requirements	mean	more	effort	spent	to	achieve	that	
• How	do	you	reliably detect	failure?		Hard	problem	to	solve
If	you	accept	downtime	how	much	can	you	really	tolerate?
• 1s,	5s,	30s,	1min	?
417/05/2017 Madrid	MySQL	Users	Group	- MMUG18
What	options	are	available?
• MySQL	Cluster
• carrier	grade
• very	high	uptime
• Not	InnoDB – specialised workloads	
• Galera
• Often	with	asynchronous	replication	between	datacentres
• InnoDB Cluster
• Very	new
• All	require	clients	to	take	action	on	failure	of	a	node
• If	you	use	a	proxy	that	can	fail	too…
517/05/2017 Madrid	MySQL	Users	Group	- MMUG18
What	options	are	available?
“Cluster	solutions”
• Do	not	work	well	cross-DC	due	to	latency
• If	you	accept	writes	into	multiple	masters	there’s	a	chance	of	conflict
• Slows	things	down
• InnoDB Cluster	now	does	not	recommend	this	behaviour – requires	care
• Only	small	setups	work	in	a	single	data-centre so	adaptation	here	is	
also	needed
• Cluster	setups	do	not	scale	easily	to	10	or	more	servers
617/05/2017 Madrid	MySQL	Users	Group	- MMUG18
What	options	are	available?
• Standard	MySQL,	MariaDB,	Amazon	RDS,	Google	Cloud	SQL,	…
• Read	scale-out
• Asynchronous	replication
• Semi-sync	helps	improve	performance	and	ensure	data	is	“somewhere	else”	when	
acknowledging	a	transaction
• If	you	are	out	of	the	cloud	then:	different	setups
• SBR	or	RBR?
• No	GTID,	Oracle	or	MySQL	GTID?
• Optional	semi-sync?
• If	you	are	out	of	the	cloud	then:	do	it	yourself
• MHA
• MariaDB Replication	Manager
• Orchestrator
717/05/2017 Madrid	MySQL	Users	Group	- MMUG18
Orchestrator
817/05/2017 Madrid	MySQL	Users	Group	- MMUG18
Orchestrator
• Handles	master	failover,	but	more…
• GUI	to	manage	and	visualise topology	– very	handy
• CLI	to	do	the	same	things	– good	for	scripting
• API	calls	to	run	at	a	distance	(more	generic	interface)
• Needs	a	DB	backend	to	store	state.
• Normally	MySQL	but	can	be	SQLite
917/05/2017 Madrid	MySQL	Users	Group	- MMUG18
Orchestrator
What	failures	does	it	handle?
• Master	failures	– needs	to	talk	to	external	systems
• Intermediate	master	failures	– can	handle	on	its	own
• Does	not care	about	slaves	or	applications
• Works	with	GTID:	Oracle	or	MariaDB
• Works	without	using	GTID:	Can	add	Pseudo-GTID (events	injected	on	
the	master	are	used	to	find	a	match)	so	no	need	to	migrate	to	GTID	if	
not	wanted
• Handles	multi-level	topologies
1017/05/2017 Madrid	MySQL	Users	Group	- MMUG18
Orchestrator	GUI
1117/05/2017 Madrid	MySQL	Users	Group	- MMUG18
Orchestrator	GUI
1217/05/2017 Madrid	MySQL	Users	Group	- MMUG18
Orchestrator	GUI
1317/05/2017 Madrid	MySQL	Users	Group	- MMUG18
Orchestrator	CLI
Over	100	commands	you	can	use
• E.g.
• relocate
• discover
• begin-downtime,	end-downtime
• topology
1417/05/2017 Madrid	MySQL	Users	Group	- MMUG18
Orchestrator	CLI
17/05/2017 Madrid	MySQL	Users	Group	- MMUG18 15
Failure	Notifications
• Using	the	hooks	can	talk	to	jabber	or	email	to	advise	of	the	actions	
taken:
17/05/2017 Madrid	MySQL	Users	Group	- MMUG18 16
Failure	Auditing
17/05/2017 Madrid	MySQL	Users	Group	- MMUG18 17
Orchestrator	Setup
• Source	at	github.com/github/orchestrator
• Binaries	written	in	go
• Daemon	runs	web	service	and	discovery,	client	on	each	MySQL	server
• State	stored	in	MySQL	/	SQLite
• Single	json configuration	file:	/etc/orchestrator.conf.json
• How	to	reach	backend	database	(stores	state)
• How	to	recognise delay
• Most	defaults	are	good	to	get	you	going
• Which	systems	you	want	to	trigger	recovery	on
• Hooks	to	handle	recovery	(talking	to	external	systems)
• If	you	need	help	please	ask
1817/05/2017 Madrid	MySQL	Users	Group	- MMUG18
Orchestrator	Characteristics
• Discover one	server	in	your	cluster	and	orchestrator	will	find	the	
others
• Detects	new	servers	in	the	cluster	automatically
• Notifies	you	of	problems	seen
• Recovery	is	optional	(per	cluster)
• Optional	selection	of	candidate	masters	or	servers	to	blacklist
• Global	ON /	OFF switch	– handy	if	several	failures	happen	at	once
• For	paranoid	DBAs,	so	far	orchestrator	has	always	done	the	right	thing
1917/05/2017 Madrid	MySQL	Users	Group	- MMUG18
Orchestrator	HA	?
Orchestrator	can	be	run	in	HA	mode
• Multiple	daemons	will	co-operate	so	if	one	fails	another	one	takes	
over	(they	share	the	database	backend)
• Use	a	load	balancer	to	provide	an	HA	GUI	service
• Use	nginx (or	similar)	for	authentication	and	TLS	if	needed
• Upgrades	are	easier
• Replicate	the	orchestrator	MySQL	backend	to	not	lose	data
2017/05/2017 Madrid	MySQL	Users	Group	- MMUG18
Does	it	Scale?
Yes
• Booking.com has	a	large	installation	with	a	single	cluster	monitoring		
thousands	of	MySQL	servers
• Recommended	by	YouTube	for	managing	Vitess servers
• Quite	a	number	of	other	users	but	they	are	not	very	visible
2117/05/2017 Madrid	MySQL	Users	Group	- MMUG18
Future	work
• Simplify	configuration	and	setup	so	more	people	can	use	it
• Improve	scalability
• Make	it	work	on	Amazon	RDS
• Spread	the	word…
17/05/2017 Madrid	MySQL	Users	Group	- MMUG18 22
Further	help	needed?
• github.com/github/orchestrator
• for	Issues	(Problems	/	Questions)	and	Pull	Requests	(patches)
• Google	Group:	Orchestrator	MySQL
• https://groups.google.com/forum/#!forum/orchestrator-mysql
• Feel	free	to	contact	me	and	I	will	try	to	help	provide	pointers
2317/05/2017 Madrid	MySQL	Users	Group	- MMUG18
Oh,	and	Booking.com is	hiring!
• Almost	any	role:
• MySQL	Engineer	/	DBA
• System	Administrator
• System	Engineer
• Site	Reliability	Engineer
• Developer
• Designer
• Technical	Team	Lead
• Product	Owner
• Data	Scientist
• And	many	more…
• https://workingatbooking.com/
17/05/2017 Madrid	MySQL	Users	Group	- MMUG18 24
Questions
?
17/05/2017 Madrid	MySQL	Users	Group	- MMUG18 25

More Related Content

PDF
How to set up orchestrator to manage thousands of MySQL servers
PDF
Almost Perfect Service Discovery and Failover with ProxySQL and Orchestrator
PPTX
ProxySQL for MySQL
PDF
How to Manage Scale-Out Environments with MariaDB MaxScale
PDF
Wars of MySQL Cluster ( InnoDB Cluster VS Galera )
PDF
MySQL Performance Tuning. Part 1: MySQL Configuration (includes MySQL 5.7)
PPTX
ProxySQL & PXC(Query routing and Failover Test)
PDF
MySQL Parallel Replication: All the 5.7 and 8.0 Details (LOGICAL_CLOCK)
How to set up orchestrator to manage thousands of MySQL servers
Almost Perfect Service Discovery and Failover with ProxySQL and Orchestrator
ProxySQL for MySQL
How to Manage Scale-Out Environments with MariaDB MaxScale
Wars of MySQL Cluster ( InnoDB Cluster VS Galera )
MySQL Performance Tuning. Part 1: MySQL Configuration (includes MySQL 5.7)
ProxySQL & PXC(Query routing and Failover Test)
MySQL Parallel Replication: All the 5.7 and 8.0 Details (LOGICAL_CLOCK)

What's hot (20)

PDF
ProxySQL High Avalability and Configuration Management Overview
PDF
MySQL InnoDB Cluster - A complete High Availability solution for MySQL
PDF
ProxySQL and the Tricks Up Its Sleeve - Percona Live 2022.pdf
PDF
MySQL Failover and Orchestrator
PPTX
Running MariaDB in multiple data centers
PDF
New features in ProxySQL 2.0 (updated to 2.0.9) by Rene Cannao (ProxySQL)
PDF
MySQL InnoDB Cluster - Advanced Configuration & Operations
PDF
Automated master failover
PPTX
How to be Successful with Scylla
PDF
High Availability for OpenStack
PDF
ProxySQL at Scale on AWS.pdf
PDF
[2018] MySQL 이중화 진화기
PDF
Percona Live 2022 - MySQL Architectures
PDF
Best Practice for Achieving High Availability in MariaDB
PDF
Galera cluster for high availability
PDF
MySQL Database Architectures - InnoDB ReplicaSet & Cluster
PPTX
My sql failover test using orchestrator
PPTX
Query logging with proxysql
PDF
5 Steps to PostgreSQL Performance
PDF
Percona XtraDB Cluster
ProxySQL High Avalability and Configuration Management Overview
MySQL InnoDB Cluster - A complete High Availability solution for MySQL
ProxySQL and the Tricks Up Its Sleeve - Percona Live 2022.pdf
MySQL Failover and Orchestrator
Running MariaDB in multiple data centers
New features in ProxySQL 2.0 (updated to 2.0.9) by Rene Cannao (ProxySQL)
MySQL InnoDB Cluster - Advanced Configuration & Operations
Automated master failover
How to be Successful with Scylla
High Availability for OpenStack
ProxySQL at Scale on AWS.pdf
[2018] MySQL 이중화 진화기
Percona Live 2022 - MySQL Architectures
Best Practice for Achieving High Availability in MariaDB
Galera cluster for high availability
MySQL Database Architectures - InnoDB ReplicaSet & Cluster
My sql failover test using orchestrator
Query logging with proxysql
5 Steps to PostgreSQL Performance
Percona XtraDB Cluster
Ad

Similar to MMUG18 - MySQL Failover and Orchestrator (20)

PDF
DITA versus DITA-OT
PPTX
High Availability Using MySQL Group Replication
PDF
Front end architecture patterns
PDF
Implement DevOps Like a Unicorn—Even If You’re Not One
PPTX
Pick a Winner: How to Choose a Data Warehouse
PDF
Minerva db corporate flyer
PDF
MySQL InnoDB Cluster - Meetup Oracle MySQL / AFUP Paris
PDF
A data analyst view of Bigdata
PPTX
3DCS and Parallel Works Provide Cloud Computing for FAST Tolerance Analysis
PPTX
DITA for Small Teams Workshop (Tekom 2017)
PPTX
How to Choose a Data Warehouse
PDF
SQLDay2013_ChrisWebb_SSASDesignMistakes
PPTX
Using MySQL Fabric for High Availability and Scaling Out
PPTX
Embedded camps 2018
PPTX
Lets Talk Google BigQuery
PDF
Google BigQuery Best Practices
PPTX
MySQL HA Sharding-Fabric
PPT
Scale out magento 2 at aws
PPTX
Azure Saturday 2017 - Planning for the Cloud
PPTX
Архитектура приложений с использованием MySQL, Петр Зайцев (Percona)
DITA versus DITA-OT
High Availability Using MySQL Group Replication
Front end architecture patterns
Implement DevOps Like a Unicorn—Even If You’re Not One
Pick a Winner: How to Choose a Data Warehouse
Minerva db corporate flyer
MySQL InnoDB Cluster - Meetup Oracle MySQL / AFUP Paris
A data analyst view of Bigdata
3DCS and Parallel Works Provide Cloud Computing for FAST Tolerance Analysis
DITA for Small Teams Workshop (Tekom 2017)
How to Choose a Data Warehouse
SQLDay2013_ChrisWebb_SSASDesignMistakes
Using MySQL Fabric for High Availability and Scaling Out
Embedded camps 2018
Lets Talk Google BigQuery
Google BigQuery Best Practices
MySQL HA Sharding-Fabric
Scale out magento 2 at aws
Azure Saturday 2017 - Planning for the Cloud
Архитектура приложений с использованием MySQL, Петр Зайцев (Percona)
Ad

Recently uploaded (20)

PPTX
Build automations faster and more reliably with UiPath ScreenPlay
PDF
The-Future-of-Automotive-Quality-is-Here-AI-Driven-Engineering.pdf
PDF
Human Computer Interaction Miterm Lesson
PDF
MENA-ECEONOMIC-CONTEXT-VC MENA-ECEONOMIC
PPTX
SGT Report The Beast Plan and Cyberphysical Systems of Control
PDF
A hybrid framework for wild animal classification using fine-tuned DenseNet12...
PDF
ment.tech-Siri Delay Opens AI Startup Opportunity in 2025.pdf
PDF
Transform-Your-Streaming-Platform-with-AI-Driven-Quality-Engineering.pdf
PDF
Data Virtualization in Action: Scaling APIs and Apps with FME
PDF
CXOs-Are-you-still-doing-manual-DevOps-in-the-age-of-AI.pdf
PDF
Examining Bias in AI Generated News Content.pdf
PDF
IT-ITes Industry bjjbnkmkhkhknbmhkhmjhjkhj
PDF
Introduction to MCP and A2A Protocols: Enabling Agent Communication
PDF
Ensemble model-based arrhythmia classification with local interpretable model...
PPTX
Presentation - Principles of Instructional Design.pptx
PPTX
AI-driven Assurance Across Your End-to-end Network With ThousandEyes
PDF
Transform-Your-Supply-Chain-with-AI-Driven-Quality-Engineering.pdf
PDF
AI.gov: A Trojan Horse in the Age of Artificial Intelligence
PDF
EIS-Webinar-Regulated-Industries-2025-08.pdf
PDF
Aug23rd - Mulesoft Community Workshop - Hyd, India.pdf
Build automations faster and more reliably with UiPath ScreenPlay
The-Future-of-Automotive-Quality-is-Here-AI-Driven-Engineering.pdf
Human Computer Interaction Miterm Lesson
MENA-ECEONOMIC-CONTEXT-VC MENA-ECEONOMIC
SGT Report The Beast Plan and Cyberphysical Systems of Control
A hybrid framework for wild animal classification using fine-tuned DenseNet12...
ment.tech-Siri Delay Opens AI Startup Opportunity in 2025.pdf
Transform-Your-Streaming-Platform-with-AI-Driven-Quality-Engineering.pdf
Data Virtualization in Action: Scaling APIs and Apps with FME
CXOs-Are-you-still-doing-manual-DevOps-in-the-age-of-AI.pdf
Examining Bias in AI Generated News Content.pdf
IT-ITes Industry bjjbnkmkhkhknbmhkhmjhjkhj
Introduction to MCP and A2A Protocols: Enabling Agent Communication
Ensemble model-based arrhythmia classification with local interpretable model...
Presentation - Principles of Instructional Design.pptx
AI-driven Assurance Across Your End-to-end Network With ThousandEyes
Transform-Your-Supply-Chain-with-AI-Driven-Quality-Engineering.pdf
AI.gov: A Trojan Horse in the Age of Artificial Intelligence
EIS-Webinar-Regulated-Industries-2025-08.pdf
Aug23rd - Mulesoft Community Workshop - Hyd, India.pdf

MMUG18 - MySQL Failover and Orchestrator