SlideShare a Scribd company logo
Using and Benchmarking Galera
   in Different Architectures

                       Henrik Ingo

                   Percona Live UK
                  London, 2012-12-04



            Please share and reuse this presentation
     licensed under Creative Commonse Attribution license
           http://creativecommons.org/licenses/by/3.0/
Agenda



MySQL Galera                                               How does it perform?
* Synchronous multi-master                                 * In memory workload
clustering, what does it mean?                             * Scale-out for writes - how is it
* Load balancing and other                                 possible?
options                                                    * Disk bound workload
* WAN replication                                          * WAN replication
* How network partitioning is                              * Parallel slave threads
handled
                                                           * Allowing slave to replicate
* How network partitioning is                              (commit) out-of-order
handled in WAN replication
                                                           * NDB shootout
                                                           .
Using and Benchmarking Galera in Different Architectures
2012-04-12                                                                                      2
About Codership Oy



●   Participated in 3 MySQL
    cluster developments
    since 2003
●   Started Galera work 2007
●   Galera is free, open
    source. Commercial
    support from Codership
    and partners.
●   Percona XtraDB Cluster
    and MariaDB Galera
    Cluster launched 2012
Using and Benchmarking Galera in Different Architectures
2012-04-12                                                 3
Synchronous Multi-Master Clustering
Galera in a nutshell



●   True multi-master:
    Read & write to any node
●   Synchronous replication
●   No slave lag, integrity
    issues
●   No master-slave failovers,
    no VIP needed
●   Multi-threaded slave
●   Automatic node                                         DBMS   DBMS     DBMS
    provisioning
                                                                  Galera

Using and Benchmarking Galera in Different Architectures
2012-04-12                                                                        5
Load balancing



What do you mean no
failover???
●   Use a load balancer
                                                                    LB
●   Application sees just one IP
●   Write to any available
    node, round-robin
●   If node fails, just write to
    another one                                            DBMS   DBMS     DBMS
●   What if load balancer fails?
    -> Turtles all the way down                                   Galera

Using and Benchmarking Galera in Different Architectures
2012-04-12                                                                        6
Protip: JDBC, PHP come with built-in load balancing!




●   No Single Point of Failure                                    LB          LB

●   One less layer of network
    components
●   Is aware of MySQL
    transaction states and
    errors
●   Sysbench does this
    internally too (except it doesn't                      DBMS        DBMS        DBMS
    really failover)
                                                                   Galera

Using and Benchmarking Galera in Different Architectures
2012-04-12                                                                                7
Load balancer per each app node




●   Also no Single Point of
                                                                  LB          LB
    Failure
●   LB is an additional layer,
    but localhost = pretty fast
●   Need to manage more
    load balancers
●   Good for languages other
    than Java, PHP                                         DBMS        DBMS        DBMS


                                                                   Galera

Using and Benchmarking Galera in Different Architectures
2012-04-12                                                                                8
Whole stack cluster (no load balancing)


●   One DB node per app server,
    usually same host
●   LB on HTTP or DNS level
●   Each app server connects to
                                                                   LB
    localhost
●   Simple
●   Usually app server cpu is
    bottleneck
    –   Bad: This is a bit wasteful
        architecture, especially if DB is                  DBMS   DBMS     DBMS
        large
    –   Good: Replication overhead w                              Galera
        Galera is negligible
Using and Benchmarking Galera in Different Architectures
2012-04-12                                                                        9
You can still do VIP based failovers. But why?




                                                      failover

                      VIP                                                VIP
                                 Clustering                                      Clustering
                                 framework                                       framework




    DBMS            DBMS            DBMS                         DBMS   DBMS      DBMS


                    Galera                                              Galera

Using and Benchmarking Galera in Different Architectures
2012-04-12                                                                              10
WAN replication
WAN replication




●   Works fine
●   Use higher timeouts
                                                            DBMS          DBMS
●   No impact on reads
●   No impact within a
    transaction
●   adds 100-300 ms to                                             DBMS

    commit latency (see
    benchmarks)


Using and Benchmarking Galera in Different Architectures
2012-04-12                                                                       12
WAN with MySQL asynchronous replication



●   You can mix Galera replication and
    MySQL replication
                                                           DBMS   DBMS                   DBMS
●   Good option on slow WAN link
    (China, Australia)
                                                              DBMS                     DBMS   DBMS
●   You'll possibly need more nodes
    than in pure Galera cluster
●   Remember to watch out for slave
    lag, etc...
●   Channel failover possible, better                                      DBMS
    with 5.6
●   Mixed replication also useful when                                   DBMS   DBMS
    you want an asynchronous slave
    (such as time-delayed, or filtered).


Using and Benchmarking Galera in Different Architectures
2012-04-12                                                                                    13
How network partitioning is handled
               aka
   How split brain is prevented
Preventing split brain



●   If part of the cluster can't be
    reached, it means
                                                                  LB          LB
    –   The node(s) has crashed
    –   Nodes are fine and it's a network
        connectivity issue
        = network partition
    –   Network partition may lead to split
        brain if both parts continue to commit
        transactions.
    –   A node cannot know which of the two
        has happened
●   Split brain leads to 2 diverging
    clusters, 2 diverged datasets                          DBMS        DBMS        DBMS
●   Clustering SW must ensure there
    is only 1 cluster partition active at                              Galera
    all times
Using and Benchmarking Galera in Different Architectures
2012-04-12                                                                            15
Quorum



●   Galera uses quorum
    based failure handling:                                       LB          LB
    –   When cluster partitioning is
        detected, the majority
        partition "has quorum" and
        can continue
    –   A minority partition cannot
        commit transactions, but
        will attempt to re-connect to
        primary partition
                                                           DBMS        DBMS        DBMS
●   A load balancer will notice
    the errors and remove
                                                                       Galera
    failed node from its pool
Using and Benchmarking Galera in Different Architectures
2012-04-12                                                                            16
What is majority?



●   50% is not majority
●   Any failure in 2 node cluster
                                                              DBMS         DBMS
    = both nodes must stop
●   4 node cluster split in half =
    both halves must stop
●   pc.ignore_sb exists but don't
    use it
●   You can
    manually/automatically
    enable one half by setting                             DBMS   DBMS   DBMS     DBMS
    wsrep_cluster_address
●   Use 3 or more nodes
Using and Benchmarking Galera in Different Architectures
2012-04-12                                                                          17
Failures in WAN replication
Multiple Data Centers



                                                       DBMS   DBMS DBMS   DBMS   DBMS DBMS
●   A single node can fail
●   A single node can have
    network connectivity                               DBMS   DBMS DBMS   DBMS   DBMS DBMS

    issue
●   The whole data center
                                                       DBMS   DBMS DBMS   DBMS   DBMS DBMS
    can have connectivity
    issue
●   A whole data center can                            DBMS   DBMS DBMS   DBMS   DBMS DBMS
    be destroyed


Using and Benchmarking Galera in Different Architectures
2012-04-12                                                                            19
Pop Quiz



                                                       DBMS   DBMS DBMS   DBMS   DBMS DBMS
●   Q: What does 50% rule
    mean in each of these
    cases?                                             DBMS   DBMS DBMS   DBMS   DBMS DBMS




                                                       DBMS   DBMS DBMS   DBMS   DBMS DBMS




                                                       DBMS   DBMS DBMS   DBMS   DBMS DBMS




Using and Benchmarking Galera in Different Architectures
2012-04-12                                                                            20
Pop Quiz



                                                       DBMS   DBMS DBMS   DBMS   DBMS DBMS
●   Q: What does 50% rule
    mean in each of these
    cases?                                             DBMS   DBMS DBMS   DBMS   DBMS DBMS




                                                       DBMS   DBMS DBMS   DBMS   DBMS DBMS




●   A: Better have 3 data                              DBMS   DBMS DBMS   DBMS   DBMS DBMS

    centers too.

Using and Benchmarking Galera in Different Architectures
2012-04-12                                                                            21
WAN replication with uneven node distribution



●   Q: What does 50% rule mean when you have uneven
    amount of nodes per data center?

                                          DBMS     DBMS DBMS




                    DBMS                                       DBMS




Using and Benchmarking Galera in Different Architectures
2012-04-12                                                            22
WAN replication with uneven node distribution



●   Q: What does 50% rule mean when you have uneven
    amount of nodes per data center?

                                          DBMS     DBMS DBMS




                    DBMS                                       DBMS




●   A: Better distribute nodes evenly.
       (We will address this in future release.)
Using and Benchmarking Galera in Different Architectures
2012-04-12                                                            23
Benchmarks!
Baseline: Single node MySQL (sysbench oltp, in-memory)




●   Red, Blue: Constrained by InnoDB group commit bug
    –   Fixed in Percona Server 5.5, MariaDB 5.3 and MySQL 5.6
●   Brown: InnoDB syncs, binlog doesn't
●   Green: No InnoDB syncing either
●   Yellow: No InnoDB syncs, Galera wsrep module enabled
                                  http://openlife.cc/blogs/2011/august/running-sysbench-tests-against-galera-cluster
Using and Benchmarking Galera in Different Architectures
2012-04-12                                                                                                    25
3 node Galera cluster (sysbench oltp, in memory)




                                  http://openlife.cc/blogs/2011/august/running-sysbench-tests-against-galera-cluster
Using and Benchmarking Galera in Different Architectures
2012-04-12                                                                                                    26
Comments on 3 node cluster (sysbench oltp, in memory)




●   Yellow, Red are equal
    -> No overhead or bottleneck from Galera replication!
●   Green, Brown = writing to 2 and 3 masters
    -> scale-out for read-write workload!
     –   Top shows 700% CPU util (8 cores)
                                  http://openlife.cc/blogs/2011/august/running-sysbench-tests-against-galera-cluster
Using and Benchmarking Galera in Different Architectures
2012-04-12                                                                                                    27
Sysbench disk bound (20GB data / 6GB buffer), tps



●   EC2 w local disk
    –   Note: pretty poor I/O here
●   Blue vs red: turning off
    innodb_flush_log_at_trx
    _commit gives 66%
    improvement
●   Scale-out factors:
    2N = 0.5 x 1N
    4N = 0.5 x 2N
●   5th node was EC2
    weakness. Later test
    scaled a little more up to
    8 nodes
                                             http://codership.com/content/scaling-out-oltp-load-amazon-ec2-revisited
Using and Benchmarking Galera in Different Architectures
2012-04-12                                                                                                   28
Sysbench disk bound (20GB data / 6GB buffer), latency



●   As before
●   Not syncing InnoDB
    decreases latency
●   Scale-out decreases
    latency
●   Galera does not add
    latency overhead




                                             http://codership.com/content/scaling-out-oltp-load-amazon-ec2-revisited
Using and Benchmarking Galera in Different Architectures
2012-04-12                                                                                                   29
Multi-threaded slave. Out-of-order slave commits.



Multi-thread slave
    –   For memory-bound workload, multi-threaded slave provides no benefit
        (there's nothing to fix)
    –   For disk-bound, multi-threaded slave helps. 2x better or more.
Out-of-order commits
    –   By default slave applies transactions in parallel, but preserves commit order
    –   OOOC is possible: wsrep_provider_options="replicator.commit_order=1"
    –   Not safe for most workloads, ask your developers
    –   Seems to help a little, in some case, but if you're really I/O bound then not
    –   Default multi-threaded setting is so good, we can forget this option




Using and Benchmarking Galera in Different Architectures
2012-04-12                                                                         30
Drupal on Galera: baseline w single server



●   Drupal, Apache, PHP,
    MySQL 5.1
●   JMeter
    –   3 types of users: poster,
        commenter, reader
    –   Gaussian (15, 7) think time
●   Large EC2 instance
●   Ideal scalability: linear until
    tipping point at 140-180 users
    –   Constrained by Apache/PHP
        CPU utilization
    –   Could scale out by adding more
        Apache in front of single MySQL

                                  http://codership.com/content/scaling-drupal-stack-galera-part-2-mystery-failed-login
Using and Benchmarking Galera in Different Architectures
2012-04-12                                                                                                     31
Drupal on Galera: Scale-out with 1-4 Galera nodes (tps)



●   Drupal, Apache, PHP,
    MySQL 5.1 w Galera
●   1-4 identical nodes
    –   Whole stack cluster
    –   MySQL connection to
        localhost
●   Multiply nr of users
    –   180, 360, 540, 720
●   3 nodes = linear scalability,
    4 nodes still near-linear
●   Minimal latency overhead



                                  http://codership.com/content/scaling-drupal-stack-galera-part-2-mystery-failed-login
Using and Benchmarking Galera in Different Architectures
2012-04-12                                                                                                     32
Drupal on Galera: Scale-out with 1-4 Galera nodes (latency)



●   Like before
●   Constant nr of users
    –   180, 180, 180, 180
●   Scaling from 1 to 2
    –   drastically reduces latency
    –   tps back to linear
        scalability
●   Scaling to 3 and 4
    –   No more tps as there was
        no bottleneck.
    –   Slightly better latency
    –   Note: No overhead from
        additional nodes!
                                  http://codership.com/content/scaling-drupal-stack-galera-part-2-mystery-failed-login
Using and Benchmarking Galera in Different Architectures
2012-04-12                                                                                                     33
Benchmark Setup



1. Centralized Access:

     Client                                  n    Client
                                  co nnectio
                   Direc t client


    m1.large



  Virginia, US   ~ 6000 km, ~ 90 ms RTT          Ireland, EU


                                                               34
Benchmark Setup



2. Global Synchronization:

     Client                                Client


                      Synchronous
    m1.large           replication         m1.large



  Virginia, US   ~ 6000 km, ~ 90 ms RTT   Ireland, EU


                                                        35
Moderate Load / Average Latencies (ms)


client     centralized replicated   change
us-east    28.03        119.80      ~4.3x
eu-west    1953.89      122.92      ~0.06x




                                             36
Moderate Load / 95% Latencies (ms)


client          centralized replicated   change
us-east         59.32       150.66       ~2.5x
eu-west         2071.49     157.11       ~0.08x




                                                  37
Heavy Load / 95% Latencies (ms)


client        centralized replicated   change
us-east       387.16      416.24       ~1.07x
eu-west       2209.19     421.76       ~0.19x




                                                38
Heavy Load / Transaction Rate


client      centralized replicated   overall gain
us-east     300.10      236.89
eu-west     31.07       241.49
                                          ~1.5x




                                                    39
Why is this possible?


Client↔Server synchronization is much heavier
than Master↔Slave




  Client   Reads, Writes, etc.   Master   Slave

                                           only
                                          writes
                                                   40
Why is this possible?


Client↔Server synchronization is much heavier
than Master↔Slave
   → it pays to move server closer to client:



           Reads,
  Client   Writes,    Master                          Slave
            etc.

                                              only
                                             writes
                                                              41
Galera and NDB shootout: sysbench "out of the box"



●   Galera is 4x better
Ok, so what does this
really mean?
●   That Galera is better...
    –   For this workload
    –   With default settings
        (Severalnines)
    –   Pretty user friendly and
        general purpose
●   NDB
    –   Excels at key-value and
        heavy-write workloads
    –   Would benefit here from
        PARTITION BY RANGE
                                                           http://codership.com/content/whats-difference-kenneth
Using and Benchmarking Galera in Different Architectures
2012-04-12                                                                                               42
Conclusions


Many MySQL replication idioms go away:                     Negligible overhead compared to single node
synchronous, multi-master, no slave lag, no                case (when properly configured)
binlog positions, automatic node
provisioning.                                              Better than single node:
                                                           - No InnoDB syncs needed
Many LB and VIP architectures possible,                    - Can read & write to all nodes
JDBC/PHP load balancer recommended.
                                                           Similar results both memory bound and disk
Also for WAN replication. Adds 100-300 ms                  bound workloads.
to commit.
                                                           Whole stack load balancing: no performance
Quorum based: Majority partition wins.                     penalty from gratuitiously adding Galera nodes
Minimum 3 nodes. Minimum 3 data centers.                   For a global service, network latency will always
                                                           show up somewhere. Synchronous Galera
                                                           replication is often an excellent choice!
                                                           Galera is good where InnoDB is good: general
                                                           purpose, easy to use HA cluster




Using and Benchmarking Galera in Different Architectures
2012-04-12                                                                                           43
Questions?




Thank you for listening!
 Happy Clustering :-)

More Related Content

PDF
9 DevOps Tips for Going in Production with Galera Cluster for MySQL - Slides
PDF
MariaDB Galera Cluster - Simple, Transparent, Highly Available
PDF
Galera cluster for MySQL - Introduction Slides
PPTX
Maria DB Galera Cluster for High Availability
PDF
Introduction to Galera
PDF
Failover or not to failover
PDF
High Availability with Galera Cluster - SkySQL Road Show 2013 in Berlin
DOCX
Master master vs master-slave database
9 DevOps Tips for Going in Production with Galera Cluster for MySQL - Slides
MariaDB Galera Cluster - Simple, Transparent, Highly Available
Galera cluster for MySQL - Introduction Slides
Maria DB Galera Cluster for High Availability
Introduction to Galera
Failover or not to failover
High Availability with Galera Cluster - SkySQL Road Show 2013 in Berlin
Master master vs master-slave database

What's hot (20)

PDF
Webinar slides: 9 DevOps Tips for Going in Production with Galera Cluster for...
PDF
Highly Available MySQL/PHP Applications with mysqlnd
PPTX
MySQL Multi Master Replication
PDF
Webinar slides: Introducing Galera 3.0 - Now supporting MySQL 5.6
PDF
Choosing a MySQL High Availability solution - Percona Live UK 2011
PDF
Best practices for MySQL High Availability
PPT
Galera cluster - SkySQL Paris Meetup 17.12.2013
ODP
MySQL HA with PaceMaker
PDF
Percona XtraDB Cluster ( Ensure high Availability )
PDF
Plny12 galera-cluster-best-practices
PPTX
MariaDB Galera Cluster
PDF
Webinar Slides : Migrating to MySQL, MariaDB Galera and/or Percona XtraDB Clu...
PDF
Webinar Slides: MySQL HA/DR/Geo-Scale - High Noon #7: ClusterControl
PDF
Scaling with sync_replication using Galera and EC2
PDF
Repair & Recovery for your MySQL, MariaDB & MongoDB / TokuMX Clusters - Webin...
PDF
Percona XtraDB Cluster vs Galera Cluster vs MySQL Group Replication
PPTX
Percona XtraDB Cluster SF Meetup
PDF
Percona XtraDB Cluster
PDF
MySQL Sandbox 3
PDF
Introduction to Galera Cluster
Webinar slides: 9 DevOps Tips for Going in Production with Galera Cluster for...
Highly Available MySQL/PHP Applications with mysqlnd
MySQL Multi Master Replication
Webinar slides: Introducing Galera 3.0 - Now supporting MySQL 5.6
Choosing a MySQL High Availability solution - Percona Live UK 2011
Best practices for MySQL High Availability
Galera cluster - SkySQL Paris Meetup 17.12.2013
MySQL HA with PaceMaker
Percona XtraDB Cluster ( Ensure high Availability )
Plny12 galera-cluster-best-practices
MariaDB Galera Cluster
Webinar Slides : Migrating to MySQL, MariaDB Galera and/or Percona XtraDB Clu...
Webinar Slides: MySQL HA/DR/Geo-Scale - High Noon #7: ClusterControl
Scaling with sync_replication using Galera and EC2
Repair & Recovery for your MySQL, MariaDB & MongoDB / TokuMX Clusters - Webin...
Percona XtraDB Cluster vs Galera Cluster vs MySQL Group Replication
Percona XtraDB Cluster SF Meetup
Percona XtraDB Cluster
MySQL Sandbox 3
Introduction to Galera Cluster
Ad

Similar to Using and Benchmarking Galera in different architectures (PLUK 2012) (20)

PPTX
Sql server 2012 ha and dr sql saturday dc
PPTX
Sql server 2012 ha and dr sql saturday tampa
PPTX
Sql server 2012 ha and dr sql saturday boston
PPTX
Sql Server 2012 HA and DR -- SQL Saturday Richmond
PPTX
Scalability
PDF
Implementing the Future of PostgreSQL Clustering with Tungsten
PPTX
High-Availability of YARN (MRv2)
PDF
Drupal Con My Sql Ha 2008 08 29
KEY
Cloud Computing & Scaling Web Apps
PDF
Running your Java EE applications in the Cloud
PDF
NoSQL Database
PPTX
Sql Server High Availability & DR Technologies
PPTX
Sql 2012 always on
PDF
MySQL Cluster Schema management (2014)
PDF
Cloudcon East Presentation
PDF
Cloudcon East Presentation
PPTX
001 hbase introduction
PPTX
Handling Massive Writes
PDF
Percona Live 2014 - Scaling MySQL in AWS
PDF
MySQL High Availability Solutions
Sql server 2012 ha and dr sql saturday dc
Sql server 2012 ha and dr sql saturday tampa
Sql server 2012 ha and dr sql saturday boston
Sql Server 2012 HA and DR -- SQL Saturday Richmond
Scalability
Implementing the Future of PostgreSQL Clustering with Tungsten
High-Availability of YARN (MRv2)
Drupal Con My Sql Ha 2008 08 29
Cloud Computing & Scaling Web Apps
Running your Java EE applications in the Cloud
NoSQL Database
Sql Server High Availability & DR Technologies
Sql 2012 always on
MySQL Cluster Schema management (2014)
Cloudcon East Presentation
Cloudcon East Presentation
001 hbase introduction
Handling Massive Writes
Percona Live 2014 - Scaling MySQL in AWS
MySQL High Availability Solutions
Ad

More from Henrik Ingo (15)

PDF
ICPE25 Henrik Ingo Optimizing Hunter Nyrkiö slides (1).pdf
PDF
SPEC June 2025 - Using e-divisive means change detection in continuous benchm...
PDF
Introduction to new high performance storage engines in mongodb 3.0
PDF
Meteor - The next generation software stack
PDF
MongoDB for Oracle Experts - OUGF Harmony 2014
PDF
Building Your First MongoDB App
PDF
Analytics with MongoDB Aggregation Framework and Hadoop Connector
PDF
Whats new in mongoDB 2.4 at Copenhagen user group 2013-06-19
PDF
Spatial functions in MySQL 5.6, MariaDB 5.5, PostGIS 2.0 and others
PDF
Introducing Xtrabackup Manager
PDF
Froscon 2012 how big corporations play the open source game
PDF
Databases and the Cloud
PDF
Fixed in drizzle
PDF
Froscon2011: How i learned to use sql and then learned not to use it
PDF
How to grow your open source project 10x and revenues 5x OSCON2011
ICPE25 Henrik Ingo Optimizing Hunter Nyrkiö slides (1).pdf
SPEC June 2025 - Using e-divisive means change detection in continuous benchm...
Introduction to new high performance storage engines in mongodb 3.0
Meteor - The next generation software stack
MongoDB for Oracle Experts - OUGF Harmony 2014
Building Your First MongoDB App
Analytics with MongoDB Aggregation Framework and Hadoop Connector
Whats new in mongoDB 2.4 at Copenhagen user group 2013-06-19
Spatial functions in MySQL 5.6, MariaDB 5.5, PostGIS 2.0 and others
Introducing Xtrabackup Manager
Froscon 2012 how big corporations play the open source game
Databases and the Cloud
Fixed in drizzle
Froscon2011: How i learned to use sql and then learned not to use it
How to grow your open source project 10x and revenues 5x OSCON2011

Recently uploaded (20)

PDF
Mathura Sridharan's Appointment as Ohio Solicitor General Sparks Racist Backl...
PPTX
Richard Smith and First Zurich trust scam.pptx
PDF
Tran Quoc Bao led Top 3 Social Influencers Transforming Healthcare & Life Sci...
PDF
01082025_First India Newspaper Jaipur.pdf
PDF
ACFrOgB7qGIQ8bhzZH1Pzz4DLzOiKY24QMUch6D2DeHr9Wmm6Me1clS-AgTR6FhMOpbl2iwGlABTp...
DOCX
Meme Coins news - memecoinist website platform
PPTX
200 years old story of a paradise on earth
PPTX
OBG. ABNORAMLTIES OF THE PUERPERIUM, BSC
PDF
Supereme Court history functions and reach.pdf
DOCX
Breaking Now – Latest Live News Updates from GTV News HD
PDF
Executive an important link between the legislative and people
PDF
hbs_mckinsey_global_energy_perspective_2021.pdf
PDF
05082025_First India Newspaper Jaipur.pdf
PDF
JUDICIAL_ACTIVISM_CRITICAL_ANALYSIS in india.pdf
PPTX
opher bryers alert -How Opher Bryer’s Impro.ai Became the Center of Israel’s ...
PDF
Theories of federalism showcasing india .pdf
PDF
Role of federalism in the indian society
PPTX
PPT on SardarPatel and Popular Media.pptx
PDF
03082025_First India Newspaper Jaipur.pdf
PPTX
Final The-End-of-the-Cold-War-and-the-Emergence-of-a-Unipolar-World.pptx
Mathura Sridharan's Appointment as Ohio Solicitor General Sparks Racist Backl...
Richard Smith and First Zurich trust scam.pptx
Tran Quoc Bao led Top 3 Social Influencers Transforming Healthcare & Life Sci...
01082025_First India Newspaper Jaipur.pdf
ACFrOgB7qGIQ8bhzZH1Pzz4DLzOiKY24QMUch6D2DeHr9Wmm6Me1clS-AgTR6FhMOpbl2iwGlABTp...
Meme Coins news - memecoinist website platform
200 years old story of a paradise on earth
OBG. ABNORAMLTIES OF THE PUERPERIUM, BSC
Supereme Court history functions and reach.pdf
Breaking Now – Latest Live News Updates from GTV News HD
Executive an important link between the legislative and people
hbs_mckinsey_global_energy_perspective_2021.pdf
05082025_First India Newspaper Jaipur.pdf
JUDICIAL_ACTIVISM_CRITICAL_ANALYSIS in india.pdf
opher bryers alert -How Opher Bryer’s Impro.ai Became the Center of Israel’s ...
Theories of federalism showcasing india .pdf
Role of federalism in the indian society
PPT on SardarPatel and Popular Media.pptx
03082025_First India Newspaper Jaipur.pdf
Final The-End-of-the-Cold-War-and-the-Emergence-of-a-Unipolar-World.pptx

Using and Benchmarking Galera in different architectures (PLUK 2012)

  • 1. Using and Benchmarking Galera in Different Architectures Henrik Ingo Percona Live UK London, 2012-12-04 Please share and reuse this presentation licensed under Creative Commonse Attribution license http://creativecommons.org/licenses/by/3.0/
  • 2. Agenda MySQL Galera How does it perform? * Synchronous multi-master * In memory workload clustering, what does it mean? * Scale-out for writes - how is it * Load balancing and other possible? options * Disk bound workload * WAN replication * WAN replication * How network partitioning is * Parallel slave threads handled * Allowing slave to replicate * How network partitioning is (commit) out-of-order handled in WAN replication * NDB shootout . Using and Benchmarking Galera in Different Architectures 2012-04-12 2
  • 3. About Codership Oy ● Participated in 3 MySQL cluster developments since 2003 ● Started Galera work 2007 ● Galera is free, open source. Commercial support from Codership and partners. ● Percona XtraDB Cluster and MariaDB Galera Cluster launched 2012 Using and Benchmarking Galera in Different Architectures 2012-04-12 3
  • 5. Galera in a nutshell ● True multi-master: Read & write to any node ● Synchronous replication ● No slave lag, integrity issues ● No master-slave failovers, no VIP needed ● Multi-threaded slave ● Automatic node DBMS DBMS DBMS provisioning Galera Using and Benchmarking Galera in Different Architectures 2012-04-12 5
  • 6. Load balancing What do you mean no failover??? ● Use a load balancer LB ● Application sees just one IP ● Write to any available node, round-robin ● If node fails, just write to another one DBMS DBMS DBMS ● What if load balancer fails? -> Turtles all the way down Galera Using and Benchmarking Galera in Different Architectures 2012-04-12 6
  • 7. Protip: JDBC, PHP come with built-in load balancing! ● No Single Point of Failure LB LB ● One less layer of network components ● Is aware of MySQL transaction states and errors ● Sysbench does this internally too (except it doesn't DBMS DBMS DBMS really failover) Galera Using and Benchmarking Galera in Different Architectures 2012-04-12 7
  • 8. Load balancer per each app node ● Also no Single Point of LB LB Failure ● LB is an additional layer, but localhost = pretty fast ● Need to manage more load balancers ● Good for languages other than Java, PHP DBMS DBMS DBMS Galera Using and Benchmarking Galera in Different Architectures 2012-04-12 8
  • 9. Whole stack cluster (no load balancing) ● One DB node per app server, usually same host ● LB on HTTP or DNS level ● Each app server connects to LB localhost ● Simple ● Usually app server cpu is bottleneck – Bad: This is a bit wasteful architecture, especially if DB is DBMS DBMS DBMS large – Good: Replication overhead w Galera Galera is negligible Using and Benchmarking Galera in Different Architectures 2012-04-12 9
  • 10. You can still do VIP based failovers. But why? failover VIP VIP Clustering Clustering framework framework DBMS DBMS DBMS DBMS DBMS DBMS Galera Galera Using and Benchmarking Galera in Different Architectures 2012-04-12 10
  • 12. WAN replication ● Works fine ● Use higher timeouts DBMS DBMS ● No impact on reads ● No impact within a transaction ● adds 100-300 ms to DBMS commit latency (see benchmarks) Using and Benchmarking Galera in Different Architectures 2012-04-12 12
  • 13. WAN with MySQL asynchronous replication ● You can mix Galera replication and MySQL replication DBMS DBMS DBMS ● Good option on slow WAN link (China, Australia) DBMS DBMS DBMS ● You'll possibly need more nodes than in pure Galera cluster ● Remember to watch out for slave lag, etc... ● Channel failover possible, better DBMS with 5.6 ● Mixed replication also useful when DBMS DBMS you want an asynchronous slave (such as time-delayed, or filtered). Using and Benchmarking Galera in Different Architectures 2012-04-12 13
  • 14. How network partitioning is handled aka How split brain is prevented
  • 15. Preventing split brain ● If part of the cluster can't be reached, it means LB LB – The node(s) has crashed – Nodes are fine and it's a network connectivity issue = network partition – Network partition may lead to split brain if both parts continue to commit transactions. – A node cannot know which of the two has happened ● Split brain leads to 2 diverging clusters, 2 diverged datasets DBMS DBMS DBMS ● Clustering SW must ensure there is only 1 cluster partition active at Galera all times Using and Benchmarking Galera in Different Architectures 2012-04-12 15
  • 16. Quorum ● Galera uses quorum based failure handling: LB LB – When cluster partitioning is detected, the majority partition "has quorum" and can continue – A minority partition cannot commit transactions, but will attempt to re-connect to primary partition DBMS DBMS DBMS ● A load balancer will notice the errors and remove Galera failed node from its pool Using and Benchmarking Galera in Different Architectures 2012-04-12 16
  • 17. What is majority? ● 50% is not majority ● Any failure in 2 node cluster DBMS DBMS = both nodes must stop ● 4 node cluster split in half = both halves must stop ● pc.ignore_sb exists but don't use it ● You can manually/automatically enable one half by setting DBMS DBMS DBMS DBMS wsrep_cluster_address ● Use 3 or more nodes Using and Benchmarking Galera in Different Architectures 2012-04-12 17
  • 18. Failures in WAN replication
  • 19. Multiple Data Centers DBMS DBMS DBMS DBMS DBMS DBMS ● A single node can fail ● A single node can have network connectivity DBMS DBMS DBMS DBMS DBMS DBMS issue ● The whole data center DBMS DBMS DBMS DBMS DBMS DBMS can have connectivity issue ● A whole data center can DBMS DBMS DBMS DBMS DBMS DBMS be destroyed Using and Benchmarking Galera in Different Architectures 2012-04-12 19
  • 20. Pop Quiz DBMS DBMS DBMS DBMS DBMS DBMS ● Q: What does 50% rule mean in each of these cases? DBMS DBMS DBMS DBMS DBMS DBMS DBMS DBMS DBMS DBMS DBMS DBMS DBMS DBMS DBMS DBMS DBMS DBMS Using and Benchmarking Galera in Different Architectures 2012-04-12 20
  • 21. Pop Quiz DBMS DBMS DBMS DBMS DBMS DBMS ● Q: What does 50% rule mean in each of these cases? DBMS DBMS DBMS DBMS DBMS DBMS DBMS DBMS DBMS DBMS DBMS DBMS ● A: Better have 3 data DBMS DBMS DBMS DBMS DBMS DBMS centers too. Using and Benchmarking Galera in Different Architectures 2012-04-12 21
  • 22. WAN replication with uneven node distribution ● Q: What does 50% rule mean when you have uneven amount of nodes per data center? DBMS DBMS DBMS DBMS DBMS Using and Benchmarking Galera in Different Architectures 2012-04-12 22
  • 23. WAN replication with uneven node distribution ● Q: What does 50% rule mean when you have uneven amount of nodes per data center? DBMS DBMS DBMS DBMS DBMS ● A: Better distribute nodes evenly. (We will address this in future release.) Using and Benchmarking Galera in Different Architectures 2012-04-12 23
  • 25. Baseline: Single node MySQL (sysbench oltp, in-memory) ● Red, Blue: Constrained by InnoDB group commit bug – Fixed in Percona Server 5.5, MariaDB 5.3 and MySQL 5.6 ● Brown: InnoDB syncs, binlog doesn't ● Green: No InnoDB syncing either ● Yellow: No InnoDB syncs, Galera wsrep module enabled http://openlife.cc/blogs/2011/august/running-sysbench-tests-against-galera-cluster Using and Benchmarking Galera in Different Architectures 2012-04-12 25
  • 26. 3 node Galera cluster (sysbench oltp, in memory) http://openlife.cc/blogs/2011/august/running-sysbench-tests-against-galera-cluster Using and Benchmarking Galera in Different Architectures 2012-04-12 26
  • 27. Comments on 3 node cluster (sysbench oltp, in memory) ● Yellow, Red are equal -> No overhead or bottleneck from Galera replication! ● Green, Brown = writing to 2 and 3 masters -> scale-out for read-write workload! – Top shows 700% CPU util (8 cores) http://openlife.cc/blogs/2011/august/running-sysbench-tests-against-galera-cluster Using and Benchmarking Galera in Different Architectures 2012-04-12 27
  • 28. Sysbench disk bound (20GB data / 6GB buffer), tps ● EC2 w local disk – Note: pretty poor I/O here ● Blue vs red: turning off innodb_flush_log_at_trx _commit gives 66% improvement ● Scale-out factors: 2N = 0.5 x 1N 4N = 0.5 x 2N ● 5th node was EC2 weakness. Later test scaled a little more up to 8 nodes http://codership.com/content/scaling-out-oltp-load-amazon-ec2-revisited Using and Benchmarking Galera in Different Architectures 2012-04-12 28
  • 29. Sysbench disk bound (20GB data / 6GB buffer), latency ● As before ● Not syncing InnoDB decreases latency ● Scale-out decreases latency ● Galera does not add latency overhead http://codership.com/content/scaling-out-oltp-load-amazon-ec2-revisited Using and Benchmarking Galera in Different Architectures 2012-04-12 29
  • 30. Multi-threaded slave. Out-of-order slave commits. Multi-thread slave – For memory-bound workload, multi-threaded slave provides no benefit (there's nothing to fix) – For disk-bound, multi-threaded slave helps. 2x better or more. Out-of-order commits – By default slave applies transactions in parallel, but preserves commit order – OOOC is possible: wsrep_provider_options="replicator.commit_order=1" – Not safe for most workloads, ask your developers – Seems to help a little, in some case, but if you're really I/O bound then not – Default multi-threaded setting is so good, we can forget this option Using and Benchmarking Galera in Different Architectures 2012-04-12 30
  • 31. Drupal on Galera: baseline w single server ● Drupal, Apache, PHP, MySQL 5.1 ● JMeter – 3 types of users: poster, commenter, reader – Gaussian (15, 7) think time ● Large EC2 instance ● Ideal scalability: linear until tipping point at 140-180 users – Constrained by Apache/PHP CPU utilization – Could scale out by adding more Apache in front of single MySQL http://codership.com/content/scaling-drupal-stack-galera-part-2-mystery-failed-login Using and Benchmarking Galera in Different Architectures 2012-04-12 31
  • 32. Drupal on Galera: Scale-out with 1-4 Galera nodes (tps) ● Drupal, Apache, PHP, MySQL 5.1 w Galera ● 1-4 identical nodes – Whole stack cluster – MySQL connection to localhost ● Multiply nr of users – 180, 360, 540, 720 ● 3 nodes = linear scalability, 4 nodes still near-linear ● Minimal latency overhead http://codership.com/content/scaling-drupal-stack-galera-part-2-mystery-failed-login Using and Benchmarking Galera in Different Architectures 2012-04-12 32
  • 33. Drupal on Galera: Scale-out with 1-4 Galera nodes (latency) ● Like before ● Constant nr of users – 180, 180, 180, 180 ● Scaling from 1 to 2 – drastically reduces latency – tps back to linear scalability ● Scaling to 3 and 4 – No more tps as there was no bottleneck. – Slightly better latency – Note: No overhead from additional nodes! http://codership.com/content/scaling-drupal-stack-galera-part-2-mystery-failed-login Using and Benchmarking Galera in Different Architectures 2012-04-12 33
  • 34. Benchmark Setup 1. Centralized Access: Client n Client co nnectio Direc t client m1.large Virginia, US ~ 6000 km, ~ 90 ms RTT Ireland, EU 34
  • 35. Benchmark Setup 2. Global Synchronization: Client Client Synchronous m1.large replication m1.large Virginia, US ~ 6000 km, ~ 90 ms RTT Ireland, EU 35
  • 36. Moderate Load / Average Latencies (ms) client centralized replicated change us-east 28.03 119.80 ~4.3x eu-west 1953.89 122.92 ~0.06x 36
  • 37. Moderate Load / 95% Latencies (ms) client centralized replicated change us-east 59.32 150.66 ~2.5x eu-west 2071.49 157.11 ~0.08x 37
  • 38. Heavy Load / 95% Latencies (ms) client centralized replicated change us-east 387.16 416.24 ~1.07x eu-west 2209.19 421.76 ~0.19x 38
  • 39. Heavy Load / Transaction Rate client centralized replicated overall gain us-east 300.10 236.89 eu-west 31.07 241.49 ~1.5x 39
  • 40. Why is this possible? Client↔Server synchronization is much heavier than Master↔Slave Client Reads, Writes, etc. Master Slave only writes 40
  • 41. Why is this possible? Client↔Server synchronization is much heavier than Master↔Slave → it pays to move server closer to client: Reads, Client Writes, Master Slave etc. only writes 41
  • 42. Galera and NDB shootout: sysbench "out of the box" ● Galera is 4x better Ok, so what does this really mean? ● That Galera is better... – For this workload – With default settings (Severalnines) – Pretty user friendly and general purpose ● NDB – Excels at key-value and heavy-write workloads – Would benefit here from PARTITION BY RANGE http://codership.com/content/whats-difference-kenneth Using and Benchmarking Galera in Different Architectures 2012-04-12 42
  • 43. Conclusions Many MySQL replication idioms go away: Negligible overhead compared to single node synchronous, multi-master, no slave lag, no case (when properly configured) binlog positions, automatic node provisioning. Better than single node: - No InnoDB syncs needed Many LB and VIP architectures possible, - Can read & write to all nodes JDBC/PHP load balancer recommended. Similar results both memory bound and disk Also for WAN replication. Adds 100-300 ms bound workloads. to commit. Whole stack load balancing: no performance Quorum based: Majority partition wins. penalty from gratuitiously adding Galera nodes Minimum 3 nodes. Minimum 3 data centers. For a global service, network latency will always show up somewhere. Synchronous Galera replication is often an excellent choice! Galera is good where InnoDB is good: general purpose, easy to use HA cluster Using and Benchmarking Galera in Different Architectures 2012-04-12 43
  • 44. Questions? Thank you for listening! Happy Clustering :-)