Webinar
Scaling MySQL: Scale Up versus Scale Out
August 16, 2012
Agenda


       1. Who We Are

       2. The Scalability Problem

       3. Scale Up vs. Scale Out

       4. Customer ROI/Case Studies

       5. Q & A
          (please type questions directly into the GoToWebinar side panel)




2
Who We Are

    Presenters:                                     Paul Campaniello,
                                                  VP of Global Marketing
                                              25 year technology veteran with
                                              marketing experience at Mendix,
                                              Lumigent, Savantis and Precise.




                Doron Levari, Founder
            A technologist and long-time
          veteran of the database industry.
         Prior to founding ScaleBase, Doron
                  was CEO to Aluna.


3
Pain Points – The Scalability Problem

• Thousands of new online and mobile
  apps launching every day
• Demand climbs for these apps and
  databases can’t keep up
• App must provide uninterrupted
  access and availability
• Database performance and
  scalability is critical




4
Big Data = Big Scaling Needs

       Big Data = Transactions + Interactions + Observations
               Sensors/RFID/Devices      Mobile Web       User Generated Content        Spatial & GPS Coordinates




                                                                                                                            BIG DATA
Petabytes      User Click Stream         Sentiment        Social Interactions & Feeds


               Web Logs               Dynamic Pricing       Search Marketing




                                                                                                 WEB
               Offer History          A/B Testing           Affiliate Networks
Terabytes                                                                                                 External
                                                                                                          Demographics
               Segmentation           Customer Touches




                                                                                 CRM
                                                                                                          Business Data
               Offer Details          Support Contacts                                                    Feeds


Gigabytes
                                                                                                  HD Video, Audio, Images
                                                                                   Behavioral
                                                    ERP


                    Purchase Detail
                                                                                   Targeting      Speech to Text
                    Purchase Record
                                                                                                  Product/Service Logs
                    Payment Record                                                 Dynamic
                                                                                   Funnels
                                                                                                  SMS/MMS
Megabytes



                                      Increasing Data Variety and Complexity

   5
                                           The 451 Group & Teradata
Scalability Pain



Infrastructure
Cost $
                   Large                     You just lost
                   Capital                    customers
                 Expenditure


                                                         Predicted
                                                         Demand

                               Opportunity                   Traditional
                                 Cost                        Hardware

                                                             Actual
                                                             Demand

                                                         Dynamic
                                                         Scaling


                                                                      time


    6
Scale Up

                                             http://forge.mysql.com/wiki/Top10SQLPerformanceTips
                                             innodb_buffer_pool_size
                                Instance     query_cache_size
                                 Tuning



                                                                        EXPLAIN
               SSD                                  SQL Tuning          Indexes
                                                                        SELECT *
                                                                        DISTINCT vs. GROUP BY




                     Hardware
                                           Partitioning
                      Upscale

7
Partitioning Performance

    • See excellent presentation by Giuseppe Maxia from 2010
       – http://www.slideshare.net/datacharmer/partitions-performance-with-
         mysql-51-and-55

               Engine    No Partitions   Partitions
               InnoDB    4min 30s        13.19s
               MyISAM    25.03s          4.45s

    • Keeps data objects at their sweet spot
    • Helps fit indexes in RAM
    • Distributes sessions load over disks




8
Scaling Up Hardware

    • Usually DB gets the strongest servers
    • However – there is a limit to how much performance
      improvement can be derived from increasing hardware
    • Some data:




        http://www.mysqlperformanceblog.com/2011/01/26/modeling-innodb-
                         scalability-on-multi-core-servers/

9
Scale Up Pros & Cons

Pros                                  Cons
May result in major performance       Tedious, never ending…
improvements
Mostly transparent to the application SQL modifications are not always an option
HW upscale is easy                    Expensive
                                      Requires unique skill set
                                      Requires downtime
                                      Limited. At one (near) point – the database engine
                                      itself becomes the bottleneck




10
The Database Engine is the Bottleneck...

 • Every write operation is At Least 4 write operations inside the DB:
     – Data segment
     – Index segment
     – Undo segment
     – Transaction log
 • And Multiple Activities in the DB engine memory:
     – Buffer management
     – Locking
     – Thread locks/semaphores
     – Recovery tasks




11
The Database Engine is the Bottleneck

 • Every write operation is At Least 4 write operations inside the DB:
     – Data segment
     – Index segment
     – Undo segment
                                             Now multiply
    – Transaction log
                                              by 10TB and
 • And Multiple Activities in the DB engine memory:
                                                 10,000
    – Buffer management                        concurrent
    – Locking                                   sessions
     – Thread locks/semaphores
     – Recovery tasks




12
Scale Out (two methods)
                                             Read

                                             Write

         Read/Write
1
          Splitting

                               Replication




    Data Distribution
2
       (sharding)




    13
Read/Write Splitting

 • Write to master, read from (1 or more) slaves
 • Good for scaling reads
     – Although big data is still big data

 • Not good for scaling writes
 • Many issues:
     – A-synchronous replication’s lag – read might not be up to date
     – A “query my update” inside a transaction will always be out of date
     – Adhere transactions isolation with stickiness?
     – Requires code changes




14
Data Distribution (sharding)

 • If done right and all the way:
        – The ultimate scaling machine
        – Provides significant performance improvements
        – The only way to really improve read and also writes
 • However if done in-house, (and not done properly), it can cause:
        – Substantial development efforts
        – Silos of data with no merging




     http://www.scalebase.com/don’t-ever-ever-write-your-own-sharding-code
15
Scale Out Features and Benefits

     Feature                                  Benefit
     Automatic data distribution (sharding)   Scale data-, read-, write- intensive applications
                                              Great performance of cross-db queries & maintenance
       Parallel query execution
                                              commands
                                              Support of sophisticated cross-db queries, even with ORDER
       Query result aggregation
                                              BY, GROUP BY, LIMIT, Aggregate functions…
                                              Flexibility: no need to over-provision
       Online data redistribution
                                              No downtime
     Read/Write splitting                     Easily scale read-intensive applications
       Replication lag-based routing         Improves data consistency and isolation
       Read stickiness after writes          Ensure consistent and isolated database operation
     100% compatible MySQL proxy              Applications unmodified
                                              Standard MySQL tools and interfaces
     MySQL databases untouched                Data is safe within MySQL InnoDB/MyISAM/any

     Data distribution review and analysis    Optimization of data distribution policy

     Data consistency verifier                Validate system-wide data consistency

     Real-time monitoring and alerts          Simplify management, reduce TCO
16
Scale Out Provides Immediate & Tangible Value



     Application Server            Database A    Standby A




     Application Server           Database B     Standby B




                                  Database C    Standby C
            BI




                                 Database D     Standby D
       Management

17
Typical Scale Out (ScaleBase) Deployment



     Application Server                           Database A    Standby A
                              ScaleBase
                          Central Management




     Application Server                           Database B    Standby B


                               ScaleBase
                          Data Traffic Manager



                                                 Database C    Standby C
            BI




                                                 Database D    Standby D
       Management

18
Scaling Out Achieves Unlimited Scalability

             160000

             140000

             120000

             100000
Throughput




                                                                                               84000
             80000                                                                                     Throughput (TPM)
                                                                                                       Total DB Size (MB)
             60000                                                                60000                # Connections
                                                                     48000
             40000
                                                        36000
                                              24000                                            2500
             20000                                                                2000
                                     12000              1500         1500
                          6000                1000
                 0        500        500
                      1          2           4        6          8           10           14
                                              Number of Databases

     19
Summary

     • Database scalability is a significant problem
         – Growing trends such as Big Data and mobile only compound it
     • Scale Up helps somewhat, but has limitations

     • Scale Out provides a longer term and more cost effective solution

     • ScaleBase has an effective scale out solution with a proven ROI
         – ScaleBase improves performance and requires NO changes to your
           existing infrastructure




20
Questions (please enter directly into the GTW side panel)



617.630.2800

www.ScaleBase.com

doron.levari@scalebase.com

paul.campaniello@scalebase.com


21
Thank You
22

ScaleBase Webinar 8.16: ScaleUp vs. ScaleOut

  • 1.
    Webinar Scaling MySQL: ScaleUp versus Scale Out August 16, 2012
  • 2.
    Agenda 1. Who We Are 2. The Scalability Problem 3. Scale Up vs. Scale Out 4. Customer ROI/Case Studies 5. Q & A (please type questions directly into the GoToWebinar side panel) 2
  • 3.
    Who We Are Presenters: Paul Campaniello, VP of Global Marketing 25 year technology veteran with marketing experience at Mendix, Lumigent, Savantis and Precise. Doron Levari, Founder A technologist and long-time veteran of the database industry. Prior to founding ScaleBase, Doron was CEO to Aluna. 3
  • 4.
    Pain Points –The Scalability Problem • Thousands of new online and mobile apps launching every day • Demand climbs for these apps and databases can’t keep up • App must provide uninterrupted access and availability • Database performance and scalability is critical 4
  • 5.
    Big Data =Big Scaling Needs Big Data = Transactions + Interactions + Observations Sensors/RFID/Devices Mobile Web User Generated Content Spatial & GPS Coordinates BIG DATA Petabytes User Click Stream Sentiment Social Interactions & Feeds Web Logs Dynamic Pricing Search Marketing WEB Offer History A/B Testing Affiliate Networks Terabytes External Demographics Segmentation Customer Touches CRM Business Data Offer Details Support Contacts Feeds Gigabytes HD Video, Audio, Images Behavioral ERP Purchase Detail Targeting Speech to Text Purchase Record Product/Service Logs Payment Record Dynamic Funnels SMS/MMS Megabytes Increasing Data Variety and Complexity 5 The 451 Group & Teradata
  • 6.
    Scalability Pain Infrastructure Cost $ Large You just lost Capital customers Expenditure Predicted Demand Opportunity Traditional Cost Hardware Actual Demand Dynamic Scaling time 6
  • 7.
    Scale Up http://forge.mysql.com/wiki/Top10SQLPerformanceTips innodb_buffer_pool_size Instance query_cache_size Tuning EXPLAIN SSD SQL Tuning Indexes SELECT * DISTINCT vs. GROUP BY Hardware Partitioning Upscale 7
  • 8.
    Partitioning Performance • See excellent presentation by Giuseppe Maxia from 2010 – http://www.slideshare.net/datacharmer/partitions-performance-with- mysql-51-and-55 Engine No Partitions Partitions InnoDB 4min 30s 13.19s MyISAM 25.03s 4.45s • Keeps data objects at their sweet spot • Helps fit indexes in RAM • Distributes sessions load over disks 8
  • 9.
    Scaling Up Hardware • Usually DB gets the strongest servers • However – there is a limit to how much performance improvement can be derived from increasing hardware • Some data: http://www.mysqlperformanceblog.com/2011/01/26/modeling-innodb- scalability-on-multi-core-servers/ 9
  • 10.
    Scale Up Pros& Cons Pros Cons May result in major performance Tedious, never ending… improvements Mostly transparent to the application SQL modifications are not always an option HW upscale is easy Expensive Requires unique skill set Requires downtime Limited. At one (near) point – the database engine itself becomes the bottleneck 10
  • 11.
    The Database Engineis the Bottleneck... • Every write operation is At Least 4 write operations inside the DB: – Data segment – Index segment – Undo segment – Transaction log • And Multiple Activities in the DB engine memory: – Buffer management – Locking – Thread locks/semaphores – Recovery tasks 11
  • 12.
    The Database Engineis the Bottleneck • Every write operation is At Least 4 write operations inside the DB: – Data segment – Index segment – Undo segment Now multiply – Transaction log by 10TB and • And Multiple Activities in the DB engine memory: 10,000 – Buffer management concurrent – Locking sessions – Thread locks/semaphores – Recovery tasks 12
  • 13.
    Scale Out (twomethods) Read Write Read/Write 1 Splitting Replication Data Distribution 2 (sharding) 13
  • 14.
    Read/Write Splitting •Write to master, read from (1 or more) slaves • Good for scaling reads – Although big data is still big data • Not good for scaling writes • Many issues: – A-synchronous replication’s lag – read might not be up to date – A “query my update” inside a transaction will always be out of date – Adhere transactions isolation with stickiness? – Requires code changes 14
  • 15.
    Data Distribution (sharding) • If done right and all the way: – The ultimate scaling machine – Provides significant performance improvements – The only way to really improve read and also writes • However if done in-house, (and not done properly), it can cause: – Substantial development efforts – Silos of data with no merging http://www.scalebase.com/don’t-ever-ever-write-your-own-sharding-code 15
  • 16.
    Scale Out Featuresand Benefits Feature Benefit Automatic data distribution (sharding) Scale data-, read-, write- intensive applications Great performance of cross-db queries & maintenance  Parallel query execution commands Support of sophisticated cross-db queries, even with ORDER  Query result aggregation BY, GROUP BY, LIMIT, Aggregate functions… Flexibility: no need to over-provision  Online data redistribution No downtime Read/Write splitting Easily scale read-intensive applications  Replication lag-based routing Improves data consistency and isolation  Read stickiness after writes Ensure consistent and isolated database operation 100% compatible MySQL proxy Applications unmodified Standard MySQL tools and interfaces MySQL databases untouched Data is safe within MySQL InnoDB/MyISAM/any Data distribution review and analysis Optimization of data distribution policy Data consistency verifier Validate system-wide data consistency Real-time monitoring and alerts Simplify management, reduce TCO 16
  • 17.
    Scale Out ProvidesImmediate & Tangible Value Application Server Database A Standby A Application Server Database B Standby B Database C Standby C BI Database D Standby D Management 17
  • 18.
    Typical Scale Out(ScaleBase) Deployment Application Server Database A Standby A ScaleBase Central Management Application Server Database B Standby B ScaleBase Data Traffic Manager Database C Standby C BI Database D Standby D Management 18
  • 19.
    Scaling Out AchievesUnlimited Scalability 160000 140000 120000 100000 Throughput 84000 80000 Throughput (TPM) Total DB Size (MB) 60000 60000 # Connections 48000 40000 36000 24000 2500 20000 2000 12000 1500 1500 6000 1000 0 500 500 1 2 4 6 8 10 14 Number of Databases 19
  • 20.
    Summary • Database scalability is a significant problem – Growing trends such as Big Data and mobile only compound it • Scale Up helps somewhat, but has limitations • Scale Out provides a longer term and more cost effective solution • ScaleBase has an effective scale out solution with a proven ROI – ScaleBase improves performance and requires NO changes to your existing infrastructure 20
  • 21.
    Questions (please enterdirectly into the GTW side panel) 617.630.2800 www.ScaleBase.com [email protected] [email protected] 21
  • 22.