100% found this document useful (2 votes)

203 views

Web Scalability - Part - 2

Web servers can scale by adding machines with languages like Ruby, PHP, and Python not being the bottleneck. Serving media requires a "mini-cluster" to scale and ensure high availability. Thumbnails may represent a bottleneck due to many small file accesses stressing disks. Replicating databases allows for reads from replicas but does not help with write scaling and replicas can lag behind masters. Solutions include database sharding, splitting replicas into pools, and optimizing RAID configurations.

Uploaded by

fizo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

100% found this document useful (2 votes)

203 views

Web Scalability - Part - 2

Uploaded by

fizo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 25

Supercourse Scalability

Web Servers
Web Servers
• Linux.
• Can usually scale by adding
machines.
• Ruby, Python, PHP, Groovy.
– Web Code not the bottleneck.
– Spend time waiting RPCs.
– Development speed critical.
The CPU is not the bottleneck.
Any language should be fast
enough..just, it should be dynamic
Serving Media
• Each piece of Media should be
hosted by a “mini-cluster”.
– Scalability.
– More than 1 HDD to serve the media.
– Online Backup.
• Apache  lighttpd (high load,
context switching, context switching)
• Switch from single process to multi-
process.
Serving Media
C
DNs

The most Popular content

SuperCour
se
Serv1
SuperCour
Moderately played se
Intern Serv2
et SuperCour
se
Serv3
SuperCour
se
Serv4
Serving Thumbnails
• Thumbnails are scary, they may
represent a bottleneck.
– Disk Seeks.
– Many small objects.
– High number of requests/sec
Serving Thumbnails
• Limit on the # of files in a
directory.(ext3)
• Squid..better to use Varnish.(reverse
Proxy)
• Apache may not be sufficient for disk
reads. (load, too many disk reads)
Thumbnails: lighttpd/aio
• Lighttpd is single threaded…

Main Thread

Worker Disk Disk

Thread 1 Read Read

Worker Disk
Thread 2 Read
Thumbnails
• There will may be bottleneck with
accessing small files (disks reads
bottlenecks).
Thumbnails: BT
• Google uses a system called BTFE in
youtube, google video, imagesearch..
– based on Google Bigtable.
– Avoids small files problem.
– Various forms of caching.(multiple
cache layers based on location..etc)
Databases
• Stores metadata (users, bookmarks,
comments, etc…)
• Database performance degrades
with disk reads.
• Pay little attention to “swap” in the
linux kernel, as the OS may swap the
database engine in/out.
DB Optimizations
• Query Optimizations
• Batch Jobs
• memcached
• App server caching.
• Pre-calculation of common queries.
DB Replication

Master All writes go

(mostly here
Write)

Sql Replication

Write Write Write

Read Read Read

DB Replication: Too many
writes

Master
(mostly
Write)

Sql Replication

Write Write Write

Read Read Read

Replication doesn’t help writes

Replica Lag
• Replication is asynchronous
• Replicas can fall behind master
database, serve old data.
• MySQL Replication.
Replication: Master
Client Client Client
Thread1 Thread2 Thread3
update1 update2 update3

Master
Databa
se

Multiple threads = concurrency on multi-

disk, multi-CPU systems
Replication: Replica
Replication thread
Update1
Long update2(blocks update 3 until finished!)
update3

Replica
Databa
se

Single thread = limited parallelism,

higher likelihood of slow query stalling
updates
Replication Thread
Unhealthy
• Normal replication Thread:
Update row 100(cache miss)

Update row 2(cache hit)

Update row 8(cache miss)

Update row 40(cache miss)

Update row 2(cache hit)

Cache misses require slow disk I/Os, causing a reduction in replication speed.
DB (Abstract view)
• DB updates involve two steps:
– Reading the affected DB pages.
– Applying the changes.

• Prefetch the pages needed by step

#1.(cache primer by reading the SQL
buffer for the affected rows)
• Difficult solution that will not solve all
the replicas problems
Summarize Replicas
• Too many read replicas
• Writes start crowding out reads
• Replication lag
• Extraordinary measures needed to
stay alive…
Database Pools
• Split replica databases into two
pools:
– Media watch.
• Most visited, media displayed data..etc
– General
• Lower priority than media.
• Less efficient queries.
• Less popular.
• Replica is still lagging but less than before.
DB RAID Tweaking
• Monolithic RAID 10 volume(10 disks)
• Linux sees only 1 volume, so it
doesn’t schedule too many parallel
disk I/Os.
DB RAID
• Split the to 5 volumes each one has 2
disks.
• Linux will see 5 volumes instead of 1
logical volume allowing it more
aggressively schedule disk I/O.
DB Partitions
• Partition the monolithic DB into
multiple shards.
• We should try to balance the traffic
on these shards.
• This should be done by monitoring
active users and move them across
different shards.
Replace DB by MapReduce
• Think to replace the traditional DBs
by MapReduce.
• MySQL doesn’t allow parallel queries.
• MapReduce Spread computational
power across many other machines.

MySQL Scaling and High Availability Architectures
100% (8)
MySQL Scaling and High Availability Architectures
57 pages
Building Scalable Web Sites
No ratings yet
Building Scalable Web Sites
21 pages
Real World Web: Performance & Scalability
100% (26)
Real World Web: Performance & Scalability
189 pages
Node.js 63 Interview Questions and Answers
From Everand
Node.js 63 Interview Questions and Answers
John Edward Cooper Berg
No ratings yet
Inside Livejournal Backend
100% (7)
Inside Livejournal Backend
49 pages
Rwws Mysql 2006
No ratings yet
Rwws Mysql 2006
73 pages
Youtube Architecture
No ratings yet
Youtube Architecture
25 pages
Building Scalable Web Architectures: Aaron Bannert
No ratings yet
Building Scalable Web Architectures: Aaron Bannert
74 pages
Lamp Technology
100% (1)
Lamp Technology
13 pages
Scaling Memcache at Facebook - Slides
No ratings yet
Scaling Memcache at Facebook - Slides
28 pages
Building Scalable Web Architectures: Aaron Bannert
No ratings yet
Building Scalable Web Architectures: Aaron Bannert
75 pages
Livejournal'S Backend: Brad Fitzpatrick Mark Smith
100% (3)
Livejournal'S Backend: Brad Fitzpatrick Mark Smith
70 pages
Linux and H/W Optimizations For Mysql: Yoshinori Matsunobu
No ratings yet
Linux and H/W Optimizations For Mysql: Yoshinori Matsunobu
160 pages
MySQLConf2007 Capacity
No ratings yet
MySQLConf2007 Capacity
54 pages
Another MySQL Performance Talk
100% (1)
Another MySQL Performance Talk
35 pages
5 Common Server Setups For Your Web Application - DigitalOcean
No ratings yet
5 Common Server Setups For Your Web Application - DigitalOcean
12 pages
Mysql Cluster Deployment Best Practices
No ratings yet
Mysql Cluster Deployment Best Practices
39 pages
Performance Is Overrated - NEDB 2012
100% (2)
Performance Is Overrated - NEDB 2012
44 pages
Lessons Learned Building A Web 2.0 Application Using Mysql
100% (3)
Lessons Learned Building A Web 2.0 Application Using Mysql
50 pages
Web Application
No ratings yet
Web Application
13 pages
Tuning Linux For MongoDB
No ratings yet
Tuning Linux For MongoDB
26 pages
Module 3 - Server services
No ratings yet
Module 3 - Server services
14 pages
Computer Science S-75: Building Dynamic Websites
No ratings yet
Computer Science S-75: Building Dynamic Websites
22 pages
report week 3
No ratings yet
report week 3
2 pages
YouTube Architecture Dmvdivc90jj5hh1a9
No ratings yet
YouTube Architecture Dmvdivc90jj5hh1a9
5 pages
Lamp Technology
No ratings yet
Lamp Technology
17 pages
Howto Serve 2500 Ad Requests / Second
No ratings yet
Howto Serve 2500 Ad Requests / Second
54 pages
Mattu Universit-wps Office
No ratings yet
Mattu Universit-wps Office
11 pages
Vineet Gupta - GM - Software Engineering - Directi: Intelligent People. Uncommon Ideas
No ratings yet
Vineet Gupta - GM - Software Engineering - Directi: Intelligent People. Uncommon Ideas
73 pages
Web Server
No ratings yet
Web Server
16 pages
3 Web Application Architecture
No ratings yet
3 Web Application Architecture
23 pages
Ac 2005 Scalable We Barch
No ratings yet
Ac 2005 Scalable We Barch
74 pages
AIPPTMaker 1732038232790
No ratings yet
AIPPTMaker 1732038232790
21 pages
Tech Stack
No ratings yet
Tech Stack
54 pages
Apache Toamcat Installation
No ratings yet
Apache Toamcat Installation
58 pages
Lamp Technology New
100% (1)
Lamp Technology New
25 pages
Tim Hawkins: or "How To Survive The Digg or Slashdot Effect"
100% (10)
Tim Hawkins: or "How To Survive The Digg or Slashdot Effect"
34 pages
Prashant H
No ratings yet
Prashant H
21 pages
Backend Developer Guide
No ratings yet
Backend Developer Guide
7 pages
L 2 DC Linux Server Guide Public Release
No ratings yet
L 2 DC Linux Server Guide Public Release
11 pages
Adbms: Concepts and Architectures: Unit I
No ratings yet
Adbms: Concepts and Architectures: Unit I
41 pages
Memcache FB PDF
No ratings yet
Memcache FB PDF
14 pages
Large Datasets in MySQL On Amazon EC2
No ratings yet
Large Datasets in MySQL On Amazon EC2
30 pages
Lecture 8
No ratings yet
Lecture 8
44 pages
MATTU UNIVERSIT-WPS Office
No ratings yet
MATTU UNIVERSIT-WPS Office
11 pages
System Design
No ratings yet
System Design
56 pages
CS4005 2022
No ratings yet
CS4005 2022
15 pages
L2DC Linux Server Guide
No ratings yet
L2DC Linux Server Guide
11 pages
Shard-In After Sharding Out With SSD
100% (1)
Shard-In After Sharding Out With SSD
41 pages
High-Value Transaction Processing With MySQL
No ratings yet
High-Value Transaction Processing With MySQL
53 pages
Linux and H/W Optimizations For MySQL
100% (2)
Linux and H/W Optimizations For MySQL
160 pages
Node.js, JavaScript, API: Interview Questions and Answers
From Everand
Node.js, JavaScript, API: Interview Questions and Answers
John Edward Cooper Berg
5/5 (1)
Build Your Own Distributed Compilation Cluster - A Practical Walkthrough
From Everand
Build Your Own Distributed Compilation Cluster - A Practical Walkthrough
Hunter Davis
No ratings yet
Computer Science I Essentials
From Everand
Computer Science I Essentials
Randall Raus
5/5 (7)
DRBD-Cookbook: How to create your own cluster solution, without SAN or NAS!
From Everand
DRBD-Cookbook: How to create your own cluster solution, without SAN or NAS!
Joerg Christian Seubert
No ratings yet
Bash Shell from Zero to Hero: An SRE's Practical Guide to Terminal Skills, Scripting, and Automation
From Everand
Bash Shell from Zero to Hero: An SRE's Practical Guide to Terminal Skills, Scripting, and Automation
Nolan Reeves
No ratings yet
All My IT Tech Posts
From Everand
All My IT Tech Posts
Stephen Edwards
No ratings yet
Linux for Beginners: Linux Command Line, Linux Programming and Linux Operating System
From Everand
Linux for Beginners: Linux Command Line, Linux Programming and Linux Operating System
Steve Will
4.5/5 (3)
Nintendo 64 Architecture: Architecture of Consoles: A Practical Analysis, #8
From Everand
Nintendo 64 Architecture: Architecture of Consoles: A Practical Analysis, #8
Rodrigo Copetti
No ratings yet
Kubernetes Made Easy
From Everand
Kubernetes Made Easy
Pankaj Joshi
No ratings yet
Infant Respiratory Distress Syndrome (Irds)
No ratings yet
Infant Respiratory Distress Syndrome (Irds)
24 pages
30 Points Fiqh-of-Fasting PDF
No ratings yet
30 Points Fiqh-of-Fasting PDF
3 pages
DFL E-27 Emulsion
100% (1)
DFL E-27 Emulsion
21 pages
Setting Up Sealed Mesocosms To Try To Establish Sustainability
No ratings yet
Setting Up Sealed Mesocosms To Try To Establish Sustainability
4 pages
The Documentation Process For Physical Product Development
No ratings yet
The Documentation Process For Physical Product Development
2 pages
Family and Friends: Listening Part 3
100% (1)
Family and Friends: Listening Part 3
4 pages
Java Frog Class
No ratings yet
Java Frog Class
3 pages
Fresh Algae As Organic Fertilizer in Yield and Growth of PECHAY (Brassica Rapa Subsp. Chinensis)
No ratings yet
Fresh Algae As Organic Fertilizer in Yield and Growth of PECHAY (Brassica Rapa Subsp. Chinensis)
4 pages
Analyzing The Factors Affecting Consumer Awareness On Organic Foods in
100% (1)
Analyzing The Factors Affecting Consumer Awareness On Organic Foods in
12 pages
Ulysses by Alfred, Lord Tennyson - Poetry Foundation
No ratings yet
Ulysses by Alfred, Lord Tennyson - Poetry Foundation
2 pages
Mental Disorders in Ayurveda
50% (2)
Mental Disorders in Ayurveda
32 pages
Immunomodulatory Effect of Various Anti-Parasitics: A Review
No ratings yet
Immunomodulatory Effect of Various Anti-Parasitics: A Review
13 pages
Global Rail Sustainability Report 2022
No ratings yet
Global Rail Sustainability Report 2022
52 pages
Annual Function Slide
No ratings yet
Annual Function Slide
69 pages
A Triple W Model of Rumination On Sadnes
No ratings yet
A Triple W Model of Rumination On Sadnes
16 pages
Catch Up Friday Session Plan
No ratings yet
Catch Up Friday Session Plan
24 pages
Zojirushi BBCC m15 Bread Machine
No ratings yet
Zojirushi BBCC m15 Bread Machine
17 pages
B.Voc. Resume
No ratings yet
B.Voc. Resume
2 pages
2015 Paper 2 Specimen Paper PDF
No ratings yet
2015 Paper 2 Specimen Paper PDF
10 pages
Gire PDF
No ratings yet
Gire PDF
24 pages
उच्चत्तर शिक्षा शिभाग शिक्षा मंत्रालय भारत सरकार के तहत एक स्वायत्त संगठन ( (An Autonomous Organization under the Department of Higher Education, Ministry of Education, Government of India)
No ratings yet
उच्चत्तर शिक्षा शिभाग शिक्षा मंत्रालय भारत सरकार के तहत एक स्वायत्त संगठन ( (An Autonomous Organization under the Department of Higher Education, Ministry of Education, Government of India)
5 pages
Pressurized Irrigation Principle
No ratings yet
Pressurized Irrigation Principle
61 pages
3D Printing of Core and Cavity Inserts
No ratings yet
3D Printing of Core and Cavity Inserts
15 pages
MyLabEight EighteXP GettingStarted E R02
No ratings yet
MyLabEight EighteXP GettingStarted E R02
96 pages
Rule 67 of The Rules of Court: Expropriation: TRINIDAD, Eva Angeline M. March 8, 2018 2011-41125 Local Government
No ratings yet
Rule 67 of The Rules of Court: Expropriation: TRINIDAD, Eva Angeline M. March 8, 2018 2011-41125 Local Government
1 page
Contoh Work Order Form
No ratings yet
Contoh Work Order Form
1 page
TB 08 2021 Updated
No ratings yet
TB 08 2021 Updated
57 pages
Macias Vs Comelec
No ratings yet
Macias Vs Comelec
6 pages
Nuclear Fusion and Fisssion
No ratings yet
Nuclear Fusion and Fisssion
5 pages
A Genius Called Ruth Lawrence
100% (1)
A Genius Called Ruth Lawrence
3 pages

Web Scalability - Part - 2

Uploaded by

Web Scalability - Part - 2

Uploaded by

Supercourse Scalability

The most Popular content

Worker Disk Disk

Master All writes go

Write Write Write

Read Read Read

Write Write Write

Read Read Read

Replication doesn’t help writes

Multiple threads = concurrency on multi-

Single thread = limited parallelism,

Update row 2(cache hit)

Update row 8(cache miss)

Update row 40(cache miss)

Update row 2(cache hit)

• Prefetch the pages needed by step

You might also like