100% found this document useful (1 vote)
327 views

Web Application Performance Using Six Sigma

This document discusses using Six Sigma methodology to improve web application performance. It begins by defining quality and the importance of performance. It then explains how traditional approaches often fail to solve performance problems and that Six Sigma provides a structured way to define the problem, measure key metrics, analyze root causes, improve performance, and ensure issues do not return. The document outlines the Six Sigma DMAIC process and how it can be applied to a performance project, highlighting some common challenges. It dives deeper into the define phase, including creating user scenarios, setting performance goals, understanding the voice of the customer, and defining critical to quality metrics.

Uploaded by

neovik82
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
327 views

Web Application Performance Using Six Sigma

This document discusses using Six Sigma methodology to improve web application performance. It begins by defining quality and the importance of performance. It then explains how traditional approaches often fail to solve performance problems and that Six Sigma provides a structured way to define the problem, measure key metrics, analyze root causes, improve performance, and ensure issues do not return. The document outlines the Six Sigma DMAIC process and how it can be applied to a performance project, highlighting some common challenges. It dives deeper into the define phase, including creating user scenarios, setting performance goals, understanding the voice of the customer, and defining critical to quality metrics.

Uploaded by

neovik82
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 48

Improving Web Application

Performance Using Six Sigma

Mukesh Jain
Principal Quality Manager
Global Online Services
Microsoft Corporation

1
We will talk about…
z What is Quality?
z Types of Performance Measurement
z How to Measure Web App Performance?
z Traps we fall into and how Six Sigma
showed us the right path
z Overview of Six Sigma
z Using Six sigma to Improve Performance
z 5 Steps to Improve Web App Performance
z Ensuring problems do not come back
z Questions
2
What’s the biggest problem today

Today’s Market demand best of everything


z Quality
z Cost
z Schedule

Schedule

Reliable, Responsive and Secure service 24X7


is the demand of the global customer

3
Defects are inevitable

z Fact: No software can be guaranteed


100% defect-free
z Action: No action
z Result: We make it horribly true
z Ask: Why the defect happened?
z Do: Analyze data and improve the
process to prevent it
And Sustain it Æ Six Sigma
4
What is Quality?
z Meets expectations
z Serves the Purpose / Needs
z Intuitive / Usability

z Reliable
z Responsive / Performance
z Security / Privacy

z High Quality/Low Defect

z Getting it right the first time, every time

z Think Global
5
Why is Performance important?

z More online activities


z Competitors are faster
z Users have more options
z Attracting and retaining users
z We cannot overcome speed of light
z Release it and then fix it – no longer
an option – we may loose Mind-Share
and we may not get second chance

6
Solving Performance Problems
Your app is having performance issues
z Traditional way
z Put more people
z Do more performance testing
z But Performance Testing tools
z Outcome/Results
z We often march on a journey to solve a problem without
understanding the problem
z If we don’t know what we are improving, how would we
know if we have improved?
z Structured way of solving the problem
z Use Six Sigma, we can define the problem, measure the
right thing, analyze root cause and then solve the right
problem – the right way…
7
What is Six Sigma?
z Structured Problem solving methodology
z Solving the right problem – the right way
z Focus on finding and fixing the root cause
z Ensuring problem does not come back
z Drive Continuous Improvement
z DMAIC

8
Identify major user
Monitor to scenarios & Perf
ensure perf do 1.DEFINE Goals
not degrade with
future changes to
the code
Use right set of tools
to measure & baseline
5.CONTROL current Performance
2.MEASURE Measure the following
(SUSTAIN) WebPage Page Load Time (1 & 2)
Time to First Byte
Performance # of round trips
Avg Bytes Download
Improvement
DMAIC cycle

Analyze data, find reasons /


4.IMPROVE 3.ANALYZE root-cause for slow perf
Consult Perf Experts

Fix root cause of perf


problems in your code/
architecture/deployment
Six Sigma - DMAIC

z Define (D) : Zero in on specific problem with defined return on effort


z Measure (M) : Determine current performance of process
z Analyze (A) : Validate key drivers of performance (root cause of problem)
z Improve (I) : Improved performance and validated realized results
z Control (C) : Implement controls to ensure continued performance

Project Phases and Deliverables

Define Measure Analyze Improve Control

¾ Project selection ¾ Key output variables ¾ Key causes (vital few) ¾ Improvement strategy ¾ Lock in the results
¾ Project charter (metrics or Y’s) of defects (X’s) ¾ Prioritize solutions (control plan)
¾ Critical to Customer ¾ Possible causes of ¾ Tested & measured •Mistake proofing
(CTQ) needs defects (X’s) solutions •Control points
¾ High level process map ¾ Data collection and ¾ Final solutions •Monitoring Plan
presentation plan •Positive hand off
¾ Current Performance of control plan
¾ Internal/ external ¾ Financial impact of the
benchmarking project

10
The Six Sigma Project - challenges

z Bottom-up approach
z Difficult to get buy-in
z Myth: Six Sigma cannot be used for software
z We don’t need a Six Sigma project to
understand this problem and to fix it
z Its common sense – we can fix it
z Do it if you have time – on your personal time
z Measuring the right things
z Avoiding temptation of jumping to solution

11
Six Sigma – Define Phase
z Project Selection
z The right project (Strategic, ROI, etc.)
z Project Charter
z Why are we doing this? Business Case
z Goal, Scope, Time-line
z Exec Sponsors, Team & Resources
z Web App XYZ (which drives 40% of revenue to the company) have
performance problems (based on survey). This is impacting significant
% of users and we are seeing decline in users to the site.
z Customer Focus
z Voice of Customer (VOC) – what do customer need
z Critical to quality (CTQ)
z Customers have repeatedly complained about the performance of the
XYZ app. They are not able to complete transactions in timely manner.
z Current Situation
z Process Map
z Current Performance

12
The Define Phase: Problem Statement

Testers did not do performance testing



Performance bugs were not fixed

Performance goals/expectation not clear

Performance of any product/feature
should managed in a better way

13
The Define Phase: Goal

Ship Zero Performance Bugs



Find twice # of performance bug

Performance goals should be clear

Improve test’s ability to find 95% of
the performance bugs before Beta1
14
Define Perf User Scenarios
z Understand the User (Voice of the Customer)
z First time visitor / Guest user / authenticated user
z Returning users with cache/no cache
z User on the same site/session
z User from other MS sites/domains (w/ passport)
z User Demographics (Geo, Home/Office, machine config,
consumer/info worker/social, connection speed)
z Typical user transactions, Back/Forward/Refresh usage
z Perf goal
z Regional competitor performance
z What is acceptable performance
z Do not use “it should be fast”, try “JP broadband users
should be able to get the page in 4 seconds (75th percentile), when
they visit for the first time (PLT1), 2 seconds for PLT2”

15
The Define Phase: Voice of Customer
z Performance at-least as good as last release

z Comparable to the competitors

z UI Responsiveness

z Notification in case of delays

z If it appears slow – it is slow, irrespective of


what the data says

16
The Define Phase: Critical To Quality (CTQ)

Examples:

z For common scenarios, Improve performance by


atleast 10% (compared to last release)

z Any action taking more than 3 seconds should have


a progress bar/notification

z Similar/predictable performance in any connection


mode (speed, latency)

z Handle failures/major issues gracefully

17
International Internet Routes

Notes
•Minimum bandwidth to be seen on this map is 14 Gigs
•Does not report bandwidth within a country
Latency and the impact to page load
times based on number of round trips

RT = Round Trips
Red line is 34 RT

Green line is 21 RT
Dotted Green is 10
RT

Blue line is 3 RT

Source data for timings is 75th percentile for country in question from: http://msncore/performance/netsmart/Netstats.asp
Microsoft Confidential 19
Network Round trip delays…

Microsoft Confidential 20
Performance/Load/Stress Testing
z Performance testing
z User-Scenarios testing (typical case & best case)
z Establish Baseline & perform trend analysis
z Detect performance issues
z Tools: WebRunner, WANSim, HttpWatch, etc.

z Load testing (volume/longevity/endurance)


z Expected MAX # of concurrent users
z Volume of data
z Very Long active sessions

z Stress testing (negative testing)


z What happens with the load exceeds significantly OR
system goes thru resource constraints/failures
z Does the system gracefully recovers from failure?
(e.g. Tiwan Earthquake in Dec 2006)
Six Sigma - Measure
z Measurement Process – Data Collection Plan
z Testing, Monitoring, Measurement system analysis, sampling
z Clear, concise definition of variables, process involved
z Identify Key measures/drivers of performance
z Y = F(X1,X2,X3,X4,…)
z Ishikawa (Fishbone) diagram – Cause & Effect diagram
z Internal / External Benchmarking
z Baseline Current performance & Impact
z Identify and measure current performance and its impact on
customer, collect more data if required
On 300kbps connection & 300ms Round-Trip-Time, it takes 6 seconds to load the
page for the PLT1 case, 2.5 seconds for PLT2. 20% of our user abandon the page
before it loads.
Web Page X have 2 HTML files, 3 .js files, 3.css, 5 images, the web App opens 2
parallel tcp ports to download them.

22
Types of Performance Measurements
z Client UI Response Time
z Server Response Time
z Load/Stress
z Bytes over wire/Throughput
z Availability
z Latency (anywhere in the world)
z Browser Processing & Rendering time
z User Machine - Resource utilization
z Perceived Performance

If the user feels the product is slow, your product is slow –


no matter what our data says…

z PLT1: The user visiting the site with NO CACHE


z PLT2: Returning user with CACHE
23
What is Performance

z PLT1: The user visiting the site with


NO CACHE
z PLT2: Returning user with CACHE

Microsoft Confidential 24
Measure
z Business Measurements
z # of unique users
z # of page views
z Click-Thru-Rate (CTR)
z Revenue
z Market Share
z Net Promoter
z Other Measurements
z % of people on PLT1 / PLT2
z Errors
z Abandon rate (Incomplete/Closed/Click-Away)

25
Measure: Key variables for Performance
z Factors attributing to Web Page Performance
z # of Files, Static/Dynamic content
z Page Load Time
z Bytes download
z DNS Lookup time
z Peak hours / Load & Stress
z User spread / Global?
z Data Centers / CDN / Redirects
z Multiple versions of the app
z Web Page Architecture (parallel/sequential download)
z Compression
z Expiry dates
z Keep-Alive
z PLT1 / PLT2 (Caching?)
26
Performance – Cause-Effect Diagram
Application / Web Page End-User

# of Files
Machine config
JS/CSS processing

# of TCP ports Bandwidth


Caching/Keep-Alive

Static/Dynamic content Latency

Page Architecture

Total Bytes/ Compression


End-User
Page Load Time
DNS Setup
GeoLocation DataCenter/CDN

Load/Stress Peak Hours

# of Servers

App versioning

Server/Infrastructure

Best Practices @ http://MSNPerf 27


Six Sigma - Analyze
z Find Root Cause of the problems
z Analyze the data from Measure phase
z Identify vital few variables (x)
z Perform correlation and regression analysis
z Data stratification
z Use 5 Why techniques
z Hypothesis testing
z Sources of variation
z Use Cause-Effect diagram
z Plot data on graphs (trend, releases)
z Special Cause / Common Cause
# of files on the site is 10, some of these files can be combined, 2 files are not
compressed. Majority of the users who abandon the site are from UK (Latency)
and Dial-up users from US (slow connection). The problem started to happen from
March 01 (when release 2.2 went live)

28
PLT1/PLT2 User Distribution

Analysis
•From the data it looks like 17% of the users are in PLT1 and 83% in PLT2
•There are several users in a middle category (16K to 21K)

29
Impact of slow performance on user

As it takes more time to display page – users STOP the page


Typically after 5 seconds of wait, 15% user stops the page from loading and the % grows with the
time Microsoft Confidential 30
User count & Performance across states

31
Bytes & Page Load Time distribution

32
Hourly Users & PLT distribution

33
Six Sigma - Improve
z Improvement strategy & Plan
z Improvement solution selection
z Generate ideas – Involve diverse team
z Identify and rank solution alternatives
z Pilot solutions and select final solution
z FMEA
z Design of experiments
z Test and implement final solution
z Communication plan
z ToBe Process Map
z Track improvement, monitor trend
z Share success/failure stories Æ Best Practices 34
Six Sigma – Control (Sustain improvements)

z Process Monitoring/Control Plan


z Mistake Proofing (Poka-Yoke)
z Control Chart
z Response Plan
z Document standard process/procedure
z Train resources
z Share Learning
z Project closure
z Positive Handoff
z Measure Benefits (Financial / Customer Sat)
z Celebrate
35
Sample Control Plan
z We will get user feedback on Beta1 build and evaluate if we
need any more improvements

z We will continue to have meeting (every other week) to


discuss any perf test related issues

z Person ABC will do the measurements and monitor the app


performance until Beta1 and analyze/report user feedback

z A monthly status mail will be sent to (10th of each month) the


Project team, Product Manager and VP, reporting the stats
related to the project (Perf – for various category of users,
improvement over time, survey/feedback from users, etc.)

z If we find that we are deviating from our goal, we will call a


meeting (within 2 days) and analyze problem and develop
solution
36
Lessons Learned / Suggestions

Lessons Learned:
z Six Sigma Process Measurements helped us uncover
problems in a structured way and come up with
solutions to eliminate them or minimize its impact.
z Implementing improvements requires involvement
from everyone at all levels and all disciplines

Suggestions:
z By getting all disciplines (dev/pm/test) involved we
can focus on preventing the problems rather than
relying on finding and fixing them.
z Integrate new practices into the development cycle
once the improvements have been validated
37
Results
Note: Specific % & # are purposely removed from this presentation

z Product benefits:
z Page size drastically reduced
z Site performance improved

z User satisfaction improved

z Increased click-thru-rate

z Overall impact:
z Increased focus on Performance
z More investment in performance
(expanding the performance group)
38
Visual Round Trip Analyzer

39
Visual Round Trip Analyzer

40
Visual Round-Trip Analyzer (VRTA)
Client Port – Browser to
Server Connection
Servers

Bandwidth
Utilization

File Transfer Duration

Color coded
By File Type

Time in Seconds

http://msncore/performance/netsmart/VRTA_animation/sample1/main.html
Use more parallel TCP ports

Bad Good

42
Unblock Java Script

Standard JS downloads in Serial and


creates bandwidth bottlenecks

Use a binding methodology to get


around this issue

43
Unblock JS
Solution I
This method has been
successfully used by the
Windows Live Hotmail Team
==================
function AsyncLoad()
{
var l = arguments.length;
for (var i=0;i<l;i++)
{
document.write("<script src='" + arguments[i] + "'></" +
"script>");
}
}
AsyncLoad(
"file1.js",
"file2.js",
"file3.js");
=====================

From: WR-Client VRTA 44


Expiration dates
Relative Content
Time URI Len Status Code
0.00http://groups.msn.com/people/ 27659 200 -- OK
0.70http://c.msn.com/c.gif 42 200 -- OK
4.53http://groups.msn.com/global/css.htm 0 304 -- Not Modified
4.88http://groups.msn.com/spacer.gif
304 = Bad! 0 304 -- Not Modified
4.89http://groups.msn.com/home_icons_chat_48x40.gif 0 304 -- Not Modified
4.90http://www.match.com/msnprofile/profile.aspx 2481 200 -- OK
5.22http://groups.msn.com/home_icons_IM_48x40.gif 0 304 -- Not Modified
5.55http://groups.msn.com/msnmess_themes_65x60.gif 0 304 -- Not Modified
5.57http://groups.msn.com/home_icons_heart_42x39.gif 0 304 -- Not Modified
5.81http://www.match.com/lib.msnprofiles.master.style.css 383 200 -- OK
5.81http://www.match.com/lib.msnprofiles.style1.css 214 200 -- OK
5.89http://groups.msn.com/home_icons_MD_48x40.gif 0 304 -- Not Modified
5.91http://groups.msn.com/home_icons_groups_48x40.gif 0 304 -- Not Modified
6.20http://www.match.com/libraries/lib.template.globaljs.js 0 304 -- Not Modified
6.22http://groups.msn.com/msn_phone3n_48x40.gif 0 304 -- Not Modified
6.24http://groups.msn.com/whitepages/msnwhitepages.htm 0 304 -- Not Modified
6.57http://groups.msn.com/home_icons_msnbutterfly_48x40_c.gif 0 304 -- Not Modified
6.61http://xml.eshop.msn.com/xmlbuddy/eShopOffer.aspx 1227 200 -- OK
6.86http://images.match.com/match/matchscene/articles/spotlight1617.jpg 8708 200 -- OK
7.35http://groups.msn.com/match_com_header_blue_matte.gif 0 304 -- Not Modified
7.35http://groups.msn.com/spacer.gif 0 304 -- Not Modified
7.36http://view.atdmt.com/MSN/iview/msnnkhac001300x250xWBCK4000109msn/direct;wi.300;hi.250/01 320 200 -- OK
7.43http://rad.msn.com/ADSAdClient31.dll 489 200 -- OK
7.69http://groups.msn.com/1195B_goBtn.gif 0 304 -- Not Modified
7.71http://groups.msn.com/gfol_180x150_survey_express_jan03_2.gif 0 304 -- Not Modified
8.91http://att.atdmt.com/b/MSMSNMATCVON/Harmonics_WorstNightmare_2499_300x250.gif 14852 200 -- OK
10.58 56375

Max-Age: Supersedes Expiration

45
Use Compression

Types of Compression: Static, Dynamic


Levels 0 through 10
26KB/5.5 = 5KB; 20KB BW savings

46
Keep-Alive TCP ports

47
Questions?

Mukesh Jain
[email protected]

48

You might also like