Showing posts with label performance. Show all posts
Showing posts with label performance. Show all posts

Tuesday, March 10, 2009

Steve Souders: Life's Too Short, Write Fast Code (part 2)

By Steve Souders, Member of Technical Staff

I've been working on a follow-up book to High Performance Web Sites called Even Faster Web Sites. As I finish chapters, I talk about the findings at conferences and tech talks. The first three chapters are Split the Initial Payload, Load Scripts Without Blocking, and Don't Scatter Inline Scripts. You can hear about those best practices in my video from Google I/O.

This talk presents the next three chapters: Couple Asynchronous Scripts, Use Iframes Sparingly, and Flush the Document Early.

The adoption of JavaScript is growing, but the blocking behavior of external scripts is well known. That's why it's important to use one of the techniques to load scripts without blocking (see the Google I/O talk). But loading scripts asynchronously means that inlined code that uses symbols from the script must be coupled in some way. Without this coupling, undefined symbol errors occur when the inlined code is executed before the external script arrives.

There are five techniques for coupling asynchronous scripts: hardcoded callback, window onload, timer, script onload, and degrading script tags. All of the techniques work. Degrading scripts tags is the most elegant, but isn't well known. Script onload is the most versatile technique and is the one I recommend people use. In the talk, I then go into detail, including many code examples, on how to load scripts asynchronously and use these coupling techniques to speed up your web page.

Iframes have a negative impact on web pages. They are the most expensive DOM element to create. They block the parent's onload event (although there's a workaround to this problem in Safari and Chrome). Also, the main page can block resources in the iframe. It's important to understand these interactions if you use iframes in your page.

Flushing the document early allows the browser to start rendering the page and downloading resources in the page, even before the entire HTML document has arrived. But getting flushing to work can feel like trying to get the stars to align. You need to understand PHP's output_buffering, HTTP/1.1's chunked encoding, Apache's DeflateBufferSize, the impact of proxies, minimum HTML size requirements in Safari and Chrome, and the need for domain sharding to avoid having the HTML document block other downloads.

If your company wants a better user experience, increased revenues, and reduced operating costs, the key is to create even faster web sites. For more information on these best practices, watch the video below and read the slides.



Check out other talks in this tech speaker series:

Monday, February 09, 2009

John Resig: Drop-in JavaScript Performance



Although Mozilla is right across the street, their JavaScript evangelist, John Resig, hails from Boston. When he comes to town, it's a great opportunity to learn about his current explorations in the world of JavaScript. I was fortunate to be able to host him for a Google tech talk last week. The video and slides are now available. In addition to his evangelism role at Mozilla, John is the creator of jQuery and Dromaeo, author of Pro JavaScript Techniques, and member of the Firebug Working Group. He's currently working on Secrets of the JavaScript Ninja, due out sometime this year.

In this talk, John starts off highlighting why performance will improve in the next generation of browsers, thanks to advances in JavaScript engines and new features such as process per tab and parallel script loading. He digs deeper into JavaScript performance, touching on shaping, tracing, just-in-time compilation, and the various benchmarks (SunSpider, Dromaeo, and V8 benchmark). John plugs my UA Profiler, with its tests for simultaneous connections, parallel script loading, and link prefetching. He wraps up with a collection of many other advanced features in the areas of communiction, DOM, styling, data, and measurements.



Wow, a lot of material to cover in one hour. An added benefit of having this talk given at Google is the questions from the audience. At one point, a member of the Google Chrome team goes into detail about how parsing works in V8. Many thanks to John for sharing his work and insights with all of us.

Tuesday, December 16, 2008

App Engine's System Status Dashboard



We recently announced a System Status Dashboard for Google App Engine. As developers depend on App Engine for important applications, we wanted to provide more visibility into App Engine's availability and performance.

Application development today is pretty different than it was just a few years ago. Most web apps now make use of hosted third-party services for features like search, maps, video, or translation (e.g., our AJAX APIs). These services mean developers don't have to invest in massive computing resources to build these features themselves, and can instead focus on what is exciting and new about their apps.

Building in dependencies to third-party services or moving to a new hosting infrastructure is not something developers take lightly. This new App Engine dashboard provides some of the same monitoring data that we use internally, so you can make informed decisions about your hosting infrastructure.

Learn more (about this and other recent announcements) in the App Engine blog and please let us know what you think.

Wednesday, October 01, 2008

Measuring Speed The Slow Way



Let's say you figured out a wicked-cool way to speed up how quickly your website loads. You know with great certainty that the most popular web page will load much, much faster. Then, you remember that the page always loads much, much faster in your browser with the web server running on your development box. You need numbers that represent what your users will actually experience.

Depending on your development process, you may have several measuring opportunities such as after a live release, on a staging server, on a continuous build, or on your own development box.

Only a live release gives numbers that users experience--through different types of connections, different computers, different browsers. But, it comes with some challenges:
  • You must instrument your site (one example tool is jiffy-web).
  • You must isolate changes.
    • The most straight-forward way to do that is to release only one change at a time. That avoids having multiple changes altering performance numbers--for better or for worse--at the same time.
  • Consider releasing changes to a subset of users.
    • That helps safe-guard against real-world events like holidays, big news days, exploding hard drives, and, perhaps the worst possible fate of all: a slashdotting.
Measuring real traffic is important, but performance should also be measured earlier in the development process. Millions of users will not hit your development web server--they won't, right?! You need a way to measure your pages without that.

Today, Steve Souders has released Hammerhead, a Firefox plug-in that is just the ticket for measuring web page load times. It has a sweet feature that repeatedly loads pages both with and without the browser cache so you can understand different use cases. One thing Hammerhead will not do for you is slow down your web connection. The page load times that you measure on your development machine will likely be faster than your users' wildest dreams.

Measuring page load times with a real DSL or dial up connection would be ideal, but if you cannot do that, all hope is not lost. You can try the following tools that simulate slower connection speeds on a single box:Please share your own favorite tools in the comments. Each of these tools is super easy to install and setup options to simulate slower connections. However, they each have some caveats that you need to keep in mind.

Firefox Throttle hooks into the WinSock API to limit bandwidth and avoids using proxy settings. (If you use it, be sure to disable "burst-mode".) Right now, Firefox Throttle only limits bandwidth. That means it controls how much data arrives in a given time period after the first bits arrive. It does not limit latency. Latency controls how long it takes packets to travel to and from the server. See Wikipedia's Relationship between latency and throughput for more details. For certain webpages, latency can make up a large part of the overall load time. The next Firefox Throttle release is expected to include latency delays and other webmaster friendly features to simulate slower, less-reliable connections. With these enhancements, Firefox Throttle will be an easy recommendation.

Fiddler and Charles act as proxies, and, as a result they make browsers act rather differently. For instance, IE and Firefox drastically limit the maximum number of connections (IE8 from 60+ to 6 and FF3 from 30 to 8). If you happen to know that all your users go though a proxy anyway, then this will not matter to you. Otherwise, it can mean that web pages load substantially differently.

If you have more time and hardware with which to tinker, you may want to check out tools like dummynet (FreeBSD or Mac OS X), or netem (Linux). They have even more knobs and controls and can be put between the web browser hardware and the serving hardware.

Measurements at each stage of web development can guide performance improvements. Hammerhead combined with a connection simulator like Firefox Throttle can be a great addition to your web development tool chest.

Thursday, May 08, 2008

The Future of Web Performance at Google I/O: JavaScript



This post is one in a series that previews Google I/O, our biggest developer event in San Francisco, May 28-29. Over the next month, we'll be highlighting sessions and speakers to give Google Code Blog readers a better sense of what's in store for you at the event. - Ed.

In April I announced that I'm starting another book. The working title is High Performance Web Sites, Part 2. This book contains the next set of web performance best practices that goes beyond my first book and YSlow. Here are the rules I have so far:
  1. Split the initial payload
  2. Load scripts without blocking
  3. Don't scatter scripts
  4. Split dominant content domains
  5. Make static content cookie-free
  6. Reduce cookie weight
  7. Minify CSS
  8. Optimize images
  9. Use iframes sparingly
  10. To www or not to www
I'm most excited about the best practices for improving JavaScript performance (rules 1-3). Web sites that are serious about performance are making progress on the first set of rules, but there's still a lot of room for improving JavaScript performance. Across the ten top U.S. sites approximately 40% of the time to load the page is spent downloading and executing JavaScript, and only 26% of the JavaScript functionality downloaded is used before the onload event.

In my session at Google I/O I'll present the research behind rules 1-3, talk about how the ten top U.S. web sites perform, demonstrate Cuzillion, and give several takeaways that you can use to make your web site faster.

Friday, March 14, 2008

How we improved performance on Google Code



If you're a frequent visitor to code.google.com for product updates and reference materials for Google APIs you're working with, you might have noticed that the page loading time (or page rendering time depending on how you see it) has reduced in varying degrees in the past several weeks.

As you'll see below, we've made several changes to help reduce user-perceived latency. This is not an exhaustive list of all improvements we've made recently, but these are the major ones we've made.

As Steve Souders emphasizes as the "Performance Golden Rule" in his book High Performance Web Sites, "only 10-20% of the end user response time is spent downloading the HTML document. The other 80-90% is spent downloading all the components in the page (p.5)".

We agree. That's why we focused our effort on reducing the number and size of downloads (HTTP requests) for the "components" throughout Google Code.
  • Combined and minimized JavaScript and CSS files used throughout the site
Downloading JavaScript and CSS files blocks rendering of the rest of the page. Thus, to reduce the number of HTTP requests made on the initial page load, we combined frequently-used JavaScript and CSS files into one file each. This technique has brought down 20 HTTP requests down to just 2. We also minimized the files by stripping out unnecessary whitespace and shortening function/variable names whenever possible.
  • Implemented CSS sprites for frequently-used images
There are 7 images prominently used throughout Google Code, including the Google Code logo, the googley balls at the bottom of every page, the plus and minus signs as well as the subscribe icon inside each blog gadget.

Although browsers usually download several images in parallel, we concatenated these images into one image so only one HTTP request would be made. Of course, concatenating several images into one required us to make several changes in HTML/CSS. For example, instead of having:

<img src="/https/googlecode.blogspot.com/images/plus.gif" />


We had to change it to:

<div style="background-image:url(/https/googlecode.blogspot.com/images/sprites.gif); background-position:-28px -246px; width:9px; height:9px">&amp;</div></span>


where sprites.gif is the concatenated image and background-position and width/height carefully calculated.
  • Implemented lazy loading of Google AJAX APIs loader module (google.load)
We like to eat our own dogfood. Among other APIs, we use our very own AJAX Feed API on product homepages inside the blog gadgets and the AJAX Search API on the search page. These Google AJAX APIs require the Google loader module (google.load) to be loaded first before any of the specific AJAX APIs (i.e. AJAX Feed API, AJAX Search API, Maps API) can be initialized and used. Traditionally, the Google AJAX APIs loader module would be loaded by including the following <script> tag in the <head> section:
    <script type="text/javascript" src="http://www.google.com/jsapi"></script>
This works well in most cases, but when optimizing for the display of static content, this blocks the browser from rendering the rest of the page until it's finished loading that script, thus impacting the user-perceived latency. So instead of loading the Google AJAX APIs loader module upfront, we are now loading it lazily only on the pages where it's required. This is made possible as follows (please note that this is a stripped-down version of what we have on Google Code):

First, in the <head> section, we load the Google AJAX APIs loader module via DOM scripting only on the pages where it's required:

if (needToLoadGoogleAjaxApisLoaderModule) {
// Load Google AJAX APIs loader module (google.load)
var script = document.createElement('script');
script.src = 'http://www.google.com/jsapi?callback=googleLoadCallback';
script.type = 'text/javascript';
document.getElementsByTagName('head')[0].appendChild(script);
}
It's important to add the 'callback' parameter in the src attribute, 'callback=googleLoadCallback'. This callback handler will then be called whenever the Google loader module is finished loading.

Then, in the Google loader callback handler (googleLoadCallback()), we initialize the AJAX Feed API and provide the function name that utilizes the AJAX Feed API (startUsingAjaxFeedAPI):
function googleLoadCallback() {
// Initialize AJAX Feed API
google.load('feeds', '1', {callback: startUsingAjaxFeedAPI});
}

function startUsingAjaxFeedAPI() {
// Start using AJAX Feed API
var feed = new google.feeds.Feed(someFeedUrl);
...
}

In effect, we're loading the AJAX Feed API on-demand through the use of two consecutive callback handlers, first to load the Google AJAX APIs loader module (google.load) and then to initialize the AJAX Feed API before it's used. Similar technique can be used for the Maps API and the AJAX Search API.

By now you're probably wondering just how much of an impact did these changes have on Google Code anyways? According to our latency measurement stats, the user-perceived latency on Google Code dropped quite a bit, anywhere between 30% and 70% depending on the page. This is a huge return for relatively small investments we've made along the way, and we hope you'll find these techniques useful for your own web development as well.