Optimizing Page Load Time

Karol Jarkovsky — Dec 5, 2011

In this blog post we are going to look at several best practices and recommendations that will help you achieve faster page load, take off some load from your website and ultimately give your website more room for breathing.

First of all, let’s cover the very basics of how communication between a client (browser) and web server over TCP actually looks like. It will ensure we are all on the same page concerning low-level stuff around which we are going to build this section.

The following scheme displays two timeline bars – client and server:

The communication and data flow goes like this:

1. [Client] Open Connection (TCP SYN) – Based on the requested URL, a browser gets an IP address from a DNS server and enters the hand-shake phase by sending out a TCP SYN packet while expecting a confirmation packet sent back from the web server,

2. [Server] Accept Connection (SYN ACK) – When the server is ready to handle the client request it sends the SYN ACK (acknowledge) packet back – at this point the connection is considered opened.

NOTE: Establishing a new connection is a very time-consuming process and thereby our goal is to re-use already opened connections as much as possible,

3. [Client] HTTP GET – In the next step, the browser sends out the HTTP REQUEST header with the requested URL, cookies and other details to the web server,

4. [Server] ACK – After some time, the web server acknowledges the retrieval of the requested packet and starts generating a response. During that time, the connection is idle (empty space on the timeline),

5. [Server] HTTP RESPONSE #1 – Once the response is ready, the server creates a response packet containing the HTTP RESPONSE header along with a portion of the HTML source.

NOTE: The amount of HTML included within the first packet depends on the maximum TCP packet size allowed,

6. [Client] ACK #1 – The Client receives the initial packet and confirms the delivery by sending out an ACK #1 packet,

7. [Server] HTTP RESPONSE #N – Typically the maximum packet size is smaller than the complete response from the server. Therefore, a series of HTTP RESPONSE packets is repeatedly sent to the client until the complete answer is retrieved,

8. [Client] ACK #N – The client repeatedly acknowledges each successful packet delivery by replying with the ACK packet every time a response packet from the server is retrieved.

That is what a simplified timeline of the HTTP over TCP packet exchange looks like. As you can see, there is a lot going on during the data exchange between the web server and the client browser.

As I mentioned, the connection becomes idle between the time the web server receives the HTTP request and the time the initial HTTP response is sent to the client. Using multiple simultaneous connections, you can minimize the idle time and thereby minimize the total page load time. You can read about the best practices and recommendations on how to optimize the connection usage and eliminate idle time in the following section.

Keep the Connection Busy

The response coming from the web server is processed by the client sequentially. That means as soon as the first response packet is retrieved from the server, the client starts parsing the response and attempts to download any additional resource referred in the HTML code. That is why we call it on-the-fly response processing. Any external resource (image, JavaScript, flash file, etc.) found within the HTML being processed opens a new connection and starts a parallel download (if possible).

Let’s consider the first example. Below you can see the load timeline for the page with an image in the middle of the HTML source. After the connection is opened and the initial response is received by the client (dark blue bar), the rest of the response is received (green bar). Somewhere in the middle of the second response data processing, the image is found and the client fires up a parallel image download.

Now let’s take a look at an example where the same page references the image at the beginning of the HTML source. As you can see below, the image reference placed at the beginning of the response code is retrieved with the initial response packet. The client therefore opens a new connection and starts downloading the image sooner than in the previous example which means the whole page finishes loading sooner.

TIP: As you can see from the examples we have presented, you should try to include at least one, ideally multiple object references (not just images) at the top of the HTML to make sure the downloading of external resources starts as soon as possible.

Reorder External Resources

Another important fact that relates to the position of an external object reference in the response HTML is that any resources specified within the HEAD element are downloaded before anything else. It not only means that nothing else is downloaded at that time, it also means that nothing is being rendered in the browser until the downloading of all HEADer resources is finished.

That may result in a period of inactivity, which users perceive as the page hanging with an empty white screen displayed in their browser.

TIP: The best practice is therefore to move as many resources (e.g. JavaScript files, CSS style sheet references [if those are not applied to the page by default - e.g. printer or media dependent stylesheet and moving to the BODY won't cause re-flow issues], etc.) from the HEADer to the BODY.

We have already mentioned that external resources are downloaded in the order as they appear within the HTML source. If you think about that for a minute, you can perhaps see where this is going. If you place a reference to a really big banner image/flash file at the top of your page, the user experience will suffer because the page has to wait for this resource to download before it renders it, which makes page loading look slow.

TIP: If you want to optimize user experience, you can take advantage of a technique that allows out-of-order object loading. You can either delay object loading using the so-called ‘late loading’ technique or use ‘early loading’ in the case of an important object that should be downloaded before the parser actually gets to the position in the HTML where the image should be placed.

Please refer to the sample codes for both scenarios below:

Late loading

<img id=“someImage” width=“50” height=“50”/>

… some more HTML …

<script type=“text/javascript”>

document.getElementById(“someImage”).src = “someimage.jpg”;

</script>

Please note that we are specifying the ‘width’ and ‘height’ attributes for the IMG element, forcing the browser to reserve appropriate space on the page before the image even loads. That way, we prevent moving elements on the website during the rendering.

Early loading

<script type=“text/javascript”>

var someImage = new Image();
someImage.src = “someimage.jpg”;

</script> … some more HTML … <img src=“someimage.jpg” width=“50” height=“50”/>

Enable Simultaneous Download

The majority of modern browsers allow you to open up to 2 connections to the same domain (domain, not IP) at the same time. It means that you can download up to 2 objects from the same domain in parallel. Honestly, in scenarios where you have 10 or 15 external objects from the same domain referenced on your page (which is too many by the way and you should try hard to keep it at a minimum), it may prolong load time significantly. Let’s take a look at an example.

In the above example, you can see that at the same time the first image starts downloading (using an already opened connection used to download the page HTML), another image starts downloading (opening a new connection). No other image starts downloading unless the previous download is finished. Now let’s take a look at the load timeline for the same page if images were referenced through multiple domains.

In the example above, we have a page with 5 images placed on it (1 image from d1.domain.com, 2 images from d2.domain.com and 2 images from d3.domain.com). As long as there is no more than 2 images referenced from the same domain and the browser can open up to 2 parallel connections, all 5 images are being download at the same time and the page is loaded way faster than in the previous example.

TIP: You should use different domains (even though all domains will point to the same IP) for external resource links. It will allow you to download multiple objects from the same domain in parallel. You can setup two or more domain aliases in Kentico CMS/EMS for the website to be able to link resources using different domains in the URL. The rule of thumb here may also be to provide a separate website for the content shared by multiple websites that are likely visited by the same client because resources from the same URL can be cached (cache is case sensitive so someImage.PNG is not someimage.png) and therefore the next requests for the same resource can be served locally instead.

Place JavaScript Wisely

There are 3 important facts about processing of any JavaScript referenced on your page:

1. Script resources are downloaded each one before the next,

• Once the JavaScript starts downloading no other JS object is downloaded at the same time even if referenced through a different domain,

2. If the downloading of a script starts, no other objects are retrieved in parallel,

• Images are queued for download while the JS object is downloaded,

3. The browser stops rendering the page while a script is downloaded,

• It may make the page look like it is freezing or that server has stopped responding which is not a welcome user experience at all.

In the following timeline, you can see when the first JS file referenced in the code starts downloading. The second JS is not being downloaded until the first JS is done downloading. Any images placed after the second JS reference are added into the queue and their download waits until the JS files are done. Then parallelism as explained takes place and images are downloaded at the same time (because they are still referenced through multiple domains as explained earlier).

The timeline below on the other hand displays how the page loads in an ideal scenario where the JS files are placed at the bottom of the page (after the images).

TIP: Place script references at the end of your HTML. If you cannot move script at the end because there are elements on the page that requires it, place at least one or more objects before the script to allow a higher degree of parallelism.

If you try to follow recomendations from this post when developing your next website I guarantee you the user experience and perception of how fast your website feels will change dramaticaly. And the user experience is what it is all about these days, right?

Please use comments below if you have ideas or thoughts you want to share with others. Thank you for reading and see you next time.

K.J.

Share this article on Twitter Facebook LinkedIn

Karol Jarkovsky Google Plus

Director of Product

Comments

Karol Jarkovsky commented on Dec 6, 2011

@Darren

I'm glad it helped! That's exactly the type of customers that inspired me to compile this article.

Just by following these simple rules I've been able to speed up load time of one of our clients from horrible 20 sec. to 4 sec. That's why I feel so important about it.

I'm also curious about the complications you had with your client (and things you mentioned).

Would you mind to share what was going on exactly so we can eventually compile some article around it? You can contact me on my e-mail - karolj[-]kentico.com if you wish.

Thank you,

Karl

Karol Jarkovsky commented on Dec 6, 2011

@Musafa

Thank you for additional comments! I like what you're saying and think that based on what audience we're talking to the article could include plenty of other information and resources or it could discuss the issue on another level of detail.

I still have to comment though:

1) It will definitely load faster. It's the same thing like if there would be two of us swimming for 200m and the total race time is the time of the last swimmer touching the other side of the pool. We’re going to swim 400m (total as 2x200m) no matter how fast we’re going to do that or who jumps to the water first. On the other hand, if I start swimming with 2 sec. delay after you jump into water, the final race time will be your time + my 2 seconds (if we both swim at the same speed).

And so right, the same amount of data will be downloaded, but if the external resource that would otherwise take too long to download, starts downloading earlier the page will eventually get rendered sooner.

Don’t take my word, you can test it yourself. Create new blank page in Kentico, don’t inherit any content, place editable region on template, copy some long text in the editable region (text long enough that response will be split into multiple packages) and insert some image (again big enough let’s say 400 KB) at the end of the text. Try to load page, record the page load time and then do the same with the image at the top.

I did so and just with a single image moved from bottom to top I can see page load shrinks from 306ms to 292ms, which is 5% gain in total load time. And that’s just simple page with single image and some text.
Anyway, I hope we’ll have chance to talk about those things in details on some of our Kentico events. Please keep an eye on our blogs as your comments are greatly appreciated and ‘ad rem’.

Thank you, Karl

Mufasa commented on Dec 6, 2011

@Karol:

1) RE: HTTP protocol. Your call; was just a suggestion for clarity.

2) The total load page will still be the same; the same resources have to loaded regardless of the order. However, the _perceived_ load time might change if you post-load (after DOM ready) the low performance resources (the banner image in this example), since the part the user cares about on the page will have loaded and they may start interacting with the page even if the banner isn't finished loading.

3) Yes, the HTTP spec does require all HEAD references to be loaded before processing the DOM. Of course, the HTML 4.01 spec requires all CSS <LINK> tags to be in the HEAD — though most browsers allow it anywhere. In fact, most browsers block from rendering anything until _all_ CSS files are downloaded across the _entire_ page, regardless of their position. The browsers that don't do that still have to perform extra reflows if CSS is not in the HEAD. That is the main reason why they should be put together in the HEAD. See http://code.google.com/speed/page-speed/docs/rendering.html#PutCSSInHead

5) In the specific scenarios you mentioned in your comment, changing the image load order might help the perceived load time only. It should be advised that this only applies on a case by case basis. A lot of junior developers read things like this and apply the guideline blindly across all resources, not realizing that doing it that way benefits them nothing if not used judiciously. You may want to advise them on when to use and when not to.

6) I was advising the avoidance of absolute statements like "No other image starts downloading unless the previous download is finished" that could be misleading unless the entire article is read — which I find a lot of users skip the non-bullet points. I was suggesting rewording it to be more clear up front.

7a) Re-read my original comments. I address the HTTP spec 2 vs. browsers limit vs. how many domains you should use. The point was that you shouldn't rely on more than 2, nor assume that it is only 2 either.

7b) Sub-domains require a minimum of at least 1 DNS lookup for each one and sometimes more if there are CNAMES set to other DNS servers, exactly as main domains require. The usage of sub-domains is not free regarding DNS lookups. The developer must be aware of that when deciding how many different domains to use.

7c) Yes, having 'sticky' domains per resource is a good idea so the user doesn't have to cache the same resource multiple times. That's a guideline that automatically comes with using multiple asset domains; but it doesn't mean it's a _reason_ to have multiple asset domains. That was my point in my first comment. That the guidelines for how to make a decision to apply multiple domains and how many, was what needed a little more attention.

Your article was helpful. As web sites get bigger and more resource intensive, more developers need to be aware of the finer details of page speed. "Performance is a feature." Good job getting the word out.

Darren Gourley commented on Dec 6, 2011

We used the image parallelization technique on one of our clients websites as their pages are very image intensive. It brough the overall page load speed down from 8-9 seconds to 3-4 seconds.
The idea came from this webinar - so thanks. The only problems we had with this was the display of the CAPTCHA, some async postbacks and few other niggly things like images being called from javascript.

Darren

Karol Jarkovsky commented on Dec 5, 2011

@Steve

Thank you for your input and I definitely recommend everyone to read your blog post. I find it very useful!

@Mufasa

I really appreciate your valuable input. That's exactly feedback I was looking for.

Let me comment on this a bit more:

Ad 1) First the HTTP protocol flow - I don't think it's confusing at all. I actually believe that to understand how the packets are exchanged during the client-server communication is very important. If nothing else it gives you an idea how much overhead is involved. My point was to demonstrate that there are blind spots when the connection is idle and that we can utilize those spots by leveraging parallelism,

Ad 2) Referencing objects at the top of the HTML - Let me give you an example. Let's say you have big banner image at your page. Let's say it usually takes 1.5 - 3 sec. to download it - because it's referenced from the slow source (I've actually worked with client who was linking banner like this, sad, but true).) Now imagine you put that image at the end of the HTML, and because the page HTML source is also rather long browser receives 3 packets before it finally gets to the part where the image occurs. Then the download starts. So question is, wouldn't be the TOTAL load time shorter with the image placed at the top instead of bottom when the download starts sooner (in parallel) while the rest of the page HTML is still being loaded?

Ad 3) Moving CSS files from HEAD - As I understand browsers block rendering a web page until ALL external style sheets have been downloaded. My advice is therefore based on the best practice where any CSS styles that won't cause re-flows (e.g. printer style sheet or other media dependent sheet that aren't applied unless explicitly requested) are referenced outside the HEAD element. Anyway, I see your point here that I'm not being clear enough and I'll update the article text accordingly. Thank you!

Ad 4) JavaScript at the end when needed sooner on the page - That's actually exactly what I'm talking about where I advise to move at least couple of resources that doesn't depend on JavaScript before script reference to benefit from the parallelism. Again, I may need to be more specific so I'll try to rephrase that information to make the message more clear. Thank you!

Ad 5) Early/late loading - Whether you specify JavaScript through the inline script or not a browser will most probably wait for the script to download anyway. It has to process it before moving forward with rendering. The real advantage of early/late loading as I outlined it is that with the early loading (as proposed) you can fire image download before parsing reaches some much bigger image or JavaScript code even though you can't move it in the HTML source (for whatever reason). The same thing is true about the late loading where the point is to reserve a space in the HTML for the image (placing the <IMG> tag at required position) and start downloading actual image after the most important content on the page is rendered (because the image is just banner that user really doesn't care about or whatever). The re-flows are avoided by using the WIDTH and HEIGHT attributes for the <IMG> tag as you can see in the sample code.

Ad 6) No other image starts downloading unless the previous download is finished - This example is not rally about Kentico website specifically. As with the rest of the article it is all about principles of client-server communication and the way how browser renders the content under certain circumstances. The fact that as long as the maximum number of parallel connections is reached no other resource is download from the same domain until one of the connections is released is what the message there is,

Ad 7) Max allowed connections + asset only domain + extensive DNS look up - When I was gathering data for my research I read about increased number of parallel connections you're talking about. The fact is that the HTTP standard still defines limit of parallel connections as 2. Browsers are therefore just breaking the standard. I'm not saying it's wrong, it's actually great, but what I also observed was that besides IE8 any other browser I used for testing didn't open more than 3 simultaneous connections. Frankly speaking I don't want to rely on fact that website users will use browser that will open more connections unless it's defined in the standard. Also your concerns about excessive number of DNS look-ups are slightly irrelevant from my point of view as I'm advising using different sub-domains not different domains (unless I'm missing your point and you are talking about something else). For me the point of asset-only domain is not the fact whether it's cookie less or not (which really matters in case of extensive projects mostly), but the fact that as long as I reference resources from the same domain and user is likely to visit different websites that are displaying the resources from the same domain caching mechanism will pre-cache resources once they are downloaded no matter what website user is looking at.

The links you provided are great resource for further investigating on the topic. I'm pretty sure it's not my last article on subject and hopefully I'll get to some of the stuff you pointed out later too.

I'd like to thank you once again for your comments as they'll definitely keep users think more about these things which will eventually expand knowledge of Kentico community.

Karl

Mufasa commented on Dec 5, 2011

There are several dubious suggestions here.

The first section about the internal HTTP protocol flow seems to be barely applicable to the rest of the article. It could easily be removed without a loss in understanding from the user.

I don't see the need to suggest "try[ing] to include at least one, ideally multiple object references (not just images) at the top of the HTML" to improve speed. If this is just to get the DNS request to a non-current domain name or to start a parallel image load, it won't affect the overall or perceived page load time any over loading the resources after the <HEAD> tag, assuming proper asset loading order is adhered to in the first (see below).

It is recommended to keep all CSS files in the <HEAD> of the page for two reasons: 1. CSS files can be loaded in parallel as they do not block processing. 2. CSS files loaded after the first render (anywhere in the <BODY> tag, can cause a re-render/reflow and can be visually jarring to the user experience and slow down the final render load time because of the extra reflows.

True, JavaScript should be at the very end of the <BODY> tag when possible, since JavaScript loads do block loading of other <SCRIPT> tags, depending on the defer/async attributes, and the browser. (Some download in parallel even without defer/async, but wait to process them sequentially when possible.) Some scripts must be in the <HEAD> to work because they were not designed to be at the end of the page; users should be warned that moving <SCRIPT> tags around can cause problems with some scripts so each be tested.

Early and late loading of JavaScript resources is nearly useless if it is done with inline <SCRIPT> tags. If anything, it slows it down (very slightly) because the <SCRIPT> tags will pause DOM parsing while they run the <SCRIPT>. If the image.src is set during DOM parsing via an inline <SCRIPT>, no time will have been gained. Late binding is only advisable to speed combined load/processing time if it is done _after_ the DOM is ready, if the image sizes are set explicitly so the document will not be forced to reflow, and if the parallel connections are already saturated when the image is expected to be loaded.

"No other image starts downloading unless the previous download is finished." It isn't real clear in that paragraph that you are trying to say that the 2nd connection 'open' time is taking some time and that your example conveniently times them so the 2nd starts downloading when the 1st finishes. In practice, it won't be this exact. Also, the HTTP specification specifies 2 concurrent connections per domain, and any CSS files in the <HEAD> (of which Kentico almost always adds at least one), will have already opened the 2nd connection, so this example is more confusing than it is practical for a Kentico site.

Further, many browsers have increased this concurrent connection limit to between 4 and 16. So it's less of a problem than it used to be. Still, multiple domains can help to a point. You should have advised your users to measure and balance the cost of the number of DNS lookups (which are very expensive when not cached) for multiple domains versus the number of external resources that can load simultaneously versus the average consumer's available bandwidth and latency. There are a lot of factors that can determine if, and how many, asset-only domains to use, if any. It should not be applied blindly.

You missed the most important part of an asset-only domain: It allows for cookieless domains. Though that gain is usually only worth the cost to implement for very large sites or sites with excessively large cookies.

"Once the JavaScript starts downloading no other JS object is downloaded at the same time even if referenced through a different domain" isn't entirely technically accurate. Most browsers these days still download in parallel when possible, but run <SCRIPTS synchronously and block on most processing (but not all). (As mentioned in paragraph #4 above.)

Further reading:

http://developer.yahoo.com/performance/rules.html
http://programmers.stackexchange.com/a/46760/24464
http://code.google.com/speed/page-speed/docs/rules_intro.html has a few nuggets in the documentation there, with further reading suggestions at the end of the page
http://shop.oreilly.com/product/9780596529307.do
http://stackoverflow.com/q/1142318/2291

Appetere Web Solutions commented on Dec 5, 2011

Excellent article! Will definitely be making use of these tips.

Another way of speeding up image loading is to combine multiple images into one image, then use CSS relative positioning to choose which part of the image to display.

Also, if you're using shared hosting where you can't control when the application pool shuts down or gets recycled, you might find one of my blog posts useful: http://www.appetere.com/Blogs/SteveM/May-2011/Improve-Kenticos-first-page-load-time

Steve

Optimizing Page Load Time

Karol JarkovskyGoogle Plus

Comments

Karol Jarkovsky commented on Dec 6, 2011

Karol Jarkovsky commented on Dec 6, 2011

Mufasa commented on Dec 6, 2011

Darren Gourley commented on Dec 6, 2011

Karol Jarkovsky commented on Dec 5, 2011

Mufasa commented on Dec 5, 2011

Appetere Web Solutions commented on Dec 5, 2011

Karol Jarkovsky Google Plus