Optimization tip: Write better code, disable Output filter
Output filter is a quite powerful feature that can fix you code to be more valid, but also comes with quite an overhead. Here is the overview of it's possibilities and what which of them you really need. Did you know you can get much better performance without it? Read further ...
Welcome in the new year!
My first post in this year continues with the Optimization tip series to help you further improve the performance of your web sites. I hope you will find it useful.
What is the Output filter?
In general, Output filter is sort of handler that is bound to the output stream and is able to change the output HTML to better fit the needs for the output. Specifically with Kentico CMS, the output filter is able to fix some of the human errors and make the output (X)HTML more valid.
How does it work and what it can do?
Basically, our Output filter consists or three parts:
Form action filter - This part is seperated and implemented as the HTML adapter. It fixes the action parameter of the form on the URL rewriting so the page postbacks on the same virtual URL as the original request. It is usually required (unless you fix the form action with your own code in ASPX template model).
Resolve filter - This filter is able to resolve the virtual paths (e.g. ~/App_Themes/Images/...) to make valid URLs in the output code. It basically allows you to write the virtual paths directly to the HTML code, but comes with the cost of processing.
XHTML filter - This part of Output filter is a HttpModule that uses standard ASP.NET OutputFilter mechanism to bind a handler that is able to fix many of the human errors in XHTML validity of the page and makes it more valid. It parses the complete page HTML, analyzes it's consistency and fixed the parts that need to be fixed. Of course this is not for free, it comes with a significant overhead (due to the parsing and analyzing).
The last two parts are actually joined in the processing for better performance.
All settings are available in the Site manager -> Settings
under category Output filter
The settings are not checkboxes, but textboxes, so you can exclude more specific locations from filtering (you may have more sections which you do not want to filter). So the settings are actually negative, anything set in there is excluded
from filtering, everything else is filtered.
Excluded output form filter URLs
"" (empty) - All pages have the given filtering enabled (no URLs are excluded)
"/" - All pages have the given output filter disabled (all URLs start with this prefix)
"/News;/Products" - All URLs starting with /News and /Products have the given output filter disabled, every other page has it enabled.
"/SpecialPages/ContactUs.aspx" - Only this special page has the output filter disabled, all other pages are filtered.
This setting says which pages should not attempt to fix the action parameter of the form. Here is an example of what it does:
Since there is the URL rewriting to the template page URL in place (for details see this webinar), the original rendered URL in the form action parameter for page /Home.aspx would be like (for portal engine): /CMSPages/PortalTemplate.aspx?aliaspath=/Home
But in such case, when user comes and does a postback, he gets this not very much friendly URL to the address bar of his browser. This is obviously something not wanted.
When the form filter is enabled, it can fix it, and makes the action parameter to be exactly as original URL /Home.aspx so the user browser still displays the same friendly URL no matter what the user does with the page.
That is all and it is quite effective, my recommendation is never to disable this unless you have some speicific system page where not replacing the URL is something you want.
Excluded resolve filter URLs
Similar to the previous one, you can just put in the pages where you don't want this or need this. Needless to say, this filter only appplies to locations where the URLs need to be resolved, such as src and href attributes in tags. It keeps the regular text untouched.
So when the resolve filter is disabled, you cannot use in any of your HTML virtual paths using ~/ (except for some text web parts which do the resolving automatically locally). The good thing about this is that with automatic resolving, you can have the same code for site in virtual directory or root of the server so you can easily move it from one location to another and it will still work.
It basically replaces the ~ with the application path of your application, so the path ~/Home.aspx for web site in virtual directory KenticoCMS will produce the URL /KenticoCMS/Home.aspx on the output.
On the other hand, if you make your code not need the resolving (by writing paths with ASPX tags resolved by your explicit code or specifically relative to the root of the server), you can save yourselves some overhead.
Luckily, this filter is not that greedy (it runs either as smaller part of XHTML filter or separately within some different optimized method), that is why it is enabled by default and used in our sample web sites (if you disable it, you may see some wrong links or URLs). My recommendation for most installations is to leave it enabled and disable it only when you are dealing with a heavily loaded web sites located in root of the web server where you are sure you will never need to move the web site to a virtual directory. Or in case all your web pages are displayed correctly even with this filter is disabled (in other words when you are not using virtual URLs in any web parts that do not support them natively).
Excluded XHTML filter URLs
This is a global setting for all subsequent XHTML filter settings, if you disable some page or section with this one, no further XHTML filter will be used for these locations. Entire XHTML filter needs to parse the entire output HTML, which leads to quite an overhead. That is why it is disabled by default in our sample web sites and the web sites are made valid by a good coding.
Good coding is a keyword here. If you can make your developers, designers and editors write valid XHTML code (editors only if they do something with the WYSIWYG editor source view or enter tags into textboxes which you do not encode in your transformations), you can live without this filter and get much better performance out of your web site. If not, you get their mistakes fixed automatically, but for a cost. That is the deal ...
My recommendation is to leave it disabled and teach your staff to write the correct code. Or .. if you cannot make them do that, disable it for the sections that go through a review or are not edited by those who cannot understand the HTML validity rules.
Lets see what the XHTML filter can fix ... they are all quite powerful and smart in what they can do. You can disable all of them specifically to only some locations.
Excluded XHTML attributes filter URLs
This specific filter takes every single atrribute of your tags, and makes it XHTML valid, which means it adds quotes where they are needed or provides a value to the attributes that are missing it.
For example, it changes <td class=TDClass nowrap> to <td class="TDClass" nowrap="nowrap">
Excluded XHTML lower case filter URLs
Lowercase filter will make sure that all your tags and their attribute names are lowercase.
For example, it automatically changes <SPAN Class="MySpan"> to <span class="MySpan">.
Excluded XHTML self close filter URLs
Self close filter as the name suggests makes the tags which are not closed closed. Actually, not only that, it also opens unopened tags.
<img src="image.png"> gets fixed as <img src="image.png" />
<p><em>Something</em> gets fixed as <p><em>Something</em></p>
<em>Something</em></p> gets fixed as <em>Something</em><p></p>
On the last example you can see that the filter is not yet that smart to know where to put the beginning tag, so it puts it to the nearest possible location, which is still better option than broken an invalid HTML.
Excluded XHTML tags filter URLs
This filter option replaces the obsolete tags with their correct XHTML versions, which is:
Filters and Full page caching
<b> gets fixed as <strong>
<i> gets fixed as <em>
Good thing about full page caching is that it caches the output after filtering, so for fully cached pages the filter settings only affect the performance for the first load of the page (which may also be significant for someone)
Overhead of the filters
Here is the list of expected overheads the filters may need (not exact numbers, this is my educated guess based on our observations and may differ project from project):
Form filter - 1%
Resolve filter - 2% (based on the number or URLs to resolve)
XHML filter as global - 20% (based on the size of the page HTML)
Attributes filter - +1% (based on the number of attributes on the page)
Lowercase filter - +1% (based on the number of tags and attributes on the page)
Self close filter - +5% (based on the number of tags, depth of HTML structure and number of unpaired tags)
Tags filter - +1% (based on the number of tags)
So as you can see, leaving the XHTML filter disabled can save you about 25% of performance of the web site. It is a good thing to have it disabled and write better code. Other filters are quite effective since they do not do much.
And this is the conclusion, see you at my next post.