DIY: How to handle robots.txt and favicon.ico in a multisite environment

   —   

Very often question with very easy solution. See how to do it, there are no workarounds, it can be done with standard settings and approaches ...

Hi there,

I last few weeks, we have encountered several requests on how you can implement robots.txt or favicon.ico files specific to the web site in the multisite environment. Needless to say, such environment is not only resource friendly, since it doesn't need several instances of the application, but also very easy to maintain in comparison to mupliple setups. So the questions are really in place.

Not to mention our large clients that use our solution as a platform for their services where they implement and host hundreds of web sites of their clients in a single instance. If you didn't know something like that is possible with a standard CMS, believe that Kentico CMS handles that very well.

But let's go straight to the topic ...

Location of the robots.txt file

The file robots.txt is required in the root of the web site, that is why it seems hard to implement if you have just single file system with multiple files. But in reality, the oposite is true. Kentico CMS has several means of aliasing the content which gives you great power of customizing the content per-site.

Prepare the file

The best location for you robots.txt file is the same where you store the content: in the content tree. The reason for that is that the content tree is specific to specific site, and the content can be aliased. See where I am going?

Let's go ahead and try that. Create a new document of type Folder named "System files" under the root of your web site. It can be actually anything (any document type), anywhere, named anyhow, and you don't actually even need to do that as you will see later. But I like things organized and that is exactly why I prefer to go this way.

Now create a new document of type File, and upload your robots.txt file to it. This will result in having the file with URL like /System-files/robots.aspx which will return the text file.



The only other thing we need to do with the file is go to it's properties tabs, and on the URLs tab set the URL path to /robots. This step will make the main URL of the file /robots.aspx which is very close to what we wanted, right? Note that this URL can be set to this value no matter where the document is, that is why the step above is completely up to you.

Working with extensions

You may also notice that the URL extensions field was automatically populated with .txt extension, because you uploaded the text file.



This in the perfect world should mean that the document is also accesible through URL /robots.txt but in reality, when you attempt to display that, you will get something like this:



When you are playing with extensions that aren't standardly processed by ASP.NET and if you get an 404 error like this, it means that the request never reached the application code. Simply said, this is a result of some missing configuration.

What we need to achieve is that the requests go through. That is exactly the same scenario as for setting up different extension for pages or even extensionless setup, except for that in this case we want to keep current extension intact, and just let the requests go through to reach the file. That is why we only use the part that let's the requests go through.

The documentation on how to setup extensionless can be found here:
http://devnet.kentico.com/docs/devguide/index.html?configuration_of_extensionless_urls.htm

I am on Windows 7, so IIS 7 and higher is our case. The documentation says:

"For it to work, you need to modify the system.webServer section of your web.config file by adding the first highlighted line and changing the opening <modules> tag to the second highlighted line in the code example below."

<system.webServer>

   ...

<httpErrors existingResponse="PassThrough" />
<modules runAllManagedModulesForAllRequests="true">

  ...

</system.webServer>


So that is exactly what we need to do. Just update the web.config like that.

Finished

And that's it! Now if you go to /robots.txt URL, the request reaches the application, matches the URL path and extension and returns the file as expected.



It was simple, isn't it?

What with favicon.ico?

The favicon.ico file is exactly the same case but you need to do something in addition to that to get it on the page. What you do in content and configuration is exactly the same (of course with different data, URL path will be /favicon).

What you may need to encounter is that your settings do not allow you to upload .ico files, to fix that, just add the ico extension to the settings of allowed extensions:



But if you open the /favicon.ico URL, you still get the default one. That is because there is a physical file with that path in the root of the web site. If there are both physical file and virtual content under the same URL, the physical file has higher priority. That is a simple performance rule that we obey. So the next step is to get rid of that physical file.



Just delete the file favicon.ico from the root of the project. Now if we look at the favicon.ico (note that you may need to clear the cache of your browser, it may not handle it correctly with the previous icon cached), we see exactly what we want under that URL:

favicon.png

Not everything yet

You may thing that is the end. But if you look at some page code, you quickly find out that the tag for the favicon is missing. That is because by deleting the icon from the root of the project the system lost track of it and thinks there is no favicon to include.

To fix that, we will just add that tags manually. The simplest way is to use the web part Head HTML code, and put the following code in it:

<link rel="shortcut icon" href="/KenticoCMS55Blog/favicon.ico" />
<link rel="icon" href="/KenticoCMS55Blog/favicon.ico" />


Note that you need to use the correct virtual directory instead of my KenticoCMS55Blog or link directly to root if your web site is in the root of the web site.

And we are finished with the favicon. It will behave the same way like before, except that now it can be used for each of the sites separately.

That is all for today, see you next time ...
Share this article on   LinkedIn

Martin Hejtmanek

Hi, I am the CTO of Kentico and I will be constantly providing you the information about current development process and other interesting technical things you might want to know about Kentico.

Comments

Martin Hejtmanek commented on

Not sure, did you check the event log for errors or try other browser?

sbenson-vocus commented on

The txt file isn't uploading. Any reason why I can't upload the file? As I select the file, and choose to upload, the CMS spins as if it's uploading, and then nothing happens.

Antonio.Laccetti_eLogic commented on

Thanks for sharing ;)

Martin Hejtmanek commented on

Hi Jeroen,

It is not very common that people share .ico files on the internet. That is the only reason for it. In fact I think you should just allow it for the time of the favicon.ico upload and then remove it again unless you really want your visitors to share icons and count with it in your web site design.

Jeroen Fürst commented on

Great post Martin! I have a question though: why is .ico not a default allowed upload extension in the media library? Cheers!

Martin Hejtmanek commented on

Hi Ryan,

Yes, you are right. I just know that I once had an issue with some older browser (not sure which one, but it was definitely something archaic from times of IE 5.5) that it needed the favicon.ico from the root as well, otherwise it didn't work. That is why I didn't mention it.

Ryan Wheale commented on

Wow... never thought of doing the robots file like that.

I wanted to put this out there as another option for the favicon file:

1) Delete the favicon.ico from the root (as mentioned above)
2) Upload your favicon to the site's folder ([Site Root]/[Site Name])
3) Use this in the header of your document:
<link rel="shortcut icon" href="{%ApplicationPath%}/{%SiteName%}/favicon.ico" />