Security - Robots.txt publishing location of CMSPages

Kevin Gock asked on November 12, 2014 08:24

A security audit identified a risk that robots.txt discloses Kentico pages locations such as the standard CMSSiteManager/, CMSDesk/. I checked the website and noticed the robots.txt does NOT disclose these pages and that there are quite a few public Kentico sites that do disclose those directories.

Could you please advise best practice? My current thinking is to simply remove these two from robots.txt (copying the practice that kentico website does) on the basis that the underlying pages are protected by requiring an authenticated log in so crawlers automatically cannot index or follow the underlying pages.

Correct Answer

Brenden Kehren answered on November 16, 2014 20:21

If you're that concerned, then you shouldn't include any of the directories because they could all be accessed. What about the /CMSPages directory that has all the pages that you can post or get information from? IMHO that's a bigger security risk than leaving the CMSDesk and CMSSiteManager items in the file.

If you do #2, you'll get indexing errors as the crawlers will attempt to crawl those directories. I'm not stating its a bad option, probably what I'd do if the client was generally concerned with that, but know you'll prob get errors when crawling.

1 votesVote for this answer Unmark Correct answer

Recent Answers

Brenden Kehren answered on November 13, 2014 05:09

What security risk are you concerned with allowing those in exclusions in your robots file? Yes it gives some of the folder structure but by default none of those folders can be accessed unless you've made other changes to allow them to be without being authenticated.

0 votesVote for this answer Mark as a Correct answer

Kevin Gock answered on November 13, 2014 05:23

The security risk identified by the company that did the audit was "The presence of the robots.txt does not in itself present any kind of security vulnerability, but it does disclose information about the restricted areas within the [client] corporate website. If [client] relies on robots.txt to protect the location of CMSDesk and CMSSiteManager portals from web crawlers, and does not enforce proper access control over them, then this could present a serious vulnerability.

I suppose the options are: 1. Leave it - access control is enforced so there is no risk 2. Remove the two items from robots.txt - access control keeps them safe AND protected folder locations are not disclosed.

They both seem good enough IMHO but (2) I guess is a little better practice?

0 votesVote for this answer Mark as a Correct answer

Kevin Gock answered on November 17, 2014 00:17

Thanks Brenden, appreciate your answer. CMSPages isn't in the robots.txt file which is good.

0 votesVote for this answer Mark as a Correct answer

   Please, sign in to be able to submit a new answer.