Kentico 9 web farm problems

Tom Troughton asked on March 10, 2016 11:34

I've upgrade a Kentico 8.2 site to Kentico 9 and hotfixed it to 9.0.13. I've tested locally on a single server environment and now deployed to a web farm for the first time. The web farm is 2 servers.

I start the site on one server and navigate to the site running on that server (bypassing load balancer). In settings I change web farm to be automatic. I then move to the web farm module and see the server has registered itself and is 'transitioning'. After a while the server becomes 'healthy'.

I then start the site on the second server (WEB2DEV). I visit the site locally on that server (again bypassing load balancer) and go to the web farm module. Here I can see the new server has registered itself and is 'transitioning'. However, the first server is now listed as 'not responding'.

But if I move back to the site on the other server it now lists both servers as 'healthy'. Here is a screenshot from both servers:

Image Text

I've restarted both web servers a few times and every time they end up back in this situation.

I'm completely stuck. Both servers are definitely using the same database. I've checked event logs and there is nothing suspicious. I've re-signed macros. Both web.configs are using a shared machine key.

Does anyone have any idea how this can be resolved?

Update:

With some testing I've found that even though the servers are in this situation, web farm synchronization is still actually working both ways. So that's something. But of course I need to resolve the issue that one server is reporting the other as not responding otherwise clients will panic... Any advice?

Correct Answer

Tom Troughton answered on March 11, 2016 12:11

I have discovered the issue. The clocks on my two web servers were not in sync. So even though both servers were healthy, the internal check must use server time to determine the last successful ping or something. Anyway, once the clocks were in sync the web farm reports both servers as healthy.

0 votesVote for this answer Unmark Correct answer

Recent Answers


Bryan Soltis answered on March 10, 2016 14:23

Hi Nat,

Kentico 9's web farm support works by determining if a server is healthy or not and adding / removing it from the synchronization process. This is determined by an async process that "pings" each server to make sure the other servers can access it. If the server is not reachable for over 3 minutes, it is moved to a "Not Responding" sate.

You can find out more about how the Web Farm process works in this webinar.

From what you provided, it looks like there may be an issue wit the server communicating with each other. To verify this, I would look into the internal network to make sure there is no latency or issues when pinging one server to another.

If you find there are no issues, I would recommend opening a support ticket as they may need to look at you system further to understand what may be happening.

  • Bryan
1 votesVote for this answer Mark as a Correct answer

Tom Troughton answered on March 11, 2016 09:06

Hi Bryan, thanks very much for this information. I will watch the video today.

In the mean time, would you mind briefly explaining the mechanism for this ping? On both servers I can load the other server's site directly via its internal URL (bypassing load balancer) and both servers can ping the other using both IP and server name (the latter being the name auto-registered by Kentico in the web farm module).

So I'm curious which specific aspect of our network I need to investigate.

0 votesVote for this answer Mark as a Correct answer

   Please, sign in to be able to submit a new answer.