Viewstate error when saving a page in a web farm

Ji Pattison-Smith asked on August 28, 2019 12:12

I have a web farm in Azure with 2 app services running Kentico 12. Azure Traffic Manager is used, and this sits behind Cloudflare, but we aren’t using full output caching - it’s really for DDoS protection.

I’m getting an occasional error when saving a page after adding a new widget. I see the “500 server error” message, and I can see from the event logs the following:

Exception message: Failed to load viewstate. The control tree into which viewstate is being loaded must match the control tree that was used to save viewstate during the previous request. For example, when adding controls dynamically, the controls added during a post-back must match the type and position of the controls added during the initial request.

After investigating, it looks like what’s happening is that the request where the widget is dropped onto the page is to one of the servers, but then it switches to the other server on save and throws this error.

I found another forum post with a similar (but not identical) issue, but there aren’t any solutions there. (https://devnet.kentico.com/questions/web-parts-added-to-page-disappear-randomly-and-object-does-not-exist-error)

Kentico documentation mentions “sticky sessions” which sounded promising, but Azure Traffic Manager does not support these due to it being a DNS-level traffic manager.

As far as I can see, everything is set up correctly in terms of the machine key etc according to Kentico’s documentation (https://docs.kentico.com/k12/configuring-kentico/setting-up-web-farms/configuring-web-farm-servers).

We are using Redis Cache for session state.

Are there any other settings I need to check or configuration to do to get this working correctly?

Recent Answers

Brenden Kehren answered on August 28, 2019 15:26

In your load balancer you need to state to keep the session alive on one server until that server becomes unavailable or the users session exipres. This way they won't jump between servers on a single request. I forgot exactly what that is called but when we worked with an on-premise setup, the f5 load balancer needed to have a simple checkbox checked to make sure this happened.

0 votesVote for this answer Mark as a Correct answer

Ji Pattison-Smith answered on August 28, 2019 15:43 (last edited on August 28, 2019 15:46)

Thanks for the reply Brenden. I think you're referring to "sticky sessions" or "session affinity" which are not supported with Azure Traffic Manager, since it routes traffic at a DNS level so never sees any actual traffic.

This did seem to be working fine prior to Cloudflare implementation, so it must be related to that, but I'm confused about the purpose of the shared session state provider if it can't handle users switching between different servers. Surely that's the whole point? What am I missing?

0 votesVote for this answer Mark as a Correct answer

Brenden Kehren answered on August 28, 2019 16:01

Thanks for the reminder, yes, that's what I was talking about. Correct, we've found with that particular project that even though we used SQL Session, we still had issues with users jumping from one server to another in the middle of a request. Sticky session was the solution at the load balancer level.

For your implementation, I'm not familiar with Azure Traffic Manager. We have several web farm implementations and none of them use that, we are simply using the built-in features Azure has for scaling out and directing traffic. We have one particular site with 3 servers in the farm and averaging 1.1 million hits per month and all works without issue.

0 votesVote for this answer Mark as a Correct answer

Peter Mogilnitski answered on August 28, 2019 17:58 (last edited on August 28, 2019 18:08)

We have the Azure Traffic Manager and multiple Kentico instances per region. Traffic Manager redirects to nearest data center based on a client location. So if you are in the East Cost and try to access your web site - you will go to the East US Azure Data center. We also have autoscalling enabled, we can get more instances if required. Most likely this is a problem with your web farm configuration. I would not blame traffic manager.

0 votesVote for this answer Mark as a Correct answer