Entire Kentico system hangs, CPU maxxed to 100%

Marcel Guldemond asked on August 31, 2016 21:12

Both of our Dev and Stage environments exhibit the following behaviour:

  • All pages and requests become unresponsive, for all users
  • this happens periodically, and we haven't been able to trace it back to a common cause.
  • the server's CPU is maxxed out at 100% until we kill the process and restart IIS.
  • Originally Dev and Stage were running as separate instances on the same server, so we moved Stage to a new server. Both instances still exhibit the same behaviour.
  • web server logs show a mix of requests to admin, edit, mvc, media, and content pages with each of the crashes, so normal use scenarios. As I said earlier, we haven't been able to find a pattern of repeatability.

This is a Version 9 Webapp site, with an MVC site set up in a subfolder as a separate IIS app. We used Kentico's webfarm to join the Webapp and the MVC site so the MVC site connected to smart search results.

Our question is, what's the best way to debug this, and track down what's causing this to happen?

Thanks, Marcel

Recent Answers


Brenden Kehren answered on August 31, 2016 21:30 (last edited on September 10, 2016 05:17)

Marcel,

This can be very hard to troubleshoot.

  1. I'd first start with the Kentico event log. Yes, I know you can't get in there but query the database table (cms_event) directly and see what events are going on.
  2. I'd check the servers event logs. Those will give more details in most cases than Kentico event logs will.
  3. check the IIS event logs.
  4. Remove a server from the web farm.
  5. Smart search could be causing problems. Check the /App_Data/CMSModules/SmartSearch sub directories for any smart search indexes with .lock files. Delete all the directories in there and rebuild your indexes (if you can get back in Kentico)
  6. Double check you can make a connection to the database.
  7. Run the site locally connected to the remote database and debug.
  8. Apply the latest hotfix or upgrade

Share what else you tried or what eventually fixed the issue for you.

1 votesVote for this answer Mark as a Correct answer

Roman Hutnyk answered on September 1, 2016 08:46

Marcel,

Do you have any heavy processes running on the background? e.g. some scheduled task, staging or integration task?

0 votesVote for this answer Mark as a Correct answer

Jeroen Fürst answered on September 1, 2016 10:41

Hi Marcel,

We had similar issues before and we noticed that the webserver was making calls to external services which were blocked in the firewall. Opening up the firewall and allowing the webserver to reach these services fixed it for us.

Hope this helps, Jeroen

0 votesVote for this answer Mark as a Correct answer

Pavel Jiřík answered on September 2, 2016 15:00

Hi Marcel, there was a following bug fixed in hotfix 9.0.10:

Scheduler - Infinite loop when planning the next run time of scheduled tasks in rare cases Planning of the next run time for scheduled tasks resulted in an infinite loop in rare cases. This could cause very high CPU usage on the server.

I would try to apply the hotfix first to see if that resolves the issue.

1 votesVote for this answer Mark as a Correct answer

   Please, sign in to be able to submit a new answer.