Page Type field versioning best practices

Virgil Carroll asked on January 25, 2016 21:09

Hey thought I would throw out this to get other opinions.

I have a client that keeps using the Object Versioning in Kentico to keep track of changes to page type fields in their projects. This is causing project SQL DBs that have only a few hundred or thousand pages to be extremely large (one almost 20GB). They say they need this so they can keep good track of who changes what fields as part of their internal QA process, but I keep encouraging them to turn this off and look at other methods.

This is a rare occurrence, but from time to time we have clients that want this type of detailed audit trail. We have done this often in SharePoint and it has a pretty good mechanism for this, but Kentico object versioning does not seem the best solution. Wondering what others have done and what best practices you use?

Recent Answers

Virgil Carroll answered on January 25, 2016 21:09

I should mention they are using this to track ALL changes, including changes to the objects themselves, not just for field tracking.

0 votesVote for this answer Mark as a Correct answer

Roman Hutnyk answered on January 25, 2016 21:14

Virgil, I believe that Continuous Integration could be used here - it will allow you to keep those changes under the source control, which will reduce database size and improve overall performance.

0 votesVote for this answer Mark as a Correct answer

Virgil Carroll answered on January 25, 2016 21:15

Roman, good thought...they are still in 8, but that could definitely be a discussion point

0 votesVote for this answer Mark as a Correct answer

Brenden Kehren answered on January 25, 2016 22:51

Hi Virgil,

So what's the actual problem? Are they short on disk space where the 20GB DB is too large or is there just too much history? Kentico's version history is a great way to keep things in check especially if you can't upgrade to v9.

First start with ensuring all the versioning is setup properly in Kentico and will eliminate older version history as needed.

Secondly, I'd check to see if the version history is the actual problem in the database. Run a script to see which tables have the most data in them. In the last few months I've worked with some large websites who had analytics turned on but never managed the analytics data. One database was 153GB in size. After cleaning up the analytics data to the last year or less, I was able to get the database to under 12GB. Another I just worked on last week is at 18GB but they didn't have all tracking turned on. Once I removed the history in Kentico, it cut the database back to 4GB. My guess is you'll find other tables with more data in them than the version history. Some tables to check are CMS_Email, all the Analytics_XXX tables, CMS_EventLog, Staging_Task. I also have a script which gives row counts for all tables, views and indexes.

There are other options like upgrading to v9 and purchasing RedGate but they come with an expense so I'd say use the tools Kentico provides because they work very well but make sure the tools are configured correctly before going with another solution.

Good luck!

1 votesVote for this answer Mark as a Correct answer

Martin Hejtmanek answered on January 26, 2016 09:11

Hi Virgil,

I would definitely start by examining which particular tables have the largest size as Brenden suggests. Here are some additional considerations:

  • v9 Supports storing binary data of attachment versions in file system, that could greatly reduce the size of your database if the biggest table is attachment history

  • another option to consider is setting System -> Log metadata changes, which logs changes to particular fields in event log. The setting doesn't have the best name though, in fact it is logging data, but just changed ones

  • if none of that helps, you can always add custom handlers (ObjectEvents.Insert.After, ObjectEvents.Update.Before, ObjectEvents.Delete.Before) and log whatever custom information you need to either custom table, custom module class, file system or whatever storage suits you, BaseInfo has methods which can provide you with the changed data information, an example can be seen in this article:

0 votesVote for this answer Mark as a Correct answer

Virgil Carroll answered on January 26, 2016 21:02

Thanks for all the thoughts. I did actually run a script I have to see the table sizes, hence my question on good practices, the two biggest tables in a couple of their projects (after turning on all object versioning) was the Object Versioning table (averaging between 10-15GB) and EventLog (averaging 2-4GB, but with millions of rows). A lot of the projects we work with are highly transactional and changes are happening constantly on a lot of content, so this method of tracking is creating really large tables.

Where this is not a storage issue (its in Azure) it is causing a lot of challenges when taking backups, making copies, etc for different purposes. That is what prompted me to ask what others do to keep better track of content and page changes vs. just using the object versioning.


0 votesVote for this answer Mark as a Correct answer

Joshua Adams answered on January 26, 2016 22:50

You may want to event put a limit on the event log size. Seems like thats a lot of data to keep, unless it is used frequently for troubleshooting. As far as the versioning, how many versions would have to be kept? Im looking at a version 7 site, and there are properties to set the major and minor version length. That could possibly help, only drawback there would be that eventually they would be removed from the system, or minor version promoted to major after the period of time specified in the settings. Just a thought, some clients want all data no matter what, even if they most likely will never use it, but if thats their choice, and they want to pay the upgraded cost for maintaining it, then so be it.

0 votesVote for this answer Mark as a Correct answer

Brenden Kehren answered on January 27, 2016 03:22 (last edited on January 27, 2016 03:23)

I agree with Josh, limit the event log and no need to log all the details of every object change if you are keeping versions of them. Versioning will do that already, no need to log them too. The event log is pretty much useless if it gets too large, takes a long time to write to it and read from it. I set the event log to between 5000 and 10000 depending on the client.

You might also look at setting the minor versions a bit higher threshold if they are editing a lot too.

0 votesVote for this answer Mark as a Correct answer

Virgil Carroll answered on January 27, 2016 18:11

Thanks all for the comments, was really curious about what others are doing and this helps.

0 votesVote for this answer Mark as a Correct answer

   Please, sign in to be able to submit a new answer.