How to get report/list of unpublish items ?

khanh dinh asked on June 7, 2019 13:18

In Media libraries, I created some folder for product images (such as folder 1, 2, 3). In each folders, some images are linked to publish and some to unpublish pages. Is there any process/method in Kentico where I can extract the list of the unpublish items (images that linked to unpublish pages) from those Media folders ? such us using Reporting ? or SQL query ?

Recent Answers


Mike Wills answered on June 7, 2019 19:47

Hi Khanh,

This is a significant challenge. It's one reason that I like to use page attachments for images that are page specific. If a page is archived or deleted, the image is archived or deleted with it.

In your situation, it sounds like you're looking for a way to clean up the media library, based on what images are needed to support the published pages. There are a few solutions, but either option requires tedious manual work, or custom code.

Spider option
You could use a site crawler like Screaming Frog to crawl your site for images. This would give you a list of all images used by published pages. You could then compare this list to the list of images in your media library, to identify which ones are used by published pages. Beware, you may have a mix of direct media URLs and permanent URLs. Either way, you'll be able to identify the media files, whether by path, or by guid.

Automated option
You could create a tool that crawls your content tree to identify media files that are in use. This will be feasible if your content is in page type fields. It will be very difficult if you have content in editable text web parts and widgets. The tool would have to know the fields in every type that may have a image url in it. Typically, this will be rich text fields and text fields using the "URL selector" or "Media selection" control. The tool would have to scan all nodes in the content tree, looking for the use of a media file, to determine whether or not an image is used by a published page. This would be tough, but feasible. Unless you needed some automated tasks to happen regularly, I'd opt for a solution similar to the first one.

Regards,

Mike

1 votesVote for this answer Mark as a Correct answer

Juraj Ondrus answered on June 10, 2019 07:44

If you want to use SQL, you can use WHERE condition like this:
([DocumentCanBePublished] = 1 AND ([DocumentPublishFrom] IS NULL OR [DocumentPublishFrom] <= GETDATE()) AND ([DocumentPublishTo] IS NULL OR [DocumentPublishTo] >= GETDATE()))
This is the condition to check whether page is published. So, you basically need to negate it to get the list of pages which are not published. Or, just a quick , dirty way:
Select * FROM View_CMS_Tree_Joined WHERE DocumentID NOT IN (Select DocumentID FROM View_CMS_Tree_Joined WHERE ([DocumentCanBePublished] = 1 AND ([DocumentPublishFrom] IS NULL OR [DocumentPublishFrom] <= GETDATE())

1 votesVote for this answer Mark as a Correct answer

khanh dinh answered on June 13, 2019 10:19

Thank you very much for your great assistance and information. Let me do check my server again with your suggestions.

0 votesVote for this answer Mark as a Correct answer

   Please, sign in to be able to submit a new answer.