Get all urls

Jay Asbury asked on October 4, 2023 04:26

I am working with a Kentico 12 site and am building a tool to get all known urls for all current published pages. I can get all documents with something like

var documents = DocumentHelper.GetDocuments()
.Published(true)       // Retrieve only published documents
.Columns("DocumentID", "DocumentName", "NodeAliasPath")
               .OrderByDescending("DocumentModifiedWhen");  // Order

But I am having trouble knowing how to get the other urls and if documentAlias or alternateurls or other types will provide the whole set of urls for a document. I am doing this in prep of verifying we have all urls working for a rewrite of the site in 13. I need to get this list to the people testing urls. I need the urls that would be navigated to on live site.

Correct Answer

Jay Asbury answered on October 4, 2023 18:07

Thanks. That helped me get to

 WITH urls
AS (SELECT DocumentName,
           NodeAliasPath,   -- default URL
           DocumentUrlPath, -- if populated, this overrides the NodeAliasPath (replace ROUTE: with empty string)
           DocumentCulture, -- if multi-culture, you'll need to filter off of this as well
           AliasURLPath,    -- if populated, this means the NodeAliasPath it is linked to has this as a vanity URL 
           AliasActionMode  -- what happens when this URL is hit, redirect, do nothing, default is redirect I believe  
    FROM View_CMS_Tree_Joined
        LEFT OUTER JOIN CMS_DocumentAlias
            ON NodeID = AliasNodeID
               AND DocumentCulture = AliasCulture
    WHERE
        -- get only published pages 
        (
            [DocumentCanBePublished] = 1
            AND
            (
                [DocumentPublishFrom] IS NULL
                OR [DocumentPublishFrom] <= GETDATE()
            )
            AND
            (
                [DocumentPublishTo] IS NULL
                OR [DocumentPublishTo] >= GETDATE()
            )
        )
        AND NodeSiteID = 1 -- enter your site ID from the CMS_Site table
)
SELECT urls.NodeAliasPath,
       CASE
           WHEN urls.DocumentUrlPath IS NULL
                OR RTRIM(LTRIM(urls.DocumentUrlPath)) = '' THEN
               urls.NodeAliasPath
           ELSE
               urls.DocumentUrlPath
       END NavigationURL,
       1 OriginalDocument
FROM urls
UNION
SELECT urls.NodeAliasPath,
       AliasURLPath,
       0
FROM urls
WHERE urls.AliasURLPath IS NOT NULL
ORDER BY urls.NodeAliasPath,
         OriginalDocument DESC;
0 votesVote for this answer Unmark Correct answer

Recent Answers


Brenden Kehren answered on October 4, 2023 16:18

What hotfix are you on?

Are you using Portal Engine or MVC?

If MVC are you using a custom routing engine of some sort?

0 votesVote for this answer Mark as a Correct answer

Jay Asbury answered on October 4, 2023 16:20

Portal. Going MVC in 13.

0 votesVote for this answer Mark as a Correct answer

Jay Asbury answered on October 4, 2023 16:28

Pretty sure we are on 95 hotfix.

0 votesVote for this answer Mark as a Correct answer

Brenden Kehren answered on October 4, 2023 16:34

Since you're using Portal Engine, a SQL query like below should get you what you need:

select DocumentName
, NodeAliasPath -- default URL
, DocumentUrlPath -- if populated, this overrides the NodeAliasPath (replace ROUTE: with empty string)
, DocumentCulture -- if multi-culture, you'll need to filter off of this as well
, AliasURLPath -- if populated, this means the NodeAliasPath it is linked to has this as a vanity URL 
, AliasActionMode -- what happens when this URL is hit, redirect, do nothing, default is redirect I believe  
from View_CMS_Tree_Joined
left outer join CMS_DocumentAlias on NodeID = AliasNodeID and DocumentCulture = AliasCulture 
where  
-- get only published pages 
([DocumentCanBePublished] = 1 AND ([DocumentPublishFrom] IS NULL OR [DocumentPublishFrom] <= GETDATE()) AND ([DocumentPublishTo] IS NULL OR [DocumentPublishTo] >= GETDATE())) 
and NodeSiteID = 1 -- enter your site ID from the CMS_Site table
0 votesVote for this answer Mark as a Correct answer

Brenden Kehren answered on October 6, 2023 17:24

Jay,

Your query will ONLY get you the pages with alias paths added and NOT ALL POSSIBLE URLs like you requested. The query I provided will give you ALL POSSIBLE URLs.

0 votesVote for this answer Mark as a Correct answer

Jay Asbury answered on October 6, 2023 17:45

URLS is your query just formatted along with the rest of the query.

My query using the cte does the following

SELECT urls.NodeAliasPath, CASE WHEN urls.DocumentUrlPath IS NULL OR RTRIM(LTRIM(urls.DocumentUrlPath)) = '' THEN urls.NodeAliasPath ELSE urls.DocumentUrlPath END NavigationURL, 1 OriginalDocument FROM urls

Provides the original documents along with their replacement if being replaced as you said with DocumentUrlPath -- if populated, this overrides the NodeAliasPath (replace ROUTE: with empty string)

Then I UNION the following aliases query

SELECT urls.NodeAliasPath, AliasURLPath, 0 FROM urls WHERE urls.AliasURLPath IS NOT NULL

and then the order by moves all the nodealiaspaths together

ORDER BY urls.NodeAliasPath, OriginalDocument DESC; Example of one document below

Node Alias Path Navigation URL Original Document
/Propane-Finder /Propane-Finder 1
/Propane-Finder /finder 0
/Propane-Finder /Location 0
/Propane-Finder /Locations 0
/Propane-Finder /nearme 0
/Propane-Finder /tv 0

Which matches the page setting sin cms. The 2nd column is the value I wanted.

0 votesVote for this answer Mark as a Correct answer

Brenden Kehren answered on October 6, 2023 17:48

Thanks for the explanation, I see that now. Nice addition!

0 votesVote for this answer Mark as a Correct answer

   Please, sign in to be able to submit a new answer.