Kentico 8 Technology - DataQuery API

Martin Hejtmanek — Apr 25, 2014

Last time I gave you an introduction to the concept of DataQuery. Now it’s time to show you its capabilities in more detail using specific examples. Read this article to learn more.

Hi There,

As I discussed last time, DataQuery is a pretty powerful concept for querying the data from the Kentico database. Let me now get to some specific features so that you can see some specific examples.

I am going to cover the most important basics in this article, and get to more advanced scenarios later in a seperate article. Note that there is a lot that DataQuery can offer, so don’t take this as a final list, and see the Intellisense of a particular query instance for even more options.

I will focus this article primarily on the fluent version of the API, as the LINQ statements are not that Kentico-specific.

General select query

Before I get to any particular methods, I want you to get an idea of what DataQuery does internally, so that you know how particular modifications map to the resulting query.

Every data query base has a general query behind, with the full name “<class name>.generalselect”. If you are familiar to the Kentico data engine, you will find it very similar to the “<class name>.selectall” query, yet different. This query is generated on-the-fly for each individual class, and similar to other queries; you can override it by creating it manually in the UI if you need to. This is how this query looks like no matter from which class it comes from:

SELECT ##DISTINCT## ##TOPN## ##COLUMNS## FROM ##SOURCE## WHERE ##WHERE## GROUP BY ##GROUPBY## HAVING ##HAVING## ORDER BY ##ORDERBY##

You may already notice that there are more options than before, especially:

Distinct – Kentico data engine now supports dictinct for all data, I will get to that later
Source – Unlike query “selectall”, this query has a variable source (which defaults to class table name), that can be overridden. I am not going to cover sources today, just know that it allows support for joins and source redirection.
GroupBy, Having – This is another new feature that we provide in Kentico 8, you can now specify GROUP BY and HAVING statements. I will get back to that later as well.

In case paging is set up, the resulting query is wrapped in a Common table expression to ensure paging on database. I have modified it from the default state to include placeholders as well to better illustrate where the particular parameters fit:

WITH AllData  AS
(
    SELECT ##COLUMNS##, ROW_NUMBER() OVER (ORDER BY ##ORDERBY##) AS CMS_RN FROM ##SOURCE## …
)
SELECT *, (SELECT COUNT(*) FROM AllData) AS CMS_TOT FROM AllData WHERE CMS_RN BETWEEN 6 AND 10 ORDER BY CMS_RN

The key knowledge we can pick up from this is that any setup method on the query modifies the content of particular placeholder values. The placeholders are then populated with values during execution of the query. I will always include en example of setup and resulting query, so that you can get an even better idea of what we are talking about.

Common functionality

As I have shown you in the previous article, the DataQuery concept has several layers, and each layer adds certain functionality; I will describe it bottom-up. Note, that each subsequent layer inherits all capabilities from its predecessors.

To conform to the fluent syntax, each setup method returns the identity of the query that can be further modified with next methods.

The query uses SQL parameters wherever possible. Let’s take a look at the following basic query:

var users =
	UserInfoProvider.GetUsers()
		.Where("UserName", QueryOperator.Equals, "administrator");

If you use users.ToString(), it returns the representation of the object, that is executed in the database:

SELECT * FROM CMS_User WHERE UserName = @UserName

However, if you need to get the string representation without parameters, typically for debug, testing, or legacy purposes , you can use users.ToString(true), which materializes all the parameters and produces this kind of output:

SELECT * FROM CMS_User WHERE UserName = N'administrator'

You can use this concept with any of the query objects at any level. In my examples, I will be presenting the materialized variants to make it clear where the particular values go. If you want to see the originals that execute in database, just use the same code with ToString without parameters.

Where conditions

You can define where conditions in two ways: Either as a standalone object of the type WhereCondition, which you can later pass to a query, or directly at the query level. I will show you standalone conditions, just know that the resulting where condition forms the content of the ##WHERE## placeholder in the resulting query.

General Where

A general where looks like this:

var where =
	new WhereCondition()
		.Where("ColumnName", QueryOperator.Equals, "SomeValue");

It basically matches the options that the SQL server offers for binary operators and produces the following code:

ColumnName = N'SomeValue'

There is also a variant with unary operators:

var where =
	new WhereCondition()
		.Where("ColumnName", QueryUnaryOperator.IsNull);

This variant makes the query more readable, it produces the same output as:

var where =
	new WhereCondition()
		.Where("ColumnName", QueryOperator.Equals, null);

Which is the following:

ColumnName IS NULL

To simplify the readability of the code, there are also more specific variants of where.

Specific Where

We provide several options to make it easier to perform specific where operations. Here are some of them just to give you an idea:

WhereContains
WhereLike
WhereNull
WhereNotNull
WhereStartsWith
Etc.

To get the full list, see IWhereCondition reference, and look for all of the methods that have the prefix „Where“.

Each of these methods maps to a general Where method, so they are basically a syntax sugar. I encourage you to use them if you feel comfortable doing so.

Advanced Where

There are a couple of methods that help you add more advanced things to your where conditions, such as nested queries or a list of values. These are the following two methods:

WhereIn
WhereNotIn

Let me give you a specific example:

var categories =
	OptionCategoryInfoProvider.GetOptionCategories()
		.WhereStartsWith("CategoryName", "custom.");
       
var skus =
	SKUInfoProvider.GetSKUs()
		.WhereIn("SKUOptionCategoryID", categories);

Note that I have just passed in another query, which represents a list of objects without specifying the particular ID column. If you pass in the query that already has just a single column, it uses that column. But, if you pass in a richer query, the engine automatically converts it to a single column query using the ID column of that particular object. Here is the result of the call.

SELECT * FROM COM_SKU WHERE SKUOptionCategoryID IN (
    SELECT CategoryID FROM COM_OptionCategory WHERE CategoryName LIKE N'custom.%'
)

It gets all product option SKUs from all categories that have a code name that starts with „custom.“
Note that I was able to do this without using a single character, operator or keyword that would be specific to a particular SQL server type.

Logical operators

If you use several subsequent Where methods, it automatically joins them with the AND operator, so you can easily specify more conditions that must match.

To use the OR operator, you can simply use the Or method in the setup chain, here is an example:

var where =
	new WhereCondition()
		.WhereNull("A")
		.WhereNull("B")
		.Or()
		.WhereNull("C")
		.WhereNull("D");

The result is as follows:

A IS NULL AND B IS NULL OR C IS NULL AND D IS NULL

As with any other programming language, unless you specify otherwise, the operators will have their default priorities if you write everything within a single flat expression. So comparison operators will execute first, then ANDs and ORs last.

The Or Method is a one-time modification to the query, and once you use the next Where method, the operator resets to the default AND.

IMPORTANT: Never, ever use Or without immediately calling a subsequent Where method (some of them)! By calling Or on its own and passing the result further, you make the query generally unpredictable for other developers by using your code.

Brackets with logical operators

To apply brackets you can simply nest the where conditions using Where(IWhereCondition) or its variants. I prefer the one with the lambda expression that constructs the new where condition for me. It simply produces less code:

var where =
	new WhereCondition()
		.WhereNull("A")
		.Where(w => w
			.WhereNull("B")
			.Or()
			.WhereNull("C")
		)
		.WhereNull("D");

This applies the brackets exactly the way I want it to and displays it the same way as you would read it in the C# code:

A IS NULL AND (B IS NULL OR C IS NULL) AND D IS NULL

Note, that there is also a method called And, and both And and Or have oveloads that accept where conditions as well; you can use them to simplify particular scenarios.

By this time you probably already noticed how I format the code. There are a few simple rules to make such code using fluent syntax easily readable:

Use one line for one operation
Write the code in the same order as you would expect it to be in the output
Use same indentation for items at the same level
Indent further when you dive deeper in the hierarchy

If you feel that there are nicer ways how to write the same code, please share with others via comments.

Query settings

The next level in query hierarchy is the DataQuerySettings. While the where condition represents only one query placeholder (##WHERE##), settings represent all placeholders of the query. Settings are not yet capable of getting data from the database that will be in the next level. For a full reference of setup methods, see IDataQuerySettings reference.

By the way, to easily find a reference for a particular Kentico class or interface, search in Google using the following keywords “kentico api reference idataquerysettings” with the name of the particular type.
Specifically for DataQuery types, focus on generic variants of the interfaces.

Order by

Ordering results is a process that offers much narrower options than the where conditions, therefore it only has three basic methods:

OrderBy – General one with enum for direction
OrderByAscending
OrderByDescending

Once again, the usage is very simple:

var roles =
	RoleInfoProvider.GetRoles()
		.OrderBy("RoleName")
		.OrderByDescending("RoleDisplayName");

I have intentionally used two different directions so that you can see how you can use more OrderBy methods at once. The result is as follows:

SELECT * FROM CMS_Role ORDER BY RoleName, RoleDisplayName DESC

Unless the provider method description says otherwise, the resulting query is not ordered. If there is a need for an ordered query (e.g. in case of paging), and you don’t specify the order, the system uses ordering by some default column (typically ID or display name) based on the object type metadata.

Top N records

Specifying the TOPN in the query is just a call to one method. Here’s an example of it:

var roles =
	RoleInfoProvider.GetRoles()
		.TopN(10);

And it results in:

SELECT TOP 10 * FROM CMS_Role

Similar to older Kentico API, you can reset the TOPN by specifying a value of zero or less.

Result columns

There are two sorts of setup methods for columns:

Methods to specify columns that start with Column(s)
Methods to add additional columns that start with AddColumn(s)

Setting up the columns works very similarly to the OrderBy, except that if you don’t specify explicitly that you want to add extra columns, it completely replaces the current column settings. That makes sure that you only get what you wanted.

Here is a basic example of this:

var roles =
	RoleInfoProvider.GetRoles()
		.Columns("RoleID", "RoleName");

Again, I have intentionally used a variant with a list of column names to completely abstract from SQL syntax. It results in obvious query:

SELECT RoleID, RoleName FROM CMS_Role

Now let me show you why we have the Add methods. If you already tried this example and saw intellisense or the class reference, you might have noticed that columns accept two types:

String – Representing a column name of full expression
IQueryColumn – Representing abstraction of the column definition

You can use the second one to easily define more complex columns without the need to use specific SQL syntax, and also leverage queries to populate the columns. So far, we have covered columns that we feel will be most useful to you.

Should you have any further recommendations or needs, please let us know, or make your own custom column implementation.

Here is a specific example of how you can leverage an advanced column in aggregation:

var roles =
	RoleInfoProvider.GetRoles()
		.Columns("RoleName")
		.AddColumn(
			new CountColumn("RoleName").As("NameCount")
		)
		.GroupBy("RoleName");

I am a little bit ahead with the GroupBy, so please ignore that for a moment and focus just on columns. Here is the result:

SELECT RoleName, (COUNT(RoleName)) AS NameCount FROM CMS_Role GROUP BY RoleName

Below are column types that we provide by default:

AggregatedColumn – General aggregation support. Possible options are defined in enumeration AggregationType.
CountColumn – Specific type of aggregation as it is the one that is used most often.
RowNumberColumn – Represents the row number in the results, we use it in paging.
NestedSelectQueryColumn – Column populated from a nested query.

I don’t want to go into too much detail right now, so let me just make a brief note on the NestedSelectQueryColumn. Note, that queries have the method AsColumn(name), that turns the query into the column object populated with that query, so if you want to play with it, use this method instead rather than this class directly. Also, there currently is a restriction that you have to use a fully specified string where condition if you want to reference the source column in the nested query column.

Anyway, columns with nested queries have high overhead in a database. In general, try to avoid them as much as possible.

Now let me show you what’s completely new in data engine of Kentico 8 …

Distinct

You can now easily include the distinct keyword into your query. Here is an example of this and its result:

var roles =
	RoleInfoProvider.GetRoles()
		.Distinct()
		.Columns("RoleName");

SELECT DISTINCT RoleName FROM CMS_Role

There is not much to explain here; just know that this method and similar methods that simply enable a particular option always have an optional parameter that allows to disable it as well. So in this case you can call Distinct(false) to remove the previously set up DISTINCT keyword.

Group by

Now let me finally get to the GROUP BY part of the query. There are only two methods:

GroupBy – Sets group by for specific columns
NewGroupBy – Resets both group by and having part of the query

This method only gets the list of columns to group by. I will show you an example in a couple of moments together with having.

Note, that group by never adds columns, and always specifies the full list.

Having

Having is essentially a where condition in a specific location of a query. To make things as simple as possible, we didn’t replicate all where methods with Having prefixes. Instead, we have just provided the variant that gets the where condition object (or sets it up), and uses the condition.

So here is the example for both group by and having:

var roles =
	RoleInfoProvider.GetRoles()
		.Columns("RoleName", "RoleDisplayName")
		.AddColumn(
			new CountColumn("RoleName").As("RoleCount")
		)
		.GroupBy("RoleName", "RoleDisplayName")
		.Having(w => w
			.WhereStartsWith("RoleDisplayName", "CMS ")
		);

With the following result:

SELECT RoleName, RoleDisplayName, (COUNT(RoleName)) AS RoleCount FROM CMS_Role GROUP BY RoleName, RoleDisplayName HAVING RoleDisplayName LIKE N'CMS %'

This example gets the listing of all roles that start with “CMS” with their number throughout the whole system (all sites and global).

Paging

Paging in data query is simple, and basically maps to the above mentioned Common table expression that restricts the results to particular rows in the overall result data. While paging is not set, the query is represented only by a plain general select query.

The paging is provided through a method called Page:

var roles =
	RoleInfoProvider.GetRoles()
		.Page(2, 5);

In our case, we want the third page (indexed from zero in C#) of size five, which means row numbers betwen eleven and fifteen (indexed from one on SQL server). The resulting query is the following:

WITH AllData AS
(
    SELECT   *, ROW_NUMBER() OVER (ORDER BY RoleDisplayName) AS CMS_RN FROM CMS_Role
)
SELECT  *, (SELECT COUNT(*) FROM AllData) AS CMS_TOT FROM AllData WHERE CMS_RN BETWEEN 11 AND 15 ORDER BY CMS_RN

Note how the system added default ORDER BY for you to make sure the paging is deterministic. If you define the order by yourselves, it uses yours.

To adjust to the next page you would just simply increment the page index, and continue by calling Page(3, 5).

When a query is paged, you can use its property TotalRecords to get the total number of records available. Have you noticed the CMS_TOT column in the above example? That is how the query gets this information.

There are also ways to easily iterate through all data page-by-page, item-by-item, but I will cover that in another article.

Data query

So now you know how to set up particular parts of your query. Now let me cover additions that DataQuery itself brings to the process.

As this article is already quite long, I will only cover the basics and leave the advanced topics for some other article.

The DataQuery basically builds the query using settings, and executes it in database to get the results when the code requests them. The whole process is lazy-loaded.

Untyped results

The DataQuery itself doesn’t store the results in cache, so if you need caching you need to implement it in your code.

What it does is caching the current result directly in the query object. If you enumerate the data, it doesn’t make additional query to the database. It just loads all of the defined data when you start working with the results.

IMPORTANT: Whenever you change the setup of the query, it drops its current cached results so it can load the fresh ones on the next enumeration.

DataQuery itself has untyped results represented just as a DataSet. To use typed results you need to use ObjectQuery.
To work with the result, either assign the query to a DataSet variable (there is an implicit conversion defined), or use its Result property.

For your convenience, the query also has a property Tables so that it mimics the most used parts of the DataSet interface.

Then you can work with the DataSet the regular way. Use this approach for performance-critical operations as strongly typed enumeration that ObjectQuery provides has higher overhead.

Alternatively you can call the ExecuteReader method on the query object to get the results in a form of a DataReader.

ObjectQuery

Strongly typed extension of a DataQuery is called ObjectQuery. All info providers return this stongly typed query. In addition to everything I have described, ObjectQuery adds aspects related to strongly typed code.

Strongly typed results and enumeration

With ObjectQuery, you can either get the results as InfoDataSet through the property TypedResult, or enumerate it directly as shown in the previous DataQuery article.

Specific methods

Object query has a direct link to a particular type as well as its metadata. Using this information, it provides certain methods to further simplify the queries and readability of the code. These are basically extensions of Where populated by the right data.

OnlyGlobal – Restricts the selection only to global objects
OnSite – Restricts the selection only to objects assigned to the given site (either by SiteID column or site binding)
WithCodeName – Restricts the selection based on code name
WithGuid - Restricts the selection based on GUID
WithID - Restricts the selection based on ID

Let me give you one example to demonstrate how this can make your life easier. This query gets all stylesheets assigned to the sample Corporate site:

var css =
	CssStylesheetInfoProvider.GetCssStylesheets()
		.OnSite("CorporateSite");

This is the result of the query:

SELECT * FROM CMS_CssStyleSheet WHERE StylesheetID IN (
    SELECT StylesheetID FROM CMS_CssStylesheetSite WHERE SiteID = 110
)

You would have to do this in order to achieve the same result by knowing the exact achitecture of CSS to Site bindings:

var siteId = SiteInfoProvider.GetSiteID("CorporateSite");
 
var cssSites =
	CssStylesheetSiteInfoProvider.GetCssStylesheetSites()
		.Column("StylesheetID")
		.WhereEquals("SiteID", siteId);
 
var css =
	CssStylesheetInfoProvider.GetCssStylesheets()
		.WhereIn("StylesheetID", cssSites);

Making such tweaks can make your life easier, so whenever you find yourself writing too much repeated code, simply ask yourself if you can write some extension method that would simplify that for you, or just ask us for a method of this type.

This all becomes even more interesting when we get to the documents. In one of the next articles, we will teach you how to leverage the DocumentQuery, which is an even better extension of the ObjectQuery that targets documents and all of their aspects.

Wrap up

The DataQuery concept is very powerful, and it allows us to give you the best of Kentico data engine technology. By this time you should have an idea of how the concept works, and how you can leverage it for your needs.

The key takeaways for you are the following:

DataQuery concept has several layers where each layer adds specific functionality to the previous layer
WhereCondition class provides just basic representation of SQL where condition
DataQuerySettings class provides all query parameters (top N, distinct, columns, source, where condition, order by, group by, having and paging)
DataQuery represents a specific untyped query that provides lazy-loaded results
ObjectQuery is a strongly typed query that provides simplified methods to query objects based on their metadata
All Kentico providers return ObjectQuery and therefore support all above mentioned scenarios including all parameters

Everything that we will be potentially adding in the future will be just another layer of abstraction to simplify work with a specific API (as we did for documents), or that provides more and better setup methods as well as support for even more DB operations.

We are going to completely abstract all of our code from specific SQL syntax, and you should do the same to make sure your code can be supported the best way possible in the future. If the engine doesn’t allow you to completely abstract from SQL syntax, just let us know.

Make sure to give us feedback and share this knowledge through Twitter or other channels.

Looking forward to seeing you at my next article!

Share this article on Twitter Facebook LinkedIn

Martin Hejtmanek

Hi, I am the CTO of Kentico and I will be constantly providing you the information about current development process and other interesting technical things you might want to know about Kentico.

Comments

MartinH commented on May 21, 2015

Hi, you can get the total number of records while using paging through the TotalRecords property of the query object. If you get it after you get the data (result or start enumerating), it won't cause any additional queries to database. If you get it before, it will issue and extra query just to get the number of records without getting the actual data of current page.

piyushkothari3 commented on May 20, 2015

In older versions with GeneralConnection, we could also get total results, when we were getting paged data, how do I achieve this with new API?

Mohamed Rashed commented on Apr 28, 2014

I think you are very close to tiny ORM concept like PetaPOCO, dapper-dot-net and Simple.Data.

Yehuda commented on Apr 25, 2014

This is my favorite feature in v8.
My code now looks so much better and lets even someone that is unfamiliar with Kentico to understand what I'm doing.