Where did the System.IO namespace go

   —   
When you search for System.IO namespace in the code of Kentico CMS 6 you won´t find it anywhere. Instead of that, you will find a little different namespace - CMS.IO. In this post I will describe why we changed this and what it means for you. I´ll also finish my talk about storing data in Windows Azure because it´s highly related to the CMS.IO namespace.

Storing data in Kentico CMS 5.5 R2

Let´s first go back to history. Kentico CMS 5.5 R2 used the System.IO namespace for accessing files, as any other standard ASP.NET application does. It was also the first version which officially supported Windows Azure. Since Windows Azure doesn´t contain any persistent NTFS data storage (more info in my last blog post) you had to store all files of Kentico CMS in the database. There were exceptions in the form of three modules - media library, smart search and web analytics. These have stored data in the Windows Azure blob storage using a special feature called Windows Azure drive. The Windows Azure drive creates a virtual drive in the blob storage and you can store data there using the System.IO interface. The disadvantage of this solution is that only one instance can mount the drive for read/write access. Since 5.5 R2 was limited to one instance anyway, this wasn´t a big deal.

Storing data in Kentico CMS 6

Then, development of version 6 started and we wanted to extend the existing Windows Azure support for multiple instances. We had more options how to handle the file system in Windows Azure, e.g. using the Windows Azure Drive for each instance. But this solution would bring a lot of disadvantages, like duplication of data or the need for file synchronization between multiple instances. Rather than that, we decided to store data natively in the blob storage. However, modification of all code which accesses the files would be a true coding horror and that is where CMS.IO came into the play.

CMS.IO

What is CMS.IO? It´s a wrapper around the System.IO namespace. It contains the same classes with the same methods and properties as System.IO. After creation of this namespace we just replaced System.IO usings in all our code with CMS.IO usings. In fact, we created an abstract layer between Kentico CMS business logic and data (meaning files and directories) storage.


As the schema above illustrates, it gave us the ability to write different data storage providers which can use different APIs in order to store data in different types of storage. We needed this to store data in the Windows Azure blob storage using its API.
Moreover, it gave everyone the ability to write a provider for any other storage without a single additional change in the existing code. As you can read in our roadmap, we are preparing an IO provider for Amazon S3 services for the next version of Kentico CMS.
CMS.IO is same as System.IO most of the time but the fact is that they are not exact clones, they are slightly different. Here is the list of differences:
  • Not everything is overridden – some objects, such as Exceptions, are not overridden. Also, unused overrides, class members or method overrides are not overridden as well. But everything commonly used can be found there. The key to decision was Kentico CMS code, we made an override of all class members which were used in existing Kentico code and we didn´t have to add anything during the whole Kentico CMS 6 development.
  • Class instances are created with the New() static method instead of a traditional constructor.
The final score is one change and one limitation. That´s not bad, is it? This is where we really appreciate your feedback. If you think that either one of the items above limits or bothers you in any way, please leave a comment.
Now let´s look practically how you can work with CMS.IO, first example shows how to write data into the file:

using CMS.IO;
...
FileInfo fi = FileInfo.New("MyFile.txt");
using (StreamWriter sw = fi.CreateText())
{
    sw.WriteLine("Hello world!");
}


In this example, I first created an instance of the FileInfo class and then I wrote “Hello world” to the file using the StreamWriter object. As you can see, there are two differences from the standard .NET code – using of CMS.IO instead of System.IO and the New() static class instead of a constructor.
Talking about StreamWriter, it may be not obvious that CMS.IO uses its own set of streams (and of course it contains writers and readers for them as well) and these are not related to System.IO (with one exception which will be shown later). But what if you have a System.IO.Stream object and you want to use it with Kentico API, which uses CMS.IO.Stream? Let´s find out in my next example:

using CMS.IO
...
MetaFileInfo mfi = new MetaFileInfo(ds.Tables[0].Rows[0]);
mfi.InputStream = StreamWrapper.New(FileUploadControl.PostedFile.InputStream);


At first I created an instance of a metafile object (metafiles are general objects representing media files) and then I wanted to set up the InputStream property, which is of the CMS.IO.Stream type. Because I wanted to use the InputStream property of ASP.NET FileUpload control as a data source, I had to use StreamWrapper which encapsulates the System.IO.Stream object into CMS.IO.Stream. 
Also, there can be a situation when you need to do the opposite operation – treat CMS.IO.Stream as a System.IO.Stream. For this purpose, CMS.IO contains the GeneralStream class which inherits from System.IO.Stream. See this example:

using CMS.IO
...
Stream writer = null;
XmlWriter xml = null;

writer = FileStream.New("MyFile.xml");
xml = XmlWriter.Create(GeneralStream.New(writer));


In this code, I created a CMS.IO.FileStream object, which is used as a parameter for the XmlWriter.Create() method. Because this method accepts System.IO.Stream, I created a GeneralStream which envelops the CMS.IO.Stream object.
To give you a complete picture of the whole solution, I created this class diagram below, which shows the streams hierarchy in CMS.IO: 


There is an abstract class called CMS.IO.Stream, from which all streams inherit (same as System.IO). The only exception is GeneralStream, which inherits from System.IO.
The CMS.IO library contains only declarations of abstract classes, the code which loads the proper provider for data (files) access and general code (for example static class Path). All storage specific code is placed in different libraries. The standard CMS.IO provider, which works with NTFS (using the System.IO namespace), is placed in the CMSStorage library.
After reading the previous text, you may ask the following question: What about my custom code? Should I upgrade it to use CMS.IO? Simple answer is yes. It´s definitely recommended. But if your Kentico site runs in a traditional datacenter (doesn´t matter whether the server belongs to you or to a webhosting company) and you don´t plan to leverage different storages than NTFS (for example cloud storages like Windows Azure blob or Amazon S3), it´s not necessary to “upgrade” to CMS.IO. Using System.IO still works fine with Kentico CMS.

Windows Azure storage provider

Now let´s take a look on how the Windows Azure provider works. If you are not interested in Windows Azure, you can skip these paragraphs and continue with reading from the next header.
This storage provider is implemented in a library called AzureStorage, the namespace is called CMS.AzureStorage and the DLL is named CMS.AzureStorage.dll. It´s the default provider for Windows Azure environment but you can use it even if you are not hosting your application in Windows Azure (more on that in the next section of this post). The provider stores data into Windows Azure Blob storage calling blob API through the C# client library, which is part of Azure SDK.
In my previous post, I told you that this storage has two types of objects - blob, which represents the files, and container, which is something like a folder in a traditional file system. Based on this information, the most logical approach would be to map folders to containers and files to blobs. But I also told you that blob storage has a flat structure (you cannot insert a container into another) but you can emulate a tree structure by blob prefixes. If you want to leverage this functionality, you have to store all data within a single container. We are using this approach in Kentico CMS, so all files are stored into one container and blob prefixes define the path to them. Directories in the blob storage are virtual and defined by its content so if you create a directory, Kentico CMS internally creates a special empty blob inside the directory and names it $cmsfolder$.
There are several configuration options for Windows Azure blob provider and all of them must be set in the web.config file or the service configuration file, because these settings have to be available immediately after application start and even if database is not yet installed.
First, you must define credentials to your storage service:

CMSAzureAccountName – Account name of Windows Azure storage service.
CMSAzureSharedKey – Primary shared access key for Windows Azure storage service.

Then you can specify the end points. If end points are not specified, default values in form <account name>.<storage>.core.windows.net are used. These are setting for the end points:

CMSAzureQueueEndPoint – End point for queue storage. The AzureStorage provider contains a class called Queue for working with this type of storage.
CMSAzureTableEndPoint – End point for table storage. There is no support class for table storage right now but it may be added in the future.
CMSAzureBlobEndPoint – End point for blob storage. This end point is most important since Kentico CMS uses the blob storage.

You can also influence container behavior:

CMSContainerRootName - Name of the root container where files are stored. The default value is “cmsroot”. 
CMSContainerIsPublic – Defines whether data inside the container can be accessed by public users (no authentication is needed). If so, Kentico CMS generates links to media files pointing directly to storage. Then you can take advantage of the built-in content delivery network and image loading can be really fast. On the other hand, this behavior is less secure. The default value is false so files are accessible only through Kentico CMS and permissions are applied. This setting is evaluated only when you are running in Windows Azure.

Now let´s take a look at how this provider works internally. We already know how data is organized but what we don´t know is how it is actually accessed. First let me point out a few facts:
  • Files are not stored on separate place from the application so there is a bigger delay than on a traditional file system. Also, data transfers between the application and storage are charged.
  • Files are shared across the Windows Azure compute instances. On traditional web farms, each server has its own copy of each file.
  • Windows Azure storage API doesn´t support file streams.
These are three challenges we have to face. Fortunately, there is a local storage on each Windows Azure instance which can help us. The first challenge is solved by using the local cache. Each file is downloaded to the local storage and it´s loaded from there. But how can we know that we work with fresh data? What if another instance updated them (challenge no. 2)? There is a property, E-tag, containing the MD5 checksum of the blob (file). So data are taken from cache only if checksums from storage and cache are equal, otherwise the new version of the file is downloaded from the blob storage. The exact process is illustrated in the following flow chart:


The cache folder is set up by default on Windows Azure. But if you want to only use the Windows Azure provider you must set the CMSAzureCachePath key in the web.config to the directory where the cache will be stored.
We’ve solved the first two challenges so let´s take care about third one, too. Kentico stores files as block blobs, so the provider internally works with 4MB file parts. The first option could be to download only needed parts of a file based on stream position etc. But since we already have a mechanism for accessing files, we built streams handling on top of that. When you are using streams, the file is taken from the cache instead of the blob and a temporary file stream is created in the local storage. When the Dispose() method is called, the stream is uploaded to the storage but only in case you opened it for the writing.
Temporary files are stored in a different folder than the cache. If you don´t run on Windows Azure, you need to configure the CMSAzureTempPath key in your web.config file in order to specify a directory where you want to store temporary streams.

Writing your own storage provider

One of the main reasons why we built CMS.IO was to give you the ability to store data on different types of storages. We are going to prepare more storage providers for you starting with the storage provider for Amazon S3 but if you need a certain provider right know, you can write it by yourself. Here is a simple guide how to do it:

1.    I recommend you to create a dedicated library for that.
2.    Add a reference from this library to the CMS.IO assembly.
3.    Create a class and make it inherit from the abstract class CMS.IO.AbstractDirectory. Implement all methods.
4.    Repeat the previous steps but this time, inherit from CMS.IO.AbstractFile.
5.    Create a class and make it inherit from CMS.IO.FileStream. Now override all methods and properties with your implementation. The default implementation throws the NotImplemented() exception so if you forget about something you will find out soon. Also, create the following constructors in your FileStream class:

public FileStream(string path, CMS.IO.FileMode mode)
public FileStream(string path, CMS.IO.FileMode mode, CMS.IO.FileAccess access)
public FileStream(string path, CMS.IO.FileMode mode, CMS.IO.FileAccess access, CMS.IO.FileShare share)
public FileStream(string path, CMS.IO.FileMode mode, CMS.IO.FileAccess access, CMS.IO.FileShare share, int bSize)


6.    Same as previous step except the class should inherit from CMS.IO.FileInfo. For FileInfo you have to create the following constructor:

public FileInfo(string filename)

7.    Again, the same process, but this time the class should inherit from CMS.IO.DirectoryInfo. The Directory info needs the following constructor:

public DirectoryInfo(string path)

8.    Add a new reference to the web project pointing to your storage provider library.
9.    Last step is to set up your new provider instead of the standard one (System.IO). You need to configure your two application keys in web.config:
CMSStorageProviderAssembly – Assembly of your created library (e.g. “CMS.AzureStorage”)
CMSExternalStorageName – name of your storage (e.g. “azure”). There are no rules how to name your storage. This information will influence two properties: CMS.IO.StorageHelper.IsExternalStorage (true if key is set to a value) and CMS.IO.StorageHelper.ExternalStorageName (returns the name of your storage provider). These properties can help you with make certain decisions in your code.

After you finish this, all files are accessed through your own storage provider. Everything in Kentico CMS will work except for:
  • Links to media library files – links are generated to directly point to the file system location. You can solve this by enabling permanent links (files are processed through CMS.IO) or by customization of the MediaFileURLProvider class. This class is responsible for link generation and with our new customization possibilities you can take control over that process.
  • “Redirect files to disk” feature - Kentico CMS redirects directly to locations in the file system. You can however customize the SendFile() method in ~\CMS\CMSPages\GetFile.aspx.cs and add your own logic which redirects files to desired location.
  • Download link to an exported package – The link points to a file system location on the server and you can customize it by modifying the Page_Load() method in ~\CMS\CMSModules\ImportExport\SiteManager\ExportObject.aspx.cs
As you can see, everything can be done without the need to purchase the source code. On the other hand, it can be unfriendly for you to modify our code in order to get links to work so we will work on this in the future.
This is all from me for now, I hope you like the changes. In case you have any questions or remarks, please leave a comment.
Share this article on   LinkedIn

Dominik Pinter

I'm a fan of cloud computing (primarily Windows Azure) and I really like to dig into web application security. My blog is focused on everything related to Kentico, .NET Framework, Cloud platforms and web application security.

Comments

Jaundalynn commented on

If I were a Tenegae Mutant Ninja Turtle, now I'd say "Kowabunga, dude!"