Introduction to (Windows) Azure
Hi there, with this post I’m starting a new series here on the DevNet portal about Windows Azure. I will write about Azure in general and also about Azure integration in Kentico CMS. In this first post I will introduce the Azure platform from a development point of view. I will tell you a few words about the capabilities of Windows Azure, what you can use in Azure and what we decided to use when we started to support Kentico CMS
First of all, I will not explain the basic concepts of cloud computing. If you are completely unfamiliar with cloud computing, you can read some basic info in Michal Neuwirth’s post
Let’s start with terminology. Azure-related terminology is a little bit tricky and one could get confused very easily.
– Azure is a term used for Microsoft’s entire cloud computing platform — it consists of Windows Azure, SQL Azure, Azure AppFabric and Azure Marketplace DataMarket.
– this term is usually used for computing, storage and network services provided by the Azure platform. We will look at this part a little bit deeper below.
– it’s a relational database service similar to MS SQL Server.
– Supports infrastructure services. Azure Appfabric consists of a general service bus, access control services and a durable cache service. The term “fabric” is the trickiest one in Azure terminology because a service with a very similar name also exists — the Fabric Controller, which is a completely different thing. Microsoft also offers a product called Windows Server AppFabric, which is similar to Azure Appfabric by functionality but it is used on-premise.
Azure Marketplace DataMarket
– Data as a service. There are two roles: producer
–offers some kind of data and consumer
– who can use data from producers as a data source.
I will use these terms in the rest of this series. As you probably noticed in the list above, I introduced the basic parts of Azure. I personally think that for CMS systems
, two services are most critical — Windows Azure and SQL Azure. You need to run your application somewhere (Windows Azure) and store its data (SQL Azure as a database backend, Windows Azure storage for storing physical files). Let’s take a more detailed look at the first one (SQL Azure will be covered in one of the future posts).
As you can see, there are four different components in Windows Azure
. Let´s start with a description of Windows Azure computing services. Basically, this is only a huge farm of virtual machines. However, there is one difference between standard VM web farms and Windows Azure. You have to put your application into one or more roles. Each role offers a different approach.
– designed for interaction with users. A web role is basically a place for hosting your web application or web service. There are three subtypes:
ASP.NET web role – for running ASP.NET applications. As .NET developers, you will in most cases use this type of role.
WCF Service web role – for your WCF services.
CGI web role – this role is meant to be used for platforms/languages other than ASP.NET, such as PHP or Java.
– simply put, everything except for web applications should run as a worker role. You can also run web applications as a worker role but you would do better to go with a web role, which is designed for it. Originally, worker roles were mainly designed for running long complex tasks or scheduled tasks. When you for example need to sort some table every few hours, a worker role is a good place to run this code. But you can run whatever you need there. You want a ftp server hosted on Azure? No problem. You only need to specify an input endpoint for your worker role (you couldn´t do this in the first version of worker roles), change the code of your ftp to bind to the correct port using Azure runtime API (more on that in one of the future posts) and you are ready to go.
– VM is a shortcut for virtual machine. As you certainly know, Windows Azure uses the platform as a service model but you can now also use it as infrastructure as a service. The VM role was created for this purpose. To use it, you create a VHD using Hyper-V technology, upload it to your cloud space and that´s it. Unfortunately, with VM you lose some of the benefits of PaaS — you have to take care of your role by yourself, including the installation of new updates to the underlying OS.
All your roles are maintained by the Fabric Controller
. It creates and deletes virtual machines, monitors the health of your application and so on. All this is fully transparent for you, you only enter the desired configuration into the service definition and service configuration files and the Fabric Controller takes care of everything.
Another important part of Windows Azure is storage. There are three types of storages. All storages are highly distributed, durable, highly available and ready for massive scaling.
– General storage for huge binary files. Blob storage is also a good replacement for a standard file system. Blob storage consists of containers and blobs. Containers are similar to folders in a file system and every blob must be in some container. The only difference is that one container cannot be placed inside another, the hierarchy is flat. On the other hand, the managed client library offers virtual directories for the simulation of a tree structure (this topic will be covered more deeply in the future). There are two types of blobs. Page blobs are designed for random read/write access, ideal for storing text files. Block blobs are intended for everything else. There is one rule — if you don´t have any specific reasons for using page blobs, always use block blobs.
– The ideal place for storing non-relational metadata. Tables don´t have any schema and every row can have a different structure. To get or set data from/into a table you need to use a key. Each key consists of two parts — a row and partition key. You can get the best performance by dividing a table into partitions using different partition keys.
– this type of storage is usually used for internal communication between roles. Of course, you can use this storage for whatever you want to.
Now a few words about Azure networking services. There is a content delivery network
, so if you have customers all over the world, this is an ideal solution to decrease latency. There is also a thing called Azure Connect
. It allows you to connect your on-premise servers with Azure roles. For example, you can join your Azure role to your domain. Azure Connect is currently in CTP (community technical preview).
Differences between On-premise and Windows Azure
Let´s take a look at the main differences between an on-premise environment and Windows Azure. This part will be focused only on the differences between standard ASP.NET development and Azure.
The application must be stateless
In the Azure environment, roles typically have more than one instance. It´s recommended to have at least two instances of each role in order to ensure high availability. Requests between instances are divided by the built-in load balancer. The algorithm behind the load balancer uses the round robin model, so every request goes to a different instance. Where is the catch? If you are storing anything in the memory of your process, you have to synchronize it between instances. As you may know, we have the web farms module in Kentico CMS which takes care of the synchronization of static objects.
Another problem is with session state data — you cannot use any of the three standard providers for storing session state data. InProc and StateServer are out of the game because both store data on the current machine. SQL Azure should be usable for this purpose with a few changes (we will discuss it in the post about SQL Azure).
A durable standard NTFS file system is missing
I have already introduced all durable storages and I didn´t mention a NTFS file system. It´s because the Azure platform doesn´t offer any durable NTFS file system. The reason for this is that NTFS isn´t a distributed file system, which is a basic condition for use in the Azure environment. You could use local storage on a single instance. However, local storage is not durable storage and data aren´t synchronized across instances, so you can use this type of storage only for temporary files or as the cache of the given instance.
Also, if you store anything into files in your application using the System.IO namespace and you don´t want to change this, you can use the Azure drive feature. This feature enables you to mount blob storage as a NTFS hard drive that you can use in the standard way. But, you can mount this drive for reading and writing for a single instance only. I’m not saying that you cannot use this feature with multiple instances, but it requires additional work to be done.
The last option is to use pure blob storage as replacement for a standard file system — we will talk about this option more in the future.
No write access to the application directory
Yes, you can use local storage on individual instances but this type of storage is in a different part of the hard drive than the application data. The part with the application is read only. Why is that a problem? For example, in Kentico CMS we write the connection string into the web.config file after installation — you cannot do that on Azure. Why is that? Because you would change data only on one instance. More importantly, even if you created a mechanism for synchronizing data between instances, you still couldn’t change the golden image. A golden image is created when you upload your application to the cloud. If you add a new instance, it´s created from the golden image.
Used parts of the Azure platform in Kentico CMS 5.5 R2 and 6.0
Now it´s time to take a look what Kentico CMS needs to run in Azure.
ASP.NET Web role
– All of Kentico CMS is one big ASP.NET web role.
– as you know, Kentico CMS normally uses MS SQL Server for its database. In the Azure world we use SQL Azure services
Kentico CMS 5.5 R2
Blob storage via Windows Azure Drive
– the Smart search, Web analytics and Media library modules need to save data to the file system. Because 5.5 R2 is limited to use with one instance only, we decided to support these modules using the Windows Azure Drive feature.
Kentico CMS 6.0
– for version 6.0, we are preparing a replacement of the standard NTFS file system by blob storage on Azure. We will not use Azure Drive anymore — only the pure blob API.
Azure AppFabric cache
– We will use this cache service at least for storing session state data.
Ok, I think this is quite enough for today. The next post will be dedicated to developing on Azure. Stay tuned.