For several months I have been writing a series about the basics of Windows Azure. This post is the last one from that series and we will focus on databases in Microsoft cloud, on a service called SQL Azure.
As you may assume, SQL Azure is basically a Microsoft SQL server running in the Windows Azure environment. This could be the most basic description of the service. However, as with everything in Azure, there is a lot of small differences which you might want to know about.
SQL Azure Architecture
Let´s start with the description of the physical architecture of the service. Knowing how it works internally is important for understanding why SQL Azure is limited in some ways. Imagine that you have Kentico EMS running in your datacenter. You collect a lot of important data and you don´t want to lose it in case of a hardware failure. To serve a lot of queries and protect your data, you typically replicate your data across multiple database servers and regularly backup the data. And this is basically SQL Azure, it´s a bunch of modified MS SQL servers hosted and replicated in the Azure datacenters.
Actually, it is more complicated, your database is replicated in real time three times and every replica is placed in a different node. Microsoft says that they are using a brand new and special type of replication and this replication is not part of any existing product. Microsoft has at least two different datacenters on one continent and at least one copy of a database is hosted in a different geo location (but always on the same continent). So, even if your primary datacenter is destroyed by some kind of disaster (an earthquake for example) your SQL Azure database will continue to work.
However, having three database servers for each customer would be really expensive. Instead of that, servers are shared by multiple customers. You can of course access only your databases and databases of other customers are completely isolated from you.
SQL Azure is “only” a relational database
You sure know that Microsoft SQL server contains a lot of different features like SQL Reporting services
and the relational database is only a part of it. SQL Azure can look like a step back from this point of view; it´s only a relational database, other features are missing. (Well, this is not entirely true but more on that later). Additionally, because the databases are shared between users, you cannot modify server settings and use features which can influence the whole server or other databases. You should also keep in mind that SQL Azure is based on SQL server 2008 R2 or may be already upgraded on SQL Server 2012 (codename Denali) so older features such as system tables
are not supported and you must use their successors (in case of system tables you must use catalog views).
The complete list of limitation is on MSDN
. From my point of view as a developer, these are the most significant:
Full-text search – this feature highly depends on the underlying storage (disk) and influences the whole server. It is one of the most wanted features in SQL Azure and Microsoft had it on their SQL Azure roadmap for several times. However, at some point of development, they found an unsolvable problem so the future of this feature is quite unsure.
Common language runtime procedures – Similar story. Procedures (their code) are executed on the database server and this could lead to problems with the performance of the given server and its databases. When a procedure contains a memory or CPU challenging task, it will slow down everything else on the server.
Running cross-database queries – Every query is executed strictly in context of a certain database. And there is no way to jump out of this context. Your database server is not a real machine. It´s a group of virtual nodes in the Azure datacenter and each of your databases can be stored in a different node. Because of that, commands like
USE <database> are denied.
SQL Agent – SQL agent is a mechanism for executing scheduled administrative tasks (jobs) in the SQL server. Unfortunately, SQL Azure doesn’t support it. This is the reason why SQL Azure doesn’t support SQL session state provider - there is no SQL Agent which could execute the delete procedure (job) for expired sessions. However, you can use your on-premise SQL Agent and connect it to the SQL Azure database.
Backup/Restore – Data are protected against hardware failures or disasters but what about user mistakes (e.g. a user deletes data in certain table)? Typically you would use the backup/restore feature but SQL Azure doesn’t support it. Rather than that, you have three other options which will be covered later in a dedicated chapter.
Clustered index – In fact, this is not a limitation but a requirement. Every table must have one clustered index.
Limited amount of resources – As I wrote, your database is shared with other databases so a query which consumes more than an allowed amount of resources is cancelled. See this blog post for more details.
Are you still reading? That´s great, the bad news are already behind us. There are a few additional features in SQL Azure – data synchronization service, reporting services and support for data tier applications.
For synchronization or migration of a data from your database to SQL Azure, you can use the Sync framework
(which now supports SQL Azure, see this post
for details) or cloud service, the Data Sync
service. This service is now in CTP (community technical preview). With this service, you can configure your SQL Azure database to be synchronized to another one. It can be a different SQL Azure database in the same or different geo location or even an on-premise SQL server DB. Data synchronization can be bi-directional and it is done within a certain time interval.
SQL Azure reporting service is a cloud based service built on the on-premise SQL reporting which is part of Microsoft SQL server. This service is currently in CTP. If we compare it to the on-premise version, it has many limitation
s and it´s not that feature rich (this is actually a sign of many Azure services) but more features will be added in the future. You can use the service for viewing reports from SQL Azure directly in a browser; any other software is not needed. You can also embed an AJAX based report viewer into your web application.
Data tier applications
With Visual studio 2010 and Microsoft SQL server 2008 R2 it is possible to create a special project for SQL schema scripts within your solution. This feature helps mainly team development because you can treat SQL scripts as any other code using source control, etc. After you create your schema, you simply deploy it to the database. SQL Azure is also supported in this feature so you can deploy your database in this way directly to the cloud.
Editions of the SQL Azure
SQL Azure is available in two editions – Web and Business. Currently, they differ only in maximal size of the database where Web can take 1 to 5 GBs and Business can be 10-50 GBs. Why did they divide the service into two editions? In the future, there will be several advanced features in SQL Azure (for example the ability to preserve CPU cores or RAM size for your database) and most of them will be supported only in the higher (Business) edition.
Backup and synchronization
As I wrote a few paragraphs above, in SQL Azure, you are automatically protected against physical damage of your data caused by a disaster or hardware failure. But you have to take care of human faults. There are a number of ways to do that, I´ve already mentioned data synchronization using the sync framework or data sync service. With these features, you can build an automatic backup mechanism where your database will be synchronized to an on-premise SQL service which will be backed up in a standard way.
Another option is to use the COPY
CREATE DATABASE destination_database_name
AS COPY OF[source_server_name.]source_database_name
To copy MyDatabase database to the same server execute this command:
CREATE DATABASE [MyDatabaseBackup] AS COPY OF[MyDatabase]
This command is available only in SQL Azure and it copies a whole SQL Azure database to another one. The first database will be primary and the second will be a backup. Good solution is to schedule running of the COPY command using a worker role.
The third possible solution is using the Import/Export feature of SQL Azure. This feature is related to support of data tier applications. Basically, you can do an export of the database to the special dacpac
format. It exports data as well as schema information. Then you can import this file into a different database. See this example
how this can be done programmatically.
Management of the SQL Azure
Now let´s take a look at how you can manage the SQL Azure service. The options are similar to other services. The whole management can be done via the Windows Azure portal (http://windows.azure.com
Also, there is a special part of the portal for management of the data. You can access it by clicking Manage in the ribbon menu. After that, a new window will appear. After you enter credentials for your database, you can browse it directly in a browser.
This portal has ambitions to replace SQL Management Studio one day. Until then, you can of course use the real SQL Management Studio but you need version 2008 R2 or higher.
A whole different way to manage your database is to write your own management solution and use the Windows Azure management API. With that, you can create and delete not just databases but also whole servers.
Migration on-premise database to the SQL Azure
Now let´s talk a little bit about how you can migrate your existing database to SQL Azure. There are several ways to do that. From my personal experience, the best way to do that is to use SQL Azure migration wizard (AMW). This tool is totally free and can be found on codeplex
Usage is simple, too. First you fill out the connection info of your on-premise database, AMW analyzes and exports your schema and data. During this process, AMW notifies you about the progress and it also informs you every time it finds anything which is not supported by SQL Azure. Then, you enter credentials for the SQL Azure database and AMW imports schema and data to it. It also automatically adds clustered indexes to the tables which don’t have them yet. Since this can influence performance of your database I strongly recommend you to check that indexes are added to the most suitable rows. And that’s it. On a standard Kentico CMS database with the sample Corporate site installed, the whole operation takes about 10 minutes.
The AMW has also the ability to migrate a database back to an on-premise database and copy data from one SQL Azure database to another.
Security of the SQL Azure
I’m convinced that a lot of people don’t want to go to the cloud because they are concerned about the security of their data there. Since a database is in fact storage of your data (and data of your clients) I should say something about its security features.
First of all, if you want to, you can use your on-premise database with a cloud service application. There is a service called Windows Azure Connect, which creates a virtual private network between your on-premise servers and the cloud, so you can have a database server behind a firewall and still do this. Obviously, this solution has a big disadvantage in the long distance between the servers (and bigger delays) so you must have a really good cache mechanism and/or also a quite static application.
I think the scenario above is exceptional and most of you have applications with dynamic content. In these cases, you should use SQL Azure. In SQL Azure, there are several features which improve security:
The whole communication must be done through a secured channel. Unencrypted connections are declined. Encryption is specified in the connection string by the
Encrypt=True/False parameter. SSL itself is not a silver bullet because it is susceptible to the “man in the middle” attack. To prevent this kind of attack, you mustn´t trust the server’s certificate and you have to validate it. To set up validation of the server certificate, add the
TrustServerCertificate=True/False parameter to to you connection string and set its value to
SQL Azure has a built-in firewall. By default, access from all IPs is restricted and you need to allow certain IPs to connect to the database. There is one special rule represented by 0.0.0.0 IP address – servers inside Windows Azure. Unfortunately, you can only allow all cloud servers or none. This means that if you allow connection from your cloud application, all other applications running in Windows Azure (maintained by other Windows Azure customers) can also connect to your database.
User names like “sa” or “admin” are restricted to use in SQL Azure. This makes it harder to an attacker to crack into the account.
SQL Azure and Kentico CMS
After reading the text above, you have quite a good knowledge about SQL Azure. The last piece that is missing is what changes need to be done to start using it. In an ideal scenario you just change the connection string. In the real world it is not usually that easy.
In Kentico CMS’ case we did just a few changes. We replaced the obsolete system tables in our queries with catalog views and made sure that every table has one clustered index. We also rewrote one or two queries because they contained syntax which is not supported by SQL Azure.
The last change was modification of our database installation. Our database schema scripts are exported from a standard on-premise SQL server. For SQL Azure, we delete from them the unsupported keywords /commands like “
NOT FOR REPLICATION
”. Also, right now, the database installer does not create a database or a server for SQL Azure (we are planning to add this feature in future).
That is all for today. With this post, the whole series about Windows Azure services also ends, because we went through all the parts which Kentico CMS currently uses. But don’t worry; I will continue to write about Windows Azure because there are a lot of exciting things I want to tell you, so stay tuned.