Rethinking IT, Part 3: Storage, Backup, Big Data Analysis and More
Microsoft's cloud platform has a lot to offer -- too much, in fact, to fit into one or two articles. This final part focuses on media management, disaster recovery and Active Directory.
More in this series:
This three- part series provides an overview of Microsoft Azure as it applies to IT professionals. The first installment looked at Web sites, virtual machines and autoscaling. That was followed by Part 2, a discussion of PowerShell, Azure Automation, Cloud Services, Mobile Services and more. This third part covers Storage, HDInsight, Azure Backup, Recovery Manager, Media Services, Azure Active Directory, Service Bus, BizTalk Services and the new portal.
The storage for almost everything apart from SQL databases (and its simpler cousin, table storage) in Azure is called BLOB storage. It can scale up to 200TB and is accessible through a REST API. All your data is stored in triplicate for resiliency.
You can also choose to have your data replicated to another datacenter; this is called Geo Redundant, and gives you another three copies of your data. A recent update enables Read access only to this replicated storage, useful for application failover. There's the Import/Export feature (U.S. only at this stage), for sending large volumes of data on a Bitlocker-encrypted hard drive to and from Azure. A good tool from Microsoft for copying data is Azcopy, which lets you copy data to and from Azure, as well as between different storage accounts in Azure.
Microsoft offers a Big Data analysis service built on 100 percent Apache Hadoop (version 2.2 at the time of writing) called HDInsight. This relies on the Hadoop Distributed File System (HDFS) to ingest unstructured data, which can then be analyzed using MapReduce jobs. Note that when you create an HDInsight cluster, it has Infiniband (very high throughput) connectivity to its storage.
HDInsight supports Hive and Pig for job management, Apache Sqoop for data management and Oozie for workflow management. Apache Ambari (a tool for provisioning, managing and monitoring Hadoop clusters) is currently only supported for monitoring; Microsoft is working with Hortonworks on a full implementation.
A classic "first candidate" for workloads to move to a public cloud is backup. Azure Backup can be used with either Windows Server backup or System Center Data Protection Manager (2012 SP1 or later).
There are some limitations to be aware of: Windows Server backup can only back up files, folders and volumes. You can't back up System state or do a System recovery. With Data Protection Manager (DPM), you can also back up Hyper-V VMs and SQL Server databases. The main limitation to keep in mind, is bandwidth; if your only good backup of 200GB of important business data is on the other side of a slower link, and your Recovery Time Objective (RTO) says one hour, you might be in trouble. Step-by-step instructions can be found here.
If you have System Center 2012 deployed, specifically Virtual Machine Manager, and you have two datacenters, andyou want to implement Hyper-V Replica for site-wide disaster recovery, this Azure service is a no-brainer. It takes care of automatically distributing certificates to all hosts (instead of having to do it manually on each server) for HTTPS replication. If both of your datacenters are members of the same Active Directory (AD) forest, you can just use Kerberos. It lets you create Recovery Plans that define the order in which VMs should be brought online; these plans can also include manual steps.
Note that only a small amount of metadata about your VMs is housed in Azure; the actual replication of your VM's VHD files is never routed through Azure.
There are several different types of cache and cache-related services in Azure. As part of a cloud service, you can have dedicated Worker Role VMs for caching, or you can set aside some part of the memory of ordinary Worker Roles for caching. There's also a Cache Service (currently in preview) for which the infrastructure is managed by Microsoft. It comes in Basic, Standard and Premium flavors, and provides a URL endpoint to which clients connect. This caching can be used by cloud services, IaaS VMs and Web sites.
Finally, Azure has a Content Delivery Network (CDN) with nodes across the globe to which content is replicated, for faster local access by clients.
Reliably streaming large amounts of content to computers as well as mobile devices is quite a challenging task. Azure Media streamlines all parts of the process of ingesting raw video data, transcoding it into different formats (third-party services from Adobe or Rapid Digital are available as options), protecting it with DRM if desired, and finally, streaming it to viewers.
Depending on your requirements, you can use either a shared encoding service or request reserved encoding units; Azure offers a 99.9% Service-Level Agreement (SLA) for both modes. Recently added was Dynamic Packaging, which allows two different streaming formats -- SmoothStreaming and HLSv4 -- from a single stored file, saving on storage costs.
Windows Azure Active Directory
Underlying Azure and other Microsoft services such as Office 365 and Intune is Windows Azure Active Director (WAAD). It's a scalable directory which supports a subset of the functionality of your on-premises Active Directory Domain Services (AD DS), while adding cloudy features such as WS-Trust, WS-Federation and SAML-2, and acting as an OAUTH 2.0 provider. It can be accessed through the Graph API over REST, with ODATA and JSON payloads, letting applications read and write directory objects. WAAD also enables authentication using Facebook, Google ID, Live ID and Yahoo for your applications.
There's also a Premium version of WAAD which comes with an SLA. It has no limits on the directory size, group based access assignment, delegated group management, self-service password resets and multi-factor authentication. Make no mistake: WAAD is Microsoft's serious entry into the Identity-as-a-Service (IDaas) market, with more than 1,200 third-party services already supported for single sign-on (SSO).
There are two main paths for linking your AD to the cloud AD.
- The simpler one is using Dirsync to upload relevant portions of your AD objects to WAAD. Recent versions allows password hashes to be uploaded as well, letting the actual authentication happen in the cloud.
- A more secure, but also more complicated, option that offers true SSO is AD Federation, where authentication requests in AD are sent back to your on-premises servers.
Finally, you can run your own domain controllers in the cloud using IaaS VMs. This could be a self-contained environment with DCs in the cloud supporting an AD-based application; alternatively, you could put one or more DCs in the cloud that are part of your on-premises forest.
The service bus supports both basic queuing and Publish/Subscribe (called Pub/Sub) messaging over Advanced Messaging Queuing Protocol (AMQP). It also lets you set up communication between different Azure applications, between applications in other clouds, and even your on-premises applications.
Continuing the trend of offering services as counterparts to their on-premises applications, there's the option to use BizTalk Services for business-tobusiness or enterprise application integration for cloud and hybrid solutions. There are five different flavors of the service, ranging from the free option, through Developer, up to Premium, all with different constraints on number of connections, scaling capabilities and protocol support.
I've always found the Azure portal easy to navigate, but recently Microsoft revealed a new portal (currently in preview) that offers a new, integrated and customizable experience meant to assist with DevOps. Underlying it is the concept of Resource Groups (which are just JSON templates) and a Resource Manager that lets you put different Azure services in a bucket and deploy and manage them as a single unit. It also aggregates the total cost of all the underlying services and displays it. Going forward, all Azure services will be built on Resource Manager. This will also open up Azure for third-party ISVs to more easily add their own services into Azure.
The new portal is completely customizable, allowing a user to create a portal that shows them what they need to know, based on job role. Any task not currently available in the preview comes with a link that takes you back to the old portal, with no need to sign on again.
The Choice: Azure vs. Amazon
My life as a tech writer (and MCT) used to be easy– a new version of a product would come out, you'd spend months getting to know the new stuff and then write about it and teach it in class. That world is now gone: as I look at Azure and other cloud services (and even on-premises products), the rate of new and improved features in Azure on a monthly basis is staggering.
Comparisons with Amazon Web Services are inevitable, as they were first to this party and have a much larger market share than any other public cloud provider. There are some unique points to Microsoft and Azure, however, which makes me confident that they have a bright future. The first is the link to on-premises systems, where no other cloud provider has any footprint in enterprise clouds today (apart from VMware, but when it comes to public cloud, they are years behind). Microsoft can provide a gradual move from on-premises to the cloud, letting enterprises move at their own pace. Another advantage Microsoft provides is developer tools and integration, as Azure offers a great platform for developing apps, and the best tools for doing so.
On the Amazon side, nothing beats the VMs with solid-state drives (SSD) attached that it offers, although Azure's storage can give you pretty phenomenal throughput.
So what's an IT pro to do? Get a free trial of Azure, explore some -- or all -- the technologies discussed, and become acquainted with it. Do the same with AWS, maybe starting with its free offering, and explore the more mature ecosystem of third-party offerings for Amazon's cloud. Chances are, both Azure and AWS skills might be what's needed tomorrow.