How-To
Tips & Tricks To Secure and Optimize Azure VMs
This is the third and final part in a series where I look at Infrastructure-as-a-Service (IaaS) virtual machines (VMs) in Microsoft Azure. Part one covered the basics of creating a VM, and setting up networking, backup and monitoring before deploying VMs. In part two I covered VM families and sizes, how to pick the right size, running multiple VMs in Availability Sets, picking storage options and choosing between the two options for disk encryption.
Here I'll look at secure ways of accessing your Windows or Linux VMs, Just-in-Time (JIT) access, Azure Security Center, Azure Backup, optimizing networking for VMs, automating creation of VMs, and tips and tricks for managing VMs running in Azure.
VM Access
One important thing that I glossed over in the first two articles is how to access your VM. The default setup gives you RDP access (Windows) or SSH (Linux). I'll bet that you don't have RDP/SSH access directly from the Internet to your production VMs on-premises today, though. So what's the best way to handle that in Azure?
There are three options (the first two are best when they're combined), one is to have a single jump server in Azure that you can access, and all other VMs can only be accessed with SSH/RDP internally on your Azure Virtual Network (vNet) from that server. This lowers your attack surface but that jump server is still a target, although you can minimize that surface by only allowing RDP/SSH traffic from a set of IP addresses (your on-premises datacenter). The second option is to enable JIT access. This blocks permanent access (no RDP/SSH ports open for automatic scans to find) and only opens access when you need it -- you go to Azure Security Center and request access for a set period of time, after which the ports are closed again automatically. This only support Azure Resource Manager (ARM) VMs, not classic VMs, but you should be migrating off those anyway.
Recently (August 2018) Microsoft released (in preview) the ability to turn on JIT access directly on the configured blade of a VM. Note that you'll need the Standard version of Azure Security Center (see Figure 1) to use JIT VM Access.
The third option is to create a permanent connection between your on-premises network and your vNets, either through a Site to Site (S2S) VPN or through Azure ExpressRoute. The latter provides a pair of SLA-backed links (10Mb/s to 10Gb/s) directly from your datacenter to Azure (bypassing the Internet), which is highly secure but more costly. An S2S VPN is simply something you enable from your router, which creates an IKE v2 IPSec VPN tunnel to Azure. Depending on expected traffic volumes/high availability (HA) needs, you have to pick a gateway SKU in Azure. Note that you can have an S2S VPN as a fallback for an ExpressRoute connection.
Once this type of connectivity is in place, you can disable access from the Internet to your VMs (except application access, of course; if your VMs present Web applications, you'll need to allow that traffic to your resources) and only manage through the VPN connection.
Many VMs don't need to run 24x7 and you can use the auto-shutdown feature to save money (you'll still be paying the low cost for storage of the OS and data disks but not for the compute) -- simply configure each VM to turn off at a particular time and pick the right time zone (see Figure 2). You can also set it up to send an e-mail 30 minutes before it turns off or use a WebHook to notify you -- in case you want to cancel the automatic shutdown. There's no automatic turn on feature, though, so if you have a lot of production VMs, you might be better served by Azure Automation.
Optimize Networking
If your VMs require fast networking to other VMs or other Azure services, there are a few things to keep in mind. Not all VM sizes provide the same expected network performance, a Standard_F1 for instance goes up to 750Mbps, whereas a Standard_F8 offers up to 6Gbps and a Standard_D15_v2 goes up to 25Gbps.
Also, make sure you enable Receive Side Scaling (RSS) in your Windows VMs, to check if it's enabled run Get-NetAdapterRSS and if it's not enabled run: Get-NetAdapter | % {Enable-NetAdapterRss -Name $_.Name}. RSS is enabled in Linux VMs by default.
In the previous articles I've mentioned Network Security Groups (NSGs), the built-in software firewall that lets you control access to ports on a per-VM or -vNet basis. If you have a mandate or prefer to use a third-party firewall/security appliance, you can use a Network Virtual Appliance (NVA) -- probably the same one you've standardized on for on-premises. To enable this, you'll need to use User Defined Routing (UDR) and configuring the IP address of your NVA as the next hop for subnet-to-subnet or subnet-to-internet traffic. If your networking team/business is thinking in legacy terms and mandates that all Internet traffic has to be routed back to on-premises to exit through your corporate firewall, you can use UDR to force that, too.
If you need really fast networking, look for VMs that support Accelerated Networking, which uses SR-IOV to bypass the host and virtual switch, thus providing low latency, reduced jitter and low CPU utilization for your networking. For a more in-depth look at planning your network architecture for Azure, check out Microsoft's best practices documentation.
Security Center
Once you're deploying production workloads to VMs in Azure, management, security and monitoring become important. What do you use today for monitoring on-premises workloads? If that/those solutions don't provide good monitoring for workloads deployed in the cloud, it might be time to add another tool to the toolbelt.
Microsoft offers Azure Security Center, a service in Azure that will give you recommendations on how to improve your security posture, monitor your VMs and other resources for security issues, track your identities and access, as well as let you set security policy and alert you to threats (see Figure 3).
For VMs, cloud services (Web and Worker Roles) and app services (Web sites), Azure Security Center will monitor their security state and give you recommendations. Monitoring requires the Microsoft Monitoring Agent (MMA) in your VMs. For Web apps you'll get information such as HTTPS-only access is better or the Web application firewall isn't deployed, whereas for VMs it'll tell you if it's missing system updates, disk encryption isn't configured or endpoint protection isn't enabled.
On the other hand, for your vNets Azure Security Center will tell you to enable NSGs, or a third-party firewall or restrict access through for Internet-facing endpoints. If you have Azure SQL (the Platform as a Service [PaaS], not your own VMs running SQL Server, although support for that is coming), Azure Security Center will recommend enabling auditing, threat detection and enabling Transparent Data Encryption (TDE).
Microsoft has good guidance on what to incorporate in your planning for security.
Backup
Any production workloads in Azure should be backed up. The storage platform is durable, with three copies of your virtual disks stored across different fault domains (with Managed Disks), and hopefully the same if you created your own storage accounts. But durability isn't the same as being safe against user errors or large-scale disasters, nor does it provide the ability to go back in time.
Azure Backup will protect your Windows (VSS Full backup) or Linux VM (write your own script to ensure application consistency) on a daily (or weekly) basis using the VM Agent. You can pick the time of day when your backups run -- make sure you pick a time when the load on your VMs is low. Configuring backup jobs is straightforward (see Figure 4); for planning performance use the Azure Backup Storage capacity planner for IaaS virtual machine backup spreadsheet. Azure will back up each VM disk in parallel, ensuring best performance, but you shouldn't schedule more than 100 VMs to back up to the same vault and if you have classic VMs, only back up 10 of those at a time. Take special care when backing up VMs with encrypted disks and remember that you can only back up VMs in a region to a recovery vault in the same region. You can use the Microsoft Azure Recovery Services vault performance health check script for checking the performance health of your vaults.
If you have VMs running SQL Server you have a few options for backup: you can use the normal VM backup as shown in Figure 4, which will give you a "whole VM" backup, but not a granular SQL backup. You could also use the native SQL backup from within the VM to back up to an Azure storage account (SQL Server 2012 CU2+). That option forces you to configure individual backups on each SQL server, which may not be acceptable in a larger deployment. Finally, there's a new service in preview that uses Azure backup, but reaches into the VM for a real SQL backup (full/differential/transaction logs).
If you need protection against a total outage of a region you can use Azure Site Recovery (ASR) to replicate VMs from one region to another. Recently (August 2018) added in preview is the ability to do this across subscriptions. ASR can also be used to replicate physical servers, VMware VMs or Hyper-V VMs from on-premises for migration to Azure (free for the first 31 days for each VM) or for disaster recovery (you only pay for the storage you use plus a set fee per VM, until you actually fail over to Azure at which point normal VM costs start). You can also use ASR to migrate VMs from one Azure region to another.
Multiple VMs
In part two of this series I covered VM Scale Sets (VMSS) and Availability Sets (AS). If you'd like to automate the creation of VMs using templates you'll need to make a detour into ARM. All resources in Azure are controlled by ARM and when you create a VM using the portal like I did in part one, there's a link at the end of the wizard to download a template based on your selections. Don't use those because they'll need a lot of editing to be at all useful.
Instead, head over to Azure Quickstart templates (Figure 5) where you'll find many ARM templates for VMs (and all other Azure services). You can then either use the deploy to Azure button to do just that to your Azure subscription, or browse the files on GitHub and customize them further before deploying.
New Features
Nothing in the cloud ever stays still and it's worth mentioning two new features that are currently in public previews: Azure Firewall and Azure Virtual WAN. While neither are ready for production deployments at the time of writing, this should change toward the end of 2018. Azure Firewall is a fully managed service (no VMs to manage, automatic scalability) providing a stateful firewall. Eventually this will be a replacement (with much less management overhead) for NSGs and costly third-party Network Virtual Appliances for network security management.
Virtual WAN on the other hand connects your current deployment of WAN devices (Citrix and Riverbed are the ones currently supported, but any IKE v2 IPSec capable device should work) through a software-based WAN (SD-WAN) that runs over Microsoft's worldwide fiber network.
Next Time
This concludes this series look at IaaS VMs in Azure -- next month I'll look at other ways of moving to the cloud and why picking up a whole bunch of VMs from on-premises and just lifting and shifting them to the cloud may not be the best idea.
About the Author
Paul Schnackenburg has been working in IT for nearly 30 years and has been teaching for over 20 years. He runs Expert IT Solutions, an IT consultancy in Australia. Paul focuses on cloud technologies such as Azure and Microsoft 365 and how to secure IT, whether in the cloud or on-premises. He's a frequent speaker at conferences and writes for several sites, including virtualizationreview.com. Find him at @paulschnack on Twitter or on his blog at TellITasITis.com.au.