AZ-304 Microsoft Azure Architect Design – Design for High Availability Part 2
4. *NEW* Essential High Availability Concepts for Exam
So at this point, I feel like I should reiterate how important it is to understand Availability Zones and how they improve the resiliency of your application and actually how they work. So we’ve covered this concept before, but Availability Zones are actually data centers. They’re physically separate locations within a single Azure region. This diagram from Microsoft’s website kind of shows that you have three separate buildings that are physically separated. Now, it may even be that there are more than one building that make up an Availability Zone, but there are three Availability Zones in a region.
The reason for this is that when one Zone Availability Zone is affected, whether it’s a powder outage or an Internet outage, Microsoft has designed their system so that the other data centers in the same region shouldn’t be affected. So they’ve tried to make it so that outages are localized to that particular geographical spot in that data center. And other data centers in the same region should not be affected. And so by deploying your application across all three Availability Zones in a single region, then you are actually protecting yourself from what are called localized failures.
So the question that might come up on this exam certainly something you absolutely need to understand before you attempt the exam is that what you do? Or how do you deploy your application across all three Zones so that if one or two of those Availability Zones were to fail, then your application isn’t affected. So think about that in terms of all of the complex set of services that go into making up your application. You might have a virtual network and a firewall and a database and some virtual machines and some sort of mental tier, et cetera.
You’ve got lots of things that go into your application and you really can’t avoid thinking about how does it affect it if a single zone was to go down, or in the worst case, two out of three zones were to go down. Now. Not every Azure service has specific availability zone support. Now, in general, Microsoft has broken up their services into three main categories, which we’ll cover in a second. The other thing to think about is not every Azure region supports the concept of Availability Zones.
So there’s quite a few regions we can see here Canada region, Brazil, some of the US. Regions. But out of the 60 plus regions, we only have about 20 here that support Availability Zones. So if this kind of redundancy is important to you, then you’re going to have to think about which regions you deploy your solutions to. So I said there are three main types of services. If you look across the thousand or so Azure services that are available, you’ve got Zonal services.
Now, these are like virtual machines where you can specify a specific Availability Zone that you can deploy to. And if you need to deploy it into two zones or into three zones, you’re responsible for doing that, you have to deploy it to zone one, to zone two. Individually. Zone redundant services are services where Azure manages it like your storage account where you can choose ZRS storage and that’s zone redundant. And Azure will keep copies of your data in other zones. And basically it’s automatically replicated.
If a single zone was to go down, then they would handle that. And finally we have services which are, I guess you can call them global services, but they are services like Azure Active Directory, which are not specifically tied to a region. And even if you have to choose a region during deployment, which we’ll talk about in a second, that could just be for where the profile is stored. But the services themselves are managed globally. So to go into this exam, we should know whether services are always available services or Zonal or zone redundant. So here’s a list of the global always available services. And I’ve highlighted a few that you might use commonly in your application. So you might use DNS services, front Door Services, CDN Traffic Manager, things like that as part of designing your application. And so to know that Microsoft takes care of the zone related issues, then you can deploy, for instance, front door.
And it doesn’t make you specify which zone to deploy to and it takes care of some of that stuff for you. So I’ll highlight Traffic Manager and a couple of these others that you can add to your solution that you don’t have to worry about the zone specific issues. But if you think about, like I said, DNS is a global service. But if you want to deploy an IP address, a public static IP address, then that is a zone related service. And if you go under the standard SKU, you can actually choose which zones this IP address gets deployed to. So you can have effectively zone redundancy as part of your IP because, you know, the IP address is a resource that gets deployed as part of your resource group. And so you want to specifically choose, when you’re deploying your public IP address, which zones to go to if you’re concerned about zone redundancy in this way.
Another example is application gateway. If you look at the version two of the application gateway skew, then you can deploy that cross zones. What this means is that you have a single instance of application gateway and it can manage services in specific zones. Again, this is something has to be done during deployment and it’s not something that’s automatically given to you. And so again, if you’re concerned with zones going down and needing your application to stay running when we have a zone of problems, then you’re going to choose the application gateway V two skew and specifically deployed across all three zones.
Now here’s an example diagram of a potential solution deployment. We can see here we have a static virtual IP address that could be cross zone, as we just saw. There’s the application gateway on the left. That gateway seems to be driving traffic to couple of availability zones. And there’s AKS solutions in one zone, VM scale set solutions in another zone. There are some on premises solutions, there’s some Azure App service solutions.
And it’s working across its own way across all of the zones. Key vault is another service which is zone redundant, where Azure manages the zones for you. It stores your keys in a zone redundant way. And if one of the zones was to fail, azure will take care of picking up your data from another zone. So this is the my just set key vault is zone redundant.
Now, one thing to point out is the virtual machines are zonal. So when you’re deploying a single virtual machine, you’re deciding potentially which availability zone to deploy it to. And if you need it to be cross zone, you need to deploy one VM into each zone. And even though the virtual machines themselves are zonal, if you combine them into a virtual machine scale set, you can actually define the skill set as being zone redundant. So once again, you’re deploying your virtual machines in a scale set across all three zones, and the skill set manages scaling up and down across zones.
And that’s something you define when you’re deploying the virtual machine scale set. But if you use something like Azure App Services, there is an App Service Premium service which is a V two or V three Premium that you can choose to cross zone. So even in the example of App services, you do have to take action to protect your apps that are from zone failures. And so, here’s an example. I said right now, it’s an Arm option only you can choose zone redundant as a property not available in the portal. Now, because it’s cross zone, you do have to have three instances of your app running, of course. And so there is minimum capacity and obviously a minimum charge for doing that.
So we go back to that question I posed a couple of minutes ago, which is you really do have to think about if you’re going to design an application and you need it to be deployed across all three zones so that if it fails in one or two zones that it doesn’t affect the application. Then you have to start thinking about those solutions, the public IP address cross zone, the app service cross zone individual VMs or deployed into zones, app Service Premium, V two or V three, et cetera. So what I should do is in the next video we should deploy solutions such as this cross zone, and we can see this in action.
5. *NEW* DEMO: Deploying an HA Zone-Redundant Solution
So we’re going to start this demo by creating a resource group. This resource group is going to contain all of our Zone redundant resources. Now, even though the resource group can come from any region and contain resources from other regions, I’m going to kind of intentionally only create the resource group in the same region as my other resources. And so I’m picking East US, which we know. According to the Microsoft documentation, East US has availability zone support. Again, you don’t have to create it there but I’m going to create it just part of this demo. So we have a resource group. Now, the first thing that we’re going to create here is a scale set. So I’m going to select a virtual machine scale set from the list of resources. Now we know that the scale set can be deployed across region. We’re going to put this into our three or five group. We’re going to give this a name.
I’m going to call this my new VMs. It’s in the East OS region which we know has Availability Zone support. And we’ll see here that we do have the ability to deploy this either basically allowing Azure to manage it or we can specify we want this skill set across all three zones. So this is the Zone redundant method of deployment. Now we’re also given this option of deploying this again with the orchestration, which means how does the scale set decide to do scaling? And we’re just going to leave it as uniform scaling. And when it comes to security, we’re just going to leave the standard of state of security. I’ll do some Windows Server 2019 servers and this is a pretty small server. It’s a one CPU 3. 5 gigabytes. We’re going to have to create our Windows credentials. All right, I’m not going to choose Windows hybrid. We’re not going to add any additional disks.
We can use regular SSD, regular encryption. It’s going to create a brand new virtual network for these virtual machines to be part of, which is fine too. We’re not going to be put this part of an existing load balancer. Now this initial instance count is interesting because we’re deploying our skill set across three zones. And of course for VMSs we’re going to want to have at least three servers, one in each zone. Now we could of course choose less. That’s up to us. But choose three manual scaling whether we choose to scale in and out or have our own automation or we rely on their automation. So we can have, let’s say between three and six instances sort of scaling in and scaling out rules. Do we want to have diagnostics locks? Now you can see this is the policy for how scaling is balanced. Some of the defaults, it’s across building zones and fault domains.
And then when you’re scaling in it’s going to delete the highest instance ID the newest virtual machine method or the oldest virtual machine method, which is depending on the age of the VM, it’ll delete the newest or the oldest. We’ll just leave the default. We’re not going to worry about security or Windows updates, I’m going to turn off boot diagnostics. We’re not going to worry about identity guest OS updates. We can have this concept. It’s relatively new for skill sets, which is health monitoring. It’s just simply similar to a load balancer where you have a health monitor probe pointing to, let’s say in this case a web address http address port 80. And if it detects one of the machines that’s not acting properly, then do we allow automatic repairs where it just kills off one of the VMs and instantiate it.
So this is relatively new. I’m not going to turn this on for now. It’s not the point of this video. We’ve got the hyper scaling situations force strictly even balance across zones. So again, we did set the default scaling to balance across zones. In this case it will actually force the scaling to be balanced. As it says in the tooltip, it will fail if it’s not possible to be balanced. So let’s not select this. We’re not going to risk scaling failing when it can’t create a VM in the zone that we want. So the concept of spreading has to do with your virtual machines being spread across as many fault domains as possible. That is the concept of max spreading. And with static spreading it only limits it to five and exactly five fault domains. And if it can’t find five fault domains, then scaling fails.
So in the case of availability zones, Microsoft recommends that you set to max spreading and you really do need distinct reasons why you wouldn’t have a max spreading in order to choose that fixed spreading option. All right, we’re going to skip over the tags and when we create this, we’re going to end up creating a virtual machine scale set which operates across zones. And we’re going to have at least three instances and we should expect it to be one instance per zone because that’s our balanced scaling option. We’re not forcing that because I guess, I guess in this situation we could find ourselves with a zone that doesn’t contain an instance based on things created and things deleted and things that got shut down manually, et cetera. We won’t touch that.
So I’m going to say create. So this virtual machine scale set should be balanced across availability zones. And now what we can do is we can create an application gateway zone, redundant application gateway for this. So I go into the resource group, say create, look for application gateway. Remember that the skew of the gateway has to be the V two skew. So let’s call this a test gateway. Now I’m going to put this in the same region as the other resources standard V two. We can enable scaling, but the important thing here is to enable the zone redundancy. So what will happen here is if it’s basically having a minimum instance count of zero is Azure is going to take care of making sure there’s enough instances we’ll put this onto our existing virtual network.
There does need to be an application gateway, of course needs its own subnet. So we’re going to have to do that. So I wasn’t paying attention when it created the Virtual Machine scale set. It’s taking up a whole 16 subnet address space and the entire virtual network is only 16. So I’m going to have to add additional huge address space. So I’m going to save this, go back to the Vet, add a subnet. I’m going to call this the Azure App Gateway and it doesn’t need to be typically huge, but let’s just stick with this 24. We’re going to leave that, all right, we can then close the gateway and we can choose its own gateway. Now, it’s not a network gateway, it doesn’t need a specific name, but it doesn’t need to be by itself. Now, remember the IP address. Now there is this zone redundant option. Now we are creating an application gateway that’s own redundant.
So it’s choosing it by default for the pool. We could add it without targets or we could say our Virtual Machine scale set is the target. We need a routing rule. Let’s just put all traffic, all traffic port 80 and the back in target is our backend pool. We haven’t got a Http settings for this affinity. No draining, no, just created the default Http settings and there’s no path based. We could have path based but we’re just doing the default no tags. Now, it’s going to take seven to ten minutes for this to create. But when we created, we’re going to have a publicly accessible IP address pointing to our Virtual Machine scale set and it should support if zone one or zone two are both were to go down and zone three remains running. We’ve done everything from the public IP application gateway and the Virtual Machine scale set up for that to be true. So the deployment succeeded. We can quickly check how long it took to get this gateway deployed. Around four minutes.
And what we’re expecting then is what we have is a fully zone redundant deployment. We have our zone redundant IP, zone redundant gateway and a zone redundant VM scale set. And in this way anyone who comes to this application through this IP address should be able to continue to do that, whether the zone one or zone two is down, as long as zone three remains, et cetera. So that is a very important distinction, increasing the availability of your applications.
Now, one thing that we haven’t done is we didn’t install a web server on these virtual machines and so we can’t go to the IP and verify it works. That would be sort of a final step. Again, that’s not the point of this video. So I’m not going to go ahead and turn this into web server and things like that, but you can certainly do that if you want to demonstrate that this is working the way that we intend. I’m going to delete this resource group. Now, the end of the video, I hope that clarifies how availability zones work and all the different resources that you need to make Zone redundant in order to support that type of deployment.
6. *NEW* High Availability Non-Relational Storage
All right, the next requirement of the exam is to understand how availability works with no relational databases. Now, we’re talking about no relational databases within Azure. We’re generally talking about these things such as Cosmos DB, Azure Radis Cache and the Azure Storage Services which include table storage, Blob storage and even Azure files. And so in this video, we’re going to talk about how high availability applications are designed using these non relational data stores. First up, we’ll talk about Cosmos DB. Now, Cosmos DB is the premier database solution for non relational data, sometimes called no SQL data. And it does have some high availability features built in. So even without you having to configure anything specific, you’re going to be given some of these higher availability features. One of these features is that Azure Cosmos DB keeps a copy of your data. In fact, it keeps four copies of your data within the zone.
So this is similar to an Azure storage account, which we’ll talk about in a second. But Cosmos DB will actually have your primary piece of data and then it will copy that data into three other nodes within that data center. And so you’re running costs to be in a single data center. You know, you’ve got four copies of everything you write to it that’s been stored around. This happens because if some sort of failure happens, if one of the nodes was to have a hardware failure or a power failure, then you can actually Microsoft will actually pick the data up from another node seamlessly. You don’t have to do anything. You won’t be necessarily notified that in one of the nodes failed. This is how they keep their service running. So this is what I would call you’ve not enabled Availability zones specifically, and you’re running Cosmos DB out of a single region.
So Azure giving you fairly high four nines, 99. 99% availability. When you’re running Cosmos DB in a single region for both reads and writes. The downside to this is you’re really dependent on that zone being up. And if there’s any kind of zone level failure, if the data center was to flood or there was a power outage or a loss of Internet to that whole building, then your application could be down because it’s reliant on that single data center. Now, Cosmos DB makes it really easy to replicate your databases into multiple regions. There’s a cool interface where you can just pick another region and it’ll do all the replication for you. This still gives you 99. 9% availability for rights, but that bumps up the reads to 99. 9 availability. And then when you’re talking about multiwrite situation or multi home situation, you’ve got five nines both for the reads and writes. So just by adding another region to your solution, you’ve increased your availability. So this protects you from that single zone going down because there’s a backup region and that automatic failover will kick in. Now, let’s talk about availability zone support so, like we saw with application gateways and virtual machine scale sets, you can deploy cosdb in an Application Zone mode. And when you do this, even if you’re running with Application Zones in a single region, you get that bump from 99. 99 to 99. 95. So four and a half, nine availability just by adding Availability Zones in the single region.
Now you’re protected against the data center outage because if you’ve got multi zone deployments, a single data center goes down, you’ve still got your other two data centers. So you don’t lose data and you don’t lose availability. Of course, you’re still subject to the regional level outage because you’re only deployed to the single region. If we now bump that up to, say, with multiregion. So you’ve got not only the Availability Zones enabled, but you also have the multiregion writing available. Then you can see you’re still getting 99. 99 five availability for rights, but you’re actually getting five nines for reads. And in the Multireight situation, you’re getting five nines. Now, one thing I’ve noticed across all of these examples, the last three pages is Cosmos DB already has a pretty high level availability in a single region, 99. 99%. So when you’re getting into the 99. 95 or the 59 situation, you’re really just talking about fractional improvements in your availability.
But when you’re running a highly available application and you really do need to protect against these situations of data centers or regions going down, that could make a lot of difference, obviously. So Microsoft recommends availability Zone support enabled with Multiregion. Multi writes and that is if your main requirement is high availability for Cosmos DB, then you have to turn on all three of these options in order to get that highest level of availability. Now, there’s cost to it, of course, so just turning on Availability Zone Support within a single region increases the cost by 25%. So 1. 25 times the cost of Cosmos DB in that single region. It’s a small price to pay.
And you do get the protection against the zone going down. Of course, if you’re running it in two or three or more regions, then you are increasing your cost further by double, triple or more. So each region basically adds to your cost incrementally. Now, one of the other recommendations if you are running Cosmos DB in a single region for single write multiregion sorry, then Uni should enable the automatic failover. So you do have the option of controlling the failover, but you should enable automatic failover so that Azure picks up when the first region becomes unavailable and automatically moves everything over to running as the second region is the primary. Now we’ll switch over from talking about Cosmos DB to talking about Azure Redis cache.
Now. Azure Redis cache also has been designed for availability. So there is some availability built in. Azure redis cache runs on the concept of nodes. And nodes are VMs. And so you’ve got dual node options. And that means two VMs. And we can see that in a second. So we’ve got sort of these three levels of availability. There’s standard mode, which is 99. 9% availability running on two VMs replicated in a single data center. And then you have automatic failover if one of them was to fail. So this basically protects you from node level failure. A particular rack, a particular server, particular power out, power supply. Zone redundancy increases your availability to four nines and it uses multiple nodes across Availability Zones with an automatic failover. And then when you get into geo replication, that’s the 59 support. And now you’re talking about your redis cache running in multiple regions. And right now you have to control the failover. There is just too many implications of redis cache automatic failing over between regions.
So they want you to be able to control when that happens. Here’s a diagram of the standard configuration which is two nodes running in the same data center. So we can see there’s the primary node and the Replica node and it’s automatically replicated. And when we do need to failover, then basically the load balancer takes care of sending traffic to the Replica instead of the primary. So the primary does all the work until failure. Now, when you get into zone replication for redis, then you need to be on the Premium or Enterprise Tiers so there doesn’t run on the standard tier.
Here’s an example of Zone level replication. You can see there’s three Availability Zones and four nodes. So the Availability Zone One, there’s the primary node that’s doing all the work until failure. And you also have a backup replica in Zone One that protects you against zone level failure, node level failure. And then Zone Two and Three exists so that you can handle zone level failure.
Now the highest level here is geo replication requires Premium or Enterprise Tiers. Now, geo replication is primarily for disaster recovery. So like we said, right now it has manual failover. And the reason is, like I said a second ago, there’s just too many implications of redis cache, which is supposed to be a super fast in memory cache doesn’t store data to disk running in a different region from your application. So if you had your application running in East US and your redis cache running in the West US, you’re really slowing things down. So they want you to have control over that.
When you get into what’s called Enterprise geo replication there’s a lot more features. And right now this is in preview mode. So this isn’t really covered by the exam but keep an eye on enterprise level georeplication is going to be the future of highest availability across regions but it’s not quite ready for production use next. And lastly, we’ll talk about the Azure Storage. And so Azure Storage also has built in high availability features. When you’re creating a storage account you choose the redundancy and two of the options are locally redundant storage and zone redundant storage.
They both keep three copies of your data. Locally redundant storage keeps those three copies in the same physical location in the same data center. And zone redundant storage copies your data into three availability zones in that same region. So here’s a diagram of zone redundant storage. You can see one copy of your data in each of the three data centers. Now, like the other two solutions, zone redundant storage does not protect you from regional level outages. If you want to do that, you need to get georedundant storage, which is GRS. And the interesting thing about georedundant storage is GRS is an LRS running in two locations.
So it’s locally redundant storage in one region and locally redundant storage in another. So you don’t get the benefits of zone storage, but you do get the benefits of being across two regions. For geo zone redundant storage, you get zone redundant storage in the primary region and you get locally redundant storage in the secondary region. So it’s not zone redundant storage in both locations. Now, in the GRS and GCRs, you don’t have read access or any access to that secondary region. So you have to trust that your data is replicated to a secondary region and you don’t actually get access to it until failure happens. So once you have to do a rollover or failover from the primary to the secondary region, then you get that data access to that data. Here’s a diagram showing GZRS where you’ve got the zone redundant storage in the primary and locally redundant storage in the secondary. Now, if you do need read access to that secondary location, then that’s Read Access RAGRS and Rag ZRS. So then you get that end point that is read only. It is important to keep in mind that these updates are not instantaneous. Right? So we do have to start to involve this thing called last sync time so that you’ll know when you’re dealing with a secondary region, how old the data is. It could be a few seconds, it could be a few minutes. There’s no SLA for how quickly it syncs in GRS.
Microsoft’s web page says within 15 minutes. It’s called eventually consistent. And we know this consistency from Cosmos DP as well. And so the availability goes from 99. 9 to 99. 99 with this read access option. So really interesting is 99. 9. Availability has been all of the locally and zone redundant options. That really doesn’t differentiate too much. Even though we know that the availability will be higher, the SLA doesn’t get higher, and only on the read access GRS and GZRS, we’re getting additional fourth nine. We’re talking about storage. It’s important to differentiate between the different services and so Azure files does not support this read access mode, right?
So if you’re storing stuff in a file share, you can have globally redundant storage, but you’re not going to have read access to the secondary location and also, even to get the globally redundant storage the file share has to be lower than five terabytes. So if you have a really big file share then that can’t be globally replicated by Azure. We’re talking about something like managed disk which is starting to get outside of this thing. But managed disks are only local or zone. Here’s a diagram showing all of the different storage services we see.
General Purpose V two is available in all of them. General Purpose V One does not support zones, so not that many people are using it. But if you do find yourself with an exam question talking about John Purpose V one, that is not a zone feature. Also, premium block Blobs are local and zone only. And as we saw, file shares also being local and zone only except for lower than a certain amount. Right. So this diagram can show you that the more replication you get, you’re really being focused into General Purpose V Two account. And that’s a summary of the no relational options for Azure availability.
7. *NEW* High Availability Relational SQL Database
So in this video we’re going to talk about high availability when it comes to relational databases within the context of this exam. Relational databases include the SQL Server Engine databases, which are Azure SQL Database, SQL Managed Instance, and SQL Server in A VM. The other types of relational databases like MySQL and PostgreSQL are not covered by the exam even though these are managed services and something like an Azure Synapse Analytics is also not on the exam. So let’s talk about SQL database. That’s the primary relational database that Microsoft obviously recommends. Now of course you would understand SQL database is designed to be highly available by default.
And so if you just accept the standard defaults, basically when you’re deploying a SQL database, you’re going to get 99. 99% uptime. Okay? So if you’ve got this, you’re going to get this type of minimum uptime guarantee. They don’t want your SQL database to be going down. Here’s a diagram showing a SQL database running in a single region. You can see there’s a cluster of gateways at the top that act as a load balancer. And then behind there is the Compute, which is the primary replica. It has the tempdb on local storage and the data files and the log files are actually stored in a premium storage account, LRS, which is the locally redundant storage. And so when you’re running your SQL database in this normal configuration, you’re actually interacting with a type of virtual machine behind the scenes. Running this SQL instance and the data files being stored in a storage account, you can see here that there are some replicas backup nodes if you will.
And if there is a failover that needs to happen, then Azure takes care of this for you. They’re going to basically failover zone redundant, essentially a configuration where they fail over. You don’t even necessarily need to notice that. And the data files are running on the storage account. So you’re not going to lose any data because it’s just one data source. Now this is going to have the same challenges that a storage account running in LRS is, which is it’s susceptible to zone level outages. So if a single availability zone was to go down and it happened to be where your SQL database is stored, then you’re going to lose access to that data for that period.
So even though the Compute has redundancy, the storage does not. You’ve got the three copies of your data within the single data center, but outside that data center it’s not being stored. You can see the backup files are being stored in various levels of redundancy. So this is what you’re running when you’re running a SQL database. Basic standard or general purpose? Remember, when you deploy a SQL database you do get the choice between the old basic standard premium model and the new what they call Vcore model, where you’re selecting the CPUs and the memory separately. Now it’s important to remember just like Cosmos DB, you can do Geo replication with databases.
Now, SQL Database isn’t meant to be multi home where you can write anywhere in the world, but it does have read access across the world and you can do this Geo replication and have your secondary databases elsewhere. It’s also good to know that you’re actually not increasing your SLA when you do georeplication. I’m going to pull in the SLA for SQL Database and we can see that when you’re running SQL Database in the business critical tier with Georeupplication, what you’re getting is not increased availability. You’re getting the recovery point objective of 5 seconds.
So you’re going to lose a maximum of 5 seconds of data when you have a Geo replication. And the recovery time objective is 30 seconds. So it takes you around 30 seconds to be back to operational. So you’re not actually gaining availability in the percentage terms, but you are gaining recovery which is objective that you should keep in mind as well. So there is in preview mode and so this won’t be on the exam zone redundant options for general purpose SQL Database. And you can see here we’ve got our similar configuration, we’ve got our cluster of GWS, the compute layer with temptab and the data and log files. But now the data and logs are running on zone redundant storage and you have your compute failovers across three zones.
And so now you’ve got a zone redundant configuration running in a single region using zone redundant storage and actually deploying your nodes into the separate zones. This is in preview mode. Now, next up from the general purpose tier is this either Premium tier, basic and Standard become premium or the business critical tier. And this is where you’re really getting into higher availability options. So this is a premium type of configuration or business critical running in a single region. And we can see here, of course, the control ring is the same, but the way that your data is stored is completely different. The data and the log files are actually stored on the Local for these VMs.
And so now you’re not using storage for your data, you’re actually using local. And so there’s obviously going to be some performance improvements from that. But now you have the additional challenge of that data needed to be replicated across multiple replicas. There’s this concept again, this is given to you and always on Availability Group. We’ll talk about that when we talk about virtual machines. But basically the failover is going to be set up similar to the way you would do it for SQL Server in a VM, right? The way that the data is replicated between the primary and the replicas, the backups still remain unchanged in the business critical or premium options.
Now there is a zone redundant option for premium business critical as well. And as you would expect now we introduced the concept of Availability Zones. The replicas are now spread across the three zones with two replicas being in the primary zone. So that you’ve got node failure covered in the same zone, and then zone failure would be covered in another zone. So a very logical setup for zone redundant at this premium business critical tier. If we bring in the SLA, we can see that doing it in this way, business critical or premium in zone redundant increases your availability to four and a half nine s. And that’s what I see here. Now finally there’s this concept of hyperscale.
Now hyperscale is in my view, a completely separate configuration. What you’re ending up with here is you’ve got your compute running, but now instead of using data files, you’re using in memory cache. And so caching, there’s a cache element here. There’s servers managing the cache. The data is stored in files in a storage account. But now you have basically caching servers that are moving the data from memory between the SQL Server and the files in a storage account. So this is like a completely different setup, but this is called hyperscale. And again, this to me is when you’re getting into needing like really quick reads. So stuff is in memory and not on disk. So that’s why the architecture has to be different. Now, SQL Managed instance does share a lot of the concepts with SQL Database. The primary benefit of SQL managed instance is two things.
One is the increased compatibility with SQL Server running on premises. So this is the closest amount of compatibility between those two services. And the second thing is of course, Azure is managing this. You’re not having to be a SQL Server expert. To have a SQL Server in A VM requires you to have the operational skill, the patching of the code, operating system, updates, backups. You get some of the benefits of SQL Database while maintaining the compatibility of SQL Server running on premises. But what you’re not getting is some of these Availability Zones configurations. So there’s no zone redundant configurations for SQL managed instance. So if you see on a test questions about needing to support zone failure, like if two zones were to fail but one zone still running, you need your database still to be there. A SQL managed instance is not a zone failure solution in a solution. The other thing it doesn’t support is this hyper scale concept where you’ve got these reads having to come from in memory cache as opposed to running from disk.
That’s not going to be a SQL managed instance solution. Finally, SQL Server and a VM, as we just said, is effectively you’re running SQL Server as you are on premises, but instead of running it on a physical server, you’re running into virtual machine. So the fact that we talked about virtual machines in a previous video, a couple of videos ago, virtual machines are zonal. So you’re deploying a virtual machine if you’re using Availability Zones, you’re deploying it into a specific availability zone. This means that if you want to have basically zone redundancy, you’re going to have to deploy multiple SQL Servers into multiple VMs. Now, as we saw earlier, we were just talking about Availability groups. You’re going to have to set up this concept of a Windows Failover cluster so that Windows itself can detect when it’s not working and failover to another server. Also, the Availability groups is what’s going to have SQL Server manage those requests and being able to fail over to another SQL Server. Here’s a sample diagram of running SQL Server in a VM. In a single region, you’ve got a load balancer, you’ve got multiple Windows servers. In this example, the availability group manages the synchronization of the data you do in a two node configuration. There needs to be a third node to decide who’s failing, right? So this sort of outsider, they call it quorum.
When you have odd number of servers, the majority of servers decides who’s right. And so when you’ve got an even number of servers, you’ll need another server to make a quorum. And so that’s why there’s a witness. It doesn’t serve as a SQL Server, but it’s a judge for which of these servers should be made the primary in a failover situation. Now, when you’re talking about cross region support, this is a bit of an odd diagram because you’re not setting up multiple servers in the secondary region.
So you’ve got your primary server, you’ve got your Failover in the same region, your witness to decide between the two. But if you’ve got region level failure, then you’ve got a backup running in another region. But in this diagram, you don’t have two servers with a witness. It’s just a single server. When a region is down. This is an emergency configuration, I guess. So here’s an example of multiple region azure SQL Server and A VM. So as we saw in this video, you have a lot of the same concepts in non relational data. With relational data. We’re talking about SQL Server and a VM or Azure. SQL Database more broadly. And you do have his own redundant options, but there’s a lot of historical Windows Failovers and SQL Server Availability Groups that carry over from the on premises world into the cloud. When you’re dealing with things like this.
Interesting posts
The Growing Demand for IT Certifications in the Fintech Industry
The fintech industry is experiencing an unprecedented boom, driven by the relentless pace of technological innovation and the increasing integration of financial services with digital platforms. As the lines between finance and technology blur, the need for highly skilled professionals who can navigate both worlds is greater than ever. One of the most effective ways… Read More »
CompTIA Security+ vs. CEH: Entry-Level Cybersecurity Certifications Compared
In today’s digital world, cybersecurity is no longer just a technical concern; it’s a critical business priority. With cyber threats evolving rapidly, organizations of all sizes are seeking skilled professionals to protect their digital assets. For those looking to break into the cybersecurity field, earning a certification is a great way to validate your skills… Read More »
The Evolving Role of ITIL: What’s New in ITIL 4 Managing Professional Transition Exam?
If you’ve been in the IT service management (ITSM) world for a while, you’ve probably heard of ITIL – the framework that’s been guiding IT professionals in delivering high-quality services for decades. The Information Technology Infrastructure Library (ITIL) has evolved significantly over the years, and its latest iteration, ITIL 4, marks a substantial shift in… Read More »
SASE and Zero Trust: How New Security Architectures are Shaping Cisco’s CyberOps Certification
As cybersecurity threats become increasingly sophisticated and pervasive, traditional security models are proving inadequate for today’s complex digital environments. To address these challenges, modern security frameworks such as SASE (Secure Access Service Edge) and Zero Trust are revolutionizing how organizations protect their networks and data. Recognizing the shift towards these advanced security architectures, Cisco has… Read More »
CompTIA’s CASP+ (CAS-004) Gets Tougher: What’s New in Advanced Security Practitioner Certification?
The cybersecurity landscape is constantly evolving, and with it, the certifications that validate the expertise of security professionals must adapt to address new challenges and technologies. CompTIA’s CASP+ (CompTIA Advanced Security Practitioner) certification has long been a hallmark of advanced knowledge in cybersecurity, distinguishing those who are capable of designing, implementing, and managing enterprise-level security… Read More »
Azure DevOps Engineer Expert Certification: What’s Changed in the New AZ-400 Exam Blueprint?
The cloud landscape is evolving at a breakneck pace, and with it, the certifications that validate an IT professional’s skills. One such certification is the Microsoft Certified: DevOps Engineer Expert, which is validated through the AZ-400 exam. This exam has undergone significant changes to reflect the latest trends, tools, and methodologies in the DevOps world.… Read More »