AZ-304 Microsoft Azure Architect Design – Design for High Availability
1. Application Redundancy
So we’ve been talking in this section about business continuity. Now, part of business continuity is being able to recover from a disaster, of course, and we’ve talked about that. But another aspect of business continuity is not even suffering a disaster, not having any downtime. Even when a particular component fails or a particular server fails, a particular region fails. If your application continues on and end users don’t even notice that there was a problem, you’ve got a highly available application. It’s designed for resiliency and it’s part of a good business continuity strategy.
Now, Microsoft has outlined a number of areas of redundancy for this exam and one of them is called Application Redundancy. Now, when you’re designing your application, this is the great thing about the cloud. You’ve got so many different choices when it comes to services, from computing services and networking services, load balancing services, database services, storage services.
There’s multiple types of services for each one. Each of those services are going to have a different availability profile and different pricing. So if you are going to store data, do you store data in an Azure storage account as Table storage, or do you store that in a SQL database service? Right? These are going to have different availability SLAs and also different availability profiles. One of the key aspects of availability entirely is just to avoid having any kind of single point of failure. So if you have an application that has multiple web servers running in a load balance, or multiple application servers also load balanced, but a single database that is in the behind all that stuff, if that database goes down, then you will have a single point of failure.
And you could even say if your applications are all running in a single region, east US. Region, and all components of your application are running in that region, well, when that region goes down, that’s a single point of failure. So spreading yourself out geographically as well as ensuring that there’s always more than one copy of any server that you control, running and networking and stuff like that. So the key for that is, again, redundancy.
Now, there’s many types of load balancing within Azure. And so when you’re doing any kind of redundancy, you’re talking about multiple servers that are controlled a load, and usually you need some type of load balancer in front of that. So whether you have a virtual machine scale set has load balancing built in, azure App Services also has load balancing built in. There are public load balancers and private load balancers. There’s traffic manager, there’s azure front door service. And all of these services exist at different levels to provide the load balancing operations.
Now, this course again is more about strategy and choosing the right one. We’re not going to get into how to set it up in all the settings and stuff like that. One thing that load balancing gives you is the concept of auto scaling. So you need a load balancer in front of a virtual machine scale set so that you can easily add additional virtual machines and that traffic gets automatically sent to those. So whether you’re using app services or VMSs, any type of compute operation, you’re going to want to have a load balancer that gives you scaling. Now some of the computing operations like functions or service fabric functions and logic apps and things like that, you don’t control the servers and you don’t control the scaling. So not every service that you pick has a scaling option. But make sure that you’re thinking about scaling and it’s almost a must if you have a high availability application.
Now, another aspect of availability is that there might be some malicious actors out there who are trying to take you down. So not trying to just protect against hardware failures or Internet failures or power failures, but denial of service attacks where millions of computers coordinate together to try to bring down a single endpoint, well that’s an attack. And it’s designed to reduce your availability. It’s designed to block out legitimate customers from accessing your services. Microsoft has basic DDoS protection included for free.
But there is an advanced DDoS protection. So if you are going to be likely the target of an attack, then you’re going to want to look into paying for some type of DDoS protection. It’s a little pricey, if I’m being honest. But if you are the target of an attack, then that’s the price that you might have to pay. The other thing too is if you’re under attack. If you’re currently being subject to denial of service attack, moving your application behind a denial of service protection is one mitigation strategy for an actor attack underweight making sure that your VMs are distributed equally among different pieces of hardware within a region within data centers is also an aspect of availability.
Microsoft has a service level agreement for virtual machines and that does depend on whether you’ve got them as standalone machines or in running an availability set which will distribute them among servers, hardware racks. So that the no single power failure, no single power hardware is going to take your virtual machine set down also among data centers, which is dumb.
2. High Availability for Essential Components
So we’re continuing to talk about High Availability. And in particular, we’re going to look at what is High Availability and what resources within Azure require your active participation in order to achieve High Availability and which ones Microsoft Azure actually does the work to create. Ha. So what is a high availability service? Well, if your application continues to run exactly the way you intended it to in healthy state with no significant downtime, even in the face of hardware failure or an attack.
Now, to get High Availability, we’ve said this a few times it requires multiple servers, availability zones, running load balancers and making sure your data and files are replicated, multiple locations to handle that kind of traffic, essentially. Now, when you look at your application as a whole, it’s not that every single component, every single component, no matter how big or how small, requires your attention. For High Availability, there is the concept where this part is essential, this is public facing, this is what the customers are going to see.
And this part, if it was down for an hour, nobody would know, right? Now this is a hypothetical example, should always point that out, but that’s what it is. So let’s say you have an application running on the Web and you’ve got your end user customers who actually pay you money log in to use this application. Well, you could argue, and it’s a fair argument to say it requires High Availability. Those customers expect it working 24 hours a day, seven days a week. And if they log in and sometimes it’s working and sometimes it’s not, that would be a major blow to your business.
It’s financially a disastrous and building would be on fire if your application didn’t work every couple of days. So you want those customers facing apps to be High Availability? Like I said, this is my opinion. This is hypothetical. What you would do is you would implement a messaging synchronization process within your application so that you can use a highly available, like the service bus queue for applications to talk to each other. You would use highly available storage. So we talked about SQL, SQL Server database versus table storage. You would go to those High Availability solutions over the ones that you have to manage yourself.
Now, on the flip side, if you had a batch job that once a week ran collated the data and uploaded it into a data warehouse, well, does that need to be highly available or can that job, instead of running it one, can it run at two? Well, sure it can. Internal employees would have to understand if there’s some problem with the server that needs to be rebooted, then you can argue that back end job does not need to be highly available.
Now, if it’s a payroll job or it’s collecting payments from customers, maybe there’s some argument in there or the admin app, right? So if you’ve got an application that only internal employees can access. And you said, Listen, Friday at 12:00, I have to reboot the server. It’s going to be down from twelve to 1230. You send an email to everybody, people can deal with that, right? So that’s not necessarily the same standard of availability.
3. Storage Types for High Availability
So we talked about compute, but what is highly available storage? What storage types should you choose? And what should you avoid if you’re aiming for high availability? So, first of all, we should say that the Azure storage account, the unmanaged storage account, and the managed storage account by default is highly durable. Azure has designed this such that when you send it a file and they give you a confirmation that the file was received, there is a ridiculously high standard for durability. I think it’s like nine nine or eleven nines or something like that.
One time I sat down and did this calculation that said if you were to write a file to Azure Storage once every second, it would take something like 7 million years for Azure to have lost one of those files. So the way that Azure, once it receives the file, it’s written there’s three copies of this within a local region. If it’s globally redundant, there are six copies of this file around the world. Azure is not going to lose your file. It’s pretty highly durable now. Yeah. One in seven years, maybe 7 million years. So it does come with either local zone or global redundancy.
If you’re going to have three copies or six copies, those files are pretty safe. Another way to ensure that you have highly available storage is by using the failover option. If you are deploying your storage in a globally redundant manner, then you can have your primary location and your secondary location. And if something was to ever happen, let’s say your primary location was to be unavailable, you could initiate a failover so that your secondary location becomes your primary location.
And that could be very handy for highly available storage. Now, the farther away you need to store files, the higher chance of some kind of latency or some kind of data loss, right? So if here’s an example of some user writing a data into a Blob account and it’s got a globally redundant storage, and it’s storing that file halfway around the world, the farther the file needs to be stored, the longer the latency. And then if you do the failover in between or the region becomes unavailable or something like that, those milliseconds end up starting to count a little bit more.
Like, here’s an example of a downtime, a region unavailable. And so your secondary region might be a few hundred milliseconds out of sync with the primary region again. So you can either initiate the failover or the failover. Yeah. This is a manual initiation that’s in preview mode. And so suddenly now you want to use the other region as your primary. There is that little tiny, tiny risk of losing that file when you do a failover. The other thing that is recommended for high availability is that you do backups, right? So you have your servers and everything that’s essential. You have backup copies of it and can be restored. Now another High Availability solution is if you need to move files around, the the Azed Copy command line utility can do things behind the scenes. So if you need to move files from one region to the other, you can initiate AZ Copy, and it will copy those files. This could be part of a High Availability strategy. So if you’ve got your log files in one blob and you want to make sure there’s a backup of that somewhere else, you can certainly.
Interesting posts
The Growing Demand for IT Certifications in the Fintech Industry
The fintech industry is experiencing an unprecedented boom, driven by the relentless pace of technological innovation and the increasing integration of financial services with digital platforms. As the lines between finance and technology blur, the need for highly skilled professionals who can navigate both worlds is greater than ever. One of the most effective ways… Read More »
CompTIA Security+ vs. CEH: Entry-Level Cybersecurity Certifications Compared
In today’s digital world, cybersecurity is no longer just a technical concern; it’s a critical business priority. With cyber threats evolving rapidly, organizations of all sizes are seeking skilled professionals to protect their digital assets. For those looking to break into the cybersecurity field, earning a certification is a great way to validate your skills… Read More »
The Evolving Role of ITIL: What’s New in ITIL 4 Managing Professional Transition Exam?
If you’ve been in the IT service management (ITSM) world for a while, you’ve probably heard of ITIL – the framework that’s been guiding IT professionals in delivering high-quality services for decades. The Information Technology Infrastructure Library (ITIL) has evolved significantly over the years, and its latest iteration, ITIL 4, marks a substantial shift in… Read More »
SASE and Zero Trust: How New Security Architectures are Shaping Cisco’s CyberOps Certification
As cybersecurity threats become increasingly sophisticated and pervasive, traditional security models are proving inadequate for today’s complex digital environments. To address these challenges, modern security frameworks such as SASE (Secure Access Service Edge) and Zero Trust are revolutionizing how organizations protect their networks and data. Recognizing the shift towards these advanced security architectures, Cisco has… Read More »
CompTIA’s CASP+ (CAS-004) Gets Tougher: What’s New in Advanced Security Practitioner Certification?
The cybersecurity landscape is constantly evolving, and with it, the certifications that validate the expertise of security professionals must adapt to address new challenges and technologies. CompTIA’s CASP+ (CompTIA Advanced Security Practitioner) certification has long been a hallmark of advanced knowledge in cybersecurity, distinguishing those who are capable of designing, implementing, and managing enterprise-level security… Read More »
Azure DevOps Engineer Expert Certification: What’s Changed in the New AZ-400 Exam Blueprint?
The cloud landscape is evolving at a breakneck pace, and with it, the certifications that validate an IT professional’s skills. One such certification is the Microsoft Certified: DevOps Engineer Expert, which is validated through the AZ-400 exam. This exam has undergone significant changes to reflect the latest trends, tools, and methodologies in the DevOps world.… Read More »