AZ-304 Microsoft Azure Architect Design – Design a Data Archiving Strategy

  • By
  • January 17, 2023
0 Comment

1. Storage Account – Data Archiving and Access Tiers

So continuing on talking about business continuity, in this video, we’re going to talk about something a little different, and that’s the concept of data archiving. Now, data archiving is not the same as data backups, although you might want it to keep your data backups for long periods of time. You might want to keep database backups for a year, two years, or five years. But data archiving could be anything, not just database backups. Now, we’ve just been talking about Azure, your site recovery, and the ability to get your site recovered from a disaster in a very quick period of time. So within 20 minutes or 30 minutes to be up and running again, that’s obviously got huge company value. We’re not taking away from the value of that. But data archiving is something kind of different. So this is the concept where whatever type of file that you have, that you have to keep that file somewhere accessible for a longer period of time. It could be, if we’re talking about backups, it could be your virtual machine or your physical machine backups.

And you have a backup strategy that means that you keep 30 day year old backups, 60 day year old backups, 180 day backups, and you never know what you’re going to need them for. But you’ll be thankful that you can go back in time and compare what the database was like back then to what it’s like today in a way of trying to recover from some type of other type of disaster, maybe corruption or something. So you will need these files to be stored, but you don’t need them immediately. I worked in a company one time where we had significant a number of files coming in from external sites every day. By FTP, files were just coming in, coming and coming in.

We would process them, we’d ingest them into a database, and that would go into our systems and then finally into an application that we were running. But we used to keep those raw data files that were sent in by the external party just in case. Like, you never know if the import failed or the imports failed for a couple of days and you’ve got the files there, you don’t have to contact another company, send the files again.

Having the files ready is good as an It perspective in terms of being able to recover from some sort of process failure. It also could be a legal requirement where you want to look back at the files that were sent six months ago because of some dispute that you’re having, and being able to call those files up and say, yeah, we never deleted them, they’re just sitting right there, could be helpful for many other processes. There’s also accounting, governance tax, and lots of other reasons why you have to keep files. GDPR and other data privacy illustrations might require you to keep record of the permissions that you got when the users agreed to be messaged by you, you have to keep track of that. And it has to be kept in a reliable spot. So that’s archival data, these are things, files that you have to have, but you don’t need right away.

They’re not transactional, they’re not immediate. Now, within the Azure storage account, there’s four levels of performance. Right? There’s the Performance tier, which is a special tier of above the regular tier that is a lot faster, sub millisecond response times. There’s the Hot tier, which is the default. That means you write the file, you read the file. It’s just normal level of performance. And there’s these things that we might not know much about that are the Cool and the Archive tiers. And in this video, we’re going to talk about everything below Hot, which is Cool and Archive. Now, when you go into Cool and in filing into Archive, there are significant cost savings for the storage of the files. So when we’re looking at Hot storage, that costs two cent per gigabyte to store, then when you look at Cool Storage, that’s like one cent per gigabyte. And when you’re looking at Archive storage, that’s like one fifth of a cent. You really go down in price quite significantly when you go from Hot to Cool to Archive. So, like, Archive tier, like I said, is something around 90% cheaper than the Hot tier. Now, that comes at a price. Like the last slide said, it’s is cheaper to store, but it’s more expensive and more difficult to access.

2. Access Tier Requirements

So continuing on talking about storage accounts and data archiving, let’s look at what are the reasons why you would want to use it. What are some of the prerequisites? What do you have to have in place before you would consider using data archiving tier? Now, if you think back to the older days, if you’re young, you might not even know this, but back in the day, companies used to back up their servers to a tape drive, right? So there’d be big spinning reels of tape, tape maybe you saw this in an old movie. And they would have to remove that tape from the tape player and store that somewhere. And if they ever needed those files, they would have to go search through the closet, find the tape it’s on, load the tape up, and then they would ingest that data from that tape.

So you can look at the Archive storage tier, something similar to that, where it’s not easily accessible, but it is accessible if you really need it. So these are for files that, for whatever reason, you need to keep, but you never will need them at a moment’s notice. You always will have a little bit of advance notice to access them. That’s for the Archive tier specifically. Now, the tiering, the Hot, Cool and Archive is only available in newer storage accounts, and that is general purpose V two primarily. It’s also available in a Blob storage account. It is not available in general purpose v one. It’s pretty easy to migrate from general purpose v one to v two. And so there’s a lot of good reasons to do that for new features, especially something like this in the future. Microsoft has said that they would be adding new features to general purpose v two that they wouldn’t even add to a Blob storage account. So I think that’s eventually going away and all storage accounts will be v two and above. Now, by default, when you add a new file to a storage account, it’s defaulted to the Hot Access tier.

Hot Access Tier is just the normal tier, the regular tier. You pay a really good rate to store it, and you pay a pretty decent rate to retrieve it for every access, write and read would cost you something as well. Now, as we said in the last video, when you go down to the Cool tier, you’re actually saving about 50%. So there’s quite a bit of savings just to go from hot to cool. But it does cost you more. It’s about five times more expensive to read those files. And so you wouldn’t want to store files in a cool tier that you need to access frequently, like several times a day.

But if you have a file and you think of it like old family photos, if you were storing your old family photos in the cloud, you might want to put them in a Cool access account because you’re not seeing your old family photos every single day. But hey, you wouldn’t mind paying a couple of cents more when you finally get around to wanting to see them because you’ve saved so much in the storage. By the storage cost being lower, you can actually, I’m meant to see in the last slide, you can set your account to default, to cool. So what does that mean? That means that any new files added to a storage account could come in at the Cool tier. And so that just becomes the default. Hot and Cool are the only two that you can choose, is the default. The archived tier can never be a default. You always have to individually select a file and make it an archive.

And that’s why it’s at the object level only. So these are object level, an object level setting to reiterate, to retrieve a file that, once it’s been set into Archive and you want to download that file, it’s going to take potentially several hours to retrieve. So that’s the downside of saving a file in an archived tier, is the number of hours that it takes. The more files you want to retrieve, the longer it takes, et cetera. There’s also the requirement that you remain in the archive tier for at least 180 days. So don’t put these old family photos into the archive unless you’re prepared to keep them there for about six months, because you might be saving 90% on the storage cost, but they want you to keep it there for six months and not for a date. Those files, once they’re in the archive tier, like I said, they’re like they’re on magnetic tape. You can’t read them, you can’t copy them, you can’t overwrite them, you can’t even do anything to them. You have access to the metadata, but the files themselves are completely inaccessible to you.

What the process of taking a file from Archive back to the Cool tier, or the Hot tier is called rehydration. And again, you go in a queue and depending on how many files are ahead of you, it could take minutes, it could take hours, depends on how busy they are. So to get stuff out of Archive, it’s a bit of a process. Again, back to the magnetic tape. They had to go in the closet.

They had to find the tape. They have to have a free machine. There’s no free machine. Okay, let’s wait for one to finish, put the tape onto the drive. So I’m not sure Azure using tape right now, but it does take them a long time to get those files out of this archive. And there’s a new priority preview feature called high priority. So if you really, really need those files and you can’t wait 6 hours and you want to get it done within an hour, there is in preview mode, high priority, and of course, that’s going to cost you more.

Go into an Azure portal and we can see on the left here within a storage account, there’s a Blob service section and there’s lifecycle management in there. Now, this is pretty cool, so I can go and set up a rule that is basically automatic method of getting files from hot to cool to archive. And here’s a screenshot of a rule I was setting up that would after 120 days, it moves a hot Blob into Cool storage and after a year, 365 days, it would move that cool file into Archive storage. And so you can set up this life cycle where files you don’t need anymore save you money.

3. Access Tier Service Level Agreements (SLAs)

Now we can also mention that Microsoft makes certain promises when it comes to the accessibility of these files. And these are called Service level agreements or SLAs. Now at the beginning of the section I mentioned, there’s actually four tiers and we haven’t talked about the Premium tier that much. But the Premium tier is basically a really high performance tier. Microsoft has put this into a storage account that is obviously SSD. It’s a hyper fast SSD running in hardware and network that is designed to be quick. And you can see here they’re promising a less than ten millisecond time to first bite.

So if you have files where it’s very important that we get that first byte out to the end user in a snap of a fingers or less than ten milliseconds, then Premium Performance is available to you. You cannot use Premium Performance with the Global Redundancy feature and there’s no minimum duration in terms of keeping it in premium. Microsoft offers a 99. 9% availability guarantee for the Premium Performance files. When we get to the regular tier, the Hot tier, which is in general purposely two is the default. Unless you change it, that’s still 99. 9% availability. But if you get into the Global Redundancy, then you can basically get ten times the availability with 99. 99%. And that’s because the files are stored outside of the geography and in a different region of the world. There’s no minimum duration.

You’ll be charged for the gigabytes used in the month and you can delete them anytime. And the response times are sub 1 second. So they’re millisecond latency, but not less than ten millisecond latency. So a little bit slower than the performance tier, but still pretty fast. To get a file from storage within a second, moving down to the Cool tier, we’re going to actually get lose a bit. So the availability is only 99%. That means that in one in a hundred requests to read a Cool tier file might fail. At least Microsoft is not going to back it up more than that. And if you can use the Redundancy, Global Redundancy and getting an availability for that, but there is a minimum 30 days duration to store it in the Cold tier.

So if you even try to delete the file within that 30 days, you’re still going to be charged for the full 30 days. And the response time is similar to the Hot tier. Finally, the archive which we were talking about, these files are offline. So there’s no availability. You have to rehydrate the file and then you have to wait 60 minutes, 3 hours, 6 hours to get your file back. So there’s no availability SLE, there’s no concept of Global Redundancy for archive tier. You do have that 180 day minimum and it takes hours from the time that you decide you wanted the file to the file being in your possession. That being said, 90% savings off of the Hot tier. So if you really, really want to put something away into a deep freeze, then consider the archive tier.

Comments
* The most recent comment are at the top

Interesting posts

The Growing Demand for IT Certifications in the Fintech Industry

The fintech industry is experiencing an unprecedented boom, driven by the relentless pace of technological innovation and the increasing integration of financial services with digital platforms. As the lines between finance and technology blur, the need for highly skilled professionals who can navigate both worlds is greater than ever. One of the most effective ways… Read More »

CompTIA Security+ vs. CEH: Entry-Level Cybersecurity Certifications Compared

In today’s digital world, cybersecurity is no longer just a technical concern; it’s a critical business priority. With cyber threats evolving rapidly, organizations of all sizes are seeking skilled professionals to protect their digital assets. For those looking to break into the cybersecurity field, earning a certification is a great way to validate your skills… Read More »

The Evolving Role of ITIL: What’s New in ITIL 4 Managing Professional Transition Exam?

If you’ve been in the IT service management (ITSM) world for a while, you’ve probably heard of ITIL – the framework that’s been guiding IT professionals in delivering high-quality services for decades. The Information Technology Infrastructure Library (ITIL) has evolved significantly over the years, and its latest iteration, ITIL 4, marks a substantial shift in… Read More »

SASE and Zero Trust: How New Security Architectures are Shaping Cisco’s CyberOps Certification

As cybersecurity threats become increasingly sophisticated and pervasive, traditional security models are proving inadequate for today’s complex digital environments. To address these challenges, modern security frameworks such as SASE (Secure Access Service Edge) and Zero Trust are revolutionizing how organizations protect their networks and data. Recognizing the shift towards these advanced security architectures, Cisco has… Read More »

CompTIA’s CASP+ (CAS-004) Gets Tougher: What’s New in Advanced Security Practitioner Certification?

The cybersecurity landscape is constantly evolving, and with it, the certifications that validate the expertise of security professionals must adapt to address new challenges and technologies. CompTIA’s CASP+ (CompTIA Advanced Security Practitioner) certification has long been a hallmark of advanced knowledge in cybersecurity, distinguishing those who are capable of designing, implementing, and managing enterprise-level security… Read More »

Azure DevOps Engineer Expert Certification: What’s Changed in the New AZ-400 Exam Blueprint?

The cloud landscape is evolving at a breakneck pace, and with it, the certifications that validate an IT professional’s skills. One such certification is the Microsoft Certified: DevOps Engineer Expert, which is validated through the AZ-400 exam. This exam has undergone significant changes to reflect the latest trends, tools, and methodologies in the DevOps world.… Read More »

img