AZ-304 Microsoft Azure Architect Design – Design a solution for logging and monitoring Part 2
5. Monitoring Storage Accounts
Now we’re in the storage account section of Azure Monitor and we can see a selection of our storage accounts. Not all of them have got auto selected, but most of them have. And at a glance we can see over the past 4 hours how many transactions each storage account has had, the frequency of those transactions, latency metrics, server latency, end to end latency, and authorization errors. We can also view capacity metrics, which is obviously important. If we’ve got five petabytes for storage account, then we can see how close we are to that.
And of course none of these are that close. Now going into any of these storage accounts, clicking on it will take you into the Insights tab of the storage account itself. And so scrolling down we can see this is our storage account. There is an insights. Tab. Now notice that there’s no enabling or disabling when you create a storage account in Azure, it is going to be attracted in terms of Azure Monitor metrics. And so there’s no turning it on or turning it off. You can customize the screen, you can look at different workbooks, just like Azure VMs. But basically Insights is one of the services provided by a storage account. Now we can see this overview screen for that storage account.
Failures, performance, availability. Now you might think, Azure storage account, why would there be failures? Well, it could be as simple as someone trying to access a file that they don’t have authorized access to, in which case it’s an authorization error, for instance. And so we can look at the various success and failure options and start to track, okay, why is it failing? Oh, it’s failing because get blob is what’s failing. Maybe the Blob doesn’t exist, things like that. So we can sort of, if we’re interested, track down this. We can look at performance metrics, of course, availability, storage accounts do have a service level agreement. And so you would expect the storage health over.
Again, this is over a four hour period to have no known downtimes. But maybe if you did have a downtime, Microsoft would reimburse you for that over a 30 day period. So searching is pretty straightforward. You’re just going to have that stuff there. It’s not to turn on and turn off. You don’t get a selection. Like I said, you can go into the workbooks and see some customized templates for workbooks if you’re interested in these things or customize them and save them, share them, et cetera. So Azure Montreal storage accounts, as you can see, pretty straightforward.
6. Azure Monitor Alerts and Metrics
So here we are in the metrics section of Azure Monitor. What we’re going to do is we’re going to look at the different ways that you can now run reports and create dashboards based on all of that data. So assuming you’ve set up some of your app services, virtual machines, storage accounts, ETCA. To pull data into a log, analytics workspace. Now we can look at how do we get that data and do something with it. One of the ways we can do that is through setting up metrics and also setting up alerts. Now, metrics is basically a visual dashboard or a diagram. First thing you have to do is select what is called a scope. Now, most important thing to understand is this metrics tab has a very difficult time, or at least it’s not even allowed to have too many different resources on the same chart.
So even if I picked one of these resources and I picked storage account for instance, the rest of it automatically gets grayed out and it’ll actually tell me that storage accounts do not have this multi selection for metrics. So I can’t graph two or three storage accounts at once.
It kind of makes sense because if you’re trying to graph like say CPU utilization, I guess it would have to average it and not put them both on there or something like that, even App Services. So if I unselect all and I said App services, that would filter it down. And if I chose one app service, again, everything grayed out and will say App services is not one of the ones that allow multiple metrics to be graphed. So I’m going to choose the app service that we created earlier in this section and say apply.
Now that the app service has been chosen, we do have this app service Metrics and we can sort of see the various things in here. We saw this earlier with the 200 errors, the 400 errors. 200 is a success rate. So we should see like in this 1037 range. I had 20 successful views of the web page and I haven’t had any since. If I want to be all cool about it, I can hit refresh six, seven, 8910 times and let’s see that get reflected in our metrics. Now if we go back to the metrics chart, I can actually change the time range.
So instead of it being a 24 hours range, I can go down to 4 hours and instead of having a 15 minutes granularity, I can say I want 1 minute granularity, hit apply. And we can see that over the last 4 hours that these Http requests just sort of really spiked in the last minute. So that was the hitting refresh button. Of course, the Http is more than just the single hit to the page, it would be all the CSS or JavaScript or anything else that was behind the scenes.
So I refreshed that quite a bit and we got a real spike here. So we can see in a graph form the activity to our website. Now, one of the interesting things is then if we like this chart, if this is a useful chart to us, we can pin it to our Azure Portal dashboard. And so every time we log into the Azure and we go into the dashboard, this chart can be one of the charts shown to us. So we can basically build our own view when we log into Azure. Now, let’s say that we want to make an alert based off of this.
So based on the amount of traffic that’s coming in in this particular 1 minute time span, it’s something that I need to be notified of when it reaches a certain level. So I say new alert rule, we can see it’s already scoped on that individual resource and we need to build some type of logic. So whenever this Http statistic, let’s say I want to say when it is greater than when it got up to 80, but that doesn’t seem very high. Let’s say whenever it hits 500, I want to be sent an email.
So this is 500, it’s going to be evaluated every 1 minute and I just want to know that over a five minute period. So 500 views to my website over a five minute period, evaluated every minute. Now you’ll notice there’s a cost of this, so it’s ten cents per alert. So I’ll set that up and then I can set myself up as an email recipient. I can say add an action group, I can create an action group, call this email me.
And what I want to happen is I want to send an email to my email address@whatever. com and that will basically send an alert when this notification is hit. Now, each resource that you graph is going to have its own metrics that it can track. So we’re tracking in this particular case the app service. We have this selection with its Http errors, data and Data out CPU. If we change our scope, let’s say we want to track the virtual machine instead, and I’m going to say that and I’m going to select this virtual machine as the scope. Then the virtual machine becomes the track and it becomes different. So you’ll notice that I’m not seeing Http to xx as being an option, but I do have Data In, Data Out, disk read, OS queues, et cetera, various options. You can see my server in the last 4 hours has become active at 50% CPU utilization. So basically I can do the same thing.
I can set up an alert if I wanted to monitor this and see that situation. In this particular case, I’m going to want to know when it does drop because that would probably be something that’s not good in this particular servers case. So using metrics can show you some interesting graphs and then you can set up alerts either on their own or directly from this alert button. That can then SMS, text, email, lu and perform some other actions. Action groups can start off jobs, functions, logic, apps, etc.
7. Azure Monitor Log Queries (Kusto)
Alright, so the last topic we’ll talk about is this logs section. And this is really hiding a lot of power underneath such a simple link that doesn’t seem to stand out from the others. But I’m going to close this little pop up and I’m going to show you. I’m going to minimize the menu, pull out the schema, and basically this is a query tool. So this is going to be very similar to SQL Server query Editor or any of these query editor interfaces you might be used to where you can type in a query, hit run, see the results. Now it’s all going to depend on which resource you select. So if we look at this is very similar to selecting Scope for metrics. So if we want to say let’s look at the Application Services, then I’m going to be able to look at this Train Test Service that we created.
Now again, there’s a little bit of complexity to this when you’re talking about implementation and actually working with this. So when I select this, what’s really cool is it resets itself here and it shows me that there are three tables that the Azure Train Test App Service could feed into. And if I mouse over it, it has a little pop up that says Entries from the activity logs. So this is like subscription level events, the creation, deletion, restart, redeployments, et cetera. Now when I click it, you’ll see that the query window gets filled with the name of the table. You don’t have to use the word select. So this is not a sequel query. This is actually called Return by mistake. This is actually called Kusto. Kusto. And the language is Cousteau query Language or KQL. So it’s not going to have the same syntax as a SQL. You just enter the table name which happens to be Azure Activity.
And we can see these are the various things that happen to this service. And this is Scoped within the last 24 hours. And so we can see all the way back to the time when the resource was first deployed even. Right? So this is operation Microsoft web site. There’s some informational, informational stuff here. So if I wanted to run it, write a query on the amount of times that this resource was restarted or someone had did something to it, this is where you would go. Now we could also get the service logs and the metrics from here. Like I said, there’s a little trick to this.
I’m going to change from the App Service scope to the Virtual Machine scope. So if I go down here and I say Virtual Machines, just click out of it. This is the My Minor machine we were just dealing with. And now the tables have changed. Now if I rerun the same query on Azure Activity, then it’s not going to be based on the App Service, it’ll be based on the virtual machine. Let’s look at the heartbeat. Okay? So I’m going to run the heartbeat. Now, of course, the heartbeat in the world of virtual machines is that the machine just puts out an IMA live message. You can see the operating system version that’s being output, and it’s just every minute. Hello, hello. We can see these 1 minute intervals.
The performance stuff is under perf, so I can do that. Now, just looking at this table is not super useful. What really makes this useful is what’s called queries. And so you can write your own queries. So this is a query language, and you can get all where clause, select statement, summarizing. You can be very clever. You can even create charts with this. Or there’s some pre built queries. So let’s say I want to look at the count heartbeats, okay? Now, when you do this, it adds to the window. It doesn’t delete what’s there. So I’m going to delete what I previously had. And this says, count all the computer heartbeats for the last hour. So this particular query has the time built into it. So even though I’ve got 24 hours time range, it’s going to be restricted to 1 hour based on this.
So you can see the query language, right, the table name, the where clause, and a summarize, which is like a group by, I guess. So how many heartbeats do I have? I’ve had 105 heartbeats in the last hour. Normally, agents generate heartbeat every minute. So you can query this again. This can be turned into an alert if this number doesn’t come back correct, et cetera. We can look at performance, CPU usage trends. I’m going to have to delete what’s there. CPU usage trends. I can run it. This is rendering it as a chart. You see the render command in here. And so, again, I can customize this, filter out certain things. If you’re looking for errors, if you’re looking for very specific counters, things like that, this is a very flexible language. We’re not going to teach you this language, and I don’t think it appears on the exam, but knowing that that’s the language that runs within these queries, this is where the power is in terms of being able to extract all this data. You’re pulling in data from hundreds of machines. Now, you can write reports, save those reports, turn those into alerts output charts. If you’re looking for errors, you can start to look for errors in here.
8. *NEW* Azure Sentinel for Security Monitoring
Now we just looked at Azure Monitor which focuses on performance, it focuses on diagnostics and various events. But what about security? We live in an age now where security has to be top of mind for any person developing or creating apps for the cloud. Well, that’s where the new product called Azure Sentinel comes in. So if we go into Azure Sentinel, first thing we have to do is we have to connect Sentinel to one of our existing workspaces. Now luckily, I do have a workspace that is eligible for Sentinel. And so I do have a lot of my logging auditing going into Sentinel and I can say add to Azure Sentinel. But. What is sentinel? Sentinel is basically what Azure Monitor is for performance. Sentinel is for security. So this allows you to collect data from all your devices that are security related. It has some automatic threat detection algorithms inside of it. You can minimize false positives. You can also use it for investigations.
So once you do have some security events, let’s say you do have someone who’s repeatedly trying to log into one of your applications and it fails and fails and fails. Well, you can start to dig down into the logs and see when they started, what ended up happening, track that IP address through the logs. Did they even eventually get in? Maybe they did. You can collect those log files into what are called incidents and then basically start almost a ticketing system around the incident until you can respond to that. So Sentinel is basically what’s called a security Information Event Management, or Siem solution. Also the Security Orchestration automated response. Soar solution. So it is both for investigations and for responses of security. So we saw that took a minute or so to connect our workspace into it. Now we do have to start going through and setting this up, but basically we can see that. You can look at what I was calling about incidents.
Hunting is the process of trying to find security problems. So you have the log files at your disposal. You can basically run queries to find what if you’re putting yourself in the hacker’s shoes? What are some of the telltale signs of either successfully or unsuccessfully being hacked? And you can basically search around for those. So let’s go first to the overview screen and we can see that even though we literally just hooked up the Sentinel to the workspace a few minutes ago, it’s already collected 2. 4 thousand events from my system. So all of the various things that were already in that workspace, it’s already collected them all right? So and it’s basically just done it. So I can see sort of broken out on the overview screen how many virtual machine events and performance events, et cetera. We don’t have any incidents we can sort of see. It’s basically a default dashboard here where we can sort of see if there’s any malicious attacks, et cetera.
Now we haven’t really hooked this up other than the workspace, which is not configured for most of my services, we haven’t really hooked this up to a lot of other services within Azure and outside of Azure. So if we go back to the news and say collect data, and it’s also under data connectors here, we can see that there are 60 connectors currently and I only have two of them connected. So if I say I only want to see the ones that are currently configured, so by default I’m going to get the security events.
It’s basically based on the workspace, things that go into my workspace that are security events and also non security events, subscription level events such as creating a resource group, creating a new resource, deleting a resource, et cetera. Those go under activity. But that is just a fraction of what a Sentinel can connect. We can actually see it has an Amazon Web Services connector here. And so we can actually stream our cloud trail logs from AWS into Sentinel and Sentinel can become this centralized repository for all your security. We also have other third party services, AI, Vectra, outside K, Audit. Let me scroll down here. So we got some Barracuda services, cisco, Citrix, et cetera. So Microsoft is basically creating these connectors to a lot of these security products.
Like f five is a firewall connector. Okay? So you can basically get all of your security information in one. Now we can actually see other things such as Azure Firewall, Advanced Threat Detection, azure Active Directory might be a good one because this is what’s going to show you people who are using your ad users using what are basically the sign in logs. So we do have to connect our Azure Active Directory into Sentinel. It’s not connected by default. We can see here that we do need to have a Premium license. So I’m currently on a premium subscription for my Default active directory. And if you don’t have that, you’re going to have to grant that. You have to be at the right permission settings, et cetera. And basically I can grant the access to Azure Sentinel to my Azure Active Directory sign in logs and other audit logs. And so from that point forward, we can now get access to those logs within Sentinel. Now I’m going to close this out.
We can see, like I said, we look at something like hunting. If I click onto that, Microsoft provides 92 built in queries for various things. So we just talked about AWS cloud trail. Well, there’s a query pre built that allow you to find, it says Changes made to AWS Access Policy. So the identity and access management policy. So if someone goes in and grants changes the Aim policy, you can basically find those events using this type of query scrolling down. Remember, Azure activity is already set up by default. We can see if somebody’s going to enumerate which means go and list one by one all of the Azure account storage keys, things like that. Got DNS logs microsoft Office web w three CIS log So if we’re looking at Pen testers and attackers that are using common exploits then this type of log can basically pull that out. So these are defaults.
Now you’re not stuck with that. You can basically create your own query based on the sources and then basically just run queries to generate any kind of reports that you want. The real power then becomes setting up alerts based on this. So once you’ve got a query and you’re okay with the results then maybe you want to turn that into an alert that you get notified anytime a new event happens that matches this query. Now Azure also provides what are called workbooks, which are basically predefined report sets. And so if you’ve got let’s say the Azure activity workbook, then basically I can go and have a predefined dashboard that’s going to track. Remember we talked Azure activities about subscription level events such as Top Ten Active Resource Groups and what activities from Creation Updation and Deletion is happening over time.
So this is like a dashboard that you can then customize and it can become your workbook, but it’s a template for a workbook and finally you’ve got what are called Playbooks, which is the automation. So if you do have a query that you develop in hunting and you find that it’s really helpful to find incidents, you can set up an alert but you can also have some type of action that happens. Maybe it disables the user if user X has four bad login attempts in 24 hours, sends an email to the user asking them what’s going on and disables the account until they return, reply, et cetera, you can invent your own sort of automations that come off of when it detects something suspicious. Alerts are great, but if you can develop some sort of automation then you can basically have your system subtly adjust, turning things off and on or enabling multifactor authentication and things like that, depending on the threats that are being detected.
Interesting posts
The Growing Demand for IT Certifications in the Fintech Industry
The fintech industry is experiencing an unprecedented boom, driven by the relentless pace of technological innovation and the increasing integration of financial services with digital platforms. As the lines between finance and technology blur, the need for highly skilled professionals who can navigate both worlds is greater than ever. One of the most effective ways… Read More »
CompTIA Security+ vs. CEH: Entry-Level Cybersecurity Certifications Compared
In today’s digital world, cybersecurity is no longer just a technical concern; it’s a critical business priority. With cyber threats evolving rapidly, organizations of all sizes are seeking skilled professionals to protect their digital assets. For those looking to break into the cybersecurity field, earning a certification is a great way to validate your skills… Read More »
The Evolving Role of ITIL: What’s New in ITIL 4 Managing Professional Transition Exam?
If you’ve been in the IT service management (ITSM) world for a while, you’ve probably heard of ITIL – the framework that’s been guiding IT professionals in delivering high-quality services for decades. The Information Technology Infrastructure Library (ITIL) has evolved significantly over the years, and its latest iteration, ITIL 4, marks a substantial shift in… Read More »
SASE and Zero Trust: How New Security Architectures are Shaping Cisco’s CyberOps Certification
As cybersecurity threats become increasingly sophisticated and pervasive, traditional security models are proving inadequate for today’s complex digital environments. To address these challenges, modern security frameworks such as SASE (Secure Access Service Edge) and Zero Trust are revolutionizing how organizations protect their networks and data. Recognizing the shift towards these advanced security architectures, Cisco has… Read More »
CompTIA’s CASP+ (CAS-004) Gets Tougher: What’s New in Advanced Security Practitioner Certification?
The cybersecurity landscape is constantly evolving, and with it, the certifications that validate the expertise of security professionals must adapt to address new challenges and technologies. CompTIA’s CASP+ (CompTIA Advanced Security Practitioner) certification has long been a hallmark of advanced knowledge in cybersecurity, distinguishing those who are capable of designing, implementing, and managing enterprise-level security… Read More »
Azure DevOps Engineer Expert Certification: What’s Changed in the New AZ-400 Exam Blueprint?
The cloud landscape is evolving at a breakneck pace, and with it, the certifications that validate an IT professional’s skills. One such certification is the Microsoft Certified: DevOps Engineer Expert, which is validated through the AZ-400 exam. This exam has undergone significant changes to reflect the latest trends, tools, and methodologies in the DevOps world.… Read More »