DP-203 Data Engineering on Microsoft Azure – Monitor and optimize data storage and data processing Part 8
22. Azure Stream Analytics – More on partitions
Now in this chapter, I just want to go through some more notes when it comes to your partitions in Azure Steam Attics. So when it comes to your Stream antics job, there is something known as a compatibility level. Now, by default, the compatibility level at this point in time is one two. So because of this, Azure Steam Addicts will also make use of the partitions based on the partitions defined in Azure Event Hub. So we had only seen this earlier on in an example, because we had four partitions in our DB Multi Hub. When it came to Azure Event Hub, azure Steam antics was making use of taking the data from those four partitions.
So here, this is based on the compatibility level. But if you have a lower compatibility level, then you explicitly need to make use of the partition key in your query. Here, you need to say please partition by the Partition ID. Again, remember, the partition ID is a property that is inbuilt in the data that’s coming in onto Azure Steve antics if the data is coming in from Azure Event Hub. So if you actually go on to your job, there is something known as a setting of the compatibility level. And here by default, it will be one two.
Now, sometimes you may also want to explicitly repartition your input if you don’t have control over the partition key. You can also specify the number of output partitions. So sometimes there can be a use case scenario wherein your Azure Steam and its job is taking events from one Azure Event Hub. And let’s say this has four partitions and it might be streaming the data, has an output onto another Azure Event Hub. This could be one of the outputs.
So this is also possible when this is the case, you have to ensure that when it comes to the output that you are also outputting onto four partitions onto the destination as your Event Hub.
Again, you have to ensure that your destination as your Event Hub has the same number of partitions that are set. So ensure there is a one to one mapping when it comes to the number of partitions. If you’re using Azure Event Hubs has the input and using Azure Event Hubs as the output, and if you are using your own partition key, again, make sure the partition key is the same when it comes to your input and when it comes to your output. Now, when it comes to the number of streaming units, the maximum number of streaming units that you can actually put for a job.
So in our job, our jobs were very simple. Even though we had in built partitions which are taken in by the Azure Steam Addicts job, I always get the number of streaming units has three. Obviously, if you have more resource intensive jobs, then you can actually calculate the maximum number of streaming units that are actually required for a job. So I’ve taken the screenshot from the Microsoft documentation wherein they are actually telling you how you can calculate the maximum streaming units that you can put up for a job.
So here they’re saying that let’s say that your query has only one step and it is not partitioned, then put six has the starting measure when it comes to the maximum streaming units for a job. Now, they are saying that if the query contains only one step and the input data stream is partitioned by 16, then here you need a total of 96 streaming units for a job.
Here they’re saying that if the query contains two steps but they are again not partitioned, you can leave it as six. So here there are different examples on how you calculate the maximum streaming units. Right, so inhale again, just want to go through some important points from an exam perspective when it comes to partitions and this calculation of the maximum streaming units for a job.
23. Azure Stream Analytics – An example on diagnosing errors
Now, in this chapter, I just want to go through the part wherein you can actually try to diagnose the errors that can come into play when it comes to working with a Steam at the so if I go on to query for my existing Stream Attic job. So here I have a query in place. We have also seen this earlier on. Now, let’s say that I tried to quickly discast one of my values that I have here. So let me look at the average and let me cast this as a float. So here I’m actually changing. Now, what is the data type? Let me save the query. And now let me start my Stream antics job.
So I’ll hit on start. I’ll choose a custom here, I’ll just choose a specific time and let me hit on Start. So I’ll just wait for two to three minutes and then we’ll come back. So now after some time when you actually come back, you can see that the status of your job is in a degraded state if I scroll down and if I go on to the Job diagram. So here also when I go on to DB log. So here you can see you are getting output data conversion errors. Now here you are not getting details about the error. You can just see that during what processing times these conversion errors were coming about. So I actually purposely changed our query so that we get these sort of errors. So you could get conversion errors, you could also get input desialization errors, et cetera.
When you get these errors, what you can do is that you can actually go on to the Azure Monitor service here, if you go on to Activity log. Let me just hide this. Here you can see a diagnostic error in place. If I go on to it here, you will see the process events that data conversion error did happen. So if I click on this particular event, if I go on to the JSON part, let me scroll down. So here you can see that there was a Schema validation error here. It is clearly saying that it is not able to convert the property that’s average of a specific type onto a type that is defined in our table in our dedicated SQL pool.
So if you want to get details about the error, you can actually go on to your activity logs. Now, please note that it might take around three to five minutes for the errors to actually pop up in the Activity log. You can click on Refresh at any time to see those errors. But in this chapter I want to tell you ways in which you can actually diagnose errors that actually come in your Steam antics job. More often than not, I go on to the Job diagram in the Azure Steam Attics job. This gives me a very good idea on how the job is actually performing and if I want to dig into any errors, then I go into activity, Log.
Interesting posts
The Growing Demand for IT Certifications in the Fintech Industry
The fintech industry is experiencing an unprecedented boom, driven by the relentless pace of technological innovation and the increasing integration of financial services with digital platforms. As the lines between finance and technology blur, the need for highly skilled professionals who can navigate both worlds is greater than ever. One of the most effective ways… Read More »
CompTIA Security+ vs. CEH: Entry-Level Cybersecurity Certifications Compared
In today’s digital world, cybersecurity is no longer just a technical concern; it’s a critical business priority. With cyber threats evolving rapidly, organizations of all sizes are seeking skilled professionals to protect their digital assets. For those looking to break into the cybersecurity field, earning a certification is a great way to validate your skills… Read More »
The Evolving Role of ITIL: What’s New in ITIL 4 Managing Professional Transition Exam?
If you’ve been in the IT service management (ITSM) world for a while, you’ve probably heard of ITIL – the framework that’s been guiding IT professionals in delivering high-quality services for decades. The Information Technology Infrastructure Library (ITIL) has evolved significantly over the years, and its latest iteration, ITIL 4, marks a substantial shift in… Read More »
SASE and Zero Trust: How New Security Architectures are Shaping Cisco’s CyberOps Certification
As cybersecurity threats become increasingly sophisticated and pervasive, traditional security models are proving inadequate for today’s complex digital environments. To address these challenges, modern security frameworks such as SASE (Secure Access Service Edge) and Zero Trust are revolutionizing how organizations protect their networks and data. Recognizing the shift towards these advanced security architectures, Cisco has… Read More »
CompTIA’s CASP+ (CAS-004) Gets Tougher: What’s New in Advanced Security Practitioner Certification?
The cybersecurity landscape is constantly evolving, and with it, the certifications that validate the expertise of security professionals must adapt to address new challenges and technologies. CompTIA’s CASP+ (CompTIA Advanced Security Practitioner) certification has long been a hallmark of advanced knowledge in cybersecurity, distinguishing those who are capable of designing, implementing, and managing enterprise-level security… Read More »
Azure DevOps Engineer Expert Certification: What’s Changed in the New AZ-400 Exam Blueprint?
The cloud landscape is evolving at a breakneck pace, and with it, the certifications that validate an IT professional’s skills. One such certification is the Microsoft Certified: DevOps Engineer Expert, which is validated through the AZ-400 exam. This exam has undergone significant changes to reflect the latest trends, tools, and methodologies in the DevOps world.… Read More »