Amazon AWS Certified SysOps Administrator Associate – CloudFront

  • By
  • May 23, 2023
0 Comment

1. [SAA/DVA] CloudFront Overview

Now we’re getting into content delivery and we’ll start by CloudFront. So CloudFront is a content delivery network or CDN, and what it does is it improves read performance because the content is going to be distributed and cached at the edge locations. And edge locations are all around the world and there’s about 216 points of presence globally as I’m recording this lecture. And they add all the time new points of presence. So it’s much more than the 30 something regions that Albus has.

This is a worldwide thing. And so what does Platform give you on top of this caching at the edge? Well, it gives you DDoS protection. So to protect against attack that are denial, distributed Denial of service, it gives you integration with a shield and also a web application. Farrell will see those into the security section of this course. But the idea is that it’s really protected and it’s a good way to front your applications when you deploy them globally. And also allow you to expose an external Https endpoint by loading a certificate and also talk internally in Https to your applications if you needed to encrypt that traffic as well.

So let’s take a diagram. So this is a map of the world and there are some orange regions and there are edge. Everything on this graph is edge, but as you can see, it’s all around the globe. And so, for example, say we have an SV bucket in Australia and some user from America wants to access it. It’s actually going to access an edge location close to it, so in America, and that network is going to be transmitted over the private OS network all the way to the S Three bucket and the content is going to be caged. So the idea is that this American user, the more users you have in America, the more they will want to do the same kind of reads and they will have content served directly from America, not necessarily from Australia, because it will be fetched once into America and then served from there.

So caged locally. So another user maybe in Asia will talk to an edge location closer to Asia. And that edge location again, will support the traffic to the S Three buckets to get the content and then case it at the edge. So Cloud Front allows you really to distribute your reads all around the world based on these different edge locations and improve latency and reduce the load on your main S Three bucket. So I said, s three buckets. But what are the different Cloud Front origins? Well, the first one is an S Three bucket and you would use Cloud Front in front of S Three as a very common pattern to distribute your files globally and cage them at the edge.

You also get enhanced security as we’ll see in the hands on between Cloud Front and your S Three bucket using your Cloud Front OAI or Origin access identity and this allows your sq bucket to only allow communication from Cloud Front and from nowhere else. And then finally, you could also use Cloud Front as an ingress to upload files into S Three from anywhere in the world. OK, the other option is to use custom origin and there must be an Http endpoint. So this could be anything that respects the Http protocol.

So it could be an application advancer, it could be an easy two instance, it can be an S Three website, but we first must enable the bucket as a static S Three website and note that it is different from an entry bucket website. We need to enable that setting as we’ve seen before and we could be any Http backend you want, for example, if it was on your on premises infrastructure. Okay, so how does Cloud Front work at a high level? So we have a bunch of edge locations all around the globe and they’re connected to the origin with defined. It could be an S Three buckets or it could be any Http endpoint and our client wants to access our Cloud Front distribution.

For doing this, the client will send an Http request directly into Cloud Front. And this is what an Http request would look like. There would be a URL, some query string parameters and there would be also some headers and then the edge location will forward the request to your origin and that includes the query strings and that includes the request headers.

So everything gets forwarded onto your origin. I mean, you can configure this and then your origin responds to the edge location. The edge location will cache the response based on the cache settings we’ve defined and return the response back to our client. And the next time another client makes a similar request, the edge location will first look into the cache before forwarding the request to the origin. That is the whole purpose of having a CDN.

Okay, so very, very simple. This is how Cloud Front works at a high level. So let’s look at S Three as an origin in details. So you have the cloud and you have your origin, which is your S Three buckets. And for example, you have an edge location in Los Angeles and some users want to read some data from there. So your edge location is going to fetch the data from your S Three buckets over the private Illus network and give you the results from that edge location. The idea here is that for the edge location of Cloud Front to access your S Three buckets, it’s going to use an OAI or an origin access identity. It’s an im role for your Cloud Front origin and using that role it’s going to access your bucket.

And the bucket policy is going to say yes, this role is accessible and yes, send the file to Cloud Front. So this works as well for other edge locations. For example, in Sao Paulo, in Brazil or Mumbai or Melbourne. And so all around the world, your Edge locations are going to serve caged content from your S Three buckets. And so we can see how CloudFront can become super helpful as a CDN. Now, what do we have? ALB or EC two as an origin. The security changes a little bit. So we have our EC Two instance or instances, and they must be public because they must be publicly accessible from http standpoint. And we have our users all around the world, so they will access our Edge location, and our Edge location will access our EC Two instance. And as you can see, it traverses the security group.

So the security group must allow the IPS of Cloud Front Edge locations into the EC Two instance. And for this, there is a list of public IP for Edge locations that you can get on this website. And the idea is that the security group must allow all these public IP of Edge locations to allow Cloud Front to fetch content from your EC Two instances. So that makes sense. What if we use an ALB as an origin? So now we have a security group for the ALB and the ALB must be public to be accessible by Cloud Front. But the back end easy two instances now can be private. And so in terms of Security group, for the easy two instances, it needs to allow the security group of the load balancer. We’ve seen this extensively.

And for the Edge location, which are, again, public locations, it needs to access your ALB through the public network. And so that means that your security group for your ALB must allow the public IP of the Edge locations, the same public IP as we had from before. So, two different architectures, same concept, but we better understand network security for S Three, for ALB or EC Two in Front or behind, I must say Cloud Front. Now, Cloud Front is a CDN. It also has some really nice features. One of them is Georestriction. So you can restrict who can access your distribution. So you can provide a whitelist.

We’re saying, okay, users from this list of approved countries and only this list can go to cafonte or you can say blacklist, where you’re saying, okay, the users from these countries are not allowed to access our distribution. And the way the country is determined is using a third party GeoIP database where the incoming IP is matched against it to figure out the country.

So the use case for geo restriction will be when you have copyright laws to prevent access to your content and you want to prove to regulators that you are indeed restricting content access from, say, France, if you have content in America. Okay? Now you may be asking yourself what is really the difference between Cloud Fronts and something like S Three cross Region Replication? So CloudFront is using a global Edge network and files are going to be caged for a TTL for Time to live, maybe for a day. So it’s great when you have static content that must be available everywhere around the world. Okay? And maybe you are okay with if that content is outdated a little bit.

Now for S three Cross region Replication, it must be set up for each region in which you want to have a replication to happen and the files will be updated in near real time. It’s going to be read only, so it’s going to help you with reading performance. So Free Cross Region Replication will be great if you have dynamic content that needs to be available at low latency in a few amounts of regions. Hope that makes sense. Hope that’s very Clear platform is for catching globally and Sere Cross Region Replication for replicationing in two select regions. All right, so that’s it for this lecture. I will see you in the next lecture for.

2. [SAA/DVA] CloudFront with S3 – Hands On

We are going to create an S Three bucket and we’ll create a platform distribution as well. So in front of that bucket to basically distribute the content of that bucket anywhere around the globe with low latency. Then we’re going to create what’s called an origin access identity. So this is a user of Cloud Front that will be accessing our S Three bucket and will limit the S Three bucket to be only accessed using this identity user. So effectively, we’ll basically make sure that no one can access ours free bucket except if they go through Cloud Front. And why would we do this? Well, we can do this for many reasons for monitoring, because maybe you have cookies, maybe because of some policies or et cetera, et cetera. And that is a very popular exam questions.

So let’s go see how we do this right now. So let’s go ahead and create a bucket and I’ll collect my content through Cloud Front. That’s a horrible name, but let’s go ahead with it. Click on Next and then I will keep everything as is. Click on Next and create bucket. Okay, so I have my bucket being created and I’m just going to upload a few files in it. So I’m going to upload the coffee, the beach, and the index of HTML. Click on upload. And here we go. My files are uploaded. So now we’re going to go straight ahead into Cloud Front and get to see how Cloud Front works. So in CloudFront, what I have to do is to create a distribution. So let’s go ahead and create a distribution and then we’re going to get started with a web type of distribution. So, all right, the origin domain now is going to be basically our bucket, our bucket name. So here it’s my content through CloudFront Estrada amazona. com the origin path. We’re going to leave it empty and the idea we’re just going to leave it as is. So this is basically going to give you a description of the origin. All right, we’re going to restrict the access bucket.

And this is what’s very important here, the restrict bucket access option. So if I say no, there’s no option. If I say yes, you see there’s a lot more option that happens. And so here basically, this is if we want our users to always have access to our S Three content only using the Cloud Front URLs, not the Amazon S Three URLs. So it’s super important that if we want that to happen, yes, we need to restrict the bucket access. And then I can select an origin access identity. So here we could use an existing identity, but we don’t have any identity. Or we can just go ahead and create a new one and you can just name it. We’ll call it Access Identity demo, for example.

And then finally there’s need to grant read permissions for that identity onto Rs Three buckets. And so either no you will update the permissions yourself or yes, please update the bucket policy for me and we’ll just say yes, please update the bucket policy for me because we’re a bit lazy. Alright, so now we’re good. Now let’s just scroll down and say okay, we are okay with Http and Https but maybe we want to redirect Http to Https so that we only force encryption connections to cloud fronts. So we’ll select the second option. Maybe we’ll allow only get and head but we could allow more methods. I can scroll down and see there’s tons of parameters but for now I’ll just leave it as is and at the very bottom we will scroll all the way down and we’ll say on create distribution.

Now this distribution can take a lot of time to be created and this is the case usually with Cloud Front. So right now the state is enabled but the status is in progress and it can take a few, maybe ten minutes to get created. So I’m going to pause the video until then. So while this is getting created, what’s really important to see is that now we have an origin access identity that has been created. And so you can see that there is our comments, there is an ID, so E eight Y six, et cetera, et cetera. And there is an Amazon S three canonical user ID for this. So automatically we have created an origin access identity while we were creating our distribution.

If we go back to our bucket and now goes to permissions and bucket policy. Well, we can see that a policy was created automatically for us saying, okay, basically our user the cloud front origin access identity e 86 the one we have right here. This one is allowed to talk and do a get object on anything of my content within these buckets. So it’s really neat because now we basically understand that this user that was created by Cloud Front has access to our buckets and we could even make a statement to deny anything if it’s not coming from this principle. So we could edit that bucket policy. We won’t do it here, but we could update it to say anything that doesn’t come from this user will be denied.

And so effectively we would only restrict our bucket access to this origin access identity. So super important for you to understand this because that could be an exam question. So now we’re going to wait for the distribution to be over. Okay, so my Cloud Front distribution is done and now I should be able to access it, for example, my beach Jpg file through my URL. So I take the URL of Cloud Front and as you can see we get an access denied and this is due to the fact we’ve been redirected to the S three buckets. So this is due to a DNS issue and we’ll have to wait about 3 hours for it to be fixed. But in the meantime, what we’ll do is that we’ll make the files public in our S Three buckets to fix this temporarily. And so the one thing we have to do to fix this is to go to our S Three management console and make these files public. But so if I wanted to make, for example, my coffee JPEG public, so I right click and Make Public. Click on make public. I get an error, it says failed and I get access denied. So why is my access denied? Well, because there is a setting in properties.

You have a sorry, in permissions, you have a public access setting. And here you can change the public access setting for this bucket. And basically we’re going to untake everything. We’re basically going to allow us to set some objects to be public. So I click on Save and I’ll click on Confirm and also basically allow us to make some objects public. So let’s go back to my coffee JPEG. I right click make public and make public. So now it was a success. And so if I go back to my Cloud Front URL and in there I’ll do coffee JPEG, I should be able here we go to see my coffee. So it’s pretty cool this worked. And similarly, I can also look at Beach JPEG and I can try to make this public.

So I’ll try to make public make Public. Here we go. And in there I’ll go back to my client URL and then I will do Beach JPEG. And here we go. The beach is appearing as well. Finally, you may have seen that when I accessed my distribution using the domain name, I was being redirected directly to an S Three URL instead of getting served through the domain name. So this has to do with a DNS propagation issue. And if you want to read more about it, this question on stack overflow called AWS Cloud Front redirecting to S Three buckets. This answer explains why. And the idea is that you need to wait three 4 hours for basically the DNS to propagate properly before you get the Cloud Front access to your images and your files directly using the Cloud Front URL instead of using the SF URL.

So just something you should know, it’s not a bug, it’s just something that you be aware of. It’s a temporary redirect which will be fixed when the DNS has propagated on AWS Side. So I have waited about a day now, and if I click my domain name and go to Beach Jpg now, as you can see, my Front URL directly takes me to Beach Jpg and I’m not redirected to the S Three buckets.

And so that’s a really cool thing. Now, because the DNS has propagated, I am served properly only through Cloud Front. And so for this now I can go back to my S Three and I can make that file Beach Jpg not public again. So I’ll click on Public access and change the ACL of that object and remove read object. So now this file is private and I can go back to coffee jpg do the same on the permissions remove public access. Excellent. So that no files in my bucket now have public access. They’re all private.

And now I can go back to my permissions and the public access settings. I can now retake all these things preventing me from making anything public. So I’ll confirm this. Click confirm. Yeah, I’ll confirm. And so the idea now is that all these files are private and only the Bucket policy will allow Cloud Front through the origin access identity to access my files so we can verify it. I can just refresh this page and it’s working. I can go to coffee jpg, it’s working too. And I can go to index HTML. And as you can see, I also see my coffee picture.

So now here, it’s really cool because we use Cloud Front, and the idea is that we have all this content being caged at the edges of the world, and my S three bucket is only accessible through Cloud Front. And that’s a very, very popular exam question. So remember, the important things here is the origin access identity, this ID right here that is put into a Bucket policy allowing only my Cloud Front user to access my S three bucket. It and therefore I’m really protected and I can access Cloud Front all around the world from anywhere. And only using my using Cloud Front, I can access my Sree buckets. So that’s it for this lecture. I hope you enjoyed it. And.

3. CloudFront Reports, Logs and Troubleshooting

So let’s talk about access logs for CloudFront. So the idea is that when you enable it the same way you would do for an SR bucket or for a load balancer, for example, well, CloudFront is going to log every request made to your origin into a logging Svocets. The way it’s going to work is that your people are, your users are going to access your website through the Cloud Front URL, okay? And then each edge location around the world is going to send its log file. So the distribution log data to a central s three bucket that you have to define. Now your S three bucket is going to be a logging bucket, but it can be different and it should be different than your origin bucket, which is maybe where the content is originated from. Okay? So there are two types of buckets in Amazon history. Your origin bucket that is linked to your CloudFront distribution and your logs bucket to send your log files to.

And you could have a second website with a second cloud from distribution, do the exact same thing and send it out to your logs bucket. The idea is that you could separate your distribution by adding a prefix within your logs buckets. So CloudFront has a lot of reports and it’s possible to generate reports on cache statistics reports, popular object reports, top referral reports, usage reports, and viewers reports. And to create these reports, CloudFront is going to use the data from the access logs. But you don’t need to enable access logs to be sent into s three to have these reports being generated. Okay? So no matter what, you will find these reports. And for some troubleshooting, well, clapping A will cache the Http 405 hundred status codes returned by S three or the origin server.

So that means that if there isn’t an available file at the origin, then that response is going to be cached no matter what. So four xx will indicate that the user doesn’t have access to the underlying bucket, for example 403, or that the user is requesting an object that is not existing. So 404, okay, and five x six will indicate gateway issues. So let’s have a look in the console to see where all these options are. So here’s my platform distribution and it will click on Edit and then within it you can enable standard logging. Yes. And then you need to specify a bucket for your logs. So I can specify this bucket right here and a log prefix, for example, Cloud Front. And once you’ve done this, then you can go ahead and save this setting and then logging is going to be enabled for your access logs.

So if we go have a look in this bucket, if I go into Amazon S three and look for this bucket right here, which is black belts, okay, as we can see there’s a clarifying directory. So I did enable it before and anytime some access was being done. An object has been locked, so it’s in the Gzip format. And if we have a look at one of these files directly. So let’s have a look at this file. For example, I’m going to download it. So if we have a look at this file, so we have the fields. So each column is going to be represented here. Okay, so we have the date, the time we get some information around that was get requests from this URL at this path right here, it was a 403. The results, it was from Mozilla and so on. And so you get a lot of information around your request overall. Okay, so this is a classic access log, and you could use a service like Athena to analyze this access log.

So that’s cool. Now, the other thing I want to show you is some real reports that you can have on my distribution. So if you look at the cache statistics right here, and we have a look at the distribution that I use, actually, it’s one of my distribution that I use. As you can see, we can see the number of requests available here. So we get the total request right here. So I get about 90,000 requests every day on this distribution. And then we can see how many cache hits we get, how many cache misses we get, and the errors. So my distribution is really, actually really good because the cache hits is about 78% and the cash misses is about 14%, which is quite nice. And then you can see the errors as well is about 1. 69%. And then we can look at how many bytes are transferred to the user.

So every day the users are downloading nine bytes from the cache and 1. 7gb sorry, are missing from the cache every day. We can look at the status codes over time, which is really, really nice, and the percentage get requested, not finished. Downloading is available right here as well. So it gives us some really nice statistics, okay? And we could have the statistics based on your locations as well. Now, if you look at popular objects, it will show you which objects are the most popular within my distribution, as well as statistics regarding these objects. So this is good top refers, this is where the content is coming from and where people are downloading it from.

So that’s good usage. So which is how many Http and Https requests. So all of them are Https requests, which is good. And then we can look at protocols, so Https yet again destination, so we can see where they’re sent from. So Cloud Front to your users, or your origin to Cloud Front. So obviously, because there’s a lot of cash, then it’s going to be mostly from Cloud Front to your users, and then a field level encryption request, which I don’t have in this distribution. And finally get some viewers statistics. So either on desktop, mobile, Ivy, butts crawlers, unknown tablet, Mt and TV. So we get some information and device trends over time. We can get information around the browsers. They’re using some Chrome, Chromium, Firefox Edge, et cetera, et cetera, the operating system, Windows, Mac and so on. And you can scroll down to get some trends as well as locations.

So whether from India, United States and other locations, you can see right here, which I think are really, really cool statistics overall. Okay, so that’s it for this lecture. And then obviously, you can look at monitoring and telemetry. So if you ever look at monitoring of Cloud Front itself, you can go and see the distribution metrics, where you get a lot of metrics available directly from Cloud Watch. So, number of requests over time so I can put it on the one week to show you.

So, number of requests over time, data transfer over time in terms of byte download error rates instead of time again, four xx type of error rates, five xx type of error rights latency of your origin and cache hit rates, which is really, really high, and it shows you again over time how you’re doing. Okay, so that’s it for this lecture. I hope you liked it and you see some real world data, which is kind of cool, and I will see you in the next lecture.

Comments
* The most recent comment are at the top

Interesting posts

The Growing Demand for IT Certifications in the Fintech Industry

The fintech industry is experiencing an unprecedented boom, driven by the relentless pace of technological innovation and the increasing integration of financial services with digital platforms. As the lines between finance and technology blur, the need for highly skilled professionals who can navigate both worlds is greater than ever. One of the most effective ways… Read More »

CompTIA Security+ vs. CEH: Entry-Level Cybersecurity Certifications Compared

In today’s digital world, cybersecurity is no longer just a technical concern; it’s a critical business priority. With cyber threats evolving rapidly, organizations of all sizes are seeking skilled professionals to protect their digital assets. For those looking to break into the cybersecurity field, earning a certification is a great way to validate your skills… Read More »

The Evolving Role of ITIL: What’s New in ITIL 4 Managing Professional Transition Exam?

If you’ve been in the IT service management (ITSM) world for a while, you’ve probably heard of ITIL – the framework that’s been guiding IT professionals in delivering high-quality services for decades. The Information Technology Infrastructure Library (ITIL) has evolved significantly over the years, and its latest iteration, ITIL 4, marks a substantial shift in… Read More »

SASE and Zero Trust: How New Security Architectures are Shaping Cisco’s CyberOps Certification

As cybersecurity threats become increasingly sophisticated and pervasive, traditional security models are proving inadequate for today’s complex digital environments. To address these challenges, modern security frameworks such as SASE (Secure Access Service Edge) and Zero Trust are revolutionizing how organizations protect their networks and data. Recognizing the shift towards these advanced security architectures, Cisco has… Read More »

CompTIA’s CASP+ (CAS-004) Gets Tougher: What’s New in Advanced Security Practitioner Certification?

The cybersecurity landscape is constantly evolving, and with it, the certifications that validate the expertise of security professionals must adapt to address new challenges and technologies. CompTIA’s CASP+ (CompTIA Advanced Security Practitioner) certification has long been a hallmark of advanced knowledge in cybersecurity, distinguishing those who are capable of designing, implementing, and managing enterprise-level security… Read More »

Azure DevOps Engineer Expert Certification: What’s Changed in the New AZ-400 Exam Blueprint?

The cloud landscape is evolving at a breakneck pace, and with it, the certifications that validate an IT professional’s skills. One such certification is the Microsoft Certified: DevOps Engineer Expert, which is validated through the AZ-400 exam. This exam has undergone significant changes to reflect the latest trends, tools, and methodologies in the DevOps world.… Read More »

img