Amazon AWS Certified SysOps Administrator Associate – Databases for SysOps Part 9

  • By
  • May 28, 2023
0 Comment

18. [DVA] ElastiCache Redis Cluster Modes

Two types of elastic cache replication you can have for redis and we need to know them both. So the first one is called cluster mode disabled. And in this case you have one primary node and up to five replicas for your node. So this is for redis. And so we have one chart. So all your data is in this chart. We’ll have one primary cache node and then we can optionally set up up to five cache replicas. So from zero to five replica nodes. The idea is that when you have a primary node and some replicas in case of the failure of the primary mode node, then the replicas can take over. So the replication is a synchronous between the caches and the primary node will be used for read and writes while the other nodes are going to be read only.

So on top of disaster recovery, you can also scale your read capability in the elastic cache for redis by enabling rereplicas. So you have one chart. That means that all the nodes will have all the data in your redis cluster. The idea here is that you’re guarding against data loss if there is a node failure. And you can also enable multiaz. It is enabled by default, by the way, in case you won’t have multiaz failover.

So again, this is helpful for multiaz and also for scaling the read performance of your elastic edge cluster. Now, there is a second mode and this says with cluster mode enabled. And in this mode the data is going to be partitioned across many shards, which is helpful if you want to scale your writes. So let’s have a look at a second. So we have shard one and shard two or three or shard N.

And the idea is that the data is going to be partially on shard one, partially on shard two, all the way up to shard N. So it’s going to be divided across all your shards. Now, each shard will work just as what we saw before when cluster mode was disabled. That means we can have one primary node and up to five replica nodes. So the data is replicated and you set up the number of replicas equally across all the shards. Okay, there’s also a multi AZ capability.

It’s on by default. And so you can have again failover between your primary node and your replica nodes in case of an availability failure. And you get up to 500 nodes per cluster. That means that if you don’t set up any replicas, you have 500 shards with a single master. But if you are setting up replication, for example, you want one master and one replica, then you get 250 shards. And if you want to have say, five replicas in one master, you get 83 shards maximum.

So the idea here is that you are really interested into sharding your data when you want to enable cluster mode. That means again that you want to scale your rights and that the data is going to be partitioned. That means the data will live across many different shards. Okay, so this is the main difference when you have cluster mode enabled or cluster mode disabled for ElastiCache. So that’s it. Just this thing to know going into the exam, you can come up on one question. I hope you like this lecture, and I will see you in the next lecture.

19. ElastiCache Redis for SysOps

Okay, so now let’s do a little bit of dive into how redis is going to be seen for the CSS exam. And first of all, you need to know how to scale redis. So there are two modes for redis. There’s a cluster mode disabled and the cluster mode enabled. So in case of cluster mode disabled, and we just have one node group with a primary node and up to zero and five replica nodes. So if you want to do horizontal scaling, that means adding instances or removing instances. While it’s as easy as adding and removing read replicas from your node group, and as you know, you can have a maximum of five replicas. The other kind of scaling you can do for redis is going to be vertical scaling. In this case, you’re going to scale up or down to larger or smaller node types.

Okay? So that means you’re going to change the instance type behind your Elastic Cache cluster. And so to do so, it’s going to be a one click operation in redis as well. But you need to know that internally, what’s going to happen is that Elastic Cache will create a new node group and then there will be a data replication from the old node group to the new node group. And then finally, once this replication has happened, then there’s going to be a DNS update being done by Elastic Cache service so that it’s transparent for your application and they can connect to the new node group directly.

So that’s for cluster mode disabled, not for cluster mode enabled, it gets a little bit more complicated. So there are two kinds of scaling. The first one is online scaling, where your cluster is still up and running during the scaling process. That means there is no downtime and you may get some degradation in performance, but your redis cluster is up and running and offline scaling, which is that you will take your entire cluster offline, so you will not be able to serve request during the scaling process. Okay? But then you can do more configuration changes, for example, node types or upgrading the engine version. So if we have a look at horizontal scaling for cluster mode enabled, then you can do regarding.

That means scaling in and out by adding and removing shards. Okay? So in cluster mode enabled, remember, we have multiple shards, so we can horizontally scale by adding more shards. Or you can do shard rebalancing to distribute the key space among the shards as much as possible. And this horizontal scaling is supported for both online and offline mode. Okay? So you can keep it up or take it down if you wanted to add horizontal scaling. Now, for vertical scaling, which is to change the node types to change the read write capacity. So you want to change to a larger or a smaller node type. And this supports online scaling. And again, this is a one click thing and this is going to happen behind the scenes.

Okay, now for Redis metrics to monitor. So we have a few metrics that should make sense for a cache technology. So the first one is Eviction. So this is a number of no expired items that the cache evicted to allow space for new rights. So that means that you have memory being overfilled. So the solution is to choose an Eviction policy to evict expired items. That means that for example, you want to evict automatically the least recently used items to make sure that you don’t have too many Evictions. Or you want to have a larger node type, for example, to have more memory.

Or you want to scale out by adding more nodes. For example, if you’re using Redis in cluster mode enabled then you want to monitor obviously the CPU utilization. That means that if it’s too high then you want to scale up to a larger node time or you want to scale out horizontally by adding more nodes. And swap usage should not exceed 50 megabytes. And if so, if it does, please make sure you look at your memory settings. Then we have current connections. So this is a number of current and active connections to your Redis cluster.

And if it’s too high then maybe your application is having some bad behavior and initiating a connection every single time to Redis. In that case this is bad. And have a look at the application behavior, the database memory usage percentage which is the percentage of memory utilization and then network in and out and Replication Bytes or Replication Lag which is giving you some information around the replication of the data within your Redis cluster.

And so obviously the Replication Bytes to be high and Replication Lag to be low so that there is no lag between the replica, the re replica and the main nodes. Okay, so that’s it. That’s all you need to know for Redis, obviously. Hopefully everything should be quite intuitive. Okay, but it’s good to see it once and I will see you in the next lecture.

20. ElastiCache Memcached for SysOps

Let’s have a look at what we need to know from a memcache D perspective regarding scaling. So we have a mem cached cluster and it has two cache nodes. And as we can know, memcache cluster can have up to between one and 40 nodes. Okay, this is a soft limit. That means that if you want to do horizontal scaling, then you need to add and remove nodes from the cluster. And then using auto discovery that I will show you in the next slide, your app will be able to find these new nodes. So that means is that if you want to do horizontal scaling, well, you just add cache nodes to your cluster and you’re good to go with the auto discovery mechanism.

Now, in case of vertical scaling, we can scale up and down to larger or smaller node types. And the scala process is as such. So you’re going to create a new cluster with a new node type. Then you’re going to update your application to use the new clusters endpoints, and then you’re going to delete the old cluster. This is something you have to do manually.

And then as you know, memcache D does not have a backup system. So that means that memcache D cluster or nodes will start out without any data. And that means your application has to fill the data yourself. Okay, so it’s different than redis that has a backup mechanism. So if you have a look at the old memcache cluster, for example, it’s running a smaller instance type. We’re going to have the application connected to it right now, but we’re going to create a new memcache cluster with a larger cache type.

And as we know, this cluster will be empty. So application side, we update the endpoints. That means that we stop being connected to the old cluster and we start being connected to the new cluster. And our application, if it’s well programmed, will start populating data into this new Mcachd cluster.

Okay, now from Mache auto discovery, what it is? Well, we have a cluster, say, for example, with four nodes, and the clients want to connect to all of them. So typically we need to have the clients reference each of these four nodes using the GNS endpoints to connect to them. But if we start adding new nodes over time, how does the clients get updated with all these nodes? So there’s auto discovery and the idea is that all the cache nodes in the cluster know about all the other nodes.

So that means that the client, from a con perspective, will seamlessly connect to all of them. So what that means is that the clients will connect to the configuration endpoints, will get back the IP address of cache node one. Okay? Then it will connect directly into the cache node one, and the cache node one will respond with the metadata of the IP addresses of all the nodes in the cluster and the client can use that to connect to any node to find the right data in your cluster. Okay, finally some metrics to monitor, which is very similar to Redis.

So we have Evictions, which is the number of nonexpired items that the cache evicted to allow space for new rights. And so the solution is to use a new eviction policy or to set up a larger node type or to scale out by adding more nodes CP utilization, which is if it’s too high, well, you need to scale up to a larger node type or to scale out by adding more nodes.

Swap usage, same as before, and current connection as well, which is a number of connections between your application and meme cache D. So make sure that your application does not have a bad behavior to open too many connections and then freeball memory, which is the amount of free memory on the host. OK, so hopefully that makes sense. Hopefully you’re good to go in Memcache now and I will see you in the next lecture.

Comments
* The most recent comment are at the top

Interesting posts

The Growing Demand for IT Certifications in the Fintech Industry

The fintech industry is experiencing an unprecedented boom, driven by the relentless pace of technological innovation and the increasing integration of financial services with digital platforms. As the lines between finance and technology blur, the need for highly skilled professionals who can navigate both worlds is greater than ever. One of the most effective ways… Read More »

CompTIA Security+ vs. CEH: Entry-Level Cybersecurity Certifications Compared

In today’s digital world, cybersecurity is no longer just a technical concern; it’s a critical business priority. With cyber threats evolving rapidly, organizations of all sizes are seeking skilled professionals to protect their digital assets. For those looking to break into the cybersecurity field, earning a certification is a great way to validate your skills… Read More »

The Evolving Role of ITIL: What’s New in ITIL 4 Managing Professional Transition Exam?

If you’ve been in the IT service management (ITSM) world for a while, you’ve probably heard of ITIL – the framework that’s been guiding IT professionals in delivering high-quality services for decades. The Information Technology Infrastructure Library (ITIL) has evolved significantly over the years, and its latest iteration, ITIL 4, marks a substantial shift in… Read More »

SASE and Zero Trust: How New Security Architectures are Shaping Cisco’s CyberOps Certification

As cybersecurity threats become increasingly sophisticated and pervasive, traditional security models are proving inadequate for today’s complex digital environments. To address these challenges, modern security frameworks such as SASE (Secure Access Service Edge) and Zero Trust are revolutionizing how organizations protect their networks and data. Recognizing the shift towards these advanced security architectures, Cisco has… Read More »

CompTIA’s CASP+ (CAS-004) Gets Tougher: What’s New in Advanced Security Practitioner Certification?

The cybersecurity landscape is constantly evolving, and with it, the certifications that validate the expertise of security professionals must adapt to address new challenges and technologies. CompTIA’s CASP+ (CompTIA Advanced Security Practitioner) certification has long been a hallmark of advanced knowledge in cybersecurity, distinguishing those who are capable of designing, implementing, and managing enterprise-level security… Read More »

Azure DevOps Engineer Expert Certification: What’s Changed in the New AZ-400 Exam Blueprint?

The cloud landscape is evolving at a breakneck pace, and with it, the certifications that validate an IT professional’s skills. One such certification is the Microsoft Certified: DevOps Engineer Expert, which is validated through the AZ-400 exam. This exam has undergone significant changes to reflect the latest trends, tools, and methodologies in the DevOps world.… Read More »

img