Amazon AWS Certified Machine Learning Specialty – Modeling Part 14

  • By
  • January 25, 2023
0 Comment

38. Amazon Rekognition

Next, let’s talk about Amazon recognition, which is, again, what it sounds like. It’s computer vision, and its main purpose is object and scene detection, hence the name recognition. So this stuff is worth remembering. There’s a lot of stuff it can do in the field of computer vision that’s all wrapped up into the recognition service.

The first thing it does is object and scene detection. So you can actually say, okay, what’s in this image? Just like using Imagenetics and Reset or any of the Sage maker object detection algorithms. It’s just a higher level for using that stuff. So you just pass in an image. It’ll give you back information as to what’s in that image and where those objects are. Also, you can use your own face collection if you want to. So if you want to be able to identify individuals in Photos and know who they are, you can do that with recognition as well.

And yes, this gets into very creepy ethical privacy concerns, but I’m not going to go there because you can be sure that the exam isn’t either. But if you do use this in the real world, please think twice about how you’re using this service. It’s one of those powerful technologies that can be used for good or for evil. But enough of that. It can also do image moderation. I think that’s a really exciting application of it. So it can actually look at an image and automatically tell you if it thinks that it might include something offensive.

Well, I don’t want to get into what’s offensive, but use your imagination. If you have some sort of an open social network, for example, and you want to make sure that people aren’t posting icky stuff, you could use recognition to actually automatically look at those images and flag things that look like they might be in violation of your terms of service. This is good because it means that you don’t need to employ humans to look at horrifying images all day long.

Believe me, that’s a job that nobody wants. It can also do facial analysis, which we’re seeing here in this example. Here in the image, if it sees a face in an image, it can tell you all about that face. It can tell you how confident that it is a face, and is it male or a female, how old is this person? What’s their facial expression? What’s their emotion? Are they wearing glasses? Things like that. So again, getting into a little bit of creepy territory here, but it can do that.

It can actually analyze a face, tell you about that face, their expected age, their emotion, things like that. You can imagine applications like a camera in a store if you want to see if your customers are happy or not, right? Not sure that’s an ethical thing to do, but you can do it. You can also do celebrity recognition, man, there’s a lot of stuff here that’s kind of creepy.

If I were a celebrity, I would not be happy about that. But it actually has a database of celebrities and if you feed in a video stream or a photo, it can actually flag and say if a celebrity is actually detected in that image or that video stream. Wow. Yeah.

So if you’re a celebrity, you could actually set up a camera on Hollywood and basically have an automatic paparazzi, if you will. This makes me cringe a little bit. Also, it can do face comparison, so it can actually compare two faces together. It can extract text that’s an image. So that’s pretty cool. If you had signs that were in an image, you can imagine extracting the text in those signs using recognition and maybe turning around and feeding that into translate to translate that sign into another language. That would be a really cool example of putting these services together. It can also do video analysis. So again, it can use either pictures or video streams. And on video analysis it can actually mark at what point on the timeline that it found an object or found a specific person through facial analysis and object detection, or when it actually found a specific celebrity on the timeline of that video. It can also do people pathing. So it can actually show you the path that individuals are following through the scene over time. I’m just not going to go there.

There are interesting law enforcement applications of this, but there are real ethical concerns about this stuff anyway. You just need to know how it works for the exam. Don’t worry about whether or not it’s the right thing to do. The exam is definitely not going to get into those issues anyway. The nitty gritty it is worth having some depth on this service.

So remember, the images will come from S Three, or you can just provide raw image bytes as part of the request itself, so you can actually pass in the image as part of the request. If you’re using the API, using S Three will usually be faster, especially if the image is already there. Recognition can obviously access S Three very efficiently. Facial recognition, how well that works will depend on having good lighting, a good angle, good visibility of the eyes. It depends on the eyes a lot, it turns out, and the resolution of the image as well. So if you have just a few pixels of a face, obviously that’s not going to work as well as a high resolution image. Video must come from Kinesis video streams. So going back to the very beginning of this course, they talked about kinesis. Kinesis video streams and recognition go hand in hand.

So that alone is worth remembering for the exam because the exam talks a lot about building larger systems, right? So if you have a kinesis video stream, you could feed that video stream into recognition to do all the things that recognition does object detection, facial analysis, celebrity detection, things like that. That video must be H 264 encoded. It must be between five and 30 frames per second. And if you have to choose between resolution and frame rate, it’s better to go with more resolution to ensure good results from recognition. You can also use recognition together with Amazon lambda, AWS lambda, to trigger image analysis upon upload of an image.

So you could set up a lambda function that says, whenever I upload a new image into this S three bucket, go run recognition on it and tell me what’s in that image. So you can again imagine a larger system here. Maybe you have a camera out there in the world that’s taking still photos every minute or something, uploading those images to S three, and then a lambda function finds that that image was uploaded to s three, runs recognition on it, and tells you if whatever you’re looking for was in that image or not. You can play with this in the console, and it is fun to do. So I will leave that as an exercise for you.

If you just go to AWS recognition in the AWS console, you can try out all the different capabilities of recognition. They conveniently provide sample images, which I’m sure are chosen carefully to make sure they work well. But it does work well. And for example, on this one, we’re trying out the object and scene detection capability. You can see that it’s actually drawn bounding boxes around several different cars and flagged them as cars and vehicles and automobiles and transportation. It found a human, found a person. So it’s giving you multiple names for the same thing even. And we could expand that and show even more. I’m not sure if it actually found that skateboard, but it classified it as something, so it might be interesting to expand that and see how it classified that little skateboard as well. So that is recognition in a nutshell, basically, computer vision at a very high level API, where you don’t necessarily need to know a lot about machine learning to use it. You can use all the power of computer vision, even if you’re not a machine learning expert.

39. Amazon Forecast

This next one’s pretty quick. Amazon Forecast, again, does what it sounds like. It’s a time series analysis and a service, if you will. So it is a fully managed service to deliver highly accurate forecasts using machine learning. What’s pretty cool about it is that it has this AutoML option that will actually choose the best model for your time series data. So it can actually go out and evaluate different time series models out there, including Arima, Deepar, TS, NPTs and Profit and choose the best one for your data automatically. That’s pretty neat stuff. It can work with any time series of information, so you can just imagine a wide variety of applications. Maybe you want to predict prices or when to run a promotion or economic performance, stock prices, obviously that comes to mind. And you can also combine that with associated data to find relationships between multiple times series together. Sounds a lot like Sage Makers Deepar, doesn’t it? Well, in fact, that is one of the options for AutoML.

It is actually using Deepar under the hood as one of those algorithms that it uses. This can be used for things like inventory planning, financial planning, resource planning. What’s my projected traffic on my website going forward? How many servers will I need to actually provide for that traffic? Its underlying concepts include data set groups, predictors, and forecasts. They are exactly what they sound like a group of data sets that might be related to each other, time series that have some relationship to each other, predictors the things that you’re trying to predict and forecast, which is the actual forecast data that comes back from forecast. And that’s all you need to know about Amazon Forecast. I wouldn’t expect a ton of depth on this on the exam, but again, you need to know what it is and what it’s for. That’s it. It’s for forecasting time series data.

40. Amazon Lex

Now we have Amazon Lex, which is kind of a different kind of beast. Amazon Lex is billed as the inner workings of Alexa. And you would think that this would be a really complicated machine learning, artificial intelligent, deep learning thing, but it really isn’t. And as such, you won’t really get a whole lot of depth on this on the exam. It’s really just a chatbot engine. So it’s more about coordinating sort of the rules that generate rate how the chat bot responds to different kinds of requests. So the way that it’s structured is that a bot is built around what we call intents and an utterance invokes an intent. So for example, an utterance of I want to order a pizza might be mapped to a specific intent within Lex that says, okay, this is the pizza ordering intent. You then write a lambda function to fulfill that intent. So basically you have to write code to order a pizza given the information that they specified. That extra information is what’s called a slot.

So a slot would specify extra information that was needed by the intent so less can take some text input and say, okay, they said that they want to order a pizza. I can then turn around and say what size pizza do you want? And it would fill in the slot for the pizza size. I could say what toppings do you want? And they could come back with what toppings they want and those would be slots. When it has all the information it needs, that lambda function will be responsible for passing that off to some other service that actually orders the pizza and makes it happen. You can deploy a chatbot developed with Lex to the AWS Mobile SDK that will just automatically make a mobile app out of it for you. Also the Facebook Messenger, slack and Twilio. So it’s important to remember that Amazon Lex, although it bills itself as Alexa’s inner workings, it’s really just the chatbot part of it, the part that’s building the rules about what are the things that I can do? What’s the information I need to fulfill those intents? And how do I turn around and invoke a lambda function to make that happen? It’s not actually doing any sort of really general level AI to understand what’s being said. It’s just listening to the words that are coming in or taking the words that are coming in as text, really, and trying to map those to a pre generated list of intents that might map to intents from given words, if you will.

So it’s not as smart as it sounds is what I’m trying to say. Also it’s not really speech to text or text to speech, it’s the inner workings of running that chatbot engine. So to build an Alexa you would probably use Amazon Transcribe to actually transcribe incoming streaming text from the device to text that could be fed into Amazon Lex. Lex would probably come back with text as a response. And you would use something like Amazon Poly to actually speak that response back through Alexa. There’s a whole lot of depth to Lex. They have an entire course on this for free on AWS. That’s kind of interesting. But you’re not going to need that much depth for the exam. You just need to know again what it is, what it’s for, and how it fits into other things. And for the example of building your own Alexa, it would probably fit into something like Transcribe and Poly to actually do that interface between text and voice in both directions.

Comments
* The most recent comment are at the top

Interesting posts

The Growing Demand for IT Certifications in the Fintech Industry

The fintech industry is experiencing an unprecedented boom, driven by the relentless pace of technological innovation and the increasing integration of financial services with digital platforms. As the lines between finance and technology blur, the need for highly skilled professionals who can navigate both worlds is greater than ever. One of the most effective ways… Read More »

CompTIA Security+ vs. CEH: Entry-Level Cybersecurity Certifications Compared

In today’s digital world, cybersecurity is no longer just a technical concern; it’s a critical business priority. With cyber threats evolving rapidly, organizations of all sizes are seeking skilled professionals to protect their digital assets. For those looking to break into the cybersecurity field, earning a certification is a great way to validate your skills… Read More »

The Evolving Role of ITIL: What’s New in ITIL 4 Managing Professional Transition Exam?

If you’ve been in the IT service management (ITSM) world for a while, you’ve probably heard of ITIL – the framework that’s been guiding IT professionals in delivering high-quality services for decades. The Information Technology Infrastructure Library (ITIL) has evolved significantly over the years, and its latest iteration, ITIL 4, marks a substantial shift in… Read More »

SASE and Zero Trust: How New Security Architectures are Shaping Cisco’s CyberOps Certification

As cybersecurity threats become increasingly sophisticated and pervasive, traditional security models are proving inadequate for today’s complex digital environments. To address these challenges, modern security frameworks such as SASE (Secure Access Service Edge) and Zero Trust are revolutionizing how organizations protect their networks and data. Recognizing the shift towards these advanced security architectures, Cisco has… Read More »

CompTIA’s CASP+ (CAS-004) Gets Tougher: What’s New in Advanced Security Practitioner Certification?

The cybersecurity landscape is constantly evolving, and with it, the certifications that validate the expertise of security professionals must adapt to address new challenges and technologies. CompTIA’s CASP+ (CompTIA Advanced Security Practitioner) certification has long been a hallmark of advanced knowledge in cybersecurity, distinguishing those who are capable of designing, implementing, and managing enterprise-level security… Read More »

Azure DevOps Engineer Expert Certification: What’s Changed in the New AZ-400 Exam Blueprint?

The cloud landscape is evolving at a breakneck pace, and with it, the certifications that validate an IT professional’s skills. One such certification is the Microsoft Certified: DevOps Engineer Expert, which is validated through the AZ-400 exam. This exam has undergone significant changes to reflect the latest trends, tools, and methodologies in the DevOps world.… Read More »

img