Amazon AWS Certified Machine Learning Specialty – Modeling Part 13

  • By
  • January 25, 2023
0 Comment

34. Amazon Comprehend

Up next, we’re going to go outside the world of Amazon, Sage Maker and into the higher level AI and ML services that AWS offers. These are services that aren’t really geared toward machine learning experts. They’re really more for anyone who just wants to use ML capabilities. So you don’t necessarily have to understand how a neural network works or even write a lot of code to use these algorithms in these higher level services. They’re aimed at a more general audience, if you will. You’ll still need to know what they can do and how they can be put together to solve different problems on the exam. So even though these services don’t really get at are you an ML expert, which is what the exam is really trying to figure out, you still need to know what they are and the capabilities that they have, at least. So let’s dive through them one at a time, starting with Amazon Comprehend. So remember we talked about blazing text in Sage Maker and how that was something that sounds like a really comprehensive natural language processing service.

And it really isn’t comprehend is it? Does NLP. So basically, Amazon Comprehend is a service for natural language processing and text analytics. And you can input any text you can imagine into it. Some sample applications that they give are social media feeds if you want to do like sentiment analysis on them, emails, web pages, documents, transcripts. And they even have a separate product called Comprehend Medical that’s specifically built for medical records and all sorts of medical terminology and the strict privacy requirements that surround that sort of data. The way it works is that it can extract key phrases, entities, sentiments, languages, syntax, topics, and document classifications from your documents, all automatically. And you can train it on your own data if you want to, or just use it out of the box on its pretrained models, whatever you want to do.

So you can use it at a high level, just out of the box. And if you are a little bit more advanced, you can actually train it on your own data as well. So let’s dive into all those different things that Amazon Comprehend can do. The first thing we mentioned was entities. So if you give it a piece of text, and this is one of the examples that you’ll see in the AWS console, it can actually extract the entities, the things, the important objects that exist within that text. So in this example here, their sample text is amazon. com, Inc. Is located in Seattle, Washington, was founded on July 5, 1994 by Jeff Bezos. Wow, that was a long time ago, thinking back on it and some of the entities that it extracted here are Amazon. com Inc. Which it has classified as an organization with a 90% confidence score. Seattle, Washington is a location. July 5, 1994 is a date. Jeff Bezos is a person.

So you can see it’s sort of plucking out phrases that represent a single concept and categorizing them for you. So that’s what entities are all about. Entities are a little bit different from key phrases. So you’ll see here that we have sort of longer phrases here in the key phrases. And we’re not saying what they are. We’re just saying these are important phrases in the sentence. So slight difference there between entities. So we start off with the same entities and key phrases. Amazon.com, Seattle, Washington, July 5 jeff Bezos but as we get further in this sentence, we see longer phrases being flagged, such as other notable Seattle based companies that wasn’t flagged as an entity, but it was flagged as a key phrase. Starbucks and Boeing, that’s actually two different entities, but it’s one key phrase. So it’s kind of like breaking up the sentence into its biggest parts, if you will, the phrases. Well, that’s what it is, key phrases, right? And it has a confidence score associated with that. But again, it’s not trying to classify these things because a phrase might actually encompass more than one thing. It can also do language detection. So just by looking at this text, it says, with 99% confidence, I’m pretty sure this is English. So if you just need to figure out what kind of language a given bit of text is, comprehend can do that. That might be useful, for example, in building a machine language system, let’s say that you want to build a universal translator, basically, where the input might be just a bunch of text and you don’t necessarily know upfront what language it’s in. Maybe it’s something that somebody inputted from their phone while they were traveling.

You could use comprehend to identify what language it is and maybe then turn around and use something like Amazon Translate to say, okay, I know this is Spanish, I want to translate it to English. Something like that would be a valid use of comprehend for language detection. It can also do sentiment analysis. And this isn’t a very exciting or emotional sentence, so it is saying with 99% confidence that it’s neutral sentiment. Nothing really to talk about there, but if it was a positive or a negative or a mixed sentiment, it would flag that as well. So some sample applications you hear about or social media monitoring, if you want to be notified of somebody saying bad things about your product online, you might want to have a little bit of an alarm on negative sentiment on things that are being posted on social media about your company.

So that’s sentiment analysis very simple. It’s either neutral, positive, negative, or mixed, and it will give you a confidence score for each one. Finally, there’s syntax. And this is again, kind of related to entities and key phrases, but instead of classifying things as to what they are, we’re classifying them by the part of speech that they are. And it goes down to every single word and every single punctuation mark here within the sentence. So Amazon. com is a proper noun. And it’s interesting that it actually figured out that the. com was not like an actual period. Like in terms of punctuation, amazon. com is one thing. So yay, Amazon comprehend the comma is a punctuation mark, ink is a proper noun, is is an auxiliary verb located as a verb, so on and so forth. So if you need to do some sort of a syntactical breakdown of a sentence, possibly as a pre cursor to more natural language processing that you want to do, amazon comprehend can do that for you as well. So that’s comprehend. Okay? Again, it’s just NLP stuff at a very high level. It can do things like detect entities, key phrases, sentiment analysis, and syntax detection, as well as language detection. So that’s all you really have to remember that comprehend, high level service does NLP stuff. All right, moving on.

35. Amazon Translate

Up next, we have the Amazon Translate service, which does exactly what it sounds like it does. It translates text under the hood is using deep learning. And as we saw from Sage Maker, we got a little bit more depth into how that’s working at a lower level, but at a high level, we don’t care. At Amazon Translate. It just works. Some key points about Translate is that not only does it just translate stuff from one language to another, it can also support custom terminology. So you can pass in sort of dictionary in either CSV or TMX format, which is a standard format in the world of machine translation. So if you have a bunch of proper names or brand names or things that aren’t in the dictionary that you might want to flag special cases that have a specific translation that maybe only you know about, you can provide that extra data to Amazon Translate through that CSV or TMX format. Here’s an example of what it looks like. And you can just go to the AWS console and experiment with it.
So you can see here that it’s actually doing automatic detection of the language. You can just set the source language to automatic, and it will use presumably the same algorithm that comprehend does to do language detection.

And we typed in Amazon Translate uses deep learning for machine translation. We set the target language to French, and it translated that to French for us. And it’s just that simple. And you can see down below what the JSON request and response would look like if you were doing this from the API level. It’s just as simple. You just pass in the text, you pass in the language that you are coming in from, which might just be automatic and the language that you want, and it gives you back. Translated text. Couldn’t be easier, right? And it’s really that simple. There’s not a whole lot more to talk about. In terms of the exam, that’s all you need to know. And in terms of an application developer, that’s all you need to know as well. That’s really it. Apart from the ability to pass in CSV or TSV data that has has specific translations for proper nouns or brand names or things that you might want to handle specially, that’s it. That’s amazon translate. There’s nothing else to talk about here. So, easiest lecture in the world, right? Let’s move on.

36. Amazon Transcribe

Up next we have Amazon Transcribe, which again does exactly what it says it does. It does speech to text. So you can give it speech input in Flak, MP3, MP4 or Waveformat, specify the language that it’s in and it will give you back transcribe text. It can also support streaming audio. So if you have a stream of audio coming in from http two or WebSocket, it can actually translate streaming audio as it’s received as well. You could imagine building maybe a system that makes closed captions for live television, right? That’s pretty cool. That only works with French, English and Spanish today, but it’s a pretty neat capability. It can also do speaker identification. I mean, obviously it doesn’t know the names of the speakers unless you tell it, but if you specify how many speakers are in a given conversation, it will at least try to identify who’s talking in terms of speaker number one, speaker number two, what have you. It can also do channel identification.

So if you have two callers, you could transcribe them separately and they merge them together based on the timing of what they call utterances. So based on who’s talking first, you might decide to split one conversation into one channel and another side of the conversation into another. So again, if you have two people talking, for example, the transcripts of those two people could be provided individually as part of the channel identification capability. You can also provide it with a custom vocabulary.

So again, if you have special words like names, acronyms, brand names, things like that, you can just give it a list of those as a vocabulary list and say, hey, these are special words that you might not know about Amazon Transcribe. Look out for these and use them. For example, if you’re making closed captions for a highly technical course like this one, you might want to provide a vocabulary list of all the technical terms that you’re using like Flack and MP3 and Http, for example. That might be things that I would put into a vocabulary list for transcribing. This very slide. You can also give it vocabulary tables that go one step further where you say not only are these is this a list of special words, this is how to pronounce them. So you can actually include sounds like IPA, which is the International Phonetic Alphabet, I think that’s what that stands for. Or display as we can provide just discrete information about what different words are and how they’re pronounced and how you want those to be transcribed. Exactly. So if you do have special cases of special words that you want transcribed, you can tell Amazon transcribe about it ahead of time and make sure that it gets it right. It’s pretty cool stuff. They have a cool demo of it on the AWS console if you just want to play around with it. So let’s play around with it. Got to give it a try. So if you go to Amazon Transcribe in your AWS console, you’ll see a screen that looks like this, and yeah, you can actually use this without writing any code at all. So it takes you to the Transcription Jobs page here by default, where you can just say Create job and give it an audio file and have it transcribe it for you just all through the console.

So you don’t even need to use programming at all here. You can also provide your custom vocabulary through the console as well. So, again, very user friendly stuff. These services are really intended toward a more general audience. They’re not really made for ML experts, like we said, but it’s cool to watch it in action. Let’s go to real time transcription. And I do know that I’m talking in English, so let’s start streaming. What should I say here? I’m drawing a blank. Let’s see if it actually works. Wow, it’s actually working in real time. That’s actually really cool. And it’s fast enough that I could actually envision this. Working for real time closed captions Even Well, it did an okay job there, but still, it’s a lot better than most transcription systems that I’ve seen. It’s not 100% accurate yet, if you do need to well, now I’m just, like, not even talking straight here.

This is a good test, though. All right, I think that’s enough. Anyway, the bottom line here is that it’s good, but it’s not perfect. So if you do use this service, you’re still probably going to want to take a look at it and have a human clean it up when you’re done, but it’s pretty darn good. All right, let’s stop streaming. I mean, on the whole, it did a pretty good job there. Yeah, it just kind of messed up that sentence there, but for the most part, it did a good job. So that sounds Amazon Transcribe in action again. It just does what it says it does. There’s not a whole lot to talk about here.

37. Amazon Polly

Up next, we have Amazon Poly. And just like Amazon transcribed it’s speechtotext, amazon Poly does it the other way. It does TextToSpeech and it can do many different voices and languages as well. I don’t know if Poly is a cultural reference that stands up across the entire world, but at least in American culture, poly is a stereotypical name for a parrot. Although I’m not sure that’s actually a parrot. I think that might be a macaw. But anyway, stereotypically. You have a trained parrot who can talk and repeat stuff that it’s told Polly Want a cracker? Is the stereotypical thing it says. So that’s why they called it Poly. It’s not the most descriptive name, but picture a parrot talking to you, basically. Anyway, it’s a neural text to speech engine that can handle many different voices and many different languages. And at a high level, it just works. You give it text, it talks and says what that text is.

So you can imagine this as being an integral part of something like Amazon Alexa, where under the hood it has some text that it’s forming, but it needs to speak to you at the end of the day through an audio interface. Now, some of the finer points of Poly here are worth knowing. The exam could actually go into some depth on this. First of all, it has something called Lexicons, and this is where you can customize the pronunciation of specific words and phrases. So, for example, you might have a Lexicon that says, when I see the text W three C, I actually want you to say Worldwide Web Consortium. So it’s a good way of handling acronyms, where if you want to actually sound out what that acronym stands for, you can provide a lexicon to say, this acronym maps to this actual text that I want you to say. Also, it supports what’s called SSML format. This is an alternative to plain text that stands for a Speech Synthesis Markup Language. So in addition to just passing in raw text with SSML, you can give more precise control over emphasis and pronunciation and pauses for breathing and whispering and the speech rate and pitch and pauses and things like that. So if you want finer control over how it’s actually being pronounced and you can use that to make it sound a lot more natural, SSML allows you to give it that sort of fine level control over how your text to speech works. It also supports speech marks. This is actually an output thing as opposed to an input thing.

The idea is that it will encode when a sentence or a word starts and ends within the audio stream. So you can get this ancillary data back that says, this is where this sentence begins, this is where it ends. This is where this word starts, this is where this word ends. And you can imagine using that. For example, if you need to lip sync some animation. So if you had some sort of system that was taking text and automatically speaking it, maybe through some sort of animated character, you could actually automate the lip syncing of that animated character as well, using those speech marks. So kind of an interesting idea there. Yeah, that’s a little bit worrisome for me. As an online instructor, you could imagine a virtual Frank here just parroting text through Amazon polly in a virtual head talking with speech marks. But anyway, I don’t want to think about that too much. It is worth remembering what’s on the slide, guys. So remember, Polly, lexicons are useful for customizing the pronunciation of words and phrases.

Specifically, acronyms is a good example of that. Expanding an acronym through a Lexicon SSML just gives you more control over that pronunciation by allowing you to mark up emphasis and pauses and things like that. And you can get speech marks back that actually mark when essentials or word starts and ends. This is what it looks like in the console. So again, it’s very simple to use. You just type in text and it will parrot it back to you through Amazon Poly. Pretty cool stuff. If you want to go give that a try yourself, feel free to play with it. But it does exactly what you think it would do. You type in some plain text, click on Listen to Speech and it will play it back. Or you can download an MP3 file. You can also provide SSM there and that other tab there.

So again, it’s something that you don’t even need to do any programming for at all. You could actually just do this all through the console and generate MP3 files by hand if you wanted to. You’ll see that it has multiple different voices to choose from. There as well. We have Sally, Joanna, Ivy, Kendra, Kimberly, Matthew, Justin and Joey. Today for English. And yeah, you also have a choice of engine. That’s interesting too. There’s a standard engine and a neural engine. So I guess the neural engine is actually an option. It’d be interesting to see how those two compare. You can also see you can upload your lexicon file through the console as well. So again, you can do everything just through the console if you want to. But obviously the API is supported as well. So this is fun stuff, I can’t resist playing with it. So I’ve gone into the AWS console to Amazon Poly, and for the plain text here, I’ve typed in Sundog Education as the best courses ever.

Of course we do. Let’s listen to the text and listen to speech. Sundog Education has the best courses ever. Thanks Joanna. That wasn’t very enthusiastic though. I wonder what the neural engine sounds like. Is there any difference? Sundog Education has the best courses ever. Not really, but we have different voices to choose from too here. Let’s see what Kendra sounds like. Sundog Education has the best courses ever. All right, how about a male one? Let’s try Justin Sundog. Education has the best courses ever. I guess Justin’s a kid. How about Matthew Sundog? Education has the best courses ever. Well, there you go. So that’s Amazon Poly. You can still kind of tell that it’s artificial, but it’s a lot better than artificial voice used to sound like. So there you have it. Very simple to use. If I wanted to, I could download an MP3 file of that. I could change the file format that it actually outputs. I could provide customize pronunciation with a custom lexicon. If I wanted to, I could provide a smell so that’s Amazon Poly does text to speech, and it has a bunch of different ways of controlling that that are available to you as well.

Comments
* The most recent comment are at the top

Interesting posts

The Growing Demand for IT Certifications in the Fintech Industry

The fintech industry is experiencing an unprecedented boom, driven by the relentless pace of technological innovation and the increasing integration of financial services with digital platforms. As the lines between finance and technology blur, the need for highly skilled professionals who can navigate both worlds is greater than ever. One of the most effective ways… Read More »

CompTIA Security+ vs. CEH: Entry-Level Cybersecurity Certifications Compared

In today’s digital world, cybersecurity is no longer just a technical concern; it’s a critical business priority. With cyber threats evolving rapidly, organizations of all sizes are seeking skilled professionals to protect their digital assets. For those looking to break into the cybersecurity field, earning a certification is a great way to validate your skills… Read More »

The Evolving Role of ITIL: What’s New in ITIL 4 Managing Professional Transition Exam?

If you’ve been in the IT service management (ITSM) world for a while, you’ve probably heard of ITIL – the framework that’s been guiding IT professionals in delivering high-quality services for decades. The Information Technology Infrastructure Library (ITIL) has evolved significantly over the years, and its latest iteration, ITIL 4, marks a substantial shift in… Read More »

SASE and Zero Trust: How New Security Architectures are Shaping Cisco’s CyberOps Certification

As cybersecurity threats become increasingly sophisticated and pervasive, traditional security models are proving inadequate for today’s complex digital environments. To address these challenges, modern security frameworks such as SASE (Secure Access Service Edge) and Zero Trust are revolutionizing how organizations protect their networks and data. Recognizing the shift towards these advanced security architectures, Cisco has… Read More »

CompTIA’s CASP+ (CAS-004) Gets Tougher: What’s New in Advanced Security Practitioner Certification?

The cybersecurity landscape is constantly evolving, and with it, the certifications that validate the expertise of security professionals must adapt to address new challenges and technologies. CompTIA’s CASP+ (CompTIA Advanced Security Practitioner) certification has long been a hallmark of advanced knowledge in cybersecurity, distinguishing those who are capable of designing, implementing, and managing enterprise-level security… Read More »

Azure DevOps Engineer Expert Certification: What’s Changed in the New AZ-400 Exam Blueprint?

The cloud landscape is evolving at a breakneck pace, and with it, the certifications that validate an IT professional’s skills. One such certification is the Microsoft Certified: DevOps Engineer Expert, which is validated through the AZ-400 exam. This exam has undergone significant changes to reflect the latest trends, tools, and methodologies in the DevOps world.… Read More »

img