AWS Made Easy

Ask Us Anything: Episode 12

Episode 12
July 26, 2022
1 h 03 min

In this episode, Rahul and Stephen reviewed a selection of AWS announcements, including some breaking AWS news.

Latest podcast & videos

September 27, 2022November 3, 2022

1 h 07 min In this episode, Rahul and Stephen continue the theme of Behind the Scenes by showing some of the automation which makes AWS Made Easy possible.

September 20, 2022September 28, 2022

1 h 07 min In this episode, Rahul and Stephen recap the "Behind the Scenes" episode 1, and then discuss a few new AWS announcements, and plan for Behind the ...

September 13, 2022September 20, 2022

1 h 10 min In this episode, Rahul and Stephen begin part 1 of a 3-part series in showing #AWS-powered automation, developed with DevSpaces and DevFlows, to show how they ...

August 30, 2022September 19, 2022

1 h 03 min In this “What’s New Review” post, Rahul and Stephen go over a variety of announcements from AWS. Most of the articles rated very well, with the ...

August 17, 2022September 19, 2022

1 h 11 min In this episode, Rahul and Stephen film from Anaheim, where they were attending an AWS Partner Summit. They filmed from a makeshift studio in a hotel ...

View all »

Summary

Plus, we have introduced a new rating system for the articles. For each article, we are going to add a rating of either:

Simplifies

or

Or, if it is undecided:

Wait and see

An announcement earns a “Simplifies” badge if, overall, the new feature makes AWS easier to use. Conversely, a “Too complicated” badge is earned if the additional complexity added by this feature does not, in our estimation, have sufficient benefits. Or, if there is significant room for improvement. This is our first pass at a “ratings” system, and we will add some more granularity over time.

Amazon EC2 Console adds ‘Verified Provider’ label for public AMIs

In this announcement, we learn that Amazon has added a “Verified Provider” tag to AMIs in the marketplace. Although this is well intentioned, this does not seem to have sufficient granularity. Whether an AMI is “good” or “trusted” really depends on time and the use case. For example, an AMI may have great data science tools, but not be suited at all for public use.

Additionally, adding this at a vendor level seems to be extremely coarse. All vendors have had their ups and downs. Amazon should leverage the star system from Amazon retail which can allow for more detailed discussions on the merits of particular AMIs.

Verdict:

AWS Announces AWS Wickr (Preview)

This next announcement is a communications and messaging platform called Wickr. According to the Wickr site, the tagline is “Protect all of your communications, from video conferences to group messaging and file sharing, with Wickr.” It is not clear exactly what Wickr is at this point, but seems to be a Slack-replacement built on Chime. We will be testing Wickr throughout the next few weeks and come back with our final evaluation, but for now:

Wait and see

Amazon Macie introduces new capability to securely review and validate sensitive data found in an Amazon S3 object

This added feature to Macie highlights the innovation done by the Macie team. Sensitive data, such as PHI or credit card information, must be secured. Detecting such data in blobs of text and media can be extremely challenging, and it would be nearly impossible for any small or medium sized company to build their own “sensitive data detector” with the same level of accuracy.

The verdict on this one is clear:

Simplifies

The AI Use Case Explorer is now available

The AI Use Case Explorer offers what we have been waiting for, opinionated ways on how to approach common machine learning problems. With the AI use case explorer, it is easy to drill down by industry, problem type, and objective, and get links to case studies and suggested AWS tools to use. We hope to see similar tools like this for other parts of AWS, such as choosing databases.

Verdict:

Simplifies

AWS CodeBuild supports Arm-based workloads in South America (São Paulo) and Europe (Stockholm)

This announcement, that CodeBuild for ARM-based workloads is gaining momentum, is very welcome! With the Graviton series of processors, development for ARM is skyrocketing. With this announcement and others like it, it makes integrating ARM-based builds into a CI/CD pipeline a first-class experience.

Verdict:

Simplifies

Amazon Athena adds visual query analysis and tuning tools

Query analysis in SQL is complex and involved. And, since Athena can source from many different types of data stores, including RDS, S3, Dynamo, and more, it can be extremely difficult to model query speed. As Donald Knuth has said, “Premature optimization is the root of all evil.” Using visual query analysis, and the new Query Statistics API, you can get guidance on where to optimize.

Verdict:

Simplifies

Summary

Article	Verdict
EC2 Verified Provider	Too complicated
Wickr	Wait and see
Macie new capabilities	Simplifies
AI Use Case Explorer	Simplifies
CodeBuild + ARM	Simplifies
Athena Visual Query Analysis Tools	Simplifies

See you next week!

Transcript

Stephen

Hello, everyone, and welcome to AWS Made Easy. Ask us Anything live stream. This is episode number 12. And joining you today is me Stephen Barr and co-host Rahul Subramaniam. How are you?
Rahul

Hello, everyone. I’m doing very well. I had an amazing weekend.
Stephen

What made it so good?
Rahul

It feels like we kind of are regaining our social life of sorts. So this Friday evening, very dear friends of ours who have kids exactly the same age as ours, we decided that we would try catching up after we put the kids to bed. And was just so amazing. We stayed up till 1:00 am just chatting. It was really, really awesome. So awesome that we decided to do it again on Saturday. So, it feels like…and I can’t remember the last time we did something like this. So yeah, that and a bunch of interesting 3D printing home assistant set up, I set up Octo-Print for my 3D printer. So yeah, a bunch of interesting things I got to do all through the weekend.
Stephen

I want to ask you about the 3D printing where I’m still on shock about just you know, as a parent of four having an evening where everything lines up, the kids are in bed, and you get to have some… Oh, I remember a long time ago when my first two kids were a bit smaller. And I was spending some time with some friends who didn’t have kids and they were fussing over the restaurant to go to and all these little mundane things. Don’t you get it? We can just sit here and talk.
Rahul

Correct.
Stephen

Which is incredible. Kids haven’t even walked.
Rahul

That’s exactly what we did. I mean, and the probability that two nights in a row lineup that way is, I mean has been impossible to this far. So [inaudible 00:02:05.634] these big game-changing events?
Stephen

Okay, that’s a big congratulations and a slight amount of jealousy to you.
Rahul

You’ll get there soon enough.
Stephen

Cool, cool.
Rahul

And yeah, the next time we are in the same city, I mean the kids do get along really well and have a great time together. So we should definitely do that as well.
Stephen

Oh, yeah, absolutely. combined resources.
Rahul

Absolutely. How was your weekend?
Stephen

Oh, really, really well. So I’ll show you, I’ll do a little visual. We went to this place. Oops. I’ll show you. We went to this place that’s somewhat hard to pronounce. Take a guess at how you pronounce the city here. S-E-Q-U-I-M.
Rahul

Sequim.
Stephen

It’s pronounced Quim.
Rahul

Quim. Okay.
Stephen

Quim. They have this really cool thing over there. They have this, the Lavender Festival. Because they farm lavender out there, this isn’t my photo, but I just grabbed it off of Instagram. But the fun thing is that to get there, one of the ways to get there, you take a ferry, so you go to downtown Seattle, and you can go over to Bainbridge and then drive over. And so on the way out, I took this little video and I posted it on Twitter’s Seattle, what a great place to live. It got 10,000 views, 271 likes, the impressions were… It’s the best tweet I’ve ever had. I think part of why is because half of the readers thought it was sarcastic in that if you look at it, it’s a cloudy day and they said hey, you’re actually going away from Seattle. So if it’s so great, why are you leaving? But then the people who’ve taken this ferry they know, Okay, going for the day. So for the record, I like Seattle, and this was a great day trip made possible by my proximity to Seattle.
Rahul

We have to draw some learnings from this.
Stephen

Oh, absolutely.
Rahul

For our AWS Made Easy podcast, we need to figure out a way to be sarcastic about or sound sarcastic at least, about what we’re saying. How about, everything should move to the cloud.
Stephen

Well, there’s this great thing called, what is it, Poe’s law where it says if on the internet, if there’s…there gets to a point where you can’t tell the difference between extremism and satire, and so that’s, I think, the sweet spot that you want to hit. And then the last funny story for that weekend, so my Australian wife ran into a cafe to get some, well, what she intended to get was French fries for the kids. It was the end of a long day, a plate of hot French fries, and so she goes in and asked for a bag of hot chips. And not kidding, she got a Ziploc bag of potato chips that had been put into the microwave. And someone with a straight face handed this to her. It was great. So she’s still a few more adaptations to be back in Seattle.
Rahul

That is a really, really funny story.
Stephen

All right. Well, shall we talk about our first one today? We’ve got some exciting announcements. Let’s jump into it with Verified Provider. All right, so I’ll show the transition and here we go. All right. So, Amazon EC2 console adds Verified Provider label for public AMIs. So the idea is, there can be…anyone can put anything on the marketplace. And to be able to have, well, I guess what comes up with any repository of code that you might use, unfortunately, part of human nature, is that someone might put something in there that’s malicious. So what does Verified Provider look like? Basically this.

So if you look at well, hopefully, the Amazon Linux is verified. But then we got MacOS is verified, this is the front page, Red Hat’s verified. So the idea is that if you see some AMI that you’ve heard about, like say you’re copying an AMI ID from a tutorial, how do you know that that AMI doesn’t have some rogue bit of code in it that you don’t want to be using? Well, you can look at this…you can look for the verified provider stamp. So, Rahul, what are your thoughts on this one?
Rahul

So, I think the intention is exactly what you just stated, which is to make sure that there isn’t rogue stuff in there. I think your Amazon kind of fell a little short. If you look at buying stuff on amazon.com, they actually have reviews and stuff for that particular item, not just the vendor. And I think that’s what is missing over here, you can have a certified vendor, but it says nothing really about the content that they have actually put out. So, for example, if there’s an AMI, that somebody created, let’s say Red Hat creator, or someone else created, there’s nothing to say that that particular AMI does not have all libraries or you know, security vulnerabilities that I don’t know of.

So I think there are two things that are kinda really important. One is, the verification has to happen at the AMI level, not just at the vendor level. And the second one is that there has to be an expiry to that verification, right? Because there are always new security vulnerabilities that are coming up. There are always new, you know, tricks that show up against old systems that yeah, for that particular point in time, they were all valid. But yeah, new insights from new threats could invalidate an existing AMI in particular, it doesn’t invalidate the vendor, the vendor is still a good vendor. But what you care really about is that specific AMI. And so I don’t think it’s really helped that much.
Stephen

That’s a really good point. And I can see the benefit of like you said verifying the vendor is one level, but then beyond the vendor, it’s kind of an all or nothing. Do you trust every last piece of code that Red Hat’s ever put out ever and even from five years ago to still be up with security or not?
Rahul

Yeah.
Stephen

So, the quality is basically point in time.
Rahul

Correct. And the quality is point in time. And that quality, the value of that quality changes over a period of time. And if you just certify the vendor, someone might be fooled into thinking that they are actually picking the right AMI, which is from their perspective certified by AWS, but it really isn’t. It’s only the vendor that is verified by AWS and it says nothing really about the AMI content. So, yeah, I’m a little disappointed in this one, because it’s kind of a half-baked process for them.
Stephen

Okay, so our takeaway from this one is all right, you want to add in reviews and things like is this still… Is the security still valid, right? If imagine a LAMP stack from 10 years ago must have all sorts of unpatched vulnerabilities that have since been patched but the AMI by design is frozen in time.
Rahul

Correct. I mean, in my ideal world, this marketplace would look very much like the amazon.com marketplace where you have the characteristics of the AMI, it tells you, you know, the bill of materials in it, you know, just like you have the specifications on any object. Or anything that you buy on amazon.com. And then it has reviews, let people comment about what they found hard or easy or what is missing in there, or how to patch it up or like whatever instructions because reading those reviews really do help. Have the FAQ section like you have for your…for the stuff that you sell on amazon.com. Like, I think bringing that concept to the marketplace, because their experience over there with those with that selling is actually very similar to what you need over here. There is just such a massive selection right now of these AMIs nobody knows what the right one to pick is. So just by saying, “Here’s a vendor who’s validated,” says nothing about what you’re actually picking.
Stephen

And especially if you’re just getting started, I can imagine as I’ve been in this position, you know, you’re reading a tutorial, the tutorial might be a little bit dated, and part of that dated tutorial might reference an AMI that was great at the time, but has since been superseded. I mean, it’s not going to vanish from the marketplace, it will just. And if now it has that vendor-certified sticker, we might even get a false sense of security.
Rahul

Exactly. And I think that’s a dangerous thing. For customers to think that they are secure in picking a vendor-verified AMI, I think it would be a very dangerous thing, potentially.
Stephen

Yeah. Okay. All right. Yeah, that makes a lot of sense. Well, in that case, what’s our final verdict on this announcement?
Rahul

Okay, so last week we basically decided that on every announcement that we discuss on the show, we are going to either decide it complicates life, or it simplifies it. And I think in this case, given that it falls short of what it’s really trying to do, and it gives people a sense of false security, I would say just complicates things. So I would, yeah, I don’t know if it’s too complicated, but it just complicates things over here.
Stephen

All right. This is a beta of our rating system. All right, so we’ll say it complicates things [crosstalk 00:12:13.325]. All right. So this… Okay, so our verdict is it’s complicated because it could lead to a false sense of security because whether an AMI is good or not is really a question of at a point in time. And it could totally depend on your situation. It could be great for research, but don’t ever make it public facing, right?
Rahul

Correct. Yeah.
Stephen

Okay. Great. Well, I think we’ve got some breaking news we want to share. Let’s do a quick transition and get into our breaking news. All right, so this announcement is what 35 minutes old?
Rahul

Yeah, it’s about 35, 40 minutes old, I think it just came in as we were prepping to get online.
Stephen

I’ll put the URL of the announcement in the comment section so everyone can follow along, there we go. It’s Wickr online. So what is Wickr?
Rahul

So it’s actually rare for them to come up with something completely new and different. Which is not issued in re:Invent or they haven’t done a big show about it. But it looks like if you actually go to the AWS Wickr webpage, just go down at the end of that. Yeah. This I think gives you a much better idea of what Wickr really is. So basically, for your corporate communication, if, you know, the need for security, end-to-end encryption, making sure that all of your communications are archived for legal and compliance purposes, this tries to solve that particular pattern. So, you basically have client, you know, controlled keys that basically encrypt everything, where even AWS has no control over or no access to decrypt any of this stuff. And all of your data compliance requirements are kind of managed in one place. And, yeah, basically, encryption allows for communication, [crosstalk 00:14:27.969] knowing if it’s a new client like Chime or if it is embedded and built into Chime. I haven’t been able to figure that out yet. So I think we’re gonna get into previewing that [crosstalk 00:14:35]
Stephen

I’m looking for a screenshot. Let’s see.
Rahul

There isn’t one.
Stephen

Is this a Slack replacement? Like is it a Chime replacement or? Yeah, what is it? A video…
Rahul

I have a feeling that this is gonna be a layer in Chime because it makes sense over there because you’re talking about chats, you’re talking about communicate, you know, voice video, you’re talking about file sharing and screen sharing, which is all I mean, AWS’s tool, there is Chime and the Chime SDK. And I’m assuming that Wickr is going to be not only an independent service where you could have any streaming data or file data that gets encrypted. But Chime will probably incorporate this as the client of choice. But I don’t know how this plays with Slack or any of the other tools. Again, since this is just 30 minutes old, we’re gonna, you know, right after this call, I’m gonna have…I’m gonna see if I can get access, preview access to Wickr and play around with it. We’ll probably have an update next week on this.
Stephen

Yeah, that sounds really good. It’s, I mean, yeah, reading the description, it looks like Slack. And I can see a benefit of as from a procurement side, when you’re looking at a tool, especially having worked in with PHI and financial data, you want to vet each tool and say, Is this tool approved for this type of data or not. And then you have to be really careful when you cross those boundaries, even just sending your colleague a message like, Hey, what’s going on with patient ABC 123, that could be potentially a violation that you can’t send that in a Slack message. So if you…if this is on AWS, you’ve already vetted AWS, you’ve vetted the whole thing in terms of encrypted in transit, encrypted at rest, that can be really useful in having to bring in extra tools.
Rahul

Correct. Though, I mean, with third-party tools, I really don’t know how this would work. Because if there are third-party tools that are end-to-end encrypted, then you need to figure out a way to kind of get access to all of that data for your corporate compliance and put it all out [crosstalk 00:16:53]
Stephen

Because I was meaning replacing third-party tools with thinking of Amazon is mostly if you already considered that first party, then.
Rahul

No, I’m fine if everything is in the AWS ecosystem because everything will just plug right in. But if you were a corporation using something like Slack, or Jive, or you know, one of these, you know, products that are used internally for communication, or Skype for that matter, how do you get that data into the system for your compliance and how do you then layer in Wickr? That’s what I’m most curious about. So it’s pretty clear on the…on Active Directory, SSL, all of those signings, and stuff like that, I’m just very curious about how the end-to-end encryption happens and how do you then take that data and still preserve it and store it for compliance?
Stephen

Okay, and then here it’s saying you can create and manage Wickr networks and also Wickr bots. I definitely have to have a good play with this and see what it’s doing, exactly where it fits in, what it replaces or aims to replace, what its relationship with Chime is. But overall looks like it could be possibly really exciting.
Rahul

Could potentially be exciting. But honestly, I have never seen a more wig product announcement than this, where you’re still trying to guess at what the product actually does. I was trying to Google for any video, any description by anybody about Wickr, and it’s literally just this announcement.
Stephen

We see a kind of a screenshot in here, you see that it’s a…there’s a window when you open it.
Rahul

I don’t know if it’s a stock photo or a screenshot. So it is a… but you can download Wickr on the Wickr page. That’d be an interesting exercise to just try that. If you just go up there’s a download Wickr so you can download it for Mac, I think it’s a new client looks like. It seems about as big as the Chime client I assume that they just packaged Chime into it.
Stephen

137 megabytes. Well, oh, here we go. Applications.
Rahul

We probably need to screen share.
Stephen

Yeah, I’ll share that over in a moment. Let’s see. Wickr Pro. Okay. Let’s do share with the window. Okay. I was trying to find the…the doc says it’s launched a window, but I can’t find it. All right, let’s try that one more time.
Stephen

Here we go. Let me share this window now. Go to share. Let’s see. Okay, this is what happens when you download Wickr.
Rahul

Oh, looks like a completely different UI compared to Chime. And you just enter your email ID, or hide your email ID and just sign in?
Stephen

All right, one second here.
Rahul

Let’s try that out. I’d love to see what it looks like. The fun of live streaming a live analysis of a new product.
Stephen

Now I have to check my email and enter a code, one second, though this…I don’t want my email logged in. Verify my account, open Wickr Pro. Okay, now I’ll put you back on screen. All right, one second.
Rahul

I think one of the big, big opportunities is to simplify how you get communication done.
Stephen

You know, I’m gonna skip
Rahul

By the way, did you do Wickr Me or Wickr pro?
Stephen

I did Wickr Pro. I’ll put the window back on. Okay, so this is what it looks like. Well…
Rahul

That looks like a Chime on the left. That looks like a read and Chime UI. Hence, we have rooms pretty much like Chime
Stephen

Opening search. Okay.
Rahul

Okay.
Stephen

Okay. Well, I’ll have to invite you later and see what this is like, see if this is a…is it Slack replacement, or like you said is it a layer over Chime?
Rahul

At first look at…definitely looks like Chime. It looks like a reskinned Chime to me. But let’s play around with it to the week and we will get back to all of you on, you know with what we think about it.
Stephen

Can we give this the exemplifies a complicated rating yet? I don’t know if we can.
Rahul

I don’t think we can yet till we understand exactly what it does. So let’s reserve judgment on this one till next week. And we’ll bring it back next week. So while we are on this subject, let me just talk about what I have. And I really hope that this is nothing like Chime. The thing that I hate about Chime is that it feels like an external product that’s kind of, you know, somehow hammered into the AWS console with a very, very shoddy integration. So, for example, in Chime, if you have to provision a user, you have to go into your AWS console, go into the Chime app, and then provision folks over there. And then there’s a separate set of credentials to log into Chime, which is not your AWS console credentials. It’s not a, you know…it’s not those identities or keys. It’s something completely different.
Stephen

That’s almost how Amplify Consoles felt at first when it’s like wait, this is a separate system.
Rahul

Correct. And, unfortunately, Chime is still that way. Chime doesn’t feel integrated at all into the AWS ecosystem. So I really hope that they have not made the same mistakes with this. So, that’s going to be my hope. I’m gonna try this out. If I’ve already downloaded the client. We’ll see how easy or difficult this becomes.
Stephen

Well send me a…send me a Wickr message. I wonder if there’s going to be a new verb, can you Wickr me? And I know you said, Hey Slack me. All right. Can you Wickr me and see what happens?
Rahul

Skype me, or can you Wickr me? I think someday there’s going to be a word that’s just gonna sound awful in a sentence.
Stephen

Yeah.
Rahul

And as a verb.
Stephen

All right. Well, we’re gonna put a question mark on this one for now. And let’s have a quick break. When we come back, I think we’ve got another breaking announcement.
Rahul

Yeah.
Female Speaker

Public cloud costs going up and your AWS bill growing without the right cost controls? CloudFix saves you 10 to 20% on your AWS bill. By focusing on AWS recommended fixes that are 100% safe with zero downtime and zero degradation in performance. The best part, with your approval CloudFix finds and implements AWS fixes to help you run more efficiently. Visit cloudfix.com for a free savings assessment.
Stephen

All right, the announcement is AWS Amazon Macie introduces new capability to securely review and validate sensitive data found in an S3 object. So I put the URL of the article into the comments, so, for anyone who wants to follow along. And this one’s also brand new. This one just came out about 45, about an hour ago.
Rahul

Yep. Yeah. So, AWS Macie is actually a really neat service with a bunch of different tools that identify things like PII, PHI, and so on in your data. And this is a very, very fundamental problem for anyone that is bringing in data into their systems, making sure that it is scraped off all of this data upfront is something that you want to do. And tools are, you know, few and far between, especially tools that work well are few and far between. Well, Macie has been remarkably good at doing these kinds of detections and fixing your problem. And it can be part of your data ingestion workflow itself. And it just comes in really…it kind of operates really smoothly. Once you have stuff in S3, you can basically have the S3 event trigger, you know, a job to go figure out whether there’s anything that is non-compliant. And then you can automatically you know, pull in a glue job if you need to, to clean stuff up. So really, really useful tool and looks like they’ve got a bunch of new checks or reviews that you can validate.
Stephen

Interesting even when it says credential materials. So is this what powers, say if you check in a key to a public-facing repo or a key where it shouldn’t be?
Rahul

I think so. I think that’s exactly what they’re trying to do with the GitHub stuff. But sure, I mean, if you’ve checked a bunch of stuff where something is tagged, you know, like a key-value pair is tagged with a password somewhere, and you see free text, then this would end up flagging it, which is pretty neat. So you want to make sure that in whatever you’re dumping into S3, you don’t have, you know, clear text passwords and stuff like that stored. Incredibly valuable.
Stephen

Do we know what the pricing is? Let’s see the pricing. US East $0.10 cents per bucket, credit, 50,000 gigabytes at $1 per gigabyte. So, wow.
Rahul

This is cheap, depending on how much data you’re pulling in to make sure that you don’t have PII, PHI like the liability associated with pulling those in and potentially not handling the right way are severe. So just having this one simple check in place, I think makes a big difference.
Stephen

Back when I was in healthcare IT, I think the internal number we had going around was like a million dollars per patient record. So like, oh, this CSV file is worth $300 million. It’s a bit…and plus HIPAA means you expose yourself to personal criminal liability for doing the wrong thing. So yeah, this is a lot cheaper. This is very, very, very cheap. We wouldn’t hesitate for a second to run this [crosstalk 00:29:38.744]
Rahul

Absolutely…
Stephen

Anything that could possibly have personal or anything sensitive in it.
Rahul

Yeah, by the way, I also saw GDPR in there. I am no GDPR expert or specialist so if anyone in the audience is, please feel free to comment and enlighten us. But in whatever little experience I have had dealing with GDPR I find that A, it is an interpretation, depending on the country or the region or wherever it’s applicable. And it’s also a function of the kind of data you have. So I’d be curious to see what specific checks come out of the box, or are available out of the box for GDPR.
Stephen

Yeah, I’m really curious about that, too. And I’m not…I think for some of the…I know that some of the regulations involve if you pull your data, what happens to a machine learning model that was trained on that data, and that’s a bit more fuzzy because do you retrain existing models that are deployed and I think that some people do and I think to be compliant you might need to. But at least being able to identify any plain text blog that would have…that has a name and address, a credit card, a password. Let’s see. They have a link to sensitive data, manage data identifiers.
Rahul

Yeah. Can we just zoom in a little bit over here?
Stephen

Yeah.
Rahul

Awesome.
Stephen

Wow. Okay. So I look for JSON web tokens. How do you private keys? [crosstalk 00:31:17.336] This is cool.
Rahul

Yeah, so we’ve used a bunch of Macie stuff for a bunch of the customer data that we pull in. So we’ve been using Macie quite extensively. But do they say which ones are new? So credit card stuff has been there for a while. I think the JSON tokens I hadn’t seen earlier. HIPAA-compliant stuff might also be new.
Stephen

Interesting. Finally with JSON tokens. How everyone…you could tend to memorize what opening curly bracket looks like in day 64, lowercase E, capital Y capital J. Like okay, what is this?
Rahul

Very true. So, yeah, I mean, that’s the easiest way to kind of filter out all of the potential dangerous PII, PHI, kind of, and GDPR data. So I would give it the stamp of it simplifies life. For sure. I wouldn’t want to build a system that does this from scratch.
Stephen

And to get the amount of training data to build this from scratch would be really hard. And you handling that training data would be difficult in and of itself because you need all that personal information to train it.
Rahul

You’re absolutely right. I think just data would be so scarce that we’d never build a decent system if we were to do it ourselves. So the fact that this is available as a super cheap service, in the context of you know, what not having it means. Yeah, this absolutely falls in the simplifies bucket.
Stephen

All right, well, anything else we should talk about before we move on to the next one?
Rahul

Yeah, I just realized, as I was looking at the comments that you posted, we are on YouTube. So, last week, we talked about getting onto Youtube as well. So now we what, moved on to five channels?
Stephen

Yeah, we are on Facebook, LinkedIn, YouTube, Twitch, and Twitter.
Rahul

Awesome. So we got one more channel in, and hopefully, we can get someone from StreamYard, you know, coming in and talking about how they do all of this amazing stuff. Looking forward to that episode.
Stephen

Absolutely. We’ll get that lined up and calendared. All right, we’re gonna switch gears to AI. Okay, so this is a little bit of a different one, they announced the AI Use Case Explorer, let me share the screen here. Oops. There we go.
Rahul

I wish we had a banner that said, finally, you know, like, you’ve heard me rant about AWS not being opinionated about anything. And for the first time, I’m seeing it. There you go. For the first time, I’m seeing AWS actually put out a tool that helps you with making decisions. You know, I’d asked them to create a simple decision tree for, how you choose the right data source or a data store. You know, the distinction between Aurora and just RDS, and Neptune, and Dynamo DB, and document DB, and, you know, all the others. They just refuse to do it. Like they refuse to have a simple decision tree that would help people decide which kind of datastore is right for them, for their use case. But I’m really glad that the AI team thinks differently within AWS. Again, benefits of the federated setup that they have.

So I’m really glad they built this tool because this is a real challenge. Finding the right models for the right industries, for the right outcomes, with the right trade-offs being made available, is incredibly hard. And it is the main reason why everyone goes about building their own architectures, their own neural networks, their own models from scratch every single time. If they could search through it, filter through it, and figure out which models are most relevant for their use cases, they can get a really, really neat starting platform without having to go through an entire you know, machine learning, workflow definition, and execution, which can take months at times.

And plus, without all the data, you may not even be able to get that kind of accuracy. We’re just talking about Macie, right? If you wanted to build something like that, you just wouldn’t know where to go look for something like this. And the fact that this is now available in the Explorer really tells you, okay, I know exactly how to go leverage this particular model, or this particular service to deliver these outcomes.
Stephen

Let’s have a look at it. So they say, it’s interesting that you can drill down into use cases. Okay, let’s say we want consumer goods, and we want, okay, let’s do marketing and we want more revenue. Sounds good. Okay, so then we get personal…
Rahul

Personalization. AWS Personalized is an amazing service, which we’ve talked about before. In fact, our very first episode was with the GM of AWS Personalized Ankur Mehrotra. And he was talking about how everyone should be able to leverage this right out of the box with really, really simple pieces of data.
Stephen

And, I think, what’s really neat about this is that you know, we think we’re always focused on the tech part of it. But then you zoom out from that in terms of, what does an organization need to launch a new AI project, especially for an organization that hasn’t done one before. And you want to know, Okay, what have similar organizations done in the past? What are other…what’s possible? And to be able to right away, say, Okay, here are some things, here’s other companies, here’s success stories… Oh interesting, I didn’t know it would take you to YouTube. Okay, cool. So here’s all of the things that you need to make your case as to why you could want to do this. And here’s where the…here’s how the path has been trod before. It’s really, really neat. Because sometimes, you know, AI especially it can sound like a very daunting thing. I always think about Terminators and robots, all that stuff, and it can be a bit unnerving to say, Okay, we wanna roll out AI into your business. And so to be able to have all these use cases at your fingertips that are similar to what you want to do, that can be really empowering as a business.
Rahul

To be very honest, deploying an entire workflow for, you know, bringing the data, cleaning the data, doing feature engineering, then training the model based on it. Once you figure out the model deploy, the model for inferencing, and then keeping the inference service up is nothing short of that nightmare you’re talking about with AI. You know, the Terminator kind of nightmare. This comes close. This is…it’s torturous. You know, working through all those notebooks and those experiments and figuring out what is the right model for you. That’s all extremely painful work. So, the fact that you can actually leverage other people’s use cases, other people’s work, and services from AWS is as a big, big win for sure.

By the way, I’m very curious. I saw one category called cost optimizer. Can we go?… Let’s look at the filters again, let’s search. I think it’s a software, one of the categories, yeah, great. Let’s look at operations, software operations. And then let’s look at cost savings.
Stephen

Reduce. Okay.
Rahul

Reduce costs, let’s see. So fraud detection. Great. Yes, operationally, you wanna be able to detect anomalies, and there are some really neat anomaly detection tools, so look out for metrics, look out for vision, look out for metrics, in particular, helps with fraud, chatbots, and virtual assistants. Yep, all of your Lex Polly comprehend recognition. Things that, you know, comprehend recognition can do it for you for not safer work and those kind of thing. They’re all operationally hard things to get right. And you can just start leveraging a bunch of these services to do that effectively, right out of the box. What are the other options at the bottom?
Stephen

Contact Center intelligence, cybersecurity,
Rahul

Cybersecurity. So, they have security hub now which is pretty awesome. Bunch of Process Automation, this is probably change manager as well as actually, I’m just interested in seeing what is in process automation.
Stephen

This is it, it’s automate data processing from documents.
Rahul

Oh, this is Kendra.
Stephen

Well, it’s… Oh, yeah, look it’s Textract [crosstalk 00:41:08.117] comprehend, augmented AI. This is really neat. I can see this really bridging the gap between, you know, what we think of when you think of the AWS consoles as this giant menu and you feel like, all right, well, I guess I better hope I picked the right thing. And then this which is very opinionated and very directed and curated into telling you where to go, and what services to use. It’s a real break from the typical approach.
Rahul

Right. And very pleasantly out of character for AWS to do something like this. So, again, this gets a big thumbs up from me. It simplifies customers lives dramatically. So, yeah, definitely a big thumbs up. And I wish they had more of this across the different product or service categories.
Stephen

So something like this for what database to use?
Rahul

Yeah, there are. Yeah, I mean, wouldn’t that be super neat?
Stephen

Absolutely.
Rahul

Like just no SQL databases, what should I use, should I use MongoDB hosted on AWS, which again, is a big collaboration and a service from AWS, or should you use DynamoDB? Or should you use document DB? When do you use which like, no, there’s no opinion from AWS on any of that.
Stephen

And then should you use DynamoDB with single table design? Or should you use a couple of tables? Well, I was scratching my head on this yesterday trying to figure out how to keep track of the articles. And Alex Debris, you sold another copy of the DynamoDB book. All right. So, okay. Well, yeah, again, this gets a big simplifies from us. This is really great. Anything else we should add to this one?
Rahul

No, I think we covered this one. Yeah, I think we’ve covered all aspects of it.
Stephen

All right, let’s have a quick break. And we come back, we’re gonna be talking about CodeBuild.
Female Speaker

Is your AWS bill going up, CloudFix makes AWS cost savings easy, and helps you with your cloud hygiene. Think of CloudFix as the Norton Utilities for AWS. AWS recommends hundreds of new fixes each year to help you run more efficiently, and save money. But it’s difficult to track all these recommendations and tedious to implement them. Stay on top of AWS recommended savings opportunities, and continuously save 25% off your AWS bill. Visit cloudfix.com for a free savings assessment.
Stephen

Okay. AWS CodeBuild supports Arm-based workloads in South America, Sao Paulo, and Europe, Stockholm. Let’s put the link in the chat. So on the whole looks like this is only two of their many regions. But I guess the broader theme of this is more Arm support, more graviton.
Rahul

Correct. So I think there’s a need to bring more and more workloads to support Arm. And there is still a little bit of a gap. A, I think development environments, not being Arm-based, or not being specifically Graviton, is a challenge for sure. You wanna make sure that stuff that works locally on your machine works when you deploy it. And that’s why I think we need a paradigm shift on that front of it on dev…on the development side. And then on the CICD side of things, which is kind of what CodeBuilders helps you, you know, build up all your stuff, create the artifacts, put them into repositories, and then leverage it for deployments. The fact that those are now available on Arm more widely.

They did announce, you know, US-East-1, if I remember correctly, what, six months ago or so. So now that it’s available more widely, I hope that there’s gonna be more adoption. But there are still large parts of the developer workflow that are still not covered by support for Arm. So you know, I’d encourage the audience to take a look at you know, tools like their spaces which allow you to do this on Arm.
Stephen

I was just going to mention that. It’s just…I’ve got it into my habit now. I will take your repository, open it up in the standard X86 dev space, use it, make sure it’s all set and then just change the URL to our Graviton dev space. And most of the time it works. Occasionally, you have to point to a different package name if those haven’t been merged. Because what happens is, ideally, package names are built for multi-architecture. But sometimes when it gets started, it’s like dash Arm or something. And eventually, they get all merged up. But usually, it basically works. But getting that workflow there is, you know, I’m glad that the tooling is moving towards a more Arm-oriented world.
Rahul

I would agree. And I think it needs to, it needs to move really, really fast. I don’t know whether you saw that announcement. But AWS is already at a run rate of, Arm constituting about $5 billion a year in revenue, which is a big deal. I mean, given you know, how recent Arm is, and we’ve barely scratched the surface in terms of adoption. The fact that Arm contributes to an annual run rate of roughly 5 billion at this point is quite a significant achievement. So it looks like you know, it’s gaining tons of traction, more and more people are kind of moving on to it, even though there’s still a lot of folks who have not even considered it or moved to it. But this is the bet you wanna make. Graviton is the bet you wanna make.

And so if you’re not already leveraging these tools that allow you to manage your entire end-to-end workflow on Arm, then at some point of time, it’s gonna prevent you from leveraging the cost-benefit analysis that the Graviton…or cost benefits that Graviton brings to you. What did we discuss the last time, if I remember correctly? The Graviton processors are producing anywhere from 35% to 50% price performance improvement over the X86 depending on which Graviton processor and type you’re on. But I think the C7Gs are what 50 if we were showing a bunch of cases with 50, 55% price performance improvement? Which is significant.
Stephen

Yeah. And that’s over the Graviton2. Right, not over the X86…
Rahul

I think it was 40% over Graviton2 and in total, about 50 odd percent, or near 60 over the X86s. So yeah, there’s a…and this is not the end of it, right? We are…they’re constantly coming up with new versions of the Graviton. And there’s tons more performance to be gained out of it.
Stephen

And like you said, it’s really cool that this ecosystem is filling out very quickly. And I was thinking back to just a couple of years ago, I was trying to get CICD working for Arm, and it was a real pain. So I was using GitHub, and GitHub runners didn’t support Arm. So I had an NVIDIA Jetson on my desk running a self-hosted runner. And it was just…it was a bit fragile of a process. So we know that when AWS brings something in it’s robust, it’s tested, and we could count on it. And it doesn’t have to be this special case, right? You don’t do X86 development on your cloud and an Arm development on a little box on your desk, running off of a lab power supply. You know this is…it can be part of your workflow.
Rahul

As you read it, we move to that GitHub still has the issue that you don’t have Arm-based runners. So for all of our GitHub actions that we run, we’ve got a cluster of Graviton instances that act as our runners to process all the Arm workloads that we have.
Stephen

Yeah, that makes a lot of sense.
Rahul

But I wish it was less painful to do that. I just wish it was available out of the box.
Stephen

Yeah, absolutely. So GitHub, if you’re listening, please Arm-based runners would be really welcome. That’s where the world is going. So for this announcement, CodeBuild supports Arm. If you happen to be using these two regions, this is a good announcement. But in general, we’re very excited about the tooling and development moving to Arm wherever possible. Should we give this a simplifies?
Rahul

I think we’d give it a simplifies because the more the tooling gets filled out, the simpler it gets for people. Right now for anyone who is motivated enough to get the price performance improvements that Graviton has, you have to jump through a whole lot of hoops to make that happen. And every bit of tooling helps, you know, reduce that burden. So, definitely a simplifies.
Stephen

Excellent. All right. Well, let’s do a…let’s roll the next article about Amazon Athena. All right, here goes. Okay. Here’s the next one, Amazon Athena adds visual query analysis tuning in tools. Sorry, go ahead.
Rahul

So this is almost like the query planner that you always wanted for Athena. And we’re just so used to writing out the queries that make intuitive sense. And then starting to observe the performance of our queries before we can start optimizing them. And we’ve come to rely on these career analysis engines, you know, to help us fine-tune and refine our queries so much. Because when you have parameterized queries, in particular, you really don’t know how people are using it, you don’t know how it’s affecting the performance of your queries. The fact that these tools are now available makes a big difference to the cycle time of creating these Athena queries and making them available to all the other tools that leverage them, whether it be Quicksand dashboards or any other, you know, downstream system like Glue or whatever other job you might be doing. So a huge win for sure.
Stephen

Yeah, it’s really easy to have…well, if you’re not careful, you can wind up with a full node scan popping up. And okay, great. Now, you’ve lost all of the advantages of your index, and you’re just looking at every last entry. And yeah, it slows things down substantially. And being able to spot that visually, rather than looking through logs. That’s fantastic.
Rahul

Actually, sometimes Athena is fast enough that you don’t realize it, but it does massive scans at times and then it shows up in your bill a month later. And it seems really, really scary when something like that happens. And you don’t…it’s really hard to interpret how a certain SQL is going to be hard. I’m using the word interpret twice. But it’s really hard to figure out how the database engine is going to, or the Athena query engine is gonna interpret the query, the SQL query you right, in terms of how it executes it under the hood until you see the query plans, until you see these, you know, these debugging markers. Till you see them, you’re not gonna be able to figure out what the problem is, you wouldn’t know why you got a massive bill, you wouldn’t know why your queries are slow, because you just don’t understand enough about how the SQL that you wrote is getting interpreted by the engine. So having this data now is incredibly valuable in simplifying and better indexing the data or better organizing the data so that it gets indexed and queried effectively by Athena.
Stephen

Now, I think that one of the things that with SQL where it’s both a strength and a weakness is that you can do something that’s phenomenally complicated. You can look at views calling stored procedures, and views and views and views stacked on top of each other. And it will just chug through it. And it might be really slow, but it will usually just…it usually finishes and does the right thing. But like you said, it can lead to either a complexity explosion or a cost explosion, or a time explosion.

But I’ll…a quick story about a company that will remain nameless, I was doing some consulting and they wanted to know why their Tableau dashboard took so long to render. And they were thinking it was a Tableau problem. But the underlying query, the underlying data query took about 57 seconds, and the Tableau dashboard took about 58 seconds to render. So, okay, it’s not the tab…in this case it wasn’t Tableau’s fault. It was the fact that it was views on views on views on views. And it was, you know, full node scans on top of another. But it’s easy to do. It’s easy to get to that situation if you’re not careful and if you’re not in a hurry, right? You say, oh, a minute is fine or 30 seconds is fine. But when you’re paying for it by the query, it’s a different situation.
Rahul

Yeah. And I think in the case of Athena, this actually gets a lot more complicated. Given that Athena can actually operate on large number of different kinds of data stores. The interpretation of what a SQL means for that…for a different data store might actually be different. So what comes in territory in your head as you’re writing a SQL may be interpreted differently if you are using Athena to talk to Aurora versus Redshift versus S3 versus DynamoDB.
Stephen

Yes, your mental model gets a lot more complicated. Okay, this piece of the data is actually loading in a Parquet file on an S3 bucket or something like that.
Rahul

Exactly. And does it change if you’re picking up a Parquet file versus say, just a CSV or a JSON file, like how does it deal with all of that complexity? And how does it interpret SQL? So unless you have that query plan, unless you have those query statistics, it’s gonna be really hard to get to the root cause of either a slow query, or excess costs or, you know, yeah, any of those kinds of symptoms.
Stephen

And I see that as part of this, they’ve got this new query statistics API. I haven’t used this. Have you used it yet? [crosstalk 00:56:38.686]
Rahul

Not yet. No, I think this might be relatively new.
Stephen

Yeah, they mentioned that in the article, they said it’s new. Yeah. Metrics are displayed as an embedded console [crosstalk 00:56:51.693]
Rahul

Yeah, for the new Query Statistics API. So what else does it tell you? Let’s look at some of the stats that it shows us. And let’s try to see if these are valuable. So you have runtime statistics, which is execution time, which is great. Input bytes, input bytes is neat, because it…actually, I’m curious, is input bytes the number of bytes that you actually had in the query itself? If it is a parameterized query that could change. Or is it the number of bytes? No, it’s probably that. Output bytes is the resulting dataset. Query stage plan, this one is your query plan, basically. And it looks like a list. So that’s pretty neat. That’s gonna be very, very valuable. And then let’s look at the rows. So you have inputs and outputs total, and how many rows got returned? Great. Then I was just wondering whether you get numbers per stage.
Stephen

No, it’s just [crosstalk 00:57:57.487]
Rahul

Unless the query stage plan node.
Stephen

Because you want to know [crosstalk 00:58:00.434]
Rahul

Yeah, do they have the definition of the query stage plan node? Let’s see. On the left-hand side, you might see data types here, query stage plan, type. Yeah. No, it was right there.
Stephen

Querry stage plan node.
Rahul

Yep. So remote source, just go up?
Stephen

Children, stage plans which is name identifier, sub plans, remote sources, that’s interesting. So you get the remote sources like you’re saying.
Rahul

Yeah. So, it could be a DynamoDB, it could be Redshift, it could be S3, it could be any of those sources.
Stephen

And it’s interesting, even if you don’t have an immediate plan of how to use this data, I find that just having a glance to the underlying data, and this is a general thing, it always helps, right? Because then you know if something big changed, right? Like if your rows in is many order of magnitudes bigger than your rows out, then maybe you want to change where you’re filtering versus where you’re joining versus…it’s easy to select everything and then filter at the end. But then you make your database work a lot harder. Maybe you want to filter earlier, and get your rows in and rows out a bit closer. If there’s a lot of things you just can…even if you don’t…haven’t done the proper analytics, just having a gut feeling of what’s right for your particular job and then it can alert you when something changes, that’s good to know.
Rahul

Great. And actually, the mental model that you just described, where is it a filter for… Is it a scan of a big chunk of data, and then a filter and then, you know, picking the data elements that you want? Or is it a filter first, and then scan that’s happening? Like that mental model that you have for SQL may not actually apply the same way when you’re talking about Document DB or DynamoDB or, you know, any other data source that is available as part of the adapters. Like that interpretation of what it means, what the where clause will do, and what the Select will do, and in what order it will execute those things. You will not know what until you actually see the query plan.
Stephen

Yeah, I mean, that’s, I guess this all goes back to Don Knuth and saying premature optimization is the root of all evil, right? Just because I have this mental model. And I might write my queries a certain way, I might learn that I’m optimizing towards a constraint that stopped being binding a long time ago.
Rahul

Correct. Yeah. And, you know, the SQL query engine is also gonna have optimizations in there, where, if it knows enough about the schema, it could actually make a bunch of decisions. We don’t even know what decisions it makes. Like if you look at the Redshift query planner, there are a ton of optimizations and stuff that it does under the hood, to make sure that it can partition those queries across different datasets effectively. It pulls out all…pulls in all the data from all the different computes, aggregates it, and brings it back to you. That, you know, you wouldn’t even know about if you did not know that there was a query plan like that under the hood.
Stephen

So, really not try to premature the optimizer can outsmart the query planner. Write it…almost write it naively and then see where’s the slow point.
Rahul

That’s one way to do it. Keep it simple, do it fast, get more data into it, start analyzing the results, look at anomalies, fix them.
Stephen

Perfect. So I definitely will be looking at this API, the query statistics API. Well, it looks like we are just about out of time, but I think this is a really fun session.
Rahul

Absolutely.
Stephen

Oh, do we say we give this a simplifies?
Rahul

I think so, I think instead of guessing at your credit plans, or what Athena is gonna do, the fact that you now have the query plans, and the stats makes life a lot simpler in terms of how you organize your queries. So yes, definitely a simplifies.
Stephen

Well, that was five simplifies out of six and only one too complicated for the…and the even two complicated had an asterisk in that it’s not quite what we wanted. Or there could be more data about the AMI. It’s pretty good, five out of six, that’s nice work, Amazon, this week.
Rahul

Absolutely. Look forward to next week, and I know that there was a barrage of announcement that just happened very, very shortly before this particular live stream. We’ve not been able to analyze it and go through it, but we gonna do that next week.
Stephen

All right. Well, let’s…we’ll talk to you next week. And thanks again. And thanks, Rahul, we will see you everyone…see you next week.
Rahul

Thanks, everyone. See you soon. Bye.
Female Speaker

Is your AWS public cloud bill growing? While most solutions focus on visibility, CloudFix saves you 10 to 20% on your AWS bill by finding and implementing AWS-recommended fixes that are 100% safe. There’s zero downtime and zero degradation in performance. We’ve helped organizations save millions of dollars across tens of thousands of AWS instances. Interested in seeing how much money you can save? Visit cloudfix.com to schedule…