AWS Made Easy

Ask Us Anything: Episode 8

Episode 8
June 22, 2022
1 h 02 min

For today’s episode, Rahul and Stephen reviewed a selection of the many announcements made by AWS this week.

Latest podcast & videos

September 27, 2022November 3, 2022

1 h 07 min In this episode, Rahul and Stephen continue the theme of Behind the Scenes by showing some of the automation which makes AWS Made Easy possible.

September 20, 2022September 28, 2022

1 h 07 min In this episode, Rahul and Stephen recap the "Behind the Scenes" episode 1, and then discuss a few new AWS announcements, and plan for Behind the ...

September 13, 2022September 20, 2022

1 h 10 min In this episode, Rahul and Stephen begin part 1 of a 3-part series in showing #AWS-powered automation, developed with DevSpaces and DevFlows, to show how they ...

August 30, 2022September 19, 2022

1 h 03 min In this “What’s New Review” post, Rahul and Stephen go over a variety of announcements from AWS. Most of the articles rated very well, with the ...

August 17, 2022September 19, 2022

1 h 11 min In this episode, Rahul and Stephen film from Anaheim, where they were attending an AWS Partner Summit. They filmed from a makeshift studio in a hotel ...

View all »

Summary

Announcing support for cross-region search in Amazon OpenSearch Service

This announcement reflects the underlying work by the Aurora team in cross region replication. Since AWS uses Aurora to power OpenSearch / ElasticSearch, they benefit from innovations made by the Aurora team.

Easily customize your notifications while using Amazon Lookout for Metrics

When setting up alerting and metrics, it is important to have the alerts filtered so you know that when you do get an alert, it is relevant and important. Lookout for Metrics uses machine learning to help surface relevant alerts. It is important to prevent “alert fatigue”, a learned behavior where if alerts are deemed mostly unimportant, then the important alerts will be missed.

Announcing enhanced integration with Service Quotas for Amazon DynamoDB

Service Quotas are important for managing the “blast radius” when something goes wrong. Services such as DynamoDB and S3 can scale up to meet nearly any demand. With Service Quotas for DynamoDB, you can be alerted when your database usage exceeds pre-set limits. This can be useful for distinguishing between high levels of legitimate usage (e.g. an app experiencing a spike in usage), vs bad / malicious high levels of resource utilization.

Amazon Lex Automated Chatbot Designer is now generally available

“ Developers can iterate on the design, add chatbot prompts and responses, integrate business logic to fulfill user requests, and then build, test, and deploy the chatbot in Amazon Lex.”

AWS Connect

Rahul and Stephen discussed two different announcements regarding AWS Connect, which allows users to create a virtual contact center / help desk with just a few clicks. Connect empowers businesses to optimize the customer service experience while enjoying AWS reliability and scalability, and without having to build custom applications.

Transcript

Rahul

Stephen, you might have to unmute. Good morning, everyone.
Stephen

Sorry about that. I need to automate that.

Hi, everyone. Welcome to AWS Made Easy Episode #8. This is live. I’m still learning how to do this. Thank you for bearing with us. And thank you, welcome, Rahul. And thank you for ASL Interpreters, Tess, and later we’ll have Alyson. Thank you for making this webinar more accessible to a wider audience. Really appreciate you doing this.

Rahul, how are you doing? How was your weekend?
Rahul

The weekend is awesome. We made a trip to keep this appointment and to be honest, not disappointing at all. It was fabulous. Though it is a little wet and cold, but we had family, friends, it was quite an amazing weekend camping out there. And this week looks like really awesome weather here in Seattle. So yeah, it’s been bright and sunny for the last two or three days. Finally, just as we were wrapping up our camping, we packed all of our stuff into the cars and we were just waiting to kinda head out, that’s when the sun came out. For the entire two days of the weekend, it was dark and gloomy. But you know, we had a lovely campsite. We had you know glamping of sorts, we had a yurt. And had a nice fire going. And we had really nice neighbors in the camping area who let us use their canopy, so we could all stay dry. So it was pretty awesome.
Stephen

Oh, that sounds fantastic.
Rahul

How about you?
Stephen

I’m glad it wasn’t disappointing.
Rahul

How was your weekend?
Stephen

One of my sister’s getting married soon and so we had a little party for her. I think it was her bachelorette party, so my wife went. I was with the kids. That was really fun. And then we had a Father’s Day barbecue. So I’m a bit, had a couple of parties. That was really nice. That’s when I need to get some gym time in and a couple of light meals. It’s all in good fun and good festivities.
Rahul

All right. Cool.
Stephen

It’s so bright I had to close my window shades to not wash out the colors.
Rahul

It looks like AWS is having some sort of festivities themselves. As you and I were discussing on Friday, things were going berserk. We had the AWS announcements. I think we’ve had a record number of announcements this week across all the different services. And my phone was just going crazy with all the alerts that are coming in. So, this is gonna be an interesting week to discuss all the different announcement that I got. I’m not sure we’re gonna have time to do all of them. But I [crosstalk 00:13:18].
Stephen

Well, considering we usually make it through about half to two-thirds of our planned ones. No, I think that at some point, I had this idea that we could have a little timer where we have to say, okay, we have five minutes per segment. But maybe we would apply that at the next What’s New Review.
Rahul

Yeah, I think that would be a fun thing. Or to make it even more entertaining, you’re gonna have these electric buzzers that you have kind of where you put your app monitor it buzzes every time you can overshoot. It’s really hard to kind of not overshoot, right?
Stephen

Oh, well, now, it’s so interesting. Yeah, I agree. I think that’s a good problem to have, and so much to talk about, and so much to go over. And so, for those of you who are new, in this show, sometimes we have guests and sometimes we review new announcements on AWS, and sometimes there’s a mixture of both. So, I think with that, we should jump right into our first segment. We got a little introduction and then we’ll jump into our first article.
[00:14:28]

♪ [music] ♪
[00:14:43]
Stephen

There we go. That’s our new intro to our “What’s New Review” segment. All right, so, our first article today is AWS just announced support for cross-region search in Amazon OpenSearch Service. So, the idea is OpenSearch or Elasticsearch, there is previous to this announcement you could have data in different [inaudible 00:15:17] within a region and replication, but now you can have data mirrored across regions and then search that data across clusters. So, Rahul, why is this a big announcement? And why is this a big deal?
Rahul

So I think the first time we saw something like this was with Aurora, the database service. And the tech that, know you, AWS built to separate out the computer and the storage is a pattern that now we see them applying across various different services. So, when Aurora first, you know, was designed, they had this multi-easy replication that they created that allowed, you know, you to A, have all these new instances in the cluster kind of failover very quickly. Replication you know followed asset properties and you can manage all of that stuff within this replication.

A lot of the constraints that were created by the Gap Theorem around the trade-outs that you need to make, started kind of blurring a little bit because of this tricky separation of compute and storage. They took it to another level with Aurora where they started creating a better mechanism of replicating across regions. Now, of course, there are higher latencies when you’re replicating across regions, but given that they have the kind of backbone that they have, they managed to create some optimizations where it isn’t too bad, it is possible to now, you know, replicate data across the regions and failover over there or load balance over there if necessary.

Now, they seem to be bringing that same tech over to a bunch of other storage technologies. Like, I foresee this you know, showing up in, I mean, of course, we are seeing it now in OpenSearch, but the same thing happening in Neptune, the same thing happening in, you know, a lot of other services like the Time Series Database, Timestream. I see this being a common pattern that you’ll see… You’ll also see that what was built for Aurora Serverless is probably something that’s going to happen with a whole lot of other databases like the same pattern of having your compute units being able to scale up and down within 0.5 or 1 times the compute units. That whole mechanism and pattern that they’re able to stamp out across this service is pretty remarkable.

So, you know, I would anchor on the Aurora kind of pattern because right now what I see is AWS Aurora is kind of like the leading service that is trying out, testing out a bunch of these new patterns of replication patterns of separating out compute and storage, maintaining certain kinds of consistencies and guarantees, and SLAs. I would keep my eye out and you know look at all the other data storage services and I expect to see all of those patterns being replicated in the near future.
Stephen

So for some context just to get everyone on the same page, we mentioned the CAP theorem. So that’s the idea that systems can have, at most, two of consistency, availability, and we call partition tolerance or being able to be distributed across multiple computers, multiple areas. And so what we’re seeing, you said, underneath this is Aurora and because this a common pattern of AWS, is they use their, say, lower order services and they build higher order things on top of them? And so the benefit that we all get from that is that they improve the lower order ones, the higher order ones they benefit because they are built from this. So in this case like you’re saying the replication, the increased reliability, and so we watch out for things for both Aurora and then also things built on Aurora.
Rahul

Correct. So just for context, you know, the big breakthrough that Aurora had was the ability to take the data and use the S3 backbone under the hood. And because they had built this amazing Multi-AZ replication into S3, they were able to leverage it pretty much as is to build the storage layer for Aurora. So the minute you write something, it immediately gets replicated across three different AZs and that’s available to you right out of the box as a service. So now that you could look at storage as, you know, through one single lens and all the other replication and all the other needs that you have around storage are taken care of for you, you could then add the compute layer on top of it independent of the storage. So you could scale up compute, you could scale up storage separately, you could replicate stuff, you could failover a lot easier. And that’s been the big breakthrough for Aurora. And now, you’ll get that with OpenSearch. I foresee this happening with Neptune, I foresee this happening with a bunch of other services, yeah.
Stephen

Great. And then we see that this works on both OpenSearch and Elasticsearch, which is pretty neat. I don’t know that they both have the same, that they were both able to leverage that same Aurora backend technology.
Rahul

Yeah, at the end of the day, OpenSearch is basically up fork off [SP] Elasticsearch. So, maintaining compatibility given that they haven’t really divulged that much yet has been, I think, fairly straightforward from an AWS standpoint. So, yeah, and when they did Elasticsearch or their older version of Elasticsearch, the managed version, right from then, they had kind of started the process of separating out the compute and storage.
Stephen

And you know, just for, I suppose, we’re building up a little bit of a library of content now. And looking back to Episode 5, we had a great discussion that was reflecting on as we were talking about this. I have a little clip from that discussion just to bring that forward to close out this segment. Should we do that?
Rahul

Yeah, let’s go ahead.
Stephen

All right.
Rahul

While they OpenSearched large chunks of Elasticsearch which AWS basically took and turned into their, you know, Elasticsearch or OpenSearch service, Elasticsearch as a company still has the entire ELK stack that includes Kibana and Logstash. And ELK as a whole is basically what a lot of customers want to leverage. Then Kibana visualization beats anything that AWS has, and customers want it. So, it makes sense for Elasticsearch to leverage the marketplace, to sell the whole stack as a whole with their terms and their pricing even though it’s on the AWS marketplace and have it be available to AWS customers. So rather than fight AWS on just Elasticsearch part, they are now dealing with the entire stack and are offering it in the marketplace. And that makes life a lot easier for both parties. They you know, fight a lot less and I think customers end up benefiting the most out of this.
Stephen

You know, I thought that was a really great insight that you had about… And this, I guess, shows again there’s a bit of competition and yet there’s also this mutual benefit from iterating on this platform.
Rahul

Yeah, I think it’s a fairly symbiotic relationship at this point for a lot of these providers. I mean, you see a similar story with Snowflake. Snowflake’s entire success is because of Amazon or with AWS, and you know, vice versa. Like, AWS is basically getting a huge amount of workloads moved to their setup because Snowflake, you know, brings a whole lot of those customers for those kind of use cases under their platform. So, I think it is a very, very mutually beneficial relationship. I think you just have to make sure that each party is not stepping on the other one’s toes too much. There is going to be some overlap for sure, but as long as they can see the mutual benefit of that relationship and each one knows how to ride the other’s success, I think customers will stand to benefit the most out of it.
Stephen

There’s a great Silicon Valley where it coopetition.
Rahul

Yeah.
Stephen

I fix some ride with that. Well, let’s [inaudible 00:23:54] to say we want to have a quick word from our sponsor and then we will be back to talk about Lookout for Metrics.
Women

Is your public cloud riddled with costing complexity? CloudFix is here to clean that up. CloudFix focuses on cost savings for your AWS environment by finding and implementing fixes that continuously save 25% off your AWS bill. Too good to be true? We’ve helped organizations save millions across 45,000 instances. Visit cloudfix.com to schedule a free assessment and see your yearly savings.
Stephen

All right, and we’re back. And welcome, Alyson. Thank you for taking over for the interpretation. All right, we are talking about “Easily customize your notifications while using Amazon Lookout for Metrics.” So, this is a quick summary. Lookout for Metrics. The idea of Lookout for Metrics, it’s anomaly detection for different logs, business processes, anything that could, or any stream of data that you’d want to know has something changed and do I want to know about it. So, with this announcement, I highlight, the main part, you can now add filters to your alerts configuration to only get notification for anomalies that mattered most to you. You can also simply modify existing alerts for your needs for notification as anomalies evolve.

Okay, so why is this important?
Rahul

So, the problem with a lot of your metrics and your data is that there’s just, again, it’s like a fire hose. And to be able to sift through that is incredibly hard. And as more and more of the applications move to this microservices architecture and there’s more Lambdas, there are more subsystems that are producing tons and tons of logs, sifting through that is incredibly hard. And that’s at the lower level. At a higher level, you know, your application could be monitoring customer journeys, it could be monitoring calls coming in, traffic. It could be measuring pretty much anything. And to be able to stay on top of it, to figure out what you should pay attention to at any point of time from an operations standpoint, becomes incredibly hard.

If you have too many metrics, you know, it’s like when something goes wrong in an aircraft and if you had 25 bulbs bleeping and alarms and sirens and all going off, you wouldn’t know where to get started. In code, you know, quality, and this one kind of rings home very close to me, which is a lot of us have started to always use tools like Sonar, SonarQube, or Sonar for our code quality measures. And the one thing that I find happening pretty much all the time is within about a few weeks of setting up a tool like SonarQube or CAST or whatever tool you pick, you start seeing developers shutting off those rules. Because there’s just so much noise that keeps coming in. You don’t know how to filter out the pure signal out of it and then just start, more often than not, just ignoring the rule or the signal or whatever you thought was important to you.

So, while you have all of these metrics, the metrics are only useful when you have a mechanism of A, identifying the outlier very quickly and easily, and B, even amongst those outliers, being able to hone in and filter into only what you really care about actioning on. And not everything is actionable. Sometimes you just want to know about anomalies but they help you do a post-mortem at a later stage. But you need to focus on only what is actionable. You know, what are your decision-making alerts going to be, filter them down to just those, and focus on those. Otherwise your ops team is going to be all over the place. They’re going to be running like chickens with their heads on fire. And that’s not an uncommon scene in a lot of organizations.
Stephen

You brought up a lot of really, really interesting points. Like, you said, if there is a plane that had 20 alarms going off, it would be really nerve wracking. I actually read a book. So I used to work in the medical data science, medical informatics, and they really think about this idea of alert fatigue. Because nurses and doctors deal with this quite a bit. So I pulled up this one study. One study showed 331 alerts were needed to prevent one adverse drug event. 90% of medication alerts are overridden by physicians and more than half of overrides were due to alerts being deemed irrelevant. So, it’s a universal problem, not just in IT or tech. But just as human beings, if we hear the same sound… I mean, every parent knows this. If you hear the same sound 100 times, you start tuning it out and you have to really say, okay, is this important and you have to set up your alerts system to think, okay, I really know and I really trust my system so that if I get an alert, it’s important.
Rahul

Correct. And it’s also important to I think, more often than not, it is really hard to create a tool-based anomaly detection. Like you can say, hey, if my threshold crosses this, you know, raise an alert. Because it’s very, in an elastic environment like AWS where your loads might weary your… You know, all the metrics might be all over the place within a week. It’s really hard to set rules because those rules are going to get violated in a very short order. You need a machine-learning-like approach which starts to understand patterns a lot better. And then you’ll have to detect anomalies based on that.

And that’s why something like Lookout for Metrics is really valuable. You’re not just adding simple rules which, again, will overwhelm you with outcomes or completely hide anomalies. Like, you have to be flexible. You have to follow the pattern of the data stream that is coming in and find anomalies within that rather than set very gut-based rules that determine what you think might be an anomaly might not actually be an anomaly in the data pattern. So, I think Lookout for Metrics plays a very, very important role in that first part, which is making sure that you understand and know the right kinds of anomalies that are being raised or that you see in your data. And second is the filtering and simplifying the way you consume all of those insights or those anomalies is what is being made easy over here. So, incredibly valuable.

We actually have a very interesting use case that the team is working on right now which is… We have a product in which we have tons and tons of customer journeys. Like, we have tons of website data that comes in for how…it’s a customer experience product. There’s tons of data that’s coming in about what a customer’s journey is and for some very, very large organizations, a journey could take you to a thousand different bots. I think the largest one we have can have anywhere from 5,000 to 10,000 different bots that a customer could take through a simple journey. And in all of those cases, it is manually impossible to create the idea of an anomaly. Like, I want you to know where to start with creating rules.

So, you need to let the ML kind of decide what is normal. Right? It’s hard to objectively define that in rules as to what normal is, but once you feed the data to Lookout for Metrics, you let it learn what normal is, and then you detect anomalies in that. Suddenly, you know your service goes down and you have, you know, the volume on one particular path dropping off dramatically and moving to another path because some service was down or your login didn’t work or the lead didn’t get generated and therefore didn’t give access to the customer, the potential customer or whatever. You might have a number of scenarios, but to be able to detect that anomaly, create an alert and have someone look at it and fix it right away, is incredibly valuable and it’s not just cloud watch metrics that you can use this for. You can use this for a whole lot of use cases where you have time series data and you want to go figure out anomalies.
Stephen

And what I was thinking about was time series. And especially things like retail where there are cyclical patterns. Right? And so if you had a simple threshold-based alert, you might say, okay, well, on holiday weekends, I’ll ignore all alerts because everything goes haywire anyway. And that’s a horrible practice, right? You want to be able to okay, I know that there’s going to be more traffic but within this context of time series, within this particular time window, I still want to know my anomalies.

And so that’s what machine learning can do for you is it can start adjusting based on historical patterns to your data. That is to say you have a product where every, I don’t know, every Thursday, your orders spike. Then maybe you want to know if orders didn’t spike on Thursday because something strange… Maybe you forgot to top up your ad budget or something. Whatever it is, you want to know. And that’s, in a sense, an anomaly that’s personal to your context.
Rahul

Correct. And that is not something that you could just determine based on rules, or you couldn’t set threshold based on very simple rules, you would need an ML kind of approach. Or if you have daily anomaly, like you know, for certain services you find that between 9:00 and 10:30 a.m. during weekdays, you see these huge spike of consumption within this microservice and then you find that the rest of the day, everything is pretty much flat or low. But then how would you model that as a rule? Like, you want to make sure that…you’re not going to change your rules every day, you’re not going to change rules by the hour, and you don’t know… Okay, you know, what if that 10:30 a.m. spills over to 11:00 a.m.? Like, would you treat that as an anomaly? It’s just really hard to articulate those kind of rules. It’s easier to let a machine learning algorithm kind of learn what the pattern is and find anomalies within them.
Stephen

Absolutely. And I like, I was just going through the rest of this article. You can not only have alerts… I usually have alerts going to Slack, and then you can also have a Lambda that processes them, which are the ones that are really complicated or have some fan out of alerts. So you can send one through SNS and then have that fan out to different consumers. And I like that its different business units can handle alerts that are contextual to different business units or maybe finance versus DevOps wants to know different things about what’s happening.
Rahul

Yeah. And if you have a bunch of automation, by the way, you could have this going to an event bridge set up where you might have different routers or rules that then take this event and process it in different ways. I mean, the number of services that you could then you know plumb kind of one after the other in succession is enormous. So, it all depends on what you want to do. In fact, we’re going to talk about another service with Connect where they have incident management or cases that are now available in preview. You could probably take an event of an anomaly and then pass it on to AWS Connect cases. And you don’t need to set up a separate instance or a separate setup for that at all.
Stephen

Yeah, that sounds really… Yeah, I’m looking forward to talk about… And we have some other things in Connect coming up later in the show. So, to close out this segment I wanted to show a clip of what happens when you cannot ignore an alert. And this is from…
Geordi La Forge

We got a red light on the second intake valve.
Man 2

Ignore it. We’ll be fine. Prepare for…
Stephen

I love the faces on Riker and Geordi La Forge here. And so I’d like to think about… So, if we think about… This is Star Trek with the crew of The Enterprise traveled back in time and they’re on the first launch of the very first faster-than-light ship. And you can see from the people in the perspective of Star Trek, these two commanders in the future, they think, oh, my gosh, he’s ignoring an alert. Whereas in 2050, he’s saying, there’s got to be something wrong with my alerting system. How could I ignore that? And so, hopefully, when we’re in the year 2250, we will have such confidence in our alerts that any alert that comes up we know has been screened and filtered, and the computer is presenting it to us saying, hey, this is important. Pay attention to it.
Rahul

Correct. And you know, this feels so close to home because like the expression on their faces, I have seen dev teams have the same expression when they get these alerts from their CICD system or things failing, and then the manager’s under pressure to kind of deliver, you know, an outcome and you have a deadline to kind of get your service published and get the software out. And this is exactly the expression that developers have on their faces. They all look at each other and like cross their fingers. All of that, ah, we could probably get through this alert. Like, it shouldn’t matter that much and then they let it go, and then next thing you know is it’s a disaster at the customer deployment but…
Stephen

Imagine the kind of alerts that were going off at Cloudflare this week.

All right, let’s take a quick break and we are going to come back with service quotas for DynamoDB. So, we’ll be right back.
Woman 1

Public cloud costs going up and your AWS bill growing without the right cost controls? CloudFix saves you 10% to 20% on your AWS bill by focusing on AWS-recommended fixes that are 100% safe with zero downtime and zero degradation in performance. The best part, with your approval, CloudFix finds and implements AWS fixes to help you run more efficiently. Visit cloudfix.com for a free savings assessment.
Stephen

All right. We’re back. So, this is, I think, a good transition. So this article is “Announcing enhanced integration with Service Quotas for Amazon DynamoDB.” So, Dynamo is, of course, their planet-scale database service, and then with Service Quotas, you can view all the values of… Oh, I think here is the important part. You can create CloudWatch alarms to notify you when your utilization of a given quota exceeds a configurable threshold. And this allows you to better adapt utilization based on your applied quota values and automate a quota increase requests.

So, we’re talking about alerts just a second ago, and here’s a perfect use case. If you want to know the rate at which your table or the usage of your table is growing or shrinking or changing.

So, Service Quotas, I haven’t used them a lot based on my personal usage. What can we say about this announcement?
Rahul

Yes. So in general, if you look at any of the AWS services or a lot of cloud services in general, the fact that they are elastic in nature, you can scale up or down almost infinitely, one of the DynamoDB’s big claims is that they can scale up infinitely. Right? One of the problems with that is that you can very quickly end up in a scenario where some wrong setup within your application or maybe it’s an external attack of some kind can just suddenly cause a massive spike in utilization of a service like DynamoDB and can give you a massive bill to go deal with. And you wouldn’t be able to stop it because DynamoDB will scale up to deal with the load that’s kind of thrown at it. And that’s a risk.

Now, the proven way to kind of go about this is to say I want to know when my system exceeds a certain threshold of utilization and when it does, that’s when I want to make important decisions. I want to decide whether this is near a real load, where hey, it’s great, I’d had a marketing campaign and suddenly I have tons of customers trying out my product. That’s awesome. So I want to let it go. Or this just feels like an attack. This has come out of nowhere. I didn’t expect it. I don’t know why suddenly there’s this massive load. It’s not going to impact businesses or going to give me a crazy high bill at the end of the month. So I want to make a decision on whether to let this continue or not.

So, Service Quotas are actually an amazing way of managing up a bounce off your elasticity where it puts you back in control. And with that control, you are then allowing yourself to make decisions that can ensure that you are not causing waste, you’re not spending money unnecessarily, and those services don’t proliferate unbridled or unchecked.
Stephen

Well, I think this is that phrase that I’ve heard you use a lot of like limiting the blast radius. Right?
Rahul

Yes.
Stephen

So, what happens if… So, what you’re saying about this anomaly, right? If all of the sudden you got a spike in usage, you’d expect say your web traffic, your API calls, your storage to kind of all grow together. But if everything else stays the same and all of the sudden your database usage goes up, maybe say okay, shoot, someone messed up the backup script or something like that. And we’re making 100 extra copies. It’s not something strange that you have to go investigate because it’s not part of the rest of the system. Right? You’d expect the system to behave as a whole. And so if one thing goes off, in this case, all of a sudden your database service quota gets exceeded but nothing else does, you’d want to know about that.
Rahul

Correct. Like, I mean, there are scenarios where you do expect that elastic nature to kind of kick in and move. So for example when Amazon themselves, one of the interesting documents that they published after every prime day is the services and how much they were put under stress and what they could handle. So yeah, when you have something like a prime day come up, you know DynamoDB definitely kind of shoots up in utilization and you suddenly start seeing many millions of transactions per second executing well through it. But on a typical day, it’s not something that you would expect. I would think that you would create some sort of an upper bound on the different API requests and stuff like that, and if something exceeds over there, you raise a flag or raise an alarm to say something is not right.

You know, I think this also goes back to what we talked about in terms of Lookout for Metrics. Imagine you could actually look at your Lookout for Metrics. You can get the statistical averages like figure out what your one sigma or one standard deviation looks like for your particular metric and you could choose that to set up your service quota for that service. So, your system knows and learns what one standard deviation implies and you use that as a mechanism of managing your service quota that way. Even if you have an anomaly, okay, you basically you know, manage it to an upper bound and you can take a call on what you want to do next. So that’s another way of using Lookout for Metrics as well.
Stephen

You know, I like what you said about well, in a sense, DynamoDB is going to handle whatever you’re going to throw at it. Same with S3, right? You’re not going to overwhelm S3 with your data. It’s big enough, and so then you have to have some other mechanism for regulating this. It’s kind of a funny analogy but I was thinking back to my old diesel Land Cruiser. I used to say, “You know, it has a natural mechanism for preventing speeding.” It can’t. There’s no better… I never got a speeding ticket in that vehicle because it’s just not capable of it. But the Dynamo’s the exact opposite, right? If you tell it to store a billion records or half a trillion records, it’ll do it.
Rahul

Yes.
Stephen

You have to have some other way of it. And maybe that is what you want to do, but you’d at least want to know about it.
Rahul

Yeah. I have known of a number of AWS customers who have had the scenario where there are these crypto miners lurking around. And the minute they can get access to an API which allows you to launch an instance, they will use it. They will launched the P instance types, which are basically your GPU-centric ones. They are expensive to launch. And they will launch as many instances as they possibly can. And then they will run all of their mining stuff, you know, on it. Now, AWS has done a phenomenal job of creating multiple checks and stuff like that, but one of the big instruments that you have in your armory against that kind of attack is your service quotas because if you limit tech as a policy within our org, we set at literally at the org level down to every account.

Our service policies ensure that you cannot launch a P instance type. If you want to launch a P instance type for machine learning kind of use cases, then specific exceptions are created. And also, it’s just not possible to launch those instances in 99% of our accounts just because that’s control mechanism that ensures that not just external attacks but even internally. That you are not wasting money on launching this instance because unless you have a justifiable case for launching those kind of GPU-based instances, you’re just wasting money. So it’s an incredibly valuable tool in your arsenal to fight waste.
Stephen

When I was a naive college student, I once checked in an AWS key into a public GitHub repository and that’s exactly what happened. That was a terrifying couple of days and luckily, I was able to reverse that. That was a long time ago. And now, luckily, they work with GitHub. You can’t check in key. That will be rejected immediately because they know the dangers of that.
Rahul

Correct.
Stephen

The more mechanisms for stopping this the better.

All right, well, should we switch gears to Lex?
Rahul

Yeah, let’s talk about Lex.
Stephen

This is exciting. I pulled this article up. Here we go. This is, and I’ll paste this to the URL in the chat, this is the automatic chatbot designer. This enables developers to automatically design chatbots from conversation transcripts in hours rather than weeks. So I think the idea here is if you have an existing customer support system, then you can…you already have all these great interactions available and you can use those to train a chatbot, and that way the customers, maybe the human support agents can work on the special cases that the chatbot can’t deal with. I guess the idea of this it’s like reinforcement learning. Right? Because you train it on this, and then you get feedback on how well the chatbot does. And I was reading through this article earlier, it says you can iterate on this. Here we go. Developers can iterate on the design, add chatbot prompts and responses, integrate business logic, and build, test, and deploy. And so, you get this continuous feedback, and I guess it goes on this theme that we’ve been talking about of using machine learning to tell you when…well, it was to train your models and then to tell you when it is working and when it isn’t working.
Rahul

Yeah. So I’m gonna start with a personal confession. Lex is one of my favorite services of just… It’s enabled me to do a lot of things that I would not have been able to do otherwise.

So, the way I end up using Lex, and this is just for some background about why this chatbot tool is so valuable. I have a terrible sense UInUX. And every time, you know, like a big part of my job is to work on product. And I think our strengths lie more in the backend engines that drive all these amazing products. But when it comes to you UInUX, I find myself fairly handicapped. And thankfully we have a good team to kind of overcome those limitations. But I found when Lex first came out, again, one of those light bulb moments for me, where I found that my team and I were really, really good at writing up API to do some very specific operations of things. And it was amazing that I could take a Lex intent and map it directly to an API. And the fact that it was available in the chatbots, whatever output I wanted out of an API, I did not have to build out a full-fledged UI with a bunch of dropdowns and figure out how to get to the right data to pass on to that API call.

I could literally define an intent with slots, the slots being the API parameters, and the output would come right to me in the chat client. Right? So, I could directly interact whatever question I wanted answered. I’d just write an API and that API would be the intent and the parameters that are needed were the slots. And then I could have a conversational mechanism of asking that question. And sometimes I would forget what the parameters are, so I’ll just say, “Hey, I want data on X.” And it would say, “Hey, great. What date range do you want it for?” And then, you know what particular dimension you wanted it on and it would ask me all these questions back to help me fill out these parameters. And it would get me the answer. And I wouldn’t have to build out a full-fledged UI with dropdowns and selectors, and date selectors, and figure out how to construct this amazing UX that wouldn’t feel painful.
Stephen

I’d like to say with intents. So, I’m thinking about home automation, right? You could have an intent be, turn on the lights, and then the slots are, which lights?
Rahul

Correct.
Stephen

And looking at the way we have an Alexa speaker that doesn’t have a screen so it has to be voice-activated. And so being able to have that UI, like you said, without building a drop-down menu, some API call them lists the lights in your house, then another selector thing, and how do you unselect, and how do you cancel. You know, you can have it all… You can have UI, voice-powered UI, which as we discussed in a previous episode, all sci-fi in the future you’re just talking to the computer. So that’s where we’re going.
Rahul

Yeah. In fact, like, I think one of the simplest… Here’s a use case that I use extensively. We use a lot of QuickSight dashboards that have tons of data. And I personally feel very handicapped kind of navigating through those QuickSight dashboards because there are so many dials, and unless you really knew the underlying data and how to slice and dice it, it just becomes really, really complicated. So, every time I manage to work with somebody to create a dashboard that I like, to create a view that I like of the world, I immediately take that and write up a very simple API endpoint that says, here’s what I want and here are the four parameters that go into determining the right data.

And once I’ve got that little piece implemented, it’s usually about 10 to 15 lines of code that go into a Lambda, I literally can ask my Alexa device or I could just type into my chatbot saying, “Hey, can you pull up this particular data?” And then it says, “Hey, give me the date range, give me parameter B, give me parameter C.” I’m like, “Okay, what are choice over parameter A, B, or C?” It can give me those answers I fill it up right away. And it gives me the report right there in my chatbot. So it gives me a QuickSight URL, I click on it, and it takes me to the dashboard with all the parameters filled in. And I don’t have to fiddle with the UI to figure out how to get there. It’s one link, it’s got everything built-in, I click on it, it takes me there.

And that I find incredibly liberating because, you know, now, I’m not bound by all the UInUX and I’m not in your UIHell, which is what I kind of dig in for myself. But back to this particular announcement, I think why this is so incredibly valuable is that back when I started, I’m talking about five, six years ago when Lex first came out, I had to write Lambdas for every little chat handle. You know, unless you knew how to write Lambdas, deploy those functions, figure out how you can scale them. If you had 10 parts of the conversation, then you’d have 10 different Lambdas dealing with all this complexity. Now, if you go to the chatbot designer, you literally don’t have to do any of this stuff. You don’t even have to deal with Lambda. You can literally just write up a bunch of your intent, define your slots, give it just a few phrases words so there are phrases that you can understand how to parse those intents, you can set context right there in the UI, and you can try it out right in the UI.

And if it works, great. If it doesn’t, you want to tweak some of those phrases because there’s a different way that you’re adding it, teach it. You can give it a few more, three or four more, and then you’re all set. It works. So that’s the, I think, I would expect to see folks really leverage Lex to create a bunch of automations. The chatbot interface is so underutilized today. If you feel very, very comfortable with APIs, chatbots are probably the best way for you to start automating a whole bunch of stuff and getting value right away without needing to build out complicated UIs.
Stephen

Every time I go through the React Tutorial… I hope we could do a quick switch here. Every time I go through the React Tutorial, “I can’t do this.” It’s like being able to talk to a chatbot and just give it a little guidance on what to do, that’s so much better.
Rahul

Yeah, and in fact, it’s just the API call. I mean, if you know how to design an API endpoint, the Lex interface feels so intuitive, literally map the intent to the API function name, that’s your intent, and each parameter in your function call is a slot. There couldn’t be a better match for a UX to the backend system. I find it incredibly awesome. It’s very, very liberating.
Stephen

Fantastic. Well, let’s take another quick break. And then we get back, we will do one maybe two more story or announcements. And all right, we’ll be right back in about 30 seconds.
Woman 1

Is your AWS bill going up? CloudFix makes AWS cost savings easy and helps you with your cloud hygiene. Think of CloudFix as the Norton utilities for AWS. AWS recommends hundreds of new fixes each year to help you run more efficiently and save money, but it’s difficult to track all these recommendations and tedious to implement them. Stay on top of AWS recommended savings opportunities and continuously save 25% off your AWS bill. Visit cloudfix.com for a free savings assessment.
Stephen

All right, and we’re back. So we’ve got two different announcements from the Connect team. So Amazon Connect, that’s the essentially call center service, and that’s what we’ve gotten to. The first one, general availability of outbound campaigns for calls, texts, and emails.

So, going through this. I’ll post this announcement in the chat with the other ones. It offers organizations an embedded cost-effective way to contact up to millions of customers daily. Now, this is a big deal. I mean, running outbound communications in this age of spam and there’s always a kind of adversarial relationship when you’re doing massive outbound communications that you don’t want to get blocked by the main carriers. I used to run my own SMTP server for sending emails a long time ago. It’s really hard to do that anymore without getting flagged as spam. But with Amazon, you can use Amazon Connect to send these calls, texts, and emails and they’ve done that hard work in terms of being on all the right filters and all that.

And also regulation. This is really neat, I thought. New communications, capabilities include features to support compliance of local regulations like TCPA through point-of-dial checks, calls controls for time of the day. So example, say you have some automated system, it’s going to make sure that you don’t accidentally call someone when it’s 3:00 in the morning their time, and then they’ve got a tweet about it, and all of the sudden your company looks bad. It’s all the stuff that you don’t have to write yourself that they’ve done for you and packaged it up.

And then, again, what’s in the background, machine learning, predictive dialing, ML-powered answering machine detection, just to make sure you’re optimally using your agents’ time. So, if there’s an answering machine, your agent’s not gonna spend 30 seconds talking to it and then leaving this message that won’t have a great return on their time investment because that adds up. 30 seconds times 1 million is a lot of time to be wasting.

Rahul, what are your comments about this announcement?
Rahul

I’ve loved Connect, and I think we’ve had an earlier episode where I talked about how many amazing things they’re adding to this particular service. In most organizations, the human resources that is involved in talking to customers is incredibly valuable and important on so many dimensions. Number one, they are your one touch point with your customer more than any other, you know… Like, your sales folks, your finance folks, your product folks don’t talk to the customer or interact with the customer quite as much as your support folks do. So making sure that you have consistency in the way these people or this organization interacts with the customer is incredibly important to your brand, to your reputation, and you know all that stuff. The more you can automate, the more checks you can put in place, the more consistently you can deliver that interaction or the customer experience, the better your brand gets in the long term.

If you have, you know, two agents, one being really, really customer-centric and focused and the other one not so much, you’re gonna have a very inconsistent experience from a customer standpoint. So, tools like this where you can focus your agents to only handle a narrow set of things, which are really important and valuable, where machines cannot possibly kind of play that particular role, I think adds to the overall brand. A, you’re using people to only focus on the really, really hard stuff that requires judgment, that requires a certain skill and you’re not having them do a very wide range of things in which case you will end up with fluctuations in customer experience. So, the more tools that come out of the Amazon Connect space, I think they are creating a pattern that could potentially allow you to build one of the best customer service organizations on the planet. And I love every one of these new capabilities that they’re coming up with.

I mean, can you imagine reaching out to your customers in near real-time when let’s say a service is down or let’s say you are about to get on the flight and the flight is delayed. Or let’s say there’s an emergency of some kind and you need to get to the customers right away, you need reach millions of customers, tell them exactly what they should do. There is no way to handle this effectively when you’re doing this with people. Humans don’t scale quite the same way. So, the more tools like this which meet all the compliances, meet all the laws, keeping your brand identity, keeping your brand reputation intact. I mean, tools like this are invaluable.
Stephen

I love what you said, that consistent experience. And it’s, you know when you deal with some of these poorly trained chatbots and it’s just, you know, it can do your head in a little bit, really I just want to talk to a person, but then when you deal with this really effective ones, it’s like wow, that was… I dealt with one for a refund and it answered the question exactly what I wanted, didn’t diss anyone. And I like that idea of, you know, this is about outbound campaigns. So saying like a flight is delayed or some important thing, you know, an outage in your service that the customers need to hear right away. The ability to do that, like you said, without having to code it all yourself and make sure that you aren’t disturbing customers or running afoul on some local regulation, that’s a really neat thing.
Rahul

Yeah. And to give you a simple example. I mean, I’ll just bring up one real-world use case of this. Every time you receive an Amazon delivery, within 24 hours you’ll actually receive an email that says, “Did you like the product? Did you receive it? Was it in good shape? Did the delivery happen correctly?” And very recently, I had ordered something where I’d ordered two items. One of them came in, the second one didn’t. And I was dreading going into the Amazon site and dealing with support. But because I got the…I received this email, it literally said, “Hey, just tell me whether you received it or not.” And there were four options in that email. I was able to provide feedback without having to log into the Amazon portal or going to support. I mean, Amazon hides their… Getting to support is not the easiest thing on the Amazon website. So yeah, I was able to literally give feedback. They got back within 24 hours. And they had another set of the product shipped back to me, whatever was missing. And that customer experience is night and day versus me having to actually find a person, talk to that person, give them the feedback which is again a very inconsistent interaction that I would have in that scenario.
Stephen

And then think about how it would have been where you have to call someone, you write down a 28-digit case number, hope that you got it right, hope that the next time you call it got entered properly. But because you have that confidence that the case was handled even though it was a bot and not a human being, you still have that confidence that it was handled, it was managed, and you got that quick turnaround time which is you’re probably gonna do more Amazon Prime orders in the future I’d imagine.
Rahul

Exactly. Absolutely. I have Amazon Prime accounts in three different countries depending on where I am. And yeah, I do that because of the consistency of the experience.
Stephen

Yeah, it’s pretty neat. I think it’s a really good thing that we can have that very consistent and yet highly scalable service.
Rahul

Exactly.
Stephen

Let’s talk about Connect cases. We’ve got about three minutes, so this will have to be a quick one. Built-in case management capabilities to make it easy for your contact center agents to create, collaborate on, quickly resolve customer issues that require multiple customer conversations and follow-up tasks. So that’s what I was talking about earlier. You write down that case number, you call in, you hope that it’s stored correctly. This is a better way. The Connects cases. And I liked what it said. It said, “Without having to build custom applications or integrate with third-party products.” So from a developer perspective, again, less code is a liability. And sure it sounds interesting and fun to get to markup this new database and think of all the stuff, but at the end of the day, why would you want to manage this if it’s not the core of your business, and Amazon’s iterated on it a thousand times more.
Rahul

Correct. And I mean today the reality is that the case management has become fairly commodity or relative commodity where you use ServiceNow or Zendesk or at licensed products to basically decide. That becomes your case management system and then the pain really is to have all these systems integrate. You have to go through that process, make sure the data is flowing through correctly. More often than not, the data actually resides in multiple different data sources where you might have some data in Salesforce, some data in Zendesk, and you need data from both of them to match up. Disambiguation becomes a problem so Connect kind of helps solve a lot of that problem where you have matches capability within Connect where even if you’re pulling data from different sources, the matches will disambiguate it.

So, I think it’s already there and is certainly going to get better as Connect becomes the center of your entire CRM, customer support operations. The center of gravity is moving into the Connect product or Connect service itself where that’s where your single pane of glass, your single source of truth is going to reside. All the other peripheral systems that you end up using are gonna have some slice of the data, but if you wanted that holistic view, Connect is probably the place to go look at to find out where the real truth lies. And it’s also then the right place to have your agents look into as they’re interacting with your customer because that’s where they have the 360-degree view of what is important to the customer, what do they care about, what should I not say, what resonates with them in terms of a tone. All of that as it resides in one place in Connect. I mean, makes it much easier for agents instead of having to go to four or five or six different systems and try to pull the data out.
Stephen

And it really is the like you said the make or break of customer experience. Every company is going to have a bad experience or a bad initial thing where you buy a product it doesn’t show up, where there’s a bug. You know, something bad happens. That’s inevitable. But how it’s handled, for me that determines whether… That totally determines my brand loyalty. And like you’ve said there are certain companies where I know they will handle it if there’s a problem. And even if the problem is extremely rare, the fact in the back of my mind, that confidence is they’ll handle it is what keeps me loyal as a customer. And so being able to offer that without having to have this huge investment in data integration and pipelines and training, that’s a really empowering thing for businesses of I guess any size. Especially the smaller ones where they don’t have the scale to build this all out themselves.
Rahul

Completely agree.
Stephen

Well, we’ve gone to the end of our time. And again, we have more to share. We do have to… I think the queue grows faster than our ability to read from it.
Rahul

Absolutely.
Stephen

But that’s a great problem to have. Thank you, Tess and Alyson. I really appreciate your translation work. Thanks, Rahul. It’s really good to have this great discussion. Thank you again for the audience. We will see you again next week.
Rahul

Thanks, everyone.
Stephen

Thanks a lot.
[01:11:29]

♪ [music] ♪
[01:11:42]
Woman 1

Is your AWS Public Cloud bill growing? While most solutions focus on visibility, CloudFix saves you 10% to 20% on your AWS bill by finding and implementing AWS-recommended fixes that are 100% safe. There’s zero downtime and zero degradation in performance. We’ve helped organizations save millions of dollars across tens of thousands of AWS instances. Interested in seeing how much money you can save? Visit cloudfix.com to schedule a free assessment and see your yearly savings.