On this episode, Alex dives deep into the intricacies of Amazon DynamoDB, a Planet-Scale NoSQL database service that supports key-value and document data structures. Alex discusses the consistency and predictability in the design of DynamoDB’s performance, and how to best utilize it. Alex is the author of “The DynamoDB Book”, a comprehensive guide to data modeling with DynamoDB. He also has been recognized as an AWS Data Hero.
AWS superfan and CTO of ESW Capital
Author of “The DynamoDB Book”
Hello and welcome to AWS Insiders. On this podcast, we’ll uncover how today’s tech leaders can stay ahead of the
constantly evolving pace of innovation at AWS. You’ll hear the secrets and strategies of Amazon’s top product
managers on how to reduce costs and improve performance. Today’s episode features an interview with Alex DeBrie,
principal at DeBrie Advisory and author of The DynamoDB Book. On this episode, Alex dives deep into the
intricacies of Amazon DynamoDB, a PlanetScale, NoSQL database service that supports key value and document data
structures. Alex discusses the consistency and predictability in the design of DynamoDB’s performance and how to
best utilize it. But before we get into it, here’s a brief word from our sponsor.
This podcast is brought to you by CloudFix, ready to save money on your AWS bill. CloudFix finds and implements
100% safe. AWS recommended account fixes that can save you 10 to 20% on your AWS bill. Visit cloudfix.com for a
free savings assessment.
And now here’s your host AWS super fan and CTO of SW Capital Rahul Subramaniam.
Welcome back to AWS Insiders. In the previous episode, we started talking about DynamoDB with Alex DeBrie. We
discussed its architecture and the unique characteristics that allow it to be a PlanetScale data store. Thanks
for joining us again, Alex, and welcome back. I want to jump right in and take off from where we left the
conversation in the last episode. I’d love for you to share with us one or two examples of use cases where the
scale of the problem made DynamoDB the only real choice.
Yep, absolutely. So it’s hard to pick one or two, but a few interesting ones would be, basically what we know
without the factory trilogy, there’s a billing and charging system they’re doing for telecoms. So you think all
these cell phones or even, I believe internet-connected devices sending 3G, 4G, 5G data all over the place, and
you just think of all the tiny little charging requests they have on a per-second per minute basis where they’re
saying, “Hey, can I make this phone call” and maybe authorize them for saying, “Yeah, you can, you can have it.”
For the next five minutes, check-in. If that phone call is still coming and then telling us how long that phone
call actually lasted or how much data was actually sent. And just think of the scale of these, these telecom
things, keeping track of all those tiny little charge requests and making that happen.
And, I think a lot of the old generation ones are built on relational databases and they can do that, but
there’s more latency in how much they can do. There are more limits on how many concurrent transactions they can
handle or starts to get quite expensive. Whereas DynamoDB, we designed this one from scratch using DynamoDB, and
you can roll it out. It can handle global telecom requests and it can handle pretty significant amounts of
traffic at a really low, predictable latency. We can optimize for, Hey, these are the really important paths
that we need to handle to authorize these particular charge requests. Make sure they’re good and allow it
because you don’t want someone trying to make a call and it’s got to take 30 seconds to figure out if they can
if they have enough credits on their bill or whatever like that.
So I think DynamoDB if you really want a global scale solution there, I think Dynamo is really going to be the
only option for you. Another one, I think that’s pretty interesting. Can’t get into super specifics, but I had a
customer that was, they were in an industry that was heavily affected sort of positively, I’d say by COVID
right. Where usage went up quite a bit, because of COVID and just people’s different changing patterns. So they
knew a big selling event was coming down the road and they also had very cyclical usage patterns. So during the
day, usage was much higher than at night. During the week, usage was much higher during the weekend. And then
also periods of the year would be much higher. So they had some time where usage would be much, much lower and
time would be much higher.
And they were in one of these low usage periods, but we’re saying, “Hey, in a few months, it’s really going to
ramp up and we need to know what we can get out of that.” And just having the predictability around DynamoDB of
saying, “Hey, we know that it’s going to be able to scale this. We’ve done the math. We’ve calculated this out.
We’re not going to hit partition throughput limits. We’re not going to hit general scale limits. We can actually
do this thing,” is really great. And then also having that ability to scale up and down during the day, during
the week, during periods of the year and save a lot of money on their bill that way, I think it was just really
beneficial to them and something that would’ve been hard with a relational database.
I agree. Another example that stands out for me is that of applications that support custom fields. Now,
relational data stores require you to declare a static structure for the schema right upfront. And the need for
custom feels throws a wrench in those works. The most common practice to work around that particular problem is
to create a separate table that stores both your custom feel and the values, but the challenge there is that
then for every query that you run, you’re required to do a join with that custom table. Now you really start to
see the performance degradation as the size of that table increases and the joints get more and more expensive
DynamoDb on the other hand, seems like such a natural fit for such applications at scale. So staying on that
subject of scale, I actually wanted to bring up the conversation around the infinite scale. Every time I read
the blog post about Amazon Prime Day statistics, I’m astounded by the kind of transaction throughput that they
push through DynamoDB. And these numbers are orders of magnitude higher than what any relational data store
could possibly do today.
And despite proof points like that, most people find it really hard to digest the AWS claim that DynamoDB has
potentially infinite scale. What do you tell people about the architecture of DynamoDB that makes this
theoretical infinite scale possible?
Yeah, I think that’s one of the really cool things about DynamoDB. And I love those Prime Day posts that Amazon
comes up with and say, “Hey, this is what we drove on Prime Day in the numbers like you’re saying are truly
mind-boggling.” But again, the principles beneath Dynamo are going to be pretty simple and you can sort of see
how it’s just going to scale literally as your data grows there. So, I think the most important thing to know
about Dynamo, it has that primary key structure that we talked about, but every single item is going to have,
what’s called a partition key. And what this partition key is doing, is basically helping decide to which shard
or node a particular item is going to go. So when a request comes into DynamoDB, the first thing that’s going to
hit is just this global request router that’s deployed across the entire region.
It handles all the tables within a particular region and it gets that item. It pulls up the metadata for that
table. It hashes the partition key value that was sent in for a particular item. And then based on that, it
knows which node to go to in the fleet. So each node in the fleet is going to hold about 10 gigabytes of data.
And if you have a hundred gig tables, you’re going to have 10 different primaries behind the scenes serving that
data. And in that request router, it’s going to hash that partition key, say, “Oh, you’re on, this item is on
primary four.” It’s going to go there and go directly to that. So that’s an 01 constant time lookup to figure
out, which record in the hash map, this basically belongs to. But now as you go from a hundred gigs to a
terabyte, now you have a hundred different partitions.
It’s still going to be that constant time looking up at the front end to say, Hey, which partition does that
belong to? Whether we have 10 partitions, whether we have a hundred partitions or a thousand partitions, it’s
going to be the same amount of time to get to the specific partition, which again is only holding 10 gigs of
data. And then within that partition locating your particular item or locating a range of items is going to be
very efficient for you. And that’s how they can continue to scale those out. I want to just like push on that a
little further. One thing that was interesting to me talking to a dyno person that reinvent this year is just
how important the size of that partition is. So that size, it’s only 10 gigs that they’re holding on these
different machines. They’ve got a giant fleet of machines, 10 gig partitions plus secondaries and all that.
But that size makes it very easy for them to recover from instance failures. If an instance fails, they can
promote to a secondary, but it’s also very quick to just replicate that 10 gigs somewhere else. If you have a
one-gig network link there, it takes 10 seconds to replicate that somewhere else. Whereas I worked for a
company, we had these giant MongoDB deployments, and each shard was going to be 150, 200 gigs sometimes. Just
because of the maintenance for us, we didn’t want to have a thousand shards. So it’s easier to have 10 shards of
200 gig rather than a thousand shards of whatever. But it also means it’s a much bigger deal, if one goes down,
it’s harder to sort of bootstrap a new one to get a new secondary back online, whatever that is. So I think that
the partitioning scheme that DynamoDB has, but then also being a fully managed service where they can put all
the customers across a particular region and have these tiny little partitions share the hardware. I think that
really helps what they can do there.
Yeah. It’s amazing how that simplicity of structure has actually allowed DynamoDB to scale as it does. I think,
if they had made it any more complex than it is today, I don’t think they would have been able to build service
as scalable and flexible as DynamoDB. I think that simplicity is key over there. So that brings me to the next
section where I’d love to know what your top three tips to DynamoDB customers would be?
Top three tips, man, that’s a tough one. There are so many, and it really depends on the sort of where you are
in the learning process. Are you brand new? Then I have different tips on if you’re maybe intermediate vs if
you’re an expert, things like that. I would say I sort of break it into again, like three different learning
levels. So, the biggest one, if you are brand new to DynamoDB, and this is their first time doing Dynamo or
NoSQL data modeling, my tip would be, don’t just treat it like a relational database. I think that’s going to
get you into trouble, but understand, how you think about DynamoDB data modeling the principles of single table
design access first designed, whatever you want to call it. So I think that errors, you see there would be
normalizing your data too much.
Using that relational model, try to do joints in your application code rather than pre-joining your data to
handle your access patterns or things like that. So that’s the first one. If you’re brand new, that’s the
biggest thing I would say is to understand that it’s different, understand how single table design works,
because that at least teaches you, “Hey, this is different. And something else is going on here.” If you’re a
little further down that path or in the intermediate level, I would say, be careful about going too far on the
other end of the spectrum. Sometimes again, like you’re saying hyper normalization or denormalizing everything
that can be a problem. I also see people that are so sort of in that medium range thing, maybe this is the first
data model they’re picking up on single table design. They put all sorts of items for a customer in the same
partition, all the same, partition key, even if you’re not accessing them together.
So going back to our example before customers, addresses, orders, order items, giving them all the same
partition keys, they’re all grouped together, even though you’re never fetching a customer and the order items
in the same request. So those should be indifferent partition keys. So make sure you’re not overloading that too
much. And then, if you’ve reached, hopefully the highest level of enlightenment, you’re starting to understand
DynamoDB. I would just say, make sure you’re again, really thinking about the specifics of your application, and
do the math. That’s the biggest thing I tell people, “Hey do the math and figure out what is going to be optimal
for your exact application because it’s hard to give very generic advice. I can tell you the three or four or
five factors you need to consider, but you need to do the math. Think about it. What makes sense for your
I agree. I think doing the math is great advice. Talking about math, the pricing of AWS services can get pretty
complex at times since the pricing happens on multiple different dimensions, what it is your advice to the
audience about mistakes that they should avoid or best practices that they should follow to ensure that they
don’t have an out of control bill at the end of the month.
I think one thing you might get into trouble with is if you have too many secondary indexes global secondary
indexes, we haven’t talked about that too much here, but basically in Dynamo, you might have, we talked about
the importance of the primary key, but you might have an item that you need to access in different ways. And
what you can do is set up what are called secondary indexes. And those will re-index the data, it’s sort of like
a read-only copy of your table that you can do these additional access patterns. And whenever you write an item
to your table, it’s going to get replicated out to those secondary indexes. So super useful, but watch out for
having too many of those GSIs. If you have an item that you’re indexing 6, 7, or 8 different ways, every time
you do it, right, every time you update that item, you’re going to have to pay for it eight different times,
which is going to cost you a lot there, especially if it’s a bigger item.
Because they’re going to charge you based on the size of that. So I think that can be a sneaky one that sneaks
up on people where they have this just right amplification cost that they didn’t expect and can be costly there.
So look into that one. Again, I think going back, understanding the core principles, and again doing the math,
is right. But I hate to keep reusing that phrase, but that’s the big one. I think understanding the principles
is like, “Hey, Dynamo is very much focused on the primary key.” That’s how you access your data and you want to
be fetching sort of the exact data. Dynamo’s charging you for the data you’re accessing. But the great thing
about that is you can retrieve the exact items you want in almost all cases. So then it’s pretty efficient that
way and sort of it matches your costs.
So, make sure you’re modeling in a way that works. I would say avoid the filter expressions generally. I’d say
that’s a more advanced topic, but a lot of people think they can use filter expressions to have a more efficient
data access and that’s actually not going to save you in the long run, because they’re going to charge you on
the data that you read, even if some of it is filtered out by your filter expression. So try and make it. So the
actual data you read, the primary keys that you’re targeting are the actual items that you want to get there.
That’ll help keep your costs in check. I would say adding a more fine-grain level, not going to save you a ton,
but could save you 30%, to 40% on your bill as a lot of people start with on-demand capacity with DynamoDB.
That’s really great. You can do pay-per-use pricing. You don’t have to pre-provision capacity things like that.
But at some point, if your bill gets large enough and you have predictable traffic, it’s going to pay to switch
to provision capacity where you’re saying how much capacity you have in advance rather than paying-per-use. And
that’s going to be a lot cheaper on a fully utilized basis. It’s hard to get full utilization. So that’s why I
say like, make sure you know your sort of traffic first and especially if it’s pretty predictable, and then
switch over to that. But don’t prematurely optimize there. I think it’s fine to start with pay-per-use,
especially if your bill is like you’re saying in the free tier level or less than a hundred bucks a month on
that DynamoDB bill, don’t spend a bunch of your engineer’s time working on that. Wait until your bill gets a
There are a couple of other examples that I wanted to share with the audience that we encountered. I think I
actually find the concept of the TTL that you have in DynamoDB is a very useful mechanism for managing costs,
because a lot of times, especially for DynamoDB use cases, you find that what you really want is the data for
the last day, maybe a week, but stuff that’s older than that. You’re never going to really access it. So you
might as well delete it out of your table and not just keep all that storage around. That is one. And then of
course, if you do need to keep that data around, you could decide to dump it in S3, which is what we typically
did. But the other alternative now with the new infrequent access tier that you have for storage, seems like a
really good adoption. Are you seeing adoption of that being pretty common?
Yeah, sure. So I’ve definitely seen a few people that are interested in that. And just for background folks that
are listening, at reinvent this year, they came out with a new storage tier for Dynamo, which is pretty
interesting and unique where, in Dynamo, you’re basically charged on two axis. You’re charged based on the read
and write units. You’re actually doing the transactions against your database, and you’re also charged for the
storage of your items. So however bigger your items are, they’ll just figure out how many gigs that is and
charge you a quarter per month on something like that. And like Rahul, you’re saying here, like a lot of people
end up with years and years of data of which only the most recent stuff really looked at, but maybe they need to
keep it around just for, just in case the customer asks for it or something like that.
But you see people where the storage costs actually exceed the transaction costs or maybe at least become a
bigger chunk of that. So they have this new storage tier, this infrequent access storage tier, where they reduce
the storage costs on that particular table. And then they proportionally sort of increase the transaction cost.
So for some people, if you have a lot of historical data that you need to keep around just in case, this could
be a lot cheaper for you to where it reduces that storage cost, it increases your transaction costs a little
bit, but not enough to sort of offset that benefits storage as well. So, it’s fairly niche. I wouldn’t say it’s
super niche, but I’d say you probably have to have a pretty big table. You probably have to have a few years of
data to be doing this, but this can really help in those situations.
And then, like you’re saying, TTL is a great thing. If you actually don’t need to show that data to anyone ever,
like it’s not relevant at all, after a week or a month or whatever, you can throw a TTL on that item and just
make it 30 days after it was created. And Dynamo will just periodically go through there and expire out those
items for you. So you’re no longer paying for them and it just moves the burden from you having to scan over
your table and delete old stuff. It moves it onto Dynamo. They’ll look through it. They’ll maintain that for
you. You just pay the right cost to delete it. And that’s it.
Yep. With that, I’d like to thank you Alex for coming on the show, loved all the insights that you have. I love
working with you on a lot of these very, very interesting use cases and problems, and learned a lot from you
through that process. So thank you once again, and thanks for sharing all of your knowledge around this and for
writing the book at the first place. So once again, thanks.
Absolutely Rahul. Thanks for having me on. It’s always great to see you and I love hearing about your
experience. You’ve seen a lot of cool stuff and then just, how to balance that against these, the stuff you’ve
seen in relational databases with Dynamo, I think this is really great. So thanks for having me on, it’s been
Great. Thanks everyone. That’s a wrap.
We hope you enjoyed this episode of AWS Insiders. If so, please take a moment to rate and review the show. For
more information on how to implement a hundred percent safe AWS recommended account fixes that can save you 10
to 20% off your AWS bill, visit cloudfix.com. Join us again next time for more secrets and strategies from top
Amazon insiders and experts. Thank you for Listening.