AWS Made Easy

Ask Us Anything: Episode 9

Episode 9
July 5, 2022
1 h 03 min

In this episode, Rahul Subramaniam and Stephen Barr talk with Nathan Peck, of AWS. Nathan is a Senior Developer Advocate, focusing on CDK and containerization technologies.

Latest podcast & videos

September 27, 2022November 3, 2022

1 h 07 min In this episode, Rahul and Stephen continue the theme of Behind the Scenes by showing some of the automation which makes AWS Made Easy possible.

September 20, 2022September 28, 2022

1 h 07 min In this episode, Rahul and Stephen recap the "Behind the Scenes" episode 1, and then discuss a few new AWS announcements, and plan for Behind the ...

September 13, 2022September 20, 2022

1 h 10 min In this episode, Rahul and Stephen begin part 1 of a 3-part series in showing #AWS-powered automation, developed with DevSpaces and DevFlows, to show how they ...

August 30, 2022September 19, 2022

1 h 03 min In this “What’s New Review” post, Rahul and Stephen go over a variety of announcements from AWS. Most of the articles rated very well, with the ...

August 17, 2022September 19, 2022

1 h 11 min In this episode, Rahul and Stephen film from Anaheim, where they were attending an AWS Partner Summit. They filmed from a makeshift studio in a hotel ...

View all »

Summary

The outline of the discussion, along with some of the responses, is given below. Note that: RS = Rahul Subramaniam, SB = Stephen Barr, NP = Nathan Peck.

What is CDK in a nutshell?
Why build CDK on top of CloudFormation?
Why not directly target the APIs themselves (e.g. like boto3)?
1. NP: Declarative vs Imperative
2. RS: Why not base the imperative stuff on CloudFormation?
3. NP: CDK is already a huge project, without needing to re-write CloudFormation as well as CDK
Is it good practice to modify the CloudFormation generated by CDK?
How can CDK help with container-based deployments?
1. The CDK can, for example:
  1. CDK can build a container
  2. Manage the ECR repository
  3. Ship the container to AWS Fargate
2. RS: A possible downside of this approach is that it is one more tool to manage.
Are there best practices for organizing your code to handle one more tool, e.g. cdk?
1. NP: Use Docker, and have that be the layer of abstraction. Then, Docker is responsible for packaging the app, and CDK is responsible for sending it to the cloud
2. RS: This makes sense, but brings along another concern, namely managing the versioning of cdk.
CDK iterates quickly. Will a CDK-based deployment work 6 months from now? Highlight from AWS Made Easy AUA Ep 9
1. NP: Containerize it! This can pin the version of CDK and all its dependencies at a particular point in time.
2. NP: Package-rot is an issue in the node ecosystem, but note that CDK should never be production facing. The artifact which gets generated by CDK is a CloudFormation template, and this should happen on a CI/CD system.
What are the best practices for using CDK across multiple accounts with shared resources?
Looking at CDK as a resource lifecycle management tool, how does it compare to Kubernetes’ approach?
Is there anything we can do to make CDK / CloudFormation deploy faster?
1. NP: Use more, smaller stacks
2. RS: How does this work with the “monorepo” philosophy?
3. NP: There is some custom tooling necessary to get this to work, that can basically detect what parts of a repo have changed and then run the CI on just these subdirectories within the repo.

Transcript

Stephen

All right. Hello, everyone, and welcome to “AWS Made Easy” episode number nine. And today we are very fortunate to have Nathan Peck with us.
Nathan

Hello.
Stephen

And Nathan is a Senior Developer Advocate with AWS, and he’s focused on CDK and containerization technologies. So, how’s everyone today?
Rahul

Welcome to the show, Nathan.
Stephen

We also have Rahul with us. And thank you, Karen and Marissa, our ASL interpreters for making this live stream more accessible.
Rahul

How are you doing? [crosstalk00:00:51].
Nathan

I’m doing great today. It’s a good start to the week after a long weekend and ready to get back into the tech.
Rahul

Yeah. I hope everyone had a great 4th of July [inaudible 00:01:07], and Nathan you’re in New York?
Nathan

Yes.
Rahul

How were the celebrations there?
Nathan

Very noisy.
Rahul

And we can contrast that with Seattle where Stephen is at the moment.
Stephen

So, we had a really nice show over Lake Union and then had a few unofficial extensions to the show that the kids appreciated. It was nice. Couldn’t guess for better weather. After the wettest spring on record, it was nice to have a perfect summer evening for the 4th.
Rahul

Yeah. And the weather [inaudible 00:01:45]
Nathan

And your spring means less risk of starting a fire accident.
Rahul

Nice. Here…I just returned back from Seattle. I’m in Dubai at the moment this week and just recovering from jet lag. So, those are the updates from here. So, awesome. We have a really, you know, packed set of conversations to have today are all around CDK. So, I don’t wanna take away too much time from that. Just quick set of instructions, Stephen for the audience, where they can post questions and yeah, how can they ask…
Stephen

Yes. We are… Again, please for the audience, please ask questions. We are on LinkedIn, Twitter, Twitch, but we get…ask a question on any one of those platforms and we’ll try to get to it, you know, in close to real-time as we can. And yeah. I’m just excited to have you. And yeah, let’s dive right into it. So, Nathan, first, just give us some of your background and tell us about yourself, what’s been your trajectory with cloud at AWS, and how did you wind up as an advocate for CDK and containerization?
Nathan

Yeah. Sure thing. So, I basically came from the startup world. I straight out of college ended up joining a startup. And I’ve always been, from the very beginning, very focused on building things efficiently, quickly because I worked for small companies that, you know, are trying to hit product-market fit. Most of the time, the small startup I was working at did not have product-market fit. You know, they didn’t have enough customers. And so their main focus was how can we build features faster and deliver results faster to try to figure out what our customers even want in the first place so that way they’ll come and join.

So, from the very beginning, I’ve always been very focused on what is the technology I can utilize which helps me to build more efficiently deliver to the cloud and deliver production-ready application quickly and efficiently. So, that’s led me down the path, obviously, to AWS, first of all. Less resources and less infrastructure to manage overall, eventually to infrastructure’s code and containers. All three technology is cloud, infrastructure is code, and containers being things that helped me to build better and faster, and more efficiently with less wasted time.

And so now that eventually led to me as early adopter of ECS and writing some blog posts about it and being sort of a start customer. I transitioned from that role into working for the container services team at AWS to help them make the product better and evangelize the product to other folks who could also be trying to build out their startup, trying to build something big and exciting, but needed help trying to figure out how they can build it faster and more efficiently. So, kind of taking my learnings from five-plus or seven-plus years of startup world to other folks who wanna build their startups.
Stephen

Awesome. Well, thank you for that, and thank you for the background. It’s really a kind of neat trajectory of coming from the startup world and then seeing how to actually make the infrastructure to enable these other companies, whether they’re startups or Fortune 500s or anything in between, they can all use this great AWS infrastructure that we have access to. So, going to the focus of our show, CDK. So, what’s the elevator pitch for CDK?
Nathan

All right. So, my elevator pitch for CDK, well, first, I should probably mention what CDK even stands for, for the folks who aren’t super familiar. It stands for AWS cloud development kit. And basically, it’s a hybrid between an SDK. You know, if you’re a programmer, you’re familiar with software development kit, the idea that you can pull this library in and it provides you with an API on top of, you know, for accomplishing something. And it is essentially SDK for the cloud. And it allows me to provision, create, configure cloud resources using a programming language that I’m familiar with. And that programming language, in most cases, will be the exact same programming language that I use for building my application.

And so, the elevator pitch for it is this piece of software enables my engineers to actually build and deploy cloud infrastructure without needing to have one specialist who only knows how to do cloud and one specialist who only knows how to do application development, and very little, you know, cross-collaboration between those two other than let me pitch this application over the wall to the cloud guy or the cloud girl and then they say, “Oh, well, your application isn’t working right.” And you know, back and forth. Instead, we can have one combined team which is working at both, and that really enables some very fascinating development practices.
Rahul

So, that brings up an interesting question then. AWS has always had the SDKs to deal with and interact with the cloud. Like in Python, you’ve got the bottle 3 library that’s…I mean, it’s so extensively used by anyone who uses Python, for example. Why build something on top of something like CloudFormation, which feels now completely separate and disconnected instead of building like another layer of abstraction over bottle 3? Or the other libraries in languages like, you know, the Ruby one or the… No GS one, like we already have a whole suite of those. Wouldn’t it have been easier to build a layer of abstraction on top of that rather than go down the path of CloudFormation? What brought about that decision to go one way and not the other?
Nathan

I think this is where it comes down to difference between declarative versus imperative. So, even though cloud development kit is based on an imperative language, it’s actually a declarative framework. And so, I guess first I should explain what the difference is between declarative and imperative. So, the idea behind imperative is you write commands. So, you say, “I would like you to create this resource, I would like you to update this resource or delete this resource.” And so, when it comes to creating infrastructure, that turns into a boilerplate script, which anytime you need to create or update something, you have to, first, check to see whether that resource exists. If it doesn’t exist, then I create it, right? If it already exists, I check to see whether it matches what I want to have existing. And if not, then I update it. And then I check to see if it exists and if I was not supposed to exist, then I delete it, right? So, you have this very common boilerplate pattern you have to iterate over for every single resource. So, what if there was a better way to describe this? And so, the way I think about it is just a shopping list or an inventory list. So, I create my shopping list of all the resources I want in the cloud, and I hand it off in…we call this declarative style. I hand it off to a declarative platform and say, “Here’s my declaration of what I want on my shopping list.” And you go out and make that a reality.

I don’t care whether you’re creating, I don’t care whether you’re updating or deleting resources, just make the reality match my inventory list. And so, that’s where CloudFormation adds a lot of value there because every resource has complexity as to whether or not the resource is ready, what pieces go into figuring out whether that resource has been created fully or created successfully, different status codes that may have been failures. So, CloudFormation hides as much of that away from you as possible. Whilst you just give an inventory list, so you say, “I would like this RDS database with this version of my sequel. I would like this S3 bucket.”

And so, CloudFormation says, “I’ll take your inventory list, I’ll make all these things a reality, and then I’ll give you back a final status, which is whether or not it was successful or not.” So, that saves you a lot of boilerplate writing code to manage like, well, what happens if there was a failure at this my sequel? What happens if there was a failure at this S3? And it allows you to instill everything down to pass off the inventory list, get a result back.
Rahul

Yeah. But I’m not sure that addressed the problem of…Like, I assume the CloudFormation itself was written using a lot of the underlying…
Nathan

Oh, yeah, yeah.
Rahul

Right?
Nathan

Yeah. For sure. [crosstalk 00:10:51]
Rahul

So, is this a question of exposing because part of it feels like a little bit of a hotchpotch where there are fundamental differences in the declarative versus the imperative, you know, programming paradigm. But then when you’re literally offering the imperative paradigm on a platter with something like CDK, you’re saying, “Hey, this is the way to go.” Then why not base the imperative stuff on other underlying imperative SDKs rather than CloudFormation? So, that was something that always confused me about it.
Nathan

Yeah. So, I think, number one, it’s a question of familiarity. Like a lot of people are already familiar with CloudFormation, right? And there is something good about having a declarative layer in between which you can fall back on if you do wanna export out of the system. If you don’t wanna use CDK anymore, you can always fall back down a level to CloudFormation. I also think, just in terms of building something like CDK, CDK itself is already a tremendously complex project involving, you know, made hundred thousand lines of code. I don’t think anybody wanted to sit in there and rewrite CloudFormation as well as CDK, right?
Rahul

Got it. Yeah.
Nathan

So, the way I see it is the layers kind of built on top of each other just as we have operating system layers that built on top of lower-level kernel and then higher-level tooling. It’s the same for infrastructure’s code in this case. CDK is built on top of underlying tooling and that allows CDK to improve as underlying tooling improves. So, for example, now CloudFormation has drift detection, which, by the way, did not have when CDK initially launched. But because that drift detection was added in, there’s now the ability to seamlessly within a CDK application, get a notification out of CloudFormation when the real state of your resource has drifted from the reality of what you wanted in your CDK stack. So, I think building on top of existing tooling, once again, it’s more efficient, it’s faster, it provides that confidence that even if I’m not wholeheartedly sold on CDK, I do know that there is an underlying CloudFormation that I can fall back on if I decide, “Eh, I’m kind of done with CDK for now, let me start using the CloudFormation directly.”
Stephen

So, I put as a background slide for everyone, just the workflow of how CDK interacts with CloudFormations. The idea is that you write code using the CDK library, and then there’s this synthesis step which will then generate your CloudFormation templates, which then get sent to CloudFormation to do the deployment and then you get handed back resources, which may already exist, or may have been newly created, or may have been changed to reflect a new desired state. So, that’s an accurate summary, and you’re saying the idea of rather than having CDK kind of skip this step and deploy to the cloud directly is that, well, so much work goes into CloudFormation that you can almost use that as your assembly language.
Nathan

Yeah. And not only just using it as an assembly language, but also, I think it allows you to have a snapshot which is sort of set in stone because as you mentioned, you’re writing code in a imperative language at the CDK level, which is totally fine. It provides a lot of shortcuts, and we can talk about the benefits of that later. But the benefit of declarative language is that it is very predictable. And so with CDK, one of the things I can do is I can synthesize my stack into the declarative artifact. And then that declarative artifact is now the inventory list is set in stone, it’s baked in. And I can then redeploy that as many times as I want without ever rerunning the imperative code again.

So, if you think about it as a pipeline, the imperative code is used at one stage, but then the output of that is the declarative thing, which is set in stone, it’s locked in. And then that pattern then gets deployed as many times as necessary across as many AWS accounts that you might have. So, there is a benefit to having the two-stage approach there where the declarative artifact…you know it won’t change whereas imperative, each time you rerun it, maybe there’s a logic in there that it uses math.random, right? And it randomly chooses whether or not to create a resource. Like imperative code could be doing anything at all on there, right?
Rahul

True. So, let me ask you this. So, without a doubt, I think from our use of CDK and our experience with CDK, if you compared it with something like CloudFormation, it’s night and day, right? I mean, there are so many decisions that have been made so much simpler because you have CDK now to do all these standard patterns, there are certain abstractions that are available, there are all of these libraries that you can literally just import and get started with right away. So, tons and tons of value out of having something like CDK being available to you without a doubt.

The flip side of something like CDK is that it kind of automagically generates this CloudFormation that you might not be familiar with, or you don’t know how it got generated, or sometimes there’s just so much CloudFormation in there that you feel like you need to then master that generated code. And I have never known generated code to be easy to understand, easy to read, easy to grapple, even if the claim is that, oh, you can just reuse all the CloudFormation stuff as is without ever having to, you know, rerun the transpile of sorts. So, how many customers have you come across who end up going back to CloudFormation generated from CDK and saying, “Okay. Now, that’s my standard. I’m checking that into my source control, and that’s what I’m gonna run?”
Nathan

I don’t think I’ve seen very many because most people tend to get sold on CDK to the point where they say, “Okay. I don’t really wanna go backwards now.” So, it’s kind of, you know, once you start using it, you’re like, “Okay. This is amazing now.” I like having the ability to fall back on CloudFormation if necessary. And in fact, from CDK, you can hook into the underlying CloudFormation and modify things directly even if it’s not supported inside of CDK. So, generally, we end up seeing is folks who do have that deeper underlying CloudFormation knowledge and do wanna do the fallback. They end up using the CDK methods for manipulating underlying CloudFormation from their CDK stack rather than necessarily doing a full export locking in their CloudFormation forever, and then just manipulating the CloudFormation directly.
Rahul

Got it. But do they express concern about not knowing what CloudFormation gets generated or not being able to understand all the underlying? Like simple things like IAM roles, right? I mean, I would…
Nathan

Yeah. So, I think that…
Rahul

That for me becomes one of the big areas of concern, like…
Nathan

Yeah. So, I get what you’re going for there. And let me explain how I think about this. There are a lot of tools out there and even the AWS console, which automagically create IAM roles for you, which automagically create resources. You know, if I’m clicking around an AWS console, in some cases, I click a checkbox or I click next on a dialogue and behind the scenes, three different resources got created including magical IAM role, right? And if you’re using something like an AWS SAM, if you’re using AWS Copilot, if you’re using any number of different tooling that it’s out there, is creating multiple resources each time you do an action. And a lot of people, they don’t understand the underlying behind that either.

So, I don’t necessarily think that CDK is worse than other tooling in that respect. In fact, I would say that the console, in many cases, is very confusing because I’ve been in the book previously as a customer clicking around the console when I created my application stack manually. And then I’ve later wanted to create CloudFormation, and I didn’t understand all the whole list of resources that was gonna be necessary. I was like, “Wow, I didn’t realize what the console was doing behind the scenes for me.” I didn’t realize that it had actually created this role and that role and tied these two things together. And, oh, yeah, it also created a security group for my load balancer, and I’m like, “Wait a minute, this was more complicated than I thought it was.” So, I think that fundamentally what it is, is like people think about an API. And the way I think about it is if I’m creating my architecture and I have a whiteboard in front of me and I create service A and then I draw that line over to service B, you know, it looks very simple on the whiteboard. And that’s how people wanna think about it. They wanna think about simple or top-level concepts, but then underneath those top-level concepts is a list of different resources and settings that are necessary in order to make that top-level concept between A and B work.

And no matter what tooling you use to accomplish that whether CDK or the console, some other command-line tooling, these are all abstractions and they hide away some of the complexity. But the advantage of CDK is that you can dig deeper into that underlying complexity more easily. Like with the console, it creates that IAM role and I have to go in and manually hunt for it. With CDK, it creates…I can type CDK synth at any time, output the template, and read through the list of resources to see that inventory list of everything that CDK wants to create. And so, that actually helps me understand a little bit better that when I make this change, I see this change in the infrastructure’s code. When I add this one line of CDK, it adds these 10 lines of CloudFormation.
Stephen

So, effectively, there is a fundamental level of complexity just based on the nature of what we’re doing, and it can get hidden by different tools to a point. But then eventually, at some point, you will have to know what…you have to understand the underlying system to some extent, at least to be able to play with it. And whether that’s, I don’t know, using some framework that compiles the JavaScript, eventually, you do need to know some JavaScript. And if you’re using some framework that compiles to CloudFormation, you’ll have to, you know, at the time being, you’ll eventually have to know some CloudFormation to be able to effectively use it or customize it. And I can imagine that at some point in the future, there can be users of CDK who would never touch CloudFormation, which would be a neat place to be. Let’s take a quick 30-second break.
Nathan

Yes. That’s essentially possible. Yeah.
Stephen

Let’s do a quick 30-second break and then we’ll come back and let’s talk a little bit about the CDK patterns for containerization. We will be right back after a word for our sponsor.
Announcer

Is your public cloud riddled with cost and complexity? CloudFix is here to clean that up. CloudFix focuses on cost savings for your AWS environment by finding and implementing fixes that continuously save 25% off your AWS bill. Too good to be true? We’ve helped organizations save millions across 45,000 instances. Visit cloudfix.com to schedule a free assessment and see your yearly savings.
Stephen

And we’re back. So, you had talked about part of your expertise being CDK. We wanted to bring in your other background in containerization. Can you tell us about some of the patterns of how CDK can help with container-based deployments or what patterns have evolved or have been designed to deal with this?
Nathan

Yeah. Sure. So, I think that containers in particular are a place where you will see CDK shine as a development tool. So one of the challenges with containers is that it’s a multi-stage process where there’s different pieces of tooling that are required. In particular, you have to build your container first and then push it up to the cloud, and then you can create your definition for how you want to actually run that container in the cloud. So, where CDK helps is it turns all of this into, and bridge it into something that is consistent. It bridges this gap between the local development experience and the remote cloud infrastructure.

I can have CDK, both build my application for me automatically, manage creation of a registry for uploading my container to the cloud, automatically push it and then run that image. And so, from my perspective, I just have a directory full of code, and I have a CDK statement that says, “Build this directory of code as a container and ship it up to AWS Fargate.” And the rest of that whole pipeline from taking the code into a container to shipping it up to the cloud to running an AWS Fargate is all handled by CDK code and then infrastructure is code under the hood with CloudFormation. So, I think it makes the pipeline a lot smoother and easier to integrate as compared to having a separate tool for the infrastructure’s code and a separate tool for maybe it’s a batch script or something along those lines for writing out a list of commands to run one after another in order to actually build my application and push it up.
Stephen

I was literally thinking about, you know, I’ve had that exact scenario where I thought, “Okay. I wanna do something with containers.” I think, “Okay. I’ve got a batch script saved with my AWS ECR commands because I haven’t done the work to fully bring it in into one integrated system. So, okay. Yeah, exactly right. I’m copying and pasting each commands from a text file that I’ve, you know, cargo cultivated around with me.” So, there’s a much better way of having it completely consistent. I like that a lot
Nathan

And I think the other…
Rahul

But isn’t it… Sorry, I just got to pile on that, but kind of try and bring in a slightly more opposing view. Today when you look at the code, right? I mean, code has the same problem where you need to build a file that says what you build, and then you’ve got a build, you know, whether it’s ANT, or whether it’s, you know, your NMP-build or whatever, you have all these targets or even just a simple make file. You might have a make file that has five different targets, one for build, one for deploy, one for, you know, whatever other operations you might…build package deploy. Those are all just clean, those are all your standard targets that you have pretty much across languages these days. And then now you’ve got other kinds of packaging like you create a Dockerfile that describes how you build up that container and provide all the definitions over there.

So, does this add more complexity? Because now you have one other place where you have configuration that talks about how you do something and it’s a different target now. it’s not really part of your regular code and your built. It’s one more thing that’s different in your code, one other place to go figure out how to deploy something versus… Like, by the way, even when I use CDK, what are the practices that I use irrespective of what CDK commands I might have under the hood, my make file, invariably, the deploy, my make file has a CDK command in there. Okay. So, it makes it easy for me to go to one place where I have all of the commands, whether it’s build, whether it’s deploy, whether whatever. People are not running around in different tools trying to figure out how to get the basics done. So, it’s always a build, make, deploy, those kind of commands. Are there any best practices around how you organize your code for one more tool that’s getting added now to your setup?
Nathan

Yeah. So, I think…I’m actually working on a blog post on it and I wish I had got this blog post out last week. There were some lockers though. But basically, I addressed that question of, like, how do you use containers as part of your development process? And I think the point or the point you’re making about all these different tools is a valid point. However, containers when used properly make it so that there is one standard command for building your application. No matter what language you’re using, you know, if I’m using Python, maybe I have a PIP install, if I’m using node, I have NMP install. There’s a variety of different commands, but there is one format now, which is Dockerfile where I define those language-specific build steps and language-specific commands.

And then my application is always built using the exact same command, which is Docker build, right? And so, I think the places where the layers of separation of concern are is Dockerfile down contains all the things specific to building my particular language framework. Maybe if a C-application, it’s compiling the GCC, or go app is compiling. It could be a Ruby, no matter what is language-specific that’s in the Docker file. Then from that area up, Docker is responsible for the packaging collection of those resources and turning into a tarball, basically user file. And then CDK is responsible for uploading of that to the cloud and launching it as a resource on your AWS account.

So, there’s three very specific stages that it goes through. And the interesting thing about it is now I don’t actually need any other tooling. I don’t actually need the make file actually, because, you know, potentially, I have a make file, but that’s happening inside the Docker image. I don’t actually care about making more. It doesn’t even exist on my machine anymore. In fact, when I do the Docker build, it pulls down a Docker image on the fly which contains make in my ideal version of GCC, you know, C++ compiler or whatever, does the build inside the container, right?

So, I don’t even actually have any development tools on my laptop anywhere. Everything is on the fly inside the container. And so, it makes for a very interesting process where everything is very repeatable. I can do that build very easily on any computer because I don’t have to rely on an engineer having a specific set of tools with the right version on their particular developer laptop. So, I think that’s what I would describe as the best practice and where that breakpoint comes between tooling that is containerized and tooling that is not containerized but running on your host for the purpose of uploading stuff to the cloud.
Rahul

That makes a lot of sense, but it also brings up a legitimate concern that I’ve heard from a lot of my teams and also externally, which is given that CDK is very much a client-side setup written node, the versions of CDK start, you know, coming into question and concern. Like if I’m gonna use CDK today, six months down the line, then here’s another, you know, piece of code and the dependencies and stuff that I need to kind of make sure I have working all the time. And with some of these node-based, you know, applications, because all the dependencies keep changing so frequently, node versions start becoming a concern. Six months down the line, is this stuff gonna work out of the box or not if I don’t keep maintaining it constantly and if I…? So, if I’ve not done a new deployment for six months, I don’t know whether my stuff is gonna keep working or not because of this new dependency. So, how do you guys think about it? How do you guys address it for AWS customers in general?
Nathan

So, I would address it the same way I address that for my own application, which is I containerize it.
Rahul

Okay.
Nathan

So, it is entirely possible to create a container out of any application that you want to, including CDK. And in fact, CDK development, when you’re actually developing on CDK, there is a containerized process for that if you look through the contributor guide. So, the cool thing is you can actually package up a specific version of node runtime specific version of all the package and specific version of CDK and turn that into a repeatable, reproducible thing that’s always gonna run the exact same way and then use that as part of your development process. So, now it is referencing via Docker volume, is finding your infrastructure’s code, it’s running it inside the container, and now it’s communicating as I thought.

So, I keep coming back to containers because that is a tooling that I’ve used many times to solve this problem for my own application and it can actually solve it for any application. And I think in general though, the main concern with things breaking over time in the node ecosystem is not that I run this today and then six months from now, it doesn’t run anymore because it’ll generally still run, but it’s more that different underlying packages may discover security vulnerable and stuff like that. So, there’s a little bit of what we call a package rot over time.
Rahul

Yeah.
Nathan

Now, ordinarily, that’s very concerning for production-facing application that receives public traffic because if one of those packages has a vulnerability, that’s very problematic. Somebody could be exploiting it and breaking into my web-facing application and hacking it. But for a sort of backend tool, which in this case, CDK is only running out of my local host. And if I really do it properly, it’s only running on a CI/CD server, an actual, you know, deploy pipeline server like Jenkins, or you know, AWS code build or something like that. That’s sort of behind the scenes, that’s behind the firewall, so to speak. It’s not receiving anything from the public. The only person who’s using it is my own developer. And if I have a concern about my local developer sort of exploiting the system and hacking in, I’ve got way bigger problems.
Rahul

True.
Stephen

That’s a very…we call adversarial environment to plan for.
Nathan

Yeah. Exactly.
Stephen

So, to summarize, you’re saying, well, just like everything else, just freeze your set up at a point in time now, of course, little pet peeve. If you’re gonna use Docker, don’t also say Ubuntu coal and latest or whatever, because then when you rebuild that container, but you know, pin it to a certain version, pin your nodes to a certain version, and that way, since at the end of the day, what you’re using CDK for is to make a CloudFormation template and everything. That’s the only output that’s gonna come out of it that you really need. So, it doesn’t matter if a certain version, [crosstalk 00:35:23].
Nathan

I just pin everything and keep everything static, right? And then if there’s security vulnerability underlying packages, it’s okay. I just set a task every six months or year, whatever, just be like, okay, let’s update all the pins to the latest version, make sure that things still work. But I don’t do dynamic updates on my CDK version that frequently.
Stephen

Yeah. That makes sense.
Rahul

It makes for such a great use case for dev spaces or code spaces where once you have your containerized dev environments, you shouldn’t have to change that. And yeah, it makes for a great marketing pitch for both of those.
Nathan

Yeah.
Stephen

Well, let’s take another quick break and when we come back, we wanna talk about…We have actually got a question that we want to address on best practices. So, let’s come back in 30 seconds here.
Announcer

Public cloud costs is going up and your AWS bill growing without the right cost controls? CloudFix saves you 10 to 20% on your AWS bill by focusing on AWS-recommended fixes that are 100% safe with zero downtime and zero degradation in performance. The best part, with your approval, CloudFix finds and implements AWS fixes to help you run more efficiently. Visit cloudfix.com for a free savings assessment.
Stephen

All right. We’re back, and we’ve got a question from the audience by Gaurav. So, I’ll put it up on the screen here. What are the best practices to use CDK across multiple accounts that have shared resources? And we certainly have a couple of accounts under our belts here, so yeah. What’s the way to do this? Rahul, can you give us an idea of the magnitude of the accounts we’re dealing with?
Rahul

Yeah. I mean, in our world, we are dealing with over 40,000 accounts across all of our portfolios. And so that, of course, is a big challenge. But just to take a very specific and concrete example, right? I mean, when you, invariably, you want to…because you have a large number of resources deployed across multiple different accounts, you know, one of the common practices that we had historically and we’re still trying to figure out how to get it working in the CDK world is we would design specific IAM roles and specific IAM, you know, resources, whether they be groups or policies or whatever, we would create all of these as common resources, we’d set it up once and then we’d start mandating that everyone use the same policies across all the Lambda functions you deploy or across whatever other resources that you are provisioning.

Now, we are still trying to figure out what the best way to do this with CDK is because once you’ve created a set of IAM roles and resources with CDK because this is more a…because CDK is more driven towards synthesis time decisions, it becomes really hard to leverage these common resource pools or these common resources across your CDK unless you start hard-coding a lot of these things. What generally is the best practice to manage this kind of a setup? Or is it an uncommon setup?
Nathan

Yeah. So, I think there’s a couple of different ways to think about it. I think the first way to solve the particular problem that you’re mentioning would be…the idea behind CDK is that you can create your own custom patterns and custom constructs such as, in this case, if you wanted to create an IAM role specific to a Lambda function, you would create my company Lambda, right? So, you’re not just using the base Lambda, you’re creating your own special, customized, you know, private-labeled, if you will, Lambda, that’s my company Lambda. And by doing that, you’re able to package up the definition of that particular IAM role and that particular Lambda and settings for it. Maybe there’s a specific timeout that you want to give the my company Lambda or something along those lines or specific environment variables.

And you then distribute that construct as a pattern for other teams to utilize. And if you’re familiar with no GS ecosystem, the idea is you just turn that into an MPM package. And now other people can pull in that MPM package and they can use it. And you have central management provided they don’t pin their versions on a really old version, right? Or they have like some flexibility in their versions. You can now update centrally that my company Lambda pattern with a new IAM role, and that rolls out the next time that they actually make a change. So, it depends on whether you want to have true sharing of resources or whether you just wanna have sharing of configuration and central management of configuration. I couldn’t really tell from the question whether the original question asker was talking more about sharing of configuration or actual sharing of resources because I do think…
Rahul

I think this is more sharing resources. Sharing configuration is fairly straightforward because it takes time and therefore, configuration is not that complex. But I think sharing resources that are already deployed, that becomes the complex partly because synth time, I mean, other than hard coding, it feels hard to do the set synth time.
Nathan

Yeah. So, I would never share an IAM role, for example. Every AWS account should have its own IAM roles. But I do think there’s a place for sharing resources that are over the network. So, for example, you wanna create private link between account A, account B and you wanna say there’s a shared database and this account and this account both utilize this or this service can talk to this other service over privately. So, I think this is where it goes down to defining the networking rules. And one of the cool things about CDK is that it is multi-account aware. You can’t build in an understanding that this stack is in this account and this stack is in that account. And the nice thing about it is because it is imperative language, you can build in higher-level methods such as saying construct.connect to construct. And because I’ve defined everything with CDK and I have an imperative language there, I can introspect the stacks for each construct and I can say, “Oh, your construct is actually in a completely different account in a different region, therefore, I know I’m going to need a VPC link and I need a transit gateway.” And so, when the connect two methods is called, I can choose to create these resources on the fly as part of that method using inside my imperative code. Now, if the two resources are in the same account, it becomes much easier. Now I can look at it and say, “Oh, you’re actually both living in the same VPC, all I need to do is create a security group role.”

So, you have one method, one standard method, which is called A.connect to B, and the method can do different things based on the broader understanding of where those resources are living, whether they’re living in different accounts, different EPCs in the same account, or even the same VPC and just behind different security groups. So, that’s actually one of the powers of CDK is the ability to create these higher-level methods that do essentially magic to create different results based on the setting.
Rahul

Got it. Now that makes a lot of sense. Let me jump in a slightly related question. But when I think about tools like CDK or Terraform or Pulumi or CloudFormation, I think of those tools falling in that resource life cycle management set of tools where fundamentally, you’re doing three things. A, declaring, what resources you want. B, deploying those resources. C, either continuously or at some interval, figuring out the state, or you are maintaining state of some kind where you got state in the cloud of your resources and you got your declaration of, you know, your bill of materials, or bill of resources that you wanted, right? And then keeping those two things in sync. Especially when, since we were talking about containers, Kubernetes kind of also falls in that category and it takes a very different philosophy around life cycle management of containers.

They do a version of, or a very contorted version of a life cycle management thing for AWS resources where you could do CRDs for, you know, all kinds of different resources. You use the same control plane, but you can deploy an S3 bucket or you could go deploy whatever else, and it kind of operates the continuous loop. So, it’s just a completely different perspective of how to tackle this problem. And I think we are still at the initial stages of understanding what is, you know, the right way to do it. Things are still evolving across these different perspectives. Do you mind spending a couple of minutes shedding light on how you think about the differences between these different ways of tackling this resource lifecycle problem?
Nathan

Yeah. So, I think the main difference that we’re seeing here is the difference between push-based and pull-based. So, Kubernetes is designed for a pull-based control loop where the custom resources are constantly comparing the actual state of the resource as created on your AWS account to the intended state as describing your local NCD cluster. Now, and what you can do is have a custom resource, and this gets into like the GitOps approach. GitOps, they’re sort of the purest approach, which is I don’t do any pushes whatsoever. I actually have a controller running inside my Kubernetes cluster and it pulls in configuration from the Git repo periodically, and can actually notice when there are changes and then it kicks off an update and creates resources on my account.

I think CloudFormation is designed in the opposite direction, it’s designed to be push-based. So, CloudFormation will only make changes on your account currently when I actually issue a CloudFormation update or CloudFormation deployment and kick that off. So, if there is another change that happens outside of the understanding of CloudFormation, that won’t actually be reflected or rolled back, whereas Kubernetes can actually roll that back. So, I think that’s the main difference. And I think we do see a little bit of complexity in mismatch between these two models, in particular, when it comes to resources like ECS services, which have a desired count for how many containers you wanna pull out.

But that desired count is also itself mutated by ECS. As the service scales up and down, receive more traffic than ECS auto-scaling kicks in and increases desired count, which increases the number of tasks, and vice versa can scale in. While this desired count is no longer a static thing, it’s something that increases and decreases throughout the day. So, CloudFormation’s approach to that is…the way you handle that, you don’t hard code a desired count and say CloudFormation stack because if you did, then every time you CloudFormation deployment, you would actually roll back your auto-scaling your scale. Things could have scaled up to 20 tasks, but it’ll scale you all the way back into two because CloudFormation doesn’t know, right? So, CloudFormation approach is leave out the desired count, don’t specify a desired account at all. And it’ll just use the existing value that’s already there, right? So, CloudFormation…
Rahul

But isn’t that a Kubernetes construct anyway? Because in Kubernetes, you have three ways of controlling the number of containers. You have the min, you have the max, and you have the desired count. You can basically use those as your control levers to decide how many containers are getting launched.
Nathan

Correct. Correct.
Rahul

And if you want a specific number, you always use desired count or the expected number of containers. And then if you wanted to scale, you set the min and max to give it bounds so that you know you have some control over how many containers you want at a minimum or whether you want to have an upper bound on the number of containers it launches and doesn’t go out of control that way. But that’s more a common pattern of control for auto-scaling versus non-autoscaling, right? It’s not something that’s specific to CloudFormation or CDK.
Nathan

So, that was what I was getting to though, is that CloudFormation and CDK is approach to handling this type of property, which is dynamic and changing outside inside of the control loop is to say, “We don’t want to know about that at all.”
Rahul

Got it.
Nathan

Leave it out, right? So, I define everything else using CloudFormation or CDK, but I don’t define the desired count because the desired count is dynamic and it’s increasing and decreasing automatically. Kubernetes is different because the infrastructure’s code is actually stored inside of the control point. And there is a back and forth where the actual auto-scaler updates the desire account inside of your infrastructure’s code definition, which then causes it to roll out. So, there’s a much tighter back and forth integration inside the Kubernetes control plane compared with CloudFormation, which does not currently have the ability to understand that there is this other thing like the ECS control plane and the Kubernetes control plane, which is mutating things outside of the CloudFormation deployment.

So, that is the fundamental difference there is that, in some ways, I like the Kubernetes model a bit better where everything is much more tightly integrated between the controllers and the infrastructure’s code description. On the other hand, I do think that CloudFormation produces a very reliable results because things only change when you do a deployment. With Kubernetes, there’s a pattern of having mutating admission controllers. Different mutating hooks, the kick in and decide whether or not they wanna change your pod before your pod actually gets placed into the cluster, right?
Rahul

Yeah.
Nathan

So, what can actually happen is you pass off a description of what you would like to be created, and that thing actually has changed behind the scenes and maybe you update your controller and things will change inside of your cluster and change inside your infrastructure’s code because it’s no longer a static thing, it’s something that is dynamic and updates. So, really it comes down to the trade-off there. Do you wanna have your infrastructure’s code be something that can change after the fact, or do you want to be a set-in-stone static thing which only changes when you specifically say do a deployment? And I think I…
Stephen

So, you’re saying… So, in other words, it’s much easier to reason about the state of a CloudFormation deployment than the state of a Kubernetes cluster if you’re looking at everything that goes into it, the Kubernetes cluster could change based on things that happen at run time much more flexibly, which is both good and bad, depending on what you want?
Nathan

Yes. Correct. Yeah. So, it’s a trade-off as to which you prefer in that model. And I do think that in the ideal world, there would be an option to have both, if you will, and be able to say there are some things that I don’t want to be changed at run time, there are some things that I don’t mind being changed at run time, right? And so, I think as the technology progresses, we’re gonna see more of a melding of the two models. Right now, it’s two very conflicting ways of the key infrastructure’s code as a dynamic thing that changes at one time and as a completely static thing that doesn’t change until I say to change it. And I think in reality, there are some resources that you would like to change and be dynamic, there are some resources you would like to be very static, like your database, for example, or something like that.
Rahul

Yeah.
Nathan

And so, there is a conflict there and a sort of mental model and that we will gradually resolve over time as we figure out which resources go in which bucket and how we want to define and lock some things in.
Rahul

So, just talk about the push and the pull are those two different approaches. It brings me to the other common complaint that we experience with CDK, which is the slowness of, you know, redeploying resources or just doing a smaller blade. Like if I had, you know, loads of stuff that was, you know, deployed as part of my original CDK and I literally just needed to change one Lambda, a big chunk of the wait time that happens in my CI/CD is invariably the fact that the system is just literally trying to detect drift, syncing all the data down, figuring out what has changed or not changed across all the resources, running through all that computation, figuring out what needs to be updated, and then finally, doing that little bit of an update, which, you know, we’ve got CDK deployments that, you know, in our continuous build system where the build and deployment, when I talk about deployment I literally mean uploading a Lambda function or something like that. That might take a grand total of a minute while the CDK operation could take anywhere from 20 to 30 minutes. That’s not uncommon to have. And when you see something like that, you start thinking about why couldn’t we optimize a lot of that by having a more continuous, at least sync of state where you’re not trying to recreate or redo all the state over and over again.
Nathan

Yeah. I think that’s a good point. I would say my answer to that is number one, I agree. I definitely wish CloudFormation was faster. I wish CDK was faster in some respects. I do think there are solutions to the problem though, and it mostly comes down to more granular usage. CloudFormation is not designed to be used with really large stacks. And CDK unfortunately, makes it really easy to create a gigantic stack. It becomes very easy to manage, you know, hundreds of resources potentially, and create a huge stack. But the reality is it’s still actually to your benefit to create micro stacks where each stack is maybe one Lambda and its dynamic DB table that it talks to.

And then the other Lambda and the other resources are in different stack. And what that allows you to do is have extremely granular CDK deployment. So now, if you say CDK deploy, you know, this particular stack by name, that’s probably not gonna take 20 or 30 minutes, you know, because it’s only deploying a small piece of the stack independently of the rest of it. And so, I think it comes down to the usage pattern. And this is something where maybe there’s not enough documentation on best practices of how to sort of break things up. And even a lot of people don’t realize how easy it is in CDK to create dynamic references across decks. Normally if you’re hand-coding CloudFormation, that’s very challenging. And if you have resources in one stack and you wanna reference that resource in another stack, you have to create an import, you have to create an export, you have to do this and that, right?
Rahul

Yeah.
Nathan

So, it’s scary. People say, “Oh, well, I’m just gonna put everything into one stack because that’s easy. And I know that references work really easily.” Well, CDK is the opposite story because CDK creates the exports and imports for you automatically. It actually becomes extremely easy to break things up into a ton of micro stacks. Like your application might have 100 micro stacks in it, and those stacks are sharing resources across stacks with automatically created and managed export and imports that CDK does on your behalf. And now you can deploy one small slice or one small piece of your overall stack much quicker and much easier than waiting on the entire confirmation stack deployment.
Rahul

That’s actually a really interesting idea, I hadn’t thought about this one. But how does that work when you have a single repository, which has, let’s say 10 different microservices? You do define all these different stacks, but when code changes, let’s say I had an API, right? I had 10 different APIs, they’re all different microservices. Yeah, they’re all a different microservice each. I would’ve one code base that manages all of it. It packages up the same container. It’s just that the endpoints differ depending on, you know, which API I wanna call as part of that particular microservice. So, even if I create 10 different stacks as part of the setup, when I run my CI/CD, the stack that needs to change is gonna be a function of the code that changed.
Nathan

Yeah.
Rahul

There is no simple way of mapping that. You basically run everything and say, “Hey, just make sure that whatever deltas have shown up now as part of this new packaging all get reflected accordingly.” You’re still gonna run into that problem of syncing all the stacks because part of the same repository is not as a completely different repository. Now, if you had, you know, 10 different repositories, one for each microservice, great. Makes sense.
Nathan

Yeah. So, CDK solves a little bit of that on its own. If you do break your stack up into a bunch of micro stacks, CDK will see that particular stacks haven’t actually changed. And so, it doesn’t have to deploy that stack. It will only deploy the stack that has actually changed since the last time there was deployment.
Stephen

So, CDK can if your code locally, basically?
Nathan

It’s a little bit more complex than that. But every time it creates an asset such as a Lambda function or a container image or something like that, it captures a hash of that local folder or wherever that source code came from. And so, if it sees that your code hasn’t changed, then it knows that that asset hasn’t changed it, which means it knows that Lambda function hasn’t changed, which means it knows that that entire stack or that entire slice of the CDK app hasn’t changed and doesn’t actually need to be redeployed. So, it’s a little bit self-optimizing in that respect. Now, the downside is you still have to actually do the synth of every single stack. And so, if you do have a particularly large deployment, there’s gonna be… And when you initially fire off that CDK deploy command, there may be like a 10, 15-second wait while it synthesizes all of that code. And that can sometimes be a bit slow.

So, I think that’s where it comes down to a little bit of external tooling. And it could be a little bit hacky, but there is tooling out there which can detect for a mono repo, which paths inside of the mono repo have changed since the last get commit. And based on that, you can create an understanding of this path is tied to this service which is tied to that CDK stack. You have to do this custom, by the way. It’s not something that’s built into CDK, but it’s similar to the way make works, you know, or make says. If this folder doesn’t exist, then we’ve run those commands to create that folder, right? In general, it is definitely easier to do things with a micro repo rather than a mono repo. And that’s the reason why I recommended earlier that sort of ideal usage of CDK is that you have an unstructured team, which creates an MPM package and then vends that out to other teams to consume. So, now each smaller team has their own micro repo, which pulls in a CDK package from a shared central repo, right? And that micro repo, when it deploys can utilize the shared patterns from the centralized source, but just deploy the limited number of things that it needs to deploy for that particular service. So, that does also help with carving down the service area.

I’m definitely more a fan of micro repos than mono repos. So, that’s a little bit where my bias comes in, but I do know why it is possible to solve this problem. Like obviously, Google famously has solved this problem, have all their service guard and one check-in as far as the story I understand. And it’s because they have that tooling to understand for each commit, what is the service area of which services actually change? And then which run for which service to actually make those change?
Rahul

Yeah. I think there are a lot of very interesting packaging trade-offs that have to be considered. Like the minute, you know, you have 10 different microservices that have the same packaging, if you start doing a Docker build for it. And again, you’ve lost because the hash is gonna be now for the Docker image that got created. And so, your topmost layer, which has a code is gonna change, that’s gonna change for all your stacks, right? And that’s gonna get a nullify some of the other trade-offs. So, I think it does make for an interesting range of new best practices for how you lay out a lot of these things. We continue to learn from our experience of managing all these code bases and all of these microservices, but we’d love to collaborate with you going forward, Nathan, to come up with more of these standard practices and contribute to the CDK project as well.
Nathan

Yeah. For sure. I love chatting with y’all about CDK today, and I think y’all are doing some great stuff with it. I would actually love to hear more about the 40,000 accounts too. That’s amazing.
Rahul

If you stick around, I can give you a little more background on that, but thank you so much for coming in and sharing all of these really neat best practices and patterns. This is really wonderful. I think we’re gonna have to call you back for another session because there’s gonna be more thing to…
Nathan

Yeah. I’d love to come back and be a guest again and chat about other topics, whether CDK or containers or whatever.
Rahul

Absolutely.
Stephen

We really appreciate that. Thank you. You’ve definitely got our wheels turning about different ways of structuring, you know, the microservice. I know I’m thinking about different ideas of okay, how to chop up these projects a bit differently to leverage some of the advantages of that. So, thank you for all the different ideas you’ve given us.
Nathan

Sure thing. Thank you very much for having me.
Stephen

All right. Well, and thank you to our audience. And thank you again, Marissa and Karen, for the interpretation, and we will see everyone next week.
[01:03:35]

♪ [music] ♪
[01:03:59]