Stacks on Stacks

The Serverless Ecosystem Blog by Stackery.

Posts on Serverless

How I Got Comfortable Building with Serverless
Jun Fritz

Jun Fritz | January 17, 2019

How I Got Comfortable Building with Serverless

A few months back, I blogged about my experience arriving at Stackery after code school. Months later, each day is still interesting and challenging and I’m so glad to have decided to pursue serverless as my concentration. I credit my AWS certifications for narrowing my focus enough to lead me to this point. The serverless community puts so much emphasis on exploration and getting started on your work or experiments today that, getting some exposure to AWS, you can get started right away. Here’s a breakdown of how I went from serverless novice to software engineer at Stackery:

Gaining Confidence

I was interested in cloud computing, but I had very little knowledge of AWS, so I opted for what felt like the next best thing: certifications.

After graduating from code bootcamp, I was eager to dive into the job search and get placed somewhere awesome right away. What I discovered was that many of the listings required some level of AWS experience. I was interested in cloud computing, but I had very little knowledge of AWS, so I opted for what felt like the next best thing: certifications. I wasn’t sure which to choose, so I decided to start with three associate-level certs offered by AWS: Solution Architect, Developer, and SysOps. Each covered different roles, which forced me to adopt a distinct mindset for each, deepening my understanding of cloud services and their use cases. Taking the exams gave me enough confidence to begin building cloud-based applications and showcasing them to potential employers. The ability to discuss different cloud infrastructures and build my own serverless web apps helped me earn a spot on the Stackery engineering team, and I was excited to start gaining more real-world experience.

Real-World Experience

In my current role at Stackery I get to work with AWS resources every day, but discovering what problems other serverless developers are working on has influenced me a lot. There are developers out there that have had different experiences with the tools I use, and it’s important for me to be aware of them so that I’m not limited to my own way of thinking.

The online forums and webinars about serverless development provide me with tons of useful content, but a find a lot of value in the use-cases and questions I get from others. For example, attending any of the live 3-400 level online webinars from AWS provides a deep dive into various serverless topics, along with a pointed Q&A session to address common concerns in the community. So, even if your day-to-day work doesn’t consist of serverless or AWS, you can still engage in these topics by being open to the real-world experiences of other developers.

The following resources can help keep you in the know about new advancements for serverless and AWS:

Learn by Building

Before I was introduced to Stackery, serverless development was definitely a challenge. Building my own serverless applications helped solidify what I’d learned from my certifications, but I’d get frustrated with the amount of configuration and guesswork required to write out my own CloudFormation templates.

With Stackery, it doesn’t take long for me to quickly define cloud resources and integrate them with other services; I’m able to build a stack that confirms my understanding of a specific serverless workflow rapidly. Using Stackery has helped me build serverless applications and thus learn more about serverless development.

If any part of you is compelled to learn more about serverless right out of school, I can’t recommend the above strategies highly enough. It really just comes down to curiosity, research, experimentation and, if you want a boost in confidence, perhaps a certification or two.

The Journey to Serverless: How Did We Get Here? [Infographic]
Gracie Gregory

Gracie Gregory | January 08, 2019

The Journey to Serverless: How Did We Get Here? [Infographic]

It’s the beginning of a new year and when it comes to computing, going serverless is the resolution of many engineering teams. At Stackery, this excites us because we know how significant the positive impacts of serverless are and will be. So much, in fact, that we’re already thinking about its applications for next year and beyond.

But while Stackery is toasting to serverless just as much as the headlines are, it’s crucial at this juncture to ensure that there is a wider foundational understanding. Our team is thrilled that so many others are anxious to rethink how they approach computing, save money with a pay-per-use model, and build without limits using serverless. However, we’re also proponents of knowing your serverless strategy inside and out, thereby having an airtight business use-case that anyone on the team can explain. After all, serverless didn’t rise to the top of Gartner’s top 10 infrastructure and operations trends overnight; its (figurative) source code was being drafted decades ago and this is why it’s much more than a trend. Just as we learned in history class, what’s past is prologue; the developments of yesteryear are the stage directions for today’s innovation. In other words, understanding the origins of serverless will give you a competitive advantage.

So, how exactly did we get to the edge of widespread serverless adoption? What historical developments make all of this more than a temporary buzzword? Why have the conversations about serverless been growing among your peers and leadership team, not dying down? To answer these questions, let’s interrupt our regularly-scheduled New Year celebrations with a trip back in time to 1995…

At Stackery, we’re helping engineering teams build amazing serverless applications with limitless scalability. The best part? The stage for the next decade of software development is being set now. Join us in shaping serverless computing for the next generation. Get started with Stackery today.

PHP on Lambda? Layers Makes it Possible!
Nuatu Tseggai

Nuatu Tseggai | November 29, 2018

PHP on Lambda? Layers Makes it Possible!

AWS’s announcement of Lambda Layers means big things for those of us using serverless in production. The creation of set components that can be included with any number of Lambdas means you no longer have to zip up your application code and all its dependencies each time you deploy a serverless stack. This allows you to include dependencies that are much more bespoke to your particular serverless environment.

In order to enable Stackery customers with Layers at launch, we took a look at Lambda Layers use cases. I also decided to go a bit further and publish a layer that enables you to write a Lambda in PHP. Keep in mind that this is an early iteration of the PHP runtime Layer, which is not yet ready for production. Feel free to use this Layer to learn about the new Lambda Layers feature and begin experimenting with PHP functions and send us any feedback; we expect this will evolve as the activity around proof of concepts expands.

What does PHP do?

PHP is a pure computing language and you can use to emulate the event processing syntax of a general-purpose Lambda. But really, PHP is used to create websites, so Chase’s implementation maintains that model: your Lambda accepts API gateway events and processes them through a PHP web server.

How do you use it?

Configure your function as follows:

  1. Set the Runtime to provided
  2. Determine the latest version of the layer: aws lambda list-layer-versions --layer-name arn:aws:lambda:<your 3. region>:887080169480:layer:php71
  3. Add the following Lambda Layer: arn:aws:lambda:<your region>:887080169480:layer:php71:<latest version>

If you are using AWS SAM it’s even easier! Update your function:

    Type: AWS::Serverless::Function
      Runtime: provided
        - !Sub arn:aws:Lambda:${AWS::Region}:887080169480:layer:php71

Now let’s write some Lambda code!

header('Foo: bar');
print('Request Headers:');
print('Query String Params:');


The response you get from this code isn’t very well formatted, but it does contain the header information passed by the API gateway:

If you try anything other than the path with a set API endpoint response, you’ll get an error response that the sharp-eyed will recognize as being from the PHP web server, which as mentioned above is processing all requests

Implementation Details

Layers can be shared between AWS accounts, which is why the instructions above for adding a layer works: you don’t have to create a new layer for each Lambda. Some key points to remember:

  • A layer must be published on your region
  • You must specify the version number for a layer
  • For the layer publisher, a version number is an integer that increments each time you deploy your layer

How Stackery Makes an Easy Process Easier

Stackery can improve every part of the serverless deployment process, and Lambda Layers are no exception. Stackery makes it easy to configure components like your API gateway.

Stackery also has integrated features in the Stackery Operations Console, which lets you add layers to your Lambda:


Lambda Layers offers the potential for more complex serverless applications making use of a deep library of components, both internal to your team and shared. Try adding some data variables or a few npm modules as a layer today!

How Benefit Cosmetics Uses Serverless
Guest Author - Jason Collingwood

Guest Author - Jason Collingwood | November 21, 2018

How Benefit Cosmetics Uses Serverless

Founded by twin sisters in San Francisco well before the city became the focal point of tech, Benefit has been a refreshing and innovative answer to cosmetics customers for over 40 years. The company is a major player in this competitive industry, with a presence at over 2,000 counters in more than 30 countries and online. In recent years, Benefit has undergone a swift digital transformation, with a popular eCommerce site in addition to their brick-and-mortar stores.

When I started with Benefit, the dev team’s priority was to resolve performance issues across our stack. After some quick successes, the scope opened up to include exploring how we could improve offline business processes, as well. We started with our product scorecard, which involved measuring:

  • In-site search result ranking.
  • Product placement and mentions across home and landing pages.
  • How high we appeared within a given category.

We needed to capture all this information on several different sites and in a dozen different markets. If you can believe it, we’d been living in a chaotic, manually updated spreadsheet and wasting thousands of hours per year gathering this information. There had to be a better way.

Automating Applications

To monitor a large number of sites in real time, a few SaaS options exist, but the costs can be hard to justify. Moreover, most solutions are aimed at end-to-end testing and don’t offer the kind of customization we needed. With our needs so well-defined it wasn’t very much work to write our own web scraper and determine the direction we needed to take.

The huge number of pages to load, though, meant that scaling horizontally was a must. Checking thousands of pages synchronously could take multiple days, which just wasn’t going to cut it when we needed daily reports!

“Well, let’s look into this serverless thing.”

Web monitors and testers are a classic case for serverless. The service needs to be independent of our other infrastructure, run regularly, and NOT be anyone’s full-time job to manage! We didn’t have the time nor people to spend countless hours configuring resources- and really didn’t want to be patching servers to keep it running a year in the future.

How it Works

We use Selenium and a headless Chrome driver to load our pages and write the results to a DynamoDB table. Initially, we tried to use PhantomJS but ran into problems when some of the sites we needed to measure couldn’t connect correctly. Unfortunately, we found ourselves confronted with a lof of “SSL Handshake Failed” and other common connection timeout/connection refused request errors.

The hardest part of switching to the ChromeDriver instead of PhantomJS is that it’s a larger package, and the max size for an AWS Lambda’s code package is 50 mb. We had to do quite a bit of work to get our function, with all its dependencies, down under the size limit.

The Trouble of Complexity

At this point, even though we now had a working Lambda, we weren’t completely out of the woods. Hooking up all the other services proved to be a real challenge. We needed our Lambdas to connect to DynamoDB, multiple S3 buckets, Kinesis streams, and an API Gateway endpoint. Then, in order to scale we needed to be able to build the same stack multiple times.

The Serverless Application Model (SAM) offers some relief from rebuilding and configuring stacks by hand in the AWS console, but the YAML syntax and the specifics of the AWS implementation make it pretty difficult to use freehand. For example, a timer to periodically trigger a Lambda is not a top-level element nor is it a direct child of the Lambda. Rather, it’s a ‘rule’ on a Lambda. There are no examples of this in the AWS SAM documentation.

At one point, we were so frustrated that we gave up and manually zipped up the package and uploaded via the AWS Console UI… at every change to our Lambdas! Scaling a lot of AWS services is simple, but we needed help to come up with a deployment and management process that could scale.

How Stackery Helps

It’s no surprise that when people first see the Stackery Operations Console, they assume it’s just a tool for diagramming AWS stacks. Connecting a Lambda to DynamoDB involves half a dozen menus on the AWS console, but Stackery makes it as easy as drawing a line.

Stackery outputs SAM YAML, meaning we don’t have to write it ourselves, and the changes show up as commits to our code repository so we can learn from the edits that Stackery makes.

It was very difficult to run a service even as simple as ours from scratch and now it’s hard to imagine ever doing it without Stackery. But if we ever did stop using the service, it’s nice to know that all of our stacks are stored in our repositories, along with the SAM YAML I would need to deploy those stacks via CloudFront.


With the headaches of managing the infrastructure out of the way, it meant we could focus our efforts on the product and new features. Within a few months were able to offload maintenance of the tool to a contractor. A simple request a few times a day starts the scanning/scraping process and the only updates needed are to the CSS selectors used to find pertinent elements.

Lastly, since we’re using all of these services on AWS, there’s no need to setup extra monitoring tools, or update them every few months, or generate special reports on their costs. The whole toolkit is rolled into AWS and best of all, upkeep is minimal!

Five Ways Serverless Changes DevOps
Sam Goldstein

Sam Goldstein | October 31, 2018

Five Ways Serverless Changes DevOps

I spent last week at DevOps Enterprise Summit in Las Vegas where I had the opportunity to talk with many people from the world’s largest companies about DevOps, serverless, and the ways they are delivering software faster with better stability. We were encouraged to hear of teams using serverless from cron jobs to core bets on accelerating digital transformation initiatives.

Lots of folks had questions about what we’ve learned running the serverless engineering team at Stackery, how to ensure innovative serverless projects can coexist with enterprise standards, and most frequently, how serverless changes DevOps workflows. Since I now have experience building developer enablement software out of virtual machines, container infrastructures, and serverless services I thought I’d share some of the key differences with you in this post.

Developers Need Cloud-Side Environments to Work Effectively

At its core, serverless development is all about combining managed services in the cloud to create applications and microservices. The serverless approach has major benefits. You can move incredibly fast, outsourcing tons of infrastructure friction and orchestration complexity.

However, because your app consists of managed services, you can’t run it on your laptop. You can’t run the cloud on your laptop.

Let’s pause here to consider the implications of this. With VMs and containers, deploying to the cloud is part of the release process. New features get developed locally on laptops and deployed when they’re ready. With serverless, deploying to the cloud becomes part of the development process. Engineers need to deploy as part of their daily workflow developing and testing functionality. Automated testing generally needs to happen against a deployed environment, where the managed service integrations can be fully exercised and validated.

This means the environment management needs of a serverless team shift significantly. You need to get good at managing a multitude of AWS accounts, developer specific environments, avoiding namespace collisions, injecting environment specific configuration, and promoting code versions from cloud-side development environments towards production.

Note: While there are tools like SAM CLI and localstack that enable developers to invoke functions and mimic some managed services locally, they tend to have gaps and behave differently than a cloud-side environment.

Infrastructure Management = Configuration Management

The serverless approach focuses on leveraging the cloud provider do more of the undifferentiated heavy lifting of scaling the IT infrastructure, freeing your team to maintain laser focus on the unique problems which your organization solves.

To repeat what I wrote a few paragraphs ago, serverless teams build applications by combining managed services that have the most desirable scaling, cost, and durability characteristics. However, here’s another big shift. Developers now need familiarity with a hefty catalog of services. They need to understand their pros and cons, when to use each service, and how to configure each service correctly.

A big part of solving this problem is to leverage Infrastructure as Code (IaC) to define your serverless infrastructure. For serverless teams this commonly takes the form of an AWS Serverless Application Model (SAM) template, a serverless.yml, or a CloudFormation template. Infrastructure as Code provides the mechanism to declare the configuration and relationships between the managed services that compose your serverless app. However, because serverless apps typically involve coordinating many small components (Lambda functions, IAM permissions, API & GraphQL gateways, datastores, etc.) the YAML files containing the IaC definition tend to balloon to hundreds (or sometimes thousands) of lines, making it tedious to modify and hard to keep consistent with good hygiene. Multiply the size and complexity of a microservice IaC template by your dev, test, and prod environments, engineers on the team, and microservices; you quickly get to a place where you will want to carefully consider how they’ll manage the IaC layer and avoid being sucked into YAML hell.

Microservice Strategies Are Similar But Deliver Faster

Serverless is now an option for both new applications and refactoring monoliths into microservices. We’ve seen teams deliver highly scalable, fault-tolerant services in days instead of months to replace functionality in monoliths and legacy systems. We recently saw a team employ the serverless strangler pattern to transition a monolith to GraphQL serverless microservices, delivering a production ready proof of concept in just under a week. We’ve written about the Serverless Strangler Pattern before on the Stackery blog, and I’d highly recommend you consider this approach to technical transformation.

A key difference with serverless is the potential to eliminate infrastructure and platform provisioning cycles completely from the project timeline. By choosing managed services, you’re intentionally limiting yourself to a menu of services with built-in orchestration, fault tolerance, scalability, and defined security models. Building scalable distributed systems is now focused exclusively on the configuration management of your infrastructure as code (see above). Just whisper the magic incantation (in 500-1000 lines of YAML) and microservices spring to life, configured to scale on demand, rather than being brought online through cycles of infrastructure provisioning.

Regardless of platform, enforcing cross-cutting operational concerns when the number of services increases is a (frequently underestimated) challenge. With microservices it’s easy to keep the pieces of your system simple, but it’s hard to keep them all consistent as the number of pieces grows.

What cross-cutting concerns need to be kept in sync? It’s things like:

  • access control
  • secrets management
  • environment configuration
  • deployment
  • rollback
  • auditability
  • so many other things…

Addressing cross-cutting concerns is an area many serverless teams struggle, sometimes getting bogged down in a web of inconsistent tooling, processes, and visibility. However the serverless teams that do master cross-cutting effectively are able to deliver on microservice transformation initiatives much faster than those using other technologies.

Serverless is Innovating Quickly

Just like serverless teams, the serverless ecosystem is moving fast. Cloud providers are pushing out new services and features every day. Serverless patterns and best practices are undergoing rapid, iterative evolution. There are multiple AWS product and feature announcements every day. It’s challenging to stay current on the ever expanding menu of cloud managed services, let alone best practices.

Our team at Stackery is obsessed with tracking changes in the serverless ecosystem, identifying best practices, and sharing these with the serverless community. AWS Secrets Manager, easy authorization hooks for REST APIs in AWS SAM, 15 minute Lambda timeouts, and AWS Fargate Containers are just a few examples of recent serverless ecosystem changes our team is using. Only a serverless team can keep up with a serverless team. We have learned a lot of lessons, some of them the hard way, about how to do serverless right. We’ll keep refining our serverless approach and can honestly say we’re moving faster and with more focus than we’d ever thought possible.

Patching and Capacity Distractions Go Away (Mostly)

Raise your hand if the productivity of your team ever ground to a halt because you needed to fight fires or were blocked waiting for infrastructure to be provisioned. High profile security vulnerabilities are being discovered all the time. The week Heartbleed was announced a lot of engineers dropped what they had been working on to patch operating systems and reboot servers. Serverless teams intentionally don’t manage OS’s. There’s less surface area for them to patch, and as a result they’re less likely to get distracted by freshly discovered vulnerabilities. This doesn’t completely remove a serverless team’s need to track vulnerabilities in their dependencies, but it does significantly scope them down.

Capacity constraints are a similar story. Since serverless systems scale on demand, it’s not necessary to plan capacity in a traditional sense, managing a buffer of (often slow to provision) capacity to avoid hitting a ceiling in production. However serverless teams do need to watch for a wide variety AWS resource limits and request increases before they are hit. It is important to understand how your architecture scales and how that will effect your infrastructure costs. Instead of your system breaking, it might just send you a larger bill so understanding the relationship between scale, reliability, and cost is critical.

As a community we need to keep pushing the serverless envelope and guiding more teams in the techniques to break out of technical debt, overcome infrastructure inertia, embrace a serverless mindset, and start showing results they never knew they could achieve.

Disaster Recovery in a Serverless World - Part 3
Nuatu Tseggai

Nuatu Tseggai | October 18, 2018

Disaster Recovery in a Serverless World - Part 3

This is part three of a multi-part blog series. In the first post we covered Disaster Recovery planning when building serverless applications and in the second post we covered the systems engineering needed for an automated solution in the AWS cloud. In this post, we’ll discuss the end-to-end Disaster Recovery exercise we performed to validate the plan.

The time has come to exercise the Disaster Recovery plan. The objective of this phase is to see how closely the plan correlates to the current reality of your systems and to identify areas for improvement. In practice, this means assembling a team to conduct the exercise and documenting the process within the communication channel you outlined in the plan.


In the first post you’ll find an outline of the plan that we’ll be referencing in this exercise. Section 2 describes the process of initiating the plan and assigning roles and Section 3 describes the communication channels used to keep stakeholders in sync.

Set the Stage

Decide who will be involved in the exercise, block out a chunk of time equivalent to the Recovery Time Objective, and create the communication channel in Slack (or whatever communication tool that your organization utilizes; ie: Google Hangouts, Skype, etc).

Create a document to capture key information relevant to the Disaster Recovery exercise such as who is assigned which roles, the AWS regions in play, and most importantly, a timeline pertaining to the initiation and completion of each recovery step. Whereas the Disaster Recovery plan is a living document to be updated over time, the Disaster Recovery exercise document mentioned here is specific to the date of the exercise; ie: Disaster Recovery Exercise 20180706.

Conduct the Exercise

Enter the communication channel and post a message that signals the start of the exercise immediately followed by a message to the executive team to get their approval to initiate the Disaster Recovery plan.

Upon completion of the exercise, post a message that signals the end of the exercise.

Key Takeaways

  1. Disaster Recovery exercises can be stressful. Be courteous and supportive to one another.
  2. High performance teams develop trust through practicing to communicate effectively. Lean towards communication that is both precise and unambiguous. This is easier said than done but gets easier through experience. Over time, the Disaster Recovery plan and exercise will become self reinforcing.
  3. Don’t forget to involve a representative of the executive team. This is to ensure that the operational status is being communicated across all levels.
  4. Be clear on which AWS region represents the source and which AWS region represents the DR target.
  5. Research and experiment with the options you have available to automate DNS changes. Again, the key here is gaining the skills and confidence through testing and optimizing for fast recovery. Know your records (A and CNAME) and the importance of TTL.
  6. Verify completion of recovery from the perspective of a customer.
  7. Conduct a retrospective to illuminate areas for improvement or research. Ask questions that encourage participants to openly discuss both the technical and psychological aspects of the exercise.
Disaster Recovery in a Serverless World - Part 2
Apurva Jantrania

Apurva Jantrania | September 17, 2018

Disaster Recovery in a Serverless World - Part 2

This is part two of a multi-part blog series. In the previous post, we covered Disaster Recovery planning when building serverless applications. In this post, we’ll discuss the systems engineering needed for an automated solution in the AWS cloud.

As I started looking into implementing Stackery’s automated backup solution, my goal was simple: In order to support a disaster recovery plan, we needed to have a system that automatically creates backups of our database to a different account and to a different region. This seemed like a straightforward task, but I was surprised to find that there was no documentation on how to do this in an automated, scalable solution - all existing documentation I could find only discussed partial solutions and were all done manually via the AWS Console. Yuck.

I hope that this post will make help fill that void and help you understand how to implement an automated solution for your own disaster recovery solution. This post does get a bit long so if that’s not your thing, see the tl;dr.

The Initial Plan

AWS RDS has automated backups which seemed like the perfect platform to base this automation upon. Furthermore, RDS even emits events that seem ideal for using to kick off a lambda function that will then copy the snapshot to the disaster recovery account.


The first issue I discovered was that AWS does not allow you to share automated snapshots - AWS requires that you first make a manual copy of the snapshot before you can share it with another account. I initially thought that this wouldn’t be a major issue - I can easily make my lambda function first kick off a manual copy. According to the RDS Events documentation, there is an event RDS-EVENT-0042 that would fire when a manual snapshot was created. I could then use that event to then share the newly created manual snapshot to the disaster recovery account.

This leads to the second issue - while RDS will emit events for snapshots that are created manually, it does not emit events for snapshots that are copied manually. The AWS docs aren’t clear about this and it’s an unfortunate feature gap. This means that I have to fall back to a timer based lambda function that will search for and share the latest available snapshot.

Final Implementation Details

While this ended up more complicated than initially envisioned, Stackery still makes it easy to add all the needed pieces for fully automated backups. My implementation ended up looking like this:

The DB Event Subscription resource is a CloudFormation Resource in which contains a small snippet of CloudFormation that subscribes the DB Events topic to the RDS database

Function 1 - dbBackupHandler

This function will receive the events from the RDS database via the DB Events topic. It then creates a copy of the snapshot with an ID that identifies the snapshot as an automated disaster recovery snapshot

const AWS = require('aws-sdk');
const rds = new AWS.RDS();

const DR_KEY = 'dr-snapshot';
const ENV = process.env.ENV;

module.exports = async message => {
  // Only run DB Backups on Production and Staging
  if (!['production', 'staging'].includes(ENV)) {
    return {};

  let records = message.Records;
  for (let i = 0; i < records.length; i++) {
    let record = records[i];

    if (record.EventSource === 'aws:sns') {
      let msg = JSON.parse(record.Sns.Message);
      if (msg['Event Source'] === 'db-snapshot' && msg['Event Message'] === 'Automated snapshot created') {
        let snapshotId = msg['Source ID'];
        let targetSnapshotId = `${snapshotId}-${DR_KEY}`.replace('rds:', '');

        let params = {
          SourceDBSnapshotIdentifier: snapshotId,
          TargetDBSnapshotIdentifier: targetSnapshotId

        try {
          await rds.copyDBSnapshot(params).promise();
        } catch (error) {
          if (error.code === 'DBSnapshotAlreadyExists') {
            console.log(`Manual copy ${targetSnapshotId} already exists`);
          } else {
            throw error;

  return {};

A couple of things to note:

  • I’m leveraging Stackery Environments in this function - I have used Stackery to define process.env.ENV based on the environment the stack is deployed to
  • Automatic RDS snapshots have an id that begins with ‘rds:’. However, snapshots created by the user cannot have a ‘:’ in the ID.
  • To make future steps easier, I append dr-snapshot to the id of the snapshot that is created

Function 2 - shareDatabaseSnapshot

This function runs every few minutes and shares any disaster recovery snapshots to the disaster recovery account

const AWS = require('aws-sdk');
const rds = new AWS.RDS();

const DR_KEY = 'dr-snapshot';
const DR_ACCOUNT_ID = process.env.DR_ACCOUNT_ID;
const ENV = process.env.ENV;

module.exports = async message => {
  // Only run on Production and Staging
  if (!['production', 'staging'].includes(ENV)) {
    return {};

  // Get latest snapshot
  let snapshot = await getLatestManualSnapshot();

  if (!snapshot) {
    return {};

  // See if snapshot is already shared with the Disaster Recovery Account
  let data = await rds.describeDBSnapshotAttributes({ DBSnapshotIdentifier: snapshot.DBSnapshotIdentifier }).promise();
  let attributes = data.DBSnapshotAttributesResult.DBSnapshotAttributes;

  let isShared = attributes.find(attribute => {
    return attribute.AttributeName === 'restore' && attribute.AttributeValues.includes(DR_ACCOUNT_ID);

  if (!isShared) {
    // Share Snapshot with Disaster Recovery Account
    let params = {
      DBSnapshotIdentifier: snapshot.DBSnapshotIdentifier,
      AttributeName: 'restore',
      ValuesToAdd: [DR_ACCOUNT_ID]
    await rds.modifyDBSnapshotAttribute(params).promise();

  return {};

async function getLatestManualSnapshot (latest = undefined, marker = undefined) {
  let result = await rds.describeDBSnapshots({ Marker: marker }).promise();

  result.DBSnapshots.forEach(snapshot => {
    if (snapshot.SnapshotType === 'manual' && snapshot.Status === 'available' && snapshot.DBSnapshotIdentifier.includes(DR_KEY)) {
      if (!latest || new Date(snapshot.SnapshotCreateTime) > new Date(latest.SnapshotCreateTime)) {
        latest = snapshot;

  if (result.Marker) {
    return getLatestManualSnapshot(latest, result.Marker);

  return latest;
  • Once again, I’m leveraging Stackery Environments to populate the ENV and DR_ACCOUNT_ID environment variables.
  • When sharing a snapshot with another AWS account, the AttributeName should be set to restore (see the AWS RDS SDK)

Function 3 - copyDatabaseSnapshot

This function will run in the Disaster Recovery account and is responsible for detecting snapshots that are shared with it and making a local copy in the correct region - in this example, it will make a copy in us-east-1.

const AWS = require('aws-sdk');
const rds = new AWS.RDS();

const sourceRDS = new AWS.RDS({ region: 'us-west-2' });
const targetRDS = new AWS.RDS({ region: 'us-east-1' });

const DR_KEY = 'dr-snapshot';
const ENV = process.env.ENV;

module.exports = async message => {
  // Only Production_DR and Staging_DR are Disaster Recovery Targets
  if (!['production_dr', 'staging_dr'].includes(ENV)) {
    return {};

  let [shared, local] = await Promise.all([getSourceSnapshots(), getTargetSnapshots()]);

  for (let i = 0; i < shared.length; i++) {
    let snapshot = shared[i];
    let fullSnapshotId = snapshot.DBSnapshotIdentifier;
    let snapshotId = getCleanSnapshotId(fullSnapshotId);
    if (!snapshotExists(local, snapshotId)) {
      let targetId = snapshotId;

      let params = {
        SourceDBSnapshotIdentifier: fullSnapshotId,
        TargetDBSnapshotIdentifier: targetId
      await rds.copyDBSnapshot(params).promise();

  return {};

// Get snapshots that are shared to this account
async function getSourceSnapshots () {
  return getSnapshots(sourceRDS, 'shared');

// Get snapshots that have already been created in this account
async function getTargetSnapshots () {
  return getSnapshots(targetRDS, 'manual');

async function getSnapshots (rds, typeFilter, snapshots = [], marker = undefined) {
  let params = {
    IncludeShared: true,
    Marker: marker

  let result = await rds.describeDBSnapshots(params).promise();

  result.DBSnapshots.forEach(snapshot => {
    if (snapshot.SnapshotType === typeFilter && snapshot.DBSnapshotIdentifier.includes(DR_KEY)) {

  if (result.Marker) {
    return getSnapshots(rds, typeFilter, snapshots, result.Marker);

  return snapshots;

// Check to see if the snapshot `snapshotId` is in the list of `snapshots`
function snapshotExists (snapshots, snapshotId) {
  for (let i = 0; i < snapshots.length; i++) {
    let snapshot = snapshots[i];
    if (getCleanSnapshotId(snapshot.DBSnapshotIdentifier) === snapshotId) {
      return true;
  return false;

// Cleanup the IDs from automatic backups that are prepended with `rds:`
function getCleanSnapshotId (snapshotId) {
  let result = snapshotId.match(/:([a-zA-Z0-9-]+)$/);

  if (!result) {
    return snapshotId;
  } else {
    return result[1];
  • Once again, leveraging Stackery Environments to populate ENV, I ensure this function only runs in the Disaster Recovery accounts

TL;DR - How Automated Backups Should Be Done

  1. Have a function that will manually create an RDS snapshot using a timer and lambda. Use a timer that makes sense for your use case
    • Don’t bother trying to leverage the daily automated snapshot provided by AWS RDS.
  2. Have a second function, that monitors for the successful creation of the snapshot from the first function and shares it to your disaster recovery account.

  3. Have a third function that will operate in your disaster recovery account that will monitor for snapshots shared to the account, and then create a copy of the snapshot that will be owned by the disaster recovery account, and in the correct region.
How to Write 200 Lines of YAML in 1 Minute
Anna Spysz

Anna Spysz | September 11, 2018

How to Write 200 Lines of YAML in 1 Minute

Last month, our CTO Chase wrote about why you should stop YAML engineering. I completely agree with his thesis, though for slightly different reasons. As a new developer, I’ve grasped that it’s crucial to learn and do just what you need and nothing more - at least when you’re just getting started in your career.

Now, I’m all about learning for learning’s sake - I have two now-useless liberal arts degrees that prove that. However, when it comes to being a new developer, it’s very easy to get overwhelmed by all of the languages and frameworks out there, and get caught in paralysis as you jump from tutorial to tutorial and end up not learning anything very well. I’ve certainly been there - and then I decided to just get good at the tools I’m actually using for work, and learn everything else as I need it.

Which is what brings us to YAML - short for “YAML Ain’t Markup Language”. I started out as a Python developer. When I needed to, I learned JavaScript. When my JavaScript needed some support, I learned a couple of front-end frameworks, and as much Node.js as I needed to write and understand what my serverless functions were doing. As I got deeper into serverless architecture, it seemed like learning YAML was the next step - but if it didn’t have to be, why learn it? If I can produce 200+ lines of working YAML without actually writing a single line of it, in much less time than it would take me to write it myself (not counting the hours it would take to learn a new markup language), then that seems like the obvious solution.

So if a tool allows me to develop serverless apps without having to learn YAML, I’m all for that. Luckily, that’s exactly what Stackery does, as you can see in the video below:

Get the Serverless Development Toolkit for Teams

Sign up now for a 30-day free trial. Contact one of our product experts to get started building amazing serverless applications today.

To Top