Stacks on Stacks

The Serverless Ecosystem Blog by Stackery.

Posts on Cloud Infrastructure

The Journey to Serverless: How Did We Get Here? [Infographic]
Gracie Gregory

Gracie Gregory | January 08, 2019

The Journey to Serverless: How Did We Get Here? [Infographic]

It’s the beginning of a new year and when it comes to computing, going serverless is the resolution of many engineering teams. At Stackery, this excites us because we know how significant the positive impacts of serverless are and will be. So much, in fact, that we’re already thinking about its applications for next year and beyond.

But while Stackery is toasting to serverless just as much as the headlines are, it’s crucial at this juncture to ensure that there is a wider foundational understanding. Our team is thrilled that so many others are anxious to rethink how they approach computing, save money with a pay-per-use model, and build without limits using serverless. However, we’re also proponents of knowing your serverless strategy inside and out, thereby having an airtight business use-case that anyone on the team can explain. After all, serverless didn’t rise to the top of Gartner’s top 10 infrastructure and operations trends overnight; its (figurative) source code was being drafted decades ago and this is why it’s much more than a trend. Just as we learned in history class, what’s past is prologue; the developments of yesteryear are the stage directions for today’s innovation. In other words, understanding the origins of serverless will give you a competitive advantage.

So, how exactly did we get to the edge of widespread serverless adoption? What historical developments make all of this more than a temporary buzzword? Why have the conversations about serverless been growing among your peers and leadership team, not dying down? To answer these questions, let’s interrupt our regularly-scheduled New Year celebrations with a trip back in time to 1995…

At Stackery, we’re helping engineering teams build amazing serverless applications with limitless scalability. The best part? The stage for the next decade of software development is being set now. Join us in shaping serverless computing for the next generation. Get started with Stackery today.

Infrastructure-as-Code Is The New Assembly Language For The Cloud
Chase Douglas

Chase Douglas | December 13, 2018

Infrastructure-as-Code Is The New Assembly Language For The Cloud

My career as a software engineer started in 2007 at Purdue University. I was working in the Linux kernel and researching how data was shuffled between the kernel and the user application layers. This was happening in huge clusters of machines that all talked to each other using OpenMPI — how supercomputers, like those at Los Alamos National Labs, operate to perform their enormous calculations around meteorology, physics, chemistry, etc.

It was an exciting time, but I had to learn a ton about how to debug the kernel. I’d only started programming in C over the previous year, so it really stretched my knowledge and experience. A big component of this was figuring out how to navigate a gigantic code base, which hit 6 million lines of code that year (again, in 2007!) There were times when I felt helpless trying to make sense of it all, but I will be forever grateful for the experience.

Being thrown in the deep-end meant that I was exposed to the way real-world code can be modularized. I learned how to quickly dissect a large codebase and how to debug in some of the toughest environments. But over time I also realized that I had learned a lot of skills that are largely irrelevant to how the vast majority of people build business value into software today. I now build business value that solves more abstracted problems than how bits are shuffled through a networking stack.

It’s these higher order abstractions that help engineering teams realize pivotal business results.


The main drivers of software-engineering productivity are the abstractions used to reach development goals. You can write software using CPU assembly languages or modern scripting languages. In a theoretical sense, you can achieve the same software goals with either approach. But realistically, productivity will be higher with modern scripting languages than with assembly languages.

Yet, everything we write today compiles down to assembly language in some form, even if it’s through Just-In-Time compilation. That’s because assembly language is the core medium we use to communicate intent to hardware, which ultimately carries out the operations. But now, we no longer directly write software with it; we have better abstractions.


Infrastructure-as-Code (IaC) fulfills the same foundational mechanism for cloud computing. It informs the cloud provider with raw data about our intentions: create a function here with these permissions and create a topic over there with this name.

Just as with assembly language, we have been writing IaC templates by hand because there have not been any better methods.

Just as with assembly language, we have been writing IaC templates by hand because there have not been any better methods. Frameworks like that of serverless.com are ever-so-slightly better abstractions; however, many adopters of these frameworks have yet to achieve meaningful business-productivity gains. This is largely because, once off the beaten path, you end up writing bare CloudFormation. The whole process leaves you back at square one for some of your most complicated infrastructure like VPCs and databases.

IaC is the only sane way to provision cloud infrastructure. That means it’s time for us to find abstractions on top of IaC that provide us with meaningful productivity gains. This is where Stackery comes in. Stackery provides you with an easy drag-and-drop interface to configure your serverless IaC templates. Crucially, you can also import your existing IaC templates (AWS SAM or serverless.com) and use Stackery to extend your applications without worrying that Stackery will delete or modify unrelated infrastructure configuration.

My career could have taken a number of different paths, but I’m glad to be in serverless today. The industry is moving steadily in this direction and my team creates solutions that make it more manageable for everyone. Notably, the “deep-end” of serverless is much more navigable than the technology I was working with in 2007. Unlike certain aspects of what I learned in the bowels of the Linux kernel, serverless and the tools that manage our IaC templates are the new assembly language for the cloud. Stackery and IaC are significant when considering how the majority of developers will be building business value into software going forward.

How Benefit Cosmetics Uses Serverless
Guest Author - Jason Collingwood

Guest Author - Jason Collingwood | November 21, 2018

How Benefit Cosmetics Uses Serverless

Founded by twin sisters in San Francisco well before the city became the focal point of tech, Benefit has been a refreshing and innovative answer to cosmetics customers for over 40 years. The company is a major player in this competitive industry, with a presence at over 2,000 counters in more than 30 countries and online. In recent years, Benefit has undergone a swift digital transformation, with a popular eCommerce site in addition to their brick-and-mortar stores.

When I started with Benefit, the dev team’s priority was to resolve performance issues across our stack. After some quick successes, the scope opened up to include exploring how we could improve offline business processes, as well. We started with our product scorecard, which involved measuring:

  • In-site search result ranking.
  • Product placement and mentions across home and landing pages.
  • How high we appeared within a given category.

We needed to capture all this information on several different sites and in a dozen different markets. If you can believe it, we’d been living in a chaotic, manually updated spreadsheet and wasting thousands of hours per year gathering this information. There had to be a better way.

Automating Applications

To monitor a large number of sites in real time, a few SaaS options exist, but the costs can be hard to justify. Moreover, most solutions are aimed at end-to-end testing and don’t offer the kind of customization we needed. With our needs so well-defined it wasn’t very much work to write our own web scraper and determine the direction we needed to take.

The huge number of pages to load, though, meant that scaling horizontally was a must. Checking thousands of pages synchronously could take multiple days, which just wasn’t going to cut it when we needed daily reports!

“Well, let’s look into this serverless thing.”

Web monitors and testers are a classic case for serverless. The service needs to be independent of our other infrastructure, run regularly, and NOT be anyone’s full-time job to manage! We didn’t have the time nor people to spend countless hours configuring resources- and really didn’t want to be patching servers to keep it running a year in the future.

How it Works

We use Selenium and a headless Chrome driver to load our pages and write the results to a DynamoDB table. Initially, we tried to use PhantomJS but ran into problems when some of the sites we needed to measure couldn’t connect correctly. Unfortunately, we found ourselves confronted with a lof of “SSL Handshake Failed” and other common connection timeout/connection refused request errors.

The hardest part of switching to the ChromeDriver instead of PhantomJS is that it’s a larger package, and the max size for an AWS Lambda’s code package is 50 mb. We had to do quite a bit of work to get our function, with all its dependencies, down under the size limit.

The Trouble of Complexity

At this point, even though we now had a working Lambda, we weren’t completely out of the woods. Hooking up all the other services proved to be a real challenge. We needed our Lambdas to connect to DynamoDB, multiple S3 buckets, Kinesis streams, and an API Gateway endpoint. Then, in order to scale we needed to be able to build the same stack multiple times.

The Serverless Application Model (SAM) offers some relief from rebuilding and configuring stacks by hand in the AWS console, but the YAML syntax and the specifics of the AWS implementation make it pretty difficult to use freehand. For example, a timer to periodically trigger a Lambda is not a top-level element nor is it a direct child of the Lambda. Rather, it’s a ‘rule’ on a Lambda. There are no examples of this in the AWS SAM documentation.

At one point, we were so frustrated that we gave up and manually zipped up the package and uploaded via the AWS Console UI… at every change to our Lambdas! Scaling a lot of AWS services is simple, but we needed help to come up with a deployment and management process that could scale.

How Stackery Helps

It’s no surprise that when people first see the Stackery Operations Console, they assume it’s just a tool for diagramming AWS stacks. Connecting a Lambda to DynamoDB involves half a dozen menus on the AWS console, but Stackery makes it as easy as drawing a line.

Stackery outputs SAM YAML, meaning we don’t have to write it ourselves, and the changes show up as commits to our code repository so we can learn from the edits that Stackery makes.

It was very difficult to run a service even as simple as ours from scratch and now it’s hard to imagine ever doing it without Stackery. But if we ever did stop using the service, it’s nice to know that all of our stacks are stored in our repositories, along with the SAM YAML I would need to deploy those stacks via CloudFront.

Results

With the headaches of managing the infrastructure out of the way, it meant we could focus our efforts on the product and new features. Within a few months were able to offload maintenance of the tool to a contractor. A simple request a few times a day starts the scanning/scraping process and the only updates needed are to the CSS selectors used to find pertinent elements.

Lastly, since we’re using all of these services on AWS, there’s no need to setup extra monitoring tools, or update them every few months, or generate special reports on their costs. The whole toolkit is rolled into AWS and best of all, upkeep is minimal!

Creating Your First AWS RDS Database with Stackery
Toby Fee

Toby Fee | September 10, 2018

Creating Your First AWS RDS Database with Stackery

Once you’ve built your first Lambda, you’ll need a datastore.

AWS does have an official instruction guide to help with this, but the official AWS instruction guide is extensive. You need to set up at least four resources before you can deploy your first RDS table, and by the time their 2,000-word guide is done you still haven’t even selected your DB format.

Thankfully there’s a much easier way: The deployment service Stackery, can get your RDS up and running and hooked up to Lambdas in just a few minutes.

Getting Started with RDS

AWS’s Relational Database Service (RDS) is a good choice if you’re familiar with SQL databases already or just prefer the reliable classic of tables over NoSQL.

At first glance, RDS looks like it’s basically just database hosting from Amazon, but in reality, it’s very different from their traditional server options hosted on EC2. When looking the tooling closely, it’s obvious RDS is as different from a traditional DB as a Lambda is different from a web server.

A few key differences:

  1. No OS management — While you can select the exact DB type and version (e.g., PostgreSQL, MySQL, MariaDB, Oracle, Microsoft SQL Server, Aurora) the operating system level is fully virtualized
  2. AWS’s usual restrictive access policies — Centrally, this is a positive thing: Amazon forces you to be in tight control of what services can access each other. As a result most beginners often end up stumped when, after trying to stand up their RDS, they want to connect to it from their laptop.
  3. Indescribably smooth uptime management — With the concept of an Availability Zone (AZ) you can create multiple instances that Amazon can automatically use for failover

(Seen here: two availability zones in action)

  1. And finally, backups and scaling are simpler — Because if virtualization doesn’t make scaling and backups easier, what the heck is it even for, right?

Deploy an RDS Table

Start by creating a new Stack in the Stackery UI and deleting the default nodes. We’ll add one node for our database and a second for a lambda to access DB data. You’ll need to set a DB name. Note that this is not the database name for DB requests, just the name it’ll be referred to in the Stackery UI. Set a root password here as well. Be sure to record the password somewhere for later use in this tutorial.

For the moment, our Lambda function will be fine with the default settings, but we’ll edit it later to retrieve records from the database.

This tutorial doesn’t cover populating the database in detail, since the most likely scenario is that you’ll have test data you’d like to populate. Let’s set up our Lambda to load records from the account DB.

To get access to edit your Lambda’s code: commit this stack in the Stackery UI (no need to deploy yet), then click the commit ID to go to the stack code in Github.

Next, clone the repository. Then you can edit it in whatever editor you prefer.

Let’s populate the Lambda with some code to access records from the DB:

const knex = require('knex');
cont dbInfo = {
  development: {
    client: 'mysql',
    connection: {
      host: 'localhost',
      user: 'root',
      password: 'mySecretPW',
      database: 'accounts'
    }
  },

  production: {
    client: 'mysql',
    connection: {
      host,
      user: 'root',
      password: process.env.DB_PASSWORD,
      database: 'accounts'
    }
  }
};

const connectionName = process.env.CONNECTION || 'development';
const connection = dbInfo[connectionName];

/**
 * Fetch list of accounts from database and respond with an array of account
 * names.
 */
module.exports = async message => {
  const client = knex(connection);

  try {
    const records = await client('accounts').select('name');
    return records.map(record => record.name);
  } finally {
    client.destroy();
  }
};

A few things to note about this code:

  • As is the default for new Lambda Resources in Stackery, this is written in Node 8, with support for finally{}.
  • This code relies on inline connection info, but you’ll more likely want to share that info between multiple functions — See how to share code modules between Javascript functions in a separate guide.
  • If you’ve written code to access a database from a web app, you might be surprised to discover we’re creating and destroying a client for every request. Since we’re deploying this code to a serverless environment, it’s critical that we don’t max out our connections on smaller RDS instances. It’s quite easy for multiple requests to our Lambda to result in a point of 65 simultaneous connections, which is the limit on a db.t2.micro instance — We’ve previously discussed database connection usage in a serverless context on our blog.

Push the new Lambda code up to the GitHub repository, and refresh the Stackery UI to show your updates.

The last step is to give the Lambda function the database password. It expects to find it as a config variable for this Lambda — which you can set from the Stackery UI — but this isn’t an ideal place to put the password, since it will be visible in both the Stackery UI and the GitHub repo for this stack.

Instead, set two environment variables CONNECTION to production and DB_PASSWORD to ${config.dbPassword}, so the password will be populated from the environment config.

You can access ‘environments’ from the menu at the top right. Finally, set any environment variables as standard Javascript Object Notation (JSON).

Now your stack should be ready to commit, prepare, and deploy!

AWS Serverless Application Model YAML Templates: An Introduction
Toby Fee

Toby Fee | August 21, 2018

AWS Serverless Application Model YAML Templates: An Introduction

If you’ve never worked with YAML before, you’ll probably find the basic template format daunting. YAML — which somehow doesn’t stand for Yet Another Markdown Language, but instead YAML Ain’t Markdown Language — is a simplified way of storing data structures. YAML is popular with Rails users and sees some popularity with all higher order language programmers (JS, Python, etc…).

Clean, Simple Formatting

YAML is popular with languages that tend to see only three types of data:

  • Scalars: numbers and strings, the ‘text’ of your data
  • Heaps: e.g. objects, dictionaries, or any other term for a mapping of key-value pairs
  • Lists: i.e. arrays or an ordered sequence of the other two data types

YAML primarily tries to produce configuration documents that are readable without any special interpretation and formatting. If you look at even a short snippet of YAML from the middle of a SAM configuration file the key-value pairs are pretty readable, and you can even guess how lists might be used:

   DynamoDBTable:
     Type: AWS::DynamoDB::Table
     Properties:
       AttributeDefinitions:
         - AttributeName: id
           AttributeType: S

Indentation is used for structure, colons separate key-value pairs, and dashes are used to create “bullet” lists.

This simple, clean formatting also brings us to the most common complaint about YAML, it is whitespace-dependent, and missing a single tab can mess up the interpretation of the whole document.

YAML Basics

Always remember that YAML Lint is your friend. I still use yamllint.com every time I write YAML, turning on a display for invisible characters and purpose built linters for your IDE are all well and good, but that final check of YAML is still crucial.

YAML vs. XML

If you’re familiar with XML you may well have already interacted with systems that use either YAML or XML for configuration, or noticed that they are used by competing data storage systems. YAML is instantly readable, XML is not. On the other hand, the X in XML stands for “extensible” and while YAML really only has three ways to store data, the markup available in XML is limitless.

YAML vs. JSON

JavaScript Object Notation (JSON) and YAML look more similar and share design goals. While JSON is designed to be a bit easier to compile programmatically, the biggest difference for everyday use is that YAML has a method for referencing other items in the same JSON file. This might seem pretty minor but when standing up multiple API pathways in a SAM file, the ability to say TABLE_NAME: !Ref Table is a huge convenience.

The official specifications for YAML are well written and have some general guidance on best practices, implementation differences, and the design goals of YAML.

AWS Serverless Application Model

AWS CloudFormation templates are a standardized specification for describing, documenting, and deploying components of a serverless application Let’s look at one of the shortest possible SAM files:

AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31

Resources:
    MyFirstFunction:
        Type: AWS::Serverless::Function
        Properties:
           Handler: index.handler
           Runtime: nodejs4.3
           CodeUri: s3://bucketName/codepackage.zip

With the Handler property set to index.handler, the code package at the CodeUri will be opened, CloudFormation will look for a file (“index”) and a function or module (“handler”). Both the AWSTemplateFormatVersion and Transform should be the same for all your Sam files, and the Type and Runtime properties are self-explanatory.

Events

While the template above will create a Lambda, it’ll be fundamentally incomplete since it needs something to trigger a Lambda in order to run. The Events property defines these triggers.

In general you’d use SAM files when you want to define multiple interlinked pieces of your application. This example (leaving off the top three lines that will be the same in every SAM file) grabs events from a DynamoDB table:

  TableAlerter:
    Type: AWS::Serverless::Function
    Properties:
      Handler: index.handler
      Runtime: nodejs6.10
      Events:
        Stream:
          Type: DynamoDB
          Properties:
            Stream: !GetAtt DynamoDBTable.StreamArn
            BatchSize: 100
            StartingPosition: TRIM_HORIZON  

Bringing it All Together

Most of the time SAM is the best choice because SAM can stand up an interlinked set of resources. So our SAM file will have more than one key under Resources.

Let’s stand up a table for the lambda above to read from:

AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Resources:
  TableAlerter:
    Type: AWS::Serverless::Function
    Properties:
      Handler: index.handler
      Runtime: nodejs6.10
      Events:
        Stream:
          Type: DynamoDB
          Properties:
            Stream: !GetAtt DynamoDBTable.StreamArn
            BatchSize: 100
            StartingPosition: TRIM_HORIZON   

  Alerts:
    Type: AWS::DynamoDB::Table
    Properties:
      AttributeDefinitions:
        - AttributeName: id
          AttributeType: S
      KeySchema:
        - AttributeName: id
          KeyType: HASH
      ProvisionedThroughput:
        ReadCapacityUnits: 5
        WriteCapacityUnits: 5

At this point it might be helpful to use anchors from the YAML specification to share config information or try the AWS SAM system for creating and sharing environment variables.

The official AWS documentation on SAM isn’t particularly instructive, with just a few examples and some tutorial references. However the full specification is laid out in the AWSLabs GitHub project documentation.

Once you’ve mastered the basics, or if you’re feeling overwhelmed by the tool, you may want to use a service to create and deploy your stack via CloudFormation. Stackery does just that, and recently announced that Stackery now builds SAM templates natively.

Serverless for Total Beginners
Anna Spysz

Anna Spysz | August 16, 2018

Serverless for Total Beginners

As the newest member of the Stackery Engineering team and Stackery’s Resident N00b™, I have been wanting to explain what serverless is in the most beginner-friendly terms possible. This is my attempt to do so.

I recently graduated a full-stack coding bootcamp, where I learned several ways to build and deploy a traditional (i.e. monolithic) web application, how to use containers to deploy an app, but nothing about serverless architecture. It wasn’t until I started my internship at Stackery that I even began to grasp what serverless is, and I’m still learning ten new things about it every day. While the concept of serverless functions and FaaS may seem daunting to new developers, I’ve found that it’s actually a great thing for beginners to learn; if done right, it can make the process of deployment a lot easier.

Above all, serverless is a new way of thinking about building applications. What’s exciting to me as a frontend-leaning developer is that it allows for most of the heavy lifting of your app to take place in the frontend, while cloud services handle typically backend aspects such as logging in users or writing values to a database. That means writing less code up front, and allows beginners to build powerful apps faster than the traditional monolith route.

So let’s dive in with some definitions.

What is a stack, and why are we stacking things?

A stack is essentially a collection of separate computing resources that work together as a unit to accomplish a specific task. In some applications, they can make up the entire backend of an app.

Stackery dashboard

The above example is about as simple as you can get with a stack. It consists of a function and an object store. When triggered, the function manipulates the data stored in the object store (in this case, an S3 bucket on AWS).

A simple use case would be a function that returns a specific image from the bucket when triggered - say, when a user logs into an app, their profile picture could be retrieved from the object store.

Here’s a somewhat more complex stack:

Stackery dashboard

This stack consists of a function (SignupHandler) that is triggered when someone submits an email address on a website’s newsletter signup form (Newsletter Signup API). The function takes the contents of that signup form, in this case a name and email address, and stores it in a table called Signup. It also has an error logger (another function called LogErrors), which records what happened should anything go wrong. If this stack were to be expanded, another function could email the contents of the Signup table to a user when requested, for example.

Under the hood, this stack is using several AWS services: Lambda for the functions, API Gateway for the API, and DynamoDB for the table.

Finally, here is a stack handling CRUD operations in a web application:

Stackery dashboard

While this looks like a complex operation, it’s actually just the GET, PUT, POST, and DELETE methods connected to a table of users. Each of the functions is handling just one operation, depending on which API endpoint is triggered, and then the results of that function are stored in a table.

This kind of CRUD stack would be very useful in a web application that requires users to sign up and sign in to use. When a user signs up, the POST API triggers the createUser function, which simply pulls up the correct DynamoDB table and writes the values sent (typically username and password) to the table. The next time the user comes back to the app and wants to log in, the getUser function is called by the GET API. Should the user change their mind and want to delete their account, the deleteUser function handles that through the DELETE API.

Are microservices == or != serverless?

There is a lot of overlap between the concepts of microservices and serverless: both consist of small applications that do very specific things, usually as a part of a larger application. The main difference is how they are managed.

A complex web application - a storefront, for example - may consist of several microservices doing individual tasks, such as logging in users, handling a virtual shopping cart, and processing payments. In a microservice architecture, those individual apps still operate within a larger, managed application with operational overhead - usually a devOps team making it all work smoothly together.

With serverless, the operational overhead is largely taken care of by the serverless platform where your code lives. In the case of a function on AWS Lambda, just about everything but the actual code writing is handled by the platform, from launching an instance of an operating system to run the code in your function when it is triggered by an event, to then killing that OS or container when it is no longer needed.

Depending on the demand of your application, serverless can make it cheaper and easier to deploy and run, and is generally faster to get up and running than a group of microservices.

Are monoliths bad?

To understand serverless, it’s helpful to understand what came before: the so-called “monolith” application. A monolith application has a complex backend that lives on a server (or more likely, many servers), either at the company running the application or in the cloud, and is always running, regardless of demand - which can make it expensive to maintain.

The monolith is still the dominant form of application, and certainly has its strengths. But as I learned when trying to deploy my first monolith app in school, it can be quite difficult for beginners to deploy successfully, and is often overkill if you’re trying to deploy and test a simple application.

So serverless uses servers?

Stackery dashboard

Yes, there are still servers behind serverless functions, just as “the cloud” consists of a lot of individual servers.

After all, as the mug says, “There is no cloud, it’s just someone else’s computer”.

That’s true for serverless as well. We could just as well say, “There is no serverless, it’s just someone else’s problem.”

What I find great about serverless is that it gives developers, and especially beginning developers, the ability to build and deploy applications with less code, which means less of an overall learning curve. And for this (often frustrated) beginner, that’s quite the selling point.

Simple Authentication with AWS Cognito
Matthew Bradburn

Matthew Bradburn | August 07, 2018

Simple Authentication with AWS Cognito

I was recently doing some work related to AWS Cognito, which I wasn’t previously familiar with, and it turns out to be pretty interesting. Stackery has a cloud-based app for building and deploying serverless applications, and we use Cognito for our own authentication.

The thing I was trying to do was hard to figure out but easy once I figured it out, so I’ll include some code snippets related to my specific use case. I’m assuming this is only interesting for people who are doing something similar, so it’s partly a description of what we do and partly a HOW-TO guide for those who want to do similar things.

Cognito is Amazon’s cloud solution for authentication – if you’re building an app that has users with passwords, you can depend on AWS to handle the tricky high-risk security stuff related to storing login credentials instead of doing it yourself. Pricing is based on your number of monthly active users, and the first 50k users are free. For apps I’ve worked on, we would have been very pleased to grow out of the free tier. It can also do social login, such as “log in with Facebook” and so forth.

Part of the problem I had getting started with Cognito is the number of different architectures and authentication flows that can be implemented. You can use it from a smartphone app or a web app, and you may want to talk to Cognito from the front end as well as the back end. And then security-related APIs tend to be complicated in general.

In our case, we wanted to create user accounts from a back-end NodeJS server and we needed to do sign-in from a mostly-static website. Ordinarily you’d do sign-in from some more structured javascript environment like React. It turns out not to be tricky, but the problem with not using React is that a lot of examples aren’t applicable.

Account Creation

We create user accounts programmatically from our API server, which talks to Cognito as an administrator. We also create a user record in our own database for the user at that time, so we want to control that process. As I implied above, we don’t store user credentials ourselves. Our Cognito user pool is configured such that only admins can create users – the users do not sign themselves up directly.

Setting up the Cognito User Pool is easy once you know what to do. The Cognito defaults are good for what we’re doing; although we disable user sign-ups and set “Only allow administrators to create users”. We have a single app client, although you could have more. When we create the app client, We do not ask Cognito to generate a client secret – since we do login from a web page, there isn’t a good way to keep secrets of this type. We set “Enable sign-in API for server-based authentication”, named ADMIN_NO_SRP_AUTH. (“SRP” here stands for “Secure Remote Password”, which is a protocol in which a user can be authenticated by a remote server without sending their password over the network. It would be vital for doing authentication over an insecure network, but we don’t need it.)

Assuming you’re creating your own similar setup, you’ll need to note your User Pool ID and App Client ID, which are used for every kind of subsequent operation.

Cognito also makes a public key available that is used later to verify that the client has successfully authenticated. Cognito uses RSA, which involves a public/private key pair. The private key is used to sign a content payload, which is given to the client (it’s a JWT, JSON Web Token), and the client gives that JWT to the server in the header of its authenticated requests. Our API server uses the public key to verify that the JWT was signed with the private key.

There are actually multiple public keys involved for whatever reason, but they’re available from Cognito as a JWKS (“JSON Web Key Set”). To retrieve them you have to substitute your region and user pool ID and send a GET to this endpoint:

(https://cognito-idp.{region}.amazonaws.com/{userPoolId}/.well-known/jwks.json)

To get a user account created from the website, we send an unauthenticated POST to our API server’s /accounts endpoint, where the request includes the user’s particulars (name and email address) and plaintext password – so this connection to the API server must obviously be over HTTPS. Our API server creates a user record in our database and uses the key as our own user ID. Then we use the Cognito admin API to create the user.

const AWS = require('aws-sdk');
const cognito = new AWS.CognitoIdentityServiceProvider();

// userId - our user record index key
// email - the new user's email address
// password - the new user's password
function createCognitoUser(userId, email, password) {
  let params = {
    UserPoolId: USER_POOL_ID, // From Cognito dashboard "Pool Id"
    Username: userId,
    MessageAction: 'SUPPRESS', // Do not send welcome email
    TemporaryPassword: password,
    UserAttributes: [
      {
        Name: 'email',
        Value: email
      },
      {
        // Don't verify email addresses
        Name: 'email_verified',
        Value: 'true'
      }
    ]
  };

  return cognito.adminCreateUser(params).promise()
    .then((data) => {
      // We created the user above, but the password is marked as temporary.
      // We need to set the password again. Initiate an auth challenge to get
      // started.
      let params = {
        AuthFlow: 'ADMIN_NO_SRP_AUTH',
        ClientId: USER_POOL_CLIENT_ID, // From Cognito dashboard, generated app client id
        UserPoolId: USER_POOL_ID,
        AuthParameters: {
          USERNAME: userId,
          PASSWORD: password
        }
      };
      return cognito.adminInitiateAuth(params).promise();
    })
    .then((data) => {
      // We now have a proper challenge, set the password permanently.
      let challengeResponseData = {
        USERNAME: userId,
        NEW_PASSWORD: password,
      };

      let params = {
        ChallengeName: 'NEW_PASSWORD_REQUIRED',
        ClientId: USER_POOL_CLIENT_ID,
        UserPoolId: USER_POOL_ID,
        ChallengeResponses: challengeResponseData,
        Session: data.Session
      };
      return cognito.adminRespondToAuthChallenge(params).promise();
    })
    .catch(console.error);
}

Of course the server needs admin access to the user pool, which can be arranged by putting AWS credentials in environment variables or in a profile accessible to the server.

Cognito wants users to have an initial password that they must change when they first log in. We didn’t want to do it that way, so during the server-side account creation process, while we have the user’s plaintext password, we do an authentication and set the user’s desired password as a permanent password at that time. Once that authentication completes, the user password is saved only in encrypted form in Cognito. The authentication process gives us a set of access and refresh tokens as a result, but we don’t need them for anything on the server side.

Client Authentication

When the users later want to authenticate themselves, they do that directly with Cognito from a login web form, which requires no interaction with our API server. Our web page includes the Cognito client SDK bundle. You can read about it on NPM, where there’s a download link:

amazon-cognito-identity

Our web page uses “Use Case 4” described on that page, in which we call Cognito’s authenticateUser() API to get a JWT access token. That JWT is sent to our API server with subsequent requests in the HTTP Authorization header.

Server Verification

The API server needs to verify that the client is actually authenticated, and it does this by decoding the JWT. It has the public key set that we downloaded as above, and we follow the verification process described here:

decode-verify-jwt

The link has a good explanation, so I won’t repeat that.

One of the items in the JWT payload is the username, which allows us to look up our own user record for the authenticated user. And that’s all there is to it. I hope this saves someone some time!

Custom CloudFormation Resources: Real Ultimate Power
Chase Douglas

Chase Douglas | May 24, 2018

Custom CloudFormation Resources: Real Ultimate Power

my ninja friend mark

Lately, I’ve found CloudFormation custom resources to be supremely helpful for many use cases. I actually wanted to write a post mimicing Real Ultimate Power:

Hi, this post is all about CloudFormation custom resources, REAL CUSTOM RESOURCES. This post is awesome. My name is Chase and I can’t stop thinking about custom resources. These things are cool; and by cool, I mean totally sweet.

Trust me, it would have been hilarious, but rather than spend a whole post on a meme that’s past its prime let’s take a look at the real reasons why custom resources are so powerful!

an awesome ninja

What Are Custom Resources?

Custom resources are virtual CloudFormation resources that can invoke AWS Lambda functions. Inside the Lambda function you have access to the properties of the custom resource (which can include information about other resources in the same CloudFormation stack by way of Ref and Fn::GetAtt functions). The function can then do anything in the world as long as it (or another resource it invokes) reports success or failure back to CloudFormation within one hour. In the response to CloudFormation, the custom resource can provide data that can be referenced from other resources within the same stack.

another awesome ninja

What Can I Do With Custom Resources?

Custom resources are such a fundamental resource that it isn’t obvious at first glance all the use cases it enables. Because it can be invoked once or on every deployment, it’s a powerful mechanism for lifecycle management of many resources. Here are a few examples:

You could even use custom resources to enable post-provisioning smoke/verification testing:

  1. A custom resource is “updated” as the last resource of a deployment (this is achieved by adding every other resource in the stack to its DependsOn property)
  2. The Lambda function backing the custom resource triggers smoke tests to run, then returns success or failure to CloudFormation
  3. If a failure occurs, CloudFormation automatically rolls back the deployment

Honestly, while I have begun using custom resources for many use cases, I discover new use cases all the time. I feel like I have hardly scratched the surface of what’s possible through custom resources.

And that’s what I call REAL Ultimate Power!!!!!!!!!!!!!!!!!!

more awesome ninja

Get the Serverless Development Toolkit for Teams

Sign up now for a 30-day free trial. Contact one of our product experts to get started building amazing serverless applications today.

To Top