Upload Files to AWS S3 Using a Serverless Framework

Today, we will discuss uploading files to AWS S3 using a serverless architecture. We will do so with the help of the following services from AWS — API Gateway, AWS Lambda, and AWS S3.

To help with the complexity of building serverless apps, we will use Serverless Framework — a mature, multi-provider (AWS, Microsoft Azure, Google Cloud Platform, Apache OpenWhisk, Cloudflare Workers, or a Kubernetes-based solution like Kubeless) framework for serverless architecture.

Configuring the Environment

Before starting the project, create your credentials for programmatic access to AWS through IAM, where you can specify which permissions your users should have. Now that you have the credentials, configure the Serverless Framework to use them when interacting with AWS.

NEW RESEARCH: LEARN HOW DECISION-MAKERS ARE PRIORITIZING DIGITAL INITIATIVES IN 2024.

Install Serverless Framework (also known as SLS) through npm or the standalone binary directly. So, if you already have npm installed, just run “npm install serverless”. Use the VSCode for it to improve productivity. With everything configured, we are now ready to start the project.

Starting the Project

To start an SLS project, type “sls” or “serverless”, and the prompt command will guide you through creating a new serverless project.

After that, your workspace will have the following structure:

Lastly, run “npm init” to generate a package.json file that will be needed to install a library required for your function.

Now, your project is ready to define the resources for AWS and the implementation for your lambda function, which will receive the file and store it in an S3 bucket.

Defining Resources

To interact with AWS Services, you can create the services through the AWS Console, AWS CLI, or through a framework that helps us build serverless apps, such as Serverless Framework.

The Serverless Framework defines resources for AWS using the CloudFormation template. Let’s define the S3 bucket that stores the files that will be uploaded.

resources:
  Resources:
    ModuslandBucket:
      Type: AWS::S3::Bucket
      Properties:
        BucketName: modusland${opt:stage, 'dev'}

Bucket S3

As you can see, the “BucketName” at “ModuslandBucket” resource uses a variable from the stage defined when the deployment process is made, or the default value that is defined into a variable if the stage is not passed.

You can also define a property “stage” into provider configuration at serverless.yml, for later use. The stageis useful for distinguishing between development, QA, and production environments.

Now, define the AWS IAM Role (UploadRole) that your lambda function will use to get access to S3 (respecting the least privileged principle of IAM) and put the logs from the request into a CloudWatch log group.

resources:
  Resources:
    ModuslandBucket:
      Type: AWS::S3::Bucket
      Properties:
        BucketName: modusland
        AccessControl: PublicRead
    UploadRole:
      Type: AWS::IAM::Role
      Properties:
        AssumeRolePolicyDocument:
          Version: "2012-10-17"
          Statement:
            - Effect: Allow
              Principal:
                Service:
                  - lambda.amazonaws.com
              Action: sts:AssumeRole
        Policies:
          - PolicyName: upload-policy
            PolicyDocument:
              Version: "2012-10-17"
              Statement:
                - Effect: Allow
                  Action:
                    - s3:PutObject
                    - s3:PutObjectAcl
                  Resource:
                    !Sub
                      - "arn:aws:s3:::${BucketName}/*"
                      - {BucketName: !Ref ModuslandBucket}
                - Effect: Allow
                  Action:
                    - logs:CreateLogGroup
                    - logs:CreateLogStream
                    - logs:PutLogEvents
                  Resource:
                    Fn::Sub:
                      - arn:aws:logs:${Region}:${AccountId}:log-group:/aws/lambda/*:*:*
                      - { Region: !Ref AWS::Region, AccountId: !Ref AWS::AccountId }

Role for allowing upload to bucket modusland

The last configuration needed at the AWS level is to set up the support for binary files for API Gateway. To do this, just add the following definitions on serverless.yml at the provider level.

provider:
  name: aws
  runtime: nodejs14.x
  lambdaHashingVersion: 20201221
  apiGateway:
    binaryMediaTypes:
      - 'multipart/form-data'

Binary payload support for API Gateway

Coding the Function

As the files will be multipart/form-data, we need something to extract the payload’s content. Let’s go with the “parse-multipart” package as we are going to use NodeJs for coding the function. You can run “npm install parse-multipart” to install it.

As web apps generally work with “multipart/form-data”, this binary type was chosen to show you how to build a function integrating with the AWS ecosystem. But it’s not restricted to this kind of binary data. Just add the desired type to binary types on API Gateway settings.

Another thing to note is that API Gateway passes the binary files as base64 encoded, so you’ll need to decode from base64 before passing the body to parse-multipart.

So, in our handler, we receive the event from API Gateway, using the “parseMultipart” to extract the file’s content and the name to save it into the S3 bucket. Note that bucket name is an environment variable that will be injected into our function at the time of deployment.

Another thing to note is that we’re setting the ACL (Access Control List) as “public-read”, as our goal is to show you how to upload files to S3 using API Gateway (see best practices to define AccessControl to S3 buckets and objects).

const AWS = require('aws-sdk');
const parseMultipart = require('parse-multipart');
 
const BUCKET = process.env.BUCKET;
 
const s3 = new AWS.S3();
 
module.exports.handle = async (event) => {
  try {
    const { filename, data } = extractFile(event)
     await s3.putObject({ Bucket: BUCKET, Key: filename, ACL: 'public-read', Body: data }).promise();
 
    return {
      statusCode: 200,
      body: JSON.stringify({ link: `https://${BUCKET}.s3.amazonaws.com/${filename}` })
    }
  } catch (err) {
    return {
      statusCode: 500,
      body: JSON.stringify({ message: err.stack })
    }
  }
}
 
function extractFile(event) {
  const boundary = parseMultipart.getBoundary(event.headers['content-type'])
  const parts = parseMultipart.Parse(Buffer.from(event.body, 'base64'), boundary);
  const [{ filename, data }] = parts
 
  return {
    filename,
    data
  }
}

Lambda function

So now, we just need to configure our function on “serverless.yml”. We need to declare our functions inside the “functions” tree, give it a name, and other required attributes.

One important thing here is the event section, where we should specify “http” to integrate the AWS API Gateway to our lambda function.

The API Gateway has two ways to integrate an endpoint http to lambda, http endpoint, or other AWS Services. The first one is HTTP APIs, which are designed for low-latency, cost-effective integrations with AWS services, including AWS Lambda and HTTP endpoints. The second one is REST APIs, a previous-generation API that currently offers more features. You can learn about the differences between them here.

Let’s use the HTTP APIs for this function as it is useful for web apps like CORS, support for OIDC, and OAuth 2 authorization.

In the above examples, we have created a specific role for its function named “UploadRole”. Now, it’s time to use it by putting the “role” attribute under the function name. Another thing that we used in our function was the environment variable for the bucket name. Let’s declare it under the function name, on an attribute named “environment.”

functions:
  uploader:
    handler: handler.handle
    events:
      - http: POST /file/upload
    role: UploadRole
    environment:
      BUCKET: modusland${opt:stage, 'dev'}

Function definition

To complete our journey, we need to deploy our function to AWS, making this service available for testing. For deployment, running “sls deploy” by default will publish our services at the us-east-1 region on AWS and the “dev” stage. After the deployment process is finished, the Serverless Framework will list the URL to access our function.

SLS deployment output

Time to Test

To test our function, we will use Insomnia, an open-source API client that enables us to quickly and easily send REST, SOAP, GraphQL, and GRPC requests.

As the service is already configured, upload files using “multipart/form-data”. Below is an example of using Insomnia to upload a single file.

Uploading using Insomnia

You can now use the link to get your file from the S3 bucket wherever you want.

Conclusion

The API Gateway has a limitation for a payload of requests at 10MB, and AWS Lambda functions have the same limitation at 6MB. As an alternative to uploading large files for S3 buckets still using serverless architecture, you can have an HTTP endpoint that returns a pre-signed URL from S3 for later upload.

Nowadays, there is a growing demand for serverless architecture, which makes uploading files to AWS S3 using API gateway with AWS Lambda (NodeJs) extremely useful. By simply following the above steps, you can make your own API to upload your files to S3 buckets on AWS.

You can view the source code of this blog post here.

Posted in Application Development, DevOps

Rafael Waterkemper

Rafael is a Full-Stack Engineer at Modus Create with more than six years of experience in application development. Rafael's interests include DevOps, cloud computing, and most recently, Golang. In his free time, he likes to learn new things, play soccer, spend time with family, and play video games.