serverless - freeCodeCamp.org

How to Deploy a Serverless Spam Classifier Using Scikit-Learn, AWS Lambda, & API Gateway

Rakshath Naik — Thu, 30 Apr 2026 05:06:15 +0000

In today's digital world, spam is no longer just an annoyance - it's a growing security threat. To combat this, developers often turn to machine learning to build intelligent filters that can distinguish legitimate emails from malicious ones.

While building a machine learning model in a notebook is relatively straightforward, the real challenge lies in the last mile: deploying that model into a scalable, production-ready system that users can actually interact with.

In this project, I built an end-to-end serverless spam classifier, combining Scikit-learn for model development with AWS Lambda, Amazon S3, and Amazon API Gateway for deployment. The result is a lightweight, scalable API that can classify messages in real time.

The system is designed to be modular and cost-efficient, allowing the model to be retrained and updated independently without affecting the live API. From detecting "free iPhone" scams to identifying phishing attempts, this project demonstrates how to bridge the gap between machine learning experimentation and real-world deployment.

Prerequisites
Building the Brain: The Model
Deploying the Model to AWS
How to Run The Project Locally
Our Project Architecture
Conclusion: The Power of Serverless AI
Acknowledgment / References

1. Prerequisites

Fundamental skills: Basic proficiency in Python and understanding of Machine Learning concepts like classification.
AWS account: Access to an AWS account with permissions for Lambda, S3, and API Gateway.
Environment: Python 3.11 installed, along with libraries like scikit-learn, pandas, and joblib.
AWS CLI: Configured on your local machine for file uploads.
HuggingFace account: You can directly download the model from my account.

2. Building the Brain: The Model

Photo by Steve A Johnson on Unsplash

At the heart of this project lies a supervised learning approach. Instead of simply specifying which words are considered spam, we'll provide the computer with a dataset and an algorithm, enabling it to learn and identify spam patterns on its own.

1. Vectorization: Turning Text into Math

Machine Learning models can't read text. They require numerical input. To solve this, we used the TF-IDF (Term Frequency-Inverse Document Frequency) Vectorizer.

feature_extraction = TfidfVectorizer(min_df=1, stop_words='english', lowercase=True)
X_train_features = feature_extraction.fit_transform(X_train

Here's the mathematical formula:

$$w_{i,j} = tf_{i,j} \times \log \left( \frac{N}{df_i} \right)$$

TF-IDF term definitions:

wᵢ,ⱼ (Weight): The final importance score of a specific word in a document.
tfᵢ,ⱼ (Term Frequency): How often a word appears in a single email.
N (Total Documents): The total count of all emails in your dataset.
dfᵢ (Document Frequency): The number of different emails that contain this specific word.
log(N/dfᵢ) (IDF): A penalty that lowers the score of common words like the or is that appear everywhere.

It cleans the data by removing common words, converts all text to lowercase for consistency, and assigns more importance to rare and meaningful words while giving less importance to frequently used words.

2. Training: The Logistic Regression Engine

We'll use Logistic Regression here, a classification algorithm that predicts the probability of an outcome.

In this stage, we feed our vectorized training data into the Logistic Regression algorithm. The goal is to establish a mathematical relationship between specific word weights and the Spam or Ham label.

During training, the model iteratively adjusts its internal parameters to minimize error, eventually learning that words like winner or free correlate highly with spam, while conversational language correlates with legitimate messages.

model = LogisticRegression()
model.fit(X_train_features, Y_train)

In our case, it calculates the probability that an email belongs to spam or HAM.

The algorithm uses the Sigmoid function to map any real-valued number into a value between 0 and 1.

$$P(y=1|x) = \frac{1}{1 + e^{-(z)}}$$

where z = β₀ + β₁x₁ + … + βₙxₙ.

3. Evaluation: Testing the Intelligence

After training, we need to verify if the brain actually works on data it hasn't seen before.

prediction_on_test_data = model.predict(X_test_features)
accuracy_on_test_data = accuracy_score(Y_test, prediction_on_test_data)

By comparing the model’s predictions against the actual labels in our test set, we calculate an Accuracy Score. This gives us the confidence that the model is ready for the real world (achieving ~94% accuracy in our tests).

4. Exporting the Logic (Serialization)

To move this brain from our local Python environment to the AWS Cloud, we'll use Joblib to save our work into binary files (.pkl).

joblib.dump(model, 'spam_model.pkl')
joblib.dump(feature_extraction, 'vectorizer.pkl')

We use the Pickle format because it allows us to freeze complex Python objects (mathematical weights and word mappings) into a portable binary format that can be instantly re-animated in the cloud.

We need the Vectorizer to translate new user text into the exact numerical coordinates the Model was trained to understand. Using one without the other is like having a key but no lock.

The trained Logistic Regression model and TF-IDF vectorizer are openly available for the community on Hugging Face here: Get the model on HuggingFace.

3. Deploying the Model to AWS

Training a model is science, while deploying it is engineering. To make this classifier accessible to the world, we'll use a serverless stack that scales automatically and incurs nearly no maintenance costs.

1. Model Storage: Amazon S3

First, we'll uploade our .pkl files to an S3 bucket. By decoupling the model from the code, we can update the AI's intelligence (simply by overwriting the file in S3) without redeploying the backend code. It makes the system highly maintainable.

2. The Production Backend: AWS Lambda

To make the AI accessible, we'll move from a local script to a Serverless Cloud Architecture. This ensures the model is always available without the cost of a 24/7 server.

The deployment environment is AWS Lambda (Python 3.11). Since Lambda is a lightweight environment, it doesn't include Scikit-Learn or Joblib. To provide these, we'll download and store them in our S3 bucket and import them through the layers.

Commands in AWS CLI:


# 1. Create a workspace
mkdir ml_layer && cd ml_layer

# 2. Install scikit-learn and its dependencies into a folder
pip install \
    --platform manylinux2014_x86_64 \
    --target=python/lib/python3.11/site-packages \
    --implementation cp \
    --python-version 3.11 \
    --only-binary=:all: \
    scikit-learn joblib

# 3. Zip the folder
zip -r sklearn_lib.zip python

# 4. Upload to S3 (Using AWS CLI)
aws s3 cp sklearn_lib.zip s3://YOUR-BUCKET-NAME/

We store the Scikit-Learn library as a ZIP in S3 to bypass the AWS Lambda deployment package size limit. This allows the function to dynamically load heavy dependencies only when needed without bloating the core code.

The Lambda Function:


import json
import boto3
import os
import sys
from io import BytesIO

# Ensures the custom Lambda layer(containing sklearn/joblib)
sys.path.append('/opt/python')

try:
    import joblib
except ImportError:
    # Fallback for specific Scikit-Learn distributions
    from sklearn.utils import _joblib as joblib

# Initialize S3 client
s3 = boto3.client('s3')

# Use placeholders for the article so readers can insert their own values
BUCKET_NAME = 'YOUR_S3_BUCKET_NAME' 
MODEL_KEY = 'spam_model.pkl'
VECTORIZER_KEY = 'vectorizer.pkl'

# Global variables for 'Warm Start' caching (improves performance by keeping model in RAM)
model = None
vectorizer = None

def load_model():
    """Downloads model files from S3 only if they aren't already in RAM"""
    global model, vectorizer
    if model is None or vectorizer is None:
        try:
            # 1. Load the Logistic Regression Model from S3
            m_obj = s3.get_object(Bucket=BUCKET_NAME, Key=MODEL_KEY)
            model = joblib.load(BytesIO(m_obj['Body'].read()))
            
            # 2. Load the TF-IDF Vectorizer directly from S3
            v_obj = s3.get_object(Bucket=BUCKET_NAME, Key=VECTORIZER_KEY)
            vectorizer = joblib.load(BytesIO(v_obj['Body'].read()))
        except Exception as e:
            raise Exception(f"Failed to load .pkl files from S3: {str(e)}")

def lambda_handler(event, context):
    try:
        # Ensure model and vectorizer are ready before processing
        load_model()
        
        # Handles both direct Lambda tests and API Gateway POST requests
        body = event.get('body', event)
        if isinstance(body, str):
            body = json.loads(body)
            
        text = body.get('text', '')
            
        if not text:
            return {
                'statusCode': 400,
                'body': json.dumps({'error': 'No text provided.'})
              }

        # 1. Transform input text to numeric features using the trained Vectorizer
        data_vec = vectorizer.transform([text])
        
        # 2. Predict using the Logistic Regression Model 
        prediction = int(model.predict(data_vec)[0])
        
      # 3. Map numeric result to human-readable label
        result_label = "HAM" if prediction == 1 else "SPAM"
        
        # RESPONSE WITH CORS
        return {
            'statusCode': 200,
            'headers': {
                'Content-Type': 'application/json',
                'Access-Control-Allow-Origin': '*' # needed for cross-domain web integration
            },
            'body': json.dumps({
                'status': 'success',
                'classification': result_label,
                'input_text': text
            })
        }
        
    except Exception as e:
        return {
            'statusCode': 500,
            'body': json.dumps({'error_message': f"Inference Error: {str(e)}"})
        }

Key features of the Lambda function:

Warm start caching: By defining the model and vectorizer variables outside the lambda_handler, we store them in the container's memory. This significantly reduces cold start latency for subsequent requests.
Dynamic dependency loading: The sys.path.append('/opt/python') line allows us to import heavy libraries from S3/Layers without exceeding the upload limit.
Bimodal input handling: The function is designed to handle both direct JSON testing from the AWS console and stringified payloads sent via API Gateway.

3. The API Gateway - The Bridge to the Web

Photo by Growtika on Unsplash

Creating the REST API

Next we'll create a REST API with a single POST method. Why POST, you might be wondering? Well, we need to securely send a JSON payload containing the user’s text message to our model.

First navigate to the Amazon API Gateway console and select Create API -> REST API.
Give your API a name, such as EmailSpamPredictor-API, and set the Endpoint Type to Regional.
Then in the left sidebar, click Resources and enter a resource name (e.g: / predict as entered by me)
Next click the create method and select POST and then select Lambda Function for integration type
Ensure Lambda Proxy integration is enabled (this allows the full request to pass through to your code).

The CORS Configuration (The Troubleshooting Hub)
This is where many developers encounter the dreaded Connection Error. Since our API is hosted on AWS, and if your front-end is on a separate website, the browser’s Same-Origin Policy will block the request by default.

To fix this, we'll enable CORS:

Access-Control-Allow-Origin: Set to * (or specifically to your domain) to tell the browser that the API is allowed to talk to your front-end.
The OPTIONS method: API Gateway creates an OPTIONS method automatically. This handles the Preflight request where the browser asks, “Are you allowed to receive data from me?” before sending the actual text.
Access-Control-Allow-Headers: In the screenshot, you'll notice headers like Content-Type and Authorization are allowed. This ensures that when our JavaScript fetch() call sets the content type to application/json, the API Gateway doesn't reject it.

Image illustrates the CORS configuration for our project. (Image by author)

Deployment Stages

Once the API is deployed to a production stage, AWS generates a permanent Invoke URL. This acts as the public gateway to our model and typically follows this structure: https://[api-id].execute-api.[region].amazonaws.com/prod/classify.

Connecting the Frontend (The JavaScript Layer)

With the API live, we can now write a simple JavaScript function to talk to our model. This script runs whenever a user clicks the Analyze button on your site.


async function checkSpam() {
    const message = document.getElementById("userInput").value;
    const apiUrl = "YOUR_API_GATEWAY_INVOKE_URL";

    try {
        const response = await fetch(apiUrl, {
            method: "POST",
            headers: {
                "Content-Type": "application/json"
            },
            body: JSON.stringify({ "text": message })
        });

        const data = await response.json();
        
        // Display result on the webpage
        const resultElement = document.getElementById("result");
        resultElement.innerText = `Prediction: ${data.classification}`;
        resultElement.style.color = data.classification === "SPAM" ? "red" : "green";

    } catch (error) {
        console.error("Error:", error);
        alert("Could not connect to the Spam Detector API.");
    }
}

4. How to Run The Project Locally

You can store the front-end as an HTML file. Once it's ready, you shouldn’t just double-click the .html file. Opening it as a file in your browser can cause security restrictions. Instead, you should host it using a simple local server.

Step 1: Open the terminal or Command Prompt.

Step 2: Navigate to your project folder

cd [PATH_TO_YOUR_FOLDER]

Step 3: Start a local Python web server.

python -m http.server 8000

Step 4: Access the application.

Open your browser and navigate to:
http://localhost:8000/your-file-name.html

Watch the Demo:

5. Our Project Architecture

The image illustrates the architecture of our project (Building a Serverless Spam Classifier). It shows the process that takes place from the client input to the final model output. (Image by Author)

Client Front-End Interaction: The process starts on the far left. A user interacts with the web interface (for example, a website or a desktop app). They input text like WIN free iPhone now and trigger a request.
The Entry Point: API Gateway: The request hits the Amazon API Gateway, which acts as the security guard and translator.
(a) CORS OPTIONS handles the pre-flight handshake to ensure the browser has permission to talk to the AWS cloud.
(b) Classification Request (POST) routes the actual message data to your backend logic.
The Engine: AWS Lambda (Python 3.11): The central “lightbulb” represents your Lambda function. This is where the code you wrote lives. It doesn’t run 24/7 – it only wakes up when a request arrives.
Storage & Retrieval: S3 Bucket: Since Lambda is lightweight, it doesn’t store your heavy Machine Learning files internally.
Dependency and Model Download: The function reaches out to the S3 Bucket to pull in the sklearn_lib.zip (the engine) and the .pkl files (the intelligence).
Required Dependency and Model: These assets are loaded into the Lambda’s temporary memory to prepare for the prediction.
The Inference Pipeline: Inside the Lambda, a three-step mathematical cycle occurs:
(a) Text Vectorizer: Translates the words into numbers.
(b) Logistic Regression: Calculates the probability of spam based on those numbers.
(c) Label: Assigns a final result (Spam or Ham).
The Result Delivery: The result is sent back through the API Gateway, including the necessary CORS Headers to ensure the browser accepts it. The front-end then updates to show the “Result: SPAM” with a visual indicator.

6. Conclusion: The Power of Serverless AI

By merging the mathematical simplicity of Logistic Regression with the industrial strength of AWS Serverless Architecture, we have transformed a static Python script into a globally accessible, scalable API.

This project demonstrates that you don’t need a massive budget or a 24/7 dedicated server to deploy high-quality Machine Learning.

Using the S3-to-Lambda workaround allowed us to bypass common storage hurdles, ensuring that our Brain (the model) and its Muscle (Scikit-Learn) could function seamlessly within the cloud’s ephemeral environment. It bridges the gap between experimentation and real-world applications, making AI systems practical, efficient, and accessible.

7. Acknowledgment / References

Pre-trained spam classification model: View on Hugging Face (rakshath1/mail-spam-detector · Hugging Face)
Scikit-learn Documentation
AWS Lambda Documentation
Amazon S3 Documentation
Amazon API Gateway Documentation

Connect With Me

You may also like

How to Build a Full-Stack CRUD App with React, AWS Lambda, DynamoDB, and Cognito Auth

Benedicta Onyebuchi — Tue, 17 Mar 2026 15:13:02 +0000

Building a web application that works only on your local machine is one thing. Building one that is secure, connected to a real database, and accessible to anyone on the internet is another challenge entirely. And it requires a different set of tools.

Most production web applications share a common set of needs: they store and retrieve data, they expose that data through an API, they require users to authenticate before accessing sensitive operations, and they need to be deployed somewhere reliable and fast.

Meeting all of those needs used to require managing servers, configuring databases, handling authentication infrastructure, and provisioning hosting environments – often as separate, manual processes.

AWS changes that model significantly. With the combination of services you'll use in this tutorial (Lambda, DynamoDB, API Gateway, Cognito, and CloudFront), you can build and deploy a fully functional, secured, globally distributed application without managing a single server.

Each service handles one specific responsibility:

DynamoDB stores your data
Lambda runs your business logic on demand
API Gateway exposes your functions as a REST API
Cognito manages user authentication
CloudFront delivers your frontend worldwide over HTTPS.

The AWS CDK (Cloud Development Kit) ties all of this together by letting you define every one of those services as TypeScript code. Instead of clicking through the AWS Console to configure each resource manually, you describe your entire infrastructure in a single file and deploy it with one command.

By the end of this tutorial, you will have a fully deployed vendor management dashboard. Users can sign up, log in, and then create, read, and delete vendors, with all data securely stored in AWS DynamoDB and all routes protected by Amazon Cognito authentication.

What You'll Build

In this handbook, you'll build a two-panel web app where authenticated users can:

Add a new vendor (name, category, contact email)
View all saved vendors in real time
Delete a vendor from the list
Sign in and sign out securely

The frontend is built with Next.js. The backend runs entirely on AWS: DynamoDB stores the data, Lambda functions handle the logic, API Gateway exposes a REST API, Cognito manages authentication, and CloudFront serves the app globally over HTTPS.

Who This Is For
Prerequisites
Architecture Overview
Part 1: Set Up Your AWS Account and Tools
Part 2: Set Up the Project Structure
Part 3: Define the Database (DynamoDB)
Part 4: Write the Lambda Functions
Part 5: Build the API with API Gateway
Part 6: Deploy the Backend to AWS
Part 7: Build the React Frontend
Part 8: Add Authentication with Amazon Cognito
Part 9: Deploy the Frontend with S3 and CloudFront
What You Built
Conclusion

Who This Is For

This tutorial is for developers who know basic JavaScript and React but have never used AWS. You don't need any prior backend, cloud, or DevOps experience. I'll explain every AWS concept before we use it.

Prerequisites

Before starting, make sure you have the following installed and available:

Node.js 18 or higher: Download here
npm: Included with Node.js
A code editor: I recommend VS Code
A terminal: Any terminal on macOS, Linux, or Windows (WSL recommended on Windows)
An AWS account: You will create one in Part 1. A credit card is required, but the Free Tier covers everything in this tutorial.
Basic familiarity with React and TypeScript: You should understand components, useState, and useEffect.

Architecture Overview

Before writing any code, here's a plain-English description of how the pieces fit together.

When a user clicks "Add Vendor" in the React app:

The frontend reads the user's JWT auth token from the browser session
It sends a POST request to API Gateway, including the token in the request header
API Gateway checks the token against Cognito. If the token is invalid or missing, it rejects the request with a 401 error immediately
If the token is valid, API Gateway passes the request to the createVendor Lambda function
The Lambda function writes the new vendor to DynamoDB
DynamoDB confirms the write, and the Lambda returns a success response
The frontend re-fetches the vendor list and updates the UI

The same flow applies to reading and deleting vendors, with different Lambda functions and HTTP methods.

How the app is deployed: Your React app is exported as a static site, uploaded to an S3 bucket, and served globally through CloudFront. Your backend infrastructure (Lambda functions, API Gateway, DynamoDB, Cognito) is defined in TypeScript using AWS CDK and deployed with a single command.

Part 1: Set Up Your AWS Account and Tools

Before writing any application code, you need three things in place: an AWS account, the right tools on your machine, and credentials that let those tools communicate with AWS on your behalf.

1.1 Create Your AWS Account

If you don't have an AWS account:

Go to https://aws.amazon.com
Click Create an AWS Account
Follow the sign-up prompts and add a payment method
Once registered, log in to the AWS Management Console

AWS has a Free Tier that covers all the services used in this tutorial. You won't be charged for normal use while following along.

1.2 Install the AWS CLI and CDK

The AWS CLI is a command-line tool that lets you interact with AWS from your terminal: checking resources, configuring credentials, and more.

The AWS CDK (Cloud Development Kit) is the tool you will use to define your entire backend (database, Lambda functions, API) using TypeScript code. Instead of clicking through the AWS Console to create each resource, you describe what you want in a TypeScript file and CDK builds it for you.

Install both:

# Install AWS CLI (macOS)
curl "https://awscli.amazonaws.com/AWSCLIV2.pkg" -o "AWSCLIV2.pkg"
sudo installer -pkg AWSCLIV2.pkg -target /

# For Linux, see: https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2-linux.html
# For Windows, see: https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2-windows.html

# Install AWS CDK globally
npm install -g aws-cdk

Verify both are installed:

aws --version
cdk --version

Both commands should print a version number. If they do, you are ready to move on.

1.3 Configure Your AWS Credentials (IAM)

This step is critical. Your terminal needs a set of credentials – like a username and password – to act on your behalf inside AWS.

Think of your root account (the one you signed up with) as the master key to your entire AWS account. You should never use it for day-to-day development. Instead, you will create a separate IAM user with its own set of keys. If those keys are ever exposed, you can delete them without compromising your root account.

Phase 1: Create an IAM User

Log in to the AWS Console and search for IAM in the top search bar
In the left sidebar, click Users, then click Create user
Name the user cdk-dev. Leave "Provide user access to the AWS Management Console" unchecked – you only need terminal access, not console access
On the permissions screen, choose Attach policies directly

Search for AdministratorAccess and check the box next to it

Note on permissions: In a production job you would use a more restricted policy. For this tutorial, Administrator access is needed because CDK creates many different types of AWS resources.

6. Click through to the end and click Create user

Phase 2: Generate Access Keys

Click on your newly created cdk-dev user from the Users list
Go to the Security credentials tab
Scroll down to Access keys and click Create access key
Select Command Line Interface (CLI), check the acknowledgment box, and click Next
Click Create access key

Important: Copy both the Access Key ID and the Secret Access Key right now. You will never be able to see the Secret Access Key again after closing this screen. Save both values in a password manager or secure note.

Phase 3: Connect Your Terminal to AWS

Run the following command in your terminal:

aws configure

You will be prompted for four values:

AWS Access Key ID:     [paste your Access Key ID]
AWS Secret Access Key: [paste your Secret Access Key]
Default region name:   us-east-1
Default output format: json

Use us-east-1 as your region for this tutorial. After this step, every CDK and AWS CLI command you run will use these credentials automatically.

Part 2: Set Up the Project Structure

You will use a monorepo layout – one top-level folder with two sub-projects inside: frontend for your React app and backend for your AWS infrastructure code. They are deployed independently but live side by side.

2.1 Create the Workspace

mkdir vendor-tracker && cd vendor-tracker
mkdir backend frontend

2.2 Initialize the Frontend (Next.js)

Navigate into the frontend folder and run:

cd frontend
npx create-next-app@latest .

When prompted, choose the following options:

TypeScript --> Yes
ESLint --> Yes
Tailwind CSS --> Yes
src/ directory -->No
App Router --> Yes
Import alias --> No

2.3 Initialize the Backend (CDK)

Navigate into the backend folder and run:

cd ../backend
cdk init app --language typescript

This generates a boilerplate CDK project. The most important file it creates is backend/lib/backend-stack.ts. This is where you will define all of your AWS infrastructure as TypeScript code.

Also install esbuild, which CDK uses to bundle your Lambda functions:

npm install --save-dev esbuild

2.4 Understanding CDK Before You Write Any Code

CDK is likely different from most tools you have used. Here is how it works:

Normally, you would create AWS resources by clicking through the AWS Console: create a table here, configure a Lambda function there. CDK lets you do all of that using TypeScript code instead.

When you run cdk deploy, CDK reads your TypeScript file, converts it into an AWS CloudFormation template (an internal AWS format for describing infrastructure), and submits it to AWS. AWS then creates all the resources you described.

A few terms you will see throughout this tutorial:

Stack: The collection of all AWS resources you define together. Your BackendStack class is your stack.
Construct: Each individual AWS resource you create inside a stack (a table, a Lambda function, an API) is called a construct.
Deploy: Running cdk deploy sends your TypeScript definition to AWS and creates or updates the real resources.

The main file you'll work in is backend/lib/backend-stack.ts. Think of it as the blueprint for your entire backend.

Your final project structure will look like this:

vendor-tracker/
├── backend/
│   ├── lambda/
│   │   ├── createVendor.ts
│   │   ├── getVendors.ts
│   │   └── deleteVendor.ts
│   ├── lib/
│   │   └── backend-stack.ts
│   └── package.json
└── frontend/
    ├── app/
    │   ├── layout.tsx
    │   ├── page.tsx
    │   └── providers.tsx
    ├── lib/
    │   └── api.ts
    ├── types/
    │   └── vendor.ts
    └── .env.local

Part 3: Define the Database (DynamoDB)

DynamoDB is AWS's NoSQL database. Think of it as a fast, scalable key-value store in the cloud. Every item in a DynamoDB table must have a unique ID called the partition key. For your vendor table, that key will be vendorId.

Open backend/lib/backend-stack.ts. Replace the entire file contents with the following:

import * as cdk from 'aws-cdk-lib';
import { Construct } from 'constructs';
import * as dynamodb from 'aws-cdk-lib/aws-dynamodb';

export class BackendStack extends cdk.Stack {
  constructor(scope: Construct, id: string, props?: cdk.StackProps) {
    super(scope, id, props);

    // 1. DynamoDB Table
    const vendorTable = new dynamodb.Table(this, 'VendorTable', {
      partitionKey: {
        name: 'vendorId',
        type: dynamodb.AttributeType.STRING,
      },
      billingMode: dynamodb.BillingMode.PAY_PER_REQUEST,
      removalPolicy: cdk.RemovalPolicy.DESTROY, // For development only
    });
  }
}

What each line does:

partitionKey tells DynamoDB that vendorId is the unique identifier for every record. No two vendors can share the same vendorId.
PAY_PER_REQUEST means you only pay when data is actually read or written. There is no charge when the table is idle, which makes it cost-effective for learning.
RemovalPolicy.DESTROY means the table will be deleted when you run cdk destroy. For production apps you would not use this.

Part 4: Write the Lambda Functions

A Lambda function is your server, but unlike a traditional server, it only runs when it's called. AWS spins it up on demand, runs your code, and shuts it down. You're only charged for the time your code is actually running.

You'll write three Lambda functions:

createVendor.ts: Adds a new vendor to DynamoDB
getVendors.ts: Returns all vendors from DynamoDB
deleteVendor.ts: Removes a vendor from DynamoDB by ID

Create a new folder inside backend:

mkdir backend/lambda

A Note on the AWS SDK

All three Lambda functions use AWS SDK v3 (@aws-sdk/client-dynamodb and @aws-sdk/lib-dynamodb). This is the current standard. An older version of the SDK (aws-sdk) exists but is deprecated and not bundled in the Node.js 18 Lambda runtime, which is what you'll use. Stick to v3 throughout.

4.1 Create Vendor Lambda

Create backend/lambda/createVendor.ts:

import { DynamoDBClient } from "@aws-sdk/client-dynamodb";
import { DynamoDBDocumentClient, PutCommand } from "@aws-sdk/lib-dynamodb";
import { randomUUID } from "crypto";

const client = new DynamoDBClient({});
const docClient = DynamoDBDocumentClient.from(client);

export const handler = async (event: any) => {
  try {
    const body = JSON.parse(event.body);

    const item = {
      vendorId: randomUUID(), // Generates a collision-safe unique ID
      name: body.name,
      category: body.category,
      contactEmail: body.contactEmail,
      createdAt: new Date().toISOString(),
    };

    await docClient.send(
      new PutCommand({
        TableName: process.env.TABLE_NAME!,
        Item: item,
      })
    );

    return {
      statusCode: 201,
      headers: {
        "Access-Control-Allow-Origin": "*",
        "Access-Control-Allow-Headers": "Content-Type,Authorization",
        "Access-Control-Allow-Methods": "OPTIONS,POST,GET,DELETE",
      },
      body: JSON.stringify({ message: "Vendor created", vendorId: item.vendorId }),
    };
  } catch (error) {
    console.error("Error creating vendor:", error);
    return {
      statusCode: 500,
      headers: { "Access-Control-Allow-Origin": "*" },
      body: JSON.stringify({ error: "Failed to create vendor" }),
    };
  }
};

What each part does:

randomUUID() generates a universally unique ID using Node's built-in crypto module. No extra package is needed. This is more reliable than Date.now(), which can produce duplicate IDs if two requests arrive within the same millisecond.
process.env.TABLE_NAME reads the DynamoDB table name from an environment variable. You'll set this value in the CDK stack. This avoids hardcoding the table name inside your Lambda code.
The headers block is required for CORS (Cross-Origin Resource Sharing). Without Access-Control-Allow-Origin, your browser will block responses from a different domain than your frontend. Without Access-Control-Allow-Headers, the Authorization header you add later for Cognito will be rejected during the browser's preflight check.

4.2 Get Vendors Lambda

Create backend/lambda/getVendors.ts:

import { DynamoDBClient } from "@aws-sdk/client-dynamodb";
import { DynamoDBDocumentClient, ScanCommand } from "@aws-sdk/lib-dynamodb";

const client = new DynamoDBClient({});
const docClient = DynamoDBDocumentClient.from(client);

export const handler = async () => {
  try {
    const response = await docClient.send(
      new ScanCommand({
        TableName: process.env.TABLE_NAME!,
      })
    );

    return {
      statusCode: 200,
      headers: {
        "Access-Control-Allow-Origin": "*",
        "Access-Control-Allow-Headers": "Content-Type,Authorization",
        "Content-Type": "application/json",
      },
      body: JSON.stringify(response.Items ?? []),
    };
  } catch (error) {
    console.error("Error fetching vendors:", error);
    return {
      statusCode: 500,
      headers: { "Access-Control-Allow-Origin": "*" },
      body: JSON.stringify({ error: "Failed to fetch vendors" }),
    };
  }
};

What each part does:

ScanCommand reads every item in the table and returns them as an array. For a learning project this is fine. In a production app with millions of rows, you would use a more targeted QueryCommand to avoid reading the entire table on every request.
response.Items ?? [] returns an empty array if the table is empty, preventing the frontend from crashing when there are no vendors yet.

4.3 Delete Vendor Lambda

Create backend/lambda/deleteVendor.ts:

import { DynamoDBClient } from "@aws-sdk/client-dynamodb";
import { DynamoDBDocumentClient, DeleteCommand } from "@aws-sdk/lib-dynamodb";

const client = new DynamoDBClient({});
const docClient = DynamoDBDocumentClient.from(client);

export const handler = async (event: any) => {
  try {
    const body = JSON.parse(event.body);
    const { vendorId } = body;

    if (!vendorId) {
      return {
        statusCode: 400,
        headers: { "Access-Control-Allow-Origin": "*" },
        body: JSON.stringify({ error: "vendorId is required" }),
      };
    }

    await docClient.send(
      new DeleteCommand({
        TableName: process.env.TABLE_NAME!,
        Key: { vendorId },
      })
    );

    return {
      statusCode: 200,
      headers: {
        "Access-Control-Allow-Origin": "*",
        "Access-Control-Allow-Headers": "Content-Type,Authorization",
        "Access-Control-Allow-Methods": "OPTIONS,POST,GET,DELETE",
      },
      body: JSON.stringify({ message: "Vendor deleted" }),
    };
  } catch (error) {
    console.error("Error deleting vendor:", error);
    return {
      statusCode: 500,
      headers: { "Access-Control-Allow-Origin": "*" },
      body: JSON.stringify({ error: "Failed to delete vendor" }),
    };
  }
};

What each part does:

DeleteCommand removes the item whose vendorId matches the key you provide. DynamoDB doesn't return an error if the item doesn't exist. It simply does nothing.
The 400 guard at the top returns a clear error if the caller forgets to send a vendorId, rather than letting DynamoDB throw a confusing internal error.

Part 5: Build the API with API Gateway

API Gateway is what gives your Lambda functions a public URL. Without it, there's no way for your browser to trigger a Lambda function. Think of it as the front door of your backend: it receives HTTP requests, checks whether the caller is authorized, routes the request to the correct Lambda, and returns the Lambda's response to the caller.

Now you'll wire everything together in backend/lib/backend-stack.ts.

5.1 Add Lambda Functions and API Gateway to the Stack

Replace the entire contents of backend/lib/backend-stack.ts with this complete, assembled file:

import * as cdk from 'aws-cdk-lib';
import { Construct } from 'constructs';
import * as dynamodb from 'aws-cdk-lib/aws-dynamodb';
import * as apigateway from 'aws-cdk-lib/aws-apigateway';
import { NodejsFunction } from 'aws-cdk-lib/aws-lambda-nodejs';

export class BackendStack extends cdk.Stack {
  constructor(scope: Construct, id: string, props?: cdk.StackProps) {
    super(scope, id, props);

    // 1. DynamoDB Table 
    const vendorTable = new dynamodb.Table(this, 'VendorTable', {
      partitionKey: {
        name: 'vendorId',
        type: dynamodb.AttributeType.STRING,
      },
      billingMode: dynamodb.BillingMode.PAY_PER_REQUEST,
      removalPolicy: cdk.RemovalPolicy.DESTROY,
    });

    // 2. Lambda Functions
    const lambdaEnv = { TABLE_NAME: vendorTable.tableName };

    const createVendorLambda = new NodejsFunction(this, 'CreateVendorHandler', {
      entry: 'lambda/createVendor.ts',
      handler: 'handler',
      environment: lambdaEnv,
    });

    const getVendorsLambda = new NodejsFunction(this, 'GetVendorsHandler', {
      entry: 'lambda/getVendors.ts',
      handler: 'handler',
      environment: lambdaEnv,
    });

    const deleteVendorLambda = new NodejsFunction(this, 'DeleteVendorHandler', {
      entry: 'lambda/deleteVendor.ts',
      handler: 'handler',
      environment: lambdaEnv,
    });

    // 3. Permissions (Least Privilege)
    vendorTable.grantWriteData(createVendorLambda);
    vendorTable.grantReadData(getVendorsLambda);
    vendorTable.grantWriteData(deleteVendorLambda);

    // 4. API Gateway
    const api = new apigateway.RestApi(this, 'VendorApi', {
      restApiName: 'Vendor Service',
      defaultCorsPreflightOptions: {
        allowOrigins: apigateway.Cors.ALL_ORIGINS,
        allowMethods: apigateway.Cors.ALL_METHODS,
        allowHeaders: ['Content-Type', 'Authorization'],
      },
    });

    const vendors = api.root.addResource('vendors');
    vendors.addMethod('POST', new apigateway.LambdaIntegration(createVendorLambda));
    vendors.addMethod('GET', new apigateway.LambdaIntegration(getVendorsLambda));
    vendors.addMethod('DELETE', new apigateway.LambdaIntegration(deleteVendorLambda));

    // 5. Outputs
    new cdk.CfnOutput(this, 'ApiEndpoint', {
      value: api.url,
    });
  }
}

What each section does:

NodejsFunction is a special CDK construct that automatically bundles your Lambda code and all its dependencies into a single file using esbuild before uploading it to AWS. This is why you installed esbuild in Part 2.

Always use NodejsFunction instead of the basic lambda.Function construct. The basic version requires you to manually manage bundling, which causes "Module not found" errors at runtime.

Permissions (Least Privilege): In AWS, no resource can communicate with any other resource by default. A Lambda function has no access to DynamoDB, S3, or anything else unless you explicitly grant it.

This is called the Least Privilege principle: each piece of your system gets exactly the permissions it needs, and nothing more. grantWriteData lets a Lambda write and delete items. grantReadData lets a Lambda read items. Using separate grants for each function means the getVendors Lambda can never accidentally delete data.

CfnOutput prints a value to your terminal after cdk deploy completes. You'll use the ApiEndpoint URL to configure your frontend.

Part 6: Deploy the Backend to AWS

Your infrastructure is fully defined in code. Now you'll deploy it to AWS and get a live API URL.

6.1 Bootstrap Your AWS Environment

Before your first CDK deployment, AWS needs a small landing zone in your account – an S3 bucket where CDK can upload your Lambda bundles and other assets. This setup step is called bootstrapping and only needs to be done once per AWS account per region.

From inside your backend folder, run:

cdk bootstrap

Important: Bootstrapping is region-specific. If you ever switch to a different AWS region, you will need to run cdk bootstrap again in that region.

6.2 Deploy

Run:

cdk deploy

CDK will display a summary of everything it is about to create and ask for your confirmation. Type y and press Enter.

When the deployment finishes, you'll see an Outputs section in your terminal:

Outputs:
BackendStack.ApiEndpoint = https://abcdef123.execute-api.us-east-1.amazonaws.com/prod/

Copy that URL. You'll need it when building the frontend.

6.3 Troubleshooting: How to Read AWS Error Logs

Real deployments rarely go perfectly the first time. If something goes wrong after deploying, here is how to find the actual error message.

Error: 502 Bad Gateway

A 502 means API Gateway received your request but your Lambda crashed before it could respond. The most common cause is a missing environment variable – for example, if TABLE_NAME was not passed correctly and the Lambda cannot find the table.

To find the actual error message, use CloudWatch Logs:

Log in to the AWS Console and search for CloudWatch
In the left sidebar, click Logs --> Log groups

Find the group named /aws/lambda/BackendStack-CreateVendorHandler...
Click the most recent Log stream
Read the error message. It will tell you exactly what went wrong

Two common messages and their fixes:

Runtime.ImportModuleError : Your Lambda cannot find a module. Make sure you're using NodejsFunction (not lambda.Function) in your CDK stack. NodejsFunction automatically bundles dependencies; lambda.Function does not.
AccessDeniedException: Your Lambda tried to access DynamoDB but doesn't have permission. Check that you have the correct grantWriteData or grantReadData call in your stack for that Lambda.

Part 7: Build the React Frontend

Your backend is live. Now you'll build the React UI that talks to it.

7.1 Define the Vendor Type

Before writing any API or component code, define what a "vendor" looks like in TypeScript. This gives you type safety throughout your frontend code.

Create frontend/types/vendor.ts:

export interface Vendor {
  vendorId?: string; // Optional when creating — the Lambda generates it
  name: string;
  category: string;
  contactEmail: string;
  createdAt?: string;
}

The vendorId? is marked optional with ? because when you are creating a new vendor, you don't have an ID yet. The createVendor Lambda generates one. When you read vendors back from the API, vendorId will always be present.

7.2 Create the API Service Layer

Rather than writing fetch calls directly inside your React components, you'll centralize all your API logic in one file. This pattern is called a service layer. It keeps your components clean and makes it easy to update API calls in one place.

First, create a .env.local file inside your frontend folder to store your API URL:

# frontend/.env.local
NEXT_PUBLIC_API_URL=https://abcdef123.execute-api.us-east-1.amazonaws.com/prod

Replace the URL with the ApiEndpoint value from your cdk deploy output. The NEXT_PUBLIC_ prefix is required by Next.js to make an environment variable accessible in the browser.

You might be wondering: why not hardcode the URL? If you paste your API URL directly into your code and push it to GitHub, it becomes publicly visible. While an API URL alone does not expose your data (Cognito will protect that), it's good practice to keep URLs and secrets out of source control. Always use .env.local and add it to your .gitignore.

Make sure .env.local is in your .gitignore:

echo ".env.local" >> frontend/.gitignore

Now create frontend/lib/api.ts:

import { Vendor } from '@/types/vendor';

const BASE_URL = process.env.NEXT_PUBLIC_API_URL!;

export const getVendors = async (): Promise => {
  const response = await fetch(`${BASE_URL}/vendors`);
  if (!response.ok) throw new Error('Failed to fetch vendors');
  return response.json();
};

export const createVendor = async (vendor: Omit): Promise => {
  const response = await fetch(`${BASE_URL}/vendors`, {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify(vendor),
  });
  if (!response.ok) throw new Error('Failed to create vendor');
};

export const deleteVendor = async (vendorId: string): Promise => {
  const response = await fetch(`${BASE_URL}/vendors`, {
    method: 'DELETE',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({ vendorId }),
  });
  if (!response.ok) throw new Error('Failed to delete vendor');
};

What each part does:

Omit means the createVendor function accepts a vendor without an ID or timestamp (those are generated server-side).
if (!response.ok) throw new Error(...) ensures that any HTTP error (4xx or 5xx) surfaces as a JavaScript error in your component, where you can show the user a meaningful message instead of silently failing.

You'll update these functions later in Part 8 to include the Cognito auth token.

7.3 Build the Main Page

Now create the main page component. It includes a form for adding vendors and a live list that displays all current vendors.

Replace the contents of frontend/app/page.tsx with:

'use client';

import { useState, useEffect } from 'react';
import { createVendor, getVendors, deleteVendor } from '@/lib/api';
import { Vendor } from '@/types/vendor';

export default function Home() {
  const [vendors, setVendors] = useState([]);
  const [form, setForm] = useState({ name: '', category: '', contactEmail: '' });
  const [loading, setLoading] = useState(false);
  const [error, setError] = useState('');

  const loadVendors = async () => {
    try {
      const data = await getVendors();
      setVendors(data);
    } catch {
      setError('Failed to load vendors.');
    }
  };

  // Load vendors once when the page first renders
  useEffect(() => {
    loadVendors();
  }, []);
  // The empty [] means this runs only once. Without it, the effect would
  // run after every render, causing an infinite loop of fetch requests.

  const handleSubmit = async (e: React.FormEvent) => {
    e.preventDefault(); // Prevent the browser from reloading the page on submit
    setLoading(true);
    setError('');
    try {
      await createVendor(form);
      setForm({ name: '', category: '', contactEmail: '' }); // Reset the form
      await loadVendors(); // Refresh the list from DynamoDB
    } catch {
      setError('Failed to add vendor. Please try again.');
    } finally {
      setLoading(false);
    }
  };

  const handleDelete = async (vendorId: string) => {
    try {
      await deleteVendor(vendorId);
      await loadVendors(); // Refresh after deleting
    } catch {
      setError('Failed to delete vendor.');
    }
  };

  return (
    
      Vendor Tracker
      Manage your vendors, stored in AWS DynamoDB.

      {error && (
        {error}
      )}

      

        {/* ── Add Vendor Form ── */}
        
          Add New Vendor
          
             setForm({ ...form, name: e.target.value })}
              required
            />
             setForm({ ...form, category: e.target.value })}
              required
            />
             setForm({ ...form, contactEmail: e.target.value })}
              required
            />
            
          
        

        {/* ── Vendor List ── */}
        
          
            Current Vendors ({vendors.length})
          
          
            {vendors.length === 0 ? (
              No vendors yet. Add one using the form.
            ) : (
              vendors.map(v => (
                
                  
                    {v.name}
                    {v.category} · {v.contactEmail}
                  
                  
                
              ))
            )}
          
        

      
    
  );
}

Key points in this component:

'use client' at the top is a Next.js directive. It tells Next.js that this component uses browser APIs (useState, useEffect, event handlers) and must run in the browser, not be pre-rendered on the server.
e.preventDefault() inside handleSubmit stops the browser's default form submission behavior, which would cause a full page reload and wipe your React state.
After every createVendor or deleteVendor call, loadVendors() is called again. This re-fetches the latest data from DynamoDB so the UI always matches what is actually stored in the database.

7.4 Test the App Locally

Start your Next.js development server:

cd frontend
npm run dev

Open http://localhost:3000 in your browser. You should see the two-panel layout. Try adding a vendor and confirm it appears in the list.

Verifying the connection to AWS:

Open Chrome DevTools (F12) and click the Network tab. When you add a vendor, you should see:

A POST request to your AWS API URL returning a 201 status code
A GET request returning 200 with the updated vendor list

You can also verify the data was saved by opening the AWS Console, navigating to DynamoDB --> Tables --> VendorTable --> Explore table items. Your vendor should appear there.

Part 8: Add Authentication with Amazon Cognito

Right now your API is completely open. Anyone who finds your API URL can add or delete vendors. You'll fix that with Amazon Cognito.

Cognito is AWS's authentication service. It manages a User Pool – a database of registered users with usernames and passwords. When a user logs in, Cognito issues a JWT (JSON Web Token): a cryptographically signed string that proves who the user is. Your API Gateway will check for this token on every request. No valid token means no access.

What is a JWT? A JSON Web Token is a string that looks like eyJhbGci.... It contains encoded information about the user and is signed by Cognito using a secret key.

API Gateway can verify the signature without contacting Cognito on every request, which makes token checking fast. Think of it as a tamper-proof badge: anyone can read the name on it, but only Cognito's signature makes it valid.

8.1 Add Cognito to the CDK Stack

Open backend/lib/backend-stack.ts and update it to include Cognito. Here is the complete updated file:

import * as cdk from 'aws-cdk-lib';
import { Construct } from 'constructs';
import * as dynamodb from 'aws-cdk-lib/aws-dynamodb';
import * as apigateway from 'aws-cdk-lib/aws-apigateway';
import * as cognito from 'aws-cdk-lib/aws-cognito';
import { NodejsFunction } from 'aws-cdk-lib/aws-lambda-nodejs';

export class BackendStack extends cdk.Stack {
  constructor(scope: Construct, id: string, props?: cdk.StackProps) {
    super(scope, id, props);

    // ─── 1. DynamoDB Table ────────────────────────────────────────────────────
    const vendorTable = new dynamodb.Table(this, 'VendorTable', {
      partitionKey: { name: 'vendorId', type: dynamodb.AttributeType.STRING },
      billingMode: dynamodb.BillingMode.PAY_PER_REQUEST,
      removalPolicy: cdk.RemovalPolicy.DESTROY,
    });

    // ─── 2. Lambda Functions ──────────────────────────────────────────────────
    const lambdaEnv = { TABLE_NAME: vendorTable.tableName };

    const createVendorLambda = new NodejsFunction(this, 'CreateVendorHandler', {
      entry: 'lambda/createVendor.ts',
      handler: 'handler',
      environment: lambdaEnv,
    });

    const getVendorsLambda = new NodejsFunction(this, 'GetVendorsHandler', {
      entry: 'lambda/getVendors.ts',
      handler: 'handler',
      environment: lambdaEnv,
    });

    const deleteVendorLambda = new NodejsFunction(this, 'DeleteVendorHandler', {
      entry: 'lambda/deleteVendor.ts',
      handler: 'handler',
      environment: lambdaEnv,
    });

    // ─── 3. Permissions ───────────────────────────────────────────────────────
    vendorTable.grantWriteData(createVendorLambda);
    vendorTable.grantReadData(getVendorsLambda);
    vendorTable.grantWriteData(deleteVendorLambda);

    // ─── 4. Cognito User Pool ─────────────────────────────────────────────────
    const userPool = new cognito.UserPool(this, 'VendorUserPool', {
      selfSignUpEnabled: true,
      signInAliases: { email: true },
      autoVerify: { email: true },
      userVerification: {
        emailStyle: cognito.VerificationEmailStyle.CODE,
      },
    });

    // Required to host Cognito's internal auth endpoints
    userPool.addDomain('VendorUserPoolDomain', {
      cognitoDomain: {
        domainPrefix: `vendor-tracker-${this.account}`,
      },
    });

    const userPoolClient = userPool.addClient('VendorAppClient');

    // ─── 5. API Gateway + Authorizer ──────────────────────────────────────────
    const api = new apigateway.RestApi(this, 'VendorApi', {
      restApiName: 'Vendor Service',
      defaultCorsPreflightOptions: {
        allowOrigins: apigateway.Cors.ALL_ORIGINS,
        allowMethods: apigateway.Cors.ALL_METHODS,
        allowHeaders: ['Content-Type', 'Authorization'],
      },
    });

    const authorizer = new apigateway.CognitoUserPoolsAuthorizer(
      this,
      'VendorAuthorizer',
      { cognitoUserPools: [userPool] }
    );

    const authOptions = {
      authorizer,
      authorizationType: apigateway.AuthorizationType.COGNITO,
    };

    const vendors = api.root.addResource('vendors');
    vendors.addMethod('GET', new apigateway.LambdaIntegration(getVendorsLambda), authOptions);
    vendors.addMethod('POST', new apigateway.LambdaIntegration(createVendorLambda), authOptions);
    vendors.addMethod('DELETE', new apigateway.LambdaIntegration(deleteVendorLambda), authOptions);

    // ─── 6. Outputs ───────────────────────────────────────────────────────────
    new cdk.CfnOutput(this, 'ApiEndpoint', { value: api.url });
    new cdk.CfnOutput(this, 'UserPoolId', { value: userPool.userPoolId });
    new cdk.CfnOutput(this, 'UserPoolClientId', { value: userPoolClient.userPoolClientId });
  }
}

What changed:

CognitoUserPoolsAuthorizer tells API Gateway to check every request for a valid Cognito JWT before passing it to any Lambda. If the token is missing or invalid, API Gateway rejects the request with a 401 Unauthorized response without ever touching your Lambda.
authOptions is applied to all three API methods: GET, POST, and DELETE. All routes are now protected.
autoVerify: { email: true } tells Cognito to mark the email attribute as verified after a user confirms via the verification code email. It doesn't skip the verification email, as users still receive a code. If you want to skip verification during development, you can manually confirm users in the Cognito console (covered in section 8.5).
Two new CfnOutput values (UserPoolId and UserPoolClientId) will appear in your terminal after the next deployment. Your frontend needs them to connect to Cognito.

Deploy the updated stack:

cd backend
cdk deploy

After deployment, your terminal output will include three values:

Outputs:
BackendStack.ApiEndpoint     = https://abc123.execute-api.us-east-1.amazonaws.com/prod/
BackendStack.UserPoolId      = us-east-1_xxxxxxxx
BackendStack.UserPoolClientId = xxxxxxxxxxxxxxxxxxxx

Save all three values. You'll use them in the next step.

8.2 Install and Configure AWS Amplify

AWS Amplify is a frontend library that handles all the complex authentication logic for you: it manages the login UI, stores tokens in the browser, refreshes expired tokens automatically, and exposes a simple API to read the current user's session.

Install the Amplify libraries inside your frontend folder:

cd frontend
npm install aws-amplify @aws-amplify/ui-react

Create frontend/app/providers.tsx. This file initializes Amplify with your Cognito configuration. It runs once when the app loads:

'use client';

import { Amplify } from 'aws-amplify';

Amplify.configure(
  {
    Auth: {
      Cognito: {
        userPoolId: process.env.NEXT_PUBLIC_USER_POOL_ID!,
        userPoolClientId: process.env.NEXT_PUBLIC_USER_POOL_CLIENT_ID!,
      },
    },
  },
  { ssr: true }
);

export function Providers({ children }: { children: React.ReactNode }) {
  return <>{children};
}

Add the Cognito IDs to your frontend/.env.local file:

NEXT_PUBLIC_API_URL=https://abc123.execute-api.us-east-1.amazonaws.com/prod
NEXT_PUBLIC_USER_POOL_ID=us-east-1_xxxxxxxx
NEXT_PUBLIC_USER_POOL_CLIENT_ID=xxxxxxxxxxxxxxxxxxxx

Replace the values with the outputs from your cdk deploy.

8.3 Wire Providers into the App Layout

This step is critical. Amplify must be initialized before any component tries to use authentication. If you skip this step, fetchAuthSession() will throw an "Amplify not configured" error and nothing will work.

Open frontend/app/layout.tsx and update it to wrap the app in the Providers component:

import type { Metadata } from 'next';
import './globals.css';
import { Providers } from './providers';

export const metadata: Metadata = {
  title: 'Vendor Tracker',
  description: 'Manage your vendors with AWS',
};

export default function RootLayout({
  children,
}: {
  children: React.ReactNode;
}) {
  return (
    
      
        {children}
      
    
  );
}

By wrapping {children} in , you ensure that Amplify is configured once at the root of the app, before any child page or component renders.

8.4 Protect the UI with withAuthenticator

Now wrap your Home component so that unauthenticated users see a login screen instead of the dashboard.

Replace the contents of frontend/app/page.tsx with this updated version:

'use client';

import { useState, useEffect } from 'react';
import { withAuthenticator } from '@aws-amplify/ui-react';
import '@aws-amplify/ui-react/styles.css';
import { getVendors, createVendor, deleteVendor } from '@/lib/api';
import { Vendor } from '@/types/vendor';

// withAuthenticator injects `signOut` and `user` as props automatically
function Home({ signOut, user }: { signOut?: () => void; user?: any }) {
  const [vendors, setVendors] = useState([]);
  const [form, setForm] = useState({ name: '', category: '', contactEmail: '' });
  const [loading, setLoading] = useState(false);
  const [error, setError] = useState('');

  const loadVendors = async () => {
    try {
      const data = await getVendors();
      setVendors(data);
    } catch {
      setError('Failed to load vendors.');
    }
  };

  useEffect(() => {
    loadVendors();
  }, []);

  const handleSubmit = async (e: React.FormEvent) => {
    e.preventDefault();
    setLoading(true);
    setError('');
    try {
      await createVendor(form);
      setForm({ name: '', category: '', contactEmail: '' });
      await loadVendors();
    } catch {
      setError('Failed to add vendor.');
    } finally {
      setLoading(false);
    }
  };

  const handleDelete = async (vendorId: string) => {
    try {
      await deleteVendor(vendorId);
      await loadVendors();
    } catch {
      setError('Failed to delete vendor.');
    }
  };

  return (
    
      {/* ── Header ── */}
      
        
          Vendor Tracker
          Signed in as: {user?.signInDetails?.loginId}
        
        
      

      {error && (
        {error}
      )}

      

        {/* ── Add Vendor Form ── */}
        
          Add New Vendor
          
             setForm({ ...form, name: e.target.value })}
              required
            />
             setForm({ ...form, category: e.target.value })}
              required
            />
             setForm({ ...form, contactEmail: e.target.value })}
              required
            />
            
          
        

        {/* ── Vendor List ── */}
        
          
            Current Vendors ({vendors.length})
          
          
            {vendors.length === 0 ? (
              No vendors yet.
            ) : (
              vendors.map(v => (
                
                  
                    {v.name}
                    {v.category} · {v.contactEmail}
                  
                  
                
              ))
            )}
          
        

      
    
  );
}

// Wrapping Home with withAuthenticator means any user who is not logged in
// will see Amplify's built-in login/signup screen instead of this component.
export default withAuthenticator(Home);

8.5 Pass the Auth Token to API Calls

Now that API Gateway requires a JWT on every request, your fetch calls need to include the token in the Authorization header. Without it, every request will return a 401 Unauthorized error.

Update frontend/lib/api.ts with a token helper and updated fetch calls:

import { fetchAuthSession } from 'aws-amplify/auth';
import { Vendor } from '@/types/vendor';

const BASE_URL = process.env.NEXT_PUBLIC_API_URL!;

// Retrieves the current user's JWT token from the active Amplify session
const getAuthToken = async (): Promise => {
  const session = await fetchAuthSession();
  const token = session.tokens?.idToken?.toString();
  if (!token) throw new Error('No active session. Please sign in.');
  return token;
};

export const getVendors = async (): Promise => {
  const token = await getAuthToken();
  const response = await fetch(`${BASE_URL}/vendors`, {
    headers: { Authorization: token },
  });
  if (!response.ok) throw new Error('Failed to fetch vendors');
  return response.json();
};

export const createVendor = async (
  vendor: Omit
): Promise => {
  const token = await getAuthToken();
  const response = await fetch(`${BASE_URL}/vendors`, {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      Authorization: token,
    },
    body: JSON.stringify(vendor),
  });
  if (!response.ok) throw new Error('Failed to create vendor');
};

export const deleteVendor = async (vendorId: string): Promise => {
  const token = await getAuthToken();
  const response = await fetch(`${BASE_URL}/vendors`, {
    method: 'DELETE',
    headers: {
      'Content-Type': 'application/json',
      Authorization: token,
    },
    body: JSON.stringify({ vendorId }),
  });
  if (!response.ok) throw new Error('Failed to delete vendor');
};

What getAuthToken does:

fetchAuthSession() reads the currently logged-in user's session from the browser. Amplify stores the session in memory and localStorage after the user signs in.

session.tokens?.idToken is the JWT string that API Gateway's Cognito Authorizer is looking for. Passing it as the Authorization header tells API Gateway: "This request is from an authenticated user."

8.6 Troubleshooting Cognito

When a new user signs up through the Amplify UI, Cognito marks the account as Unconfirmed until the user verifies their email address. A verification code is sent to the user's email. After entering the code, the account becomes confirmed and the user can log in.

If you are testing locally and want to skip the email step, you can manually confirm any account in the AWS Console:

Open the AWS Console and navigate to Cognito
Click on your User Pool (VendorUserPool...)
Click the Users tab
Click on the user's email address
Open the Actions dropdown and click Confirm account

401 Unauthorized errors after deployment

If you are getting 401 errors, check two things:

Open Chrome DevTools --> Network tab, click the failing request, and look at the Request Headers. You should see an Authorization header with a long string of characters. If it is missing, getAuthToken is failing. Check that Amplify is configured correctly in providers.tsx and wired in via layout.tsx.
In your CDK stack, confirm that authorizationType: apigateway.AuthorizationType.COGNITO is present on every protected method definition. If it is missing, API Gateway may not be checking tokens even though the authorizer is defined.

Part 9: Deploy the Frontend with S3 and CloudFront

Your app works locally. Now you'll deploy it to a real HTTPS URL that anyone in the world can visit.

The strategy: Next.js will export your React app as a set of static HTML, CSS, and JavaScript files. Those files will be uploaded to an S3 bucket (AWS's file storage service). CloudFront sits in front of the bucket as a Content Delivery Network (CDN), distributing your files to servers around the world and serving them over HTTPS.

9.1 Configure Next.js for Static Export

Open frontend/next.config.js (or next.config.mjs) and add the output: 'export' setting:

/** @type {import('next').NextConfig} */
const nextConfig = {
  output: 'export', // Generates a static /out folder instead of a Node.js server
};

export default nextConfig;

Note on 'use client' and static export: When output: 'export' is set, Next.js builds every page at compile time. Any component that uses browser-only APIs – like withAuthenticator from Amplify – must have 'use client' at the top of the file. This tells Next.js to skip server-side rendering for that component and run it only in the browser.

You already have 'use client' in page.tsx. If you ever see a build error mentioning window is not defined or similar, check that the relevant component has 'use client' at the top.

Build the frontend:

cd frontend
npm run build

This generates an /out folder containing your complete website as static files. Verify the folder was created:

ls out
# You should see: index.html, _next/, etc.

9.2 Add S3 and CloudFront to the CDK Stack

Open backend/lib/backend-stack.ts and add the hosting infrastructure. Here's the complete final version of the file:

import * as cdk from 'aws-cdk-lib';
import { Construct } from 'constructs';
import * as dynamodb from 'aws-cdk-lib/aws-dynamodb';
import * as apigateway from 'aws-cdk-lib/aws-apigateway';
import * as cognito from 'aws-cdk-lib/aws-cognito';
import * as s3 from 'aws-cdk-lib/aws-s3';
import * as cloudfront from 'aws-cdk-lib/aws-cloudfront';
import * as origins from 'aws-cdk-lib/aws-cloudfront-origins';
import * as s3deploy from 'aws-cdk-lib/aws-s3-deployment';
import { NodejsFunction } from 'aws-cdk-lib/aws-lambda-nodejs';

export class BackendStack extends cdk.Stack {
  constructor(scope: Construct, id: string, props?: cdk.StackProps) {
    super(scope, id, props);

    // 1. DynamoDB Table 
    const vendorTable = new dynamodb.Table(this, 'VendorTable', {
      partitionKey: { name: 'vendorId', type: dynamodb.AttributeType.STRING },
      billingMode: dynamodb.BillingMode.PAY_PER_REQUEST,
      removalPolicy: cdk.RemovalPolicy.DESTROY,
    });

    // 2. Lambda Functions
    const lambdaEnv = { TABLE_NAME: vendorTable.tableName };

    const createVendorLambda = new NodejsFunction(this, 'CreateVendorHandler', {
      entry: 'lambda/createVendor.ts',
      handler: 'handler',
      environment: lambdaEnv,
    });

    const getVendorsLambda = new NodejsFunction(this, 'GetVendorsHandler', {
      entry: 'lambda/getVendors.ts',
      handler: 'handler',
      environment: lambdaEnv,
    });

    const deleteVendorLambda = new NodejsFunction(this, 'DeleteVendorHandler', {
      entry: 'lambda/deleteVendor.ts',
      handler: 'handler',
      environment: lambdaEnv,
    });

    // 3. Permissions
    vendorTable.grantWriteData(createVendorLambda);
    vendorTable.grantReadData(getVendorsLambda);
    vendorTable.grantWriteData(deleteVendorLambda);

    // 4. Cognito User Pool
    const userPool = new cognito.UserPool(this, 'VendorUserPool', {
      selfSignUpEnabled: true,
      signInAliases: { email: true },
      autoVerify: { email: true },
      userVerification: {
        emailStyle: cognito.VerificationEmailStyle.CODE,
      },
    });

    userPool.addDomain('VendorUserPoolDomain', {
      cognitoDomain: { domainPrefix: `vendor-tracker-${this.account}` },
    });

    const userPoolClient = userPool.addClient('VendorAppClient');

    // 5. API Gateway + Authorizer
    const api = new apigateway.RestApi(this, 'VendorApi', {
      restApiName: 'Vendor Service',
      defaultCorsPreflightOptions: {
        allowOrigins: apigateway.Cors.ALL_ORIGINS,
        allowMethods: apigateway.Cors.ALL_METHODS,
        allowHeaders: ['Content-Type', 'Authorization'],
      },
    });

    const authorizer = new apigateway.CognitoUserPoolsAuthorizer(
      this,
      'VendorAuthorizer',
      { cognitoUserPools: [userPool] }
    );

    const authOptions = {
      authorizer,
      authorizationType: apigateway.AuthorizationType.COGNITO,
    };

    const vendors = api.root.addResource('vendors');
    vendors.addMethod('GET', new apigateway.LambdaIntegration(getVendorsLambda), authOptions);
    vendors.addMethod('POST', new apigateway.LambdaIntegration(createVendorLambda), authOptions);
    vendors.addMethod('DELETE', new apigateway.LambdaIntegration(deleteVendorLambda), authOptions);

    // 6. S3 Bucket (Frontend Files) 
    const siteBucket = new s3.Bucket(this, 'VendorSiteBucket', {
      blockPublicAccess: s3.BlockPublicAccess.BLOCK_ALL,
      removalPolicy: cdk.RemovalPolicy.DESTROY,
      autoDeleteObjects: true,
    });

    // 7. CloudFront Distribution (HTTPS + CDN)
    const distribution = new cloudfront.Distribution(this, 'SiteDistribution', {
      defaultBehavior: {
        origin: new origins.S3Origin(siteBucket),
        viewerProtocolPolicy: cloudfront.ViewerProtocolPolicy.REDIRECT_TO_HTTPS,
      },
      defaultRootObject: 'index.html',
      errorResponses: [
        {
          // Redirect all 404s back to index.html so React can handle routing
          httpStatus: 404,
          responseHttpStatus: 200,
          responsePagePath: '/index.html',
        },
      ],
    });

    // 8. Deploy Frontend Files to S3 
    new s3deploy.BucketDeployment(this, 'DeployWebsite', {
      sources: [s3deploy.Source.asset('../frontend/out')],
      destinationBucket: siteBucket,
      distribution,
      distributionPaths: ['/*'], // Clears CloudFront cache on every deploy
    });

    // 9. Outputs ───────────────────────────────────────────────────────────
    new cdk.CfnOutput(this, 'ApiEndpoint', { value: api.url });
    new cdk.CfnOutput(this, 'UserPoolId', { value: userPool.userPoolId });
    new cdk.CfnOutput(this, 'UserPoolClientId', { value: userPoolClient.userPoolClientId });
    new cdk.CfnOutput(this, 'CloudFrontURL', {
      value: `https://${distribution.distributionDomainName}`,
    });
  }
}

What the hosting infrastructure does:

The S3 bucket stores your static HTML, CSS, and JavaScript files. It is private – users cannot access it directly.
CloudFront is the CDN that sits in front of S3. It gives you an HTTPS URL and caches your files at edge locations worldwide, so the app loads fast no matter where users are located. REDIRECT_TO_HTTPS automatically upgrades any HTTP request to HTTPS.
The error response for 404 returns index.html instead of an error page. This is necessary for single-page apps: if a user navigates directly to a route like /vendors/123, CloudFront cannot find a file at that path, but sending back index.html lets the React app handle the routing correctly.
distributionPaths: ['/*'] tells CloudFront to invalidate its entire cache after every deployment. This ensures users always see the latest version of your app immediately.
BucketDeployment is a CDK construct that automatically uploads the contents of your frontend/out folder to the S3 bucket every time you run cdk deploy.

9.3 Run the Final Deployment

First, build the frontend with the latest environment variables:

cd frontend
npm run build

Then deploy everything from the backend folder:

cd ../backend
cdk deploy

After deployment finishes, copy the CloudFrontURL from the terminal output:

Outputs:
BackendStack.CloudFrontURL = https://d1234abcd.cloudfront.net

Open that URL in your browser. Your app is now live on the internet, served over HTTPS, globally distributed.

What You Built

You now have a fully deployed, production-style full-stack application. Here is a summary of every piece you built and what it does:

Layer	Service	What it does
Frontend	Next.js + CloudFront	React UI served globally over HTTPS
Auth	Amazon Cognito + Amplify	User sign-up, login, and JWT token management
API	API Gateway	Routes HTTP requests, validates auth tokens
Logic	AWS Lambda (×3)	Creates, reads, and deletes vendors on demand
Database	DynamoDB	Stores vendor records with no idle cost
Storage	S3	Holds your built frontend files
Infrastructure	AWS CDK	Defines and deploys all of the above as code

Conclusion

You have built and deployed the foundational pattern of almost every cloud application: a secured API backed by a database, deployed with infrastructure as code. Here is everything you accomplished:

You set up a professional AWS development environment with scoped IAM credentials. You defined your entire backend infrastructure as TypeScript code using AWS CDK, which means your database, API, Lambda functions, and authentication system are all version-controlled, repeatable, and deployable with a single command.

You wrote three Lambda functions that handle create, read, and delete operations, each with proper error handling and the correct AWS SDK v3 patterns. You connected them to a REST API through API Gateway and protected every route with Amazon Cognito authentication, so only registered, verified users can interact with your data.

On the frontend, you built a Next.js application with a service layer that cleanly separates API logic from UI components, manages JWTs automatically through AWS Amplify, and gives users a complete sign-up and sign-in flow without you writing a single line of authentication UI code.

Finally, you deployed the entire system: your backend to AWS Lambda and DynamoDB, and your frontend as a static site served globally through CloudFront over HTTPS.

The full source code for this tutorial is available on GitHub. Clone it, modify it, and use it as a reference for your own projects.

How to Build a Serverless RAG Pipeline on AWS That Scales to Zero

Christopher Galliart — Wed, 11 Mar 2026 18:19:40 +0000

Most RAG tutorials end the same way: you've got a working prototype and a bill for a vector database that runs whether anyone's querying it or not. Add an always-on embedding service, a hosted LLM endpoint, and the usual AWS infrastructure, and you're looking at real money before a single user shows up.

But it doesn't have to work that way. In this tutorial, you'll deploy a fully serverless RAG pipeline that processes documents, images, video, and audio, then scales to zero when nobody's using it.

Everything runs in your AWS account, your data never leaves your infrastructure, and your ongoing monthly cost for a modest knowledge base will be closer to 2-3 USD than 300 USD.

We'll use RAGStack-Lambda, an open-source project I built on AWS. By the end, you'll have a deployed pipeline with a dashboard, an AI chat interface with source citations, a drop-in web component you can embed in any app, and an MCP server you can use to feed your assistant context.

What This Actually Costs

Before we build anything, let's talk money, because the cost story is the whole point.

RAG pipelines have two cost phases: ingestion (processing your documents once) and operation (querying them over time).

Most platforms charge you a flat monthly rate regardless of which phase you're in. A serverless architecture flips that: ingestion costs something, and then everything scales to zero.

Ingestion: The One-Time Hit

When you upload documents, several things happen: text extraction (OCR for PDFs and images), embedding generation, metadata extraction, and storage. Here's what that actually costs per service:

Textract (OCR): This is the most expensive part of ingestion, and it only applies to scanned PDFs and images that need text extraction. Plain text, HTML, CSV, and other text-based formats skip this entirely.

Textract charges about 1.50 USD per 1,000 pages for standard text detection. If you're uploading 500 pages of scanned PDFs, that's about 0.75 USD. A heavy initial load of several thousand scanned pages might run 5-10 USD. But once your documents are processed, you never pay this again unless you add new ones.

Bedrock Embeddings (Nova Multimodal): This is where your content gets converted into vectors for semantic search. The pricing is almost comically cheap:

Text: 0.00002 USD per 1,000 input tokens
Images: 0.00115 USD per image
Video/Audio: 0.00200 USD per minute

To put that in perspective: if you have 1,500 text documents averaging 2,500 tokens each after chunking, your total embedding cost is about 0.08 USD. A knowledge base with 500 images runs 0.58 USD. Even a mixed corpus of text, images, and a few hours of video stays well under 2 USD for the entire embedding pass. This is a one-time cost – you only re-embed if you add or update documents.

Bedrock LLM (Metadata Extraction): RAGStack uses an LLM to analyze each document and extract structured metadata automatically. This is a few inference calls per document using Nova Lite or a similar model. At 0.06 USD/0.24 USD per million input/output tokens, processing 1,500 documents costs well under 1 USD.

S3 Vectors (Storage): Storing your embeddings. At 0.06 USD per GB/month, a knowledge base of 1,500 documents with 1,024-dimension vectors takes up a trivially small amount of space. We're talking pennies per month.

S3 (Document Storage): Your source documents in standard S3. Even cheaper, 0.023 USD per GB/month.

DynamoDB: Stores document metadata and processing state. The on-demand pricing model means you pay per request during ingestion, then essentially nothing at rest. A few cents for the initial load.

To put real numbers on it: if you upload 200 text documents (PDFs, HTML, markdown), your total ingestion cost is likely under 1 USD. If you upload 1,000 scanned PDFs that need OCR, you might see 5-8 USD as a one-time hit. That 7-10 USD figure you might see referenced? That's the upper end for a heavy initial load with lots of OCR work.

Operation: Where Scale-to-Zero Shines

Once your documents are ingested, the pipeline is waiting. Not running. Waiting. Here's what each query costs:

Lambda: Invocations are billed per request and duration. The free tier covers 1 million requests/month. For a personal or small-team knowledge base, you may never leave the free tier.

S3 Vectors (Queries): 2.50 USD per million query API calls, plus a per-TB data processing charge. For a small index queried a few hundred times a month, this rounds to effectively zero.

Bedrock (Chat Inference): This is your main operating cost. Each chat response requires an LLM call. Using Nova Lite at 0.06 USD per million input tokens and 0.24 USD per million output tokens, a typical RAG query (retrieval context + user question + response) might cost 0.001-0.003 USD per query. A hundred queries a month is 0.10-0.30 USD.

Step Functions: Orchestrates the document processing pipeline. Standard workflows charge 0.025 USD per 1,000 state transitions. Minimal during operation since it's only active during ingestion.

Cognito: User authentication. Free for the first 10,000 monthly active users.

CloudFront: Serves the dashboard UI. Free tier covers 1 TB of data transfer per month.

API Gateway: Handles GraphQL API requests. Free tier covers 1 million API calls per month.

Add it all up for a knowledge base with 500 documents getting a few hundred queries per month, and your monthly operating cost is somewhere between 0.50 USD and 3.00 USD. Most of that is the LLM inference for chat responses.

The Comparison That Matters

Here's the same pipeline on a traditional always-on stack:

Service	RAGStack-Lambda	Traditional Stack
Vector Database	S3 Vectors: pennies/mo	Pinecone Starter: `70 USD`/mo
Vector Database (alt)	S3 Vectors: pennies/mo	OpenSearch Serverless: about `350 USD`/mo min
Compute	Lambda: free tier	EC2 or ECS: `50-150 USD`/mo
LLM Inference	Same per-query cost	Same per-query cost
Total (idle)	about `0.50-3.00 USD`/mo	`120-500 USD`/mo

The LLM inference cost per query is roughly the same everywhere – that's Bedrock's on-demand pricing regardless of your architecture. The difference is everything else. Traditional stacks pay a floor cost whether anyone's using them or not. A serverless stack pays for what it uses, and idle costs essentially nothing.

What About Transcribe?

If you're uploading video or audio, AWS Transcribe adds cost for speech-to-text conversion. Standard transcription runs about 0.024 USD per minute of audio. A 10-minute video costs 0.24 USD to transcribe. This is a one-time ingestion cost, once transcribed and embedded, the resulting text chunks are queried like any other document.

What You're Building

By the end of this tutorial, you'll have a deployed pipeline that does the following:

You upload a document (PDF, image, video, audio, HTML, CSV, the full list is extensive) through a web dashboard.
The pipeline detects the file type and routes it to the right processor. Scanned PDFs go through OCR via Textract. Video and audio go through Transcribe for speech-to-text, split into 30-second searchable chunks with speaker identification. Images get visual embeddings and any caption text you provide.
An LLM analyzes each document and extracts structured metadata, topic, document type, date range, people mentioned, whatever's relevant. This happens automatically.
Everything gets embedded using Amazon Nova Multimodal Embeddings and stored in a Bedrock Knowledge Base backed by S3 Vectors.
You (or your users) ask questions through an AI chat interface. The pipeline retrieves relevant documents, passes them as context to a Bedrock LLM, and returns an answer with collapsible source citations, including timestamp links for video and audio that jump to the exact position.

All of this runs in your AWS account. No external control plane, no third-party services beyond AWS itself.

The Architecture

A few things to note about this architecture:

Step Functions orchestrate everything. When a document is uploaded, a state machine manages the entire processing flow, detecting the file type, routing to the right processor, waiting for async operations like Transcribe jobs, then triggering embedding and metadata extraction.

This is what makes the pipeline reliable without a running server. If a step fails, it retries. You can see exactly where every document is in the processing pipeline.

Lambda does the compute. Every processing step is a Lambda function. They spin up when needed, run for a few seconds to a few minutes, and shut down. There's no EC2 instance idling at 3 AM.

S3 Vectors is the vector store. Your embeddings live in S3's purpose-built vector storage rather than in a dedicated vector database like Pinecone or OpenSearch.

This is what makes the "scale to zero" cost possible: you're paying object storage rates for vector data instead of keeping a database cluster warm. It also means your vectors are sitting in your own S3 bucket, not in a third-party managed service that holds your data on their terms.

Cognito handles auth. The dashboard and API are protected with Cognito user pools. When you deploy, you get a temporary password via email. The web component uses IAM-based authentication, and server-side integrations use API key auth.

CloudFront serves the UI. The dashboard is a static React app served through CloudFront, so there's no web server to maintain.

Two Ways to Deploy

You have two deployment paths depending on what you want:

AWS Marketplace (the fast path), click deploy, fill in two fields (stack name and email), and wait about 10 minutes. No local tooling required. This is the path we'll walk through first.

From Source (the developer path), Clone the repo, run publish.py, and deploy via SAM CLI. This is the path for when you want to customize the processing pipeline, modify the UI, or contribute to the project. We'll cover this after the Marketplace walkthrough.

Both paths produce the same stack. The Marketplace version just wraps the CloudFormation template in a one-click deployment.

Prerequisites

Before you deploy, you'll need:

An AWS account with permissions to create CloudFormation stacks, Lambda functions, S3 buckets, DynamoDB tables, and Cognito user pools. If you're using an admin account, you're covered.
Bedrock model access: RAGStack defaults to us-east-1 because that's where Nova Multimodal Embeddings is available. Amazon's own models (including Nova) are available by default in Bedrock, no manual enablement required. Just make sure your IAM role has the necessary bedrock:InvokeModel permissions.
For the Marketplace path: just a web browser.
For the source path: Python 3.13+, Node.js 24+, AWS CLI and SAM CLI configured, and Docker (for building Lambda layers).

Deploying from AWS Marketplace

This is the fastest path – no local tools, no CLI, no Docker. You'll launch a CloudFormation stack and have a working pipeline in about 10 minutes.

Step 1: Launch the Stack

Click the direct deploy link to open CloudFormation's "Quick create stack" page with the template pre-loaded.

Step 2: Fill In Two Fields

The page has a lot of options, but you only need two:

Stack name: Must be lowercase. This becomes the prefix for all your AWS resources (for example, my-docs, team-kb, project-notes). Keep it short.
Admin Email: Under Required Settings. Cognito will send your temporary login credentials here. Use an email you can access right now.

Everything else – Build Options, Advanced Settings, OCR Backend, model selections – can stay at the defaults. They're there for customization later, but the defaults work out of the box.

Step 3: Deploy

Scroll to the bottom, check the three acknowledgment boxes under "Capabilities and transforms," and click Create stack.

Deployment takes roughly 10 minutes. You can watch the progress in the CloudFormation Events tab if you're curious, but there's nothing to do until the stack status flips to CREATE_COMPLETE.

Step 4: Log In

Once the stack finishes, check your email. Cognito sends you the dashboard URL and a temporary password. Log in, set a new password, and you're looking at an empty dashboard ready for documents.

Deploying from Source

If you want to customize the pipeline, modify the UI, or contribute to the project, deploy from source instead.

Step 1: Clone and Set Up

git clone https://github.com/HatmanStack/RAGStack-Lambda.git
cd RAGStack-Lambda

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -r requirements.txt

Step 2: Deploy

The publish.py script handles everything: building the frontend, packaging Lambda functions, and deploying via SAM CLI.

python publish.py \
  --project-name my-docs \
  --admin-email admin@example.com

This defaults to us-east-1 for Nova Multimodal Embeddings. The script will build the React dashboard, build the web component, package all Lambda layers with Docker, and deploy the CloudFormation stack through SAM.

First deploy takes longer (15-20 minutes) because it's building everything from scratch. Subsequent deploys are faster since SAM caches unchanged resources.

If you only want to iterate on the backend and skip UI builds:

# Skip dashboard build (still builds web component)
python publish.py --project-name my-docs --admin-email admin@example.com --skip-ui

# Skip ALL UI builds
python publish.py --project-name my-docs --admin-email admin@example.com --skip-ui-all

Once it finishes, you'll get the same Cognito email and dashboard URL as the Marketplace path.

Uploading Your First Documents

The dashboard has tabs for different content types. We'll start with the Documents tab since that's the most common use case.

Documents

Click the Documents tab and upload a file. RAGStack accepts a wide range of formats: PDF, DOCX, XLSX, HTML, CSV, JSON, XML, EML, EPUB, TXT, and Markdown. Drag and drop or use the file picker.

Once uploaded, the document enters the processing pipeline. You'll see the status update in real time:

UPLOADED: File received and stored in S3.
PROCESSING: Step Functions has picked it up and routed it to the right processor. Text-based files (HTML, CSV, Markdown) go through direct extraction. Scanned PDFs and images go through Textract OCR. The LLM analyzes the content and extracts structured metadata, topic, document type, people mentioned, date ranges, whatever's relevant to the content.
INDEXED: Embeddings generated, vectors stored, document is searchable.

Text documents typically process in 1-5 minutes. OCR-heavy documents (scanned PDFs, images with text) can take 2-15 minutes depending on page count.

Images

The Images tab works differently. Upload a JPG, PNG, GIF, or WebP and you can add a caption. Both the visual content and caption text get embedded using Nova Multimodal Embeddings, so you can search by what's in the image or by your description of it.

This is where multimodal embeddings earn their keep. A traditional text-only RAG pipeline would need you to describe every image manually. Here, the image itself becomes searchable, and since everything stays in your AWS account, you're not sending personal photos or sensitive visual content to an external service to get there.

What About Video and Audio?

Upload video or audio files and RAGStack routes them through AWS Transcribe for speech-to-text conversion. The transcript gets split into 30-second chunks with speaker identification, then embedded like any other document. When chat results reference a video source, you get timestamp links that jump to the exact position in the recording.

Web Scraping

The Scrape tab lets you pull websites directly into your knowledge base. Enter a URL and RAGStack crawls the page, extracts the content, and processes it through the same pipeline as uploaded documents, metadata extraction, embedding, indexing.

This is useful for building a knowledge base from existing web content without manually saving and uploading pages. Documentation sites, blog archives, reference material, anything publicly accessible.

Chatting With Your Knowledge Base

This is the payoff. Go to the Chat tab, type a question, and RAGStack retrieves relevant documents from your knowledge base, passes them as context to a Bedrock LLM, and returns an answer with source citations.

The citations are collapsible, so click to expand and see which documents informed the answer, with the option to download the source file. For video and audio sources, you get clickable timestamps that jump to the relevant moment.

Metadata Filtering

If you've uploaded enough documents to have meaningful metadata categories, the chat interface lets you filter search results by metadata before querying. RAGStack auto-discovers the metadata structure from your documents, so you don't configure this manually, it just appears as your knowledge base grows.

This is useful when you have a large mixed corpus. Instead of hoping the vector search picks the right context from thousands of documents, you can narrow it down: "only search documents about project X" or "only search content from Q4 2024."

Embedding the Web Component in Your App

The dashboard is useful for managing your knowledge base, but the real power is embedding RAGStack's chat in your own application. The web component works with any framework, React, Vue, Angular, Svelte, plain HTML.

Load the script once from your CloudFront distribution:

Then drop the component wherever you want a chat interface:

That's it. The component handles authentication (via IAM), manages conversation state, and renders source citations, all self-contained. Your CloudFront URL is in the stack outputs.

For server-side integrations that don't need a UI, the GraphQL API is available with API key authentication. You can find your endpoint and API key in the dashboard under Settings.

Using the MCP Server

RAGStack includes an MCP server that connects your knowledge base to AI assistants like Claude Desktop, Cursor, VS Code, and Amazon Q CLI. Instead of switching to the dashboard to search your documents, you ask your assistant directly.

Install it:

pip install ragstack-mcp

Then add it to your AI assistant's MCP configuration:

{
  "ragstack": {
    "command": "uvx",
    "args": ["ragstack-mcp"],
    "env": {
      "RAGSTACK_GRAPHQL_ENDPOINT": "YOUR_ENDPOINT",
      "RAGSTACK_API_KEY": "YOUR_API_KEY"
    }
  }
}

Your endpoint and API key are in the dashboard under Settings. Once configured, type @ragstack in your assistant's chat to invoke the MCP server, then ask things like "search my knowledge base for authentication docs" and it queries RAGStack directly.

See the MCP Server docs for the full list of available tools and setup details.

What You Can Build From Here

You've got a deployed RAG pipeline that costs almost nothing to run and handles text, images, video, and audio. A few directions you might take it:

A searchable personal archive. Every conference talk you've saved, every PDF textbook, every tutorial video that's sitting in a folder somewhere. Upload it all, and now you have one search interface across years of accumulated material. The multimodal embeddings mean your screenshots and diagrams are searchable too, not just the text.

I built a family archive app this way, scanned letters, old photos, home videos, with RAGStack deployed as a nested CloudFormation stack so the whole family can search across decades of memories using the chat widget.

A second brain for a client project. Scrape the client's existing docs, upload the SOW and meeting notes, drop in the codebase documentation. Now you've got a searchable knowledge base scoped to that engagement. Spin it up at the start, tear it down when the contract ends. At these costs, it's disposable infrastructure.

AI chat over a niche dataset. Recipe collections, legal filings, research papers, local government meeting minutes, any corpus that's too specialized for general-purpose LLMs to know well. The web component means you can ship it as a standalone tool without building a frontend from scratch.

RAG for your MCP workflow. If you're already using Claude Desktop or Cursor, the MCP server turns your knowledge base into another tool your assistant can reach for. Upload your team's runbooks and architecture docs, and now @ragstack in your editor gives you instant context without tab-switching.

Wrapping Up

The serverless RAG pipeline you just deployed handles document processing, multimodal embeddings, metadata extraction, and AI chat with source citations, all scaling to zero when idle, all running in your AWS account. Your documents, your vectors, your infrastructure. The traditional approach to this stack costs 120-500 USD/month in baseline infrastructure. This one costs pocket change.

The full source is at github.com/HatmanStack/RAGStack-Lambda. File issues, open PRs, or just poke around the architecture. If you want to go deeper on the technical tradeoffs, particularly how filtered vector search behaves on cost-optimized backends like S3 Vectors, that's a story for the next post.

How to Run a Docker Container in AWS Lambda

Agnes Olorundare — Wed, 24 Dec 2025 23:38:56 +0000

While containers are quite lightweight and provide various benefits, it can be challenging to decide how best to deploy them. There are a number of ways to deploy and run Docker containers. But some are best for orchestrating and managing containers, and may not suit a simple use case of running just one container.

In this article, I’ll teach you how you can deploy a single Docker container using a serverless service on AWS called Lambda.

Prerequisite/ Requirements
Serverless with AWS Lambda
How to Build, Run, and Test a Container Locally
How to Push Your Image to Amazon Elastic Container Registry (ECR)
How to Deploy Your Docker Image to Lambda
Cleanup
Conclusion

Prerequisite/ Requirements

The following tools and skills are necessary for following along with this tutorial:

Knowledge of Docker, and have Docker installed locally.
An AWS account with credentials with administrative privilege for making API calls via the CLI. Best practice would be to limit the privilege to exactly what needs to be done.
AWS CLI installed locally
Python virtual environment managers such as uv (optional)

Serverless with AWS Lambda

Containers provide a lightweight, consistent, and resource-friendly way of running applications. Serverless takes away the overhead of managing the underlying infrastructures on which the container runs. So as you can probably start to see, combining these tools helps you deploy applications in a way that lets you focus on business logic, performance, and what gives your product a competitive edge/ advantage.

One AWS tool that enables you to go serverless is Lambda. With Lambda, you’re only billed for the number of times the code in the function runs, the memory you selected at the time of provisioning the service, and the duration of each invocation of the function.

In addition to removing operational overhead, Lambda can also help you save money since you won’t have to deal with idle resources. The function only comes alive when triggered by a request sent to it.

How to Build, Run, and Test a Container Locally

Docker is a tool that helps you package applications or software into portable, standardized and shareable units that have everything the applications need such as libraries, runtime, system tools, application code, in order to run. These units are called containers.

In this section, I’ll walk you through building the Docker image, running the container, and testing it after it’s running.

You can find the project that you’ll be using here in this GitHub repository.

Build the Docker Image

To run a Docker container, you first need to build an image. The image becomes the template or class from which you create the container or instance of the class.

You can find the code to build an image in lambda_function.py.

# lambda_function.py

def lambda_handler(event, context):
    name = event["name"]
    message = f"Hello, {name}!"

    try:
        return {
            "statusCode": 200,
            "body": message
        }
    except Exception as e:
        return {
            "statusCode": 400,
            "body": {"error": str(e)}
        }

As you can see from the code above, this is a very basic Python application that expects a POST HTTP request, with a JSON payload that contains the key – name – and a corresponding value. The code then returns a greeting containing the name it has received. The application has just a single function, which also serves as the entry point to it.

To build a Docker image, you’ll need a Dockerfile to provide the blueprint for the image. For this specific case, the Dockerfile you’ll use is also very basic. Each line in a Dockerfile is called a Directive, and this provides the instruction Docker should follow when creating an image. So building a Docker image means creating a template for a container by following the instructions or directives in the Dockerfile.

# Dockerfile

FROM public.ecr.aws/lambda/python:3.12

# Copy function code... LAMBDA_TASK_ROOT is /var/task, the working directory set in the base image
COPY lambda_function.py ${LAMBDA_TASK_ROOT}    

# Set the CMD to your handler - lambda_handler
CMD ["lambda_function.lambda_handler"]

A Dockerfile usually starts with a base image. To deploy an application as a Docker container in AWS Lambda, the base image has to be of a specific kind, depending on the application run-time. For this case, you’ll need the Python run-time, so the base image is public.ecr.aws/lambda/python:3.12. It’s okay to use a different Python version.

The next directive in the Dockerfile is copying the lambda_function.py file to a specific path in the base image. That path is referenced using an environment variable that has already been defined in the base image and points to /var/task. This is the directory your code will be running from.

The last directive is simply a command to start the application when the container runs.

Now, you can run the build command from the project’s root directory:

docker build -t : .

Run the Docker Container

Next, let’s create a running container from this image.

docker run -it --rm -p 8080:8080  lambda_docker:1.0.0

The command above will create a container and run it in interactive mode just so you can see the logs generated by the application in the container. Port 8080 is also exposed on the host where the container is running and mapped to the container port, which is also 8080 (defined by AWS). The container gets automatically removed once you kill the running process with CTRL + C.

Test the Running Container

Now confirm that the application running within the container can receive and process requests. To do this, use the code in the test.py file:

# test.py

import requests

url = "http://localhost:8080/2015-03-31/functions/function/invocations"

data = {
    "name": "Janet"
}

response = requests.post(url, json=data)

print("Status Code:", response.status_code)
print("Response Body:", response.json())

You can use the Python requests library to make this call. Install the library by using a virtual environment to isolate the application from your overall system. This helps prevent issues with conflicts in the versions of libraries you install for an application to use.

If you’re using uv to manage your virtual environment, simply run the command:

uv add requests

Then run the code in test.py from within the virtual environment:

uv run python3 test.py

You should see the desired response on the terminal.

How to Push Your Image to Amazon Elastic Container Registry (ECR)

Now that you have a working Docker image to deploy to Lambda, the next step is to push the image to a Docker registry. For this use case, your image has to be pushed to Amazon ECR, a container registry for storing Docker images.

To push your Docker image, you first need to tag the image, which simply means naming the image in a specific way.

Currently, this image tag is lambda-docker:1.0.0. To tag it the AWS way, first create an ECR repository. Let’s use the AWS CLI for this (this requires you to configure the AWS credentials locally by running the aws configure command and providing your credentials).

Setup Environment Variables

# Set AWS profile
export AWS_PROFILE=

# Set other variables

AWS_REGION=
ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text)
REPO_NAME=lambda-docker
TAG=1.0.0

The above commands set the AWS_PROFILE for the CLI to target the right AWS account for API calls. The other variables specify the region, account ID, and the ECR repository name and tag.

Create ECR Repository and Authenticate

Now, create the ECR repository:

aws ecr create-repository \
  --repository-name "$REPO_NAME" \
  --region "$AWS_REGION"

Authenticate to Amazon ECR:

aws ecr get-login-password --region "$AWS_REGION" \
  | docker login \
  --username AWS \
  --password-stdin "$ACCOUNT_ID.dkr.ecr.$AWS_REGION.amazonaws.com"

Tag and Push the Docker Image

Now, tag the Docker image:

docker tag $REPO_NAME:$TAG \
  $ACCOUNT_ID.dkr.ecr.$AWS_REGION.amazonaws.com/$REPO_NAME:$TAG

Push the image to the ECR repository you created:

docker push $ACCOUNT_ID.dkr.ecr.$AWS_REGION.amazonaws.com/$REPO_NAME:$TAG

And that’s it! Your image is now in ECR.

How to Deploy Your Docker Image to Lambda

With your image now in ECR, you can create a Lambda function. Navigate to the Lambda console, and click Create a Function.

Create Lambda Function

Select Container Image and go ahead to search for the ECR repository you created.

Next, select the image:

Leave other configurations as default and click create.

Navigate to the function after creating.

Test Deployment

Now, let’s test the deployment. For this, simply use the existing Lambda Test tab. Provide all the details needed, including the payload for your POST request.

And that’s it. You’ve successfully deployed a Docker container on AWS by leveraging ECR and Lambda. You can go a step forward by integrating API Gateway and making the function accessible from the internet.

Cleanup

Remember to delete the services you’ve created on your AWS ECR repository and Lambda to avoid extra charges.

Conclusion

Deploying your Docker container on AWS Lambda is an efficient way to get your application running quickly without being bothered by managing servers or platforms.

Thanks for reading!

How to Build a Full-Stack Serverless CRUD App using AWS and React

Chisom Uma — Tue, 21 Oct 2025 16:37:30 +0000

Imagine running a production application that automatically scales from zero to thousands of users without ever touching a server configuration. That's the power of serverless architecture, and it's easier to implement than you might think.

If you're a junior cloud engineer ready to move beyond theoretical AWS concepts and build something real, this tutorial walks you through creating a complete serverless coffee shop management system.

You'll learn how to architect, deploy, and secure a production-ready application using AWS's most powerful serverless services.

Without further ado, let's get started!

Prerequisites
Tools We’ll be Using
What We are Building
Why Serverless?
Architectural Overview
Build a Serverless Full-Stack App
Troubleshooting Access Denied Error
Conclusion

Prerequisites

Basic knowledge of AWS.
Basic knowledge of AWS serverless services.
Knowledge of React (not required).
Basic knowledge of Postman or other API testing tools.

Tools We’ll be Using

What We are Building

We'll build a complete serverless coffee shop management system using AWS cloud services. Coffee shop owners will securely log in through AWS Cognito authentication and have full control over their inventory, adding new products, updating stock levels, viewing current inventory, and removing discontinued items. To follow along with this tutorial, you can clone the repo here.

This is what our user interface (UI) looks like:

Why Serverless?

AWS serverless services like Lambda, Cognito, and API Gateway automatically scale to zero during quiet periods and instantly ramp up when traffic spikes. While 'serverless' might sound like there are no servers at all, this isn't actually the case. It means that AWS handles all the heavy lifting, provisioning, managing, and scaling of the infrastructure behind the scenes. You only pay for what you use.

Architectural Overview

Our architecture uses DynamoDB as the data store, with Lambda functions (enhanced by Lambda layers) handling all API Gateway requests. Cognito secures the API Gateway, while CloudFront CDN delivers everything globally. The React frontend connects directly to the Cognito UserPool and gets hosted on S3 with CloudFront distribution. For production deployments, you can add a custom domain using CloudFlare and AWS Certificate Manager.

Build a Serverless Full-Stack App

In this section, you’ll build a full-stack serverless architecture.

Step 1: Create a DynamoDB table

To create a DynamoDB table, navigate to your AWS console and select the DynamoDB section. You can do this quickly by typing “DynamoDB” into the AWS search bar and clicking on DynamoDB. Next, follow the steps below to complete your table creation:

Click Create table.
Input table name as “CoffeeShop” or anything you want to name it.
Input partition key as “coffeeId” or anything you want to name it.
Click Create table.

Step 1.1: Create items

You need to create items for the table. This helps with testing connectivity to your DynamoDB table.

For our use case, we’ll be creating an item in the table called “coffee” and input attributes such as coffeeId, name, price, and availability. To create an item:

Click Explore items on the left navigation pane.
Click Create items.
Click the CoffeeShop radio button, then click Create item.

Click Add new attribute. This allows you to add different data types such as strings and booleans. The JSON structure below shows the attributes created.


{
    "coffeeId": "c123",
    "name": "new cold coffee",
    "price": 456,
    "available": true
}

Step 2: Create an IAM role for the Lambda function

Next, create a Lambda function that interacts with the DynamoDB table using an IAM role attached to the function. We’ll be setting up an IAM role named "CoffeeShopRole" that serves as a shared execution role for all Lambda functions in the coffee shop application.

This role includes the following permissions:

CloudWatch Logs: Full logging capabilities (create, write, and manage log streams)
DynamoDB Access: Complete read, write, update, and delete operations on the "CoffeeShop" table.

To do this:

Navigate to the AWS IAM console.
Navigate to Roles.
Click Create role.
Select the Lambda service.
Search for “AWSLambdaBasicExecutionRole.”
Name your role and click Create role.

This is what the role looks like:


{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": [
                "dynamodb:PutItem",
                "dynamodb:DeleteItem",
                "dynamodb:GetItem",
                "dynamodb:Scan",
                "dynamodb:UpdateItem"
            ],
            "Resource": "arn:aws:dynamodb::"
        },
        {
            "Effect": "Allow",
            "Action": [
                "logs:CreateLogGroup",
                "logs:CreateLogStream",
                "logs:PutLogEvents"
            ],
            "Resource": "*"
        }
    ]
}

This policy allows us to create CloudWatch logs. Next, create an inline policy to allow communications to DynamoDB. Select the following actions for the table:

Get
Put
Update
Scan
Delete

Next, connect your table ARN to the policy by navigating to the created table and copying the ARN into the policy.

Step 3: Create Lambda Layer And Lambda Functions

Now, we need to connect our Lambda function to the DynamoDB table. For this, we’ll need the DynamoDB JavaScript SDK. To get started, create two folders: lambda > get in your IDE, preferably VS Code. Navigate into these folders in your terminal and run the npm init command to initialize your project. Update your package.json file with this:


{
  "name": "get",
  "type": "module",
  "version": "1.0.0",
  "main": "index.js",
  "scripts": {
    "test": "echo \"Error: no test specified\" && exit 1"
  },
  "author": "",
  "license": "ISC",
  "description": ""
}

Note: that we’ll be using ECMAScript throughout the course of this tutorial.

Next, we have to create a reusable Node.js Lambda layer containing the DynamoDB JavaScript SDK and shared utility functions. This layer acts like a common library that can be attached to multiple Lambda functions, eliminating the need to bundle the same dependencies repeatedly in each function's deployment package.

To use the SDK, create a new folder in your directory titled index.mjs and paste in the code below:


// getCoffee function
import { DynamoDBClient, GetItemCommand } from "@aws-sdk/client-dynamodb"; // ESM import
const config = {
    region: "us-east-1",
};
const client = new DynamoDBClient(config);
export const getCoffee = async (event) => {
    const coffeeId = "c123";
    const input = {
        TableName: "CoffeShop",
        Key: {
            coffeeId: {
                S: coffeeId,
            },
        },
    };
    const command = new GetItemCommand(input);
    const response = await client.send(command);
    console.log(response);
    return response;
}

The code above is the getCoffee function that connects to the DynamoDB table called CoffeShop, looks up the coffee with the ID c123, and displays its details.

Change region to your specific region.

Next, install the Lambda dependencies for the SDK using the command below:


npm i @aws-sdk/client-dynamodb @aws-sdk/lib-dynamodb

Then, create a zip file for all the current files using the command below:

zip -r get.zip ./*

This creates a zip file in your project directory. Now, navigate to the Lambda function page on your AWS console and upload this zip file.

Click Test to test your application. If you run into an error, edit the Runtime settings and change the handler name to index.getCoffee. Deploy and run the code again, you should get a successful response from DynamoDB as shown below:

Response:


{
  "$metadata": {
    "httpStatusCode": 200,
    "requestId": "R14Q5UMTP3K9P9NAF1OGG0IB57VV4KQNSO5AEMVJF66Q9ASUAAJG",
    "attempts": 1,
    "totalRetryDelay": 0
  },
  "Item": {
    "available": {
      "BOOL": true
    },
    "price": {
      "N": "34"
    },
    "name": {
      "S": "My New Coffee"
    },
    "coffeeId": {
      "S": "c123"
    }
  }
}

Now, let’s make the necessary changes to make our function ready for the API gateway to get the API. When someone requests a coffee using the /coffee endpoint, we want the app to returns a list of all coffees. But if the request is made to /coffee/c123 or /coffee/id, then the app returns only details about that specific coffee.

To do this, head back to your index.mjs file and paste in the code below:


import { DynamoDBClient } from "@aws-sdk/client-dynamodb";
import { DynamoDBDocumentClient, GetCommand, ScanCommand } from "@aws-sdk/lib-dynamodb";
const client = new DynamoDBClient({});
const docClient = DynamoDBDocumentClient.from(client);
const tableName = process.env.tableName || "CoffeShop";
const createResponse = (statusCode, body) => {
    const responseBody = JSON.stringify(body);
    return {
        statusCode,
        headers: { "Content-Type": "application/json" },
        body: responseBody,
    };
};
export const getCoffee = async (event) => {
    const { pathParameters } = event;
    const { id } = pathParameters || {};
    try {
        let command;
        if (id) {
            command = new GetCommand({
                TableName: tableName,
                Key: {
                    "coffeeId": id,
                },
            });
        }
        else {
            command = new ScanCommand({
                TableName: tableName,
            });
        }
        const response = await docClient.send(command);
        return createResponse(200, response);
    }
    catch (err) {
        console.error("Error fetching data from DynamoDB:", err);
        return createResponse(500, { error: err.message });
    }
}

Run the zip -r get.zip ./* command again and re-upload the zip file in your Lambda function page.

This AWS Lambda function implements a serverless API endpoint for retrieving coffee data from a DynamoDB table, using the AWS SDK v3 to create a document client that can either fetch a specific coffee item by ID (when an id parameter is provided in the URL path) or return all items from the table (when no ID is specified, though there's a missing import for ScanCommand).

The function extracts the coffee ID from the incoming event's path parameters, constructs the appropriate DynamoDB command (GetCommand for single items or ScanCommand for all items), executes the database operation, and returns a properly formatted HTTP response with JSON headers and appropriate status codes - either a 200 success response with the coffee data or a 500 error response if something goes wrong during the database operation.

Repeat the steps above for the create, update, and delete functions. You can find these functions in your cloned project repo.

Step 4: Create an API Gateway To Expose Lambda Functions

To create an API that points to the Lambda function:

Navigate to API Gateway > Routes and click Create.
Create the following endpoints.


GET /coffee  -> getCoffee lambda function
GET /coffee/{id}  -> getCoffee lambda function
POST /coffee  -> createCoffee lambda function
PUT /coffee/{id}  -> updateCoffee lambda function
DELETE /coffee/{id}  -> deleteCoffee lambda function

Navigate to Integrations and create integrations for these endpoints. To do this, go to the Manage integrations tab, click Create, and select Lambda as the integration target.

Now, in your API Gateway portal, click on API: CoffeeShop...(random numbers) and copy the invoke URL for testing, as shown in the image below:

The get request with an id returns a 200 OK response with the created items in DynamoDB. You can play around with the rest of the endpoints on Postman :)

Adding Lambda Layer to Solve the Dependency Issue

Before we continue with this tutorial, I’d like to address one problem with the previous steps so far. All functions use the same dependency, but for each function, we had to maintain separate node_modules folders and packages.json files. To fix this issue, we’ll be using Lamba Layer. Layer contains all the dependencies, while the functions contain only your code.

To get started:

Create a new folder in your IDE called LambdaWithLayer.
Create two additional folders under the LambdaWithLayer named LambdaFunctionsWithLayer and nodejs.

Note: You must use the name nodejs for this to work.

Navigate to the nodejs folder and initialize using the npm init command.
Install dependencies using the command below:

npm i @aws-sdk/client-dynamodb @aws-sdk/lib-dynamodb

Create a new file called utils.js under the nodejs folder and paste in the code below:


import { DynamoDBClient } from "@aws-sdk/client-dynamodb";
import {
    DynamoDBDocumentClient,
    ScanCommand,
    GetCommand,
    PutCommand,
    UpdateCommand,
    DeleteCommand
} from "@aws-sdk/lib-dynamodb";
const client = new DynamoDBClient({});
const docClient = DynamoDBDocumentClient.from(client);
const createResponse = (statusCode, body) => {
    return {
        statusCode,
        headers: { "Content-Type": "application/json" },
        body: JSON.stringify(body),
    };
};
export {
    docClient,
    createResponse,
    ScanCommand,
    GetCommand,
    PutCommand,
    UpdateCommand,
    DeleteCommand
};

Here, we imported all the commands for our API operations. Now, we can create Lambda Functions without installing the SDK dependencies for each one. For example, you can create a get folder under the LambdaFunctionsWithLayer folder for the get function, then create an index.mjs file under the get folder. Next, paste the code below:


import { docClient, GetCommand, ScanCommand, createResponse } from '/opt/nodejs/utils.mjs'; // Import from Layer
const tableName = process.env.tableName || "CoffeShop";
export const getCoffee = async (event) => {
    const { pathParameters } = event;
    const { id } = pathParameters || {};
    try {
        let command;
        if (id) {
            command = new GetCommand({
                TableName: tableName,
                Key: {
                    "coffeeId": id,
                },
            });
        }
        else {
            command = new ScanCommand({
                TableName: tableName,
            });
        }
        const response = await docClient.send(command);
        return createResponse(200, response);
    }
    catch (err) {
        console.error("Error fetching data from DynamoDB:", err);
        return createResponse(500, { error: err.message });
    }
}

Now we can see that, in the code, we no longer require dependencies for the get function. We just imported from the layer.

Repeat the above steps for other functions.

Note: You can find the code for other functions in the cloned repo.

Create a zip folder for each function. You can do this by creating a file called create_zip.sh under the LambdaFunctionsWithLayer folder. Then paste the script below:


echo "Creating zip for layer"
zip -r layer.zip nodejs
echo "Creating zip for GET Function"
cd LambdaFunctionsWithLayer/get
zip -r get.zip index.mjs
mv get.zip ../../
cd ../..
echo "Creating zip for POST Function"
cd LambdaFunctionsWithLayer/post
zip -r post.zip index.mjs
mv post.zip ../../
cd ../..
echo "Creating zip for UPDATE Function"
cd LambdaFunctionsWithLayer/update
zip -r update.zip index.mjs
mv update.zip ../../
cd ../..
echo "Creating zip for DELETE Function"
cd LambdaFunctionsWithLayer/delete
zip -r delete.zip index.mjs
mv delete.zip ../../
cd ../..
echo "Success!"

Run the script using the sh create_zip.sh command. This creates zip files (including a layer.zip file) that you can upload to your AWS Lambda function Layer page.

In your AWS Lambda function page, navigate to Layers and upload the layer.zip file**.**
Update the functions by uploading the newly created zip files for each code.
Add the layer to the function by clicking Layers in the function view:

Next, click Add a layer, then select Custom layers. Then choose “DynamoDBLayer” and version “1”.

Click Add.
Repeat for all the other functions.

Step 5: Set up React Application And Upload Build To S3 Bucket

To set up our React application, navigate to the frontend folder of the cloned repository on your local machine and run npm install to install the dependencies. Then run npm run dev to start your development environment on your local machine. You should see the preview in your browser at: http://localhost:5173/.

If you inspect the page using Chrome DevTools, you’ll see that we ran into some CORS error:

Now, let’s fix this problem. To do that:

Navigate your API Gateway page.
Click on CORS on the left navigation panel.
Click Configure.
Copy your localhost URL and paste it into the Access-Control-Allow-Origin field.

Ensure to remove the / at the end of your URL as shown in the image above.

Click Add.
Enter the Access-Control-Allow-Headers field with the text content-type and click Add.
Include GET, POST, OPTIONS, PUT, and DELETE in Access-Control-Allow-Methods.
Click Save.

Now it returns our coffee, and the CORS error has been resolved.

When you add a new coffee, you should see the newly created items in your DynamoDB database.

Step 6: Set up Amazon API Gateway Authorizer

AWS Congnito helps you secure your Amazon API Gateway. Gateway validates the access token with Amazon Cognito to ensure it is valid and has not expired, and grants or denies access based on token validity.

To get started:

Navigate to Amazon Cognito > User pools.
Click Create user pool.
Select Single-page application (SPA).
Select email as the preferred sign-in and sign-up method.
Use http://localhost:5174/ or your own local URL as the return URL.
Click Create user directory.

You’ll be presented with a page containing code that we can copy and paste into our app for integration. But before we do that, let's head back to API Gateway and integrate it with Cognito. To do that:

Go to the Authorization section in API Gateway.
Navigate to Manage authorizers.
Click Create.
Select JWT and name it “Cognito-CoffeeShop”
Copy your issuer URL from Cognito Overview. Your issuer URL is the Token signing key URL. If you click on the URL, you’ll be taken to your browser, where you'll see the keys that’ll be used for verification.
For the Audience, navigate to the Cognito user pool, then to App clients, and select CoffeShopClient. Copy the Client ID.
Click Create.
Go to Routes and add authorizations to each endpoint.

Now, to integrate with our front-end app:

Navigate into the frontend folder and run the command below:

npm install oidc-client-ts react-oidc-context --save

Go to the App clients section in Cognito user pools to find the readily available code snippets for integration.
Edit your main.jsx file to include the code below:


import { createRoot } from 'react-dom/client'
import { BrowserRouter as Router, Route, Routes } from "react-router-dom";
import './index.css'
import App from './App.jsx'
import ItemDetails from "./ItemDetails";
import { AuthProvider } from "react-oidc-context";
const cognitoAuthConfig = {
  authority: "https://cognito-idp.us-east-1.amazonaws.com/us-east-1_rXq7q3KLm",
  client_id: "6fjfrlaup7oph5lhf1q8q6pnp4",
  redirect_uri: "http://localhost:5174",
  response_type: "code",
  scope: "email openid phone",
};
createRoot(document.getElementById('root')).render(
  
    
      
        
          "/" element={} />
          "/details/:id" element={} />
        
      
    
  
)

Here, we imported AuthProvider from react-oidc-context, then wrapped our app with AuthProvider. Then, move the code in the App.jsx file to a newly created Home.jsx file, and update App.jsx file with the code below:


import { useEffect, useState } from "react";
import "./App.css";
// App.js
import { useAuth } from "react-oidc-context";
function App() {
  const auth = useAuth();
  const signOutRedirect = () => {
    const clientId = "6fjfrlaup7oph5lhf1q8q6pnp4";
    const logoutUri = "http://localhost:5174/";
    const cognitoDomain = "https://us-east-1rxq7q3klm.auth.us-east-1.amazoncognito.com";
    window.location.href = `${cognitoDomain}/logout?client_id=${clientId}&logout_uri=${encodeURIComponent(logoutUri)}`;
  };
  if (auth.isLoading) {
    return Loading...;
  }
  if (auth.error) {
    return Encountering error... {auth.error.message};
  }
  if (auth.isAuthenticated) {
    return (
      
        
        
      
    );
  }
  return (
    
      
      
    
  );
}
export default App;

Now, when you run the application again, you should see this login page on your browser:

When you click on Sign in, you’ll get directed to the Sign in page. Click Sign up. You should see the page below to create your account.

During sign-up, a verification code is sent to your sign-up email. Once you’re logged in, you can then access your coffee dashboard.

Step 7: Create Cloudfront Distribution With Behaviors For S3 And API Gateway

To create a distribution.

Navigate to CloudFront.
Click Create distribution.
In the Origin page, select the S3 bucket and browse through your created S3 buckets.
Select your coffee shop bucket.
Set origin path to /dist.
Select Origin access control under Origin access.
Update your React code and AWS Cognito with the distribution domain name provided in the CloudFront log-in pages tab.

Step 8: Set up React Application And Upload Build To S3 Bucket

In this step, we’ll be building our React application and uploading the static files to an Amazon S3 bucket, which is then served from a CloudFront distribution.

To get started:

Create an S3 bucket and give it the name “mycoffeeShop123new”. This name should be globally unique across all AWS accounts.
In the frontend folder, run the npm run build command. This creates a dist folder in your directory.
Head back to the S3 bucket and drag-and-drop the dist folder into S3 to upload it.
Click Upload.

Now, copy your CloudFront distribution URL and try to access your site in a private browser, for example, Chrome incognito. You should see your site live in the browser.

Troubleshooting Access Denied Error

You may encounter an access denied error in the browser:


<Error>
    <Code>AccessDeniedCode>
    <Message>Access DeniedMessage>
Error>

It may be because of a likely S3 + CloudFront configuration error. Here are the steps to resolve this issue:

Step 1: Set up Origin Access Control (OAC)

Go to CloudFront > Your Distribution > Origins tab.
Select your S3 origin and click Edit.
Under Origin access, select Origin access control settings (recommended)
Click Create new OAC (or select an existing one).
Click Save changes.

Step 2: Update S3 Bucket Policy

After saving, CloudFront will show you a "Copy Policy" button. Click it, then:

Go to your S3 bucket > Permissions tab.
Scroll to Bucket policy and click Edit.
Paste the copied policy (it should look like this):


{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "AllowCloudFrontServicePrincipal",
            "Effect": "Allow",
            "Principal": {
                "Service": "cloudfront.amazonaws.com"
            },
            "Action": "s3:GetObject",
            "Resource": "arn:aws:s3:::YOUR-BUCKET-NAME/*",
            "Condition": {
                "StringEquals": {
                    "AWS:SourceArn": "arn:aws:cloudfront::YOUR-ACCOUNT-ID:distribution/YOUR-DISTRIBUTION-ID"
                }
            }
        }
    ]
}

Click Save changes.

Step 3: Set Default Root Object

Go back to CloudFront > Your Distribution > General tab.
Click Edit.
Set Default root object to index.html.
Save changes.

Now, try accessing the site again. It should work.

This brings us to the end of this tutorial. I hope you were able to learn a thing or two about building serverless systems :)

Conclusion

Congratulations! You've just built a production-ready serverless application from the ground up. You've successfully architected a complete CRUD system that automatically scales, stays secure with Cognito authentication, and costs you only what you actually use.

How to Securely Deploy APIs to Amazon Lambda – A Practical Guide

Agnes Olorundare — Thu, 09 Oct 2025 23:13:17 +0000

Cyber attacks against APIs (Application Programming Interfaces) are on the increase. These attacks arise from issues with proper authentication, authorization, unnecessary data exposure, lack of request limits, resource consumption, and use of vulnerable third-party APIs.

Gaps in APIs can occur before requests reach the APIs, within the code housing the APIs, and even along the path of the APIs’ communication with downstream services, dependencies, or other microservices.

Attackers leverage flaws in APIs to gain access to confidential data, harvest or manipulate data, or even make your service unavailable through distributed denial of service attacks.

In this article, you’ll learn to deploy your APIs in Lambda and apply some security measures pre-function, within the function, and post-function.

What is an API?
Requirements/Prerequisites
Project Goal
Project Overall Architecture
Improvements
Conclusion

What is an API?

The focus of this article is the security of Application Programming Interfaces (APIs). An API is an interface that connects two programms or applications, allowing them to exchange data and communicate.

An API can be internal to an organization or it can belong to a third-party that allows other users to consume their data through the API.

Requirements/Prerequisites

While this tutorial is beginner-friendly, you’ll need the following prerequisites to follow along seamlessly:

A basic knowledge of the AWS Cloud.
An AWS account with administrator access.
AWS CLI. You can find the installation guide here. Follow the instructions for your operating system.
Python. You can visit Python’s official documentation site for a guide on how to download and install Python for your specific operating system.
Pipenv or any Python virtual environment creation tool. You can find the Pipenv installation guide here.
A basic knowledge of Git.
An API client, like Postman or Thunderclient.

Project Goal

By the end of this project, you should be able to deploy APIs in Lambda securely, leveraging AWS cloud-native security services.

Project Overall Architecture

Below is the architecture of the project workflow:

As shown in the architectural diagram, when a user sends a request (a JSON object consisting of the user’s name) to an API hosted in Lambda, the user first gets authenticated by an authentication service called Amazon Cognito.

The request passes through a Web Application Firewall, then an API Gateway. API Gateway will perform a check to see if the user is authorized to access the API using the token that the user sends with the request after authentication. API Gateway then allows the traffic to pass through to the API if the user is authorized.

The user’s request will first get to an External Lambda function, which will then save the user’s name as a message to a Simple Notification Service (SNS) topic. This will then invoke an Internal Lambda to run and log the output in Amazon CloudWatch logs. The SNS topic will be accessed by External Lambda using the SNS’s unique identifier stored in Amazon Secrets Manager.

AWS Set Up

You’ll need to set up an AWS environment to get started. This requires creating an account if you don’t already have one.

Following account creation, a root user is automatically created, with all privileges attached to the user. Security best practice is to create another user with administrator privileges and use this user for subsequent tasks.

Then, create an access key for this user, which usually consists of two parts (Access Key ID and Secret Access Key) by navigating to the following:

IAM —→ Users —→ Create Access key

Follow the prompts and choose the Command Line Interface option. Check the Confirmation box, and go on to create the key. Download the CSV file provided, or manually copy the Access Key ID and Secret Access Key. Save them securely.

Open up your terminal and run the following commands using the AWS CLI:

aws configure

The above command will give some prompts to provide the components of the Access Key created earlier and your default region (the AWS region hosting the service you intend to interact with).

Clone Project

In the next step, you’ll clone the GitHub repository containing the assets and resources used in the project implementation.

Visit the project URL and clone the repository locally.

git clone

Set Up Simple Notification Service

Amazon Simple Notification Service (SNS) connects system components, enabling asynchronous communication and messaging among them.

Find SNS on the console, click on it, and create a topic that your APIs will send messages to. After successfully creating a topic, navigate to the topic, and in the topic details, you’ll find the topic’s ARN. An ARN is an Amazon Resource Name, and it’s a unique string attached to a resource you’ve created on AWS to help identify the resource. Copy the ARN of the topic.

Set Up Secrets Manager

Amazon Secrets Manager is used to store, manage, and retrieve sensitive information such as keys, credentials, tokens, and so on. You’ll store the Topic ARN created earlier. With this approach, you’ll demonstrate how your API can securely access the data and information it needs for its performance.

Go to Secrets Manager on the AWS console and create a secret. Provide the secret’s details, and add a new secret named TOPIC_ARN as the key and the actual SNS Topic ARN as the value.

Next, you’ll create some Lambda functions to serve your APIs and consume the output of the APIs. There’re three Lambda functions to set up. Two of the functions will host APIs, each of which can only be accessed by specific users. These will be referred to as ExternalLambda. The third Lambda will consume the output of the External Lambda functions through SNS.

Set Up Internal Lambda

AWS Lambda is a serverless service on AWS that users can leverage to run application functions or code when needed. You’re billed for your Lambda function based on the number of invocations of the function, the duration each invocation lasted, and the amount of memory allocated to the function. Lambda can be provisioned to use any runtime, such as Python or NodeJS. In this demonstration, you’ll focus on the NodeJS runtime.

Now that you know what Lambda is and does, you can create one. Let’s call the first Lambda function InternalLambda. On the AWS console, search for Lambda, and on the Lambda dashboard, click Create a function and provide the details. We’ll be using Node.js – JavaScript at the backend as the runtime of choice.

For the Permissions details, let Lambda create a default IAM Role. This default role is named according to your function, and the permissions attached to the role allow your Lambda function to send logs to CloudWatch, another AWS service used for monitoring and observability.

As you can see in the last image above, the Lambda function you’ve created needs a trigger and sometimes, a destination. For your InternalLambda, the trigger is the SNS topic we configured earlier. This Lambda will read the messages that’ve been published to it, and then you can access the message from your client or even CloudWatch logs.

To achieve this, click the Add trigger button and provide the details.

Next, you’ll provide the code you want to invoke through Lambda. Find the code in the GitHub repository that you cloned earlier. Paste the code in the Lambda function code space and click on Deploy to deploy the function.

secure-lambda/InternalLambda/index.js

export const handler = async (event) => {
    try {
        console.log('Request successfully received from SNS');                            

        let name = event['Records'][0]['Sns']['Message'];
        let response = {
            statusCode: 200,
            body: JSON.stringify(`Hello ${name}. Greetings from InternalLambda!`),
        };       
        console.log('Response: ', response);                                                
        return response;
    } catch (err) {                            
        let response = {
            statusCode: 500,
            body: JSON.stringify('An error occurred while processing your request.'),
        };

        console.error('Error processing event', err);
        return response;
    }   
};

The function defined in the index.js file above is simply taking the event object sent to it from SNS and extracting the Message attribute within it. We’re using console.log here to view outputs from the function and ensure it behaves as expected. Just don’t use this in a production-ready application.

Set Up External Lambda

You’ll be creating two external Lambda functions: 1 and 2. These two functions will receive user requests, process them, and publish messages to your SNS topic.

On the Lambda console, create another function and name it ExternalLambda1. Allow Lambda to create a default IAM Role, as previously.

Paste the code snippet below in the ExternalLambda1 code space:

secure-lambda/ExternalLambda1/insex.js

import {
  GetSecretValueCommand,
  SecretsManagerClient,
} from "@aws-sdk/client-secrets-manager";

import { SNSClient, 
    PublishCommand 
} from "@aws-sdk/client-sns";

const secretsManagerClient = new SecretsManagerClient();

const snsClient = new SNSClient({});

// Fetch topicArn from AWS Secrets Manager
async function getSecretValue(secretName) {
    try {
        const data = await secretsManagerClient.send(
                            new GetSecretValueCommand({
                            SecretId: secretName,
                            }),
                        );
        if (data.SecretString) {
            return JSON.parse(data.SecretString);
        }   else {
            let buff = Buffer.from(data.SecretBinary, 'base64');
            return JSON.parse(buff.toString("utf-8"));
        }
    } catch (err) {
        console.error('Error retrieving secret', err);                             // added for debugging
        throw err;
    }
}                                        

export const handler = async (event) => {

    let name = event['name'];
    console.log(`Request successfully received from ${name}`);    

    // Retrieve SNS Topic ARN from Secrets Manager
    let topicArn;
    let response;
    try {
        const secret = await getSecretValue('LambdaSNSTopicARN');
        topicArn = secret.TOPIC_ARN;
    } catch (err) {
        response = {
            statusCode: 500,
            body: JSON.stringify('An error occured, try again later.'),
        };
        console.error('Failed to load SNS Topic ARN from Secrets Manager', err);
        return response;        
    }

    // Publish to SNS topic
   try {
        const snsResponse = await snsClient.send(
        new PublishCommand({
            Message: name,
            TopicArn: topicArn,
        })
        );
        console.log("Message published successfully:", snsResponse.MessageId);
        response = {
            statusCode: 200,
            body: JSON.stringify(`Hello ${name}. Greetings from ExternalLambda1! Message forwarded to InternalLambda.`),
        };
        return response;
  } catch (err) {
        response = {
            statusCode: 500,
            body: JSON.stringify(`Sorry ${name}.An error occurred while processing your request.`),
        };
        console.error("Failed to publish message:", err);
        return response;
  }  
};

The code above leverages the AWS SDK to fetch the ARN of the SNS topic created earlier from Secrets Manager. It then publishes a message to the topic.

The SDK already comes installed in the Lambda function. Outside of Lambda, the SDK has to be explicitly installed. The function receives its event from the client via API Gateway, which we’ll configure later.

The SNS topic you created earlier will be the destination for this function. For Lambda to publish a topic to SNS, it needs the necessary permission attached to its IAM Role. AWS can automatically take care of that during your configuration, as shown below.

For the trigger, you’ll use another service known as API Gateway. More on that later.

Follow the same steps to provision another Lambda known as ExternalLambda2.

The outcome of the External Lambda setup is as shown below:

Paste the code below into ExternalLambda2. It performs the same function as ExternalLambda1, but their output differ. Each of the two Lambda functions will be receiving traffic to a specific user that’s authorized to access the function.

secure-lambda/ExternalLambda2/index.js

import {
  GetSecretValueCommand,
  SecretsManagerClient,
} from "@aws-sdk/client-secrets-manager";

import { SNSClient, 
    PublishCommand 
} from "@aws-sdk/client-sns";

const secretsManagerClient = new SecretsManagerClient();

const snsClient = new SNSClient({});

// Fetch topicArn from AWS Secrets Manager
async function getSecretValue(secretName) {
    try {
        const data = await secretsManagerClient.send(
                            new GetSecretValueCommand({
                            SecretId: secretName,
                            }),
                        );
        if (data.SecretString) {
            return JSON.parse(data.SecretString);
        }   else {
            let buff = Buffer.from(data.SecretBinary, 'base64');
            return JSON.parse(buff.toString("utf-8"));
        }
    } catch (err) {
        console.error('Error retrieving secret', err);  
        throw err;
    }
}                                        

export const handler = async (event) => {

    let name = event['name'];
    console.log(`Request successfully received from ${name}`);    

    // Retrieve SNS Topic ARN from Secrets Manager
    let topicArn;
    let response;
    try {
        const secret = await getSecretValue('LambdaSNSTopicARN');
        topicArn = secret.TOPIC_ARN;
    } catch (err) {
        response = {
            statusCode: 500,
            body: JSON.stringify('An error occured, try again later.'),
        };
        console.error('Failed to load SNS Topic ARN from Secrets Manager', err);
        return response;        
    }

    // Publish to SNS topic
   try {
        const snsResponse = await snsClient.send(
        new PublishCommand({
            Message: name,
            TopicArn: topicArn,
        })
        );
        console.log("Message published successfully:", snsResponse.MessageId);
        response = {
            statusCode: 200,
            body: JSON.stringify(`Hello ${name}. Greetings from ExternalLambda2! Message forwarded to InternalLambda.`),
        };
        return response;
  } catch (err) {
        response = {
            statusCode: 500,
            body: JSON.stringify(`Sorry ${name}.An error occurred while processing your request.`),
        };
        console.error("Failed to publish message:", err);
        return response;
  }              
};

Before moving on, you need to modify the External Lambda’s IAM Roles. Currently, IAM Roles only have permissions to write to CloudWatch and SNS (automatically added). External Lambda also needs permission to fetch the ARN of the SNS topic that was created earlier.

The point here is to show how to leverage a secrets manager, such as AWS Secrets Manager, to store sensitive information or data, and still access these securely. This approach is more secure than storing the ARN as an environment variable within Lambda.

Navigate to IAM, and click on Policies tab on the left. This brings you to a list of policies. Next, click on Create policy.

Search for secrets manager in the Policy editor.

Select the permissions Lambda needs to access Secrets Manager. In this case, that would be Read —> GetSecretValue.

Select Specific for Resources, and click on Add ARNs. On the next tab, add the details of the Secrets Manager Secret created earlier.

The Secret’s ARN will be populated here.

Next, give the policy a name and create it.

Next, navigate to Roles, and search for the IAM Roles assigned to the External Lambda functions. These are named according to the Lambda.

Click Add permissions to add a new permission to the IAM Role selected.

Configure Web Application Firewall

A firewall is a system placed in front of an application, workload, APIs, and so on to inspect traffic, filter it, and either allow or block the traffic based on some preconfigured rules.

For this project, you’ll use the AWS Web Application Firewall (WAF) service to inspect user requests before routing the traffic to your APIs running in Lambda.

Head over to the AWS console and search for WAF.

Click on the IP sets tab on the left. This will enable you to create a list of IP addresses that you want to allow (as in this case) or deny.

The IP addresses should include a CIDR block. For instance, if adding a single IP address, it should be X.X.X.X/32. The same applies to IP address ranges such as X.X.X.X/24.

Next, click on the Web ACLs tab, then Create web ACL.

Choose Regional resources as the Resource type, and enter your region. It’s best to keep all resources you’re creating in this project within the same region. Give your Web ACL a name, then click next.

Add rules to the Web ACL.

Choose a rule type. In this case, you’ll use IP set, and give the rule a name. Choose the IP set created earlier.

Select Source IP address, and Count as the Action. For this project, you’ll focus on counting the requests sent to your APIs. But as shown in the image below, you can perform other actions, such as allow, block, and so on.

Your final rule configuration will appear this way.

Scroll down, then click on Create web ACL.

Configure Cognito User Pools

Amazon Cognito is an identity management service used for creating and managing users. You can leverage it to authenticate and authorize users to applications, APIs, or other workloads.

You’ll create User Pools within Cognito and add a user to each pool. You’ll configure how these users can be authenticated and authorized to access the External Lambda functions already created.

Search for Cognito on AWS.

Click on Get started for free, then Create user pool.

Select Single-page application (SPA), give the User pool the name MyUserPool1, and select Email as an option for sign-in. This means the main attribute users will provide at signup and sign-in will be their email address. Leave everything else as the default. We’ll keep things as simple as possible.

After creating the User pool, you’ll find the page shown below. You can view the login and signup page for the pool you’ve just created by clicking on the View login page button.

You can add App clients to your User Pool. By default, a client named MyUserPool1 will be added to the pool. Navigate to your User pool, and click on App clients to see details of this client. Note the Client ID. You’ll also make some edits to the App client by clicking on the Edit button.

You’ll edit the Authentication flows field by ticking the Sign in with username and password… and Sign in with server-side administrative credentials… boxes. These changes will enable you to authenticate the user who will be added to this client programmatically, rather than through a UI. With this approach, we can fetch the token assigned to the user by Cognito and use this token to authorize access to Lambda.

Now, add a user to this pool. The user needs a valid email address. You’ll need the login page URL to create the user.

You need access to the email used to create the user. Fetch the code sent to the email address and submit to confirm the account.

Follow the same steps and create another User pool named MyUserPool2. Add a user with a different email to this pool.

Configure API Gateway

API Gateway is a service used to manage access and route traffic to API backend services such as APIs. It serves as a reverse proxy and provides an extra layer of security for backend services.

You’ll configure API Gateway to direct traffic to your Lambda functions.

Navigate to API Gateway and click on Create an API.

Select the REST API option —→ Build.

Select New API, provide a name, and choose Regional as the API endpoint type. IP address type can be IPv4 or Dualstack. We’ll select IPv4 here. Then create.

An important part of API Gateway configuration for this project is the Authorizer. API Gateway uses Authorizer to allow traffic from clients to backend services.

You’ll create two Authorizers. Each will be connected to one of the User pools you configured earlier. On the left-hand side of the API Gateway you configured, click on Authorizers —→ Create authorizer.

Provide the name AGAuthorizer1, and select Cognito as the Authorizer type. Add the User pool for MyUserPool1 created earlier. For the Token source, use Authorization. When you send a request from your API client, a token will be added to the request header for authorization. The token’s key will be named Authorization, while the value will be the token itself.

Create another Authorization for MyUserPool2 the same way.

Both Authorizers will appear this way.

Next, you’ll create resources and endpoints within the API Gateway that you’ve defined.

A resource in API Gateway is used to group certain endpoints within a specific path. You’ll define two resources within the API Gateway you’ve created. This will create two different paths, / and .

On the API Gateway dashboard, navigate to your Gateway, click on Create resource, define your root path (‘/’ in your case), and provide the resource name (lambda1).

Create another resource named lambda2.

Now, click on /lambda1, then Create method to define an endpoint within this resource. You’ll use the POST method to send requests to the backend service via this endpoint.

For the backend service or Integration type, select Lambda function, and provide the ARN of ExternalLambda1.

For Authorization, select AWS IAM —→ Cognito user pool authorizers —→ AGAuthorizer1. Leave other configurations, then create the endpoint.

Repeat the same step to create a POST method for /lambda2 resource. The method should be attached to ExternalLambda2, and AGAuthorizer2.

The API Gateway you’ve created needs to be deployed to become accessible. Deployment is usually done to a Stage.

Click on Deploy API, select New stage and name the stage development. Then, deploy.

After deployment to a stage, an invoke URL will be provided. This will serve as the base URL for the endpoints you’ve defined.

The stage you’ve created needs some modifications for enhanced security. Firstly, you need to attach the WAF that you created earlier. Secondly, the default rate limit for the API deployed to this stage is 10000. Rate limit restricts excessive resource consumption and protects your API from abuse. For this project, you can reduce the limit to 50.

To test the API Gateway set up, click on the endpoint you want to test, then the Test button. This initial test doesn’t need any authorization, since the test is done directly within the Gateway.

Add JSON data as the Request body. The key will be name, and the value will be any string.

The response sent back from ExternalLambda1 shows a status code of 200, and a response body containing exactly the message expected from the Lambda function.

If you head over to CloudWatch Log groups, you should also find the Log groups that were automatically created for the Lambda functions. Click on the Log group for ExternalLambda1 and navigate to the latest Log stream. You should find the logs for the request you’ve just made from API Gateway.

Test Setup End-to-End

To test our setup properly, and from the internet, send the same request from your API client with no additional information in the request header. This should return a 401 error – Unauthorized. This is expected.

API Gateway expects an authorization token from each request it receives before routing traffic to the appropriate backend service. It validates this token through Cognito.

You’ll mimic a user login for each user added to Coginito User pools to get a token for the user. This token will then be sent alongside any request. To achieve this, you’ll use the two Python scripts I’ve provided below:

secure-lambda/auth-scripts/user1.py

import boto3

client = boto3.client("cognito-idp")

response = client.initiate_auth(
    AuthFlow="USER_PASSWORD_AUTH",  # or ADMIN_USER_PASSWORD_AUTH if using admin creds
    AuthParameters={
        "USERNAME": "",             # user1 email
        "PASSWORD": ""              # user1 password
    },
    ClientId=""                     # Cognito App Client ID
)

id_token = response["AuthenticationResult"]["IdToken"]
access_token = response["AuthenticationResult"]["AccessToken"]
refresh_token = response["AuthenticationResult"]["RefreshToken"]

print("ID Token:", id_token)

Using the Python boto3 library, you’ll initiate an authentication request to Cognito. Provide the email address and password of the user in MyUserPool1. Also, add the Client ID of the App client.

To run the script, create an isolated environment using Pipenv, uv, or a similar library. Install the dependency used in the project as defined in the Pipfile, and run the script with the Pipenv shell.

pipenv install
pipenv shell
Python secure-lambda/auth-scripts/user1.py

The Python command will return with a token assigned to the user. Next, you use this token to authorize a user to access ExternalLambda1.

Ensure that the URL for the POST request is in the format: . You should receive a response from API Gateway indicating success.

Now try accessing ExternalLambda2 using User1 token. You should get an Unauthorized message. Note that user1 will always receive an unauthorized message when it tries accessing ExternalLambda1 without an Authorization token in the header, a wrong token, or when it tries accessing ExternalLambda2, which it is not authorized to access.

Repeat the process with User2 using the token generated for the user in MyUserPool2. First, test access to ExternalLambda2 without a token in the request header.

Then test access with the token.

Next, try accessing ExternalLambda1 using User2.

You can also view the outcome of some of the requests made by your client on CloudWatch Logs.

Also, since WAF has been configured previously to count requests (although, in a real scenario, you want to achieve much more with WAF, such as allow or block certain traffic), you can view activities captured by WAF by navigating to the service on AWS, then searching for the WAF you configured, and navigating to Traffic overview.

You can find other details, such as the client device types and where requests originated.

Clean Up

It’s important to clean up the resources created so far after the hands-on exercise. Due to the dependencies among the resources, trying to delete a resource that another resource depends on may lead to an error. So, you should delete them in this order:

Secrets Manager
Cognito – Users, App Client, then User Pool
API Gateway – Endpoints/ Methods, Resources, API, Stage
Web Application Firewall – IP Set, Web ACL
All Lambda Functions
Lambda IAM Roles and the policies attached to them
CloudWatch Log Group for all the Lambda functions
SNS Topic

Also, you can deactivate or delete the credentials created for your IAM Admin user if not in use.

Improvements

Consider the following areas to improve, apply best practices to, and enhance the security posture of your systems further.

Use of API keys
Third-party API consumption
API inventory management/ documentation
Resource provisioning using Infrastructure as Code

Conclusion

Security at every layer of an IT system is not negotiable. In this project, we’ve demonstrated how to leverage cloud-native solutions to secure APIs hosted in a serverless service, allowing only authorized users access to the APIs.

I’m Agnes Olorundare, and you can find out more about me on LinkedIn.

How to Build a Machine Learning System on Serverless Architecture

Kuriko — Tue, 26 Aug 2025 16:23:28 +0000

Let’s say you’ve built a fantastic machine learning model that performs beautifully in notebooks.

But a model isn’t truly valuable until it’s in production, serving real users and solving real problems.

In this article, you’ll learn how to ship a production-ready ML application built on serverless architecture.

Prerequisites
What We’re Building
The System Architecture
- Core AWS Resources in the Architecture
The Deployment Workflow in Action
Building a Client Application (Optional)
- The React Application
Final Results
Conclusion

Prerequisites

This project requires some basic experience with:

Machine Learning / Deep Learning: The full lifecycle, including data handling, model training, tuning, and validation.
Coding: Proficiency in Python, with experience using major ML libraries such as PyTorch and Scikit-Learn.
Full-stack deployment: Experience deploying applications using RESTful APIs.

What We’re Building

AI Pricing for Retailers

This project aims to help a middle-sized retailer compete with large players like Amazon.

Smaller companies often can’t afford significant price discounts, so they can face challenges finding optimal price points as they expand their product lines.

Our goal is to leverage AI models to recommend the best price for a selected product to maximize sales for the retailer, and display it on a client-side user interface (UI):

You can explore the UI from here.

The Models

I’ll train and tune multiple models so that when the primary model fails, a backup model gets loaded to serve predictions.

Primary Model: Multi-layered feedforward network (on the PyTorch library)
Backup Models (Backups): LightGBM, SVR, and Elastic Net (on the Scikit-Learn library)

The backup models are prioritized based on learning capabilities.

Tuning and Training

The primary model was trained on a dataset of around 500,000 samples (source) and fine-tuned using Optuna's Bayesian Optimization, with grid search available for further refinement.

The backups are also trained on the same samples and tuned using the Scikit-Optimize framework.

The Prediction

All models serve predictions on logged quantity values.

Logarithmic transformations of the quantity data make the distribution denser, which helps models learn patterns more effectively. This is because logarithms reduce the impact of extreme values, or outliers, and can help normalize skewed data.

Performance Validation

We’ll evaluate model performance using different metrics for the transformed and original data, with a lower value always indicating better performance.

Logged values: Mean Squared Error (MSE)
Actual values: Root Mean Squared Log Error (RMSLE) and Mean Absolute Error (MAE)

The System Architecture

We’re going to build a complete ecosystem around an AWS Lambda function to create a scalable ML system:

Fig. The system architecture (Created by Kuriko IWAI)

AWS Lambda is a serverless production where a service provider can run the application without managing servers. Once they upload the code, AWS takes on the responsibility of managing the underlying infrastructure.

In the serverless production, the code is deployed as a stateless function that runs only when it’s triggered by an event like HTTP requests or scheduled tasks.

This event-driven nature makes serverless production extremely efficient in resource allocation because:

There’s no server management: The cloud provider takes care of operational tasks.
You have automatic scaling: Serverless applications automatically scale up or down based on demand.
You have pay-per-use billing: Charged for the exact amount of compute resources the application consumes.

Note that other cloud ecosystems like Google Cloud Platform (GCP) and Microsoft Azure offer comprehensive alternatives to AWS. Which one you choose depends on your budget, project type, and familiarity with each ecosystem.

Core AWS Resources in the Architecture

The system architecture focuses on the following points:

The application is fully containerized on Docker for universal accessibility.
The container image is stored in AWS Elastic Container Registry (ECR).
The API Gateway’s REST API endpoints trigger an event to invoke the Lambda function.
The Lambda function loads the container image from ECR and perform inference.
Trained models, processors, and input features are stored in AWS S3 buckets.
A Redis client serves cached analytical data and past predictions stored in the ElastiCache.

And to build the system, we’ll use the following AWS resources:

Lamda: Serves a function to perform inference.
API Gateway: Routes API calls to the Lambda function.
S3 Storage: Serves feature store and model store.
ElastiCache: Store cached predictions and analytical data.
ECR: Stores Docker container images to allow Lambda to pull the image.

Each resource requires configuration. I’ll explore those details in the next section.

The Deployment Workflow in Action

The deployment workflow involves the following steps:

Draft data preparation, model training, and serialization scripts
Configure designated feature store and model store in S3
Create a Flask application with API endpoints
Publish a Docker image to ECR
Create a Lambda function
Configure related AWS resources

We’ll now walk through each of these steps to help you fully understand the process.

For your reference, here is the repository structure:

.
.venv/                  [.gitignore]    # stores uv venv
│
└── data/               [.gitignore]
│     └──raw/                           # stores raw data
│     └──preprocessed/                  # stores processed data after imputation and engineering
│
└── models/             [.gitignore]    # stores serialized model after training and tuning
│     └──dfn/                           # deep feedforward network
│     └──gbm/                           # light gbm
│     └──en/                            # elastic net
│     └──production/                    # models to be stored in S3 for production use
|
└── notebooks/                          # stores experimentation notebooks
│
└── src/                                # core functions
│     └──_utils/                        # utility functions
│     └──data_handling/                 # functions to engineer features
│     └──model/                         # functions to train, tune, validate models
│     │     └── sklearn_model
│     │     └── torch_model
│     │     └── ...
│     └──main.py                        # main script to run the inference locally
│
└──app.py                               # Flask application (API endpoints)
└──pyproject.toml                       # project configuration
└──.env                [.gitignore]     # environment variables
└──uv.lock                              # dependency locking
└──Dockerfile                           # for Docker container image
└──.dockerignore
└──requirements.txt
└──.python-version                      # python version locking (3.12)

Step 1: Draft Python Scripts

The first step is to draft Python scripts for data preparation, model training and tuning.

We’ll run these scripts in a batch process because these are resource-intensive and stateful tasks that aren’t suitable for serverless functions optimized for short-lived, stateless, and event-driven tasks.

Serverless functions also can experience cold starts. With heavy tasks in the function, the API gateway would timeout before serving predictions.

src/main.py

import os
import torch
import warnings
import pickle
import joblib
import numpy as np
import lightgbm as lgb
from sklearn.linear_model import ElasticNet
from sklearn.svm import SVR
from skopt.space import Real, Integer, Categorical
from dotenv import load_dotenv

import src.data_handling as data_handling
import src.model.torch_model as t
import src.model.sklearn_model as sk


if __name__ == '__main__': 
    load_dotenv(override=True)
    os.makedirs(PRODUCTION_MODEL_FOLDER_PATH, exist_ok=True)

    # create train, validation, test datasets
    X_train, X_val, X_test, y_train, y_val, y_test, preprocessor = data_handling.main_script()

    # store the trained preprocessor in local storage
    joblib.dump(preprocessor, PREPROCESSOR_PATH)

    # model tuning and training
    best_dfn_full_trained, checkpoint = t.main_script(X_train, X_val, y_train, y_val)

    # serialize the trained model
    torch.save(checkpoint, DFN_FILE_PATH)

    # svr
    best_svr_trained, best_hparams_svr = sk.main_script(
        X_train, X_val, y_train, y_val, **sklearn_models[1]
    )
    if best_svr_trained is not None:
        with open(SVR_FILE_PATH, 'wb') as f:
            pickle.dump({ 'best_model': best_svr_trained, 'best_hparams': best_hparams_svr }, f)

    # elastic net
    best_en_trained, best_hparams_en = sk.main_script(
        X_train, X_val, y_train, y_val, **sklearn_models[0]
    )
    if best_en_trained is not None:
        with open(EN_FILE_PATH, 'wb') as f:
            pickle.dump({ 'best_model': best_en_trained, 'best_hparams': best_hparams_en }, f)

    # light gbm
    best_gbm_trained, best_hparams_gbm = sk.main_script(
        X_train, X_val, y_train, y_val, **sklearn_models[2]
    )

    if best_gbm_trained is not None:
        with open(GBM_FILE_PATH, 'wb') as f:
            pickle.dump({'best_model': best_gbm_trained, 'best_hparams': best_hparams_gbm }, f)

Run the script to train and serialize the models using the uv package management:

$uv venv
$source .venv/bin/activate
$uv run src/main.py

The main.py script includes several key components.

Scripts for Data Handling

These scripts involve loading original data, structure missing values, and engineer features necessary for the future prediction.

src/data_handling/main.py

import os
import joblib
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split

import src.data_handling.scripts as scripts
from src._utils import main_logger


# load and save the original data frame in parquet
df = scripts.load_original_dataframe()
df.to_parquet(ORIGINAL_DF_PATH, index=False)

# imputation
df = scripts.structure_missing_values(df=df)

# feature engineering
df = scripts.handle_feature_engineering(df=df)

# save processed df in csv and parquet
scripts.save_df_to_csv(df=df)
df.to_parquet(PROCESSED_DF_PATH, index=False)


# for preprocessing, classify numerical and categorical columns
num_cols, cat_cols = scripts.categorize_num_cat_cols(df=df, target_col=target_col)
if cat_cols:
    for col in cat_cols: df[col] = df[col].astype('string')

# creates training, validation, and test datasets (test dataset is for inference only)
y = df[target_col]
X = df.copy().drop(target_col, axis='columns')
test_size, random_state = 50000, 42
X_tv, X_test, y_tv, y_test = train_test_split(
    X, y, test_size=test_size, random_state=random_state
)
X_train, X_val, y_train, y_val = train_test_split(
    X_tv, y_tv, test_size=test_size, random_state=random_state
)

# transform the input datasets
X_train, X_val, X_test, preprocessor = scripts.transform_input(
    X_train, X_val, X_test, num_cols=num_cols, cat_cols=cat_cols
)

# retrain and serialize the preprocessor
if preprocessor is not None: preprocessor.fit(X)
joblib.dump(preprocessor, PREPROCESSOR_PATH)

Scripts for Model Training and Tuning (PyTorch Model)

The scripts involve initiating the model, searching optimal neural architecture and hyperparameters, and serializing the fully-trained model so that the system can load the trained model when performing inference.

Because the primary model is built on PyTorch and the backups use Scikit-Learn, we’re drafting the scripts separately.

1. PyTorch Models

The training script contains training the model with the validation over a subset of training data.

It contains the early stopping logic when the loss history is not improved for a given consecutive epochs (that is, 10 epochs).

src/model/torch_model/scripts/training.py

import torch
import torch.nn as nn
import optuna # type: ignore
from sklearn.model_selection import train_test_split

from src._utils import main_logger

# device
device_type = device_type if device_type else 'cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu'
device = torch.device(device_type)

# gradient scaler for stability (only applicable for cuba)
scaler = torch.GradScaler(device=device_type) if device_type == 'cuba' else None

# start training
best_val_loss = float('inf')
epochs_no_improve = 0
for epoch in range(num_epochs):
    model.train()
    for batch_X, batch_y in train_data_loader:
        batch_X, batch_y = batch_X.to(device), batch_y.to(device)
        optimizer.zero_grad()

        try:
            # pytorch's AMP system automatically handles the casting of tensors to Float16 or Float32
            with torch.autocast(device_type=device_type):
                outputs = model(batch_X)
                loss = criterion(outputs, batch_y)

                # break the training loop when models return nan or inf
                if torch.any(torch.isnan(outputs)) or torch.any(torch.isinf(outputs)):
                    main_logger.error(
                        'pytorch model returns nan or inf. break the training loop.'
                    )
                    break

            # create scaled gradients of losses
            if scaler is not None:
                scaler.scale(loss).backward()
                scaler.unscale_(optimizer)  # cliping grad
                nn.utils.clip_grad_norm_(model.parameters(), max_norm=1.0)
                scaler.step(optimizer)  # unscales the gradients
                scaler.update()  # updates the scale

            else:
                loss.backward()
                nn.utils.clip_grad_norm_(model.parameters(), max_norm=1.0) # cliping grad
                optimizer.step()

        except:
            outputs = model(batch_X)
            loss = criterion(outputs, batch_y)
            loss.backward()
            optimizer.step()


    # run validation on a subset of the training dataset
    model.eval()
    val_loss = 0.0

    # switch the torch mode
    with torch.inference_mode():
        for batch_X_val, batch_y_val in val_data_loader:
            batch_X_val, batch_y_val = batch_X_val.to(device), batch_y_val.to(device)
            outputs_val = model(batch_X_val)
            val_loss += criterion(outputs_val, batch_y_val).item()

    val_loss /= len(val_data_loader)

    # check if early stop
    if val_loss < best_val_loss - min_delta:
        best_val_loss = val_loss
        epochs_no_improve = 0
    else:
        epochs_no_improve += 1
        if epochs_no_improve >= patience:
            main_logger.info(f'early stopping at epoch {epoch + 1}')
            break

The tuning script uses the study component from the Optuna library to run the Bayesian Optimization.

The study component choose a neural architecture and hyperparameter set to test from the global search space.

Then, it builds, trains, and validates the model to find the optimal neural architecture that can minimize the loss (MSE, for instance).

src/model/torch_model/scripts/tuning.py

import itertools
import pandas as pd
import numpy as np
import optuna
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader, TensorDataset
from sklearn.model_selection import train_test_split

from src.model.torch_model.scripts.pretrained_base import DFN
from src.model.torch_model.scripts.training import train_model
from src._utils import main_logger

# device
device_type = "cuda" if torch.cuda.is_available() else "mps" if torch.backends.mps.is_available() else "cpu"
device = torch.device(device_type)

# loss function
criterion = nn.MSELoss()

# define objective function for optuna
def objective(trial):
    # model
    num_layers = trial.suggest_int('num_layers', 1, 20)
    batch_norm = trial.suggest_categorical('batch_norm', [True, False])
    dropout_rates = []
    hidden_units_per_layer = []
    for i in range(num_layers):
        dropout_rates.append(trial.suggest_float(f'dropout_rate_layer_{i}', 0.0, 0.6))
        hidden_units_per_layer.append(trial.suggest_int(f'n_units_layer_{i}', 8, 256)) # hidden units per layer

    model = DFN(
        input_dim=X_train.shape[1],
        num_layers=num_layers,
        dropout_rates=dropout_rates,
        batch_norm=batch_norm,
        hidden_units_per_layer=hidden_units_per_layer
    ).to(device)

    # optimizer
    learning_rate = trial.suggest_float('learning_rate', 1e-10, 1e-1, log=True)
    optimizer_name = trial.suggest_categorical('optimizer', ['adam', 'rmsprop', 'sgd', 'adamw', 'adamax', 'adadelta', 'radam'])
    optimizer = _handle_optimizer(optimizer_name=optimizer_name, model=model, lr=learning_rate)

    # data loaders
    batch_size = trial.suggest_categorical('batch_size', [32, 64, 128, 256])
    test_size = 10000 if len(X_train) > 15000 else int(len(X_train) * 0.2)
    X_train_search, X_val_search, y_train_search, y_val_search = train_test_split(X_train, y_train, test_size=test_size, random_state=42)
    train_data_loader = create_torch_data_loader(X=X_train_search, y=y_train_search, batch_size=batch_size)
    val_data_loader = create_torch_data_loader(X=X_val_search, y=y_val_search, batch_size=batch_size)

    # training
    num_epochs = 3000 # ensure enough epochs (early stopping would stop the loop when overfitting)
    _, best_val_loss = train_model(
        train_data_loader=train_data_loader,
        val_data_loader=val_data_loader,
        model=model,
        optimizer=optimizer,
        criterion = criterion,
        num_epochs=num_epochs,
        trial=trial,
    )
    return best_val_loss


# start to optimize hyperparameters and architecture
study = optuna.create_study(direction='minimize', sampler=optuna.samplers.TPESampler())
study.optimize(objective, n_trials=50, timeout=600)

# best 
best_trial = study.best_trial
best_hparams = best_trial.params

# construct the model based on the tuning results
best_lr = best_hparams['learning_rate']
best_batch_size = best_hparams['batch_size']
input_dim = X_train.shape[1]
best_model = DFN(
    input_dim=input_dim,
    num_layers=best_hparams['num_layers'],
    hidden_units_per_layer=[v for k, v in best_hparams.items() if 'n_units_layer_' in k],
    batch_norm=best_hparams['batch_norm'],
    dropout_rates=[v for k, v in best_hparams.items() if 'dropout_rate_layer_' in k],
).to(device)

# construct an optimizer based on the tuning results
best_optimizer_name = best_hparams['optimizer']
best_optimizer = _handle_optimizer(
    optimizer_name=best_optimizer_name, model=best_model, lr=best_lr
)

# create torch data loaders
train_data_loader = create_torch_data_loader(
    X=X_train, y=y_train, batch_size=best_batch_size
)
val_data_loader = create_torch_data_loader(
    X=X_val, y=y_val, batch_size=best_batch_size
)

# retrain the best model with full training dataset applying the optimal batch size and optimizer
best_model, _ = train_model(
    train_data_loader=train_data_loader,
    val_data_loader=val_data_loader,
    model=best_model,
    optimizer=best_optimizer,
    criterion = criterion,
    num_epochs=1000
)

# create a checkpoint for serialization (reconstruct the model using the checkpoint)
checkpoint = {
    'state_dict': best_model.state_dict(),
    'hparams': best_hparams,
    'input_dim': X_train.shape[1],
    'optimizer': best_optimizer,
    'batch_size': best_batch_size
}

# serialize the model w/ checkpoint
torch.save(checkpoint, FILE_PATH)

2. Scikit-Learn Models (Backups)

For Scikit-Learn models, we’ll run k-fold cross validation during training to prevent overfitting.

K-fold cross-validation is a technique for evaluating a machine learning model's performance by training and testing it on different subsets of training data.

We define the run_kfold_validation function where the model is trained and validated using 5-fold cross-validation.

src/model/sklearn_model/scripts/tuning.py

from sklearn.model_selection import KFold
from sklearn.metrics import mean_squared_error

def run_kfold_validation(
        X_train,
        y_train,
        base_model,
        hparams: dict,
        n_splits: int = 5, # the number of folds 
        early_stopping_rounds: int = 10,
        max_iters: int = 200
    ) -> float:

    mses = 0.0

    # create k-fold component
    kf = KFold(n_splits=n_splits, shuffle=True, random_state=42)

    for fold, (train_index, val_index) in enumerate(kf.split(X_train)):
        # create a subset of training and validation datasets from the entire training data
        X_train_fold, X_val_fold = X_train.iloc[train_index], X_train.iloc[val_index]
        y_train_fold, y_val_fold = y_train.iloc[train_index], y_train.iloc[val_index]

        # reconstruct a model
        model = base_model(**hparams)

        # start the cross validation
        best_val_mse = float('inf')
        patience_counter = 0
        best_model_state = None
        best_iteration = 0

        for iteration in range(max_iters):
            # train on a subset of the training data
            try:
                model.train_one_step(X_train_fold, y_train_fold, iteration)
            except:
                model.fit(X_train_fold, y_train_fold)

            # make a prediction on validation data 
            y_pred_val_kf = model.predict(X_val_fold)

            # compute validation loss (MSE)
            current_val_mse = mean_squared_error(y_val_fold, y_pred_val_kf)

            # check if epochs should be stopped (early stopping)
           if current_val_mse < best_val_mse:
                best_val_mse = current_val_mse
                patience_counter = 0
                best_model_state = model.get_params()
                best_iteration = iteration
           else:
                patience_counter += 1

           # execute early stopping when patience_counter exceeds early_stopping_rounds
           if patience_counter >= early_stopping_rounds:
                main_logger.info(f"Fold {fold}: Early stopping triggered at iteration {iteration} (best at {best_iteration}). Best MSE: {best_val_mse:.4f}")
                break


        # after training epochs, reconstruct the best performing model 
        if best_model_state: model.set_params(**best_model_state)

        # make prediction
        y_pred_val_kf = model.predict(X_val_fold)

        # add MSEs
        mses += mean_squared_error(y_pred_val_kf, y_val_fold)

    # compute the final loss (avarage of MSEs across folds)
    ave_mse = mses / n_splits
    return ave_mse

Then, for the tuning script, we use the gp_minimize function from the Scikit-Optimize library.

The gp_minimize function is used to tune hyperparameters with Bayesian optimization.

This function intelligently searches the best hyperparameter set that can minimize the model's error, which is calculated using the run_kfold_validation function defined earlier.

The best-performing hyperparameters are then used to reconstruct and train the final model.

src/model/sklearn_model/scripts/tuning.py

from functools import partial
from skopt import gp_minimize


# define the objective function for Bayesian Optimization using Scikit-Optimize
def objective(params, X_train, y_train, base_model, hparam_names):
    hparams = {item: params[i] for i, item in enumerate(hparam_names)}
    ave_mse = run_kfold_validation(X_train=X_train, y_train=y_train, base_model=base_model, hparams=hparams)
    return ave_mse

# create the search space
hparam_names = [s.name for s in space]
objective_partial = partial(objective, X_train=X_train, y_train=y_train, base_model=base_model, hparam_names=hparam_names)

# search the optimal hyperparameters
results = gp_minimize(
    func=objective_partial,
    dimensions=space,
    n_calls=n_calls,
    random_state=42,
    verbose=False,
    n_initial_points=10,
)
# results
best_hparams = dict(zip(hparam_names, results.x))
best_mse = results.fun

# reconstruct the model with the best hyperparameters
best_model = base_model(**best_hparams)

# retrain the model with full training dataset
best_model.fit(X_train, y_train)

Step 2: Configure Feature/Model Stores in S3

The trained models and processed data are stored in the S3 bucket as a Parquet file.

We’ll draft the s3_upload function where the Boto3 client, a low-level interface to an AWS service, initiates the connection to S3:

import os
import boto3
from dotenv import load_dotenv

from src._utils import main_logger

def s3_upload(file_path: str):
    # initiate the boto3 client
    load_dotenv(override=True)
    S3_BUCKET_NAME = os.environ.get('S3_BUCKET_NAME') # the bucket created in s3
    s3_client = boto3.client('s3', region_name=os.environ.get('AWS_REGION_NAME')) # your default region

    if s3_client:
        # create s3 key and upload the file to the bucket
        s3_key = file_path if file_path[0] != '/' else file_path[1:]
        s3_client.upload_file(file_path, S3_BUCKET_NAME, s3_key)
        main_logger.info(f"file uploaded to s3://{S3_BUCKET_NAME}/{s3_key}")
    else:
        main_logger.error('failed to create an S3 client.')

Model Store

Trained PyTorch models are serialized (converted) into .pth files.

Then, these files are uploaded to the S3 bucket, enabling the system to load the trained model when it performs inference in production.

import torch

from src._utils import s3_upload

# model serialization, store in local
torch.save(trained_model.state_dict(), MODEL_FILE_PATH)

# upload to s3 model store
s3_upload(file_path=MODEL_FILE_PATH)

Feature Store

The processed data is converted into a CSV and Parquet file format.

Then, the Parquet files are uploaded to the S3 bucket, enabling the system to load the lightweight data when it creates prediction data to perform inference in production.

from src._utils import s3_upload

# store csv and parquet files in local
df.to_csv(file_path, index=False)
df.to_parquet(DATA_FILE_PATH, index=False)

# store in s3 feature store
s3_upload(file_path=DATA_FILE_PATH)

# trained preprocessor is also stored to transform the prediction data
s3_upload(file_path=PROCESSOR_PATH)

Step 3: Create a Flask Application with API Endpoints

Next, we’ll create a Flask application with API endpoints.

Flask needs to configure Python scripts in the app.py file located at the root of the project repository.

As showed in the code snippets, the app.py file needs to contain the components in order of:

AWS Boto3 client setup,
Flask app configuration and API endpoint setup,
Loading the trained preprocessor, processed input data X_test, and trained models,
Invoke the Lambda function via API Gateway, and
The local test section.

Note that X_test should never be used during model training to avoid data leakage.

app.py

from flask import Flask
from flask_cors import cross_origin
from waitress import serve
from dotenv import load_dotenv

from src._utils import main_logger

# global variables (will be loaded from the S3 buckets)
_redis_client = None
X_test = None
preprocessor = None
model = None
backup_model = None

# load env if local else skip (lambda refers to env in production)
AWS_LAMBDA_RUNTIME_API = os.environ.get('AWS_LAMBDA_RUNTIME_API', None)
if AWS_LAMBDA_RUNTIME_API is None: load_dotenv(override=True)


#### <---- 1. AWS BOTO3 CLIENT ---->
# boto3 client 
S3_BUCKET_NAME = os.environ.get('S3_BUCKET_NAME', 'ml-sales-pred')
s3_client = boto3.client('s3', region_name=os.environ.get('AWS_REGION_NAME', 'us-east-1'))
try:
    # test connection to boto3 client
    sts_client = boto3.client('sts')
    identity = sts_client.get_caller_identity()
    main_logger.info(f"Lambda is using role: {identity['Arn']}")
except Exception as e:
    main_logger.error(f"Lambda credentials/permissions error: {e}")

#### <---- 2. FLASK CONFIGURATION & API ENDPOINTS ---->
# configure the flask app
app = Flask(__name__)
app.config['CORS_HEADERS'] = 'Content-Type'

# add a simple API endpoint to serve the prediction by price point to test
@app.route('/v1/predict-price/', methods=['GET', 'OPTIONS'])
@cross_origin(origins=origins, methods=['GET', 'OPTIONS'], supports_credentials=True)
def predict_price(stockcode):
    df_stockcode = None

    # fetch request params
    data = request.args.to_dict()

    try:
        # fetch cache
        if _redis_client is not None:
            # returns cached prediction results if any without performing inference
            cached_prediction_result = _redis_client.get(cache_key_prediction_result_by_stockcode)
            if cached_prediction_result: 
                return jsonify(json.loads(json.dumps(cached_prediction_result)))

            # historical data of the selected product
            cached_df_stockcode = _redis_client.get(cache_key_df_stockcode)
            if cached_df_stockcode: df_stockcode = json.loads(json.dumps(cached_df_stockcode))


        # define the price range to make predictions. can be a request param, or historical min/max prices
        min_price = float(data.get('unitprice_min', df_stockcode['unitprice_min'][0]))
        max_price = float(data.get('unitprice_max', df_stockcode['unitprice_max'][0]))

        # create bins in the price range. when the number of the bins increase, the prediction becomes more smooth, but requires more computational cost
        NUM_PRICE_BINS = int(data.get('num_price_bins', 100))
        price_range = np.linspace(min_price, max_price, NUM_PRICE_BINS)

        # create a prediction dataset by merging X_test (dataset never used in model training) and df_stockcode
        price_range_df = pd.DataFrame({ 'unitprice': price_range })
        test_sample = X_test.sample(n=1000, random_state=42)
        test_sample_merged = test_sample.merge(price_range_df, how='cross') if X_test is not None else price_range_df
        test_sample_merged.drop('unitprice_x', axis=1, inplace=True)
        test_sample_merged.rename(columns={'unitprice_y': 'unitprice'}, inplace=True)

        # preprocess the dataset
        X = preprocessor.transform(test_sample_merged) if preprocessor else test_sample_merged

        # perform inference
        y_pred_actual = None
        epsilon = 0
        # try using the primary model
        if model:
            input_tensor = torch.tensor(X, dtype=torch.float32)
            model.eval()
            with torch.inference_mode():
                y_pred = model(input_tensor)
                y_pred = y_pred.cpu().numpy().flatten()
                y_pred_actual = np.exp(y_pred + epsilon)

        # if not, use backups
        elif backup_model:
            y_pred = backup_model.predict(X)
            y_pred_actual = np.exp(y_pred + epsilon)


        # finalize the outcome for client app
        df_ = test_sample_merged.copy()
        df_['quantity'] = np.floor(y_pred_actual) # quantity must be an integer
        df_['sales'] = df_['quantity'] * df_['unitprice'] # compute sales
        df_ = df_.sort_values(by='unitprice')

        # aggregate the results by the unitprice in the price range
        df_results = df_.groupby('unitprice').agg(
            quantity=('quantity', 'median'),
            quantity_min=('quantity', 'min'),
            quantity_max=('quantity', 'max'),
            sales=('sales', 'median'),
        ).reset_index()

        # find the optimal price point
        optimal_row = df_results.loc[df_results['sales'].idxmax()]
        optimal_price = optimal_row['unitprice']
        optimal_quantity = optimal_row['quantity']
        best_sales = optimal_row['sales']

        all_outputs = []
        for _, row in df_results.iterrows():
            current_output = {
                "stockcode": stockcode,
                "unit_price": float(row['unitprice']),
                'quantity': int(row['quantity']),
                'quantity_min': int(row['quantity_min']),
                'quantity_max': int(row['quantity_max']),
                "predicted_sales": float(row['sales']),
            }
            all_outputs.append(current_output)

        # store the prediction results in cache
        if all_outputs and _redis_client is not None:
             serialized_data = json.dumps(all_outputs)
            _redis_client.set(
                cache_key_prediction_result_by_stockcode, 
                serialized_data,
                ex=3600     # expire in an hour
            )

        # return a list of all outputs
        return jsonify(all_outputs)

    except Exception as e: return jsonify([])


# request header management (for the process from API gateway to the Lambda)
@app.after_request
def add_header(response):
    response.headers['Cache-Control'] = 'public, max-age=0'
    response.headers['Access-Control-Allow-Origin'] = CLIENT_A
    response.headers['Access-Control-Allow-Headers'] = 'Content-Type,X-Amz-Date,Authorization,X-Api-Key,X-Amz-Security-Token,Origin'
    response.headers['Access-Control-Allow-Methods'] = 'GET, POST, OPTIONSS'
    response.headers['Access-Control-Allow-Credentials'] = 'true'
    return response

#### <---- 3. LOADING PROCESSOR, DATASET, AND MODELS ---->
load_processor()
load_x_test()
load_model()

#### <---- 4. INVOKE LAMBDA ---->
def handler(event, context):
    logger.info("lambda handler invoked.")
    try:
        # connecting the redis client after the lambda is invoked
        get_redis_client()
    except Exception as e:
        logger.critical(f"failed to establish initial Redis connection in handler: {e}")
        return {
            'statusCode': 500,
            'body': json.dumps({'error': 'Failed to initialize Redis client. Check environment variables and network config.'})
        }

    # use the awsgi package to convert JSON to WSGI
    return awsgi.response(app, event, context)


#### <---- 5. FOR LOCAL TEST ---->
# serve the application locally on WSGI server, waitress
# lambda will ignore this section.
if __name__ == '__main__':   
    if os.getenv('ENV') == 'local':
        main_logger.info("...start the operation (local)...")
        serve(app, host='0.0.0.0', port=5002)
    else:
        app.run(host='0.0.0.0', port=8080)

I’ll test the endpoint locally using the uv package manager:

$uv run app.py --cache-clear

$curl http://localhost:5002/v1/predict-price/{STOCKCODE}

The system provided a list of sales predictions for each price point:

Fig. Screenshot of the Flask app local response

Key Points on Flask App Configuration

There are various points you should take into consideration when configuring a Flask application with Lambda. Let’s go over them now:

1. A Few API Endpoints Per Container

Adding many API endpoints to a single serverless instance can lead to monolithic function concern where issues in one endpoint impact others.

In this project, we’ll focus on a single endpoint per container – and if needed, we can add separate Lambda functions to the system.

2. Understanding the `handler` Function and the role of AWSGI

The handler function is invoked every time the Lambda function receives a client request from the API Gateway.

The function takes the event argument that includes the request details in a JSON dictionary and passes it to the Flask application.

AWSGI acts as an adapter, translating a Lambda event in JSON format into a WSGI request that a Flask application can understand, and converts the application’s response back into a JSON format that Lambda and API Gateway can process.

3. Using Cache Storage

The get_redis_client function is called once the handler function is called by the API Gateway. This allows the Flask application to store or fetch a cache from the Redis client:

import redis
import redis.cluster
from redis.cluster import ClusterNode

_redis_client = None

def get_redis_client():
    global _redis_client
    if _redis_client is None:
        REDIS_HOST = os.environ.get("REDIS_HOST")
        REDIS_PORT = int(os.environ.get("REDIS_PORT", 6379))
        REDIS_TLS = os.environ.get("REDIS_TLS", "true").lower() == "true"
        try:
            startup_nodes = [ClusterNode(host=REDIS_HOST, port=REDIS_PORT)]
            _redis_client = redis.cluster.RedisCluster(
                startup_nodes=startup_nodes,
                decode_responses=True,
                skip_full_coverage_check=True,
                ssl=REDIS_TLS,                  # elasticache has encryption in transit: enabled -> must be true
                ssl_cert_reqs=None,
                socket_connect_timeout=5,
                socket_timeout=5,
                health_check_interval=30,
                retry_on_timeout=True,
                retry_on_error=[
                    redis.exceptions.ConnectionError,
                    redis.exceptions.TimeoutError
                ],
                max_connections=10,            # limit connections for Lambda
                max_connections_per_node=2     # limit per node
            )
            _redis_client.ping()
            main_logger.info("successfully connected to ElastiCache Redis Cluster (Configuration Endpoint)")
        except Exception as e:
            main_logger.error(f"an unexpected error occurred during Redis Cluster connection: {e}", exc_info=True)
            _redis_client = None
    return _redis_client

4. Handling Heavy Tasks Outside of the `handler` Function

Serverless functions can experience a cold start duration.

While a Lambda function can run for up to 15 minutes, its associated API Gateway has a timeout of 29 seconds (29,000 ms) for a RESTful API.

So, any heavy tasks like loading preprocessors, input data, or models should be performed once outside of the handler function, ensuring they are ready before the API endpoint is called.

Here are the loading functions called in app.py.

app.py

import joblib

from src._utils import s3_load, s3_load_to_temp_file

preprocessor = None
X_test = None
model = None
backup_model = None


# load processor
def load_preprocessor():
    global preprocessor
    preprocessor_tempfile_path = s3_load_to_temp_file(PREPROCESSOR_PATH)
    preprocessor = joblib.load(preprocessor_tempfile_path)
    os.remove(preprocessor_tempfile_path)


# load input data
def load_x_test():
    global X_test
    x_test_io = s3_load(file_path=X_TEST_PATH)
    X_test = pd.read_parquet(x_test_io)


# load model
def load_model():
    global model, backup_model
    # try loading & reconstructing the primary model
    try:
        # first load io file from the s3 bucket
        model_data_bytes_io_ = s3_load(file_path=DFN_FILE_PATH)
        # convert to checkpoint dictionary (containing hyperparameter set)
        checkpoint_ = torch.load(
            model_data_bytes_io_, 
            weights_only=False, 
            map_location=device
        )
        # reconstruct the model
        model = t.scripts.load_model(checkpoint=checkpoint_, file_path=DFN_FILE_PATH)
        # set the model evaluation mode
        model.eval()

    # else, backup model
     except:
        load_artifacts_backup_model()

Step 4: Publish a Docker Image to ECR

After configuring the Flask application, we’ll containerize the entire application on Docker.

Containerization makes a package of the application, including models, its dependencies, and configuration in machine learning context, as a container.

Docker creates a container image based on the instructions defined in a Dockerfile, and the Docker engine uses the image to run the isolated container.

In this project, we’ll upload the Docker container image to ECR, so the Lambda function can access it in production.

After this, we’ll define the .dockerignore file to optimize the container image:

.dockerignore

# any irrelevant data
__pycache__/
.ruff_cache/
.DS_Store/
.venv/
dist/
.vscode
*.psd
*.pdf
[a-f]*.log
tmp/
awscli-bundle/

# add any experimental models, unnecessary data
dfn_bayesian/
dfn_grid/
data/
notebooks/

Dockerfile

# serve from aws ecr 
FROM public.ecr.aws/lambda/python:3.12

# define a working directory in the container
WORKDIR /app

# copy the entire repository (except .dockerignore) into the container at /app
COPY . /app/

# install dependencies defined in the requirements.txt
RUN pip install --no-cache-dir -r requirements.txt

# define commands
ENTRYPOINT [ "python" ]
CMD [ "-m", "awslambdaric", "app.handler" ]

Test in Local

Next, we’ll test the Docker image by building the container named my-app locally:

$docker build -t my-app -f Dockerfile .

Then, we’ll run the container with the waitress server in local:

$docker run -p 5002:5002 -e ENV=local my-app app.py

The -e ENV=local flag sets the environment variable inside the container, which will trigger the waitress.serve() call in the app.py.

In the terminal, you’ll find a message saying the following:

You can also call the endpoint created to see the results returned:

$uv run app.py --cache-clear

$curl http://localhost:5002/v1/predict-price/{STOCKCODE}

Publish the Docker Image to ECR

To publish the Docker image, we first need to configure the default AWS credentials and region:

From the AWS account console, issue an access token and check the default region.
Store them in the ~/aws/credentials and ~/aws/config files:

~/aws/credentials

[default] 
aws_secret_access_key=
aws_access_key_id=

~/aws/config

[default]
region=

After the configuration, we’ll publish the Docker image to ECR.

# authenticate the docker client to ECR
$aws ecr get-login-password --region  | docker login --username AWS --password-stdin .dkr.ecr..amazonaws.com

# create repository
$aws ecr create-repository --repository-name  --region 

# tag the docker image
$docker tag :  .dkr.ecr..amazonaws.com/:

# push
$docker push .dkr.ecr..amazonaws.com/:

Here’s what’s going on:

: Your default AWS region (for example, us-east-1 ).
: 12-digit AWS account ID.
: Your desired repository name.
: Your desired tag name (for example, v1.0).

Now, the Docker image is stored in ECR with the tag:

Fig. Screenshot of the AWS ECR console

Step 5: Create a Lambda Function

Next, we’ll create a Lambda function.

From the Lambda console, choose:

The Container Image option,
The container image URL from the pull down list,
A function name of our choice, and
An architecture type (arm64 is recommended for a better price-performance).

Fig. Screenshot of AWS Lambda function configuration

The Lambda function my-app was successfully launched.

Connect the Lambda function to API Gateway

Next, we’ll add API gateway as an event trigger to the Lambda function.

First, visit the API Gateway console and create REST API methods using the ARN of the Lambda function (press enter or click to view image in full size):

Fig. Screenshot of the AWS API Gateway configuration

Then, add resources to the created API gateway to create an endpoint:
API Gateway > APIs > Resources > Create Resource

Align the resource endpoint with the API endpoint defined in the app.py.
Configure CORS (for example, accept specific origins).
Deploy the resource to the stage.

Going back to the Lambda console, you’ll find the API Gateway is connected as an event trigger:
Lambda > Function > my-app (your function name)

Fig. Screenshot of the AWS Lambda dashboard

Step 6: Configure AWS Resources

Lastly, we’ll configure the related AWS resources to make the system work in production.

This process involves the following steps:

1. The IAM Role: Controls Who to Access Resources

AWS requires IAM roles to grant temporary, secure permissions to users, mitigating security risks related to long-term credentials like passwords.

The IAM role leverages policies to grant accesses to the selected service. Policies can be issued by AWS or customized by the user by defining the inline policy.

It is important to avoid overly permissive access rights for the IAM role.

In the Lambda function console, check the execution role:
Lambda > Function > > Permission > The execution role.
Set up the following policies to allow the Lambda’s IAM role to handle necessary operations:
- Lambda AWSLambdaExecute: Allows executing the function.
- EC2 Inline policy: Allows controlling the security group and the VPC of the Lambda function.
- ECR AmazonElasticContainerRegistryPublicFullAccess + Inline policy: Allows storing and pulling the Docker image.
- ElastiCache AmazonElastiCacheFullAccess + Inline policy: Allows storing and pulling caches.
- S3: AmazonS3ReadOnlyAccess + Inline policy: Allows reading and storing contents.

Now, the IAM role can access these resources and perfo the allowed actions.

2. The Security Group: Controls Network Traffic

A security group is a virtual firewall that controls inbound and outbound network traffic for AWS resources.

It uses stateful (allowing return traffic automatically) “allow-only” rules based on protocol, port, and IP address, where it denies all traffic by default.

Create a new security group for the Lambda function:
EC2 > Security Groups >

Now, we’ll want to setup inbound / outbound traffic rules.

The inbound rules:

S3 → Lambda:Type*: HTTPS /* Protocol*: TCP /* Port range*: 443 / Source: Custom**
ElastiCache → Lambda:Type*: Custom TCP /* Port range*: 6379 / Source: Custom**

*Choose the created security group for the Lambda function as a custom source.

The outbound rules:

Lambda → Internet: Type*: HTTPS /* Protocol*: TCP /* Port range*: 443 /* Destination*: 0.0.0.0/0*
ElastiCache → Internet: Type*: All Traffic /* Destination*: 0.0.0.0/0*

3. The Virtual Private Cloud (VPC)

A Virtual Private Cloud (VPC) provides a logically isolated private network for the AWS resources, acting as our own private data center within AWS.

AWS can create a Hyperplane ENI (Elastic Network Interface) for the Lambda function and its connected resources in the subnets of the VPC.

Though it’s optional, we’ll use the VPC to connect the Lambda function to the S3 storage and ElastiCache.

This process involves:

Creating a VPC endpoint from the VPC console:VPC > Create VPC.
Creating an STS (Security Token Service) endpoint:
VPC > PrivateLink and Lattice > Endpoints > Create Endpoint >
- Type*: AWS Service*
- Service name*: com.amazonaws..sts*
- Type*: Interface*
- VPC: Select the VPC created earlier.
- Subnets*: Select all subnets.*
- Security groups*: Select the security group of the Lambda function.*
- Policy*: Full access*
- Enable DNS names

The VPC must have a dedicated endpoint for STS to receive temporary credentials from STS.

Create an S3 endpoint in the VPC:
VPC > PrivateLink and Lattice > Endpoints > Create Endpoint >
- Type*: AWS Service*
- Service name*: com.amazonaws..s3*
- Type*: Gateway*
- VPC: Select the VPC created earlier.
- Subnets*: Select all subnets.*
- Security groups*: Select the security group of the Lambda function.*
- Policy*: Full access*

Lastly, check the security group of the Lambda function and ensure that its VPC ID directs to the VPC created: EC2 > Security Group > > VPC ID.

That’s all for the deployment flow.

We can now test the API endpoint in production. Copy the Invoke URL of the deployed API endpoint: API Gateway > APIs > Stages > Invoke URL. Then call the API endpoint and check if it responds predictions:

$curl -H 'Authorization: Bearer YOUR_API_TOKEN' -H 'Accept: application/json' \
     '/'

For logging and debugging, we’ll use the LiveTail of CloudWatch: CloudWatch > LiveTail.

Building a Client Application (Optional)

For full-stack deployment, we’ll build a simple React application to display the prediction using the recharts library for visualization.

Other options for quick frontend deployment include Streamlit or Gradio.

The React Application

The React application creates a web page that fetches and visualizes sales predictions from an external API, recommending an optimal price point.

The app uses useState to manage its data and state, including the selected product, the list of sales predictions, and the loading/error status.

When the user initiates a request, a useEffect hook triggers a fetch request to a Flask backend. It handles the API response as a data stream, processing it line by line to progressively update the predictions.

The AreaChart from the recharts library then visualizes this data. The X-axis represents the price and the Y-axis represents the sales. The chart updates in real-time as the data streams in. Finally, the app displays the optimal price once all the predictions are received.

App.js: (in a separate React app)

import { useState, useEffect } from "react"
import { AreaChart, Area, XAxis, YAxis, CartesianGrid, Tooltip, ResponsiveContainer, ReferenceLine } from 'recharts'


function App() {
  // state
  const [predictions, setPredictions] = useState([])
  const [start, setStart] = useState(false)
  const [isLoading, setIsLoading] = useState(false)

  // product data
  let selectedStockcode = '85123A'
  let selectedProduct = productOptions.filter(item => item.id === selectedStockcode)[0]

  // api endpoint
  const flaskBackendUrl = "YOUR FLASK BACKEND URL"

  // create chart data to display
  const chartDataSales = predictions && predictions.length > 0
    ? predictions
      .map(item => ({
        price: item.unit_price,
        sales: item.predicted_sales,
        volume: item.unit_price !== 0 ? item.predicted_sales / item.unit_price : 0
      }))
      .sort((a, b) => a.price - b.price)
    : [...selectedProduct['histPrices']]

  // optimal price to display
  const optimalPrice = predictions.length > 0
    ? predictions.sort((a, b) => b.predicted_sales - a.predicted_sales)[0]['unit_price']
    : 0

  // fetch prediction results
  useEffect(() => {
    const handlePrediction = async () => {
      setIsLoading(true)
      setPredictions([])
      const errorPrices = selectedProduct['errorPrices']

      await fetch(flaskBackendUrl)
        .then(res => {
          if (res.status !== 200) { setPredictions(errorPrices); setIsLoading(false); setStart(false) }
          else return Promise.resolve(res.clone().json())
        })
        .then(res => {
          if (res && res.length > 0) setPredictions(res)
          else setPredictions(errorPrices)
          setIsLoading(false); setStart(false)
        })
        .catch(err => { setPredictions(errorPrices); setIsLoading(false); setStart(false) })
        .finally(setStart(false))
    }

    if (start) handlePrediction()
    if (predictions && predictions.length > 0) setStart(false)
  }, [flaskBackendUrl, start])


  // render
  if (isLoading) return <Loading />
  return (
    <div>
      <ResponsiveContainer width="100%" height="100%">
        <AreaChart
          key={chartDataSales.length}
          data={chartDataSales.sort(data => data.unit_price)}
          margin={{ top: 10, right: 30, left: 0, bottom: 0 }}
        >
          <CartesianGrid strokeDasharray="3 3" strokeOpacity={0.6} />

          <XAxis
            dataKey="price"
            label={{ value: "Unit Price ($)", position: "insideBottom", offset: 0, fontSize: 12, marginTop: 10 }}
            tickFormatter={(tick) => `$${parseFloat(tick).toFixed(2)}`}
            tick={{ fontSize: 12 }}
            padding={{ left: 20, right: 20 }}
          />

          <YAxis
            label={{ value: "Predicted Sales ($)", angle: -90, position: "insideLeft", fontSize: 12 }}
            tick={{ fontSize: 12 }}
            tickFormatter={(tick) => `$${tick.toLocaleString()}`}
          />

          {/* tooltips with the prediction result data */}
          <Tooltip
            contentStyle={{
              borderRadius: '8px',
              padding: '10px',
              boxShadow: '0px 0px 15px rgba(0,0,0,0.5)'
            }}
            formatter={(value, name) => {
              if (name === 'sales') {
                return [`$${value.toFixed(4)}`, 'Predicted Sales']
              }
              if (name === 'volume') {
                return [`${value.toFixed(0)}`, 'Volume']
              }
              return value
            }}
            labelFormatter={(label) => `Price: $${label.toFixed(2)}`}
          />

          {/* chart area = sales */}
          <Area
            type="monotone"
            dataKey="sales"
            fillOpacity={1}
            fill="url(#colorSales)"
          />

          {/* vertical line for the optimal price */}
          {optimalPrice &&
            <ReferenceLine
              x={optimalPrice}
              strokeDasharray="4 4"
              ifOverflow="visible"
              label={{
                value: `Optimal Price: $${optimalPrice !== null && optimalPrice > 0 ? Math.ceil(optimalPrice * 10000) / 10000 : ''}`,
                position: "right",
                fontSize: 12,
                offset: 10
              }}
            />
          }
        AreaChart>
      ResponsiveContainer>

      {optimalPrice && <p>Optimal Price: $ {Math.ceil(optimalPrice * 10000) / 10000}p>}

    div>
  )
}

export default App

Final Results

Now, the application is ready to serve.

You can explore the UI from here.

All code (backend) is available in my Github Repo.

Conclusion

Building a machine learning system requires thoughtful project scoping and architecture design.

In this article, we built a dynamic pricing system as a simple single interface on containerized serverless architecture.

Moving forward, we’d need to consider potential drawbacks of this minimal architecture:

Increase in cold start duration: The WSGI adapter awsgi layer adds a small overhead. Loading a larger container image takes longer time.
Monolithic function: Adding endpoints to the Lambda function can lead to a monolithic function where an issue in one endpoint impacts others.
Less granular observability: AWS CloudWatch cannot provide individual invocation/error metrics per API endpoint without custom instrumentation.

To scale the application effectively, extracting functionalities into a new microservice can be a good strategy to the next step.

I’m Kuriko IWAI, and you can find more of my work and learn more about me here:

Portfolio / LinkedIn / Github

All images, unless otherwise noted, are by the author. This application utilizes synthetic dataset licensed under a Creative Commons Attribution 4.0 International (CC BY 4.0) license.

This information about AWS is current as of August 2025 and is subject to change.

The Serverless Architecture Handbook: How to Publish a Node Js Docker Image to AWS ECR and Deploy the Container to AWS Lambda

Prince Onukwili — Thu, 17 Apr 2025 02:19:13 +0000

Imagine you’re tasked with building a web application that can handle incoming traffic surges as your users grow without accumulating too much cost. Sounds like a dream, right?

But here’s the thing: traditionally, to do this, you would have to manage lots of infrastructure – resources on which your application will be deployed – which can be a real headache. You’d have servers (VM instances or physical computers) to configure, databases to scale, load balancers to monitor...it’s a whole lot 😩

This is where Serverless architecture comes to the rescue. With the Serverless model, you can deploy your applications to handle thousands of users without you having to worry about incurring too much cost, managing infrastructure, servers, networking, and so on.

In this article, you’ll learn about Serverless Architecture: what it’s all about, and how to deploy your very own application using AWS Lambda. We’ll walk through the entire process step-by-step:

How to clone your application repository using Git.
How to build an image of your application using Docker.
How to install the AWS CLI on your local machine and create AWS IAM users with the right permissions to push your Docker image to AWS Elastic Container Registry (ECR).

Once the image is up and running on ECR, we’ll then connect it to AWS Lambda and deploy the container to Lambda for a fully serverless experience. 💡✨

Ready to go serverless? Let’s get started! 🚀

What is Serverless Architecture?
Differences Between Serverless and Other Deployment Models ⚡
🧠 Prerequisites — What You Should Know Before Following Along!
How to Set Up the Application Using Git 🐙
Understanding the Codebase 🔎
How to Create a Docker Image of the Application 🐋
How to Create a Container Registry on AWS Elastic Container Registry (ECR) 📁
IAM with AWS: How to Create a User on AWS IAM to Allow Access to Your AWS ECR 👤🔐
How to Upload Your Docker Image to the AWS ECR repository ⬆️
How to Deploy the Application Container to AWS Lambda from the Image on AWS ECR 🚀
Advantages of Adopting the Serverless Model in Businesses 💼
Disadvantages of the Serverless Model 🚫
When to Adopt the Serverless Model 🤔
Conclusion 📝
About the Author 👨‍💻

What is Serverless Architecture?

Before we dive deeper, let’s break down what we mean by Servers. In the tech world, servers are powerful computers that store, process, and manage data. Think of them as the behind-the-scenes workhorses that:

Store your data: Like a central filing cabinet for your digital documents.
Run your applications: They execute the code that keeps your app or website running.
Handle requests: Servers respond to user requests – like loading a webpage or processing a login.

Alright, now let’s talk about Serverless Architecture – but first, let’s clear up a common misconception. When most people hear the word "Serverless", they immediately think, "Wait… no servers? How does that even work?!" 😅

Here’s the truth: Serverless doesn’t mean there are no servers involved (surprise, surprise! 😉). Instead, it means you, as a developer, don’t have to worry about managing the servers that your application runs on. The server-side infrastructure is fully handled by the cloud provider – in this case, AWS Lambda. You just focus on writing code and deploying it, and AWS takes care of the rest.

So, What’s the Big Deal with Serverless?

In a traditional setup, when you deploy your application, you’re responsible for things like:

Provisioning servers (how many servers do you need? What size?)
Scaling resources (how do you handle traffic spikes without overpaying?)
Monitoring and keeping everything running smoothly.

Sounds like a lot, right? 🤯 Well, Serverless Architecture simplifies all of that by letting you focus purely on your application code. With Lambda, you can run code in response to events (like an HTTP request, a file upload, or a database change) without worrying about the infrastructure behind it. AWS automatically scales the compute resources as needed, charging you only for the time your code is actually running. ⏱️💸

Imagine you’re at a restaurant. Instead of running the kitchen yourself (like managing your own servers), you just place an order (your code) and the chef (AWS Lambda) makes it for you, on-demand, based on what you need. 🍽️🍴

Differences Between Serverless and Other Deployment Models ⚡

Now that you understand how Serverless works, let’s take a little detour and explore the other models used to deploy applications. After all, Serverless isn’t the only kid on the block, and this will give you some important perspective when choosing the right model for your use case. 👀

When you build an app, you need somewhere to host it – a home for your code to live and run. Over the years, the tech world has come up with different ways to handle this, and each one gives you a different level of control (and responsibility) over your servers.

Let’s break it down.

🏠 Infrastructure as a Service (IaaS)

With IaaS, cloud providers like AWS, Google Cloud, or Microsoft Azure give you the building blocks – virtual servers (also called instances), storage, and networking tools – but it’s still your job to set everything up.

It’s like renting an empty apartment. You get the walls, the doors, and the roof, but you still have to bring your own furniture, set up your Wi-Fi, and clean the place regularly. 🏡🧹

When you choose IaaS, you’re responsible for:

Configuring the servers (choosing the size, the operating system, and installing software).
Handling updates, patches, and security.
Scaling up or down when traffic changes.

Example: Amazon EC2 (Elastic Compute Cloud) is a classic IaaS service. You rent a virtual machine, set it up yourself, and manage it like a digital landlord.

🎯 Platform as a Service (PaaS)

Next up, we’ve got PaaS – a more polished setup.

In this model, the cloud provider takes care of the infrastructure and the underlying operating system, so you don’t have to. You just upload your code, configure a few settings, and the platform runs your app.

It’s like moving into a fully furnished apartment — the kitchen works, the lights are on, and the Wi-Fi is already connected. You just show up with your bags and get to work! 🧳✨

Example: AWS Elastic Beanstalk, Heroku, or Google App Engine.

🌩️ Serverless: The Special PaaS

Now here’s where things get interesting: Serverless actually falls under the PaaS umbrella, but it deserves its own spotlight. Why? Because it takes the convenience of PaaS and pushes it to the next level.

In a traditional PaaS model (like AWS Fargate or Heroku), your application is running 24/7, whether you have visitors using it or not. You pay for the reserved space and compute power all month long, just like renting an apartment. Even if you didn’t sleep there the entire month, the bill still comes at the end. 💸🏡

But with Serverless, the rules change. You only pay when your code is actually being used.

How Applications Run in the Serverless Model ⚙️

In a Serverless model, your application isn’t just sitting there running all day. It “wakes up” only when it’s needed. But what exactly causes it to wake up? That’s where triggers come in.

Triggers are events that tell your Serverless application, “Hey, it’s time to do something!” These events could be all sorts of things, like:

A user visiting your website and clicking a button.
Someone uploading a file to your cloud storage (like an image or document).
A new row being added to a database.
An automated schedule (like a reminder that runs every day at 8 AM).

When one of these events happens, your application instantly comes to life, runs the exact task you programmed, and then goes back to “sleep” until the next trigger. This is how Serverless keeps your cloud costs low and your resources efficient – no constant running in the background, only action when there’s actually something to do!.⚡😎

For example, if a user sends a request that triggers your application to run for just 10 seconds and uses 20MB of memory, that’s all you pay for — the exact time and resources consumed.

No users? No requests? No payment. Now that’s a smart way to save money. 🧠💰

💡 Quick Comparison: PaaS vs Serverless

Feature	Traditional PaaS (example: AWS Fargate)	Serverless PaaS (example: AWS Lambda)
Server Configuration	You select compute size & limits.	No need — AWS handles it all.
Scaling	You configure scaling policies.	Automatic, event-driven scaling (based on incoming traffic). The higher the traffic, the more compute power is added to your application, and vice versa. 😃
Billing	Charged for running instances 24/7, even when idle.	Charged only when your code runs. ⏱️💸
Deployment	Deploy full applications.	Deploy small chunks of code (functions). You can also deploy microservices and full-scale web applications

🧠 Prerequisites — What You Should Know Before Following Along

Before we dive in, here’s the best part: I wrote this article to be super beginner-friendly and detailed, so even if you have little to no programming background, you’ll still be able to follow along.

Whether you’re a developer, a tech-curious startup, or a business leader trying to understand modern cloud solutions, this guide was written for you.

That said, having some light knowledge in these areas will make the ride even smoother:

🧑‍💻 Basic Programming Concepts – like how Node.js apps run and what a server does.
💡 Familiarity with Common Tech Terms – words like “deploy,” “application,” “CPU,” and “software” will pop up, but don’t worry: I’ve done my best to break these down into simple, relatable explanations.

No prior cloud experience? No problem! This guide holds your hand all the way from setup to deployment – all in plain language, no jargon.

So buckle up, and let’s proceed with deploying your very own application to AWS Lambda. 😁

How to Set Up the Application Using Git 🐙

Before we jump into writing code or deploying anything, the very first step is to grab the application we’ll be working with — and for that, we’ll be using Git.

But wait... what’s Git? — It’s a Version Control System (VCS) that helps developers track changes to their code, collaborate with teammates without stepping on each other’s toes, and safely store their work in a central place — like GitHub.

Clone the Application Repository 🧑‍💻

I’ve already created a simple project for us to use in this tutorial — it’s sitting pretty on GitHub, waiting for you.

To clone the project onto your local machine, open up your terminal and run:

git clone https://github.com/onukwilip/lambda-tutorial.git

This command will download all the code from the lambda-tutorial repository into a folder on your computer. 📁

Once the cloning is done, navigate into the project directory like this:

cd lambda-tutorial

Boom — just like that, your local machine is now set up with the same code that’s stored in the GitHub repo. 🏡

Understanding the Codebase 🔎

Open the Codebase in Your Favorite IDE 🧑‍💻

For this tutorial, we’ll be using Visual Studio Code (VS Code), but feel free to use any editor you’re comfortable with.

Once you open the lambda-tutorial project folder, you’ll notice it’s a simple Node.js web server. Nothing too fancy — just a server that can handle requests and respond with some data.

Now, it’s important to understand what’s going on inside our codebase, especially if you’re coming from deploying on platforms like Render, Vercel, or Google Cloud Run.

Deploying to Lambda vs Other Serverless Platforms ⚡

When you deploy to platforms like Vercel, Render, or Google Cloud Run, you usually package your web server just the way you wrote it – whether it’s a Node.js Express server or a Next.js app – and the platform handles it pretty much as-is.

Those platforms run your server like a mini container (or microservice) that’s always ready to handle incoming traffic, just like a waiter standing by at your table, waiting for your order.

But AWS Lambda works a little differently.

Lambda expects your code to be organized around functions – not full web servers. Think of Lambda as a chef that only shows up the moment an order is placed, cooks the food, and disappears once the job is done. 👨‍🍳🍽️

So if you’ve got a full-blown Node.js Express server, you’ll need to do a tiny bit of “translation” to fit Lambda’s expectations – and that’s where the lambda.js file comes in.

The `lambda.js` File — Your Lambda Translator 🔀

Here’s what the file looks like:

const serverless = require("serverless-http");
const app = require("./app");

const handler = serverless(app);
module.exports.handler = handler;

Let’s break it down:

const serverless = require("serverless-http");: This imports a handy little library called serverless-http. (The serverless-http library is important for our platform to run properly on AWS Lambda.) It acts like a translator: it takes your regular Express app and wraps it so that AWS Lambda can understand it.
const handler = serverless(app);: Here’s the magic. This wraps your Express app into a Lambda-compatible function.
module.exports.handler = handler;: This exports your wrapped function so AWS Lambda can call it when the application is triggered.

So, instead of starting your server like this:

app.listen(5000, () => {
  console.log("Server running on port 5000");
});

You’re handing your app over to Lambda and letting it handle incoming requests, scale, and run the app only when it’s needed.

The `app.js` File — Your Classic Express App 💻

Your app.js is where the main application logic lives. Here is usually where you:

Set up Express.
Define routes (like /api, /users, /hello).
Apply middleware (like JSON parsing, logging, CORS, and so on).
Handle HTTP requests and send back responses.

In a normal deployment (Render, Google Cloud Run, DigitalOcean, or your own server), you’d start the server using app.listen(PORT) at the bottom of this file.

But since we’re deploying to Lambda, you don’t directly start the server here. Instead, you export the app like this:

module.exports = app;

This way, your application stays “server-agnostic” – it’s not hardcoded to run on a traditional server. Lambda (via the lambda.js file) takes care of starting and stopping your app whenever it’s triggered by an event (like an HTTP request). Smart, right? 💡

Why this setup? 🤔

This little separation gives you flexibility:

You can write your Node.js app like you always would (using Express) inside app.js.
And you only tweak the entry point (via lambda.js) to fit AWS Lambda’s expectations.

How to Create a Docker Image of the Application 🐋

Now that we’ve had a good look at the code, let’s package it up the smart way — using Docker.

What is Docker? 🐳

Now, you might be wondering, "Why are we using Docker?"

Docker is a software for creating images of your applications and running those images as containers. Just like real-world shipping containers hold goods securely, Docker containers hold your app, bundled with everything it needs to run: its code, libraries, dependencies, and settings. Everything is all wrapped up neatly, so your app runs the same way everywhere, whether on your laptop, AWS Lambda, or even your friend’s machine.

Let’s Take a Look at the Dockerfile 🔍

Inside your project folder, you’ll find a file named Dockerfile. This is basically the recipe that Docker uses to build your app’s container image.

Here’s what it looks like:

FROM node:18-slim AS builder

WORKDIR /app

COPY package.json .

RUN npm i -f

COPY . .

USER root

FROM amazon/aws-lambda-nodejs

ENV PORT=5000

COPY --from=builder /app/ ${LAMBDA_TASK_ROOT}
COPY --from=builder /app/node_modules ${LAMBDA_TASK_ROOT}/node_modules
COPY --from=builder /app/package.json ${LAMBDA_TASK_ROOT}
COPY --from=builder /app/package-lock.json ${LAMBDA_TASK_ROOT}

EXPOSE 5000

CMD [ "lambda.handler" ]

Let’s break down the important steps— in plain English: 😎

FROM node:18-slim AS builder: We start by using a lightweight version of Node.js called node:18-slim and give it a tag named builder (think of it as Stage 1). This gives us the tools we need to build a Node.js app, but without extra stuff that makes the image heavy. The tag builder enables us to re-use the content of this build in the next stage
WORKDIR /app: We set the working directory inside the container to /app. Think of this as telling Docker: "Hey, this is the folder where I’ll be working from!"
COPY package.json .: This copies the package.json file (which lists your app’s dependencies) into the /app folder inside the container.
RUN npm i -f: This installs all the Node.js dependencies (the packages your app needs to work).
The -f flag forces npm to resolve conflicts if any pop up.
COPY . .: This copies the rest of your project files from your computer into the container.
USER root: This sets the user to root (administrator level) inside the container. Useful when extra permissions are needed for certain tasks.
FROM amazon/aws-lambda-nodejs: Now here’s the switch: we swap to the official AWS Lambda base image for Node.js! That is, Stage 2. This image is designed to work smoothly when deploying containers to Lambda.
ENV PORT=5000: We set an environment variable for the server port. Our app will listen on port 5000.
COPY --from=builder /app/ ${LAMBDA_TASK_ROOT}: This grabs all the files from the builder stage and copies them into Lambda’s special working directory (${LAMBDA_TASK_ROOT}).
COPY --from=builder /app/node_modules ${LAMBDA_TASK_ROOT}/node_modules: Same thing, but this one specifically copies the node_modules folder (all your installed dependencies) into Lambda’s working directory.
COPY --from=builder /app/package.json ${LAMBDA_TASK_ROOT}: Copies the package.json file into Lambda’s working directory.
COPY --from=builder /app/package-lock.json ${LAMBDA_TASK_ROOT}: Copies the lock file for your dependencies – so Lambda knows exactly which versions of libraries to use.
EXPOSE 5000: This tells Docker, “Hey, my app is going to listen for requests on port 5000!" (Though Lambda doesn’t use this directly, it’s useful for local testing.)
CMD [ "lambda.handler" ]: This tells AWS Lambda which function to run when the container starts.
In this case, it’s looking for a handler function inside your app – that’s the entry point!

How to Create Our Own Docker Image

Before we proceed, you need to have Docker running on your machine. If you haven’t installed Docker yet, check out the official installation guide here: Docker Installation Tutorial. It’s a great resource to get Docker up and running.

Ensure Docker is Running

Make sure Docker Desktop is installed and running. You can usually tell by the Docker icon in your system tray. If it’s not running, start it up before proceeding.

Build the Docker Image

Now, it’s time to create a Docker image of our application. In your terminal, navigate to the root directory of your project (where your Dockerfile is located). Then run the following command:

docker build -t demo-lambda-project:latest .

The docker build command tells Docker to create an image.
The -t demo-lambda-project:latest flag assigns a tag (or name) to your image (we’ll change this later to the image naming convention supported by AWS Elastic Container Registry – ECR).
- Here, demo-lambda-project is the name, and latest is the tag indicating the most recent build.
The . at the end tells Docker to look for the Dockerfile in the current directory.

What This Does

Docker will now follow the instructions in your Dockerfile step-by-step. It starts by building your Node.js app (using the lightweight Node 18 image), installs the dependencies, and then copies everything over to an AWS Lambda-ready image. Once done, you have a neat image tagged as demo-lambda-project:latest that’s ready for deployment.

How to Create a Container Registry on AWS Elastic Container Registry (ECR) 📁

Okay, let’s dive into creating an image registry on AWS Elastic Container Registry (ECR). Follow these steps closely to set up your repository named lambda-practice:

In the search bar at the top, type "ECR". You should see Amazon ECR pop up in the dropdown results. Click on it to navigate to the Elastic Container Registry section.

Step 2: Start Creating Your Repository

Once you’re in the ECR section, look for a button that says "Create repository". Click this button to start setting up your new container registry.

Step 3: Configuring the Repository Details

You’ll need to add some info like:

Repository name: In the form that appears, enter lambda-practice as the repository name. This name will be used to reference your repository later when uploading your Docker image.
Tag mutability: You’ll also see an option for Tag Mutability. For this tutorial, set it to Mutable. This means that if you need to update or change a tag on your image later, you can do so. (Keep in mind that in some scenarios, you might want immutable tags for images used in production environments – but mutable tags are great for testing and development, especially since we want to use the tag latest for our images.)

When you’re happy with the settings, click the "Create repository" button at the bottom of the form.

Repository Created – Now Let's Take a Look

After creating the repository, AWS will redirect you to the page listing your repositories.

Find the repository named lambda-practice in the list. This is your newly created container registry where you can push Docker images.

Copy the lambda-practice repository URI, which we’ll need later when we push our image from our local machine. The URI should be in a format similar to this - .dkr.ecr..amazonaws.com/lambda-practice

And that’s it! You’ve now successfully created a container registry on AWS ECR and have your repository (lambda-practice) ready to receive your Docker image. 🚀

IAM with AWS: How to Create a User on AWS IAM to Allow Access to Your AWS ECR 👤🔐

Now that we’ve successfully created our AWS ECR container registry (the home for our Docker image), it's time to make sure our local machine has the necessary permissions to interact with that registry. Without proper authorization, we won’t be able to upload our image.

To do that, we’ll create an IAM user with the appropriate permissions.

Step 1: Access the IAM Console

Start by logging in to your AWS Management Console: https://console.aws.amazon.com/console/home.

In the search bar at the top, type "IAM" and select the IAM service from the dropdown. This brings you to the IAM dashboard where you can manage users, roles, policies, and more.

Step 2: Navigate to the Users Section

On the left sidebar of the IAM dashboard, click on "Users". Here you'll see a list of existing users, and this is where you'll add a new one.

Step 3: Create a New User

Click the "Add users" button at the top. In the "Set user details" step, enter the username as lambda-practice.

Step 4: Attach Permissions Directly

In the "Set permissions" step, choose "Attach policies directly". In the search box, type AmazonEC2ContainerRegistryPowerUser. Select the AmazonEC2ContainerRegistryPowerUser policy by ticking its checkbox. This policy grants the necessary permissions to work with AWS ECR, such as pushing and pulling Docker images.

Click Next, and verify that the username is lambda-practice and that the AmazonEC2ContainerRegistryPowerUser policy is attached. If everything looks good, click "Create user".

Step 5: Generate Access Keys for the User

Once the user is created, you’ll be redirected to the page listing all IAM users. Locate and click on the user lambda-practice. This action will take you to the user’s summary page.

Navigate to the "Security credentials" tab.
Under "Access keys", click the "Create access key" button.
A page will appear for configuring the new access key.

In the "Access key best practices & alternatives" step, select "Command Line Interface (CLI)".

Why should you select this option? Choosing CLI ensures that the generated access key is optimized for use with the AWS CLI and other command-line tools (like Docker commands that push images to ECR), which is exactly what we need for our workflow.

Leave the other configurations as their default settings, and then click "Create access key".

Once the key is created, you’ll see the new Access key ID and Secret access key. Make sure to copy and store these credentials securely. They are essential for authorizing your local machine to access AWS ECR and perform operations with the permissions assigned to the lambda-practice user.

How to Authorize Your Local PC to Publish Images to the AWS ECR Repository

Now that we have our IAM user set up and the access keys in hand, it’s time to authenticate our local PC so we can securely push our Docker images to AWS ECR using the AWS CLI. Follow these steps:

Step 1: Install the AWS CLI

If you haven’t installed the AWS CLI on your machine yet, download and install it using the official guide here: Install the AWS CLI.

This tool allows you to interact with your AWS account right from the command line, which is essential for pushing images to ECR.

Step 2: Configure Your AWS CLI Credentials

Once installed, you need to configure your AWS CLI to use the credentials associated with the lambda-practice user. Open your terminal and run the following command to set up a new profile named lambda:

aws configure --profile lambda

You’ll be prompted to enter the following details:

AWS Access Key ID: Paste the access key ID that you generated for the lambda-practice user.
AWS Secret Access Key: Paste the corresponding secret access key.
Default region name: Enter your preferred AWS region (for example, us-east-1 or your relevant region).
Default output format: You can leave this as json or choose your preferred format.

This command configures a new CLI profile called lambda with the credentials of our IAM user.

Step 3: Verify the Configuration

To ensure everything is set up correctly, run:

aws sts get-caller-identity --profile lambda

This command will return details about the IAM user configured for the lambda profile, confirming that your local PC is now authenticated correctly.

Now you’re all set! Your AWS CLI is configured with the lambda profile, meaning your local machine has the right credentials to interact with your AWS ECR repository and push Docker images using the permissions assigned to your lambda-practice IAM user.

How to Upload Your Docker Image to the AWS ECR repository ⬆️

Uploading your Docker image to AWS ECR is the moment when your hard work gets sent off to your repository so AWS Lambda can later grab and run your container. Now that your PC is authorized to talk to ECR, let’s take a look at how to upload the image.

Step 1: Log in to ECR with Docker

Before you can push your image, you need to authenticate Docker to your AWS ECR account. You do this by running a command that gets an authentication token from AWS and pipes it to Docker. For example:

aws ecr get-login-password --region  --profile lambda | docker login --username AWS --password-stdin .dkr.ecr..amazonaws.com

Let’s break it down:

aws ecr get-login-password --region --profile lambda: This part uses the AWS CLI to get a temporary login password for ECR. Be sure to replace with the region in which your ECR repository was created (for example, us-east-1).
| docker login --username AWS --password-stdin .dkr.ecr..amazonaws.com: The pipe (|) takes the password from the AWS CLI command and passes it as input to docker login. The login command then logs Docker into ECR using the provided username (AWS) and the password. Replace with your actual AWS account ID.

Step 2: Environment Considerations

This command works on shell environments like Powershell, zsh, and bash.

Windows Users (CMD):
If you’re using the classic Windows Command Prompt (CMD), the piping syntax might not work the same way. In that case, you might consider using Windows PowerShell or Git Bash. Alternatively, you can run the command in an environment like Windows Subsystem for Linux (WSL).

Why Use the Correct Region?

It is crucial to use the exact region where your ECR repository was created. The region is a part of your repository URI. If you use the wrong region, the login will fail because it won’t find the correct repository endpoint.

How to Check the Region:

Log in to your AWS Console, navigate to the ECR section, and select your repository. The URI will look similar to this: .dkr.ecr..amazonaws.com/lambda-practice. Here, is the region you must use in your login command.

Step 3: Build Your Docker Image with the Correct Tag

Before pushing the image to ECR, you need to build it on your local machine and tag it with your repository’s name. In your terminal, navigate to your project’s root folder (where your Dockerfile is located), then run (replace and placeholders with your AWS Account ID and AWS ECR repository region):

docker build -t .dkr.ecr..amazonaws.com/lambda-practice:latest

Step 4: Push Your Docker Image to AWS ECR

Once your image is built and tagged, it’s time to push it to your remote ECR repository. Run the following command:

docker push .dkr.ecr..amazonaws.com/lambda-practice:latest

This command tells Docker to upload (or “push”) your image to the repository you created earlier.

Make sure the repository URI and tag match what you used in the build command.
Remember, if you use a different region than the one in your repository URI, the push will fail because AWS won’t recognize the repository endpoint.

How to Deploy the Application Container to AWS Lambda from the Image on AWS ECR 🚀

You can deploy your function on AWS Lambda in several ways, each catering to different use cases. Here’s a quick rundown:

ZIP file upload: Simply compress your code and dependencies into a ZIP file, then upload it directly via the AWS Lambda console. This traditional method is great for small codebases that don’t require custom runtimes.
Direct editing in the console: Write or edit your function code directly in the AWS Lambda code editor. Handy for quick tweaks, but not ideal for larger projects.
Container image: Package your application as a Docker container image and deploy it. This approach is particularly useful if you have complex dependencies, need a custom runtime, or want consistent environments across development and production.

In this tutorial, we’re taking the container image route because it offers flexibility, consistency, and scalability – all while letting us reuse our existing Docker configuration. Let’s walk through the steps for deploying your containerized application to AWS Lambda:

Step 1: Access the AWS Lambda Console

Log into your AWS Management Console. In the search bar at the top, type "Lambda" and select the AWS Lambda service from the dropdown results.

Step 2: Create a New Lambda Function

Once on the Lambda page, click the "Create function" button. You’ll see multiple function creation options. For our purposes, select the "Container image" option. This choice tells AWS that you’ll be deploying a containerized application instead of uploading a ZIP file.

Step 3: Name Your Function

In the function setup screen, enter lambda-practice as the name of your new Lambda function. This name identifies your function in AWS.

Step 4: Configure the Container Image

Under the “Container image” settings, click the "Browse images" button. A new window should appear, listing your available images from AWS Elastic Container Registry (ECR).

Select the repository you previously created (for instance, the one named lambda-practice), and pick the image tagged as latest.

Step 5: Finalize and Create

Now you’ll want to review the basic settings. In this step, you might also configure additional options such as memory allocation, timeout limits, and environment variables, depending on your application needs.

Once everything is set, click "Create function" to finalize the deployment.

How to Enable Access to Your Lambda Function

Awesome – hurray, you’ve successfully deployed your image from AWS ECR to AWS Lambda! Now the next step is to make sure your function is up and running and can be triggered properly. But you might be wondering, “How do I actually access my Lambda function to see if it’s working?” Let's break it down:

Understanding Lambda Function Triggers

There are several ways to invoke a Lambda function, and AWS supports multiple trigger options. Here are a few:

Event Source Mapping: Automatically triggers your function in response to changes in services like DynamoDB, Kinesis, or S3.
Scheduled Events: Set up cron-like scheduled invocations via Amazon CloudWatch Events.
API Gateway: Create RESTful APIs that call your function.
AWS SDK/CLI: Directly invoke the function using the AWS SDK or CLI commands.
Function URLs: A simple way to expose your function over HTTPS, giving you a public URL that users or applications can call directly.

In this tutorial, we’re going to use a Function URL to trigger our Lambda function via an HTTP event. This method allows you to invoke your function from the public internet and is perfect for testing or building public-facing APIs.

How to Create a Function URL for Your Lambda Function

Now that you're on your Lambda function's details page, here’s how to create a Function URL step-by-step:

First, on your Lambda function’s page, click the "Configuration" tab at the top. Within the Configuration section, find and select the "Function URL" sub-tab. This is where you manage the public URL for your function.

Click on the "Create Function URL" button. This will open a new configuration screen for setting up your Function URL.

Authentication type: Set the Auth type to NONE. This setting allows public, unauthenticated access to your function from the internet, which means anyone with the URL can invoke it. (This is great for testing or building public services, but be cautious with security in production environments!)
Additional settings: Under the Additional Settings section, enable Configure cross-origin resource sharing (CORS). This is useful if you plan to call your function from client-side applications hosted on different domains. Think of it as opening a window for your app to communicate with other web pages or services.

After configuring your settings, click the appropriate button to create or save the Function URL.

Verify Your Function URL

Once configured, you’ll see the Function URL displayed on the same page. You can now copy this URL.

Paste the URL into a browser or use tools like curl or Postman to send an HTTP request, triggering your Lambda function and verifying that it works as expected.

You should get a response just like this on your browser:

And that’s it! You’ve successfully set up a public HTTP endpoint that triggers your AWS Lambda function. Whether you're testing your deployment or building a public-facing API, the Function URL makes it easy for anyone to interact with your function.

Congrats — You did it!

You've just walked through the entire journey of deploying a Node.js web server, containerized with Docker, all the way to AWS Lambda using AWS ECR as your image repository. 🚀

From writing and containerizing your Node.js application, creating an AWS ECR repository, setting up IAM users and access keys, pushing your Docker image to ECR, to deploying it on Lambda – you’ve covered it all like a pro. 💪

Not only that, but you also configured a public-facing Function URL so your serverless app can now handle requests from anywhere in the world 🌍.

You’ve just combined modern cloud-native workflows with serverless deployment – giving you flexibility, scalability, and lightning-fast response times without the headache of managing servers 😁.

👏 Give yourself a pat on the back. You’ve officially containerized and deployed your Node.js web server to AWS Lambda!

Advantages of Adopting the Serverless Model in Businesses 💼

When it comes to deploying applications in the cloud, the serverless model has truly flipped the old playbook and has helped businesses save on Cloud costs! Let’s break it down in simple, real-world terms.

Cost-Efficiency 💰

For most businesses – especially startups – serverless offers a major financial advantage. Here’s why:

In traditional models like IaaS (Infrastructure as a Service) and PaaS (Platform as a Service), such as using AWS EC2 or AWS Elastic Beanstalk, you provision resources upfront.

For example: You spin up a server with 4 GB RAM and 4 vCPUs, and AWS charges you $100/month (this covers 730 hours – the whole month). Even if your app barely does anything – say it only serves real requests for 120 hours, and uses just 1 GB of memory – you still pay the full $100, because the resources were reserved and waiting for traffic 24/7.

But with Serverless:

You don’t pre-allocate or reserve compute power.
Your application only runs when someone actually needs it (for example, when a user makes an HTTP request).
You only pay for the actual execution time and the resources used.

For instance, if your function only runs for 50 hours in a month and uses 1.5 GB RAM, you might pay something like $30, compared to the flat $100 you'd have paid on EC2 or Elastic Beanstalk.

Scalability Without Stress 📈

Serverless platforms like AWS Lambda automatically handle:

Scaling up during high demand.
Scaling down to zero when idle.

This means your team won’t need to predict or provision for resources during traffic surges. Whether 1 or 1 million users visit your app, the cloud provider handles the rest.

Simplified Operations ⚙️

For your software team:

No more babysitting servers, patching security updates, or worrying about load balancers.
You focus purely on writing the business logic and shipping code.
The cloud provider handles the infrastructure behind the scenes.

This frees up your team’s time, cuts maintenance tasks, and speeds up development times.

Better Return on Investment (ROI) 📊

Because you only pay for what you use, the cost-to-value ratio improves significantly. Startups and businesses can:

Launch faster.
Experiment without financial risk.
Scale without surprise bills.
Avoid overpaying for idle resources.

Disadvantages of the Serverless Model 🚫

As exciting and cost-friendly as the serverless model seems, the golden rule in tech still applies:
every solution comes with trade-offs.

Let’s walk through a few important downsides you should consider:

No Built-in Support for Background Jobs ⏰

Unlike traditional servers where you can run background processes – like sending out newsletters at midnight or cleaning up databases at scheduled times – serverless platforms such as AWS Lambda don’t natively support background tasks or recurring jobs.

For example, let’s say you wanted your app to automatically generate reports every day at 3 AM. In a typical server setup, you’d just write a cron job and call it a day.

But with Lambda or serverless, you can’t do this directly inside your deployed function. Instead, you need external tools like:

AWS EventBridge (for scheduling and triggering Lambda functions)
Or other cloud-native schedulers.

This adds a bit of extra setup, management, and sometimes extra cost.

Unpredictable Cloud Costs 💸

One of the biggest selling points of serverless is “pay-as-you-use” – but this can also become a financial blind spot, because:

Costs depend on traffic volume and resource usage.
If your app suddenly goes viral or experiences a traffic spike, your cloud bill could skyrocket without warning.

For example, an app that runs stable at $30/month for low traffic could unexpectedly hit $1000+ if a marketing campaign or external event drives huge numbers of users to your service. While this means your app is succeeding, your budget might take a hit.

In contrast, with traditional models like AWS EC2 or Elastic Beanstalk, your costs are usually predictable – even if your server sits idle all month.

When to Adopt the Serverless Model 🤔

So, is Serverless always the right choice? Not necessarily!

If you expect:

Steady, predictable workloads, EC2 or Elastic Beanstalk might offer more cost certainty.
Long-running background tasks, serverless isn’t ideal without extra services.
Real-time control over resource limits, traditional servers give you more flexibility.

But if your app has burst traffic (users come and go), event-driven logic (like APIs or webhooks), or you want minimal ops overhead, then Serverless can save time, effort, and money.

When Serverless is the Perfect Fit: A Startup Building an Event-Driven API

Imagine you’re running a small tech startup that just launched an app for booking fitness classes. Your team is small, budgets are tight, and traffic is unpredictable – some days you have 50 users, some days 5,000.

In this case:

Your backend mostly handles HTTP requests: new sign-ups, class bookings, cancellations, and payments.
Traffic spikes during lunch breaks and weekends, but is quiet at night.
You don’t want to hire a full-time DevOps engineer just to manage servers.

👉 Why Serverless is perfect in this case:

You only pay when people use your app.
No need to manage or provision servers.
AWS Lambda auto-scales based on demand.
Fast to deploy, easy to connect to other AWS services (like DynamoDB for your database, S3 for images, and SES for emails).

By using Serverless in this case, you can save money, scale automatically, and stay laser-focused on features – not infrastructure.

When Serverless is Not a Good Fit: A Video Streaming Platform

Now imagine you’re building the next YouTube-like service for a niche audience – say, education-based content for universities.

In this case:

Your platform requires continuous background processing: encoding videos, generating thumbnails, and pushing them to CDN.
Users stream content 24/7, meaning your app is always under load.
Background jobs like recommendation engine updates or nightly reports need to run frequently.

👉 Why Serverless might be a bad idea:

Functions like AWS Lambda have a timeout limit (for example 15 minutes max per execution).
Continuous processing or streaming doesn’t fit the on-demand, short-lived nature of serverless.
Costs could skyrocket since the app runs almost all the time, making it more expensive than a dedicated EC2 or Kubernetes cluster.

Better alternative:
For this kind of use case, a traditional server-based setup – like EC2 or container orchestration via ECS or Kubernetes – would offer more control, predictable pricing, and support for long-running processes

✅ Bottom line:
Serverless is fantastic for modern apps, but like any tool, it’s best used when its strengths match your project’s needs.

Conclusion 📝

Congratulations on making it to the end of this tutorial! 🚀

In this article, we explored the power of serverless computing by walking step-by-step through the process of deploying a Node.js web server using Docker and AWS Lambda.

From building your container image, pushing it to AWS ECR, and finally deploying it on Lambda – you’ve now seen how easy it is to get an app running without the hassle of provisioning servers.

We also discussed the advantages of adopting the Serverless model in deploying your applications, it’s disadvantages, and real-world use cases in which you should adopt the serverless approach.

About the Author 👨‍💻

Hi, I’m Prince! I’m a DevOps engineer and Cloud architect passionate about building, deploying, and managing scalable applications and sharing knowledge with the tech community.

If you enjoyed this article, you can learn more about me by exploring more of my blogs and projects on my LinkedIn profile. You can find my LinkedIn articles here. You can also visit my website to read more of my articles as well. Let’s connect and grow together! 😊

How to Build a Serverless CRUD REST API with the Serverless Framework, Node.js, and GitHub Actions

Ifeanyi Otuonye — Wed, 21 Aug 2024 19:22:55 +0000

Serverless computing emerged as a response to the challenges of traditional server-based architectures. With serverless, developers no longer need to manage or scale servers manually. Instead, cloud providers handle infrastructure management, allowing teams to focus solely on writing and deploying code.

Serverless solutions automatically scale based on demand and offer a pay-as-you-go model. This means that you only pay for the resources your application actually uses. This approach significantly reduces operational overhead, increases flexibility and accelerates development cycles, making it an attractive option for modern application development.

By abstracting server management, Serverless platforms let you concentrate on business logic and application functionality. This leads to faster deployments and more innovation. Serverless architectures are also event-driven, which means they can automatically respond to real-time events and scale to meet user demands without manual intervention.

Important Concepts to Understand
Prerequisites
Our Use Case
Tutorial Objectives
How to get Started:Clone the Git Repository
Step 1: Set up the Serverless Framework Environment
Step 2: Define the API in the Serverless YAML File
Step 3: Develop the Lambda Functions for CRUD Operations
Step 4: Set Up CI/CD Pipeline Multi-stage Deployments for Dev and Prod Environments
Step 5: Test the Dev and Prod Pipelines
Step 6: Test and Validate Prod and Dev APIs using Postman
Conclusion

Before diving into the technical details, we'll go over some key background concepts.

Important Concepts to Understand

Application Programming Interface (API)

An Application Programming Interface (API) allows different software applications to communicate and interact with each other. It defines the methods and data formats that applications can use to request and exchange information for integration and data sharing between diverse systems.

HTTP Methods

HTTP methods or request methods are a critical component of web services and APIs. They indicate the desired action to be performed on a resource in a given request URL.

The most commonly used methods in RESTful APIs are:

GET: used to retrieve data from a server
POST: sends data, included in the body of the request, to create or update a resource
PUT: updates or replaces an existing resource or creates a new resource if it doesn’t exist
DELETE: deletes the specified data from the server.

Amazon API Gateway

Amazon API Gateway is a fully managed service that makes it easy for developers to create, publish, maintain, monitor and secure APIs at scale. It acts as an entry point for multiple APIs, managing and controlling the interactions between clients (such as web or mobile applications) and backend services.

It also provides various functions, including request routing, security, authentication, caching and rate limiting that help simplify the management and deployment of APIs.

Amazon DynamoDB

DynamoDB is a fully managed NoSQL database service designed for high scalability, low latency, and replication of data across multiple regions.

DynamoDB stores data in a schema-less format, allowing for flexible and fast storage and retrieval of structured and semi-structured data. It is commonly used for building scalable and responsive applications in cloud-based environments.

Serverless CRUD Application

A serverless CRUD application refers to the ability to Create, Read, Update and Delete data. But the architecture and components involved differ from traditional server-based applications.

Create involves adding new entries to a DynamoDB table. The Read operation retrieves data from a DynamoDB table. Update updates existing data in DynamoDB. And the Delete operation deletes data from DynamoDB.

The Serverless Framework

The Serverless Framework is an open-source tool that simplifies the deployment and management of serverless applications across multiple cloud providers, including AWS. It abstracts away the complexity of provisioning and managing infrastructure by allowing developers to define their infrastructure as code using a YAML file.

The framework handles the deployment, scaling and updating of serverless functions, APIs and other resources.

GitHub Actions

GitHub Actions is a powerful CI/CD automation tool that allows developers to automate their software workflows directly from their GitHub repository.

With GitHub Actions, you can create custom pipelines triggered by events such as code pushes, pull requests, or branch merges. These workflows are defined in YAML files within the repository and can perform tasks like testing, building and deploying applications to various environments.

Postman

Postman is a popular collaboration platform that simplifies the process of designing, testing, and documenting APIs. It offers a user-friendly interface for developers to create and send HTTP requests, test API endpoints, and automate testing workflows.

Alright, now that you're familiar with the tools and technologies we'll use here, let's dive in.

Prerequisites

Node.js and npm installed
AWS CLI configured with access to your AWS account
A Serverlesss Framework account
Serverlesss Framework globally installed in your local CLI

Our Use Case

Meet Alyx, an entrepreneur who has recently been learning about serverless architecture. She's read about how it's a powerful and efficient way to build backends for web applications, offering a more modern approach to web application development.

She wants to apply what she's learned so far about of the fundamentals of AWS serverless computing. She knows that serverless doesn’t mean there are no servers involved – rather, it just abstracts away the management and provisioning of servers. And now she wants to focus solely on writing code and implementing business logic.

Let’s check out how Alyx, the owner of a thriving coffee shop, begins to leverage serverless architecture for the backend of her web application.

Alyx’s Coffee Haven, an online coffee shop, offers an array of coffee blends and treats for sale. Initially, Alyx managed the shop’s orders and inventory with traditional web hosting services and operations, where she handled multiple servers and resources. But as her coffee shop grew in popularity, she started facing an increasing number of orders, especially during peak hours and seasonal promotions.

Managing the servers and ensuring the application could handle the surge in traffic became a challenge for Alyx. She found herself constantly worrying about server capacity, scalability, and the cost of maintaining the infrastructure.

She also wanted to introduce new features like personalized recommendations and loyalty programs, but this became a daunting task given the limitations of her traditional setup.

Then Alyx learned about the concept of serverless. She likened a serverless backend to a barista who automatically brews coffee in real-time, without her having to worry about the intricate details of the coffee-making process.

Excited by this idea, Alyx decided to migrate her coffee shop’s backend to a serverless platform using AWS Lambda, AWS API Gateway, and Amazon DynamoDB. This setup will let her focus more on crafting the perfect coffee blends and treats for her customers.

With serverless, each customer’s order becomes an event that triggers a series of serverless functions. Separate AWS Lambda functions processes the orders and handles all the business logic behind the scenes. For instance, it creates a customer’s order and is able to retrieve that order. It can also delete someone's order or update an order’s status.

Alyx no longer needs to worry about managing servers, as the serverless platform automatically scales up and down based on incoming order requests. Also, the cost-efficiency of serverless is huge for Alyx. With a pay-as-you-go model, she only pays for the actual compute time her functions consume, offering her a more a cost-effective solution for her growing business.

But she doesn’t stop there! She also wants to automate everything, from deploying infrastructure to updating her application whenever there’s a new change. By utilizing Infrastructure as Code (IaC) with the Serverless Framework, she can define all her infrastructure in code and manage it easily.

On top of that, she sets up GitHub Actions for continuous integration and delivery (CI/CD), so that every change she makes is automatically deployed through a pipeline, whether it’s a new feature in development or a hot fix for production.

Tutorial Objectives

Set up the Serverless Framework environment
Define an API in the YAML file
Develop AWS Lambda functions to process CRUD operations
Set up multi-stage deployments for Dev and Prod
Test the Dev and Prod pipelines
Test and validate Dev and Prod APIs using Postman

How to Get Started: Clone the Git Repository

To enhance your understanding and so you can follow along with this tutorial more effectively, go ahead and clone the project’s repository from my GitHub. You can do that by going here. As we move forward, feel free to edit the files as you feel necessary.

After cloning the repository, you will notice the presence of multiple files in your folder, as you can see in the image below. We’ll use all of these files to build our serverless coffee shop API.

Step 1: Set up the Serverless Framework Environment

To set up the Serverless Framework environment for automated deployments, you'll need to authenticate your Serverless Framework account via the CLI.

This requires creating an access key that enables the CI/CD pipeline and utilizes the Serverless Framework to authenticate securely into your account without exposing your credentials. By signing into your Serverless account and generating an access key, the pipeline can deploy your serverless application automatically from the build configuration file.

To do this, head to your Serverless account and navigate to the Access Keys section. Click on “+add,” name it SERVERLESS_ACCESS_KEY, and then create the key.

Once you’ve created your access key, be sure to copy and store it securely. You'll use this key as a secret variable in your GitHub repository to authenticate and authorize your CI/CD pipeline.

It will provide access to your Serverless Framework account during the deployment process. You’ll add this key to your GitHub repository’s secrets later, so your pipeline can securely use it to deploy the serverless resources without exposing sensitive information in your codebase.

Now, let’s define the AWS resources as code in the severless.yaml file.

Step 2: Define the API in the Serverless YAML File

In this file, you'll define the core infrastructure and functionality of the Coffee Shop API using the Serverless Framework’s YAML configuration.

This file defines the AWS services being utilized, including API Gateway, Lambda functions for CRUD operations, and DynamoDB for data storage.

You'll also configure an IAM role so the Lambda functions have the necessary permissions to interact with the DynamoDB service.

The API Gateway is set up with appropriate HTTP methods (POST, GET, PUT, and DELETE) to handle incoming requests and trigger the corresponding Lambda functions.

Let’s check out the code:

service: coffee-shop-api
frameworkVersion: '4'

provider:
  name: aws
  runtime: nodejs20.x
  region: us-east-1
  stage: ${opt:stage}
  iam:
    role:
      statements:
        - Effect: Allow
          Action:
            - dynamodb:PutItem
            - dynamodb:GetItem
            - dynamodb:Scan
            - dynamodb:UpdateItem
            - dynamodb:DeleteItem
          Resource: arn:aws:dynamodb:${self:provider.region}:*:table/CoffeeOrders-${self:provider.stage}

functions:
  createCoffee:
    handler: createCoffee.handler
    environment:
      COFFEE_ORDERS_TABLE: CoffeeOrders-${self:provider.stage}
    events:
      - http:
          path: coffee
          method: post

  getCoffee:
    handler: getCoffee.handler
    environment:
      COFFEE_ORDERS_TABLE: CoffeeOrders-${self:provider.stage}
    events:
      - http:
          path: coffee
          method: get

  updateCoffee:
    handler: updateCoffee.handler
    environment:
      COFFEE_ORDERS_TABLE: CoffeeOrders-${self:provider.stage}
    events:
      - http:  
          path: coffee  
          method: put  

  deleteCoffee:  
    handler: deleteCoffee.handler
    environment:
      COFFEE_ORDERS_TABLE: CoffeeOrders-${self:provider.stage}
    events:
      - http:
          path: coffee
          method: delete
resources:
  Resources:
    CoffeeTable:
      Type: AWS::DynamoDB::Table
      Properties:
        TableName: CoffeeOrders-${self:provider.stage}
        AttributeDefinitions:
          - AttributeName: OrderId
            AttributeType: S
          - AttributeName: CustomerName
            AttributeType: S
        KeySchema:
          - AttributeName: OrderId
            KeyType: HASH
          - AttributeName: CustomerName
            KeyType: RANGE
        BillingMode: PAY_PER_REQUEST

The serverless.yml configuration defines how Alyx's Coffee Shop API will run in a serverless environment on AWS. The provider section specifies that the application will use AWS as the cloud provider, with Node.js as the runtime environment.

The region is set to us-east-1 and the stage variable allows for dynamic deployment across different environments, like dev and prod. This means that the same code can deploy to different environments, with resources being named accordingly to avoid conflicts.

In the iam section, permissions are granted to Lambda functions to interact with the DynamoDB table. The ${self:provider.stage} syntax dynamically names the DynamoDB table, so that each environment has its own separate resources, like CoffeeOrders-dev for the development environment and CoffeeOrders-prod for production. This dynamic naming helps manage multiple environments without manually configuring separate tables for each one.

The functions section defines the four core Lambda functions, createCoffee, getCoffee, updateCoffee and deleteCoffee. These handle the CRUD operations for the Coffee Shop API.

Each function is connected to a specific HTTP method in the API Gateway, such as POST, GET, PUT and DELETE. These functions interact with the DynamoDB table that’s dynamically named based on the current stage.

The last resources section defines the DynamoDB table itself. It sets up the table with the attributes OrderId and CustomerName, which are used as the primary key. The table is configured to use a pay-per-request billing mode, making it cost-effective for Alyx's growing business.

By automating the deployment of these resources using the Serverless Framework, Alyx can easily manage her infrastructure, freeing her from the burden of manually provisioning and scaling resources.

Step 3: Develop the Lambda Functions for CRUD Operations

In this step, we implement the core logic of Alyx’s Coffee Shop API by creating Lambda functions with JavaScript that perform the essential CRUD operations createCoffee, getCoffee, updateCoffee and deleteCoffee.

These functions utilize the AWS SDK to interact with AWS services, particularly DynamoDB. Each function will be responsible for handling specific API requests such as creating an order, retrieving orders, updating order statuses, and deleting orders.

Create Coffee Lambda function

This function creates an order:

const AWS = require('aws-sdk');
const dynamoDb = new AWS.DynamoDB.DocumentClient();
const { v4: uuidv4 } = require('uuid');

module.exports.handler = async (event) => {
  const requestBody = JSON.parse(event.body);
  const customerName = requestBody.customer_name;
  const coffeeBlend = requestBody.coffee_blend;
  const orderId = uuidv4();

  const params = {
    TableName: process.env.COFFEE_ORDERS_TABLE,
    Item: {
      OrderId: orderId,
      CustomerName: customerName,
      CoffeeBlend: coffeeBlend,
      OrderStatus: 'Pending'
    }
  };

  try {
    await dynamoDb.put(params).promise();
    return {
      statusCode: 200,
      body: JSON.stringify({ message: 'Order created successfully!', OrderId: orderId })
    };
  } catch (error) {
    return {
      statusCode: 500,
      body: JSON.stringify({ error: `Could not create order: ${error.message}` })
    };
  }
};

This Lambda function handles the creation of a new coffee order in the DynamoDB table. First we import the AWS SDK and initialize a DynamoDB.DocumentClient to interact with DynamoDB. The uuid library is also imported to generate unique order IDs.

Inside the handler function, we parse the incoming request body to extract customer information, such as the customer's name and preferred coffee blend. A unique orderId is generated using uuidv4() and this data is prepared for insertion into DynamoDB.

The params object defines the table where the data will be stored, with TableName dynamically set to the value of the environment variable COFFEE_ORDERS_TABLE. The new order includes fields such as OrderId, CustomerName, CoffeeBlend, and an initial status of Pending.

In the try block, the code attempts to add the order to the DynamoDB table using the put() method. If successful, the function returns a status code of 200 with a success message and the OrderId. If there’s an error, the code catches it and returns a 500 status code along with an error message.

Get Coffee Lambda function

This function retrieves all coffee items:

const AWS = require('aws-sdk');
const dynamoDb = new AWS.DynamoDB.DocumentClient();

module.exports.handler = async () => {
  const params = {
    TableName: process.env.COFFEE_ORDERS_TABLE
  };

  try {
    const result = await dynamoDb.scan(params).promise();
    return {
      statusCode: 200,
      body: JSON.stringify(result.Items)
    };
  } catch (error) {
    return {
      statusCode: 500,
      body: JSON.stringify({ error: `Could not retrieve orders: ${error.message}` })
    };
  }
};

This Lambda function is responsible for retrieving all coffee orders from a DynamoDB table and exemplifies a serverless approach to retrieving data from DynamoDB in a scalable manner.

We again use the AWS SDK to initialize a DynamoDB.DocumentClient instance to interact with DynamoDB. The handler function constructs the params object, specifying the TableName, which is dynamically set using the COFFEE_ORDERS_TABLE environment variable.

The scan() method retrieves all items from the table. Again, if the operation is successful, the function returns a status code of 200 along with the retrieved items in JSON format. In case of an error, a 500 status code and an error message are returned.

Update Coffee Lambda function

This function updates a coffee item by its ID:

const AWS = require('aws-sdk');
const dynamoDb = new AWS.DynamoDB.DocumentClient();

module.exports.handler = async (event) => {
  const requestBody = JSON.parse(event.body);
  const { order_id, new_status, customer_name } = requestBody;

  const params = {
    TableName: process.env.COFFEE_ORDERS_TABLE,
    Key: {
      OrderId: order_id,
      CustomerName: customer_name
    },
    UpdateExpression: 'SET OrderStatus = :status',
    ExpressionAttributeValues: {
      ':status': new_status
    }
  };

  try {
    await dynamoDb.update(params).promise();
    return {
      statusCode: 200,
      body: JSON.stringify({ message: 'Order status updated successfully!', OrderId: order_id })
    };
  } catch (error) {
    return {
      statusCode: 500,
      body: JSON.stringify({ error: `Could not update order: ${error.message}` })
    };
  }
};

This Lambda function handles updating the status of a specific coffee order in the DynamoDB table.

The handler function extracts the order_id, new_status, and customer_name from the request body. It then constructs the params object to specify the table name and the primary key for the order (using OrderId and CustomerName). The UpdateExpression sets the new status of the order.

In the try block, the code attempts to update the order in DynamoDB using the update() method. Once again, of course if successful, the function returns a status code of 200 with a success message. If an error occurs, it catches the error and returns a 500 status code along with an error message.

Delete Coffee Lambda function

This function deletes a coffee item by its ID:

const AWS = require('aws-sdk');
const dynamoDb = new AWS.DynamoDB.DocumentClient();

module.exports.handler = async (event) => {
  const requestBody = JSON.parse(event.body);
  const { order_id, customer_name } = requestBody;

  const params = {
    TableName: process.env.COFFEE_ORDERS_TABLE,
    Key: {
      OrderId: order_id,
      CustomerName: customer_name
    }
  };

  try {
    await dynamoDb.delete(params).promise();
    return {
      statusCode: 200,
      body: JSON.stringify({ message: 'Order deleted successfully!', OrderId: order_id })
    };
  } catch (error) {
    return {
      statusCode: 500,
      body: JSON.stringify({ error: `Could not delete order: ${error.message}` })
    };
  }
};

The Lambda function deletes a specific coffee order from the DynamoDB table. In the handler function, the code parses the request body to extract the order_id and customer_name. These values are used as the primary key to identify the item to be deleted from the table. The params object specifies the table name and key for the item to be deleted.

In the try block, the code attempts to delete the order from DynamoDB using the delete() method. If successful, again it returns a 200 status code with a success message, indicating that the order was deleted. If an error occurs, the code catches it and returns a 500 status code along with an error message.

Now that we’ve explained each Lambda function, let’s set up a multi-stage CI/CD pipeline.

Step 4: Set Up CI/CD Pipeline Multi-stage Deployments for Dev and Prod Environments

To set up AWS secrets in your GitHub repository, first navigate to the repository’s settings. Select Settings on the top right, then go to the bottom left and select Secrets and variables.

Next, click on Actions as seen in the image below:

From there, select New repository secret to create secrets.

Three secrets are needed to create for your pipeline, AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, and SERVERLESS_ACCESS_KEY.

Use your AWS account access key credentials for the first two variables and then the serverless access key previously saved to create the SERVERLESS_ACCESS_KEY. These secrets will securely authenticate your CI/CD pipeline as seen in the image below.

Make sure that your main branch is named “main,” as this will serve as the production branch. Next, create a new branch called “dev” for development work.

You can also create feature-specific branches, such as “dev/feature,” for more granular development. GitHub Actions will use these branches to deploy changes automatically, with dev representing the development environment and main representing production.

This branching strategy allows you to manage the CI/CD pipeline efficiently, deploying new code changes whenever there's a merge into either the dev or prod environments.

How to Use GitHub Actions to Deploy the YAML File

To automate the deployment process for the Coffee Shop API, you'll utilize GitHub Actions, which integrates with your GitHub repository.

This deployment pipeline is triggered whenever code is pushed to the main or dev branches. By configuring environment-specific deployments, you'll ensure that updates to the dev branch deploy to the development environment, while changes to the main branch trigger production deployments.

Now, let’s review the code:

name: deploy-coffee-shop-api

on:
  push:
    branches:
      - main
      - dev

jobs:
  deploy:
    runs-on: ubuntu-latest

    steps:
    - name: Checkout code
      uses: actions/checkout@v3

    - name: Setup Node.js
      uses: actions/setup-node@v3
      with:
        node-version: '20.x'

    - name: Install dependencies
      run: |
        cd coffee-shop-api
        npm install

    - name: Install Serverless Framework
      run: npm install -g serverless

    - name: Deploy to AWS (Dev)
      if: github.ref == 'refs/heads/dev'
      run: |
        cd coffee-shop-api
        npx serverless deploy --stage dev
      env:
        AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
        AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
        SERVERLESS_ACCESS_KEY: ${{secrets.SERVERLESS_ACCESS_KEY}}

    - name: Deploy to AWS (Prod)
      if: github.ref == 'refs/heads/main'
      run: |
        cd coffee-shop-api
        npx serverless deploy --stage prod
      env:
        AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
        AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
        SERVERLESS_ACCESS_KEY: ${{secrets.SERVERLESS_ACCESS_KEY}}

The GitHub Actions YAML configuration is what automates the deployment process of the Coffee Shop API to AWS using the Serverless Framework. The workflow triggers whenever changes are pushed to the main or dev branches.

It begins by checking out the repository’s code, then setting up Node.js with version 20.x to match the runtime used by the Lambda functions. After that, it installs the project dependencies by navigating to the coffee-shop-api directory and running npm install.

The workflow also installs the Serverless Framework globally, allowing the serverless CLI to be used for deployments. Depending on which branch is updated, the workflow conditionally deploys to the appropriate environment.

If the changes are pushed to the dev branch, it deploys to the dev stage. If they are pushed to the main branch, it deploys to the prod stage. The deployment commands, npx serverless deploy --stage dev or npx serverless deploy --stage prod are executed within the coffee-shop-api directory.

For a secure deployment, the workflow accesses AWS credentials and the Serverless access key via environment variables stored in GitHub Secrets. This allows the CI/CD pipeline to authenticate with AWS and the Serverless Framework without exposing sensitive information in the repository.

Now, we can proceed to test out the pipeline.

Step 5: Test the Dev and Prod Pipelines

First, you'll need to verify that the main (prod) branch is called “main”. Then create a dev branch called “dev”. Once you make any valid changes to the dev branch, commit them to trigger the GitHub Actions pipeline. This will automatically deploy the updated resources to the development environment. After verifying everything in dev, you can then merge the dev branch into the main branch.

Merging changes into the main branch also automatically triggers the deployment pipeline for the production environment. This way, all necessary updates are applied and production resources are deployed seamlessly.

You can monitor the deployment process and review detailed logs of each GitHub Actions run by navigating to the Actions tab in your GitHub repository.

The logs provide visibility into each step of the pipeline, helping you verify that everything is working as expected.

You can select any build run to review detailed logs for both the development and production environment deployments so you can track the progress and ensure that everything is running smoothly.

Navigate to the specific build run in GitHub Actions, as demonstrated in the image below. There, you can view the execution details and outcomes for either the development or production pipelines.

Make sure to thoroughly test both the development and production environments to confirm successful pipeline executing.

Step 6: Test and Validate Prod and Dev APIs using Postman

Now that the APIs and resources are deployed and configured, we need to locate the unique API endpoints (URLs) generated by AWS to begin making requests to test functionality.

These URLs can test the API functionality by simply pasting them into a web browser. The API URLs are found in the output results of your CI/CD build.

To retrieve them, navigate to the GitHub Actions logs, select the most recent environment’s successful build, and click deploy to check the deployment details for the generated API endpoints.

Click on the Deploy to AWS stage for the selected environment (Prod or Dev) in your GitHub Actions logs. Once there, you’ll find the generated API URL.

Copy and save this URL, as it will be needed when testing your API’s functionality. This URL is your gateway to verifying that the deployed API works as expected.

Now copy one of the generated API URLs and paste it into your browser. You will see an empty array or list displayed in the response. This actually confirms that the API is functioning correctly and that you are successfully retrieving data from the DynamoDB table.

Even though the list is empty, it indicates that the API can connect to the database and return information.

To verify that your API works across both environments, repeat the steps for the other API environment (Prod and Dev).

For more comprehensive testing, we’ll use Postman to test all the API methods, Create, Read, Update and Delete, and perform these tests for both the development and production environments.

To test the GET method, use Postman to send a GET request to the API’s endpoint using the URL. You will receive the same response, an empty list of coffee orders as seen in the bottom of the image below. This confirms the API’s ability to retrieve data successfully, as shown in the image below.

To actually create an order, let’s test the POST method. Use Postman again to make a POST request to the API endpoint, providing the customer’s name and coffee blend in the request body, as show below :

{
  "customer_name": "REXTECH",
  "coffee_blend": "Black"
}

The response will be a success message with a unique OrderId of the order placed.

Verify that the new order was saved in the DynamoDB table by reviewing the items in the environments specific table :

To test the PUT method, make a PUT request to the API endpoint by providing the previous order ID and a new order status in the request body as shown below :

{                                                 
  "order_id": "42a81c27-1421-4025-9bef-72b14e723c34",
  "new_status": "Ready",                                             
  "customer_name": "REXTECH"                                             
}

The response will be a successful order update message with the OrderId of the order placed.

You can also verify that the order status was updated from the DynamoDB table item.

To test the DELETE method, using Postman, make a DELETE request providing the previous order ID and the customer name in the request body as shown below:

{                                                 
  "order_id": "42a81c27-1421-4025-9bef-72b14e723c34",
  "customer_name": "REXTECH"
}

The response will be a successful order deleted message with the order ID of the order placed.

Again, you can verify that the order has been deleted in the DynamoDB table.

Conclusion

That’s it – congratulations! You’ve successfully completed all the steps. We’ve built a serverless REST API that supports CRUD (Create, Read, Update, Delete) functionality with API Gateway, Lambda, DynamoDB, Serverless Framework and Node.js, automating deployment of approved code changes with Github Actions.

If you’ve gotten this far, thanks for reading! I hope it was worthwhile to you.

Ifeanyi Otuonye is a 6X AWS Certified Cloud Engineer skilled in DevOps, Technical Writing and instructional expertise as a Technical Instructor. He is motivated by his eagerness to learn and develop and thrives in collaborative environments. Before transitioning to the Cloud, he spend six years as a Professional Track and Field athlete.

In the early 2022, he strategically embarked on an mission to be a Cloud/DevOps Engineer through self study and joining a 6 month accelerated Cloud program.

In May 2023, he accomplished that goal and landed his first Cloud Engineering role and has now set another personal mission to empower other individuals on their journey to the Cloud.

Serverless Node.js Tutorial

Beau Carnes — Wed, 28 Feb 2024 20:59:09 +0000

The shift towards serverless architectures is rapidly becoming a pivotal aspect of application development. Even Node.js, which is traditionally used with servers, can be used to build a serverless application.

We just published a course on the freeCodeCamp.org YouTube channel that will teach you how to develop serverless applications using Node.js. You'll learn the nuances of deploying an Express.js and Node.js application to AWS Lambda.

You'll also learn how to leverage the cutting-edge capabilities of Neon Serverless Postgres (which created a grant for freeCodeCamp to help make this course possible) and the Serverless Framework to enhance your application development and deployment strategy.

Justin Mitchel created this course. He is a popular instructor and creator of the Coding for Entrepreneurs course platform. Justin brings his extensive expertise to the table, ensuring that you not only grasp the theoretical aspects but also gain hands-on experience through a project-based learning approach.

Here are some of the key sections in this course.

What Serverless Means for App Development: A foundational overview, preparing learners for the paradigm shift in how applications are built and deployed.
Requirements & Tech Overview: Familiarize yourself with the tools and technologies that will be utilized throughout the course.
Project Setup to Deployment: From initial setup to running Express locally with the Serverless Framework, this course ensures a hands-on approach to mastering serverless deployment.
Securing and Managing Your Deployment: Learn to secure your deployment with AWS System Manager Parameter Store and manage Neon resources efficiently with the Neon CLI.
Database Integration and Management: Dive deep into integrating NodeJS with Neon Postgres, managing database schemas, and automating branched Neon database secrets.
Advanced Deployment Techniques: Master automated deployments via GitHub Actions and explore integration with rewrites in Next.js and Vercel, culminating in deploying Express.js to Vercel.

Serverless architectures offer unparalleled scalability and cost efficiency, allowing developers to focus on code rather than managing servers. This approach not only streamlines deployment processes but also significantly reduces operational costs, as you only pay for the compute time you use, making it an ideal solution for projects of any size.

Watch the full course on the freeCodeCamp.org YouTube channel (4-hour watch).

Serverless Architecture Patterns and Best Practices

freeCodeCamp — Tue, 09 Jan 2024 01:07:58 +0000

By Faith Oyama

Serverless architecture has become a hot topic in the developer world, and for good reason.

It promises a paradigm shift – one where we leave behind the burdens of server management and focus solely on building and deploying code. No more provisioning VMs, patching software, or scaling infrastructure manually.

In this article, we'll simplify the world of serverless, exploring its core principles and benefits. We'll see how serverless applications scale effortlessly, adapt to changing workloads, and potentially save you precious resources.

But, like any powerful tool, serverless comes with its considerations. We'll shed light on potential challenges like cold starts and vendor lock-in, empowering you to make informed decisions before embarking on your serverless journey.

This article aims to equip you with the knowledge and best practices to become a confident serverless developer. We'll break down complex concepts into clear, actionable steps, using real-world examples to illustrate key points. Whether you're a curious beginner or a seasoned developer looking to expand your skill set, this article will serve as your comprehensive guide to unlocking the potential of serverless architecture.

Common Serverless Architecture Patterns

Now that we've laid the groundwork for serverless, let's explore some practical ways to build your serverless applications. Buckle up, because we're entering the realm of patterns, and reusable designs that help you structure your code efficiently and leverage the strengths of serverless architecture.

API Gateway & Lambda: The Dynamic Duo

This is the classic serverless combo, like peanut butter and jelly for web APIs. Think of API Gateway as your friendly neighbourhood receptionist, greeting incoming requests from various sources (web browsers, mobile apps, and so on). With a polite nod, it then routes each request to the appropriate Lambda function (think processing data, sending emails, or updating databases).

It's a seamless partnership: the Gateway handles routing and security, while Lambda focuses on your application's specific logic.

Here are some perks of this pattern:

Rapid Deployments: Get your APIs up and running quickly without worrying about server infrastructure.
Scalability on Demand: Lambda functions automatically scale based on traffic, freeing you from manually adjusting server capacity.
Pay-per-Use Pricing: You only pay for the resources your code uses, making serverless cost-effective for applications with fluctuating traffic.

But remember, even dynamic duos have their quirks. Cold starts, in which the initial invocation of a Lambda function takes longer to complete, can affect initial response times.

And while API Gateway offers strong security features, you still need to implement proper authorization and validation within your Lambda functions.

Fan-Out Pattern

Need to handle massive workloads but don't want to wait in line? Enter the Fan-Out pattern.

Imagine a single event (like a large file upload) triggering a swarm of Lambda functions working simultaneously on smaller chunks of the task. It's like having a team of chefs tackling different courses of a complex meal, making the entire process much faster.

This pattern excels in scenarios like:

Image resizing: Break down a large image into smaller parts for parallel resizing, then stitch them back together for a fast and efficient outcome.
Email sending: Send bulk emails to thousands of recipients without bogging down your system by distributing the task among multiple Lambda functions.

But remember that with great power comes great responsibility. Managing dependencies between your parallel functions and ensuring smooth data consistency can be tricky. Consider using queues or streams to coordinate their work and avoid unwanted surprises.

Messaging Pattern

Ever feel like your code components are tangled in a spaghetti dinner of dependencies? The Messaging pattern comes to the rescue, introducing a layer of calm, asynchronous communication between your serverless functions.

Instead of functions directly calling each other, they simply send messages to a queue (like a virtual mailbox). The functions responsible for processing those messages can pick them up at their own pace, decoupling them from the sender's execution time.

Think of it like leaving an order at a restaurant: tell the kitchen what you want (send a message), and then relax – your food will arrive (the message will be processed) when it's ready, without you needing to keep checking on the chef.

This approach offers several benefits:

Agility: If one function fails, the message remains in the queue for later processing, preventing cascading failures.
Scalability: Scale your message processing functions independently from the sending functions for optimal performance.
Flexibility: Decoupled components are easier to maintain and update, making your application more agile.

But remember, choosing the right queueing service and managing message backlogs requires careful consideration. Make sure your messaging system can handle your application's expected workload and provides efficient message buffering to avoid bottlenecks.

Serverless Best Practices

Function Focus

Keep your Lambda functions small, focused, and stateless. Think of them as single-purpose spells: each should handle a specific task and avoid holding onto any persistent state. This improves their scalability and makes them easier to debug and maintain.

Error Handling

Nobody likes unexpected crashes. Handle errors within your functions and log them efficiently. Use tools like CloudWatch to monitor logs and proactively identify potential issues before they become full-blown serverless storms.

Observability is Key

You need tools to observe your serverless application's health and performance. Utilize monitoring services like Prometheus or Datadog to track metrics like execution time, memory usage, and invocations. Early detection of performance bottlenecks helps you optimize your functions and keep your costs in check.

Testing, Testing, 1, 2, 3

Don't send your functions into the serverless void untested! Rigorous unit and integration testing are crucial for catching bugs and ensuring your code behaves as expected. Frameworks like Jest and Serverless Framework can simplify your testing process and prevent unexpected serverless hiccups.

Cost Optimization

Remember, with great power comes great responsibility for your serverless wallet. Utilize cost-saving features like throttling, which limits function invocations per second, and timeouts, which automatically terminate long-running executions. Pay-per-use billing can be your friend, but only if you manage it wisely.

Security First

Don't let your serverless application fall to malicious attacks. Implement IAM roles and policies to control access to resources and functions. Use encryption for sensitive data and regularly review your security posture to ensure your serverless spells remain protected from dark magic.

Logging and Tracing

Enabling granular logging within your functions helps you troubleshoot issues and understand their execution flow. Use tracing tools like X-Ray to visualize the path of invocations across your serverless components, making debugging a breeze even in the most complex serverless landscapes.

Versioning and Deployment

Continuous improvement is key in the serverless world. Utilize CI/CD pipelines to automate build, test, and deployment processes for your functions. Versioning allows you to roll back to stable versions if needed and experiment with new features without impacting your live application.

Conclusion

Now, it's time to put to test your newfound knowledge and build incredible applications that scale with ease and cost less.

To always stay ahead of the curve, consider exploring these resources:

AWS Serverless Application Model (SAM): Simplify building and deploying serverless applications on AWS.
Serverless Framework: An open-source framework for building and deploying serverless applications across various cloud providers.
Serverless Meetups and Conferences: Connect with other serverless enthusiasts and learn from the experts.

As you continue your serverless exploration, remember the golden rule: experiment, share your knowledge, and have fun. And if you ever encounter a particularly tricky serverless riddle, reach out! The serverless community is always eager to help.

How to Integrate AI into Your Serverless App With Amazon Bedrock

freeCodeCamp — Mon, 02 Oct 2023 19:23:14 +0000

By Sam Williams

In today's tech landscape, integrating AI is no longer a luxury – it's a necessity.

AI-driven applications have the potential to transform user experiences, automate complex tasks, and unlock new realms of possibilities. Understanding and leveraging AI APIs is a pivotal skill for developers looking to stay at the forefront of innovation.

Brief Overview of AI APIs

Artificial Intelligence APIs are powerful tools that allow developers to tap into the capabilities of pre-trained machine learning models. These APIs expose functionalities like natural language processing, computer vision, and more, enabling developers to easily incorporate advanced AI capabilities into their applications.

You no longer have to understand training epochs and neural network architecture to use AI in your projects and build incredibly powerful features for your users.

The purpose of this tutorial:

The goal of this tutorial is to equip you with the knowledge and practical skills you need to seamlessly integrate AI APIs into your projects.

I'll walk you through the entire process, from choosing the right API for your specific needs to hands-on implementation and best practices for seamless integration.

By the end, you'll be well-equipped to infuse AI-powered intelligence into your applications, opening up a world of new possibilities.

So, let's embark on this journey together and unlock the true potential of AI APIs.

Current AI API Options

There are more and more AI services available through a simple API than ever before. In this article we’ll be using Amazon Bedrock, but there are loads more out there. Even Amazon Bedrock has 6 models available, with more coming in the future.

Comparing available AI APIs

To help you make an informed decision, let's compare some of the leading AI APIs available in the market. Below is a comparison table of some prominent options:

API	Description	Price
GPT-3.5 (16k)	Cutting-edge language model that can understand as well as generate natural language or code	$0.0003/ 1000 input tokens $0.004/ 1000 output tokens
GPT-4 (32K)	OpenAI’s most advanced system, producing safer and more useful responses	$0.06/ 1000 input tokens $0.12/ 1000 output tokens
A2I Jurassic-2 Mid model (Bedrock)	Mid-sized model, designed to strike the right balance between exceptional quality and affordability	$0.0125/ 1000 input tokens $0.0125/ 1000 output tokens
A2I Jurassic-2 Ultra model (Bedrock)	AI21’s most powerful model, offering exceptional quality	$0.0188/ 1000 input tokens $0.0188/ 1000 output tokens
Anthropic Claude Instant (Bedrock)	Cutting-edge general purpose large language model	$0.00163/ 1000 input tokens $0.01102/ 1000 output tokens
Stability AI (Bedrock)	Image Generation	$0.018 - $0.072 per image depending on size and quality

Key factors to consider when selecting an API

When choosing an AI API for your project, it's crucial to consider several key factors:

API capabilities and features: Assess the specific functionalities offered by the API and ensure they align with your project requirements. The quality of the generated content can also vary a lot between models, so it's a good idea to test them out and see how well they perform for your use case.
Scalability and performance: Evaluate the API's ability to handle varying workloads and ensure it meets your performance expectations, especially during peak usage.
Cost considerations: Understand the pricing model of the API, including any associated costs for usage, and determine its compatibility with your budget.
Data privacy and security: Ensure that the API provider complies with data protection regulations and has robust security measures in place to safeguard sensitive information.

By taking these factors into account, you'll be better equipped to choose the AI API that best suits the needs of your project.

For this tutorial we’re going go with Amazon Bedrock using the A2I Jurassic-2 Mid model.

How to Request Model Access

As this service is brand new, you have to request access to the models you want to use.

To do this, log into your AWS account, Search for “Bedrock” and then select the “Base models” tab on the left. Mouse over any model and it’ll say that you don’t currently have access and to request access in “Model Access”.

This lists all of the models. Click the edit button in the top right, select the models you want to have access to and then click “Save”. For this you need to select Jurassic-2 Ultra and Jurassic-2 Mid.

Select models for which you'd like to request access

This should only take a minute or two to be approved but best to do it ASAP.

Project: Build a Holiday Planning API

What Our API Will Do

Our API is designed to simplify holiday planning. By providing the state code of your destination and the duration of your visit, we'll generate a personalised itinerary, suggesting the best activities and places to explore.

How to set up the Repo

We'll be using the Serverless Framework for this project. If you've never used it before then you can follow this quick tutorial to install Serverless and get everything set up.

We're going to be using JavaScript for this project so create a new repo like this:

sls create --template aws-nodejs --path aiTourGuide

Create a Lambda with comments

We need to start by creating our Lambda function. I like to store mine under /src/functions/{functionName}/index.js. In this case my functionName will be aiTourGuide.

In the new index.js file we can start with this code. It tries to get the state and the duration from the request and then returns a response.

exports.handler = async (event) => {
    const { state_code, duration } = JSON.parse(event.body);

    // Code for generating itinerary will go here

    const response = {
        statusCode: 200,
        body: JSON.stringify('Itinerary generated successfully!')
    };

    return response;
};

Now we want to get some data to pass to the AI. We could just ask it to generate an itinerary for us, but giving it specific data to work with usually ends up with a much better result.

Visit the National Park Service API website and sign up for an API key.
Once registered, you'll receive an API key by email to access their services.

Add the National Park Service API Call to the Lambda

const axios = require('axios');
const parksApiKey = process.env.parksApiKey

exports.handler = async (event) => {
    const { state_code, duration } = JSON.parse(event.body);

    // Make a request to the National Park Service API
    const parksApiUrl = `https://developer.nps.gov/api/v1/parks?stateCode=${state_code}&api_key=${parksApiKey}`

    const parksResponse = await axios.get(parksApiUrl);

    // Extract relevant data from the response
    const parks = parksResponse.data.data.map(park => {
        return {
            name: park.fullName,
            description: park.description
        };
    });

    // Code for generating itinerary with park data will go here

    const responseBody = parks;
    const response = {
        statusCode: 200,
        body: JSON.stringify(responseBody)
    };

    return response;
};

We're making the request to the parks API using Axios, then getting just the name and description of each park from the response. For now we are just going to return that data in the API to see what we get.

One thing we're doing is to get the parksApiKey from the environment variables at the start of the file. To add the parksApiKey as an environment variable in a Serverless Framework serverless.yml file, you can follow these steps:

Open your serverless.yml file in a text editor.
Locate the provider section, which defines the AWS provider settings. Under it, add an environment block if it doesn't already exist.
Within the environment block, define your environment variable like this:

provider:
  name: aws
  runtime: nodejs18.x
  environment:
    parksApiKey: "YOUR API KEY"

Configure the Serverless.yml config

To actually deploy an API and our code, we need to tell Serverless what to deploy. We do this by changing the functions section of the config.

functions:
  aiTourGuide:
    handler: src/functions/aiTourGuide/index.handler
    events:
      - httpApi:
          path: /tourguide
          method: post

This means we’ll be deploying a aiTourGuide lambda function with a post API endpoint pointing at /tourguide. Just make sure that the handler section is the correct path for your repo and folder structure.

If you have configured your AWS credentials to a specific profile, you need to add that to your provider section, otherwise it will use your default AWS credentials.

provider:
  name: aws
  runtime: nodejs18.x
  profile: "Your Profile" // optional

Deploy and test

Now that we've created our Lambda function and integrated the National Park Service API, it's time to deploy and test our holiday planning API.

Deployment: All we need to do is run sls deploy again and our changes will be deployed.
Testing: Use a tool like Postman to send a POST request to your API with the required parameters, such as state_code and duration. You should get a response like this.

Image showing the response

You can see we have an array of objects, with the name of the park and the description. Exactly what we wanted.

How to prepare our AI prompt

Next, we'll prepare a request to an AI API to enhance our holiday planning recommendations. We'll be using the A2I Jurassic-2 Mid model using Amazon Bedrock to generate engaging descriptions for the recommended activities.

I tend to start relatively simple and refining the prompt as I see how it works. I also wrap my prompt generation in a function. This can get quite large and complex later on, so it’s nicer not having it in the main handler. I often have it in it’s own file!

Lets start with something like this:

const generatePrompt = ({parks,duration}) => {

    const stringListOfParks = parks.map(({name, description}) => {
        return `Park Name: ${name}:
    description: ${description}`}).join(`

    `)

    const prompt = `You are an expert tour guide in the US who focusses on designing holiday itinararies for spending time in the national parks. 
    I am going to give you descriptions of multiple parks in the area as well as the duration of the trip. 
    Create an itinerary for this trip, outlining what activities can bo done on each day.

    Trip duration = ${duration} days

    Local national parks:
    ${stringListOfParks}
    `;
    return prompt
}

The stringListOfParks function turns the object array into a long string. This might not be necessary but we’ll have to wait and see.

Then we create the AI prompt. We tell the AI who they are supposed to be, what information we’re going to give them, and what we want them to do. To start with this is fine, but over time we can test different changes to our prompt to see what generates the best results.

How to call the AI API

Now that we have a prompt, we can pass this to Amazon Bedrock to handle our prompt. We need to start by importing the AWS SDK and creating the bedrockruntime.

You’ll also need to install the AWS SDK for the bedrock as it’s not currently included in any of the lambda versions:

npm i -S @aws-sdk/client-bedrock-runtime

And we add this code to the top of our Lambda file.

import { BedrockRuntime } from "@aws-sdk/client-bedrock-runtime";
import axios from "axios";

const bedrockruntime = new BedrockRuntime()

We’re also using imports now, which means we need to change our index.js file to an index.mjs. If you did this using TypeScript then you wouldn't have to rename your file.

We need to call the invokeModel command and pass it a set of parameters. I find that it is cleaner to create a separate object for the params than doing it all in one place.

Currently there isn’t an async version of the invokeModel command, so we’ll wrap it in a promise.

const aiPrompt = generatePrompt({parks,duration});

const aiModelId = 'ai21.j2-mid-v1'; // we're using the A2I Jurassic-2 Mid model

const invokeModelParams = {
    body: JSON.stringify({
        prompt: aiPrompt,
        maxTokens: 200,
        temperature: 0.5,
        topP: 0.5, // optional
    }),
    modelId: aiModelId,
    accept: 'application/json',
    contentType: 'application/json'
};

const aiResponse = await new Promise((resolve, reject) => {
    bedrockruntime.invokeModel(invokeModelParams, function(err, data) {
        if (err) {
            reject(err); // an error occurred
        } else {
            resolve(data); // successful response
        }
    });
});

// Extract AI-generated text from the response
const aiResponseJson = JSON.parse(
    new TextDecoder().decode(aiResponse.body)
);
const aiItinerary =  aiResponseJson.completions[0].data.text;

const responseBody = aiItinerary;
const response = {
    statusCode: 200,
    body: responseBody,
};
return response;

You may notice that we’re passing more than just our prompt in the body. That is because we can change a few other things to get a different output.

LLMs work by choosing the next word in the sentence. temperature and topP control whether the model chooses unusual words or sticks to the most likely word.

Temperature: Closer to 1 means more unusual words will be chosen, closer to 0 chooses more likely words.
topP: When choosing the next word, limit how many options the AI has to choose from by summing up the probabilities. Numbers closer to 1 mean more unlikely words are included.

In our case we want a relatively creative response but also for things to be correct, so 0.5 is a good starting setting for both. If we were asking it to describe a sci-fi scene we would want to go with temp=0.7 topP=0.8, or if we were asking it to write data processing code we would reduce it to 0.2 as we want an answer that is more likely to be correct.

These are both things that you can change and test to see what values give the best results. Which parameters you pass in also depends on the model.

How to add IAM permissions to call Bedrock

If your Lambda function needs to access AWS resources or services like Amazon Bedrock, we need to make sure to configure the appropriate IAM permissions.

In your serverless.yml file you need to add this to your provider section. This says that this Lambda has permission to use bedrock:InvokeModel.

provider:
  name: aws
  runtime: nodejs18.x
  environment:
    parksApiKey: YOUR API KEY
  iam:
    role:
      statements:
        - Effect: "Allow"
          Action:
            - "bedrock:InvokeModel"
          Resource: "*"

Deploy and test (again)

After integrating the AI API and ensuring the proper IAM permissions, redeploy your Lambda function by running sls deploy. Then we can test it once more to ensure the AI-generated holiday itinerary is working properly.

Using the same request as last time, this is the response I got, and you should get something similar.

Day 1:

Arrive in Jackson, Mississippi and check into hotel
Visit Medgar and Myrlie Evers Home National Monument
Overnight in Jackson

Day 2:

Drive to Natchez, Mississippi and check into hotel
Visit Natchez National Historical Park
Overnight in Natchez

Day 3:

Drive to Vicksburg, Mississippi and check into hotel
Visit Vicksburg National Military Park
Overnight in Vicksburg

Day 4:

Drive to Tupelo, Mississippi and check into hotel
Visit Tupelo National Battlefield
Overnight in Tupelo

Day 5:

Drive to Corinth, Mississippi and check into hotel
Visit Shiloh National Military Park
Overnight in Corinth

Day 6:

Drive to Jackson, Mississippi and check into hotel
Visit Brices Cross Roads National Battlefield Site
Overnight in Jackson

Day 7:

Drive to Gulf Islands National Seashore and check into hotel

Fixes to the Code

There are few small issues:

It cuts off half way through day 7 even though we said 8 days.
The descriptions aren’t very interesting.

How to extend the token limit

The reason that the response was cut off is that we initially passed a maxTokens: 200 in our AI command. This should be a simple fix of increasing this number.

We could set it to a very high number like 10,000 but we still have to pay for all of the tokens generated. Setting it to 10,000 won’t make every response 10,000 tokens long, but having a more sensible limit protects us from having an unexpected AWS bill.

I’m setting mine to 1000. If you want to get fancy you could change this based on the number of days they are traveling for.

How to improve the itinerary

This one is a bit harder. The problems are that it is very bland and it repeats a lot of “drive here and check into the hotel”, “Overnight in Y”.

We can try improving our prompt to get a better result. First, let's explicitly say that it doesn’t need to tell us about the driving or checking into a hotel.

// New content    
Do not write about driving. Do not write about checking into hotels. Do not write about where to overnight.

We can also ask for a more descriptive result. I added another line to the prompt:

Give a description of the things they will see and what there is to do in each park.

One trick for redeploying when you have only changed code is function deployments. You can run sls deploy function -f {function name} which in our case is sls deploy function -f aiTourGuide. This is a lot quicker than redeploying the whole app, allowing you to test sooner and therefore itterate quicker.

Failing to improve

Having made this change, I hoped that the result would improve, but it didn’t. I tried about 15 different prompts and they all kept the same structure and ignored my instructions to not talk about hotels, driving, or staying overnight.

Option 2 – change other parameters

With the AI always giving a very similar and boring response, you can start changing other things. To increase the creativity of the AI, increase the temperature. I went with 0.8.

This should get the AI to select more unusual words and create a less structured and more random response.

Unfortunately, the response was almost identical.

Option 3 – change the model

One of the awesome things about using Bedrock is that there are multiple models to use, and switching between them can be very easy.

We’ve been using the Jurassic-2 Mid model until now, but it’s not up to this task. Maybe it is good with much smaller prompts or simpler tasks. For now we can switch to using Jurassic-2 Ultra. The amazing thing is this is just one line of code:

const aiModelId = "ai21.j2-ultra-v1"; // used to be ai21.j2-mid-v1

Because they’re from the same company, they have identical parameters. If you wanted to switch to one of the other models you might have to change a few other options in the body of invokeModelParams.

One thing I do advise with this larger model is increasing the Lambda timeout. By default it is at 6s but sometimes that isn’t enough.

functions:
  aiTourGuide:
    handler: src/functions/aiTourGuide/index.handler
    timeout: 26  # < new config line
    events:
      - httpApi:
          path: /tourguide
          method: post

The Results

With our updated prompt and our new model we get a far better result:

Day 1:
Start your trip by exploring Brices Cross Roads National Battlefield Site. Learn about how the Confederate victory at Brices Cross Roads was a significant victory for Major General Nathan Bedford Forrest, but its long term effect on the war proved costly for the Confederates. Brices Cross Roads is an excellent example of winning the battle, but losing the war.

Day 2:
Today, visit Emmett Till and Mamie Till-Mobley National Monument. In 1955, 14-year-old Emmett Till traveled to Money, Mississippi, to visit relatives. He was kidnapped, tortured, and murdered after reportedly whistling at a white woman. His mother, Mamie Till-Mobley, insisted on an open-casket funeral near their hometown of Chicago. Her brave decision let the world see the racist violence inflicted upon her son and set the Civil Rights Movement into motion.

Day 3:
Gulf Islands National Seashore is the perfect place to visit today. Millions of visitors are drawn to the Gulf of Mexico for Gulf Islands National Seashore's emerald coast waters, magnificent white beaches, fertile marshes and historical landscapes. Come explore with us today!

Day 4:
Medgar and Myrlie Evers Home National Monument is next on the list. Medgar and Myrlie Evers were partners in the civil rights struggle. The assassination of Medgar Evers in the carport of their home on June 12, 1963, was the first murder of a nationally significant leader of the American Civil Rights Movement, and it became a catalyst for passage of the Civil Rights Act of 1964. Myrlie Evers continued to promote issues of racial equality and social justice.

Day 5:
Natchez National Historical Park is a great place to visit today. Discover the history of all the peoples of Natchez, Mississippi, from European settlement, African enslavement, the American cotton economy, to the Civil Rights struggle on the lower Mississippi River.

Day 6:
Today, explore the Natchez Trace National Scenic Trail. The Natchez Trace National Scenic Trail is five sections of hiking trail running roughly parallel to the 444-mile long Natchez Trace Parkway scenic motor road. The foot trails total more than 60 miles and offer opportunities to explore wetlands, swamps, hardwood forest, and the history of the area. For What's Open What's Close visit www.nps.gov/natr/planyourvisit/what-is-open-what-is-closed.htm

Day 7:
The Natchez Trace Parkway is the perfect place to visit today. The Natchez Trace Parkway is a 444-mile recreational road and scenic drive through three states. It roughly follows the "Old Natchez Trace" a historic travel corridor used by American Indians, "Kaintucks," European settlers, slave traders, soldiers, and future presidents. Today, people can enjoy not only a scenic drive but also hiking, biking, horseback riding, and camping along the Parkway.

Day 8:
Finish your trip by exploring Shiloh National Military Park. Visit the sites of the most epic struggle in the Western Theater of the Civil War. Nearly 110,000 American troops clashed in a bloody contest that resulted in 23,746 casualties; more casualties than in all of America's previous wars combined. Explore both the Shiloh and Corinth battlefields to discover the impact of this struggle on the soldiers and on the nation.

I then decided to try it with the Claude Instant models to see how well it did. To do this you do have to change the model parameters passed in, but it isn't too different.

As you can see, this model does about as well as the Jurassic-2 Ultra model, but looking back at the pricing table, it's 10x cheaper for the input tokens and 60% of the price for output tokens.

Based on the descriptions provided, here is an 8-day itinerary focusing on activities in the national parks:

Day 1: Spend the morning at Brices Cross Roads National Battlefield Site learning about the Confederate victory there through wayside exhibits and a short film. In the afternoon, visit Emmett Till and Mamie Till-Mobley National Monument to learn about their story and its impact on the civil rights movement.

Day 2: Dedicate the full day to exploring Gulf Islands National Seashore. Hike or bike along the beautiful coastal trails, beachcomb for seashells, and spot wildlife along the shoreline and in the marshes.

Day 3: Spend the morning touring the preserved Medgar and Myrlie Evers Home National Monument and learning about their civil rights activism. In the afternoon, visit Natchez National Historical Park to discover the diverse history of the Natchez area through historic buildings, walking trails, and exhibits.

Day 4: Enjoy a scenic drive along the Natchez Trace Parkway, stopping at various overlooks for photo opportunities. Consider hiking or biking portions of the adjacent Natchez Trace National Scenic Trail as well for more active exploration.

Day 5: Immerse yourself in the epic Civil War history of Shiloh National Military Park by touring the battlefield sites and learning about the bloody two-day battle that took place there.

Day 6: Spend the morning at Tupelo National Battlefield learning about the 1864 battle that took place in the city. In the afternoon, continue on to Vicksburg National Military Park for an overview film and to tour the extensive battlefields and fortifications.

Day 7: Dedicate another full day to further exploring Vicksburg NMP, such as taking a ranger-led tour or hiking additional trails to discover the nuances of the long siege that took place here.

Day 8: Before departing, consider returning to a favorite park from earlier in the week for additional exploration or to see anything you may have missed initially.

I also tested the Claude v2, which is more advanced but also costs about the same as the Jurassic-2 Ultra model. This didn't make a noticeably better response, so for this use case I would definitely stick with the Anthropic Claude Instant model.

How to Take it to the Next Level

That’s a really good start for such a simple prompt. You could iterate on it, improving it and testing different styles and wording. You could try and find other sources of information to pass into your prompt, as giving it useful information to use in the response is often the best way to improve the results of these models.

How to Use this Process in Other Apps

Through this process, you've learnt to build an app that leverages the power of AI. You can now follow this same process to add AI power to your own AWS apps.

Find a use case where AI could generate you some content
Gather some data that will help the AI create a better response.
Generate the prompt
Call the InvokeModel function in Bedrock
Deploy and Test your AI function
Change the Prompt and Parameters to see what results in the best responses

How to Learn More about Serverless

Now that you know how to build AI into your apps, you probably have loads of app ideas.

If you want to learn how to build the rest of that idea then check out my ultimate guide to Serverless or my course which helps you Master Serverless by building 7 real world projects.

How to Deploy a Next.js App with Custom Domain on AWS Using SST

Arunachalam B — Mon, 24 Jul 2023 07:18:19 +0000

Serverless architectures have transformed the way we build and deploy applications in the cloud, bringing in more efficiency and scalability.

In this article, we'll dive into the Serverless Stack Toolkit (SST), a framework for building serverless applications. We'll deploy a Next.js application and set up a custom domain, all without visiting the AWS console.

Let's begin this journey!

What Does Serverless Mean?

The term "serverless" refers to a cloud computing model where developers can build and deploy applications without the need to manage servers. In a serverless architecture, the cloud provider handles server provisioning, scaling, and maintenance. This allows developers to focus solely on writing code for their applications.

With serverless, developers are billed based on actual usage rather than fixed server costs, making it a cost-effective and scalable solution. It offers increased flexibility and agility, as resources are automatically allocated and released based on demand. This eliminates the need for developers to worry about infrastructure management.

Now that we have a good idea of what serverless means, let's see what the Serverless Stack Toolkit (SST) is.

Understanding Serverless Stack Toolkit (SST)

The Serverless Stack Toolkit, or SST in short, is a flexible, open-source framework designed to enable faster development and reliable deployment of serverless applications on AWS.

It aims to make it easier for developers to define their application's infrastructure using AWS CDK (Cloud Development Kit).

You can use it to test applications in real-time with Live Lambda Development, debug code in Visual Studio Code, manage applications through a web-based dashboard, and deploy to multiple environments and regions seamlessly.

Benefits of Using SST

Here are some benefits of using the SST stack:

Infrastructure as Code

With SST, developers can define their application's infrastructure programmatically using AWS CDK. This improves version control and collaboration among team members.

Efficient Testing and Debugging

SST enables live Lambda development, making it easier to test and debug serverless applications locally before deployment to AWS. This reduces potential issues and ensures smoother deployment.

Simplified Deployment

SST simplifies the deployment process, allowing developers to deploy applications to multiple environments and regions effortlessly.

Language Flexibility

SST supports multiple programming languages, including JavaScript, TypeScript, Go, Python, C#, and F#, providing developers with the flexibility to use their preferred language for building serverless applications.

Now that we have understood what SST is and some of its benefits, let's see the power of SST in action.

How to Configure AWS

Before we add SST we have to configure some AWS credentials. To do that, type the below command in your terminal:

aws configure

AWS Configure

You'll be required to enter your AWS Access Key ID, Secret Access Key, Region name and output format. If you don't have these keys, please create an IAM user and enter the credentials.

How to Add SST to Your Next.js App

We can use SST in an existing Next.js app in drop-in mode or inside a monorepo app in standalone mode.

In this article, we'll create new Next.js project and add SST which follows drop-in mode installation using the commands below:

yarn create next-app
cd my-app
yarn create sst
yarn install

Note: You should ensure that you have the index.tsx file inside the /pages folder. Without the file, you'll get errors while deploying your app using SST. You don't need to make any changes to this file.

Folder structure

Once you run the above commands, SST will create two new files —sst.config.ts and sst-env.d.ts

We have to define all our infrastructure and stacks in the sst.config.ts file.

You can use these commands to run the app locally:

# Start SST locally
yarn sst dev

# Start Next.js locally
yarn dev

On executing the yarn sst dev command, you'll be asked to enter the stage name. Please enter your environment name. I'll use dev for this project's stage name.

Start SST locally

Just sit back and watch. It will automatically create the necessary IAM roles, permissions and CloudFormation stacks.

SST - Creating the neccessary IAM roles, permissions and stack

Notice in the image above that you can see the Console URL, https://console.sst.dev/sst-demo/dev. With the Console URL, you can view real-time logs, invoke functions, replay invocations, make queries, run migrations, view uploaded files, query your GraphQL APIs, and more!

Just awesome right? I would recommend you to visit the official documentation to learn more about the services they offer.

Next, start the Next.js site by running yarn dev. You should see the default page after that.

Next.js default page

Our Next.js app is now ready to be deployed to AWS! Just run the following command and see the magic.

yarn sst deploy --stage prod

OpenNext building the Next.js App

It will automatically start building the app using OpenNext , deploy it to AWS using CDK, and output the CloudFront URL. Click on the link and you should be able to see your app up and running.

SST - Deployed changes and outputs CloudFront url

The Next.js app up and running

How to Create Infrastructure using SST

To create an infrastructure, we simply need to edit sst.config.ts and import any AWS services like S3 bucket, RDS, API Gateway, and so on from sst/constructs

Let's add a simple S3 file upload feature. Open sst.config.ts file and add the code below:

import { SSTConfig } from "sst";
import {Bucket, NextjsSite } from "sst/constructs";

export default {
  config(_input) {
    return {
      name: "sst-tutorial",
      region: "us-east-1",
    };
  },
  stacks(app) {
    app.stack(function Site({ stack }) {
      const bucket = new Bucket(stack, "public");
      const site = new NextjsSite(stack, "site",{
        bind:[bucket],
      });
      stack.addOutputs({
        SiteUrl: site.url,
      });
    });
  },
} satisfies SSTConfig;

Here, we're creating a new public S3 bucket and binding it with our NextjsSite.

Let's edit our index page to add file upload feature.

How to Upload Files to S3 using SST

To upload a file to S3, we need to generate a pre-signed URL. To do that, we need to add this package @aws-sdk/s3-request-presigner in our repo.

yarn add @aws-sdk/s3-request-presigner

Open index.tsx file and create a function called getServerSideProps above the Home function, as shown in the below code snippet.

...
import { Bucket } from "sst/node/bucket";
import { getSignedUrl } from "@aws-sdk/s3-request-presigner";
import { S3Client, PutObjectCommand } from "@aws-sdk/client-s3";
...
export async function getServerSideProps() {
  const command = new PutObjectCommand({
    ACL: "public-read",
    Key: crypto.randomUUID(),
    Bucket: Bucket.public.bucketName,
  });
  const url = await getSignedUrl(new S3Client({}), command);
  const bucketName = Bucket.public.bucketName
  console.log(bucketName)
  return { props: { url } };
}

Update the Home() function with the following code.

import styles from "@/styles/Home.module.css";
export default function Home({ url }: { url: string }) {
  return (
    <main className={styles.main}>
     <div className={styles.center}>
        <a
          href="https://5minslearn.gogosoon.com/?ref=github_sst_app"
          className={styles.card}
          target="_blank"
          rel="noopener noreferrer"
        >
          <h2 className={inter.className}>
            5minslearn <span>->span>
          h2>
          <p className={inter.className}>Learn tech in 5minsp>
        a>
      div>
      <form
        className={styles.form}
        onSubmit={async (e) => {
          e.preventDefault();

          const file = (e.target as HTMLFormElement).file.files?.[0]!;

          const image = await fetch(url, {
            body: file,
            method: "PUT",
            headers: {
              "Content-Type": file.type,
              "Content-Disposition": `attachment; filename="${file.name}"`,
            },
          });

          window.location.href = image.url.split("?")[0];
        }}
      >
        <input name="file" type="file" accept="image/png, image/jpeg" />
        <button type="submit" className={inter.className}>
          Upload
        button>
      form>
    main>
  );
}

I added an input with a file type and a button for submitting the form. The selected image will be uploaded to S3 when the form is submitted. It's time to deploy the changes.

To deploy the changes, run the yarn sst deploy command.

Once deployed you'll see a page like this:

Next.js up and running with updated changes

Now you can upload any image and check your S3. The selected file will be uploaded to your S3 bucket.

Great, we have successfully deployed the changes. But we still have the random URL generated by CloudFront which may be difficult to memorize for humans. Let's configure a custom domain.

How to Configure Custom Domains

To configure custom domains, we need a valid domain or sub-domain. You can create one using Route 53 or your preferred domain provider like GoDaddy, Namecheap, and so on.

If you have a domain on an external DNS provider, you'll need to create an SSL certificate on AWS Certificate Manager (ACM).

I have my domain on Cloudflare. If you have yours with other providers like Namecheap or GoDaddy, then the steps below should still work for you.

How to Point CNAME to CloudFront

Login into your DNS provider.
Add a CNAME. In my case, I used aws as the name because my domain is aws.gogosoon.com, and target as the CloudFront URL without https.

We've successfully pointed our CNAME to CloudFront. Now let's create an SSL certificate for our domain.

How to Create ACM Certificate

ACM certificates are managed SSL/TLS certificates that can be used with a variety of AWS services, including CloudFront.

However, there is a specific requirement for using ACM certificates with CloudFront: the certificate must be created in the US East (N. Virginia) region (us-east-1). The reason for this is that CloudFront has all of its provisioning/administrative infrastructure based in us-east-1.

Quoting from their documentation:

To use a certificate in AWS Certificate Manager (ACM) to require HTTPS between viewers and CloudFront, make sure you request (or import) the certificate in the US East (N. Virginia) Region (us-east-1).

Here are the steps to follow to create an ACM:

Login into AWS console.
Search for certificate manager, switch to us-east-1 and click on "Request Certificate" in the sidebar.

AWS ACM - Request Certificate

Enter the domain name you pointed to in your DNS provider configuration. Under "Validation method", select "Email validation" and click next.

AWS ACM - Request Certificate

A certificate with the status of "Pending Validation" will be created. You'll receive an email from AWS with a link to validate the request.

ACM certificate with pending status

Once you click on the link in the email, the status of the certificate will be changed to "Issued". Copy the ARN – we'll need it in the next steps.

AWS ACM certificate issued

Now that we've created the certificate successfully, let's create the alternate domain for CloudFront.

How to Create an Alternate Domain for CloudFront Distribution

Log into the AWS Console and search for CloudFront.
Click on the distribution created by SST.
In the "General" tab, click the "Edit" button.

Edit CloudFront distribution

Enter the alternate domain name and select the certificate that we created. Leave all other options as default and click on the "Save changes" button.

Add alternate domain for CloudFront distribution

All set! Let's edit our app to deploy the changes to our custom domain.

How to Configure External Custom Domain using SST

Update the sst.config.ts file with the following code. Paste the ARN you copied while creating the certificate as a value for the variable certArn. Replace the domainName with your domain:

import { SSTConfig } from "sst";
import {Bucket, NextjsSite } from "sst/constructs";
import { Certificate } from "aws-cdk-lib/aws-certificatemanager";


export default {
  config(_input) {
    return {
      name: "sst-tutorial",
      region: "us-east-1",
    };
  },
  stacks(app) {
    app.stack(function Site({ stack }) {
      const bucket = new Bucket(stack, "public");
      const certArn = 'Paste the certificate arn'
      const site = new NextjsSite(stack, "site",{
        bind:[bucket],
        customDomain: {
          isExternalDomain: true,
          domainName: "aws.gogosoon.com",
          cdk: {
            certificate: Certificate.fromCertificateArn(stack, "MyCert", certArn),
          },
        },
      });
      stack.addOutputs({
        SiteUrl: site.customDomainUrl || site.url,
      });
    });
  },
} satisfies SSTConfig;

sst.config.ts - File changes

Run yarn sst deploy to deploy the changes to a custom domain. Once deployed, you should have the app running on the custom URL.

Next.js deployed with custom domain using SST

Next.js app up and running with custom domain

Conclusion

Voila! Our Next.js app is now deployed to AWS, and we've connected it with our custom domain. Please check out the source code here.

The SST framework provides an excellent toolset for deploying serverless applications, contributing significantly to development speed, scalability, and error handling.

Feel free to explore more about SST and its potential in transforming your cloud development experience. Happy coding!

If you wish to learn more about AWS Services, subscribe to my email newsletter (https://5minslearn.gogosoon.com/) and follow me on social media.

How to Offer Custom APIs to Your Users with AWS API Gateway

Arunachalam B — Wed, 21 Jun 2023 13:56:00 +0000

In the world of cloud computing and serverless architecture, AWS API Gateway is a powerful tool that helps you build robust, secure, and scalable APIs.

In this tutorial, I'll introduce you to API Gateway and explain the benefits of using this helpful tool. Then I'll show you how to create and deploy a Rest API, and create usage plans to offer API keys. Alright, let's get started.

What is API Gateway?

AWS API Gateway is a fully managed service provided by Amazon Web Services (AWS) that simplifies the creation, deployment, and management of APIs at any scale.

It acts as a front door for applications, and allows you to create APIs that act as bridges between clients and back-end services. This enables secure and efficient communication.

Why Do You Need API Gateway?

AWS API Gateway offers many benefits for businesses and developers. Here are a few benefits of using API Gateway.

Scalability and High Availability

With AWS API Gateway, scaling your APIs becomes much easier. It seamlessly handles traffic spikes by automatically scaling the underlying infrastructure. This results in high availability and helps prevent service disruptions.

Security and Authentication

API Gateway offers robust security features, including built-in authentication and authorization mechanisms.

It supports User Authentication through IAM Roles for internal applications, Cognito for external applications (for example Mobile users), and it also supports custom authorizers.

Integration with other AWS Services

As part of the AWS ecosystem, API Gateway seamlessly integrates with a range of other AWS services. This enables you to leverage additional functionalities like AWS Lambda functions, AWS Cognito for user management, and AWS CloudWatch for monitoring and logging.

API Lifecycle Management

With API Gateway, you can easily version, deploy, and manage different stages of your APIs. This simplifies the process of rolling out updates, testing new features, and managing different environments such as development, staging, and production.

I hope by now you understood what API Gateway is and why it's valuable. Let's dive into creating our very own API Gateway.

How to Create an AWS API Gateway

In this section, we will:

Create a Rest API with the GET method
Integrate it with a simple hello world lambda function and deploy it

Let's begin with creating a lambda function

How to Create an AWS Lambda Function

Log in to the AWS Management Console and search for "Lambda" in the AWS Management Console search bar. Click on Create Function.

Navigate to AWS Lambda Console

Select the "Author from scratch" option, enter a name for your lambda function, select the "Python" runtime, and click the Create Function button at the bottom right.

Create a AWS Lambda Function

Once the function is created, update the following code and deploy the changes:

import json

def lambda_handler(event, context):
    body = "Hello from 5minslearn!"
    statusCode = 200
    return {
        "statusCode": statusCode,
        "body": json.dumps(body),
        "headers": {
            "Content-Type": "application/json"
        }
    }

Deploy a Lambda Function

Congratulations! You have successfully created an AWS Lambda function. Now let's create the Rest API.

How to Create a Rest API and Integrate it with AWS Lambda

Search for API Gateway in the search bar. In the REST API section, click on the Build button.

Create a Rest API

Choose the Protocol as Rest and select New API in the Create new API section. In the settings section enter the API name of your choice and leave Endpoint Type as the default. Then click the Create API button.

Configure creating a Rest API

Click the Actions Button on the top left. Next, Click Method and select the method as GET and click the Tick icon.

Create a Method

Choose "GET" method

Select Lambda Function as the Integration type and enter the name of the Lambda function you created previously. Then save the function.

Select Method configuration

Once you click save, "Add Permission to Lambda Function" will prompt for confirmation. This basically means that you're allowing the API Gateway to invoke a Lambda function. In this case, it is "DemoFunction" Lambda function. Accept the confirmation and proceed to the next step.

Allow Permission to invoke Lambda Function from API Gateway

Click on Test. It will take you to a new page. Click on the "Test" button. You'll be able to see the response from the Lambda function on the right side panel.

Our API Architecture

Test our API Gateway

As you have successfully tested your API, you're ready to deploy the API. To deploy the API, click on the Actions button once again and click on Deploy API.

Deploy the API Gateway

The Deploy API dialogue will popup. Select New Stage for Deployment stage and name it whatever you want. Click the Deploy button.

Configure API Gateway deployment

Click on Invoke URL shown at the top. You can see the response from the Lambda function.

API Gateway Created

Test our API

Great! We successfully created the Rest API, integrated it with the Lambda function, and deployed it.

But you can do this with multiple services available on the market. Why would you choose AWS API Gateway?

Well. That's a interesting question. First of all, you can configure the usage plan for your API. The best part is you don't have to write any code for it.

Now let's create a Usage Plan, generate an API key, and make our Rest API accessible only by passing the API key in the Header.

How to Create an API Gateway Usage Plan

In the left side bar click on Usage Plans and click the Create button. Enter the Name of your plan – I chose "Basic". Enter the Throttling and Quota sections as per your requirements and click Next.

Create AWS API Gateway usage plan

Click on the Add API Stage button. Select the API and its stage. Click on the tick icon at right corner and select Next.

Create a Stage for our API

Click on Create API Key and add to Usage Plan. A modal will pop up. Enter the Name for API Key. For the API key, I selected Auto Generate but if you want to give a custom key you can enter a custom key. Hit the Save button.

Create a API Key to access the service

Configure the API Key

Select Resources from the Sidebar, click on the GET API you just created, and click the Method Request.

Select the method

In the Settings section, update the API Key Required field to true and click the Tick icon. Once updated, don't forget to deploy the changes by hitting the Action dropdown. Your changes will not be updated otherwise.

Enable API Key Required field

Deploy the API

Hit the same URL now and see the magic.

Forbidden!

Because our API layer is protected now. You have to pass the API key in the header to access the data.

Forbidden access if no API Key is provided

Now Click on the Usage Plans from the Sidebar. Select your plan and navigate to the API Keys tab.

Access your API Key

Click on the API key you created in Step 3. Click Show. Copy the API key.

List of API Keys

Reveal your API Key

You have to pass the API Key in the 'x-api-key' header. Let's switch to the terminal to test this out.

Verify your Rest API without passing the API key at first. Open the terminal, and enter the following curl command. You will once again see the forbidden message.

curl --location --request GET '[enter your invoke url]'
--header 'Content-Type: application/json

Forbidden access without API Key in Terminal

Now pass the API key this time. Run the following curl command:

curl --location --request GET '[your invoke url]' \
--header 'x-api-key: [your api key]' \
--header 'Content-Type: application/json' \
--data-raw ''

Data received on passing API Key in x-api-key Header

You can see the output of the Lambda function because you passed 'x-api-key' in the header.

Awesome! You have successfully created the Usage plan, generated the API key, and attached it to the Rest API method and verified the integration.

Conclusion

In this tutorial, you learned what AWS API gateway is and how to create Usage Plans for the Rest API.

If you wish to learn more about AWS Services, subscribe to my email newsletter (https://5minslearn.gogosoon.com/) and follow me on social media.

How to Create a Serverless ChatGPT App in 10 Minutes

Michael Yuan — Mon, 20 Mar 2023 16:52:42 +0000

Since OpenAI released an official API for ChatGPT in March 2023, many developers and entrepreneurs are interested in integrating it into their own business operations.

But some significant barriers remain that make it difficult for them to do this:

OpenAI provides a simple stateless API for ChatGPT. The developer needs to keep track of the history and context of each conversation in a cache or database managed by the application. The developer also needs to manage and safeguard the API keys. There is a lot of boilerplate code unrelated to the application’s business logic.
The “natural” UI for the ChatGPT API application is a threaded chat. But it is difficult to create a “chat UI” in a traditional web or app framework. In fact, the most commonly used chat UI already exists in messaging apps like Slack, Discord, and even forums (for example, GitHub Discussions). We need a simple way to connect ChatGPT API responses to an existing messaging service.

In this article, I will show you how to create a serverless GitHub bot. The bot allows GitHub users to chat with ChatGPT and each other in GitHub Issues. You can try it by asking a question, or joining another conversation thread by leaving a comment. In other words, this project uses GitHub Issues’ threaded messages UI as its own chat UI.

Figure 1. Learning Rust with ChatGPT. see https://github.com/second-state/chat-with-chatgpt/issues/31

The bot is a serverless function written in Rust. Just fork the example, deploy your fork on flows.network, and configure it to interact with your own GitHub repos and OpenAI keys. You will have a fully functional GitHub bot in 10 minutes. There is no need to set up a web server, or a webhook for GitHub API, or a cache / database server.

How to Fork the Template Repo

First, fork this template repo from GitHub.

The src/lib.rs file contains the bot application (also known as the flow function). The run() function is called upon starting up. It listens for issue_comment and issues events from the GitHub repo owner/repo. Those events are emitted when a new issue comment or a new issue is created in the repo.

#[no_mangle]
#[tokio::main(flavor = "current_thread")]
pub async fn run() {
    // Setup variables for
    //   ower: GitHub org to install the bot
    //   repo: GitHub repo to install the bot
    //   openai_key_name: Name for your OpenAI API key
    // All the values can be set in the source code or as env vars

    listen_to_event(&owner, &repo, vec!["issue_comment", "issues"], |payload| {
        handler(&owner, &repo, &openai_key_name, payload)
    })
    .await;
}

The handler() function processes the events received by listen_to_event(). If the event is a new comment in an issue, the bot calls OpenAI's ChatGPT API to add the comment text into an existing conversation identified by the issue.number. It receives a response from ChatGPT, and adds a comment in the issue.

The flow function here automatically and transparently manages the conversation history with the ChatGPT API in a local storage. The OpenAI API key is also stored in the local storage so that instead of putting the secret text in the source code, the key can be identified by a string name in openai_key_name.

EventPayload::IssueCommentEvent(e) => {
    if e.comment.user.r#type != "Bot" {
        if let Some(b) = e.comment.body {
            if let Some(r) = chat_completion (
                    openai_key_name,
                    &format!("issue#{}", e.issue.number),
                    &b,
                    &ChatOptions::default(),
            ) {
                if let Err(e) = issues.create_comment(e.issue.number, r.choice).await {
                    write_error_log!(e.to_string());
                }
            }
        }
    }
}

If the event is a new issue, the flow function creates a new conversation identified by issue.number, and requests a response from ChatGPT.

EventPayload::IssuesEvent(e) => {
    if e.action == IssuesEventAction::Closed {
        return;
    }

    let title = e.issue.title;
    let body = e.issue.body.unwrap_or("".to_string());
    let q = title + "\n" + &body;
    if let Some(r) = chat_completion (
            openai_key_name,
            &format!("issue#{}", e.issue.number),
            &q,
            &ChatOptions::default(),
    ) {
        if let Err(e) = issues.create_comment(e.issue.number, r.choice).await {
            write_error_log!(e.to_string());
        }
    }
}

How to Deploy the Serverless Flow Function

As we can see, the flow function code calls SDK APIs to perform complex operations. For example,

The listen_to_event() function registers a webhook URL through GitHub API so that the handler() function will be called when certain events occur in GitHub.
The chat_completion() function calls the ChatGPT API with the named API key and past history / context of the specified conversation. The API key and conversation history are stored in a Redis cache.

The webhook server and the Redis cache are both external services the SDK depends on. That means the flow function must run inside a managed host environment that provides such external services. Flows.network is a PaaS (Platform as a Service) host for the flow function SDKs.

In order to deploy the flow function on flows.network, you simply need to import its source code to the PaaS.

First, sign into flows.network from your GitHub account. Import your forked GitHub repo that contains the flow function source code and choose "With Environment Variables".

Note that this is NOT the GitHub repo where you want to deploy the bot. This is the repo for your forked flow function source code.

Figure 2. Import the GitHub repo you forked from the flow function template into flows.network.

Set the environment variables to point the flow function to the OpenAI API key name (open_ai_key) and GitHub repo (owner and repo).

The GitHub owner and repo variables here point to the GitHub repo where you want to deploy the bot, NOT the repo for the flow function source code.

Figure 3. Set the environment variables for the GitHub repo where you want to deploy the bot, as well as the OpenAI API key name.

Flows.network will fetch the source code and build the Rust source code into Wasm bytecode using the standard cargo toolchain. It will then run the Wasm flow function in the WasmEdge Runtime.

How to Connect the Flow Function to GitHub and OpenAI

While the flow function requires connections to the OpenAI and GitHub APIs, the source code has no hardcoded API keys, access tokens, or OAUTH logic. The flows function SDKs have made it easy and safe for developers to interact with external SaaS API services.

Flows.network discovers that the flow function requires connections to the OpenAI and GitHub APIs. It presents UI workflows for the developers to:

Log into GitHub, authorize access to events, and register the flow function as the webhook for receiving those events.
Associate an OpenAI API key with the name openai_key_name.

Figure 4. The external services required by the flow function are connected and turned green.

Once the external SaaS APIs are connected and authorized, they turn green on the flow function dashboard. The flow function will now receive the events it listen_to_event() for. It will also get transparent access to Redis for the named OpenAI API key and the cached conversation context to support the chat_completion() SDK function.

What's next

The GitHub bot is just one of many bot types the flows.network can support. By connecting the flow function to a Slack channel, you can get ChatGPT to participate in your group discussion. Here is an example of a Slack-based ChatGPT bot.

https://github.com/flows-network/collaborative-chat

Figure 5. The Slack ChatGPT bot.

Another example is to have ChatGPT answering legal questions in a Slack channel. The flow function prepends the legal question with a prompt.

https://github.com/flows-network/robo-lawyer

Figure 6. The Slack robo lawyer bot.

Besides GitHub and Slack, there are many SaaS products you can integrate into flows.network through their APIs.

While the example flow functions are written in Rust, we aim to support JavaScript-based flow function SDKs. In another word, platform SDK functions such as listen_to_event() and chat_completion() will have a JavaScript version. The JavaScript flow function runs inside the WasmEdge Runtime on the flows.network platform through the WasmEdge-QuickJS module.

serverless - freeCodeCamp.org

How to Deploy a Serverless Spam Classifier Using Scikit-Learn, AWS Lambda, & API Gateway

Table of Contents

1. Prerequisites

2. Building the Brain: The Model

1. Vectorization: Turning Text into Math

2. Training: The Logistic Regression Engine

3. Evaluation: Testing the Intelligence

4. Exporting the Logic (Serialization)

3. Deploying the Model to AWS

1. Model Storage: Amazon S3

2. The Production Backend: AWS Lambda

3. The API Gateway - The Bridge to the Web

Creating the REST API

Deployment Stages

Connecting the Frontend (The JavaScript Layer)

4. How to Run The Project Locally

5. Our Project Architecture

6. Conclusion: The Power of Serverless AI

7. Acknowledgment / References

Connect With Me

How to Build a Full-Stack CRUD App with React, AWS Lambda, DynamoDB, and Cognito Auth

What You'll Build

Table of Contents

Who This Is For

Prerequisites

Architecture Overview

Part 1: Set Up Your AWS Account and Tools

1.1 Create Your AWS Account

1.2 Install the AWS CLI and CDK

1.3 Configure Your AWS Credentials (IAM)

Phase 1: Create an IAM User

Phase 2: Generate Access Keys

Phase 3: Connect Your Terminal to AWS

Part 2: Set Up the Project Structure

2.1 Create the Workspace

2.2 Initialize the Frontend (Next.js)

2.3 Initialize the Backend (CDK)

2.4 Understanding CDK Before You Write Any Code

Part 3: Define the Database (DynamoDB)

Part 4: Write the Lambda Functions

A Note on the AWS SDK

4.1 Create Vendor Lambda

4.2 Get Vendors Lambda

4.3 Delete Vendor Lambda

Part 5: Build the API with API Gateway

5.1 Add Lambda Functions and API Gateway to the Stack

Part 6: Deploy the Backend to AWS

6.1 Bootstrap Your AWS Environment

6.2 Deploy

6.3 Troubleshooting: How to Read AWS Error Logs

Error: 502 Bad Gateway

Part 7: Build the React Frontend

7.1 Define the Vendor Type

7.2 Create the API Service Layer

7.3 Build the Main Page

Vendor Tracker

Add New Vendor

Current Vendors ({vendors.length})

7.4 Test the App Locally

Verifying the connection to AWS:

Part 8: Add Authentication with Amazon Cognito

8.1 Add Cognito to the CDK Stack

8.2 Install and Configure AWS Amplify

8.3 Wire Providers into the App Layout

8.4 Protect the UI with withAuthenticator

Add New Vendor

Current Vendors ({vendors.length})

8.5 Pass the Auth Token to API Calls

8.6 Troubleshooting Cognito

"Unconfirmed" user error after sign-up

401 Unauthorized errors after deployment

Part 9: Deploy the Frontend with S3 and CloudFront

9.1 Configure Next.js for Static Export

9.2 Add S3 and CloudFront to the CDK Stack

9.3 Run the Final Deployment

What You Built

Conclusion

How to Build a Serverless RAG Pipeline on AWS That Scales to Zero

Here's what we'll cover: