Redis - freeCodeCamp.org

How to Persist State in Time-Series Models with Docker and Redis

Chirag Agrawal — Thu, 09 Oct 2025 01:18:59 +0000

Have you ever built a brilliant time-series model, one that could forecast sales or predict stock prices, only to watch it fail in the real world? Well, this is a common frustration. Your model works perfectly on your machine, but the moment you deploy it in a Docker container, it seems to develop amnesia. It forgets everything it knew yesterday, making its predictions for tomorrow useless.

Don’t worry. This isn't likely a flaw in your model. It's a clash between how time-series models and Docker containers are designed to work.

Time-series models are all about memory. They need to remember the past to predict the future. But Docker containers are built to be stateless and forgetful, wiping their memory clean with every restart. This fundamental conflict can turn a powerful model into a worthless one in production.

In this article, we’ll solve that problem. We're going to give your time-series model a permanent memory. You'll learn how to build a production-ready prediction service that uses Redis as an external brain and Docker volumes to ensure that memory survives any restart. We'll walk through a hands-on example, step-by-step, so you can learn how to build a system that is both intelligent and incredibly reliable.

Who is This Guide For?

To get the most out of this tutorial, it’ll be helpful to have a few things under your belt. We’ll be diving into some code and command-line work, so a little preparation will go a long way.

The main tools for this project are Docker and Docker Compose. Make sure you have them installed and running on your computer.
You’ll also find it easier to follow along if you’re comfortable with the basics of Docker, Python, and the Flask web framework. A bit of command-line experience will also be handy for running the commands in the tutorial.
But don't worry if you've never used Redis before. All you need to know is that it’s a fast, in-memory database. We’ll handle the rest along the way.

Think of this as a guided tour. As long as you're curious and have the basic tools ready, you'll be in great shape.

Understanding the Problem

Before jumping into solutions, let's first clarify what a time-series model is and then explore why containerizing it is so tricky.

So, what is a time-series model?

Simply put, a time-series model is a type of model that analyzes data points collected over time to predict future values. Think of it like predicting the weather. A meteorologist doesn't just look at the sky right now. They look at the temperature, pressure, and wind patterns from the last few hours and days to forecast what will happen tomorrow.

Time-series models do the same thing with data, whether it's website traffic, stock prices, or energy consumption. The key takeaway is that history matters. The sequence of past events provides the context needed to make an intelligent prediction about the future.

Now, here’s what breaks when you put these models in Docker.

1. Containers are ephemeral by design

Docker containers are meant to be stateless. This works great for most APIs. A user profile endpoint? Stateless. A sentiment analysis model? Stateless. They take an input, return an output, and forget everything in between.

Time-series models don't work this way. They need context from previous predictions. Without it, your model is essentially blind.

2. Lost context between predictions

Each prediction happens in isolation. Your model receives a single data point and makes a guess without knowing what came before. This defeats the entire purpose of time-series modeling.

You may think: "I'll just load all historical data on every request." But that approach fails for two reasons:

It's slow. Really slow if you have thousands of data points
It doesn't scale. When you have multiple series or high request volume, you'll hit performance walls fast

3. Model amnesia on restart

Every time you deploy a new version or the container crashes, all accumulated state disappears. Your model starts from scratch. In production, this is unacceptable.

The Solution: External State Store

Instead of keeping state inside the container, we’ll move it outside. Redis becomes the model's memory.

The pattern looks like this:

Client Request → Flask API → Redis → Prediction with Context

Your container stays stateless and replaceable. But the system as a whole maintains state through Redis.

Hands-On Implementation

Let's build this. Clone the demo repository:

git clone https://github.com/ag-chirag/docker-redis-time-series
cd docker-redis-time-series

Start with the broken approach

The docker-compose.initial.yml file shows what NOT to do:

services:
  api:
    build: ./flask-api
    ports:
      - "5000:5000"

  redis:
    image: redis:alpine

Notice what's missing? No volumes. Redis stores data in the container's filesystem, which means that data is temporary.

Run it:

docker compose -f docker-compose.initial.yml up

Make a few predictions:

curl -X POST http://localhost:5000/predict \
  -H "Content-Type: application/json" \
  -d '{
    "series_id": "demo",
    "historical_data": [
      {"timestamp": "2024-01-01T12:00:00", "value": 10},
      {"timestamp": "2024-01-01T12:01:00", "value": 20},
      {"timestamp": "2024-01-01T12:02:00", "value": 30}
    ]
  }'

You'll get a response showing Redis is working:

{
  "data_points_used": 3,
  "prediction": 40,
  "redis_connected": true
}

Now restart the services:

docker compose down
docker compose -f docker-compose.initial.yml up

Make another prediction. Check the data_points_used field. It reset. All your historical data is gone. This is exactly what we're trying to avoid.

How to fix it with volumes

The correct docker-compose.yml adds persistence:

services:
  api:
    build: ./flask-api
    ports:
      - "5000:5000"
    environment:
      - REDIS_HOST=redis

  redis:
    image: redis:alpine
    command: redis-server --appendonly yes
    volumes:
      - redis_data:/data

volumes:
  redis_data:

So, what is a volume and how does it work?

Think of a Docker volume as a dedicated external hard drive for your container. By default, when a container writes data, it does so to a temporary layer that gets destroyed when the container is removed. A volume provides a way to save that data permanently.

Here’s how it works:

Docker creates and manages a special storage area on the host machine, completely separate from any container's filesystem. In our docker-compose.yml, the volumes: redis_data: section at the bottom tells Docker to create a named volume called redis_data.
When the Redis container starts, the volumes: - redis_data:/data line tells Docker to "plug in" this external hard drive. It connects the redis_data volume to the /data directory inside the container.
Now, whenever the Redis process inside the container writes data to its /data directory (which we've configured it to do), it's actually writing to the redis_data volume on the host machine.
When you run docker compose down, the Redis container is destroyed, but the redis_data volume is untouched. It's like unplugging the external hard drive, and the data is still safe. The next time you run docker compose up, a brand new Redis container is created, the volume is re-attached, and Redis finds all its old data right where it left it.

This mechanism is the key to giving our stateful service a memory that survives restarts.

Run the corrected version:

docker compose up --build

Send several predictions to build up state:

for i in {1..5}; do
  curl -X POST http://localhost:5000/predict \
    -H "Content-Type: application/json" \
    -d "{
      \"series_id\": \"demo\",
      \"historical_data\": [{\"timestamp\": \"2024-01-01T12:0$i:00\", \"value\": $((i*10))}]
    }"
done

Now comes the test. Restart everything:

docker compose down
docker compose up

Make another prediction. Look at data_points_used. It includes all previous points. The model picks up exactly where it left off.

This works because the volume exists independently of the container lifecycle.

How the code handles state

The Flask API in flask-api/app.py stores each data point in Redis using sorted sets:

def store_data_point(series_id, timestamp, value):
    key = f"ts:{series_id}"
    redis_client.zadd(key, {json.dumps({"ts": timestamp, "val": value}): timestamp})

When making predictions, it retrieves recent history:

def get_recent_data(series_id, limit=100):
    key = f"ts:{series_id}"
    data = redis_client.zrange(key, -limit, -1)
    return [json.loads(d) for d in data]

Redis sorted sets give you automatic time ordering. The volume ensures this data survives restarts.

Test the health endpoint

Check that everything is connected properly:

curl http://localhost:5000/health

You should see:

{
  "model_loaded": true,
  "redis_connected": true,
  "status": "healthy"
}

If redis_connected is false, check your Docker logs. Common issues are network configuration or Redis not starting properly.

What About Scaling?

This setup works well for single-instance deployments. When traffic increases, you have a few options.

Horizontal scaling with Redis Cluster

For high throughput, distribute your data across multiple Redis nodes. Redis Cluster handles sharding automatically.

High availability with Redis Sentinel

Add failover capability so your state store doesn't become a single point of failure. Sentinel monitors Redis instances and promotes replicas when the primary fails.

Use managed Redis services

AWS ElastiCache, Azure Cache for Redis, or Google Cloud Memorystore handle the operational burden. You focus on your model, they handle Redis reliability.

The key insight: your API containers remain stateless. You scale the state store independently.

Common Pitfalls to Avoid

I can't emphasize this enough: test your persistence before deploying to production.

Don't assume volumes work

Actually restart your containers and verify state persists. I've seen deployments fail because someone forgot to mount the volume in production.

Don't ignore Redis memory limits

Redis keeps everything in memory. Monitor your memory usage. Set maxmemory policies appropriate for your workload. If you run out of memory, Redis will start evicting keys or refuse writes.

Don't skip monitoring

Add health checks. Monitor Redis connection status. Track prediction latency. You want to know when things break, not learn about it from angry users.

Conclusion

Time-series models need memory. Docker containers lose memory by default. The solution is simple: separate state from compute.

Use Redis as an external state store. Use Docker volumes to persist that state. Your model stays smart, your containers stay replaceable, and your deployments become reliable.

The full working code is available at github.com/ag-chirag/docker-redis-time-series. Clone it, run it, break it, learn from it.

And remember: the simplest solution that works is usually the right one. You don't always need Kubernetes and StatefulSets. Sometimes Docker Compose and a volume are enough.

Caching a Next.js API using Redis and Sevalla

Manish Shivanandhan — Wed, 27 Aug 2025 16:00:42 +0000

When you hear about Next.js, your first thought may be static websites or React-driven frontends. But that’s just part of the story. Next.js can also power full-featured backend APIs that you can host and scale just like any other backend service.

In an earlier article, I walked through building a Next.js API and deploying it with Sevalla. The example stored data in a PostgreSQL database and handled requests directly. That worked fine, but as traffic grows, APIs that hit the database on every request can slow down.

This is where caching comes in. By adding Redis as a cache layer, we can make our Next.js API much faster and more efficient. In this article, we’ll see how to add Redis caching to our API, deploy it with Sevalla, and show measurable improvements.

In the last article, I explained the API in detail. So you can use this repository to start with as the base for this project.

Why Caching Matters
What is Redis?
Setting Up the Project
Provisioning Redis
Updating Cache on Reads
Updating Cache on Writes
Deploying to Sevalla
Why Redis Works Well with Next.js APIs
Conclusion

Why Caching Matters

Every time your API hits the database, it consumes time and resources. Databases are great at storing and querying structured data, but they aren’t optimized for speed at scale when you need to serve thousands of read requests per second.

Caching solves this by keeping frequently accessed data in memory. Instead of asking the database every time, the API can return data directly from cache if it’s available. Redis is perfect for this because it’s an in-memory key-value store designed for performance.

For example, if you fetch the list of users from the database on every request, it might take 200ms to run the query and return results. With Redis caching, the first request stores the result in memory, and subsequent requests can return the same data in less than 10ms. That’s an order-of-magnitude improvement.

What is Redis?

Redis is an in-memory data store that works like a super-fast database. Instead of writing and reading from disk, it keeps data in memory, which makes it incredibly fast. That’s why it’s often used as a cache, where speed is more important than long-term storage.

It’s designed to handle high-throughput workloads with very low latency, which means it can respond in microseconds. This makes it a perfect fit for use cases like caching API responses, storing session data, or even powering real-time applications like chat systems and leaderboards.

Unlike a traditional database, Redis focuses on simplicity and speed. It stores data as key-value pairs, so you can quickly fetch or update values without writing complex queries. And because it supports advanced data types like lists, sets, and hashes, it’s much more flexible than a plain key-value store.

When combined with an API like the one we built in Next.js, Redis helps you reduce load on the main database and deliver blazing-fast responses to clients.

Setting Up the Project

Let’s clone the repository:

git clone git@github.com:manishmshiva/nextjs-api-pgsql.git next-api

Now let’s go into the directory and do an npm install to install the packages.

cd next-api
npm i

Create a .env file and add the database URL from Sevalla into an environment variable.

cat .env

The .env file should look like this:

PGSQL_URL=postgres://:-@asia-east1-001.proxy.kinsta.app:30503/

Now let’s make sure the application works as expected by starting the application and making a couple of API requests.

Starting the app:

npm run dev

Let’s make sure the database is connected. Go to localhost:3000 on your browser. It should return the following JSON:

Let’s create a new user. To create a new entry in the DB using Postman, send a POST request with the following JSON:

{"id":"d9553bb7-2c72-4d92-876b-9c3b40a8c62c","name":"Larry","email":"larry@example.com","age":"25"}

Let’s ensure the record is created by going to localhost:3000/users in the browser.

Great. Now let’s cache these APIs using Redis.

Provisioning Redis

Let’s go to Sevalla’s dashboard and click on “Databases”. Choose “Redis” from the list, and leave the rest of the options as defaults.

Once the database is created, switch on the “external connection” option and copy the publicly accessible URL.

This is how it should look in the .env file:

REDIS_URL=redis://default:@:

Now install a Redis client for Node.js:

npm install ioredis

We can now connect to Redis and use it as a cache layer for our users API. Let’s see how to implement caching.

Updating Cache on Reads

Here’s the updated users/route.ts that uses Redis:

import { NextResponse } from "next/server";
import { Client } from "pg";
import Redis from "ioredis";

const redis = new Redis(process.env.REDIS_URL!);

async function readUsers() {
  const client = new Client({
    connectionString: process.env.PGSQL_URL,
  });
  await client.connect();

  try {
    const result = await client.query("SELECT id, name, email, age FROM users");
    return result.rows;
  } finally {
    await client.end();
  }
}

export async function GET() {
  try {
    // Check cache first
    const cached = await redis.get("users");
    if (cached) {
      return NextResponse.json(JSON.parse(cached));
    }

    // Fallback to database if not cached
    const users = await readUsers();

    // Store result in cache with 60s TTL
    await redis.set("users", JSON.stringify(users), "EX", 60);

    return NextResponse.json(users);
  } catch (err) {
    return NextResponse.json({ error: "Failed to fetch users" }, { status: 500 });
  }
}

Now, when you hit /users:

The API first checks Redis.
If the data exists, it returns it instantly.
If not, it queries PostgreSQL, saves the result in Redis, and then returns it.

This makes repeated requests extremely fast. You can adjust the cache expiry (EX 60) depending on how fresh your data needs to be.

Without Redis caching, fetching /users ten times means ten database queries. Each might take around 150–200ms depending on database size and network latency.

With Redis, the first request still takes ~200ms since it populates the cache. But every request after that is nearly instant, often under 10ms. That’s a 20x improvement.

This speedup matters when your API faces hundreds or thousands of requests per second. Caching not only reduces latency but also lightens the load on your database.

Updating Cache on Writes

Right now, only GET requests use the cache. But what if we add new users? The cache would still return the old data.

The solution is to update or clear the cache whenever a write happens. Let’s update the POST handler:

export async function POST(req: Request) {
  try {
    const body = await req.json();
    const client = new Client({
      connectionString: process.env.PGSQL_URL,
    });
    await client.connect();

    const query = `
      INSERT INTO users (id, name, email, age)
      VALUES ($1, $2, $3, $4)
      RETURNING *;
    `;

    const result = await client.query(query, [
      body.id,
      body.name,
      body.email,
      body.age,
    ]);

    await client.end();

    // Invalidate cache so next GET fetches fresh data
    await redis.del("users");

    return NextResponse.json(result.rows[0]);
  } catch (err) {
    return NextResponse.json({ error: "Failed to add user" }, { status: 500 });
  }
}

Now whenever a new user is created, the cache for users is cleared. The next GET request will fetch from the database, refresh the cache, and then continue serving cached data.

Deploying to Sevalla

Push your code to GitHub or fork my repository. Now lets go to Sevalla and create a new app.

Choose your repository from the dropdown and check “Automatic deployment on commit”. This will ensure that the deployment is automatic every time you push code. Choose “Hobby” under the resources section.

Click “Create” and not “Create and deploy”. We haven’t added our PostgreSQL URL and Redis URL as environment variables, so the app will crash if you try to deploy it.

Go to the “Environment variables” section and add the key “PGSQL_URL” and the URL in the value field. Do the same for the “REDIS_URL” key and add the Redis URL.

Now go back to the “Overview” section and click “Deploy now”.

Once deployment is complete, click “Visit app” to get the live URL of your API. You can replace localhost:3000 with the new URL in Postman and test your API.

Why Redis Works Well with Next.js APIs

Redis is lightweight, blazing fast, and perfect for caching API responses. In the context of Next.js, it fits naturally because:

The API routes run server-side where Redis can be queried directly.
Caching logic is simple to add around database calls.
Redis can be used for more than caching – things like rate limiting, session storage, and pub/sub are also common patterns.

By combining Next.js, PostgreSQL, and Redis on Sevalla, you get a stack that is fast, scalable, and easy to deploy.

Conclusion

Caching isn’t just an optimization – it’s a necessity for real-world APIs. Next.js helps you build robust backend APIs that can be deployed easily. By adding Redis to the mix, those APIs can handle scale without breaking a sweat.

Sevalla ties it all together by providing managed PostgreSQL, Redis, and app hosting in one place. With a few environment variables and a GitHub repo, you can go from local dev to a production-ready, cached API in minutes.

Hope you enjoyed this article. Signup for my free AI newsletter TuringTalks.ai for more hands-on tutorials on AI. You can also find me on Linkedin.

How In-Memory Caching Works in Redis

Manish Shivanandhan — Wed, 16 Jul 2025 16:19:35 +0000

When you’re building a web app or API that needs to respond quickly, caching is often the secret sauce.

Without it, your server can waste time fetching the same data over and over again – from a database, a third-party API, or a slow storage system.

But when you store that data in memory, the same information can be served up in milliseconds. That’s where Redis comes in.

Redis is a fast, flexible tool that stores your data in RAM and lets you retrieve it instantly. Whether you’re building a dashboard, automating social media posts, or managing user sessions, Redis can make your system faster, more efficient, and easier to scale.

In this article, you’ll learn how in-memory caching works and why Redis is a go-to choice for many developers.

What Is In-Memory Caching?
What Is Redis?
How to Work with Redis
Real-Life Use Cases
Conclusion

What Is In-Memory Caching?

In-memory caching is a way of storing data in the system’s RAM instead of fetching it from a database or external source every time it’s needed.

Since RAM is incredibly fast compared to disk storage, you can access cached data almost instantly. This approach is perfect for information that doesn’t change very often, like API responses, user profiles, or rendered HTML pages.

Rather than repeatedly running the same queries or API calls, your app checks the cache first. If the data is there, it’s used right away. If it’s not, you fetch it from the source, save it to the cache, and then return it.

This technique reduces load on your backend, improves response time, and can dramatically improve your app’s performance under heavy traffic.

What Is Redis?

Redis is an open-source, in-memory data store that developers use to cache and manage data in real time.

Unlike traditional databases, Redis stores everything in memory, which makes data retrieval incredibly fast. But Redis isn’t just a simple key-value store. It offers a wide range of data types, from strings and lists to sets, hashes, and sorted sets.

Redis is also capable of handling more advanced tasks like pub/sub messaging, streams, and geospatial queries. Despite its power, Redis is lightweight and easy to get started with.

You can run it on your local machine, deploy it on a server, or even use managed Redis services offered by cloud providers. It’s trusted by major companies and used in all kinds of applications, from caching and session storage to real-time analytics and job queues.

How to Work with Redis

Redis Installation

Getting Redis up and running is surprisingly simple. You can find the installation instructions based on your operating system in the documentation.

To make sure Redis is working, run:

redis-cli ping
# Should respond with "PONG"

Redis Data Types

Redis gives you several built-in types that let you store and manage data in flexible ways.

Strings: Simple key ↔ value pairs.

SET username "Emily"
GET username

Lists: Ordered collections which are great for queues and timelines.

LPUSH tasks "task1"
RPUSH tasks "task2"
LRANGE tasks 0 -1

Hashes: Like JSON objects, great for user profiles.

HSET user:1 name "Alice"
HSET user:1 email "alice@example.com"
HGETALL user:1

Sets: Unordered collections, ideal for tags or unique items.

SADD tags "python"
SADD tags "redis"
SMEMBERS tags

Sorted Sets: Sets with scores – useful for leaderboards.

ZADD leaderboard 100 "Bob"
ZADD leaderboard 200 "Carol"
ZRANGE leaderboard 0 -1 WITHSCORES

Redis also supports Bitmaps, hyperloglogs, streams, geospatial indexes, and keeps expanding its support for data structures.

Redis with Python

If you’re working in Python, using Redis is just as easy. After installing the redis Python library using pip install redis, you can connect to your Redis server and start setting and getting keys right away.

Here is some simple Python code to work with Redis:

import redis

# Connect to the local Redis server on default port 6379 and use database 0
r = redis.Redis(host='localhost', port=6379, db=0)

# --- Basic String Example ---

# Set a key called 'welcome' with a string value
r.set('welcome', 'Hello, Redis!')

# Get the value of the key 'welcome'
# Output will be a byte string: b'Hello, Redis!'
print(r.get('welcome'))


# --- Hash Example (like a Python dict) ---

# Create a Redis hash under the key 'user:1'
# This hash stores fields 'name' and 'email' for a user
r.hset('user:1', mapping={
    'name': 'Alice',
    'email': 'alice@example.com'
})

# Get all fields and values in the hash as a dictionary of byte strings
# Output: {b'name': b'Alice', b'email': b'alice@example.com'}
print(r.hgetall('user:1'))


# --- List Example (acts like a queue or stack) ---

# Push 'Task A' to the left of the list 'tasks'
r.lpush('tasks', 'Task A')

# Push 'Task B' to the left of the list 'tasks' (it becomes the first item)
r.lpush('tasks', 'Task B')

# Retrieve all elements from the list 'tasks' (from index 0 to -1, meaning the full list)
# Output: [b'Task B', b'Task A']
print(r.lrange('tasks', 0, -1))

You might store a user's session data, queue background tasks, or even cache rendered HTML pages. Redis commands are fast and atomic, which means you don’t have to worry about data collisions or inconsistency in high-traffic environments.

One of the most useful features in Redis is key expiration. You can tell Redis to automatically delete a key after a certain period, which is especially handy for session data or temporary caches.

You can set a time-to-live (TTL) on keys, so Redis removes them automatically

SET session:1234 "some data" EX 3600  # Expires in 1 hour

Redis also supports persistence, so even though it’s an in-memory store, your data can survive a reboot.

Redis isn’t limited to small apps. It scales easily through replication, clustering, and Sentinel.

Replication allows you to create read-only copies of your data, which helps distribute the load. Clustering breaks your data into chunks and spreads them across multiple servers. And Sentinel handles automatic failover to keep your system running even if one server goes down.

Real-Life Use Cases

One of the most common uses for Redis is caching API responses.

Let’s say you have an app that displays weather data. Rather than calling the weather API every time a user loads the page, you can cache the response for each city in Redis for 5 or 10 minutes. That way, you only fetch new data occasionally, and your app becomes much faster and cheaper to run.

Another powerful use case is session management. In web applications, every logged-in user has a session that tracks who they are and what they’re doing. Redis is a great place to store this session data because it’s fast and temporary.

You can store the session ID as a key, with the user’s information in a hash. Add an expiration time, and you’ve got automatic session timeout built in. Since Redis is so fast and supports high-concurrency access, it’s a great fit for applications with thousands of users logging in at the same time.

Conclusion

In-memory caching is one of the simplest and most effective ways to speed up your app, and Redis makes it incredibly easy to implement. It’s not just a cache, it’s a toolkit for building fast, scalable, real-time systems. You can start small by caching a few pages or API responses, and as your needs grow, Redis grows with you.

If you’re just getting started, try running Redis locally and experimenting with different data types. Store some strings, build a simple task queue with lists, or track user scores with a sorted set. The more you explore, the more you’ll see how Redis can help your application run faster, smarter, and more efficiently.

Enjoyed this article? Connect with me on Linkedin. See you soon with another topic.

How to Build a Flexible API with Feature Flags Using Open Source Tools

Pradumna Saraf — Tue, 19 Nov 2024 22:56:26 +0000

Feature flagging has changed the paradigm of how backend developers can test and modify the things they build. With feature flags, we can enable and disable a feature or change the functionality of something on the fly with a single click (no need to redeploy).

In this tutorial, we will see how feature flags help us to enable and disable a feature/a part of code whenever we want from the UI, without the need to redeploy the whole code.

To understand things more deeply, we will build an app from scratch, look at feature flagging capabilities, and use a tool called Flagsmith to manage our created feature flags from a single dashboard.

Here’s what we’ll cover:

Prerequisites
What is a Feature Flag?
Feature Flags for Backend Development
Why Use Open Source Tools?
Let’s Code!
Conclusion

Prerequisites

Golang installed and a medium-level understanding of it.
A running Redis instance (Remote or local instance)
Flagsmith Account (It’s Free. We will cover this later in the article.)

What is a Feature Flag?

Feature Flag is a technique in development that allows teams to turn features on or off without modifying the source code or redeploying.

To make it a bit simpler, think of them as functioning sort of like conditional statements (for example, if-else statements): based on when something’s true or false, it determines the code path that will be executed.

Feature Flags for Backend Development

You may have seen feature flags used in frontends and websites, but there is much more to them. You can use them on the server side to modify the functionality of an API, doing things like modifying/setting the rate limit, changing the API endpoint's functionality or completely turning it off. As backend developers, we can level up our testing with feature flags.

To demonstrate this, we will go through building a demo app. The demo app is curated to show feature flagging capabilities from modifying the functionality (rate limit) on the fly to adding a new endpoint to the API for beta testing or initial rolling purposes. We’ll use entirely open-source tools along the way!

Why Use Open Source Tools?

We will be using open source tools to build this app (Golang, Redis, and Flagsmith). Open source brings more transparency and trust and encourages collaboration with the global community of backend developers.

By integrating open source tools, we get full visibility as we build and test. For example, we will integrate feature flags with GitHub, which lets us track the lifecycle of a feature by linking a Flagsmith feature flag with a GitHub Pull Request or Issue. This lets us stay updated with the changes to our features without having to manually track each modification. We can easily track the status of our features across different environments.

Let’s Code!

In this tutorial, you’ll see how the functionality of an app changes before and after testing with feature flagging mechanisms. The tools and frameworks we’ll use are Golang, Docker, Redis, Flagsmith, and GitHub. As discussed, all are open source and free to create an account to test.

To get started, open your favourite IDE, initialize a Golang project, and then copy the below code in the main.go file. Then run go mod tidy to install all the dependencies it needs.

Let’s understand what’s going on in the below code snippet:

package main

import (
    "context"
    "errors"
    "fmt"
    "log"
    "net/http"
    "os"
    "strconv"

    "github.com/gin-gonic/gin"
    "github.com/go-redis/redis_rate/v10"
    "github.com/joho/godotenv"
    "github.com/redis/go-redis/v9"
)

var (
    redisClient *redis.Client
    limiter     *redis_rate.Limiter
)

func initClients() {
    redisClient = redis.NewClient(&redis.Options{
        Addr: os.Getenv("REDIS_URL"),
    })
    limiter = redis_rate.NewLimiter(redisClient)
}

func main() {
    err := godotenv.Load()
    if err != nil {
        log.Printf("Loading environment variable from the host system")
    } else {
        log.Printf("Loading environment from .env file")
    }

    initClients()
    defer redisClient.Close()

    r := gin.Default()
    r.GET("/ping", func(c *gin.Context) {
        err, remainingLimit := rateLimitCall(c.ClientIP())
        if err != nil {
            c.JSON(
                http.StatusTooManyRequests,
                gin.H{"error": "Rate Limit Hit"})
        } else {
            c.JSON(
                http.StatusOK,
                gin.H{"Your left over API request is": remainingLimit})
        }
    })
    r.GET("/beta", func(c *gin.Context) {
        c.JSON(
            http.StatusOK,
            gin.H{"message": "This is beta endpoint"})
    })
    r.Run(":" + os.Getenv("PORT"))
}

func rateLimitCall(ClientIP string) (error, int) {
    ctx := context.Background()

    rateLimitString := os.Getenv("RATE_LIMIT")
    RATE_LIMIT, _ := strconv.Atoi(rateLimitString)

    res, err := limiter.Allow(ctx, ClientIP, redis_rate.PerHour(RATE_LIMIT))
    if err != nil {
        panic(err)
    }

    if res.Remaining == 0 {
        return errors.New("You have hit the Rate Limit for the API. Try again later"), 0
    }

    fmt.Println("remaining request for", ClientIP, "is", res.Remaining)
    return nil, res.Remaining
}

Initializing the Tools

func initClients() {
    redisClient = redis.NewClient(&redis.Options{
        Addr: os.Getenv("REDIS_URL"),
    })
    limiter = redis_rate.NewLimiter(redisClient)
}

func main() {
    err := godotenv.Load()
    if err != nil {
        log.Printf("Loading environment variable from the host system")
    } else {
        log.Printf("Loading environment from .env file")
    }

    initClients()
    defer redisClient.Close()

    r := gin.Default()
    ...
    })

At the top, we declare variables to store Redis and Rate limiter clients to reuse and initialise them once. Then we initialise them in the initClients().

In main(), first, we load the environment variables from the system or the .env file. Then we call initClients(). This will create clients and store them in the variables we created.

Next, we create a Gin router that handles all our incoming requests. These are the environment variables we need in our .env file. For this demo, we need a Redis instance running to store all the data for rate-limiting functionality. We can use Docker or any remote machine – just remember to update REDIS_URL accordingly. I am going to use Docker.

We could also go a mile ahead and get all the environment variables from the feature flags, but we won’t do this here.

REDIS_URL=localhost:6379
PORT=8080
RATE_LIMIT=10

Creating Endpoints for the API

r.GET("/ping", func(c *gin.Context) {
        err, remainingLimit := rateLimitCall(c.ClientIP())
        if err != nil {
            c.JSON(
                http.StatusTooManyRequests,
                gin.H{"error": "Rate Limit Hit"})
        } else {
            c.JSON(
                http.StatusOK,
                gin.H{"Your left over API request is": remainingLimit})
        }
    })
    r.GET("/beta", func(c *gin.Context) {
        c.JSON(
            http.StatusOK,
            gin.H{"message": "This is beta endpoint"})
    })
    r.Run(":" + os.Getenv("PORT"))

Then we create two GET endpoints, /ping and /beta. Every time someone hits the /ping endpoint we call the rateLimitCall() function. It checks and sets the rate limit of incoming requests from an IP address. All this is stored in the Redis instance we created.

So, now if the user has interacted with the /ping API endpoint for the first time, will create an entry with a limit of 10 per hour. The limit number 10 comes from the RATE_LIMIT we set, and the hourly refresh form comes from the redis_rate.PerHour(RATE_LIMIT) function.

Next, we check if the user has a remaining limit. If yes, we will return a message with the number of requests they have remaining. Otherwise, if they hit the limit cap, we return a message letting them know this.

Apart from the /ping endpoint, we have another endpoint /beta. It returns a simple message, but later we’ll see how (using feature flags) we can completely turn on and off the functionality of this endpoint.

How to Add Feature Flagging

Now it’s time to add feature flagging capabilities to our app. We are going to use Flagsmith. Flagsmith is an open source software that lets us easily create and manage feature flags across web, mobile, and server-side applications.

Using Flagsmith, we can wrap features in a flag and then toggle them on or off for different environments, users, or user segments. And then you’ll be able to manage all of them from the Flagsmith dashboard without needing to redeploy.

So, let’s install the Flagsmith package by running the below command:

go get github.com/Flagsmith/flagsmith-go-client/v3

Then we import the package by giving it an alias flagsmith. Below is the updated functionality after we apply feature flagging to our existing code.

Let’s understand the changes we’ve made here (I’ll explain below the code snippet):

package main

import (
    "context"
    "errors"
    "fmt"
    "log"
    "net/http"
    "os"

    flagsmith "github.com/Flagsmith/flagsmith-go-client/v3"
    "github.com/gin-gonic/gin"
    "github.com/go-redis/redis_rate/v10"
    "github.com/joho/godotenv"
    "github.com/redis/go-redis/v9"
)

var (
    redisClient     *redis.Client
    limiter         *redis_rate.Limiter
    flagsmithClient *flagsmith.Client
)

func initClients() {
    redisClient = redis.NewClient(&redis.Options{
        Addr: os.Getenv("REDIS_URL"),
    })
    limiter = redis_rate.NewLimiter(redisClient)
    flagsmithClient = flagsmith.NewClient(os.Getenv("FLAGSMITH_ENVIRONMENT_KEY"))
}

func main() {
    err := godotenv.Load()
    if err != nil {
        log.Printf("Loading environment variable from the host system")
    } else {
        log.Printf("Loading environment from .env file")
    }

    initClients()
    defer redisClient.Close()

    r := gin.Default()
    r.GET("/ping", func(c *gin.Context) {
        err, remainingLimit := rateLimitCall(c.ClientIP())
        if err != nil {
            c.JSON(
                http.StatusTooManyRequests,
                gin.H{"error": "Rate Limit Hit"})
        } else {
            c.JSON(
                http.StatusOK,
                gin.H{"Your left over API request is": remainingLimit})
        }
    })
    r.GET("/beta", func(c *gin.Context) {
        flags := getFeatureFlags()
        isEnabled, _ := flags.IsFeatureEnabled("beta")
        if isEnabled {
            c.JSON(
                http.StatusOK,
                gin.H{"message": "This is beta endpoint"})
        } else {
            c.String(http.StatusNotFound, "404 page not found")
        }
    })

    r.Run(":" + os.Getenv("PORT"))
}

func rateLimitCall(ClientIP string) (error, int) {

    ctx := context.Background()

    flags := getFeatureFlags()
    rateLimitInterface, _ := flags.GetFeatureValue("rate_limit")
    RATE_LIMIT := int(rateLimitInterface.(float64))
    fmt.Println("Current Rate Limit is", RATE_LIMIT)

    res, err := limiter.Allow(ctx, ClientIP, redis_rate.PerHour(RATE_LIMIT))
    if err != nil {
        panic(err)
    }

    if res.Remaining == 0 {
        return errors.New("You have hit the Rate Limit for the API. Try again later"), 0
    }

    fmt.Println("remaining request for", ClientIP, "is", res.Remaining)
    return nil, res.Remaining
}

func getFeatureFlags() flagsmith.Flags {
    ctx := context.Background()
    flags, _ := flagsmithClient.GetEnvironmentFlags(ctx)
    return flags
}

Understanding the Feature Flag Code Logic

func getFeatureFlags() flagsmith.Flags {
    ctx := context.Background()
    flags, _ := flagsmithClient.GetEnvironmentFlags(ctx)
    return flags
}

First, let’s directly jump to the new getFeatureFlags() function we created at the bottom. This function will return all the flags we created on the Flagsmith dashboard, by calling the GetEnvironmentFlags() method on flagsmithClient.

We initiated the flagsmithClient inside the initClients() function. The Flagsmith Client needs the access key (the NewClient() function) that we can get from the Flagsmith dashboard. As we did for the Redis and Limter clients, we will store the client in a global variable for reusability. You’ll understand the dashboard, creating flags, and retrieving the key in later steps.

func rateLimitCall(ClientIP string) (error, int) {

    ctx := context.Background()

    flags := getFeatureFlags()
    rateLimitInterface, _ := flags.GetFeatureValue("rate_limit")
    RATE_LIMIT := int(rateLimitInterface.(float64))
    fmt.Println("Current Rate Limit is", RATE_LIMIT)

    res, err := limiter.Allow(ctx, ClientIP, redis_rate.PerHour(RATE_LIMIT))
    if err != nil {
        panic(err)
    }

    if res.Remaining == 0 {
        return errors.New("You have hit the Rate Limit for the API. Try again later"), 0
    }

    fmt.Println("remaining request for", ClientIP, "is", res.Remaining)
    return nil, res.Remaining
}

Now coming to the rateLimitCall() function, instead of getting RATE_LIMIT from the environment, we get the value from the rate_limit flag (that we will create later). We call getFeatureFlags() and get the flag rate_limit value out from all the flags.

By setting these as feature flags, we can dynamically change the limit anytime from the dashboard. We don’t need to change the code’s functionality or do it the traditional way by changing the RATE_LIMIT value and re-running the server so that it catches new updated values.

    r.GET("/beta", func(c *gin.Context) {
        flags := getFeatureFlags()
        isEnabled, _ := flags.IsFeatureEnabled("beta")
        if isEnabled {
            c.JSON(
                http.StatusOK,
                gin.H{"message": "This is beta endpoint"})
        } else {
            c.String(http.StatusNotFound, "404 page not found")
        }
    })

Now coming to the /beta endpoint, based on whether the beta flag is enabled or disabled, this endpoint will serve the query. Otherwise, it will act as a non-reachable endpoint and return a 404 error message.

In our example, I have added a basic placeholder message to show how it will work, but this opens new possibilities in testing and initial releases (beta). If the API has a new endpoint, we can wrap the functionality in the feature flag and make it available and unavailable with a single click of a button. Also, we can do a lot more like scheduling and canary releases.

Also, our .env file will look like this. We have removed RATE_LIMIT and added FLAGSMITH_ENVIRONMENT_KEY.

REDIS_URL=localhost:6379
PORT=8080
FLAGSMITH_ENVIRONMENT_KEY=ser.ZRd***********469

How to Create Feature Flags in the Flasgsmith Dashboard

Let’s head to the Flagsmith dashboard to create the flags we used above and get the access key. If you don’t have a Flagsmith account you can sign up for free here.

After you sign up you will be prompted to create an organisation and a project. Project separation is good, as it helps us isolate logic for different projects. Once you are done, you will see a dashboard, just like the screenshot below.

We have loads of functionalities from integrations to scheduling the flags to compare the changes. Apart from Go, Flagsmith provides many SDKs. You can click on where the language name is written and it will give you some boilerplate code for that language.

Rate Limiting Feature Flag

Now, let's create our first feature flag for the rate limit. Click on the Create Feature button in the top right corner. A sidebar window will open up. Set the name, then to make the flag turn on the right way while creating, we can select Enabled by default.

In the value section, we need to set the flag value. It can take formats like Txt, JSON, XML, and so on. As our feature value is simple text like 20, 30, and so on, we will choose Txt (the default one) and set a random limit – we’ll go with 20.

You can also give tags and descriptions. Tags can be helpful when filtering out the Feature Flags. For example, we can create a tag backend to filter out all the feature flags related to Backend. The description is a concise explanation of what this particular future flag does when it is enabled (and will help with future understanding).

The screenshot below shows how it will look after filling in the details. Then, click on the Create Feature button to create the flag.

Beta Feature Flag

Let’s now create a second, beta feature flag. It will be the same process as the first one, but in this one, we don’t need to set any flag value and leave that column empty. Once we create both flags, our dashboard will look like this. It shows the flag name, value, current state (view), and so on.

Getting the Access Key

To get the Access Key, click on the SDK Keys from the sidebar, and click the Create Server-side Environment Key button to generate a key. As our app is server-side, it’s good to use that one only. Then copy and paste that key into the value placed in .env for the FLAGSMITH_ENVIRONMENT_KEY key.

Running the API

Now everything is set, so let’s head over back to IDE and run the server by executing the go run main.go command in the terminal. We will see this message In the terminal. In case you encounter any errors, just check that the packages are correctly installed, the variables are correctly set, and the app accesses the Redis instance.

Now if we visit localhost:8080/ping, we will get a message {"Your left over API request is":19}. The limit was 20, we did one request now, and the remaining is 19.

Updating the `rate_limit` Flag

Let’s update the rate_limit flag value to 10 and see what happens. To do so, again visit the Flagsmith dashboard and click on the flag name. A side menu bar will open. Update the value to 10, and click on the Update Feature Value button.

We can also schedule the update. For example, this can be useful when we expect a spike in traffic at a certain timeframe and reduce the limit per user to reduce server load.

If you now visit localhost:8080/ping, you will get a message {"Your left over API request is":8} – because the total limit is 10 and we have already requested two times.

Let's now test the /beta endpoint. Visit localhost:8080/beta, and we will see a message {"message":"This is beta endpoint"}.

Now go back to the Flagsmith dashboard and toggle the switch to disable this flag. Now visit the the URL. You will get a 404 message like this endpoint never existed.

Now that we’ve set up the functionality and demoed the feature flagging capabilities, let’s see how we can integrate the Flasgsmith GitHub App.

How to Integrate Feature Flags with the GitHub App

First, make sure you have pushed your app to GitHub. After that, install the GitHub Flasgsmith App on your repo from the GitHub Marketplace.

By integrating GitHub and Falagsmith, we can view updates on your feature flags/features as comments in GitHub Issues and Pull Requests. This allows us to easily track features, from creating an issue to merging a PR and deploying the changes.

Then select your organisation and the repositories where you want to install the app. You can install it on all of your repos or select a particular one.

As you install it, you will be auto-redirected to the Flagmsith dashboard to configure and complete the integration. Most of the data will be pre-populated, so you just need to select and add a project, and then save the configuration.

Once you hit the Save Configuration Button, it will redirect you back to the main Flagsmith dashboard where we were previously working.

Now let’s link one of the existing flags with the GitHub issue/pull request (raise a dummy PR/issue to test it), or you can create a new flag to test. Let’s proceed with the beta flag which we already created for the beta endpoint.

To link the flag with an existing issue or a pull request, click on the flag name, and a side menu will pop up from the right. Then, choose the 'Link' tab. Then select the Pull Request option, and choose the Pull Request you want to link. All of your Issues and Pull Requests linked to this flag are visible below:

To verify that the flag is successfully linked, click the hyperlink with the arrow icon below the Name column heading. It will navigate you to that particular Issue/Pull Request on GitHub. You can see that the Flagsmith GitHub App has commented below with all the details, such as environment, enabled value, and so on.

Testing the Flagsmith GitHub App

After this, when you make any changes to the flag settings, such as turning on/off the flag or changing the value, the bot will comment with all the updated details.

Let’s test by turning the flag off. As soon as you turn off the flash from the Dashboard, the bot should comment that the flag has now been disabled:

That’s it. That is how it’s simple to integrate Flagsmith with GitHub.

Conclusion

To sum it up, you now know how you can leverage feature flags as a backend developer to change the functionality of your app on the fly.

To take things to the next level, we integrated our demo app with the Flagsmith GitHub app so it could stay updated with the changes to our feature flags’ status on Pull Requests/Issues without having to manually update them.

Check out the Flagsmith repo here and don't forget to give each of these projects a star to show your support. You can also join their amazing community to get technical support.

You can connect with me - Pradumna Saraf, on socials here.

How to Build a Scalable URL Shortener with Distributed Caching Using Redis

Birkaran Sachdev — Tue, 19 Nov 2024 15:14:58 +0000

In this tutorial, we'll build a scalable URL shortening service using Node.js and Redis. This service will leverage distributed caching to handle high traffic efficiently, reduce latency, and scale seamlessly. We'll explore key concepts such as consistent hashing, cache invalidation strategies, and sharding to ensure the system remains fast and reliable.

By the end of this guide, you'll have a fully functional URL shortener service that uses distributed caching to optimize performance. We'll also create an interactive demo where users can input URLs and see real-time metrics like cache hits and misses.

What You Will Learn

How to build a URL shortener service using Node.js and Redis.
How to implement distributed caching to optimize performance.
Understanding consistent hashing and cache invalidation strategies.
Using Docker to simulate multiple Redis instances for sharding and scaling.

Prerequisites

Before starting, make sure you have the following installed:

Node.js (v14 or higher)
Redis
Docker
Basic knowledge of JavaScript, Node.js, and Redis.

Project Overview
Step 1: Setting Up the Project
Step 2: Setting Up Redis Instances
Step 3: Implementing the URL Shortener Service
Step 4: Implementing Cache Invalidation
Step 5: Monitoring Cache Metrics
Step 6: Testing the Application
Conclusion: What You’ve Learned

Project Overview

We'll build a URL shortener service where:

Users can shorten long URLs and retrieve the original URLs.
The service uses Redis caching to store mappings between shortened URLs and original URLs.
The cache is distributed across multiple Redis instances to handle high traffic.
The system will demonstrate cache hits and misses in real-time.

System Architecture

To ensure scalability and performance, we'll divide our service into the following components:

API Server: Handles requests for shortening and retrieving URLs.
Redis Caching Layer: Uses multiple Redis instances for distributed caching.
Docker: Simulates a distributed environment with multiple Redis containers.

Step 1: Setting Up the Project

Let's set up our project by initializing a Node.js application:

mkdir scalable-url-shortener
cd scalable-url-shortener
npm init -y

Now, install the necessary dependencies:

npm install express redis shortid dotenv

express: A lightweight web server framework.
redis: To handle caching.
shortid: For generating short, unique IDs.
dotenv: For managing environment variables.

Create a .env file in the root of your project:

PORT=3000
REDIS_HOST_1=localhost
REDIS_PORT_1=6379
REDIS_HOST_2=localhost
REDIS_PORT_2=6380
REDIS_HOST_3=localhost
REDIS_PORT_3=6381

These variables define the Redis hosts and ports we'll be using.

Step 2: Setting Up Redis Instances

We'll use Docker to simulate a distributed environment with multiple Redis instances.

Run the following commands to start three Redis containers:

docker run -p 6379:6379 --name redis1 -d redis
docker run -p 6380:6379 --name redis2 -d redis
docker run -p 6381:6379 --name redis3 -d redis

This will set up three Redis instances running on different ports. We'll use these instances to implement consistent hashing and sharding.

Step 3: Implementing the URL Shortener Service

Let's create our main application file, index.js:

require('dotenv').config();
const express = require('express');
const redis = require('redis');
const shortid = require('shortid');

const app = express();
app.use(express.json());

const redisClients = [
  redis.createClient({ host: process.env.REDIS_HOST_1, port: process.env.REDIS_PORT_1 }),
  redis.createClient({ host: process.env.REDIS_HOST_2, port: process.env.REDIS_PORT_2 }),
  redis.createClient({ host: process.env.REDIS_HOST_3, port: process.env.REDIS_PORT_3 })
];

// Hash function to distribute keys among Redis clients
function getRedisClient(key) {
  const hash = key.split('').reduce((acc, char) => acc + char.charCodeAt(0), 0);
  return redisClients[hash % redisClients.length];
}

// Endpoint to shorten a URL
app.post('/shorten', async (req, res) => {
  const { url } = req.body;
  if (!url) return res.status(400).send('URL is required');

  const shortId = shortid.generate();
  const redisClient = getRedisClient(shortId);

  await redisClient.set(shortId, url);
  res.json({ shortUrl: `http://localhost:${process.env.PORT}/${shortId}` });
});

// Endpoint to retrieve the original URL
app.get('/:shortId', async (req, res) => {
  const { shortId } = req.params;
  const redisClient = getRedisClient(shortId);

  redisClient.get(shortId, (err, url) => {
    if (err || !url) {
      return res.status(404).send('URL not found');
    }
    res.redirect(url);
  });
});

app.listen(process.env.PORT, () => {
  console.log(`Server running on port ${process.env.PORT}`);
});

As you can see in this code, we have:

Consistent Hashing:
- We distribute keys (shortened URLs) across multiple Redis clients using a simple hash function.
- The hash function ensures that URLs are distributed evenly across the Redis instances.
URL Shortening:
- The /shorten endpoint accepts a long URL and generates a short ID using the shortid library.
- The shortened URL is stored in one of the Redis instances using our hash function.
URL Redirection:
- The /:shortId endpoint retrieves the original URL from the cache and redirects the user.
- If the URL is not found in the cache, a 404 response is returned.

Step 4: Implementing Cache Invalidation

In a real-world application, URLs may expire or change over time. To handle this, we need to implement cache invalidation.

Adding Expiry to Cached URLs

Let's modify our index.js file to set an expiration time for each cached entry:

// Endpoint to shorten a URL with expiration
app.post('/shorten', async (req, res) => {
  const { url, ttl } = req.body; // ttl (time-to-live) is optional
  if (!url) return res.status(400).send('URL is required');

  const shortId = shortid.generate();
  const redisClient = getRedisClient(shortId);

  await redisClient.set(shortId, url, 'EX', ttl || 3600); // Default TTL of 1 hour
  res.json({ shortUrl: `http://localhost:${process.env.PORT}/${shortId}` });
});

TTL (Time-To-Live): We set a default expiration time of 1 hour for each shortened URL. You can customize the TTL for each URL if needed.
Cache Invalidation: When the TTL expires, the entry is automatically removed from the cache.

Step 5: Monitoring Cache Metrics

To monitor cache hits and misses, we’ll add some logging to our endpoints in index.js:

app.get('/:shortId', async (req, res) => {
  const { shortId } = req.params;
  const redisClient = getRedisClient(shortId);

  redisClient.get(shortId, (err, url) => {
    if (err || !url) {
      console.log(`Cache miss for key: ${shortId}`);
      return res.status(404).send('URL not found');
    }
    console.log(`Cache hit for key: ${shortId}`);
    res.redirect(url);
  });
});

Here’s what’s going on in this code:

Cache Hits: If a URL is found in the cache, it’s a cache hit.
Cache Misses: If a URL is not found, it’s a cache miss.
This logging will help you monitor the performance of your distributed cache.

Step 6: Testing the Application

Start your Redis instances:

docker start redis1 redis2 redis3

Run the Node.js server:

node index.js

Test the endpoints using curl or Postman:

Shorten a URL:

  POST http://localhost:3000/shorten
  Body: { "url": "https://example.com" }

Access the shortened URL:
```
  GET http://localhost:3000/{shortId}
```

Conclusion: What You’ve Learned

Congratulations! You’ve successfully built a scalable URL shortener service with distributed caching using Node.js and Redis. Throughout this tutorial, you’ve learned how to:

Implement consistent hashing to distribute cache entries across multiple Redis instances.
Optimize your application with cache invalidation strategies to keep data up-to-date.
Use Docker to simulate a distributed environment with multiple Redis nodes.
Monitor cache hits and misses to optimize performance.

Next Steps:

Add a Database: Store URLs in a database for persistence beyond the cache.
Implement Analytics: Track click counts and analytics for shortened URLs.
Deploy to the Cloud: Deploy your application using Kubernetes for auto-scaling and resilience.

Happy coding!

How to Build a Distributed Rate Limiting System Using Redis and Lua Scripts

Birkaran Sachdev — Tue, 19 Nov 2024 14:39:25 +0000

In this comprehensive guide, you’ll build a distributed rate limiter using Redis and Lua scripting to control user requests in a high-traffic environment.

Rate limiting is crucial in any system to prevent abuse, manage traffic, and protect your resources. By leveraging Redis and Lua, you'll build an efficient, scalable rate limiting system that can handle a large number of requests while keeping your backend services safe.

We will also include an interactive demo where users can simulate traffic, observe rate limits being enforced, and view logs of blocked requests.

What You Will Learn

How to build a rate limiting system using Redis.
How to use Lua scripts with Redis to achieve atomic operations.
Understanding Redis data structures for efficient request tracking.
Techniques for handling high traffic in a distributed system.
Using Docker to simulate and scale a distributed rate limiter.

Prerequisites

Before starting, ensure you have the following installed:

Node.js (v14 or higher)
Redis
Docker (for simulating a distributed environment)
Basic understanding of Node.js, Redis, and Lua scripting.

What You Will Learn
Prerequisites
Project Overview
Step 1: How to Set Up the Project
Step 2: How to Set Up Redis
Step 3: How to Implement the Rate Limiter with Redis and Lua
Step 4: How to Create the Node.js API Server
Step 5: How to Test the Rate Limiter
Step 6: How to Visualize Rate Limiting Metrics
Step 7: How to Deploy with Docker
Conclusion: What You’ve Learned

Project Overview

In this tutorial, you will:

Build a rate limiter using Redis and Lua to enforce request quotas.
Use Lua scripts to ensure atomic operations, avoiding race conditions.
Implement a token bucket algorithm for rate limiting.
Create an interactive demo to simulate high traffic and visualize rate limiting in action.

System Architecture

You'll build the system with the following components:

API Server: Handles incoming user requests.
Redis: Stores request data and enforces rate limits.
Lua Scripts: Ensures atomic updates to Redis for rate limiting.
Docker: Simulates a distributed environment with multiple instances.

Step 1: How to Set Up the Project

Let's start by setting up our Node.js project:

mkdir distributed-rate-limiter
cd distributed-rate-limiter
npm init -y

Next, install the required dependencies:

npm install express redis dotenv

express: A lightweight web server framework.
redis: For interacting with Redis.
dotenv: For managing environment variables.

Create a .env file with the following content:

REDIS_HOST=localhost
REDIS_PORT=6379
PORT=3000
RATE_LIMIT=5
TIME_WINDOW=60

These variables define the Redis host, port, rate limit (number of allowed requests), and the time window (in seconds).

Step 2: How to Set Up Redis

Before we dive into the code, ensure that Redis is installed and running on your system. If you don’t have Redis installed, you can use Docker to quickly set it up:

docker run -p 6379:6379 --name redis-rate-limiter -d redis

Step 3: How to Implement the Rate Limiter with Redis and Lua

To efficiently handle rate limiting, we'll use a token bucket algorithm. In this algorithm:

Each user has a “bucket” of tokens.
Each request consumes a token.
Tokens refill periodically at a set rate.

To ensure atomicity and avoid race conditions, we'll use Lua scripting with Redis. Lua scripts in Redis execute atomically, which means they can’t be interrupted by other operations while running.

How to Create a Lua Script for Rate Limiting

Create a file called rate_limiter.lua:

local key = KEYS[1]
local limit = tonumber(ARGV[1])
local window = tonumber(ARGV[2])
local current = redis.call("get", key)

if current and tonumber(current) >= limit then
    return 0
else
    if current then
        redis.call("incr", key)
    else
        redis.call("set", key, 1, "EX", window)
    end
    return 1
end

Inputs:
- KEYS[1]: The Redis key representing the user’s request count.
- ARGV[1]: The rate limit (maximum number of allowed requests).
- ARGV[2]: The time window (in seconds) for the rate limit.
Logic:
- If the user has reached the rate limit, return 0 (request blocked).
- If the user is within the limit, increment their request count or set a new count with an expiration if it's the first request.
- Return 1 (request allowed).

Step 4: How to Create the Node.js API Server

Create a file called server.js:

require('dotenv').config();
const express = require('express');
const redis = require('redis');
const fs = require('fs');
const path = require('path');

const app = express();
const client = redis.createClient({
  host: process.env.REDIS_HOST,
  port: process.env.REDIS_PORT
});

const rateLimitScript = fs.readFileSync(path.join(__dirname, 'rate_limiter.lua'), 'utf8');

const RATE_LIMIT = parseInt(process.env.RATE_LIMIT);
const TIME_WINDOW = parseInt(process.env.TIME_WINDOW);

// Middleware for rate limiting
async function rateLimiter(req, res, next) {
  const ip = req.ip;
  try {
    const allowed = await client.eval(rateLimitScript, 1, ip, RATE_LIMIT, TIME_WINDOW);
    if (allowed === 1) {
      next();
    } else {
      res.status(429).json({ message: 'Too many requests. Please try again later.' });
    }
  } catch (err) {
    console.error('Error in rate limiter:', err);
    res.status(500).json({ message: 'Internal server error' });
  }
}

app.use(rateLimiter);

app.get('/', (req, res) => {
  res.send('Welcome to the Rate Limited API!');
});

const PORT = process.env.PORT;
app.listen(PORT, () => {
  console.log(`Server running on port ${PORT}`);
});

Rate Limiter Middleware:
- Retrieves the client's IP address and checks if they are within the rate limit using the Lua script.
- If the user exceeds the limit, a 429 response is sent.
API Endpoint:
- The root endpoint is rate-limited, so users can only access it a limited number of times within the specified window.

Step 5: How to Test the Rate Limiter

Start Redis:
```
 docker start redis-rate-limiter
```
Run the Node.js Server:
```
 node server.js
```
Simulate Requests:
- Use curl or Postman to test the rate limiter:
```
  curl http://localhost:3000
```
- Send multiple requests rapidly to see rate limiting in action.

Step 6: How to Visualize Rate Limiting Metrics

To monitor rate limiting metrics like cache hits and blocked requests, we'll add logging to the middleware in server.js:

async function rateLimiter(req, res, next) {
  const ip = req.ip;
  try {
    const allowed = await client.eval(rateLimitScript, 1, ip, RATE_LIMIT, TIME_WINDOW);
    if (allowed === 1) {
      console.log(`Allowed request from ${ip}`);
      next();
    } else {
      console.log(`Blocked request from ${ip}`);
      res.status(429).json({ message: 'Too many requests. Please try again later.' });
    }
  } catch (err) {
    console.error('Error in rate limiter:', err);
    res.status(500).json({ message: 'Internal server error' });
  }
}

Step 7: How to Deploy with Docker

Let’s containerize the application to run it in a distributed environment.

Create a Dockerfile:

FROM node:14
WORKDIR /app
COPY . .
RUN npm install
EXPOSE 3000
CMD ["node", "server.js"]

Build and Run the Docker Container:

docker build -t rate-limiter .
docker run -p 3000:3000 rate-limiter

Now you can scale the rate limiter by running multiple instances.

Conclusion: What You’ve Learned

Congratulations! You’ve successfully built a distributed rate limiter using Redis and Lua scripts. Throughout this tutorial, you’ve learned how to:

Implement rate limiting to control user requests in a distributed system.
Use Lua scripts in Redis to perform atomic operations.
Apply a token bucket algorithm to manage request quotas.
Monitor rate limiting metrics to optimize performance.
Use Docker to simulate a scalable distributed environment.

Next Steps:

Add Rate Limiting by User ID: Extend the system to support rate limits per user.
Integrate with Nginx: Use Nginx as a reverse proxy with Redis-backed rate limiting.
Deploy with Kubernetes: Scale your rate limiter using Kubernetes for high availability.

Happy coding!

Build a Real-Time Multiplayer Tic-Tac-Toe Game Using WebSockets and Microservices

Birkaran Sachdev — Mon, 18 Nov 2024 21:55:22 +0000

In this tutorial, we’ll build a real-time multiplayer Tic-Tac-Toe game using Node.js, Socket.IO, and Redis. This game allows two players to connect from different browser tabs, take turns playing, and see real-time updates as they play. We'll use Redis to manage game state synchronization across multiple WebSocket servers, making our application scalable.

By the end, you'll have a fully functional game with real-time capabilities and a solid understanding of how to use WebSockets and Redis to build scalable real-time applications.

What You Will Learn

How to use Socket.IO for real-time communication.
How to use Redis Pub/Sub to synchronize game state across multiple clients.
How to set up a scalable WebSocket server architecture.

Prerequisites

Before we start, make sure you have the following installed:

Node.js (v16 or higher)
Redis
Docker (optional, for running Redis in a container)
Basic knowledge of JavaScript, Node.js, and WebSockets.

Project Overview
Step 1: Setting Up Your Development Environment
Step 2: Setting Up the Project
Step 3: Implementing the WebSocket Server with Redis
Step 4: Implement the React Frontend interface
Step 5: Running the Application
Step 6: Viewing Redis Messages in Real-Time
Demo
Conclusion

Project Overview

We'll build a real-time Tic-Tac-Toe game with the following features:

Two players can connect and play a game.
The game board updates in real-time across different browser tabs.
The game announces a winner or declares a draw when the board is full.

We’ll use:

Node.js with Socket.IO for handling WebSocket connections.
Redis Pub/Sub to manage game state synchronization across clients.

Step 1: Setting Up Your Development Environment

Installing Node.js

Ensure you have Node.js installed on your system:

node -v

If you don’t have it installed, download it from Node.js.

Installing Redis

You can install Redis locally or run it in a Docker container.

macOS (Using Homebrew)

First, ensure that you have Homebrew installed on your system before running the commands below:

brew install redis
brew services start redis

Verify that the Redis container is running with the following command:

redis-cli ping

You should see:

PONG

Using Docker to Run Redis

docker run --name redis-server -p 6379:6379 -d redis

Check if Redis is running using:

docker exec -it redis-server redis-cli ping

Step 2: Setting Up the Project

1. Create the Project Directory

mkdir tic-tac-toe
cd tic-tac-toe
npm init -y

2. Install Dependencies

npm install express socket.io redis dotenv

3. Create Environment Variables

Create a .env file in your project root with the following contents:

PORT=3000
REDIS_HOST=localhost
REDIS_PORT=6379

Step 3: Implementing the WebSocket Server with Redis

In this step, we'll set up a WebSocket server that handles real-time game interactions using Node.js, Socket.IO, and Redis. This server will manage the game state, handle player moves, and ensure synchronization across multiple clients using Redis Pub/Sub.

We'll break down each section of the code so you understand exactly how everything fits together.Server Code Explanation

Create a file named server.js and add the following code:

import dotenv from 'dotenv';
import express from 'express';
import http from 'http';
import { Server } from 'socket.io';
import { createClient } from 'redis';

dotenv.config(); // Load environment variables from .env file

const app = express();
const server = http.createServer(app);
const io = new Server(server, {
  cors: {
    origin: "http://localhost:5173",
    methods: ["GET", "POST"],
  }
});

dotenv: Loads environment variables from a .env file to keep sensitive information like ports and keys secure.
express: Sets up a basic Express server to handle HTTP requests.
http: We create an HTTP server using Node's built-in http module, which we'll use with Socket.IO for WebSocket communication.
Socket.IO: This library enables real-time, bidirectional communication between the server and clients.
CORS Configuration: Allows cross-origin requests from our frontend running on localhost:5173.

Then, to create Redis publisher and subscriber clients, we’ll add the following code to server.js:

// Initialize Redis clients
const pubClient = createClient();
const subClient = createClient();
await pubClient.connect();
await subClient.connect();

We use Redis to handle real-time data synchronization between connected clients.

pubClient: Used to publish messages (like game state updates).
subClient: Subscribes to messages (listens for updates).

connect(): Establishes a connection to the Redis server.

In this paradigm, one client is used to publish updates, and the other one subscribes to updates. This helps avoid blocking behavior, since Redis clients in subscribe mode can only receive messages.

To subscribe to Redis channels for game updates, we’ll add the following code to server.js:

// Subscribe to the Redis channel for game updates
await subClient.subscribe('game-moves', (message) => {
  gameState = JSON.parse(message);
  io.emit('gameState', gameState);
});

subClient.subscribe: Listens for messages on the game-moves channel.
Whenever a new move is made by a player, the game state is updated in Redis, and all connected clients are informed of the new state.
The message parameter contains the game state as a string. We parse it into a JavaScript object and broadcast the updated state using Socket.IO.

Next, to define the game state and functions, we’ll add the following code to server.js:

// Define initial game state
let gameState = {
  board: Array(9).fill(null),
  xIsNext: true,
};

// Function to reset the game
function resetGame() {
  gameState = {
    board: Array(9).fill(null),
    xIsNext: true,
  };
}

gameState: Keeps track of the current state of the board and whose turn it is (xIsNext).
- The board is represented as an array of 9 cells (each can be 'X', 'O', or null).
- The xIsNext flag determines which player's turn it is.
resetGame(): Resets the board and turn indicator to their initial state, allowing for a new game to start.

Next, to handle WebSocket connections, let’s add the following code to server.js:

io.on('connection', (socket) => {
  console.log('New client connected:', socket.id);

  // Send the current game state to the newly connected client
  socket.emit('gameState', gameState);

The io.on('connection') event is triggered when a new client connects.
socket.id: A unique identifier for each connected client.
We immediately send the current gameState to the new client so they can see the current board.

To handle player moves, we’ll add the following code to server.js:

  // Handle player moves
  socket.on('makeMove', (index) => {
    // Prevent making a move if cell is already taken or game is over
    if (gameState.board[index] || calculateWinner(gameState.board)) return;

    // Update the board and switch turns
    gameState.board[index] = gameState.xIsNext ? 'X' : 'O';
    gameState.xIsNext = !gameState.xIsNext;

    // Publish the updated game state to Redis
    pubClient.publish('game-moves', JSON.stringify(gameState));
    io.emit('gameState', gameState);
  });

makeMove: This event is triggered when a player clicks on a cell.
- Validation: We check if the cell is already occupied or if the game has ended before making a move.
- Updating Game State: If the move is valid, we update the board and switch turns.
The updated game state is then:
1. Published to Redis: This ensures that all instances of the server stay in sync.
2. Broadcasted to all clients: This immediately updates the game board for all players.

To handle game restarts, we’ll add the following code to server.js:

// Handle game restarts
socket.on('restartGame', () => {
  resetGame();
  io.emit('gameState', gameState);
});

To handle client disconnection handling, we’ll add the following code to server.js:

 socket.on('disconnect', () => {
    console.log('Client disconnected:', socket.id);
  });
});

Finally, to process the logic of the game, we’ll add the following functions to server.js:

// Function to check if there's a winner
function calculateWinner(board) {
  const lines = [
    [0, 1, 2], [3, 4, 5], [6, 7, 8],
    [0, 3, 6], [1, 4, 7], [2, 5, 8],
    [0, 4, 8], [2, 4, 6]
  ];
  for (let [a, b, c] of lines) {
    if (board[a] && board[a] === board[b] && board[a] === board[c]) {
      return board[a];
    }
  }
  return null;
}

function isBoardFull(board) {
  return board.every((cell) => cell !== null);
}

calculateWinner(): Checks if there’s a winning combination on the board.
isBoardFull(): Checks if all cells are filled, indicating a draw.

Step 4: Implement the React Frontend interface

In this step, we build a simple and interactive React frontend for our Tic-Tac-Toe game. This frontend allows players to connect to the WebSocket server, make moves, and see the game board update in real-time.

In App.jsx, add the following code:

import React, { useEffect, useState } from 'react';
import io from 'socket.io-client';

const socket = io('http://localhost:3000');

function App() {
  const [gameState, setGameState] = useState({
    board: Array(9).fill(null),
    xIsNext: true,
    winner: null
  });

  useEffect(() => {
    socket.on('gameState', (state) => {
      setGameState(state);
    });

    return () => socket.off('gameState');
  }, []);

  const handleClick = (index) => {
    if (gameState.board[index] || gameState.winner) return;
    socket.emit('makeMove', index);
  };

  const renderCell = (index) => (
    <button onClick={() => handleClick(index)}>{gameState.board[index]}button>
  );

  return (
    <div>
      <h1>Multiplayer Tic-Tac-Toeh1>
      <div className="board">
        {[...Array(9)].map((_, i) => renderCell(i))}
      div>
      <button onClick={() => socket.emit('restartGame')}>Restart Gamebutton>
    div>
  );
}

export default App;

Here is a summary of how the React app is broken down:

WebSocket Connection:
- The frontend establishes a connection to the server using socket.io-client.

State Management:
- The game state (gameState) is managed with React's useState and includes:
  - The board (9 cells).
  - The flag xIsNext to indicate the current player's turn.
  - The winner status.
Real-Time Updates:
- The useEffect hook:
  - Listens for gameState updates from the server.
  - Updates the local game state when changes are detected.
  - Cleans up the WebSocket listener when the component is unmounted.
Handling Player Moves:
- The handleClick function:
  - Checks if a cell is already occupied or if the game has a winner before allowing a move.
  - Sends a makeMove event to the server with the clicked cell index.
Game Board Rendering:
- The renderCell function creates a button for each cell on the board.
- The board is displayed using a 3x3 grid.
Restart Game:
- The "Restart Game" button emits a restartGame event to reset the game board for all players.
User Interface:
- A simple and interactive layout that allows players to take turns and see updates in real-time.

Step 5: Running the Application

Starting the Backend

To start the backend server, open a new terminal window and run the following commands:

cd tic-tac-toe
npm start

Starting the Frontend

To start the React frontend server, open a new terminal window and run the commands below (do not use the same one which the backend server is running on, as you need both running simultaneously to run the game).

cd tic-tac-toe-client
npm run dev

Accessing the Game

Open your browser and navigate to:

http://localhost:5173

Step 6: Viewing Redis Messages in Real-Time

While the game is running, you can view Redis messages to see real-time game state updates.

Open a terminal and run:

redis-cli
SUBSCRIBE game-moves

This will display game updates:

1) "message"
2) "game-moves"
3) "{\"board\":[\"X\",null,\"O\",null,\"X\",null,null,null,null],\"xIsNext\":false}"

Every time a move is made or the game state changes, the server publishes the updated game state to the game-moves channel. Using redis-cli, you can monitor these updates in real-time, as the game is being played.

Demo

In this demo, you'll see the Tic Tac Toe game running locally, demonstrating real-time updates as players take turns.

The gameplay showcases features such as turn switching, board updates, and game state announcements (winner or draw). This highlights how the game leverages WebSocket communication to provide a smooth, interactive experience.

Conclusion

Congratulations, you’ve successfully built a real-time multiplayer Tic-Tac-Toe game using Node.js, Socket.IO, and Redis. Here’s what you’ve learned:

Real-time WebSocket communication using Socket.IO.
Game state management using Redis Pub/Sub.
Building a responsive front-end with React.

Next Steps

Add player authentication.
Implement a chat feature.
Deploy your application to a cloud provider for scalability.

Happy coding!

How to Use Queues in Web Applications – Node.js and Redis Tutorial

Zubair Idris Aweda — Thu, 06 Jul 2023 16:23:18 +0000

When you're building large scale web applications, speed is a major priority. Users don't want to wait long for responses anymore, and they shouldn't have to. But some processes take time, and they cannot be made any faster or eliminated.

Message queues help solve this problem by providing an additional branch to the usual request-response journey. This additional branch helps make sure users can get immediate responses, and the time-consuming processes can be done on the side. Everybody goes home happy.

This article will focus on explaining what message queues are and how to get started with them by building a very simple application. You should be familiar with the basics of Node.js, and you should have Redis installed either locally or on a cloud instance. Learn how to install Redis here.

What is a Queue?

A queue is a data structure that allows you store entities in an order. Queues use a first-in-first-out (FIFO) principle.

The concept of queues in computer science is the same as the concept of queues in everyday life where people line up to get things. You join a queue from the back, wait till it is your turn, then leave the queue from the front after you have been attended to.

In computer science, when a process like an API request is running, and you need to remove a certain task (like sending an email) from the current flow, you push it to a queue and continue the process.

The diagram below illustrates the lifecycle of a queue:

Queue Lifecycle | https://optimalbits.github.io/bull/

What is a Job?

A job is any piece of data that is used on a queue, usually a JSON-like object.

As demonstrated in the cover image of this article, you can think of a job as each person on a queue at an airport. Each person holds a briefcase containing specific data, and other instructions (passports and maybe medical papers where required) that will help when it is their turn to be attended to.

New people joining this queue will enter from the back (as the last person), and people will be attended to from the front. That is how jobs are also processed, each job contains data that will be used for its processing. New jobs are added from the back while jobs are taken out from the front.

What is a Job Producer?

A job producer is any piece of code that adds a job to a queue. In real life, this would be the security guard at the airport that gives direction to people, telling them which queue to join for different purposes.

A job producer can exist independently of a job consumer. This means that in a microservice setup, a particular service might just be concerned with adding jobs to a queue, but not how they're processed after.

What is a Worker (Job Consumer)?

A worker, or job consumer, is a process, or function, that can execute a job. Think of a worker as a bank cashier attending to people on a queue at the bank. When the first person comes in, they join the queue as the only one on the queue. The cashier then calls for them and the queue is emptied.

The cashier requests for specific details to be used to process the transaction from the person. While the cashier attends to that customer, four more customers could have lined up. They will remain on the queue till the cashier is done with the first customer before calling for the next one. This is the same process with queue workers — they pick the first job in the queue, and process it.

What are Failed Jobs?

Often times, some jobs might fail during processing.

Here are some reasons why a job could fail:

Invalid or missing input data: When data required for a job to be processed is missing, the job will fail. For example, a job to send an email will fail without the recipient's email address.
Timeout: A job could be failed by the queue mechanism if it is taking longer than usual. This could be due to an issue on a dependency of the job or something else, but usually you don't want a single job running forever.
Network or infrastructure problems: These problems are almost out of your control, but they do happen. A database connection error for example would fail a job.
Dependency issues: Sometimes a job relies on some external resources to function well. Whenever these other resources are unavailable or unsuccessful, the job will fail.

When jobs fail, you can configure your queue mechanism to retry them. You can either retry the job immediately, or after a calculated amount of time. You can set a maximum number of attempts, which is recommended. If not, you end up running a job that will always fail infinitely.

Why Use Queues?

Queues are useful for creating robust communication channels between microservices. Multiple services can use the same queue. Different services could be tasked with different problems. When a service completes its task, it can push a job to another service that has workers waiting for that job. That service will pick it up and do whatever is needed with the data.

Queues are also useful for offloading heavy tasks from a process. As you'll see in this article, a time consuming task like sending of an email can be put on a queue to avoid slowing down response time.

Queues help avoid single points of failure. A process that has the ability to fail and can be retried is best processed using a queue where it can be retried after a while.

How to Build a Simple Application that Uses Queues

In this article, we'll build a simple project using Node.js and Redis. We'll use the Bull library as it simplifies a lot of the complexities involved in building a queue system. The project will have a single endpoint to send emails.

Create a New Node.js Project

mkdir nodejs-queue-project
cd nodejs-queue-project
npm init -y

The commands above will create a new folder named nodejs-queue-project and a package.json file in it. The package.json file should look like this:

{
  "name": "nodejs-queue-project",
  "version": "1.0.0",
  "description": "",
  "main": "index.js",
  "scripts": {
    "test": "echo \"Error: no test specified\" && exit 1"
  },
  "keywords": [],
  "author": "",
  "license": "ISC"
}

Install the Required Dependencies

npm i express @types/express @types/node body-parser ts-node ts-lint typescript nodemon nodemailer @types/nodemailer

The commands above will install the different packages and dependencies required for the project.

After installation, you can update the scripts section of your package.json to have a dev command. Your whole package.json file should look like this now:

{
  "name": "nodejs-queue-project",
  "version": "1.0.0",
  "description": "",
  "main": "index.js",
  "scripts": {
    "dev": "nodemon src/app.ts"
  },
  "keywords": [],
  "author": "",
  "license": "ISC",
  "dependencies": {
    "@types/express": "^4.17.17",
    "@types/node": "^20.3.3",
    "@types/nodemailer": "^6.4.8",
    "body-parser": "^1.20.2",
    "express": "^4.18.2",
    "nodemailer": "^6.9.3",
    "nodemon": "^2.0.22",
    "ts-lint": "^4.5.1",
    "ts-node": "^10.9.1",
    "typescript": "^5.1.6"
  }
}

The file above shows all your installed dependencies. The npm run dev command will run when you use the dev script.

How to Build the Endpoint

The first thing to do is to create a new folder named src. This folder will contain all your code files. The first file it will contain is the root file of the application — the app.ts file as defined in the package.json file.

We'll use the app.ts file to import required packages and create a simple server with a single endpoint to send a email as seen below:

import express from "express";
import bodyParser from "body-parser";
import nodemailer from "nodemailer";

const app = express();

app.use(bodyParser.json());

app.post("/send-email", async (req, res) => {
  const { from, to, subject, text } = req.body;

  // Use a test account as this is a tutorial
  const testAccount = await nodemailer.createTestAccount();

  const transporter = nodemailer.createTransport({
    host: "smtp.ethereal.email",
    port: 587,
    secure: false,
    auth: {
      user: testAccount.user,
      pass: testAccount.pass,
    },
    tls: {
      rejectUnauthorized: false,
    },
  });

  console.log("Sending mail to %s", to);

  let info = await transporter.sendMail({
    from,
    to,
    subject,
    text,
    html: `${text}`,
  });

  console.log("Message sent: %s", info.messageId);
  console.log("Preview URL: %s", nodemailer.getTestMessageUrl(info));

  res.json({
    message: "Email Sent",
  });
});

app.listen(4300, () => {
  console.log("Server started at http://localhost:4300");
});

Now, you can start your server by running npm run dev in your terminal. You should see a message saying Server started at [http://localhost:4300](http://localhost:4300) in your terminal.

npm run dev message

You can now test the endpoint using a tool like Postman:

Endpoint testing on Postman

The request took almost 4 seconds as shown in the screenshot. This is very slow for an endpoint. If you take a look at your terminal, you should also see a URL where you can preview the email that was sent.

Opening the link lets you see how the email looks.

Email content

How to Create the Queue

To make the process even faster, the email can be queued to be sent later and a response sent to the user immediately.

To do this, install the bull library and its @types library as we'll use it to create a queue. That is:

npm i bull @types/bull

Creating a new queue using bull is as easy as instantiating a new Bull object with a name for the queue:

// This goes at the top of your file
import Bull from 'bull';

const emailQueue = new Bull("email");

When the queue is created with just the queue name, it tries to use the default Redis connection URL: localhost:6379. If you prefer using a different URL, simply pass in a second object to the Bull class as an options object:

const emailQueue = new Bull("email", {
  redis: "localhost:6379",
});

At this point, you can create a simple function to serve as a job producer and add a job to the queue every time a request comes in.

type EmailType = {
  from: string;
  to: string;
  subject: string;
  text: string;
};

const sendNewEmail = async (email: EmailType) => {
  emailQueue.add({ ...email });
};

This newly created function, sendNewEmail, accepts an object containing details of the new email to be sent of type EmailType. There's sender email address (from), recipient email address (to), subject of the email, and the content of the email (text). Then it pushes a new job to the queue.

You can use this function instead of sending the email during the request now. Modify the endpoint to do this:

app.post("/send-email", async (req, res) => {
  const { from, to, subject, text } = req.body;

  await sendNewEmail({ from, to, subject, text });

  console.log("Added to queue");

  res.json({
    message: "Email Sent",
  });
});

At this point, the code is simpler and the process is faster. The request only takes about 40m — about 100x faster than before.

Endpoint testing with Postman

At this point, the email is added to a queue. It will remain on the queue until processed. The job can be processed by the same application or another service (if in a microservice setup).

How to Process the Jobs

The cycle is incomplete and useless if the mails never leave the queue. We'll create a job consumer to process the jobs and clear the queue.

We can do this by creating the logic for a function that accepts a Job object and sends the email:

const processEmailQueue = async (job: Job) => {
  // Use a test account as this is a tutorial
  const testAccount = await nodemailer.createTestAccount();

  const transporter = nodemailer.createTransport({
    host: "smtp.ethereal.email",
    port: 587,
    secure: false,
    auth: {
      user: testAccount.user,
      pass: testAccount.pass,
    },
    tls: {
      rejectUnauthorized: false,
    },
  });

  const { from, to, subject, text } = job.data;

  console.log("Sending mail to %s", to);

  let info = await transporter.sendMail({
    from,
    to,
    subject,
    text,
    html: `${text}`,
  });

  console.log("Message sent: %s", info.messageId);
  console.log("Preview URL: %s", nodemailer.getTestMessageUrl(info));

  return nodemailer.getTestMessageUrl(info);
};

The function above accepts a Job object. The object has useful properties that shows the status of and data in a job. Here, we use the data property.

At this point, all we have is a function. It doesn't pick up jobs automatically because it doesn't know which queue to work with.

Before connecting it to the queue, you can go on to add a few jobs to the queue by sending some requests. You can check the email jobs currently queued by running this command in your redis-cli:

LRANGE bull:email:wait 0 -1

This checks the email waitlist, and returns the ids of the waiting jobs.

Redis CLI

I have created a few jobs just to show how workers actually work.

Now, connect the worker to the queue by adding this line of code:

emailQueue.process(processEmailQueue);

This is what your app.ts file should now look after that:

import express from "express";
import bodyParser from "body-parser";
import nodemailer from "nodemailer";
import Bull, { Job } from "bull";

const app = express();

app.use(bodyParser.json());

const emailQueue = new Bull("email", {
  redis: "localhost:6379",
});

type EmailType = {
  from: string;
  to: string;
  subject: string;
  text: string;
};

const sendNewEmail = async (email: EmailType) => {
  emailQueue.add({ ...email });
};

const processEmailQueue = async (job: Job) => {
  // Use a test account as this is a tutorial
  const testAccount = await nodemailer.createTestAccount();

  const transporter = nodemailer.createTransport({
    host: "smtp.ethereal.email",
    port: 587,
    secure: false,
    auth: {
      user: testAccount.user,
      pass: testAccount.pass,
    },
    tls: {
      rejectUnauthorized: false,
    },
  });

  const { from, to, subject, text } = job.data;

  console.log("Sending mail to %s", to);

  let info = await transporter.sendMail({
    from,
    to,
    subject,
    text,
    html: `${text}`,
  });

  console.log("Message sent: %s", info.messageId);
  console.log("Preview URL: %s", nodemailer.getTestMessageUrl(info));
};

emailQueue.process(processEmailQueue);

app.post("/send-email", async (req, res) => {
  const { from, to, subject, text } = req.body;

  await sendNewEmail({ from, to, subject, text });

  console.log("Added to queue");

  res.json({
    message: "Email Sent",
  });
});

app.listen(4300, () => {
  console.log("Server started at http://localhost:4300");
});

Once you save, you'll notice that the server restarts and immediately starts sending out mails. This is because the worker sees the queue and begins processing immediately.

Server sending out queued emails

Now, both the producer and the worker are active. Every new API request will be pushed to the queue, and the worker will immediately process it unless there's some pending jobs already.

Summary

I hope this article helped you understand what a message queue is, how to add jobs and create processes to run them, and how you can use them to build better web applications. You can find the code files used in this article on GitHub.

If you have any questions or relevant advice, please get in touch with me to share them.

To read more of my articles or follow my work, you can connect with me on LinkedIn, Twitter, and Github. It’s quick, it’s easy, and it’s free!

How to Use Redis in Your PHP Apps

Zubair Idris Aweda — Wed, 03 May 2023 19:51:55 +0000

Redis is a data store that stores data primarily in memory. It's faster than traditional databases, and has grown quite popular.

In this tutorial, you'll learn the basics of how Redis works, when to use it, how to install it on your device, and how to use it as a caching system in a PHP web application.

What Is Redis?

Redis is a data store – like a database, but one that stores data primarily in-memory. This makes it much faster than traditional databases where data is stored in disks. Because of this speed, Redis is often used as a caching tool.

Redis can store data in any data type, as it uses a key-value pair system to store data. This is also unlike traditional databases that use documents or rows.

You can think of a Redis database as a big JSON object, where everything in the database is a key-value pair. This means it might not be the best place to store structured data.

You can also use Redis as a database, as it has the ability to write data to disk for persistence. You can configure Redis to persist data either periodically or after every command you issue. When Redis isn't configured to persist data, it is very volatile, and a system crash would result in a loss of data.

Redis is popular in production level applications and it's used by large companies like Twitter, Github, SnapChat and StackOverFlow.

When to Use Redis

For One Time Passwords (OTPs): These are usually generated to be used once, and have short lifespans. With Redis' ability to set an expiry date for data, you can safely store the OTP without worrying about deleting them after a certain period.
For frequently accessed resources: For data that doesn't change too frequently but is accessed a lot, you can use Redis to save time that would have been spent querying the database or making a call to some external service.
For heavy duty queries: For database queries that take time, and also won't change too often, use Redis to reduce this time by storing the results for as long as you like.

How to Install Redis

You can install Redis on any operating system. Here are the instructions for macOS, Windows Subsystem for Linux, and Linux.

macOS

To install Redis on macOS, run:

brew install redis

Then, run this command to start Redis:

redis-server

Windows Subsystem for Linux and Linux

Redis doesn't exactly support the Windows operating system yet, so you can install WSL (Windows Subsystem for Linux) on windows to have a Linux environment.

To install Redis on Linux, run:

curl -fsSL https://packages.redis.io/gpg | sudo gpg --dearmor -o /usr/share/keyrings/redis-archive-keyring.gpg

echo "deb [signed-by=/usr/share/keyrings/redis-archive-keyring.gpg] https://packages.redis.io/deb $(lsb_release -cs) main" | sudo tee /etc/apt/sources.list.d/redis.list

sudo apt-get update
sudo apt-get install redis

Then, run this command to start Redis:

sudo service redis-server start

Now that Redis is installed, you can test it by running redis-cli ping. This will output "PONG". Like this:

Testing Redis Installation

Redis Basics

To use Redis as a REPL or as a standalone application, run redis-cli. It will open the REPL environment.

How to Set Data

Use the SET keyword to set a key value pair in Redis. To set a username key to the value Zubs , run this:

SET username Zubs

Setting a key-value pair

How to Get Data

To get the recently saved username key, use the GET keyword like this:

GET username

Getting a value by key

How to Delete Data

You can also delete a previously stored key using the DEL keyword like this:

DEL username

Deleting a value by key

How to Check if a Value Exists

You can check for the existence of a key by using the EXISTS keyword. It returns 0 when the key doesn't exist, and 1 if it does. You can test by checking if the recently deleted username key exists. Like this:

How to Set a Time to Live for Keys

Redis lets you specify how long some key should exist for when creating it. This is one really great feature of Redis. To do this, use the SETEX keyword like this:

SETEX key seconds value

You can check the time to live for a specific key using the TTL keyword. This returns -1 if the key has no set expiration, meaning it will be stored indefinitely. It returns -2 if the key doesn't exist. And it returns the time in seconds if the key exists.

You can set an expiration time in seconds for a key previously created without an expiration time using the EXPIRE keyword. For example, create a key to store a variable age with a value of 26.

SET age 26

Then, set an expiration time of 10 seconds for it.

EXPIRE age 20

Check the time left to live repeatedly a couple of times to see how it actually reduces and eventually doesn't exist again.

How to Build a Simple Application with Redis

To help you understand how Redis works, we'll now build a basic web application that uses Redis to cache data to load responses faster. You'll be building a simple application that fetches images data from JSONPlaceholder and returns them.

Create a New PHP Project Using Composer

Create a new folder for the project, change directory into the newly created folder, and run the following compound to create a new composer project:

composer init -q

This will create a new composer.json file that should look like this:

{
    "require": {}
}

Next, create a public folder to house your public facing code files. Then create a new index.php file in the folder.

Put in some boilerplate content in the PHP file for now and start a server.



echo "Hello World!";

php -S localhost:8080

Install a Simple Router and Handle Requests

To complete the project, install a simple PHP router, Altorouter, and a web client, Guzzlehttp.

composer require altorouter/altorouter guzzlehttp/guzzle

Update the index.php to contain this code:



// Import composer autoload file
require_once __DIR__ . '/../vendor/autoload.php';

// Import GuzzleHttp Client
use GuzzleHttp\Client;

// Instantiate router and web client
$router = new AltoRouter();
$client = new Client();

// Register Sample route
$router->map('GET', '/', function () {
    // Set response Content-Type
    header('Content-Type: application/json; charset=utf-8');

    // Return basic response
    echo json_encode(['data' => 'Hello World']);
});

/**
 * Route to get all photos
 */
$router->map('GET', '/photos', function () use ($client) {
    // Make request to JSONPlaceholder
    $response = $client->request('GET', 'https://jsonplaceholder.typicode.com/photos');

    header('Content-Type: application/json; charset=utf-8');
    echo json_encode([
        'data' => json_decode($response->getBody()->getContents())
    ]);
});

/**
 * Route to get single photo by id
 */
$router->map('GET', '/photos/[i:id]', function (int $id) use ($client) {
    $response = $client->request('GET', 'https://jsonplaceholder.typicode.com/photos/' . $id);

    header('Content-Type: application/json; charset=utf-8');
    echo json_encode([
        'data' => json_decode($response->getBody()->getContents())
    ]);
});

$match = $router->match();

if( is_array($match) && is_callable( $match['target'] ) ) {
    call_user_func_array( $match['target'], $match['params'] );
} else {
    // no route was matched
    header( $_SERVER["SERVER_PROTOCOL"] . ' 404 Not Found');
}

The code is pretty self explanatory. But, here's a breakdown for clarity. From lines 1-11, the required classes GuzzleHttp and AltoRouter are imported and instantiated.

From lines 14-20, the first route is registered, with a simple closure that returns "Hello World!". Lines 25-45 register two more routes, one to fetch all photos, /photos and another to fetch a single photo, /photos/id.

The final lines are required based on documentation of the router package to actually execute the closures set in the routes declaration.

You can test these routes using Postman.

Hello World route

Get All Photos route

Get a Single Photo route

The /photos route takes an average of 1400ms per request. The /photos/id takes an average of 900ms per request.

Install and Instantiate Redis

These times can be reduced by caching the results of the original request to JSONPlaceholder, then returning a response from the cache instead of making a request every time.

To use Redis with PHP, install the PhpRedis extension. This extension provides an API for communicating with Redis. You can easily install it using the command:

pecl install redis

After installation, you can then use this class in your PHP project. Import the class and instantiate it at the top of your index.php file:

$redis = new Redis();
$redis->connect('127.0.0.1');

Having done this, you can now use Redis in your project.

How to Cache Data with Redis

Store the raw JSON response returned from JSONPlaceholder to Redis with an expiry time of 1 hour (3600 seconds).

$response = $client->request('GET', 'https://jsonplaceholder.typicode.com/photos');

$redis->setex(
    'photos',
    3600,
    $response->getBody()->getContents()
);

Here, you create a new key called photos, give it an expiration time of 1 hour, then assign it the raw response gotten from JSONPlaceholder.

But at this point the API still takes a long time to respond. This is because you're only storing this response, you're not using Redis to return the response.

To fix this, when a new request comes in, check if you have some data previosuly stored in-memory. If yes, you return the data in-memory, else, you make a call to JSONPlaceholder.

Update the /photos block to this:

/**
 * Route to get all photos
 */
$router->map('GET', '/photos', function () use ($client, $redis) {
    // Check if Redis has the key
    if (!$redis->exists('photos')) {
        $response = $client->request('GET', 'https://jsonplaceholder.typicode.com/photos');

        // Store the data for next use
        $redis->setex(
            'photos',
            REDIS_STANDARD_EXPIRY,
            $response->getBody()->getContents()
        );
    }

    header('Content-Type: application/json; charset=utf-8');
    echo json_encode([
        'data' => json_decode($redis->get('photos'))
    ]);
});

Testing in Postman to see improvements, you see the average response time after the first call (the original call before it is cached) has dropped to an average of 20ms for the /photos route. This is an improvement of over 50x. Redis saves a lot of processing time and power.

Update the /photos/id route to use Redis too:

$router->map('GET', '/photos/[i:id]', function (int $id) use ($client, $redis) {
    if (!$redis->exists('photos:' . $id)) {
        $response = $client->request('GET', 'https://jsonplaceholder.typicode.com/photos/' . $id);

        $redis->setex(
            'photos:' . $id,
            REDIS_STANDARD_EXPIRY,
            $response->getBody()->getContents()
        );
    }

    header('Content-Type: application/json; charset=utf-8');
    echo json_encode([
        'data' => json_decode($redis->get('photos:' . $id))
    ]);
});

The /photos/id route is now also much faster as it takes less than 5ms to get a response, an improvement of over 45x.

Summary

I hope you now understand the what Redis is, its basics, and how you can use it to enhance the speed of your PHP web applications. You can find the code files used in this article on GitHub.

If you have any questions or relevant advice, please get in touch with me to share them.

To read more of my articles or follow my work, you can connect with me on LinkedIn, Twitter, and Github. It’s quick, it’s easy, and it’s free!

Pub/Sub in Redis – How to Use the Publish/Subscribe Messaging Pattern

Mihail Gaberov — Fri, 28 Apr 2023 14:33:25 +0000

When you're working on an application that needs to be easily maintainable, scalable, and performant, the Publish/Subscribe messaging pattern is a good choice.

The idea behind it is simple, but powerful. We have senders called publishers. Their sole role is to send or publish messages. They don’t care about who is going to receive them or if someone will receive them at all. They just shoot and forget the messages. And they do that via channels.

Think of them as, for example, TV channels. We have Sports channels, Weather Forecasting channels, Cooking channels, and so on. Every publisher sends its messages to a certain channel, and whoever is subscribed for this channel will be able to receive these messages.

Here is where the subscribers come in play. They can subscribe to one or more channels and start receiving the messages broadcasted in there.

As we already mentioned, the messages are to be sent and forgotten. This means that if a subscriber subscribes for a certain channel, all the messages that were sent previously in that channel are not going to be available to this subscriber.

Due to the nature of this kind of architecture, we can easily achieve low coupling between the different components and provide a solid foundation for building robust and easy-to-maintain applications.

For example, imagine a situation where we need to replace or improve the publishing part of our system – say add more publishers, more channels or so on. Since the two parts are isolated, meaning publishers don’t care about subscribers and vice versa, we could easily do that without worrying whether we are breaking some other part of the system. We just add the new publishers. Then later, when a subscriber comes to the relevant channels, it just starts using them.

What is Redis?

The initial idea behind Redis was to serve as an in-memory cache solution, as an alternative to its ancestor Memcached.

But nowadays it's a many-in-one solution, providing an in-memory data structure store, key-value database, message brokering, and so on. This makes it perfect candidate when building an application that needs a really fast caching solution as well as some of the other features mentioned before. Especially if the performance of the app is crucial for its regular usage.

Redis performance comparison (source: google)

One of the biggest advantages when using Redis is the huge community and technical resources you can find online. A lot of these resources are free, and there are online platforms that have free tier offerings.

Redis includes in its arsenal a cloud solution as well. If you want to try it yourself, you may go here and register a free account or use their initial coupon offering.

Redis Enterprise Cloud Sign Up / Sign In page

Pub/Sub in Redis

What is pub/sub?

Publish/Subscribe channels in Redis is one of the features I haven’t mentioned above but it’s included in the last versions of Redis. This is their implementation of the pub/sub messaging pattern, where we have publishers and subscribers that exchange messages via channels.

We'll go briefly through it below and then see it in practice in a small demo app I have prepared for you.

How does Redis pub/sub work?

We have publishers (the producers of messages), channels (that the messages are going through), and subscribers (the receivers of the messages). Who receives what depends solely on who is subscribed to which channel.

Let's see how this works in an example:

If we have created three publishers which will be publishing messages to three different channels. Let’s call them channels 1, 2 and 3. We also have three subscribers, let’s call them subscribers A, B and C.

Now, let’s imagine subscriber A is listening for messages on all three channels, that is, it's subscribed to them. And subscribers B and C are subscribed to channels 2 and 3. This means that when either of the three publishers sends a message, subscriber A will receive it. And subscribers B and C will be receiving messages sent only by publishers 2 and 3, because they are listening only for messages on these channels (2 and 3).

Notice that we have two entities using a channel – one is sending, the other is receiving – but they are totally independent. And the messages being sent are not persisted. Once they are sent by the publisher they are forgotten. The only entities that are subscribed at the moment of sending will get them.

How to use pub/sub in Redis

There are a plethora of client libraries that you can use with Redis. There is a dedicated page where everybody can go and pick one, depending on the specific project needs or just on your preferred programming language.

People at Redis also marked some of these repositories as recommended which makes the choice easier, if you are new to all this.

For our demo below, I used ioredis, a full-featured Redis client for Node.js. I chose this because the demo app UI is built with React and Node.js and my server code goes pretty well with it.

Redis Pub/Sub Demo

Redis Pub/Sub Visualizer app

Show time!

The idea behind the demo application is to show visually how the pattern works.

What you will see when you open it for first the time is three buttons for publishing simple messages (news) in the three imaginary TV channels: Weather, Sports and Music.

The cards below the publish buttons are the subscribers. Once you move your mouse cursor over any of them, it will flip to its back side and you will see three buttons. You may use each of these buttons to subscribe to the relevant channel.

Once a subscriber is signed over a channel and you click on the icon or the publish button for this channel, you will see a sample news appearing on the front side of the card.

Play with different publishers/subscribers combinations and see the result.

I hope this will give you a better understanding of what I explained in the example above.

How to run the demo app locally

In order to install and run the demo application locally, follow the steps below (all commands are considered to be run from the root project directory):

Run frontend:

cd client yarn && yarn dev

Run backend:

cd server && yarn yarn start

And finally, use your local installation of Docker (if don’t have one, you may get it from here) to run this:

docker run -p 6379:6379 redislabs/redismod:preview

That’s probably the easiest way to have a running copy of Redis locally. The other option would be to use Redis Cloud directly and deploy the application online. This is an option I am still investigating and if I manage to do it, I will deploy the whole app and will let you know.

Closing

This article introduced you to the pub/sub messaging pattern subject. It's important to remember that whenever we want to build a highly performant application with a low coupled architecture and real-time like messaging features, consider using the Publish/Subscribe pattern and Redis in particular.

In fact a lot of the real life applications that use Redis are dashboard-based. This means that usually there is a nice dashboard screen, showing different data, often being updated in real time.

Imagine, for example, a system showing traffic in a specific area. This kind of software is a perfect candidate for leveraging the advantages of pub/sub. And in many cases this is achieved by using Redis.

In any case, as developers and engineers, we should always be guided by the specific needs of the project we are working on. Whenever we decide to introduce a new pattern or technology, we should do it carefully and back it up with serious research.

The AI Chatbot Handbook – How to Build an AI Chatbot with Redis, Python, and GPT

freeCodeCamp — Wed, 27 Jul 2022 20:16:44 +0000

By Stephen Sanwo

In order to build a working full-stack application, there are so many moving parts to think about. And you'll need to make many decisions that will be critical to the success of your app.

For example, what language will you use and what platform will you deploy on? Are you going to deploy a containerised software on a server, or make use of serverless functions to handle the backend? Do you plan to use third-party APIs to handle complex parts of your application, like authentication or payments? Where do you store the data?

In addition to all this, you'll also need to think about the user interface, design and usability of your application, and much more.

This is why complex large applications require a multifunctional development team collaborating to build the app.

One of the best ways to learn how to develop full stack applications is to build projects that cover the end-to-end development process. You'll go through designing the architecture, developing the API services, developing the user interface, and finally deploying your application.

So this tutorial will take you through the process of building an AI chatbot to help you learn these concepts in depth.

Some of the topics we will cover include:

How to build APIs with Python, FastAPI, and WebSockets
How to build real-time systems with Redis
How to build a chat User Interface with React

Important Note: This is an intermediate full stack software development project that requires some basic Python and JavaScript knowledge.

I've carefully divided the project into sections to ensure that you can easily select the phase that is important to you in case you do not wish to code the full application.

You can download the full repository on My Github here.

Section 1

Application Architecture
How to Set Up the Development Environment
Section 2
How to Build a Chat Server with Python, FastAPI, and WebSockets
How to build Real-Time Systems with Redis
How to Add Intelligence to Chatbots with AI models

Application Architecture

Sketching out a solution architecture gives you a high-level overview of your application, the tools you intend to use, and how the components will communicate with each other.

I have drawn up a simple architecture below using draw.io:

Fullstack chatbot architecture

Let's go over the various parts of the architecture in more detail:

Client/User Interface

We will use React version 18 to build the user interface. The Chat UI will communicate with the backend via WebSockets.

GPT-J-6B and Huggingface Inference API

GPT-J-6B is a generative language model which was trained with 6 Billion parameters and performs closely with OpenAI's GPT-3 on some tasks.

I have chosen to use GPT-J-6B because it is an open-source model and doesn’t require paid tokens for simple use cases.

Huggingface also provides us with an on-demand API to connect with this model pretty much free of charge. You can read more about GPT-J-6B and Hugging Face Inference API.

Redis

When we send prompts to GPT, we need a way to store the prompts and easily retrieve the response. We will use Redis JSON to store the chat data and also use Redis Streams for handling the real-time communication with the huggingface inference API.

Redis is an in-memory key-value store that enables super-fast fetching and storing of JSON-like data. For this tutorial, we will use a managed free Redis storage provided by Redis Enterprise for testing purposes.

Web Sockets and the Chat API

To send messages between the client and server in real-time, we need to open a socket connection. This is because an HTTP connection will not be sufficient to ensure real-time bi-directional communication between the client and the server.

We will be using FastAPI for the chat server, as it provides a fast and modern Python server for our use. Check out the FastAPI documentation) to learn more about WebSockets.

How to Set Up the Development Environment

You can use your desired OS to build this app – I am currently using MacOS, and Visual Studio Code. Just make sure you have Python and NodeJs installed.

To set up the project structure, create a folder namedfullstack-ai-chatbot. Then create two folders within the project called client and server. The server will hold the code for the backend, while the client will hold the code for the frontend.

Next within the project directory, initialize a Git repository within the root of the project folder using the "git init" command. Then create a .gitignore file by using "touch .gitignore":

git init
touch .gitignore

In the next section, we will build our chat web server using FastAPI and Python.

How to Build a Chat Server with Python, FastAPI and WebSockets

In this section, we will build the chat server using FastAPI to communicate with the user. We will use WebSockets to ensure bi-directional communication between the client and server so that we can send responses to the user in real-time.

How to Set Up the Python Environment

To start our server, we need to set up our Python environment. Open the project folder within VS Code, and open up the terminal.

From the project root, cd into the server directory and run python3.8 -m venv env. This will create a virtual environment for our Python project, which will be named env. To activate the virtual environment, run source env/bin/activate

Next, install a couple of libraries in your Python environment.

pip install fastapi uuid uvicorn gunicorn WebSockets python-dotenv aioredis

Next create an environment file by running touch .env in the terminal. We will define our app variables and secret variables within the .env file.

Add your app environment variable and set it to "development" like so: export APP_ENV=development. Next, we will set up a development server with a FastAPI server.

FastAPI Server Setup

At the root of the server directory, create a new file named main.py then paste the code below for the development sever:

from fastapi import FastAPI, Request
import uvicorn
import os
from dotenv import load_dotenv

load_dotenv()

api = FastAPI()

@api.get("/test")
async def root():
    return {"msg": "API is Online"}


if __name__ == "__main__":
    if os.environ.get('APP_ENV') == "development":
        uvicorn.run("main:api", host="0.0.0.0", port=3500,
                    workers=4, reload=True)
    else:
      pass

First we import FastAPI and initialize it as api. Then we import load_dotenv from the python-dotenv library, and initialize it to load the variables from the .env file,

Then we create a simple test route to test the API. The test route will return a simple JSON response that tells us the API is online.

Lastly, we set up the development server by using uvicorn.run and providing the required arguments. The API will run on port 3500.

Finally, run the server in the terminal with python main.py. Once you see Application startup complete in the terminal, navigate to the URL http://localhost:3500/test on your browser, and you should get a web page like this:

API Test Page

How to Add Routes to the API

In this section, we will add routes to our API. Create a new folder named src. This is the directory where all our API code will live.

Create a subfolder named routes, cd into the folder, create a new file named chat.py and then add the code below:

import os
from fastapi import APIRouter, FastAPI, WebSocket,  Request

chat = APIRouter()

# @route   POST /token
# @desc    Route to generate chat token
# @access  Public

@chat.post("/token")
async def token_generator(request: Request):
    return None


# @route   POST /refresh_token
# @desc    Route to refresh token
# @access  Public

@chat.post("/refresh_token")
async def refresh_token(request: Request):
    return None


# @route   Websocket /chat
# @desc    Socket for chatbot
# @access  Public

@chat.websocket("/chat")
async def websocket_endpoint(websocket: WebSocket = WebSocket):
    return None

We created three endpoints:

/token will issue the user a session token for access to the chat session. Since the chat app will be open publicly, we do not want to worry about authentication and just keep it simple – but we still need a way to identify each unique user session.
/refresh_token will get the session history for the user if the connection is lost, as long as the token is still active and not expired.
/chat will open a WebSocket to send messages between the client and server.

Next, connect the chat route to our main API. First we need to import chat from src.chat within our main.py file. Then we will include the router by literally calling an include_router method on the initialized FastAPI class and passing chat as the argument.

Update your api.py code as shown below:

from fastapi import FastAPI, Request
import uvicorn
import os
from dotenv import load_dotenv
from routes.chat import chat

load_dotenv()

api = FastAPI()
api.include_router(chat)


@api.get("/test")
async def root():
    return {"msg": "API is Online"}


if __name__ == "__main__":
    if os.environ.get('APP_ENV') == "development":
        uvicorn.run("main:api", host="0.0.0.0", port=3500,
                    workers=4, reload=True)
    else:
        pass

How to Generate a Chat Session Token with UUID

To generate a user token we will use uuid4 to create dynamic routes for our chat endpoint. Since this is a publicly available endpoint, we won't need to go into details about JWTs and authentication.

If you didn't install uuid initially, run pip install uuid. Next in chat.py, import UUID, and update the /token route with the code below:


from fastapi import APIRouter, FastAPI, WebSocket,  Request, BackgroundTasks, HTTPException
import uuid

# @route   POST /token
# @desc    Route generating chat token
# @access  Public

@chat.post("/token")
async def token_generator(name: str, request: Request):

    if name == "":
        raise HTTPException(status_code=400, detail={
            "loc": "name",  "msg": "Enter a valid name"})

    token = str(uuid.uuid4())

    data = {"name": name, "token": token}

    return data

In the code above, the client provides their name, which is required. We do a quick check to ensure that the name field is not empty, then generate a token using uuid4.

The session data is a simple dictionary for the name and token. Ultimately we will need to persist this session data and set a timeout, but for now we just return it to the client.

How to Test the API with Postman

Because we will be testing a WebSocket endpoint, we need to use a tool like Postman that allows this (as the default swagger docs on FastAPI does not support WebSockets).

In Postman, create a collection for your development environment and send a POST request to localhost:3500/token specifying the name as a query parameter and passing it a value. You should get a response as shown below:

Token Generator Postman

Websockets and Connection Manager

In the src root, create a new folder named socket and add a file named connection.py. In this file, we will define the class that controls the connections to our WebSockets, and all the helper methods to connect and disconnect.

In connection.py add the code below:


from fastapi import WebSocket

class ConnectionManager:
    def __init__(self):
        self.active_connections: List[WebSocket] = []

    async def connect(self, websocket: WebSocket):
        await websocket.accept()
        self.active_connections.append(websocket)

    def disconnect(self, websocket: WebSocket):
        self.active_connections.remove(websocket)

    async def send_personal_message(self, message: str, websocket: WebSocket):
        await websocket.send_text(message)

The ConnectionManager class is initialized with an active_connections attribute that is a list of active connections.

Then the asynchronous connect method will accept a WebSocket and add it to the list of active connections, while the disconnect method will remove the Websocket from the list of active connections.

Lastly, the send_personal_message method will take in a message and the Websocket we want to send the message to and asynchronously send the message.

WebSockets are a very broad topic and we only scraped the surface here. This should however be sufficient to create multiple connections and handle messages to those connections asynchronously.

You can read more about FastAPI Websockets and Sockets Programming.

To use the ConnectionManager, import and initialize it within the src.routes.chat.py, and update the /chat WebSocket route with the code below:

from ..socket.connection import ConnectionManager

manager = ConnectionManager()

@chat.websocket("/chat")
async def websocket_endpoint(websocket: WebSocket):
    await manager.connect(websocket)
    try:
        while True:
            data = await websocket.receive_text()
            print(data)
            await manager.send_personal_message(f"Response: Simulating response from the GPT service", websocket)

    except WebSocketDisconnect:
        manager.disconnect(websocket)

In the websocket_endpoint function, which takes a WebSocket, we add the new websocket to the connection manager and run a while True loop, to ensure that the socket stays open. Except when the socket gets disconnected.

While the connection is open, we receive any messages sent by the client with websocket.receive_test() and print them to the terminal for now.

Then we send a hard-coded response back to the client for now. Ultimately the message received from the clients will be sent to the AI Model, and the response sent back to the client will be the response from the AI Model.

In Postman, we can test this endpoint by creating a new WebSocket request, and connecting to the WebSocket endpoint localhost:3500/chat.

When you click connect, the Messages pane will show that the API client is connected to the URL, and a socket is open.

To test this, send a message "Hello Bot" to the chat server, and you should get an immediate test response "Response: Simulating response from the GPT service" as shown below:

Postman Chat Test

Dependency Injection in FastAPI

To be able to distinguish between two different client sessions and limit the chat sessions, we will use a timed token, passed as a query parameter to the WebSocket connection.

In the socket folder, create a file named utils.py then add the code below:

from fastapi import WebSocket, status, Query
from typing import Optional

async def get_token(
    websocket: WebSocket,
    token: Optional[str] = Query(None),
):
    if token is None or token == "":
        await websocket.close(code=status.WS_1008_POLICY_VIOLATION)

    return token

The get_token function receives a WebSocket and token, then checks if the token is None or null.

If this is the case, the function returns a policy violation status and if available, the function just returns the token. We will ultimately extend this function later with additional token validation.

To consume this function, we inject it into the /chat route. FastAPI provides a Depends class to easily inject dependencies, so we don't have to tinker with decorators.

Update the /chat route to the following:

from ..socket.utils import get_token

@chat.websocket("/chat")
async def websocket_endpoint(websocket: WebSocket, token: str = Depends(get_token)):
    await manager.connect(websocket)
    try:
        while True:
            data = await websocket.receive_text()
            print(data)
            await manager.send_personal_message(f"Response: Simulating response from the GPT service", websocket)

    except WebSocketDisconnect:
        manager.disconnect(websocket)

Now when you try to connect to the /chat endpoint in Postman, you will get a 403 error. Provide a token as query parameter and provide any value to the token, for now. Then you should be able to connect like before, only now the connection requires a token.

Postman Chat Test with Token

Congratulations on getting this far! Your chat.py file should now look like this:

import os
from fastapi import APIRouter, FastAPI, WebSocket, WebSocketDisconnect, Request, Depends, HTTPException
import uuid
from ..socket.connection import ConnectionManager
from ..socket.utils import get_token


chat = APIRouter()

manager = ConnectionManager()

# @route   POST /token
# @desc    Route to generate chat token
# @access  Public


@chat.post("/token")
async def token_generator(name: str, request: Request):
    token = str(uuid.uuid4())

    if name == "":
        raise HTTPException(status_code=400, detail={
            "loc": "name",  "msg": "Enter a valid name"})

    data = {"name": name, "token": token}

    return data


# @route   POST /refresh_token
# @desc    Route to refresh token
# @access  Public


@chat.post("/refresh_token")
async def refresh_token(request: Request):
    return None


# @route   Websocket /chat
# @desc    Socket for chatbot
# @access  Public

@chat.websocket("/chat")
async def websocket_endpoint(websocket: WebSocket, token: str = Depends(get_token)):
    await manager.connect(websocket)
    try:
        while True:
            data = await websocket.receive_text()
            print(data)
            await manager.send_personal_message(f"Response: Simulating response from the GPT service", websocket)

    except WebSocketDisconnect:
        manager.disconnect(websocket)

In the next part of this tutorial, we will focus on handling the state of our application and passing data between client and server.

How to Build Real-Time Systems with Redis

Our application currently does not store any state, and there is no way to identify users or store and retrieve chat data. We are also returning a hard-coded response to the client during chat sessions.

In this part of the tutorial, we will cover the following:

How to connect to a Redis Cluster in Python and set up a Redis Client
How to store and retrieve data with Redis JSON
How to set up Redis Streams as message queues between a web server and worker environment

Redis and Distributed Messaging Queues

Redis is an open source in-memory data store that you can use as a database, cache, message broker, and streaming engine. It supports a number of data structures and is a perfect solution for distributed applications with real-time capabilities.

Redis Enterprise Cloud is a fully managed cloud service provided by Redis that helps us deploy Redis clusters at an infinite scale without worrying about infrastructure.

We will be using a free Redis Enterprise Cloud instance for this tutorial. You can Get started with Redis Cloud for free here and follow This tutorial to set up a Redis database and Redis Insight, a GUI to interact with Redis.

Once you have set up your Redis database, create a new folder in the project root (outside the server folder) named worker.

We will isolate our worker environment from the web server so that when the client sends a message to our WebSocket, the web server does not have to handle the request to the third-party service. Also, resources can be freed up for other users.

The background communication with the inference API is handled by this worker service, through Redis.

Requests from all the connected clients are appended to the message queue (producer), while the worker consumes the messages, sends off the requests to the inference API, and appends the response to a response queue.

Once the API receives a response, it sends it back to the client.

During the trip between the producer and the consumer, the client can send multiple messages, and these messages will be queued up and responded to in order.

Ideally, we could have this worker running on a completely different server, in its own environment, but for now, we will create its own Python environment on our local machine.

You might be wondering – why do we need a worker? Imagine a scenario where the web server also creates the request to the third-party service. This means that while waiting for the response from the third party service during a socket connection, the server is blocked and resources are tied up till the response is obtained from the API.

You can try this out by creating a random sleep time.sleep(10) before sending the hard-coded response, and sending a new message. Then try to connect with a different token in a new postman session.

You will notice that the chat session will not connect until the random sleep times out.

While we can use asynchronous techniques and worker pools in a more production-focused server set-up, that also won't be enough as the number of simultaneous users grow.

Ultimately, we want to avoid tying up the web server resources by using Redis to broker the communication between our chat API and the third-party API.

Next open up a new terminal, cd into the worker folder, and create and activate a new Python virtual environment similar to what we did in part 1.

Next, install the following dependencies:

pip install aiohttp aioredis python-dotenv

How to Connect to a Redis Cluster in Python with a Redis Client

We will use the aioredis client to connect with the Redis database. We'll also use the requests library to send requests to the Huggingface inference API.

Create two files .env, and main.py. Then create a folder named src. Also, create a folder named redis and add a new file named config.py.

In the .env file, add the following code – and make sure you update the fields with the credentials provided in your Redis Cluster.

export REDIS_URL=
export REDIS_USER=
export REDIS_PASSWORD=
export REDIS_HOST=
export REDIS_PORT=

In config.py add the Redis Class below:

import os
from dotenv import load_dotenv
import aioredis

load_dotenv()

class Redis():
    def __init__(self):
        """initialize  connection """
        self.REDIS_URL = os.environ['REDIS_URL']
        self.REDIS_PASSWORD = os.environ['REDIS_PASSWORD']
        self.REDIS_USER = os.environ['REDIS_USER']
        self.connection_url = f"redis://{self.REDIS_USER}:{self.REDIS_PASSWORD}@{self.REDIS_URL}"

    async def create_connection(self):
        self.connection = aioredis.from_url(
            self.connection_url, db=0)

        return self.connection

We create a Redis object and initialize the required parameters from the environment variables. Then we create an asynchronous method create_connection to create a Redis connection and return the connection pool obtained from the aioredis method from_url.

Next, we test the Redis connection in main.py by running the code below. This will create a new Redis connection pool, set a simple key "key", and assign a string "value" to it.


from src.redis.config import Redis
import asyncio

async def main():
    redis = Redis()
    redis = await redis.create_connection()
    print(redis)
    await redis.set("key", "value")

if __name__ == "__main__":
    asyncio.run(main())

Now open Redis Insight (if you followed the tutorial to download and install it) You should see something like this:

Redis Insight Test

How to Work with Redis Streams

Now that we have our worker environment setup, we can create a producer on the web server and a consumer on the worker.

First, let's create our Redis class again on the server. In server.src create a folder named redis and add two files, config.py and producer.py.

In config.py, add the code below as we did for the worker environment:

import os
from dotenv import load_dotenv
import aioredis

load_dotenv()

class Redis():
    def __init__(self):
        """initialize  connection """
        self.REDIS_URL = os.environ['REDIS_URL']
        self.REDIS_PASSWORD = os.environ['REDIS_PASSWORD']
        self.REDIS_USER = os.environ['REDIS_USER']
        self.connection_url = f"redis://{self.REDIS_USER}:{self.REDIS_PASSWORD}@{self.REDIS_URL}"

    async def create_connection(self):
        self.connection = aioredis.from_url(
            self.connection_url, db=0)

        return self.connection

In the .env file, also add the Redis credentials:

export REDIS_URL=
export REDIS_USER=
export REDIS_PASSWORD=
export REDIS_HOST=
export REDIS_PORT=

Finally, in server.src.redis.producer.py add the following code:


from .config import Redis

class Producer:
    def __init__(self, redis_client):
        self.redis_client = redis_client

    async def add_to_stream(self,  data: dict, stream_channel):
        try:
            msg_id = await self.redis_client.xadd(name=stream_channel, id="*", fields=data)
            print(f"Message id {msg_id} added to {stream_channel} stream")
            return msg_id

        except Exception as e:
            print(f"Error sending msg to stream => {e}")

We created a Producer class that is initialized with a Redis client. We use this client to add data to the stream with the add_to_stream method, which takes the data and the Redis channel name.

The Redis command for adding data to a stream channel is xadd and it has both high-level and low-level functions in aioredis.

Next, to run our newly created Producer, update chat.py and the WebSocket /chat endpoint like below. Notice the updated channel name message_channel.


from ..redis.producer import Producer
from ..redis.config import Redis

chat = APIRouter()
manager = ConnectionManager()
redis = Redis()


@chat.websocket("/chat")
async def websocket_endpoint(websocket: WebSocket, token: str = Depends(get_token)):
    await manager.connect(websocket)
    redis_client = await redis.create_connection()
    producer = Producer(redis_client)

    try:
        while True:
            data = await websocket.receive_text()
            print(data)
            stream_data = {}
            stream_data[token] = data
            await producer.add_to_stream(stream_data, "message_channel")
            await manager.send_personal_message(f"Response: Simulating response from the GPT service", websocket)

    except WebSocketDisconnect:
        manager.disconnect(websocket)

Next, in Postman, create a connection and send any number of messages that say Hello. You should have the stream messages printed to the terminal like below:

Terminal Channel Messages Test

In Redis Insight, you will see a new mesage_channel created and a time-stamped queue filled with the messages sent from the client. This timestamped queue is important to preserve the order of the messages.

Redis Insight Channel

How to Model the Chat Data

Next, we'll create a model for our chat messages. Recall that we are sending text data over WebSockets, but our chat data needs to hold more information than just the text. We need to timestamp when the chat was sent, create an ID for each message, and collect data about the chat session, then store this data in a JSON format.

We can store this JSON data in Redis so we don't lose the chat history once the connection is lost, because our WebSocket does not store state.

In server.src create a new folder named schema. Then create a file named chat.py in server.src.schema add the following code:

from datetime import datetime
from pydantic import BaseModel
from typing import List, Optional
import uuid


class Message(BaseModel):
    id = uuid.uuid4()
    msg: str
    timestamp = str(datetime.now())


class Chat(BaseModel):
    token: str
    messages: List[Message]
    name: str
    session_start = str(datetime.now())

We are using Pydantic's BaseModel class to model the chat data. The Chat class will hold data about a single Chat session. It will store the token, name of the user, and an automatically generated timestamp for the chat session start time using datetime.now().

The messages sent and received within this chat session are stored with a Message class which creates a chat id on the fly using uuid4. The only data we need to provide when initializing this Message class is the message text.

How to Work with Redis JSON

In order to use Redis JSON's ability to store our chat history, we need to install rejson provided by Redis labs.

In the terminal, cd into server and install rejson with pip install rejson. Then update your Redis class in server.src.redis.config.py to include the create_rejson_connection method:


import os
from dotenv import load_dotenv
import aioredis
from rejson import Client

load_dotenv()

class Redis():
    def __init__(self):
        """initialize  connection """
        self.REDIS_URL = os.environ['REDIS_URL']
        self.REDIS_PASSWORD = os.environ['REDIS_PASSWORD']
        self.REDIS_USER = os.environ['REDIS_USER']
        self.connection_url = f"redis://{self.REDIS_USER}:{self.REDIS_PASSWORD}@{self.REDIS_URL}"
        self.REDIS_HOST = os.environ['REDIS_HOST']
        self.REDIS_PORT = os.environ['REDIS_PORT']

    async def create_connection(self):
        self.connection = aioredis.from_url(
            self.connection_url, db=0)

        return self.connection

    def create_rejson_connection(self):
        self.redisJson = Client(host=self.REDIS_HOST,
                                port=self.REDIS_PORT, decode_responses=True, username=self.REDIS_USER, password=self.REDIS_PASSWORD)

        return self.redisJson

We are adding the create_rejson_connection method to connect to Redis with the rejson Client. This gives us the methods to create and manipulate JSON data in Redis, which are not available with aioredis.

Next, in server.src.routes.chat.py we can update the /token endpoint to create a new Chat instance and store the session data in Redis JSON like so:

@chat.post("/token")
async def token_generator(name: str, request: Request):
    token = str(uuid.uuid4())

    if name == "":
        raise HTTPException(status_code=400, detail={
            "loc": "name",  "msg": "Enter a valid name"})

    # Create new chat session
    json_client = redis.create_rejson_connection()

    chat_session = Chat(
        token=token,
        messages=[],
        name=name
    )

    # Store chat session in redis JSON with the token as key
    json_client.jsonset(str(token), Path.rootPath(), chat_session.dict())

    # Set a timeout for redis data
    redis_client = await redis.create_connection()
    await redis_client.expire(str(token), 3600)


    return chat_session.dict()

NOTE: Because this is a demo app, I do not want to store the chat data in Redis for too long. So I have added a 60-minute time out on the token using the aioredis client (rejson does not implement timeouts). This means that after 60 minutes, the chat session data will be lost.

This is necessary because we are not authenticating users, and we want to dump the chat data after a defined period. This step is optional, and you don't have to include it.

Next, in Postman, when you send a POST request to create a new token, you will get a structured response like the one below. You can also check Redis Insight to see your chat data stored with the token as a JSON key and the data as a value.

Token Generator Updated

How to Update the Token Dependency

Now that we have a token being generated and stored, this is a good time to update the get_token dependency in our /chat WebSocket. We do this to check for a valid token before starting the chat session.

In server.src.socket.utils.py update the get_token function to check if the token exists in the Redis instance. If it does then we return the token, which means that the socket connection is valid. If it doesn't exist, we close the connection.

The token created by /token will cease to exist after 60 minutes. So we can have some simple logic on the frontend to redirect the user to generate a new token if an error response is generated while trying to start a chat.


from ..redis.config import Redis

async def get_token(
    websocket: WebSocket,
    token: Optional[str] = Query(None),
):

    if token is None or token == "":
        await websocket.close(code=status.WS_1008_POLICY_VIOLATION)

    redis_client = await redis.create_connection()
    isexists = await redis_client.exists(token)

    if isexists == 1:
        return token
    else:
        await websocket.close(code=status.WS_1008_POLICY_VIOLATION, reason="Session not authenticated or expired token")

To test the dependency, connect to the chat session with the random token we have been using, and you should get a 403 error. (Note that you have to manually delete the token in Redis Insight.)

Now copy the token generated when you sent the post request to the /token endpoint (or create a new request) and paste it as the value to the token query parameter required by the /chat WebSocket. Then connect. You should get a successful connection.

Chat Session with Token

Bringing it all together, your chat.py should look like the below.


import os
from fastapi import APIRouter, FastAPI, WebSocket, WebSocketDisconnect, Request, Depends
import uuid
from ..socket.connection import ConnectionManager
from ..socket.utils import get_token
import time
from ..redis.producer import Producer
from ..redis.config import Redis
from ..schema.chat import Chat
from rejson import Path

chat = APIRouter()
manager = ConnectionManager()
redis = Redis()


# @route   POST /token
# @desc    Route to generate chat token
# @access  Public


@chat.post("/token")
async def token_generator(name: str, request: Request):
    token = str(uuid.uuid4())

    if name == "":
        raise HTTPException(status_code=400, detail={
            "loc": "name",  "msg": "Enter a valid name"})

    # Create nee chat session
    json_client = redis.create_rejson_connection()
    chat_session = Chat(
        token=token,
        messages=[],
        name=name
    )

    print(chat_session.dict())

    # Store chat session in redis JSON with the token as key
    json_client.jsonset(str(token), Path.rootPath(), chat_session.dict())

    # Set a timeout for redis data
    redis_client = await redis.create_connection()
    await redis_client.expire(str(token), 3600)

    return chat_session.dict()


# @route   POST /refresh_token
# @desc    Route to refresh token
# @access  Public


@chat.post("/refresh_token")
async def refresh_token(request: Request):
    return None


# @route   Websocket /chat
# @desc    Socket for chat bot
# @access  Public

@chat.websocket("/chat")
async def websocket_endpoint(websocket: WebSocket, token: str = Depends(get_token)):
    await manager.connect(websocket)
    redis_client = await redis.create_connection()
    producer = Producer(redis_client)
    json_client = redis.create_rejson_connection()

    try:
        while True:
            data = await websocket.receive_text()
            stream_data = {}
            stream_data[token] = data
            await producer.add_to_stream(stream_data, "message_channel")
            await manager.send_personal_message(f"Response: Simulating response from the GPT service", websocket)

    except WebSocketDisconnect:
        manager.disconnect(websocket)

Well done on reaching it this far! In the next section, we will focus on communicating with the AI model and handling the data transfer between client, server, worker, and the external API.

How to Add Intelligence to Chatbots with AI Models

In this section, we will focus on building a wrapper to communicate with the transformer model, send prompts from a user to the API in a conversational format, and receive and transform responses for our chat application.

How to Get Started with Huggingface

We will not be building or deploying any language models on Hugginface. Instead, we'll focus on using Huggingface's accelerated inference API to connect to pre-trained models.

The model we will be using is the GPT-J-6B Model provided by EleutherAI. It's a generative language model which was trained with 6 Billion parameters.

Huggingface provides us with an on-demand limited API to connect with this model pretty much free of charge.

To get started with Huggingface, Create a free account. In your settings, generate a new access token. For up to 30k tokens, Huggingface provides access to the inference API for free.

You can Monitor your API usage here. Make sure you keep this token safe and don't expose it publicly.

Note: We will use HTTP connections to communicate with the API because we are using a free account. But the PRO Huggingface account supports streaming with WebSockets see parallelism and batch jobs.

This can help significantly improve response times between the model and our chat application, and I'll hopefully cover this method in a follow-up article.

How to Interact with the Language Model

First, we add the Huggingface connection credentials to the .env file within our worker directory.

export HUGGINFACE_INFERENCE_TOKEN=
export MODEL_URL=https://api-inference.huggingface.co/models/EleutherAI/gpt-j-6B

Next, in worker.src create a folder named model then add a file gptj.py. Then add the GPT class below:

import os
from dotenv import load_dotenv
import requests
import json

load_dotenv()

class GPT:
    def __init__(self):
        self.url = os.environ.get('MODEL_URL')
        self.headers = {
            "Authorization": f"Bearer {os.environ.get('HUGGINFACE_INFERENCE_TOKEN')}"}
        self.payload = {
            "inputs": "",
            "parameters": {
                "return_full_text": False,
                "use_cache": True,
                "max_new_tokens": 25
            }

        }

    def query(self, input: str) -> list:
        self.payload["inputs"] = input
        data = json.dumps(self.payload)
        response = requests.request(
            "POST", self.url, headers=self.headers, data=data)
        print(json.loads(response.content.decode("utf-8")))
        return json.loads(response.content.decode("utf-8"))

if __name__ == "__main__":
    GPT().query("Will artificial intelligence help humanity conquer the universe?")

The GPT class is initialized with the Huggingface model url, authentication header, and predefined payload. But the payload input is a dynamic field that is provided by the query method and updated before we send a request to the Huggingface endpoint.

Finally, we test this by running the query method on an instance of the GPT class directly. In the terminal, run python src/model/gptj.py, and you should get a response like this (just keep in mind that your response will certainly be different from this):

[{'generated_text': ' (AI) could solve all the problems on this planet? I am of the opinion that in the short term artificial intelligence is much better than human beings, but in the long and distant future human beings will surpass artificial intelligence.\n\nIn the distant'}]

Next, we add some tweaking to the input to make the interaction with the model more conversational by changing the format of the input.

Update the GPT class like so:


class GPT:
    def __init__(self):
        self.url = os.environ.get('MODEL_URL')
        self.headers = {
            "Authorization": f"Bearer {os.environ.get('HUGGINFACE_INFERENCE_TOKEN')}"}
        self.payload = {
            "inputs": "",
            "parameters": {
                "return_full_text": False,
                "use_cache": False,
                "max_new_tokens": 25
            }

        }

    def query(self, input: str) -> list:
        self.payload["inputs"] = f"Human: {input} Bot:"
        data = json.dumps(self.payload)
        response = requests.request(
            "POST", self.url, headers=self.headers, data=data)
        data = json.loads(response.content.decode("utf-8"))
        text = data[0]['generated_text']
        res = str(text.split("Human:")[0]).strip("\n").strip()
        return res


if __name__ == "__main__":
    GPT().query("Will artificial intelligence help humanity conquer the universe?")

We updated the input with a string literal f"Human: {input} Bot:". The human input is placed in the string and the Bot provides a response. This input format turns the GPT-J6B into a conversational model. Other changes you may notice include

use_cache: you can make this False if you want the model to create a new response when the input is the same. I suggest leaving this as True in production to prevent exhausting your free tokens if a user just keeps spamming the bot with the same message. Using cache does not actually load a new response from the model.
return_full_text: is False, as we do not need to return the input – we already have it. When we get a response, we strip the "Bot:" and leading/trailing spaces from the response and return just the response text.

How to Simulate Short-term Memory for the AI Model

For every new input we send to the model, there is no way for the model to remember the conversation history. This is important if we want to hold context in the conversation.

But remember that as the number of tokens we send to the model increases, the processing gets more expensive, and the response time is also longer.

So we will need to find a way to retrieve short-term history and send it to the model. We will also need to figure out a sweet spot - how much historical data do we want to retrieve and send to the model?

To handle chat history, we need to fall back to our JSON database. We'll use the token to get the last chat data, and then when we get the response, append the response to the JSON database.

Update worker.src.redis.config.py to include the create_rejson_connection method. Also, update the .env file with the authentication data, and ensure rejson is installed.

Your worker.src.redis.config.py should look like this:


import os
from dotenv import load_dotenv
import aioredis
from rejson import Client


load_dotenv()


class Redis():
    def __init__(self):
        """initialize  connection """
        self.REDIS_URL = os.environ['REDIS_URL']
        self.REDIS_PASSWORD = os.environ['REDIS_PASSWORD']
        self.REDIS_USER = os.environ['REDIS_USER']
        self.connection_url = f"redis://{self.REDIS_USER}:{self.REDIS_PASSWORD}@{self.REDIS_URL}"
        self.REDIS_HOST = os.environ['REDIS_HOST']
        self.REDIS_PORT = os.environ['REDIS_PORT']

    async def create_connection(self):
        self.connection = aioredis.from_url(
            self.connection_url, db=0)

        return self.connection

    def create_rejson_connection(self):
        self.redisJson = Client(host=self.REDIS_HOST,
                                port=self.REDIS_PORT, decode_responses=True, username=self.REDIS_USER, password=self.REDIS_PASSWORD)

        return self.redisJson

While your .env file should look like this:

export REDIS_URL=
export REDIS_USER=
export REDIS_PASSWORD=
export REDIS_HOST=
export REDIS_PORT=
export HUGGINFACE_INFERENCE_TOKEN=
export MODEL_URL=https://api-inference.huggingface.co/models/EleutherAI/gpt-j-6B

Next, in worker.src.redis create a new file named cache.py and add the code below:

from .config import Redis
from rejson import Path

class Cache:
    def __init__(self, json_client):
        self.json_client = json_client

    async def get_chat_history(self, token: str):
        data = self.json_client.jsonget(
            str(token), Path.rootPath())

        return data

The cache is initialized with a rejson client, and the method get_chat_history takes in a token to get the chat history for that token, from Redis. Make sure you import the Path object from rejson.

Next, update the worker.main.py with the code below:

from src.redis.config import Redis
import asyncio
from src.model.gptj import GPT
from src.redis.cache import Cache

redis = Redis()

async def main():
    json_client = redis.create_rejson_connection()
    data = await Cache(json_client).get_chat_history(token="18196e23-763b-4808-ae84-064348a0daff")
    print(data)

if __name__ == "__main__":
    asyncio.run(main())

I have hard-coded a sample token created from previous tests in Postman. If you don't have a token created, just send a new request to /token and copy the token, then run python main.py in the terminal. You should see the data in the terminal like so:

{'token': '18196e23-763b-4808-ae84-064348a0daff', 'messages': [], 'name': 'Stephen', 'session_start': '2022-07-16 13:20:01.092109'}

Next, we need to add an add_message_to_cache method to our Cache class that adds messages to Redis for a specific token.


  async def add_message_to_cache(self, token: str, message_data: dict):
      self.json_client.jsonarrappend(
          str(token), Path('.messages'), message_data)

The jsonarrappend method provided by rejson appends the new message to the message array.

Note that to access the message array, we need to provide .messages as an argument to the Path. If your message data has a different/nested structure, just provide the path to the array you want to append the new data to.

To test this method, update the main function in the main.py file with the code below:

async def main():
    json_client = redis.create_rejson_connection()

    await Cache(json_client).add_message_to_cache(token="18196e23-763b-4808-ae84-064348a0daff", message_data={
        "id": "1",
        "msg": "Hello",
        "timestamp": "2022-07-16 13:20:01.092109"
    })

    data = await Cache(json_client).get_chat_history(token="18196e23-763b-4808-ae84-064348a0daff")
    print(data)

We are sending a hard-coded message to the cache, and getting the chat history from the cache. When you run python main.py in the terminal within the worker directory, you should get something like this printed in the terminal, with the message added to the message array.

{'token': '18196e23-763b-4808-ae84-064348a0daff', 'messages': [{'id': '1', 'msg': 'Hello', 'timestamp': '2022-07-16 13:20:01.092109'}], 'name': 'Stephen', 'session_start': '2022-07-16 13:20:01.092109'}

Finally, we need to update the main function to send the message data to the GPT model, and update the input with the last 4 messages sent between the client and the model.

First let's update our add_message_to_cache function with a new argument "source" that will tell us if the message is a human or bot. We can then use this arg to add the "Human:" or "Bot:" tags to the data before storing it in the cache.

Update the add_message_to_cache method in the Cache class like so:

  async def add_message_to_cache(self, token: str, source: str, message_data: dict):
      if source == "human":
          message_data['msg'] = "Human: " + (message_data['msg'])
      elif source == "bot":
          message_data['msg'] = "Bot: " + (message_data['msg'])

      self.json_client.jsonarrappend(
          str(token), Path('.messages'), message_data)

Then update the main function in main.py in the worker directory, and run python main.py to see the new results in the Redis database.

async def main():
    json_client = redis.create_rejson_connection()

    await Cache(json_client).add_message_to_cache(token="18196e23-763b-4808-ae84-064348a0daff", source="human", message_data={
        "id": "1",
        "msg": "Hello",
        "timestamp": "2022-07-16 13:20:01.092109"
    })

    data = await Cache(json_client).get_chat_history(token="18196e23-763b-4808-ae84-064348a0daff")
    print(data)

Next, we need to update the main function to add new messages to the cache, read the previous 4 messages from the cache, and then make an API call to the model using the query method. It'll have a payload consisting of a composite string of the last 4 messages.

You can always tune the number of messages in the history you want to extract, but I think 4 messages is a pretty good number for a demo.

In worker.src, create a new folder schema. Then create a new file named chat.py and paste our message schema in chat.py like so:

from datetime import datetime
from pydantic import BaseModel
from typing import List, Optional
import uuid


class Message(BaseModel):
    id = str(uuid.uuid4())
    msg: str
    timestamp = str(datetime.now())

Next, update the main.py file like below:

async def main():

    json_client = redis.create_rejson_connection()

    await Cache(json_client).add_message_to_cache(token="18196e23-763b-4808-ae84-064348a0daff", source="human", message_data={
        "id": "3",
        "msg": "I would like to go to the moon to, would you take me?",
        "timestamp": "2022-07-16 13:20:01.092109"
    })

    data = await Cache(json_client).get_chat_history(token="18196e23-763b-4808-ae84-064348a0daff")

    print(data)

    message_data = data['messages'][-4:]

    input = ["" + i['msg'] for i in message_data]
    input = " ".join(input)

    res = GPT().query(input=input)

    msg = Message(
        msg=res
    )

    print(msg)
    await Cache(json_client).add_message_to_cache(token="18196e23-763b-4808-ae84-064348a0daff", source="bot", message_data=msg.dict())

In the code above, we add new message data to the cache. This message will ultimately come from the message queue. Next we get the chat history from the cache, which will now include the most recent data we added.

Note that we are using the same hard-coded token to add to the cache and get from the cache, temporarily just to test this out.

Next, we trim off the cache data and extract only the last 4 items. Then we consolidate the input data by extracting the msg in a list and join it to an empty string.

Finally, we create a new Message instance for the bot response and add the response to the cache specifying the source as "bot"

Next, run python main.py a couple of times, changing the human message and id as desired with each run. You should have a full conversation input and output with the model.

Open Redis Insight and you should have something similar to the below:

Conversational Chat

Stream Consumer and Real-time Data Pull from the Message Queue

Next, we want to create a consumer and update our worker.main.py to connect to the message queue. We want it to pull the token data in real-time, as we are currently hard-coding the tokens and message inputs.

In worker.src.redis create a new file named stream.py. Add a StreamConsumer class with the code below:

class StreamConsumer:
    def __init__(self, redis_client):
        self.redis_client = redis_client

    async def consume_stream(self, count: int, block: int,  stream_channel):

        response = await self.redis_client.xread(
            streams={stream_channel:  '0-0'}, count=count, block=block)

        return response

    async def delete_message(self, stream_channel, message_id):
        await self.redis_client.xdel(stream_channel, message_id)

The StreamConsumer class is initialized with a Redis client. The consume_stream method pulls a new message from the queue from the message channel, using the xread method provided by aioredis.

Next, update the worker.main.py file with a while loop to keep the connection to the message channel alive, like so:


from src.redis.config import Redis
import asyncio
from src.model.gptj import GPT
from src.redis.cache import Cache
from src.redis.config import Redis
from src.redis.stream import StreamConsumer
import os
from src.schema.chat import Message


redis = Redis()


async def main():
    json_client = redis.create_rejson_connection()
    redis_client = await redis.create_connection()
    consumer = StreamConsumer(redis_client)
    cache = Cache(json_client)

    print("Stream consumer started")
    print("Stream waiting for new messages")

    while True:
        response = await consumer.consume_stream(stream_channel="message_channel", count=1, block=0)

        if response:
            for stream, messages in response:
                # Get message from stream, and extract token, message data and message id
                for message in messages:
                    message_id = message[0]
                    token = [k.decode('utf-8')
                             for k, v in message[1].items()][0]
                    message = [v.decode('utf-8')
                               for k, v in message[1].items()][0]
                    print(token)

                    # Create a new message instance and add to cache, specifying the source as human
                    msg = Message(msg=message)

                    await cache.add_message_to_cache(token=token, source="human", message_data=msg.dict())

                    # Get chat history from cache
                    data = await cache.get_chat_history(token=token)

                    # Clean message input and send to query
                    message_data = data['messages'][-4:]

                    input = ["" + i['msg'] for i in message_data]
                    input = " ".join(input)

                    res = GPT().query(input=input)

                    msg = Message(
                        msg=res
                    )

                    print(msg)

                    await cache.add_message_to_cache(token=token, source="bot", message_data=msg.dict())

                # Delete messaage from queue after it has been processed

                await consumer.delete_message(stream_channel="message_channel", message_id=message_id)


if __name__ == "__main__":
    asyncio.run(main())

This is quite the update, so let's take it step by step:

We use a while True loop so that the worker can be online listening to messages from the queue.

Next, we await new messages from the message_channel by calling our consume_stream method. If we have a message in the queue, we extract the message_id, token, and message. Then we create a new instance of the Message class, add the message to the cache, and then get the last 4 messages. We set it as input to the GPT model query method.

Once we get a response, we then add the response to the cache using the add_message_to_cache method, then delete the message from the queue.

How to Update the Chat Client with the AI Response

So far, we are sending a chat message from the client to the message_channel (which is received by the worker that queries the AI model) to get a response.

Next, we need to send this response to the client. As long as the socket connection is still open, the client should be able to receive the response.

If the connection is closed, the client can always get a response from the chat history using the refresh_token endpoint.

In worker.src.redis create a new file named producer.py, and add a Producer class similar to what we had on the chat web server:


class Producer:
    def __init__(self, redis_client):
        self.redis_client = redis_client

    async def add_to_stream(self,  data: dict, stream_channel) -> bool:
        msg_id = await self.redis_client.xadd(name=stream_channel, id="*", fields=data)
        print(f"Message id {msg_id} added to {stream_channel} stream")
        return msg_id

Next, in the main.py file, update the main function to initialize the producer, create a stream data, and send the response to a response_channel using the add_to_stream method:

from src.redis.config import Redis
import asyncio
from src.model.gptj import GPT
from src.redis.cache import Cache
from src.redis.config import Redis
from src.redis.stream import StreamConsumer
import os
from src.schema.chat import Message
from src.redis.producer import Producer


redis = Redis()


async def main():
    json_client = redis.create_rejson_connection()
    redis_client = await redis.create_connection()
    consumer = StreamConsumer(redis_client)
    cache = Cache(json_client)
    producer = Producer(redis_client)

    print("Stream consumer started")
    print("Stream waiting for new messages")

    while True:
        response = await consumer.consume_stream(stream_channel="message_channel", count=1, block=0)

        if response:
            for stream, messages in response:
                # Get message from stream, and extract token, message data and message id
                for message in messages:
                    message_id = message[0]
                    token = [k.decode('utf-8')
                             for k, v in message[1].items()][0]
                    message = [v.decode('utf-8')
                               for k, v in message[1].items()][0]

                    # Create a new message instance and add to cache, specifying the source as human
                    msg = Message(msg=message)

                    await cache.add_message_to_cache(token=token, source="human", message_data=msg.dict())

                    # Get chat history from cache
                    data = await cache.get_chat_history(token=token)

                    # Clean message input and send to query
                    message_data = data['messages'][-4:]

                    input = ["" + i['msg'] for i in message_data]
                    input = " ".join(input)

                    res = GPT().query(input=input)

                    msg = Message(
                        msg=res
                    )

                    stream_data = {}
                    stream_data[str(token)] = str(msg.dict())

                    await producer.add_to_stream(stream_data, "response_channel")

                    await cache.add_message_to_cache(token=token, source="bot", message_data=msg.dict())

                # Delete messaage from queue after it has been processed
                await consumer.delete_message(stream_channel="message_channel", message_id=message_id)


if __name__ == "__main__":
    asyncio.run(main())

Next, we need to let the client know when we receive responses from the worker in the /chat socket endpoint. We do this by listening to the response stream. We do not need to include a while loop here as the socket will be listening as long as the connection is open.

Note that we also need to check which client the response is for by adding logic to check if the token connected is equal to the token in the response. Then we delete the message in the response queue once it's been read.

In server.src.redis create a new file named stream.py and add our StreamConsumer class like this:

from .config import Redis

class StreamConsumer:
    def __init__(self, redis_client):
        self.redis_client = redis_client

    async def consume_stream(self, count: int, block: int,  stream_channel):
        response = await self.redis_client.xread(
            streams={stream_channel:  '0-0'}, count=count, block=block)

        return response

    async def delete_message(self, stream_channel, message_id):
        await self.redis_client.xdel(stream_channel, message_id)

Next, update the /chat socket endpoint like so:

from ..redis.stream import StreamConsumer

@chat.websocket("/chat")
async def websocket_endpoint(websocket: WebSocket, token: str = Depends(get_token)):
    await manager.connect(websocket)
    redis_client = await redis.create_connection()
    producer = Producer(redis_client)
    json_client = redis.create_rejson_connection()
    consumer = StreamConsumer(redis_client)

    try:
        while True:
            data = await websocket.receive_text()
            stream_data = {}
            stream_data[str(token)] = str(data)
            await producer.add_to_stream(stream_data, "message_channel")
            response = await consumer.consume_stream(stream_channel="response_channel", block=0)

            print(response)
            for stream, messages in response:
                for message in messages:
                    response_token = [k.decode('utf-8')
                                      for k, v in message[1].items()][0]

                    if token == response_token:
                        response_message = [v.decode('utf-8')
                                            for k, v in message[1].items()][0]

                        print(message[0].decode('utf-8'))
                        print(token)
                        print(response_token)

                        await manager.send_personal_message(response_message, websocket)

                    await consumer.delete_message(stream_channel="response_channel", message_id=message[0].decode('utf-8'))

    except WebSocketDisconnect:
        manager.disconnect(websocket)

Refresh Token

Finally, we need to update the /refresh_token endpoint to get the chat history from the Redis database using our Cache class.

In server.src.redis, add a cache.py file and add the code below:


from rejson import Path

class Cache:
    def __init__(self, json_client):
        self.json_client = json_client

    async def get_chat_history(self, token: str):
        data = self.json_client.jsonget(
            str(token), Path.rootPath())

        return data

Next, in server.src.routes.chat.py import the Cache class and update the /token endpoint to the below:


from ..redis.cache import Cache

@chat.get("/refresh_token")
async def refresh_token(request: Request, token: str):
    json_client = redis.create_rejson_connection()
    cache = Cache(json_client)
    data = await cache.get_chat_history(token)

    if data == None:
        raise HTTPException(
            status_code=400, detail="Session expired or does not exist")
    else:
        return data

Now, when we send a GET request to the /refresh_token endpoint with any token, the endpoint will fetch the data from the Redis database.

If the token has not timed out, the data will be sent to the user. Or it'll send a 400 response if the token is not found.

How to Test the Chat with multiple Clients in Postman

Finally, we will test the chat system by creating multiple chat sessions in Postman, connecting multiple clients in Postman, and chatting with the bot on the clients.

Lastly, we will try to get the chat history for the clients and hopefully get a proper response.

Recap

Let's have a quick recap as to what we have achieved with our chat system. The chat client creates a token for each chat session with a client. This token is used to identify each client, and each message sent by clients connected to or web server is queued in a Redis channel (message_chanel), identified by the token.

Our worker environment reads from this channel. It does not have any clue who the client is (except that it's a unique token) and uses the message in the queue to send requests to the Huggingface inference API.

When it gets a response, the response is added to a response channel and the chat history is updated. The client listening to the response_channel immediately sends the response to the client once it receives a response with its token.

If the socket is still open, this response is sent. If the socket is closed, we are certain that the response is preserved because the response is added to the chat history. The client can get the history, even if a page refresh happens or in the event of a lost connection.

Congratulations on getting this far! You have been able to build a working chat system.

In follow-up articles, I will focus on building a chat user interface for the client, creating unit and functional tests, fine-tuning our worker environment for faster response time with WebSockets and asynchronous requests, and ultimately deploying the chat application on AWS.

This Article is part of a series on building full-stack intelligent chatbots with tools like Python, React, Huggingface, Redis, and so on. You can follow the full series on my blog: blog.stephensanwo.dev - AI ChatBot Series**

You can download the full repository on My Github Repository

I wrote this tutorial in collaboration with Redis. Need help getting started with Redis? Try the following resources:

How to Create a Rate Limiter using Bucket4J and Redis

Abhinav Pandey — Fri, 01 Apr 2022 19:05:04 +0000

In this tutorial we will learn how to implement rate limiting in a scaled service.
We will use the Bucket4J library to implement it and we will use Redis as a distributed cache.

Why Use Rate Limiting?

Let's get started with some basics to make sure we understand the need for rate limiting and introduce the tools we'll be using in this tutorial.

Problem with Unlimited Rates

If a public API like the Twitter API allowed its users to make an unlimited number of requests per hour, it could lead to:

resource exhaustion
decreasing quality of the service
denial of service attacks

This might result in a situation where the service is unavailable or slow. It could also lead to more unexpected costs being incurred by the service.

How Rate Limiting Helps

Firstly, rate-limiting can prevent denial of service attacks. When coupled with a deduplication mechanism or API keys, rate limiting can also help prevent distributed denial of service attacks.

Secondly, it helps in estimating traffic. This is very important for public APIs. This can also be coupled with automated scripts to monitor and scale the service.

And thirdly, you can use it to implement tier-based pricing. This type of pricing model means that users can pay for a higher rate of requests. The Twitter API is an example of this.

The Token Bucket Algorithm

Token Bucket is an algorithm that you can use to implement rate limiting. In short, it works as follows:

A bucket is created with a certain capacity (number of tokens).
When a request comes in, the bucket is checked. If there is enough capacity, the request is allowed to proceed. Otherwise, the request is denied.
When a request is allowed, the capacity is reduced.
After a certain amount of time, the capacity is replenished.

How to Implement Token Bucket in a Distributed System

To implement the token bucket algorithm in a distributed system, we need to use a distributed cache.

The cache is a key-value store to store the bucket information. We will use a Redis cache to implement this.

Internally, Bucket4j allows us to plug in any implementation of the Java JCache API. The Redisson client of Redis is the implementation we will use.

Project Implementation

We will use the Spring Boot framework to build our service.

Our service will contain the below components:

A simple REST API.
A Redis cache connected to the service – using the Redisson client.
The Bucket4J library wrapped around the REST API.
We'll connect Bucket4J to the JCache interface which will use the Redisson client as the implementation in the background.

First, we will learn to rate limit the API for all requests. Then we will learn to implement a more complex rate limiting mechanism per user or per pricing tier.

Let's start with the project setup.

Install Dependencies

Let's add the below dependencies to our pom.xml (or build.gradle) file.

<dependencies>
    
    <dependency>
      <groupId>org.springframework.bootgroupId>
      <artifactId>spring-boot-starter-webartifactId>
    dependency>

    
    <dependency>
        <groupId>org.redissongroupId>
        <artifactId>redisson-spring-boot-starterartifactId>
        <version>3.17.0version>
    dependency>

    
    <dependency>
        <groupId>com.giffing.bucket4j.spring.boot.startergroupId>
        <artifactId>bucket4j-spring-boot-starterartifactId>
        <version>0.5.2version>
    dependency>
dependencies>

Cache Configuration

Firstly, we need to start our Redis server. Let's say we have a Redis server running on port 6379 on our local machine.

We need to perform two steps:

Create a connection to this server from our application.
Set up JCache to use the Redisson client as the implementation.

Redisson's documentation provides concise steps to implement this in a regular Java application. We're going to implement the same steps, but in Spring Boot.

Let's look at the code first. We need to create a Configuration class to create the required beans.

@Configuration
public class RedisConfig  {

    @Bean
    public Config config() {
        Config config = new Config();
        config.useSingleServer().setAddress("redis://localhost:6379");
        return config;
    }

    @Bean
    public CacheManager cacheManager(Config config) {
        CacheManager manager = Caching.getCachingProvider().getCacheManager();
        cacheManager.createCache("cache", RedissonConfiguration.fromConfig(config));
        return cacheManager;
    }

    @Bean
    ProxyManager proxyManager(CacheManager cacheManager) {
        return new JCacheProxyManager<>(cacheManager.getCache("cache"));
    }
}

What does this do?

Creates a configuration object that we can use to create a connection.
Creates a cache manager using the configuration object. This will internally create a connection to the Redis instance and create a hash called "cache" on it.
Creates a proxy manager that will be used to access the cache. Whatever our application tries to cache using the JCache API, it will be cached on the Redis instance inside the hash named "cache".

Build the API

Let's create a simple REST API.

@RestController
public class RateLimitController {
    @GetMapping("/user/{id}")
    public String getInfo(@PathVariable("id") String id) {
        return "Hello " + id;
    }
}

If I hit the API with the URL http://localhost:8080/user/1, I will get the response Hello 1.

Bucket4J Configuration

To implement the rate limiting, we need to configure Bucket4J. Thankfully, we do not need to write any boilerplate code due to the starter library.

It also automatically detects the ProxyManager bean we created in the previous step and uses it to cache the buckets.

What we do need to do is configure this library around the API we created.
Again there are multiple ways to do this.

We can go for property-based configuration which is defined in the starter library.
This is the most convenient way for simple cases like rate-limiting for all users or all guest users.

However, if we want to implement something more complex like a rate limit for each user, it's better to write custom code for it.

We are going to implement rate limiting per user. Let's assume we have the rate limit for each user stored in a database, and we can query it using the user id.

Let's write the code for it step by step.

Create a Bucket

Before we start, let's look at how a bucket is created.

Refill refill = Refill.intervally(10, Duration.ofMinutes(1));
Bandwidth limit = Bandwidth.classic(10, refill);
Bucket bucket = Bucket4j.builder()
        .addLimit(limit)
        .build();

Refill – After how much time the bucket will be refilled.
Bandwidth – How much bandwidth the bucket has. Basically, requests per refill period.
Bucket – An object configured using these two parameters. Additionally, it maintains a token counter to keep track of how many tokens are available in the bucket.

Using this as the building block, let's change a few things to make it suitable to our use case.

Create and Cache Buckets using ProxyManager

We created the proxy manager for the purpose of storing buckets on Redis. Once a bucket is created, it needs to be cached on Redis and does not need to be created again.

To make this happen, we will replace the Bucket4j.builder() with proxyManager.builder(). ProxyManager will take care of caching the buckets and not creating them again.

ProxyManager's builder takes two parameters – a key against which the bucket will be cached and a configuration object that it will use to create the bucket.

Let's see how we can implement it:

@Service
public class RateLimiter {
    //autowiring dependencies

    public Bucket resolveBucket(String key) {
        Supplier configSupplier = getConfigSupplierForUser(key);

        // Does not always create a new bucket, but instead returns the existing one if it exists.
        return buckets.builder().build(key, configSupplier);
    }

    private Supplier getConfigSupplierForUser(String key) {
        User user = userRepository.findById(userId);
        Refill refill = Refill.intervally(user.getLimit(), Duration.ofMinutes(1));
        Bandwidth limit = Bandwidth.classic(user.getLimit(), refill);
        return () -> (BucketConfiguration.builder()
                .addLimit(limit)
                .build());
    }
}

We have created a method which returns a bucket for a key provided. In the next step, we will see how to use this.

How to Consume Tokens and Set Up Rate Limiting

When a request comes in, we will try to consume a token from the relevant bucket.
We will use the tryConsume() method of the bucket to do this.

@GetMapping("/user/{id}")
public String getInfo(@PathVariable("id") String id) {
    // gets the bucket for the user
    Bucket bucket = rateLimiter.resolveBucket(id);

    // tries to consume a token from the bucket
    if (bucket.tryConsume(1)) {
        return "Hello " + id;
    } else {
        return "Rate limit exceeded";
    }
}

The tryConsume() method returns true if the token was consumed successfully or false if the token was not consumed.

How to Test our Service

We can test this using any automated testing technique. For example, we can use JUnit. Let's write a test case that calls the getInfo() method multiple times and verifies that the response is correct.

Let's assume we have a user with id 1 and a limit of 10 requests per minute. Let's assume we also have a user with id 2 and a limit of 20 requests per minute.

We will hit 11 requests for both users and verify that the request fails for the user with id 1 but succeeds for the user with id 2.

@Test
public void testGetInfo() {

    // calls the method 10 times for user 1
    for (int i = 0; i < 10; i++) {
        rateLimiter.getInfo(1));
        rateLimiter.getInfo(2));
    }

    // verifies that the response is rate limited for user 1
    assertEquals("Rate limit exceeded", rateLimiter.getInfo(1));

    // verifies that the response is successful for user 2
    assertEquals("Hello 2", rateLimiter.getInfo(2));
}

When we run the test, we will see that the test passes.

Conclusion

In this tutorial, we have covered how to create a rate limiter using Bucket4j and Redis in a Spring Boot application.We also looked at how to set up a Redisson client with JCache and how to use it to cache buckets.

At the end, we implemented a simple rate limiter which can be used to rate limit requests for specific users.

Hope you enjoyed this tutorial. Thanks for reading!

How to Scale a System With Process Splitting and Redis

freeCodeCamp — Tue, 20 Jul 2021 19:26:45 +0000

By Pramono Winata

Have you ever gotten into trouble trying to handle a single process that's really huge or heavy? If so, I can help you figure out how to better manage it.

In this article I will be sharing how I'm currently managing a single message that is too big to be processed on a single process. I've split it into different chunks, which results in separate processes.

I won't go into much technical detail, but more of the architectural process.
I'll discuss some bits about caching usage and pubsub, but I will not go into details on the implementation. Instead, I'll focus on the pattern itself.

The Problem

_Photo by [Unsplash](https://unsplash.com/@dsmacinnes?utm_source=ghost&utm_medium=referral&utm_campaign=api-credit">Danielle MacInnes / My First Approach

_Photo by [Unsplash](https://unsplash.com/@dose?utm_source=ghost&utm_medium=referral&utm_campaign=api-credit">Dose Media / How to Handle Finishing Processes

_Photo by [Unsplash](https://unsplash.com/@tumbao1949?utm_source=ghost&utm_medium=referral&utm_campaign=api-credit">James Wainscoat / Redis, and I am using that to deal with my issue here.

If you are not familiar with Redis, it is a service that is generally used as a cache.

We will manage our Redis mechanism like this:

Adding Redis to mark our process

The process looks exactly the same as before, but with the addition of Redis in the middle. You need to make sure you have a valid initial count for this case.

In my case, since I'm publishing a list, I can easily put the length of my list as my initial counter. And for the counter, I can just decrease it by one each time a process has finished. Then I will be able to know if I have finished all my processes simply by referring to my Redis counter. If it has reached 0, it means that I can safely mark that all of my processes are done.

Wrapping Up

To sum it all up, I split the message into several messages which will be processed all together in several processes. To manage the message processes, I use Redis caching.

The solution that I have described above will not be a silver bullet every time you have a problem processing a very big message. There are other ways like streaming your message, but that will be a story for another day.

Thanks for reading my article through to the end! I sincerely hope that you enjoyed and found my article interesting and, most importantly, that it was useful.

Redis Database Basics – How the Redis CLI Works, Common Commands, and Sample Projects

freeCodeCamp — Wed, 14 Apr 2021 15:55:10 +0000

By Mehul Mohan

Redis is a popular in-memory database used for a variety of projects, like caching and rate limiting.

In this blog post, we will see how you can use Redis as an in-memory database, why you'd want to use Redis, and finally we'll discuss a few important features of the database. Let's start.

What is an in-memory database?

Traditional databases keep part of the database (usually the "hot" or often-accessed indices) in memory for faster access, and the rest of the database on disk.

Redis, on the other hand, focuses a lot on latency and the fast retrieval and storage of data. So it operates completely on memory (RAM) instead of storage devices (SSD/HDD). Speed is important!

Redis is a key-value database. But don't let it fool you into thinking it's a simple database. You have a lot of ways to store and retrieve those keys and values.

Why do you need Redis?

You can use Redis in a lot of ways. But there are two main reasons I can think of:

You are creating an application where you want to make your code layer stateless. Why? - Because if your code is stateless, it is horizontally scalable. Therefore, you can use Redis as a central storage system and let your code handle just the logic.
You are creating an application where multiple apps might need to share data. For example, what if somebody is trying to bruteforce your site at payments.codedamn.com, and once you detect it, you'd also like to block them at login.codedamn.com? Redis lets your multiple disconnected/loosely connected services share a common memory space.

Redis Basics

Redis is relatively simple to learn as there are only a handful of commands you'll need to know. In the next couple sections, we'll cover a few main Redis concepts and some useful common commands.

The Redis CLI

Redis has a CLI which is a REPL version of the command line. Whatever you write will be evaluated.

The above image shows you how to do a simple PING or hello world in Redis in one of my codedamn Redis course exercises (the course is linked at the end if you want to check it out).

This Redis REPL is very useful when you're working with the database in an application and quickly need to get a peek into a few keys or the state of Redis.

Common Redis commands

Trying out common commands on Redis CLI in codedamn course

Here are a few very commonly used commands in Redis to help you learn more about how it works:

SET

SET allows you to set a key to a value in Redis.

Here's an example of how it works:

SET mehul "developer from india"

This sets the key mehul to the value developer from india.

GET

GET allows you to get the keys you've set.

Here's the syntax:

GET mehul

This will return the string "developer from india" as we set above.

SETNX

This key will set a value only if the key does not exist. This command has a number of use cases, including not accidentally overwriting the value of a key which might already be present.

Here's how it works:

SET key1 value1
SETNX key1 value2
SETNX key2 value2

After running this example, your key1 will have the value value1 and key2 as value2. This is because the second command will have no effect as key1 was already present.

MSET

MSET is like SET, but you can set multiple keys together in one command. Here's how it works:

MSET key1 "value1" key2 "value2" key3 "value3"

Right now we are using key and value as the prefix for keys and values. But in reality when you write such code it's easy to lose track of what is a key and what is a value in such a long command.

So one thing you can do is always quote your value using double quotes, and leave your keys without quotes (if they are valid keynames without quotes).

MGET

MGET is similar to GET, but it can return multiple values at once, like this:

MGET key1 key2 key3 key4

This will return four values as an array: value1, value2, value3 and null. We got key4 as null because we never set it.

DEL

This command deletes a key – simple enough, right?

Here's an example:

SET key value
GET key # gives you "value"
DEL key 
GET key # null

INCR and DECR

You can use these two commands to increment or decrement a key which is a number. They are very useful and you'll use them a lot, because Redis can perform two operations in one – GET key and SET key to key + 1.

This avoids roundtrips to your parent application, and makes the operation also safe to perform without using transactions (more on this later)

Here's how they work:

SET favNum 10
INCR favNum # 11
INCR favNum # 12
DECR favNum # 11

EXPIRE

The EXPIRE command is used to set an expiration timer to a key. Technically it's not a timer, but a kill timestamp beyond which the key will always return null unless it's set again.

SET bitcoin 100
EXPIRE bitcoin 10

GET bitcoin # 100
# after 10 seconds
GET bitcoin # null

EXPIRE uses a little bit more memory to store that key as a whole (because now you have to also store when that key should expire). But you probably won't ever care about that overhead.

TTL

This command can be used to learn how much time the key has to live.

Example:

SET bitcoin 100
TTL bitcoin # -1
TTL somethingelse # -2

EXPIRE bitcoin 5
# wait 2 seconds
TTL bitcoin # returns 3
# after 1 second
GET bitcoin # null
TTL bitcoin # -2

So what can we learn from this code?

TTL will return -1 if the key exists but doesn't have an expiration
TTL will return -2 if the key doesn't exist
TTL will return time to live in seconds if the key exists and will expire

SETEX

You can perform SET and EXPIRE together with SETEX.

Like this:

SETEX key 10 value

Here, the key is "key", the value is "value", and the time to live (TTL) is 10. This key will get unset after 10 seconds.

Now that you have fundamental knowledge of basic Redis commands and how the CLI works, let's build a couple of projects and use those tools in real life.

Project 1 – Build an API Caching System with Redis

Preview of API caching system building lab on codedamn

This project involves setting up an API caching system with Redis, where you cache results from a 3rd party server and use it for some time.

This is useful so that you are not rate limited by that third party. Also, caching improves your site's speed, so if you implement it correctly it's a win-win for everyone.

You can build this project interactively on codedamn inside the browser using Node.js. If you're interested, you can try the API caching lab for free.

If you're only interested in the solution (and not building it yourself) here's how the core logic will work in Node.js:

app.post('/data', async (req, res) => {
    const repo = req.body.repo

    const value = await redis.get(repo)

    if (value) {
        // means we got a cache hit
        res.json({
            status: 'ok',
            stars: value
        })

        return
    }

    const response = await fetch(`https://api.github.com/repos/${repo}`).then((t) => t.json())

    if (response.stargazers_count != undefined) {
        await redis.setex(repo, 60, response.stargazers_count)
    }

    res.json({
        status: 'ok',
        stars: response.stargazers_count
    })
})

Let's see what's happening here:

We try to get the repo (which is the passed repo format - facebook/react) from our Redis cache. If present, great! We return the star count from our redis cache, saving us a roundtrip to GitHub's servers.
If we don't find it in cache, we do a request to GitHub's servers, and get the star count. We check if the star count is not undefined (in case a repo doesn't exist/is private). If it has a value, we setex the value with a timeout of 60 seconds.
We set a timeout because we don't want to serve stale values over time. This helps us refresh our star count at least once a minute.

Here's the full source code:

https://github.com/codedamn-classrooms/redis-nodejs-classroom/tree/lab5sol

Project 2 - Rate limiting API with Redis

Preview of rate limiting API with Redis

This project involves rate limiting a certain endpoint to protect it from bad actors, and then blocking them from accessing that particular API.

This is very useful for login and sensitive API endpoints, where you don't want a single person to hit your endpoint with thousands of requests.

We perform rate limiting by IP address in this lab. If you want to attempt this codelab, you can try it for free on codedamn.

If you're only interested in the solution (and not building it yourself) here's how the core logic will work in Node.js:

app.post('/api/route', async (req, res) => {
    // add data here
    const ip = req.headers['x-forwarded-for'] || req.ip

    const reqs = await redis.incr(ip)
    await redis.expire(ip, 2)

    if (reqs > 15) {
        return res.json({
            status: 'rate-limited'
        })
    } else if (reqs > 10) {
        return res.json({
            status: 'about-to-rate-limit'
        })
    } else {
        res.json({
            status: 'ok'
        })
    }
})

Let's understand this code block:

We try to extract the IP from the x-forwarded-for header (or you can use req.ip as we are using express)
We INCR the IP address field. If our key in Redis never existed, INCR would automatically set it to 0 and increment, that is finally set it to 1.
We set the key to expire in 2 seconds. Ideally you'd want a larger value - but this is what the codedamn challenge specified above, so there we have it.
Finally we check the request counts, if they are greater than a certain threshold, we block the request from reaching the main function body.

Here's the full solution:

https://github.com/codedamn-classrooms/redis-nodejs-classroom/tree/lab6sol

More on Redis

Redis is much more than what we have learned so far. But the good thing is that we have learned enough to start working with it already!

In this section, let's cover a few more Redis fundamentals.

Redis is single threaded

Redis runs as a single threaded process, even on a multiple core system supporting multi threading. This is not a performance nightmare, but a safety measure against inconsistent read/writes in a multi threaded environment.

If Redis were multi threaded, to ensure thread safety when accessing a single key, you'd eventually have resolved to some locking mechanism, which probably would perform worse than single threaded/sequential access anyway.

Redis Transactions

Of course, you cannot do everything in Redis in a single command. But you can surely ask it to do a block of commands in a single go (that is, nobody else talks to Redis while it is executing that block). You can do that using the MULTI command.

Here's how that works:

MULTI
SET hello world
SET yo lo
SET number 1
INCR number
EXPIRE hello 10
EXPIRE yo 5
EXEC

This will perform all these operations in one go, that is it will not run anything at all after MULTI, and will run everything at once the moment it sees the EXEC keyword.

Redis includes support for lists and sets for more advanced use cases. You can also use Redis as a broadcasting service where you publish to a channel and others who have subscribed to the channel receive a notification. This is very useful in multi-client architecture.

Conclusion

I hope you liked this introduction to Redis. This blog post is a part of codedamn's new interactive course: Redis + Node.js caching, where you not only learn about these concepts, but practice them within your browser on the go.

Feel free to give the course a try and let me know what you think. You can find me on twitter to send any feedback :)

How to Use Redis to Supercharge Your Web APIs

freeCodeCamp — Wed, 13 May 2020 03:50:58 +0000

By Tarique Ejaz

Performance is an essential parameter to consider when you're designing any piece of software. It is particularly important when it comes to what happens behind-the-scenes.

We, as developers and technologists, adopt multiple tweaks and implementations in order to improve performance. This is where caching comes into play.

Caching is defined as a mechanism to store data or files in a temporary storage location from where it can be instantly accessed whenever required.

Caching has become a must have in web applications nowadays. We can use Redis to supercharge our web APIs - which are built using Node.js and MongoDB.

"Caching would apparently still play a super important role 100 to 200 years down the line."

Redis: A Layman's Overview

Redis, according to the official documentation, is defined as an in-memory data structure store which is used as a database, message broker, or cache storage. It supports data structures such as strings, hashes, lists, sets, sorted sets with range queries, bitmaps, hyperloglogs, geospatial indexes with radius queries and streams.

Okay, that is quite a lot of data structures right there. Just to make it simple, almost all the data structures supported can be condensed into one form of string or the other. You will get more clarity as we run through the implementation.

But one thing is clear. Redis is powerful, and when used properly can make our applications not only faster but amazingly efficient. Enough talk. Let's get our hands dirty.

Let's Talk Code

Before we start off, you will need to get redis setup in your local system. You can follow this quick setup process to get redis up and running.

Done? Cool. Let's start. We have a simple application created in Express which makes use of an instance in MongoDB Atlas to read and write data from.

We have two major APIs created in the /blogs route file.

...

// GET - Fetches all blog posts for required user
blogsRouter.route('/:user')
    .get(async (req, res, next) => {
        const blogs = await Blog.find({ user: req.params.user });

        res.status(200).json({
            blogs,
        });
    });

// POST - Creates a new blog post
blogsRouter.route('/')
    .post(async (req, res, next) => {
        const existingBlog = await Blog.findOne({ title: req.body.title });

        if (!existingBlog) {
            let newBlog = new Blog(req.body);

            const result = await newBlog.save();

            return res.status(200).json({
                message: `Blog ${result.id} is successfully created`,
                result,
            });
        }

        res.status(200).json({
            message: 'Blog with same title exists',
        });
    });

...

Sprinkling Some Redis Goodness

We start off by downloading the npm package redis to connect to the local redis server.

const mongoose = require('mongoose');
const redis = require('redis');
const util = require('util');

const redisUrl = 'redis://127.0.0.1:6379';
const client = redis.createClient(redisUrl);
client.hget = util.promisify(client.hget);

...

We make use of the utils.promisify function to transform the client.hget function to return a promise instead of a callback. You can read more about promisification here.

The Redis connection is in place. Before we start writing any more caching code, let us take a step back and try to understand what are the requirements we need to fulfill and the likely challenges we might face.

Our caching strategy should be able to address the following points.

Cache the request for all blog posts for a particular user
Clear cache every time a new blog post is created

The likely challenges we should be careful of as we go about our strategy are:
The right way to handle key creation for storing cache data
Cache expiration logic and forced expiration for maintaining cache freshness
Reusable implementation of caching logic

All right. We have our points jotted down and redis connected. On to the next step.

Overriding the Default Mongoose Exec Function

We want our caching logic to be reusable. And not only reusable, we also want it to be the first checkpoint before we make any query to the database. This can easily be done by using a simple hack of piggy-backing onto the mongoose exec function.

...

const exec = mongoose.Query.prototype.exec;

...

mongoose.Query.prototype.exec = async function() {
    ...

     const result = await exec.apply(this, arguments);

    console.log('Data Source: Database');
    return result;
}

...

We make use of the prototype object of mongoose to add our caching logic code as the first execution in the query.

Adding Cache as a Query

In order to denote which queries should be up for caching, we create a mongoose query. We provide the ability to pass the user to be used as a hash-key through the options object.

Note: Hashkey serves as an identifier for a hash data structure which, in layman terms, can be stated as the parent key to a set of key-value pairs. Thereby, enabling caching of a larger number of query-value set. You can read more about hashes in redis here.

...

mongoose.Query.prototype.cache = function(options = {}) {
    this.enableCache = true;
    this.hashKey = JSON.stringify(options.key || 'default');

    return this;
};

...

Having done so, we can easily use the cache() query along with the queries we want to cache in the following manner.

...

const blogs = await Blog
                    .find({ user: req.params.user })
                    .cache({ key: req.params.user });

...

Crafting The Cache Logic

We have set up a common reusable query to denote which queries need to be cached. Let's go ahead and write the central caching logic.

...

mongoose.Query.prototype.exec = async function() {
    if (!this.enableCache) {
        console.log('Data Source: Database');
        return exec.apply(this, arguments);
    }

    const key = JSON.stringify(Object.assign({}, this.getQuery(), {
        collection: this.mongooseCollection.name,
    }));

    const cachedValue = await client.hget(this.hashKey, key);

    if (cachedValue) {
        const parsedCache = JSON.parse(cachedValue);

        console.log('Data Source: Cache');

        return Array.isArray(parsedCache) 
                ?  parsedCache.map(doc => new this.model(doc)) 
                :  new this.model(parsedCache);
    }

    const result = await exec.apply(this, arguments);

    client.hmset(this.hashKey, key, JSON.stringify(result), 'EX', 300);

    console.log('Data Source: Database');
    return result;
};

...

Whenever we use the cache() query along with our main query, we set the enableCache key to be true.

If the key is false, we return the main exec query as default. If not, we first form the key for fetching and storing/refreshing the cache data.

We use the collection name along with the default query as the key name for the sake of uniqueness. The hash-key used is the name of the user which we have already set earlier in the cache() function definition.

The cached data is fetched using the client.hget() function which requires the hash-key and the consequent key as parameters.

Note: We always use JSON.parse() while fetching any data from redis. And similarly, we use JSON.stringify() on the key and data before storing anything into redis. This is done since redis does not support JSON data structures.

Once we have obtained the cached data, we have to transform each of the cached objects into a mongoose model. This can be done by simply using new this.model().

If the cache does not contain the required data, we make a query to the database. Then, having returned the data to the API, we refresh the cache using client.hmset(). We also set a default cache expiration time of 300 seconds. This is customizable based on your caching strategy.

The caching logic is in place. We have also set a default expiration time. Next up, we look at forcing cache expiration whenever a new blog post is created.

Forced Cache Expiration

In certain cases, such as when a user creates a new blog post, the user expects that the new post should be available when they fetche all the posts.

In order to do so, we have to clear the cache related to that user and update it with new data. So we have to force expiration. We can do that by invoking the del() function provided by redis.

...

module.exports = {
    clearCache(hashKey) {
        console.log('Cache cleaned');
        client.del(JSON.stringify(hashKey));
    }
}

...

We also have to keep in mind that we will be forcing expiration on multiple routes. One extensible way is to use this clearCache() as a middleware and call it once any query related to a route has finished execution.

const { clearCache } = require('../services/cache');

module.exports = async (req, res, next) => {
    // wait for route handler to finish running
    await next(); 

    clearCache(req.body.user);
}

This middleware can be easily called on a particular route in the following way.

...

blogsRouter.route('/')
    .post(cleanCache, async (req, res, next) => {

    ...

    }

...

And we are done. I agree that was a quite a lot of code. But with that last part, we have set up redis with our application and taken care of almost all the likely challenges. It is time to see our caching strategy in action.

Redis in Action

We make use of Postman as the API client to see our caching strategy in action. Here we go. Let's run through the API operations, one by one.

We create a new blog post using the /blogs route

New Blog Post Creation

We then fetch all the blog posts related to user tejaz

Fetching all Blog Posts for User tejaz

We fetch all the blog posts for user tejaz once more.

Fetch all Blog Posts for User tejaz Once More

You can clearly see that when we fetch from the cache, the time taken has gone down from 409ms to 24ms. This supercharges your API by decreasing the time taken by almost 95%.

Plus, we can clearly see that cache expiration and update operations work as expected.

You can find the complete source code in the redis-express folder here.

https://github.com/tarique93102/article-snippets/tree/master/redis-express

Conclusion

Caching is a mandatory step for any performance-efficient and data-intensive application. Redis helps you easily achieve this in your web applications. It is a super powerful tool, and if used properly it can definitely provide an excellent experience to developers as well as users all around.

You can find the complete set of redis commands here. You can use it with redis-cli to monitor your cache data and application processes.

The possibilities offered by any particular technology is truly endless. If you have any queries, you can reach out to me on [LinkedIn](https://www.linkedin.com/in/tarique-ejaz/).

In the mean time, keep coding.

Redis - freeCodeCamp.org

How to Persist State in Time-Series Models with Docker and Redis

What we’ll cover:

Who is This Guide For?

Understanding the Problem

So, what is a time-series model?

1. Containers are ephemeral by design

2. Lost context between predictions

3. Model amnesia on restart

The Solution: External State Store

Hands-On Implementation

Start with the broken approach

How to fix it with volumes

So, what is a volume and how does it work?

How the code handles state

Test the health endpoint

What About Scaling?

Horizontal scaling with Redis Cluster

High availability with Redis Sentinel

Use managed Redis services

Common Pitfalls to Avoid

Don't assume volumes work

Don't ignore Redis memory limits

Don't skip monitoring

Conclusion

Caching a Next.js API using Redis and Sevalla

Table of Contents

Why Caching Matters

What is Redis?

Setting Up the Project

Provisioning Redis

Updating Cache on Reads

Updating Cache on Writes

Deploying to Sevalla

Why Redis Works Well with Next.js APIs

Conclusion

How In-Memory Caching Works in Redis

Table of Contents

What Is In-Memory Caching?

What Is Redis?

How to Work with Redis

Redis Installation

Redis Data Types

Redis with Python

Real-Life Use Cases

Conclusion

How to Build a Flexible API with Feature Flags Using Open Source Tools

Here’s what we’ll cover:

Prerequisites

What is a Feature Flag?

Feature Flags for Backend Development

Why Use Open Source Tools?

Let’s Code!

Initializing the Tools

Creating Endpoints for the API

How to Add Feature Flagging

Understanding the Feature Flag Code Logic

How to Create Feature Flags in the Flasgsmith Dashboard

Rate Limiting Feature Flag

Beta Feature Flag

Getting the Access Key

Running the API

Updating the rate_limit Flag

How to Integrate Feature Flags with the GitHub App

Testing the Flagsmith GitHub App

Conclusion

How to Build a Scalable URL Shortener with Distributed Caching Using Redis

What You Will Learn

Prerequisites

Table of Contents

Project Overview

System Architecture

Step 1: Setting Up the Project

Step 2: Setting Up Redis Instances

Step 3: Implementing the URL Shortener Service

Step 4: Implementing Cache Invalidation

Adding Expiry to Cached URLs

Step 5: Monitoring Cache Metrics

Step 6: Testing the Application

Conclusion: What You’ve Learned

Updating the `rate_limit` Flag