caching - freeCodeCamp.org

How to Optimize Django REST APIs for Performance: Profiling, Caching, and Scaling.

Mari — Tue, 17 Feb 2026 18:22:09 +0000

Performance problems in APIs rarely start as performance problems. They usually start as small design decisions that worked perfectly when the application had ten users, ten records, or a single developer testing locally. Over time, as traffic increases and data grows, those same decisions begin to slow everything down.

In this article, we’ll walk step by step through how performance issues arise in Django REST APIs, how to see them clearly using profiling tools, and how to fix them using query optimization, caching, pagination, and basic scaling strategies.

This article will be most useful for developers who already understand Django, the Django REST Framework, and REST concepts, but are new to performance optimization.

Why Django REST APIs Become Slow

Before optimizing anything, it’s important to understand why APIs become slow in the first place.

Most performance issues in Django REST APIs come from three main sources:

Too many database queries
Doing expensive work repeatedly
Returning more data than necessary

Django is fast by default, but it does exactly what you ask it to do. If your API endpoint triggers 300 database queries, Django will happily run all 300.

Now let’s look at some common causes of performance issues in Django REST APIs.

1. N+1 Query Problems in Serializers

This happens when you loop over objects and access related fields, causing a separate query for each object.

# models.py
class Author(models.Model):
    name = models.CharField(max_length=100)

class Post(models.Model):
    title = models.CharField(max_length=200)
    author = models.ForeignKey(Author, on_delete=models.CASCADE)

# views.py (naive approach)
posts = Post.objects.all()
for post in posts:
    # This triggers a query per post to fetch the author
    print(post.author.name)

If you have 100 posts, this runs 101 queries: 1 for posts and 100 for authors. Django lazily loads related objects by default, so without intervention, your API performs repetitive database work that slows response times.

# Naive queryset fetching all related objects separately
posts = Post.objects.all()
authors = [post.author for post in posts]  # triggers extra queries per post

Each access to post.author triggers a new query. Even though you already fetched all posts, Django lazily loads related objects by default. This creates many extra queries, slowing down your API.

3. Serializing Large Datasets Without Pagination

Returning large query sets all at once can slow down your API and increase memory usage.

# views.py
from rest_framework.response import Response
from rest_framework.decorators import api_view
from .models import Post
from .serializers import PostSerializer

@api_view(['GET'])
def all_posts(request):
    posts = Post.objects.all()  # retrieves all posts at once
    serializer = PostSerializer(posts, many=True)
    return Response(serializer.data)

If your database has thousands of posts, this endpoint fetches everything in memory, serializes it, and sends it over the network. It’s slow and can crash under load. Later, we’ll learn to paginate results efficiently.

4. Recomputing Expensive Work Repeatedly

Some endpoints calculate the same values on every request instead of caching or precomputing.

def expensive_view(request):
    # Simulate expensive computation
    result = sum([i**2 for i in range(1000000)])
    return JsonResponse({"result": result})

Even if the data doesn’t change often, this computation happens on every request, consuming CPU time unnecessarily.

Performance optimization is about reducing unnecessary work.

At this point, it might be tempting to jump straight into fixes like caching responses or optimizing database queries. But doing that without evidence often leads to wasted effort or even new problems.

Before changing anything, you need to understand where your API is actually spending time. Is it the database? Is it serialization? Is it Python code running repeatedly on every request? This is where profiling becomes essential.

Profiling: Finding the Real Bottlenecks

Optimizing without profiling is guessing. Profiling helps you answer one question:

Where is my API actually spending time?

In practice, profiling means observing an API while it runs and collecting data about what it’s doing. This includes how many database queries are executed, how long those queries take, and how much time is spent in Python code, such as serializers or business logic.

By profiling first, you avoid making assumptions and can focus on fixing the parts of your API that are truly slowing things down.

Measuring Query Count in a View

During development, Django keeps track of all executed queries. You can inspect them directly:

from django.db import connection
from rest_framework.decorators import api_view
from rest_framework.response import Response
from .models import Post
from .serializers import PostSerializer

@api_view(["GET"])
def post_list(request):
    posts = Post.objects.all()
    serializer = PostSerializer(posts, many=True)

    response = Response(serializer.data)

    print(f"Total queries executed: {len(connection.queries)}")

    return response

If this prints 101 queries for 100 posts, you likely have an N+1 problem. This simple check confirms whether the database layer is the bottleneck.

One of the easiest ways to profile Django applications during development is by using tools that expose this information directly while requests are being processed.

The Django Debug Toolbar is one of the simplest ways to understand performance during development. It acts as a lightweight profiling tool that shows what happens behind the scenes when a request is handled.

It shows you:

How many SQL queries were executed
How long each query took
whether queries are duplicated
Which parts of the request lifecycle are slow

First, install it:

pip install django-debug-toolbar

In settings.py:

INSTALLED_APPS = [
    ...
    "debug_toolbar",
]

MIDDLEWARE = [
    ...
    "debug_toolbar.middleware.DebugToolbarMiddleware",
]

INTERNAL_IPS = [
    "127.0.0.1",
]

In urls.py:

import debug_toolbar
from django.urls import path, include

urlpatterns = [
    ...
    path("__debug__/", include(debug_toolbar.urls)),
]

When you load an endpoint in the browser during development, the toolbar displays total SQL queries, execution time, and duplicate queries. This makes inefficiencies immediately visible.

When you load an API endpoint and see 150 SQL queries for a single request, that’s a strong signal that something is wrong, often an N+1 query problem or inefficient serializer behavior.

Logging SQL Queries

Django allows you to log all executed SQL queries. This is especially useful when debugging API views.

Seeing the raw SQL makes inefficiencies obvious, such as repeated SELECT statements for the same table.

How to Enable SQL Query Logging

You can configure Django to log all SQL queries in settings.py:

LOGGING = {
    "version": 1,
    "handlers": {
        "console": {
            "class": "logging.StreamHandler",
        },
    },
    "loggers": {
        "django.db.backends": {
            "handlers": ["console"],
            "level": "DEBUG",
        },
    },
}

With this configuration, every SQL query will be printed to the console when your API runs. Repeated SELECT statements or unexpected queries become obvious.

Profiling API Response Time

Database queries are only one part of API performance. Beyond queries, it’s also important to measure the total response time of an endpoint.

Profiling response time helps you understand whether delays are caused by database access or by other parts of the request lifecycle. For example, if an endpoint takes 1.2 seconds to respond but only 50 milliseconds are spent on database queries, the bottleneck is likely in serialization, business logic, or repeated computations in Python.

By comparing query time and total response time, profiling helps you identify what to fix first instead of optimizing the wrong layer of the system.

How to Measure Total Response Time

import time
from rest_framework.decorators import api_view
from rest_framework.response import Response

@api_view(["GET"])
def example_view(request):
    start_time = time.time()

    # Simulate work
    data = {"message": "Hello world"}

    response = Response(data)

    end_time = time.time()
    print(f"Response time: {end_time - start_time:.4f} seconds")

    return response

If database queries are fast but the total response time is high, the bottleneck may be serialization or expensive Python logic.

Once you’ve identified that database access is a significant contributor to slow response times, the next step is to look more closely at how Django retrieves related data.

SQL Query Optimization in Django REST APIs

One of the most common reasons Django REST APIs become slow is inefficient access to related objects. This often manifests as the N+1 query problem, where fetching related objects triggers a separate database query for each item. Identifying and fixing this problem can significantly reduce the number of queries and improve API performance.

Understanding the N+1 Query Problem

Consider a simple example:

You fetch a list of posts
Each post has an author
For every post, Django fetches the author separately

If you have 100 posts, this results in 101 queries: 1 for the posts and 100 for the authors. This happens because Django lazily loads related objects by default. Without intervention, your API performs repetitive database work that slows down response times.

Solving the Problem with `select_related` and `prefetch_related`

Django provides built-in tools to control how related objects are loaded efficiently: select_related and prefetch_related.

1. Using select_related

select_related is designed for foreign key and one-to-one relationships. It performs an SQL join and retrieves related objects in a single query.

Use it when:

You know you will access related objects
The relationship is one-to-one or many-to-one

posts = Post.objects.select_related("author")

for post in posts:
    print(post.author.name)  # No additional queries

This performs a SQL JOIN and retrieves posts and authors in a single query, eliminating the N+1 problem.

It reduces multiple queries into just one, avoiding repeated database hits.

2. Using prefetch_related

prefetch_related is used for many-to-many and reverse foreign key relationships. It performs separate queries for each related table but combines the results in Python.

Use it when:

A SQL join would produce too much duplicated data
You are dealing with collections of related objects

Example: How to Optimize a Many-to-Many Relationship

Consider a blog application where posts can have multiple tags:

# models.py
class Tag(models.Model):
    name = models.CharField(max_length=50)

class Post(models.Model):
    title = models.CharField(max_length=200)
    tags = models.ManyToManyField(Tag)

Now imagine fetching posts and accessing their tags:

posts = Post.objects.all()

for post in posts:
    print(post.tags.all())  # Triggers additional queries

If you have 100 posts, Django may execute:

1 query to fetch posts
1 query per post to fetch related tags

This results in many unnecessary database hits.

You can optimize this using prefetch_related:

posts = Post.objects.prefetch_related("tags")

for post in posts:
    print(post.tags.all())  # Uses prefetched data

With this approach, Django performs one query for posts and one query for all related tags. It then matches them in Python, eliminating repeated database queries.

Together, these tools allow you to optimize your queries and eliminate the N+1 problem efficiently.

Common Beginner Mistakes

Even after applying these optimizations, it’s easy to make mistakes. Watch out for:

Forgetting that serializers can trigger additional queries
Using select_related on many-to-many relationships
Assuming Django automatically optimizes queries
Not checking the query count after adding serializers

Paying attention to these pitfalls ensures your API remains fast and scalable.

Caching in Django REST APIs

Even after optimizing database queries, API performance can still suffer if the same computations or database lookups are performed repeatedly. This is where caching comes in. Caching is a technique for storing the results of expensive operations so they can be retrieved more quickly the next time they are needed.

At its core, caching exists because computers have multiple layers of memory with different speeds:

CPU registers (fastest)
L1, L2, L3 caches
Main memory (RAM)
SSD storage
HDD storage (slowest)

Each layer trades speed for size: the closer the data is to the CPU, the faster it can be accessed. Software systems use the same principle; by storing frequently accessed data in a “closer” or faster location, applications can respond more quickly.

Cache Eviction

Caches are limited in size, so when a cache is full, some data must be removed to make room for new data. This process is called cache eviction.

Common eviction strategies include:

Least Recently Used (LRU): removes the data that hasn’t been accessed for the longest time
Random Replacement: removes a random item from the cache

The goal is to keep the data that is most likely to be requested again while freeing space for new data. Understanding this helps developers use caching effectively.

Caching in Application Architectures

Caching exists at several levels in modern software systems:

Client-side caching: Web browsers cache HTTP responses to reduce the need for repeated network requests. This is controlled with HTTP headers like Cache-Control.
CDN caching: Content Delivery Networks store static assets closer to users, reducing latency and server load.
Backend caching: Backend services cache results from database queries, computed values, or API responses. This is where Django caching is most commonly applied.

By applying caching strategically at the backend, APIs can serve data faster while reducing computation and database load.

Caching in Django

Django provides a flexible caching framework that supports multiple backends, including in-memory, file-based, database-backed, and third-party stores like Redis. The main types of caching in Django are:

Per-view caching: caches the entire output of a view. Ideal for endpoints where responses rarely change.
```
 from django.views.decorators.cache import cache_page

 @cache_page(60 * 15)  # cache for 15 minutes
 def my_view(request):
```
1. Template fragment caching: caches specific parts of a template to avoid repeated rendering.
2. Low-level caching: gives full control over what is cached and for how long, making it ideal for API responses.

By combining these approaches, you can reduce repeated work in your API, lower database load, and speed up response times.

When to Use Redis

While Django’s built-in caching backends are sufficient for many projects, high-traffic APIs often require a shared, in-memory cache. This is where Redis excels. Redis is designed for fast access, low latency, and can handle frequent reads across multiple servers.

You should consider using Redis when:

Data is read frequently but changes infrequently
Low latency is important for API responses
You need cache expiration and eviction policies
You want a shared cache across multiple servers or services

Redis is particularly effective for API endpoints that serve the same data to many users, such as frequently accessed lists or computed results.

Common Beginner Mistakes

Caching is powerful, but it’s easy to misuse. Some common pitfalls include:

Caching everything blindly: not all data benefits from caching
Forgetting cache invalidation: stale data can lead to incorrect responses
Using cache where query optimization would suffice: sometimes optimizing database queries is a better solution than caching.

Remember: caching should complement good database design, not replace it.

Pagination and Limiting Expensive Datasets

Even with caching, returning large datasets in a single request can slow down your API and increase memory usage. Pagination is a simple and effective way to limit the amount of data returned at once.

Pagination helps by reducing:

Database load
Memory usage
Serialization time
Network transfer size

Django REST Framework provides built-in pagination classes that make it easy to paginate endpoints. As a rule of thumb, always paginate list endpoints unless there is a strong reason not to.

Load Testing and Measuring Improvement

Optimizations are only meaningful if you can measure their impact. Load testing simulates multiple users accessing your API simultaneously, helping you answer key questions:

How many requests per second can my API handle?
Where does the API start to break under load?
Did caching, query optimization, and pagination actually improve performance?

By running load tests before and after optimization, you can validate that your changes have the desired effect and avoid optimizing the wrong parts of your system.

Summary and Next Steps

Optimizing Django REST APIs isn’t about chasing every tiny micro-optimization. It’s about reducing unnecessary work and focusing on the parts of your API that actually slow down performance.

Key Takeaways

Profile before optimizing: Identify the real bottlenecks before making changes.
Reduce database queries: Use techniques like select_related, prefetch_related, and avoid N+1 queries.
Cache frequently accessed data: Use Django caching and Redis to reduce repeated computations.
Paginate large datasets: Limit memory usage and network load by returning data in chunks.
Measure performance changes: Always verify that your optimizations have a real impact.

Next Steps for Your APIs

Add profiling to your existing APIs to understand where time is spent.
Identify one slow endpoint and focus on optimizing it first.
Optimize database queries using Django ORM best practices.
Introduce caching carefully; avoid caching everything blindly.
Measure the results with load testing and performance metrics.

Remember: Performance optimization is not a one-time task. It’s a habit built by continuously observing how your system works, testing improvements, and applying changes where they make the most impact.

Why Your UI Won’t Update: Debugging Stale Data and Caching in React Apps

Oluwadamisi Samuel — Thu, 05 Feb 2026 17:32:01 +0000

Your UI doesn’t “randomly” refuse to update. In most cases, it’s rendering cached data, which is data that was saved somewhere so the app doesn’t have to do the same work again.

Caching is great for performance, but it becomes a pain when you don’t realize which layer is reusing old data.

If you’ve ever seen this:

You update a profile name, but the screen still shows the old one.
You delete an item, but it stays in the list.
Your API returns fresh JSON, but the page refuses to change.
You deploy a fix, but your teammate still sees the old behavior.

You’re probably hitting a cache.

What makes this especially confusing is that not all stale UI comes from “real” caches. Modern web apps have multiple places where data can be reused, saved, or replayed between your UI, your API and when your app is deployed. When you don’t have a clear mental model of these layers, debugging turns into guesswork.

This article lays out a practical guide of the five most common caching layers that cause stale UI, plus one non-cache trap that looks exactly like one. The goal is to help you quickly identify where stale data is coming from, so you can fix the right thing instead of “refreshing harder.”

Why it Matters

I first ran into this while building an app where the UI wouldn’t update after a successful change. The API returned 200 OK, the database was correct, but the screen stayed stale. I assumed something was wrong with my code or state logic. Instead, the issue was coming from a caching layer I hadn’t invalidated. That’s the real problem with stale UI, you can’t debug it effectively unless you know which layer might be serving cached data.

When you understand where caching happens:

You debug faster by identifying the layer instead of guessing.
You avoid production-only bugs caused by caching defaults.
You stop chasing React issues when the data was never fresh.

This article gives you a simple mental model to pinpoint the layer and fix the right thing.

Why it Matters
The Mental Model
Non-Cache Cause
Cache 1: React Query Cache
Cache 2: Next.js fetch() Caching
Cache 3: Browser HTTP Cache (a Saved Copy in Your Browser)
Cache 4: CDN/Hosting Cache
Cache 5: Service Worker Cache (Only if Your Site is a PWA)
10-Second Debug Guide
Prevention: Set Caching Intentionally
Recap

The Mental Model

When your UI shows data, it feels like it comes straight from your API. In reality, the request/response path can hit multiple reuse points.

Non-Cache Cause

Duplicated React local state (same symptoms as caching). This one isn’t a formal cache, but it causes a lot of “why didn’t it update?” bugs especially for beginners.

The common trap:

const [name, setName] = useState(user.name) // initialized once

useState only uses its argument during the initial render. On every subsequent render, React ignores this value and preserves the existing state.

If user.name later changes (for example, after fresh API data arrives), the name state will not update automatically. At that point, name becomes a stale copy of user.name, and the UI renders outdated data unless you manually synchronize it.

This happens because you have duplicated state:

user.name is the source of truth.
name state is a local snapshot taken once.

React does not keep duplicated state in sync for you.

Correct patterns:

Render directly from the source when possible.

If the value is not being edited locally, do not copy it into state:

{user.name}

This guarantees the UI always reflects the latest data.

Explicitly synchronize local state when editable state is required.

If you need local, editable state (for example, a controlled input), you must opt in to synchronization:

const [name, setName] = useState(user.name);  

    useEffect(() => {    
        setName(user.name); 
     }, [user.name]);

This effect runs only when user.name changes, explicitly updating local state to match the new source value.

Cache 1: React Query Cache

React Query (TanStack Query) stores query results in a QueryClient cache (in memory by default) so your UI can render quickly and avoid unnecessary network requests. When a component needs data, React Query can return cached data immediately and then decide whether to fetch the data again based on options like staleTime and “refetch” behaviors (on mount, window focus, reconnect).

Common failure mode: mutation succeeds, but the UI stays old

A 200 OK only confirms the mutation request succeeded. It does not automatically update the cached query data your UI is rendering.

After a mutation, one of these usually happens:

The query that renders the screen was not invalidated/fetched
You invalidated the wrong query key (the UI reads from a different key)
The UI is rendering local React state that’s out of sync (not the query result)

The simplest “safe” pattern is: invalidate the exact query key your UI uses, so it fetches fresh data.

import { useMutation, useQueryClient } from "@tanstack/react-query";

function useUpdateProfile(userId: string) {
  const queryClient = useQueryClient();

  return useMutation({
    mutationFn: updateProfileRequest,
    onSuccess: () => {
      // Invalidate the same key your UI query uses (example: ["user", userId])
      queryClient.invalidateQueries({ queryKey: ["user", userId] });
    },
  });
}

If your UI uses a different key (for example ["me"] or ["user", userId, "profile"]), you must invalidate that key instead, React Query won’t “figure it out” from the URL.

Query Keys: React Query Caches by Key, not URL

React Query does not cache by endpoint URL. The query key is the identity of the cached data. If two different requests share the same key, React Query treats them as the same data and they can overwrite each other.

You should avoid keys like ["user"] (too broad), and use keys like ["user", userId] and ["users", { page, search, filter }].

Two settings that control “when it will refetch”:

staleTime: how long cached data is treated as fresh. While data is fresh, React Query is less likely to refetch automatically.
gcTime (formerly cacheTime): how long unused query data stays in memory after it’s no longer used by any component, before it’s garbage collected.

Cache 2: Next.js fetch() Caching

This is the one that surprises a lot of frontend devs. Next.js can cache results to speed things up. That means your server might return a previously saved copy of:

The API data it fetched, or
The page it already built

This is often the first time frontend developers encounter server-side caching behavior that affects UI correctness. So, even if your database has the new value, you can still see the old one, because Next.js didn’t fetch the API again, or didn’t rebuild the page this time.

This mainly applies to the App Router (Next.js calls these saved copies the Data Cache and Full Route Cache).

What you’ll notice when this happens

You refresh the page and it still shows the old value.
Your API is correct (Postman/curl shows the new email), but the UI is stuck.
Sometimes it “fixes itself” after a short wait (because the saved copy refreshes on a timer).

For example: “I updated my profile email, but prod still shows the old one”

The page (reads email on the server):

// app/settings/page.tsx
export default async function SettingsPage() {
 const res = await fetch("https://api.example.com/users/42", {
  method: "GET",
})
  const user = await res.json();

  return (
    <main>
      <h1>Settingsh1>
      <p>Email: {user.email}p>
    main>
  );
}

You submit an “Update email” form, the API returns 200 OK, the database is updated, but /settings still shows the previous email in production.

That usually means you’re seeing a saved copy somewhere on the server side.

How to debug it

Step 1: Reproduce in a production-like run

Caching can behave differently in development. Run:

next build && next start

Then test again.

Step 2: Confirm whether the request is reaching your Next.js server at all

Add a log inside the page:

console.log("Rendering /settings at", new Date().toISOString());

Then reload settings twice.

If you see a new timestamp every reload, the request is reaching your server and the page code is running.
If you don’t see logs in production, your request may not be reaching your server at all (often because a hosting/CDN layer is serving a saved copy before Next.js runs). You’ll confirm that in the CDN section later.

Step 3: Force Next.js to ask your API every time

Change the fetch to:

const res = await fetch("https://api.example.com/me", {
  method: "GET",
  cache: "no-store",
});

This means: don’t save this response – always fetch it again.

If this fixes the stale email then the problem was a saved copy of the API response (Data Cache).

Step 4: If the email is still stale, force Next.js to rebuild the page every request

Add this to the page file:

// app/settings/page.tsx
export const dynamic = "force-dynamic";

This means: don’t serve a saved copy of the page; rebuild it per request.

A “beginner-safe” setup for the user settings pages with some of the suggestions:

// app/settings/page.tsx
export const dynamic = "force-dynamic";

export default async function SettingsPage() {
  const res = await fetch("https://api.example.com/me", { cache: "no-store" });
  const me = await res.json();
  return <p>Email: {me.email}p>;
}

When you want caching for speed, but still need real time updates, these are some options you can take:

Option A: Refresh the saved copy every N seconds

Good for public pages, not ideal for “my settings must update now.”

await fetch(url, { next: { revalidate: 60 } });

This means: “You can reuse a saved copy, but refresh it at most every 60 seconds.”

Option B: Refresh right after the update (best for “update email” flows)

If you update the email on the server (Server Action or API route), tell Next.js to throw away the saved copy for /settings page so the next visit is fresh:

// app/settings/actions.ts
"use server";

import { revalidatePath } from "next/cache";

export async function updateEmail(email: string) {
  await fetch("https://api.example.com/me/email", {
    method: "PUT",
    headers: { "content-type": "application/json" },
    body: JSON.stringify({ email }),
  });

  // Tell Next.js: next request to /settings should be rebuilt
  revalidatePath("/settings");
}

Note: Next.js caching details can differ by version and by App Router vs Pages Router. Instead of trying to memorize defaults, debug by setting the behavior explicitly (no-store, revalidate, force-dynamic) and observe what changes.

Cache 3: Browser HTTP Cache (a Saved Copy in Your Browser)

Sometimes the browser reuses a saved copy of an API response (from memory or disk), so it doesn’t fully fetch it again.

What you’ll notice

You open DevTools, and the network shows (from memory cache) or (from disk cache).

Fast check

DevTools → Network

Turn on Disable cache (only works while DevTools is open)
Reload and retry

Why it happens

Usually your server allows caching via headers like Cache-Control or ETag (which can lead to 304 Not Modified).

Cache 4: CDN/Hosting Cache

This is often a production-only cache, which is why frontend bugs can appear “impossible” to reproduce locally. In production, a CDN/hosting layer can serve a saved copy of a response before your request reaches your server. That’s why “prod is stale, local is fine” happens.

What you’ll notice

Prod is stale, local is fine
Different users see different results (different regions/POPs)
Pages are very fast even right after data changed

Fast check

Open DevTools → Network → click the request → Response Headers

Age: if present and increasing, it’s strong evidence you’re getting a cached response from an intermediary cache
Provider headers can hint HIT/MISS (examples: x-vercel-cache, cf-cache-status)
Source (Age header, HTTP caching): https://www.rfc-editor.org/rfc/rfc9111

Quick diagnostic check

Change the URL slightly by adding this to the end of the URL:

?debug=1700000000000

If the new URL shows fresh data, the edge was likely caching the original URL. This doesn’t fix it for everyone, you’d still need correct cache settings or a purge/invalidation on your CDN.

Cache 5: Service Worker Cache (Only if Your Site is a PWA)

If your site has a service worker, it can return a saved response before the network runs. This can make new deployments or new data seem “ignored.”

What you’ll notice

Works in Incognito but not normal mode
Hard refresh doesn’t help
DevTools “Disable cache” doesn’t fully explain it

Fast check (Chrome)

Open DevTools → Application → Service Workers

enable Bypass for network, or Unregister temporarily
reload and retest

10-Second Debug Guide

Stale data is rarely random: it usually means a cache layer is doing its job, just not in the way you expect. Modern applications stack multiple caches, so debugging is less about fixing code immediately and more about locating the layer responsible.

Think of this as a quick cheat sheet to figure out which cache layer might be serving stale data, so you can focus your debugging on the right layer.

No request in Network? Go to Cache 1 (React Query), then Local state, then Cache 5 (Service worker).
Request exists, but response is old? Go to Cache 3 (Browser), Cache 4 (CDN), then Cache 2 (Next.js).
Response is fresh, UI is old? Go back to Cache 1 (invalidating / query keys) and Local state.

Once you know the likely layer, use the Fast check in that section to confirm it.

Prevention: Set Caching Intentionally

Most stale-data bugs happen because caching settings were never chosen but the defaults were.

User-specific pages (settings/admin/dashboard): default to fresh: Next.js: use cache: "no-store" on important fetches, and/or force dynamic routes when needed.
Public pages (marketing/blog/docs): saving + revalidate is usually fine: Decide a revalidate window that matches the business need (seconds/minutes/hours).
React Query: set staleTime based on how often the data actually changes, and make query keys match the inputs.
APIs: set Cache-Control / Vary intentionally so shared caches don’t mix user-specific responses.

Recap

Caching itself isn’t the problem. Stale UI happens when a cache exists but you didn’t choose it intentionally or align it with the data’s freshness requirements.

If the UI won’t update, it’s usually because you’re seeing a saved copy from React Query, Next.js, the browser, a CDN, or a service worker. And sometimes it’s not a cache at all, it’s local React state

How to Cache Golang API Responses for High Performance

Temitope Oyedele — Wed, 15 Oct 2025 10:27:00 +0000

Go makes it easy to build APIs that are fast out of the box. But as usage grows, speed at the language level is not enough. If every request keeps hitting the database, crunching the same data, or serializing the same JSON over and over, latency creeps up and throughput suffers. Caching is the tool that keeps performance high by storing work that has already been done so that future requests can reuse it instantly. Let’s look at four practical ways to cache APIs in Go, each explained with an analogy and backed by simple code you can adapt.

Response Caching with Local and Redis Storage
Database Query Result Caching
HTTP Caching with ETag and Cache-Control
Stale-While-Revalidate with Background Refresh
Wrapping Up

Response Caching with Local and Redis Storage

When the process of generating an API response becomes expensive, the fastest solution is to store the entire response. Think of a coffee shop during the morning rush. If every customer orders the same latte, the barista could grind beans and steam milk for each order, but the line would move slowly. A smarter move is to brew a pot once and pour from it repeatedly. To handle both speed and scale, the shop keeps a small pot at the counter for instant pours and a larger urn in the back for refills. In software terms, the counter pot is a local in-memory cache such as Ristretto or BigCache, and the urn is Redis, which allows multiple API servers to share the same cached responses.

In Go, this two-tier setup usually follows a cache-aside pattern: look in local memory first, fall back to Redis if needed, and only compute the result when both layers miss. Once computed, the value is saved in Redis for everyone and in memory for immediate reuse on the next call.

val, ok := local.Get(key)
if !ok {
    val, err = rdb.Get(ctx, key).Result()
    if err == redis.Nil {
        val = computeResponse() // expensive DB or logic
        _ = rdb.Set(ctx, key, val, 60*time.Second).Err()
    }
    local.Set(key, val, 1)
}
w.Header().Set("Content-Type", "application/json")
w.Write([]byte(val))

In the code above, the first attempt is to retrieve the response from the local cache, which returns instantly if the key or data exists. If not found, it queries Redis as the second layer. If Redis also returns nothing, the expensive computation runs and its result is stored in Redis with a sixty seconds expiration so other services can access it, then placed in the local cache for immediate reuse. After which, the response is written back to the client as JSON.

This gives you the best of both worlds: lightning-fast responses for repeat calls and a consistent cache across all your API servers.

Database Query Result Caching

Sometimes the API itself is simple but the real cost hides in the database. Imagine a newsroom waiting for election results. If every editor keeps calling the counting office for the same numbers, the phone lines may jam. Instead, one reporter calls once, writes the result on a board, and every editor copies from there. The board is the cache, and it saves both time and pressure on the office.

In Go, you can apply the same principle by caching query results. Rather than hitting the database for each identical request, you store the result in Redis with a key that represents the query intent. When the next request comes in, you pull from Redis, skip the database, and respond faster.

key := fmt.Sprintf("q:UserByID:%d", id)
if b, err := rdb.Get(ctx, key).Bytes(); err == nil {
    var u User
    _ = json.Unmarshal(b, &u)
    return u
}

u, _ := repo.GetUser(ctx, id) // real DB call
bb, _ := json.Marshal(u)
_ = rdb.Set(ctx, key, bb, 2*time.Minute).Err()
return u

Here, we construct a cache key that uniquely identifies the query using the user ID, then attempts to fetch the serialized result from Redis. If the key exists, it deserializes the bytes back into a User struct and returns immediately without touching the database. On a cache miss, it executes the actual database query through the repository, serializes the User object to JSON, stores it in Redis with a two-minute expiration, and returns the result.

This pattern dramatically reduces database load and response time for read-heavy APIs, but you must remember to clear or refresh entries when data changes, or set short time-to-live values to keep results reasonably fresh.

HTTP Caching with ETag and Cache-Control

Not all caching has to happen inside the server. The HTTP standard already provides tools that let clients or CDNs reuse responses. By setting headers like ETag and Cache-Control, you can tell the client whether the response has changed. If nothing is new, the client keeps its own copy and the server only sends a lightweight 304 response.

It is similar to a manager posting notices on an office board. Each sheet carries a small stamp. Employees compare the stamp against the one they already have. If it matches, they know their copy is still valid and skip taking a new one. Only when the stamp changes do they replace it.

In Go this is straightforward. Compute an ETag from the response body, compare it with what the client sends, and decide whether to return the full payload or just the 304.

etag := computeETag(responseBytes)
if match := r.Header.Get("If-None-Match"); match == etag {
    w.WriteHeader(http.StatusNotModified)
    return
}

w.Header().Set("ETag", etag)
w.Header().Set("Cache-Control", "public, max-age=60")
w.Write(responseBytes)

The code above generates an ETag, which is a fingerprint or hash of the response content, then checks if the client sent an If-None-Match header with a matching ETag from a previous request. If the ETags match, the content hasn't changed, so the server responds with a 304 Not Modified status and sends no body, saving bandwidth. When the ETags don't match or the client has no cached version, the server attaches the new ETag and a Cache-Control header that allows public caching for sixty seconds, then sends the full response.

This approach reduces bandwidth, lowers CPU usage, and pairs well with CDNs that can cache and serve responses directly.

Stale-While-Revalidate with Background Refresh

There are cases where serving slightly old data is acceptable if it keeps the API fast. Stock dashboards, analytics summaries, or feed endpoints often fit this model. Instead of making users wait for fresh data on every request, you can serve the cached value immediately and refresh it quietly in the background. This technique is called Stale-While-Revalidate.

Picture a stock ticker screen in a lobby. The numbers may be a few seconds behind, but they are still useful to anyone glancing at the board. Meanwhile, a background process fetches the latest figures and updates the ticker. The reader never stares at a blank screen and the system stays responsive even during spikes.

In Go, this can be built by storing not just the cached data but also timestamps that define when the data is fresh, when it can still be served as stale, and when it must be recomputed. The singleflight package helps ensure that only one goroutine does the refresh work, preventing a dogpile of updates.

entry := getEntry(key) // {data, freshUntil, staleUntil}
switch {
case time.Now().Before(entry.freshUntil):
    return entry.data
case time.Now().Before(entry.staleUntil):
    go refreshSingleflight(key) // background refresh
    return entry.data
default:
    return refreshSingleflight(key) // must refresh now
}

Here, the code retrieves a cache entry containing the data along with two timestamps marking the freshness and staleness boundaries. If the current time falls before the fresh threshold, the data is considered fully fresh and returned immediately. If time has passed the fresh threshold but remains within the stale window, the code returns the slightly outdated data instantly while launching a background goroutine to refresh it asynchronously, ensuring the next request gets updated information. Once time exceeds even the stale boundary, the data is too old to serve, so the code blocks and performs a synchronous refresh before returning.

This keeps latency low while still ensuring the cache updates regularly, a balance between freshness and performance.

Wrapping Up

Caching is not a single tactic but a set of strategies that fit different needs. Full response caching eliminates repeat work at the top level. Query result caching protects the database from repeated load. HTTP caching leverages the protocol to cut down data transfer. Stale-While-Revalidate strikes a compromise that favors speed without leaving data stale for too long.

In practice, these approaches are often layered. A Go API might use local memory and Redis for responses, apply query-level caching for hot tables, and set ETags so clients avoid unnecessary downloads. With the right mix, you can cut latency by orders of magnitude, handle far more traffic, and save both compute and database resources.

Caching a Next.js API using Redis and Sevalla

Manish Shivanandhan — Wed, 27 Aug 2025 16:00:42 +0000

When you hear about Next.js, your first thought may be static websites or React-driven frontends. But that’s just part of the story. Next.js can also power full-featured backend APIs that you can host and scale just like any other backend service.

In an earlier article, I walked through building a Next.js API and deploying it with Sevalla. The example stored data in a PostgreSQL database and handled requests directly. That worked fine, but as traffic grows, APIs that hit the database on every request can slow down.

This is where caching comes in. By adding Redis as a cache layer, we can make our Next.js API much faster and more efficient. In this article, we’ll see how to add Redis caching to our API, deploy it with Sevalla, and show measurable improvements.

In the last article, I explained the API in detail. So you can use this repository to start with as the base for this project.

Why Caching Matters
What is Redis?
Setting Up the Project
Provisioning Redis
Updating Cache on Reads
Updating Cache on Writes
Deploying to Sevalla
Why Redis Works Well with Next.js APIs
Conclusion

Why Caching Matters

Every time your API hits the database, it consumes time and resources. Databases are great at storing and querying structured data, but they aren’t optimized for speed at scale when you need to serve thousands of read requests per second.

Caching solves this by keeping frequently accessed data in memory. Instead of asking the database every time, the API can return data directly from cache if it’s available. Redis is perfect for this because it’s an in-memory key-value store designed for performance.

For example, if you fetch the list of users from the database on every request, it might take 200ms to run the query and return results. With Redis caching, the first request stores the result in memory, and subsequent requests can return the same data in less than 10ms. That’s an order-of-magnitude improvement.

What is Redis?

Redis is an in-memory data store that works like a super-fast database. Instead of writing and reading from disk, it keeps data in memory, which makes it incredibly fast. That’s why it’s often used as a cache, where speed is more important than long-term storage.

It’s designed to handle high-throughput workloads with very low latency, which means it can respond in microseconds. This makes it a perfect fit for use cases like caching API responses, storing session data, or even powering real-time applications like chat systems and leaderboards.

Unlike a traditional database, Redis focuses on simplicity and speed. It stores data as key-value pairs, so you can quickly fetch or update values without writing complex queries. And because it supports advanced data types like lists, sets, and hashes, it’s much more flexible than a plain key-value store.

When combined with an API like the one we built in Next.js, Redis helps you reduce load on the main database and deliver blazing-fast responses to clients.

Setting Up the Project

Let’s clone the repository:

git clone git@github.com:manishmshiva/nextjs-api-pgsql.git next-api

Now let’s go into the directory and do an npm install to install the packages.

cd next-api
npm i

Create a .env file and add the database URL from Sevalla into an environment variable.

cat .env

The .env file should look like this:

PGSQL_URL=postgres://:-@asia-east1-001.proxy.kinsta.app:30503/

Now let’s make sure the application works as expected by starting the application and making a couple of API requests.

Starting the app:

npm run dev

Let’s make sure the database is connected. Go to localhost:3000 on your browser. It should return the following JSON:

Let’s create a new user. To create a new entry in the DB using Postman, send a POST request with the following JSON:

{"id":"d9553bb7-2c72-4d92-876b-9c3b40a8c62c","name":"Larry","email":"larry@example.com","age":"25"}

Let’s ensure the record is created by going to localhost:3000/users in the browser.

Great. Now let’s cache these APIs using Redis.

Provisioning Redis

Let’s go to Sevalla’s dashboard and click on “Databases”. Choose “Redis” from the list, and leave the rest of the options as defaults.

Once the database is created, switch on the “external connection” option and copy the publicly accessible URL.

This is how it should look in the .env file:

REDIS_URL=redis://default:@:

Now install a Redis client for Node.js:

npm install ioredis

We can now connect to Redis and use it as a cache layer for our users API. Let’s see how to implement caching.

Updating Cache on Reads

Here’s the updated users/route.ts that uses Redis:

import { NextResponse } from "next/server";
import { Client } from "pg";
import Redis from "ioredis";

const redis = new Redis(process.env.REDIS_URL!);

async function readUsers() {
  const client = new Client({
    connectionString: process.env.PGSQL_URL,
  });
  await client.connect();

  try {
    const result = await client.query("SELECT id, name, email, age FROM users");
    return result.rows;
  } finally {
    await client.end();
  }
}

export async function GET() {
  try {
    // Check cache first
    const cached = await redis.get("users");
    if (cached) {
      return NextResponse.json(JSON.parse(cached));
    }

    // Fallback to database if not cached
    const users = await readUsers();

    // Store result in cache with 60s TTL
    await redis.set("users", JSON.stringify(users), "EX", 60);

    return NextResponse.json(users);
  } catch (err) {
    return NextResponse.json({ error: "Failed to fetch users" }, { status: 500 });
  }
}

Now, when you hit /users:

The API first checks Redis.
If the data exists, it returns it instantly.
If not, it queries PostgreSQL, saves the result in Redis, and then returns it.

This makes repeated requests extremely fast. You can adjust the cache expiry (EX 60) depending on how fresh your data needs to be.

Without Redis caching, fetching /users ten times means ten database queries. Each might take around 150–200ms depending on database size and network latency.

With Redis, the first request still takes ~200ms since it populates the cache. But every request after that is nearly instant, often under 10ms. That’s a 20x improvement.

This speedup matters when your API faces hundreds or thousands of requests per second. Caching not only reduces latency but also lightens the load on your database.

Updating Cache on Writes

Right now, only GET requests use the cache. But what if we add new users? The cache would still return the old data.

The solution is to update or clear the cache whenever a write happens. Let’s update the POST handler:

export async function POST(req: Request) {
  try {
    const body = await req.json();
    const client = new Client({
      connectionString: process.env.PGSQL_URL,
    });
    await client.connect();

    const query = `
      INSERT INTO users (id, name, email, age)
      VALUES ($1, $2, $3, $4)
      RETURNING *;
    `;

    const result = await client.query(query, [
      body.id,
      body.name,
      body.email,
      body.age,
    ]);

    await client.end();

    // Invalidate cache so next GET fetches fresh data
    await redis.del("users");

    return NextResponse.json(result.rows[0]);
  } catch (err) {
    return NextResponse.json({ error: "Failed to add user" }, { status: 500 });
  }
}

Now whenever a new user is created, the cache for users is cleared. The next GET request will fetch from the database, refresh the cache, and then continue serving cached data.

Deploying to Sevalla

Push your code to GitHub or fork my repository. Now lets go to Sevalla and create a new app.

Choose your repository from the dropdown and check “Automatic deployment on commit”. This will ensure that the deployment is automatic every time you push code. Choose “Hobby” under the resources section.

Click “Create” and not “Create and deploy”. We haven’t added our PostgreSQL URL and Redis URL as environment variables, so the app will crash if you try to deploy it.

Go to the “Environment variables” section and add the key “PGSQL_URL” and the URL in the value field. Do the same for the “REDIS_URL” key and add the Redis URL.

Now go back to the “Overview” section and click “Deploy now”.

Once deployment is complete, click “Visit app” to get the live URL of your API. You can replace localhost:3000 with the new URL in Postman and test your API.

Why Redis Works Well with Next.js APIs

Redis is lightweight, blazing fast, and perfect for caching API responses. In the context of Next.js, it fits naturally because:

The API routes run server-side where Redis can be queried directly.
Caching logic is simple to add around database calls.
Redis can be used for more than caching – things like rate limiting, session storage, and pub/sub are also common patterns.

By combining Next.js, PostgreSQL, and Redis on Sevalla, you get a stack that is fast, scalable, and easy to deploy.

Conclusion

Caching isn’t just an optimization – it’s a necessity for real-world APIs. Next.js helps you build robust backend APIs that can be deployed easily. By adding Redis to the mix, those APIs can handle scale without breaking a sweat.

Sevalla ties it all together by providing managed PostgreSQL, Redis, and app hosting in one place. With a few environment variables and a GitHub repo, you can go from local dev to a production-ready, cached API in minutes.

Hope you enjoyed this article. Signup for my free AI newsletter TuringTalks.ai for more hands-on tutorials on AI. You can also find me on Linkedin.

How In-Memory Caching Works in Redis

Manish Shivanandhan — Wed, 16 Jul 2025 16:19:35 +0000

When you’re building a web app or API that needs to respond quickly, caching is often the secret sauce.

Without it, your server can waste time fetching the same data over and over again – from a database, a third-party API, or a slow storage system.

But when you store that data in memory, the same information can be served up in milliseconds. That’s where Redis comes in.

Redis is a fast, flexible tool that stores your data in RAM and lets you retrieve it instantly. Whether you’re building a dashboard, automating social media posts, or managing user sessions, Redis can make your system faster, more efficient, and easier to scale.

In this article, you’ll learn how in-memory caching works and why Redis is a go-to choice for many developers.

What Is In-Memory Caching?
What Is Redis?
How to Work with Redis
Real-Life Use Cases
Conclusion

What Is In-Memory Caching?

In-memory caching is a way of storing data in the system’s RAM instead of fetching it from a database or external source every time it’s needed.

Since RAM is incredibly fast compared to disk storage, you can access cached data almost instantly. This approach is perfect for information that doesn’t change very often, like API responses, user profiles, or rendered HTML pages.

Rather than repeatedly running the same queries or API calls, your app checks the cache first. If the data is there, it’s used right away. If it’s not, you fetch it from the source, save it to the cache, and then return it.

This technique reduces load on your backend, improves response time, and can dramatically improve your app’s performance under heavy traffic.

What Is Redis?

Redis is an open-source, in-memory data store that developers use to cache and manage data in real time.

Unlike traditional databases, Redis stores everything in memory, which makes data retrieval incredibly fast. But Redis isn’t just a simple key-value store. It offers a wide range of data types, from strings and lists to sets, hashes, and sorted sets.

Redis is also capable of handling more advanced tasks like pub/sub messaging, streams, and geospatial queries. Despite its power, Redis is lightweight and easy to get started with.

You can run it on your local machine, deploy it on a server, or even use managed Redis services offered by cloud providers. It’s trusted by major companies and used in all kinds of applications, from caching and session storage to real-time analytics and job queues.

How to Work with Redis

Redis Installation

Getting Redis up and running is surprisingly simple. You can find the installation instructions based on your operating system in the documentation.

To make sure Redis is working, run:

redis-cli ping
# Should respond with "PONG"

Redis Data Types

Redis gives you several built-in types that let you store and manage data in flexible ways.

Strings: Simple key ↔ value pairs.

SET username "Emily"
GET username

Lists: Ordered collections which are great for queues and timelines.

LPUSH tasks "task1"
RPUSH tasks "task2"
LRANGE tasks 0 -1

Hashes: Like JSON objects, great for user profiles.

HSET user:1 name "Alice"
HSET user:1 email "alice@example.com"
HGETALL user:1

Sets: Unordered collections, ideal for tags or unique items.

SADD tags "python"
SADD tags "redis"
SMEMBERS tags

Sorted Sets: Sets with scores – useful for leaderboards.

ZADD leaderboard 100 "Bob"
ZADD leaderboard 200 "Carol"
ZRANGE leaderboard 0 -1 WITHSCORES

Redis also supports Bitmaps, hyperloglogs, streams, geospatial indexes, and keeps expanding its support for data structures.

Redis with Python

If you’re working in Python, using Redis is just as easy. After installing the redis Python library using pip install redis, you can connect to your Redis server and start setting and getting keys right away.

Here is some simple Python code to work with Redis:

import redis

# Connect to the local Redis server on default port 6379 and use database 0
r = redis.Redis(host='localhost', port=6379, db=0)

# --- Basic String Example ---

# Set a key called 'welcome' with a string value
r.set('welcome', 'Hello, Redis!')

# Get the value of the key 'welcome'
# Output will be a byte string: b'Hello, Redis!'
print(r.get('welcome'))


# --- Hash Example (like a Python dict) ---

# Create a Redis hash under the key 'user:1'
# This hash stores fields 'name' and 'email' for a user
r.hset('user:1', mapping={
    'name': 'Alice',
    'email': 'alice@example.com'
})

# Get all fields and values in the hash as a dictionary of byte strings
# Output: {b'name': b'Alice', b'email': b'alice@example.com'}
print(r.hgetall('user:1'))


# --- List Example (acts like a queue or stack) ---

# Push 'Task A' to the left of the list 'tasks'
r.lpush('tasks', 'Task A')

# Push 'Task B' to the left of the list 'tasks' (it becomes the first item)
r.lpush('tasks', 'Task B')

# Retrieve all elements from the list 'tasks' (from index 0 to -1, meaning the full list)
# Output: [b'Task B', b'Task A']
print(r.lrange('tasks', 0, -1))

You might store a user's session data, queue background tasks, or even cache rendered HTML pages. Redis commands are fast and atomic, which means you don’t have to worry about data collisions or inconsistency in high-traffic environments.

One of the most useful features in Redis is key expiration. You can tell Redis to automatically delete a key after a certain period, which is especially handy for session data or temporary caches.

You can set a time-to-live (TTL) on keys, so Redis removes them automatically

SET session:1234 "some data" EX 3600  # Expires in 1 hour

Redis also supports persistence, so even though it’s an in-memory store, your data can survive a reboot.

Redis isn’t limited to small apps. It scales easily through replication, clustering, and Sentinel.

Replication allows you to create read-only copies of your data, which helps distribute the load. Clustering breaks your data into chunks and spreads them across multiple servers. And Sentinel handles automatic failover to keep your system running even if one server goes down.

Real-Life Use Cases

One of the most common uses for Redis is caching API responses.

Let’s say you have an app that displays weather data. Rather than calling the weather API every time a user loads the page, you can cache the response for each city in Redis for 5 or 10 minutes. That way, you only fetch new data occasionally, and your app becomes much faster and cheaper to run.

Another powerful use case is session management. In web applications, every logged-in user has a session that tracks who they are and what they’re doing. Redis is a great place to store this session data because it’s fast and temporary.

You can store the session ID as a key, with the user’s information in a hash. Add an expiration time, and you’ve got automatic session timeout built in. Since Redis is so fast and supports high-concurrency access, it’s a great fit for applications with thousands of users logging in at the same time.

Conclusion

In-memory caching is one of the simplest and most effective ways to speed up your app, and Redis makes it incredibly easy to implement. It’s not just a cache, it’s a toolkit for building fast, scalable, real-time systems. You can start small by caching a few pages or API responses, and as your needs grow, Redis grows with you.

If you’re just getting started, try running Redis locally and experimenting with different data types. Store some strings, build a simple task queue with lists, or track user scores with a sorted set. The more you explore, the more you’ll see how Redis can help your application run faster, smarter, and more efficiently.

Enjoyed this article? Connect with me on Linkedin. See you soon with another topic.

How to Build a Scalable URL Shortener with Distributed Caching Using Redis

Birkaran Sachdev — Tue, 19 Nov 2024 15:14:58 +0000

In this tutorial, we'll build a scalable URL shortening service using Node.js and Redis. This service will leverage distributed caching to handle high traffic efficiently, reduce latency, and scale seamlessly. We'll explore key concepts such as consistent hashing, cache invalidation strategies, and sharding to ensure the system remains fast and reliable.

By the end of this guide, you'll have a fully functional URL shortener service that uses distributed caching to optimize performance. We'll also create an interactive demo where users can input URLs and see real-time metrics like cache hits and misses.

What You Will Learn

How to build a URL shortener service using Node.js and Redis.
How to implement distributed caching to optimize performance.
Understanding consistent hashing and cache invalidation strategies.
Using Docker to simulate multiple Redis instances for sharding and scaling.

Prerequisites

Before starting, make sure you have the following installed:

Node.js (v14 or higher)
Redis
Docker
Basic knowledge of JavaScript, Node.js, and Redis.

Project Overview
Step 1: Setting Up the Project
Step 2: Setting Up Redis Instances
Step 3: Implementing the URL Shortener Service
Step 4: Implementing Cache Invalidation
Step 5: Monitoring Cache Metrics
Step 6: Testing the Application
Conclusion: What You’ve Learned

Project Overview

We'll build a URL shortener service where:

Users can shorten long URLs and retrieve the original URLs.
The service uses Redis caching to store mappings between shortened URLs and original URLs.
The cache is distributed across multiple Redis instances to handle high traffic.
The system will demonstrate cache hits and misses in real-time.

System Architecture

To ensure scalability and performance, we'll divide our service into the following components:

API Server: Handles requests for shortening and retrieving URLs.
Redis Caching Layer: Uses multiple Redis instances for distributed caching.
Docker: Simulates a distributed environment with multiple Redis containers.

Step 1: Setting Up the Project

Let's set up our project by initializing a Node.js application:

mkdir scalable-url-shortener
cd scalable-url-shortener
npm init -y

Now, install the necessary dependencies:

npm install express redis shortid dotenv

express: A lightweight web server framework.
redis: To handle caching.
shortid: For generating short, unique IDs.
dotenv: For managing environment variables.

Create a .env file in the root of your project:

PORT=3000
REDIS_HOST_1=localhost
REDIS_PORT_1=6379
REDIS_HOST_2=localhost
REDIS_PORT_2=6380
REDIS_HOST_3=localhost
REDIS_PORT_3=6381

These variables define the Redis hosts and ports we'll be using.

Step 2: Setting Up Redis Instances

We'll use Docker to simulate a distributed environment with multiple Redis instances.

Run the following commands to start three Redis containers:

docker run -p 6379:6379 --name redis1 -d redis
docker run -p 6380:6379 --name redis2 -d redis
docker run -p 6381:6379 --name redis3 -d redis

This will set up three Redis instances running on different ports. We'll use these instances to implement consistent hashing and sharding.

Step 3: Implementing the URL Shortener Service

Let's create our main application file, index.js:

require('dotenv').config();
const express = require('express');
const redis = require('redis');
const shortid = require('shortid');

const app = express();
app.use(express.json());

const redisClients = [
  redis.createClient({ host: process.env.REDIS_HOST_1, port: process.env.REDIS_PORT_1 }),
  redis.createClient({ host: process.env.REDIS_HOST_2, port: process.env.REDIS_PORT_2 }),
  redis.createClient({ host: process.env.REDIS_HOST_3, port: process.env.REDIS_PORT_3 })
];

// Hash function to distribute keys among Redis clients
function getRedisClient(key) {
  const hash = key.split('').reduce((acc, char) => acc + char.charCodeAt(0), 0);
  return redisClients[hash % redisClients.length];
}

// Endpoint to shorten a URL
app.post('/shorten', async (req, res) => {
  const { url } = req.body;
  if (!url) return res.status(400).send('URL is required');

  const shortId = shortid.generate();
  const redisClient = getRedisClient(shortId);

  await redisClient.set(shortId, url);
  res.json({ shortUrl: `http://localhost:${process.env.PORT}/${shortId}` });
});

// Endpoint to retrieve the original URL
app.get('/:shortId', async (req, res) => {
  const { shortId } = req.params;
  const redisClient = getRedisClient(shortId);

  redisClient.get(shortId, (err, url) => {
    if (err || !url) {
      return res.status(404).send('URL not found');
    }
    res.redirect(url);
  });
});

app.listen(process.env.PORT, () => {
  console.log(`Server running on port ${process.env.PORT}`);
});

As you can see in this code, we have:

Consistent Hashing:
- We distribute keys (shortened URLs) across multiple Redis clients using a simple hash function.
- The hash function ensures that URLs are distributed evenly across the Redis instances.
URL Shortening:
- The /shorten endpoint accepts a long URL and generates a short ID using the shortid library.
- The shortened URL is stored in one of the Redis instances using our hash function.
URL Redirection:
- The /:shortId endpoint retrieves the original URL from the cache and redirects the user.
- If the URL is not found in the cache, a 404 response is returned.

Step 4: Implementing Cache Invalidation

In a real-world application, URLs may expire or change over time. To handle this, we need to implement cache invalidation.

Adding Expiry to Cached URLs

Let's modify our index.js file to set an expiration time for each cached entry:

// Endpoint to shorten a URL with expiration
app.post('/shorten', async (req, res) => {
  const { url, ttl } = req.body; // ttl (time-to-live) is optional
  if (!url) return res.status(400).send('URL is required');

  const shortId = shortid.generate();
  const redisClient = getRedisClient(shortId);

  await redisClient.set(shortId, url, 'EX', ttl || 3600); // Default TTL of 1 hour
  res.json({ shortUrl: `http://localhost:${process.env.PORT}/${shortId}` });
});

TTL (Time-To-Live): We set a default expiration time of 1 hour for each shortened URL. You can customize the TTL for each URL if needed.
Cache Invalidation: When the TTL expires, the entry is automatically removed from the cache.

Step 5: Monitoring Cache Metrics

To monitor cache hits and misses, we’ll add some logging to our endpoints in index.js:

app.get('/:shortId', async (req, res) => {
  const { shortId } = req.params;
  const redisClient = getRedisClient(shortId);

  redisClient.get(shortId, (err, url) => {
    if (err || !url) {
      console.log(`Cache miss for key: ${shortId}`);
      return res.status(404).send('URL not found');
    }
    console.log(`Cache hit for key: ${shortId}`);
    res.redirect(url);
  });
});

Here’s what’s going on in this code:

Cache Hits: If a URL is found in the cache, it’s a cache hit.
Cache Misses: If a URL is not found, it’s a cache miss.
This logging will help you monitor the performance of your distributed cache.

Step 6: Testing the Application

Start your Redis instances:

docker start redis1 redis2 redis3

Run the Node.js server:

node index.js

Test the endpoints using curl or Postman:

Shorten a URL:

  POST http://localhost:3000/shorten
  Body: { "url": "https://example.com" }

Access the shortened URL:
```
  GET http://localhost:3000/{shortId}
```

Conclusion: What You’ve Learned

Congratulations! You’ve successfully built a scalable URL shortener service with distributed caching using Node.js and Redis. Throughout this tutorial, you’ve learned how to:

Implement consistent hashing to distribute cache entries across multiple Redis instances.
Optimize your application with cache invalidation strategies to keep data up-to-date.
Use Docker to simulate a distributed environment with multiple Redis nodes.
Monitor cache hits and misses to optimize performance.

Next Steps:

Add a Database: Store URLs in a database for persistence beyond the cache.
Implement Analytics: Track click counts and analytics for shortened URLs.
Deploy to the Cloud: Deploy your application using Kubernetes for auto-scaling and resilience.

Happy coding!

Caching vs Content Delivery Networks – What's the Difference?

freeCodeCamp — Fri, 01 Mar 2024 19:27:51 +0000

By Anamika Ahmed

In the world of network optimization, Content Delivery Networks (CDNs) and caching play a vital role in improving website performance and user experience.

And while both aim to speed up website loading times, they have distinct purposes and mechanisms.

In this tutorial, we'll dive deep into the details of CDNs and caching to understand their similarities, differences, and how they contribute to enhancing online experiences.

Here's what we'll cover:

What is Caching?
What is a Content Delivery Network (CDN)?
Caching vs CDNs – What's the Difference?
When to Use Caching
When to use CDNs
Combining Caching and CDNs
Wrapping Up

What is Caching?

Imagine you’re a librarian managing a popular library. Every day, readers come in asking for the same set of books like “Think and Grow Rich” or “The Intelligent Investor.”

Initially, you fetch these books from the main shelves, which takes time and effort. But soon, you notice a pattern: the same set of books are requested repeatedly by different readers. So, what do you do?

You decide to create a special section near the entrance where you keep copies of these frequently requested books. Now, when readers come asking for them, you don’t have to run to the main shelves each time. Instead, you simply hand them the copies from the special section, saving time and making the process more efficient.

This special section represents the cache, storing frequently accessed books for quick retrieval.

Caching is a technique used to store copies of frequently accessed data temporarily. The cached data can be anything from web pages and images to database query results. When a user requests cached content, the server retrieves it from the cache instead of generating it anew, significantly reducing response times.

When a web server receives a request, it can follow different caching strategies to handle it efficiently. One prevalent strategy is known as read-through caching:

Request Received: The web server gets a request from a client.
Check Cache: It first looks into the cache to see if the response to the request is already there.
Cache Hit: If the response is in the cache (hit), it sends the data back to the client right away.
Cache Miss: If the response isn’t in the cache (miss), the server queries the database to fetch the required data.
Store in Cache: Once it gets the data from the database, it stores the response in the cache for future requests.
Send Response: Finally, the server sends the data back to the client.

What to Consider When Implementing a Cache System

Decide When to Use a Cache:

A cache is best for frequently read but infrequently modified data.
Cache servers are not suitable for storing critical data as they use volatile memory.
Important data should be stored in persistent data stores to prevent loss in case of cache server restarts.

Set an Expiration Policy:

Implement an expiration policy to remove expired data from the cache.
Avoid setting expiration dates too short (to prevent frequent database reloads), and too long (to prevent stale data).

Maintain Synchronization Between Data Stores and Cache

Inconsistencies can arise due to separate operations on data storage and cache, especially in distributed environments.

Mitigate Failures:

Use multiple cache servers across different data centers to avoid single points of failure.
Over-provision memory to accommodate increased usage and prevent performance issues.

Implement an Eviction Policy:

When the cache is full, new items may cause existing ones to be removed (cache eviction).
A popular eviction policy is Least Recently Used (LRU), but other policies like Least Frequently Used (LFU) or First In, First Out (FIFO) can be chosen based on specific use cases.

Real-World Applications of Caching

Social Media Platforms: Imagine scrolling through your Facebook feed. Thanks to caching, you see profile pictures, trending posts, and recently liked content instantly, even if millions of users are accessing the platform simultaneously.

Caching these frequently accessed elements on servers or your device minimizes delays and makes the experience smoother and more engaging.

E-commerce Websites: When browsing Amazon for a new gadget, you expect a seamless shopping experience. Caching plays a crucial role here. Product images, descriptions, and pricing information are cached, enabling the website to display search results and product pages rapidly.

This is especially crucial during peak seasons like Black Friday or Cyber Monday, where caching helps handle surges in traffic and ensures customers can complete their purchases without encountering delays.

Content Management Systems (CMS): Millions of websites rely on CMS platforms like WordPress. To ensure smooth performance for all these users, many CMS platforms integrate caching plugins. These plugins cache frequently accessed pages, reducing the load on the server and database.

This translates to faster page loading times, improved SEO ranking due to faster indexing by search engines, and a more responsive website overall, providing a better experience for visitors.

What is a Content Delivery Network (CDN)?

Now, think of a CDN as a global network of book delivery trucks. Instead of storing all the books in one central library, you have local branches worldwide, each with copies of the most popular books.

When readers request a book, you don’t have to ship it from the main library. Instead, you direct them to the nearest branch, where they can quickly pick up a copy. This cuts down on travel time (data transfer time) and keeps everyone happy with fast access to their favorite books.

In technical terms, a CDN is a network of servers distributed across various locations globally. Its primary purpose is to deliver web content, such as images, videos, scripts, and stylesheets to users more efficiently by reducing the physical distance between the server and the user.

How CDNs Work:

First, imagine that User A wants to see an image on a website. They click on a link provided by the CDN, like “https://mywebsite.cloudfront.net/image.jpg". This requests the image.

Then, if the image isn’t in the CDN’s storage (cache), the CDN fetches the image from the original source, like a web server or Amazon S3.

In response to that, the original source sends the image back to the CDN. It might include a Time-to-Live (TTL) header, indicating how long the image should stay cached.

Next, the the CDN stores the image and serves it to User A. It stays cached until the TTL expires.

Then let's say that user B requests the same image. At that point, the CDN checks if it’s still in the cache. If the image is still cached (TTL hasn’t expired), the CDN serves it from there (a hit). Otherwise (a miss), it fetches a fresh copy from the origin.

What to Consider When Implementing a CDN

Cost Management: CDNs charge for data transfers. It’s wise to cache frequently accessed content, but not everything.
Cache Expiry: Set appropriate cache expiry times. Too long, and content might be stale. Too short, and it strains origin servers.
CDN Fallback: Plan for CDN failures. Ensure your website can switch to fetching resources directly from the origin if needed.
Invalidating Files: You can remove files from the CDN before they expire using various methods provided by CDN vendors.

Real-World Applications of a CDN

Video Streaming Services: Imagine you're in Sydney, Australia, craving to watch the latest season of your favorite show on Netflix. Without a CDN, the data would have to travel all the way from a server in, say, California, leading to buffering and frustrating delays.

But thanks to CDNs, Netflix caches popular content on edge servers closer to you, in Sydney or its surrounding region. This significantly reduces the distance the data needs to travel, ensuring smooth playback and an uninterrupted viewing experience, regardless of your location.

In fact, studies show that CDNs can reduce video startup time by up to 50%, making a significant difference in user satisfaction.

Gaming Content Distribution: Gamers know the pain of waiting for massive game updates or DLC downloads. But companies like Steam and Epic Games leverage CDNs to make things faster.

These platforms cache game files, updates, and multiplayer assets on edge servers close to gaming communities. This means whether you're downloading a new game in New York or patching your favorite title in Tokyo, the data doesn't have to travel across continents.

Using CDNs can decrease download times quite a bit, leading to quicker access to the games you love and smoother multiplayer experiences with minimal lag.

Global News Websites: Staying informed about global events shouldn't be hindered by slow loading times. Major news organizations like BBC News and The New York Times use CDNs to ensure their breaking news updates and multimedia content reach audiences worldwide instantly.

By caching critical information like articles, videos, and images on servers across different continents, CDNs enable news websites to deliver real-time updates quickly, keeping readers informed regardless of their location.

During major events or emergencies, this can be especially crucial, as evidenced by a case study where a news organization using a CDN reported a 20% increase in website traffic without any performance issues during a breaking news event.

Caching vs CDNs – What's the Difference?

Similarities between caching and CDNs:

Improved Performance: Both CDNs and caching aim to enhance website performance by reducing latency and speeding up content delivery.

Efficient Resource Utilization: By serving cached or replicated content, both approaches help optimize resource utilization and reduce server load.

Enhanced User Experience: Faster load times lead to a better user experience, whether achieved through CDNs or caching.

Differences between Caching and CDNs

Scope:

CDNs: CDNs are a network of servers located in different geographic locations around the world.
Caching: Caching is a method of storing web content on a user’s local device or server.

Implementation:

CDNs: CDNs require a separate infrastructure and configuration.
Caching: Caching can be implemented within a web application or server using caching rules and directives.

Geographic Coverage:

CDNs: Designed to deliver web content to users across the world.
Caching: Typically used to improve performance for individual users or within a local network.

Network Architecture:

CDNs: Use a distributed network of servers to cache and deliver content.
Caching: This can be implemented using various types of storage such as local disk, memory, or a server-side cache.

Performance Benefits:

CDNs: Provide faster and more reliable content delivery by caching content in multiple locations.
Caching: Improves performance by reducing the number of requests to the origin server and delivering content faster from a local cache.

Cost:

CDNs: Can be more expensive to implement and maintain due to the need for a separate infrastructure and ongoing costs for network maintenance.
Caching: Can be implemented using existing infrastructure and server resources, potentially reducing costs.

When to use Caching

Caching is ideal for frequently accessed content that doesn't change frequently. This includes static assets like images, CSS files, and JavaScript libraries.

It's particularly effective for websites with a substantial user base accessing similar content, such as news websites, blogs, and e-commerce platforms.

Caching can also significantly reduce server load and improve response times for users, especially in scenarios where content delivery latency is a concern.

When to use CDNs

CDNs are invaluable for delivering content to a global audience, especially when geographical distance between users and origin servers leads to latency issues.

They are well-suited for serving dynamic content, streaming media, and handling sudden spikes in traffic.

CDNs also excel in scenarios where content needs to be delivered reliably and consistently across diverse geographic regions, ensuring optimal user experience regardless of location.

Combining Caching and CDNs

In many scenarios, employing both caching and CDNs together yields optimal results, particularly for dynamic websites and applications where a mix of static and dynamic content delivery is essential. Let's consider a popular news website as an example.

Imagine a bustling news website that regularly publishes breaking news articles, accompanied by images and videos. While the core news content is dynamic and frequently updated, the images and videos associated with older articles remain relatively static and are accessed repeatedly by users.

To address this, the website can implement a combined strategy:

Caching on the Origin Server: Frequently accessed elements like website templates, navigation menus, and static content are cached directly on the origin server. This caching reduces server load and enhances performance for initial page loads.
CDN Caching: The website leverages a CDN to cache frequently accessed images and videos associated with news articles on edge servers located worldwide. This ensures that users, regardless of their geographic location, can swiftly access these elements with minimal latency.

There are many benefits of the combined approach, such as:

Faster Loading Times: By serving cached content from both the origin server and CDN edge servers, users experience significantly faster loading times, leading to a more engaging browsing experience.
Reduced Server Load: Caching alleviates pressure on the origin server, enabling it to efficiently process dynamic content updates while serving static elements from cache.
Improved Global Reach: The CDN ensures that users worldwide can access the website and its content with minimal delays, irrespective of their proximity to the origin server.

But there are also some factors to consider:

Cache Invalidation: Regularly updating cached content ensures users access the latest information. Most CDNs offer efficient cache invalidation mechanisms to facilitate this process.
Cost Optimization: While combining caching and CDNs enhances performance, it's crucial to evaluate the cost-effectiveness of caching specific content. Analyzing user access patterns helps determine the optimal caching strategy.

By strategically combining caching and CDNs, you and your team can create a robust content delivery infrastructure that delivers a superior user experience worldwide.

Wrapping Up

Both CDNs and caching play crucial roles in optimizing website performance and user experience by speeding up content delivery.

While caching stores frequently accessed data locally for quick retrieval, CDNs provide a geographically distributed network of servers to deliver content efficiently to users worldwide.

Understanding their similarities in performance improvement and resource utilization, as well as their key differences in scope, implementation, and cost is crucial for choosing the right approach for your specific needs.

Caching in React – How to Use the useMemo and useCallback Hooks

freeCodeCamp — Mon, 15 May 2023 18:39:25 +0000

By Scott Gary

As you become more proficient at coding in React, performance will become a major focal point in your development process.

As with any tool or programming methodology, caching plays a huge role when it comes to optimizing React applications.

Caching in React typically goes by the term memoization. It's used to improve performance by reducing the amount of times a component renders due to state or prop mutations.

React provides two APIs for caching: useMemo and useCallback. useCallback is a hook that memoizes a function, while useMemo is a hook that memoizes a value. These two hooks are often used in conjunction with the Context API to further improve efficiency.

Here’s a basic list of topics we’ll be covering in this article:

React caching default behavior.
The useMemo hook.
The useCallback hook.

In order to follow along, you'll need a decent understanding of React and stateful components.

Default Caching Behavior in React

By default, React uses a technique called “shallow comparison” to determine whether a component should be re-rendered. This basically means that if the props or state of a component haven’t changed, React will assume that the output of the component hasn’t changed either and won’t re-render it.

While this default caching behavior is very effective by itself, it isn’t always enough to optimize complex components that require advanced state management.

In order to achieve more control over your component’s caching and rendering behavior, React offers the useMemo and useCallback hooks.

Caching in React with the useMemo Hook

useMemo is useful when you need to do an expensive computation to retrieve a value, and you want to ensure that the computation is only performed when necessary. By memoizing the value using useMemo, you can ensure that the value is only computed when its dependencies change.

In a React component, you may have multiple properties that make up your state. If a piece of state changes that has nothing to do with our expensive value, why recompute it if it hasn’t changed?

Here’s an example code block reflecting a basic useMemo implementation:

react
import React, { useState, useMemo } from 'react';
function Example() {
const [txt, setTxt] = useState(“Some text”);
const [a, setA] = useState(0);
const [b, setB] = useState(0);
const sum = useMemo(() => {
console.log('Computing sum...');
return a + b;
}, [a, b]);
return (
<div>
<p>Text: {txt}p>
<p>a: {a}p>
<p>b: {b}p>
<p>sum: {sum}p>
<button onClick={() => setTxt(“New Text!”)}>Set Textbutton>
<button onClick={() => setA(a + 1)}>Increment abutton>
<button onClick={() => setB(b + 1)}>Increment bbutton>
div>
);
}

In our Example component above, assume the sum() function performs an expensive computation. If the txt state is updated, React is going to re-render our component, but because we memoized the returned value of sum, this function will not run again at this time.

The only time the sum() function will run is if either the a or b state has been mutated (changed). This is an excellent improvement upon the default behavior, which will rerun this method upon each re-render.

Caching in React with the useCallback Hook

useCallback is useful when you need to pass a function as a prop to a child component, and you want to ensure that the function reference does not change unnecessarily. By memoizing the function using useCallback, you can ensure that the function reference remains the same as long as its dependencies do not change.

Without getting too heavy into JavaScript function references, let’s just take a look at how they can affect the rendering of your React app. When a function reference changes, any child components that receive the function as a prop will re-render, even if the function logic itself has not changed.

This is because, as we already mentioned, React does a shallow comparison of prop values to determine whether a component should re-render, and a new function reference will always be considered a different value than the previous one.

In other words, the simple act of redeclaring a function (even the same exact function), causes the reference to change, and will cause the child component that receives the function as a prop to unnecessarily re-render.

Here’s an example code block reflecting a basic useCallback implementation:

react
import React, { useState, useCallback } from 'react';
function ChildComponent({ onClick }) {
console.log('ChildComponent is rendered');
return (
<button onClick={onClick}>Click mebutton>
);
}
function Example() {
const [count, setCount] = useState(0);
const [txt, setTxt] = useState(“Some text…”);
const incrementCount = useCallback(() => {
setCount(prevCount => prevCount + 1);
}, [setCount]);
return (
<div>
<p>Text: {txt}p>
<p>Count: {count}p>
<button onClick={setTxt}>Set Textbutton>
<button onClick={setCount}>Incrementbutton>
<ChildComponent onClick={incrementCount} />
div>
);
}

As you can see in the above example, we pass the incrementCount method instead of the setCount method to the child component. This is because incrementCount is memoized, and when we run our setTxt method, it won’t cause the child component to unnecessarily re-render.

The only way our child component will re-render in this example is if the setCount method runs, because we passed it as a dependency parameter to our useCallback memoization.

Conclusion

Caching is an important technique for optimizing React applications. By reducing unnecessary re-renders, caching can help to improve the performance and efficiency of your application.

React provides a default caching behavior by using a virtual DOM to compare changes in state and props, and only updating components after a shallow comparison reflects changes. This is a great optimization technique that’s sufficient in many scenarios, but sometimes more fine-grained control is desired.

The useMemo and useCallback hooks were created to achieve this fine-grained control.

useMemo is used to memoize the results of a function call, and is useful when the function is expensive to compute and the result does not change often.

useCallback is used to memoize the actual reference of a function rather than the returned value, and is used when the function is passed as a prop to child components that might cause unnecessary re-renders.

Want to learn more? To learn more check out the OhMyCrawl Blog for more programming tips for SEO.

What is Pre-Caching? How to Increase Website Speed and Performance

freeCodeCamp — Thu, 19 Jan 2023 18:05:56 +0000

By Saurabh Dashora

Speed and performance are two of the key ingredients that make a website stand out from its peers.

Imagine visiting a bestseller list on the Amazon website and finding that the product pages take forever to show up.

What about a blog publishing some great stories that readers aren’t able to read because the incoming traffic exceeds the server’s capacity?

And if that’s not enough, just measure your frustration when there is a much-anticipated new movie on Netflix and all you get to see is the loading screen.

If you are the one running such a website, this is a good problem to have. Clearly, people are enthusiastic about the things you are offering and there is a disproportionate amount of traffic for certain pages.

However, if not handled properly, the situation can quickly spiral into chaos and end up alienating your users. Ultimately, it will result in a loss of business whether in terms of viewership time, sales, or even general user goodwill.

So how can you avoid this situation?

One of the most prominent techniques to boost website speed and performance is pre-caching.

In this post, you will get to understand the concept of pre-caching in great detail.

What is Pre-caching?

Pre-caching is a technique used to proactively store or cache data in anticipation of future requests. The idea is to cache commonly accessed data or resources in advance so that when the time comes, you can deliver it to the end-user faster.

Check out the below illustration that depicts how pre-caching can work in an overall system context.

How pre-caching works

You can perform pre-caching on the client-side such as in the web browser. Alternatively, you can also do it on the server-side using Content Delivery Networks (CDNs) or other caching solutions.

Whatever may be the approach, the goal of pre-caching is to improve the performance and user experience by reducing content load times.

How Pre-caching Works

The process of pre-caching requires you to store a copy of the data in a location that’s closer to the user or store the data in advance so that it is readily available when needed.

Below are the high-level steps to implement pre-caching.

First, identify the data or resources that are accessed most frequently. These resources are good candidates for pre-caching. For example, most popular blog posts, bestseller product list, and so on. You could also include images, JS files, and stylesheets to the pre-caching list
After the identification, you need to decide on the caching system to store the pre-cached data. This could be a local cache on the user’s device or even a distributed cache spanning multiple servers. The choice will depend on the type of resource.
Next, you need to pre-populate the cache with the identified resources. This step can be performed automatically by the system during initialization phase. Alternatively, you can do it on an on-demand basis as the data is accessed by the users. Remember, pre-caching is all about being proactive.
Once the setup is ready, you can let your system do the job. Whenever the system needs to access the pre-cached data, it can retrieve directly from the cache instead of fetching from the slower external source.

How to Decide What to Pre-cache

The success of pre-caching depends on the quality of data that is pre-cached. It is important for you to choose the right data to cache.

While it can sound daunting in practice, you can follow the below rules to make the right choice:

Prioritize critical resources such as the HTML, CSS, and JavaScript files that are needed for the initial page load. Usually, these are the most important resources that are required to provide a fast and smooth user experience.
You should also consider caching third-party resources like fonts, libraries or scripts from other domains. These resources can be pre-cached locally to reduce frequent network requests.
Your initial assumption about the best resources for pre-caching can change. Therefore, it is vital to perform a regular analysis of your web application’s usage pattern and derive insights about the user activity. This will help your pre-caching catalog stay relevant as your application evolves.
Explore the use of machine learning for pre-caching. You can build prediction models in order to predict which resources will be requested in the future based on past usage patterns. You can train this model on historical data and use it to identify the best candidate resources for pre-caching. Of course, this is a costly approach and its use depends on the importance of pre-caching in your application context.

The Advantages of Pre-caching

Pre-caching may sound like a lot of trouble. Why bother worrying about it?

In my view, the advantages of pre-caching can outweigh the difficulties. Here are a few important benefits of pre-caching:

Performance Improvement – When you pre-cache data, you are essentially reducing the load time for the content. This leads to a faster user experience. Websites and apps that expect a high volume of traffic or deal with huge amounts of data are significantly benefited by this.
Improved user experience – Who doesn’t like a good user experience? Faster load times improve the overall user experience and help reduce the bounce-rate (percentage of users leaving a website after visiting one page). Pre-caching also improves content availability even in the case of poor network connection.
Cost Reduction – Pre-caching can help you reduce costs. For example, if you pre-cache data on a CDN, you are ultimately reducing the load on the origin server. This saves bandwidth and reduces the server cost.
Offline Access – With pre-caching, you can also enable offline access to content using the concept of service workers. This is extremely important for mobile apps and websites that need to work in areas with poor internet connectivity.
Security – Though it is an indirect benefit, you also improve the overall security of your assets using pre-caching. Basically, pre-caching blunts the impact of DoS (Denial of Service) attacks since the application won’t have to serve the pre-cached resources from the origin server.

How to Implement Pre-caching in Node.js

Let’s look at a very basic implementation of pre-caching in Node.js:

const express = require('express');
const nodecache = require('node-cache');
require('isomorphic-fetch');

//Setting up Express
const app = express();

//Creating the node-cache instance
const cache = new nodecache({stdTTL: 10})

//We are using the fake API available at 
const baseURL = '';

//Pre-caching Popular Posts
[1, 2, 3].map(async (id) => {
    const fakeAPIURL = baseURL + id
    const data = await fetch(fakeAPIURL).then((response) => response.json());
    cache.set(id, data);
    console.log(`Post Id ${id} cached`);
})

//API Endpoint to demonstrate caching
app.get('/posts/:id', async (req, res) => {
    const id = req.params.id;
    if (cache.has(id)) {
        console.log('Fetching data from the Node Cache');
        res.send(cache.get(id));
    }
    else {
        const fakeAPIURL = baseURL + id
        const data = await fetch(fakeAPIURL).then((response) => response.json());

        cache.set(req.params.id, data);
        console.log('Fetching Data from the API');
        res.send(data);
    }
})

//Starting the server
app.listen(3000, () => {
    console.log('The server is listening on port 3000')
})

The example uses the node-cache library to create an in-memory cache. It is borrowed from a blog post that shows how to implement an in-memory cache in Node.js.

To simulate how pre-caching works, it assumes that posts 1, 2, and 3 are extremely popular posts and suitable candidates for pre-caching. Therefore, the data for these posts are pre-fetched during the application startup process and stored in the cache object.

When a request is made for these specific posts, the application fetches the data directly from the cache instead of calling the API.

Of course, this is a very basic setup for pre-caching. But you should get the idea of the concept in action.

Types of Pre-caching

While the previous section’s example demonstrated a particular approach to implement pre-caching, there are other methods as well.

Since the basic idea of pre-caching is quite simple, you can implement it in several different ways. Broadly, there are two main approaches: client-side pre-caching and server-side pre-caching.

Client-side Pre-caching

The most common way of client-side pre-caching is to proactively cache resources on the browser. With browser-based caching, you are trying to anticipate the resources that will be requested and store them in advance using the browser’s cache-storage API.

Browser-based caching often relies on the caching headers to determine if a particular resource is cacheable. When a user requests a page, the browser checks its cache to see if a copy of the requested data is already available. If it is, the browser loads the data from the cache. This reduces the time it takes to load the page.

Another way of implementing browser-based caching is using the Service Worker API. Basically, assets are pre-cached ahead of time, usually during the process of installing a service worker.

With service worker pre-caching, key static assets and materials such as HTML, CSS, JS and image files that are necessary for offline access get downloaded and stored in a Cache instance. You can use the JavaScript library named Workbox that makes it easy to precache resources and provides a simple API for working with the service worker cache.

Server-Side Pre-caching

The second approach is server-side pre-caching. You can also do it using multiple ways.

First method is to use Content Delivery Networks or CDNs that store a copy of the data on servers that are distributed around the world. When a user makes a request for some data, it is delivered from the server that is closest to the user. This reduces the time to handle the request and makes your website faster for the user.

Pre-caching with CDN

Second approach is to use a Caching Proxy Server. It is a server that sits in front of the origin server and works as a caching layer by storing a copy of the data. Next time there is a request from the user, the proxy server delivers the data directly without having to make a request to the origin server.

Caching Proxy Server

Here’s a sample configuration to use NGINX as a caching proxy server to cache all files with extension jpeg/jpg, png, css and js for 60 seconds.

# configure the proxy server to cache all assets for 1 hour
proxy_cache_path /var/cache/nginx levels=1:2 keys_zone=static_cache:20m inactive=60m;

# set the cache control header to a max-age of 1 hour
add_header Cache-Control "public, max-age=3600";

# cache all assets
location ~* \\.(jpg|jpeg|png|css|js)$ {
    proxy_cache static_cache;
    proxy_cache_valid 200 60m;
    proxy_pass http://origin_server;
}

Pre-Caching Best Practices

Pre-caching is an extremely useful technique to improve your web application’s performance. However, you should follow some best practices to ensure that pre-caching gives the desired results.

Here are a few best practices you should keep in mind while pre-caching:

Cache only necessary resources. Caching too many resources can lead to wasting storage space and nullifying the advantages of pre-caching. You should only cache the resources that have the greatest impact on performance improvement.
Don’t forget to version pre-cached resources. Even pre-cached resources may get updated in the future. So you should make sure to use versioning to identify the latest version of a resource. When a resource gets updated, you should also increment the version number to keep track of subsequent updates.
Use appropriate cache-control headers. Cache-control headers help us ensure that the resources are cached for the right amount of time. For example, an e-commerce platform might have a list of bestselling products that’s changing frequently due to products dropping off the charts or rising up. Such resources should have a shorter cache lifetime to keep the cache relevant.
Use a Content Delivery Network (CDN). CDNs help reduce the time it takes to load the resources. You should leverage a CDN to distribute resources to the edge servers located closer to the user. Edge servers in combination with pre-caching is a powerful technique to boost your web application’s performance.
Use a library or framework to enable pre-caching. Even though you might be tempted to build your solutions for pre-caching from scratch, you should consider using a library such as Workbox to enable service workers for pre-caching resources. For server-side caching, consider using a combination of CDNs and caching proxy servers.

Drawbacks of Pre-caching

While pre-caching is a very useful technique and mostly beneficial, you should make sure to avoid the below drawbacks:

Stale data: With pre-caching, you are storing data in advance. This data may not always be up-to-date. If the data changes frequently, the pre-cached version will become stale and lead to issues. To avoid this situation, you must have a proper strategy for cache invalidation that can get rid of stale data.
Inefficient use of resources: While pre-caching, you are basically assuming that the data you are caching will be accessed frequently. Such assumptions may not always be correct and you may end up with data that’s not needed frequently. If the size of such data grows beyond a certain limit, the caching solution can become inefficient and cause wastage of precious resources that can be used for other purposes.
Limited scope: Pre-caching is limited in scope. It only applies to data that is known in advance and can be pre-populated in the cache. This is mostly static data. It’s tough to implement pre-caching for data that is generated dynamically without the use of complex algorithms.

That’s it

To conclude, pre-caching is a powerful technique that can potentially improve the speed and performance of your website.

With pre-caching, you are essentially trying to anticipate which resources a user is likely to request next and downloading them in advance.

Even on its own, pre-caching is a game-changing technique that can have a significant impact on the user experience of your web application.

However, it's equally important to keep in mind that pre-caching is only one aspect of the website optimization process. You should try and use it in conjunction with other techniques such as minification, compression, and code optimization.

If you liked this post and found it useful, consider sharing it with friends and colleagues.

How to Cache Expensive Database Queries Using the Momento Serverless Cache

freeCodeCamp — Thu, 22 Dec 2022 20:45:05 +0000

By Andrew Brown

When to Use a Cache

When you are building a web-application, you'll need to fetch data from a database. As your traffic and the size of your database grows, you will find that querying your database gets slower and slower.

In order to return requests to users quickly, a cache can be a cost-effective and easy solution rather than having to upgrade your database.

Diagram showing how a cache works

A cache is an in-memory database which can store simple data as a key and value data structure.

Popular open-source caching solutions that already exist are Memcache and Redis.

What is Momento?

Momento Serverless Cache is a Caching-as-a-Service (CaaS) that you can integrate as a caching solution. It will reduce expensive or unnecessary queries against your primary database.

Momento has an SDK for the eight most popular programming languages. Here's an example of using the Ruby SDK to do a simple get and set of a cache item:

require 'momento'

client = Momento::SimpleCacheClient.new(
  auth_token: ENV['MOMENTO_AUTH_TOKEN'],
  default_ttl: ENV['MOMENTO_TTL']
)

response = client.set ENV['MOMENTO_CACHE_NAME'], "Hello", "World"
response = client.get ENV['MOMENTO_CACHE_NAME'], key
if response.hit?
  puts "Cache returned: #{response.value_string}"
elsif response.miss?
  puts "The item wasn't found in the cache."
end

Just a quick note: the company is called Momento and the caching product is called the Momento Serverless Cache, but we'll just say "Momento" to refer to the product to keep it simple.

Why Use Momento?

Momento is a Serverless Cache, and as a result has the following benefits:

Creating a new cache is nearly instant
You pay based on usage ($0.15/GB per transfer cost)
It has a very generous free-tier (first 50GB per month free)
No credit card required to start using the cache
It just scales, no server configuration or tuning required
It just works from anywhere

Momento is ideal for developers who just need a simple caching solution, and want to focus on their code instead of having to mange caching infrastructure.

Why Not Use a Managed Open Source Service?

Their are already open-source managed cloud services.

For example:

AWS has ElasticCache which allows you to run Memcached or Redis
Amazon MemoryDB for Redis
Azure has an Azure Cache for Redis
Redis has its own Redis Cloud offering

These existing cloud services can simplify some aspects of hosting and scaling a caching layer for your web-applications. But there are some things to consider:

you have to choose the right size compute
there are additional application integration steps
it takes time (up to an hour) to provision a cache
there are limitations on where a cache must live in your network

The Redis opens-source in memory database, for example, has a variety of complex data structures and data operations. It could be suited to more advanced use cases, where it goes beyond being a cache and can act (and is marketed as) a primary database.

There is no wrong answer when choosing a cache. What you have are trade-offs and you need to choose the best solution for your use-case.

How to Install Momento CLI

Momento (at the time of writing this article) is an API-only service.

So in order to use Momento you need to create an account by using their CLI tool.

Windows Install Instructions:

brew tap momentohq/tap
brew install momento-cli

Linux Install Instructions:

wget https://github.com/momentohq/momento-cli/releases/download/v0.22.8/momento-cli-0.22.8.linux_x86_64.tar.gz
tar -xvf momento-cli-0.22.8.linux_x86_64.tar.gz --strip-components 3
sudo mv momento /usr/local/bin
rm momento-cli-0.22.8.linux_x86_64.tar.gz

Once installed, test that the CLI is working with the following command:

momento --version
> momento 0.22.6

How to Create a Momento Account

To create an account, enter the following command:

momento account signup aws \
--email YOUR_EMAIL \
--region us-east-1

> Signing up for Momento...
> Success! Your access token will be emailed to you shortly.

Remember to replace YOUR_EMAIL with your own email address (for example, andrew@example.com).

Momento is going to email an access token and this access token is how Momento will identify and authorize our future API calls to use the cache.

Example email with provided token

Why did I need to type "aws" when creating an account?

Notice that we specified aws and the AWS region us-east-1 on creation.

When you create an account, you need to say which Cloud Service Provider (CSP) that the cache will be hosted on.

You might think, do I need to have and connect my own AWS account?

The answer is no. The cache is being setup within Momento's AWS account.

The reason Momento allows you to choose the CSP is because some companies have data policies about what part of the world and what CSP their data must reside on.

How to Configure the CLI to Use the Access Token

We need to configure the CLI to use the access token that was emailed.

Type the momento configure command to prompt the configuration wizard:

momento configure
Token: XXXXXXXXXXXXXXXXX
Default Cache [default-cache]: 
Default Ttl Seconds [600]: 
default-cache successfully created as the default with default TTL of 600s

Token: Enter the token by copying and pasting it from the previous email
Default Cache: Hit enter
Default TTL: Hit enter

The momento configure will generate two TOML configuration files:

~/.momento/credentials – stores sensitive configuration, for example: access token

[default]
token=XXXXXXXX

~/.momento/config – stores common configuration, for example: ttl default

[default]
cache=default-cache
ttl=600

How to Set and Get Cache Data

To set cache data is straightforward. You have the cache set and the cache get subcommands:

momento cache set --key "andrew" --value "brown" 
momento cache get --key "andrew"
> brown

How to Create a New Cache

We can create a another cache instantly with the cache create command. And we'll supply the --name flag to the cache get and cache set:

momento cache create --name freecodecamp
momento cache set --name freecodecamp --key "Quincy" --value "Larson" 
momento cache get --name freecodecamp --key "Quincy" 
> Larson

How to Integrate Momento Directly into Your Web Application Code

To use Momento within backend web-application code, we need to use one of the provided SDKs.

Let's write an example of using Momento in a Flask (Python) web-application using the Momento Python SDK.

Here is what our Flask app looks without using caching:

import os
import psycopg2
from flask import Flask, render_template
import json

app = Flask(__name__)

def get_db_connection():
    conn = psycopg2.connect(host='localhost',
                            database='flask_db',
                            user=os.environ['DB_USERNAME'],
                            password=os.environ['DB_PASSWORD'])
    return conn

@app.route('/')

def index():
    json_data = get_free_courses()

    response = app.response_class(
        response=json_data,
        status=200,
        mimetype='application/json'
    )
    return response

def get_free_courses():
  json_data = None
  conn = get_db_connection()
  cur = conn.cursor()

  cur.execute('SELECT * FROM free_courses;')
  free_courses = cur.fetchall()

  json_data = json.dumps(free_courses)

  cur.close()
  conn.close()
  return json_data

Here is what our application would look like implementing Momento:

import os
import psycopg2
from flask import Flask, render_template
import json
from momento import simple_cache_client as scc

_MOMENTO_AUTH_TOKEN  = os.getenv('MOMENTO_AUTH_TOKEN')
_MOMENTO_TTL_SECONDS = os.getenv('MOMENTO_TTL_SECONDS')
_MOMENTO_CACHE_NAME  = os.getenv('_MOMENTO_CACHE_NAME')

app = Flask(__name__)

def get_db_connection():
    conn = psycopg2.connect(host='localhost',
                            database='flask_db',
                            user=os.environ['DB_USERNAME'],
                            password=os.environ['DB_PASSWORD'])
    return conn

@app.route('/')
def index():
  with scc.SimpleCacheClient(_MOMENTO_AUTH_TOKEN, _MOMENTO_TTL_SECONDS) as cache_client:
    key = 'get_free_courses'
    get_resp = cache_client.get(_CACHE_NAME, 'get_free_courses')
    if get_resp.status() == 'hit':
      json_data = get_resp.value()
    elif get_resp.status() == 'miss':
      json_data = get_free_courses()
      cache_client.set(_CACHE_NAME, 'get_free_courses', json_data)

    response = app.response_class(
        response=json_data,
        status=200,
        mimetype='application/json'
    )
    return response

def get_free_courses():
  json_data = None
  conn = get_db_connection()
  cur = conn.cursor()

  cur.execute('SELECT * FROM free_courses;')
  free_courses = cur.fetchall()

  json_data = json.dumps(free_courses)

  cur.close()
  conn.close()
  return json_data

Summary

If you want to give Momento a go, visit their website documentation for more information.

https://docs.momentohq.com/

Redis Database Basics – How the Redis CLI Works, Common Commands, and Sample Projects

freeCodeCamp — Wed, 14 Apr 2021 15:55:10 +0000

By Mehul Mohan

Redis is a popular in-memory database used for a variety of projects, like caching and rate limiting.

In this blog post, we will see how you can use Redis as an in-memory database, why you'd want to use Redis, and finally we'll discuss a few important features of the database. Let's start.

What is an in-memory database?

Traditional databases keep part of the database (usually the "hot" or often-accessed indices) in memory for faster access, and the rest of the database on disk.

Redis, on the other hand, focuses a lot on latency and the fast retrieval and storage of data. So it operates completely on memory (RAM) instead of storage devices (SSD/HDD). Speed is important!

Redis is a key-value database. But don't let it fool you into thinking it's a simple database. You have a lot of ways to store and retrieve those keys and values.

Why do you need Redis?

You can use Redis in a lot of ways. But there are two main reasons I can think of:

You are creating an application where you want to make your code layer stateless. Why? - Because if your code is stateless, it is horizontally scalable. Therefore, you can use Redis as a central storage system and let your code handle just the logic.
You are creating an application where multiple apps might need to share data. For example, what if somebody is trying to bruteforce your site at payments.codedamn.com, and once you detect it, you'd also like to block them at login.codedamn.com? Redis lets your multiple disconnected/loosely connected services share a common memory space.

Redis Basics

Redis is relatively simple to learn as there are only a handful of commands you'll need to know. In the next couple sections, we'll cover a few main Redis concepts and some useful common commands.

The Redis CLI

Redis has a CLI which is a REPL version of the command line. Whatever you write will be evaluated.

The above image shows you how to do a simple PING or hello world in Redis in one of my codedamn Redis course exercises (the course is linked at the end if you want to check it out).

This Redis REPL is very useful when you're working with the database in an application and quickly need to get a peek into a few keys or the state of Redis.

Common Redis commands

Trying out common commands on Redis CLI in codedamn course

Here are a few very commonly used commands in Redis to help you learn more about how it works:

SET

SET allows you to set a key to a value in Redis.

Here's an example of how it works:

SET mehul "developer from india"

This sets the key mehul to the value developer from india.

GET

GET allows you to get the keys you've set.

Here's the syntax:

GET mehul

This will return the string "developer from india" as we set above.

SETNX

This key will set a value only if the key does not exist. This command has a number of use cases, including not accidentally overwriting the value of a key which might already be present.

Here's how it works:

SET key1 value1
SETNX key1 value2
SETNX key2 value2

After running this example, your key1 will have the value value1 and key2 as value2. This is because the second command will have no effect as key1 was already present.

MSET

MSET is like SET, but you can set multiple keys together in one command. Here's how it works:

MSET key1 "value1" key2 "value2" key3 "value3"

Right now we are using key and value as the prefix for keys and values. But in reality when you write such code it's easy to lose track of what is a key and what is a value in such a long command.

So one thing you can do is always quote your value using double quotes, and leave your keys without quotes (if they are valid keynames without quotes).

MGET

MGET is similar to GET, but it can return multiple values at once, like this:

MGET key1 key2 key3 key4

This will return four values as an array: value1, value2, value3 and null. We got key4 as null because we never set it.

DEL

This command deletes a key – simple enough, right?

Here's an example:

SET key value
GET key # gives you "value"
DEL key 
GET key # null

INCR and DECR

You can use these two commands to increment or decrement a key which is a number. They are very useful and you'll use them a lot, because Redis can perform two operations in one – GET key and SET key to key + 1.

This avoids roundtrips to your parent application, and makes the operation also safe to perform without using transactions (more on this later)

Here's how they work:

SET favNum 10
INCR favNum # 11
INCR favNum # 12
DECR favNum # 11

EXPIRE

The EXPIRE command is used to set an expiration timer to a key. Technically it's not a timer, but a kill timestamp beyond which the key will always return null unless it's set again.

SET bitcoin 100
EXPIRE bitcoin 10

GET bitcoin # 100
# after 10 seconds
GET bitcoin # null

EXPIRE uses a little bit more memory to store that key as a whole (because now you have to also store when that key should expire). But you probably won't ever care about that overhead.

TTL

This command can be used to learn how much time the key has to live.

Example:

SET bitcoin 100
TTL bitcoin # -1
TTL somethingelse # -2

EXPIRE bitcoin 5
# wait 2 seconds
TTL bitcoin # returns 3
# after 1 second
GET bitcoin # null
TTL bitcoin # -2

So what can we learn from this code?

TTL will return -1 if the key exists but doesn't have an expiration
TTL will return -2 if the key doesn't exist
TTL will return time to live in seconds if the key exists and will expire

SETEX

You can perform SET and EXPIRE together with SETEX.

Like this:

SETEX key 10 value

Here, the key is "key", the value is "value", and the time to live (TTL) is 10. This key will get unset after 10 seconds.

Now that you have fundamental knowledge of basic Redis commands and how the CLI works, let's build a couple of projects and use those tools in real life.

Project 1 – Build an API Caching System with Redis

Preview of API caching system building lab on codedamn

This project involves setting up an API caching system with Redis, where you cache results from a 3rd party server and use it for some time.

This is useful so that you are not rate limited by that third party. Also, caching improves your site's speed, so if you implement it correctly it's a win-win for everyone.

You can build this project interactively on codedamn inside the browser using Node.js. If you're interested, you can try the API caching lab for free.

If you're only interested in the solution (and not building it yourself) here's how the core logic will work in Node.js:

app.post('/data', async (req, res) => {
    const repo = req.body.repo

    const value = await redis.get(repo)

    if (value) {
        // means we got a cache hit
        res.json({
            status: 'ok',
            stars: value
        })

        return
    }

    const response = await fetch(`https://api.github.com/repos/${repo}`).then((t) => t.json())

    if (response.stargazers_count != undefined) {
        await redis.setex(repo, 60, response.stargazers_count)
    }

    res.json({
        status: 'ok',
        stars: response.stargazers_count
    })
})

Let's see what's happening here:

We try to get the repo (which is the passed repo format - facebook/react) from our Redis cache. If present, great! We return the star count from our redis cache, saving us a roundtrip to GitHub's servers.
If we don't find it in cache, we do a request to GitHub's servers, and get the star count. We check if the star count is not undefined (in case a repo doesn't exist/is private). If it has a value, we setex the value with a timeout of 60 seconds.
We set a timeout because we don't want to serve stale values over time. This helps us refresh our star count at least once a minute.

Here's the full source code:

https://github.com/codedamn-classrooms/redis-nodejs-classroom/tree/lab5sol

Project 2 - Rate limiting API with Redis

Preview of rate limiting API with Redis

This project involves rate limiting a certain endpoint to protect it from bad actors, and then blocking them from accessing that particular API.

This is very useful for login and sensitive API endpoints, where you don't want a single person to hit your endpoint with thousands of requests.

We perform rate limiting by IP address in this lab. If you want to attempt this codelab, you can try it for free on codedamn.

If you're only interested in the solution (and not building it yourself) here's how the core logic will work in Node.js:

app.post('/api/route', async (req, res) => {
    // add data here
    const ip = req.headers['x-forwarded-for'] || req.ip

    const reqs = await redis.incr(ip)
    await redis.expire(ip, 2)

    if (reqs > 15) {
        return res.json({
            status: 'rate-limited'
        })
    } else if (reqs > 10) {
        return res.json({
            status: 'about-to-rate-limit'
        })
    } else {
        res.json({
            status: 'ok'
        })
    }
})

Let's understand this code block:

We try to extract the IP from the x-forwarded-for header (or you can use req.ip as we are using express)
We INCR the IP address field. If our key in Redis never existed, INCR would automatically set it to 0 and increment, that is finally set it to 1.
We set the key to expire in 2 seconds. Ideally you'd want a larger value - but this is what the codedamn challenge specified above, so there we have it.
Finally we check the request counts, if they are greater than a certain threshold, we block the request from reaching the main function body.

Here's the full solution:

https://github.com/codedamn-classrooms/redis-nodejs-classroom/tree/lab6sol

More on Redis

Redis is much more than what we have learned so far. But the good thing is that we have learned enough to start working with it already!

In this section, let's cover a few more Redis fundamentals.

Redis is single threaded

Redis runs as a single threaded process, even on a multiple core system supporting multi threading. This is not a performance nightmare, but a safety measure against inconsistent read/writes in a multi threaded environment.

If Redis were multi threaded, to ensure thread safety when accessing a single key, you'd eventually have resolved to some locking mechanism, which probably would perform worse than single threaded/sequential access anyway.

Redis Transactions

Of course, you cannot do everything in Redis in a single command. But you can surely ask it to do a block of commands in a single go (that is, nobody else talks to Redis while it is executing that block). You can do that using the MULTI command.

Here's how that works:

MULTI
SET hello world
SET yo lo
SET number 1
INCR number
EXPIRE hello 10
EXPIRE yo 5
EXEC

This will perform all these operations in one go, that is it will not run anything at all after MULTI, and will run everything at once the moment it sees the EXEC keyword.

Redis includes support for lists and sets for more advanced use cases. You can also use Redis as a broadcasting service where you publish to a channel and others who have subscribed to the channel receive a notification. This is very useful in multi-client architecture.

Conclusion

I hope you liked this introduction to Redis. This blog post is a part of codedamn's new interactive course: Redis + Node.js caching, where you not only learn about these concepts, but practice them within your browser on the go.

Feel free to give the course a try and let me know what you think. You can find me on twitter to send any feedback :)

How to Setup Instagram-like Video Stories in Your App

freeCodeCamp — Tue, 22 Sep 2020 22:01:31 +0000

By Agam Mahajan

The article will teach you how you can show multiple videos in one view, like we see in Instagram Stories.

We'll also learn how to cache the videos in the user's device to help save that user's data and network calls and smooth out their experience.

A quick note: this implementation is for iOS, but the same logic can be applied in other codebases as well.

In general, whenever we want to play a video, we get the video URL and simply present **AVPlayerViewController** with that URL.

let videoURL = URL(string: "Sample-Video-Url")
let player = AVPlayer(url: videoURL!)
let playerViewController = AVPlayerViewController()
playerViewController.player = player
self.present(playerViewController, animated: true) {
    playerViewController.player.play()
}

Pretty straightforward, right?

But the drawback of this implementation is that you can’t customize it. Which, if you are working for a good product company, will be an everyday ask. :D

Alternatively, we can use **AVPlayerLayer** which will do a similar job – but it allows us to customize the view and other elements.

let videoURL = URL(string: "Sample-Video-Url")
let player = AVPlayer(url: videoURL!)
let playerLayer = AVPlayerLayer(player: player)
playerLayer.frame = self.view.bounds
self.view.layer.addSublayer(playerLayer)
player.play()

But what if you want to combine multiple videos, similar to Instagram stories? Then we probably have to dive in a bit deeper.

Coming Back to the Problem Statement

Now, let me tell you about my use case.

In my company, Swiggy, we want to be able to show multiple videos, where each video should be shown x number of times.

On top of that, it should have an Instagram-like stories feature.

Video-2 should seamlessly autoplay after video-1, and so on
It should jump to corresponding videos whenever the user taps left or right.

If you think caching could be the answer, don't worry – I’ll get to that in a bit.

Multiple layers in one view

First things first, we need to figure out how to add multiple videos in one view.

What we can do is create one **AVPlayerLayer** and assign the first video to it. When the first video is finished, then we assign the next video to the same **AVPlayerLayer** .

func addPlayer(player: AVPlayer) {
    player.currentItem?.seek(to: CMTime.zero, completionHandler: nil)
    playerViewModel?.player = player
    playerView.playerLayer.player = player
}

To jump to the previous or next video, we can do the following:

Add a tap gesture on the view
If the touch location ‘x’ is less than half of the screen, then assign the previous video, else assign the next video

@objc func didTapSnap(_ sender: UITapGestureRecognizer) {
   let touchLocation = sender.location(ofTouch: 0, in: view)
   if touchLocation.x < view.frame.width/2 {
     changePlayer(forward: false)
     } 
   else {
     fillupLastPlayedSnap()
     changePlayer(forward: true)
    }
}

There we go. We now have our own Insta-like Stories video feature.

But our task is not done yet!

Now Back to Caching

We don't want it to be the case that every time a user navigates from one video to another, it starts to download the video from the beginning.

Also, if the video is shown again in the next session, we don't need to do another server call.

If we can cache the video, then the user’s internet will be saved. The load on the server will also be reduced.

Finally, the UX will improve as the user won't have to wait a long time to load the video.

As a good developer, reducing a user’s internet usage should be our priority.

Less data usage, happy customer

Load Videos Asynchronously

The first thing we can use to load videos is loadValuesAsynchronously.

According to the Apple documentation, loadValuesAsynchronously:

Tells the asset to load the values of all of the specified keys (property names) that are not already loaded.

The advantage here is that it saves the video until it is rendered. So it will not download the video from the start whenever the user navigates to a previous video. It will only download the part which was not rendered earlier.

Let's look at an example**: say we have Video_1 that is 15 seconds long, and the user saw 10 seconds of that video before jumping to Video_2.

Now if the user comes back to Video_1 again by tapping to the left, loadValuesAsynchronously will have that 10 seconds of video saved and will only download the remaining (unwatched) 5 seconds.

func asynchronouslyLoadURLAssets(_ newAsset: AVURLAsset) {
    DispatchQueue.main.async {
            newAsset.loadValuesAsynchronously(forKeys: self.assetKeysRequiredToPlay) {
                for key in self.assetKeysRequiredToPlay {
                    var error: NSError?
                    if newAsset.statusOfValue(forKey: key, error: &error) == .failed {
                        self.delegate?.playerDidFailToPlay(message: "Can't use this AVAsset because one of it's keys failed to load")
                        return
                    }
                }

                if !newAsset.isPlayable || newAsset.hasProtectedContent {
                    self.delegate?.playerDidFailToPlay(message: "Can't use this AVAsset because it isn't playable or has protected content")
                    return
                }
                let currentItem = AVPlayerItem(asset: newAsset)
                let currentPlayer = AVPlayer(playerItem: currentItem)
                self.delegate?.playerDidSuccesToPlay(playerDetail: currentPlayer)
            }

        }

You can find more details on loadValuesAsynchronously at this link.

The caveat here is it persists video data for that session only. If the user closes and comes back to the app, the video has to be downloaded again.

So what other options do we have?

Saving Videos in Device

Now comes Video Caching!

When the video is rendered completely, we can export the video and save it to the user’s device. When the video comes up again in their next session, we can pick the video from the device and simply load it.

AVAssetExportSession
According to Apple's documentation:

An object that transcodes the contents of an asset source object to create an output of the form described by a specified export preset.

This means that AVAssetExportSession acts as an exporter, through which we can save the file to the user’s device. We have to give the output URL and the output file type.

let exporter = AVAssetExportSession(asset: avUrlAsset, presetName: AVAssetExportPresetHighestQuality)
exporter?.outputURL = outputURL
exporter?.outputFileType = AVFileType.mp4

exporter?.exportAsynchronously(completionHandler: {
    print(exporter?.status.rawValue)
    print(exporter?.error)
})

You can find more details on AVAssetExportSession at this link.

Now the only thing left is to fetch the data from the cache and load the video.

Before loading, check if the video is present in the cache. Then fetch that local URL and give it to loadValuesAsynchronously.

if let cacheUrl = FindCachedVideoURL(forVideoId: videoId) {
    let cacheAsset = AVURLAsset(url: cacheUrl)
    asynchronouslyLoadURLAssets(cacheAsset)
}
else {
  asynchronouslyLoadURLAssets(newAsset)
}

Caching will help reduce a lot of user data usage as well as server load (sometimes up to TBs of data).

Other use cases for caching

What other use cases we can handle with caching? The following are examples of ways you could use caching here:

Ensure Optimum Storage

Before saving the video on the device, you should check whether enough storage is present on the device to do so.

func isStorageAvailable() -> Bool {
   let fileURL = URL(fileURLWithPath: NSHomeDirectory() as String)
   do {
      let values = try fileURL.resourceValues(forKeys: [.volumeAvailableCapacityForImportantUsageKey, .volumeTotalCapacityKey])
      guard let totalSpace = values.volumeTotalCapacity,
      let freeSpace = values.volumeAvailableCapacityForImportantUsage else {
          return false
      }
      if freeSpace > minimumSpaceRequired {
         return true
      } else {
          // Capacity is unavailable
          return false
      }  
    catch {}
    return false
}

Remove Deprecated Videos

You can have a timestamp for each video so that you can clean up old videos from device memory after a certain number of days.

func cleanExpiredVideos() {
        let currentTimeStamp = Date().timeIntervalSince1970
        var expiredKeys: [String] = []
        for videoData in videosDict where currentTimeStamp - videoData.value.timeStamp >= expiryTime {
            // video is expired. delete
            if let _ = popupVideosDict[videoData.key] {
                expiredKeys.append(videoData.key)
            }
        }
        for key in expiredKeys {
            if let _ = popupVideosDict[key] {
                popupVideosDict.removeValue(forKey: key)
                deleteVideo(ForVideoId: key)
            }
        }
    }

Maintain a limited number of videos

You can make sure only a limited number of videos are saved in the file at a time. Let's say 10.

Then when the 11th video comes, you can have it delete the least-viewed video and replace it with the new one. This will also help you not consume too much of the user’s device memory.

func removeVideoIfMaxNumberOfVideosReached() {
        if popupVideosDict.count >= maxVideosAllowed {
            // remove the least recently used video
            let sortedDict = popupVideosDict.keysSortedByValue { (v1, v2) -> Bool in
                v1.timeStamp < v2.timeStamp
            }
            guard let videoId = sortedDict.first else {
                return
            }
            popupVideosDict.removeValue(forKey: videoId)
            deleteVideo(ForVideoId: videoId)
        }
    }

Measure Impact

Don’t forget to add logs, so that you can measure the impact of your feature. I have used a custom New Relic Log Event to do so:

 static func findCachedVideoURL(forVideoId id: String) -> URL? {
        let nsDocumentDirectory = FileManager.SearchPathDirectory.documentDirectory
        let nsUserDomainMask = FileManager.SearchPathDomainMask.userDomainMask
        let paths = NSSearchPathForDirectoriesInDomains(nsDocumentDirectory, nsUserDomainMask, true)
        if let dirPath = paths.first {
            let fileURL = URL(fileURLWithPath: dirPath).appendingPathComponent(folderPath).appendingPathComponent(id + ".mp4")
            let filePath = fileURL.path
            let fileManager = FileManager.default
            if fileManager.fileExists(atPath: filePath) {
                NewRelicService.sendCustomEvent(with: NewRelicEventType.statusCodes,
                                                                   eventName: NewRelicEventName.videoCacheHit,
                                                                   attributes: [NewRelicAttributeKey.videoSize: fileURL.fileSizeString])
                return fileURL
            } else {
                return nil
            }
        }
        return nil
    }

To convert the file size to a readable format, I fetch the file size and convert it to Mbs.

extension URL {
    var attributes: [FileAttributeKey : Any]? {
        do {
            return try FileManager.default.attributesOfItem(atPath: path)
        } catch let error as NSError {
            print("FileAttribute error: \(error)")
        }
        return nil
    }

    var fileSize: UInt64 {
        return attributes?[.size] as? UInt64 ?? UInt64(0)
    }

    var fileSizeString: String {
        return ByteCountFormatter.string(fromByteCount: Int64(fileSize), countStyle: .file)
    }
}

This is how you can measure your impact:

Total data saved = number of requests video_size = 2.4MB20.3K ~= 49GB

This is just two weeks of data. You do the math for the whole year. ? And this will keep on increasing exponentially over time.

That’s it! You have now built your own caching mechanism.

Wrapping up

In this article, we saw how easily we can integrate multiple videos in one view, giving an Instagram-like story feature.

We also learned why and how caching plays an important role here. We saw how it helps the user save a lot of data and have a smooth user experience.

Do let me know if I missed something, or if you can think of any more use cases.
Thanks for your time. :)

An In-depth Introduction to HTTP Caching: Cache-Control & Vary

Léo Jacquemin — Thu, 24 Oct 2019 09:56:49 +0000

Introduction - scope of the article

This series of articles deals with caching in the context of HTTP. When properly done, caching can increase the performance of your application by an order of magnitude. On the contrary, when overlooked or completely ignored, it can lead to some very unwanted side effects caused by misbehaving proxy servers that, in the absence of clear caching instructions, decide to cache anyway and serve stale resources.

In the first part of this series, we argued that caching is the most effective way to increase performance, when measured by the page load time. In this second part, it is time to shift our focus to the mechanisms at our disposal. To put it in another way: how does HTTP caching actually work?

To answer this question, we decided to consider the case of an empty cache that starts progressively caching and serving resources. As it gradually receives incoming HTTP requests, our cache will start behaving accordingly. Serving the resource from the cache when a fresh copy is available, varying over multiple representations, making a conditional request... This way, we can introduce each concept progressively as we need it.

At first, our empty cache will have no choice but to forward requests to the origin server. This will allow us to understand how origin servers instruct our cache on what to do with the resource, such as if it is allowed to store it, and for how long. For this, we will examine each Cache-Control directive and clarify some of them that have been known to have conflicting meanings.

Second, we will look at what happens when our cache receives a request for a resource it already knows. How does our cache decide if it can re-use a previously stored response? How does it map a given HTTP request to a particular resource? To answer these, we will learn about representation variations with the Vary header.

This article is going to focus on knowledge that’s the most valuable from a web developer’s perspective. Therefore, conditional requests are only discussed briefly and will be the focus of another article.

Without further ado, let us start with an overview of what we will be exploring.

The HTTP caching decision tree

Conceptually, a cache system always involve at least three participants. With HTTP, these participants are the client, the server, and the caching proxy.

However, when learning about HTTP caching, we strongly encourage you not to think of the client as your typical web browser because these days, they all ship with their own HTTP caching layer. It makes it difficult to clearly separate the browser from the cache. For this reason, we invite you to think of the client as a headless command line program such as cURL or any application without an embedded HTTP cache.

All precautions aside, let us now deep dive into the subject by taking a look at the following picture: the HTTP caching decision tree.

This picture illustrates all the possible paths a request can take every time a client asks for a resource to an origin server behind a caching system. A careful examination of this illustration reveals that there are only four possible outcomes.

Clearly separating these outcomes in our minds is actually very convenient, seeing as each important caching concept (cache instructions, representation matching, conditional requests and resource aging) maps to each one of them.

Let us describe succinctly each one by introducing two important terms relating to the HTTP caching terminology: cache hits and cache misses.

Hits and misses

The first possible outcome is when the cache finds a matching resource, and is allowed to serve it, which, in the caching world, are indeed two distinct things. This outcome is what we commonly call a cache hit, and is the reason why we use caches in the first place.

When a cache hit happens, it completely offloads the origin server and the latency is dramatically reduced. In fact, when the cache hit happens in the browser’s HTTP cache latency is null and the requested resource is instantly available.

Unfortunately, cache hits account only one of the four possible outcomes. The rest of them fall into the second category, also known as cache misses, which can happen for only three reasons.

The first reason a cache miss typically happens is simply when the cache does not find any matching resource in its storage. This is usually a sign that the resource has never been requested before, or has been evicted from the cache to free up some space. In such cases, the proxy has no choice but to forward the request to the origin server, fully download the response and look for caching instructions in the response headers.

The second reason a cache miss can happen is actually just as detrimental, where the cache detects a matching representation, one that it could potentially use. However, the resource is not considered to be fresh anymore - we will see how exactly in the cache-control section of this article - but is said to be stale.

In such case, the cache sends a special kind of request, called a conditional request to the origin server. Conditional requests allow caches to retrieve resources only if they are different from the one they have in their local storage. Since only the origin server ever has the most recent representation of a given resource, conditional requests always have to go through the whole caching proxy chain up to the origin server.

These special requests have only two possible outcomes. If the resource has not changed, the cache is instructed to use its local copy by receiving a 304 Not Modified response along with updated headers and an empty body. This outcome, the third one on our list, is called a successful validation.

Finally, the last possible outcome is when the resource has changed. In this case, the origin server sends a normal 200 OK response, as it would if the cache was empty and had forwarded the request. To put it another way, cache misses caused by empty cache and failed validation yield exactly the same HTTP response.

To best visualize these four paths, it is helpful to picture them in a timeline, as illustrated below.

At first, the cache is empty. The flow of requests starts with a cache miss (empty cache outcome). On its way back, the cache would read caching instructions and store the response. All subsequent requests for this particular resource would yield to cache hits, until the resource becomes stale and needs to be revalidated.

Upon a first revalidation, it is possible that the resource has not changed, hence, a 304 Not Modified would be sent.

Then, the resource eventually gets updated by a client, typically with a PUT or a PATCH request. When the next conditional request arrives, the origin server detects that the resource has changed and replies a 200 OK with updated ETag and Last-Modified headers.

Knowing about cache hits and cache misses along with the 4 possible paths that every cacheable request could take, should give you a good overview of how caching works.

Though overviews can only get you so far. In the following section, we will give a detailed explanation of how origin servers communicate caching instructions.

How origin servers communicate caching instructions

Origin servers communicate their caching instructions to downstream caching proxies by adding a Cache-Control header to their response. This header is an HTTP/1.1 addition and replaces the deprecated Pragma header, that was never a standard one.

Cache-control header values are called directives. The specification defines a lot of them, with various uses and browser-support. These directives are primarily used by developers to communicate caching instructions. However, when present in an HTTP request, clients can also influence the caching decision. Let us now take the time to describe the most useful directives.

max-age

The first important Cache-Control directive to know about is the max-age directive, which allows a server to specify the lifetime of a representation. It is expressed in seconds. For instance, if a cache sees a response containing the header Cache-Control: max-age=3600, it is allowed to store and serve the same response for all subsequent requests for this resource for the next 3600 seconds.

During these 3600 seconds, the resource will be considered fresh and cache hits will occur. Past this delay, the resource will become stale and validation will take over.

no-store, no-cache, must-revalidate

Unlike max-age, the no-store, no-cache and must-revalidate directives are about instructing caches to not cache a resource. However, they differ in subtle ways.

no-store is pretty self-explanatory, and in fact, it does even a little more than the name suggests. When present, a HTTP/1.1 compliant cache must not attempt to store anything, and must also take actions to delete any copy it might have, either in memory, or stored on disk.

The no-cache directive, on the other hand, is arguably much less self-explanatory. This directive actually means to never use a local copy without first validating with the origin server. By doing so, it prevents all possibility of a cache hit, even with fresh resources.

To put it another way, the no-cache directive says that caches must revalidate their representations with the origin server. But then comes another directive, awkwardly named… must-revalidate.

If this starts to get confusing for you, rest assured, you are not alone. If what one wants is not to cache, it has to use no-store instead of no-cache. And if what one wants is to always revalidate, it has to use no-cache instead of must-revalidate.

Confusing, indeed.

As for the must-revalidate directive, it is used to forbid a cache to serve a stale resource. If a resource is fresh, must-revalidate perfectly allows a cache to serve it without forcing any revalidation, unlike with no-store and no-cache. That’s why this header should always be used with a max-age directive, to indicate a desire to cache a resource for some time and when it’s become stale, enforce a revalidation.

When it comes to these last three directives, we find the choice of words to describe each of them particularly confusing: no-store and no-cache are expressed negatively whereas must-revalidate is expressed positively. Their differences would probably be more obvious if they were to be expressed in the same fashion.

Therefore, it is helpful to think about each of them expressed in terms of what is not allowed:

no-store: never store anything
no-cache: never cache hit
must-revalidate: never serve stale

Technically, these directives can appear in the same Cache-Control header. It is not uncommon to see them combined as a comma-separated list of values. A lot of popular websites still seem to behave very conservatively, sending back HTML pages with the following header:

Cache-Control: no-cache, no-store, max-age=0, must-revalidate

When you stumble upon this, the intention behind it is usually pretty clear: the web development team wants to ensure that the resource never gets served stale to anyone.

However, such cache-buster lines are probably not necessary anymore. Past work done in 2017 already showed that browsers are really rather compliant with the specification in respect to Cache-Control response directives. Therefore, unless you’re planning on setting up a caching stack with decades old software, you should be fine using just the directives you need. The most popular combinations will be analyzed in another article.

public, private

The last important directives we haven’t discussed yet are a little bit different, as they control which types of caches are allowed to cache the resources. These are the public and private directives, private being the default one if unspecified.

Private caches are the ones that are supposed to be used by a single user. Typically, this is the web browser’s cache. CDN and reverse-proxies on the contrary, handle requests coming from multiple users.

Why do we need to distinguish these two types of caches ? The answer is straightforward: security, as illustrated by the following example.

Many web applications expose convenience endpoints that rely on information coming from elsewhere than the URL. If two users access their profile by requesting /users/me, at https://api.example/com, and their actual user id is hidden within a Authorization: Bearer 4Ja23ç42…. token, the cache won’t be able to tell these are in fact two very different resources.

Indeed, when constructing their cache key, caches do not inspect HTTP headers unless specifically instructed to do so, as we shall see in the next section.

s-maxage

The s-maxage directive is like the max-age directive, except that it only applies to public caches, which are also referred to as shared caches (hence the s- prefix). If both directives are present, s-maxage will take precedence over max-age on public caches and be ignored on private ones.

When using this directive, the general rule is to always ensure that s-maxage value is below max-age’s. The reasoning behind this rule is that the closer you are to the origin, the more suitable it is to check frequently what the latest representation is.

Imagine you were to cache for one day in the proxy, and one hour in browsers.

Every time a browser would ask a resource to upstream servers, we could know in advance that the proxy will not contact the origin server for at least a day. Therefore, why not put the same TTL directly in the browsers ? As a conclusion, it is a best practice to always leave out a longer TTL in max-age than in s-maxage.

stale-while-revalidate and stale-if-error
These two directives are not technically part of the original specification but are part of an extension which were first described more than 10 years ago. Although their browser support is limited, some popular CDNs have been supported them for more than 5 years!

Though stale-while-revalidate is pretty useful. As the name implies, it allows a cache to “[...] immediately return a stale response while it revalidates it in the background, thereby hiding latency (both in the network and on the server) from clients”.

This caching extension proves really helpful for things like images, where reducing latency is critical for the user experience, and where having a stale version for a few seconds is often better than a painfully downloading image.

As for stale-if-error, it allows a cache to serve a stale version if the origin server returns a 5xx status code. This gives developers a chance to fix potential issues during a grace period where clients are shielded from irritating error pages.

Consider the case of a meteo third-party script. If the meteo server happens to be unreachable for a few minutes, it’s probably best to display a slightly outdated forecast during this lapse of time, than it is to see a portion of the page be blank (or a whole blank page if the code does not handle third-party scripts loading failures.

What we don’t know yet

After examining these Cache-Control directives, we now understand how applications that are distributed on the web, tend to leverage HTTP caching mechanisms in multiple ways, depending on what they need.

Though what we don’t yet understand is what cache softwares actually do with the response they receive. They will most likely have to store it somewhere in order to retrieve it later. That’s the core idea of any caching system after all.

Under normal circumstances, this certainly looks like what we would call an implementation detail. It should be merely enough to know that resources are indeed stored some way. Yet in this case, learning just a little more is actually critical.

Neglecting the mechanisms that govern how caching softwares map objects from the HTTP responses space to their storage space can have really unexpected consequences, such as serving a brotli encoded Chinese document, to a user who does not understand Chinese, using a browser unable to decode brotli ¯_(ツ)_/¯

How caches store and retrieve resources

Albeit unlikely to happen, since most browsers can decode brotli - and since most people know how to 說中文 - the previous situation can still easily occur. To understand why this is the case, one must consider how caches store their representations.

By virtue of what they try to achieve, most caching softwares ought to be able to quickly retrieve simple text documents. To do so, a very simple yet powerful strategy is to use a key-value store. This strategy fits well in-memory representations. Therefore, the question one must answer when designing is the following: how to construct a cache key from an HTTP response?

What we are looking for here is a way to uniquely identify a resource. Conveniently, this is exactly why URIs - Uniform Resource Identifiers - were invented in the first place!

But URIs don’t tell the whole truth about resources. They never describe them entirely, if only for the fact that resources change over time.

Websites get rebranded, new content gets published and users update their profile. Granted, not for the same reasons or at the same frequency, though all resources will eventually change. In fact, the entire Conditional request specification is based on this sole observation: nothing is permanent except change.

Philosophical quotes aside, there is, however, another time-independent reason why resources change. Indeed, any moment, resources may be available in multiple representations. This is why we have Content-Negociation.

The HTTP request headers Accept, Accept-Language, Accept-Encoding, Accept-Charset (and a few other headers who are not strictly speaking part of content negotiation) add another dimension on which representations can differ. As such, the problem of finding a good cache key becomes more complicated. Since all these representations share the same URI, caches must have a way to distinguish them in order to serve the right representation at each client, honoring content negotiation.

And since only origin servers know what different representations are available, it is again the origin server’s responsibility to indicate to a cache based on which headers it will generate a different representation. To do so, the origin servers must add a Vary header containing the value of the request headers that cause different representations to be generated.

When caches see a response coming from an origin server with, for instance, the header Vary**:** Accept-Language, it will examine the value of the Accept-Language header, such as fr-FR**,** and use this value to construct a more specific cache-key, perhaps like https://example.net/home.html_fr-FR.

The actual implementation strategy is of little importance to us. Altering the cache key might not even be the best way to do it. It somehow has to use the value of the header to differentiate representations.

The Vary header can actually point at more than one header, when resources are available in multiple representations. Selecting a cache key when multiple headers are involved is not really much more complicated than with only one header. The real problem when varying over multiple dimensions is the combinatorial explosion.

Unfortunately, there are no ways around this. If you are to cache and serve your resources in multiple representations, you have to pay the cost of a large storage. If you decide to lower your vary cardinality, some of your users will receive cache hits for responses that won’t match their requests.

On the other hand, if you vary properly on everything, and do not have enough storage space, chances are your users won’t be seeing cache hits anytime soon.

Now, it is important to know that this is only a problem if you decide to use a public cache, for which two different requests coming from two different users are running the same code, at the proxy level. If you decide to leverage the browser’s cache only, then you can skip the Vary header altogether and serve resources in as many representations as you want. This is because each browser’s cache will only cache representations matching the user’s preferences. This is good news!

But let’s not get ahead of ourselves just yet. As we said, caches use the value of the header as its input to generate a more specific cache key. But what is to say that all these values are well formatted ? Absolutely nothing! This is the rather inconvenient consequence of RFC father’s robustness principle. HTTP servers are indeed very liberal in what they accept.

However there is hope.

Considering the case of an origin server that can only produce a representation in two different languages, caches must be able to regroup incoming Accept-Content values such as fr, fr-FR, fr_FR_.._ into something such as FR. Otherwise, just like before with the combinatorial explosion, the number of representations will explode, but in this case, for a misguided reason.

The process by which all these representations are regrouped is called normalization and is often done at the cache. Many caches offer configuration utilities or their own languages to deal with these situations. Sometimes, the functions are even already written, or snippets can easily be found on the Internet. The following pictures illustrates the process for the infamous User-Agent header.

Fastly, a popular CDN, sampled 100 000 requests and found that the Accept-Encoding header was expressed in 44 different ways ! As for the User-Agent header, they found a shy of… 8000 different ones! Without normalization, chances are that the cache will never see any hit.

This wraps up the section about representation variation. At this point, we know how to instruct caches to store our resources, and have learned to leverage the Vary header to prevent accidents from happening when using public caches. We have now covered enough of the specification to be able to cache resources effectively.

Common misconceptions

By now, you should have a thorough understanding of how HTTP caching works. Freshness control, resource’s representations and cache hits are no longer mysterious concepts to you. And if you start to feel empowered by all this knowledge, we have some good news for you: we’ve covered a large portion of the specification, and you now know pretty much all that’s necessary to be up and running.

But make no mistake. Caching is a complex topic.

Experience has shown us that, unless you’re dealing with it on a day-to-day basis, what may be crystal clear today will quickly turn into something rather blurry after a few weeks. Therefore, we decided to conclude this second article by dispelling two common misconceptions that are all too easy to make.

Freshness-control and validation

This might seem obvious after reading the previous sections but it is worth repeating many times. Freshness control and validation (which we have slightly discussed in the beginning) are two very distinct mechanisms that serve two very different purposes, and involve HTTP requests between different pieces.

Freshness control always happen in a cache and is solely based on time
Validations always happen in the origin server and are based both on time and on identifiers (ETags)

This is something we find important to remind ourselves. It means that once the cache has received temporal instructions, it can - and best believe it will - serve resources without ever contacting the origin server until the timer expires.

For instance, if your web application’s HTML file reaches a browser and the HTTP response happens to include the header Cache-Control: max-age=86400 the browser will happily serve the same version of your app for a day. In this case, the browser would serve it for one day without any possible action from you or anyone, except the user, if one ever decided to flush his browser’s cache.

If you’re thinking everyone can make mistakes, and one day is not so bad, well, brace yourself: the maximum max-age value is… 31536000 seconds! That is to say, one year. This is the reason why HTML files are very dangerous to cache like this, and should generally be declared with Cache-Control: no-cache.

Freshness and most recent representation

Another misconception is to believe that cache hits and freshness have anything to do with having the last available version of a resource. This is what we all try to achieve, but one can never truly know if the resource it has been served from a cache is indeed the most up-to-date version. In fact, this holds true even in the absence of cache. It has to do with the nature of distributed applications: other people’s actions can change the things we are interacting with at any time.

When querying the state of the application, the ETag header must always be used to always let the server know what our current understanding of the application’s state is. And if it does not match the server’s, 409 Conflict are expected to be received on the client side.

Conclusion

Along this article, we have described how caching actually works. Now would be a good time to spin up a local dev server and fiddle around with these two core headers: Cache-Control and Vary to see them in action.

We started by giving an overview of how caching works, illustrating the four possible paths that a request can take : the happy path (cache hit) and the 3 possible ways to have a cache miss : empty cache, failed revalidation and successful revalidation. This overview alone gives the possibility to understand how complex caching topologies can fit together.

Then, we went deeper and looked at all the most useful Cache-Control headers, and clarified some subtle differences that are all easily missed.

We also looked at the Vary header and the fundamental difference between resources and representations, to avoid serving the wrong representation to the right client.

Finally, we took some time to review it all through the angle of common misconceptions you might encounter, and hopefully helped you to avoid them.

In the next article, we’ll apply all of this knowledge to set up a local lab environment in which we will set an innocent node.js app on fire with a load-testing tool, right before rescuing it with the help of a popular caching software.

Stay tuned!

To go further:

The official specification about the material we covered (and other things)
https://tools.ietf.org/html/rfc7234#section-5.3

Google Web’s Fundamental
https://developers.google.com/web/fundamentals/performance/optimizing-content-efficiency/http-caching#defining-optimal-cache-control-policy

About the Cache-Control header:
https://developer.mozilla.org/fr/docs/Web/HTTP/Headers/Cache-Control

About the Vary Header:
https://www.smashingmagazine.com/2017/11/understanding-vary-header/
https://www.fastly.com/blog/best-practices-using-vary-header
https://www.fastly.com/blog/getting-most-out-vary-fastly
https://www.fastly.com/blog/understanding-vary-header-browser

caching - freeCodeCamp.org

How to Optimize Django REST APIs for Performance: Profiling, Caching, and Scaling.

What we’ll cover:

Why Django REST APIs Become Slow

1. N+1 Query Problems in Serializers

2. Fetching Related Objects Inefficiently

3. Serializing Large Datasets Without Pagination

4. Recomputing Expensive Work Repeatedly

Profiling: Finding the Real Bottlenecks

Measuring Query Count in a View

Using the Django Debug Toolbar

How to Install and Enable the Django Debug Toolbar

Logging SQL Queries

How to Enable SQL Query Logging

Profiling API Response Time

How to Measure Total Response Time

SQL Query Optimization in Django REST APIs

Understanding the N+1 Query Problem

Solving the Problem with select_related and prefetch_related

Example: How to Optimize a Many-to-Many Relationship

Common Beginner Mistakes

Caching in Django REST APIs

Cache Eviction

Caching in Application Architectures

Caching in Django

When to Use Redis

Common Beginner Mistakes

Pagination and Limiting Expensive Datasets

Load Testing and Measuring Improvement

Summary and Next Steps

Key Takeaways

Next Steps for Your APIs

Read More

Why Your UI Won’t Update: Debugging Stale Data and Caching in React Apps

Why it Matters

Table of Contents

The Mental Model

Non-Cache Cause

The common trap:

Cache 1: React Query Cache

Common failure mode: mutation succeeds, but the UI stays old

Query Keys: React Query Caches by Key, not URL

Cache 2: Next.js fetch() Caching

What you’ll notice when this happens

How to debug it

Step 1: Reproduce in a production-like run

Step 2: Confirm whether the request is reaching your Next.js server at all

Step 3: Force Next.js to ask your API every time

Step 4: If the email is still stale, force Next.js to rebuild the page every request

A “beginner-safe” setup for the user settings pages with some of the suggestions:

Option A: Refresh the saved copy every N seconds

Option B: Refresh right after the update (best for “update email” flows)

Cache 3: Browser HTTP Cache (a Saved Copy in Your Browser)

What you’ll notice

Fast check

Why it happens

Cache 4: CDN/Hosting Cache

What you’ll notice

Fast check

Quick diagnostic check

Cache 5: Service Worker Cache (Only if Your Site is a PWA)

What you’ll notice

Fast check (Chrome)

10-Second Debug Guide

Prevention: Set Caching Intentionally

Recap

How to Cache Golang API Responses for High Performance

Table of Contents

Response Caching with Local and Redis Storage

Database Query Result Caching

HTTP Caching with ETag and Cache-Control

Stale-While-Revalidate with Background Refresh

Wrapping Up

Caching a Next.js API using Redis and Sevalla

Table of Contents

Why Caching Matters

What is Redis?

Setting Up the Project

Provisioning Redis

Updating Cache on Reads

Solving the Problem with `select_related` and `prefetch_related`