Hamdaan Ali - freeCodeCamp.org

How to Elevate Your Database Game: Supercharging Query Performance with Postgres FDW

Hamdaan Ali — Wed, 18 Feb 2026 22:36:48 +0000

Foreign data wrappers (FDWs) make remote Postgres tables feel local. That convenience is exactly why FDW performance surprises are so common.

A query that looks like a normal join can execute like a distributed system: rows move across the network, remote statements get executed repeatedly, and the local planner quietly becomes a coordinator. In that world, “fast SQL” is not mainly about CPU or indexes. It’s about data movement and round-trips.

This handbook covers the mechanism that determines whether a federated query behaves like a clean remote query or a chatty distributed workflow: pushdown.

Pushdown is not “moving compute”. Pushdown determines whether filtering, joining, ordering, and aggregation occur at the data source or after the data has already crossed the wire. When pushdown works, the local server receives a reduced result set. When it doesn’t, Postgres often has to fetch broad intermediate sets and finish the work locally.

The chapters ahead will help you build a practical mental model of what is “shippable” in postgres_fdw, why some expressions are blocked, and how to read EXPLAIN (ANALYZE, BUFFERS, VERBOSE) without getting tricked by familiar plan shapes.

After the core method, the handbook covers tuning knobs that matter in production, schema and indexing considerations, benchmarking methodology, monitoring and logging, and a case study that shows what a real pushdown win looks like end-to-end.

The later sections go deeper into advanced shippability edge cases, cost model calibration, and regression-proofing FDW workloads.

Prerequisites
Executive Summary
Motivation
FDW Basics Without the Setup Tax
Pushdown Mechanics
Shippable Operations: a Deep Dive
Pushdown Blockers and Why They Exist
Reading EXPLAIN Like a Pro
How to Tune postgres_fdw
Schema and Index Recommendations
Benchmarking Methodology
Monitoring and Logging
Case Study: Refactoring a Keycloak Coverage Query
Checklist and Troubleshooting Guide
Case Study Takeaways
Advanced Operations: A Deeper Dive into Shippability
Common Anti‑Patterns and How to Avoid Them
Extending Tuning: Calibrating Cost Models
Further Case Studies and Practical Examples
Monitoring, Diagnostics, and Regression Testing
Extended Guidelines for Advanced DBAs
Bringing it All Together
References

Prerequisites

This handbook assumes basic comfort with Postgres query plans. It builds on EXPLAIN (ANALYZE, BUFFERS) rather than reintroducing SQL fundamentals, indexing, or join algorithms.

The focus here is federated execution: how foreign queries behave, and how to reason about them with the same clarity as local plans.

Here’s what you should already be comfortable with:

Reading EXPLAIN (ANALYZE, BUFFERS) output and spotting obvious plan smells (row explosions, bad join order, missed indexes).
Basic join mechanics (nested loop, hash join, merge join) and why cardinality estimates matter.
Postgres statistics at a practical level (ANALYZE, correlation, and what “estimated rows vs actual rows” implies).

And here’s what you need to follow along with the examples:

A Postgres “local” instance that will run postgres_fdw and act as the coordinator.
A Postgres “remote” instance that holds the foreign tables.
Permission on the local side to:
- CREATE EXTENSION postgres_fdw;
- create a SERVER and USER MAPPING
- create FOREIGN TABLE objects (or permission to use existing ones)
A way to run queries and capture plans:
- psql is enough, and so is any GUI, as long as you can run EXPLAIN (ANALYZE, BUFFERS, VERBOSE).

We won’t go through a long environment setup walkthrough. The examples assume the FDW objects exist and focus on plans and behavior.

We also won’t go into general distributed systems theory. Only the pieces that show up in an FDW plan are used.

Executive Summary

The single most important lesson of this handbook is that FDW pushdown reduces data movement. It’s tempting to think of pushdown as merely changing where a calculation happens (“move the work to the remote”). But what really matters is whether the remote server is asked for only the rows you need.

When pushdown is working, the remote server performs the selective join and filtering, and the local Postgres receives a small, already reduced result set. When pushdown fails, the local server becomes a distributed query coordinator: it pulls large intermediate sets over the network and then finishes the heavy lifting locally.

Why does this matter? Because a refactor that makes more of your query shippable to the remote server can slash end‑to‑end latency without changing a single row of output. In the case study we'll explore later, rewriting a query so that the FDW can ship a joined remote query instead of performing multiple foreign scans and local joins reduces runtime from approximately 166 ms to 25 ms. The business logic did not change – the shape of the work changed.

Below is a simple bar chart illustrating that dramatic drop. The chart uses actual timings from the case study. If you run the experiment yourself, the numbers may differ depending on your hardware and network, but the relative difference should be clear.

Motivation

Foreign data wrappers let you query remote data using the same SQL syntax you use locally. That convenience is exactly why they can be so deceptive.

A federated query may look like a normal join, but under the hood, it behaves like a distributed system: some part of the plan runs on the remote server, some on the local server, and every boundary between them is a network hop. The slow path is rarely “bad SQL” – it’s usually a combination of two things:

Too many rows are pulled over the network. Without pushdown, the FDW retrieves a large slice of the remote table and applies your filters and joins locally. This may lead to tens of thousands or millions of rows being shipped across the network when you only needed hundreds or fewer.
Too many round-trips. If the plan performs a nested loop that drives a foreign scan, it can end up executing the same remote query hundreds or thousands of times. Each call might be fast on its own, but latency adds up.

This isn't speculation. PostgreSQL's documentation makes clear that a foreign table has no local storage and that Postgres “asks the FDW to fetch data from the external source” [1]. There is no local buffer cache or heap storage to hide mistakes. Every row you retrieve must traverse the network at least once. If your plan fetches more rows than it needs, or repeatedly does so, performance can degrade quickly.

That’s why you should treat the Remote SQL shown in EXPLAIN (VERBOSE) as part of your query plan. It tells you exactly what the remote server is being asked to do. If it’s missing your filters or joins, you know the local server will have to finish the job. The rest of this handbook will teach you how to read that plan, how to force pushdown when possible, and how to recognize the signs that something has gone wrong.

FDW Basics Without the Setup Tax

You might be tempted to skip this section if you've already created foreign tables in your own databases. Don't. Understanding the architecture of foreign data wrappers is essential to understanding why pushdown matters.

SQL/MED in a nutshell

PostgreSQL implements the SQL/MED (Management of External Data) standard through its FDW framework. To access a remote Postgres server via postgres_fdw, you perform four steps:

Install the extension: CREATE EXTENSION postgres_fdw tells Postgres to load the FDW code.
Create a foreign server: CREATE SERVER foreign_server FOREIGN DATA WRAPPER postgres_fdw OPTIONS (host '...', port '...', dbname '...')defines where the remote server resides and how to connect.
Create a user mapping: CREATE USER MAPPING FOR your_user SERVER foreign_server OPTIONS (user 'remote_user', password '...') tells Postgres how to authenticate on the remote side.
Create a foreign table: CREATE FOREIGN TABLE remote_table (...) SERVER foreign_server OPTIONS (schema_name '...', table_name '...'); defines the columns and references the remote table.

Once you've done that, you can run SELECT statements against the foreign table as if it were local. But the definition hides an important detail: there is no storage associated with that foreign table [1]. Every time you SELECT, INSERT, UPDATE, or DELETE, the FDW must connect to the remote server, build a remote query, send it, and read the results. This overhead is small for simple queries but becomes critical as queries get more complex.

What postgres_fdw does and does not do

postgres_fdw does two things for you:

It builds remote SQL from your query, including pushing down safe filters, joins, sorts, and aggregates when it can.
It fetches rows from the remote server and hands them to the local executor. If some part of your query cannot be executed remotely, the local executor performs that part.

The FDW tries hard to minimize data transfer by sending as much of your WHERE clause as possible to the remote server and by not retrieving unused columns [2]. It also has a number of tuning knobs that we'll explore later (such as fetch_size, use_remote_estimate, fdw_startup_cost, and fdw_tuple_cost[3]). But the real win often comes from structuring your query so that the FDW can push work down.

There's one last architectural point to keep in mind: the remote server runs with a restricted session environment. In remote sessions opened by postgres_fdw, the search_path is set to pg_catalog only, and TimeZone, DateStyle, and IntervalStyle are set to specific values [4]. This means that any functions you expect to run remotely must be schema‑qualified or packaged in a way that the FDW can find them. It also underscores why you should not override session settings for FDW connections unless you know exactly what you are doing [4].

Pushdown Mechanics

At a high level, “pushdown” means pushing as much of your SQL query as possible to the remote server. But the FDW cannot simply send arbitrary SQL. It must be safe and portable for remote evaluation. Postgres uses the term shippable to describe expressions and operations that can be evaluated on the foreign server.

What “shippable” means in practice

An expression is considered shippable if it meets several conditions:

It uses built‑in functions, operators, or data types, or functions/operators from extensions that have been explicitly allow‑listed via the extensions option on the foreign server [2]. If you use a custom function or an extension that has not been declared, the FDW assumes it cannot run remotely.
It’s marked IMMUTABLE. Postgres distinguishes between IMMUTABLE, STABLE, and VOLATILE functions. Only immutable functions – those that always return the same output for the same inputs and don’t depend on session state – are candidates for pushdown [5]. This rule prevents time‑dependent functions, such as now() or random() from being evaluated remotely, because the result might differ between the local and remote servers.
It doesn’t depend on local collations or type conversions. PostgreSQL’s docs warn that type or collation mismatches can lead to semantic anomalies [1]. If the FDW cannot guarantee that a comparison behaves identically on both servers, it will refuse to push it down. For example, comparing a citext column to a text constant could be unsafe if the remote server doesn’t have the citext extension installed.

From these rules, you can derive a mental checklist: avoid non‑immutable functions in your WHERE clause, keep your join conditions simple and typed correctly, and list any third‑party extensions you want to use in the foreign server’s extensions option so that they are considered shippable [2].

WHERE pushdown

If a WHERE clause consists entirely of shippable expressions, it will be included in the remote query. Otherwise, it will be evaluated locally. This matters because pushing a filter down reduces the number of rows returned to the local server.

Consider a predicate like this:

WHERE created_at >= now() - interval '30 days'

Because now() is volatile (it returns a different value each time it’s called), Postgres cannot assume the remote server will interpret now() the same way. The FDW therefore pulls the entire table and applies the filter locally.

A better approach is to pass a parameter into the query or compute the cutoff timestamp once in the application and embed it into the SQL.

Join pushdown conditions

Joins are the next big lever. When postgres_fdw encounters a join between foreign tables on the same foreign server, it will send the entire join to the remote server unless it believes it will be more efficient to fetch the tables individually or unless the tables use different user mappings [6].

It applies the same precautions described for WHERE clauses: the join condition must be shippable, and both tables must be on the same server. Cross‑server joins are never pushed down – the FDW will perform them locally.

Shippability decision tree

It can be helpful to visualize the shippability rules as a flowchart. Below is a simple decision tree that you can use when inspecting an expression or join clause.

It starts with the question of whether an expression is in a WHERE or JOIN clause. Further decisions are made based on factors like using volatile functions, built-in functions, type mismatches, or cross-server joins. The flowchart concludes with outcomes like "Not shippable, evaluated locally" or "Shippable, included in Remote SQL."

If you reach the left side of the tree, the expression will be evaluated locally. If you reach the right side, the FDW can ship it.

Shippable Operations: a Deep Dive

Postgres has been expanding what postgres_fdw can be pushed down over several versions. This section walks through each operation class and the conditions required for pushdown.

Filters (WHERE clauses)

As explained above, simple filters that use built‑in operators and immutable functions are generally pushed down. If you see a Filter: node above a Foreign Scan in your plan, it means some part of your predicate didn’t qualify. Common reasons include using now(), timezone() or other volatile functions, referencing a non‑allow‑listed extension, or comparing different collation settings.

When this happens, the entire table (or at least all rows matching other shippable conditions) is fetched, and the filter is applied locally.

Plan smell: Look for a Foreign Scan node with a Filter: line directly above it. That means filtering happened locally. Also look for broad Remote SQL such as:

SELECT * FROM remote_table WHERE (name = 'Hamdaan')

with no group constraints. That's a sign that the filter was not pushed down.

Joins

Simple inner joins between foreign tables on the same foreign server are usually pushable. The join condition must satisfy the same shippability rules as filters. If the join involves more than one foreign server, if the join condition uses an unshippable function, or if the foreign tables use different user mappings, the FDW will fetch each table separately and join them locally [6]. This can lead to large intermediate sets being transferred.

Plan smell: A Hash Join or Merge Join where both inputs are Foreign Scan nodes indicates that the join was performed locally. Conversely, a single Foreign Scan representing a join and containing the JOIN ... ON clause in Remote SQL indicates that the join was pushed down.

Aggregates (GROUP BY, COUNT, SUM, and so on)

Starting in PostgreSQL 10, aggregates can be pushed to the remote server when possible. The release notes state explicitly: “push aggregate functions to the remote server,” and explain that this reduces the amount of data that must be transferred from the remote server and offloads aggregate computation [7].

To qualify, both the grouping expressions and the aggregate functions themselves must be shippable. If the FDW cannot push an aggregate, it will fetch the raw rows and perform the aggregation locally.

Plan smell: Look for a GroupAggregate node above a Foreign Scan that returns many rows. When the aggregate is pushed down, there will be no local aggregate node. Instead, the Remote SQL will include a GROUP BY clause.

ORDER BY and LIMIT

Prior to PostgreSQL 12, sorting and limiting were rarely pushed down. In version 12, Etsuro Fujita’s patch allows ORDER BY sorts and LIMIT clauses to be pushed to postgres_fdw foreign servers in more cases [8]. For the sort or limit to be pushed, the underlying scan must be pushable, and the ordering expression must be shippable. Partitioned queries or complicated join trees may still cause the sort or limit to be applied locally.

Plan smell: A local Sort or Limit node above a Foreign Scan indicates the operation was not pushed down. Conversely, a Remote SQL statement containing ORDER BY and LIMIT indicates that pushdown succeeded.

DISTINCT

Distinct operations can be pushed down when the distinct expression list is shippable. But if the distinct is combined with unshippable expressions, or if the distinct is applied after a join that cannot be pushed down, the FDW will retrieve all rows and perform the distinct locally.

Window functions

In practice, window functions are rarely pushed down through postgres_fdw. They often require ordering or partitioning semantics that are difficult to represent portably. If you see a WindowAgg node in your plan, it’s almost always local. That doesn’t mean you can't use window functions with foreign tables, but you should expect them to incur network and CPU costs.

Version differences

Postgres developers continue to improve the FDW layer. Here are some notable changes by version:

PostgreSQL 9.6 introduced remote join pushdown and allowed UPDATE/DELETE pushdown. Before 9.6, all joins were local.
PostgreSQL 10 introduced aggregate pushdown, enabling remote GROUP BY and aggregate functions [7].
PostgreSQL 12 expanded ORDER BY and LIMIT pushdown [8].
PostgreSQL 15 added pushdown for certain CASE expressions and other improvements.

If you learned FDW behavior on an older version, revisit your assumptions.

Pushdown Blockers and Why They Exist

When pushdown fails, it’s not due to bad luck. There’s always a reason grounded in safety or correctness. Here are the most common blockers and how to diagnose them.

Non‑immutable functions

Functions marked VOLATILE or STABLE cannot be pushed down because their results may differ between the local and remote server. Examples include now(), random(), current_user, and user‑defined functions that look at session variables or query the database. Even functions you might think are harmless, like age() or clock_timestamp(), can cause pushdown to fail.

Fix: Compute volatile values in your application or in a CTE before referencing the foreign table. For example, compute timestamp 'now' - interval '30 days' as a constant and compare your created_at column against that constant. Alternatively, move the logic into a stored generated column on the remote table.

Type and collation mismatches

The documentation warns that when types or collations don’t match between the local and remote tables, the remote server may interpret conditions differently [1]. This is particularly insidious when text comparisons, case‑insensitive collations, or non‑default locale settings are used. If Postgres can't guarantee the same semantics, it will pull rows locally and evaluate the expression.

Fix: Make sure that your foreign table definition uses the same data types and collations as the remote table. When in doubt, explicitly cast values to a common type.

Cross‑server joins

Joins across different foreign servers cannot be pushed down. The FDW can only ship a join when both tables reside on the same remote server and use the same user mapping [6]. Otherwise, it will perform two separate scans and join the results locally.

Fix: If you frequently join tables across servers, consider consolidating the tables on a single server, materializing a view on one side, or pulling the smaller table into a temporary local table before joining.

Mixed local and foreign joins

A join between a local table and a foreign table will not be pushed down. Even though the foreign side might be pushdown‑eligible, the FDW cannot join it with local data on the remote server. A nested loop with a parameterized foreign scan is the typical pattern here, resulting in many remote calls.

Fix: Filter or aggregate as much as possible on the foreign side first (via a CTE or by materializing a subset) before joining to local tables.

Remote session settings and search paths

Because postgres_fdw sets a restricted search_path, TimeZone, DateStyle, and IntervalStyle in remote sessions [4], any functions you call must be schema‑qualified or otherwise compatible. If a function relies on the current search path or session settings, it may break or produce different results on the remote side.

Fix: Schema‑qualify remote functions and ensure that any environment‑dependent logic is safe to execute under the default FDW session settings. If necessary, attach SET search_path or other settings to your remote functions.

Troubleshooting matrix

The table below maps symptoms in your EXPLAIN plan to likely causes and fixes. Use it as a quick diagnostic tool when something looks off.

Symptom in plan	Likely cause	Suggested fix
Foreign Scan has loops much greater than 1	Parameterized remote lookup caused by nested loop, join conditions not shippable	Rewrite join so the FDW can ship a single joined query, or batch remote requests via an `IN` list or temporary table
Broad Remote SQL that lacks scope predicates	`WHERE` clause contains non‑immutable functions or unsupported operators	Replace volatile functions with constants or allow‑list extension functions, ensure types and collations match
Local Hash Join or Merge Join between two foreign tables	Join could not be pushed down (different servers, user mappings, or unshippable join expression)	Consolidate tables on one server, align user mappings, or rewrite the join condition
Local Sort, Limit, or Unique on top of a Foreign Scan	`ORDER BY`, `LIMIT`, or `DISTINCT` could not be pushed down	Simplify sort expressions, push filters deeper, check PG version for improvements
Plan runs but gives wrong results when pushdown is enabled	Semantic mismatch due to type/collation differences or remote session settings [1] [4]	Align types/collations, schema‑qualify functions, use stable session settings

Reading EXPLAIN Like a Pro

Many developers skim EXPLAIN plans for local queries, looking at the top nodes and overall cost. For FDW queries, you must invert that habit: read the foreign parts first. The Remote SQL string tells you what the remote server is being asked to do, and the loops field tells you how many times that remote call is executed.

Inspect the Foreign Scan nodes

Start by finding the Foreign Scan node(s). In EXPLAIN (VERBOSE), each foreign scan includes a line like:

Remote SQL: SELECT ...

This line is not a trivial – it’s the actual SQL that will run on the remote server. Read it carefully. Does it include your WHERE predicates? Does it include your join conditions? If not, you know the local server will pick up the slack.

Look at the loops column. If the loops exceed 1, the same remote query is executed multiple times. For example:

Foreign Scan on public.user_entity  (rows=1 loops=416)
  Remote SQL: SELECT id, tenant_id FROM public.user_entity WHERE enabled AND service_account_client_link IS NULL AND id = $1

This is the “N+1” problem in disguise. The plan executes the foreign scan once per outer row. Multiply the per‑loop cost by the number of loops to understand why the query is slow. The fix is to rewrite the query so that the join and filters are applied in a single remote call.

Recognize InitPlan vs SubPlan

An InitPlan runs once and caches its result. A SubPlan can run per outer row. In FDW queries, subplans often drive parameterized remote scans. If you see a SubPlan attached to a nested loop that feeds a foreign scan, suspect a parameterized remote lookup and look for ways to turn it into an InitPlan or merge it into a single remote query.

Understand CTE materialization

Common table expressions (CTEs) behave differently depending on whether they are marked MATERIALIZED or NOT MATERIALIZED. A materialized CTE is computed once and stored in a temporary structure, then read by the rest of the query. A non‑materialized CTE is inlined into the parent query, allowing optimizations to span across the boundary.

In PostgreSQL 12 and later, CTEs are inlined by default unless they’re referenced multiple times or explicitly marked MATERIALIZED. Materializing a CTE that contains a foreign scan can freeze a broad remote fetch and prevent later clauses from being pushed down. On the other hand, materialization can prevent repeated remote scans if the CTE is referenced multiple times. Use this lever deliberately to control where remote work happens.

Annotated example

Let's annotate a simplified excerpt from a real plan. The goal is to show how to quickly read the relevant parts.

Nested Loop  (rows=414 loops=1)
  -> Hash Join  (rows=416 loops=1)
       -> Foreign Scan on public.user_entity (rows=1 loops=416)
            Remote SQL: SELECT id, tenant_id FROM public.user_entity WHERE enabled AND service_account_client_link IS NULL AND id = $1
  -> Foreign Scan on public.user_attribute (rows=671 loops=1)
       Remote SQL: SELECT ua.user_id, ua.value FROM user_attribute ua JOIN user_entity u ON ua.user_id = u.id JOIN tenant r ON u.tenant_id = r.id WHERE ua.name = 'attribute A' AND r.name = 'demo' AND u.enabled AND u.service_account_client_link IS NULL AND (g.name = 'keycloak-group-a' OR g.parent_group = $1)

In the old plan, the first Foreign Scan executed 416 times, each time retrieving a single row. The Remote SQL only applies the filter on enabled and service_account_client_link – it doesn’t include the tenant or group scoping. That scoping is applied by the nested loop outside the foreign scan.

In the refactored plan, the second Foreign Scan results from combining user_attribute, user_entity, user_group_membership, keycloak_group, and tenant into a single remote query. It retrieves 671 rows in a single query and includes all relevant filters. There is no repeated remote call. The timing difference is driven by the different loop values and the selectivity of the Remote SQL.

How to Tune postgres_fdw

Once you've structured your query for maximum pushdown, tuning knobs let you squeeze out further performance improvements and adjust planner decisions.

fetch_size

fetch_size controls how many rows postgres_fdw retrieves per network fetch. The default is 100 rows [9]. A small fetch size means more round-trips and lower memory usage. A larger fetch size reduces network overhead at the cost of buffering more rows in memory.

In practice, increasing fetch_size to a few thousand can reduce latency for large result sets. It’s specified either at the foreign server or foreign table level:

ALTER SERVER foreign_server OPTIONS (ADD fetch_size '1000');
ALTER FOREIGN TABLE remote_table OPTIONS (ADD fetch_size '1000');

use_remote_estimate

By default, the planner estimates the cost of foreign scans using local statistics. This can be wildly inaccurate if the foreign table has a different data distribution. Setting use_remote_estimate to true tells postgres_fdw to run EXPLAIN on the remote server to get row count and cost estimates. This can dramatically improve join order selection at the cost of an additional remote query during planning [3]. You can set this per table or per server:

ALTER SERVER foreign_server OPTIONS (SET use_remote_estimate 'true');

fdw_startup_cost and fdw_tuple_cost

These cost parameters model the overhead of starting a foreign scan and the cost per row fetched. Adjusting them can influence the planner’s choice of join strategy. A higher fdw_startup_cost discourages the planner from choosing plans with many small foreign scans (which might generate many remote calls). A higher fdw_tuple_cost discourages plans that fetch large numbers of rows [3]. Use these only after you have solid evidence from EXPLAIN and experiments.

ANALYZE and analyze_sampling

Running ANALYZE on a foreign table collects local statistics by sampling the remote table [3]. Accurate stats are essential for good estimates when use_remote_estimate is false.

But if the remote table changes frequently, these stats become stale quickly. The analyze_sampling option controls whether sampling happens on the remote side or locally. When analyze_sampling is set to random, system, bernoulli, or auto, ANALYZE will sample rows remotely instead of pulling all rows into the local server[3].

extensions

The extensions option lists extensions whose functions and operators can be shipped to the remote server [2]. If you rely on functions from citext, pg_trgm, or other extensions, add them to the server definition:

ALTER SERVER foreign_server OPTIONS (SET extensions 'citext,pg_trgm');

A quick knob impact table

Knob	Primary effect	When to change it	Possible downside
fetch_size	Number of rows per fetch	Result sets are large and latency dominates	Too large consumes memory
use_remote_estimate	Better row count/cost estimates	Planner misestimates foreign scans	Extra remote queries during planning
fdw_startup_cost	Penalty per foreign scan	Planner chooses many small foreign scans	Wrong values bias the planner
fdw_tuple_cost	Cost per row fetched	Planner pulls too many rows	Mis‑tuned values mislead planner
extensions	Which extension functions are shippable	Using extension functions in predicates	Extensions must exist and match on both servers

Schema and Index Recommendations

Pushdown doesn’t eliminate the need for good indexes. In fact, effective pushdown depends on the remote server having indexes that support the filter and join predicates you’re shipping.

Below are some patterns to watch for in FDW queries and the indexes that support them. You can adapt these to your own schema.

Table	Access pattern	Recommended index	Why
tenant (remote)	Filter by tenant.name	UNIQUE (name) or BTREE (name)	Resolves tenant ID quickly
keycloak_group (remote)	Filter by name, join by tenant_id, filter on parent_group	Composite (tenant_id, name) and (parent_group)	Supports resolving root group and walking one‑level hierarchy
user_group_membership (remote)	Join by user_id, filter by group_id	BTREE (group_id, user_id)	Efficiently finds users in a set of groups
user_attribute (remote)	Filter by name, join by user_id	Composite (name, user_id) (optionally include value)	Matches “attribute name → users → values” flow
user_entity (remote)	Filter by tenant_id, enabled, service_account_client_link IS NULL, join by id	Partial index on (tenant_id, id) with predicate on enabled and service_account_client_link IS NULL	Helps remote planner start from user table when tenant and user filters are applied
filtercategory (local)	Filter by category && uuid[], join on (entitytype, entityid)	GIN index on category, BTREE (entitytype, entityid)	Speeds array overlap checks and join predicate

In general, indexes should reflect the join order you expect the remote planner to use. If your Remote SQL starts with:

FROM user_attribute ua JOIN user_entity u ON ua.user_id = u.id JOIN user_group_membership ugm ON ...

ensure that indexes exist on user_attribute(user_id) and user_group_membership(user_id).

Benchmarking Methodology

It’s easy to claim a performance improvement without proper measurement. Here's a repeatable method you can use to benchmark FDW query changes.

Warm the caches. Run each query once to load data into the remote buffer cache and the local FDW connection. Discard the timings.
Measure latencies. Use EXPLAIN (ANALYZE, BUFFERS, VERBOSE) to capture execution times, buffer usage, and remote row counts. Be aware that EXPLAIN ANALYZE adds overhead, so record the raw execution time if possible by running the query directly.
Record remote metrics. On the remote server, enable pg_stat_statements and track the calls, total_time, and rows for each remote query. This gives you a per‑query breakdown and confirms what Remote SQL is executed.
Control for concurrency and network latency. Run benchmarks during a quiet period or isolate the test cluster. If your environment has high network latency, record the round‑trip time separately to attribute delays.
Compare apples to apples. Benchmark the old and new queries under identical conditions. Use the same sample data, same remote server, and same connection settings.
Look at row counts. The primary goal of pushdown is to reduce the number of rows shipped. Compare the rows column of each Foreign Scan node.

Here's a simple matrix you can use to record your experiments:

Scenario	What you're testing	Expected change in Remote SQL	Metrics to record
Baseline (old query)	Starting point: broad remote scans + local joins	Remote SQL lacks scoping predicates	p50/p95 latency, remote row count, local sort/hash time
Refactor (new query)	Join + filter pushdown	Remote SQL includes joins and filters	Same metrics, plus remote row count
Introduce a volatile function	Pushdown blocker test	Clause removed from Remote SQL	Remote row count increases, local filter cost increases
Type or collation mismatch	Semantic risk test	Remote SQL might change behavior or lose pushdown	Compare correctness and row counts
ORDER/LIMIT pushdown	Version‑dependent test	Remote SQL includes ORDER BY, LIMIT	Sort time shifts to remote. Row count should remain
use_remote_estimate on/off	Planning accuracy test	Planner uses remote estimates	Planning time, join order, and runtime difference

Monitoring and Logging

In production, you need to know when a query starts misbehaving. There are two places to look: the local server and the remote server.

Local metrics

pg_stat_statements. This extension tracks planning and execution times, row counts, and buffer hits for each query. Look for high total times relative to rows or calls.
Auto Explain or auto_explain. Turn on auto_explain.log_min_duration_statement to capture slow queries with plans. This will show you the Remote SQL executed and whether the plan changed.
Connection pool metrics. Monitor connection counts and wait events related to FDW operations (for example, PostgresFdwConnect, PostgresFdwGetResult) as described in the documentation [10].

Remote metrics

pg_stat_statements on the remote server. This lets you see which Remote SQL queries are being executed, how often, and how long they take. Compare these with the Remote SQL strings in your local EXPLAIN plans.
Server logs. Increase log_statement or log_min_duration_statement on the remote server to capture long-running remote queries.

Correlating local and remote metrics can reveal patterns such as a new code path causing a surge in remote queries or pushdown failures, leading to heavy remote scans.

Case Study: Refactoring a Keycloak Coverage Query

The theory above may seem abstract until you see it play out in practice. Let's walk through a real example inspired by a Keycloak integration.

The original query calculated coverage: given a list of category IDs, it returned the percentage of users who had attributes mapped to those categories and a JSON array of entity counts. The query used a CTE to build a list of scoped users, then joined it with user attributes, category mappings, and a few other tables.

Symptom

In a test environment with 100K user records, the query averaged 166 ms. This was slower than expected. Running EXPLAIN (ANALYZE, BUFFERS, VERBOSE) showed two foreign scans on the Keycloak database. The first scanned user_entity 416 times (loops = 416). The second pulled all rows from user_attribute where name = 'attributeA' before filtering by tenant and group locally.

Here's a simplified excerpt (numbers are approximate):

Foreign Scan on public.user_entity  (actual time=0.117..0.117 rows=1 loops=416)
  Remote SQL: SELECT id, tenant_id FROM public.user_entity WHERE (enabled AND service_account_client_link IS NULL AND id = $1)
Foreign Scan on public.user_attribute  (actual time=41.267..80.352 rows=80739 loops=1)
  Remote SQL: SELECT value, user_id FROM public.user_attribute WHERE (('attributeA' = name))

The first scan performed a single-row lookup 416 times. The second scan retrieved 80,739 rows because the only condition pushed down was name = 'attributeA'. Tenant and group scoping occurred locally. That meant 80k rows were transferred over the network and then filtered down to about 671 on the local side.

Diagnosis

There were two main issues.

First was the N+1 remote calls on user_entity. The join to user_entity was not pushed down, so the plan executed a remote lookup for each row from user_group_membership. This created 416 remote queries.

Second was the unscoped attribute fetch. Because the WHERE clause included user_entity.tenant_id = tenant.id and keycloak_group.name = 'groupA' in a higher CTE, the FDW could not see those predicates when scanning user_attribute. It therefore fetched all rows with name = 'attributeA' and left the tenant and group filters to the local side.

Refactor

The fix was to inline the tenant and group joins into the user_attribute scan to avoid the nested-loop pattern. The refactored selected_user_attributes CTE looked like this (simplified for readability):

WITH selected_user_attributes AS (
  SELECT DISTINCT ua.user_id, ua.value
  FROM public.user_attribute ua
  JOIN public.user_entity u ON u.id = ua.user_id
  JOIN public.user_group_membership ugm ON ugm.user_id = u.id
  JOIN public.keycloak_group g ON g.id = ugm.group_id
  JOIN public.tenant r ON r.id = u.tenant_id
  WHERE ua.name = 'attributeA'
    AND u.enabled
    AND u.service_account_client_link IS NULL
    AND r.name = 'tenantA'
    AND (g.name = 'groupA' OR g.parent_group = (
         SELECT id FROM public.keycloak_group WHERE name = 'groupA' AND tenant_id= r.id
    ))
)

This single query expresses the same scoping logic that previously lived in separate CTEs. Because all the join conditions are on the same foreign server and use built‑in operators, the FDW can push down the entire join. The new plan looked like this:

Foreign Scan  (actual time=7.840..7.856 rows=671 loops=1)
  Remote SQL: SELECT ua.user_id, ua.value FROM user_attribute ua JOIN user_entity u ON ua.user_id = u.id JOIN user_group_membership ugm ON ugm.user_id = u.id JOIN keycloak_group g ON g.id = ugm.group_id JOIN tenant r ON u.tenant_id= r.id WHERE ua.name = 'attributeA' AND u.enabled AND u.service_account_client_link IS NULL AND r.name = 'tenantA' AND (g.name = 'groupA' OR g.parent_group = $1)

Only one remote query is executed, and it returns 671 rows. Tenant and group scoping occur on the remote server. There is no nested loop or repeated remote scan. The final runtime dropped to about 25 ms.

Why it improved

Fewer rows crossing the network. The old plan fetched 80k attribute rows and filtered them locally. The new plan fetched only the 671 scoped rows.
No repeated remote calls. The old plan executed 416 remote scans of user_entity. The new plan performs one joined remote query.
Less local work. Because the join and filtering happen remotely, the local side no longer hashes or filters large sets.

Key takeaway

If you see a Foreign Scan with a high loops count or a Remote SQL that doesn’t contain your filters and joins, you’re leaving performance on the table. Merging filters and joins into a single remote query (subject to shippability rules) often yields orders-of-magnitude improvements.

Checklist and Troubleshooting Guide

The following steps summarize how to approach FDW performance tuning:

Inspect the Remote SQL. Always run EXPLAIN (VERBOSE) and look at what is being sent to the remote. If your predicates are missing, the FDW isn't pushing them down.
Check loops. If the loops are greater than 1 on a Foreign Scan, you are paying for repeated remote calls. Rewrite the query or reorder the joins to make the foreign scan run once.
Make predicates shippable. Replace volatile functions with constants or parameters. Ensure operators and functions are built‑in or explicitly allow‑listed via the extensions option [2].
Align types and collations. Use the same data types and collations on both sides to avoid semantic mismatches [1].
Push joins to the same server. Consolidate tables on one foreign server if possible. Joins across servers cannot be pushed down [6].
Use use_remote_estimate when planning seems off. Enabling remote estimates can improve join order selection [3].
Tune fetch_size and costs if your queries transfer many rows. A bigger fetch_size reduces round-trip; adjusting fdw_startup_cost and fdw_tuple_cost influences the planner [3].
Analyze foreign tables if you rely on local cost estimates. Keep in mind that stats can get stale quickly [3].
Monitor both servers. Use pg_stat_statements on local and remote servers to see how often remote queries run and how long they take.
Test version upgrades. Each major release improves FDW pushdown semantics (for example, aggregates in 10 [7], ORDER/LIMIT in 12 [8]). Retest after upgrading.

Case Study Takeaways

Querying remote data with PostgreSQL’s postgres_fdw can be fast and convenient if you respect the underlying mechanics. Pushdown is the difference between streaming a trickle of relevant rows and hauling an ocean of data across the network. It isn't simply a matter of moving CPU cycles – it changes how much data moves, how many network round-trip occur, and how much your local server has to do.

The rules may seem restrictive – use only immutable functions, avoid cross‑server joins, align types and collations – but they exist to preserve correctness while enabling optimization.

By reading EXPLAIN from the bottom up, inspecting the Remote SQL, and understanding the shippability rules, you can spot slow patterns quickly. Armed with tuning knobs like fetch_size and use_remote_estimate, and a willingness to rewrite queries to make joins and filters pushable, you can often achieve dramatic performance gains without touching your hardware.

This case study shows that rewriting a query to enable a single-joined remote query reduced runtime from around 166 ms to 25 ms. That sort of improvement is not rare. It’s what happens when you treat FDW queries as distributed queries rather than local queries in disguise.

The next time you debug a slow FDW query, remember this handbook. Check the Remote SQL. Count the loops. Ask yourself: “Am I doing the work close to the data, or am I bringing the data to the work?” Adjust accordingly, and you'll write queries that make the most of Postgres's federated capabilities while keeping your latency in check.

This section closes the case study loop and summarizes exactly what changed in the plan and why it produced a large end-to-end win. The following sections of the handbook turn that single win into a repeatable method: how Postgres determines what is shippable, how to quickly read FDW plans, which operations and versions matter, and how to debug common failure modes that prevent pushdown.

Advanced Operations: A Deeper Dive into Shippability

The previous sections introduced the basic rules around what can be pushed to the remote and why. To really make sense of those rules, you need to see how they play out on the operations you use every day.

This section walks through filters, joins, aggregates, ordering, and limits, DISTINCT queries, and window functions in more detail. By the end, you should have a mental map of which operations to trust and which to double‑check when reading your plans.

Filters and simple predicates

WHERE clauses matter more than you think

When you specify WHERE attribute = 'value' on a foreign table, the FDW will happily transmit that predicate to the remote server as long as the comparison uses built‑in types and immutable operators. For example:

WHERE id = 42 is fine
WHERE lower(username) = 'hamdaan' is fine if lower() is allow‑listed and immutable
WHERE created_at >= now() - interval '7 days' is not shippable because now() is volatile

When such a predicate cannot be pushed, the FDW will fetch every row that matches all the shippable predicates and apply the rest locally. That means that a seemingly innocuous call to now() can blow up your network traffic.

The lesson is simple: compute volatile values up front (in your application or in a CTE) and reference them as constants in the query against the foreign table.

Complex expressions are not automatically unsafe

Suppose you have WHERE (status = 'active' AND (age BETWEEN 18 AND 29 OR age > 65)). This entire expression is shippable because it uses built‑in boolean logic, simple comparisons, and immutable operators. The FDW will deparse it into remote SQL and forward it. You only need to worry when one of the subexpressions introduces a function or operator that the FDW doesn’t recognize or cannot safely assume exists on the remote.

A good heuristic is: if you can express your filter using only simple comparisons, boolean logic, and built‑in functions, pushdown should work. When in doubt, check the Remote SQL.

Array and JSON operators

Modern Postgres makes heavy use of array and JSON functions. Many of these functions, like the array overlap operator && used in the case study, are built‑in and can be shipped. But some JSON functions are provided by extensions (like jsonb_path_query or functions from the pgjson family).

If your filter uses one of these, ensure that the extension is available and allow‑listed on the foreign server. Otherwise, the FDW will fetch rows and perform the JSON logic locally. This is rarely what you want when dealing with large JSON columns.

Joins: the good, the bad, and the ugly

Same‑server joins are your friend

If you join multiple foreign tables that are all defined on the same foreign server and user mapping, and if the join condition uses only shippable expressions, then the FDW can generate a single remote join. This is the ideal case.

For example, joining orders and customers on orders.customer_id = customers.id is pushable, as long as both tables reside on the same foreign server. The remote planner will use its own statistics and indexes to plan the join, and the local server will simply iterate through the result. Postgres 9.6 and later support this pattern [6].

Cross‑server joins break pushdown

If you attempt to join two foreign tables that live on different servers (or even on the same remote server but with different user mappings), postgres_fdw will fetch the tables separately and join them locally. This is almost always slower than pushing the join down, because you end up transferring both tables in their entirety.

The FDW design team chose not to support cross‑server joins because there is no portable way to tell two remote servers to cooperate on a join. Your options are: replicate one table on the other server, materialize the smaller table locally before joining, or restructure the query to filter aggressively on each side before joining locally.

Mixed local/foreign joins are tricky

Joining a local table to a foreign table cannot be pushed down, for straightforward reasons: the remote server has no access to your local data. A common pattern that triggers repeated remote calls looks like this:

SELECT u.id, a.value
FROM users u
LEFT JOIN user_attribute a
  ON a.user_id = u.id AND a.name = 'favorite_color';

If users is a local table and user_attribute is foreign, the plan may use a nested loop: for each local u, it executes a remote lookup in user_attribute to retrieve attributes.

The fix is to flip the query: retrieve all relevant rows from user_attribute in one remote scan, then join them locally. Or, if possible, create a small temporary table on the remote side with your u.id values, perform the join entirely remotely, and then fetch the results.

Join conditions matter

Even when joining two foreign tables on the same server, an unshippable join condition will force the join to be local. For example, JOIN ON textcol ILIKE '%foo%' is not pushable because ILIKE might not exist or behave identically on the remote.

If you need case‑insensitive matching, consider lowercasing both sides: LOWER(textcol) = 'foo' (assuming the remote server has the lower() function available and allowed). Similarly, joining on a cast expression (for example, JOIN ON CAST(a.id AS text) = b.text_id) can block pushdown. Define your columns with matching types instead.

Aggregates and grouping

Aggregates are where the data movement story shines. When you can push down a GROUP BY and aggregate functions like COUNT, SUM, AVG, or MAX, you reduce the result set to just the aggregated rows. This can be a difference of several orders of magnitude.

Postgres 10 introduced aggregate pushdown [7]. But not all aggregates are equal:

Simple aggregates such as COUNT(*), SUM(col), AVG(col), MIN(col), and MAX(col) are shippable when applied to shippable expressions. Even COUNT(DISTINCT col) is often shippable, because the remote can deduplicate before counting. The FDW will wrap the aggregate in a remote query and return just the aggregated row.

If you see a GroupAggregate node on the local side, check whether all involved columns and functions are shippable. If they are, ensure that the join conditions above are also pushable.

Filtered aggregates such as COUNT(*) FILTER (WHERE x > 5) or SUM(col) FILTER (WHERE status = 'active') are often pushable, because they translate into SUM(CASE WHEN condition THEN col ELSE 0 END) or COUNT(...). As long as the filter is shippable, the FDW will push it into the remote aggregate.

User‑defined aggregates are rarely pushable. If you have a custom aggregate function, the FDW will not assume that it exists or behaves the same on the remote server. Even if you install the function on both servers, postgres_fdw won't push it unless the function is in an allow‑listed extension.

Grouping sets and rollups are not currently pushable. When you write GROUP BY GROUPING SETS (...) or ROLLUP(...), Postgres will compute the grouping locally even if the underlying scan is remote.

If you need complex rollups, consider performing them in two steps: push down the initial grouping to the remote server to reduce rows, then perform the rollup locally.

ORDER BY, LIMIT, and DISTINCT

Ordering and limiting rows may seem like purely cosmetic features, but they affect how much data is transferred. If the remote can sort and limit, the local server only receives the top N rows. If it cannot, the local server must sort everything.

Postgres 12 expanded the cases where ORDER BY and LIMIT are pushed down [8]. Here are guidelines:

Single foreign scan with simple sort: If your query selects from one foreign table and sorts by a shippable expression (for example, ORDER BY created_at DESC), the FDW will include ORDER BY in Remote SQL. It will also push down LIMIT and OFFSET. This is ideal because the remote server does the sort and sends only the top rows.
Sort after join: If you sort after joining two foreign tables on the same server, and the join and sort expressions are shippable, the FDW may push both down. But if the sort requires columns from the local side or from a different remote server, the FDW cannot push it down.
Sort after aggregation: Sorting aggregated results is often pushable as long as the aggregate itself is pushable. But when grouping occurs locally, the sort remains local.
DISTINCT behaves like GROUP BY. If the distinct expression list is shippable, the FDW can push it down. If you write SELECT DISTINCT ON (col1) col2, col3 FROM ... and col3 is not part of the DISTINCT list, Postgres will treat this as GROUP BY and may push it. Be aware that DISTINCT ON semantics differ from plain DISTINCT and may not be pushable in older Postgres versions.

Window functions

Window functions (for example, ROW_NUMBER() OVER (PARTITION BY ...), RANK(), LAG(), LEAD()) rely on ordering and partitioning across rows.

Postgres has not yet taught postgres_fdw how to push window functions. When you see a WindowAgg node in your plan, it’s almost always local. The FDW will fetch the rows, and the local server will sort, partition, and compute the window. If you need to run window functions on remote data, plan to transfer the data locally.

Version‑specific quirks

The exact pushdown capabilities vary by release. When planning migrations or deciding whether to rely on a pushdown behavior, check the release notes:

9.6: first version to support pushdown of joins and sorts, and remote updates and deletes.
10: introduced aggregate pushdown [7], significantly reducing network use for GROUP BY queries.
11: improved partition pruning and join ordering for foreign tables.
12: expanded ORDER BY and LIMIT pushdown [8].
15: added pushdown for simple CASE expressions and additional built‑in functions.
17 (development at the time of writing) continues to expand shippable constructs. Always test on your target version because subtle improvements can change what the FDW can ship.

Common Anti‑Patterns and How to Avoid Them

Everyone has run into FDW queries that seemed reasonable but turned out to be bottlenecks. Here are a few of the most common mistakes and how to correct them. These examples are deliberately simplified – so you can adapt them to your schema.

Using volatile functions in predicates

Anti‑pattern:

SELECT *
FROM audit_logs
WHERE event_ts >= now() - interval '1 day';

now() is a volatile function, so the FDW refuses to push this predicate. It pulls all rows from audit_logs and filters them locally.

Better:

SELECT *
FROM audit_logs
WHERE event_ts >= $1;

Compute $1 (a timestamp) in your application or upstream query. Or compute it once in a CTE:

WITH cutoff AS (SELECT now() - interval '1 day' AS ts) SELECT * FROM audit_logs, cutoff WHERE event_ts >= cutoff.ts;

The FDW sees a constant and pushes the predicate.

Joining local and foreign data first

Anti‑pattern:

SELECT u.email, ua.value
FROM users u
LEFT JOIN user_attribute ua ON u.id = ua.user_id AND ua.name = 'favorite_movie';

This uses a local table (users) to drive a join to a foreign table (user_attribute). The FDW receives 10,000 individual remote queries if users have 10,000 rows. Each call fetches one or zero rows from user_attribute.

Better:

-- Fetch all favorite movies remotely and join locally
WITH remote_movies AS (
  SELECT ua.user_id, ua.value
  FROM user_attribute ua
  WHERE ua.name = 'favorite_movie'
)
SELECT u.email, rm.value
FROM users u
LEFT JOIN remote_movies rm ON u.id = rm.user_id;

Now the FDW issues one query to fetch all relevant attributes, and the join is done locally in one pass.

Cross‑server joins without materialization

Anti‑pattern:

SELECT *
FROM remote_db1.orders o
JOIN remote_db2.customers c ON o.customer_id = c.id;

This is not pushable because the two tables are on different foreign servers. Postgres will fetch orders and customers separately and join them locally. If orders have 1 million rows and customers have 50,000 rows, you will transfer 1.05 million rows.

Better: Replicate or materialize one side on the other server (or locally) before joining. For example, create a materialized view m_customers on remote_db1 containing just the id and name of the customers you need, then join orders and m_customers on the same server. Alternatively, copy customers into a temporary table on the local server and join there.

Complex expressions on join keys

Anti‑pattern:

SELECT *
FROM remote_table a
JOIN remote_table b ON CAST(a.key AS text) = b.key_text;

Casting a numeric key to text prevents pushdown. The remote server cannot use indexes and must return both tables. The local server performs the join and cast.

Better: Align your schemas so that the join columns use the same type. If you cannot change the schema, create a computed column on the remote server with the appropriate type and use it in the join.

Ignoring collation and type mismatches

Anti‑pattern:

SELECT *
FROM remote_table
WHERE citext_col = 'abc';

If the remote server doesn’t have the citext extension installed, the comparison semantics will differ, and the FDW will refuse to ship the filter. This appears harmless until you see the plan and realize all rows were fetched.

Better: Install the same extensions and collations on the remote server, or convert the column to a base type like text on both sides.

Extending Tuning: Calibrating Cost Models

Earlier, we discussed fetch_size, use_remote_estimate, and the cost knobs. This section expands on how to use them strategically.

Balancing fetch size and memory

fetch_size controls how many rows the FDW asks for in each round trip [9]. Think of it as the batch size. The default (100) works well for small result sets. If you expect to retrieve tens of thousands of rows, a higher fetch size reduces the overhead of many network requests. But there are trade‑offs:

Memory consumption: Each foreign scan buffers rows until they are consumed. A huge fetch size (for example, 10,000) may allocate more memory than you expect, especially when multiple scans run concurrently. Monitor memory usage as you increase this setting.
Latency hiding: If network latency is high, overlapping network requests with local processing can hide some latency. But postgres_fdw does not pipeline multiple fetches – it waits for one batch before requesting the next. This means that a larger batch size reduces the number of waits, but cannot overlap them. If you operate across data centers, consider using a connection pooler or caching layer instead of just increasing fetch_size.

Remote estimates vs. local estimates

The planner uses statistics to estimate how many rows each node will produce, which in turn influences join order. When use_remote_estimate is false (the default), the planner guesses based on local stats collected by ANALYZE on the foreign table. This can be wrong if the remote table has a different distribution than the local sample, or if the table has changed since the last ANALYZE.

Setting use_remote_estimate to true instructs the FDW to run EXPLAIN on the remote server during planning to obtain row counts and cost estimates [3]. This can improve join ordering, especially when joining multiple foreign tables or mixing local and foreign tables. The downside is increased planning time because each remote estimate runs an extra query.

In practice:

Enable use_remote_estimate on queries with complex joins where the planner picks obviously wrong join orders. If enabling it improves the plan, consider leaving it on for that server or table.
Use ANALYZE on foreign tables periodically if your remote data is relatively static. This populates local stats and can avoid the overhead of remote estimates.
Don’t enable use_remote_estimate indiscriminately on simple lookups. The cost of additional round-trip remote flights may outweigh the benefit.

Tuning cost parameters

fdw_startup_cost and fdw_tuple_cost control how much the planner thinks it costs to start a foreign scan and fetch each row [3]. If these are too low, the planner may choose a nested loop that generates many small remote calls. If they are too high, the planner might avoid remote scans even when they are efficient.

You can adjust these parameters based on empirical measurement:

Increase fdw_startup_cost to discourage the planner from using nested loops that call the remote table repeatedly. You might set it to the average cost of a round-trip remote.
Increase fdw_tuple_cost if network bandwidth is limited or expensive. This indicates to the planner that each remote row incurs higher fetch costs than a local row. The planner will prefer plans that filter early on the remote side.

Always adjust these settings gradually and observe the effect on the plan. Keep separate settings per foreign server if network conditions differ.

When to analyze foreign tables

Running ANALYZE on a foreign table collects sample statistics by pulling a subset of rows from the remote server. This helps the planner estimate row counts when use_remote_estimate is off. It also helps decide whether to use an index on the remote side. You should analyze foreign tables when:

The remote table is large and static, and you want accurate local estimates without the overhead of remote estimates.
You have just defined a foreign table, and the default stats are empty.
You changed the extensions allow‑list to enable more pushdown and want the planner to see the effect.

Conversely, if the remote data changes constantly, ANALYZE results will quickly become stale. In that case, rely on use_remote_estimate instead.

Further Case Studies and Practical Examples

The Keycloak coverage example is not the only place where pushdown matters. The following scenarios illustrate other patterns you may encounter.

Reporting on a sharded logging system

Imagine you store application logs across multiple shards, each a separate Postgres database. You want to produce a report of the number of error logs per service per day.

A naïve approach might join all shards in one query:

SELECT shard, service, date_trunc('day', log_time) AS day, COUNT(*)
FROM shard1.logs
UNION ALL
SELECT shard, service, date_trunc('day', log_time) AS day, COUNT(*)
FROM shard2.logs
...;

This approach will fetch all log rows to the local server and aggregate them locally. A better solution is to push the grouping to each shard:

SELECT shard, service, day, sum(count)
FROM (
  SELECT 1 AS shard, service, date_trunc('day', log_time) AS day, COUNT(*) AS count
  FROM shard1.logs
  WHERE log_time >= $1 AND log_time < $2
  GROUP BY service, day
  UNION ALL
  SELECT 2 AS shard, service, date_trunc('day', log_time) AS day, COUNT(*)
  FROM shard2.logs
  WHERE log_time >= $1 AND log_time < $2
  GROUP BY service, day
  ...
) x
GROUP BY shard, service, day;

Here, each foreign server returns a small set of aggregated rows instead of raw logs. The outer aggregation sums across shards. This pattern generalizes: push grouping and filtering to the remote side, then combine locally.

Combining remote and local data for analytics

Suppose you have a local table users and a remote table orders. You want to compute the average order amount per user segment. A naïve query might look like:

SELECT u.segment, AVG(o.amount)
FROM users u
JOIN orders o ON o.user_id = u.id
GROUP BY u.segment;

This is a local join driving a remote nested loop. The better approach is to aggregate orders remotely by user_id and join on the small result:

WITH remote_totals AS (
  SELECT user_id, SUM(amount) AS total, COUNT(*) AS n
  FROM orders
  GROUP BY user_id
)
SELECT u.segment, AVG(rt.total / rt.n)
FROM users u
JOIN remote_totals rt ON u.id = rt.user_id
GROUP BY u.segment;

This pushes the heavy aggregation to the remote and transfers only one row per user. The local join then groups by segment. As with other examples, the key is to reduce remote rows before they cross the network.

Avoiding pushdown for correctness

There are legitimate cases where you should prevent pushdown because of semantic differences. Postgres allows you to do this by adding OFFSET 0 or wrapping the foreign table in a CTE.

For example, if a built‑in function behaves differently on the remote due to a version mismatch, you can force local evaluation:

WITH local_eval AS (SELECT  FROM remote_table)  -- CTE prevents pushdown
SELECT 
FROM local_eval
WHERE some_complex_expression(local_eval.col) > 0;

Alternatively, a WHERE clause like random() < 0.1 will not push down because random() is volatile – you don't need to force it. But adding OFFSET 0 is a simple hack that prevents any pushdown:

SELECT * FROM remote_table OFFSET 0;

Knowing how to disable pushdown intentionally helps you debug. If a query returns different results when pushdown occurs, suspect type/collation mismatches or remote session settings [4].

Monitoring, Diagnostics, and Regression Testing

Monitoring doesn't end at counting remote rows. To make pushdown reliable in production, you need to set up mechanisms to detect regressions and gather evidence when performance changes.

Automate EXPLAIN regression tests

In addition to unit tests and integration tests, you can add tests that assert the shape of your plans. For instance, if a mission‑critical report must always push down a WHERE clause, you can write a test that runs EXPLAIN (VERBOSE) and checks that the Remote SQL contains the filter. You might even parse loops and assert that it is 1. When a developer inadvertently adds a non‑immutable function or changes a join, the test will fail. This is akin to snapshot testing for SQL.

Monitor pg_stat_statements across servers

Enable pg_stat_statements on both the local and remote servers. On the local side, track the total time, planning time, and rows for each FDW query. On the remote side, track which queries are being executed.

Look for outliers: a query whose remote calls spike or whose average remote rows jump from hundreds to thousands. Those are early signs of pushdown failure.

Log remote SQL with auto_explain

Setting auto_explain.log_min_duration_statement (for example, to 500ms) causes Postgres to automatically log slow queries with their plans. Combine this with auto_explain.log_verbose = true and auto_explain.log_nested_statements = true to capture remote SQL as well. When a federated query slows down, the log will show you exactly what remote SQL was executed and how often. This is invaluable in production, where you cannot always run EXPLAIN interactively.

Use connection pooling and prepare statements

postgres_fdw maintains a connection pool keyed on the user mapping. It reuses connections between queries, but you can also use connection pooling at the network level (for example, pgbouncer or pgcat).

Keeping connections warm reduces the startup cost, as captured by fdw_startup_cost. Meanwhile, preparing statements on the remote server (via PREPARE and EXECUTE) can save parse time when the same remote SQL is executed frequently. postgres_fdw can use server‑side prepared statements for parameterized scans.

Regression testing after version upgrades

Every major Postgres release brings improvements to postgres_fdw pushdown semantics. But new releases also change planner heuristics and remote SQL generation. After an upgrade, rerun your key queries with EXPLAIN (VERBOSE), compare the Remote SQL, and benchmark them.

In some cases, a release may push down something previously local, revealing a latent type mismatch or a function difference. In other cases, pushdown may be withheld due to a new rule. Don’t assume that an upgrade automatically improves performance – test it.

Extended Guidelines for Advanced DBAs

To close this handbook, here are consolidated guidelines distilled from the previous sections. They go beyond simple bullet points to capture nuances. Keep them handy for reference or print them out for your team.

Respect the FDW safety model. Immutable functions and built‑in operators are your friends. Anything outside that scope must be explicitly allowed or evaluated locally. Understand which items belong to each category and plan accordingly.
Always read the Remote SQL. Don’t trust your intuition about what is being pushed down. The Remote SQL string is the only source of truth. It indicates whether a predicate, join, sort, or limit operation is occurring remotely. It also shows parameter placeholders (for example, $1) that correspond to values passed from the local plan.
Reduce before you fetch. The network is the highest cost. If the remote can reduce rows through filtering, grouping, or limiting, let it. If it cannot, structure your query to enable it. Avoid queries that require pulling large raw tables and processing them locally.
Beware of join order. The planner sometimes chooses a nested loop with a foreign table as the inner side, resulting in repeated remote calls. Examine loops: if you see a high number, consider rewriting the query or adjusting cost parameters.
Use CTEs strategically. A CTE can isolate remote scans and let you control whether they are materialized once or inlined. Use MATERIALIZED to avoid repeated remote scans when a CTE is referenced multiple times. Use NOT MATERIALIZED to allow optimizations across CTE boundaries.
Instrument, monitor, iterate. Good FDW performance is not a one‑off fix. Monitor queries and plans. Use tests to catch regressions. Adjust tuning knobs and indexes as your data or workload changes. Document your reasoning so others can understand why a particular plan is expected.
Educate your team. Federated queries invite subtle bugs and performance traps. Share the high‑level rules – immutable functions only, cross‑server joins are local, always check remote SQL – so engineers write safer queries by default. A 30‑minute training can save hours of debugging later.

Bringing it All Together

This handbook has covered a lot of ground: from the high‑level principle that pushdown is about data movement, to the nitty‑gritty of join conditions and tuning knobs, to troubleshooting steps and case studies. It is intentionally opinionated and personal: these are the patterns and pitfalls encountered in real systems, not abstract guidelines. By sharing specific examples, I hoped to make the rules memorable and show how they interplay with actual workloads.

The goal is not just to tell you what to do, but to show you how to think and problem solve: review the plan, trace data movement, and determine whether the query is doing the heavy work in the right place.

That thinking process, practiced enough times, becomes second nature. When you write a new query, you'll automatically consider whether your predicates are immutable, whether the join can be shipped, and whether you are about to trigger an N+1 pattern. When you review plans, you'll start from the Foreign Scan nodes and remote SQL, not the top‑level node. When you tune, you'll know which knobs to twist and in which order.

Keep experimenting. Use the examples here as starting points. Try different structures in a test environment and measure the difference. The more you play with pushdown, the more comfortable you'll become with its constraints and superpowers.

If this handbook helps you avoid one performance incident or saves you from shipping a broken query, it has done its job. Enjoy exploring the federated world of Postgres.

References

[1] [2] [3] [4] [5] [6] [9] [10] PostgreSQL: Documentation: 18: F.38. postgres_fdw – access data stored in external PostgreSQL servers (https://www.postgresql.org/docs/current/postgres-fdw.html)

[7] PostgreSQL: Release Notes (https://www.postgresql.org/docs/release/10.0/)

[8] PostgreSQL: Release Notes (https://www.postgresql.org/docs/release/12.0/)

The Cryptography Handbook: Exploring RSA PKCSv1.5, OAEP, and PSS

Hamdaan Ali — Wed, 02 Apr 2025 22:04:38 +0000

The RSA algorithm was introduced in 1978 in the seminal paper, "A Method for Obtaining Digital Signatures and Public-Key Cryptosystems". Over the decades, as RSA became integral to secure communications, various vulnerabilities and attacks have emerged, underscoring the importance of understanding and implementing RSA correctly.

This handbook will help you understand the internal workings of the RSA algorithm, how they have evolved over the years, and the schemes defined under various RFCs. This knowledge will help you make informed choices about the most suitable RSA schemes depending on your business requirements.

In this handbook, we’ll begin by exploring the foundational principles of the RSA algorithm. By examining its mathematical underpinnings and historical evolution, you will gain insight into the diverse array of attacks that have emerged over the years.

The narrative unfolds as an evolutionary journey: from the original, straightforward (textbook) RSA implementation, through the discovery of vulnerabilities, to the development of effective countermeasures, and further refinements as new challenges were encountered. This progression illuminates how RSA has transformed over time and also demonstrates how modern cryptographic libraries have integrated these advancements to achieve secure implementations in today’s applications.

You can also watch the associated video here:

Prerequisites
The Alice-Bob Paradigm
The Birth of the RSA Cryptosystem
RSA Operations
Issues with Euler’s Totient Function in RSA
The Carmichael Function
- Mathematical Implication of The Carmichael function
- The Carmichael Function in Modern Implementations
Issues with Raw RSA
Exploiting Textbook RSA’s Determinism and Malleability
Low-Exponent Attacks
Håstad’s Broadcast Attack: Low Exponent Meets Multiple Recipients
Introduction to Padding Schemes in RSA
Public Key Cryptography Standards (PKCS#1 v1.5)
- The Mathematics Behind PKCS#1 v1.5
The Bleichenbacher Attack
Optimal Asymmetric Encryption Padding (OAEP)
- The Mathematics Behind OAEP
Why SHA-1 or MD5 Are Safe in RSA-OAEP
- Label Hashing
- Mask Generation Function (MGF1)
Adoption in Cryptographic Libraries (PKCS#1 v1.5 vs OAEP)
Enhancing Digital Signatures: The Transition to PSS
The Road Ahead: Assessing RSA’s Long-Term Viability
References

Prerequisites

Linear Algebra: A foundational understanding of Linear Algebra and Modular Arithmetic will help you understand certain sections of the handbook, though it is not an absolute requirement. This handbook provides comprehensive explanations of mathematical expressions and their underlying concepts as they arise.

For a concise and relevant introduction to the Chinese Remainder Theorem (CRT) in the context of the handbook, you may find this resource helpful: CRT, RSA, and Low Exponent Attacks | YouTube.

Patience (and a Sense of Adventure): RFCs can sometimes get dull to read, and research papers can feel intimidating at first glance. This handbook is designed to make standard cryptographic concepts accessible to everyone, guiding you through each step with clarity and intuition. Every concept is reinforced with clear, step-by-step examples, ensuring not only a thorough understanding but also familiarity with widely used standard notations. So take your time, take a deep breath, and embrace the journey.

For visual learners, the associated video may offer a more engaging experience.

The Alice-Bob Paradigm

Throughout this handbook, you will come across numerous sequence diagrams and mathematical proofs that use the Alice-Bob Paradigm.

The Alice-Bob paradigm is a common convention in cryptography where two generic entities, often named Alice and Bob, are used to illustrate various scenarios, protocols, or cryptographic principles.

These characters represent two parties engaged in communication, with Alice typically representing the sender or initiator, and Bob representing the receiver or responder.

We often introduce Eve as a third party, symbolizing an eavesdropper or potential attacker, adding an element of security risk, and illustrating scenarios where external entities might attempt to intercept or manipulate the communication.

The Birth of the RSA Cryptosystem

The year 1978 witnessed the birth of a new era in cryptography with the introduction of the RSA cryptosystem, named after its inventors (Rivest, Shamir, and Adleman).

This development, introduced in the paper "A Method for Obtaining Digital Signatures and Public-Key Cryptosystems", provided a method for secure digital communication and laid the foundation for modern public-key cryptography.

At the heart of RSA lies elementary number theory – specifically, the properties of prime numbers and modular arithmetic. Let’s first understand how these key concepts form its mathematical foundations.

Prime Numbers and Composite Moduli

The algorithm starts by selecting two large prime numbers, denoted as p and q. Their product ($n = p \times q$) forms the modulus for both the public and private keys.

The security of RSA depends heavily on the fact that, while multiplying these primes is computationally straightforward, factoring the resulting large composite number n is considered infeasible for sufficiently large primes.

At this point, it’s important to note that p and q must be large prime numbers to ensure RSA’s security. Fortunately, modern libraries handle this automatically by using well-established prime-generation algorithms. As a result, you can focus on higher-level aspects of your applications without having to manage the low-level details of prime selection.

For instance, let’s have a look at OpenSSL’s RSA key generation routine which performs several checks to ensure that the resulting modulus $n = p \times q $ meets the desired bit-length requirements:

The below snippet right-shifts the product of the generated primes (stored in r1) by bitse - 4 bits to isolate the top 4 bits, which are then checked to ensure that the modulus meets the desired size criteria.

if (!BN_rshift(r2, r1, bitse - 4))
    goto err;
bitst = BN_get_word(r2);

The extracted bits (bitst) are then compared against a predefined range (from 0x9 to 0xF). This range ensures that the most significant byte of the modulus isn’t too small or too large.

if (bitst < 0x9 || bitst > 0xF) {
    bitse -= bitsr[i];

If the significant bits do not fall within the desired range, the bit length is adjusted and the prime-generation process is retried. If the number of retries exceeds a set limit, the entire process is restarted.

if (!BN_GENCB_call(cb, 2, n++))
    goto err;
if (primes > 4) {
    if (bitst < 0x9)
        adj++;
    else
        adj--;
} else if (retries == 4) {
    i = -1;
    bitse = 0;
    sk_BIGNUM_pop_free(factors, BN_clear_free);
    factors = sk_BIGNUM_new_null();
    if (factors == NULL)
        goto err;
    continue;
}
retries++;
goto redo;

To ensure that the numbers are necessarily primes, these libraries use a combination of probabilistic tests, including the Rabin-Miler Primality Testing, and sieving methods to quickly eliminate non-prime candidates.

The Euler Totient Function

For a number n that is the product of two primes, the Euler totient function is given by:

$$\varphi(n) = (p-1)(q-1)$$

This function counts the number of integers less than $n$ that are co-prime to $n$. Euler’s theorem, which states that for any integer a co-prime to n, $ a^{\varphi(n)} \equiv 1 \pmod{n}$ plays a central role in proving why RSA’s operations are reversible.

But most modern RSA cryptosystems use the Carmichael function instead of the Euler’s Totient Function. We will examine the reasoning behind this shift in the next few sections.

Computing the Keys

Now we select an integer $e$ such that $1 < e < \varphi(n)$and $\gcd(e, \varphi(n)) = 1$. This $e$ becomes the public exponent you see as a parameter in the RSA function calls you make.

With that done, now let’s determine $d$ as the modular multiplicative inverse of $e \, \, modulo \, \varphi(n)$. In other words, $d$ is computed such that:

$$e \times d \equiv 1 \pmod{\varphi(n)}$$

This step is the mathematical linchpin ensuring that decryption is the inverse operation of encryption.

In the 1978 paper, the authors explicitly provided these formulas and steps. They showed that if you encrypt a message m using $c = m^e \mod n$ and then decrypt using $m = c^d \mod n $ , the original message is recovered – thanks to the properties of modular exponentiation and Euler’s theorem. This mathematical framework was novel at the time and immediately set the stage for a new era in cryptography.

RSA Operations

Now that the mathematical foundations are laid, the RSA algorithm can be seen as a set of three core operations: Encryption, Decryption, and Signing. Throughout this handbook's next sections, we will critically analyze these operations and learn about several pitfalls in each. Then we will examine how these were averted with the birth of new schemes, each to solve a new issue discovered on the way.

Encryption

With the public key $(n, e)$ available to everyone, any user can encrypt a message $m$ (where $m$ is first encoded as an integer in the range $0 \leq m < n$ ) using the formula:

$$c = m^e \mod n$$

Here, c is the ciphertext. Because the operation is based on modular exponentiation, even if m is known, recovering m from c without knowing d is computationally hard.

Decryption

The intended recipient, who possesses the private key $d$, decrypts the cipher text $c$ by computing:

$$m = c^d \bmod n$$

Using the relationship ($e \times d \equiv 1 \pmod{\varphi(n)}$) and properties from Euler’s theorem, the above operation exactly inverts the encryption step, recovering the original message $m$.

This ensures that only the holder of the private key can read the encrypted message. This is the backbone of RSA’s use in secure communication.

The sequence diagram below wraps up our discussion so far:

Digital Signatures

Digital signatures fulfill a different security goal: authenticity and integrity rather than confidentiality. While encryption and decryption use the public key for “locking” and the private key for “unlocking,” digital signatures reverse these roles.

1. Signing

The author of a message uses their private key $d$ to compute a signature $s$ on the message $m$, guided by the formula mentioned below:

$$s = m^d \bmod n$$

This can later be verified by others using the corresponding public key. The purpose here is not to recover a secret message but to create a proof of authenticity.

2. Verification:

Anyone with the public key $(n, e)$ can verify that the signature s indeed belongs to the message $m$ by computing:

$$m \equiv s^e \bmod n$$

If the equivalence holds, it confirms two key points: That the message has not been tampered with (that is, integrity), and that the signature must have been generated using the private key d (that is, authenticity).
As long as $d$ is kept secret, only the legitimate signer can produce a valid signature. Take at look at the sequence diagram below to understand the complete process.

Issues with Euler’s Totient Function in RSA

While using Euler’s Totient Function works well in theory, implementers of the scheme realized its practical downsides. Simply put, the primary issue was that Euler’s Totient Function can lead to a larger private exponent $d$ than what was necessary.

To completely appreciate this fact, let’s take a step back to understand why the size of the private exponent $d$ matters in RSA.

RSA decryption (or signing) involves computing $m^d ~~mod ~n$ which is done via modular exponentiation. The time complexity of exponentiation algorithms (like square-and-multiply) grows with the number of bits in $d$. A larger $d$ means more multiplications and squarings, that is slower decryption.

In practice, if using the Euler’s Totient Function makes $d$ roughly twice as large as what is required, then decryption can be almost twice as slow compared to using the minimal $d$. This inefficiency is especially noticeable when $e$ is small (common public exponents like 3 or 65537). A small $e$ leads to a very large $d$ under $φ(n)$.

Beyond performance, having an unnecessarily large $d$ can increase storage size slightly (a few more bytes for the key). This can also lead to interoperability quirks, which is why standards and protocols such as FIPS 186-4 [1] and RFC 8017 [2] expect $d$ to be below a certain size. We will take a detailed look at this in the next section.

To combat these issues, cryptographers utilized the Carmichael function to generate RSA keys. Before we dive into how the Carmichael function helps our case, let’s quickly understand what the Carmichael function actually is.

The Carmichael Function

The Carmichael Function, represented by $λ(n)$, also known as the reduced totient or least universal exponent, is defined as the smallest positive integer $m$ such that for every integer $a$ co-prime to $n$, $ a^m ≡ 1 (mod n)$.

To put this in easy terms, $λ(n)$ is the exponent of the multiplicative group modulo $n$ (the least common multiple of the orders of all elements). For RSA-style moduli (product of primes), the Carmichael function is guided by the formula:

$$\lambda(n) = \operatorname{lcm}(p-1,\,q-1)$$

where $n = p . q$ with $p$ and $q$ being the large primes.

You may now understand the Carmichael function better if we put it in the following way: $λ(n)$ is the least common multiple of $λ(n)$ of each prime power dividing n. So for a prime $p$, $λ(p) = φ(p) = p – 1$, and for two primes, we take the $lcm$ of $p-1 $ and $q-1.$

Mathematical Implication of The Carmichael function

The Carmichael function $λ(n)$ is a “tighter” bound. What this means is that $λ(n)$ divides $φ(n)$ (since the exponent of a finite group always divides the group order by Lagrange’s Theorem [3])

If $p$ and $q$ are both odd primes, then $p–1$ and $q–1 $ are even, so their least common multiple is roughly half of $(p–1)(q–1)$. Mathematically:

$$λ(n) = \dfrac{(p–1)(q–1)} {gcd(p–1, q–1)}$$

We can observe that this $λ(n)$ is lesser than or equal to $φ(n)$ and often considerably smaller. This means $λ(n)$ provides the minimal exponent needed for RSA’s correctness, whereas $φ(n)$might be a larger number that still works but isn’t necessary.

When you choose two large random primes $p$ and $q$, you have:

$$\varphi(n) = (p-1)(q-1) \approx n,$$

because for large primes, the subtracted ones make only a small difference compared to $p$ and $q$ themselves.

Now, since both $p-1$ and $q-1 $ are even, they each have a factor of 2. If those are their only common factors (which is often the case for random primes), then:

$$\lambda(n) = \mathrm{lcm}(p-1, q-1) \approx \frac{\varphi(n)}{2}.$$

When you compute the private exponent $d$ as the modular inverse of $e$ (a small number) modulo $ \varphi(n)$ versus modulo $\lambda(n)$, the range from which $d$ is chosen is roughly twice as large in the former case. That means the typical $d$ when computed modulo $\varphi(n)$ can be about twice as large as when computed modulo $\lambda(n)$. A larger $d$ means that during decryption (or signing) the modular exponentiation $c^d \mod n$ takes slightly more time.

Intuitively, using $λ(n)$ ensures we don’t “overshoot” the exponent required for the modular arithmetic to cycle back to 1.

A smaller $d$ makes every RSA decryption and signature operation faster. For instance, if $λ(n)$ is roughly half of $φ(n)$, then $d$ will have one less bit than it would otherwise, cutting the exponentiation work by about 50%. This is a free performance gain, as we aren’t changing the security assumptions or the key size $n$, just using the mathematically tight value for the exponent. The RSA algorithm’s security is not weakened by this and now the $d$ is different but functionally equivalent.

The Carmichael Function in Modern Implementations

The critical property for RSA ($e·d ≡ 1 ~mod ~~λ(n)$) is both necessary and sufficient for correct decryption, thanks to Carmichael’s theorem. So there’s no need for $d$ to also satisfy the stronger condition modulo $φ(n)$.

By switching to computing $d ~ modulo ~~ λ(n)$ (i.e., $d = e^{-1} ~mod ~~λ(n)$), we directly get the smallest working private exponent. Ronald Rivest himself noted this optimization in his 1999 seminal paper [4], stating that solving for $d$ using $ λ(n)$ instead of $φ(n)$ is slightly preferable because it can result in a smaller value for d.

Over time, the use of $ λ(n)$ in RSA moved from an academic suggestion to an industry standard. Today’s cryptographic standards explicitly acknowledge or require the $λ(n)$ approach.

For example, the official RSA standard (PKCS #1 v2.2, RFC 8017 [2]) defines the RSA key generation in terms of $λ(n)$. It specifies that the private exponent $d$ is chosen such that $e·d ≡ 1 (mod λ(n))$ (with $λ(n) = lcm(p–1, q–1)$). In other words, PKCS #1 expects the Carmichael function to be used for the modulus of the exponent. Likewise, NIST’s FIPS 186-4 (Digital Signature Standard) mandates that $d$ be less than $λ(n)$.

Any RSA key where $d$ is larger than $λ(n)$ is considered non-compliant in those strict contexts. This effectively forces implementations to use the smaller $λ(n)$-based exponent, since any “oversized” $d$ can be reduced $mod ~~λ(n)$ to meet the criterion.

Standards such as FIPS 186-4 [1] (the Digital Signature Standard) and RFC 8017 [2] (which specifies PKCS#1 v2.2 for RSA Cryptography) include requirements or recommendations that imply the private exponent $d$ should be as small as possible and ideally less than $ \lambda(n)$. Using $\lambda(n)$ (the least common multiple of $p-1$ and $q-1$) directly produces the smallest valid $d$, whereas using $\varphi(n)$ often results in a $d$ that is larger than necessary. This not only improves performance (by reducing the number of modular multiplications needed during decryption/signing) but also helps maintain interoperability with protocols that expect d to be below a certain size.

The Python cryptography library (PyCA cryptography) explicitly documents [5] that it uses Carmichael’s totient to generate the “smallest working value of $d$,” noting that older implementations (including the original RSA paper) used Euler’s totient and ended up with larger exponents. OpenSSL also uses the Carmichael function in their low-level RSA APIs [6].

This shift to the Carmichael function ensures that under the hood your RSA key is a bit more efficient than the ones from the late 1970s while providing the same level of security.

Issues with Raw RSA

Raw or “Textbook” RSA soon turned out to be insecure when two major weaknesses were discovered.

The operations involved in RSA are entirely deterministic, which means that for a given plaintext $m$, encryption always produces the same cipher text $C = m^e \mod n$.

An eavesdropper or an attacker, say Eve, can guess or derive plain texts by exploiting the predictability of outputs. Since RSA encryption is a public operation, an attacker can encrypt likely messages and compare results to a target cipher text – a trivial chosen plaintext attack.

Besides this, textbook RSA is also malleable. This means that its algebraic structure allows attackers to manipulate cipher texts in meaningful ways. For instance, given a cipher text $C = RSA(M)$, an attacker can multiply it by the encryption of a known value (say, r) to produce a new cipher text $C’ = C · r^e ~~mod ~n$, which decrypts to the plaintext $M·r$. When the legitimate receiver decrypts $C'$, the result is $M·r$, from which the attacker can often recover $M$.

Let’s understand these vulnerabilities with a small practical example.

Exploiting Textbook RSA’s Determinism and Malleability

Key Generation (Setup)

For our toy example, we’ll choose small prime numbers and generate an RSA key pair:

Let’s select the values of $p =3$ and $q=11$. Both of these values are prime. Now, compute the modulus and Totient Function as follows:

$$\begin{gather} \begin{split} n = p × q = 3 × 11 = 33 \\ φ(n) = (p – 1) × (q – 1) = 2 × 10 = 20 \end{split} \end{gather}$$

Now choose the public exponent. Let’s consider $e=3$ since it is coprime with $ φ(n) = 20$, and $gcd(3, 20) = 1$.

Now let’s compute the private exponent. We know that d is the modular inverse of $e ~~mod ~φ(n)$. We need to find d such that $(d × e) ≡ 1~~ (mod ~20)$. Using this knowledge we can compute $d = 7$ as $3 × 7 = 21 ≡ 1 ~~ (mod~ 20)$.

Finally, the public key is $(n = 33, ~ e = 3)$ and the private key (secret) is $d = 7$.

Encryption Process

Now, let’s encrypt a simple message using the above key. Let us select our plaintext to be $M = 4$. The cipher text in this case would be:

$$\begin{gather} \begin{split} C = 4^3 ~~mod ~33 \\ C = 64 ~~mod ~33 \\ C = 64 – 33×1 = 31 \end{split} \end{gather}$$

To consolidate the findings so far, if we encrypt message $4$ with the public key $(e=3, n=33)$, we will produce the cipher text $31$. Now, let’s try the exploits.

Determinism Exploit (Ciphertext Guessing Attack)

Textbook RSA is deterministic – the same plaintext always yields the same ciphertext (with no randomness involved). An attacker who intercepts the ciphertext $C=31$ can exploit this by encrypting likely plaintext guesses and comparing results:

The adversary, say Eve, will try encrypting candidate plaintexts with the public key and see which one produces $31$. They may pick randomized values to increase their efficiency:

$$\begin{gather} \begin{aligned} Guess~ M = 1 ⇒ 1^3~~ mod ~33 = 1 \\ Guess~ M = 2 ⇒ 2^3~~ mod ~33 = 8 \\ Guess~ M = 3 ⇒ 3^3~~ mod ~33 = 27 \\ Guess~ M = 4 ⇒ 4^3~~ mod ~33 = 31 \\ \end{aligned} \end{gather}$$

By simply comparing ciphertexts, the attacker finds that encrypting $4$ yields 31, which matches the intercepted ciphertext. Thus, the attacker learns the original plaintext $M$ was $4$. This is possible because there’s no randomization in textbook RSA – an eavesdropper can identify a message by trial encryption of guesses, breaking confidentiality if the message space is small or guessable.

Malleability Exploit (Ciphertext Manipulation Attack)

Raw RSA is also malleable. This means an attacker can take a ciphertext and modify it in a way that results in a predictable change in the decrypted plaintext. Let’s understand how this works.

RSA has a multiplicative property, that is, multiplying two ciphertexts corresponds to multiplying their plaintexts before encryption:

$$E(M_1) \cdot E(M_2) \mod n = (M_1^e \mod n)\times(M_2^e \mod n) \mod n = (M_1 \cdot M_2)^e \mod n$$

The sequence diagram below explains how the malleability exploit works in naive RSA.

Alice sends a ciphertext to Bob after the initialization phase. Note that by this point, n and e are public knowledge. Eve intercepts this ciphertext by using mechanisms such as a MiTM (Man in the Middle) attack.

Now, Eve picks a known value to manipulate the message. Let’s say the attacker chooses $X = 2$ (with the intent to double the original plaintext).

Then they compute the encryption of X using the public key:

$$E(X) = 2^3 \mod 33 = 8.$$

Now, Eve multiplies the original ciphertext by this value (mod n) to get a new ciphertext:

$$\begin{gather} \begin{split} C{\prime} = C \times E(X) \mod n = 31 \times 8 \mod 33 \\ C{\prime} = 248~~ mod~ 33 = 248 – 33×7 = 248 – 231 = 17 \end{split} \end{gather}$$

This new ciphertext $C{\prime}$ is the encryption of the product of the original plaintext and $2$. If we directly encrypted $M \times X = 4 \times 2 = 8$ with RSA, we would get $8^3 \mod 33 = 512 \mod 33 = 17$. This means that $C′$ corresponds to the plaintext $8$, which is the original message $4$ multiplied by $2$.

In a real-world chosen ciphertext attack, the attacker may have access to a decryption oracle or observe a system response that reveals information about $M{\prime}$. The decryption result $8$ is exactly $M \times 2$ (the original message multiplied by the attacker’s chosen factor). Knowing the factor $X = 2$, the attacker can deduce the original message by dividing: $8/ 2 = 4$.

Note that Eve has not broken the mathematical foundations behind RSA here. They have only used the public key to compute an encryption of $2$, and then combined it with the intercepted ciphertext. They don’t know the original plaintext yet, but they have manipulated the ciphertext in a way that they know the new plaintext is twice the original message.

Low-Exponent Attacks

Beyond determinism and malleability exploits, textbook RSA is also vulnerable to Low-Exponent Attacks. Using a small public exponent like $e = 3$ (or sometimes $17$) was popular because it used to speed up encryption and signature verification. But this soon turned out to be a security concern.

When RSA uses a small public exponent (say, $e = 3$) and the plaintext is very short (so that $M^3$ is smaller than the modulus $n$), the encryption does not “wrap around” modulo $n$. Mathematically:

$$c = M^3 \mod n = M^3 \quad \text{(if $ M^3 < n $)}$$

Let’s understand this with an easy example:

Consider our plaintext to be: $M = 5$. We compute $M^3$ as $M^3 = 5^3 = 125$.

Now assume $n$ is a $4096$‑bit number which is large compared to $125$. In this case, the ciphertext is simply $c = 125$. Eve intercepting $c = 125$ can compute the cube root of $125$ to get the plaintext: $\sqrt[3]{125} = 5$ thus recovering $M$ directly.

This shows that if $M$ is small enough, the ciphertext leaks the plaintext when $e$ is low.

Håstad’s Broadcast Attack: Low Exponent Meets Multiple Recipients

In 1985, Johan Håstad’s highlighted the broadcast attack that illustrates the danger of a low exponent, $e$, when the same message is sent to multiple parties as a broadcast.

Imagine Alice wants to send the same plaintext message M to three different recipients. Each recipient has their own RSA public key with modulus $N_1, N_2, N_3,$ but for speed all use $e = 3$ (a common practice historically). Alice encrypts $M$ with each public key, yielding ciphertexts:

$$\begin{gather} \begin{split} C_1 = M^3 \bmod N_1 \\ C_2 = M^3 \bmod N_2 \\ C_3 = M^3 \bmod N_3 \end{split} \end{gather}$$

Eve, who intercepts all three $C_1, C_2, C_3$ can recover M without breaking any single RSA key.

Since each $N_i $ is different (and we assume they are pairwise coprime, as RSA keys should be), the attacker can use the Chinese Remainder Theorem (CRT) to combine the three congruences $x \equiv C_i \pmod{N_i}$. Note that at this point Eve only has $C_1$, $C_2$ and $C_3$. They do not have the plaintext $M$ or $M^3$ and yet they can reconstruct $M^3$ with the intercepted data. To understand the Chinese Remainder Theorem and this reconstruction, you may follow this: CRT, RSA, and Low Exponent Attacks | Youtube.

There is a unique solution modulo $N_1N_2N_3$ for $x$, and that solution turns out to be an integer, $x = M^3$ (because the true integer $M^3$ is smaller than the product $N_1N_2N_3$ of each $M < N_i $ ). In essence, CRT lets Eve reconstruct $M^3$ exactly. Once they have $M^3$ as an ordinary integer, they simply take the cube root to find $M$. There’s no need to factor any modulus or invert the RSA function – the math falls out due to the low exponent.

The sequence diagram below aims to provide a high-level understanding of the attack:

Now let’s see this attack in action with a sample:

Suppose three different RSA public keys all use exponent $e=3$, with moduli $ n_b = 187$ (for Bob),
$n_c = 115 $ (for Carol), and $n_d = 87$ (for Dave).

These $n_i$ are pairwise coprime ($gcd$ of each pair is $1$). Now assume the same plaintext message $M$ is encrypted with each public key. Let’s take a concrete $M$. For example with $M=42$, we will have:

$$\begin{gather} \begin{split} c_b = M^3 \bmod n_b \\ c_c = M^3 \bmod n_c \\ c_d = M^3 \bmod n_d \\ \end{split} \end{gather}$$

On calculating these, we have:

$$\begin{gather} \begin{split} c_b = 42^3 \bmod 187 = 36 \\ c_c = 42^3 \bmod 115 = 28 \\ c_d = 42^3 \bmod 87 = 51 \\ \end{split} \end{gather}$$

So the three ciphertexts observed are $36$, $28$, and $51$, respectively. Eve who knows $n_b, n_c, n_d$ and these ciphertexts can now recover $M$ as follows:

Eve will compute the total modulus $N = n_b \cdot n_c \cdot n_d = 187 \times 115 \times 87 = 1,870,935.$ (This is the modulus for the combined system of congruences).
Now Eve will compute the partial products for each congruence:

$$\begin{gather} \begin{split} N_b = \frac{N}{n_b} = \frac{1,870,935}{187} = 10,005 \\ N_c = \frac{N}{n_c} = \frac{1,870,935}{115} = 16,269 \\ N_d = \frac{N}{n_d} = \frac{1,870,935}{87} = 21,505 \end{split} \end{gather}$$

At this point, Eve needs the inverses of each $N_i$ modulo its corresponding $n_i$:
- First Eve computes $M_b = (N_b)^{-1} \bmod n_b$, i.e. the number $M_b$ such that $N_b \cdot M_b \equiv 1 \pmod{187}$. In this case, $N_b = 10005$. Using the extended Euclidean algorithm, Eve can find $M_b = 2$ (since $10005 \times 2 = 20010 \equiv 1 \pmod{187}$).
- Then Eve computes $M_c = (N_c)^{-1} \bmod n_c$. Here $N_c = 16269$. The inverse mod $115$ turns out to be $M_c = 49$ (For verification: $16269 \times 49 \equiv 1 \pmod{115}$).
- Next up, Eve computes $M_d = (N_d)^{-1} \bmod n_d$. For $N_d = 21505$, the inverse mod $87$ is $M_d = 49$ as well (coincidentally the same value in this case, since $21505 \times 49 \equiv 1 \pmod{87}$).

Now Eve reconstructs the combined value using the Chinese Remainder Theorem for three congruencies. The construction of this formula is beyond the scope of this handbook, but to completely understand how this springs into action, you may go through this video: CRT, RSA and Low Exponent Attacks | Youtube.

$$C \;=\; c_b \cdot N_b \cdot M_b \;+\; c_c \cdot N_c \cdot M_c \;+\; c_d \cdot N_d \cdot M_d \pmod{N}$$

On substituting the numbers:

$$C = 36 \cdot 10005 \cdot 2 \;+\; 28 \cdot 16269 \cdot 49 \;+\; 51 \cdot 21505 \cdot 49 \pmod{1,870,935}$$

Let’s carefully evaluate each term:

$$\begin{gather} \begin{split} 36 \cdot 10005 \cdot 2 = 720,360 \\ 28 \cdot 16269 \cdot 49 = 22,341,348 \\ 51 \cdot 21505 \cdot 49 = 5,37,40,995 \\ \end{split} \end{gather}$$

Summing these gives a raw total of $7,20,360 + 2,23,21,068 + 5,37,40,995 = 7,67,82,423$. Now reduce this modulo $N = 1,870,935$:

$$\begin{align} \begin{split} C \equiv 7,67,82,423 \pmod{1,870,935}\\ C = 74,088 \\ \end{split} \end{align}$$

Now Eve will simply take the cube root of $C: \sqrt[3]{74088} = 42$, which is the original plaintext.
Eve has successfully recovered $M$.

The key takeaway from these attacks is that without proper defenses. RSA alone does not satisfy modern definitions of security. It is not resistant to chosen-plaintext or chosen-cipher text attacks. This gap between the theoretical one-way function (RSA’s trapdoor permutation) and a secure encryption scheme became evident as implementers found that naive RSA could be “broken” by various clever tricks.

To counter these weaknesses, standards bodies introduced padding schemes to strengthen RSA encryption. In the following sections, you will learn about each of these paddings schemes and how they’ve been exploited over the years.

Introduction to Padding Schemes in RSA

Before we dive into the padding schemes and how it helps our case, let’s quickly recap the need for padding in RSA.

Textbook RSA encryption is deterministic. The same plaintext always produces the same ciphertext under a given public key. This determinism makes raw RSA insecure. An attacker can guess possible messages, encrypt them with the public key, and compare with the target ciphertext to see which guess matches.

Beyond determinism, small-exponent attacks illustrate why padding is critical. If the message $m$ is too small relative to the modulus, raising it to a small public exponent (like $e=3$) might not wrap around $N$. Padding the plaintext with random data before encryption remedies these problems by making the ciphertext unpredictable and ensuring $m^e$ spans the modulus’ range.

Public Key Cryptography Standards (PKCS#1 v1.5)

In 1998, Kaliski and RSA Laboratories introduced PKCS#1 v1.5 to the world in a public publication [7]. In PKCS#1 v1.5, every RSA‐encrypted message is wrapped inside a special “encryption block” $EB$. This block ensures that the raw message is both the right size for RSA and padded in a way that’s hard to tamper with.

In this scheme, the plaintext is padded to the size of the modulus $N$ (in bytes) as:

$$EB = 00 ~||~ BT ~||~ PS ~||~ 00 ~||~ M$$

Here, $0x00$ (Leading Zero Byte) is always at the front. It ensures that, when the concatenated string $EB$ is converted to a big‐endian integer, the value is less than the RSA modulus (that is, we don’t end up with a number too large for RSA to handle). You will better appreciate this fact when we dive into the mathematics behind this.

The next octet is the Block Type, $BT$, which tells us the “type” of padding being used. The standard defines three possible $BT$ values: $00, 01, $ and $02$- to support different operations. For example, $BT=00$ and $BT = 01$ is used for private-key operations (such as digital signatures) and $BT = 02$ is used for public-key operations. For encryption under PKCS#1 v1.5, this is always $0x02$. It’s basically a label that says, “This is an encryption block, not something else”.

The next block is the Padding String $PS$. This is a string of nonzero random bytes. This is crucial for security because it introduces randomness into each encryption. If the same message is encrypted multiple times, these random bytes ensure that each ciphertext looks different, foiling many simple attacks that rely on seeing repeated patterns.

The next octet, $0x00$, is a Delimiter. This single zero byte marks the end of the padding. During decryption, this helps the recipient quickly identify where the padding stops and the real message begins.

Finally, we have the actual data you want to protect – $M$. Once the recipient has verified the padding, they know exactly where to find this message.

This mechanism helped solve the deterministic issue of naive RSA. In the next sections, let’s understand the mathematics involved in PKCS#1 v1.5 padding and its security implications.

The Mathematics Behind PKCS#1 v1.5

Before we begin, let’s get our symbols and abbreviations correct. We will use upper-case symbols (such as $EB$) to denote octet strings and bit strings. We will use lower-case symbols (such as $n$) to denote integers.

In PKCS#1 v1.5, we will use $k$ to represents the length of the RSA modulus $n$ in bytes. For example, if you have a $1024$-bit RSA key, then the RSA modulus $n$ is a $1024$-bit number. Since there are $8$ bits in a byte, if your RSA modulus is $L$ bits long, then:

$$k = \left\lceil \frac{L}{8} \right\rceil = \frac{1024}{8} = 128 \text{ bytes}$$

The total length of the encryption block will be equal to this RSA key length $k$ (in bytes). Now here the length of the data $M$ shall not be more than $k-11$ octets, since the 11 bytes are consumed by the blocks – $0x00 ~||~ 0x02 ~||~ PS ~||~ 0x00$. This limitation guarantees that the length of the padding string $PS$ is at least eight octets, which is a security condition in PKCS#1v1.5:

$$∣PS∣=k~−∣M∣−~3$$

For example, with a $1024$-bit RSA modulus, the value of $k$ comes out to be $128$. Here Alice could encrypt up to $128 - 11 = 117$ bytes of data. The $11$ bytes are used for the $0x00 ~||~ 0x02 ~||~ PS ~||~ 0x00$ structure. The random $PS $ ensures that each encryption of the same message produces a different ciphertext, preventing the deterministic encryption problem.

RSA doesn’t directly operate on the bytes. Once the padded string $EB$ is ready, it needs to be converted into an integer guided by the Octet String to Integer Primitive (OS2IP) formula:

$$x = \sum_{i=1}^{k} 2^{8(k - i)} \,\mathrm{EB}_i$$

where $EB_i$ are the octets of $EB$ from first to last. In other words, $EB_1$ (the first byte) is the most significant byte, and $EB_k$ (the last byte) is the least significant. Now Alice can simply encrypt this block using $C = x^c \mod n$.

To solidify our learnings so far, let’s apply this to a sample plaintext and find the padded blocks.

Let’s assume the RSA modulus is $8$ bytes long ($k=8$). Suppose we want to encrypt a message $M$ that is $2$ bytes long. Then the padding string $PS$ must fill the remaining space:

$$Total ~ bytes=k=8=1(0x00)+1(BT)+∣PS∣+1(delimiter)+∣M∣$$

Since $∣M∣=2$ and there are $∣M∣=2∣$ fixed bytes, can find the required length of the padding string:

$$∣PS∣=8−3−2=3 ~ bytes$$

Let’s pick 3 arbitrary nonzero bytes for $PS$, say - $0xA3, ~0x5F, ~0xC2$. And let’s say the message is the ASCII text “Hi”. In hexadecimal, that’s: $0x48$ for 'H' and $0x69$ for 'i'.

Thus, the complete encryption block becomes:

Now we will convert this octet string to an integer using the OS2IP formula we discussed above:

$$x = \sum_{i=1}^{k} 2^{8(k - i)} \,\mathrm{EB}_i$$

For our example, with $k=8$ the conversion is:

$$x= 0x00×256^7+0x02×256^6+0xA3×256^5+0x5F×256^4+0xC2×256^3+0x00×256^2+0x48×256^1+0x69×256^0$$

Note that the hexadecimal values can be converted to decimal as needed. For instance, $0xA3 = 163, 0x5F = 95, 0xC2 = 194, 0x48 = 72,$ and $0x69 = 105$.

There is an interesting observation in the application of this formula. Because the first two bytes are fixed ($0x00$ and $0x02$), the integer $x$ has a known lower bound. The contribution of the first two bytes is:

$$0×256^ 7 +2×256^ 6 =2×256^ 6$$

The rest of the bytes ($PS$, the delimiter, and $M$) add some value that is at least $0$ and at most just less than $256^6$ (since the second byte is fixed as $0x02$ and cannot be $0x03$). Thus, $x$ is in the range:

$$2×256 ^ 6 ≤x<3×256 ^ 6$$

This property which makes the range predictable, paved the way for the Bleichenbacher attack (also known as the “padding oracle” attack). If a system reveals whether a decrypted block is “correctly padded,” an attacker can systematically probe different ciphertexts and narrow down the plaintext – because the attacker knows it must lie in that narrow range. Let’s take a detailed look at the Bleichenbacher attack in the next sections and understand how the exploit works.

The Bleichenbacher Attack

In 1998, Daniel Bleichenbacher published a seminal paper [8] demonstrating an adaptive chosen-ciphertext attack against RSA with PKCS#1 v1.5 padding. The Bleichenbacher Attack, also dubbed as the “million messages” attack, demonstrated that if an attacker has access to an oracle that tells whether a submitted ciphertext decrypts to a properly padded plaintext (that is, whether the PKCS#1 v1.5 formatting is correct), the attacker can gradually recover the full plaintext. Let’s break down how this attack works:

First, Eve needs an Oracle. The attack assumes the attacker can query a system, such as an SSL/TLS server, and find out if a given ciphertext $C$ is PKCS#1 v1.5 conformant. In the 1998 paper, Bleichenbacher exploited the fact that a TLS server, when presented with an improperly padded RSA-encrypted premaster secret, would respond with a specific error alert if the padding was wrong. Essentially, the server acted as an oracle: it would decrypt $C$ with its private key and simply tell the attacker “padding OK” or “padding error” (the error could be timing-based or an explicit alert).

Note that the oracle does not reveal the plaintext. It only reveals a single bit of information at a time: “valid padding or not.” This might seem harmless, but Bleichenbacher showed that it’s enough to eventually recover the plaintext.

To quickly recap, the attacker’s goal is to find the unknown message integer $m$ (the PKCS#1-padded plaintext as an integer) given its ciphertext $C = m^e \bmod N$, using the oracle. We know that if $m$ is properly padded, it lies in a specific numeric range: $2B \le m < 3B$ where $B = 2^{8*(k-2)}$, as defined earlier.

If $k=128$ bytes, then $B=2^{8*126}$, and a correctly padded $m$ will start with $0x00 ~||~0x02$, so it’s between $2B$ and $3B$. The attacker, Eve, initially only knows that $m$ is in the range $[2B, 3B)$.

In the Bleichenbacher Attack, Eve will exploit RSA’s multiplicative property. They will choose a number $s$ (called the multiplier) and compute a new ciphertext $C' = (C s^e) \bmod N$. This $C'$ here corresponds to a new plaintext: $m' = m s \bmod N$ (because $C' \equiv m^e * s^e \equiv (ms)^e \pmod{N}$).

To begin the attack, Eve finds some $s_0$ such that $C_0 = C * (s_0)^e \mod N$ yields a valid padding. This is referred to as the Blinding step. This is usually easy – for example, $s_0$ can be chosen so that $m * s_0$ is just slightly above $N$, which almost certainly will wrap around and land in $[2B,3B)$. The attacker does not know $m$ to verify this directly. They rely on the padding oracle’s yes/no response to infer that the blinded plaintext $(m×s_0)\mod N$ falls in the correct range.

If the oracle returns “valid padding” for a given $ s_0$, it tells the attacker that $s_0 \mod N$lies between $2B$and $3B$. Mathematically:

$$2B≤(m×s_0)~mod N<3B$$

Now, Eve will try to try to narrow down this range in a loop, which is often referred to as the interval having step. Initially, Eve had one wide interval $[a, b] = [2B, 3B)$ that contains $m$. In each iteration, Eve tries increasing values of $s$ (starting from a certain minimum) until the oracle returns “padding OK” for $C' = C_0 * s^e$. Suppose this happens at some $s = s_i$. Given this feedback, Eve now knows:

$$2𝐵 ≤ (𝑚 × 𝑠_i) ~ mod 𝑁 < 3𝐵$$

This congruence implies there exists some integer $r$ such that:

$$2B ≤ ( m×s_i)−rN < 3B$$

Rearranging, we get a constraint on $m$:

$$\frac{2B+rN}{s_i} ≤ m < \frac{3B+rN}{s_i}$$

Eve doesn’t know $r$ outright, but they can solve for the possible range of $r$ by considering the current interval $[a,b]$ for $m$. Essentially, Eve uses the previous bounds on $m$ to guess which $r$ would make the inequality true, then updates the new bounds $[a, b]$ as the intersection of all possible solutions for $m$. This dramatically shrinks the interval.

Each oracle query yields such a constraint. Eventually, the interval $[a,b]$ collapses to a single value, $[a,a]$. Now, Eve can find the plaintext using:

$$m = (a × s_i^{-1}) ~ mod N$$

At that point, Eve has recovered the entire padded plaintext $m$, and by stripping off the padding, the original message itself.

The sequence diagram below consolidates our learning of the attack:

The Bleichenbacher attack showed that the format of the padding in PKCS#1 v1.5 leaked just enough info to enable a full private-key operation (decrypting the message) without ever factoring N. The attack leveraged the fact that it’s possible to craft ciphertexts that will decrypt to a valid-looking plaintext without knowing the plaintext. In essence, PKCS#1 v1.5 padding allowed about $1$ in $2^{16}$ chance (roughly) for a random blob to appear as “valid padding.” That was enough for an adaptive attack to succeed with feasible queries.

This is precisely what later padding designs like OAEP fixed. OAEP’s design makes such random valid ciphertexts astronomically unlikely (plaintext aware). We will learn about RSA-OAEP in the next sections.

To mitigate the Bleichenbacher attack without immediately changing the padding scheme, practitioners implemented defensive measures. For example, TLS should treat all decryption failures the same way (so an attacker can’t distinguish padding vs. other errors), and servers would generate a fake premaster secret on padding failure to continue the handshake and avoid timing leaks. Nonetheless, the safest course has been to deprecate PKCS#1 v1.5 encryption in favor of schemes like RSA-OAEP.

Optimal Asymmetric Encryption Padding (OAEP)

By the end of 1995, Bellare and Rogaway proposed Optimal Asymmetric Encryption Padding (OAEP) with the goal of achieving provable security. This padding aimed to make RSA encryption resistant not just to passive attacks but also to adaptive chosen-ciphertext attacks. In other words, even if an attacker can trick a system into decrypting chosen ciphertexts (as an “oracle”), they should learn nothing useful about the plaintext. OAEP was subsequently standardized in PKCS#1 v2.0 (published as RFC 2437 in 1998) and later versions.

Compared to PKCS#1 v1.5, OAEP has a more complex encoding that uses hash functions and a mask generation function (MGF) to thoroughly randomize the plaintext before RSA encryption, providing stronger guarantees.

OAEP’s design can be viewed as a two-layer Feistel-like network using a random seed. It takes the input message and randomizes it in a way that is reversible only with the correct seed. The scheme was proven plaintext-aware in the random oracle model which means that an adversary cannot concoct a valid ciphertext without knowing the corresponding plaintext. If an attacker tries to forge or tamper with ciphertexts, they almost surely produce an invalid padding that will be rejected. This property directly counters padding-oracle attacks.

OAEP (with a proper hash/MGF) is semantically secure against adaptive chosen ciphertext attacks, assuming RSA is hard to invert and treating the hash functions as random oracles. Unlike PKCS#1 v1.5, which lacked a formal proof, OAEP comes with a proof sketch that breaking RSA-OAEP is as hard as breaking RSA itself.

In practice, this means OAEP drastically reduces the risk of any padding oracle: an attacker can no longer easily find ciphertexts that slip through the padding check except by brute force which has a $2^{-hLen*8}$ success probability. For example, the success probability with SHA-1 would be $2^{-160}$.

The block diagram below is a visual representation of the OAEP encoding schema:

Let’s understand what these mathematical notions mean and the workings of RSA-OEAP, up next.

The Mathematics Behind OAEP

Optimal Asymmetric Encryption Padding requires a hash function for two operations we will discuss in this section. We will choose SHA-1 as a hash function in OAEP and $hLen$ denotes the length in octets of the hash function output. We will later demonstrate why even MD5 or SHA-1 is a secure choice for OAEP even if it is not collision resistant.

Before we dive into the mathematics, let’s recap a few notations and define the main pieces we’ll be using:

In RSA, $N$is the modulus, and $k$ is the size of $N$ in bytes. For a $2048$-bit modulus, $k=256$ bytes.
$M $ is the message or plaintext to be encrypted. This plaintext must be short enough to fit into the padded block (at most $k−2⋅hLen−2$ bytes). In our notation, $Hash$ refers to the cryptographic hash function (for example, SHA-1, SHA-256) of output length $hLen$. For example: If using SHA-1, $hLen=20$ bytes.

We will also use an optional string associated with the message (often empty). This is the Label $L$. If this label is empty, its hash is a fixed value. (For example: the SHA-1 of an empty string).

The hash of this label $L$ is represented by $lHash$, where $lHash=Hash(L)$. As mentioned earlier, if $L$ is empty, $lHash$ is simply $Hash('')$. This means that in any case $lHash$ will hold a value.

We will also use a Mask Generation Function, $MGF$, which is often mentioned as $MGF1$. This function takes an input (seed or masked data) and produces an output of a specified length by iterating the underlying hash function. We’ll write $MGF(input,length)$ to indicate “generate a mask of $length$ bytes from $input$”.

Now that you are familiar with all the necessary notations, we are ready to begin the encoding step.

Step 1: Constructing the Data Block (DB)

We will compute $lHash=Hash(L)$. If $L$ is empty, $lHash$ is a constant (For example, the SHA-1 of the empty string).

Form the padding string $PS$, the length of $PS$ is chosen so that the entire block $DB$ has length $(k−hLen−1)$ bytes. Numerically, $PS$ has $(k−mLen−2⋅hLen−2)$ bytes of $0x00$, where $mLen$ is the length of the message $M$.

Now we simply concatenate the blocks to generate the octet string for the Data Block ($DB$):

$$DB=lHash~∣∣~PS~∣∣~0x01~∣∣~M$$

Here the single byte $0x01$ acts as a delimiter which marks where the zero padding ends and the actual message $M$ begins. $DB$ ends up being $(k−hLen−1)$ bytes.

Step 2: Generating a Mask for the Data Block

First, we pick a random string called $seed$ of length $hLen$ bytes. For example, when using SHA-1 where $hLen=20$, then we say that the seed consists of $20$ random bytes.

Now we use the mask generation function, $MGF$, on the $seed$ to create a mask the same length as $DB$:

$$dbMask=MGF(seed,k−hLen−1)$$

The idea is to spread the randomness of the seed across the entire $DB$.

Step 3: Mask the Data Block

Now, we will Combine $DB$ and $dbMask$ with the bitwise $XOR$ operation:

$$maskedDB=DB \oplus dbMask$$

This step “scrambles” $DB$ with the random seed.

Step 4: Generate a Mask for the Seed

Next, we will produce a mask for the seed itself, based on $maskedDB$:

$$seedMask=MGF(maskedDB,hLen)$$

This step simply ensures that the seed is not left in the clear.

Step 5: Mask the Seed

Now we will combine the original seed and the new mask with an $XOR$ operation:

$$maskedSeed=seed \oplus seedMask$$

Now the seed is also “scrambled” by the data block.

Step 6: Form the Final Encoded Message (EM)

We are now ready to build our final block. Simply concatenate everything into a $k$-byte string:

$$EM=0x00~∣∣~maskedSeed~∣∣~maskedDB$$

The leading $0x00$ byte ensures that when $EM$ is interpreted as an integer, it’s less than the RSA modulus $N$. At this point, $EM$ is your OAEP-padded message of length $k$.

Step 7: Covert concatenated String to Integer

Remember from our discussion before on PKCS#1v1.5 that RSA cannot directly operate on this concatenated string of bytes. We need to convert the $EM$ block to a non-negative integer using the OS2IP formula:

$$x = \sum_{i=1}^{k} 2^{8(k - i)} \,\mathrm{EB}_i$$

Step 8: Perform RSA Encryption

Now that we have the encoded message ($EM$) as an integer $x$, we are ready to perform RSA guided by the formula:

$$C =x^e \bmod N$$

where $(e,N)$ is the public key. The thus computed $C$ is our ciphertext generated using RSA-OAEP.

When decrypting, the process is reversed: the recipient uses their private key $d$ to compute $m = c^d \bmod N$, recovers the $EM$, then splits it into the $0x00$, $maskedSeed$, and $maskedDB$, and uses the same $MGF$ and hash function to unravel the $XORs$ in reverse order. Finally, they check that the recovered $lHash'$ matches the expected hash and that the block contains the proper structure ($...||0x01||...$).

If any check fails, the padding is invalid. Only if everything checks out is the message $M$ returned. The result is that an invalid ciphertext will almost always be detected and rejected without giving an attacker any useful information.

By design, OAEP effectively foiled the padding oracle problem. The chance that a random guess produces a valid OAEP encoding is negligible: on the order of $2^{-hLen*8}$). In fact, Daniel Bleichenbacher (after breaking PKCS#1 v1.5) advocated for exactly such a “plaintext-aware” padding where forging a valid padding is infeasible.

Why SHA-1 or MD5 Are Safe in RSA-OAEP

Earlier in the section above, we mentioned that we’d be using SHA-1 for our mathematical formulation and examples. When you see SHA-1 or MD5 used in the context of RSA-OAEP, don’t let the fact that these hash functions are considered broken for collision resistance alarm you. If you notice carefully in the previous section, the hash functions serve two very specific roles that do not rely on their collision resistance. Let’s break them down one by one:

Label Hashing

The hash function is used to compute a fixed-length hash of an optional label $L$ (often empty).

Now let’s see why is this safe in the context. This hash, called $lHash$, acts as a domain separator. Its job is simply to ensure that the label is correctly associated with the ciphertext during decryption. As long as the label is chosen wisely (that is, not built from adversary-controlled parts), collision resistance isn’t critical here.

Mask Generation Function (MGF1)

The hash function is also used inside $MGF1$ to create a pseudorandom mask. This mask is applied both to the data block $DB$ and to the random seed used in the encoding process.

In this context, the hash function is treated as a random oracle. The job is to spread the randomness of the seed across a larger block of data. For this purpose, properties like length extension or collision resistance are not relevant. What matters is that the output appears random, and even SHA-1 or MD5 can deliver that when used in this controlled, fixed-input scenario.

Adoption in Cryptographic Libraries (PKCS#1 v1.5 vs OAEP)

After the Bleichenbacher attack, standards and libraries migrated to OAEP or at least added support for it, while treating PKCS#1 v1.5 as a legacy option. Modern cryptographic libraries and protocols reflect these lessons.

In 1998, the RSA standard was updated. PKCS#1 v2.0 introduced RSAES-OAEP as the new recommended encryption scheme, and by PKCS#1 v2.1 and v2.2 (RFC 3447 and RFC 8017), OAEP is required for new applications, with PKCS#1 v1.5 included only for backward compatibility.

OpenSSL discourages users from using PKCS#1 v1.5 as it leaks information that can potentially be used to mount a Bleichenbacher padding oracle attack [10]. The documentation clearly mentions that it is highly recommended to use RSA_PKCS1_OAEP_PADDING in new applications.

The Python cryptography library (PyCA cryptography) also asks developers to use OAEP for encryption instead of PKCS#1 v1.5 [11].

After Bleichenbacher’s 1998 attack, it was impractical to instantly replace PKCS#1 v1.5 everywhere. Instead, protocol designers issued countermeasures.

TLS, for example, responded by changing the error handling: the server would not reveal a padding failure distinctly. It would generate a fake premaster secret and proceed to prevent timing clues, and always return a generic handshake failure at a later stage, making it harder for the attacker to distinguish why decryption failed.

These countermeasures reduced the oracle’s fidelity but were tricky to get right across different implementations. In fact, not everyone got it right – the Bleichenbacher attack continued to resurface in various forms when implementations made mistakes in error handling.

In 2018, researchers discovered the ROBOT attack (Return Of Bleichenbacher’s Oracle Threat): several TLS implementations had subtle bugs that recreated a padding oracle, allowing the attack to succeed 19 years later. The ROBOT paper showed that even with countermeasure guidelines, the complexity of uniformly handling errors led to slip-ups in popular products.

This underscores that patching an insecure scheme is often error-prone – a design that is secure by construction (like OAEP) is preferable.

PKCS#1 v1.5 continues to exist because of these patchwork security measures and the fact that it cannot be abruptly removed from all existing systems. It is generally regarded as "legacy" or maintained "for compatibility" purposes. The collective wisdom is clear: use OAEP for RSA encryption whenever possible.

Enhancing Digital Signatures: The Transition to PSS

Now that you understand how OAEP transformed RSA encryption by mitigating vulnerabilities in deterministic padding, it’s time to turn our attention to RSA digital signatures – a critical function for ensuring message integrity and authenticity.

Early RSA signature schemes suffered from similar problems as raw encryption: their deterministic nature made them prone to forgery and replay attacks. This vulnerability paved the way for an improvement: the Probabilistic Signature Scheme (PSS).

Before we dive into PSS itself, let’s quickly understand the pain points with early RSA signatures.

Problems with Early RSA Signature Schemes

Traditional RSA signatures were generated by simply applying the RSA decryption function on a message digest (often with minimal formatting):

$$s=m^d \bmod N$$

where $m$ is the hash (or encoded hash) of the message. This approach was deterministic which meant that each time the same message was signed, the exact signature was produced. Such determinism had two major drawbacks:

Predictability and Replay

Since the signature for a given message was always identical, an attacker could replay a captured signature with impunity or forge signatures if they could deduce patterns in the signature scheme.
Forgery Risks

In a deterministic setting, if an attacker finds any structure or mathematical relationship in the signature, they might be able to forge a valid signature for a new message. In certain scenarios, weak formatting could allow an adversary to create a “signature transformation” that produces a valid signature without having access to the private key.

These issues highlighted that a signature scheme must be probabilistic to be secure against adaptive forgery attempts and to ensure non-repudiation. This means that the signer should not be able to repudiate a signature because it is bound to a random value known only at signing time.

Birth of the Probabilistic Signature Scheme (PSS)

Towards the end of 1998, Bellare and Rogaway also proposed a scheme to overcome the inherent limitations of deterministic RSA signatures [12]. The core idea was to introduce randomness into the signature generation process so that even when signing the same message twice, the resulting signatures would be different. This randomness comes from a salt value and a carefully designed encoding process. The result is a signature method with strong, provable security guarantees.

This randomness prevents attackers from exploiting patterns in the signature process. The probabilistic Signature Scheme was designed to be provably secure in the random oracle model, meaning that forging a signature would be as hard as breaking RSA itself under certain assumptions [13].

The block diagram below is a visual representation of the PSS encoding schema:

Let’s understand what these mathematical notions mean as well as the workings of RSA-PSS, up next.

The Mathematics Behind PSS

Before diving into the mechanics of RSA-PSS, it’s helpful to define the notations and terms you’ll see in the steps ahead.

In RSA, $N$is the modulus, a large integer that is the product of two primes. $k$ is the length of $N$ in bytes. For an $2048$-bit key, $k=256$ bytes.

$M$represents the message data or document you want to sign. In RSA-PSS, you’ll typically first compute a hash of $M$. $Hash$ refers to a cryptographic hash function (for example, SHA-256) that maps data to a fixed-size output. The output length is denoted $hLen$. For SHA-256, $hLen=32$ bytes.

We will use a salt, $S$, randomly generated string of fixed length (often the same as $hLen$). This randomness is essential in ensuring that each signature is unique, even for the same message.

$H$ or $mHash$ is the hash of the message $M$and $H'$ is a secondary hash that includes both $M$ and the salt $S$. This appears in the PSS encoding step.

The Mask Generation Function, $MGF$, is a function that uses the hash internally to produce a pseudorandom output of arbitrary length. In PSS, it is used to “mask” parts of the data block so that the signature is hard to forge.

A fixed byte, $0xbc$ (in hex) is appended at the end of the encoded message to mark the boundary of the PSS structure. This serves as a simple integrity check during decoding. After a successful encoding we receive an encoded message $EM$ which is an octet string of length $emLen = \left\lceil{\frac{emBits}{8}}\right\rceil$.

Now that you are familiar with all the necessary notations, we are ready to begin the encoding step.

Step 1: Message Hashing and Salt Generation

We compute the hash of the message as $H~( mHash)=Hash(M)$ where $M$ is our message. We will also create a random salt $S$ (of fixed length, say 20 bytes if you use SHA-1).

Step 2: Encoding the Hash with the Salt (PSS-Encode)

We will construct a Data Block, $DB$, by combining a padding with the hash and the salt. The padding is a sequence of $0$’s that fills space and ensures a fixed length. Mathematically:

$$M' = (0x)~00 ~00 ~00 ~00 ~00 ~00 ~00 ~00 ~||~ mHash ~||~ salt$$

Now we compute the Hash of this block as $H' = Hash(M')$. We will generate another octet string $PS$ and concatenate it with the salt and $0x01$ as a delimiter:

$$DB = PS ~||~ 0x01 ~||~ salt$$

Note that DB is an octet string of length $emLen - hLen - 1$. The mask that you see in the visual representation above must be of this length. Mathematically:

$$dbMask = MGF(H, emLen - hLen - 1)$$

We will then apply this mask on the $DB$ block using an $XOR$ operation to produce our $maskedDB$:

$$maskedDB = DB \oplus dbMask$$

Recollect that $emLen$ is the intended length of the Encoded Message $EM$ and $hLen$ is the length of the hash output. Now we append a fixed trailer field $0xbc$ and produce the encoded message in its octet string representation:

$$EM = maskedDB ~||~ H ~||~ 0xbc$$

This encoding process ensures that both the salt and the hash are mixed together in a non-reversible, pseudorandom manner. The randomness from the salt is “spread” over the data block by the $MGF$, making it extremely difficult for any adversary to manipulate the signature.

Step 3: RSA Signature Generation

Once you have the encoded message $EM$, the RSA signature is produced by using the RSA private key. First, convert the Octet String to its integer representation using the OS2IP method we’ve discussed before. Then apply the RSA Private Key Operation:

$$s=m^d \bmod N$$

where $d$ is the private exponent and $N$ is the RSA modulus.

Step 4: Signature Verification

At the receiver end, when any recipient wants to verify a signature, they reverse the process:

$$m′= s^e \bmod N$$

and convert $m'$ back to an encoded message $EM$. The verifier then extracts the components $(MaskedDB, H′, trailer)$ and recomputes $H'$ from the message and salt. The verifier confirms that the hash and salt embedded in $EM$ match what is expected. If everything checks out, the signature is valid.

The Road Ahead: Assessing RSA’s Long-Term Viability

In 1994, Peter Shor’s algorithm [14], demonstrated that a quantum computer can factor large integers in polynomial time, thereby efficiently breaking RSA’s underlying hard problem – the difficulty of factoring $N = p \times q$.

Although experimental quantum computers have made progress, they remain far from having the number of stable qubits required to break RSA keys of practical sizes (2048 or 4096 bits).

In anticipation of large-scale quantum computers, the cryptographic community is actively developing and standardizing algorithms believed to be resistant to quantum attacks. These include lattice-based schemes (such as CRYSTALS-Kyber and NTRU), code-based cryptography (such as the McEliece cryptosystem), hash-based signatures (such as XMSS), and multivariate polynomial cryptosystems.

It’s important to note that while OAEP and PSS improve the security of RSA against classical attacks, they do not protect RSA from quantum attacks. In a post-quantum world, even the most secure classical padding will not prevent a quantum computer from breaking RSA using Shor’s algorithm.

In the near term, RSA remains in widespread use and, when implemented with padding schemes such as OAEP and PSS, continues to provide strong security against classical adversaries. But looking ahead, it’s expected that organizations will gradually migrate to post-quantum algorithms as they mature and become standardized.

References

[1] FIPS 186-5: Digital Signature Standard (DSS)

[2] RFC 8017 PKCS #1: RSA Cryptography Specifications

[3] Lagrange's theorem

[4] Ronald L. Rivest, Robert D. Silverman: Are Strong Primes Needed for RSA?

[5] pyca/cryptography

[6] OpenSSL Github: rsa_chk.c

[7] RFC 2313: PKCS #1: RSA Encryption

[8 ] Daniel Bleichenbacher: Chosen Ciphertext Attacks Against Protocols Based on the RSA Encryption Standard PKCS #1

[9] RFC 8017: PKCS #1 RSA Cryptography Specifications Version 2.2

[10] RSA_public_encrypt: Warnings

[11] pyca/PKCS1v1

[12] Probabilistic signature scheme

[13] RFC 8017: RSASSA-PSS

[14] Algorithms for quantum computation: discrete logarithms and factoring

How to Protect Data in Transit using HMAC and Diffie-Hellman in Node.js [Full Handbook]

Hamdaan Ali — Mon, 18 Mar 2024 23:00:22 +0000

Data integrity refers to the assurance that data will remain accurate, unaltered, and consistent throughout its lifecycle. In communication, data integrity is important in safeguarding against unintended alterations and malicious interventions during data transmission.

The integrity of Digital Data is accomplished using Hashing Algorithms. The crypto module in Node provides various built-in vetted library functions to provide means to not only verify the integrity of data but also the authenticity of its origin.

This handbook aims to highlight the internal workings of the functions in the crypto library and give you some insights into the internal workings of HMAC and Diffie-Hellman Key Exchange. This will help you make informed decisions about hash algorithms and key lengths depending on your business requirements.

The primary focus of this handbook is to emphasize the crucial aspect of data integrity rather than discussing the various encryption algorithms available.

Encryption is used to protect information by converting it into a secure format, which ensures its confidentiality. But data integrity is concerned with ensuring that the data remains accurate and unaltered.

You can also watch the associated video here:

Prerequisites
The Alice-Bob Paradigm
Message Detection Code (MDC)
Message Authentication Code (MAC)
Hash-based MACs (HMAC)
The Diffie-Hellman-Merkle Protocol
Connecting the Dots
Invoking the APIs
Wrapping Up
References

Prerequisites

Node and Express: We'll create a TypeScript sample application using the Express framework. A basic understanding of the framework would be helpful. You will need the Node Runtime Environment to execute the scripts.
Postman Client: To make an API request and to test out the sample application, you will need a tool to make HTTP Requests. You may use your web browser's "Edit and Send" feature under the Networks tab, but since not all browsers allow this, it's best to use a tool like Postman which provides a better UI to observe responses.

The Alice-Bob Paradigm

Throughout this handbook you will come across numerous sequence diagrams and mathematical proofs that use the Alice-Bob Paradigm.

The Alice-Bob paradigm is a common convention in cryptography where two generic entities, often named Alice and Bob, are used to illustrate various scenarios, protocols, or cryptographic principles.

The Alice-Bob Paradigm

These characters represent two parties engaged in communication, with Alice typically representing the sender or initiator, and Bob representing the receiver or responder.

We often introduce Eve as a third party, symbolizing an eavesdropper or potential attacker, adding an element of security risk and illustrating scenarios where external entities might attempt to intercept or manipulate the communication.

The sample application shown in the later sections models after this Alice-Bob Paradigm to use Boost Inc. and Account Aggregator (AA) as the parties engaged in communication.

Message Detection Code (MDC)

When Alice needs to send critical data to Bob over the internet, the data changes hands, jumping between routers and servers, each step carrying the potential risk of unintended alterations.

If Eve manages to get their hands on Alice's data, they might modify it. So the integrity of the data becomes questionable, emphasizing that its original state may have been compromised during transmission.

Note that we are talking about the integrity and not the confidentiality of the data. Say even after Alice encrypts the data, it doesn't inherently guarantee that the data hasn't been tampered with during transit.

Consider this scenario: even though Eve may be unable to decrypt the encrypted message, they might attempt to modify the ciphertext in transit. This could involve altering bits, rearranging packets, or injecting malicious code, potentially leading to unintended consequences upon decryption.

This is where a Message Detection Code (MDC) or a hash comes in picture. A modification detection code (MDC) is a message digest or a checksum that can prove the integrity of a message: that the message has not been changed [1].
The figure below explains how MDC is used to verify the integrity of a message:

Modification Detection Code [1]

A Hash Function is used to generate the digest for any given message. This hash function processes the entire content of the message, producing a fixed-size string of characters that uniquely represents the message's contents. This is called the message digest or MDC.

Note that any hash function, such as SHA-256, SHA-3, or MD5, can be used depending on your specific security requirements and preferences.

Once the digest is generated, it serves as a unique fingerprint for the original message. When Alice sends both the message and its corresponding digest to Bob, they can independently apply the same hash function to the received message. If the calculated digest matches the one received from Alice, it serves as irrefutable evidence that the message has not undergone any modifications during transmission.

Message Authentication Code (MAC)

While MDC or the checksum is typically transferred over a safe channel, it may so happen that the safety of the channel or the trusted party itself is compromised. In such a case Eve can easily modify both the message and the digest and Bob will never know if the message actually came from Alice as intended.

What MDC lacks is a definitive guarantee of the message origin, leaving a potential vulnerability in confirming the true sender.

This is where Message Authentication Code (MAC) comes in. MACs not only ensure the integrity of the message, detecting any unauthorized alterations, but they also provide a mechanism for authenticating the origin of the data. In other words, MACs offer assurance that the message is indeed originating from Alice and not by someone else.

The figure below explains how MAC can help authenticate the origin of a message besides providing integrity check:

Message Authentication Code [1]

Notice that the difference between a MDC and a MAC is that MAC also includes a secret key (K) between Alice and Bob. The hash function also takes in a key (K) along with the message (M) to generate a MAC.

$$ h (K | M) = MAC $$

Now both the message and MAC can be sent over the same insecure channel. When Bob receives this ( M + MAC ), he can separate out the message M and compute the MAC for it using the same hash function and the secret key (K).

Bob will then compare the newly computed MAC with the one he received. If the two MACs match, the message is authentic and has not been modified by an adversary.

$$ Alice: S(K,M) = MAC \\ Bob: V(M, K, MAC) = Accept/ Reject $$

Since Eve does not have this secret key (K), they cannot modify the message and generate a valid MAC. Consequently, the resulting MAC becomes a unique fingerprint, signifying not only the integrity of the message but also authenticating its origin.

Hash-based MACs (HMAC)

While MAC do provide a guarantee of authentication of the origin of a message, it is still falls short in ensuring unforgeability. It is easy for Eve to perform a Man in The Middle (MiTM) attack, intercept the MAC + Message pair and then perform the Length Extension Attack.

Given ( S = h( K || M) ) and the message (M), Eve can extend (M) to (M' = M || Pad || w) and create (MAC(M')); where (MAC(M')) is evaluated as
( S = h( K || M || Pad || w) ).

Eve does not require knowledge of the secret key (K) to extend the message (M) to (M'). When Alice receives this modified (M') and (MAC(M')), they are unable to determine the modification.

HMAC or a Hash-based MAC is a specific method for constructing a MAC algorithm out of a collision resistant hash function. HMAC uses two passes of hash computation and provides a better immunity against length extension attacks. The figure below explains the construction of HMACs.

Hash-based Message Authentication Code [1]

There are several steps involved in the implementation of HMACs [1]:

Divide the message into N blocks, each of b bits
Select a secret key and left-padded with 0’s to create a b-bit key and exclusive-ored with a constant called (ipad) (input pad).
Use the same secret key and XOR it with an another constant called (opad).
The value of (ipad) and (opad) are fixed constants as defined in the HMAC Standards [3]. The value of (ipad) is taken as b/8 repetition of the sequence
00110110 (hex: 36) and the value of (opad) is taken as b/8 repetition of the sequence 01011100 (hex: 5C).
These values are defined in such a way to have the most "non-regular" Hamming distance from each other.
The Hamming distance between (ipad) and (opad) 4, meaning exactly half of the bits are flipped.
Prepend the result produced in Step 2 to the message block. Use the hash function on this (N+1) block to create a n bit message digest called the intermediate HMAC.
The intermediate HMAC is prepended with (0)s to make a b bit block and then the result of Step 3 is prepended to this block.
Use the hash function again on the result of step 5 to get a final n bit HMAC.

Mathematically, this can be represented as:

$$ S(k, m) = H(k \oplus \text{opad} || H(k \oplus \text{ipad} || m)) $$

Now if Eve tries to extend (M) to (M' = M || Pad || w), the resulting HMAC construction this would be:

$$ HMAC(K, M')=H(K||opad, H(K||ipad, M || Pad || w)) $$

Due to the unique application of (opad) in the outer hash, the attacker cannot construct (H(K||opad, <...> )) without knowing the key (K). The outer padding disrupts the internal state for any additional input, thwarting the attacker's attempt.

The Diffie-Hellman-Merkle Protocol

One of the main challenges in Symmetric-key Ciphers is the distribution of keys. A fundamental question naturally arises: How will Bob know what keys Alice has used?

A very intuitive answer to this problem could be to use a Key Exchange or a Key Distribution Center (KDC). However, the utilization of a KDC or a Key Exchange introduces a notable caveat: the requirement of a secure channel for transmitting keys.

The security of a system employing a Key Distribution Center (KDC), such as in the case of the Kerberos authentication protocol, is heavily dependent on the security of the KDC itself. If the KDC is compromised, the cryptographic keys it manages and distributes can be exposed, leading to potential security vulnerabilities throughout the system as seen in a Golden Ticket Attack.

In the year 1979, Ralph Merkle, Whitfield Diffie and Martin Hellman came up with a way to Securely exchange Cryptographic Keys over Public Insecure Channels.

The Diffie-Hellman-Merkle Protocol provides a way for two parties to agree upon a shared secret key over an insecure channel without directly exchanging that key. The crypto module in Node.js contains the DiffieHellman class, which is a utility for creating Diffie-Hellman key exchanges.

Before we go through all of the functions defined in this class, it is important to understand the mathematics that goes around in The Diffie-Hellman-Merkle Protocol. The UML Sequence Diagram below explains the steps involved in The Diffie-Hellman-Merkle Protocol:

The Diffie-Hellman-Merkle Protocol

The process begins with either of the party who wants to establish a secure communication with the other. In this case, Alice wants to start the communication.

Alice will first pick a randomly chosen Generator g and a large prime number p. Increasing the length of the prime number results in heightened security, as it amplifies the difficulty for adversaries to execute certain cryptographic attacks.

However, enlarging the prime number also comes with computational costs. Longer prime numbers require more computational resources to perform the key generation.

Now, Alice needs to select a Private a and compute a modular exponentiation:

$$ A = g^a (\text{mod} , p) $$

Alice will send over the Generator g, the large prime p and Alice's Public Key A to Bob. At this point, Bob has all the values he needs to evaluate his own modular exponentiation of:

$$ A = g^b (\text{mod} , p) $$

He will send back this Public Key B to Alice.

Note that up until this point, all communication are occurring over insecure channel. The values g, p, A and B "might" as well be sent as plaintext. The Actual Secret Key is evaluated when Alice and Bob use these data to compute what is known as a "Shared Secret".

Shared Secret computed by A:

$$ S = A^b (\text{mod} , p) \\ S = g^{\left(ab\right)} (\text{mod} , p) $$

Shared Secret computed by B:

$$ S = B^a (\text{mod} , p) \\ S = g^{\left(ab\right)} (\text{mod} , p) $$

Notice how the Shared Secret computed by both parties at their end are the same.

This symmetrical outcome is the essence of the Diffie-Hellman key exchange, where each party independently computes the shared secret using their private key and the public key received from the other party. This ensures that both Alice and Bob arrive at an identical Shared Secret, establishing a secure foundation for further encrypted communication.

Why is the Shared Secret Secure?

Diffie-Hellman key exchange relies on the mathematical principles of discrete logarithm, primitive roots and Modular exponentiation.

Modular exponentiation is the problem of computing (a^b mod n), where (a), (b), and (n) are known integers. Discrete logarithm is the problem of finding (x) such that (a^x mod n = b), where (a), (b), and (n) are known integers and (a) is a primitive root modulo (n).

The security of Diffie-Hellman is rooted in the computational complexity of calculating discrete logarithms.

For example, given g, p and a, it's easy to compute A as Modular exponentiation is in P, meaning that there is a polynomial-time algorithm to solve it.

But, the other way can't be said true. Given g, p, and A, computing a requires solving the discrete logarithm problem, which is widely believed to be a computationally infeasible task [2].

Remember that both parties will compute the Shared Secret at their end and there is no need to send over this secret to the other party. This eliminates the risk of the Shared Secret getting intercepted by Eve and the only option they are left with is to solve the discrete logarithm problem.

Connecting the Dots

The key (K) that we provide in an HMAC has to be the same for both Alice and Bob. Now that we know how a Diffie-Hellman-Merkle key exchange works, it becomes intuitive that we can plug in the shared secret as the key for an HMAC.

Alice can use the shared key (S) in the HMAC function as a parameter and Bob can use the same shared secret (S), computed at their end, in the verification algorithm.

The crypto module in Node.js provides various built-in functions to implement cryptographic constructs such as HMACs and Diffie-Hellman Key Exchange. It is always recommended to use vetted cryptographic libraries and avoid implementing cryptographic algorithms yourselves over the concerns of Side Channel Attacks or a Heartbleed.

Let's create a TypeScript/ Node.js application to understand the implementation and prototypes of these functions. The two entities involved in communication in this application would be Boost Inc. and Account Aggregator. Boost needs to send a critical data over to the Account Aggregator.

We will first utilize the DiffieHellman class to create Secret Keys for both entities. Boost will then use the Secret Key to create a HMAC using the Hmac Class in Node. Account Aggregator will recieve this HMAC along with the message. They will verify this HMAC against the newly generated HMAC from the message they received.

Note that the code at Account Aggregator's end will be simulated and we will create API endpoints for each operation to show separation of concerns in this sample application.

The following sequence diagram explains what the application does:

UML Sequence Diagram for the sample application

Project Setup

In the root of your workspace, install Express, Axios, type definitions of Node, and type definitions of Express using the following command:

npm init -y | npm install axios express
npm install -D nodemon ts-node @types/express @types/node typescript

Configure tsconfig as per your liking and create a file called cryto.utils.ts under src/utils. Let's create an interface and import all necessary modules from the crypto library:

import { createHmac, createDiffieHellman, DiffieHellman, KeyObject, BinaryLike } from 'crypto';

export interface KeyPair {
  publicKey: Buffer;
  privateKey: Buffer;
  generator: Buffer;
  prime: Buffer;
  diffieHellman: DiffieHellman;
}

This interface will function as a blueprint for managing cryptographic key pairs throughout this application. It encapsulates the public and private keys, generator, prime, and a Diffie-Hellman object.

By using this interface we will ensure a structured and standardized approach to handle cryptographic key pair information, thus promoting clarity and consistency in cryptographic operations within a Node.js environment.

The createDiffieHellman Function

Next, we will define the function generateKeyPair which will allow us to generate the private and public keys, (A) and (B) along with the large prime (p) and the generator (g) using the createDiffieHellman and generateKeys functions.

export function generateKeyPair(prime?: any, generator?: any): KeyPair {
  const diffieHellman = prime && generator ? createDiffieHellman(prime, 'hex', generator, 'hex') : createDiffieHellman(2048);
  diffieHellman.generateKeys();

  return {
    publicKey: diffieHellman.getPublicKey(),
    privateKey: diffieHellman.getPrivateKey(),
    generator: diffieHellman.getGenerator(),
    prime: diffieHellman.getPrime(),
    diffieHellman,
  };
}

Notice that the parameters to this function – prime and generator – are optional. This is because the underlying createDiffieHellman has five defined overloads:

function createDiffieHellman(primeLength: number, generator?: number): DiffieHellman;

function createDiffieHellman(
    prime: ArrayBuffer | NodeJS.ArrayBufferView,
    generator?: number | ArrayBuffer | NodeJS.ArrayBufferView,
): DiffieHellman;

function createDiffieHellman(
    prime: ArrayBuffer | NodeJS.ArrayBufferView,
    generator: string,
    generatorEncoding: BinaryToTextEncoding,
): DiffieHellman;

function createDiffieHellman(
    prime: string,
    primeEncoding: BinaryToTextEncoding,
    generator?: number | ArrayBuffer | NodeJS.ArrayBufferView,
): DiffieHellman;

function createDiffieHellman(
    prime: string,
    primeEncoding: BinaryToTextEncoding,
    generator: string,
    generatorEncoding: BinaryToTextEncoding,
): DiffieHellman;

The first function creates a Diffie-Hellman object with a randomly generated prime of the specified length. The createDiffieHellman(2048); creates a Diffie-Hellman object where the length of the randomly generated prime is 2048 bits.

When no generator value is provided to this function, it takes the default value of 2. The length of the prime necessarily has to be large and if you select a small value Node will throw an error signifying that this length will not make a secure key.

Instead of passing in the length of the prime, we can pass the prime as a buffer. This is what Account Aggregator will to at their end when Boost sends over the necessary details.

Similarly you can use the other function declarations as per your use case to pass the prime and generator as ArrayBuffer or ArrayBufferView types or as string with a specified encoding.

The computeSecret Function

Now, let's define a method generateSharedSecret that takes in a Key pair and a public key as parameter and computes the shared secret (S):

export function generateSharedSecret(keyPair: KeyPair, publicKey: Buffer): Buffer {
  return keyPair.diffieHellman.computeSecret(publicKey);
}

The computeSecret function also has four overrides, which allows you to either provide the Public key parameter as string or ArrayBufferView as well as options to specify the inputEncoding and outputEncoding.

The createHmac Function

Now that we've computed our shared secret, let's create a function generateHMAC that consumes this shared secret and generates a digest against it.

export function generateHMAC(data: any, secretKey: KeyObject | BinaryLike): any {
  data = JSON.stringify(data);
  const hmac = createHmac('sha256', secretKey);
  hmac.update(data);
  return hmac.digest('hex');
}

The first parameter of the createHmac function takes an algorithm. This is where you need to specify what underlying hash function do you want to use.

Remember that the security of HMAC relies on various factors, including the cryptographic strength of the underlying hash function, the size of its hash output, and the quality and size of the key.

The options given to you under this algorithms parameter depends on the available algorithms supported by the OpenSSL version on the platform. To check what algorithms are available to you, execute the following command in the terminal:

openssl list -digest-algorithms

This will give you a list from which you can select your desired algorithm for the underlying hash function:

RSA-MD4 => MD4
RSA-MD5 => MD5
RSA-MDC2 => MDC2
RSA-RIPEMD160 => RIPEMD160
RSA-SHA1 => SHA1
RSA-SHA1-2 => RSA-SHA1
RSA-SHA224 => SHA224
RSA-SHA256 => SHA256
...

The secret key that the createHmac function takes could either be of type KeyObject or of type BinaryLike. Note that the type BinaryLike is a union type in TypeScript. It is a type that can be either a string or a NodeJS.ArrayBufferView.

The createHmac function's data parameter is designed to accepts strings, Buffer, TypedArray and DataView. To simplify the developer experience and minimize complexity, we intentionally set the data parameter type in the generateHMAC function as any. Internally, we handle the conversion to a string using JSON.stringify.

Initializing communication

Now on Boost's end create a file verification.controller.ts under src/controllers:

import { generateKeyPair, generateSharedSecret, generateHMAC, KeyPair } from '@boost/v1/utils/crypto.utils';
import { KeyObject, BinaryLike } from 'crypto';

const boostKeyPair: KeyPair = generateKeyPair();

export function shareKeys() {
    const boostPublicKey: Buffer = boostKeyPair.publicKey;
    const boostPrivateKey: Buffer = boostKeyPair.privateKey;
    const boostGenerator: Buffer = boostKeyPair.generator;
    const boostPrime: Buffer = boostKeyPair.prime;
    const boostDiffieHellman = boostKeyPair.diffieHellman;

    return {
        boostPublicKey,
        boostPrivateKey,
        boostGenerator,
        boostPrime,
        boostDiffieHellman,
    };
}

export function hmacDigest(data: any, secretKey: KeyObject | BinaryLike): any {
    return generateHMAC(JSON.stringify(data), secretKey);
}

This file imports the interface and all necessary modules from cryto.utils.ts and defines two wrapper functions – shareKeys and hmacDigest. shareKeys will only serve as a wrapper around generateKeyPair which will allow developers at Boost to send only the required keys over to the Account Aggregator.

Setting up the Account Aggregator

At the Account Aggregator's end, we need to set up a function that computes AA's public key and sends it over to Boost Inc. We will also need a function to verify the received HMAC of a data by comparing it against one that AA generates:

import { generateKeyPair, generateSharedSecret, generateHMAC, KeyPair } from '../utils/crypto.utils';  
import axios from 'axios';

let sharedSecret: Buffer;

export async function sendAAPublicKey(): Promise<Buffer> {
  try {
    const response = await axios.get('http://localhost:3000/init');

    const boostPublicKey: Buffer = Buffer.from(response.data.boostPublicKey, 'hex');
    const boostGenerator: Buffer = Buffer.from(response.data.boostGenerator, 'hex');
    const boostPrime: Buffer = Buffer.from(response.data.boostPrime, 'hex');

    const AA: KeyPair = generateKeyPair(boostPrime, boostGenerator);
    sharedSecret = generateSharedSecret(AA, boostPublicKey);

    return AA.publicKey;
  } catch (error) {
    console.error('Error sending AA public key:', (error as Error).message);
    throw error;
  }
}

export async function verifyData(data: any, hmac: string): Promise<string> {
  try {
    const calculatedHMAC = generateHMAC(JSON.stringify(data), sharedSecret);
    return calculatedHMAC === hmac ? "Integrity and authenticity verified" : "Integrity or authenticity compromised";
  } catch (error) {
    console.error('Error verifying data:', (error as Error).message);
    throw error;
  }
}

We make an Axios request to the /init endpoint defined at Boost and fetch (p), (g) and (A). Once we've computed the public key, we'll send that back to Boost. We will also compute our shared secret here which we'll use while verifying the HMAC in the verifyData method.

Setting up the Express APIs

Now that all the controllers and utility functions are in place, we'll create a few endpoints to facilitate communication between Boost Inc. and the Account Aggregator.

Boost:

import express, { Request, Response } from 'express';
import { hmacDigest, shareKeys } from '@boost/v1/controllers/verification.controller';
import { KeyPair, generateSharedSecret } from '@boost/v1/utils/crypto.utils';
import { DiffieHellman } from 'crypto';
import axios from 'axios';

const appBoost = express();
const PORT_BOOST = 3000;

let boostPublicKey: Buffer, boostPrivateKey: Buffer;
let boostGenerator: Buffer, boostPrime: Buffer;
let sharedSecret: Buffer;
let boostKeyPair: KeyPair, boostDiffieHellman: DiffieHellman;

appBoost.get('/init', async (req: Request, res: Response) => {
    ({ boostPublicKey, boostPrivateKey, boostGenerator, boostPrime, boostDiffieHellman } = shareKeys());
    res.send({ boostPublicKey, boostGenerator, boostPrime });
});

// Simulated Data
const data = {
    name: 'Boost User 1',
    phone: '1234567890',
};

appBoost.get('/fetchData', async (req: Request, res: Response) => {
    const hmac = hmacDigest(data, sharedSecret);
    res.send({ data, hmac });
});

appBoost.listen(PORT_BOOST, () => {
  console.log(`Boost server is running on http://localhost:${PORT_BOOST}`);
});

The /init endpoint, hosted by Boost, is invoked by AA within its sendAAPublicKey function. When the shared secret is calculated, AA will invoke the endpoint /fetchData to retrieve the critical information.

Account Aggregator (AA):

import express, { Request, Response } from 'express';
import { sendAAPublicKey, verifyData } from '@AA/v1/controllers/aa.controller';
import { KeyPair, generateSharedSecret } from '@boost/v1/utils/crypto.utils';
import { DiffieHellman } from 'crypto';
import axios from 'axios';

const appAA = express();
const PORT_AA = 3001;

let boostPublicKey: Buffer, boostPrivateKey: Buffer;
let boostGenerator: Buffer, boostPrime: Buffer;
let AAPublicKey: Buffer;
let sharedSecret: Buffer;
let boostKeyPair: KeyPair, boostDiffieHellman: DiffieHellman;

appAA.get('/fetchAAPublicKey', async (req: Request, res: Response) => {
    AAPublicKey = await sendAAPublicKey();
    res.send({ AAPublicKey: AAPublicKey.toString('hex') });

    boostKeyPair = {
        publicKey: boostPublicKey,
        privateKey: boostPrivateKey,
        generator: boostGenerator,
        prime: boostPrime,
        diffieHellman: boostDiffieHellman,
    }

    sharedSecret = generateSharedSecret(boostKeyPair, AAPublicKey);
});

appAA.get('/verifyData', async (req: Request, res: Response) => {
    const response = await axios.get('http://localhost:3000/fetchData');
    const { data, hmac } = response.data;
    const verified = await verifyData(data, hmac);
    res.send({ verified });
});

appAA.listen(PORT_AA, () => {
  console.log(`AA server is running on http://localhost:${PORT_AA}`);
});

The fetchAAPublicKey endpoint, hosted as AA's end, will be invoked by Boost when it wants to evaluate the Shared Secret. The verifyData method is encapsulated within a GET request, enabling either party to confirm the integrity of the transmitted data.

Invoking the APIs

Head over to your Postman Client to test out these APIs. Since the sendAAPublicKey method takes care of the initiation, we need to start our communication using the /fetchAAPublicKey endpoint:

Postman Client: fetchAAPublicKey Endpoint

You will observe the AA's public key as a response. Now, Boost Inc. will use this Public Key and evaluate the Shared Secret.

Once that is done, it will use the Shared Secret to compute the message digest in the /fetchData endpoint. Since /verifyData invokes the former endpoint, we'll check this in action on our Postman Client:

Postman Client: verifyData Endpoint

You will notice that the /verifyData response declares the successful verification of both integrity and authenticity. This acknowledgment ensures that the transmitted data remains untampered and originates from the authenticated source, providing a layer of security for communication between the two entities.

Wrapping Up

And there you have it: by utilizing HMACs and the Diffie-Hellman-Merkle Key Exchange, you can verify the integrity and authenticity of your transmitted data, enhancing the security of your applications and ensuring a reliable API communication framework for developers.

By understanding the intricacies and mathematical underpinnings of these practices, you can now make informed decisions, fortifying your system against tampering threats.

Find the complete code snippets here — GitHub Gist | HamdaanAliQuatil.
You may find me on X (formerly Twitter) – Hamdaan Ali Quatil.

References

[1] Behrouz A. Forouzan – Introduction to Cryptography and Network Security

[2] New Directions in Cryptography, Whitfield Diffie and Martin E. Hellman diffie.hellman.pdf (jhu.edu)

[3] Keying Hash Functions for Message Authentication, Mihir Bellare, Ran Canetti, Hugo Krawczyk https://cseweb.ucsd.edu/~mihir/papers/kmd5.pdf

How to Defend Against Server-Side Request Forgery

Hamdaan Ali — Fri, 05 Jan 2024 17:21:50 +0000

Server-Side Request Forgery (SSRF) has been a consistent issue in application security and is among the OWASP Top 10 vulnerabilities.

In this walkthrough, you'll first learn what Server-Side Request Forgery is and how it differs from Client-Side Request Forgery. We will create a sample application to gain a better understanding of how Server-Side Request Forgery attacks work, and explore various methods to safeguard our application against SSRF vulnerabilities.

Prerequisites

Node and Express: We'll create a JavaScript sample application using the Express framework. A basic understanding of the framework would be helpful. You will need the Node Runtime Environment to execute the scripts.
Postman Client: To make an API request and to exploit the vulnerability, you will need a tool to make HTTP Requests. You may use your web browser's "Edit and Send" feature under the Networks tab, but since not all browsers allow this, it's best to use a tool like Postman which provides a better UI to observe responses.

What is Server-Side Request Forgery?

Server-Side Request Forgery, or SSRF, is a security vulnerability that allows malicious actors to manipulate the server into making unintended requests on behalf of the server itself.

SSRF provides a window for such malicious actors to make requests "from" the server when they should be making requests "to" the server.

To appreciate what this means, let's look at a normal request execution using the sequence diagrams below:

UML Sequence Diagram for normal request execution

In a typical scenario, a server processes incoming requests from clients. Users or external systems initiate these requests, and the server responds accordingly. This is a standard client-server interaction where the server acts upon the requests it receives.

Now let's look at what SSRF looks like:

UML Sequence Diagram for SSRF attacks

In applications vulnerable to SSRF, attackers exploit the server's ability to make HTTP requests to resources that should not be directly accessible from the public internet. These resources may include internal protected resources, APIs, websites, or databases that can only be accessed from the server.

Attackers achieve this by tricking the server into making unintended requests to various destinations, including internal APIs, internal HTML pages, and internal databases.

How Does SSRF Differ from CSRF?

SSRF is an attack where an attacker can make the server perform requests on their behalf. This involves manipulating the server to make requests to internal resources, which can result in unauthorized actions or information disclosure.

On the other hand, in CSRF, or Client-Side Request Forgery, the attacker tricks a user's browser into making unintended requests to a specific web application for which the user is already authenticated. This means that actions are performed on behalf of the user without their consent.

Backend Developers must be aware of SSRF to make secure applications. In contrast, front-end developers must be mindful of and implement client-side security measures to prevent CSRF attacks.

Identifying Code Smells

SSRF attacks often occur when web applications improperly mishandle user-controlled input, leading to network requests based on inadequately sanitized user input. Processing un-sanitized URLs in API requests is a common entry point for SSRF attacks.

Another common giveaway to identifying SSRF vulnerabilities in your applications is to check for instances where XML parsing occurs without adequate validation of external entities. Applications that fail to validate and secure their XML parsers properly may inadvertently expose themselves to SSRF risks.

In this walkthrough, you will make a server that takes a URL and uses it to make network requests without proper validation and sanitization. You will then see ways to mitigate this issue.

Understanding the Pain Points

To better understand the issue of SSRF attacks, lets create a sample application using Express and JavaScript. Below is a Mermaid Sequence Diagram where we explain what the code base does:

UML Sequence Diagram for the sample application

We will create an Express app with two endpoints — /fetch, a GET request designed to fetch content from a specified URL, and /admin, another GET request, which is an internal API within the organization that accesses an internally protected resource.

We will discover a security vulnerability associated with Server-Side Request Forgery (SSRF) in implementing the first GET request.

We will also create another helper function at the /uploads endpoint to allow our clients to fetch and view their recently uploaded content.

Project Setup

To get started, let's quickly set up our repository and install all the required packages. In the root of your workspace, install Express and Axios using the following command:

npm init -y | npm i axios express

Executing this command will create a package.json file with default settings and install the specified packages.

To simulate the internal protected resource, let's create a data.json in the root of your workspace:

{   
    "name": "Hamdaan Ali Quatil",
    "password": "violinblackeye"
}

Now, create a file called app.js in the root of your repository. Here, we will define all of our endpoints. Import all required packages like this:

const express = require('express');
const axios = require('axios');
const fs = require('fs').promises;

We use the fs (File System) module to interact with the local file system. Within the Express application, we use fs.promises to read the contents of a file. The fetchPrivateResource function asynchronously reads the contents of the data.json file, which is an internal resource.

Let's create an instance of the Express app to handle HTTP requests and define the fetchPrivateResource method. In the sample application, only the admin should be able to fetch this internal resource, but you will observe how a malicious actor can access this using an SSRF attack.

const app = express();
const port = 3000;

// Function to fetch private resource
const fetchPrivateResource = async () => {
  try {
    const content = await fs.readFile('data.json', 'utf-8');
    return content;
  } catch (error) {
    console.error('Error reading private resource:', error.message);
    throw error;
  }
};

The Fetch Endpoint

Now, let's define our first endpoint, /fetch which expects a query parameter url containing the target URL. Upon receiving a request, the server uses the Axios library to make a GET request to the specified URL.

app.get("/fetch", async (req, res) => {
  const url = req.query.url;

  try {
    const response = await axios.get(url);
    const responseData = JSON.stringify(response.data);

    const filename = path.basename(url);
    const textFilePath = path.join(__dirname, "uploads", "upload-data.txt");

    await fs.writeFile(textFilePath, responseData, "utf-8");

    res.send("Upload Successful");
  } catch (error) {
    console.error("Error:", error.message);
    res.status(500).send("Internal Server Error");
  }
});

The axios.get method is used to perform the HTTP GET request, and the response data is then converted to a JSON string. The resulting string is written to a text file named upload-data.txt in the uploads folder of the server. Finally, a success message or an error message is sent back to the client, depending on the outcome of the operation.

The Uploads Endpoint

With that done, let's create an endpoint to allow users to access and verify their uploaded files. The server will check if the requested file exists, and if so, it sends the file to the client. When a file cannot be found, the server returns a 404 error.

app.get("/uploads/:filename", async (req, res) => {
  const filename = req.params.filename;
  const filePath = path.join(__dirname, "uploads", filename);
  console.log(filePath);

  try {
    // Check if file exists
    await fs.access(filePath);

    // If file exists, send it to the client
    res.sendFile(filePath);

  } catch (error) {
    res.status(404).send("File not found: " + error);
  }
});

The Admin Endpoint

Now, we need to make an internal API – the /admin route – which is intentionally shielded from public access. The objective is to ensure this API is only accessible from localhost or the local machine (127.0.0.1).

We can do this by implementing a middleware that acts as a protective barrier, permitting requests to proceed to the /admin route only if they originate from the local host.

The middleware checks whether the req.hostname property, which represents the hostname specified in the HTTP request, matches localhost or 127.0.0.1. If the request is from a different host, the middleware responds with a 403 Forbidden status, thereby restricting access.

// middleware to protect admin API
app.use('/admin', async (req, res, next) => {
  const isLocalhost = req.hostname === 'localhost' || req.hostname === '127.0.0.1';

  if (isLocalhost) {
    next();
  } else {
    res.status(403).send('Forbidden');
  }
});

// Route to access the admin API
app.get('/admin', async (req, res) => {
  try {
    const content = await fetchPrivateResource();
    res.send(content);
  } catch (error) {
    res.status(500).send('Internal Server Error');
  }
});

Once all routes are configured, we start the server using the app.listen method, and it begins listening on port 3000 for incoming requests.

app.listen(port, () => {
  console.log(`Server is running on http://localhost:${port}`);
});

With our app.js now set up to process incoming requests, let's run the sample application using nodemon:

npm i -D nodemon | nodemon app.js

The server has started on the port 3000. Now, we are ready to test our sample application and look for code smells that may lead to SSRF attacks. You may find the complete code here — GitHub Gist | HamdaanAliQuatil.

How to Exploit the Vulnerability

Let's try to make a GET request to the fetch API. We are simulating the process of uploading a text file using the URL to the file. In this demonstration, we will fetch the contents of an example file and save it on our servers. Here is the link to the text file.

Open your Postman Client and execute a GET request with the URL http://localhost:3000/fetch?url=https://example-files.online-convert.com/document/txt/example.txt. We are adding the link to the file as a Query Parameter in the /fetch endpoint. When you hit send, you will see a response "Upload Successful".

Postman Client: Fetch Endpoint

You'll see that your repository now has a newly created file in the uploads directory. Clients can now access their uploaded information using the /uploads API endpoint to view their files.

Postman Client: Uploads Endpoint

Now, let's send a malicious request by changing our Query param to http://120.0.07/admin in the same request to the /fetch endpoint. The updated URL will now look like this: http://localhost:3000/fetch?url=http://127.0.0.1:3000/admin.

In the Query parameter, 127.0.0.1 is a Loopback Address. A loopback address is a reserved IP address used to establish network connections with the same host (the local machine) for testing and communication within the device.

The malicious actor is attempting is to make a request to the server's /admin route from the server itself using the loopback address. This simulates an internal resource access scenario.

Postman Client: Admin Endpoint

You'll notice that an "Upload Successful" message comes as a response to this request. Now try accessing your uploaded file again using the GET request at the /upload endpoint.

Postman Client: Uploads Endpoint

You'll see that the contents of the uploaded file have been altered. This alteration highlights a successful SSRF (Server-Side Request Forgery) attack, where a malicious actor took advantage of the server's capability to initiate internal requests.

The file, which initially contained specific data, has now been tampered with. This showcases the potential for unauthorized access and manipulation of sensitive information through SSRF exploits.

How to Defend Against SSRF Attacks

Now, let’s see the ways in which we can fix our application's vulnerability to SSRF. The most intuitive solution that comes to your mind could be to never allow a client to enter a URL. This is certainly the most powerful defense. The server should create a URL it needs.

But many times, allowing URLs in your business logic becomes an absolute necessity. In such cases, our goal is to prevent the attack or at least reduce the risk if an attack occurs.

If you really must allow a URL as it is, here are some precautionary steps you can take:

Sanitization and Validation

As with most vulnerabilities, a pain-point in SSRF attacks is the use of untrusted data. Always treat any data coming from the client side as untrusted.

Sanitizing and validating the client-supplied data should go a long way to defend against SSRF attacks. A very intuitive validation is to restrict any URL containing localhost or the loopback address.

Let's create a helper function isValidUrl and call it in the function for the /fetch endpoint.

function isValidUrl(url) {
  // Restrict URLs to HTTP only. This blocks FTP and other protocols
  const validUrlRegex = /^http:\/\/\S+$/;

  if (!validUrlRegex.test(url)) {
    return false;
  }

  try {
    const parsedUrl = new URL(url);

    // Check if the host is localhost or a loopback IP address
    const isLocalhost = parsedUrl.hostname === 'localhost';
    const isLocalIP = /^127\.\d+\.\d+\.\d+$/g.test(parsedUrl.hostname);

    return !(isLocalhost || isLocalIP);
  } catch (error) {
    return false;
  }
}

Your updated function for /fetch endpoint should look like this:

app.get("/fetch", async (req, res) => {
  const url = req.query.url;

  if (!isValidUrl(url)) {
    res.status(400).send("Loopback URLs are not allowed");
    return;
  }

  try {
    ...
    res.send("Upload Successful");
  } catch (error) {
    ...
  }
});

Now, go back to the Postman Client and resend the malicious request. You will observe that previously uploaded file is not tampered and you receive "Loopback URLs are not allowed" in the response.

Whitelisting via an Allow List

You may create a positive allow list to only allow certain trusted IP Addresses, URL Schema, and Port. Let's implement an allow list and improve our isValidUrl function:

const whitelist = ["boost.com", "boost.in", "trustedDomain3.com"];
const allowedPorts = ['80', '443'];

Now use your declared whitelist in the isValidUrl function:

function isValidUrl(url) {
  try {
    const parsedUrl = new URL(url);

    if (!whitelist.includes(parsedUrl.hostname)) {
      return false;
    }

    if (!allowedPorts.includes(parsedUrl.port)) {
      return false;
    }

    return true;
  } catch (error) {
    return false;
  }
}

Notice how we've removed the need for regex. This brings us to another mitigation technique that you must avoid:

Don't Use a Deny List

You must never mitigate SSRF vulnerabilities using a deny list or regex. Restricting the use of IP Addresses is not straightforward. To understand why we must avoid a deny list, look at the following example.

A Loopback Address is typically represented using 127.0.0.1 . Its quite easy to spot this address and reject it. But a problem arises when a malicious request is sent using any other forms of this Loopback address that also points to the local machine. For example, 127.1, ::1, localhost ,::ffff:7f00:1 all point to the local machine.

A regular expression to spot all such variations is much more complex. Malicious actors can easily bypass a deny list by passing an octal representation of decimal encoding of the IP address.

Enforce a URL Scheme

In absence of this measure, a client might send requests that use any protocols other than the intended ones. To replace our validUrlRegex, we will use a allowedSchemes list. We will restrict our application to only process requests when the protocols are either https: or http. Not allowing any requests with protocols file: and ftp: will safe-guard our sample application.

const allowedSchemes = ['http:', 'https:'];

The updated isValidUrl function will look like this:

function isValidUrl(url) {
  try {
    const parsedUrl = new URL(url);

    if (!whitelist.includes(parsedUrl.hostname)) {
      return false;
    }

    if (!allowedPorts.includes(parsedUrl.port)) {
      return false;
    }

    if (!allowedSchemes.includes(parsedUrl.protocol)) {
      return false;
    }

    return true;
  } catch (error) {
    return false;
  }
}

Disable Redirects

Redirects are a mechanism used by web applications to forward a user's browser from one URL to another. If a server follows redirects automatically, an attacker could exploit this behavior to make the server inadvertently access internal resources, leading to data exposure or unauthorized actions.

To restrict redirects in Axios, pass in an Axios Configuration object in the second parameter:

const response = await axios.get(url, { maxRedirects: 0 });

To learn more about Axios Config, check this guide: Axios | Request Config.

Send Filtered Data to the Client

Avoid sending raw response bodies directly from your server to the client. Ensure that the responses reaching the client are carefully curated and conform to expected formats.

By implementing this practice, you shield your application from potential security vulnerabilities associated with exposing unfiltered or sensitive information. Always validate, filter, and format responses to align with your application's anticipated data structures.

Wrapping Up

And there you have it: by implementing a few well-established methodologies and best practices, you can effectively detect and mitigate SSRF attacks in your applications and create secure APIs as developers.

Find the complete code snippets here — GitHub Gist | HamdaanAliQuatil.
You may find me on X (formerly Twitter) - Hamdaan Ali Quatil.

Hamdaan Ali - freeCodeCamp.org

How to Elevate Your Database Game: Supercharging Query Performance with Postgres FDW

Table of Contents

Prerequisites

Executive Summary

Motivation

FDW Basics Without the Setup Tax

SQL/MED in a nutshell

What postgres_fdw does and does not do

Pushdown Mechanics

What “shippable” means in practice

WHERE pushdown

Join pushdown conditions

Shippability decision tree

Shippable Operations: a Deep Dive

Filters (WHERE clauses)

Joins

Aggregates (GROUP BY, COUNT, SUM, and so on)

ORDER BY and LIMIT

DISTINCT

Window functions

Version differences

Pushdown Blockers and Why They Exist

Non‑immutable functions

Type and collation mismatches

Cross‑server joins

Mixed local and foreign joins

Remote session settings and search paths

Troubleshooting matrix

Reading EXPLAIN Like a Pro

Inspect the Foreign Scan nodes

Recognize InitPlan vs SubPlan

Understand CTE materialization

Annotated example

How to Tune postgres_fdw

fetch_size

use_remote_estimate

fdw_startup_cost and fdw_tuple_cost

ANALYZE and analyze_sampling

extensions

A quick knob impact table

Schema and Index Recommendations

Benchmarking Methodology

Monitoring and Logging

Local metrics

Remote metrics

Case Study: Refactoring a Keycloak Coverage Query

Symptom

Diagnosis

Refactor

Why it improved

Key takeaway

Checklist and Troubleshooting Guide

Case Study Takeaways

Advanced Operations: A Deeper Dive into Shippability

Filters and simple predicates

WHERE clauses matter more than you think

Complex expressions are not automatically unsafe

Array and JSON operators

Joins: the good, the bad, and the ugly

Same‑server joins are your friend

Cross‑server joins break pushdown

Mixed local/foreign joins are tricky

Join conditions matter

Aggregates and grouping

ORDER BY, LIMIT, and DISTINCT

Window functions

Version‑specific quirks

Common Anti‑Patterns and How to Avoid Them

Using volatile functions in predicates

Joining local and foreign data first

Cross‑server joins without materialization

Complex expressions on join keys

Ignoring collation and type mismatches

Extending Tuning: Calibrating Cost Models

Balancing fetch size and memory

Remote estimates vs. local estimates

Tuning cost parameters

When to analyze foreign tables

Further Case Studies and Practical Examples