<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/"
    xmlns:atom="http://www.w3.org/2005/Atom" xmlns:media="http://search.yahoo.com/mrss/" version="2.0">
    <channel>
        
        <title>
            <![CDATA[ indexing - freeCodeCamp.org ]]>
        </title>
        <description>
            <![CDATA[ Browse thousands of programming tutorials written by experts. Learn Web Development, Data Science, DevOps, Security, and get developer career advice. ]]>
        </description>
        <link>https://www.freecodecamp.org/news/</link>
        <image>
            <url>https://cdn.freecodecamp.org/universal/favicons/favicon.png</url>
            <title>
                <![CDATA[ indexing - freeCodeCamp.org ]]>
            </title>
            <link>https://www.freecodecamp.org/news/</link>
        </image>
        <generator>Eleventy</generator>
        <lastBuildDate>Mon, 25 May 2026 10:49:15 +0000</lastBuildDate>
        <atom:link href="https://www.freecodecamp.org/news/tag/indexing/rss.xml" rel="self" type="application/rss+xml" />
        <ttl>60</ttl>
        
            <item>
                <title>
                    <![CDATA[ How Database Indexes Work – A Practical Guide with PostgreSQL Examples ]]>
                </title>
                <description>
                    <![CDATA[ Every developer eventually runs into a slow query. The table has grown from a few hundred rows to a few million, and what used to take milliseconds now takes seconds — or worse. The fix, more often th ]]>
                </description>
                <link>https://www.freecodecamp.org/news/how-database-indexes-work-a-practical-guide-with-postgresql-examples/</link>
                <guid isPermaLink="false">69e11c10ffbb787634dea035</guid>
                
                    <category>
                        <![CDATA[ Databases ]]>
                    </category>
                
                    <category>
                        <![CDATA[ PostgreSQL ]]>
                    </category>
                
                    <category>
                        <![CDATA[ indexing ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ iyiola ]]>
                </dc:creator>
                <pubDate>Thu, 16 Apr 2026 17:27:44 +0000</pubDate>
                <media:content url="https://cdn.hashnode.com/uploads/covers/5e1e335a7a1d3fcc59028c64/cf6919a4-f803-4783-83ff-5c7674141c55.png" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>Every developer eventually runs into a slow query. The table has grown from a few hundred rows to a few million, and what used to take milliseconds now takes seconds — or worse.</p>
<p>The fix, more often than not, is an index.</p>
<p>A database index is a data structure that helps the database find rows faster without scanning the entire table. It works a lot like the index at the back of a textbook: instead of reading every page to find a topic, you look it up in the index, get the page number, and go straight there.</p>
<p>In this tutorial, you'll learn how indexes work under the hood, how to create and use them effectively in PostgreSQL, and how to avoid the common mistakes that make indexes useless or even harmful.</p>
<h2 id="heading-table-of-contents">Table of Contents</h2>
<ul>
<li><p><a href="#heading-prerequisites">Prerequisites</a></p>
</li>
<li><p><a href="#heading-why-do-you-need-indexes">Why Do You Need Indexes?</a></p>
</li>
<li><p><a href="#heading-how-indexes-work-under-the-hood">How Indexes Work Under the Hood</a></p>
</li>
<li><p><a href="#heading-how-to-create-your-first-index">How to Create Your First Index</a></p>
</li>
<li><p><a href="#heading-how-to-use-explain-analyze-to-measure-performance">How to Use EXPLAIN ANALYZE to Measure Performance</a></p>
</li>
<li><p><a href="#heading-types-of-indexes-in-postgresql">Types of Indexes in PostgreSQL</a></p>
</li>
<li><p><a href="#heading-how-to-create-a-composite-index">How to Create a Composite Index</a></p>
</li>
<li><p><a href="#heading-how-to-create-a-partial-index">How to Create a Partial Index</a></p>
</li>
<li><p><a href="#heading-how-to-create-an-expression-index">How to Create an Expression Index</a></p>
</li>
<li><p><a href="#heading-how-to-create-a-unique-index">How to Create a Unique Index</a></p>
</li>
<li><p><a href="#heading-how-to-manage-indexes">How to Manage Indexes</a></p>
</li>
<li><p><a href="#heading-when-indexes-hurt-instead-of-help">When Indexes Hurt Instead of Help</a></p>
</li>
<li><p><a href="#heading-common-mistakes-that-prevent-index-usage">Common Mistakes That Prevent Index Usage</a></p>
</li>
<li><p><a href="#heading-best-practices-for-indexing">Best Practices for Indexing</a></p>
</li>
<li><p><a href="#heading-conclusion">Conclusion</a></p>
</li>
</ul>
<h2 id="heading-prerequisites">Prerequisites</h2>
<p>To follow along with the examples, you'll need:</p>
<ul>
<li><p>Basic knowledge of SQL (SELECT, INSERT, UPDATE, DELETE, WHERE, JOIN)</p>
</li>
<li><p>A running PostgreSQL instance (version 12 or later)</p>
</li>
<li><p>A SQL client like <code>psql</code>, pgAdmin, or DBeaver</p>
</li>
</ul>
<p>If you don't have PostgreSQL installed locally, you can use a free cloud-hosted instance from services like <a href="https://neon.tech">Neon</a> or <a href="https://supabase.com">Supabase</a>.</p>
<h2 id="heading-why-do-you-need-indexes">Why Do You Need Indexes?</h2>
<p>When you run a query like <code>SELECT * FROM users WHERE email = 'jane@example.com'</code>, the database needs to find the matching row. Without an index, PostgreSQL performs a <strong>sequential scan</strong> — it reads every single row in the table and checks whether the <code>email</code> column matches.</p>
<p>For a table with 100 rows, this is fine. For a table with 10 million rows, it's painfully slow.</p>
<p>An index solves this by creating a separate, sorted data structure that maps column values to their row locations. Instead of scanning 10 million rows, PostgreSQL can look up the value in the index and jump directly to the matching row. This can reduce query time from seconds to milliseconds.</p>
<p>But indexes aren't free. They come with trade-offs you need to understand before adding them everywhere. You'll learn about those trade-offs throughout this tutorial.</p>
<h2 id="heading-how-indexes-work-under-the-hood">How Indexes Work Under the Hood</h2>
<p>PostgreSQL's default index type is the <strong>B-tree</strong> (balanced tree). Understanding how a B-tree works will help you make smarter decisions about when and how to index.</p>
<p>A B-tree organizes data into a sorted, hierarchical structure with three levels:</p>
<ol>
<li><p><strong>Root node</strong> — the top of the tree. It holds a few values that divide the data into broad ranges.</p>
</li>
<li><p><strong>Internal nodes</strong> — each one further narrows down the range.</p>
</li>
<li><p><strong>Leaf nodes</strong> — the bottom level. These hold the actual indexed values along with pointers to the corresponding rows in the table.</p>
</li>
</ol>
<p>When PostgreSQL uses a B-tree index to find a value, it starts at the root and follows the path that matches the target value, moving through internal nodes until it reaches the correct leaf node. This path is called a <strong>tree traversal</strong>, and it typically requires only 3–4 steps even for tables with millions of rows.</p>
<p>Think of it like a phone book. You don't start at page one and read every name. You open to roughly the right section (root), narrow it down to the right page (internal nodes), and scan the entries on that page (leaf node).</p>
<p>This sorted structure is also why B-tree indexes work well for range queries like <code>WHERE price &gt; 50 AND price &lt; 100</code>. The database finds the starting point in the tree and then scans forward through the leaf nodes, which are already in order.</p>
<h2 id="heading-how-to-create-your-first-index">How to Create Your First Index</h2>
<p>Let's build a practical example. You'll create a table, load it with data, and see the difference an index makes.</p>
<h3 id="heading-step-1-create-the-table-and-insert-sample-data">Step 1 – Create the Table and Insert Sample Data</h3>
<pre><code class="language-sql">CREATE TABLE customers (
    id SERIAL PRIMARY KEY,
    first_name VARCHAR(50) NOT NULL,
    last_name VARCHAR(50) NOT NULL,
    email VARCHAR(100) NOT NULL,
    city VARCHAR(50),
    created_at TIMESTAMP DEFAULT NOW()
);
</code></pre>
<p>Now insert a large number of rows so the performance difference is visible. This generates 500,000 rows of sample data:</p>
<pre><code class="language-sql">INSERT INTO customers (first_name, last_name, email, city)
SELECT
    'User' || gs,
    'Last' || gs,
    'user' || gs || '@example.com',
    (ARRAY['Lagos', 'London', 'New York', 'Berlin', 'Tokyo'])[1 + (gs % 5)]
FROM generate_series(1, 500000) AS gs;
</code></pre>
<h3 id="heading-step-2-query-without-an-index">Step 2 – Query Without an Index</h3>
<pre><code class="language-sql">EXPLAIN ANALYZE
SELECT * FROM customers WHERE email = 'user250000@example.com';
</code></pre>
<p>You'll see output similar to this:</p>
<pre><code class="language-plaintext">Seq Scan on customers  (cost=0.00..11374.00 rows=1 width=52) (actual time=45.123..91.456 rows=1 loops=1)
  Filter: ((email)::text = 'user250000@example.com'::text)
  Rows Removed by Filter: 499999
Planning Time: 0.085 ms
Execution Time: 91.502 ms
</code></pre>
<p>The key detail here is <code>Seq Scan</code> — PostgreSQL scanned all 500,000 rows to find a single match. It filtered out 499,999 rows. That's a lot of wasted work.</p>
<h3 id="heading-step-3-create-an-index">Step 3 – Create an Index</h3>
<pre><code class="language-sql">CREATE INDEX idx_customers_email ON customers (email);
</code></pre>
<p>This creates a B-tree index on the <code>email</code> column. The name <code>idx_customers_email</code> follows a common naming convention: <code>idx_</code> prefix, then the table name, then the column name.</p>
<h3 id="heading-step-4-query-with-the-index">Step 4 – Query With the Index</h3>
<p>Run the same query again:</p>
<pre><code class="language-sql">EXPLAIN ANALYZE
SELECT * FROM customers WHERE email = 'user250000@example.com';
</code></pre>
<p>Now you'll see something like this:</p>
<pre><code class="language-plaintext">Index Scan using idx_customers_email on customers  (cost=0.42..8.44 rows=1 width=52) (actual time=0.034..0.036 rows=1 loops=1)
  Index Cond: ((email)::text = 'user250000@example.com'::text)
Planning Time: 0.112 ms
Execution Time: 0.058 ms
</code></pre>
<p>The scan type changed from <code>Seq Scan</code> to <code>Index Scan</code>. The execution time dropped from ~91ms to ~0.06ms. That's roughly a 1,500x improvement — from one line of SQL.</p>
<h2 id="heading-how-to-use-explain-analyze-to-measure-performance">How to Use <code>EXPLAIN ANALYZE</code> to Measure Performance</h2>
<p><code>EXPLAIN ANALYZE</code> is your most important tool for understanding how PostgreSQL executes a query. You already saw it in the previous section, but let's break down what the output means.</p>
<pre><code class="language-sql">EXPLAIN ANALYZE SELECT * FROM customers WHERE city = 'Lagos';
</code></pre>
<p>The output will tell you several things:</p>
<ul>
<li><p><strong>Scan type</strong> — whether PostgreSQL used a sequential scan, index scan, bitmap index scan, or another access method</p>
</li>
<li><p><strong>Cost</strong> — the estimated cost in arbitrary units. The first number is the startup cost, the second is the total cost</p>
</li>
<li><p><strong>Rows</strong> — how many rows PostgreSQL estimated it would find versus how many it actually found</p>
</li>
<li><p><strong>Actual time</strong> — the real time in milliseconds to execute the query</p>
</li>
<li><p><strong>Rows Removed by Filter</strong> — how many rows were scanned but didn't match the condition</p>
</li>
</ul>
<p>If you see <code>Seq Scan</code> on a large table with a selective WHERE clause, that's usually a sign you need an index. If you see <code>Index Scan</code> or <code>Index Only Scan</code>, your index is working.</p>
<p>One thing to keep in mind: <code>EXPLAIN</code> without <code>ANALYZE</code> shows the plan without actually running the query. <code>EXPLAIN ANALYZE</code> runs the query and shows real timing data. Always use <code>EXPLAIN ANALYZE</code> when you're investigating performance, but be careful with it on destructive queries — <code>EXPLAIN ANALYZE DELETE FROM ...</code> will actually delete the rows. Wrap those in a transaction and roll back:</p>
<pre><code class="language-sql">BEGIN;
EXPLAIN ANALYZE DELETE FROM customers WHERE city = 'Berlin';
ROLLBACK;
</code></pre>
<h2 id="heading-types-of-indexes-in-postgresql">Types of Indexes in PostgreSQL</h2>
<p>PostgreSQL supports several index types, each optimized for different query patterns.</p>
<h3 id="heading-b-tree-default">B-tree (Default)</h3>
<p>B-tree is the default index type and covers the vast majority of use cases. It supports equality checks (<code>=</code>), range queries (<code>&lt;</code>, <code>&gt;</code>, <code>&lt;=</code>, <code>&gt;=</code>, <code>BETWEEN</code>), sorting (<code>ORDER BY</code>), and <code>IS NULL</code> / <code>IS NOT NULL</code> checks.</p>
<pre><code class="language-sql">-- These are equivalent – B-tree is the default
CREATE INDEX idx_name ON customers (last_name);
CREATE INDEX idx_name ON customers USING btree (last_name);
</code></pre>
<p>Use B-tree when you don't have a specific reason to use something else.</p>
<h3 id="heading-hash">Hash</h3>
<p>Hash indexes are optimized purely for equality comparisons (<code>=</code>). They don't support range queries or sorting. In practice, B-tree handles equality checks almost as fast, so hash indexes are rarely necessary.</p>
<pre><code class="language-sql">CREATE INDEX idx_email_hash ON customers USING hash (email);
</code></pre>
<p>Consider a hash index only if you have a very large table with frequent equality-only lookups and want to save a small amount of index space.</p>
<h3 id="heading-gin-generalized-inverted-index">GIN (Generalized Inverted Index)</h3>
<p>GIN indexes are designed for values that contain multiple elements — like arrays, JSONB documents, or full-text search vectors. Instead of indexing a single value per row, GIN indexes every element within the value.</p>
<pre><code class="language-sql">-- Add a JSONB column
ALTER TABLE customers ADD COLUMN preferences JSONB DEFAULT '{}';

-- Index the JSONB column
CREATE INDEX idx_preferences ON customers USING gin (preferences);

-- Now this query uses the GIN index
SELECT * FROM customers WHERE preferences @&gt; '{"newsletter": true}';
</code></pre>
<p>Use GIN when you're querying inside JSONB data, searching arrays with <code>@&gt;</code> or <code>&amp;&amp;</code>, or doing full-text search with <code>tsvector</code>.</p>
<h3 id="heading-gist-generalized-search-tree">GiST (Generalized Search Tree)</h3>
<p>GiST indexes support geometric data, ranges, and full-text search. They're commonly used with PostGIS for geospatial queries.</p>
<pre><code class="language-sql">-- Range type example
CREATE TABLE events (
    id SERIAL PRIMARY KEY,
    name VARCHAR(100),
    duration TSRANGE
);

CREATE INDEX idx_event_duration ON events USING gist (duration);

-- Find overlapping events
SELECT * FROM events WHERE duration &amp;&amp; '[2025-01-01, 2025-01-31]'::tsrange;
</code></pre>
<p>Use GiST when you're working with spatial data, range types, or need overlap/containment operators.</p>
<h3 id="heading-brin-block-range-index">BRIN (Block Range Index)</h3>
<p>BRIN indexes are extremely small and work well on large tables where the physical row order correlates with the indexed column's value. A common example is a timestamp column on an append-only table where new rows always have later timestamps.</p>
<pre><code class="language-sql">CREATE INDEX idx_created_at_brin ON customers USING brin (created_at);
</code></pre>
<p>BRIN stores summary information (min/max values) for each block of rows rather than indexing every row individually. This makes the index much smaller than a B-tree, but it only works well when the data is naturally ordered.</p>
<p>Use BRIN for very large, append-only tables with naturally ordered data — like logs, events, or time-series data.</p>
<h2 id="heading-how-to-create-a-composite-index">How to Create a Composite Index</h2>
<p>A composite index (also called a multi-column index) covers more than one column. It's useful when your queries frequently filter or sort by multiple columns together.</p>
<pre><code class="language-sql">CREATE INDEX idx_city_lastname ON customers (city, last_name);
</code></pre>
<p>The order of columns in a composite index matters. PostgreSQL can use this index for queries that filter on <code>city</code> alone, or on both <code>city</code> and <code>last_name</code>. But it <strong>can't</strong> efficiently use this index for queries that filter only on <code>last_name</code>.</p>
<p>Think of it like a phone book sorted by city first, then by last name within each city. You can easily look up everyone in Lagos. You can also look up everyone named "Adeyemi" in Lagos. But finding all people named "Adeyemi" across all cities requires scanning the whole book.</p>
<p>This principle is called the <strong>leftmost prefix rule</strong>: PostgreSQL can use a composite index for queries that include the leftmost column(s) of the index, but not for queries that skip them.</p>
<pre><code class="language-sql">-- ✅ Uses the index (matches leftmost column)
SELECT * FROM customers WHERE city = 'Lagos';

-- ✅ Uses the index (matches both columns, left to right)
SELECT * FROM customers WHERE city = 'Lagos' AND last_name = 'Adeyemi';

-- ❌ Cannot use this index efficiently (skips the leftmost column)
SELECT * FROM customers WHERE last_name = 'Adeyemi';
</code></pre>
<p>When deciding column order, place the most selective column first — the one that narrows down the results the most.</p>
<h2 id="heading-how-to-create-a-partial-index">How to Create a Partial Index</h2>
<p>A partial index covers only a subset of rows in a table. You define the subset with a WHERE clause in the index definition.</p>
<p>This is useful when you only query a specific portion of the data. For example, if you have an <code>orders</code> table and you frequently query for pending orders but rarely look at completed ones:</p>
<pre><code class="language-sql">CREATE TABLE orders (
    id SERIAL PRIMARY KEY,
    customer_id INT NOT NULL,
    status VARCHAR(20) NOT NULL DEFAULT 'pending',
    total NUMERIC(10, 2),
    created_at TIMESTAMP DEFAULT NOW()
);

-- Only index rows where status is 'pending'
CREATE INDEX idx_orders_pending ON orders (customer_id)
WHERE status = 'pending';
</code></pre>
<p>This index is smaller than a full index because it skips all rows that don't match the WHERE condition. Smaller indexes use less disk space, consume less memory, and are faster to maintain during writes.</p>
<p>For the index to be used, your query's WHERE clause must match the index's condition:</p>
<pre><code class="language-sql">-- ✅ Uses the partial index
SELECT * FROM orders WHERE status = 'pending' AND customer_id = 42;

-- ❌ Cannot use the partial index (different status)
SELECT * FROM orders WHERE status = 'shipped' AND customer_id = 42;
</code></pre>
<h2 id="heading-how-to-create-an-expression-index">How to Create an Expression Index</h2>
<p>Sometimes you need to index the result of a function or expression rather than a raw column value. Expression indexes (also called functional indexes) handle this.</p>
<p>A common scenario is case-insensitive email lookups. If your queries use <code>LOWER(email)</code>, a regular index on <code>email</code> won't help — PostgreSQL sees the function call as a different expression.</p>
<pre><code class="language-sql">-- Regular index on email – won't help with LOWER() queries
CREATE INDEX idx_email ON customers (email);

-- This query does NOT use the index above
SELECT * FROM customers WHERE LOWER(email) = 'user100@example.com';
</code></pre>
<p>To fix this, create an index on the expression itself:</p>
<pre><code class="language-sql">CREATE INDEX idx_email_lower ON customers (LOWER(email));
</code></pre>
<p>Now queries that use <code>LOWER(email)</code> in their WHERE clause will use this index:</p>
<pre><code class="language-sql">-- ✅ Uses the expression index
SELECT * FROM customers WHERE LOWER(email) = 'user100@example.com';
</code></pre>
<p>The rule is straightforward: the expression in your query must match the expression in the index exactly. If the index is on <code>LOWER(email)</code>, your query must also use <code>LOWER(email)</code>.</p>
<h2 id="heading-how-to-create-a-unique-index">How to Create a Unique Index</h2>
<p>A unique index guarantees that no two rows have the same value (or combination of values) in the indexed columns. It serves a dual purpose: it enforces data integrity and provides fast lookups.</p>
<pre><code class="language-sql">CREATE UNIQUE INDEX idx_customers_email_unique ON customers (email);
</code></pre>
<p>If you try to insert a duplicate value, PostgreSQL will reject the operation:</p>
<pre><code class="language-sql">INSERT INTO customers (first_name, last_name, email, city)
VALUES ('Test', 'User', 'user1@example.com', 'Lagos');
-- ERROR: duplicate key value violates unique constraint "idx_customers_email_unique"
</code></pre>
<p>You might wonder how this differs from a UNIQUE constraint. Under the hood, PostgreSQL implements UNIQUE constraints by creating a unique index. The two are functionally identical.</p>
<p>The difference is intent — a UNIQUE constraint expresses a data integrity rule, while a unique index explicitly focuses on query performance with uniqueness as a bonus.</p>
<h2 id="heading-how-to-manage-indexes">How to Manage Indexes</h2>
<p>As your database grows, you'll need to inspect, monitor, and maintain your indexes.</p>
<h3 id="heading-how-to-list-all-indexes-on-a-table">How to List All Indexes on a Table</h3>
<pre><code class="language-sql">SELECT
    indexname,
    indexdef
FROM pg_indexes
WHERE tablename = 'customers';
</code></pre>
<p>This shows the name and full definition of every index on the table.</p>
<h3 id="heading-how-to-check-index-size">How to Check Index Size</h3>
<pre><code class="language-sql">SELECT
    pg_size_pretty(pg_relation_size('idx_customers_email')) AS index_size;
</code></pre>
<p>For a broader view of all indexes and their sizes:</p>
<pre><code class="language-sql">SELECT
    indexrelname AS index_name,
    pg_size_pretty(pg_relation_size(indexrelid)) AS size
FROM pg_stat_user_indexes
WHERE relname = 'customers'
ORDER BY pg_relation_size(indexrelid) DESC;
</code></pre>
<h3 id="heading-how-to-find-unused-indexes">How to Find Unused Indexes</h3>
<p>Indexes that are never used waste disk space and slow down writes. You can find them by checking <code>pg_stat_user_indexes</code>:</p>
<pre><code class="language-sql">SELECT
    indexrelname AS index_name,
    idx_scan AS times_used,
    pg_size_pretty(pg_relation_size(indexrelid)) AS size
FROM pg_stat_user_indexes
WHERE relname = 'customers'
AND idx_scan = 0
ORDER BY pg_relation_size(indexrelid) DESC;
</code></pre>
<p>If an index has <code>idx_scan = 0</code> after a reasonable period of normal usage, it's a candidate for removal. Just make sure to check across a full business cycle — some indexes are only used during monthly reports or seasonal operations.</p>
<h3 id="heading-how-to-drop-an-index">How to Drop an Index</h3>
<pre><code class="language-sql">DROP INDEX IF EXISTS idx_customers_email;
</code></pre>
<p>If you're dropping an index on a production table and want to avoid locking writes, use <code>CONCURRENTLY</code>:</p>
<pre><code class="language-sql">DROP INDEX CONCURRENTLY IF EXISTS idx_customers_email;
</code></pre>
<h3 id="heading-how-to-rebuild-an-index">How to Rebuild an Index</h3>
<p>Over time, indexes can become bloated as rows are inserted, updated, and deleted. You can rebuild an index to reclaim space:</p>
<pre><code class="language-sql">REINDEX INDEX idx_customers_email;
</code></pre>
<p>Or rebuild all indexes on a table:</p>
<pre><code class="language-sql">REINDEX TABLE customers;
</code></pre>
<p>On production systems, use <code>REINDEX CONCURRENTLY</code> (PostgreSQL 12+) to avoid locking the table:</p>
<pre><code class="language-sql">REINDEX INDEX CONCURRENTLY idx_customers_email;
</code></pre>
<h2 id="heading-when-indexes-hurt-instead-of-help">When Indexes Hurt Instead of Help</h2>
<p>Indexes aren't free. Every index you add comes with costs:</p>
<ol>
<li><p><strong>Write overhead</strong> — every INSERT, UPDATE, or DELETE must also update every index on the table. If a table has 10 indexes and you insert a row, PostgreSQL performs 11 write operations (one for the table and one for each index). On write-heavy tables, excessive indexes can significantly slow down data modification.</p>
</li>
<li><p><strong>Storage cost</strong> — indexes consume disk space. On large tables, indexes can take up as much space as the table itself, sometimes more. You can check this with <code>pg_relation_size</code>.</p>
</li>
<li><p><strong>Memory consumption</strong> — PostgreSQL caches frequently used indexes in memory. More indexes means more memory pressure, which can push useful data out of the cache and slow down other queries.</p>
</li>
<li><p><strong>Maintenance burden</strong> — indexes need periodic maintenance (vacuuming, reindexing) and add complexity to schema migrations.</p>
</li>
</ol>
<p>The question to ask is not "should I add an index?" but rather "does the read performance gain justify the write performance cost for this table's workload?"</p>
<h2 id="heading-common-mistakes-that-prevent-index-usage">Common Mistakes That Prevent Index Usage</h2>
<p>You can have the perfect index and PostgreSQL might still ignore it. Here are the most common reasons.</p>
<h3 id="heading-wrapping-the-indexed-column-in-a-function">Wrapping the Indexed Column in a Function</h3>
<pre><code class="language-sql">-- Index on email
CREATE INDEX idx_email ON customers (email);

-- ❌ PostgreSQL cannot use the index because of LOWER()
SELECT * FROM customers WHERE LOWER(email) = 'user1@example.com';

-- ✅ Fix: create an expression index on LOWER(email)
CREATE INDEX idx_email_lower ON customers (LOWER(email));
</code></pre>
<p>Any function applied to the indexed column in a WHERE clause prevents the standard index from being used. You need an expression index that matches the function.</p>
<h3 id="heading-implicit-type-casting">Implicit Type Casting</h3>
<pre><code class="language-sql">-- id is an INTEGER column with an index
-- ❌ Passing a string forces a type cast, which may prevent index usage
SELECT * FROM customers WHERE id = '42';

-- ✅ Use the correct type
SELECT * FROM customers WHERE id = 42;
</code></pre>
<p>When the query's value type doesn't match the column type, PostgreSQL may cast the column to match, which prevents index usage.</p>
<h3 id="heading-using-or-conditions-across-different-columns">Using OR Conditions Across Different Columns</h3>
<pre><code class="language-sql">-- ❌ OR across different columns can prevent index usage
SELECT * FROM customers WHERE email = 'user1@example.com' OR city = 'Lagos';

-- ✅ Rewrite as UNION for better index utilization
SELECT * FROM customers WHERE email = 'user1@example.com'
UNION
SELECT * FROM customers WHERE city = 'Lagos';
</code></pre>
<h3 id="heading-leading-wildcards-in-like-queries">Leading Wildcards in LIKE Queries</h3>
<pre><code class="language-sql">-- ❌ Leading wildcard cannot use a B-tree index
SELECT * FROM customers WHERE email LIKE '%@example.com';

-- ✅ Trailing wildcard CAN use a B-tree index
SELECT * FROM customers WHERE email LIKE 'user1%';
</code></pre>
<p>A B-tree index is sorted from left to right. A leading wildcard (<code>%something</code>) means the database can't use the sorted structure and falls back to a sequential scan. If you need to search by suffix or substring, consider a GIN index with the <code>pg_trgm</code> extension.</p>
<h3 id="heading-low-selectivity">Low Selectivity</h3>
<p>If a column has very few distinct values relative to the number of rows (low selectivity), PostgreSQL may decide a sequential scan is faster than using the index.</p>
<p>For example, if a <code>status</code> column has only three possible values (<code>'pending'</code>, <code>'shipped'</code>, <code>'delivered'</code>) and each value covers roughly a third of the table, an index on <code>status</code> alone provides little benefit. PostgreSQL would still need to read a large portion of the table, and the extra index lookup adds overhead.</p>
<p>A partial index is often the better solution in these cases.</p>
<h2 id="heading-best-practices-for-indexing">Best Practices for Indexing</h2>
<p>Here's a summary of the key principles to follow:</p>
<ol>
<li><p><strong>Index columns that appear in WHERE, JOIN, and ORDER BY clauses.</strong> These are the columns the database needs to search, match, or sort by. Start with the queries that run most frequently or take the longest.</p>
</li>
<li><p><strong>Measure before and after with EXPLAIN ANALYZE.</strong> Never add an index based on guesswork. Run your query with <code>EXPLAIN ANALYZE</code>, add the index, and run it again. If the execution time doesn't improve meaningfully, the index isn't helping.</p>
</li>
<li><p><strong>Don't index every column.</strong> Each index slows down writes and consumes storage. Be deliberate about which columns you index based on actual query patterns.</p>
</li>
<li><p><strong>Use composite indexes for multi-column filters.</strong> If your queries commonly filter on <code>city</code> and <code>last_name</code> together, a composite index on <code>(city, last_name)</code> is more efficient than two separate single-column indexes.</p>
</li>
<li><p><strong>Put the most selective column first in composite indexes.</strong> The column that narrows the results the most should come first.</p>
</li>
<li><p><strong>Use partial indexes when you only query a subset of data.</strong> If 90% of your queries target rows where <code>status = 'active'</code>, a partial index on that subset is smaller and faster than a full index.</p>
</li>
<li><p><strong>Monitor index usage regularly.</strong> Query <code>pg_stat_user_indexes</code> to find unused indexes and remove them.</p>
</li>
<li><p><strong>Rebuild bloated indexes periodically.</strong> On tables with heavy update/delete activity, indexes can become bloated. Use <code>REINDEX CONCURRENTLY</code> on production systems.</p>
</li>
</ol>
<h2 id="heading-conclusion">Conclusion</h2>
<p>In this tutorial, you learned what database indexes are and why they matter for query performance. You explored how B-tree indexes work under the hood, created several types of indexes (single-column, composite, partial, expression, and unique), and used <code>EXPLAIN ANALYZE</code> to measure the impact.</p>
<p>You also learned about the trade-offs indexes introduce — write overhead, storage cost, and memory pressure — and the common mistakes that silently prevent PostgreSQL from using your indexes.</p>
<p>The core principle is simple: index deliberately based on your actual query patterns, measure the results, and remove anything that isn't pulling its weight.</p>
<p>If you found this tutorial helpful, you can find more of my writing on <a href="https://freecodecamp.org/news/author/iyiola">freeCodeCamp</a> and connect with me on <a href="https://linkedin.com/in/iyioladev">LinkedIn</a> and <a href="https://x.com/iyiola_dev_">X</a>.</p>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ How to Streamline Search in Web Applications with Elasticsearch  ]]>
                </title>
                <description>
                    <![CDATA[ They say data is the new gold. But navigating through a large dataset to meet the demands of consumers in record time still gives backend devs a headache. Conventional database queries often aren't to ]]>
                </description>
                <link>https://www.freecodecamp.org/news/streamline-search-functionality-in-web-apps-with-elasticsearch/</link>
                <guid isPermaLink="false">69e10d82b67a275a9d505023</guid>
                
                    <category>
                        <![CDATA[ elasticsearch ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Node.js ]]>
                    </category>
                
                    <category>
                        <![CDATA[ indexing ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Search Engines ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ Oluwatobi ]]>
                </dc:creator>
                <pubDate>Thu, 16 Apr 2026 16:25:38 +0000</pubDate>
                <media:content url="https://cdn.hashnode.com/uploads/covers/5e1e335a7a1d3fcc59028c64/e6563d07-a253-4fd9-b1f6-54dc98a48319.png" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>They say data is the new gold. But navigating through a large dataset to meet the demands of consumers in record time still gives backend devs a headache.</p>
<p>Conventional database queries often aren't totally reliable in getting accurate search results fast. But fortunately, Elasticsearch comes to the rescue.</p>
<p>In this article, I'll walk you through how to use Elasticsearch to enhance database searches and analytics while still maintaining efficiency.</p>
<p>Here are the prerequisites for this tutorial:</p>
<ul>
<li><p>A Node.js environment</p>
</li>
<li><p>Basic backend knowledge</p>
</li>
</ul>
<p>With that, let's get started. But first of all, what is Elasticsearch?</p>
<h3 id="heading-table-of-content">Table of Content</h3>
<ul>
<li><p><a href="#heading-what-is-elasticsearch">What is Elasticsearch?</a></p>
</li>
<li><p><a href="#heading-elasticsearch-key-terms">Elasticsearch Key Terms</a></p>
</li>
<li><p><a href="#heading-how-to-set-up-elasticsearch">How to Set Up Elasticsearch</a></p>
</li>
<li><p><a href="#heading-how-to-set-up-the-demo-project">How to Set Up the Demo Project</a></p>
</li>
<li><p><a href="#heading-how-to-set-up-elasticsearch-in-your-project">How to Set Up Elasticsearch in Your Project</a></p>
</li>
<li><p><a href="#heading-how-to-work-with-indexes-in-elasticsearch">How to Work with Indexes in Elasticsearch</a></p>
</li>
<li><p><a href="#heading-search-implementation">Search Implementation</a></p>
</li>
<li><p><a href="#heading-full-code">Full Code</a></p>
</li>
<li><p><a href="#heading-wrapping-up">Wrapping Up</a></p>
</li>
<li><p><a href="#heading-conclusion">Conclusion</a></p>
</li>
</ul>
<h2 id="heading-what-is-elasticsearch">What is Elasticsearch?</h2>
<p>Elasticsearch is a search engine built by Apache that can index words and phrases, providing advanced text and vector search capabilities. It also has other useful features such as search analytics and an auto-complete feature.</p>
<p>Note that Elasticsearch isn't a database, even though it does provide indexing features (which popular databases also do).</p>
<p>Other popular alternatives to this tool used in production environments include <a href="https://www.algolia.com/">Algolia</a>, <a href="https://opensearch.org/">OpenSearch</a> and <a href="https://www.meilisearch.com/">MeiliSearch</a>.</p>
<h2 id="heading-elasticsearch-key-terms">Elasticsearch Key Terms</h2>
<p>in this section, we'll go over some important terminology used in Elasticsearch. To ease your understanding, I'll make references to common database terminologies.</p>
<ul>
<li><p><strong>Index</strong>: This serves as a storage location for the data you're going to explore. It's like the database for Elasticsearch. It also shares other properties that DBs possess like uniqueness.</p>
</li>
<li><p><strong>Document:</strong> This is the smallest unit of information stored within the index. It's structurally identical to the MongoDB-based document and is also similar to rows in SQL-based databases.</p>
</li>
<li><p><strong>Mapping:</strong> Mapping refers to sets of rules or instructions that define how documents and fields are stored in the Elasticsearch index.</p>
</li>
<li><p><strong>Score:</strong> This is generated by Elasticsearch to show the degree of relevance of the search query to the stored index.</p>
</li>
<li><p><strong>Analyzer:</strong> When data is sent to the Elasticsearch engine for indexing, it initially passes through an analyzer which processes the text before indexing. This is achieved via Filters and Tokenizers.</p>
</li>
<li><p><strong>Tokenizers:</strong> This tool converts the gross unstructured data sent to the Elasticsearch engine into structured data tokens for further processing and storage.</p>
</li>
<li><p><strong>Aggregator:</strong> This search tool performs detailed analysis on the tokens stored in the index to generate actionable data insights. It's an advantage of the Elasticsearch engine. Mongo DB’s aggregator offer similar functions.</p>
</li>
<li><p><strong>Filter</strong>: A set of instructions which modifies tokens generated during the process of analysis. This could entail removal of fillers, capitalization rules, and so on.</p>
</li>
<li><p><strong>Bulk index:</strong> This refers to indexing more than one document at once. You typically do this when indexing a database with pre-existing content.</p>
</li>
</ul>
<h2 id="heading-how-to-set-up-elasticsearch">How to Set Up Elasticsearch</h2>
<p>For the purpose of this tutorial, we'll use Elasticsearch's installable software on our local machine. Online hosted versions of Elasticsearch also exist which work hitch-free as well.</p>
<p><a href="https://www.elastic.co/docs/deploy-manage/deploy/self-managed/install-elasticsearch-with-zip-on-windows">Here</a> is a link detailing how to setup Elasticsearch on Windows. For non-Windows users, you can also install Elasticsearch on <a href="https://www.elastic.co/docs/deploy-manage/deploy/self-managed/install-elasticsearch-from-archive-on-linux-macos">Linux/Mac OS</a> or use <a href="https://www.elastic.co/docs/deploy-manage/deploy/self-managed/install-elasticsearch-with-docker">Docker</a>.</p>
<p><strong>Note</strong> that for Windows users, make sure you run Elasticsearch as an Administrator to avoid installation errors.</p>
<p>After successful installation, you can test if it's functioning by navigating to <code>localhost:9200</code> which serves as the default local endpoint for Elasticsearch. There you'll see a success message on the screen similar to the image below:</p>
<img src="https://cdn.hashnode.com/uploads/covers/64bba6ecb09308034572f437/ad94560d-7629-45f1-a94d-eaed0b61cefe.png" alt="elastic search localhost homepage" style="display:block;margin:0 auto" width="813" height="744" loading="lazy">

<p>With that , we'll move on to setting up our project and integrating ElasticSearch into our demo project.</p>
<h2 id="heading-how-to-set-up-the-demo-project">How to Set Up the Demo Project</h2>
<p>For the sake of this tutorial, we will be utilizing a ready-built forum-based backend application built in Node Express JS. &nbsp;Here is the link to the project.</p>
<p>to get the project up and running, clone this package and run</p>
<p><code>npm start</code></p>
<p><code>MySQL</code> will serve as the default database for this tutorial. &nbsp;Let's now proceed to the next section.</p>
<h2 id="heading-how-to-set-up-elasticsearch-in-your-project">How to Set Up Elasticsearch in Your Project</h2>
<p>The existing demo project is a backend implementation of a forum site which allows users to post text content and facilitate discussions through category-based threads.</p>
<p>Elasticsearch is great for ensuring that users can sift through these posts and threads to accurately locate key content using distinct keywords. This is more effective than using traditional database search queries which can be cumbersome.</p>
<p>To set up Elasticsearch, start by installing the Elasticsearch <code>npm</code> package. To do this, run the command below in your project directory:</p>
<pre><code class="language-shell">npm install @elastic/elasticsearch
</code></pre>
<p>After successful installation, create a <code>config.js</code> file where you'll setup your driver to connect to your Elasticsearch application.</p>
<pre><code class="language-javascript">const { Client } = require('@elastic/elasticsearch');

const esClient = new Client({
  node: 'http://localhost:9200',
  auth: {
    username: process.env.ELASTICSEARCH_USERNAME,
    password: process.env.ELASTICSEARCH_PASSWORD
  },
  maxRetries: 5,
  requestTimeout: 60000,
  tls: {
    rejectUnauthorized: process.env.NODE_ENV !== 'development'
  }
});

module.exports = esClient;
</code></pre>
<p>To access and use Elasticsearch's capabilities within your backend application, you'll need to setup and configure your Elasticsearch driver. The details are specified in the config file code above.</p>
<p>As mentioned earlier, Elasticsearch runs on the <code>localhost:9200</code> port. So your Elasticsearch node will be directed to the localhost port. Online hosted Elasticsearch nodes will also work in similar scenarios.</p>
<p>Next in the config file, you'll provide the authentication credentials required to access Elasticsearch. The requested username and password will be supplied within the Auth object. If you're running Elasticsearch locally, authentication may not be required unless security is enabled.</p>
<p>In this scenario, <code>MaxRetries</code> refers to the number of maximum unsuccessful attempts to access Elasticsearch. In this case, we've pegged it at 5 attempts. <code>requestTimeout</code> is the time in milliseconds after which the request will automatically terminate if it's not processed.</p>
<p>Once you've completion the Config file, you'll import this config and initialize the Elasticsearch client when your backend starts.</p>
<h2 id="heading-how-to-work-with-indexes-in-elasticsearch">How to Work with Indexes in Elasticsearch</h2>
<p>Before we start harnessing the full power of Elasticsearch, we need to customize its search capabilities within the backend of the project. This involves setting up an index within the Elasticsearch Engine that indexes all posts made to the backend application.</p>
<pre><code class="language-javascript">const esClient = require('./config');

const setupIndex = async () =&gt; {
  try {
    const indexExists = await esClient.indices.exists({
      index: INDEX_NAME
    });

    if (indexExists) {
      console.log(`Index "${INDEX_NAME}" already exists`);
      return;
    }

    await esClient.indices.create({
      index: INDEX_NAME,
      ...indexMapping
    });

    console.log(`Index "${INDEX_NAME}" created`);
  } catch (err) {
    console.error(err);
    throw err;
  }
};
</code></pre>
<p>The code above highlights creating a new index. First, you need to invoke the <code>setupIndex()</code> function. Within this function, you're providing the preferred name for your index. &nbsp;Elasticsearch then checks if the name already exists within its indexes.</p>
<p>The function terminates if the index name already exists (to prevent duplication). But if it doesn't exist, it proceeds to create an index with that unique name alongside the index Mapping rules (which we'll discuss further shortly).</p>
<p>After creating the index, you'll see a success message in your application console.</p>
<h3 id="heading-how-to-delete-an-index">How to Delete an Index</h3>
<p>After a while, an index may no longer serve its purpose and you may need to remove it from Elasticsearch.</p>
<p>You can do this by executing the <code>esClient.indices.delete()</code> command as shown below:</p>
<pre><code class="language-javascript">const deleteIndex = async () =&gt; {
  try {
    await esClient.indices.delete({ index: INDEX_NAME });
    console.log(`${INDEX_NAME} deleted`);
  } catch (err) {
    console.error("Error deleting index:", err);
  }
};
</code></pre>
<h3 id="heading-how-to-delete-a-post-within-an-index">How to Delete a Post within an Index</h3>
<p>Sometimes, posts get deleted and modified. Also, users may get banned, after which you'd want to remove their content from the stored database .</p>
<p>In these cases, you'll want to ensure true deletion – that is, both from the database and from Elasticsearch indexed storage.</p>
<p>To do this, you'll call the <code>esClient.delete()</code> function, passing the Elasticsearch Client ID and the post's unique ID that you want to delete as callback arguments to your <code>esClient.delete</code> function.</p>
<pre><code class="language-javascript">const deletePost = async (postId) =&gt; {
  try {
    await esClient.delete({
      index: INDEX_NAME,
      id: postId.toString(),
    });

    console.log("Post successfully deleted");
    return { success: true, postId };
  } catch (err) {
    console.error(err);
    throw err;
  }
};
</code></pre>
<h3 id="heading-how-to-index-a-post">How to Index a Post</h3>
<p>After setting up the Elasticsearch Index, you'll want to automatically index posts made to the database into the Elasticsearch index.</p>
<p>To do this, you'll need to make sure that the post is compatible with your index schema via the <code>transformPostTOESRepo</code> function. This function extracts and formats the post data so it matches the Elasticsearch document structure.</p>
<pre><code class="language-javascript">const transformPostToESDoc = (post) =&gt; {
  return {
  id: post.id,
  title: post.title,
  content: post.body,
  author: post.author,
  category: post.category,
  tags: post.tags,
  views: post.views || 0,
  published_at: post.created_at
};

const indexPost = async (postId) =&gt; {
  try {
    const postRepo = await getPostRepo();
    const post = await postRepo.findOne({ where: { id: postId } });

    if (!post) {
      throw new Error("Post not available");
    }

    const esDocument = transformPostToESDoc(post);

    await esClient.index({
      index: INDEX_NAME,
      id: post.id.toString(),
      document: esDocument
    });

    console.log("Post successfully indexed");
    return { success: true, postId };
  } catch (err) {
    console.error(err);
    throw err;
  }
};
</code></pre>
<p>The post to be indexed must have a unique ID. For ease of use, we used the unique post ID constraint that comes by default in regular databases. Optionally, you can also use UUID libraries to generate unique post IDs.</p>
<p>The Post information is then attached to the <code>esClient.index()</code> function as the document to be indexed. We also put appropriate error handling measures in place to prevent the app from crashing if the process is unsuccessful.</p>
<h3 id="heading-how-to-define-elastic-search-mapping-rules">How to Define Elastic Search Mapping Rules</h3>
<p>Elasticsearch mappings define how your data is stored and indexed. They specify the data type of each field and how text is analyzed for search.</p>
<p>In the example below, we'll define an index configuration that includes custom analyzers for autocomplete and mappings for each post field (like title, content, and author).</p>
<pre><code class="language-javascript">const indexMapping = {
  settings: {
    analysis: {
      analyzer: {
        autocomplete: {
          type: 'custom',
          tokenizer: 'standard',
          filter: ['lowercase', 'autocomplete_filter']
        },
        autocomplete_search: {
          type: 'custom',
          tokenizer: 'standard',
          filter: ['lowercase']
        }
      },
      filter: {
        autocomplete_filter: {
          type: 'edge_ngram',
          min_gram: 2,
          max_gram: 10
        }
      }
    }
  },
  mappings: {
    properties: {
      id: { type: 'integer' },
      title: {
        type: 'text',
        analyzer: 'autocomplete',
        search_analyzer: 'autocomplete_search',
        fields: {
          keyword: { type: 'keyword' },
          standard: { type: 'text' }
        }
      },
      content: {
        type: 'text',
        analyzer: 'standard'
      },
      category: {
        type: 'keyword'
      },
      tags: { type: 'keyword' },
      author: {
        type: 'text',
        fields: {
          keyword: { type: 'keyword' }
        }
      },
      views: { type: 'integer' },
      published_at: { type: 'date' }
    }
  }
};
</code></pre>
<p>The <code>indexMapping</code> object defines how Elasticsearch should store and process your data. It consists of two main parts: <code>settings</code> and <code>mappings</code>.</p>
<p>The <code>mappings</code> section defines the structure of your documents. Each field (like <code>title</code>, <code>content</code>, or <code>author</code>) has a type such as <code>text</code>, <code>keyword</code>, <code>integer</code>, or <code>date</code>. This tells Elasticsearch how to store and search that field.</p>
<p>For text fields, we can also define analyzers. Analyzers control how text is broken into smaller pieces (tokens) during indexing and search.</p>
<p>In the <code>settings</code> section, we defined a custom analyzer for autocomplete. This uses an <code>edge_ngram</code> filter to generate partial word matches, so users can find results as they type. We also defined a separate <code>search_analyzer</code> to ensure that search queries are processed correctly.</p>
<p>Together, these settings allow you to support features like autocomplete while keeping search results accurate and efficient.</p>
<h2 id="heading-search-implementation">Search Implementation</h2>
<p>In order to implement your search functionality, you'll need to build out the API. This involves building the business logic service and the API route. You'll also use <code>GET</code> requests and attach your search term as a query. The result it generates will be received as a JSON document.</p>
<p>Then you'll implement the search post service function. In this scenario, you'll be using the search engine capabilities to search for phrases within the index. In line with best practices, you'll use a pagination technique to minimize&nbsp;receiving unwanted information.</p>
<p>The search query will consist of the index name, pagination parameters (<code>from</code> and <code>size</code>) to control which results are returned, and the expected maximum size of the result. &nbsp;You'll also attach a query object specifying the modality of the search that the Elasticsearch engine should use.</p>
<pre><code class="language-javascript">const searchElastic = async (query, page = 1, size = 10) =&gt; {
  const searchQuery = {
    index: INDEX_NAME,
    from: (page - 1) * size,
    size,
    query: {
      bool: {
        must: [
          {
            multi_match: {
              query,
              fields: ["title^3", "content"],
              type: "best_fields",
              fuzziness: "AUTO"
            }
          }
        ]
      }
    }
  };

  const result = await esClient.search(searchQuery);
  return result.hits.hits;
};
</code></pre>
<p>In the code above, the function is named <code>searchElastic</code>. The function contains three variables which must be passed in order to execute it: <code>size</code>, <code>page</code> and <code>query</code>.</p>
<p>The <code>size</code> variable specifies the maximum number of documents per search query to be returned. The default count could be any integer.</p>
<p>The query uses a <code>multi_match</code> clause to search across multiple fields, such as <code>title</code> and <code>content</code>. The <code>title^3</code> syntax boosts matches in the title, making them more relevant than matches in other fields.</p>
<p>We also included a <code>must</code> clause which defines conditions that documents must match to be included in the results.</p>
<p>The search results are usually ranked based on their degree of relevance to the search query.</p>
<h2 id="heading-full-code">Full Code</h2>
<p>With this, you've completed this tutorial and have configured Elasticsearch to index posts made to your database. Here's the full code:</p>
<ol>
<li>Elasticsearch Client (config.js):</li>
</ol>
<pre><code class="language-javascript">const { Client } = require('@elastic/elasticsearch');

const esClient = new Client({
  node: 'http://localhost:9200',
  auth: {
    username: process.env.ELASTICSEARCH_USERNAME,
    password: process.env.ELASTICSEARCH_PASSWORD
  },
  maxRetries: 5,
  requestTimeout: 60000,
  tls: {
    rejectUnauthorized: process.env.NODE_ENV !== 'development'
  }
});

module.exports = esClient;
</code></pre>
<ol>
<li>Index mapping:</li>
</ol>
<pre><code class="language-javascript">const indexMapping = {
  settings: {
    analysis: {
      analyzer: {
        autocomplete: {
          type: 'custom',
          tokenizer: 'standard',
          filter: ['lowercase', 'autocomplete_filter']
        },
        autocomplete_search: {
          type: 'custom',
          tokenizer: 'standard',
          filter: ['lowercase']
        }
      },
      filter: {
        autocomplete_filter: {
          type: 'edge_ngram',
          min_gram: 2,
          max_gram: 10
        }
      }
    }
  },
  mappings: {
    properties: {
      id: { type: 'integer' },
      title: {
        type: 'text',
        analyzer: 'autocomplete',
        search_analyzer: 'autocomplete_search',
        fields: {
          keyword: { type: 'keyword' },
          standard: { type: 'text' }
        }
      },
      content: {
        type: 'text',
        analyzer: 'standard'
      },
      category: {
        type: 'keyword'
      },
      tags: { type: 'keyword' },
      author: {
        type: 'text',
        fields: {
          keyword: { type: 'keyword' }
        }
      },
      views: { type: 'integer' },
      published_at: { type: 'date' }
    }
  }
};
</code></pre>
<ol>
<li>Create index:</li>
</ol>
<pre><code class="language-javascript">const setupIndex = async () =&gt; {
  try {
    const indexExists = await esClient.indices.exists({
      index: INDEX_NAME
    });

    if (indexExists) {
      console.log(`Index "${INDEX_NAME}" already exists`);
      return;
    }

    await esClient.indices.create({
      index: INDEX_NAME,
      ...indexMapping
    });

    console.log(`Index "${INDEX_NAME}" created`);
  } catch (err) {
    console.error(err);
    throw err;
  }
};
</code></pre>
<ol>
<li>Delete index:</li>
</ol>
<pre><code class="language-javascript">const deleteIndex = async () =&gt; {
  try {
    await esClient.indices.delete({ index: INDEX_NAME });
    console.log(`${INDEX_NAME} deleted`);
  } catch (err) {
    console.error("Error deleting index:", err);
  }
};
</code></pre>
<ol>
<li>Delete document (post):</li>
</ol>
<pre><code class="language-javascript">const deletePost = async (postId) =&gt; {
  try {
    await esClient.delete({
      index: INDEX_NAME,
      id: postId.toString()
    });

    console.log("Post successfully deleted");
    return { success: true, postId };
  } catch (err) {
    console.error(err);
    throw err;
  }
};
</code></pre>
<ol>
<li>Transform and index post:</li>
</ol>
<pre><code class="language-javascript">const transformPostToESDoc = (post) =&gt; {
  return {
  id: post.id,
  title: post.title,
  content: post.body,
  author: post.author,
  category: post.category,
  tags: post.tags,
  views: post.views || 0,
  published_at: post.created_at
};

const indexPost = async (postId) =&gt; {
  try {
    const postRepo = await getPostRepo();
    const post = await postRepo.findOne({ where: { id: postId } });

    if (!post) {
      throw new Error("Post not available");
    }

    const esDocument = transformPostToESDoc(post);

    await esClient.index({
      index: INDEX_NAME,
      id: post.id.toString(),
      document: esDocument
    });

    console.log("Post successfully indexed");
    return { success: true, postId };
  } catch (err) {
    console.error(err);
    throw err;
  }
};
</code></pre>
<ol>
<li>Search function:</li>
</ol>
<pre><code class="language-javascript">const searchElastic = async (query, page = 1, size = 10) =&gt; {
  const searchQuery = {
    index: INDEX_NAME,
    from: (page - 1) * size,
    size,
    query: {
      bool: {
        must: [
          {
            multi_match: {
              query,
              fields: ["title^3", "content"],
              type: "best_fields",
              fuzziness: "AUTO"
            }
          }
        ]
      }
    }
  };

  const result = await esClient.search(searchQuery);
  return result.hits.hits;
};
</code></pre>
<h2 id="heading-wrapping-up">Wrapping Up</h2>
<p>Now you know how to use Elasticsearch to improve user search in your web applications. Elasticsearch is agnostic which allows you to use it across programming languages and frameworks. Its large community base also provides helpful user guides to make onboarding easier.</p>
<p>To further harness Elasticsearch's power, you can explore other tools within the <strong>ELK</strong> stack (Elasticsearch, Log Stash, and Kibana ) that'll help you generate high quality data visualizations for your data, especially for enterprise applications.</p>
<h2 id="heading-conclusion">Conclusion</h2>
<p>A fast and reliable search engine isn’t negotiable in your web applications these days. Elasticsearch is your go-to for getting this done.</p>
<p>If you would like to read other articles that will enhance your tech journey, feel free to check out <a href="https://portfolio-oluwatobi.netlify.app/">my website here</a> . Stay active!</p>
 ]]>
                </content:encoded>
            </item>
        
    </channel>
</rss>
