How to Streamline Search in Web Applications with Elasticsearch

They say data is the new gold. But navigating through a large dataset to meet the demands of consumers in record time still gives backend devs a headache.

Conventional database queries often aren't totally reliable in getting accurate search results fast. But fortunately, Elasticsearch comes to the rescue.

In this article, I'll walk you through how to use Elasticsearch to enhance database searches and analytics while still maintaining efficiency.

Here are the prerequisites for this tutorial:

A Node.js environment
Basic backend knowledge

With that, let's get started. But first of all, what is Elasticsearch?

Table of Content

What is Elasticsearch?
Elasticsearch Key Terms
How to Set Up Elasticsearch
How to Set Up the Demo Project
How to Set Up Elasticsearch in Your Project
How to Work with Indexes in Elasticsearch
Search Implementation
Full Code
Wrapping Up
Conclusion

What is Elasticsearch?

Elasticsearch is a search engine built by Apache that can index words and phrases, providing advanced text and vector search capabilities. It also has other useful features such as search analytics and an auto-complete feature.

Note that Elasticsearch isn't a database, even though it does provide indexing features (which popular databases also do).

Other popular alternatives to this tool used in production environments include Algolia, OpenSearch and MeiliSearch.

Elasticsearch Key Terms

in this section, we'll go over some important terminology used in Elasticsearch. To ease your understanding, I'll make references to common database terminologies.

Index: This serves as a storage location for the data you're going to explore. It's like the database for Elasticsearch. It also shares other properties that DBs possess like uniqueness.
Document: This is the smallest unit of information stored within the index. It's structurally identical to the MongoDB-based document and is also similar to rows in SQL-based databases.
Mapping: Mapping refers to sets of rules or instructions that define how documents and fields are stored in the Elasticsearch index.
Score: This is generated by Elasticsearch to show the degree of relevance of the search query to the stored index.
Analyzer: When data is sent to the Elasticsearch engine for indexing, it initially passes through an analyzer which processes the text before indexing. This is achieved via Filters and Tokenizers.
Tokenizers: This tool converts the gross unstructured data sent to the Elasticsearch engine into structured data tokens for further processing and storage.
Aggregator: This search tool performs detailed analysis on the tokens stored in the index to generate actionable data insights. It's an advantage of the Elasticsearch engine. Mongo DB’s aggregator offer similar functions.
Filter: A set of instructions which modifies tokens generated during the process of analysis. This could entail removal of fillers, capitalization rules, and so on.
Bulk index: This refers to indexing more than one document at once. You typically do this when indexing a database with pre-existing content.

How to Set Up Elasticsearch

For the purpose of this tutorial, we'll use Elasticsearch's installable software on our local machine. Online hosted versions of Elasticsearch also exist which work hitch-free as well.

Here is a link detailing how to setup Elasticsearch on Windows. For non-Windows users, you can also install Elasticsearch on Linux/Mac OS or use Docker.

Note that for Windows users, make sure you run Elasticsearch as an Administrator to avoid installation errors.

After successful installation, you can test if it's functioning by navigating to localhost:9200 which serves as the default local endpoint for Elasticsearch. There you'll see a success message on the screen similar to the image below:

With that , we'll move on to setting up our project and integrating ElasticSearch into our demo project.

How to Set Up the Demo Project

For the sake of this tutorial, we will be utilizing a ready-built forum-based backend application built in Node Express JS. Here is the link to the project.

to get the project up and running, clone this package and run

npm start

MySQL will serve as the default database for this tutorial. Let's now proceed to the next section.

How to Set Up Elasticsearch in Your Project

The existing demo project is a backend implementation of a forum site which allows users to post text content and facilitate discussions through category-based threads.

Elasticsearch is great for ensuring that users can sift through these posts and threads to accurately locate key content using distinct keywords. This is more effective than using traditional database search queries which can be cumbersome.

To set up Elasticsearch, start by installing the Elasticsearch npm package. To do this, run the command below in your project directory:

npm install @elastic/elasticsearch

After successful installation, create a config.js file where you'll setup your driver to connect to your Elasticsearch application.

const { Client } = require('@elastic/elasticsearch');

const esClient = new Client({
  node: 'http://localhost:9200',
  auth: {
    username: process.env.ELASTICSEARCH_USERNAME,
    password: process.env.ELASTICSEARCH_PASSWORD
  },
  maxRetries: 5,
  requestTimeout: 60000,
  tls: {
    rejectUnauthorized: process.env.NODE_ENV !== 'development'
  }
});

module.exports = esClient;

To access and use Elasticsearch's capabilities within your backend application, you'll need to setup and configure your Elasticsearch driver. The details are specified in the config file code above.

As mentioned earlier, Elasticsearch runs on the localhost:9200 port. So your Elasticsearch node will be directed to the localhost port. Online hosted Elasticsearch nodes will also work in similar scenarios.

Next in the config file, you'll provide the authentication credentials required to access Elasticsearch. The requested username and password will be supplied within the Auth object. If you're running Elasticsearch locally, authentication may not be required unless security is enabled.

In this scenario, MaxRetries refers to the number of maximum unsuccessful attempts to access Elasticsearch. In this case, we've pegged it at 5 attempts. requestTimeout is the time in milliseconds after which the request will automatically terminate if it's not processed.

Once you've completion the Config file, you'll import this config and initialize the Elasticsearch client when your backend starts.

How to Work with Indexes in Elasticsearch

Before we start harnessing the full power of Elasticsearch, we need to customize its search capabilities within the backend of the project. This involves setting up an index within the Elasticsearch Engine that indexes all posts made to the backend application.

const esClient = require('./config');

const setupIndex = async () => {
  try {
    const indexExists = await esClient.indices.exists({
      index: INDEX_NAME
    });

    if (indexExists) {
      console.log(`Index "${INDEX_NAME}" already exists`);
      return;
    }

    await esClient.indices.create({
      index: INDEX_NAME,
      ...indexMapping
    });

    console.log(`Index "${INDEX_NAME}" created`);
  } catch (err) {
    console.error(err);
    throw err;
  }
};

The code above highlights creating a new index. First, you need to invoke the setupIndex() function. Within this function, you're providing the preferred name for your index. Elasticsearch then checks if the name already exists within its indexes.

The function terminates if the index name already exists (to prevent duplication). But if it doesn't exist, it proceeds to create an index with that unique name alongside the index Mapping rules (which we'll discuss further shortly).

After creating the index, you'll see a success message in your application console.

How to Delete an Index

After a while, an index may no longer serve its purpose and you may need to remove it from Elasticsearch.

You can do this by executing the esClient.indices.delete() command as shown below:

const deleteIndex = async () => {
  try {
    await esClient.indices.delete({ index: INDEX_NAME });
    console.log(`${INDEX_NAME} deleted`);
  } catch (err) {
    console.error("Error deleting index:", err);
  }
};

How to Delete a Post within an Index

Sometimes, posts get deleted and modified. Also, users may get banned, after which you'd want to remove their content from the stored database .

In these cases, you'll want to ensure true deletion – that is, both from the database and from Elasticsearch indexed storage.

To do this, you'll call the esClient.delete() function, passing the Elasticsearch Client ID and the post's unique ID that you want to delete as callback arguments to your esClient.delete function.

const deletePost = async (postId) => {
  try {
    await esClient.delete({
      index: INDEX_NAME,
      id: postId.toString(),
    });

    console.log("Post successfully deleted");
    return { success: true, postId };
  } catch (err) {
    console.error(err);
    throw err;
  }
};

How to Index a Post

After setting up the Elasticsearch Index, you'll want to automatically index posts made to the database into the Elasticsearch index.

To do this, you'll need to make sure that the post is compatible with your index schema via the transformPostTOESRepo function. This function extracts and formats the post data so it matches the Elasticsearch document structure.

const transformPostToESDoc = (post) => {
  return {
  id: post.id,
  title: post.title,
  content: post.body,
  author: post.author,
  category: post.category,
  tags: post.tags,
  views: post.views || 0,
  published_at: post.created_at
};

const indexPost = async (postId) => {
  try {
    const postRepo = await getPostRepo();
    const post = await postRepo.findOne({ where: { id: postId } });

    if (!post) {
      throw new Error("Post not available");
    }

    const esDocument = transformPostToESDoc(post);

    await esClient.index({
      index: INDEX_NAME,
      id: post.id.toString(),
      document: esDocument
    });

    console.log("Post successfully indexed");
    return { success: true, postId };
  } catch (err) {
    console.error(err);
    throw err;
  }
};

The post to be indexed must have a unique ID. For ease of use, we used the unique post ID constraint that comes by default in regular databases. Optionally, you can also use UUID libraries to generate unique post IDs.

The Post information is then attached to the esClient.index() function as the document to be indexed. We also put appropriate error handling measures in place to prevent the app from crashing if the process is unsuccessful.

How to Define Elastic Search Mapping Rules

Elasticsearch mappings define how your data is stored and indexed. They specify the data type of each field and how text is analyzed for search.

In the example below, we'll define an index configuration that includes custom analyzers for autocomplete and mappings for each post field (like title, content, and author).

const indexMapping = {
  settings: {
    analysis: {
      analyzer: {
        autocomplete: {
          type: 'custom',
          tokenizer: 'standard',
          filter: ['lowercase', 'autocomplete_filter']
        },
        autocomplete_search: {
          type: 'custom',
          tokenizer: 'standard',
          filter: ['lowercase']
        }
      },
      filter: {
        autocomplete_filter: {
          type: 'edge_ngram',
          min_gram: 2,
          max_gram: 10
        }
      }
    }
  },
  mappings: {
    properties: {
      id: { type: 'integer' },
      title: {
        type: 'text',
        analyzer: 'autocomplete',
        search_analyzer: 'autocomplete_search',
        fields: {
          keyword: { type: 'keyword' },
          standard: { type: 'text' }
        }
      },
      content: {
        type: 'text',
        analyzer: 'standard'
      },
      category: {
        type: 'keyword'
      },
      tags: { type: 'keyword' },
      author: {
        type: 'text',
        fields: {
          keyword: { type: 'keyword' }
        }
      },
      views: { type: 'integer' },
      published_at: { type: 'date' }
    }
  }
};

The indexMapping object defines how Elasticsearch should store and process your data. It consists of two main parts: settings and mappings.

The mappings section defines the structure of your documents. Each field (like title, content, or author) has a type such as text, keyword, integer, or date. This tells Elasticsearch how to store and search that field.

For text fields, we can also define analyzers. Analyzers control how text is broken into smaller pieces (tokens) during indexing and search.

In the settings section, we defined a custom analyzer for autocomplete. This uses an edge_ngram filter to generate partial word matches, so users can find results as they type. We also defined a separate search_analyzer to ensure that search queries are processed correctly.

Together, these settings allow you to support features like autocomplete while keeping search results accurate and efficient.

Search Implementation

In order to implement your search functionality, you'll need to build out the API. This involves building the business logic service and the API route. You'll also use GET requests and attach your search term as a query. The result it generates will be received as a JSON document.

Then you'll implement the search post service function. In this scenario, you'll be using the search engine capabilities to search for phrases within the index. In line with best practices, you'll use a pagination technique to minimize receiving unwanted information.

The search query will consist of the index name, pagination parameters (from and size) to control which results are returned, and the expected maximum size of the result. You'll also attach a query object specifying the modality of the search that the Elasticsearch engine should use.

const searchElastic = async (query, page = 1, size = 10) => {
  const searchQuery = {
    index: INDEX_NAME,
    from: (page - 1) * size,
    size,
    query: {
      bool: {
        must: [
          {
            multi_match: {
              query,
              fields: ["title^3", "content"],
              type: "best_fields",
              fuzziness: "AUTO"
            }
          }
        ]
      }
    }
  };

  const result = await esClient.search(searchQuery);
  return result.hits.hits;
};

In the code above, the function is named searchElastic. The function contains three variables which must be passed in order to execute it: size, page and query.

The size variable specifies the maximum number of documents per search query to be returned. The default count could be any integer.

The query uses a multi_match clause to search across multiple fields, such as title and content. The title^3 syntax boosts matches in the title, making them more relevant than matches in other fields.

We also included a must clause which defines conditions that documents must match to be included in the results.

The search results are usually ranked based on their degree of relevance to the search query.

Full Code

With this, you've completed this tutorial and have configured Elasticsearch to index posts made to your database. Here's the full code:

Elasticsearch Client (config.js):

const { Client } = require('@elastic/elasticsearch');

const esClient = new Client({
  node: 'http://localhost:9200',
  auth: {
    username: process.env.ELASTICSEARCH_USERNAME,
    password: process.env.ELASTICSEARCH_PASSWORD
  },
  maxRetries: 5,
  requestTimeout: 60000,
  tls: {
    rejectUnauthorized: process.env.NODE_ENV !== 'development'
  }
});

module.exports = esClient;

Index mapping:

const indexMapping = {
  settings: {
    analysis: {
      analyzer: {
        autocomplete: {
          type: 'custom',
          tokenizer: 'standard',
          filter: ['lowercase', 'autocomplete_filter']
        },
        autocomplete_search: {
          type: 'custom',
          tokenizer: 'standard',
          filter: ['lowercase']
        }
      },
      filter: {
        autocomplete_filter: {
          type: 'edge_ngram',
          min_gram: 2,
          max_gram: 10
        }
      }
    }
  },
  mappings: {
    properties: {
      id: { type: 'integer' },
      title: {
        type: 'text',
        analyzer: 'autocomplete',
        search_analyzer: 'autocomplete_search',
        fields: {
          keyword: { type: 'keyword' },
          standard: { type: 'text' }
        }
      },
      content: {
        type: 'text',
        analyzer: 'standard'
      },
      category: {
        type: 'keyword'
      },
      tags: { type: 'keyword' },
      author: {
        type: 'text',
        fields: {
          keyword: { type: 'keyword' }
        }
      },
      views: { type: 'integer' },
      published_at: { type: 'date' }
    }
  }
};

Create index:

const setupIndex = async () => {
  try {
    const indexExists = await esClient.indices.exists({
      index: INDEX_NAME
    });

    if (indexExists) {
      console.log(`Index "${INDEX_NAME}" already exists`);
      return;
    }

    await esClient.indices.create({
      index: INDEX_NAME,
      ...indexMapping
    });

    console.log(`Index "${INDEX_NAME}" created`);
  } catch (err) {
    console.error(err);
    throw err;
  }
};

Delete index:

const deleteIndex = async () => {
  try {
    await esClient.indices.delete({ index: INDEX_NAME });
    console.log(`${INDEX_NAME} deleted`);
  } catch (err) {
    console.error("Error deleting index:", err);
  }
};

Delete document (post):

const deletePost = async (postId) => {
  try {
    await esClient.delete({
      index: INDEX_NAME,
      id: postId.toString()
    });

    console.log("Post successfully deleted");
    return { success: true, postId };
  } catch (err) {
    console.error(err);
    throw err;
  }
};

Transform and index post:

const transformPostToESDoc = (post) => {
  return {
  id: post.id,
  title: post.title,
  content: post.body,
  author: post.author,
  category: post.category,
  tags: post.tags,
  views: post.views || 0,
  published_at: post.created_at
};

const indexPost = async (postId) => {
  try {
    const postRepo = await getPostRepo();
    const post = await postRepo.findOne({ where: { id: postId } });

    if (!post) {
      throw new Error("Post not available");
    }

    const esDocument = transformPostToESDoc(post);

    await esClient.index({
      index: INDEX_NAME,
      id: post.id.toString(),
      document: esDocument
    });

    console.log("Post successfully indexed");
    return { success: true, postId };
  } catch (err) {
    console.error(err);
    throw err;
  }
};

Search function:

const searchElastic = async (query, page = 1, size = 10) => {
  const searchQuery = {
    index: INDEX_NAME,
    from: (page - 1) * size,
    size,
    query: {
      bool: {
        must: [
          {
            multi_match: {
              query,
              fields: ["title^3", "content"],
              type: "best_fields",
              fuzziness: "AUTO"
            }
          }
        ]
      }
    }
  };

  const result = await esClient.search(searchQuery);
  return result.hits.hits;
};

Wrapping Up

Now you know how to use Elasticsearch to improve user search in your web applications. Elasticsearch is agnostic which allows you to use it across programming languages and frameworks. Its large community base also provides helpful user guides to make onboarding easier.

To further harness Elasticsearch's power, you can explore other tools within the ELK stack (Elasticsearch, Log Stash, and Kibana ) that'll help you generate high quality data visualizations for your data, especially for enterprise applications.

Conclusion

A fast and reliable search engine isn’t negotiable in your web applications these days. Elasticsearch is your go-to for getting this done.

If you would like to read other articles that will enhance your tech journey, feel free to check out my website here . Stay active!