<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/"
    xmlns:atom="http://www.w3.org/2005/Atom" xmlns:media="http://search.yahoo.com/mrss/" version="2.0">
    <channel>
        
        <title>
            <![CDATA[ elasticsearch - freeCodeCamp.org ]]>
        </title>
        <description>
            <![CDATA[ Browse thousands of programming tutorials written by experts. Learn Web Development, Data Science, DevOps, Security, and get developer career advice. ]]>
        </description>
        <link>https://www.freecodecamp.org/news/</link>
        <image>
            <url>https://cdn.freecodecamp.org/universal/favicons/favicon.png</url>
            <title>
                <![CDATA[ elasticsearch - freeCodeCamp.org ]]>
            </title>
            <link>https://www.freecodecamp.org/news/</link>
        </image>
        <generator>Eleventy</generator>
        <lastBuildDate>Tue, 09 Jun 2026 23:07:33 +0000</lastBuildDate>
        <atom:link href="https://www.freecodecamp.org/news/tag/elasticsearch/rss.xml" rel="self" type="application/rss+xml" />
        <ttl>60</ttl>
        
            <item>
                <title>
                    <![CDATA[ How to Streamline Search in Web Applications with Elasticsearch  ]]>
                </title>
                <description>
                    <![CDATA[ They say data is the new gold. But navigating through a large dataset to meet the demands of consumers in record time still gives backend devs a headache. Conventional database queries often aren't to ]]>
                </description>
                <link>https://www.freecodecamp.org/news/streamline-search-functionality-in-web-apps-with-elasticsearch/</link>
                <guid isPermaLink="false">69e10d82b67a275a9d505023</guid>
                
                    <category>
                        <![CDATA[ elasticsearch ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Node.js ]]>
                    </category>
                
                    <category>
                        <![CDATA[ indexing ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Search Engines ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ Oluwatobi ]]>
                </dc:creator>
                <pubDate>Thu, 16 Apr 2026 16:25:38 +0000</pubDate>
                <media:content url="https://cdn.hashnode.com/uploads/covers/5e1e335a7a1d3fcc59028c64/e6563d07-a253-4fd9-b1f6-54dc98a48319.png" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>They say data is the new gold. But navigating through a large dataset to meet the demands of consumers in record time still gives backend devs a headache.</p>
<p>Conventional database queries often aren't totally reliable in getting accurate search results fast. But fortunately, Elasticsearch comes to the rescue.</p>
<p>In this article, I'll walk you through how to use Elasticsearch to enhance database searches and analytics while still maintaining efficiency.</p>
<p>Here are the prerequisites for this tutorial:</p>
<ul>
<li><p>A Node.js environment</p>
</li>
<li><p>Basic backend knowledge</p>
</li>
</ul>
<p>With that, let's get started. But first of all, what is Elasticsearch?</p>
<h3 id="heading-table-of-content">Table of Content</h3>
<ul>
<li><p><a href="#heading-what-is-elasticsearch">What is Elasticsearch?</a></p>
</li>
<li><p><a href="#heading-elasticsearch-key-terms">Elasticsearch Key Terms</a></p>
</li>
<li><p><a href="#heading-how-to-set-up-elasticsearch">How to Set Up Elasticsearch</a></p>
</li>
<li><p><a href="#heading-how-to-set-up-the-demo-project">How to Set Up the Demo Project</a></p>
</li>
<li><p><a href="#heading-how-to-set-up-elasticsearch-in-your-project">How to Set Up Elasticsearch in Your Project</a></p>
</li>
<li><p><a href="#heading-how-to-work-with-indexes-in-elasticsearch">How to Work with Indexes in Elasticsearch</a></p>
</li>
<li><p><a href="#heading-search-implementation">Search Implementation</a></p>
</li>
<li><p><a href="#heading-full-code">Full Code</a></p>
</li>
<li><p><a href="#heading-wrapping-up">Wrapping Up</a></p>
</li>
<li><p><a href="#heading-conclusion">Conclusion</a></p>
</li>
</ul>
<h2 id="heading-what-is-elasticsearch">What is Elasticsearch?</h2>
<p>Elasticsearch is a search engine built by Apache that can index words and phrases, providing advanced text and vector search capabilities. It also has other useful features such as search analytics and an auto-complete feature.</p>
<p>Note that Elasticsearch isn't a database, even though it does provide indexing features (which popular databases also do).</p>
<p>Other popular alternatives to this tool used in production environments include <a href="https://www.algolia.com/">Algolia</a>, <a href="https://opensearch.org/">OpenSearch</a> and <a href="https://www.meilisearch.com/">MeiliSearch</a>.</p>
<h2 id="heading-elasticsearch-key-terms">Elasticsearch Key Terms</h2>
<p>in this section, we'll go over some important terminology used in Elasticsearch. To ease your understanding, I'll make references to common database terminologies.</p>
<ul>
<li><p><strong>Index</strong>: This serves as a storage location for the data you're going to explore. It's like the database for Elasticsearch. It also shares other properties that DBs possess like uniqueness.</p>
</li>
<li><p><strong>Document:</strong> This is the smallest unit of information stored within the index. It's structurally identical to the MongoDB-based document and is also similar to rows in SQL-based databases.</p>
</li>
<li><p><strong>Mapping:</strong> Mapping refers to sets of rules or instructions that define how documents and fields are stored in the Elasticsearch index.</p>
</li>
<li><p><strong>Score:</strong> This is generated by Elasticsearch to show the degree of relevance of the search query to the stored index.</p>
</li>
<li><p><strong>Analyzer:</strong> When data is sent to the Elasticsearch engine for indexing, it initially passes through an analyzer which processes the text before indexing. This is achieved via Filters and Tokenizers.</p>
</li>
<li><p><strong>Tokenizers:</strong> This tool converts the gross unstructured data sent to the Elasticsearch engine into structured data tokens for further processing and storage.</p>
</li>
<li><p><strong>Aggregator:</strong> This search tool performs detailed analysis on the tokens stored in the index to generate actionable data insights. It's an advantage of the Elasticsearch engine. Mongo DB’s aggregator offer similar functions.</p>
</li>
<li><p><strong>Filter</strong>: A set of instructions which modifies tokens generated during the process of analysis. This could entail removal of fillers, capitalization rules, and so on.</p>
</li>
<li><p><strong>Bulk index:</strong> This refers to indexing more than one document at once. You typically do this when indexing a database with pre-existing content.</p>
</li>
</ul>
<h2 id="heading-how-to-set-up-elasticsearch">How to Set Up Elasticsearch</h2>
<p>For the purpose of this tutorial, we'll use Elasticsearch's installable software on our local machine. Online hosted versions of Elasticsearch also exist which work hitch-free as well.</p>
<p><a href="https://www.elastic.co/docs/deploy-manage/deploy/self-managed/install-elasticsearch-with-zip-on-windows">Here</a> is a link detailing how to setup Elasticsearch on Windows. For non-Windows users, you can also install Elasticsearch on <a href="https://www.elastic.co/docs/deploy-manage/deploy/self-managed/install-elasticsearch-from-archive-on-linux-macos">Linux/Mac OS</a> or use <a href="https://www.elastic.co/docs/deploy-manage/deploy/self-managed/install-elasticsearch-with-docker">Docker</a>.</p>
<p><strong>Note</strong> that for Windows users, make sure you run Elasticsearch as an Administrator to avoid installation errors.</p>
<p>After successful installation, you can test if it's functioning by navigating to <code>localhost:9200</code> which serves as the default local endpoint for Elasticsearch. There you'll see a success message on the screen similar to the image below:</p>
<img src="https://cdn.hashnode.com/uploads/covers/64bba6ecb09308034572f437/ad94560d-7629-45f1-a94d-eaed0b61cefe.png" alt="elastic search localhost homepage" style="display:block;margin:0 auto" width="813" height="744" loading="lazy">

<p>With that , we'll move on to setting up our project and integrating ElasticSearch into our demo project.</p>
<h2 id="heading-how-to-set-up-the-demo-project">How to Set Up the Demo Project</h2>
<p>For the sake of this tutorial, we will be utilizing a ready-built forum-based backend application built in Node Express JS. &nbsp;Here is the link to the project.</p>
<p>to get the project up and running, clone this package and run</p>
<p><code>npm start</code></p>
<p><code>MySQL</code> will serve as the default database for this tutorial. &nbsp;Let's now proceed to the next section.</p>
<h2 id="heading-how-to-set-up-elasticsearch-in-your-project">How to Set Up Elasticsearch in Your Project</h2>
<p>The existing demo project is a backend implementation of a forum site which allows users to post text content and facilitate discussions through category-based threads.</p>
<p>Elasticsearch is great for ensuring that users can sift through these posts and threads to accurately locate key content using distinct keywords. This is more effective than using traditional database search queries which can be cumbersome.</p>
<p>To set up Elasticsearch, start by installing the Elasticsearch <code>npm</code> package. To do this, run the command below in your project directory:</p>
<pre><code class="language-shell">npm install @elastic/elasticsearch
</code></pre>
<p>After successful installation, create a <code>config.js</code> file where you'll setup your driver to connect to your Elasticsearch application.</p>
<pre><code class="language-javascript">const { Client } = require('@elastic/elasticsearch');

const esClient = new Client({
  node: 'http://localhost:9200',
  auth: {
    username: process.env.ELASTICSEARCH_USERNAME,
    password: process.env.ELASTICSEARCH_PASSWORD
  },
  maxRetries: 5,
  requestTimeout: 60000,
  tls: {
    rejectUnauthorized: process.env.NODE_ENV !== 'development'
  }
});

module.exports = esClient;
</code></pre>
<p>To access and use Elasticsearch's capabilities within your backend application, you'll need to setup and configure your Elasticsearch driver. The details are specified in the config file code above.</p>
<p>As mentioned earlier, Elasticsearch runs on the <code>localhost:9200</code> port. So your Elasticsearch node will be directed to the localhost port. Online hosted Elasticsearch nodes will also work in similar scenarios.</p>
<p>Next in the config file, you'll provide the authentication credentials required to access Elasticsearch. The requested username and password will be supplied within the Auth object. If you're running Elasticsearch locally, authentication may not be required unless security is enabled.</p>
<p>In this scenario, <code>MaxRetries</code> refers to the number of maximum unsuccessful attempts to access Elasticsearch. In this case, we've pegged it at 5 attempts. <code>requestTimeout</code> is the time in milliseconds after which the request will automatically terminate if it's not processed.</p>
<p>Once you've completion the Config file, you'll import this config and initialize the Elasticsearch client when your backend starts.</p>
<h2 id="heading-how-to-work-with-indexes-in-elasticsearch">How to Work with Indexes in Elasticsearch</h2>
<p>Before we start harnessing the full power of Elasticsearch, we need to customize its search capabilities within the backend of the project. This involves setting up an index within the Elasticsearch Engine that indexes all posts made to the backend application.</p>
<pre><code class="language-javascript">const esClient = require('./config');

const setupIndex = async () =&gt; {
  try {
    const indexExists = await esClient.indices.exists({
      index: INDEX_NAME
    });

    if (indexExists) {
      console.log(`Index "${INDEX_NAME}" already exists`);
      return;
    }

    await esClient.indices.create({
      index: INDEX_NAME,
      ...indexMapping
    });

    console.log(`Index "${INDEX_NAME}" created`);
  } catch (err) {
    console.error(err);
    throw err;
  }
};
</code></pre>
<p>The code above highlights creating a new index. First, you need to invoke the <code>setupIndex()</code> function. Within this function, you're providing the preferred name for your index. &nbsp;Elasticsearch then checks if the name already exists within its indexes.</p>
<p>The function terminates if the index name already exists (to prevent duplication). But if it doesn't exist, it proceeds to create an index with that unique name alongside the index Mapping rules (which we'll discuss further shortly).</p>
<p>After creating the index, you'll see a success message in your application console.</p>
<h3 id="heading-how-to-delete-an-index">How to Delete an Index</h3>
<p>After a while, an index may no longer serve its purpose and you may need to remove it from Elasticsearch.</p>
<p>You can do this by executing the <code>esClient.indices.delete()</code> command as shown below:</p>
<pre><code class="language-javascript">const deleteIndex = async () =&gt; {
  try {
    await esClient.indices.delete({ index: INDEX_NAME });
    console.log(`${INDEX_NAME} deleted`);
  } catch (err) {
    console.error("Error deleting index:", err);
  }
};
</code></pre>
<h3 id="heading-how-to-delete-a-post-within-an-index">How to Delete a Post within an Index</h3>
<p>Sometimes, posts get deleted and modified. Also, users may get banned, after which you'd want to remove their content from the stored database .</p>
<p>In these cases, you'll want to ensure true deletion – that is, both from the database and from Elasticsearch indexed storage.</p>
<p>To do this, you'll call the <code>esClient.delete()</code> function, passing the Elasticsearch Client ID and the post's unique ID that you want to delete as callback arguments to your <code>esClient.delete</code> function.</p>
<pre><code class="language-javascript">const deletePost = async (postId) =&gt; {
  try {
    await esClient.delete({
      index: INDEX_NAME,
      id: postId.toString(),
    });

    console.log("Post successfully deleted");
    return { success: true, postId };
  } catch (err) {
    console.error(err);
    throw err;
  }
};
</code></pre>
<h3 id="heading-how-to-index-a-post">How to Index a Post</h3>
<p>After setting up the Elasticsearch Index, you'll want to automatically index posts made to the database into the Elasticsearch index.</p>
<p>To do this, you'll need to make sure that the post is compatible with your index schema via the <code>transformPostTOESRepo</code> function. This function extracts and formats the post data so it matches the Elasticsearch document structure.</p>
<pre><code class="language-javascript">const transformPostToESDoc = (post) =&gt; {
  return {
  id: post.id,
  title: post.title,
  content: post.body,
  author: post.author,
  category: post.category,
  tags: post.tags,
  views: post.views || 0,
  published_at: post.created_at
};

const indexPost = async (postId) =&gt; {
  try {
    const postRepo = await getPostRepo();
    const post = await postRepo.findOne({ where: { id: postId } });

    if (!post) {
      throw new Error("Post not available");
    }

    const esDocument = transformPostToESDoc(post);

    await esClient.index({
      index: INDEX_NAME,
      id: post.id.toString(),
      document: esDocument
    });

    console.log("Post successfully indexed");
    return { success: true, postId };
  } catch (err) {
    console.error(err);
    throw err;
  }
};
</code></pre>
<p>The post to be indexed must have a unique ID. For ease of use, we used the unique post ID constraint that comes by default in regular databases. Optionally, you can also use UUID libraries to generate unique post IDs.</p>
<p>The Post information is then attached to the <code>esClient.index()</code> function as the document to be indexed. We also put appropriate error handling measures in place to prevent the app from crashing if the process is unsuccessful.</p>
<h3 id="heading-how-to-define-elastic-search-mapping-rules">How to Define Elastic Search Mapping Rules</h3>
<p>Elasticsearch mappings define how your data is stored and indexed. They specify the data type of each field and how text is analyzed for search.</p>
<p>In the example below, we'll define an index configuration that includes custom analyzers for autocomplete and mappings for each post field (like title, content, and author).</p>
<pre><code class="language-javascript">const indexMapping = {
  settings: {
    analysis: {
      analyzer: {
        autocomplete: {
          type: 'custom',
          tokenizer: 'standard',
          filter: ['lowercase', 'autocomplete_filter']
        },
        autocomplete_search: {
          type: 'custom',
          tokenizer: 'standard',
          filter: ['lowercase']
        }
      },
      filter: {
        autocomplete_filter: {
          type: 'edge_ngram',
          min_gram: 2,
          max_gram: 10
        }
      }
    }
  },
  mappings: {
    properties: {
      id: { type: 'integer' },
      title: {
        type: 'text',
        analyzer: 'autocomplete',
        search_analyzer: 'autocomplete_search',
        fields: {
          keyword: { type: 'keyword' },
          standard: { type: 'text' }
        }
      },
      content: {
        type: 'text',
        analyzer: 'standard'
      },
      category: {
        type: 'keyword'
      },
      tags: { type: 'keyword' },
      author: {
        type: 'text',
        fields: {
          keyword: { type: 'keyword' }
        }
      },
      views: { type: 'integer' },
      published_at: { type: 'date' }
    }
  }
};
</code></pre>
<p>The <code>indexMapping</code> object defines how Elasticsearch should store and process your data. It consists of two main parts: <code>settings</code> and <code>mappings</code>.</p>
<p>The <code>mappings</code> section defines the structure of your documents. Each field (like <code>title</code>, <code>content</code>, or <code>author</code>) has a type such as <code>text</code>, <code>keyword</code>, <code>integer</code>, or <code>date</code>. This tells Elasticsearch how to store and search that field.</p>
<p>For text fields, we can also define analyzers. Analyzers control how text is broken into smaller pieces (tokens) during indexing and search.</p>
<p>In the <code>settings</code> section, we defined a custom analyzer for autocomplete. This uses an <code>edge_ngram</code> filter to generate partial word matches, so users can find results as they type. We also defined a separate <code>search_analyzer</code> to ensure that search queries are processed correctly.</p>
<p>Together, these settings allow you to support features like autocomplete while keeping search results accurate and efficient.</p>
<h2 id="heading-search-implementation">Search Implementation</h2>
<p>In order to implement your search functionality, you'll need to build out the API. This involves building the business logic service and the API route. You'll also use <code>GET</code> requests and attach your search term as a query. The result it generates will be received as a JSON document.</p>
<p>Then you'll implement the search post service function. In this scenario, you'll be using the search engine capabilities to search for phrases within the index. In line with best practices, you'll use a pagination technique to minimize&nbsp;receiving unwanted information.</p>
<p>The search query will consist of the index name, pagination parameters (<code>from</code> and <code>size</code>) to control which results are returned, and the expected maximum size of the result. &nbsp;You'll also attach a query object specifying the modality of the search that the Elasticsearch engine should use.</p>
<pre><code class="language-javascript">const searchElastic = async (query, page = 1, size = 10) =&gt; {
  const searchQuery = {
    index: INDEX_NAME,
    from: (page - 1) * size,
    size,
    query: {
      bool: {
        must: [
          {
            multi_match: {
              query,
              fields: ["title^3", "content"],
              type: "best_fields",
              fuzziness: "AUTO"
            }
          }
        ]
      }
    }
  };

  const result = await esClient.search(searchQuery);
  return result.hits.hits;
};
</code></pre>
<p>In the code above, the function is named <code>searchElastic</code>. The function contains three variables which must be passed in order to execute it: <code>size</code>, <code>page</code> and <code>query</code>.</p>
<p>The <code>size</code> variable specifies the maximum number of documents per search query to be returned. The default count could be any integer.</p>
<p>The query uses a <code>multi_match</code> clause to search across multiple fields, such as <code>title</code> and <code>content</code>. The <code>title^3</code> syntax boosts matches in the title, making them more relevant than matches in other fields.</p>
<p>We also included a <code>must</code> clause which defines conditions that documents must match to be included in the results.</p>
<p>The search results are usually ranked based on their degree of relevance to the search query.</p>
<h2 id="heading-full-code">Full Code</h2>
<p>With this, you've completed this tutorial and have configured Elasticsearch to index posts made to your database. Here's the full code:</p>
<ol>
<li>Elasticsearch Client (config.js):</li>
</ol>
<pre><code class="language-javascript">const { Client } = require('@elastic/elasticsearch');

const esClient = new Client({
  node: 'http://localhost:9200',
  auth: {
    username: process.env.ELASTICSEARCH_USERNAME,
    password: process.env.ELASTICSEARCH_PASSWORD
  },
  maxRetries: 5,
  requestTimeout: 60000,
  tls: {
    rejectUnauthorized: process.env.NODE_ENV !== 'development'
  }
});

module.exports = esClient;
</code></pre>
<ol>
<li>Index mapping:</li>
</ol>
<pre><code class="language-javascript">const indexMapping = {
  settings: {
    analysis: {
      analyzer: {
        autocomplete: {
          type: 'custom',
          tokenizer: 'standard',
          filter: ['lowercase', 'autocomplete_filter']
        },
        autocomplete_search: {
          type: 'custom',
          tokenizer: 'standard',
          filter: ['lowercase']
        }
      },
      filter: {
        autocomplete_filter: {
          type: 'edge_ngram',
          min_gram: 2,
          max_gram: 10
        }
      }
    }
  },
  mappings: {
    properties: {
      id: { type: 'integer' },
      title: {
        type: 'text',
        analyzer: 'autocomplete',
        search_analyzer: 'autocomplete_search',
        fields: {
          keyword: { type: 'keyword' },
          standard: { type: 'text' }
        }
      },
      content: {
        type: 'text',
        analyzer: 'standard'
      },
      category: {
        type: 'keyword'
      },
      tags: { type: 'keyword' },
      author: {
        type: 'text',
        fields: {
          keyword: { type: 'keyword' }
        }
      },
      views: { type: 'integer' },
      published_at: { type: 'date' }
    }
  }
};
</code></pre>
<ol>
<li>Create index:</li>
</ol>
<pre><code class="language-javascript">const setupIndex = async () =&gt; {
  try {
    const indexExists = await esClient.indices.exists({
      index: INDEX_NAME
    });

    if (indexExists) {
      console.log(`Index "${INDEX_NAME}" already exists`);
      return;
    }

    await esClient.indices.create({
      index: INDEX_NAME,
      ...indexMapping
    });

    console.log(`Index "${INDEX_NAME}" created`);
  } catch (err) {
    console.error(err);
    throw err;
  }
};
</code></pre>
<ol>
<li>Delete index:</li>
</ol>
<pre><code class="language-javascript">const deleteIndex = async () =&gt; {
  try {
    await esClient.indices.delete({ index: INDEX_NAME });
    console.log(`${INDEX_NAME} deleted`);
  } catch (err) {
    console.error("Error deleting index:", err);
  }
};
</code></pre>
<ol>
<li>Delete document (post):</li>
</ol>
<pre><code class="language-javascript">const deletePost = async (postId) =&gt; {
  try {
    await esClient.delete({
      index: INDEX_NAME,
      id: postId.toString()
    });

    console.log("Post successfully deleted");
    return { success: true, postId };
  } catch (err) {
    console.error(err);
    throw err;
  }
};
</code></pre>
<ol>
<li>Transform and index post:</li>
</ol>
<pre><code class="language-javascript">const transformPostToESDoc = (post) =&gt; {
  return {
  id: post.id,
  title: post.title,
  content: post.body,
  author: post.author,
  category: post.category,
  tags: post.tags,
  views: post.views || 0,
  published_at: post.created_at
};

const indexPost = async (postId) =&gt; {
  try {
    const postRepo = await getPostRepo();
    const post = await postRepo.findOne({ where: { id: postId } });

    if (!post) {
      throw new Error("Post not available");
    }

    const esDocument = transformPostToESDoc(post);

    await esClient.index({
      index: INDEX_NAME,
      id: post.id.toString(),
      document: esDocument
    });

    console.log("Post successfully indexed");
    return { success: true, postId };
  } catch (err) {
    console.error(err);
    throw err;
  }
};
</code></pre>
<ol>
<li>Search function:</li>
</ol>
<pre><code class="language-javascript">const searchElastic = async (query, page = 1, size = 10) =&gt; {
  const searchQuery = {
    index: INDEX_NAME,
    from: (page - 1) * size,
    size,
    query: {
      bool: {
        must: [
          {
            multi_match: {
              query,
              fields: ["title^3", "content"],
              type: "best_fields",
              fuzziness: "AUTO"
            }
          }
        ]
      }
    }
  };

  const result = await esClient.search(searchQuery);
  return result.hits.hits;
};
</code></pre>
<h2 id="heading-wrapping-up">Wrapping Up</h2>
<p>Now you know how to use Elasticsearch to improve user search in your web applications. Elasticsearch is agnostic which allows you to use it across programming languages and frameworks. Its large community base also provides helpful user guides to make onboarding easier.</p>
<p>To further harness Elasticsearch's power, you can explore other tools within the <strong>ELK</strong> stack (Elasticsearch, Log Stash, and Kibana ) that'll help you generate high quality data visualizations for your data, especially for enterprise applications.</p>
<h2 id="heading-conclusion">Conclusion</h2>
<p>A fast and reliable search engine isn’t negotiable in your web applications these days. Elasticsearch is your go-to for getting this done.</p>
<p>If you would like to read other articles that will enhance your tech journey, feel free to check out <a href="https://portfolio-oluwatobi.netlify.app/">my website here</a> . Stay active!</p>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ Learn Elasticsearch with a Comprehensive Beginner-Friendly Course ]]>
                </title>
                <description>
                    <![CDATA[ Search functionality is one of the most critical features of modern applications, whether you're building websites, e-commerce platforms, or data-driven applications. But how do you create powerful and efficient search engines that can handle vast am... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/learn-elasticsearch-with-a-comprehensive-beginner-friendly-course/</link>
                <guid isPermaLink="false">675afe6e7c306f8d973e7281</guid>
                
                    <category>
                        <![CDATA[ elasticsearch ]]>
                    </category>
                
                    <category>
                        <![CDATA[ youtube ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ Beau Carnes ]]>
                </dc:creator>
                <pubDate>Thu, 12 Dec 2024 15:17:02 +0000</pubDate>
                <media:content url="https://cdn.hashnode.com/res/hashnode/image/upload/v1734016608619/82574380-c09a-4442-97b0-3e707a1675d2.jpeg" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>Search functionality is one of the most critical features of modern applications, whether you're building websites, e-commerce platforms, or data-driven applications. But how do you create powerful and efficient search engines that can handle vast amounts of data and provide relevant results quickly? The answer lies in Elasticsearch, one of the most popular and flexible search engines available today. If you want to learn how to leverage Elasticsearch in your projects, this course is the perfect starting point!</p>
<p>We just published a comprehensive course on the freeCodeCamp.org YouTube channel designed for beginners who want to understand Elasticsearch from the ground up. Created by 3CodeCampers, this course offers a perfect mix of theory and hands-on practice. You'll start by learning the fundamentals of Elasticsearch, such as index management, document storage, text analysis, and search functionality. Then, you'll move on to advanced topics like semantic search, embeddings, and pipelines. The second part of the course focuses on applying your new skills by building a real-world project: a search engine for the Astronomy Picture of the Day (APOD) dataset.</p>
<h3 id="heading-what-youll-learn-in-this-course">What You'll Learn in This Course</h3>
<p>The course is split into two parts to provide a complete learning experience:</p>
<h4 id="heading-part-1-elasticsearch-fundamentals">Part 1: Elasticsearch Fundamentals</h4>
<p>In the first part, you'll dive deep into the essential concepts of Elasticsearch, including:</p>
<ul>
<li><p><strong>Index Management</strong>: Learn how to create and manage indexes to organize your data effectively.</p>
</li>
<li><p><strong>Document Storage</strong>: Understand how to store and retrieve documents using Elasticsearch APIs.</p>
</li>
<li><p><strong>Text Analysis and Tokenization</strong>: Discover how Elasticsearch breaks down text for powerful search capabilities.</p>
</li>
<li><p><strong>Search API</strong>: Learn how to perform simple and advanced searches, including filtering and aggregation.</p>
</li>
<li><p><strong>Semantic Search and Embeddings</strong>: Explore how to incorporate semantic search and dense vector embeddings for more relevant results.</p>
</li>
<li><p><strong>Pipelines and Ingest Processors</strong>: Automate data processing before storing it in Elasticsearch.</p>
</li>
<li><p><strong>Advanced Features</strong>: Delve into deep pagination, SQL search API, and more.</p>
</li>
</ul>
<p>This foundational knowledge is applicable to any programming language, but the course uses Python for demonstrations, making it easy to follow along.</p>
<h4 id="heading-part-2-real-world-project-build-a-search-engine-for-apod">Part 2: Real-World Project – Build a Search Engine for APOD</h4>
<p>In the second part, you’ll apply everything you've learned by building a practical project. You’ll create a search engine for NASA's <strong>Astronomy Picture of the Day (APOD)</strong> dataset. This project will give you hands-on experience with key skills like:</p>
<ul>
<li><p><strong>Data Cleaning Pipelines</strong>: Prepare and clean the dataset for optimal search performance.</p>
</li>
<li><p><strong>Tokenization and Analysis</strong>: Break down text data to enable efficient search queries.</p>
</li>
<li><p><strong>Search Functionality</strong>: Implement powerful search features, including pagination and filtering.</p>
</li>
<li><p><strong>Aggregations</strong>: Summarize and analyze search results to extract insights.</p>
</li>
</ul>
<p>By the end of the project, you'll have a fully functional search engine and a deeper understanding of how Elasticsearch can enhance your applications.</p>
<h3 id="heading-course-contents">Course Contents</h3>
<p>This in-depth course spans 5 hours and covers a wide range of topics:</p>
<ol>
<li><p><strong>Introduction and Installation</strong></p>
</li>
<li><p><strong>Index Management and Document Storage</strong></p>
</li>
<li><p><strong>Text Analysis, Searching, and Pipelines</strong></p>
</li>
<li><p><strong>Advanced Features like Embeddings and Semantic Search</strong></p>
</li>
<li><p><strong>Final Project – Building a Real-World Search Engine</strong></p>
</li>
</ol>
<h3 id="heading-why-learn-elasticsearch">Why Learn Elasticsearch?</h3>
<p>Elasticsearch is a powerful tool used by companies worldwide for search, logging, and analytics. Whether you're a developer, data scientist, or tech enthusiast, mastering Elasticsearch can open new career opportunities and enhance your ability to build efficient, scalable applications. This course makes learning Elasticsearch accessible, practical, and fun!</p>
<p>You can watch the full course on <a target="_blank" href="https://youtu.be/a4HBKEda_F8">the freeCodeCamp.org YouTube channel</a> (5-hour watch).</p>
<div class="embed-wrapper">
        <iframe width="560" height="315" src="https://www.youtube.com/embed/a4HBKEda_F8" style="aspect-ratio: 16 / 9; width: 100%; height: auto;" title="YouTube video player" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen="" loading="lazy"></iframe></div>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ How to Set Up Geolocation Search in Your App with Elasticsearch ]]>
                </title>
                <description>
                    <![CDATA[ By Pramono Winata Location-based features are pretty common in apps nowadays. These features might seem complicated, but they can actually be implemented quite easily with Elasticsearch. Elasticsearch is a NoSQL database with a document-based structu... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/geolocation-search-elasticsearch/</link>
                <guid isPermaLink="false">66d4608b47a8245f78752a9d</guid>
                
                    <category>
                        <![CDATA[ database ]]>
                    </category>
                
                    <category>
                        <![CDATA[ elasticsearch ]]>
                    </category>
                
                    <category>
                        <![CDATA[ geolocation ]]>
                    </category>
                
                    <category>
                        <![CDATA[ search ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ freeCodeCamp ]]>
                </dc:creator>
                <pubDate>Thu, 07 Jan 2021 17:12:37 +0000</pubDate>
                <media:content url="https://cdn-media-2.freecodecamp.org/w1280/5fd644e7e6787e098393e278.jpg" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>By Pramono Winata</p>
<p>Location-based features are pretty common in apps nowadays. These features might seem complicated, but they can actually be implemented quite easily with Elasticsearch.</p>
<p>Elasticsearch is a NoSQL database with a document-based structure. It's often used as a Search Engine. It also provides its own syntax and many tools to help your search be as flexible as possible.</p>
<p>In this article I will show you a simple way to search by geolocation by getting a list of cities by coordinate range.</p>
<h2 id="heading-how-to-install-elasticsearch">How to Install Elasticsearch</h2>
<p><img src="https://www.freecodecamp.org/news/content/images/2021/01/1-4thJErMA9UpuP1jEBLRWFQ.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p>You can find an easy-to-follow <a target="_blank" href="https://www.elastic.co/guide/en/elasticsearch/reference/7.4/install-elasticsearch.html">installation guide</a> on Elasticsearch's website. At the time I am writing this article, I am using Elasticsearch version 7.4.2 .</p>
<p>Just keep in mind that Elasticsearch has made a lot of changes in recent versions, one of them being the <a target="_blank" href="https://www.elastic.co/guide/en/elasticsearch/reference/master/removal-of-types.html">removal of mapping types.</a> So if you are using another version of Elasticsearch some things here might not fully work.</p>
<p>After finishing your installation, do not forget to run your Elasticsearch service, which they emphasize clearly in their installation guide (for Linux, do this <code>./bin/elasticsearch</code> ).</p>
<p><strong>Make sure your elasticsearch is running</strong> by using a GET request into port 9200 in your local machine, like this: <a target="_blank" href="http://localhost:9200"><code>GET http://localhost:9200</code></a></p>
<h2 id="heading-how-to-make-your-elasticsearch-index">How to Make Your Elasticsearch Index</h2>
<p>An index is similar to table in a regular database. For this example, let's make an index named <code>cities</code> that will contain our data.</p>
<p>Let's also define a simple model for our data: </p>
<ul>
<li><code>id</code> : <code>keyword</code> for our identifier</li>
<li><code>name</code> : <code>text</code> for the city name</li>
<li><code>coordinate</code> : <code>geo_point</code> to store our city coordinates (neat, they have this data-type already)</li>
</ul>
<p>In Elasticsearch, we create the index by making a curl into an API. In our case our request will be like this:</p>
<pre><code>PUT http:<span class="hljs-comment">//localhost:9200/cities</span>
</code></pre><pre><code class="lang-json">{
    <span class="hljs-attr">"settings"</span>: {
        <span class="hljs-attr">"number_of_shards"</span>: <span class="hljs-number">1</span>,
        <span class="hljs-attr">"number_of_replicas"</span>: <span class="hljs-number">1</span>
    },
    <span class="hljs-attr">"mappings"</span>: {
        <span class="hljs-attr">"properties"</span>: {
            <span class="hljs-attr">"id"</span>: {
                <span class="hljs-attr">"type"</span>: <span class="hljs-string">"keyword"</span>
            },
            <span class="hljs-attr">"name"</span>: {
                <span class="hljs-attr">"type"</span>: <span class="hljs-string">"text"</span>
            },
            <span class="hljs-attr">"coordinate"</span>: {
                <span class="hljs-attr">"type"</span>: <span class="hljs-string">"geo_point"</span>
            }
        }
    }
}
</code></pre>
<p>When you used that curl, you should get a response like this to verify that your index has been created:</p>
<pre><code class="lang-json">{
    <span class="hljs-attr">"acknowledged"</span>: <span class="hljs-literal">true</span>,
    <span class="hljs-attr">"shards_acknowledged"</span>: <span class="hljs-literal">true</span>,
    <span class="hljs-attr">"index"</span>: <span class="hljs-string">"cities"</span>
}
</code></pre>
<p>Nicely done! Now your index is ready to be used. Let's go ahead and play around with our newly created index.</p>
<h2 id="heading-how-to-populate-elasticsearch-data">How to Populate Elasticsearch Data</h2>
<p>We will now fill our Elasticsearch index with documents. If you are not familiar with this term, just know that it is very similar to rows in a SQL database.</p>
<p>In Elasticsearch, it's possible to store data that doesn't match with our predefined schema. But we will not do that here – instead we will insert data that matches our predefined schema.</p>
<p>Since we will be inserting multiple data at once, we will use the <a target="_blank" href="https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-bulk.html">bulk</a> API that Elasticsearch provides that allows multiple insertions in one API call.</p>
<p>In the example below, I will be inserting 9 cities into my index. Feel free to add more if you wish.</p>
<p><code>POST '[http://localhost:9200/cities/_bu](http://localhost:9200/cities/_bu)lk</code></p>
<pre><code class="lang-json">{ <span class="hljs-attr">"index"</span>:{<span class="hljs-attr">"_index"</span>: <span class="hljs-string">"cities"</span> } }
{ <span class="hljs-attr">"id"</span>: <span class="hljs-number">1</span>, <span class="hljs-attr">"name"</span>: <span class="hljs-string">"Jakarta"</span>, <span class="hljs-attr">"coordinate"</span>: {  <span class="hljs-attr">"lat"</span>: <span class="hljs-number">-6.2008</span>, <span class="hljs-attr">"lon"</span>: <span class="hljs-number">106.8456</span>}}
{ <span class="hljs-attr">"index"</span>:{<span class="hljs-attr">"_index"</span>: <span class="hljs-string">"cities"</span> } }
{ <span class="hljs-attr">"id"</span>: <span class="hljs-number">2</span>, <span class="hljs-attr">"name"</span>: <span class="hljs-string">"Tokyo"</span>, <span class="hljs-attr">"coordinate"</span>: {  <span class="hljs-attr">"lat"</span>: <span class="hljs-number">35.6762</span>, <span class="hljs-attr">"lon"</span>: <span class="hljs-number">139.6503</span>} }
{ <span class="hljs-attr">"index"</span>:{<span class="hljs-attr">"_index"</span>: <span class="hljs-string">"cities"</span> } }
{ <span class="hljs-attr">"id"</span>: <span class="hljs-number">3</span>, <span class="hljs-attr">"name"</span>: <span class="hljs-string">"Hong Kong"</span>, <span class="hljs-attr">"coordinate"</span>: {  <span class="hljs-attr">"lat"</span>: <span class="hljs-number">22.3193</span>, <span class="hljs-attr">"lon"</span>: <span class="hljs-number">114.1694</span>} }
{ <span class="hljs-attr">"index"</span>:{<span class="hljs-attr">"_index"</span>: <span class="hljs-string">"cities"</span> } }
{ <span class="hljs-attr">"id"</span>: <span class="hljs-number">4</span>, <span class="hljs-attr">"name"</span>: <span class="hljs-string">"New York"</span>, <span class="hljs-attr">"coordinate"</span>: {  <span class="hljs-attr">"lat"</span>: <span class="hljs-number">40.7128</span>, <span class="hljs-attr">"lon"</span>: <span class="hljs-number">-74.0060</span>} }
{ <span class="hljs-attr">"index"</span>:{<span class="hljs-attr">"_index"</span>: <span class="hljs-string">"cities"</span> } }
{ <span class="hljs-attr">"id"</span>: <span class="hljs-number">5</span>, <span class="hljs-attr">"name"</span>: <span class="hljs-string">"Paris"</span>, <span class="hljs-attr">"coordinate"</span>: {  <span class="hljs-attr">"lat"</span>: <span class="hljs-number">48.8566</span>, <span class="hljs-attr">"lon"</span>: <span class="hljs-number">2.3522</span>} }
{ <span class="hljs-attr">"index"</span>:{<span class="hljs-attr">"_index"</span>: <span class="hljs-string">"cities"</span> } }
{ <span class="hljs-attr">"id"</span>: <span class="hljs-number">6</span>, <span class="hljs-attr">"name"</span>: <span class="hljs-string">"Bali"</span>, <span class="hljs-attr">"coordinate"</span>: {  <span class="hljs-attr">"lat"</span>: <span class="hljs-number">-8.3405</span>, <span class="hljs-attr">"lon"</span>: <span class="hljs-number">115.0920</span>} }
{ <span class="hljs-attr">"index"</span>:{<span class="hljs-attr">"_index"</span>: <span class="hljs-string">"cities"</span> } }
{ <span class="hljs-attr">"id"</span>: <span class="hljs-number">7</span>, <span class="hljs-attr">"name"</span>: <span class="hljs-string">"Berlin"</span>, <span class="hljs-attr">"coordinate"</span>: {  <span class="hljs-attr">"lat"</span>: <span class="hljs-number">52.5200</span>, <span class="hljs-attr">"lon"</span>: <span class="hljs-number">13.4050</span>} }
{ <span class="hljs-attr">"index"</span>:{<span class="hljs-attr">"_index"</span>: <span class="hljs-string">"cities"</span> } }
{ <span class="hljs-attr">"id"</span>: <span class="hljs-number">8</span>, <span class="hljs-attr">"name"</span>: <span class="hljs-string">"San Fransisco"</span>, <span class="hljs-attr">"coordinate"</span>: {  <span class="hljs-attr">"lat"</span>: <span class="hljs-number">37.7749</span>, <span class="hljs-attr">"lon"</span>: <span class="hljs-number">-122.4194</span>} }
{ <span class="hljs-attr">"index"</span>:{<span class="hljs-attr">"_index"</span>: <span class="hljs-string">"cities"</span> } }
{ <span class="hljs-attr">"id"</span>: <span class="hljs-number">9</span>, <span class="hljs-attr">"name"</span>: <span class="hljs-string">"Beijing"</span>, <span class="hljs-attr">"coordinate"</span>: {  <span class="hljs-attr">"lat"</span>: <span class="hljs-number">39.9042</span>, <span class="hljs-attr">"lon"</span>: <span class="hljs-number">166.4074</span>} }
</code></pre>
<p>The payload might looks weird since it's in an incorrect JSON format, but don't worry – it's supposedly designed that way.</p>
<p>It should then reply back to you with a response similar to this:</p>
<pre><code>{
    <span class="hljs-string">"took"</span>: <span class="hljs-number">72</span>,
    <span class="hljs-string">"errors"</span>: <span class="hljs-literal">false</span>,
    <span class="hljs-string">"items"</span>: [
        <span class="hljs-comment">//will contains item for each data inserted</span>
        ...
    ]
}
</code></pre><h2 id="heading-how-to-query-your-elasticsearch-documents">How to Query Your Elasticsearch Documents</h2>
<p><img src="https://www.freecodecamp.org/news/content/images/2021/04/image-276.png" alt="Image" width="600" height="400" loading="lazy">
_Photo by [Unsplash](https://unsplash.com/@chrislawton?utm_source=ghost&amp;utm_medium=referral&amp;utm_campaign=api-credit"&gt;Chris Lawton / &lt;a href="https://unsplash.com/?utm_source=ghost&amp;utm_medium=referral&amp;utm<em>campaign=api-credit)</em></p>
<p>Now comes the interesting part. We are going to do some querying with the documents that we inserted before.</p>
<p>Elasticsearch supports many types of syntax for query searching. It also has geolocation type searching which we will play around with today.</p>
<p>We can simply start searching for our cities with curl like this:</p>
<p><code>POST '[http://localhost:9200/cities/_sear](http://localhost:9200/cities/_sear)ch</code></p>
<pre><code class="lang-json">{
  <span class="hljs-attr">"query"</span>: {
    <span class="hljs-attr">"bool"</span>: {
      <span class="hljs-attr">"filter"</span>: {
        <span class="hljs-attr">"geo_distance"</span>: {
          <span class="hljs-attr">"distance"</span>: <span class="hljs-string">"10km"</span>,
          <span class="hljs-attr">"coordinate"</span>: {
            <span class="hljs-attr">"lat"</span>: <span class="hljs-number">37.76</span>,
            <span class="hljs-attr">"lon"</span>: <span class="hljs-number">-122.42</span>
          }
        }
      }
    }
  }
}
</code></pre>
<p>That query should give me San Francisco, and the coordinates 37.7749 and -122.4194 should be inside a 10km distance radius from our coordinates (courtesy of Google).</p>
<pre><code>{
    <span class="hljs-string">"took"</span>: <span class="hljs-number">7</span>,
    <span class="hljs-string">"timed_out"</span>: <span class="hljs-literal">false</span>,
    <span class="hljs-string">"_shards"</span>: {
        <span class="hljs-string">"total"</span>: <span class="hljs-number">1</span>,
        <span class="hljs-string">"successful"</span>: <span class="hljs-number">1</span>,
        <span class="hljs-string">"skipped"</span>: <span class="hljs-number">0</span>,
        <span class="hljs-string">"failed"</span>: <span class="hljs-number">0</span>
    },
    <span class="hljs-string">"hits"</span>: {
        <span class="hljs-string">"total"</span>: {
            <span class="hljs-string">"value"</span>: <span class="hljs-number">1</span>,
            <span class="hljs-string">"relation"</span>: <span class="hljs-string">"eq"</span>
        },
        <span class="hljs-string">"max_score"</span>: <span class="hljs-number">0.0</span>,
        <span class="hljs-string">"hits"</span>: [
            {
                <span class="hljs-string">"_index"</span>: <span class="hljs-string">"cities"</span>,
                <span class="hljs-string">"_type"</span>: <span class="hljs-string">"_doc"</span>,
                <span class="hljs-string">"_id"</span>: <span class="hljs-string">"eKPspHYBivyIhfWHb2vl"</span>,
                <span class="hljs-string">"_score"</span>: <span class="hljs-number">0.0</span>,
                <span class="hljs-string">"_source"</span>: {
                    <span class="hljs-string">"id"</span>: <span class="hljs-number">8</span>,
                    <span class="hljs-string">"name"</span>: <span class="hljs-string">"San Fransisco"</span>,
                    <span class="hljs-string">"coordinate"</span>: {
                        <span class="hljs-string">"lat"</span>: <span class="hljs-number">37.7749</span>,
                        <span class="hljs-string">"lon"</span>: <span class="hljs-number">-122.4194</span>
                    }
                }
            }
        ]
    }
}
</code></pre><p>Congratulations! Now you have your own location search engine.<br>But let's experiment a bit more. Let's say you want to get more cities in that location.</p>
<p>Let's try to expand the distance to 4500km by changing the payload:</p>
<pre><code class="lang-json">{
  <span class="hljs-attr">"query"</span>: {
    <span class="hljs-attr">"bool"</span>: {
      <span class="hljs-attr">"filter"</span>: {
        <span class="hljs-attr">"geo_distance"</span>: {
          <span class="hljs-attr">"distance"</span>: <span class="hljs-string">"4500km"</span>,
          <span class="hljs-attr">"coordinate"</span>: {
            <span class="hljs-attr">"lat"</span>: <span class="hljs-number">37.76</span>,
            <span class="hljs-attr">"lon"</span>: <span class="hljs-number">-122.42</span>
          }
        }
      }
    }
  }
}
</code></pre>
<p>And you should get this response:</p>
<pre><code>{
    <span class="hljs-string">"took"</span>: <span class="hljs-number">8</span>,
    <span class="hljs-string">"timed_out"</span>: <span class="hljs-literal">false</span>,
    <span class="hljs-string">"_shards"</span>: {
        <span class="hljs-string">"total"</span>: <span class="hljs-number">1</span>,
        <span class="hljs-string">"successful"</span>: <span class="hljs-number">1</span>,
        <span class="hljs-string">"skipped"</span>: <span class="hljs-number">0</span>,
        <span class="hljs-string">"failed"</span>: <span class="hljs-number">0</span>
    },
    <span class="hljs-string">"hits"</span>: {
        <span class="hljs-string">"total"</span>: {
            <span class="hljs-string">"value"</span>: <span class="hljs-number">2</span>,
            <span class="hljs-string">"relation"</span>: <span class="hljs-string">"eq"</span>
        },
        <span class="hljs-string">"max_score"</span>: <span class="hljs-number">0.0</span>,
        <span class="hljs-string">"hits"</span>: [
            {
                <span class="hljs-string">"_index"</span>: <span class="hljs-string">"cities"</span>,
                <span class="hljs-string">"_type"</span>: <span class="hljs-string">"_doc"</span>,
                <span class="hljs-string">"_id"</span>: <span class="hljs-string">"dKPspHYBivyIhfWHb2vl"</span>,
                <span class="hljs-string">"_score"</span>: <span class="hljs-number">0.0</span>,
                <span class="hljs-string">"_source"</span>: {
                    <span class="hljs-string">"id"</span>: <span class="hljs-number">4</span>,
                    <span class="hljs-string">"name"</span>: <span class="hljs-string">"New York"</span>,
                    <span class="hljs-string">"coordinate"</span>: {
                        <span class="hljs-string">"lat"</span>: <span class="hljs-number">40.7128</span>,
                        <span class="hljs-string">"lon"</span>: <span class="hljs-number">-74.0060</span>
                    }
                }
            },
            {
                <span class="hljs-string">"_index"</span>: <span class="hljs-string">"cities"</span>,
                <span class="hljs-string">"_type"</span>: <span class="hljs-string">"_doc"</span>,
                <span class="hljs-string">"_id"</span>: <span class="hljs-string">"eKPspHYBivyIhfWHb2vl"</span>,
                <span class="hljs-string">"_score"</span>: <span class="hljs-number">0.0</span>,
                <span class="hljs-string">"_source"</span>: {
                    <span class="hljs-string">"id"</span>: <span class="hljs-number">8</span>,
                    <span class="hljs-string">"name"</span>: <span class="hljs-string">"San Fransisco"</span>,
                    <span class="hljs-string">"coordinate"</span>: {
                        <span class="hljs-string">"lat"</span>: <span class="hljs-number">37.7749</span>,
                        <span class="hljs-string">"lon"</span>: <span class="hljs-number">-122.4194</span>
                    }
                }
            }
        ]
    }
}
</code></pre><p>It gives two results: New York and San Fransisco. The results look correct, but the positioning might be a bit weird. San Fransisco supposedly should come first since it's closer, right?</p>
<p>Well not exactly, since what we are doing is just filtering. Our query is just filtering and it doesn't care about which one is closest to you. </p>
<p>But what if we want to do some calculation to show which locations might be the nearest? Don't worry, Elasticsearch can do that too. We can use a type of query called a function score query.</p>
<h3 id="heading-how-to-use-a-function-score-query-in-elasticsearch">How to Use a Function Score Query in Elasticsearch</h3>
<p>Elasticsearch calculates (scores) what documents it will show to the user. By using <a target="_blank" href="https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-function-score-query.html">function score queries</a> we can modify that score so we can determine which documents should be returned.</p>
<p>Here, we will be using the <a target="_blank" href="https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-function-score-query.html#function-decay">decay query function</a>. There are three kinds of decay functions: exp, linear, and gauss. Each of them has different behaviors.</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2021/01/decay_2d.png" alt="Image" width="600" height="400" loading="lazy">
<em>Image <a target="_blank" href="https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-function-score-query.html#function-decay">source</a></em></p>
<p>The one we will use here is the linear type function. We will also specify the coordinates together with offset and scale.</p>
<p><code>POST '[http://localhost:9200/cities/_sear](http://localhost:9200/cities/_sear)ch</code></p>
<pre><code>{
  <span class="hljs-string">"query"</span>: {
    <span class="hljs-string">"function_score"</span>: {
      <span class="hljs-string">"functions"</span>: [
        {
          <span class="hljs-string">"linear"</span>: {
            <span class="hljs-string">"coordinate"</span>: {
              <span class="hljs-string">"origin"</span>: <span class="hljs-string">"37, -122"</span>,
              <span class="hljs-string">"offset"</span>: <span class="hljs-string">"100km"</span>,
              <span class="hljs-string">"scale"</span>:<span class="hljs-string">"2500km"</span>
            }
          }
        }
      ],
       <span class="hljs-string">"min_score"</span>:<span class="hljs-string">"0.1"</span>
    }
  }
}
</code></pre><p>Now, we should get our results ordered by the highest score.</p>
<pre><code>{
    <span class="hljs-string">"took"</span>: <span class="hljs-number">32</span>,
    <span class="hljs-string">"timed_out"</span>: <span class="hljs-literal">false</span>,
    <span class="hljs-string">"_shards"</span>: {
        <span class="hljs-string">"total"</span>: <span class="hljs-number">1</span>,
        <span class="hljs-string">"successful"</span>: <span class="hljs-number">1</span>,
        <span class="hljs-string">"skipped"</span>: <span class="hljs-number">0</span>,
        <span class="hljs-string">"failed"</span>: <span class="hljs-number">0</span>
    },
    <span class="hljs-string">"hits"</span>: {
        <span class="hljs-string">"total"</span>: {
            <span class="hljs-string">"value"</span>: <span class="hljs-number">2</span>,
            <span class="hljs-string">"relation"</span>: <span class="hljs-string">"eq"</span>
        },
        <span class="hljs-string">"max_score"</span>: <span class="hljs-number">1.0</span>,
        <span class="hljs-string">"hits"</span>: [
            {
                <span class="hljs-string">"_index"</span>: <span class="hljs-string">"cities"</span>,
                <span class="hljs-string">"_type"</span>: <span class="hljs-string">"_doc"</span>,
                <span class="hljs-string">"_id"</span>: <span class="hljs-string">"eKPspHYBivyIhfWHb2vl"</span>,
                <span class="hljs-string">"_score"</span>: <span class="hljs-number">1.0</span>,
                <span class="hljs-string">"_source"</span>: {
                    <span class="hljs-string">"id"</span>: <span class="hljs-number">8</span>,
                    <span class="hljs-string">"name"</span>: <span class="hljs-string">"San Fransisco"</span>,
                    <span class="hljs-string">"coordinate"</span>: {
                        <span class="hljs-string">"lat"</span>: <span class="hljs-number">37.7749</span>,
                        <span class="hljs-string">"lon"</span>: <span class="hljs-number">-122.4194</span>
                    }
                }
            },
            {
                <span class="hljs-string">"_index"</span>: <span class="hljs-string">"cities"</span>,
                <span class="hljs-string">"_type"</span>: <span class="hljs-string">"_doc"</span>,
                <span class="hljs-string">"_id"</span>: <span class="hljs-string">"dKPspHYBivyIhfWHb2vl"</span>,
                <span class="hljs-string">"_score"</span>: <span class="hljs-number">0.19508117</span>,
                <span class="hljs-string">"_source"</span>: {
                    <span class="hljs-string">"id"</span>: <span class="hljs-number">4</span>,
                    <span class="hljs-string">"name"</span>: <span class="hljs-string">"New York"</span>,
                    <span class="hljs-string">"coordinate"</span>: {
                        <span class="hljs-string">"lat"</span>: <span class="hljs-number">40.7128</span>,
                        <span class="hljs-string">"lon"</span>: <span class="hljs-number">-74.0060</span>
                    }
                }
            }
        ]
    }
}
</code></pre><p>And that wraps it up!</p>
<h2 id="heading-conclusion">Conclusion</h2>
<p>In this article, we've covered how to implement location-based search with Elasticsearch. But this is not the end – what I have shown here is just the surface of what you can do. </p>
<p>I hope you found this article interesting and useful. If so, keep learning more about it and try to experiment with combining function scoring. It will be fun, I promise.</p>
<blockquote>
<p>Always be curios and you will learn something new.</p>
</blockquote>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ How to implement Change Data Capture using Kafka Streams ]]>
                </title>
                <description>
                    <![CDATA[ By Luca Florio Change Data Capture (CDC) involves observing the changes happening in a database and making them available in a form that can be exploited by other systems.  One of the most interesting use-cases is to make them available as a stream o... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/how-to-implement-the-change-data-capture-pattern-using-kafka-streams/</link>
                <guid isPermaLink="false">66d45e444a7504b7409c3398</guid>
                
                    <category>
                        <![CDATA[ Apache Kafka ]]>
                    </category>
                
                    <category>
                        <![CDATA[ cdc ]]>
                    </category>
                
                    <category>
                        <![CDATA[ change data capture ]]>
                    </category>
                
                    <category>
                        <![CDATA[ elasticsearch ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Kafka streams ]]>
                    </category>
                
                    <category>
                        <![CDATA[ MongoDB ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Scala ]]>
                    </category>
                
                    <category>
                        <![CDATA[ stream processing ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ freeCodeCamp ]]>
                </dc:creator>
                <pubDate>Mon, 20 Jan 2020 21:10:30 +0000</pubDate>
                <media:content url="https://cdn-media-2.freecodecamp.org/w1280/5f9c9dac740569d1a4ca3900.jpg" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>By Luca Florio</p>
<p><strong>Change Data Capture</strong> (CDC) involves observing the changes happening in a database and making them available in a form that can be exploited by other systems. </p>
<p>One of the most interesting use-cases is to make them available as a stream of events. This means you can, for example, catch the events and update a search index as the data are written to the database.</p>
<p>Interesting right? Let's see how to implement a CDC system that can observe the changes made to a NoSQL database (<strong>MongoDB</strong>), stream them through a message broker (<strong>Kafka</strong>), process the messages of the stream (<strong>Kafka Streams</strong>), and update a search index (<strong>Elasticsearch</strong>)!?</p>
<h2 id="heading-tldr">TL;DR</h2>
<p>The full code of the project is available on GitHub in this <a target="_blank" href="https://github.com/elleFlorio/kafka-streams-playground">repository</a>. If you want to skip all my jibber jabber and just run the example, go straight to the 
<strong>How to run the project</strong> section near the end of the article!?</p>
<h1 id="heading-use-case-amp-infrastructure">Use case &amp; infrastructure</h1>
<p>We run a web application that stores photos uploaded by users. People can share their shots, let others download them, create albums, and so on. Users can also provide a description of their photos, as well as Exif metadata and other useful information. </p>
<p>We want to store such information and use it to improve our search engine. We will focus on this part of our system that is depicted in the following diagram.</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2019/12/kafka-stream-playground.png" alt="Image" width="600" height="400" loading="lazy">
<em>Photo information storage architecture</em></p>
<p>The information is provided in <code>JSON</code> format. Since I like to post my shots on <a target="_blank" href="https://unsplash.com/">Unsplash</a>, and the website provides free access to its API, I used their model for the photo <code>JSON</code> document.</p>
<p>Once the <code>JSON</code> is sent through a <code>POST</code> request to our server, we store the document inside a <strong>MongoDB</strong> database. We will also store it in <strong>Elasticsearch</strong> for indexing and quick search. </p>
<p>However, we love <strong>long exposure shots</strong>, and we would like to store in a separate index a subset of information regarding this kind of photo. It can be the exposure time, as well as the location (latitude and longitude) where the photo has been taken. In this way, we can create a map of locations where photographers usually take long exposure photos.</p>
<p>Here comes the interesting part: instead of explicitly calling Elasticsearch in our code once the photo info is stored in MongoDB, we can implement a <strong>CDC</strong> exploiting Kafka and <strong>Kafka Streams</strong>. </p>
<p>We listen to modifications to MongoDB <strong>oplog</strong> using the interface provided by MongoDB itself. When the photo is stored we send it to a <code>photo</code> Kafka topic. Using <strong>Kafka Connect</strong>, an Elasticsearch sink is configured to save everything sent to that topic to a specific index. In this way, we can index all photos stored in MongoDB automatically.</p>
<p>We need to take care of the long exposure photos too. It requires some processing of the information to extract what we need. For this reason, we use Kafka Streams to create a <strong>processing topology</strong> to:</p>
<ol>
<li>Read from the <code>photo</code> topic</li>
<li>Extract Exif and location information</li>
<li>Filter long exposure photos (exposure time &gt; 1 sec.)</li>
<li>Write to a <code>long-exposure</code> topic.</li>
</ol>
<p>Then another Elasticsearch sink will read data from the <code>long-exposure</code> topic and write it to a specific index in Elasticsearch.</p>
<p>It is quite simple, but it's enough to have fun with CDC and Kafka Streams! ?</p>
<h1 id="heading-server-implementation">Server implementation</h1>
<p>Let's have a look at what we need to implement: our server exposing the <strong>REST API</strong>s!</p>
<h3 id="heading-models-and-dao">Models and DAO</h3>
<p>First things first, we need a model of our data and a <strong>Data Access Object</strong> (DAO) to talk to our MongoDB database. </p>
<p>As I said, the model for the photo <code>JSON</code> information is the one used by Unsplash. Check out the free API <a target="_blank" href="https://unsplash.com/documentation#get-a-photo">documentation</a> for an example of the <code>JSON</code> we will use. </p>
<p>I created the mapping for the serializaion/deserialization of the photo <code>JSON</code> using <a target="_blank" href="https://github.com/spray/spray-json">spray-json</a>. I'll skip the details about this, if you are curious just look at the <a target="_blank" href="https://github.com/elleFlorio/kafka-streams-playground/tree/master/src/main/scala/com/elleflorio/kafka/streams/playground/dao/model/unsplash">repo</a>!</p>
<p>Let's focus on the model for the long exposusure photo.</p>
<pre><code class="lang-scala"><span class="hljs-keyword">case</span> <span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">LongExposurePhoto</span>(<span class="hljs-params">id: <span class="hljs-type">String</span>, exposureTime: <span class="hljs-type">Float</span>, createdAt: <span class="hljs-type">Date</span>, location: <span class="hljs-type">Location</span></span>)</span>

<span class="hljs-class"><span class="hljs-keyword">object</span> <span class="hljs-title">LongExposurePhotoJsonProtocol</span> <span class="hljs-keyword">extends</span> <span class="hljs-title">DefaultJsonProtocol</span> </span>{
  <span class="hljs-keyword">implicit</span> <span class="hljs-keyword">val</span> longExposurePhotoFormat:<span class="hljs-type">RootJsonFormat</span>[<span class="hljs-type">LongExposurePhoto</span>] = jsonFormat(<span class="hljs-type">LongExposurePhoto</span>, <span class="hljs-string">"id"</span>, <span class="hljs-string">"exposure_time"</span>, <span class="hljs-string">"created_at"</span>, <span class="hljs-string">"location"</span>)
}
</code></pre>
<p>This is quite simple: we keep from the photo <code>JSON</code> the information about the <code>id</code>, the exposure time (<code>exposureTime</code>), when the photo has been created (<code>createdAt</code>), and the <code>location</code> where it has been taken. The <code>location</code> comprehends the <code>city</code>, the <code>country</code>, and the <code>position</code> composed of <code>latitude</code> and <code>longitude</code>.</p>
<p>The DAO consists of just the <code>PhotoDao.scala</code> class. </p>
<pre><code class="lang-scala"><span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">PhotoDao</span>(<span class="hljs-params">database: <span class="hljs-type">MongoDatabase</span>, photoCollection: <span class="hljs-type">String</span></span>) </span>{

  <span class="hljs-keyword">val</span> collection: <span class="hljs-type">MongoCollection</span>[<span class="hljs-type">Document</span>] = database.getCollection(photoCollection)

  <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">createPhoto</span></span>(photo: <span class="hljs-type">Photo</span>): <span class="hljs-type">Future</span>[<span class="hljs-type">String</span>] = {
    <span class="hljs-keyword">val</span> doc = <span class="hljs-type">Document</span>(photo.toJson.toString())
    doc.put(<span class="hljs-string">"_id"</span>, photo.id)
    collection.insertOne(doc).toFuture()
      .map(_ =&gt; photo.id)
  }
}
</code></pre>
<p>Since I want to keep this example minimal and focused on the CDC implementation, the DAO has just one method to create a new photo document in MongoDB. </p>
<p>It is straightforward: create a document from the photo <code>JSON</code>, and insert it in mongo using <code>id</code> as the one of the photo itself. Then, we can return the <code>id</code> of the photo just inserted in a <code>Future</code> (the MongoDB API is async).</p>
<h3 id="heading-kafka-producer">Kafka Producer</h3>
<p>Once the photo is stored inside MongoDB, we have to send it to the <code>photo</code> Kafka topic. This means we need a producer to write the message in its topic. The <code>PhotoProducer.scala</code> class looks like this.</p>
<pre><code class="lang-scala"><span class="hljs-keyword">case</span> <span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">PhotoProducer</span>(<span class="hljs-params">props: <span class="hljs-type">Properties</span>, topic: <span class="hljs-type">String</span></span>) </span>{

  createKafkaTopic(props, topic)
  <span class="hljs-keyword">val</span> photoProducer = <span class="hljs-keyword">new</span> <span class="hljs-type">KafkaProducer</span>[<span class="hljs-type">String</span>, <span class="hljs-type">String</span>](props)

  <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">sendPhoto</span></span>(photo: <span class="hljs-type">Photo</span>): <span class="hljs-type">Future</span>[<span class="hljs-type">RecordMetadata</span>] = {
    <span class="hljs-keyword">val</span> record = <span class="hljs-keyword">new</span> <span class="hljs-type">ProducerRecord</span>[<span class="hljs-type">String</span>, <span class="hljs-type">String</span>](topic, photo.id, photo.toJson.compactPrint)
    photoProducer.send(record)
  }

  <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">closePhotoProducer</span></span>(): <span class="hljs-type">Unit</span> = photoProducer.close()
}
</code></pre>
<p>I would say that this is pretty self-explanatory. The most interesting part is probably the <code>createKafkaTopic</code> method that is implemented in the <code>utils</code> package.</p>
<pre><code class="lang-scala"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">createKafkaTopic</span></span>(props: <span class="hljs-type">Properties</span>, topic: <span class="hljs-type">String</span>): <span class="hljs-type">Unit</span> = {
    <span class="hljs-keyword">val</span> adminClient = <span class="hljs-type">AdminClient</span>.create(props)
    <span class="hljs-keyword">val</span> photoTopic = <span class="hljs-keyword">new</span> <span class="hljs-type">NewTopic</span>(topic, <span class="hljs-number">1</span>, <span class="hljs-number">1</span>)
    adminClient.createTopics(<span class="hljs-type">List</span>(photoTopic).asJava)
  }
</code></pre>
<p>This method creates the topic in Kafka setting 1 as a partition and replication factor (it is enough for this example). It is not required, but creating the topic in advance lets Kafka balance partitions, select leaders, and so on. This will be useful to get our stream topology ready to process as we start our server.</p>
<h3 id="heading-event-listener">Event Listener</h3>
<p>We have the DAO that writes in MongoDB and the producer that sends the message in Kafka. We need to glue them together in some way so that when the document is stored in MongoDB the message is sent to the <code>photo</code> topic. This is the purpose of the <code>PhotoListener.scala</code> class.</p>
<pre><code class="lang-scala"><span class="hljs-keyword">case</span> <span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">PhotoListener</span>(<span class="hljs-params">collection: <span class="hljs-type">MongoCollection</span>[<span class="hljs-type">Document</span>], producer: <span class="hljs-type">PhotoProducer</span></span>) </span>{

  <span class="hljs-keyword">val</span> cursor: <span class="hljs-type">ChangeStreamObservable</span>[<span class="hljs-type">Document</span>] = collection.watch()

  cursor.subscribe(<span class="hljs-keyword">new</span> <span class="hljs-type">Observer</span>[<span class="hljs-type">ChangeStreamDocument</span>[<span class="hljs-type">Document</span>]] {
    <span class="hljs-keyword">override</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">onNext</span></span>(result: <span class="hljs-type">ChangeStreamDocument</span>[<span class="hljs-type">Document</span>]): <span class="hljs-type">Unit</span> = {
      result.getOperationType <span class="hljs-keyword">match</span> {
        <span class="hljs-keyword">case</span> <span class="hljs-type">OperationType</span>.<span class="hljs-type">INSERT</span> =&gt; {
          <span class="hljs-keyword">val</span> photo = result.getFullDocument.toJson().parseJson.convertTo[<span class="hljs-type">Photo</span>]
          producer.sendPhoto(photo).get()
          println(<span class="hljs-string">s"Sent photo with Id <span class="hljs-subst">${photo.id}</span>"</span>)
        }
        <span class="hljs-keyword">case</span> _ =&gt; println(<span class="hljs-string">s"Operation <span class="hljs-subst">${result.getOperationType}</span> not supported"</span>)
      }
    }
    <span class="hljs-keyword">override</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">onError</span></span>(e: <span class="hljs-type">Throwable</span>): <span class="hljs-type">Unit</span> = println(<span class="hljs-string">s"onError: <span class="hljs-subst">$e</span>"</span>)
    <span class="hljs-keyword">override</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">onComplete</span></span>(): <span class="hljs-type">Unit</span> = println(<span class="hljs-string">"onComplete"</span>)})
}
</code></pre>
<p>We exploit the <a target="_blank" href="https://docs.mongodb.com/manual/changeStreams/">Chage Streams interface</a> provided by the MongoDB scala library. </p>
<p>Here is how it works: we <code>watch()</code> the collection where photos are stored. When there is a new event (<code>onNext</code>) we run our logic. </p>
<p>For this example we are interested only in the creation of new documents, so we explicitly check that the operation is of type <code>OperationType.INSERT</code>. If the operation is the one we are interested in, we get the document and convert it to a <code>Photo</code> object to be sent by our producer. </p>
<p>That's it! With few lines of code we connected the creation of documents in MongoDB to a stream of events in Kafka.?</p>
<p>As a side note, be aware that to use the Change Streams interface <strong>we have to setup a MongoDB replica set</strong>. This means we need to run 3 instances of MongoDB and configure them to act as a replica set using the following command in mongo client:</p>
<pre><code class="lang-shell">rs.initiate({_id : "r0", members: [{ _id : 0, host : "mongo1:27017", priority : 1 },{ _id : 1, host :"mongo2:27017", priority : 0 },{ _id : 2, host : "mongo3:27017", priority : 0, arbiterOnly: true }]})
</code></pre>
<p>Here our instances are the containers we will run in the docker-compose file, that is <code>mongo1</code>, <code>mongo2</code>, and <code>mongo3</code>.</p>
<h3 id="heading-processing-topology">Processing Topology</h3>
<p>Time to build our processing topology! It will be in charge of the creation of the <code>long-exposure</code> index in Elasticsearch. The topology is described by the following diagram:</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2020/01/processing-topology.png" alt="Image" width="600" height="400" loading="lazy">
<em>Processing topology</em></p>
<p>and it is implemented in the <code>LongExposureTopology.scala</code> object class.
Let's analyse every step of our processing topology.</p>
<pre><code class="lang-scala"><span class="hljs-keyword">val</span> stringSerde = <span class="hljs-keyword">new</span> <span class="hljs-type">StringSerde</span>

<span class="hljs-keyword">val</span> streamsBuilder = <span class="hljs-keyword">new</span> <span class="hljs-type">StreamsBuilder</span>()

<span class="hljs-keyword">val</span> photoSource: <span class="hljs-type">KStream</span>[<span class="hljs-type">String</span>, <span class="hljs-type">String</span>] = streamsBuilder.stream(sourceTopic, <span class="hljs-type">Consumed</span>.`<span class="hljs-keyword">with</span>`(stringSerde, stringSerde))
</code></pre>
<p>The first step is to read from a source topic. We start a stream from the <code>sourceTopic</code> (that is <code>photo</code> topic) using the <code>StreamsBuilder()</code> object. The <code>stringSerde</code> object is used to serialise and deserialise the content of the topic as a <code>String</code>. </p>
<p>Please notice that at each step of the processing we create a new stream of data with a <code>KStream</code> object. When creating the stream, we specify the key and the value produced by the stream. In our topology the key will always be a <code>String</code>. In this step the value produced is still a <code>String</code>.</p>
<pre><code class="lang-scala"><span class="hljs-keyword">val</span> covertToPhotoObject: <span class="hljs-type">KStream</span>[<span class="hljs-type">String</span>, <span class="hljs-type">Photo</span>] =
      photoSource.mapValues((_, jsonString) =&gt; {
        <span class="hljs-keyword">val</span> photo = jsonString.parseJson.convertTo[<span class="hljs-type">Photo</span>]
        println(<span class="hljs-string">s"Processing photo <span class="hljs-subst">${photo.id}</span>"</span>)
        photo
      })
</code></pre>
<p>The next step is to convert the value extracted from the <code>photo</code> topic into a proper <code>Photo</code> object. </p>
<p>So we start from the <code>photoSource</code> stream and work on the values using the <code>mapValues</code> function. We simply parse the value as a <code>JSON</code> and create the <code>Photo</code> object that will be sent in the <code>convertToPhotoObject</code> stream.</p>
<pre><code class="lang-scala"><span class="hljs-keyword">val</span> filterWithLocation: <span class="hljs-type">KStream</span>[<span class="hljs-type">String</span>, <span class="hljs-type">Photo</span>] = covertToPhotoObject.filter((_, photo) =&gt; photo.location.exists(_.position.isDefined))
</code></pre>
<p>There is no guarantee that the photo we are processing will have the info about the location, but we want it in our long exposure object. This step of the topology filters out from the <code>covertToPhotoObject</code> stream the photos that have no info about the location, and creates the <code>filterWithLocation</code> stream.</p>
<pre><code class="lang-scala"><span class="hljs-keyword">val</span> filterWithExposureTime: <span class="hljs-type">KStream</span>[<span class="hljs-type">String</span>, <span class="hljs-type">Photo</span>] = filterWithLocation.filter((_, photo) =&gt; photo.exif.exists(_.exposureTime.isDefined))
</code></pre>
<p>Another important fact for our processing is the exposure time of the photo. For this reason, we filter out from the <code>filterWithLocation</code> stream the photos without exposure time info, creating the <code>filterWithExposureTime</code>.</p>
<pre><code class="lang-scala"><span class="hljs-keyword">val</span> dataExtractor: <span class="hljs-type">KStream</span>[<span class="hljs-type">String</span>, <span class="hljs-type">LongExposurePhoto</span>] =
      filterWithExposureTime.mapValues((_, photo) =&gt; <span class="hljs-type">LongExposurePhoto</span>(photo.id, parseExposureTime(photo.exif.get.exposureTime.get), photo.createdAt, photo.location.get))
</code></pre>
<p>We now have all we need to create a <code>LongExposurePhoto</code> object! That is the result of the <code>dataExtractor</code>: it takes the <code>Photo</code> coming from the <code>filterWithExposureTime</code> stream and produces a new stream containing <code>LongExposurePhoto</code>.</p>
<pre><code class="lang-scala"><span class="hljs-keyword">val</span> longExposureFilter: <span class="hljs-type">KStream</span>[<span class="hljs-type">String</span>, <span class="hljs-type">String</span>] =
      dataExtractor.filter((_, item) =&gt; item.exposureTime &gt; <span class="hljs-number">1.0</span>).mapValues((_, longExposurePhoto) =&gt; {
        <span class="hljs-keyword">val</span> jsonString = longExposurePhoto.toJson.compactPrint
        println(<span class="hljs-string">s"completed processing: <span class="hljs-subst">$jsonString</span>"</span>)
        jsonString
      })
</code></pre>
<p>We are almost there. We now have to keep the photos with a long exposure time (that we decided is more then 1 sec.). So we create a new <code>longExposureFilter</code> stream without the photos that are not long exposure. </p>
<p>This time we also serialise the <code>LongExposurePhotos</code> into the corresponding <code>JSON</code> string, which will be written to Elasticsearch in the next step.</p>
<pre><code class="lang-scala">longExposureFilter.to(sinkTopic, <span class="hljs-type">Produced</span>.`<span class="hljs-keyword">with</span>`(stringSerde, stringSerde))

streamsBuilder.build()
</code></pre>
<p>This is the last step of our topology. We write <code>to</code> our <code>sinkTopic</code> (that is <code>long-exposure</code> topic) using the string serialiser/deserialiser what is inside the <code>longExposureFilter</code> stream.
The last command simply <code>build</code>s the topology we just created.</p>
<p>Now that we have our topology, we can use it in our server. The <code>PhotoStreamProcessor.scala</code> class is what manages the processing.</p>
<pre><code class="lang-scala"><span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">PhotoStreamProcessor</span>(<span class="hljs-params">kafkaProps: <span class="hljs-type">Properties</span>, streamProps: <span class="hljs-type">Properties</span>, sourceTopic: <span class="hljs-type">String</span>, sinkTopic: <span class="hljs-type">String</span></span>) </span>{

  createKafkaTopic(kafkaProps, sinkTopic)
  <span class="hljs-keyword">val</span> topology: <span class="hljs-type">Topology</span> = <span class="hljs-type">LongExposureTopology</span>.build(sourceTopic, sinkTopic)
  <span class="hljs-keyword">val</span> streams: <span class="hljs-type">KafkaStreams</span> = <span class="hljs-keyword">new</span> <span class="hljs-type">KafkaStreams</span>(topology, streamProps)

  sys.<span class="hljs-type">ShutdownHookThread</span> {
    streams.close(java.time.<span class="hljs-type">Duration</span>.ofSeconds(<span class="hljs-number">10</span>))
  }

  <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">start</span></span>(): <span class="hljs-type">Unit</span> = <span class="hljs-keyword">new</span> <span class="hljs-type">Thread</span> {
    <span class="hljs-keyword">override</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">run</span></span>(): <span class="hljs-type">Unit</span> = {
      streams.cleanUp()
      streams.start()
      println(<span class="hljs-string">"Started long exposure processor"</span>)
    }
  }.start()
}
</code></pre>
<p>First we create the <code>sinkTopic</code>, using the same utility method we saw before. Then we build the stream topology and initialize a <code>KafkaStreams</code> object with that topology.</p>
<p>To start the stream processing, we need to create a dedicated <code>Thread</code> that will run the streaming while the server is alive. According to the official documentation, it is always a good idea to <code>cleanUp()</code> the stream before starting it. </p>
<p>Our <code>PhotoStreamProcessor</code> is ready to go!?</p>
<h3 id="heading-rest-api">REST API</h3>
<p>The server exposes REST APIs to send it the photo information to store. We make use of <a target="_blank" href="https://doc.akka.io/docs/akka-http/current/index.html">Akka HTTP</a> for the API implementation.</p>
<pre><code class="lang-scala"><span class="hljs-class"><span class="hljs-keyword">trait</span> <span class="hljs-title">AppRoutes</span> <span class="hljs-keyword">extends</span> <span class="hljs-title">SprayJsonSupport</span> </span>{

  <span class="hljs-keyword">implicit</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">system</span></span>: <span class="hljs-type">ActorSystem</span>
  <span class="hljs-keyword">implicit</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">photoDao</span></span>: <span class="hljs-type">PhotoDao</span>
  <span class="hljs-keyword">implicit</span> <span class="hljs-keyword">lazy</span> <span class="hljs-keyword">val</span> timeout = <span class="hljs-type">Timeout</span>(<span class="hljs-number">5.</span>seconds)

  <span class="hljs-keyword">lazy</span> <span class="hljs-keyword">val</span> healthRoute: <span class="hljs-type">Route</span> = pathPrefix(<span class="hljs-string">"health"</span>) {
    concat(
      pathEnd {
        concat(
          get {
            complete(<span class="hljs-type">StatusCodes</span>.<span class="hljs-type">OK</span>)
          }
        )
      }
    )
  }

  <span class="hljs-keyword">lazy</span> <span class="hljs-keyword">val</span> crudRoute: <span class="hljs-type">Route</span> = pathPrefix(<span class="hljs-string">"photo"</span>) {
    concat(
      pathEnd {
        concat(
          post {
            entity(as[<span class="hljs-type">Photo</span>]) { photo =&gt;
              <span class="hljs-keyword">val</span> photoCreated: <span class="hljs-type">Future</span>[<span class="hljs-type">String</span>] =
                photoDao.createPhoto(photo)
              onSuccess(photoCreated) { id =&gt;
              complete((<span class="hljs-type">StatusCodes</span>.<span class="hljs-type">Created</span>, id))
              }
            }
          }
        )
      }
    )
  }

}
</code></pre>
<p>To keep the example minimal, we have only two routes:</p>
<ul>
<li><code>GET /health</code> - to check if the server is up &amp; running</li>
<li><code>POST /photo</code> - to send to the system the <code>JSON</code> of the photo information we want to store. This endpoint uses the DAO to store the document in MongoDB and returns a <code>201</code> with the id of the stored photo if the operation succeeded.</li>
</ul>
<p>This is by no means a complete set of APIs, but it is enough to run our example.?</p>
<h3 id="heading-server-main-class">Server main class</h3>
<p>OK, we implemented all the components of our server, so it's time to wrap everything up. This is our <code>Server.scala</code> object class.</p>
<pre><code class="lang-scala"><span class="hljs-keyword">implicit</span> <span class="hljs-keyword">val</span> system: <span class="hljs-type">ActorSystem</span> = <span class="hljs-type">ActorSystem</span>(<span class="hljs-string">"kafka-stream-playground"</span>)
<span class="hljs-keyword">implicit</span> <span class="hljs-keyword">val</span> materializer: <span class="hljs-type">ActorMaterializer</span> = <span class="hljs-type">ActorMaterializer</span>()
</code></pre>
<p>First a couple of <strong>Akka</strong> utility values. Since we use <a target="_blank" href="https://doc.akka.io/docs/akka-http/current/index.html">Akka HTTP</a> to run our server and REST API, these implicit values are required.</p>
<pre><code class="lang-scala"><span class="hljs-keyword">val</span> config: <span class="hljs-type">Config</span> = <span class="hljs-type">ConfigFactory</span>.load()
<span class="hljs-keyword">val</span> address = config.getString(<span class="hljs-string">"http.ip"</span>)
<span class="hljs-keyword">val</span> port = config.getInt(<span class="hljs-string">"http.port"</span>)

<span class="hljs-keyword">val</span> mongoUri = config.getString(<span class="hljs-string">"mongo.uri"</span>)
<span class="hljs-keyword">val</span> mongoDb = config.getString(<span class="hljs-string">"mongo.db"</span>)
<span class="hljs-keyword">val</span> mongoUser = config.getString(<span class="hljs-string">"mongo.user"</span>)
<span class="hljs-keyword">val</span> mongoPwd = config.getString(<span class="hljs-string">"mongo.pwd"</span>)
<span class="hljs-keyword">val</span> photoCollection = config.getString(<span class="hljs-string">"mongo.photo_collection"</span>)

<span class="hljs-keyword">val</span> kafkaHosts = config.getString(<span class="hljs-string">"kafka.hosts"</span>).split(',').toList
<span class="hljs-keyword">val</span> photoTopic = config.getString(<span class="hljs-string">"kafka.photo_topic"</span>)
<span class="hljs-keyword">val</span> longExposureTopic = config.getString(<span class="hljs-string">"kafka.long_exposure_topic"</span>)
</code></pre>
<p>Then we read all the configuration properties. We will come back to the configuration file in a moment.</p>
<pre><code class="lang-scala"><span class="hljs-keyword">val</span> kafkaProps = <span class="hljs-keyword">new</span> <span class="hljs-type">Properties</span>()
kafkaProps.put(<span class="hljs-string">"bootstrap.servers"</span>, kafkaHosts.mkString(<span class="hljs-string">","</span>))
kafkaProps.put(<span class="hljs-string">"key.serializer"</span>, <span class="hljs-string">"org.apache.kafka.common.serialization.StringSerializer"</span>)
kafkaProps.put(<span class="hljs-string">"value.serializer"</span>, <span class="hljs-string">"org.apache.kafka.common.serialization.StringSerializer"</span>)

<span class="hljs-keyword">val</span> streamProps = <span class="hljs-keyword">new</span> <span class="hljs-type">Properties</span>()
streamProps.put(<span class="hljs-type">StreamsConfig</span>.<span class="hljs-type">APPLICATION_ID_CONFIG</span>, <span class="hljs-string">"long-exp-proc-app"</span>)
streamProps.put(<span class="hljs-type">StreamsConfig</span>.<span class="hljs-type">BOOTSTRAP_SERVERS_CONFIG</span>, kafkaHosts.mkString(<span class="hljs-string">","</span>))

<span class="hljs-keyword">val</span> photoProducer = <span class="hljs-type">PhotoProducer</span>(kafkaProps, photoTopic)
<span class="hljs-keyword">val</span> photoStreamProcessor = <span class="hljs-keyword">new</span> <span class="hljs-type">PhotoStreamProcessor</span>(kafkaProps, streamProps, photoTopic, <span class="hljs-string">"long-exposure"</span>)
photoStreamProcessor.start()
</code></pre>
<p>We have to configure both our Kafka producer and the stream processor. We also start the stream processor, so the server will be ready to process the documents sent to it.</p>
<pre><code class="lang-scala"><span class="hljs-keyword">val</span> client = <span class="hljs-type">MongoClient</span>(<span class="hljs-string">s"mongodb://<span class="hljs-subst">$mongoUri</span>/<span class="hljs-subst">$mongoUser</span>"</span>)
<span class="hljs-keyword">val</span> db = client.getDatabase(mongoDb)
<span class="hljs-keyword">val</span> photoDao: <span class="hljs-type">PhotoDao</span> = <span class="hljs-keyword">new</span> <span class="hljs-type">PhotoDao</span>(db, photoCollection)
<span class="hljs-keyword">val</span> photoListener = <span class="hljs-type">PhotoListener</span>(photoDao.collection, photoProducer)
</code></pre>
<p>Also MongoDB needs to be configured. We setup the connection and initialize the DAO as well as the listener.</p>
<pre><code class="lang-scala"><span class="hljs-keyword">lazy</span> <span class="hljs-keyword">val</span> routes: <span class="hljs-type">Route</span> = healthRoute ~ crudRoute

<span class="hljs-type">Http</span>().bindAndHandle(routes, address, port)
<span class="hljs-type">Await</span>.result(system.whenTerminated, <span class="hljs-type">Duration</span>.<span class="hljs-type">Inf</span>)
</code></pre>
<p>Everything has been initialized. We create the REST routes for the communication to the server, bind them to the handlers, and finally start the server!?</p>
<h4 id="heading-server-configuration">Server configuration</h4>
<p>This is the configuration file used to setup the server:</p>
<pre><code>http {
  ip = <span class="hljs-string">"127.0.0.1"</span>
  ip = ${?SERVER_IP}

  port = <span class="hljs-number">8000</span>
  port = ${?SERVER_PORT}
}
mongo {
  uri = <span class="hljs-string">"127.0.0.1:27017"</span>
  uri = ${?MONGO_URI}
  db = <span class="hljs-string">"kafka-stream-playground"</span>
  user = <span class="hljs-string">"admin"</span>
  pwd = <span class="hljs-string">"admin"</span>
  photo_collection = <span class="hljs-string">"photo"</span>
}
kafka {
  hosts = <span class="hljs-string">"127.0.0.1:9092"</span>
  hosts = ${?KAFKA_HOSTS}
  photo_topic = <span class="hljs-string">"photo"</span>
  long_exposure_topic = <span class="hljs-string">"long-exposure"</span>
}
</code></pre><p>I think that this one does not require much explanation, right??</p>
<h1 id="heading-connectors-configuration">Connectors configuration</h1>
<p>The server we implemented writes in two Kafka topics: <code>photo</code> and <code>long-exposure</code>. But how are messages written in Elasticsearch as documents? Using <strong>Kafka Connect</strong>!</p>
<p>We can setup two connectors, one per topic, and tell the connectors to write  every message going through that topic in Elasticsearch. </p>
<p>First we need <a target="_blank" href="https://docs.confluent.io/current/connect/index.html">Kafka Connect</a>. We can use the container provided by Confluence in the docker-compose file:</p>
<pre><code class="lang-yml"><span class="hljs-attr">connect:</span>
    <span class="hljs-attr">image:</span> <span class="hljs-string">confluentinc/cp-kafka-connect</span>
    <span class="hljs-attr">ports:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-number">8083</span><span class="hljs-string">:8083</span>
    <span class="hljs-attr">networks:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">kakfa_stream_playground</span>
    <span class="hljs-attr">depends_on:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">zookeeper</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">kafka</span>
    <span class="hljs-attr">volumes:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">$PWD/connect-plugins:/connect-plugins</span>
    <span class="hljs-attr">environment:</span>
      <span class="hljs-attr">CONNECT_BOOTSTRAP_SERVERS:</span> <span class="hljs-string">kafka:9092</span>
      <span class="hljs-attr">CONNECT_REST_ADVERTISED_HOST_NAME:</span> <span class="hljs-string">connect</span>
      <span class="hljs-attr">CONNECT_REST_PORT:</span> <span class="hljs-number">8083</span>
      <span class="hljs-attr">CONNECT_GROUP_ID:</span> <span class="hljs-string">compose-connect-group</span>
      <span class="hljs-attr">CONNECT_CONFIG_STORAGE_TOPIC:</span> <span class="hljs-string">docker-connect-configs</span>
      <span class="hljs-attr">CONNECT_CONFIG_STORAGE_REPLICATION_FACTOR:</span> <span class="hljs-number">1</span>
      <span class="hljs-attr">CONNECT_OFFSET_FLUSH_INTERVAL_MS:</span> <span class="hljs-number">10000</span>
      <span class="hljs-attr">CONNECT_OFFSET_STORAGE_TOPIC:</span> <span class="hljs-string">docker-connect-offsets</span>
      <span class="hljs-attr">CONNECT_OFFSET_STORAGE_REPLICATION_FACTOR:</span> <span class="hljs-number">1</span>
      <span class="hljs-attr">CONNECT_STATUS_STORAGE_TOPIC:</span> <span class="hljs-string">docker-connect-status</span>
      <span class="hljs-attr">CONNECT_STATUS_STORAGE_REPLICATION_FACTOR:</span> <span class="hljs-number">1</span>
      <span class="hljs-attr">CONNECT_KEY_CONVERTER:</span> <span class="hljs-string">"org.apache.kafka.connect.storage.StringConverter"</span>
      <span class="hljs-attr">CONNECT_VALUE_CONVERTER:</span> <span class="hljs-string">"org.apache.kafka.connect.json.JsonConverter"</span>
      <span class="hljs-attr">CONNECT_VALUE_CONVERTER_SCHEMAS_ENABLE:</span> <span class="hljs-string">"false"</span>
      <span class="hljs-attr">CONNECT_INTERNAL_KEY_CONVERTER:</span> <span class="hljs-string">"org.apache.kafka.connect.json.JsonConverter"</span>
      <span class="hljs-attr">CONNECT_INTERNAL_VALUE_CONVERTER:</span> <span class="hljs-string">"org.apache.kafka.connect.json.JsonConverter"</span>
      <span class="hljs-attr">CONNECT_ZOOKEEPER_CONNECT:</span> <span class="hljs-string">zookeeper:2181</span>
      <span class="hljs-attr">CONNECT_PLUGIN_PATH:</span> <span class="hljs-string">/connect-plugins</span>
      <span class="hljs-attr">CONNECT_LOG4J_ROOT_LOGLEVEL:</span> <span class="hljs-string">INFO</span>
</code></pre>
<p>I want to focus on some of the configuration values. </p>
<p>First of all, we need to expose the port <code>8083</code> - that will be our endpoint to configure the connectors (<code>CONNECT_REST_PORT</code>). </p>
<p>We also need to map a volume to the <code>/connect-plugins</code> path, where we will place the <a target="_blank" href="https://docs.confluent.io/current/connect/kafka-connect-elasticsearch/index.html">Elasticsearch Sink Connector</a> to write to Elasticsearch. This is reflected also in the <code>CONNECT_PLUGIN_PATH</code>. </p>
<p>The <code>connect</code> container should know how to find the Kafka servers, so we set <code>CONNECT_BOOTSTRAP_SERVERS</code> as <code>kafka:9092</code>.</p>
<p>Once Kafka Connect is ready, we can send the configurations of our connectors to the <code>http://localhost:8083/connectors</code> endpoint. We need 2 connectors, one for the <code>photo</code> topic and one for the <code>long-exposure</code> topic. We can send the configuration as a <code>JSON</code> with a <code>POST</code> request.</p>
<pre><code class="lang-json">{
  <span class="hljs-attr">"name"</span>: <span class="hljs-string">"photo-connector"</span>,
  <span class="hljs-attr">"config"</span>: {
    <span class="hljs-attr">"connector.class"</span>: <span class="hljs-string">"io.confluent.connect.elasticsearch.ElasticsearchSinkConnector"</span>,
    <span class="hljs-attr">"tasks.max"</span>: <span class="hljs-string">"1"</span>,
    <span class="hljs-attr">"topics"</span>: <span class="hljs-string">"photo"</span>,
    <span class="hljs-attr">"key.converter"</span>: <span class="hljs-string">"org.apache.kafka.connect.storage.StringConverter"</span>,
    <span class="hljs-attr">"value.converter"</span>: <span class="hljs-string">"org.apache.kafka.connect.json.JsonConverter"</span>,
    <span class="hljs-attr">"value.converter.schemas.enable"</span>: <span class="hljs-string">"false"</span>,
    <span class="hljs-attr">"schema.ignore"</span>: <span class="hljs-string">"true"</span>,
    <span class="hljs-attr">"connection.url"</span>: <span class="hljs-string">"http://elastic:9200"</span>,
    <span class="hljs-attr">"type.name"</span>: <span class="hljs-string">"kafka-connect"</span>,
    <span class="hljs-attr">"behavior.on.malformed.documents"</span>: <span class="hljs-string">"warn"</span>,
    <span class="hljs-attr">"name"</span>: <span class="hljs-string">"photo-connector"</span>
  }
}
</code></pre>
<p>We explicitly say we are gonna use the <code>ElasticsearchSinkConnector</code> as the <code>connector.class</code> , as well as the <code>topics</code> that we want to sink - in this case <code>photo</code>. </p>
<p>We don't want to use a schema for the <code>value.converter</code>, so we can disable it (<code>value.converter.schemas.enable</code>) and tell the connector to ignore the schema (<code>schema.ignore</code>).</p>
<p>The connector for the <code>long-exposure</code> topic is exactly like this one. The only difference is the <code>name</code> and of course the <code>topics</code>.</p>
<h1 id="heading-how-to-run-the-project">How to run the project</h1>
<p>We have all we need to test the CDC! How can we do it? It's quite easy: simply run the <code>setup.sh</code> script in the root folder of the repo!</p>
<p>What will the script do?</p>
<ol>
<li>Run the <code>docker-compose</code> file with all the services.</li>
<li>Configure MongoDB replica set. This is required to enable the <strong>Change Stream interface</strong> to capture data changes. More info about this <a target="_blank" href="https://docs.mongodb.com/manual/changeStreams/">here</a>.</li>
<li>Configure the Kafka connectors.</li>
<li>Connect to the logs of the server.</li>
</ol>
<p>The docker-compose will run the following services:</p>
<ul>
<li>Our Server</li>
<li>3 instances of MongoDB (required for the replica set)</li>
<li>Mongoku, a MongoDB client</li>
<li>Kafka (single node)</li>
<li>Kafka connect</li>
<li>Zookeeper (required by Kafka)</li>
<li>Elasticsearch</li>
<li>Kibana</li>
</ul>
<p>There are a lot of containers to run, so make sure you have enough resources to run everything properly. If you want, remove Mongoku and Kibana from the compose-file, since they are used just for a quick look inside the DBs.</p>
<p>Once everything is up and running, you just have to send data to the server. </p>
<p>I collected some <code>JSON</code> documents of photos from Unplash that you can use to test the system in the <code>photos.txt</code> file. </p>
<p>There are a total of 10 documents, with 5 of them containing info about long exposure photos. Send them to the server running the <code>send-photos.sh</code> script in the root of the repo. Check that everything is stored in MongoDB connecting to Mongoku at <code>http://localhost:3100</code>. Then connect to Kibana at <code>http://localhost:5601</code> and you will find two indexes in Elasticsearch: <code>photo</code>, containing the JSON of all the photos stored in MongoDB, and <code>long-exposure</code>, containing just the info of the long exposure photos.</p>
<p>Amazing, right? ?</p>
<h1 id="heading-conclusion">Conclusion</h1>
<p>We made it guys!? </p>
<p>Starting from the design of the use-case, we built our system that connected a MongoDB database to Elasticsearch using CDC. </p>
<p>Kafka Streams is the enabler, allowing us to convert database events to a stream that we can process. </p>
<p>Do you need to see the whole project? Just checkout the <a target="_blank" href="https://github.com/elleFlorio/kafka-streams-playground">repository</a> on GitHub!?</p>
<p>That's it, enjoy! ?</p>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ How to migrate from Elasticsearch 1.7 to 6.8 with zero downtime ]]>
                </title>
                <description>
                    <![CDATA[ By dor sever My last task at BigPanda was to upgrade an existing service that was using Elasticsearch version 1.7 to a newer Elasticsearch version, 6.8.1. In this post, I will share how we migrated from Elasticsearch 1.6 to 6.8 with harsh constraints... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/how-to-migrate-from-elasticsearch-1-7-to-6-8-with-zero-downtime/</link>
                <guid isPermaLink="false">66d45e414a7504b7409c338a</guid>
                
                    <category>
                        <![CDATA[ availability ]]>
                    </category>
                
                    <category>
                        <![CDATA[ data migration ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Devops ]]>
                    </category>
                
                    <category>
                        <![CDATA[ elasticsearch ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Python ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ freeCodeCamp ]]>
                </dc:creator>
                <pubDate>Wed, 25 Dec 2019 09:45:32 +0000</pubDate>
                <media:content url="https://www.freecodecamp.org/news/content/images/2019/12/es-3.png" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>By dor sever</p>
<p>My last task at <a target="_blank" href="https://www.bigpanda.io">BigPanda</a> was to upgrade an existing service that was using Elasticsearch version 1.7 to a newer Elasticsearch version, 6.8.1.</p>
<p>In this post, I will share how we migrated from Elasticsearch 1.6 to 6.8 with harsh constraints like zero downtime, no data loss, and zero bugs. I'll also provide you with a script that does the migration for you.</p>
<p>This post contains 6 chapters (and one is optional):</p>
<ul>
<li>What’s in it for me? --&gt; What were the new features that led us to upgrade our version?</li>
<li>The constraints --&gt; What were our business requirements?</li>
<li>Problem solving --&gt; How did we address the constraints?</li>
<li>Moving forward --&gt; The plan.</li>
<li>[Optional chapter] --&gt; How did we handle the infamous mapping explosion problem?</li>
<li>Finally --&gt; How to do data migration between clusters.</li>
</ul>
<h1 id="heading-chapter-1-whats-in-it-for-me">Chapter 1 — What’s in it for me?</h1>
<p>What benefits were we expecting to solve by upgrading our data store?</p>
<p>There were a couple of reasons:</p>
<ol>
<li>Performance and stability issues — We were experiencing a huge number of outages with long MTTR that caused us a lot of headaches. This was reflected in frequent high latencies, high CPU usage, and more issues.</li>
<li>Non-existent support in old Elasticsearch versions — We were missing some operative knowledge in Elasticsearch, and when we searched for outside consulting we were encouraged to migrate forward to receive support.</li>
<li>Dynamic mappings in our schema — Our current schema in Elasticsearch 1.7 used a feature called dynamic mappings that made our cluster <a target="_blank" href="https://www.elastic.co/guide/en/elasticsearch/reference/6.1/mapping.html#mapping-limit-settings">explode</a> multiple times. So we wanted to address this issue.</li>
<li>Poor visibility on our existing cluster — We wanted a better view under the hood and saw that later versions had great metrics exporting tools.</li>
</ol>
<h1 id="heading-chapter-2-the-constraints">Chapter 2 — The constraints</h1>
<ul>
<li>ZERO downtime migration — We have active users on our system, and we could not afford for the system to be down while we were migrating.</li>
<li>Recovery plan — We could not afford to “lose” or “corrupt” data, no matter the cost. So we needed to prepare a recovery plan in case our migration failed.</li>
<li>Zero bugs — We could not change existing search functionality for end-users.</li>
</ul>
<h1 id="heading-chapter-3-problem-solving-and-thinking-of-a-plan">Chapter 3 — Problem solving and thinking of a plan</h1>
<p>Let’s tackle the constraints from the simplest to the most difficult:</p>
<h2 id="heading-zero-bugs">Zero bugs</h2>
<p>In order to address this requirement, I studied all the possible requests the service receives and what its outputs were. Then I added unit-tests where needed.</p>
<p>In addition, I added multiple metrics (to the <code>Elasticsearch Indexer</code> and the <code>new Elasticsearch Indexer</code> ) to track latency, throughput, and performance, which allowed me to validate that we only improved them.</p>
<h2 id="heading-recovery-plan">Recovery plan</h2>
<p>This means that I needed to address the following situation: I deployed the new code to production and stuff was not working as expected. What can I do about it then</p>
<p>Since I was working in a service that used <a target="_blank" href="https://www.youtube.com/watch?v=STKCRSUsyP0">event-sourcing,</a> I could add another listener (diagram attached below) and start writing to a new Elasticsearch cluster without affecting production status</p>
<h2 id="heading-zero-downtime-migration">Zero downtime migration</h2>
<p>The current service is in live mode and cannot be “deactivated” for periods longer than 5–10 minutes. The trick to getting this right is this:</p>
<ul>
<li>Store a log of all the actions your service is handling (we use Kafka in production)</li>
<li>Start the migration process offline (and keep track of the offset before you started the migration)</li>
<li>When the migration ends, start the new service against the log with the recorded offset and catch up the lag</li>
<li>When the lag finishes, change your frontend to query against the new service and you are done</li>
</ul>
<h1 id="heading-chapter-4-the-plan">Chapter 4 — The plan</h1>
<p>Our current service uses the following architecture (based on message passing in Kafka):</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2019/12/indxr2.jpeg" alt="Image" width="600" height="400" loading="lazy"></p>
<ol>
<li><code>Event topic</code> contains events produced by other applications (for example, <code>UserId 3 created</code>)</li>
<li><code>Command topic</code> contains the translation of these events into specific commands used by this application (for example: <code>Add userId 3</code>)</li>
<li>Elasticsearch 1.7 — The datastore of the <code>command Topic</code> read by the <code>Elasticsearch Indexer</code>.</li>
</ol>
<p>We planned to add another consumer (<code>new Elasticsearch Indexer</code>) to the <code>command topic</code>, which will read the same exact messages and write them in parallel to Elasticsearch 6.8.</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2019/12/indxr.jpeg" alt="Image" width="600" height="400" loading="lazy"></p>
<h1 id="heading-where-should-i-start">Where should I start?</h1>
<p>To be honest, I considered myself a newbie Elasticsearch user. To feel confident to perform this task, I had to think about the best way to approach this topic and learn it. A few things that helped were:</p>
<ol>
<li>Documentation — It’s an insanely useful resource for everything Elasticsearch. Take the time to read it and take notes (don’t miss: <a target="_blank" href="https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping.html">Mapping</a> and <a target="_blank" href="https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl.html">QueryDsl</a>).</li>
<li>HTTP API — everything under <a target="_blank" href="https://www.elastic.co/guide/en/elasticsearch/reference/current/cat.html">CAT</a> API. This was super useful to debug things locally and see how Elasticsearch responds (don’t miss: cluster health, cat indices, search, delete index).</li>
<li>Metrics (❤️) — From the first day, we configured a shiny new dashboard with lots of cool metrics (taken from <a target="_blank" href="https://github.com/justwatchcom/elasticsearch_exporter"><em>elasticsearch-exporter-for-Prometheus</em></a>) that helped and pushed us to understand more about Elasticsearch.</li>
</ol>
<h1 id="heading-the-code">The code</h1>
<p>Our codebase was using a library called <a target="_blank" href="https://github.com/sksamuel/elastic4s">elastic4s</a> and was using the oldest release available in the library — a really good reason to migrate! So the first thing to do was just to migrate versions and see what broke.</p>
<p>There are a few tactics on how to do this code migration. The tactic we chose was to try and restore existing functionality first in the new Elasticsearch version without re-writing the all code from the start. In other words, to reach existing functionality but on a newer version of Elasticsearch.</p>
<p>Luckily for us, the code already contained almost full testing coverage so our task was much much simpler, and that took around 2 weeks of development time.</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2019/12/you_need_some_tests_yo.jpg" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>It's important to note that, if that wasn't the case, we would have had to invest some time in filling that coverage up. Only then would we be able to migrate since one of our constraints was to not break existing functionality.</em></p>
<h1 id="heading-chapter-5-the-mapping-explosion-problem">Chapter 5 — The mapping explosion problem</h1>
<p>Let’s describe our use-case in more detail. This is our model:</p>
<p><code>class InsertMessageCommand(tags: Map[String,String])</code></p>
<p>And for example, an instance of this message would be:</p>
<p><code>new InsertMessageCommand(Map("name"-&gt;"dor","lastName"-&gt;"sever"))</code></p>
<p>And given this model, we needed to support the following query requirements:</p>
<ol>
<li>Query by value</li>
<li>Query by tag name and value</li>
</ol>
<p>The way this was modeled in our Elasticsearch 1.7 schema was using a dynamic template schema (since the tag keys are dynamic, and cannot be modeled in advanced).</p>
<p>The dynamic template caused us multiple outages due to the mapping explosion problem, and the schema looked like this:</p>
<pre><code class="lang-bash">curl -X PUT <span class="hljs-string">"localhost:9200/_template/my_template?pretty"</span> -H <span class="hljs-string">'Content-Type: application/json'</span> -d <span class="hljs-string">'
{
    "index_patterns": [
        "your-index-names*"
    ],
    "mappings": {
            "_doc": {
                "dynamic_templates": [
                    {
                        "tags": {
                            "mapping": {
                                "type": "text"
                            },
                            "path_match": "actions.tags.*"
                        }
                    }
                ]
            }
        },
    "aliases": {}
}'</span>  

curl -X PUT <span class="hljs-string">"localhost:9200/your-index-names-1/_doc/1?pretty"</span> -H <span class="hljs-string">'Content-Type: application/json'</span> -d<span class="hljs-string">'
{
  "actions": {
    "tags" : {
        "name": "John",
        "lname" : "Smith"
    }
  }
}
'</span>

curl -X PUT <span class="hljs-string">"localhost:9200/your-index-names-1/_doc/2?pretty"</span> -H <span class="hljs-string">'Content-Type: application/json'</span> -d<span class="hljs-string">'
{
  "actions": {
    "tags" : {
        "name": "Dor",
        "lname" : "Sever"
  }
}
}
'</span>

curl -X PUT <span class="hljs-string">"localhost:9200/your-index-names-1/_doc/3?pretty"</span> -H <span class="hljs-string">'Content-Type: application/json'</span> -d<span class="hljs-string">'
{
  "actions": {
    "tags" : {
        "name": "AnotherName",
        "lname" : "AnotherLastName"
  }
}
}
'</span>
</code></pre>
<pre><code class="lang-bash">
curl -X GET <span class="hljs-string">"localhost:9200/_search?pretty"</span> -H <span class="hljs-string">'Content-Type: application/json'</span> -d<span class="hljs-string">'
{
    "query": {
        "match" : {
            "actions.tags.name" : {
                "query" : "John"
            }
        }
    }
}
'</span>
<span class="hljs-comment"># returns 1 match(doc 1)</span>


curl -X GET <span class="hljs-string">"localhost:9200/_search?pretty"</span> -H <span class="hljs-string">'Content-Type: application/json'</span> -d<span class="hljs-string">'
{
    "query": {
        "match" : {
            "actions.tags.lname" : {
                "query" : "John"
            }
        }
    }
}
'</span>
<span class="hljs-comment"># returns zero matches</span>

<span class="hljs-comment"># search by value</span>
curl -X GET <span class="hljs-string">"localhost:9200/_search?pretty"</span> -H <span class="hljs-string">'Content-Type: application/json'</span> -d<span class="hljs-string">'
{
    "query": {
        "query_string" : {
            "fields": ["actions.tags.*" ],
            "query" : "Dor"
        }
    }
}
'</span>
</code></pre>
<h2 id="heading-nested-documents-solution">Nested documents solution</h2>
<p>Our first instinct in solving the mapping explosion problem was to use nested documents.</p>
<p>We read the nested data type tutorial in the Elastic docs and defined the following schema and queries:</p>
<pre><code class="lang-bash">curl -X PUT <span class="hljs-string">"localhost:9200/my_index?pretty"</span> -H <span class="hljs-string">'Content-Type: application/json'</span> -d<span class="hljs-string">'
{
        "mappings": {
            "_doc": {
            "properties": {
            "tags": {
                "type": "nested" 
                }                
            }
        }
        }
}
'</span>

curl -X PUT <span class="hljs-string">"localhost:9200/my_index/_doc/1?pretty"</span> -H <span class="hljs-string">'Content-Type: application/json'</span> -d<span class="hljs-string">'
{
  "tags" : [
    {
      "key" : "John",
      "value" :  "Smith"
    },
    {
      "key" : "Alice",
      "value" :  "White"
    }
  ]
}
'</span>


<span class="hljs-comment"># Query by tag key and value</span>
curl -X GET <span class="hljs-string">"localhost:9200/my_index/_search?pretty"</span> -H <span class="hljs-string">'Content-Type: application/json'</span> -d<span class="hljs-string">'
{
  "query": {
    "nested": {
      "path": "tags",
      "query": {
        "bool": {
          "must": [
            { "match": { "tags.key": "Alice" }},
            { "match": { "tags.value":  "White" }} 
          ]
        }
      }
    }
  }
}
'</span>

<span class="hljs-comment"># Returns 1 document</span>


curl -X GET <span class="hljs-string">"localhost:9200/my_index/_search?pretty"</span> -H <span class="hljs-string">'Content-Type: application/json'</span> -d<span class="hljs-string">'
{
  "query": {
    "nested": {
      "path": "tags",
      "query": {
        "bool": {
          "must": [
            { "match": { "tags.value":  "Smith" }} 
          ]
        }
      }
    }
  }
}
'</span>

<span class="hljs-comment"># Query by tag value</span>
<span class="hljs-comment"># Returns 1 result</span>
</code></pre>
<p>And this solution worked. However, when we tried to insert real customer data we saw that the number of documents in our index increased by around 500 times.</p>
<p>We thought about the following problems and went on to find a better solution:</p>
<ol>
<li>The amount of documents we had in our cluster was around 500 million documents. This meant that, with the new schema, we were going to reach two hundred fifty billion documents (that’s 250,000,000,000 documents ?).</li>
<li>We read this really good blog post — <a target="_blank" href="https://blog.gojekengineering.com/elasticsearch-the-trouble-with-nested-documents-e97b33b46194">https://blog.gojekengineering.com/elasticsearch-the-trouble-with-nested-documents-e97b33b46194</a> which highlights that nested documents can cause high latency in queries and heap usage problems.</li>
<li>Testing — Since we were converting 1 document in the old cluster to an unknown number of documents in the new cluster, it would be much harder to track if the migration process worked without any data loss. If our conversion was 1:1, we could assert that the count in the old cluster equalled the count in the new cluster.</li>
</ol>
<h2 id="heading-avoiding-nested-documents">Avoiding nested documents</h2>
<p>The real trick in this was to focus on what supported queries we were running: search by tag value, and search by tag key and value.</p>
<p>The first query does not require nested documents since it works on a single field. For the latter, we did the following trick. We created a field that contains the combination of the key and the value. Whenever a user queries on a key, value match, we translate their request to the corresponding text and query against that field.</p>
<p>Example:</p>
<pre><code class="lang-bash">curl -X PUT <span class="hljs-string">"localhost:9200/my_index_2?pretty"</span> -H <span class="hljs-string">'Content-Type: application/json'</span> -d<span class="hljs-string">'
{
    "mappings": {
        "_doc": {
            "properties": {
                "tags": {
                    "type": "object",
                    "properties": {
                        "keyToValue": {
                            "type": "keyword"
                        },
                        "value": {
                            "type": "keyword"
                        }
                    }
                }
            }
        }
    }
}
'</span>


curl -X PUT <span class="hljs-string">"localhost:9200/my_index_2/_doc/1?pretty"</span> -H <span class="hljs-string">'Content-Type: application/json'</span> -d<span class="hljs-string">'
{
  "tags" : [
    {
      "keyToValue" : "John:Smith",
      "value" : "Smith"
    },
    {
      "keyToValue" : "Alice:White",
      "value" : "White"
    }
  ]
}
'</span>

<span class="hljs-comment"># Query by key,value</span>
<span class="hljs-comment"># User queries for key: Alice, and value : White , we then query elastic with this query:</span>

curl -X GET <span class="hljs-string">"localhost:9200/my_index_2/_search?pretty"</span> -H <span class="hljs-string">'Content-Type: application/json'</span> -d<span class="hljs-string">'
{
  "query": {
        "bool": {
          "must": [ { "match": { "tags.keyToValue": "Alice:White" }}]
  }}}
'</span>

<span class="hljs-comment"># Query by value only</span>
curl -X GET <span class="hljs-string">"localhost:9200/my_index_2/_search?pretty"</span> -H <span class="hljs-string">'Content-Type: application/json'</span> -d<span class="hljs-string">'
{
  "query": {
        "bool": {
          "must": [ { "match": { "tags.value": "White" }}]
  }}}
'</span>
</code></pre>
<h1 id="heading-chapter-6-the-migration-process">Chapter 6 — The migration process</h1>
<p>We planned to migrate about 500 million documents with zero downtime. To do that we needed:</p>
<ol>
<li>A strategy on how to transfer data from the old Elastic to the new Elasticsearch</li>
<li>A strategy on how to close the lag between the start of the migration and the end of it</li>
</ol>
<p>And our two options in closing the lag:</p>
<ol>
<li>Our messaging system is Kafka based. We could have just taken the current offset before the migration started, and after the migration ended, start consuming from that specific offset. This solution requires some manual tweaking of offsets and some other stuff, but will work.</li>
<li>Another approach to solving this issue was to start consuming messages from the beginning of the topic in Kafka and make our actions on Elasticsearch idempotent — meaning, if the change was “applied” already, nothing would change in Elastic store.</li>
</ol>
<p>The requests made by our service against Elastic were already idempotent, so we choose option 2 because it required zero manual work (no need to take specific offsets, and then set them afterward in a new consumer group).</p>
<h2 id="heading-how-can-we-migrate-the-data">How can we migrate the data?</h2>
<p>These were the options we thought of:</p>
<ol>
<li>If our Kafka contained all messages from the beginning of time, we could just play from the start and the end state would be equal. But since we apply retention to out topics, this was not an option.</li>
<li>Dump messages to disk and then ingest them to Elastic directly – This solution looked kind of weird. Why store them in disk instead of just writing them directly to Elastic?</li>
<li>Transfer messages between old Elastic to new Elastic — This meant, writing some sort of “script” (did anyone say Python? ?) that will connect to the old Elasticsearch cluster, query for items, transform them to the new schema, and index them in the cluster.</li>
</ol>
<p>We choose the last option. These were the design choices we had in mind:</p>
<ol>
<li>Let’s not try to think about error handling unless we need to. Let’s try to write something super simple, and if errors occur, let’s try to address them. In the end, we did not need to address this issue since no errors occurred during the migration.</li>
<li>It’s a one-off operation, so whatever works first / KISS.</li>
<li>Metrics — Since the migration processes can take hours to days, we wanted the ability from day 1 to be able to monitor the error count and to track the current progress and copy rate of the script.</li>
</ol>
<p><img src="https://www.freecodecamp.org/news/content/images/2019/12/python.gif" alt="Image" width="600" height="400" loading="lazy"></p>
<p>We thought long and hard and choose Python as our weapon of choice. The final version of the code is below:</p>
<pre><code class="lang-yml"><span class="hljs-string">dictor==0.1.2</span> <span class="hljs-bullet">-</span> <span class="hljs-string">to</span> <span class="hljs-string">copy</span> <span class="hljs-string">and</span> <span class="hljs-string">transform</span> <span class="hljs-string">our</span> <span class="hljs-string">Elasticsearch</span> <span class="hljs-string">documentselasticsearch==1.9.0</span> <span class="hljs-bullet">-</span> <span class="hljs-string">to</span> <span class="hljs-string">connect</span> <span class="hljs-string">to</span> <span class="hljs-string">"old"</span> <span class="hljs-string">Elasticsearchelasticsearch6==6.4.2</span> <span class="hljs-bullet">-</span> <span class="hljs-string">to</span> <span class="hljs-string">connect</span> <span class="hljs-string">to</span> <span class="hljs-string">the</span> <span class="hljs-string">"new"</span> <span class="hljs-string">Elasticsearchstatsd==3.3.0</span> <span class="hljs-bullet">-</span> <span class="hljs-string">to</span> <span class="hljs-string">report</span> <span class="hljs-string">metrics</span>
</code></pre>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> elasticsearch <span class="hljs-keyword">import</span> Elasticsearch
<span class="hljs-keyword">from</span> elasticsearch6 <span class="hljs-keyword">import</span> Elasticsearch <span class="hljs-keyword">as</span> Elasticsearch6
<span class="hljs-keyword">import</span> sys
<span class="hljs-keyword">from</span> elasticsearch.helpers <span class="hljs-keyword">import</span> scan
<span class="hljs-keyword">from</span> elasticsearch6.helpers <span class="hljs-keyword">import</span> parallel_bulk
<span class="hljs-keyword">import</span> statsd

ES_SOURCE = Elasticsearch(sys.argv[<span class="hljs-number">1</span>])
ES_TARGET = Elasticsearch6(sys.argv[<span class="hljs-number">2</span>])
INDEX_SOURCE = sys.argv[<span class="hljs-number">3</span>]
INDEX_TARGET = sys.argv[<span class="hljs-number">4</span>]
QUERY_MATCH_ALL = {<span class="hljs-string">"query"</span>: {<span class="hljs-string">"match_all"</span>: {}}}
SCAN_SIZE = <span class="hljs-number">1000</span>
SCAN_REQUEST_TIMEOUT = <span class="hljs-string">'3m'</span>
REQUEST_TIMEOUT = <span class="hljs-number">180</span>
MAX_CHUNK_BYTES = <span class="hljs-number">15</span> * <span class="hljs-number">1024</span> * <span class="hljs-number">1024</span>
RAISE_ON_ERROR = <span class="hljs-literal">False</span>


<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">transform_item</span>(<span class="hljs-params">item, index_target</span>):</span>
    <span class="hljs-comment"># implement your logic transformation here</span>
    transformed_source_doc = item.get(<span class="hljs-string">"_source"</span>)
    <span class="hljs-keyword">return</span> {<span class="hljs-string">"_index"</span>: index_target,
            <span class="hljs-string">"_type"</span>: <span class="hljs-string">"_doc"</span>,
            <span class="hljs-string">"_id"</span>: item[<span class="hljs-string">'_id'</span>],
            <span class="hljs-string">"_source"</span>: transformed_source_doc}


<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">transformedStream</span>(<span class="hljs-params">es_source, match_query, index_source, index_target, transform_logic_func</span>):</span>
    <span class="hljs-keyword">for</span> item <span class="hljs-keyword">in</span> scan(es_source, query=match_query, index=index_source, size=SCAN_SIZE,
                     timeout=SCAN_REQUEST_TIMEOUT):
        <span class="hljs-keyword">yield</span> transform_logic_func(item, index_target)


<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">index_source_to_target</span>(<span class="hljs-params">es_source, es_target, match_query, index_source, index_target, bulk_size, statsd_client,
                           logger, transform_logic_func</span>):</span>
    ok_count = <span class="hljs-number">0</span>
    fail_count = <span class="hljs-number">0</span>
    count_response = es_source.count(index=index_source, body=match_query)
    count_result = count_response[<span class="hljs-string">'count'</span>]
    statsd_client.gauge(stat=<span class="hljs-string">'elastic_migration_document_total_count,index={0},type=success'</span>.format(index_target),
                        value=count_result)
    <span class="hljs-keyword">with</span> statsd_client.timer(<span class="hljs-string">'elastic_migration_time_ms,index={0}'</span>.format(index_target)):
        actions_stream = transformedStream(es_source, match_query, index_source, index_target, transform_logic_func)
        <span class="hljs-keyword">for</span> (ok, item) <span class="hljs-keyword">in</span> parallel_bulk(es_target,
                                        chunk_size=bulk_size,
                                        max_chunk_bytes=MAX_CHUNK_BYTES,
                                        actions=actions_stream,
                                        request_timeout=REQUEST_TIMEOUT,
                                        raise_on_error=RAISE_ON_ERROR):
            <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> ok:
                logger.error(<span class="hljs-string">"got error on index {} which is : {}"</span>.format(index_target, item))
                fail_count += <span class="hljs-number">1</span>
                statsd_client.incr(<span class="hljs-string">'elastic_migration_document_count,index={0},type=failure'</span>.format(index_target),
                                   <span class="hljs-number">1</span>)
            <span class="hljs-keyword">else</span>:
                ok_count += <span class="hljs-number">1</span>
                statsd_client.incr(<span class="hljs-string">'elastic_migration_document_count,index={0},type=success'</span>.format(index_target),
                                   <span class="hljs-number">1</span>)

    <span class="hljs-keyword">return</span> ok_count, fail_count


statsd_client = statsd.StatsClient(host=<span class="hljs-string">'localhost'</span>, port=<span class="hljs-number">8125</span>)

<span class="hljs-keyword">if</span> __name__ == <span class="hljs-string">"__main__"</span>:
    index_source_to_target(ES_SOURCE, ES_TARGET, QUERY_MATCH_ALL, INDEX_SOURCE, INDEX_TARGET, BULK_SIZE,
                           statsd_client, transform_item)
</code></pre>
<h1 id="heading-conclusion">Conclusion</h1>
<p>Migrating data in a live production system is a complicated task that requires a lot of attention and careful planning. I recommend taking the time to work through the steps listed above and figure out what works best for your needs.</p>
<p>As a rule of thumb, always try to reduce your requirements as much as possible. For example, is a zero downtime migration required? Can you afford data-loss?</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2019/12/enjoy-the-ride.gif" alt="Image" width="600" height="400" loading="lazy"></p>
<p>Upgrading data stores is usually a marathon and not a sprint, so take a deep breath and try to enjoy the ride.</p>
<ul>
<li>The whole process listed above took me around 4 months of work</li>
<li>All of the Elasticsearch examples that appear in this post have been tested against version 6.8.1</li>
</ul>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ How to implement Elasticsearch in Go ]]>
                </title>
                <description>
                    <![CDATA[ By Pramono Winata Today, I am going to show you how to implement Elasticsearch in Go.But of course, before that I am going to give a small introduction to Elasticsearch.If you have already gained a basic understanding of Elasticsearch, you can skip t... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/go-elasticsearch/</link>
                <guid isPermaLink="false">66d4608dc7632f8bfbf1e475</guid>
                
                    <category>
                        <![CDATA[ elasticsearch ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Go Language ]]>
                    </category>
                
                    <category>
                        <![CDATA[ General Programming ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Software Engineering ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ freeCodeCamp ]]>
                </dc:creator>
                <pubDate>Mon, 25 Nov 2019 20:00:00 +0000</pubDate>
                <media:content url="https://www.freecodecamp.org/news/content/images/2019/11/Screenshot-from-2019-11-24-22-21-41-1.png" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>By Pramono Winata</p>
<p>Today, I am going to show you how to implement Elasticsearch in Go.<br>But of course, before that I am going to give a small introduction to Elasticsearch.<br>If you have already gained a basic understanding of Elasticsearch, you can skip to the next part.</p>
<h2 id="heading-elasticsearch"><strong>Elasticsearch</strong></h2>
<p>Elasticsearch has been gaining a lot of popularity lately. Searching in a Relational-Database always has issues around scalability and performance. </p>
<p>Elasticsearch is a NoSQL database that has been very successful in tackling those issues. It provides great scalability and performance, and one of the most prominent features is the scoring system that allows a lot of flexibility in the search results. After all, it is not called Elastic-search for no reason!</p>
<h3 id="heading-installing-elasticsearch">Installing Elasticsearch</h3>
<p>First, you will need to install Elasticsearch on your local machine. You can go to their <a target="_blank" href="https://www.elastic.co/guide/index.html">website</a> and get the <a target="_blank" href="https://www.elastic.co/guide/en/elasticsearch/reference/7.4/install-elasticsearch.html">installation guide</a> for it. At the time I am writing this article, I am using Elasticsearch with the version number of 7.4.2 .</p>
<p>Elasticsearch has been making a lot of changes in their versions, one of them being the <a target="_blank" href="https://www.elastic.co/guide/en/elasticsearch/reference/master/removal-of-types.html">removal of mapping type.</a> So do not expect this to fully work if you are using another version of Elasticsearch.</p>
<p>After finishing your installation, do not forget to run your elasticsearch service, which is mentioned quite clearly on their installation guide (for linux, in short do this <code>./bin/elasticsearch</code> ).</p>
<p><strong>Make sure your elasticsearch is running</strong> by requesting into port 9200 in your local machine.   </p>
<p>GET <code>localhost:9200</code></p>
<p>Hitting it should show something like below.</p>
<pre><code class="lang-json">{
  <span class="hljs-attr">"name"</span>: <span class="hljs-string">"204371"</span>,
  <span class="hljs-attr">"cluster_name"</span>: <span class="hljs-string">"elasticsearch"</span>,
  <span class="hljs-attr">"cluster_uuid"</span>: <span class="hljs-string">"8Aa0PznuR1msDL9-PYsNQg"</span>,
  <span class="hljs-attr">"version"</span>: {
    <span class="hljs-attr">"number"</span>: <span class="hljs-string">"7.4.2"</span>,
    <span class="hljs-attr">"build_flavor"</span>: <span class="hljs-string">"default"</span>,
    <span class="hljs-attr">"build_type"</span>: <span class="hljs-string">"tar"</span>,
    <span class="hljs-attr">"build_hash"</span>: <span class="hljs-string">"2f90bbf7b93631e52bafb59b3b049cb44ec25e96"</span>,
    <span class="hljs-attr">"build_date"</span>: <span class="hljs-string">"2019-10-28T20:40:44.881551Z"</span>,
    <span class="hljs-attr">"build_snapshot"</span>: <span class="hljs-literal">false</span>,
    <span class="hljs-attr">"lucene_version"</span>: <span class="hljs-string">"8.2.0"</span>,
    <span class="hljs-attr">"minimum_wire_compatibility_version"</span>: <span class="hljs-string">"6.8.0"</span>,
    <span class="hljs-attr">"minimum_index_compatibility_version"</span>: <span class="hljs-string">"6.0.0-beta1"</span>
  },
  <span class="hljs-attr">"tagline"</span>: <span class="hljs-string">"You Know, for Search"</span>
}
</code></pre>
<p>If it's showing correctly then congratulations! You have successfully run your elasticsearch service in your local machine. Give yourself a clap and take a cup of coffee, since the day is still young.</p>
<h3 id="heading-making-your-first-index">Making your first index</h3>
<p>In Elasticsearch, index is similar to a database. Whereas before, there was table in elasticsearch called type. But since type has been removed in the current version, there are only index now.</p>
<p>Confused now? Don't be. In a nutshell, just think that you only need index then afterwards you just need to insert your data into Elasticsearch.  </p>
<p>Now, we are going to make an index named <code>students</code> by doing the query below.<br>PUT <code>localhost/9200/students</code></p>
<pre><code class="lang-json">{
    <span class="hljs-attr">"settings"</span>: {
        <span class="hljs-attr">"number_of_shards"</span>: <span class="hljs-number">1</span>,
        <span class="hljs-attr">"number_of_replicas"</span>: <span class="hljs-number">1</span>
    },
   <span class="hljs-attr">"mappings"</span>: {
       <span class="hljs-attr">"properties"</span>: {
         <span class="hljs-attr">"name"</span>: {
               <span class="hljs-attr">"type"</span>: <span class="hljs-string">"text"</span>
         },
         <span class="hljs-attr">"age"</span>: {
               <span class="hljs-attr">"type"</span>: <span class="hljs-string">"integer"</span>      
         },
         <span class="hljs-attr">"average_score"</span>: {
               <span class="hljs-attr">"type"</span>: <span class="hljs-string">"float"</span>
         }
     }
   }
}
</code></pre>
<p>If nothing goes wrong, it should respond back by giving this.</p>
<pre><code class="lang-json">{
    <span class="hljs-attr">"acknowledged"</span>: <span class="hljs-literal">true</span>,
    <span class="hljs-attr">"shards_acknowledged"</span>: <span class="hljs-literal">true</span>
}
</code></pre>
<p>Your index should be created. Now we will proceed to our next step: playing around with our Elasticsearch index.</p>
<h3 id="heading-populating-your-elasticsearch">Populating your Elasticsearch</h3>
<p>First, what we will be doing now is filling in our Elasticsearch index with documents. If you are not familiar with that definition, just know that it is very similar to rows in a database.</p>
<p>In a NoSQL database, it's actually possible for every document to contain different fields that don't match with the schema.</p>
<p>But let's not do that – let's construct our column with a schema that we have defined before. The previous API will allow you to fill the document in your index.</p>
<p>POST <code>localhost:9200/students/doc</code></p>
<pre><code class="lang-json">{
    <span class="hljs-attr">"name"</span>:<span class="hljs-string">"Alice"</span>,
    <span class="hljs-attr">"age"</span>:<span class="hljs-number">17</span>,
    <span class="hljs-attr">"average_score"</span>:<span class="hljs-number">81.1</span>
}
</code></pre>
<p>Your Elasticsearch should have one document by now. We will need to insert several more data into our Elasticsearch.  And of course, we are not going to insert our student data one by one - that would be quite a hassle!  </p>
<p>Elasticsearch has specifically prepared a bulk API in order to send multiple requests at once. Let's use that to insert multiple data at once.  </p>
<p>POST <code>/students/_bulk</code></p>
<pre><code class="lang-json">{ <span class="hljs-attr">"index"</span>:{<span class="hljs-attr">"_index"</span>: <span class="hljs-string">"students"</span> } }
{ <span class="hljs-attr">"name"</span>:<span class="hljs-string">"john doe"</span>,<span class="hljs-attr">"age"</span>:<span class="hljs-number">18</span>, <span class="hljs-attr">"average_score"</span>:<span class="hljs-number">77.7</span> }
{ <span class="hljs-attr">"index"</span>:{<span class="hljs-attr">"_index"</span>: <span class="hljs-string">"students"</span> } }
{ <span class="hljs-attr">"name"</span>:<span class="hljs-string">"bob"</span>,<span class="hljs-attr">"age"</span>:<span class="hljs-number">16</span>, <span class="hljs-attr">"average_score"</span>:<span class="hljs-number">65.5</span> }
{ <span class="hljs-attr">"index"</span>:{<span class="hljs-attr">"_index"</span>: <span class="hljs-string">"students"</span> } }
{ <span class="hljs-attr">"name"</span>:<span class="hljs-string">"mary doe"</span>,<span class="hljs-attr">"age"</span>:<span class="hljs-number">18</span>, <span class="hljs-attr">"average_score"</span>:<span class="hljs-number">97.7</span> }
{ <span class="hljs-attr">"index"</span>:{<span class="hljs-attr">"_index"</span>: <span class="hljs-string">"students"</span> } }
{ <span class="hljs-attr">"name"</span>:<span class="hljs-string">"eve"</span>,<span class="hljs-attr">"age"</span>:<span class="hljs-number">15</span>, <span class="hljs-attr">"average_score"</span>:<span class="hljs-number">98.9</span> }
</code></pre>
<h3 id="heading-lets-query-for-the-data">Let's query for the data</h3>
<p>We have finally populated our Elasticsearch with several more students' data. Now let's do what Elasticsearch is known for: we will try to search our Elasticsearch for the data that we just inserted.</p>
<p>Elasticsearch supports many types of search mechanisms, but for this example we will be using a simple matching query. </p>
<p>Let's start our search by hitting this API:</p>
<p> POST <code>localhost:9200/_search</code></p>
<pre><code class="lang-json">{
    <span class="hljs-attr">"query"</span> : {
        <span class="hljs-attr">"match"</span> : { <span class="hljs-attr">"name"</span> : <span class="hljs-string">"doe"</span> }
    }
}
</code></pre>
<p>You will get back your response together with the students' data that matched with your corresponding query. Now you are officially a Search Engineer!</p>
<pre><code class="lang-json">{
    <span class="hljs-attr">"took"</span>: <span class="hljs-number">608</span>,
    <span class="hljs-attr">"timed_out"</span>: <span class="hljs-literal">false</span>,
    <span class="hljs-attr">"_shards"</span>: {
        <span class="hljs-attr">"total"</span>: <span class="hljs-number">1</span>,
        <span class="hljs-attr">"successful"</span>: <span class="hljs-number">1</span>,
        <span class="hljs-attr">"skipped"</span>: <span class="hljs-number">0</span>,
        <span class="hljs-attr">"failed"</span>: <span class="hljs-number">0</span>
    },
    <span class="hljs-attr">"hits"</span>: {
        <span class="hljs-attr">"total"</span>: {
            <span class="hljs-attr">"value"</span>: <span class="hljs-number">2</span>,
            <span class="hljs-attr">"relation"</span>: <span class="hljs-string">"eq"</span>
        },
        <span class="hljs-attr">"max_score"</span>: <span class="hljs-number">0.74487394</span>,
        <span class="hljs-attr">"hits"</span>: [
            {
                <span class="hljs-attr">"_index"</span>: <span class="hljs-string">"students"</span>,
                <span class="hljs-attr">"_type"</span>: <span class="hljs-string">"_doc"</span>,
                <span class="hljs-attr">"_id"</span>: <span class="hljs-string">"rgpef24BTFuh7kXolTpo"</span>,
                <span class="hljs-attr">"_score"</span>: <span class="hljs-number">0.74487394</span>,
                <span class="hljs-attr">"_source"</span>: {
                    <span class="hljs-attr">"name"</span>: <span class="hljs-string">"john doe"</span>,
                    <span class="hljs-attr">"age"</span>: <span class="hljs-number">18</span>,
                    <span class="hljs-attr">"average_score"</span>: <span class="hljs-number">77.7</span>
                }
            },
            {
                <span class="hljs-attr">"_index"</span>: <span class="hljs-string">"students"</span>,
                <span class="hljs-attr">"_type"</span>: <span class="hljs-string">"_doc"</span>,
                <span class="hljs-attr">"_id"</span>: <span class="hljs-string">"sApef24BTFuh7kXolTpo"</span>,
                <span class="hljs-attr">"_score"</span>: <span class="hljs-number">0.74487394</span>,
                <span class="hljs-attr">"_source"</span>: {
                    <span class="hljs-attr">"name"</span>: <span class="hljs-string">"mary doe"</span>,
                    <span class="hljs-attr">"age"</span>: <span class="hljs-number">18</span>,
                    <span class="hljs-attr">"average_score"</span>: <span class="hljs-number">97.7</span>
                }
            }
        ]
    }
}
</code></pre>
<h2 id="heading-now-lets-get-to-go">Now let's get to Go!</h2>
<p><img src="https://www.freecodecamp.org/news/content/images/2019/11/download--2-.png" alt="Image" width="600" height="400" loading="lazy">
<em>Go in action!</em></p>
<p>If you have reached this part, you should have grasped the very minimum concepts of using Elasticsearch. Now, we are going to implement Elasticsearch in Go.</p>
<p>A very primitive way of implementing Elasticsearch is that you can keep doing http requests into your Elasticsearch IP. But we are not going to do that. </p>
<p>I found <a target="_blank" href="https://github.com/olivere/elastic">this</a> very helpful library for implementing Elasticsearch in Go. You should install that library before you proceed in your Go modules.</p>
<h3 id="heading-make-your-struct">Make your struct</h3>
<p>First of all, you will definitely need to make a struct for your Model. In this example, we are going to use the same modeling as in our previous example which in this case is the <code>Student</code> struct. </p>
<pre><code class="lang-go"><span class="hljs-keyword">package</span> main

<span class="hljs-keyword">type</span> Student <span class="hljs-keyword">struct</span> {
    Name         <span class="hljs-keyword">string</span>  <span class="hljs-string">`json:"name"`</span>
    Age          <span class="hljs-keyword">int64</span>   <span class="hljs-string">`json:"age"`</span>
    AverageScore <span class="hljs-keyword">float64</span> <span class="hljs-string">`json:"average_score"`</span>
}
</code></pre>
<h3 id="heading-making-a-client-connection">Making a Client Connection</h3>
<p>Now, let's make a function that'll allow us to initialize our ES Client connection.<br>If you have a running instance of Elasticsearch outside of your localhost, you can simply change the part inside <code>SetURL</code>.</p>
<pre><code class="lang-go"><span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">GetESClient</span><span class="hljs-params">()</span> <span class="hljs-params">(*elastic.Client, error)</span></span> {

    client, err :=  elastic.NewClient(elastic.SetURL(<span class="hljs-string">"http://localhost:9200"</span>),
        elastic.SetSniff(<span class="hljs-literal">false</span>),
        elastic.SetHealthcheck(<span class="hljs-literal">false</span>))

    fmt.Println(<span class="hljs-string">"ES initialized..."</span>)

    <span class="hljs-keyword">return</span> client, err

}
</code></pre>
<h3 id="heading-data-insertion"><strong>Data Insertion</strong></h3>
<p>After that, the first thing we can do is try to insert our data into Elasticsearch via Go. We will be making a model of <code>Student</code> and inserting it into our Elasticsearch client.</p>
<pre><code class="lang-go"><span class="hljs-keyword">package</span> main

<span class="hljs-keyword">import</span> (
    <span class="hljs-string">"context"</span>
    <span class="hljs-string">"encoding/json"</span>
    <span class="hljs-string">"fmt"</span>

    elastic <span class="hljs-string">"gopkg.in/olivere/elastic.v7"</span>
)

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">main</span><span class="hljs-params">()</span></span> {

    ctx := context.Background()
    esclient, err := GetESClient()
    <span class="hljs-keyword">if</span> err != <span class="hljs-literal">nil</span> {
        fmt.Println(<span class="hljs-string">"Error initializing : "</span>, err)
        <span class="hljs-built_in">panic</span>(<span class="hljs-string">"Client fail "</span>)
    }

    <span class="hljs-comment">//creating student object</span>
    newStudent := Student{
        Name:         <span class="hljs-string">"Gopher doe"</span>,
        Age:          <span class="hljs-number">10</span>,
        AverageScore: <span class="hljs-number">99.9</span>,
    }

    dataJSON, err := json.Marshal(newStudent)
    js := <span class="hljs-keyword">string</span>(dataJSON)
    ind, err := esclient.Index().
        Index(<span class="hljs-string">"students"</span>).
        BodyJson(js).
        Do(ctx)

    <span class="hljs-keyword">if</span> err != <span class="hljs-literal">nil</span> {
        <span class="hljs-built_in">panic</span>(err)
    }

    fmt.Println(<span class="hljs-string">"[Elastic][InsertProduct]Insertion Successful"</span>)

}
</code></pre>
<h3 id="heading-querying-our-data"><strong>Querying our Data</strong></h3>
<p>Finally, we can do some searching. The below code might look a bit complex. But rest assured, it will make more sense to you after you go through it carefully. I will be using a basic matching query in the below example.</p>
<pre><code class="lang-go"><span class="hljs-keyword">package</span> main

<span class="hljs-keyword">import</span> (
    <span class="hljs-string">"context"</span>
    <span class="hljs-string">"encoding/json"</span>
    <span class="hljs-string">"fmt"</span>

    elastic <span class="hljs-string">"gopkg.in/olivere/elastic.v7"</span>
)

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">main</span><span class="hljs-params">()</span></span> {

    ctx := context.Background()
    esclient, err := GetESClient()
    <span class="hljs-keyword">if</span> err != <span class="hljs-literal">nil</span> {
        fmt.Println(<span class="hljs-string">"Error initializing : "</span>, err)
        <span class="hljs-built_in">panic</span>(<span class="hljs-string">"Client fail "</span>)
    }

    <span class="hljs-keyword">var</span> students []Student

    searchSource := elastic.NewSearchSource()
    searchSource.Query(elastic.NewMatchQuery(<span class="hljs-string">"name"</span>, <span class="hljs-string">"Doe"</span>))

    <span class="hljs-comment">/* this block will basically print out the es query */</span>
    queryStr, err1 := searchSource.Source()
    queryJs, err2 := json.Marshal(queryStr)

    <span class="hljs-keyword">if</span> err1 != <span class="hljs-literal">nil</span> || err2 != <span class="hljs-literal">nil</span> {
        fmt.Println(<span class="hljs-string">"[esclient][GetResponse]err during query marshal="</span>, err1, err2)
    }
    fmt.Println(<span class="hljs-string">"[esclient]Final ESQuery=\n"</span>, <span class="hljs-keyword">string</span>(queryJs))
    <span class="hljs-comment">/* until this block */</span>

    searchService := esclient.Search().Index(<span class="hljs-string">"students"</span>).SearchSource(searchSource)

    searchResult, err := searchService.Do(ctx)
    <span class="hljs-keyword">if</span> err != <span class="hljs-literal">nil</span> {
        fmt.Println(<span class="hljs-string">"[ProductsES][GetPIds]Error="</span>, err)
        <span class="hljs-keyword">return</span>
    }

    <span class="hljs-keyword">for</span> _, hit := <span class="hljs-keyword">range</span> searchResult.Hits.Hits {
        <span class="hljs-keyword">var</span> student Student
        err := json.Unmarshal(hit.Source, &amp;student)
        <span class="hljs-keyword">if</span> err != <span class="hljs-literal">nil</span> {
            fmt.Println(<span class="hljs-string">"[Getting Students][Unmarshal] Err="</span>, err)
        }

        students = <span class="hljs-built_in">append</span>(students, student)
    }

    <span class="hljs-keyword">if</span> err != <span class="hljs-literal">nil</span> {
        fmt.Println(<span class="hljs-string">"Fetching student fail: "</span>, err)
    } <span class="hljs-keyword">else</span> {
        <span class="hljs-keyword">for</span> _, s := <span class="hljs-keyword">range</span> students {
            fmt.Printf(<span class="hljs-string">"Student found Name: %s, Age: %d, Score: %f \n"</span>, s.Name, s.Age, s.AverageScore)
        }
    }

}
</code></pre>
<p>The query should be printed out like this:</p>
<pre><code>ES initialized...
[esclient]Final ESQuery=
 {<span class="hljs-string">"query"</span>:{<span class="hljs-string">"match"</span>:{<span class="hljs-string">"name"</span>:{<span class="hljs-string">"query"</span>:<span class="hljs-string">"Doe"</span>}}}}
</code></pre><p>And yes that query is what will be posted into the Elasticsearch.</p>
<p>The result of your query should also come out like this if you have followed my example since the very start:</p>
<pre><code>Student found Name: john doe, <span class="hljs-attr">Age</span>: <span class="hljs-number">18</span>, <span class="hljs-attr">Score</span>: <span class="hljs-number">77.700000</span> 
Student found Name: mary doe, <span class="hljs-attr">Age</span>: <span class="hljs-number">18</span>, <span class="hljs-attr">Score</span>: <span class="hljs-number">97.700000</span> 
Student found Name: Gopher doe, <span class="hljs-attr">Age</span>: <span class="hljs-number">10</span>, <span class="hljs-attr">Score</span>: <span class="hljs-number">99.900000</span>
</code></pre><p>And there you go!</p>
<p>That's the end of my tutorial about how to implement Elasticsearch in Go. I hope I have covered the very basic parts of using Elasticsearch in Go. </p>
<p>To get further info on this topic, you should read about <a target="_blank" href="https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl.html">Query DSL</a> and <a target="_blank" href="https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-function-score-query.html">Function Scoring</a> in Elasticsearch, which in my opinion one of the best things about Elasticsearch.</p>
<p>And fret not, the library used in this example also supports a lot of Elasticsearch features, even the Function Scoring query in Elasticsearch.</p>
<p>Thanks for reading through my article! I do hope it will be useful and can help you getting started using Elasticsearch. </p>
<blockquote>
<p>Never stop learning; knowledge doubles every fourteen months. ~Anthony J.D'Angelo</p>
</blockquote>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ Powerful tools for Elasticsearch data visualization & analysis ]]>
                </title>
                <description>
                    <![CDATA[ By Veronika Rovnik The goal is to turn data into information, and information into insight. ―Carly Fiorina About Kibana Kibana is a piece of data visualization software that provides a browser-based interface for exploring Elasticsearch data and n... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/powerful-tools-for-elasticsearch-data-visualization-analysis/</link>
                <guid isPermaLink="false">66d4617cd14641365a05098d</guid>
                
                    <category>
                        <![CDATA[ big data ]]>
                    </category>
                
                    <category>
                        <![CDATA[ data analytics ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Data Science ]]>
                    </category>
                
                    <category>
                        <![CDATA[ data visualization ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Developer Tools ]]>
                    </category>
                
                    <category>
                        <![CDATA[ elasticsearch ]]>
                    </category>
                
                    <category>
                        <![CDATA[ NoSQL ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Web Development ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ freeCodeCamp ]]>
                </dc:creator>
                <pubDate>Tue, 13 Aug 2019 17:00:00 +0000</pubDate>
                <media:content url="https://www.freecodecamp.org/news/content/images/2019/08/Copy-of-designing-a-scandinavian-style-home--1--1.png" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>By Veronika Rovnik</p>
<blockquote>
<p>The goal is to turn data into information, and information into insight.</p>
<p>―Carly Fiorina</p>
</blockquote>
<h1 id="heading-about-kibana">About Kibana</h1>
<p><img src="https://www.freecodecamp.org/news/content/images/2019/08/Kibana-Color-Lockup.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><a target="_blank" href="https://www.elastic.co/products/kibana/?r=fr4">Kibana</a> is a piece of <strong>data visualization software</strong> that provides a browser-based interface for <em>exploring Elasticsearch data</em> and <em>navigating the Elastic Stack</em> — a collection of open-source products (Elasticsearch, Logstash, Beats, and others).</p>
<p>While Logstash and Bits deliver data to Elasticsearch, <strong>Kibana</strong> <em>opens the window into the Elastic Stack</em>, allowing you to track the <em>health of your cluster</em>, perform <em>log</em> and <em>time-series analysis</em>, detect anomalies in the data with <em>unsupervised machine learning</em>, discover relationships using <em>graphs</em> and, most importantly, extract insights from the Elasticsearch data with <strong>visualizations</strong> that can be combined together in a <em>custom interactive dashboard</em>.</p>
<p>Today I’d like to show you how to create a stunning <strong>dashboard</strong> and a tabular <strong>report</strong> based on the Elasticsearch data.</p>
<p>Roll up your sleeves and let’s start!</p>
<h1 id="heading-where-to-start">Where to start</h1>
<p>The <strong>Home</strong> page is the place where everything starts.</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2019/08/0_fpQgMCmvLqiFhur2.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p>Here you can decide which actions to take next. The available functionality can be divided into two logical sections:</p>
<ul>
<li><strong>Visualizing</strong> and <strong>exploring</strong> the data. Here you can create a new dashboard, visualization or presentation, build a machine learning model, analyze relationships in your data using <strong>graphs</strong>, and more.</li>
<li><strong>Managing</strong> the <strong>Elastic Stack</strong>: configure your spaces, analyze logs of an application, configure security settings, etc.</li>
</ul>
<p>We’ll focus on the process of creating visualizations and adding them to the dashboard.</p>
<h1 id="heading-how-to-create-a-dashboard-in-kibana">How to create a dashboard in Kibana</h1>
<p>Let me get you a feel for how easy it is to set up a <em>rich dashboard and start reporting.</em></p>
<p>The first essential step to take is to <em>import your data</em> into Kibana. Multiple options for adding data are at your disposal — you can choose the one that works best for you:</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2019/08/0_sRsqKuv7Ptw0Clt1.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p>For demonstration purposes, I’ve selected the sample data.</p>
<p>To design your first data visualizations and combine them into the dashboard, open the <strong>Visualize</strong> page. Here you can create, modify and view the existing visualizations.</p>
<p>What will strike you at once is the abundance of <strong>visualization types</strong> you can choose from.</p>
<p>After you’ve selected the one you need, choose an index pattern as a source so as to inform Kibana about your index. Let’s choose <code>kibana_sample_data_flights</code> and start creating a horizontal bar chart.</p>
<p>Now you can apply a metric aggregation for the Y-axis and a bucket aggregation for the X-axis. Here is a <a target="_blank" href="https://www.elastic.co/guide/en/kibana/7.1/xy-chart.html">list</a> of all available aggregations for charts.</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2019/08/HorizontalBarChartKibana.gif" alt="Image" width="600" height="400" loading="lazy">
<em>Creating a horizontal bar chart in Kibana</em></p>
<p>Optionally, you can customize the colors of the visualization.</p>
<p><strong>Filtering</strong> is another mighty feature of Elasticsearch and Kibana. It provides a way to visualize only a selected subset of documents.</p>
<p>See how you can apply filters to the fields based on logical conditions:</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2019/08/FilteringBarChartKibana.gif" alt="Image" width="600" height="400" loading="lazy"></p>
<p>As you see, Kibana provides a straightforward way of filtering the data via the comfy interface. Along with that, you can choose how to filter the data — either by using the <strong>Kibana Query Language</strong> (a simplified query syntax) or <strong>Lucene</strong>.</p>
<p>To allow end-users to filter the data interactively, you can add <strong>control</strong> widgets — special elements of the dashboard which allow filtering the data simply by clicking them.</p>
<p>Another feature I’d like to highlight is the <strong>advanced filtering by dates</strong> and the ability to set time intervals for refreshing the data in the dashboard.</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2019/08/0_dO63HLLppucTAw4M.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p>The good thing is that visualizations are <strong>reusable</strong>. After creating it, you can <strong>save your result</strong> and add it to the dashboard any time as well as <strong>share</strong> with your colleagues given they have access to your Kibana instance.</p>
<p><img src="https://miro.medium.com/max/38/0*sIPxndN5TdA8xOEH?q=20" alt="Image" width="600" height="400" loading="lazy">
<em>Saving a visualization in Kibana</em></p>
<p>After arranging all the visualization elements on a single page, you can export the final dashboard to <strong>PNG</strong> or <strong>PDF</strong> format. This is what makes the dashboards portable — it’s easy to share them across departments in no time.</p>
<p>Let’s look at an example of the dashboard you can create:</p>
<p><img src="https://miro.medium.com/max/38/0*N3TOSp4x8RObP9O-?q=20" alt="Image" width="600" height="400" loading="lazy">
<em>Interacting with the dashboard in Kibana</em></p>
<p>To my mind, the principal features which make each dashboard special are <strong>interactivity</strong> and <strong>expressiveness</strong>. With it, you can communicate business metrics efficiently.</p>
<h1 id="heading-personal-impression">Personal impression</h1>
<p>The visualizations in Kibana ideally perform the tasks they are designed for. What is more, all the visualizations are <strong>eye-catching</strong> and you can tailor them according to your design ideas. The entire process of creating a dashboard in Kibana is meant to be <em>fast</em> and <em>efficient</em> — and it is so due to the Kibana’s user-friendly and intuitive interface.</p>
<p>On the other hand, I’ve felt that some functionality is missing here.</p>
<p>When working with data, one of the effective exploratory techniques you can apply is <strong>slicing</strong> and <strong>dicing</strong> your data before getting to know which aspects of the data to pay attention to. To my mind, the data table widget isn’t the best option — it presents the data in a flat table which doesn’t support a multi-dimensional view of the data. But playing with data should be done interactively and fast.</p>
<p>And this is where a <strong>pivot table control</strong> comes into play. After searching for available solutions, my choice fell on one open-source <strong>plugin</strong> called <a target="_blank" href="https://www.flexmonster.com/?r=fr4">Flexmonster</a>. It handles connecting to the <em>Elasticsearch index</em> and allows creating <strong>tabular reports</strong> based on the data from its documents. Along with that, integrating with Kibana is smooth — the only thing required to get started is to install a plugin by running one line of code in the command line. You can find more details on <a target="_blank" href="https://github.com/flexmonster/pivot-kibana">GitHub</a>. Before using it, I recommend making sure that your Kibana and Elasticsearch instances are of the same version.</p>
<p>Once you set up a tool, you are ready to use all available features for searching in-depth insights.</p>
<h1 id="heading-features-for-analytics-and-reporting">Features for analytics and reporting</h1>
<p>Flexmonster Pivot provides fast access to the most essential reporting functionality. Its toolbar allows connecting to the data source, loading previously saved reports, exporting reports to <strong>PDF</strong>, <strong>Excel</strong>, <strong>HTML</strong>, <strong>CSV</strong>, and images. Besides, I’ve managed to quickly switch between two different modes — the grid and the charts. Cells formatting options include <strong>conditional</strong> and <strong>number formatting</strong>. The field list deserves particular attention — here you can select hierarchies to rows, columns, measures, and report filters. There is also the <em>search input field</em> which is helpful if the index has a long list of fields.</p>
<p>One of the features I’d like to highlight is the ability to <strong>drag and drop</strong> the hierarchies right on the grid. Thereby, you can change the slice completely via the UI.</p>
<p>Another one is the <strong>drill-through</strong> feature — it helps to know which records stand behind the aggregated values.</p>
<h1 id="heading-working-with-a-pivot-table">Working with a pivot table</h1>
<p>Let me show you how to create a report based on the Elasticsearch data:</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2019/08/ReportInKibanaDevTo2.gif" alt="Image" width="600" height="400" loading="lazy"></p>
<p>While testing the tool, I’ve managed to <em>aggregate</em> and <em>filter</em> the data, <em>sort</em> the values on the grid and save the results to continue working with the report later. Plus, exporting works well — it’s easy to share the reports with teammates.</p>
<h1 id="heading-bringing-it-all-together">Bringing it all together</h1>
<p>Today I’ve covered the benefits Kibana provides for visualization of Elasticsearch data. You’ve been able to make sure how dashboards can empower the analysis process.</p>
<p>To my mind, a pivot table is a good tool which enables you to benefit from exploring data before teasing out the answers to complex questions.</p>
<p>Flexmonster nicely complements the available functionality of Kibana - the reports you are creating with it are insightful, customizable and can be easily shared across departments. </p>
<p>Working together, both tools have all the potential to boost your storytelling. </p>
<p>I encourage you to give such a combination a try.</p>
<h2 id="heading-whats-next">What’s next?</h2>
<ul>
<li><a target="_blank" href="https://www.elastic.co/products/stack/reporting/?r=fr4">Reporting with Kibana</a></li>
<li><a target="_blank" href="https://www.elastic.co/guide/en/kibana/current/createvis.html">Creating a visualization in Kibana</a></li>
<li><a target="_blank" href="https://www.flexmonster.com/demos/connect-elasticsearch/?r=fr4">Pivot Table for Elasticsearch</a></li>
<li><a target="_blank" href="https://www.flexmonster.com/blog/new-pivot-table-for-kibana/?r=fr4">How to add a Pivot Table to Kibana</a></li>
</ul>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ How to simplify Docker container log analysis with Elastic Stack ]]>
                </title>
                <description>
                    <![CDATA[ By Ravindu Fernando Logging is an essential component within any application. Logs enable you to analyze and sneak a peak into what’s happening within your application code like a story. Software developers spend a large part of their day to day live... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/docker-container-log-analysis-with-elastic-stack-53d5ec9e5953/</link>
                <guid isPermaLink="false">66c3496ac8f6b2d81069b349</guid>
                
                    <category>
                        <![CDATA[ coding ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Docker ]]>
                    </category>
                
                    <category>
                        <![CDATA[ elasticsearch ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Productivity ]]>
                    </category>
                
                    <category>
                        <![CDATA[ tech  ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ freeCodeCamp ]]>
                </dc:creator>
                <pubDate>Wed, 10 Apr 2019 06:25:00 +0000</pubDate>
                <media:content url="https://cdn-media-1.freecodecamp.org/images/1*ytyp7c9adYtnLbTUqDl-DA.png" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>By Ravindu Fernando</p>
<p>Logging is an essential component within any application. Logs enable you to analyze and sneak a peak into what’s happening within your application code like a story. Software developers spend a large part of their day to day lives monitoring, troubleshooting and debugging applications, which can sometimes be a nightmare. Logging allows software developers to make this hectic process much easier and smoother.</p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/EMv0ZRmR82v5oUZ4ZNllx29LNYgqZoUwjua-" alt="Image" width="400" height="400" loading="lazy"></p>
<p>If you have containerized your application with a container platform like Docker, you may be familiar with <strong><em>docker logs</em></strong> which allows you to see the logs created within your application running inside your docker container. Why then think of Elastic Stack to analyze your logs? Well, there are mainly two burning problems here:</p>
<ul>
<li>Imagine you have tens, hundreds, or even thousands of containers generating logs — SSH-ing in to all those servers and extracting logs won’t work well.</li>
<li>Also containers are immutable and ephemeral, which means they have a shorter life span. So once your containers are gone and replaced with new containers, all of your application logs related to old containers are gone.</li>
</ul>
<p>So the ultimate solution for this is to create a centralized logging component for collecting all of your container logs into a single place. This is where Elastic Stacks comes in.</p>
<p><a target="_blank" href="https://www.elastic.co/products/">Elastic Stack</a> mainly consists of four major components:</p>
<ul>
<li><strong>Beats</strong> is the new member which made the ELK Stack known as Elastic Stack. Beats are light weight log data shippers which can push logs to the ELK Stack. For this post I will be using Filebeats, a member of the Beats family, which offers a lightweight way to collect and forward and centralize logs and files.</li>
<li><strong>Logstash</strong> is a component which aggregates, modifies, and transfers logs from multiple input locations into Elasticsearch.</li>
<li><strong>Elasticsearch</strong> is a distributed, JSON-based search and analytics engine that stores and indexes data (log entries in this case) in a scalable and manageable way.</li>
<li><strong>Kibana</strong> is an enriched UI to analyze and easily access data in Elasticsearch.</li>
</ul>
<p>In this post, we will look into how to use the above mentioned components and implement a centralized log analyzer to collect and extract logs from Docker containers.</p>
<p>For the purposes of this article, I have used two t2.small AWS EC2 instances, running Ubuntu 18.04 installed with Docker and Docker compose. Instance 1 is running a tomcat webapp and the instance 2 is running ELK stack (Elasticsearch, Logstash, Kibana).</p>
<p>In Linux by default docker logs can be found in this location:<br><strong><em>/var/lib/docker/containers//&lt;container-id&amp;</em></strong>gt;-json.log</p>
<p>All docker logs will be collected via Filebeat running inside the host machine as a container. Filebeat will be installed on each docker host machine (we will be using a custom Filebeat docker file and systemd unit for this which will be explained in the Configuring Filebeat section.)</p>
<p>Our tomcat webapp will write logs to the above location by using the default docker logging driver. Filebeat will then extract logs from that location and push them towards Logstash.</p>
<p>Another important thing to note is that other than application generated logs, we also need metadata associated with the containers, such as container name, image, tags, host etc… This will allow us to specifically identify the exact host and container the logs are generating. These data can also be sent easily by Filebeat along with the application log entries.</p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/A1VVz9CNOB6-hJSBisBPdHQflklU7eHXAUyQ" alt="Image" width="800" height="352" loading="lazy">
<em>High Level Architecture — Instance 1 [Left] | Instance 2 [Right]</em></p>
<p>By doing this kind of implementation the running containers don’t need to worry about the logging driver, how logs are collected and pushed. Filebeat will take care of those. This is often known as <strong><em>single responsibility principle.</em></strong></p>
<h4 id="heading-configuring-filebeat"><strong>Configuring Filebeat</strong></h4>
<blockquote>
<p>For this section the filebeat.yml and Dockerfile were obtained from <a target="_blank" href="https://medium.com/@bcoste">Bruno COSTE</a>’s <a target="_blank" href="https://github.com/bcoste/sample-filebeat-docker-logging">sample-filebeat-docker-logging github repo</a>. Many thanks to his awesome work.</p>
<p>But since I have done several changes to filebeat.yml according to requirements of this article, I have hosted those with filebeat.service (systemd file) separately on my own repo. You can access the repo <a target="_blank" href="https://github.com/rav94/filebeat-demo">here</a>.</p>
</blockquote>
<p>As the initial step, you need to update your filebeat.yml file which contains the Filebeat configurations. Given below is a sample filebeat.yml file you can use. Note the line 21, the output.logstash field and the hosts field. I have configured it to the IP address of the server I’m running my ELK stack, but you can modify it if you are running Logstash on a separate server. By default Logstash is listening to Filebeat on port 5044.</p>
<blockquote>
<p>To get to know more about Filebeat Docker configuration parameters, <a target="_blank" href="https://www.elastic.co/guide/en/beats/filebeat/master/filebeat-input-docker.html">look here</a>.</p>
</blockquote>
<p>After that you can create your own Filebeat Docker image by using the following Dockerfile.</p>
<p>Once the image is built, you can push it in to your docker repository. Now since you have the capability to run Filebeat as a docker container, it’s just a matter of running the Filebeat container on your host instances running containers. Here is the docker run command.</p>
<pre><code>docker run -v <span class="hljs-string">'/var/lib/docker/containers:/usr/share/dockerlogs/data:ro'</span> -v <span class="hljs-string">'/var/run/docker.sock:/var/run/docker.sock'</span> --name filebeat ${YOUR_FILEBEAT_DOCKER_IMAGE}:latest
</code></pre><p>In the above Docker command, note the two bind mount parameters: /var/lib/docker/containers is the path where docker logs exist within the host machine, and it has been bound to /usr/share/dockerlogs/data path within Filebeat container with read only access. In the second bind mount argument, /var/run/docker.sock is bound into the Filebeat container’s Docker daemon. It is the unix socket the Docker daemon listens on by default and it can be used to communicate with the daemon from within a container. This allows our Filebeat container to obtain Docker metadata and enrich the container log entries along with the metadata and push it to ELK stack.</p>
<p>If you want to automate this process, I have written a Systemd Unit file for managing Filebeat as a service.</p>
<h4 id="heading-configuring-the-elk-stack"><strong>Configuring the ELK Stack</strong></h4>
<p>For this I will be using my second EC2 instance, where I run the ELK stack. You can do this by simply installing Docker compose and checking out this awesome <a target="_blank" href="https://github.com/deviantony/docker-elk">deviantony/docker-elk repo</a> and just running <strong><em>docker-compose up -d</em></strong></p>
<blockquote>
<p>Note that all your firewall rules allow inbound traffic into the Logstash, Elasticsearch and Kibana.</p>
</blockquote>
<p>Before running the ELK stack you need to make sure your logstash.conf file is properly configured to listen to incoming beats logs on port 5044 and the logs are being properly added onto the elasticsearch host. Also you need to make sure to add an index parameter on to your Elasticsearch to identify the logs generated by Filbeat uniquely.</p>
<p>In your docker-elk repo you can find your logstash.conf file by following docker-elk/logstash/pipeline pathname. This is the configuration file for setting up Logstash configurations. You need to update it as follows:</p>
<p>Once you do it, you can access your Kibana dashboard on port 5601 by default as defined on the docker-compose.yml file on <a target="_blank" href="https://github.com/deviantony/docker-elk">deviantony/docker-elk</a> repo.</p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/V3snsitA4b770QQT-iJZaIs2CXxAgyFkwst9" alt="Image" width="800" height="460" loading="lazy">
<em>Kibana Dashboard</em></p>
<p>Under the management tab, you can create an index pattern for Filebeat logs. This has to be done before you can view the logs on Kibana dashboard.</p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/E1ZmKKsFfxdBIytNreN-2rpR1NfwaBo8ze-X" alt="Image" width="800" height="401" loading="lazy">
<em>Filebeat Index Patten Configuration on Kibana Dashboard</em></p>
<p>If your containers are pushing logs properly into Elasticsearch via Logstash, and you have successfully created the index pattern, you can go to the Discover tab on the Kibana dashboard and view your Docker container application logs along with Docker metadata under the filebeat* index pattern.</p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/cr6q8RtvQNdx2oselNrwjfL3DCzZKK5T3MC3" alt="Image" width="800" height="402" loading="lazy">
<em>Discover Docker container application logs along with the Docker host metadata in Kibana Dashboard</em></p>
<p><strong>References</strong></p>
<ol>
<li><a target="_blank" href="https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-getting-started.html">https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-getting-started.html</a></li>
<li><a target="_blank" href="https://medium.com/@bcoste/powerful-logging-with-docker-filebeat-and-elasticsearch-8ad021aecd87">https://medium.com/@bcoste/powerful-logging-with-docker-filebeat-and-elasticsearch-8ad021aecd87</a></li>
<li><a target="_blank" href="https://www.elastic.co/guide/en/logstash/current/configuration.html">https://www.elastic.co/guide/en/logstash/current/configuration.html</a></li>
<li><a target="_blank" href="https://medium.com/lucjuggery/about-var-run-docker-sock-3bfd276e12fd">https://medium.com/lucjuggery/about-var-run-docker-sock-3bfd276e12fd</a></li>
</ol>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ Building a GitHub Repo Explorer with React and Elasticsearch ]]>
                </title>
                <description>
                    <![CDATA[ By Divyanshu Maithani _The [GitXplore](https://appbaseio-apps.github.io/gitxplore-app/" rel="noopener" target="blank" title=") app Elasticsearch is one of the most popular full-text search engines which allows you to search huge volumes of data quic... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/building-a-github-repo-explorer-with-react-and-elasticsearch-8e1190e59c13/</link>
                <guid isPermaLink="false">66d45e3d246e57ac83a2c729</guid>
                
                    <category>
                        <![CDATA[ elasticsearch ]]>
                    </category>
                
                    <category>
                        <![CDATA[ JavaScript ]]>
                    </category>
                
                    <category>
                        <![CDATA[ open source ]]>
                    </category>
                
                    <category>
                        <![CDATA[ React ]]>
                    </category>
                
                    <category>
                        <![CDATA[ tech  ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ freeCodeCamp ]]>
                </dc:creator>
                <pubDate>Wed, 10 Jan 2018 21:57:37 +0000</pubDate>
                <media:content url="https://cdn-media-1.freecodecamp.org/images/1*QunMstJjXbPkRfFwRBVVkg.png" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>By Divyanshu Maithani</p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/1*QunMstJjXbPkRfFwRBVVkg.png" alt="Image" width="800" height="443" loading="lazy">
_The [GitXplore](https://appbaseio-apps.github.io/gitxplore-app/" rel="noopener" target="<em>blank" title=") app</em></p>
<p><a target="_blank" href="https://www.elastic.co/products/elasticsearch">Elasticsearch</a> is one of the most popular <a target="_blank" href="https://en.wikipedia.org/wiki/Full-text_search">full-text search</a> engines which allows you to search huge volumes of data quickly, while <a target="_blank" href="https://reactjs.org/">React</a> is arguably <a target="_blank" href="http://stateofjs.com/2017/front-end/results/">the best library</a> for building user interfaces. During the past few months I’ve been co-authoring an open-source library, <a target="_blank" href="https://github.com/appbaseio/reactivesearch"><strong>ReactiveSearch</strong></a>, which provides React components for Elasticsearch and simplifies the process of building a search User Interface (UI).</p>
<p>This is the app which I’ll be building in this story:</p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/1*KPB8Sq7N3WId2jL57VGT-Q.png" alt="Image" width="800" height="443" loading="lazy">
_Check out the app on [CodeSandbox](https://codesandbox.io/s/github/appbaseio-apps/gitxplore-app/tree/master/" rel="noopener" target="<em>blank" title=")</em></p>
<h3 id="heading-a-brief-idea-of-elasticsearch">A brief idea of Elasticsearch</h3>
<p>Elasticsearch is a <a target="_blank" href="https://en.wikipedia.org/wiki/NoSQL">NoSQL</a> database which can search through large amounts of data in a short time. It performs a <a target="_blank" href="https://en.wikipedia.org/wiki/Full-text_search">full-text search</a> on the data which is stored in the form of documents (like objects) by examining all the words in every document.</p>
<p>Here’s what the <a target="_blank" href="https://www.elastic.co/guide/en/elasticsearch/reference/current/getting-started.html">Elasticsearch docs</a> say:</p>
<blockquote>
<p>Elasticsearch is a highly scalable open-source full-text search and analytics engine. It allows you to store, search, and analyze big volumes of data quickly and in near real time.</p>
</blockquote>
<p>Even if you’ve never used Elasticsearch before you should be able to follow along with this story and build your very own Elasticsearch powered search using React and ReactiveSearch. ?</p>
<h3 id="heading-what-is-reactivesearch">What is ReactiveSearch?</h3>
<p><a target="_blank" href="https://github.com/appbaseio/reactivesearch">ReactiveSearch</a> is a React UI components library for Elasticsearch. In order to search data in Elasticsearch, you need to write <a target="_blank" href="https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl.html"><strong>queries</strong></a>. Then you will need to format and render the JSON data in your UI. ReactiveSearch simplifies the entire process since you don’t need to worry about writing these queries. This makes it easier to focus on creating the UI.</p>
<p>Here is an example that generates a search-box UI with category specific suggestions:</p>
<pre><code class="lang-js">&lt;CategorySearch
  componentId=<span class="hljs-string">"repo"</span>
  dataField={[<span class="hljs-string">"name"</span>, <span class="hljs-string">"name.raw"</span>]}
  categoryField=<span class="hljs-string">"language.raw"</span>
/&gt;
</code></pre>
<p><img src="https://cdn-media-1.freecodecamp.org/images/1*2wZ7uDqfizcjV9JCnre0mQ.png" alt="Image" width="411" height="271" loading="lazy">
<em>Component rendered from the above code</em></p>
<p>This would likely have taken us 100+ lines without the library, and knowledge of <a target="_blank" href="https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl.html">Elasticsearch Query DSL</a> to construct the query.</p>
<p>In this post, I’ll use different components from the library to build the final UI.</p>
<p>You should try out <a target="_blank" href="https://appbaseio-apps.github.io/gitxplore-app/">the final app</a> before we deep-dive. Here’s the <a target="_blank" href="https://codesandbox.io/s/github/appbaseio-apps/gitxplore-app/tree/master/">CodeSandbox link</a> for the same.</p>
<h3 id="heading-setting-things-up">Setting things up</h3>
<p>Before we start building the UI, we’ll need the dataset containing GitHub repositories in Elasticsearch. ReactiveSearch works with any Elasticsearch index and you can easily <a target="_blank" href="https://opensource.appbase.io/reactive-manual/getting-started/reactivebase.html">use it with your own dataset</a>.</p>
<p>For brevity, you can use <a target="_blank" href="https://opensource.appbase.io/dejavu/live/#?input_state=XQAAAAJDAQAAAAAAAAA9iIqnY-B2BnTZGEQz6wkFsoFSyhi0TotY1ZI3dCbzpZ5wZmCa4HoWjWiBHcRO1KpPWzrR3-ungbYF_FBD7IY3vlhuTW9dQQFtt3qksr-wGqyFf_qxW2Z3widjMRY5xGpv9lCIh4b5Dyi-O2wVMmUzKADc-0pG1tyzQ558Y_SoViZ27V2qq-px_fIGV-GVRTcrO-LdiYhDhtFK4tYVTak07UxRRvGaqeK3GI2sU7O67YnSdDZNv8_5pnc3SPxlPV9t9YdkGW3YkckG3LAVp03TbrSWI7GdN0fMZCgwqWv0FP1iNWHQrUW2v8-B___Y4BHg">my dataset</a> or clone it for yourself by following <a target="_blank" href="https://opensource.appbase.io/dejavu/live/#?input_state=XQAAAAJDAQAAAAAAAAA9iIqnY-B2BnTZGEQz6wkFsoFSyhi0TotY1ZI3dCbzpZ5wZmCa4HoWjWiBHcRO1KpPWzrR3-ungbYF_FBD7IY3vlhuTW9dQQFtt3qksr-wGqyFf_qxW2Z3widjMRY5xGpv9lCIh4b5Dyi-O2wVMmUzKADc-0pG1tyzQ558Y_SoViZ27V2qq-px_fIGV-GVRTcrO-LdiYhDhtFK4tYVTak07UxRRvGaqeK3GI2sU7O67YnSdDZNv8_5pnc3SPxlPV9t9YdkGW3YkckG3LAVp03TbrSWI7GdN0fMZCgwqWv0FP1iNWHQrUW2v8-B___Y4BHg">this link</a> and clicking on <em>Clone this App</em> button. This will let you make a copy of the dataset as your own app.</p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/1*5tXgJvJROclI-NFXUKziIQ.png" alt="Image" width="800" height="408" loading="lazy">
_The GitHub repo [dataset](https://opensource.appbase.io/dejavu/live/#?input_state=XQAAAAJDAQAAAAAAAAA9iIqnY-B2BnTZGEQz6wkFsoFSyhi0TotY1ZI3dCbzpZ5wZmCa4HoWjWiBHcRO1KpPWzrR3-ungbYF_FBD7IY3vlhuTW9dQQFtt3qksr-wGqyFf_qxW2Z3widjMRY5xGpv9lCIh4b5Dyi-O2wVMmUzKADc-0pG1tyzQ558Y_SoViZ27V2qq-px_fIGV-GVRTcrO-LdiYhDhtFK4tYVTak07UxRRvGaqeK3GI2sU7O67YnSdDZNv8_5pnc3SPxlPV9t9YdkGW3YkckG3LAVp03TbrSWI7GdN0fMZCgwqWv0FP1iNWHQrUW2v8-B___Y4BHg" rel="noopener" target="<em>blank" title=")</em></p>
<p>After you enter an app name, the cloning process should start importing the 26K+ repos to your account.</p>
<p>All the repos are structured in the following format:</p>
<pre><code class="lang-json">{
  <span class="hljs-attr">"name"</span>: <span class="hljs-string">"freeCodeCamp"</span>,
  <span class="hljs-attr">"owner"</span>: <span class="hljs-string">"freeCodeCamp"</span>,
  <span class="hljs-attr">"fullname"</span>: <span class="hljs-string">"freeCodeCamp~freeCodeCamp"</span>,
  <span class="hljs-attr">"description"</span>: <span class="hljs-string">"The https://freeCodeCamp.org open source codebase and curriculum. Learn to code and help nonprofits."</span>,
  <span class="hljs-attr">"avatar"</span>: <span class="hljs-string">"https://avatars0.githubusercontent.com/u/9892522?v=4"</span>,
  <span class="hljs-attr">"url"</span>: <span class="hljs-string">"https://github.com/freeCodeCamp/freeCodeCamp"</span>,
  <span class="hljs-attr">"pushed"</span>: <span class="hljs-string">"2017-12-24T05:44:03Z"</span>,
  <span class="hljs-attr">"created"</span>: <span class="hljs-string">"2014-12-24T17:49:19Z"</span>,
  <span class="hljs-attr">"size"</span>: <span class="hljs-number">31474</span>,
  <span class="hljs-attr">"stars"</span>: <span class="hljs-number">291526</span>,
  <span class="hljs-attr">"forks"</span>: <span class="hljs-number">13211</span>,
  <span class="hljs-attr">"topics"</span>: [
    <span class="hljs-string">"careers"</span>,
    <span class="hljs-string">"certification"</span>,
    <span class="hljs-string">"community"</span>,
    <span class="hljs-string">"curriculum"</span>,
    <span class="hljs-string">"d3"</span>,
    <span class="hljs-string">"education"</span>,
    <span class="hljs-string">"javascript"</span>,
    <span class="hljs-string">"learn-to-code"</span>,
    <span class="hljs-string">"math"</span>,
    <span class="hljs-string">"nodejs"</span>,
    <span class="hljs-string">"nonprofits"</span>,
    <span class="hljs-string">"programming"</span>,
    <span class="hljs-string">"react"</span>,
    <span class="hljs-string">"teachers"</span>
  ],
  <span class="hljs-attr">"language"</span>: <span class="hljs-string">"JavaScript"</span>,
  <span class="hljs-attr">"watchers"</span>: <span class="hljs-number">8462</span>
}
</code></pre>
<ul>
<li>We will use <a target="_blank" href="https://github.com/facebookincubator/create-react-app.">create-react-app</a> to set up the project. You can install create-react-app by running the following command in your terminal:</li>
</ul>
<pre><code class="lang-bash">npm install -g create-react-app
</code></pre>
<ul>
<li>After it’s installed, you can create a new project by running:</li>
</ul>
<pre><code class="lang-bash">create-react-app gitxplore
</code></pre>
<ul>
<li>After the project is set up you can change into the project directory and install ReactiveSearch dependency:</li>
</ul>
<pre><code class="lang-bash"><span class="hljs-built_in">cd</span> gitxplore
npm install @appbaseio/reactivesearch
</code></pre>
<ul>
<li>You may also add fontawesome CDN, which we’ll be using for some icons, by inserting the following lines in <code>/public/index.html</code> before the <code>&lt;/body&gt;</code> tag ends:</li>
</ul>
<pre><code class="lang-html"><span class="hljs-tag">&lt;<span class="hljs-name">script</span> <span class="hljs-attr">defer</span>         <span class="hljs-attr">src</span>=<span class="hljs-string">"https://use.fontawesome.com/releases/v5.0.2/js/all.js"</span>&gt;</span><span class="hljs-tag">&lt;/<span class="hljs-name">script</span>&gt;</span>
</code></pre>
<h3 id="heading-diving-into-the-code">Diving into the code</h3>
<p>I’ll follow a simple directory structure for the app. Here are the important files:</p>
<pre><code>src
├── App.css               <span class="hljs-comment">// App styles</span>
├── App.js                <span class="hljs-comment">// App container</span>
├── components
│   ├── Header.js         <span class="hljs-comment">// Header component</span>
│   ├── Results.js        <span class="hljs-comment">// Results component</span>
│   ├── SearchFilters.js  <span class="hljs-comment">// Filters component</span>
│   └── Topic.js          <span class="hljs-comment">// rendered by Results</span>
├── index.css             <span class="hljs-comment">// styles</span>
├── index.js              <span class="hljs-comment">// ReactDOM render</span>
└── theme.js              <span class="hljs-comment">// colors and fonts</span>
public
└── index.html
</code></pre><p>Here’s the link to <a target="_blank" href="https://github.com/appbaseio-apps/gitxplore-app">final repo</a> if you wish to reference anything at any point.</p>
<h4 id="heading-1-adding-styles">1. Adding styles</h4>
<p>I’ve written responsive styles for the app which you can copy into your app. Just fire up your favorite text editor and copy the styles for <code>/src/index.css</code> from <a target="_blank" href="https://github.com/appbaseio-apps/gitxplore-app/blob/master/src/index.css">here</a> and <code>/src/App.css</code> from <a target="_blank" href="https://github.com/appbaseio-apps/gitxplore-app/blob/master/src/App.css">here</a> respectively.</p>
<p>Now, create a file <code>/src/theme.js</code> where we’ll add the colors and fonts for our app:</p>
<pre><code class="lang-js"><span class="hljs-keyword">const</span> theme = {
    <span class="hljs-attr">typography</span>: {
        <span class="hljs-attr">fontFamily</span>: <span class="hljs-string">'Raleway, Helvetica, sans-serif'</span>,
    },
    <span class="hljs-attr">colors</span>: {
        <span class="hljs-attr">primaryColor</span>: <span class="hljs-string">'#008000'</span>,
        <span class="hljs-attr">titleColor</span>: <span class="hljs-string">'white'</span>
    },
    <span class="hljs-attr">secondaryColor</span>: <span class="hljs-string">'mediumseagreen'</span>,
};

<span class="hljs-keyword">export</span> <span class="hljs-keyword">default</span> theme;
</code></pre>
<h4 id="heading-2-adding-the-first-reactivesearch-component">2. Adding the first ReactiveSearch component</h4>
<p>All the ReactiveSearch components are wrapped around a container component <a target="_blank" href="https://opensource.appbase.io/reactive-manual/getting-started/reactivebase.html"><strong>ReactiveBase</strong></a> which provides data from Elasticsearch to the children ReactiveSearch components.</p>
<p>We’ll use this in <code>/src/App.js</code>:</p>
<pre><code class="lang-js"><span class="hljs-keyword">import</span> React, { Component } <span class="hljs-keyword">from</span> <span class="hljs-string">'react'</span>;
<span class="hljs-keyword">import</span> { ReactiveBase } <span class="hljs-keyword">from</span> <span class="hljs-string">'@appbaseio/reactivesearch'</span>;
<span class="hljs-keyword">import</span> theme <span class="hljs-keyword">from</span> <span class="hljs-string">'./theme'</span>;
<span class="hljs-keyword">import</span> <span class="hljs-string">'./App.css'</span>;
<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">App</span> <span class="hljs-keyword">extends</span> <span class="hljs-title">Component</span> </span>{
  render() {
    <span class="hljs-keyword">return</span> (
      <span class="xml"><span class="hljs-tag">&lt;<span class="hljs-name">section</span> <span class="hljs-attr">className</span>=<span class="hljs-string">"container"</span>&gt;</span>
        <span class="hljs-tag">&lt;<span class="hljs-name">ReactiveBase</span>
          <span class="hljs-attr">app</span>=<span class="hljs-string">"gitxplore-app"</span>
          <span class="hljs-attr">credentials</span>=<span class="hljs-string">"4oaS4Srzi:f6966181-1eb4-443c-8e0e-b7f38e7bc316"</span>
          <span class="hljs-attr">type</span>=<span class="hljs-string">"gitxplore-latest"</span>
          <span class="hljs-attr">theme</span>=<span class="hljs-string">{theme}</span>
        &gt;</span>
          <span class="hljs-tag">&lt;<span class="hljs-name">nav</span> <span class="hljs-attr">className</span>=<span class="hljs-string">"navbar"</span>&gt;</span>
            <span class="hljs-tag">&lt;<span class="hljs-name">div</span> <span class="hljs-attr">className</span>=<span class="hljs-string">"title"</span>&gt;</span>GitXplore<span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span>
          <span class="hljs-tag">&lt;/<span class="hljs-name">nav</span>&gt;</span>
        <span class="hljs-tag">&lt;/<span class="hljs-name">ReactiveBase</span>&gt;</span>
      <span class="hljs-tag">&lt;/<span class="hljs-name">section</span>&gt;</span></span>
    );
  }
}
<span class="hljs-keyword">export</span> <span class="hljs-keyword">default</span> App;
</code></pre>
<p>For the <code>app</code> and <code>credentials</code> prop you may use the ones I’ve provided here as it is. If you cloned the dataset in your own app earlier you can get them from the <a target="_blank" href="https://dashboard.appbase.io/credentials">app’s credentials page</a>. If you’re already familiar with Elasticsearch you may instead pass a <code>url</code> prop referring to <a target="_blank" href="https://opensource.appbase.io/reactive-manual/getting-started/reactivebase.html#props">your own Elasticsearch cluster URL</a>.</p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/1*bI_3-Hej71eLbiVCGoK_hw.png" alt="Image" width="800" height="149" loading="lazy">
_Getting app’s credentials from appbase.io [dashboard](https://dashboard.appbase.io/credentials" rel="noopener" target="<em>blank" title="). Just copy the Read-only API key</em></p>
<p>Alternatively, you can also copy your app’s <code>credentials</code> from the <a target="_blank" href="https://dashboard.appbase.io/apps">apps dashboard</a>. Hover over your app’s card and click on <em>Copy Read Credentials</em>.</p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/1*5k1jVr3YHGBQ0Ts02gjL7g.png" alt="Image" width="394" height="179" loading="lazy">
_Alternative to above link: Copy the read credentials from [apps dashboard](https://dashboard.appbase.io/apps" rel="noopener" target="<em>blank" title=")</em></p>
<p>After adding this you would see a basic layout like this:</p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/1*P4WcAczVGDzrTm42prVSGA.png" alt="Image" width="800" height="500" loading="lazy">
<em>After adding the first ReactiveSearch component</em></p>
<h4 id="heading-3-adding-a-datasearch">3. Adding a DataSearch</h4>
<p><img src="https://cdn-media-1.freecodecamp.org/images/1*yNVLrcjB1KEz1X_1s4RHaQ.png" alt="Image" width="800" height="86" loading="lazy">
<em>DataSearch component</em></p>
<p>Next, I’ll be adding a <a target="_blank" href="https://opensource.appbase.io/reactive-manual/search-components/datasearch.html">DataSearch</a> component to search through repositories. It creates a search UI component and lets us search across one or more fields easily. The updated <code>render</code> function in <code>/src/App.js</code> would look like this:</p>
<pre><code class="lang-js"><span class="hljs-comment">// importing DataSearch here</span>
<span class="hljs-keyword">import</span> { ReactiveBase, DataSearch } <span class="hljs-keyword">from</span> <span class="hljs-string">'@appbaseio/reactivesearch'</span>;
...
&lt;ReactiveBase ... &gt;
<span class="hljs-comment">// Adding the DataSearch here</span>
    <span class="xml"><span class="hljs-tag">&lt;<span class="hljs-name">div</span> <span class="hljs-attr">className</span>=<span class="hljs-string">"flex row-reverse app-container"</span>&gt;</span>
        <span class="hljs-tag">&lt;<span class="hljs-name">div</span> <span class="hljs-attr">className</span>=<span class="hljs-string">"results-container"</span>&gt;</span>
            <span class="hljs-tag">&lt;<span class="hljs-name">DataSearch</span>
                <span class="hljs-attr">componentId</span>=<span class="hljs-string">"repo"</span>
                <span class="hljs-attr">filterLabel</span>=<span class="hljs-string">"Search"</span>
                <span class="hljs-attr">dataField</span>=<span class="hljs-string">{[</span>'<span class="hljs-attr">name</span>', '<span class="hljs-attr">description</span>', '<span class="hljs-attr">name.raw</span>', '<span class="hljs-attr">fullname</span>', '<span class="hljs-attr">owner</span>', '<span class="hljs-attr">topics</span>']}
                <span class="hljs-attr">placeholder</span>=<span class="hljs-string">"Search Repos"</span>
                <span class="hljs-attr">autosuggest</span>=<span class="hljs-string">{false}</span>
                <span class="hljs-attr">iconPosition</span>=<span class="hljs-string">"left"</span>
                <span class="hljs-attr">URLParams</span>
                <span class="hljs-attr">className</span>=<span class="hljs-string">"data-search-container results-container"</span>
                <span class="hljs-attr">innerClass</span>=<span class="hljs-string">{{</span>
                    <span class="hljs-attr">input:</span> '<span class="hljs-attr">search-input</span>',
                }}
            /&gt;</span>
        <span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span>
    <span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span></span>
&lt;/ReactiveBase&gt;
...
</code></pre>
<p>The <code>DataSearch</code> component goes inside the <code>ReactiveBase</code> component and receives all the necessary data from it so we don’t have to write Elasticsearch queries ourselves. The surrounding <code>div</code>s add some <code>className</code> properties for styling. These just add a layout to the app. You can go through all the styles at <code>/src/App.css</code> which we created earlier. You might have noticed that we have passed some props to the <code>DataSearch</code> component.</p>
<p>Here’s how they work:</p>
<ul>
<li><code>componentId</code>: a unique string identifier which we’ll use later to connect two different ReactiveSearch components.</li>
<li><code>filterLabel</code>: a string value which will show up in the filters menu later.</li>
<li><code>dataField</code>: an array of strings containing Elasticsearch fields on which search has to performed on. You can check <a target="_blank" href="https://opensource.appbase.io/dejavu/live/#?input_state=XQAAAAJiAQAAAAAAAAA9iIqnY-B2BnTZGEQz6wkFsg1HFhlgIIPlpmP5RRZ-FWEcoSd0PjkMiILXm8GQxirVSZVrDiQlmtqn4TuMTBL2E1thSmnTeiFPBGQoqmavHhOSSrRxNeEjhNKDeff0pgxw5r5nv8t-un2YUoHpv1HKzI9aZA8KH8WAmQ6XktDDO-Hn95KeD_KPXp_E76PZ04Hl6H6MrevzUojYDnGynyNwjmI07lj0kXZeqltXcATyP8PMY7ncPHlUw1p1cnfe2JXyFgzRzZcNo7xtVJiEPCuLLKzxYehuirtvUcy6oC_KC15q9kmkWssXUCkBr7dAugoFbtjO5zUdpOFWdcz2wcD3AA3--k7h&amp;editable=false">the dataset</a> and see that these fields also matches the column name. All fields specified here matches the structure of data, for example <code>name</code> refers to the name of repo, <code>description</code> refers to its description, but there is a field with a <code>.raw</code> added here, <code>name.raw</code> which is a <a target="_blank" href="https://www.elastic.co/guide/en/elasticsearch/reference/current/multi-fields.html">multi-field</a> of the <code>name</code> field. Elasticsearch can index the same data in different ways for different purposes, which we can use to get better search results.</li>
<li><code>placeholder</code>: sets the placeholder value in the input box.</li>
<li><code>autosuggest</code>: setting a <code>false</code> value for the prop causes the results to update immediately in the results.</li>
<li><code>iconPosition</code>: sets the position of the ? icon.</li>
<li><code>URLParams</code>: is a <code>boolean</code> which tells the component to save the search term in the browser’s URL so we can share a URL to a specific search query. For example, check <a target="_blank" href="https://appbaseio-apps.github.io/gitxplore-app/?repo=%22react%22">this link</a> to see all results related to “react”.</li>
<li><code>className</code>: adds a <code>class</code> for styling using CSS.</li>
<li><code>innerClass</code>: adds a <code>class</code> to different sections of a component for styling using CSS. Here, I’ve added a <code>class</code> to the <code>input</code> box for styling. A detailed description can be found in the <a target="_blank" href="https://opensource.appbase.io/reactive-manual/search-components/datasearch.html#props">docs</a>.</li>
</ul>
<p>With this, our app should get a working search bar:</p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/1*OLNYIuRpYi9AuPckJ9G_4w.png" alt="Image" width="800" height="500" loading="lazy">
<em>Adding DataSearch component</em></p>
<h4 id="heading-4-adding-the-results-view">4. Adding the Results view</h4>
<p>Next, we’ll be adding the <code>Results</code> component at <code>/src/components/Results.js</code> and importing it in <code>/src/App.js</code>.</p>
<p>Here’s how you can write the <code>Results</code> component:</p>
<pre><code class="lang-js"><span class="hljs-keyword">import</span> React <span class="hljs-keyword">from</span> <span class="hljs-string">'react'</span>;
<span class="hljs-keyword">import</span> { SelectedFilters, ReactiveList } <span class="hljs-keyword">from</span> <span class="hljs-string">'@appbaseio/reactivesearch'</span>;
<span class="hljs-keyword">const</span> onResultStats = <span class="hljs-function">(<span class="hljs-params">results, time</span>) =&gt;</span> (
  <span class="xml"><span class="hljs-tag">&lt;<span class="hljs-name">div</span> <span class="hljs-attr">className</span>=<span class="hljs-string">"flex justify-end"</span>&gt;</span>
    {results} results found in {time}ms
  <span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span></span>
);
<span class="hljs-keyword">const</span> onData = <span class="hljs-function">(<span class="hljs-params">data</span>) =&gt;</span> (
  <span class="xml"><span class="hljs-tag">&lt;<span class="hljs-name">div</span> <span class="hljs-attr">className</span>=<span class="hljs-string">"result-item"</span> <span class="hljs-attr">key</span>=<span class="hljs-string">{data.fullname}</span>&gt;</span>
    {data.owner}/{data.name}
  <span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span></span>
);
<span class="hljs-keyword">const</span> Results = <span class="hljs-function">() =&gt;</span> (
  <span class="xml"><span class="hljs-tag">&lt;<span class="hljs-name">div</span> <span class="hljs-attr">className</span>=<span class="hljs-string">"result-list"</span>&gt;</span>
    <span class="hljs-tag">&lt;<span class="hljs-name">SelectedFilters</span> <span class="hljs-attr">className</span>=<span class="hljs-string">"m1"</span> /&gt;</span>
    <span class="hljs-tag">&lt;<span class="hljs-name">ReactiveList</span>
      <span class="hljs-attr">componentId</span>=<span class="hljs-string">"results"</span>
      <span class="hljs-attr">dataField</span>=<span class="hljs-string">"name"</span>
      <span class="hljs-attr">onData</span>=<span class="hljs-string">{onData}</span>
      <span class="hljs-attr">onResultStats</span>=<span class="hljs-string">{onResultStats}</span>
      <span class="hljs-attr">react</span>=<span class="hljs-string">{{</span>
        <span class="hljs-attr">and:</span> ['<span class="hljs-attr">repo</span>'],
      }}
      <span class="hljs-attr">pagination</span>
      <span class="hljs-attr">innerClass</span>=<span class="hljs-string">{{</span>
        <span class="hljs-attr">list:</span> '<span class="hljs-attr">result-list-container</span>',
        <span class="hljs-attr">pagination:</span> '<span class="hljs-attr">result-list-pagination</span>',
        <span class="hljs-attr">resultsInfo:</span> '<span class="hljs-attr">result-list-info</span>',
        <span class="hljs-attr">poweredBy:</span> '<span class="hljs-attr">powered-by</span>',
      }}
      <span class="hljs-attr">size</span>=<span class="hljs-string">{6}</span>
    /&gt;</span>
  <span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span></span>
);
<span class="hljs-keyword">export</span> <span class="hljs-keyword">default</span> Results;
</code></pre>
<p>I’ve imported two new components from ReactiveSearch, <code>SelectedFilters</code> and <code>ReactiveList</code>. <a target="_blank" href="https://opensource.appbase.io/reactive-manual/base-components/selectedfilters.html">SelectedFilters</a> will render the filters for our ReactiveSearch components at one place:</p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/1*FAROoZa3fhXuE5-H8_FJog.png" alt="Image" width="230" height="45" loading="lazy">
<em>SelectedFilters renders removable filters</em></p>
<p><a target="_blank" href="https://opensource.appbase.io/reactive-manual/result-components/reactivelist.html">ReactiveList</a> renders the search results. Here’s how its props work:</p>
<ul>
<li><code>dataField</code>: orders the results using <code>name</code> field here.</li>
<li><code>onData</code>: accepts a function which returns a <a target="_blank" href="https://reactjs.org/docs/glossary.html#jsx">JSX</a>. The function is passed each result individually. Here we’re generating a basic UI which we’ll modify later.</li>
<li><code>onResultStats</code>: similar to <code>onData</code> but for the result stats. The function is passed the number of <code>results</code> found and <code>time</code> taken.</li>
<li><code>react</code>: the <code>[react](https://opensource.appbase.io/reactive-manual/advanced/react.html)</code> prop tells the <code>ReactiveList</code> to listen to changes made by<code>CategorySearch</code> component, we’ve provided the <code>componentId</code> of the <code>CategorySearch</code> component here called <code>repo</code>. Later we’ll add more components here.</li>
<li><code>pagination</code>: a <code>boolean</code> which tells the ReactiveList to split the results into pages, each page containing the number of results specified in the <code>size</code> prop.</li>
</ul>
<p>Now we can <code>import</code> and use the <code>Results</code> component in <code>/src/App.js</code>. Just add it inside the <code>div</code> with <code>results-container</code> class.</p>
<pre><code class="lang-js">...
import Results <span class="hljs-keyword">from</span> <span class="hljs-string">'./components/Results'</span>;
...
render() {
  <span class="hljs-keyword">return</span>(
    ...
    &lt;div className=<span class="hljs-string">"results-container"</span>&gt;
      <span class="xml"><span class="hljs-tag">&lt;<span class="hljs-name">DataSearch</span> <span class="hljs-attr">...</span> /&gt;</span></span>
      <span class="xml"><span class="hljs-tag">&lt;<span class="hljs-name">Results</span> /&gt;</span></span>
    &lt;/div&gt;
    ...
  )
}
</code></pre>
<p>With this component, a basic version of our search UI should start coming together:</p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/1*txovjxQldUv-T2V5ALzP_Q.png" alt="Image" width="800" height="500" loading="lazy">
<em>Adding the Results component</em></p>
<h4 id="heading-5-adding-a-header-component">5. Adding a Header component</h4>
<p>Lets create a <code>Header</code> component at <code>/src/components/Header.js</code> which we’ll use to render more search filters.</p>
<p>Here’s how to create a simple <code>Header</code> component:</p>
<pre><code class="lang-js"><span class="hljs-keyword">import</span> React, { Component } <span class="hljs-keyword">from</span> <span class="hljs-string">'react'</span>;

<span class="hljs-keyword">import</span> SearchFilters <span class="hljs-keyword">from</span> <span class="hljs-string">'./SearchFilters'</span>;

<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">Header</span> <span class="hljs-keyword">extends</span> <span class="hljs-title">Component</span> </span>{
    <span class="hljs-keyword">constructor</span>(props) {
        <span class="hljs-built_in">super</span>(props);
        <span class="hljs-built_in">this</span>.state = {
            <span class="hljs-attr">visible</span>: <span class="hljs-literal">false</span>,
        };
    }

    toggleVisibility = <span class="hljs-function">() =&gt;</span> {
        <span class="hljs-keyword">const</span> visible = !<span class="hljs-built_in">this</span>.state.visible;
        <span class="hljs-built_in">this</span>.setState({
            visible,
        });
    }

    render() {
        <span class="hljs-keyword">return</span> (
            <span class="xml"><span class="hljs-tag">&lt;<span class="hljs-name">nav</span> <span class="hljs-attr">className</span>=<span class="hljs-string">{</span>`<span class="hljs-attr">navbar</span> ${<span class="hljs-attr">this.state.visible</span> ? '<span class="hljs-attr">active</span>' <span class="hljs-attr">:</span> ''}`}&gt;</span>
                <span class="hljs-tag">&lt;<span class="hljs-name">div</span> <span class="hljs-attr">className</span>=<span class="hljs-string">"title"</span>&gt;</span>GitXplore<span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span>
                <span class="hljs-tag">&lt;<span class="hljs-name">div</span> <span class="hljs-attr">className</span>=<span class="hljs-string">"btn toggle-btn"</span> <span class="hljs-attr">onClick</span>=<span class="hljs-string">{this.toggleVisibility}</span>&gt;</span>Toggle Filters<span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span>
                <span class="hljs-tag">&lt;<span class="hljs-name">SearchFilters</span> {<span class="hljs-attr">...this.props</span>} <span class="hljs-attr">visible</span>=<span class="hljs-string">{this.state.visible}</span> /&gt;</span>
            <span class="hljs-tag">&lt;/<span class="hljs-name">nav</span>&gt;</span></span>
        );
    }
}

<span class="hljs-keyword">export</span> <span class="hljs-keyword">default</span> Header;
</code></pre>
<p>I’ve moved the navigation code in <code>&lt;nav&gt;..&lt;/nav&gt;</code> from <code>/src/App.js</code> here. The Header component has a method which toggles visible in the state. We’re using this to add a class which would make it take up the entire screen size on mobile layout. I’ve also added a toggle button which calls the <code>toggleVisibility</code> method.</p>
<p>It also renders another component called <code>SearchFilters</code> and passes all the props from the parent <code>App</code> component. Let’s create this component to see things in action.</p>
<p>Create a new file <code>/src/components/SearchFilters.js</code>:</p>
<pre><code class="lang-js"><span class="hljs-keyword">import</span> React <span class="hljs-keyword">from</span> <span class="hljs-string">'react'</span>;
<span class="hljs-keyword">const</span> SearchFilters = <span class="hljs-function">() =&gt;</span> (
    <span class="xml"><span class="hljs-tag">&lt;<span class="hljs-name">div</span>&gt;</span>
        Search filters go here!
    <span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span></span>
);
<span class="hljs-keyword">export</span> <span class="hljs-keyword">default</span> SearchFilters;
</code></pre>
<p>Next, I’ll update the <code>App</code> component to use the <code>Header</code> component that we created just now.</p>
<h4 id="heading-6-updating-app-component-and-handling-topics-in-state">6. Updating App component and handling topics in state</h4>
<p>We’ll add a <code>state</code> variable in <code>App</code> component called <code>currentTopics</code> which would be an array of currently selected topics in the app.</p>
<p>We’ll then use the <code>currentTopics</code> and pass them to the <code>Header</code> and <code>Results</code> components:</p>
<pre><code class="lang-js"><span class="hljs-keyword">import</span> React, { Component } <span class="hljs-keyword">from</span> <span class="hljs-string">'react'</span>;
<span class="hljs-keyword">import</span> { ReactiveBase, DataSearch } <span class="hljs-keyword">from</span> <span class="hljs-string">'@appbaseio/reactivesearch'</span>;

<span class="hljs-keyword">import</span> Header <span class="hljs-keyword">from</span> <span class="hljs-string">'./components/Header'</span>;
<span class="hljs-keyword">import</span> Results <span class="hljs-keyword">from</span> <span class="hljs-string">'./components/Results'</span>;

<span class="hljs-keyword">import</span> theme <span class="hljs-keyword">from</span> <span class="hljs-string">'./theme'</span>;
<span class="hljs-keyword">import</span> <span class="hljs-string">'./App.css'</span>;

<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">App</span> <span class="hljs-keyword">extends</span> <span class="hljs-title">Component</span> </span>{
    <span class="hljs-keyword">constructor</span>(props) {
        <span class="hljs-built_in">super</span>(props);
        <span class="hljs-built_in">this</span>.state = {
            <span class="hljs-attr">currentTopics</span>: [],
        };
    }

    setTopics = <span class="hljs-function">(<span class="hljs-params">currentTopics</span>) =&gt;</span> {
        <span class="hljs-built_in">this</span>.setState({
            <span class="hljs-attr">currentTopics</span>: currentTopics || [],
        });
    }

    toggleTopic = <span class="hljs-function">(<span class="hljs-params">topic</span>) =&gt;</span> {
        <span class="hljs-keyword">const</span> { currentTopics } = <span class="hljs-built_in">this</span>.state;
        <span class="hljs-keyword">const</span> nextState = currentTopics.includes(topic)
            ? currentTopics.filter(<span class="hljs-function"><span class="hljs-params">item</span> =&gt;</span> item !== topic)
            : currentTopics.concat(topic);
        <span class="hljs-built_in">this</span>.setState({
            <span class="hljs-attr">currentTopics</span>: nextState,
        });
    }

    render() {
        <span class="hljs-keyword">return</span> (
            <span class="xml"><span class="hljs-tag">&lt;<span class="hljs-name">section</span> <span class="hljs-attr">className</span>=<span class="hljs-string">"container"</span>&gt;</span>
                <span class="hljs-tag">&lt;<span class="hljs-name">ReactiveBase</span>
                    <span class="hljs-attr">app</span>=<span class="hljs-string">"gitxplore-app"</span>
                    <span class="hljs-attr">credentials</span>=<span class="hljs-string">"4oaS4Srzi:f6966181-1eb4-443c-8e0e-b7f38e7bc316"</span>
                    <span class="hljs-attr">type</span>=<span class="hljs-string">"gitxplore-latest"</span>
                    <span class="hljs-attr">theme</span>=<span class="hljs-string">{theme}</span>
                &gt;</span>
                    <span class="hljs-tag">&lt;<span class="hljs-name">div</span> <span class="hljs-attr">className</span>=<span class="hljs-string">"flex row-reverse app-container"</span>&gt;</span>
                        <span class="hljs-tag">&lt;<span class="hljs-name">Header</span> <span class="hljs-attr">currentTopics</span>=<span class="hljs-string">{this.state.currentTopics}</span> <span class="hljs-attr">setTopics</span>=<span class="hljs-string">{this.setTopics}</span> /&gt;</span>
                        <span class="hljs-tag">&lt;<span class="hljs-name">div</span> <span class="hljs-attr">className</span>=<span class="hljs-string">"results-container"</span>&gt;</span>
                            <span class="hljs-tag">&lt;<span class="hljs-name">DataSearch</span>
                                <span class="hljs-attr">componentId</span>=<span class="hljs-string">"repo"</span>
                                <span class="hljs-attr">filterLabel</span>=<span class="hljs-string">"Search"</span>
                                <span class="hljs-attr">dataField</span>=<span class="hljs-string">{[</span>'<span class="hljs-attr">name</span>', '<span class="hljs-attr">description</span>', '<span class="hljs-attr">name.raw</span>', '<span class="hljs-attr">fullname</span>', '<span class="hljs-attr">owner</span>', '<span class="hljs-attr">topics</span>']}
                                <span class="hljs-attr">placeholder</span>=<span class="hljs-string">"Search Repos"</span>
                                <span class="hljs-attr">iconPosition</span>=<span class="hljs-string">"left"</span>
                                <span class="hljs-attr">autosuggest</span>=<span class="hljs-string">{false}</span>
                                <span class="hljs-attr">URLParams</span>
                                <span class="hljs-attr">className</span>=<span class="hljs-string">"data-search-container results-container"</span>
                                <span class="hljs-attr">innerClass</span>=<span class="hljs-string">{{</span>
                                    <span class="hljs-attr">input:</span> '<span class="hljs-attr">search-input</span>',
                                }}
                            /&gt;</span>
                            <span class="hljs-tag">&lt;<span class="hljs-name">Results</span> <span class="hljs-attr">currentTopics</span>=<span class="hljs-string">{this.state.currentTopics}</span> <span class="hljs-attr">toggleTopic</span>=<span class="hljs-string">{this.toggleTopic}</span> /&gt;</span>
                        <span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span>
                    <span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span>
                <span class="hljs-tag">&lt;/<span class="hljs-name">ReactiveBase</span>&gt;</span>
            <span class="hljs-tag">&lt;/<span class="hljs-name">section</span>&gt;</span></span>
        );
    }
}

<span class="hljs-keyword">export</span> <span class="hljs-keyword">default</span> App;
</code></pre>
<p>The <code>setTopics</code> method will set whichever topics are passed to it, which we’ll pass to the <code>Header</code> component. The <code>toggleTopic</code> method will remove a topic from the <code>state</code> in <code>currentTopics</code> if it’s already present and add the topic if it is not present.</p>
<p>We’ll pass the <code>toggleTopic</code> method to the <code>Results</code> component:</p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/1*3Or7_Pwz3wpkUcDv-gryFQ.png" alt="Image" width="800" height="500" loading="lazy">
<em>Its starting to come together, cheers!</em></p>
<h4 id="heading-7-adding-more-filters">7. Adding more filters</h4>
<p>Lets add more filters to the UI in <code>/src/components/SearchFilters.js</code>. I’ll be using three new components from ReactiveSearch here, <code>MultiDropdownList</code>, <code>SingleDropdownRange</code> and <code>RangeSlider</code>. The components are used in a similar fashion as we used the <code>DataSearch</code> component earlier.</p>
<p>Here’s the code:</p>
<pre><code class="lang-js"><span class="hljs-keyword">import</span> React <span class="hljs-keyword">from</span> <span class="hljs-string">'react'</span>;
<span class="hljs-keyword">import</span> PropTypes <span class="hljs-keyword">from</span> <span class="hljs-string">'prop-types'</span>;
<span class="hljs-keyword">import</span> {
    MultiDropdownList,
    SingleDropdownRange,
    RangeSlider,
} <span class="hljs-keyword">from</span> <span class="hljs-string">'@appbaseio/reactivesearch'</span>;

<span class="hljs-keyword">const</span> SearchFilters = <span class="hljs-function">(<span class="hljs-params">{ currentTopics, setTopics, visible }</span>) =&gt;</span> (
    <span class="xml"><span class="hljs-tag">&lt;<span class="hljs-name">div</span> <span class="hljs-attr">className</span>=<span class="hljs-string">{</span>`<span class="hljs-attr">flex</span> <span class="hljs-attr">column</span> <span class="hljs-attr">filters-container</span> ${!<span class="hljs-attr">visible</span> ? '<span class="hljs-attr">hidden</span>' <span class="hljs-attr">:</span> ''}`}&gt;</span>
        <span class="hljs-tag">&lt;<span class="hljs-name">div</span> <span class="hljs-attr">className</span>=<span class="hljs-string">"child m10"</span>&gt;</span>
            <span class="hljs-tag">&lt;<span class="hljs-name">MultiDropdownList</span>
                <span class="hljs-attr">componentId</span>=<span class="hljs-string">"language"</span>
                <span class="hljs-attr">dataField</span>=<span class="hljs-string">"language.raw"</span>
                <span class="hljs-attr">placeholder</span>=<span class="hljs-string">"Select languages"</span>
                <span class="hljs-attr">title</span>=<span class="hljs-string">"Language"</span>
                <span class="hljs-attr">filterLabel</span>=<span class="hljs-string">"Language"</span>
            /&gt;</span>
        <span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span>
        <span class="hljs-tag">&lt;<span class="hljs-name">div</span> <span class="hljs-attr">className</span>=<span class="hljs-string">"child m10"</span>&gt;</span>
            <span class="hljs-tag">&lt;<span class="hljs-name">MultiDropdownList</span>
                <span class="hljs-attr">componentId</span>=<span class="hljs-string">"topics"</span>
                <span class="hljs-attr">dataField</span>=<span class="hljs-string">"topics.raw"</span>
                <span class="hljs-attr">placeholder</span>=<span class="hljs-string">"Select topics"</span>
                <span class="hljs-attr">title</span>=<span class="hljs-string">"Repo Topics"</span>
                <span class="hljs-attr">filterLabel</span>=<span class="hljs-string">"Topics"</span>
                <span class="hljs-attr">size</span>=<span class="hljs-string">{1000}</span>
                <span class="hljs-attr">queryFormat</span>=<span class="hljs-string">"and"</span>
                <span class="hljs-attr">defaultSelected</span>=<span class="hljs-string">{currentTopics}</span>
                <span class="hljs-attr">onValueChange</span>=<span class="hljs-string">{setTopics}</span>
            /&gt;</span>
        <span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span>
        <span class="hljs-tag">&lt;<span class="hljs-name">div</span> <span class="hljs-attr">className</span>=<span class="hljs-string">"child m10"</span>&gt;</span>
            <span class="hljs-tag">&lt;<span class="hljs-name">SingleDropdownRange</span>
                <span class="hljs-attr">componentId</span>=<span class="hljs-string">"pushed"</span>
                <span class="hljs-attr">dataField</span>=<span class="hljs-string">"pushed"</span>
                <span class="hljs-attr">placeholder</span>=<span class="hljs-string">"Repo last active"</span>
                <span class="hljs-attr">title</span>=<span class="hljs-string">"Last Active"</span>
                <span class="hljs-attr">filterLabel</span>=<span class="hljs-string">"Last Active"</span>
                <span class="hljs-attr">data</span>=<span class="hljs-string">{[</span>
                    { <span class="hljs-attr">start:</span> '<span class="hljs-attr">now-1M</span>', <span class="hljs-attr">end:</span> '<span class="hljs-attr">now</span>', <span class="hljs-attr">label:</span> '<span class="hljs-attr">Last</span> <span class="hljs-attr">30</span> <span class="hljs-attr">days</span>' },
                    { <span class="hljs-attr">start:</span> '<span class="hljs-attr">now-6M</span>', <span class="hljs-attr">end:</span> '<span class="hljs-attr">now</span>', <span class="hljs-attr">label:</span> '<span class="hljs-attr">Last</span> <span class="hljs-attr">6</span> <span class="hljs-attr">months</span>' },
                    { <span class="hljs-attr">start:</span> '<span class="hljs-attr">now-1y</span>', <span class="hljs-attr">end:</span> '<span class="hljs-attr">now</span>', <span class="hljs-attr">label:</span> '<span class="hljs-attr">Last</span> <span class="hljs-attr">year</span>' },
                ]}
            /&gt;</span>
        <span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span>
        <span class="hljs-tag">&lt;<span class="hljs-name">div</span> <span class="hljs-attr">className</span>=<span class="hljs-string">"child m10"</span>&gt;</span>
            <span class="hljs-tag">&lt;<span class="hljs-name">SingleDropdownRange</span>
                <span class="hljs-attr">componentId</span>=<span class="hljs-string">"created"</span>
                <span class="hljs-attr">dataField</span>=<span class="hljs-string">"created"</span>
                <span class="hljs-attr">placeholder</span>=<span class="hljs-string">"Repo created"</span>
                <span class="hljs-attr">title</span>=<span class="hljs-string">"Created"</span>
                <span class="hljs-attr">filterLabel</span>=<span class="hljs-string">"Created"</span>
                <span class="hljs-attr">data</span>=<span class="hljs-string">{[</span>
                    {
                        <span class="hljs-attr">start:</span> '<span class="hljs-attr">2017-01-01T00:00:00Z</span>',
                        <span class="hljs-attr">end:</span> '<span class="hljs-attr">2017-12-31T23:59:59Z</span>',
                        <span class="hljs-attr">label:</span> '<span class="hljs-attr">2017</span>',
                    },
                    {
                        <span class="hljs-attr">start:</span> '<span class="hljs-attr">2016-01-01T00:00:00Z</span>',
                        <span class="hljs-attr">end:</span> '<span class="hljs-attr">2016-12-31T23:59:59Z</span>',
                        <span class="hljs-attr">label:</span> '<span class="hljs-attr">2016</span>',
                    },
                    {
                        <span class="hljs-attr">start:</span> '<span class="hljs-attr">2015-01-01T00:00:00Z</span>',
                        <span class="hljs-attr">end:</span> '<span class="hljs-attr">2015-12-31T23:59:59Z</span>',
                        <span class="hljs-attr">label:</span> '<span class="hljs-attr">2015</span>',
                    },
                    {
                        <span class="hljs-attr">start:</span> '<span class="hljs-attr">2014-01-01T00:00:00Z</span>',
                        <span class="hljs-attr">end:</span> '<span class="hljs-attr">2014-12-31T23:59:59Z</span>',
                        <span class="hljs-attr">label:</span> '<span class="hljs-attr">2014</span>',
                    },
                    {
                        <span class="hljs-attr">start:</span> '<span class="hljs-attr">2013-01-01T00:00:00Z</span>',
                        <span class="hljs-attr">end:</span> '<span class="hljs-attr">2013-12-31T23:59:59Z</span>',
                        <span class="hljs-attr">label:</span> '<span class="hljs-attr">2013</span>',
                    },
                    {
                        <span class="hljs-attr">start:</span> '<span class="hljs-attr">2012-01-01T00:00:00Z</span>',
                        <span class="hljs-attr">end:</span> '<span class="hljs-attr">2012-12-31T23:59:59Z</span>',
                        <span class="hljs-attr">label:</span> '<span class="hljs-attr">2012</span>',
                    },
                    {
                        <span class="hljs-attr">start:</span> '<span class="hljs-attr">2011-01-01T00:00:00Z</span>',
                        <span class="hljs-attr">end:</span> '<span class="hljs-attr">2011-12-31T23:59:59Z</span>',
                        <span class="hljs-attr">label:</span> '<span class="hljs-attr">2011</span>',
                    },
                    {
                        <span class="hljs-attr">start:</span> '<span class="hljs-attr">2010-01-01T00:00:00Z</span>',
                        <span class="hljs-attr">end:</span> '<span class="hljs-attr">2010-12-31T23:59:59Z</span>',
                        <span class="hljs-attr">label:</span> '<span class="hljs-attr">2010</span>',
                    },
                    {
                        <span class="hljs-attr">start:</span> '<span class="hljs-attr">2009-01-01T00:00:00Z</span>',
                        <span class="hljs-attr">end:</span> '<span class="hljs-attr">2009-12-31T23:59:59Z</span>',
                        <span class="hljs-attr">label:</span> '<span class="hljs-attr">2009</span>',
                    },
                    {
                        <span class="hljs-attr">start:</span> '<span class="hljs-attr">2008-01-01T00:00:00Z</span>',
                        <span class="hljs-attr">end:</span> '<span class="hljs-attr">2008-12-31T23:59:59Z</span>',
                        <span class="hljs-attr">label:</span> '<span class="hljs-attr">2008</span>',
                    },
                    {
                        <span class="hljs-attr">start:</span> '<span class="hljs-attr">2007-01-01T00:00:00Z</span>',
                        <span class="hljs-attr">end:</span> '<span class="hljs-attr">2007-12-31T23:59:59Z</span>',
                        <span class="hljs-attr">label:</span> '<span class="hljs-attr">2007</span>',
                    },
                ]}
            /&gt;</span>
        <span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span>
        <span class="hljs-tag">&lt;<span class="hljs-name">div</span> <span class="hljs-attr">className</span>=<span class="hljs-string">"child m10"</span>&gt;</span>
            <span class="hljs-tag">&lt;<span class="hljs-name">RangeSlider</span>
                <span class="hljs-attr">componentId</span>=<span class="hljs-string">"stars"</span>
                <span class="hljs-attr">title</span>=<span class="hljs-string">"Repo Stars"</span>
                <span class="hljs-attr">dataField</span>=<span class="hljs-string">"stars"</span>
                <span class="hljs-attr">range</span>=<span class="hljs-string">{{</span> <span class="hljs-attr">start:</span> <span class="hljs-attr">0</span>, <span class="hljs-attr">end:</span> <span class="hljs-attr">300000</span> }}
                <span class="hljs-attr">showHistogram</span>=<span class="hljs-string">{false}</span>
                <span class="hljs-attr">rangeLabels</span>=<span class="hljs-string">{{</span>
                    <span class="hljs-attr">start:</span> '<span class="hljs-attr">0</span> <span class="hljs-attr">Stars</span>',
                    <span class="hljs-attr">end:</span> '<span class="hljs-attr">300K</span> <span class="hljs-attr">Stars</span>',
                }}
                <span class="hljs-attr">innerClass</span>=<span class="hljs-string">{{</span>
                    <span class="hljs-attr">label:</span> '<span class="hljs-attr">range-label</span>',
                }}
            /&gt;</span>
        <span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span>
        <span class="hljs-tag">&lt;<span class="hljs-name">div</span> <span class="hljs-attr">className</span>=<span class="hljs-string">"child m10"</span>&gt;</span>
            <span class="hljs-tag">&lt;<span class="hljs-name">RangeSlider</span>
                <span class="hljs-attr">componentId</span>=<span class="hljs-string">"forks"</span>
                <span class="hljs-attr">title</span>=<span class="hljs-string">"Repo Forks"</span>
                <span class="hljs-attr">dataField</span>=<span class="hljs-string">"forks"</span>
                <span class="hljs-attr">range</span>=<span class="hljs-string">{{</span> <span class="hljs-attr">start:</span> <span class="hljs-attr">0</span>, <span class="hljs-attr">end:</span> <span class="hljs-attr">180500</span> }}
                <span class="hljs-attr">showHistogram</span>=<span class="hljs-string">{false}</span>
                <span class="hljs-attr">rangeLabels</span>=<span class="hljs-string">{{</span>
                    <span class="hljs-attr">start:</span> '<span class="hljs-attr">0</span> <span class="hljs-attr">Forks</span>',
                    <span class="hljs-attr">end:</span> '<span class="hljs-attr">180K</span> <span class="hljs-attr">Forks</span>',
                }}
                <span class="hljs-attr">innerClass</span>=<span class="hljs-string">{{</span>
                    <span class="hljs-attr">label:</span> '<span class="hljs-attr">range-label</span>',
                }}
            /&gt;</span>
        <span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span>
    <span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span></span>
);

SearchFilters.propTypes = {
    <span class="hljs-attr">currentTopics</span>: PropTypes.arrayOf(PropTypes.string),
    <span class="hljs-attr">setTopics</span>: PropTypes.func,
    <span class="hljs-attr">visible</span>: PropTypes.bool,
};

<span class="hljs-keyword">export</span> <span class="hljs-keyword">default</span> SearchFilters;
</code></pre>
<p>The <code>SearchFilters</code> component we’ve created above takes in three props from the <code>Header</code> component, <code>currentTopics</code>, <code>setTopics</code> and <code>visible</code>. The <code>visible</code> prop is just used to add a <code>className</code> for styling.</p>
<p>The first component we’ve used here is a <code>[MultiDropdownList](https://opensource.appbase.io/reactive-manual/list-components/multidropdownlist.html)</code> which renders a dropdown component to select multiple options. The first <code>MultiDropdownList</code> has a <code>dataField</code> of <code>language.raw</code>. It’ll populate itself with all the languages available in the repositories dataset.</p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/1*BwqnE21ABLW5q-xeCocmFA.png" alt="Image" width="364" height="341" loading="lazy">
_The language [MultiDropdownList](https://opensource.appbase.io/reactive-manual/list-components/multidropdownlist.html" rel="noopener" target="<em>blank" title=")</em></p>
<p>We’ve used another <code>MultiDropdownList</code> to render a list of topics:</p>
<pre><code class="lang-js">&lt;MultiDropdownList
    componentId=<span class="hljs-string">"topics"</span>
    dataField=<span class="hljs-string">"topics.raw"</span>
    placeholder=<span class="hljs-string">"Select languages"</span>
    title=<span class="hljs-string">"Repo Topics"</span>
    filterLabel=<span class="hljs-string">"Topics"</span>
    size={<span class="hljs-number">1000</span>}
    queryFormat=<span class="hljs-string">"and"</span>
    defaultSelected={currentTopics}
    onValueChange={setTopics}
/&gt;
</code></pre>
<p>Here’s how the props work here:</p>
<ul>
<li><code>componentId</code>: similar to the previous ReactiveSearch components, this is a unique identifier which we’ll later associate in the <code>Results</code> component that we created to get search results.</li>
<li><code>dataField</code>: maps the component to the <code>topics.raw</code> field in Elasticsearch.</li>
<li><code>placeholder</code>: sets the placeholder value when nothing is selected.</li>
<li><code>title</code>: adds a title for the component in the UI.</li>
<li><code>filterLabel</code>: sets the label of the components in the removable filters (the <code>SelectedFilters</code> which we used in the <code>Results</code> component).</li>
<li><code>size</code>: tells the component to render a maximum of <code>1000</code> items in the list.</li>
<li><code>queryFormat</code>: when set to <code>'and'</code> as we’ve used here, it gives results which matches all the selected tags (exactly like <a target="_blank" href="https://www.google.co.in/url?sa=t&amp;rct=j&amp;q=&amp;esrc=s&amp;source=web&amp;cd=2&amp;cad=rja&amp;uact=8&amp;ved=0ahUKEwjq2aSbmLLYAhUBP48KHW7QDVMQFghHMAE&amp;url=https%3A%2F%2Fen.wikipedia.org%2Fwiki%2FIntersection_(set_theory)&amp;usg=AOvVaw3o-ni_Iic1U3sedPMsJMqV">intersection</a>).</li>
<li><code>defaultSelected</code>: sets the selected items in the component. Here we’re passing <code>currentTopics</code> which we’ve stored in the <code>state</code> at <code>/src/App.js</code>.</li>
<li><code>onValueChange</code>: is a function that will be called by the component when we make a change in its value. Here we call the <code>setTopics</code> function which we received in the props. Therefore, whenever we select or deselect a value in the component it would update the <code>currentTopics</code> in the <code>state</code> of main <code>App</code> component.</li>
</ul>
<p><img src="https://cdn-media-1.freecodecamp.org/images/1*GPqMulHiFqd35bXmay2Cww.png" alt="Image" width="365" height="347" loading="lazy">
<em>The topics MultiDropdownList component</em></p>
<p>The next ReactiveSearch component we’ve used here is a <code>[SingleDropdownRange](https://opensource.appbase.io/reactive-manual/range-components/singledropdownrange.html)</code>. It uses a new prop called <code>[data](https://opensource.appbase.io/reactive-manual/range-components/singledropdownrange.html#props)</code>.</p>
<p>Here’s how it works:</p>
<pre><code class="lang-js">&lt;SingleDropdownRange
    ...
    data={[
        { <span class="hljs-attr">start</span>: <span class="hljs-string">'now-1M'</span>, <span class="hljs-attr">end</span>: <span class="hljs-string">'now'</span>, <span class="hljs-attr">label</span>: <span class="hljs-string">'Last 30 days'</span> },
        { <span class="hljs-attr">start</span>: <span class="hljs-string">'now-6M'</span>, <span class="hljs-attr">end</span>: <span class="hljs-string">'now'</span>, <span class="hljs-attr">label</span>: <span class="hljs-string">'Last 6 months'</span> },
        { <span class="hljs-attr">start</span>: <span class="hljs-string">'now-1y'</span>, <span class="hljs-attr">end</span>: <span class="hljs-string">'now'</span>, <span class="hljs-attr">label</span>: <span class="hljs-string">'Last year'</span> },
    ]}
/&gt;
</code></pre>
<p>The <code>data</code> prop accepts an array of objects with <code>start</code> and <code>end</code> values and shows the specified <code>label</code> in the dropdown. It’s mapped to the <code>pushed</code> field in the dataset which is a <a target="_blank" href="https://www.elastic.co/guide/en/elasticsearch/reference/current/date.html">date type in Elasticsearch</a>. One cool way to specify date range in Elasticsearch is using the <code>now</code> keyword. <code>now</code> refers to the current time, <code>now-1M</code> refers to one month before, <code>now-6M</code> to six month before and <code>now-1y</code> to a year before <code>now</code>.</p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/1*g6uRRdk37VyzQEqDIX_CCg.png" alt="Image" width="500" height="270" loading="lazy">
_The pushed [SingleDropdownRange](https://opensource.appbase.io/reactive-manual/range-components/singledropdownrange.html" rel="noopener" target="<em>blank" title=") component</em></p>
<p>I’ve used another <code>SingleDropdownRange</code> component for the <code>created</code> field in the dataset.</p>
<p>Here I’ve specified year ranges in datetime for different years:</p>
<pre><code class="lang-js">&lt;SingleDropdownRange
    ...
    data={[
        {
            <span class="hljs-attr">start</span>: <span class="hljs-string">'2017-01-01T00:00:00Z'</span>,
            <span class="hljs-attr">end</span>: <span class="hljs-string">'2017-12-31T23:59:59Z'</span>,
            <span class="hljs-attr">label</span>: <span class="hljs-string">'2017'</span>,
        },
        {
            <span class="hljs-attr">start</span>: <span class="hljs-string">'2016-01-01T00:00:00Z'</span>,
            <span class="hljs-attr">end</span>: <span class="hljs-string">'2016-12-31T23:59:59Z'</span>,
            <span class="hljs-attr">label</span>: <span class="hljs-string">'2016'</span>,
        },
       ...
    ]}
/&gt;
</code></pre>
<p><img src="https://cdn-media-1.freecodecamp.org/images/1*V1J8UQXBUCWJd4-EuC5I7w.png" alt="Image" width="500" height="471" loading="lazy">
<em>SingleDropdownRange component for the created field</em></p>
<p>The third component I’ve used is a <code>[RangeSlider](https://opensource.appbase.io/reactive-manual/range-components/rangeslider.html)</code> which renders a slider UI. I’ve used to <code>RangeSlider</code> components, one for the <code>stars</code> field and the other for <code>forks</code>.</p>
<p>Two main props that this component introduces are <code>range</code> and <code>rangeLabels</code>:</p>
<pre><code class="lang-js">&lt;RangeSlider
    ...
    showHistogram={<span class="hljs-literal">false</span>}
    range={{ <span class="hljs-attr">start</span>: <span class="hljs-number">0</span>, <span class="hljs-attr">end</span>: <span class="hljs-number">300000</span> }}
    rangeLabels={{
        <span class="hljs-attr">start</span>: <span class="hljs-string">'0 Stars'</span>,
        <span class="hljs-attr">end</span>: <span class="hljs-string">'300K Stars'</span>,
    }}
/&gt;
</code></pre>
<ul>
<li><code>range</code>: prop specifies a range for the data with a <code>start</code> and <code>end</code> value.</li>
<li><code>rangeLabels</code>: prop takes the labels to show below the slider.</li>
<li><code>showHistogram</code>: is a <code>boolean</code> prop which shows a histogram with the distribution of data. Here I’ve set it to <code>false</code> since it’s not needed.</li>
</ul>
<p><img src="https://cdn-media-1.freecodecamp.org/images/1*s98sSECpz1cX-9Q_jbUyLQ.png" alt="Image" width="397" height="235" loading="lazy">
_[RangeSlider](https://opensource.appbase.io/reactive-manual/range-components/rangeslider.html" rel="noopener" target="<em>blank" title=") components for the stars and forks fields</em></p>
<p>Now we just need to connect these filters to the <code>Results</code> component. We just have to update one line in the <code>ReactiveList</code> rendered by the <code>Results</code> component to include the <code>componentId</code>s of these components.</p>
<p>Update the <code>react</code> prop in the <code>ReactiveList</code> that we rendered in the <code>Results</code> component:</p>
<pre><code class="lang-js"><span class="hljs-keyword">const</span> Results = <span class="hljs-function">() =&gt;</span> (
  <span class="xml"><span class="hljs-tag">&lt;<span class="hljs-name">div</span> <span class="hljs-attr">className</span>=<span class="hljs-string">"result-list"</span>&gt;</span>
    <span class="hljs-tag">&lt;<span class="hljs-name">SelectedFilters</span> <span class="hljs-attr">className</span>=<span class="hljs-string">"m1"</span> /&gt;</span>
    <span class="hljs-tag">&lt;<span class="hljs-name">ReactiveList</span>
      <span class="hljs-attr">...</span> // <span class="hljs-attr">updating</span> <span class="hljs-attr">the</span> <span class="hljs-attr">react</span> <span class="hljs-attr">prop</span> <span class="hljs-attr">here</span>
      <span class="hljs-attr">react</span>=<span class="hljs-string">{{</span>
        <span class="hljs-attr">and:</span> ['<span class="hljs-attr">language</span>', '<span class="hljs-attr">topics</span>', '<span class="hljs-attr">pushed</span>', '<span class="hljs-attr">created</span>', '<span class="hljs-attr">stars</span>', '<span class="hljs-attr">forks</span>', '<span class="hljs-attr">repo</span>'],
      }}
    /&gt;</span>
  <span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span></span>
);
</code></pre>
<p>That should make your results update for all the filters ?</p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/1*fAMt45ayVCNTLy77EI7Szg.png" alt="Image" width="800" height="500" loading="lazy">
<em>After connecting the filters in the ReactiveList component</em></p>
<h4 id="heading-8-updating-the-results-view">8. Updating the results view</h4>
<p>Up until now, we’ve been seeing only a basic version of the results. As the final piece of this app, lets add some flair to the results ✌️</p>
<p>We’ll be using another component inside our <code>Results</code> components to render different topics.</p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/1*AyocrMQaO0TQWoYcbhoHlA.png" alt="Image" width="363" height="86" loading="lazy">
<em>Topics component to render these little guys</em></p>
<p>Here’s how you can create your own at <code>/src/components/Topic</code>. Feel free to add your own taste ?</p>
<pre><code class="lang-js">
<span class="hljs-keyword">import</span> React, { Component } <span class="hljs-keyword">from</span> <span class="hljs-string">'react'</span>;
<span class="hljs-keyword">import</span> PropTypes <span class="hljs-keyword">from</span> <span class="hljs-string">'prop-types'</span>;

<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">Topic</span> <span class="hljs-keyword">extends</span> <span class="hljs-title">Component</span> </span>{
    handleClick = <span class="hljs-function">() =&gt;</span> {
        <span class="hljs-built_in">this</span>.props.toggleTopic(<span class="hljs-built_in">this</span>.props.children);
    }
    render() {
        <span class="hljs-keyword">return</span> (
            <span class="xml"><span class="hljs-tag">&lt;<span class="hljs-name">div</span> <span class="hljs-attr">className</span>=<span class="hljs-string">{</span>`<span class="hljs-attr">topic</span> ${<span class="hljs-attr">this.props.active</span> ? '<span class="hljs-attr">active</span>' <span class="hljs-attr">:</span> ''}`} <span class="hljs-attr">onClick</span>=<span class="hljs-string">{this.handleClick}</span>&gt;</span>
                #{this.props.children}
            <span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span></span>
        );
    }
}

Topic.propTypes = {
    <span class="hljs-attr">children</span>: PropTypes.string,
    <span class="hljs-attr">active</span>: PropTypes.bool,
    <span class="hljs-attr">toggleTopic</span>: PropTypes.func,
};

<span class="hljs-keyword">export</span> <span class="hljs-keyword">default</span> Topic;
</code></pre>
<p>This component renders its <code>children</code> and adds a click handler to toggle the topics which updates the <code>currentTopics</code> inside the main <code>App</code> component’s state.</p>
<p>Next, we just need to update our <code>Results</code> component at <code>/src/components/Results.js</code>:</p>
<pre><code class="lang-js"><span class="hljs-keyword">import</span> React <span class="hljs-keyword">from</span> <span class="hljs-string">'react'</span>;
<span class="hljs-keyword">import</span> { SelectedFilters, ReactiveList } <span class="hljs-keyword">from</span> <span class="hljs-string">'@appbaseio/reactivesearch'</span>;
<span class="hljs-keyword">import</span> PropTypes <span class="hljs-keyword">from</span> <span class="hljs-string">'prop-types'</span>;

<span class="hljs-keyword">import</span> Topic <span class="hljs-keyword">from</span> <span class="hljs-string">'./Topic'</span>;

<span class="hljs-keyword">const</span> onResultStats = <span class="hljs-function">(<span class="hljs-params">results, time</span>) =&gt;</span> (
    <span class="xml"><span class="hljs-tag">&lt;<span class="hljs-name">div</span> <span class="hljs-attr">className</span>=<span class="hljs-string">"flex justify-end"</span>&gt;</span>
        {results} results found in {time}ms
    <span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span></span>
);

<span class="hljs-keyword">const</span> onData = <span class="hljs-function">(<span class="hljs-params">data, currentTopics, toggleTopic</span>) =&gt;</span> (
    <span class="xml"><span class="hljs-tag">&lt;<span class="hljs-name">div</span> <span class="hljs-attr">className</span>=<span class="hljs-string">"result-item"</span> <span class="hljs-attr">key</span>=<span class="hljs-string">{data.fullname}</span>&gt;</span>
        <span class="hljs-tag">&lt;<span class="hljs-name">div</span> <span class="hljs-attr">className</span>=<span class="hljs-string">"flex justify-center align-center result-card-header"</span>&gt;</span>
            <span class="hljs-tag">&lt;<span class="hljs-name">img</span> <span class="hljs-attr">className</span>=<span class="hljs-string">"avatar"</span> <span class="hljs-attr">src</span>=<span class="hljs-string">{data.avatar}</span> <span class="hljs-attr">alt</span>=<span class="hljs-string">"User avatar"</span> /&gt;</span>
            <span class="hljs-tag">&lt;<span class="hljs-name">a</span> <span class="hljs-attr">className</span>=<span class="hljs-string">"link"</span> <span class="hljs-attr">href</span>=<span class="hljs-string">{data.url}</span> <span class="hljs-attr">target</span>=<span class="hljs-string">"_blank"</span> <span class="hljs-attr">rel</span>=<span class="hljs-string">"noopener noreferrer"</span>&gt;</span>
                <span class="hljs-tag">&lt;<span class="hljs-name">div</span> <span class="hljs-attr">className</span>=<span class="hljs-string">"flex wrap"</span>&gt;</span>
                    <span class="hljs-tag">&lt;<span class="hljs-name">div</span>&gt;</span>{data.owner}/<span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span>
                    <span class="hljs-tag">&lt;<span class="hljs-name">div</span>&gt;</span>{data.name}<span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span>
                <span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span>
            <span class="hljs-tag">&lt;/<span class="hljs-name">a</span>&gt;</span>
        <span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span>
        <span class="hljs-tag">&lt;<span class="hljs-name">div</span> <span class="hljs-attr">className</span>=<span class="hljs-string">"m10-0"</span>&gt;</span>{data.description}<span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span>
        <span class="hljs-tag">&lt;<span class="hljs-name">div</span> <span class="hljs-attr">className</span>=<span class="hljs-string">"flex wrap justify-center"</span>&gt;</span>
            {
                data.topics.slice(0, 7)
                    .map(item =&gt; (
                        <span class="hljs-tag">&lt;<span class="hljs-name">Topic</span>
                            <span class="hljs-attr">key</span>=<span class="hljs-string">{item}</span>
                            <span class="hljs-attr">active</span>=<span class="hljs-string">{currentTopics.includes(item)}</span>
                            <span class="hljs-attr">toggleTopic</span>=<span class="hljs-string">{toggleTopic}</span>
                        &gt;</span>
                            {item}
                        <span class="hljs-tag">&lt;/<span class="hljs-name">Topic</span>&gt;</span>
                    ))
            }
        <span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span>
        <span class="hljs-tag">&lt;<span class="hljs-name">div</span> <span class="hljs-attr">className</span>=<span class="hljs-string">"flex"</span>&gt;</span>
            <span class="hljs-tag">&lt;<span class="hljs-name">div</span>&gt;</span><span class="hljs-tag">&lt;<span class="hljs-name">div</span> <span class="hljs-attr">className</span>=<span class="hljs-string">"btn card-btn"</span>&gt;</span><span class="hljs-tag">&lt;<span class="hljs-name">i</span> <span class="hljs-attr">className</span>=<span class="hljs-string">"card-icon fas fa-star"</span> /&gt;</span>{data.stars}<span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span><span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span>
            <span class="hljs-tag">&lt;<span class="hljs-name">div</span>&gt;</span><span class="hljs-tag">&lt;<span class="hljs-name">div</span> <span class="hljs-attr">className</span>=<span class="hljs-string">"btn card-btn"</span>&gt;</span><span class="hljs-tag">&lt;<span class="hljs-name">i</span> <span class="hljs-attr">className</span>=<span class="hljs-string">"card-icon fas fa-code-branch"</span> /&gt;</span>{data.forks}<span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span><span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span>
            <span class="hljs-tag">&lt;<span class="hljs-name">div</span>&gt;</span><span class="hljs-tag">&lt;<span class="hljs-name">div</span> <span class="hljs-attr">className</span>=<span class="hljs-string">"btn card-btn"</span>&gt;</span><span class="hljs-tag">&lt;<span class="hljs-name">i</span> <span class="hljs-attr">className</span>=<span class="hljs-string">"card-icon fas fa-eye"</span> /&gt;</span>{data.watchers}<span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span><span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span>
        <span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span>
    <span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span></span>
);

<span class="hljs-keyword">const</span> Results = <span class="hljs-function">(<span class="hljs-params">{ toggleTopic, currentTopics }</span>) =&gt;</span> (
    <span class="xml"><span class="hljs-tag">&lt;<span class="hljs-name">div</span> <span class="hljs-attr">className</span>=<span class="hljs-string">"result-list"</span>&gt;</span>
        <span class="hljs-tag">&lt;<span class="hljs-name">SelectedFilters</span> <span class="hljs-attr">className</span>=<span class="hljs-string">"m1"</span> /&gt;</span>
        <span class="hljs-tag">&lt;<span class="hljs-name">ReactiveList</span>
            <span class="hljs-attr">componentId</span>=<span class="hljs-string">"results"</span>
            <span class="hljs-attr">dataField</span>=<span class="hljs-string">"name"</span>
            <span class="hljs-attr">onData</span>=<span class="hljs-string">{data</span> =&gt;</span> onData(data, currentTopics, toggleTopic)}
            onResultStats={onResultStats}
            react={{
                and: ['language', 'topics', 'pushed', 'created', 'stars', 'forks', 'repo'],
            }}
            pagination
            innerClass={{
                list: 'result-list-container',
                pagination: 'result-list-pagination',
                resultsInfo: 'result-list-info',
                poweredBy: 'powered-by',
            }}
            size={6}
            sortOptions={[
                {
                    label: 'Best Match',
                    dataField: '_score',
                    sortBy: 'desc',
                },
                {
                    label: 'Most Stars',
                    dataField: 'stars',
                    sortBy: 'desc',
                },
                {
                    label: 'Fewest Stars',
                    dataField: 'stars',
                    sortBy: 'asc',
                },
                {
                    label: 'Most Forks',
                    dataField: 'forks',
                    sortBy: 'desc',
                },
                {
                    label: 'Fewest Forks',
                    dataField: 'forks',
                    sortBy: 'asc',
                },
                {
                    label: 'A to Z',
                    dataField: 'owner.raw',
                    sortBy: 'asc',
                },
                {
                    label: 'Z to A',
                    dataField: 'owner.raw',
                    sortBy: 'desc',
                },
                {
                    label: 'Recently Updated',
                    dataField: 'pushed',
                    sortBy: 'desc',
                },
                {
                    label: 'Least Recently Updated',
                    dataField: 'pushed',
                    sortBy: 'asc',
                },
            ]}
        /&gt;
    <span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span></span>
);

Results.propTypes = {
    <span class="hljs-attr">toggleTopic</span>: PropTypes.func,
    <span class="hljs-attr">currentTopics</span>: PropTypes.arrayOf(PropTypes.string),
};

<span class="hljs-keyword">export</span> <span class="hljs-keyword">default</span> Results;
</code></pre>
<p>I’ve updated the <code>onData</code> function to render more detailed results. You’ll also notice a new <code>sortOptions</code> prop in the <code>ReactiveList</code>. This prop accepts an array of objects which renders a dropdown menu to select how you wish to sort the results. Each object contains a <code>label</code> to display as the list item, a <code>dataField</code> to sort the results on and a <code>sortBy</code> key which can either be <code>asc</code> (ascending) or <code>desc</code> (descending).</p>
<p>That’s it, your very own GitHub repository explorer should be live!</p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/1*RQ6EPM9NrDsvX_ZdkpB_cw.png" alt="Image" width="800" height="443" loading="lazy">
_GitXplore [final app preview](https://appbaseio-apps.github.io/gitxplore-app/" rel="noopener" target="<em>blank" title=")</em></p>
<h3 id="heading-useful-links">Useful links</h3>
<ol>
<li>GitXplore app <a target="_blank" href="https://appbaseio-apps.github.io/gitxplore-app/">demo</a>, <a target="_blank" href="https://codesandbox.io/s/github/appbaseio-apps/gitxplore-app/tree/master/">CodeSandbox</a> and <a target="_blank" href="https://github.com/appbaseio-apps/gitxplore-app">source code</a></li>
<li><a target="_blank" href="https://github.com/appbaseio/reactivesearch">ReactiveSearch GitHub repo</a></li>
<li>ReactiveSearch <a target="_blank" href="https://opensource.appbase.io/reactive-manual/">docs</a></li>
</ol>
<p>Hope you enjoyed this story. If you have any thoughts or suggestions, please let me know and do share your version of the app in comments!</p>
<hr>
<p>You may follow me on <a target="_blank" href="https://twitter.com/divyanshu013">twitter</a> for latest updates. I've also started posting more recent posts on my personal <a target="_blank" href="https://divyanshu013.dev/">blog</a>.</p>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ ElasticSearch with Django the easy way ]]>
                </title>
                <description>
                    <![CDATA[ By Adam Wattis A while back I was working on a Django project and wanted to implement fast free text search. Instead of using a regular database for this search function — such as MySQL or PostgreSQL — I decided to use a NoSQL database. That is when ... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/elasticsearch-with-django-the-easy-way-909375bc16cb/</link>
                <guid isPermaLink="false">66c349ada7aea9fc97bdfb0f</guid>
                
                    <category>
                        <![CDATA[ Django ]]>
                    </category>
                
                    <category>
                        <![CDATA[ elasticsearch ]]>
                    </category>
                
                    <category>
                        <![CDATA[ NoSQL ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Python ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Web Development ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ freeCodeCamp ]]>
                </dc:creator>
                <pubDate>Fri, 13 Jan 2017 21:43:09 +0000</pubDate>
                <media:content url="https://cdn-media-1.freecodecamp.org/images/1*ojvTsI-Asv1IIjdm61RzKw.jpeg" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>By Adam Wattis</p>
<p>A while back I was working on a Django project and wanted to implement fast free text search. Instead of using a regular database for this search function — such as MySQL or PostgreSQL — I decided to use a NoSQL database. That is when I discovered <a target="_blank" href="https://www.elastic.co/">ElasticSearch</a>.</p>
<p>ElasticSearch indexes documents for your data instead of using data tables like a regular relational database does. This speeds up search, and offers a lot of other benefits that you don’t get with a regular database. I kept a regular relational database as well for storing user details, logins, and other data that ElasticSearch didn’t need to index.</p>
<p>After searching for a long time on how to properly implement ElasticSearch with Django, I didn’t really find any satisfying answers. Some <a target="_blank" href="https://qbox.io/blog/how-to-elasticsearch-python-django-part1">guides or tutorials</a> were convoluted and seemed to be taking unnecessary steps in order to index the data into ElasticSearch. There was quite a bit of information on how to perform searching, but not as much about how the indexing should be done. I felt like there must be a simpler solution out there, so I decided to give it a try myself.</p>
<p>I wanted to keep it as simple as possible, because simple solutions tend to be the best ones in my opinion. KISS (Keep It Simple Stupid), Less is More and all of that stuff is something that resonates with me a lot, especially when every other solution out there is complex. I decided to use Honza Král’s example in <a target="_blank" href="https://www.youtube.com/watch?v=1KHM7WvNeL4&amp;t=1141s">this video</a> to have something to base my code on. I recommend watching it, although it is a bit outdated at this point.</p>
<p>Since I was using Django — which is written in Python — it was easy to interact with ElasticSearch. There are two client libraries to interact with ElasticSearch with Python. There’s <a target="_blank" href="https://elasticsearch-py.readthedocs.io/en/master/index.html">elasticsearch-py</a>, which is the official low-level client. And there’s <a target="_blank" href="http://elasticsearch-dsl.readthedocs.io/en/latest/index.html">elasticsearch-dsl</a>, which is build upon the former but gives a higher-level abstraction with a bit less functionality.</p>
<p>We will get into some example soon, but first I need to clarify what we want to accomplish:</p>
<ul>
<li>Setting up ElasticSearch on our local machine and ensuring it works properly</li>
<li>Setting up a new Django project</li>
<li>Bulk indexing of data that is already in the database</li>
<li>Indexing of every new instance that a user saves to the database</li>
<li>A basic search example</li>
</ul>
<p>All right, that seems simple enough. Lets get started by installing ElasticSearch on our machine. Also, all the <a target="_blank" href="https://github.com/adamwattis/elasticsearch-example">code will be available on my GitHub</a> so that you can easily follow the examples.</p>
<h4 id="heading-installing-elasticsearch">Installing ElasticSearch</h4>
<p>Since ElasticSearch runs on Java you must ensure you have an updated JVM version. Check what version you have with <code>java -version</code> in the terminal. Then you run the following commands to create a new directory, download, extract and start ElasticSearch:</p>
<pre><code>mkdir elasticsearch-example
</code></pre><pre><code>wget https:<span class="hljs-comment">//artifacts.elastic.co/downloads/elasticsearch/elasticsearch-5.1.1.tar.gz</span>
</code></pre><pre><code>tar -xzf elasticsearch<span class="hljs-number">-5.1</span><span class="hljs-number">.1</span>.tar.gz
</code></pre><pre><code>./elasticsearch<span class="hljs-number">-5.1</span><span class="hljs-number">.1</span>/bin/elasticsearch
</code></pre><p>When ElasticSearch starts up there should be a lot of output printed to the terminal window. To check that its up and running correctly open up a new terminal window and run this <code>curl</code> command:</p>
<pre><code>curl -XGET http:<span class="hljs-comment">//localhost:9200</span>
</code></pre><p>The response should be something like this:</p>
<pre><code>{  <span class="hljs-string">"name"</span> : <span class="hljs-string">"6xIrzqq"</span>,  <span class="hljs-string">"cluster_name"</span> : <span class="hljs-string">"elasticsearch"</span>,  <span class="hljs-string">"cluster_uuid"</span> : <span class="hljs-string">"eUH9REKyQOy4RKPzkuRI1g"</span>,  <span class="hljs-string">"version"</span> : {    <span class="hljs-string">"number"</span> : <span class="hljs-string">"5.1.1"</span>,    <span class="hljs-string">"build_hash"</span> : <span class="hljs-string">"5395e21"</span>,    <span class="hljs-string">"build_date"</span> : <span class="hljs-string">"2016-12-06T12:36:15.409Z"</span>,    <span class="hljs-string">"build_snapshot"</span> : <span class="hljs-literal">false</span>,    <span class="hljs-string">"lucene_version"</span> : <span class="hljs-string">"6.3.0"</span>  },  <span class="hljs-string">"tagline"</span> : <span class="hljs-string">"You Know, for Search"</span>
</code></pre><p>Great, you now have ElasticSearch running on your local machine! It’s time to set up your Django project.</p>
<h4 id="heading-setting-up-a-django-project">Setting up a Django project</h4>
<p>First you create a virtual environment with <code>virtualenv venv</code> and enter it with <code>source venv/bin/activate</code> in order to keep everything contained. Then you install some packages:</p>
<pre><code>pip install djangopip install elasticsearch-dsl
</code></pre><p>To start a new Django project you run:</p>
<pre><code>django-admin startproject elasticsearchprojectcd elasticsearchprojectpython manage.py startapp elasticsearchapp
</code></pre><p>After you created your new Django projects you need to create a model that you will use. For this guide I chose to go with a good old fashioned blog post example. In <code>models.py</code> you place the following code:</p>
<pre><code><span class="hljs-keyword">from</span> django.db <span class="hljs-keyword">import</span> modelsfrom django.utils <span class="hljs-keyword">import</span> timezonefrom django.contrib.auth.models <span class="hljs-keyword">import</span> User# Create your models here.# Blogpost to be indexed into ElasticSearchclass BlogPost(models.Model):   author = models.ForeignKey(User, on_delete=models.CASCADE, related_name=<span class="hljs-string">'blogpost'</span>)   posted_date = models.DateField(<span class="hljs-keyword">default</span>=timezone.now)   title = models.CharField(max_length=<span class="hljs-number">200</span>)   text = models.TextField(max_length=<span class="hljs-number">1000</span>)
</code></pre><p>Pretty straight forward, so far. Don’t forget to add <code>elasticsearchapp</code> to <code>INSTALLED_APPS</code> in <code>settings.py</code> and register your new BlogPost model in <code>admin.py</code> like this:</p>
<pre><code><span class="hljs-keyword">from</span> django.contrib <span class="hljs-keyword">import</span> adminfrom .models <span class="hljs-keyword">import</span> BlogPost# Register your models here.# Need to register my BlogPost so it shows up <span class="hljs-keyword">in</span> the adminadmin.site.register(BlogPost)
</code></pre><p>You must also <code>python manage.py makemigrations</code>, <code>python manage.py migrate</code> and <code>python manage.py createsuperuser</code> to create the database and an admin account. Now, <code>python manage.py runserver</code>, go to <code>[http://localhost:8000/admin/](http://localhost:8000/admin/)</code> and login. You should now be able to see your Blog posts model there. Go ahead and create your first blog post in the admin.</p>
<p>Congratulations, you now have a functioning Django project! It’s finally time to get into the fun stuff — connecting ElasticSearch.</p>
<h4 id="heading-connecting-elasticsearch-with-django">Connecting ElasticSearch with Django</h4>
<p>You begin by creating a new file called <code>search.py</code> in our <code>elasticsearchapp</code> directory. This is where the ElasticSearch code will live. The first thing you need to do here is to create a connection from your Django application to ElasticSearch. You do this in your <code>search.py</code> file:</p>
<pre><code><span class="hljs-keyword">from</span> elasticsearch_dsl.connections <span class="hljs-keyword">import</span> connectionsconnections.create_connection()
</code></pre><p>Now that you have a global connection to your ElasticSearch set-up you need to define what you want to index into it. Write this code:</p>
<pre><code><span class="hljs-keyword">from</span> elasticsearch_dsl.connections <span class="hljs-keyword">import</span> connectionsfrom elasticsearch_dsl <span class="hljs-keyword">import</span> DocType, Text, Dateconnections.create_connection()<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">BlogPostIndex</span>(<span class="hljs-title">DocType</span>):    <span class="hljs-title">author</span> </span>= Text()    posted_date = <span class="hljs-built_in">Date</span>()    title = Text()    text = Text()    <span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">Meta</span>:        <span class="hljs-title">index</span> </span>= <span class="hljs-string">'blogpost-index'</span>
</code></pre><p>It looks pretty similar to your model, right? The <code>DocType</code> works as a wrapper to enable you to write an index like a model, and the <code>Text</code> and <code>Date</code> are the fields so that they get the correct format when they get indexed.</p>
<p>Inside the Meta you tell ElasticSearch what you want the index to be named. This will be a point of reference for ElasticSearch so that it knows what index it’s dealing with when initializing it in the database and saving each new object instance created.</p>
<p>Now you need to actually create the mapping of your newly created <code>BlogPostIndex</code> in ElasticSearch. You can do this and also create a way to do the bulk indexing at the same time — how convenient right?</p>
<h4 id="heading-bulk-indexing-of-data">Bulk indexing of data</h4>
<p>The <code>bulk</code> command is located in <code>elasticsearch.helpers</code> which is included when you installed <code>elasticsearch_dsl</code> since it is built on top of that library. Do the following in <code>search.py</code>:</p>
<pre><code>...from elasticsearch.helpers <span class="hljs-keyword">import</span> bulkfrom elasticsearch <span class="hljs-keyword">import</span> Elasticsearchfrom . import models...
</code></pre><pre><code>...def bulk_indexing():    BlogPostIndex.init()    es = Elasticsearch()    bulk(client=es, actions=(b.indexing() <span class="hljs-keyword">for</span> b <span class="hljs-keyword">in</span> models.BlogPost.objects.all().iterator()))
</code></pre><p>“What is going on here?” you might be thinking. It’s not that complicated, actually.</p>
<p>Since you only want to do bulk indexing whenever you change something in our model you <code>init()</code> the model which maps it into ElasticSearch. Then, you use the <code>bulk</code> and pass it an instance of <code>Elasticsearch()</code> which will create a connection to ElasticSearch. You then pass a generator to <code>actions=</code> and iterate over all the <code>BlogPost</code> objects you have in your regular database and call the <code>.indexing()</code> method on each object. Why a generator? Because if you had a lot of objects to iterate over a generator would not have to first load them into memory.</p>
<p>There is just one problem with the above code. You don’t have an <code>.indexing()</code> method on your model yet. Lets fix that:</p>
<pre><code>...from .search <span class="hljs-keyword">import</span> BlogPostIndex...
</code></pre><pre><code>...# Add indexing method to BlogPostdef indexing(self):   obj = BlogPostIndex(      meta={<span class="hljs-string">'id'</span>: self.id},      author=self.author.username,      posted_date=self.posted_date,      title=self.title,      text=self.text   )   obj.save()   <span class="hljs-keyword">return</span> obj.to_dict(include_meta=True)
</code></pre><p>You add the indexing method to the <code>BlogPost</code> model. It returns a <code>BlogPostIndex</code> and gets saved to ElasticSearch.</p>
<p>Lets try this out now and see if you can bulk index the blog post you previously created. By running <code>python manage.py shell</code> you go into the Django shell and import your <code>search.py</code> with <code>from elasticsearchapp.search import *</code> and then run <code>bulk_indexing()</code> to index all the blog posts in your database. To see if it worked you run the following curl command:</p>
<pre><code>curl -XGET <span class="hljs-string">'localhost:9200/blogpost-index/blog_post_index/1?pretty'</span>
</code></pre><p>You should get back your first blog post in the terminal.</p>
<h4 id="heading-indexing-of-newly-saved-instance">Indexing of newly saved instance</h4>
<p>Next you need to add a signal that fires the <code>.indexing()</code> on each new instance that is saved every time a user saves a new blog post. In <code>elasticsearchapp</code> create a new file called <code>signals.py</code> and add this code:</p>
<pre><code><span class="hljs-keyword">from</span> .models <span class="hljs-keyword">import</span> BlogPostfrom django.db.models.signals <span class="hljs-keyword">import</span> post_savefrom django.dispatch <span class="hljs-keyword">import</span> receiver@receiver(post_save, sender=BlogPost)def index_post(sender, instance, **kwargs):    instance.indexing()
</code></pre><p>The <code>post_save</code> signal will ensure that the saved instance will get indexed with the <code>.indexing()</code> method after it is saved.</p>
<p>In order for this to work we also need to register Django that we’re using signals. We do this opening <code>apps.py</code> and adding the following code:</p>
<pre><code><span class="hljs-keyword">from</span> django.apps <span class="hljs-keyword">import</span> AppConfigclass ElasticsearchappConfig(AppConfig):    name = <span class="hljs-string">'elasticsearchapp'</span>    def ready(self):        <span class="hljs-keyword">import</span> elasticsearchapp.signals
</code></pre><p>To to complete this we also need to tell Django that we’re using this new configuration. We do this inside the <code>__init__.py</code> inside our <code>elasticsearchapp</code> directory by adding:</p>
<pre><code>default_app_config = <span class="hljs-string">'elasticsearchapp.apps.ElasticsearchappConfig'</span>
</code></pre><p>Now the <code>post_save</code> signal is registered with Django and is ready to listen for whenever a new blogpost is being saved.</p>
<p>Try it our by going into the Django admin again and saving a new blogpost. Then check with a <code>curl</code> command if it was successfully indexed into ElasticSearch.</p>
<h4 id="heading-simple-search">Simple search</h4>
<p>Now lets make a simple search function in <code>search.py</code> to find all posts filtered by author:</p>
<pre><code>...from elasticsearch_dsl <span class="hljs-keyword">import</span> DocType, Text, <span class="hljs-built_in">Date</span>, Search...
</code></pre><pre><code>...def search(author):    s = Search().filter(<span class="hljs-string">'term'</span>, author=author)    response = s.execute()    <span class="hljs-keyword">return</span> response
</code></pre><p>Lets try the search out. In the shell: <code>from elasticsearchapp.search import *</code> and run <code>print(search(author="&lt;author name&amp;</code>gt;")) :</p>
<pre><code>&gt;&gt;&gt; print(search(author=<span class="hljs-string">"home"</span>))&lt;Response: [<span class="xml"><span class="hljs-tag">&lt;<span class="hljs-name">Result(blogpost-index</span>/<span class="hljs-attr">blog_post_index</span>/<span class="hljs-attr">1</span>)<span class="hljs-attr">:</span> {'<span class="hljs-attr">text</span>'<span class="hljs-attr">:</span> '<span class="hljs-attr">Hello</span> <span class="hljs-attr">world</span>, <span class="hljs-attr">this</span> <span class="hljs-attr">is</span> <span class="hljs-attr">my</span> <span class="hljs-attr">first</span> <span class="hljs-attr">blog</span> <span class="hljs-attr">post</span>', '<span class="hljs-attr">title</span>'<span class="hljs-attr">:...</span>}&gt;</span>]&gt;</span>
</code></pre><p>There you have it! You have now successfully indexed all your instances into ElasticSearch, created a <code>post_save</code> signal that indexes each newly saved instance, and created a function to search our ElasticSearch database for your data.</p>
<h4 id="heading-conclusion">Conclusion</h4>
<p>This was a quite lengthy article but I hope it is written simple enough for even the beginner to be able to understand.</p>
<p>I explained how to connect a Django model to ElasticSearch for indexing and searching, but there is so much more that ElasticSearch can do. I recommend reading on their website and exploring what other possibilities exist, such as spatial operations and full text search with intelligent highlighting. Its a great tool and I will be sure to use it in future projects!</p>
<p>If you liked this article or have a comment or suggestion, please feel free to leave a message below. And stay tuned for more interesting stuff!</p>
 ]]>
                </content:encoded>
            </item>
        
    </channel>
</rss>
