<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/"
    xmlns:atom="http://www.w3.org/2005/Atom" xmlns:media="http://search.yahoo.com/mrss/" version="2.0">
    <channel>
        
        <title>
            <![CDATA[ OCR  - freeCodeCamp.org ]]>
        </title>
        <description>
            <![CDATA[ Browse thousands of programming tutorials written by experts. Learn Web Development, Data Science, DevOps, Security, and get developer career advice. ]]>
        </description>
        <link>https://www.freecodecamp.org/news/</link>
        <image>
            <url>https://cdn.freecodecamp.org/universal/favicons/favicon.png</url>
            <title>
                <![CDATA[ OCR  - freeCodeCamp.org ]]>
            </title>
            <link>https://www.freecodecamp.org/news/</link>
        </image>
        <generator>Eleventy</generator>
        <lastBuildDate>Thu, 21 May 2026 16:11:49 +0000</lastBuildDate>
        <atom:link href="https://www.freecodecamp.org/news/tag/ocr/rss.xml" rel="self" type="application/rss+xml" />
        <ttl>60</ttl>
        
            <item>
                <title>
                    <![CDATA[ How To Create An Optical Character Reader Using Angular And Azure Computer Vision ]]>
                </title>
                <description>
                    <![CDATA[ By Ankit Sharma Introduction In this article, we will create an optical character recognition (OCR) application using Angular and the Azure Computer Vision Cognitive Service.  Computer Vision is an AI service that analyzes content in images. We will ... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/how-to-create-an-optical-character-reader-using-angular-and-azure-computer-vision/</link>
                <guid isPermaLink="false">66d45dac787a2a3b05af4394</guid>
                
                    <category>
                        <![CDATA[ AI ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Angular ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Azure ]]>
                    </category>
                
                    <category>
                        <![CDATA[ OCR  ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ freeCodeCamp ]]>
                </dc:creator>
                <pubDate>Fri, 15 May 2020 19:03:00 +0000</pubDate>
                <media:content url="https://cdn-media-2.freecodecamp.org/w1280/5f9c9aff740569d1a4ca2913.jpg" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>By Ankit Sharma</p>
<h2 id="heading-introduction">Introduction</h2>
<p>In this article, we will create an optical character recognition (OCR) application using Angular and the Azure Computer Vision Cognitive Service. </p>
<p>Computer Vision is an AI service that analyzes content in images. We will use the OCR feature of Computer Vision to detect the printed text in an image. The application will extract the text from the image and detects the language of the text. </p>
<p>Currently, the OCR API supports 25 languages.</p>
<h2 id="heading-prerequisites">Prerequisites</h2>
<ul>
<li>Install the latest LTS version of Node.JS from <a target="_blank" href="https://nodejs.org/en/download/">https://nodejs.org/en/download/</a></li>
<li>Install the Angular CLI from <a target="_blank" href="https://cli.angular.io/">https://cli.angular.io/</a></li>
<li>Install the .NET Core 3.1 SDK from <a target="_blank" href="https://dotnet.microsoft.com/download/dotnet-core/3.1">https://dotnet.microsoft.com/download/dotnet-core/3.1</a></li>
<li>Install the latest version of Visual Studio 2019 from <a target="_blank" href="https://visualstudio.microsoft.com/downloads/">https://visualstudio.microsoft.com/downloads/</a></li>
<li>An Azure subscription account. You can create a free Azure account at <a target="_blank" href="https://azure.microsoft.com/en-in/free/">https://azure.microsoft.com/en-in/free/</a></li>
</ul>
<h2 id="heading-source-code">Source Code</h2>
<p>You can get the source code from <a target="_blank" href="https://github.com/AnkitSharma-007/Angular-Computer-Vision-Azure-Cognitive-Services">GitHub</a>.</p>
<blockquote>
<p>We will use an ASP.NET Core backend for this application. The ASP.NET Core backend provides a straight forward authentication process to access Azure cognitive services. This will also ensure that the end-user won’t have direct access to cognitive services.</p>
</blockquote>
<h2 id="heading-create-the-azure-computer-vision-cognitive-service-resource">Create the Azure Computer Vision Cognitive Service resource</h2>
<p>Log in to the Azure portal and search for the cognitive services in the search bar and click on the result. Refer to the image shown below.</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2020/05/CreateCVCogServ.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p>On the next screen, click on the Add button. It will open the cognitive services marketplace page. Search for the Computer Vision in the search bar and click on the search result. It will open the Computer Vision API page. Click on the Create button to create a new Computer Vision resource. Refer to the image shown below.</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2020/05/SelectComputerVisionCogServ.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p>On the Create page, fill in the details as indicated below.</p>
<ul>
<li><strong>Name</strong>: Give a unique name for your resource.</li>
<li><strong>Subscription</strong>: Select the subscription type from the dropdown.</li>
<li><strong>Pricing tier</strong>: Select the pricing tier as per your choice.</li>
<li><strong>Resource group</strong>: Select an existing resource group or create a new one.</li>
</ul>
<p>Click on the Create button. Refer to the image shown below.</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2020/05/ConfigureComputerVisionCogServ.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p>After your resource is successfully deployed, click on the “Go to resource” button. You can see the Key and the endpoint for the newly created Computer Vision resource. Refer to the image shown below.</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2020/05/ComputerVisionCogServKey-1.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p>Make a note of the key and the endpoint. We will be using these in the latter part of this article to invoke the Computer Vision OCR API from the .NET Code. The values are masked here for privacy.</p>
<h2 id="heading-creating-the-aspnet-core-application">Creating the ASP.NET Core application</h2>
<p>Open Visual Studio 2019 and click on “Create a new Project”. A “Create a new Project” dialog will open. Select “ASP.NET Core Web Application” and click on Next. Now you will be at “Configure your new project” screen, provide the name for your application as <code>ngComputerVision</code> and click on create. Refer to the image shown below.</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2020/05/CreateProject_1.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p>You will be navigated to “Create a new ASP.NET Core web application” screen. Select “.NET Core” and “ASP.NET Core 3.1” from the dropdowns on the top. Then, select the “Angular” project template and click on <code>Create</code>. Refer to the image shown below.</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2020/05/CreateProject_2.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p>This will create our project. The folder structure of the application is shown below.</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2020/05/Sol_Exp-1.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p>The <code>ClientApp</code> folder contains the Angular code for our application. The Controllers folders will contain our API controllers. The angular components are present inside the <code>ClientApp\src\app</code> folder. </p>
<p>The default template contains a few Angular components. These components won’t affect our application, but for the sake of simplicity, we will delete fetchdata and counter folders from <code>ClientApp/app/components</code> folder. Also, remove the reference for these two components from the <code>app.module.ts</code> file.</p>
<h2 id="heading-installing-computer-vision-api-library">Installing Computer Vision API library</h2>
<p>We will install the Azure Computer Vision API library which will provide us with the models out of the box to handle the Computer Vision REST API response. To install the package, navigate to Tools &gt;&gt; NuGet Package Manager &gt;&gt; Package Manager Console. It will open the Package Manager Console. Run the command as shown below.</p>
<pre><code>Install-Package Microsoft.Azure.CognitiveServices.Vision.ComputerVision -Version <span class="hljs-number">5.0</span><span class="hljs-number">.0</span>
</code></pre><p>You can learn more about this package at the <a target="_blank" href="https://www.nuget.org/packages/Microsoft.Azure.CognitiveServices.Vision.ComputerVision/">NuGet gallery</a>.</p>
<h2 id="heading-create-the-models">Create the Models</h2>
<p>Right-click on the <code>ngComputerVision</code> project and select Add &gt;&gt; New Folder. Name the folder as Models. Again, right-click on the Models folder and select Add &gt;&gt; Class to add a new class file. Put the name of your class as <code>LanguageDetails.cs</code> and click Add.</p>
<p>Open <a target="_blank" href="https://github.com/AnkitSharma-007/Angular-Computer-Vision-Azure-Cognitive-Services/blob/master/ngComputerVision/Models/LanguageDetails.cs">LanguageDetails.cs</a> and put the following code inside it.</p>
<pre><code class="lang-csharp"><span class="hljs-keyword">namespace</span> <span class="hljs-title">ngComputerVision.Models</span>
{
    <span class="hljs-keyword">public</span> <span class="hljs-keyword">class</span> <span class="hljs-title">LanguageDetails</span>
    {
        <span class="hljs-keyword">public</span> <span class="hljs-keyword">string</span> Name { <span class="hljs-keyword">get</span>; <span class="hljs-keyword">set</span>; }
        <span class="hljs-keyword">public</span> <span class="hljs-keyword">string</span> NativeName { <span class="hljs-keyword">get</span>; <span class="hljs-keyword">set</span>; }
        <span class="hljs-keyword">public</span> <span class="hljs-keyword">string</span> Dir { <span class="hljs-keyword">get</span>; <span class="hljs-keyword">set</span>; }
    }
}
</code></pre>
<p>Similarly, add a new class file <a target="_blank" href="https://github.com/AnkitSharma-007/Angular-Computer-Vision-Azure-Cognitive-Services/blob/master/ngComputerVision/Models/AvailableLanguage.cs">AvailableLanguage.cs</a> and put the following code inside it.</p>
<pre><code class="lang-csharp"><span class="hljs-keyword">using</span> System.Collections.Generic;

<span class="hljs-keyword">namespace</span> <span class="hljs-title">ngComputerVision.Models</span>
{
    <span class="hljs-keyword">public</span> <span class="hljs-keyword">class</span> <span class="hljs-title">AvailableLanguage</span>
    {
        <span class="hljs-keyword">public</span> Dictionary&lt;<span class="hljs-keyword">string</span>, LanguageDetails&gt; Translation { <span class="hljs-keyword">get</span>; <span class="hljs-keyword">set</span>; }
    }
}
</code></pre>
<p>We will also add two classes as DTO (Data Transfer Object) for sending data back to the client.</p>
<p>Create a new folder and name it DTOModels. Add the new class file <a target="_blank" href="https://github.com/AnkitSharma-007/Angular-Computer-Vision-Azure-Cognitive-Services/blob/master/ngComputerVision/DTOModels/AvailableLanguageDTO.cs">AvailableLanguageDTO.cs</a> in the DTOModels folder and put the following code inside it.</p>
<pre><code class="lang-csharp"><span class="hljs-keyword">namespace</span> <span class="hljs-title">ngComputerVision.DTOModels</span>
{
    <span class="hljs-keyword">public</span> <span class="hljs-keyword">class</span> <span class="hljs-title">AvailableLanguageDTO</span>
    {
        <span class="hljs-keyword">public</span> <span class="hljs-keyword">string</span> LanguageID { <span class="hljs-keyword">get</span>; <span class="hljs-keyword">set</span>; }
        <span class="hljs-keyword">public</span> <span class="hljs-keyword">string</span> LanguageName { <span class="hljs-keyword">get</span>; <span class="hljs-keyword">set</span>; }
    }
}
</code></pre>
<p>Add the <a target="_blank" href="https://github.com/AnkitSharma-007/Angular-Computer-Vision-Azure-Cognitive-Services/blob/master/ngComputerVision/DTOModels/OcrResultDTO.cs">OcrResultDTO.cs</a> file and put the following code inside it.</p>
<pre><code class="lang-csharp"><span class="hljs-keyword">namespace</span> <span class="hljs-title">ngComputerVision.DTOModels</span>
{
    <span class="hljs-keyword">public</span> <span class="hljs-keyword">class</span> <span class="hljs-title">OcrResultDTO</span>
    {
        <span class="hljs-keyword">public</span> <span class="hljs-keyword">string</span> Language { <span class="hljs-keyword">get</span>; <span class="hljs-keyword">set</span>; }
        <span class="hljs-keyword">public</span> <span class="hljs-keyword">string</span> DetectedText { <span class="hljs-keyword">get</span>; <span class="hljs-keyword">set</span>; }
    }
}
</code></pre>
<h2 id="heading-adding-the-ocr-controller">Adding the OCR Controller</h2>
<p>We will add a new controller to our application. Right-click on the Controllers folder and select Add &gt;&gt; New Item. An “Add New Item” dialog box will open. Select “Visual C#” from the left panel, then select “API Controller Class” from templates panel and put the name as <code>OCRController.cs</code>. Click on Add. </p>
<p>Refer to the image below.</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2020/05/AddController-1.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p>The <code>OCRController</code> will handle the image recognition requests from the client app. This controller will also return the list of all the languages supported by OCR API.</p>
<p>Open the <a target="_blank" href="https://github.com/AnkitSharma-007/Angular-Computer-Vision-Azure-Cognitive-Services/blob/master/ngComputerVision/Controllers/OCRController.cs">OCRController.cs</a> file and put the following code inside it.</p>
<pre><code class="lang-csharp"><span class="hljs-keyword">using</span> System;
<span class="hljs-keyword">using</span> System.Threading.Tasks;
<span class="hljs-keyword">using</span> Microsoft.AspNetCore.Mvc;
<span class="hljs-keyword">using</span> System.Net.Http;
<span class="hljs-keyword">using</span> System.Net.Http.Headers;
<span class="hljs-keyword">using</span> Newtonsoft.Json.Linq;
<span class="hljs-keyword">using</span> System.IO;
<span class="hljs-keyword">using</span> Newtonsoft.Json;
<span class="hljs-keyword">using</span> System.Text;
<span class="hljs-keyword">using</span> ngComputerVision.Models;
<span class="hljs-keyword">using</span> System.Collections.Generic;
<span class="hljs-keyword">using</span> Microsoft.Azure.CognitiveServices.Vision.ComputerVision.Models;
<span class="hljs-keyword">using</span> ngComputerVision.DTOModels;

<span class="hljs-keyword">namespace</span> <span class="hljs-title">ngComputerVision.Controllers</span>
{
    [<span class="hljs-meta">Produces(<span class="hljs-meta-string">"application/json"</span>)</span>]
    [<span class="hljs-meta">Route(<span class="hljs-meta-string">"api/[controller]"</span>)</span>]
    <span class="hljs-keyword">public</span> <span class="hljs-keyword">class</span> <span class="hljs-title">OCRController</span> : <span class="hljs-title">Controller</span>
    {
        <span class="hljs-keyword">static</span> <span class="hljs-keyword">string</span> subscriptionKey;
        <span class="hljs-keyword">static</span> <span class="hljs-keyword">string</span> endpoint;
        <span class="hljs-keyword">static</span> <span class="hljs-keyword">string</span> uriBase;

        <span class="hljs-function"><span class="hljs-keyword">public</span> <span class="hljs-title">OCRController</span>(<span class="hljs-params"></span>)</span>
        {
            subscriptionKey = <span class="hljs-string">"b993f3afb4e04119bd8ed37171d4ec71"</span>;
            endpoint = <span class="hljs-string">"https://ankitocrdemo.cognitiveservices.azure.com/"</span>;
            uriBase = endpoint + <span class="hljs-string">"vision/v2.1/ocr"</span>;
        }

        [<span class="hljs-meta">HttpPost, DisableRequestSizeLimit</span>]
        <span class="hljs-function"><span class="hljs-keyword">public</span> <span class="hljs-keyword">async</span> Task&lt;OcrResultDTO&gt; <span class="hljs-title">Post</span>(<span class="hljs-params"></span>)</span>
        {
            StringBuilder sb = <span class="hljs-keyword">new</span> StringBuilder();
            OcrResultDTO ocrResultDTO = <span class="hljs-keyword">new</span> OcrResultDTO();
            <span class="hljs-keyword">try</span>
            {
                <span class="hljs-keyword">if</span> (Request.Form.Files.Count &gt; <span class="hljs-number">0</span>)
                {
                    <span class="hljs-keyword">var</span> file = Request.Form.Files[Request.Form.Files.Count - <span class="hljs-number">1</span>];

                    <span class="hljs-keyword">if</span> (file.Length &gt; <span class="hljs-number">0</span>)
                    {
                        <span class="hljs-keyword">var</span> memoryStream = <span class="hljs-keyword">new</span> MemoryStream();
                        file.CopyTo(memoryStream);
                        <span class="hljs-keyword">byte</span>[] imageFileBytes = memoryStream.ToArray();
                        memoryStream.Flush();

                        <span class="hljs-keyword">string</span> JSONResult = <span class="hljs-keyword">await</span> ReadTextFromStream(imageFileBytes);

                        OcrResult ocrResult = JsonConvert.DeserializeObject&lt;OcrResult&gt;(JSONResult);
                        <span class="hljs-keyword">if</span> (!ocrResult.Language.Equals(<span class="hljs-string">"unk"</span>))
                        {
                            <span class="hljs-keyword">foreach</span> (OcrLine ocrLine <span class="hljs-keyword">in</span> ocrResult.Regions[<span class="hljs-number">0</span>].Lines)
                            {
                                <span class="hljs-keyword">foreach</span> (OcrWord ocrWord <span class="hljs-keyword">in</span> ocrLine.Words)
                                {
                                    sb.Append(ocrWord.Text);
                                    sb.Append(<span class="hljs-string">' '</span>);
                                }
                                sb.AppendLine();
                            }
                        }
                        <span class="hljs-keyword">else</span>
                        {
                            sb.Append(<span class="hljs-string">"This language is not supported."</span>);
                        }
                        ocrResultDTO.DetectedText = sb.ToString();
                        ocrResultDTO.Language = ocrResult.Language;
                    }
                }
                <span class="hljs-keyword">return</span> ocrResultDTO;
            }
            <span class="hljs-keyword">catch</span>
            {
                ocrResultDTO.DetectedText = <span class="hljs-string">"Error occurred. Try again"</span>;
                ocrResultDTO.Language = <span class="hljs-string">"unk"</span>;
                <span class="hljs-keyword">return</span> ocrResultDTO;
            }
        }

        <span class="hljs-function"><span class="hljs-keyword">static</span> <span class="hljs-keyword">async</span> Task&lt;<span class="hljs-keyword">string</span>&gt; <span class="hljs-title">ReadTextFromStream</span>(<span class="hljs-params"><span class="hljs-keyword">byte</span>[] byteData</span>)</span>
        {
            <span class="hljs-keyword">try</span>
            {
                HttpClient client = <span class="hljs-keyword">new</span> HttpClient();
                client.DefaultRequestHeaders.Add(<span class="hljs-string">"Ocp-Apim-Subscription-Key"</span>, subscriptionKey);
                <span class="hljs-keyword">string</span> requestParameters = <span class="hljs-string">"language=unk&amp;detectOrientation=true"</span>;
                <span class="hljs-keyword">string</span> uri = uriBase + <span class="hljs-string">"?"</span> + requestParameters;
                HttpResponseMessage response;

                <span class="hljs-keyword">using</span> (ByteArrayContent content = <span class="hljs-keyword">new</span> ByteArrayContent(byteData))
                {
                    content.Headers.ContentType = <span class="hljs-keyword">new</span> MediaTypeHeaderValue(<span class="hljs-string">"application/octet-stream"</span>);
                    response = <span class="hljs-keyword">await</span> client.PostAsync(uri, content);
                }

                <span class="hljs-keyword">string</span> contentString = <span class="hljs-keyword">await</span> response.Content.ReadAsStringAsync();
                <span class="hljs-keyword">string</span> result = JToken.Parse(contentString).ToString();
                <span class="hljs-keyword">return</span> result;
            }
            <span class="hljs-keyword">catch</span> (Exception e)
            {
                <span class="hljs-keyword">return</span> e.Message;
            }
        }

        [<span class="hljs-meta">HttpGet</span>]
        <span class="hljs-keyword">public</span> <span class="hljs-keyword">async</span> Task&lt;List&lt;AvailableLanguageDTO&gt;&gt; GetAvailableLanguages()
        {
            <span class="hljs-keyword">string</span> endpoint = <span class="hljs-string">"https://api.cognitive.microsofttranslator.com/languages?api-version=3.0&amp;scope=translation"</span>;
            <span class="hljs-keyword">var</span> client = <span class="hljs-keyword">new</span> HttpClient();
            <span class="hljs-keyword">using</span> (<span class="hljs-keyword">var</span> request = <span class="hljs-keyword">new</span> HttpRequestMessage())
            {
                request.Method = HttpMethod.Get;
                request.RequestUri = <span class="hljs-keyword">new</span> Uri(endpoint);
                <span class="hljs-keyword">var</span> response = <span class="hljs-keyword">await</span> client.SendAsync(request).ConfigureAwait(<span class="hljs-literal">false</span>);
                <span class="hljs-keyword">string</span> result = <span class="hljs-keyword">await</span> response.Content.ReadAsStringAsync();

                AvailableLanguage deserializedOutput = JsonConvert.DeserializeObject&lt;AvailableLanguage&gt;(result);

                List&lt;AvailableLanguageDTO&gt; availableLanguage = <span class="hljs-keyword">new</span> List&lt;AvailableLanguageDTO&gt;();

                <span class="hljs-keyword">foreach</span> (KeyValuePair&lt;<span class="hljs-keyword">string</span>, LanguageDetails&gt; translation <span class="hljs-keyword">in</span> deserializedOutput.Translation)
                {
                    AvailableLanguageDTO language = <span class="hljs-keyword">new</span> AvailableLanguageDTO();
                    language.LanguageID = translation.Key;
                    language.LanguageName = translation.Value.Name;

                    availableLanguage.Add(language);
                }
                <span class="hljs-keyword">return</span> availableLanguage;
            }
        }
    }
}
</code></pre>
<p>In the constructor of the class, we have initialized the key and the endpoint URL for the OCR API.</p>
<p>The Post method will receive the image data as a file collection in the request body and return an object of type <code>OcrResultDTO</code>. We will convert the image data to a byte array and invoke the <code>ReadTextFromStream</code> method. We will deserialize the response into an object of type <code>OcrResult</code>. We will then form the sentence by iterating over the <code>OcrWord</code> object.</p>
<p>Inside the <code>ReadTextFromStream</code> method, we will create a new <code>HttpRequestMessage</code>. This HTTP request is a Post request. We will pass the subscription key in the header of the request. The OCR API will return a JSON object having each word from the image as well as the detected language of the text.</p>
<p>The <code>GetAvailableLanguages</code> method will return the list of all the language supported by the Translate Text API. We will set the request URI and create a <code>HttpRequestMessage</code> which will be a Get request. This request URI will return a JSON object which will be deserialized to an object of type <code>AvailableLanguage</code>.</p>
<h3 id="heading-why-do-we-need-to-fetch-the-list-of-supported-languages"><strong>Why do we need to fetch the list of supported languages?</strong></h3>
<p>The OCR API returns the language code (e.g. en for English, de for German, etc.) of the detected language. But we cannot display the language code on the UI as it is not user-friendly. Therefore, we need a dictionary to look up the language name corresponding to the language code.</p>
<p>The Azure Computer Vision OCR API supports 25 languages. To know all the languages supported by OCR API see the list of <a target="_blank" href="https://docs.microsoft.com/en-us/azure/cognitive-services/computer-vision/language-support">supported languages</a>. These languages are a subset of the languages supported by the Azure Translate Text API. </p>
<p>Since there is no dedicated API endpoint to fetch the list of languages supported by OCR API, we are using the Translate Text API endpoint to fetch the list of languages. We will create the language lookup dictionary using the JSON response from this API call and filter the result based on the language code returned by the OCR API.</p>
<h2 id="heading-working-on-the-client-side-of-the-application">Working on the Client side of the application</h2>
<p>The code for the client-side is available in the ClientApp folder. We will use Angular CLI to work with the client code.</p>
<blockquote>
<p>Using Angular CLI is not mandatory. I am using Angular CLI here as it is user-friendly and easy to use. If you don’t want to use CLI then you can create the files for components and services manually.</p>
</blockquote>
<p>Navigate to the ngComputerVision\ClientApp folder in your machine and open a command window. We will execute all our Angular CLI commands in this window.</p>
<h2 id="heading-create-the-client-side-models">Create the client-side models</h2>
<p>Create a folder called models inside the <code>ClientApp\src\app</code> folder. Now we will create a file <a target="_blank" href="https://github.com/AnkitSharma-007/Angular-Computer-Vision-Azure-Cognitive-Services/blob/master/ngComputerVision/ClientApp/src/app/models/availablelanguage.ts">availablelanguage.ts</a> in the models folder. Put the following code in it.</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">export</span> <span class="hljs-keyword">class</span> AvailableLanguage {
    languageID: <span class="hljs-built_in">string</span>;
    languageName: <span class="hljs-built_in">string</span>;
}
</code></pre>
<p>Similarly, create another file inside the models folder called <a target="_blank" href="https://github.com/AnkitSharma-007/Angular-Computer-Vision-Azure-Cognitive-Services/blob/master/ngComputerVision/ClientApp/src/app/models/ocrresult.ts">ocrresult.ts</a>. Put the following code in it.</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">export</span> <span class="hljs-keyword">class</span> OcrResult {
    language: <span class="hljs-built_in">string</span>;
    detectedText: <span class="hljs-built_in">string</span>
}
</code></pre>
<p>You can observe that both these classes have the same definition as the DTO classes we created on the server-side. This will allow us to bind the data returned from the server directly to our models.</p>
<h2 id="heading-create-the-computervision-service">Create the Computervision Service</h2>
<p>We will create an Angular service which will invoke the Web API endpoints, convert the Web API response to JSON and pass it to our component. Run the following command.</p>
<pre><code>ng g s services\Computervision
</code></pre><p>This command will create a folder name as services and then create the following two files inside it.</p>
<ul>
<li>computervision.service.ts — the service class file.</li>
<li>computervision.service.spec.ts — the unit test file for service.</li>
</ul>
<p>Open <a target="_blank" href="https://github.com/AnkitSharma-007/Angular-Computer-Vision-Azure-Cognitive-Services/blob/master/ngComputerVision/ClientApp/src/app/services/computervision.service.ts">computervision.service.ts</a> file and put the following code inside it.</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">import</span> { Injectable } <span class="hljs-keyword">from</span> <span class="hljs-string">'@angular/core'</span>;
<span class="hljs-keyword">import</span> { HttpClient } <span class="hljs-keyword">from</span> <span class="hljs-string">'@angular/common/http'</span>;

<span class="hljs-meta">@Injectable</span>({
  providedIn: <span class="hljs-string">'root'</span>
})
<span class="hljs-keyword">export</span> <span class="hljs-keyword">class</span> ComputervisionService {

  baseURL: <span class="hljs-built_in">string</span>;

  <span class="hljs-keyword">constructor</span>(<span class="hljs-params"><span class="hljs-keyword">private</span> http: HttpClient</span>) {
    <span class="hljs-built_in">this</span>.baseURL = <span class="hljs-string">'/api/OCR'</span>;
  }

  getAvailableLanguage() {
    <span class="hljs-keyword">return</span> <span class="hljs-built_in">this</span>.http.get(<span class="hljs-built_in">this</span>.baseURL)
      .pipe(<span class="hljs-function"><span class="hljs-params">response</span> =&gt;</span> {
        <span class="hljs-keyword">return</span> response;
      });
  }

  getTextFromImage(image: FormData) {
    <span class="hljs-keyword">return</span> <span class="hljs-built_in">this</span>.http.post(<span class="hljs-built_in">this</span>.baseURL, image)
      .pipe(<span class="hljs-function"><span class="hljs-params">response</span> =&gt;</span> {
        <span class="hljs-keyword">return</span> response;
      });
  }
}
</code></pre>
<p>We have defined a variable baseURL which will hold the endpoint URL of our API. We will initialize the baseURL in the constructor and set it to the endpoint of the <code>OCRController</code>.</p>
<p>The <code>getAvailableLanguage</code> method will send a Get request to the <code>GetAvailableLanguages</code> method of the <code>OCRController</code> to fetch the list of supported languages for OCR.</p>
<p>The <code>getTextFromImage</code> method will send a Post request to the <code>OCRController</code> and supply the parameter of type <code>FormData</code>. It will fetch the detected text from the image and language code of the text.</p>
<h3 id="heading-create-the-ocr-component"><strong>Create the Ocr component</strong></h3>
<p>Run the following command in the command prompt to create the <code>OcrComponent</code>.</p>
<pre><code>ng g c ocr --<span class="hljs-built_in">module</span> app
</code></pre><p>The <code>--module</code> flag will ensure that this component will get registered at <code>app.module.ts</code>.</p>
<p>Open <a target="_blank" href="https://github.com/AnkitSharma-007/Angular-Computer-Vision-Azure-Cognitive-Services/blob/master/ngComputerVision/ClientApp/src/app/ocr/ocr.component.html">ocr.component.html</a> and put the following code in it.</p>
<pre><code class="lang-html"><span class="hljs-tag">&lt;<span class="hljs-name">h2</span>&gt;</span>Optical Character Recognition (OCR) using Angular and Azure Computer Vision Cognitive Services<span class="hljs-tag">&lt;/<span class="hljs-name">h2</span>&gt;</span>

<span class="hljs-tag">&lt;<span class="hljs-name">div</span> <span class="hljs-attr">class</span>=<span class="hljs-string">"row"</span>&gt;</span>
  <span class="hljs-tag">&lt;<span class="hljs-name">div</span> <span class="hljs-attr">class</span>=<span class="hljs-string">"col-md-5"</span>&gt;</span>
    <span class="hljs-tag">&lt;<span class="hljs-name">textarea</span> <span class="hljs-attr">disabled</span> <span class="hljs-attr">class</span>=<span class="hljs-string">"form-control"</span> <span class="hljs-attr">rows</span>=<span class="hljs-string">"10"</span> <span class="hljs-attr">cols</span>=<span class="hljs-string">"15"</span>&gt;</span>{{ocrResult?.detectedText}}<span class="hljs-tag">&lt;/<span class="hljs-name">textarea</span>&gt;</span>
    <span class="hljs-tag">&lt;<span class="hljs-name">hr</span> /&gt;</span>
    <span class="hljs-tag">&lt;<span class="hljs-name">div</span> <span class="hljs-attr">class</span>=<span class="hljs-string">"row"</span>&gt;</span>
      <span class="hljs-tag">&lt;<span class="hljs-name">div</span> <span class="hljs-attr">class</span>=<span class="hljs-string">"col-sm-5"</span>&gt;</span>
        <span class="hljs-tag">&lt;<span class="hljs-name">label</span>&gt;</span><span class="hljs-tag">&lt;<span class="hljs-name">strong</span>&gt;</span> Detected Language :<span class="hljs-tag">&lt;/<span class="hljs-name">strong</span>&gt;</span><span class="hljs-tag">&lt;/<span class="hljs-name">label</span>&gt;</span>
      <span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span>
      <span class="hljs-tag">&lt;<span class="hljs-name">div</span> <span class="hljs-attr">class</span>=<span class="hljs-string">"col-sm-6"</span>&gt;</span>
        <span class="hljs-tag">&lt;<span class="hljs-name">input</span> <span class="hljs-attr">disabled</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"text"</span> <span class="hljs-attr">class</span>=<span class="hljs-string">"form-control"</span> <span class="hljs-attr">value</span>=<span class="hljs-string">{{DetectedTextLanguage}}</span> /&gt;</span>
      <span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span>
    <span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span>
  <span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span>
  <span class="hljs-tag">&lt;<span class="hljs-name">div</span> <span class="hljs-attr">class</span>=<span class="hljs-string">"col-md-5"</span>&gt;</span>
    <span class="hljs-tag">&lt;<span class="hljs-name">div</span> <span class="hljs-attr">class</span>=<span class="hljs-string">"image-container"</span>&gt;</span>
      <span class="hljs-tag">&lt;<span class="hljs-name">img</span> <span class="hljs-attr">class</span>=<span class="hljs-string">"preview-image"</span> <span class="hljs-attr">src</span>=<span class="hljs-string">{{imagePreview}}</span>&gt;</span>
    <span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span>
    <span class="hljs-tag">&lt;<span class="hljs-name">input</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"file"</span> (<span class="hljs-attr">change</span>)=<span class="hljs-string">"uploadImage($event)"</span> /&gt;</span>
    <span class="hljs-tag">&lt;<span class="hljs-name">p</span>&gt;</span>{{status}}<span class="hljs-tag">&lt;/<span class="hljs-name">p</span>&gt;</span>
    <span class="hljs-tag">&lt;<span class="hljs-name">hr</span> /&gt;</span>
    <span class="hljs-tag">&lt;<span class="hljs-name">button</span> [<span class="hljs-attr">disabled</span>]=<span class="hljs-string">"loading"</span> <span class="hljs-attr">class</span>=<span class="hljs-string">"btn btn-primary btn-lg"</span> (<span class="hljs-attr">click</span>)=<span class="hljs-string">"GetText()"</span>&gt;</span>
      <span class="hljs-tag">&lt;<span class="hljs-name">span</span> *<span class="hljs-attr">ngIf</span>=<span class="hljs-string">"loading"</span> <span class="hljs-attr">class</span>=<span class="hljs-string">"spinner-border spinner-border-sm mr-1"</span>&gt;</span><span class="hljs-tag">&lt;/<span class="hljs-name">span</span>&gt;</span>Extract Text
    <span class="hljs-tag">&lt;/<span class="hljs-name">button</span>&gt;</span>
  <span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span>
<span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span>
</code></pre>
<p>We have defined a text area to display the detected text and a text box for displaying the detected language. We have defined a file upload control which will allow us to upload an image. After uploading the image, the preview of the image will be displayed using an <code>&lt;img&gt;</code> element.</p>
<p>Open <a target="_blank" href="https://github.com/AnkitSharma-007/Angular-Computer-Vision-Azure-Cognitive-Services/blob/master/ngComputerVision/ClientApp/src/app/ocr/ocr.component.ts">ocr.component.ts</a> and put the following code in it.</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">import</span> { Component, OnInit } <span class="hljs-keyword">from</span> <span class="hljs-string">'@angular/core'</span>;
<span class="hljs-keyword">import</span> { ComputervisionService } <span class="hljs-keyword">from</span> <span class="hljs-string">'../services/computervision.service'</span>;
<span class="hljs-keyword">import</span> { AvailableLanguage } <span class="hljs-keyword">from</span> <span class="hljs-string">'../models/availablelanguage'</span>;
<span class="hljs-keyword">import</span> { OcrResult } <span class="hljs-keyword">from</span> <span class="hljs-string">'../models/ocrresult'</span>;

<span class="hljs-meta">@Component</span>({
  selector: <span class="hljs-string">'app-ocr'</span>,
  templateUrl: <span class="hljs-string">'./ocr.component.html'</span>,
  styleUrls: [<span class="hljs-string">'./ocr.component.css'</span>]
})
<span class="hljs-keyword">export</span> <span class="hljs-keyword">class</span> OcrComponent <span class="hljs-keyword">implements</span> OnInit {

  loading = <span class="hljs-literal">false</span>;
  imageFile;
  imagePreview;
  imageData = <span class="hljs-keyword">new</span> FormData();
  availableLanguage: AvailableLanguage[];
  DetectedTextLanguage: <span class="hljs-built_in">string</span>;
  ocrResult: OcrResult;
  DefaultStatus: <span class="hljs-built_in">string</span>;
  status: <span class="hljs-built_in">string</span>;
  maxFileSize: <span class="hljs-built_in">number</span>;
  isValidFile = <span class="hljs-literal">true</span>;

  <span class="hljs-keyword">constructor</span>(<span class="hljs-params"><span class="hljs-keyword">private</span> computervisionService: ComputervisionService</span>) {
    <span class="hljs-built_in">this</span>.DefaultStatus = <span class="hljs-string">"Maximum size allowed for the image is 4 MB"</span>;
    <span class="hljs-built_in">this</span>.status = <span class="hljs-built_in">this</span>.DefaultStatus;
    <span class="hljs-built_in">this</span>.maxFileSize = <span class="hljs-number">4</span> * <span class="hljs-number">1024</span> * <span class="hljs-number">1024</span>; <span class="hljs-comment">// 4MB</span>
  }

  ngOnInit() {
    <span class="hljs-built_in">this</span>.computervisionService.getAvailableLanguage().subscribe(
      <span class="hljs-function">(<span class="hljs-params">result: AvailableLanguage[]</span>) =&gt;</span> <span class="hljs-built_in">this</span>.availableLanguage = result
    );
  }

  uploadImage(event) {
    <span class="hljs-built_in">this</span>.imageFile = event.target.files[<span class="hljs-number">0</span>];
    <span class="hljs-keyword">if</span> (<span class="hljs-built_in">this</span>.imageFile.size &gt; <span class="hljs-built_in">this</span>.maxFileSize) {
      <span class="hljs-built_in">this</span>.status = <span class="hljs-string">`The file size is <span class="hljs-subst">${<span class="hljs-built_in">this</span>.imageFile.size}</span> bytes, this is more than the allowed limit of <span class="hljs-subst">${<span class="hljs-built_in">this</span>.maxFileSize}</span> bytes.`</span>;
      <span class="hljs-built_in">this</span>.isValidFile = <span class="hljs-literal">false</span>;
    } <span class="hljs-keyword">else</span> <span class="hljs-keyword">if</span> (<span class="hljs-built_in">this</span>.imageFile.type.indexOf(<span class="hljs-string">'image'</span>) == <span class="hljs-number">-1</span>) {
      <span class="hljs-built_in">this</span>.status = <span class="hljs-string">"Please upload a valid image file"</span>;
      <span class="hljs-built_in">this</span>.isValidFile = <span class="hljs-literal">false</span>;
    } <span class="hljs-keyword">else</span> {
      <span class="hljs-keyword">const</span> reader = <span class="hljs-keyword">new</span> FileReader();
      reader.readAsDataURL(event.target.files[<span class="hljs-number">0</span>]);
      reader.onload = <span class="hljs-function">() =&gt;</span> {
        <span class="hljs-built_in">this</span>.imagePreview = reader.result;
      };
      <span class="hljs-built_in">this</span>.status = <span class="hljs-built_in">this</span>.DefaultStatus;
      <span class="hljs-built_in">this</span>.isValidFile = <span class="hljs-literal">true</span>;
    }
  }

  GetText() {
    <span class="hljs-keyword">if</span> (<span class="hljs-built_in">this</span>.isValidFile) {

      <span class="hljs-built_in">this</span>.loading = <span class="hljs-literal">true</span>;
      <span class="hljs-built_in">this</span>.imageData.append(<span class="hljs-string">'imageFile'</span>, <span class="hljs-built_in">this</span>.imageFile);

      <span class="hljs-built_in">this</span>.computervisionService.getTextFromImage(<span class="hljs-built_in">this</span>.imageData).subscribe(
        <span class="hljs-function">(<span class="hljs-params">result: OcrResult</span>) =&gt;</span> {
          <span class="hljs-built_in">this</span>.ocrResult = result;
          <span class="hljs-keyword">if</span> (<span class="hljs-built_in">this</span>.availableLanguage.find(<span class="hljs-function"><span class="hljs-params">x</span> =&gt;</span> x.languageID === <span class="hljs-built_in">this</span>.ocrResult.language)) {
            <span class="hljs-built_in">this</span>.DetectedTextLanguage = <span class="hljs-built_in">this</span>.availableLanguage.find(<span class="hljs-function"><span class="hljs-params">x</span> =&gt;</span> x.languageID === <span class="hljs-built_in">this</span>.ocrResult.language).languageName;
          } <span class="hljs-keyword">else</span> {
            <span class="hljs-built_in">this</span>.DetectedTextLanguage = <span class="hljs-string">"unknown"</span>;
          }
          <span class="hljs-built_in">this</span>.loading = <span class="hljs-literal">false</span>;
        });
    }
  }
}
</code></pre>
<p>We will inject the <code>ComputervisionService</code> in the constructor of the <code>OcrComponent</code> and set a message and the value for the max image size allowed inside the constructor.</p>
<p>We will invoke the <code>getAvailableLanguage</code> method of our service in the <code>ngOnInit</code> and store the result in an array of type <code>AvailableLanguage</code>.</p>
<p>The <code>uploadImage</code> method will be invoked upon uploading an image. We will check if the uploaded file is a valid image and within the allowed size limit. We will process the image data using a <code>FileReader</code> object. The <code>readAsDataURL</code> method will read the contents of the uploaded file. </p>
<p>Upon successful completion of the read operation, the <code>reader.onload</code> event will be triggered. The value of <code>imagePreview</code> will be set to the result returned by the fileReader object, which is of type <code>ArrayBuffer</code>.</p>
<p>Inside the <code>GetText</code> method, we will append the image file to a variable for type <code>FormData</code>. We will invoke the <code>getTextFromImage</code> of the service and bind the result to an object of type <code>OcrResult</code>. We will search for the language name from the array <code>availableLanguage</code>, based on the language code returned from the service. If the language code is not found, we will set the language as unknown.</p>
<p>We will add the styling for the text area in <a target="_blank" href="https://github.com/AnkitSharma-007/Angular-Computer-Vision-Azure-Cognitive-Services/blob/master/ngComputerVision/ClientApp/src/app/ocr/ocr.component.css">ocr.component.css</a> as shown below.</p>
<pre><code class="lang-css"><span class="hljs-selector-class">.preview-image</span> {
    <span class="hljs-attribute">max-height</span>: <span class="hljs-number">300px</span>;
    <span class="hljs-attribute">max-width</span>: <span class="hljs-number">300px</span>;
}

<span class="hljs-selector-class">.image-container</span>{
  <span class="hljs-attribute">display</span>: flex;
  <span class="hljs-attribute">padding</span>: <span class="hljs-number">15px</span>;
  <span class="hljs-attribute">align-content</span>: center;
  <span class="hljs-attribute">align-items</span>: center;
  <span class="hljs-attribute">justify-content</span>: center;
  <span class="hljs-attribute">border</span>: <span class="hljs-number">2px</span> dashed skyblue;
}
</code></pre>
<h2 id="heading-adding-the-links-in-nav-menu">Adding the links in Nav Menu</h2>
<p>We will add the navigation links for our components in the nav menu. Open <a target="_blank" href="https://github.com/AnkitSharma-007/Angular-Computer-Vision-Azure-Cognitive-Services/blob/master/ngComputerVision/ClientApp/src/app/nav-menu/nav-menu.component.html#L14-L16">nav-menu.component.html</a> and remove the links for Counter and Fetch data components. Add the following lines in the list of navigation links.</p>
<pre><code class="lang-html"><span class="hljs-tag">&lt;<span class="hljs-name">li</span> <span class="hljs-attr">class</span>=<span class="hljs-string">"nav-item"</span> [<span class="hljs-attr">routerLinkActive</span>]=<span class="hljs-string">"['link-active']"</span>&gt;</span>
 <span class="hljs-tag">&lt;<span class="hljs-name">a</span> <span class="hljs-attr">class</span>=<span class="hljs-string">"nav-link text-dark"</span> <span class="hljs-attr">routerLink</span>=<span class="hljs-string">'/computer-vision-ocr'</span>&gt;</span>Computer Vision<span class="hljs-tag">&lt;/<span class="hljs-name">a</span>&gt;</span>
<span class="hljs-tag">&lt;/<span class="hljs-name">li</span>&gt;</span>
</code></pre>
<h2 id="heading-execution-demo">Execution Demo</h2>
<p>Press F5 to launch the application. Click on the Computer Vision button on the nav menu at the top. You can upload an image and extract the text from the image as shown in the image below.</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2020/05/ngComputerVision.gif" alt="Image" width="600" height="400" loading="lazy">
<em>Execution Demo</em></p>
<h2 id="heading-summary">Summary</h2>
<p>We have created an optical character recognition (OCR) application using Angular and the Computer Vision Azure Cognitive Service. The application is able to extract the printed text from the uploaded image and recognizes the language of the text. The OCR API of the Computer Vision is used which can recognize text in 25 languages.</p>
<p>I just released a free eBook on Angular and Firebase. You can download the free book from <a target="_blank" href="https://www.c-sharpcorner.com/ebooks/build-a-full-stack-web-application-using-angular-and-firebase">Build a Full-Stack Web Application Using Angular &amp; Firebase</a></p>
<h2 id="heading-see-also">See Also</h2>
<ul>
<li><a target="_blank" href="https://ankitsharmablogs.com/template-driven-form-validation-in-angular/">Template-Driven Form Validation In Angular</a></li>
<li><a target="_blank" href="https://ankitsharmablogs.com/reactive-form-validation-in-angular/">Reactive Form Validation In Angular</a></li>
<li><a target="_blank" href="https://ankitsharmablogs.com/continuous-deployment-for-angular-app-using-heroku-and-github/">Continuous Deployment For Angular App Using Heroku And GitHub</a></li>
<li><a target="_blank" href="https://ankitsharmablogs.com/policy-based-authorization-in-angular-using-jwt/">Policy-Based Authorization In Angular Using JWT</a></li>
<li><a target="_blank" href="https://ankitsharmablogs.com/optical-character-reader-using-blazor-and-computer-vision/">Optical Character Reader Using Blazor And Computer Vision</a></li>
</ul>
<p>If you like the article, share with you friends. You can also connect with me on <a target="_blank" href="https://twitter.com/ankitsharma_007">Twitter</a> and <a target="_blank" href="https://www.linkedin.com/in/ankitsharma-007/">LinkedIn</a>.</p>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ How to use image preprocessing to improve the accuracy of Tesseract ]]>
                </title>
                <description>
                    <![CDATA[ By Berk Kaan Kuguoglu Previously, on How to get started with Tesseract, I gave you a practical quick-start tutorial on Tesseract using Python. It is a pretty simple overview, but it should help you get started with Tesseract and clear some hurdles th... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/getting-started-with-tesseract-part-ii-f7f9a0899b3f/</link>
                <guid isPermaLink="false">66c34b625ced6d98e4bd32e0</guid>
                
                    <category>
                        <![CDATA[ OCR  ]]>
                    </category>
                
                    <category>
                        <![CDATA[ opencv ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Python ]]>
                    </category>
                
                    <category>
                        <![CDATA[ technology ]]>
                    </category>
                
                    <category>
                        <![CDATA[ tesseract ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ freeCodeCamp ]]>
                </dc:creator>
                <pubDate>Wed, 06 Jun 2018 13:25:41 +0000</pubDate>
                <media:content url="https://cdn-media-1.freecodecamp.org/images/1*iZwvUAtgcOAVgjO23Hd2ig.jpeg" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>By Berk Kaan Kuguoglu</p>
<p>Previously, on <a target="_blank" href="https://medium.com/@bkaankuguoglu/getting-started-with-tesseract-part-i-2a6a6b1cf75e">How to get started with Tesseract</a>, I gave you a practical quick-start tutorial on Tesseract using Python. It is a pretty simple overview, but it should help you get started with Tesseract and clear some hurdles that I faced when I was in your shoes. Now, I’m keen on showing you a few more tricks and stuff you can do with Tesseract and OpenCV to improve your overall accuracy.</p>
<h3 id="heading-where-did-we-leave-off-last-time">Where did we leave off last time?</h3>
<p>In <a target="_blank" href="https://medium.com/@bkaankuguoglu/getting-started-with-tesseract-part-i-2a6a6b1cf75e">the previous story</a>, I didn’t bother going into details for the most part. But if you liked the first story, here comes the sequel! So where did we leave off?</p>
<p>Ah, we had a brief overview of rescaling, noise removal, and binarization. Now, it’s time to get down to details and show you a few settings you can play with.</p>
<h3 id="heading-rescaling">Rescaling</h3>
<p>The images that are rescaled are either shrunk or enlarged. If you’re interested in shrinking your image, <strong>INTER_AREA</strong> is the way to go for you. (Btw, the parameters <em>fx</em> and <em>fy</em> denote the scaling factor in the function below.)</p>
<pre><code>img = cv2.resize(img, None, fx=<span class="hljs-number">0.5</span>, fy=<span class="hljs-number">0.5</span>, interpolation=cv2.INTER_AREA)
</code></pre><p>On the other hand, as in most cases, you may need to scale your image to a larger size to recognize small characters. In this case, <strong>INTER_CUBIC</strong> generally performs better than other alternatives, though it’s also slower than others.</p>
<pre><code>img = cv2.resize(img, None, fx=<span class="hljs-number">2</span>, fy=<span class="hljs-number">2</span>, interpolation=cv2.INTER_CUBIC)
</code></pre><p>If you’d like to trade off some of your image quality for faster performance, you may want to try <strong>INTER_LINEAR</strong> for enlarging images.</p>
<pre><code>img = cv2.resize(img, None, fx=<span class="hljs-number">2</span>, fy=<span class="hljs-number">2</span>, interpolation=cv2.INTER_LINEAR)
</code></pre><h3 id="heading-blurring"><strong>Blurring</strong></h3>
<p>It’s worth mentioning that there are a few blur filters available in the <a target="_blank" href="https://docs.opencv.org/3.4.0/d4/d13/tutorial_py_filtering.html">OpenCV library</a>. Image blurring is usually achieved by convolving the image with a low-pass filter kernel. While filters are usually used to blur the image or to reduce noise, there are a few differences between them.</p>
<h4 id="heading-1-averaging">1. Averaging</h4>
<p>After convolving an image with a normalized box filter, this simply takes the average of all the pixels under the kernel area and replaces the central element. It’s pretty self-explanatory, I guess.</p>
<pre><code>img = cv.blur(img,(<span class="hljs-number">5</span>,<span class="hljs-number">5</span>))
</code></pre><h4 id="heading-2-gaussian-blurring">2. Gaussian blurring</h4>
<p>This works in a similar fashion to Averaging, but it uses Gaussian kernel, instead of a normalized box filter, for convolution. Here, the dimensions of the kernel and standard deviations in both directions can be determined independently. Gaussian blurring is very useful for removing — guess what? — gaussian noise from the image. On the contrary, gaussian blurring does not preserve the edges in the input.</p>
<pre><code>img = cv2.GaussianBlur(img, (<span class="hljs-number">5</span>, <span class="hljs-number">5</span>), <span class="hljs-number">0</span>)
</code></pre><h4 id="heading-3-median-blurring">3. Median blurring</h4>
<p>The central element in the kernel area is replaced with the median of all the pixels under the kernel. Particularly, this outperforms other blurring methods in removing salt-and-pepper noise in the images.</p>
<p>Median blurring is a non-linear filter. Unlike linear filters, median blurring replaces the pixel values with the median value available in the neighborhood values. So, median blurring preserves edges as the median value must be the value of one of neighboring pixels.</p>
<pre><code>img = cv2.medianBlur(img, <span class="hljs-number">3</span>)
</code></pre><h4 id="heading-4-bilateral-filtering">4. Bilateral filtering</h4>
<p>Speaking of keeping edges sharp, bilateral filtering is quite useful for removing the noise without smoothing the edges. Similar to gaussian blurring, bilateral filtering also uses a gaussian filter to find the gaussian weighted average in the neighborhood. However, it also takes pixel difference into account while blurring the nearby pixels.</p>
<p>Thus, it ensures only those pixels with similar intensity to the central pixel are blurred, whereas the pixels with distinct pixel values are not blurred. In doing so, the edges that have larger intensity variation, so-called edges, are preserved.</p>
<pre><code>img = cv.bilateralFilter(img,<span class="hljs-number">9</span>,<span class="hljs-number">75</span>,<span class="hljs-number">75</span>)
</code></pre><p>Overall, if you are interested in preserving the edges, go with median blurring or bilateral filtering. On the contrary, gaussian blurring is likely to be faster than median blurring. Due to its computational complexity, bilateral filtering is the slowest of all methods.</p>
<p>Again, you do you.</p>
<h3 id="heading-image-thresholding">Image Thresholding</h3>
<p>There’s not a single image thresholding method that fits all types of documents. In reality, all filters perform differently on varying images. For instance, while some filters successfully binarize some images, they may fail to binarize others. Likewise, some filters may work well with those images that other filters cannot binarize well.</p>
<p>I’ll try to cover the basics here, though I do recommend that you read the official documentation of <a target="_blank" href="https://docs.opencv.org/3.4.0/d7/d4d/tutorial_py_thresholding.html">OpenCV on Image Thresholding</a> for more information and the theory behind it.</p>
<h4 id="heading-1-simple-threshold">1. Simple Threshold</h4>
<p>You might recall a friend of yours giving you some advice about your life by saying “things are not always black and white”. Well, for a simple threshold, things are pretty straight-forward.</p>
<pre><code>cv.threshold(img,<span class="hljs-number">127</span>,<span class="hljs-number">255</span>,cv.THRESH_BINARY)
</code></pre><p>First, you pick a threshold value, say 127. If the pixel value is greater than the threshold, it becomes black. If less, it becomes white. OpenCV provides us with different types of thresholding methods that can be passed as the fourth parameter. I often use binary threshold for most tasks, but for other thresholding methods you may visit <a target="_blank" href="https://docs.opencv.org/3.4.0/d7/d4d/tutorial_py_thresholding.html">the official documentation.</a></p>
<h4 id="heading-2-adaptive-threshold">2. Adaptive Threshold</h4>
<p>Rather than setting a one global threshold value, we let the algorithm calculate the threshold for small regions of the image. Thus, we end up having various threshold values for different regions of the image, which is great!</p>
<pre><code>cv2.adaptiveThreshold(img, <span class="hljs-number">255</span>, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, <span class="hljs-number">31</span>, <span class="hljs-number">2</span>)
</code></pre><p>There are two adaptive methods for calculating the threshold value. While <strong>Adaptive Thresh Mean</strong> returns the mean of the neighborhood area, <strong>Adaptive Gaussian Mean</strong> calculates the weighted sum of the neighborhood values.</p>
<p>We’ve got two more parameters that determine the size of the neighborhood area and the constant value that is subtracted from the result: the fifth and sixth parameters, respectively.</p>
<h4 id="heading-3-otsus-threshold">3. Otsu’s Threshold</h4>
<p>This method particularly works well with <strong>bimodal images</strong>, which is an image whose histogram has two peaks. If this is the case, we might be keen on picking a threshold value between these peaks. This is what Otsu’s Binarization actually does, though.</p>
<pre><code>cv2.threshold(img, <span class="hljs-number">0</span>, <span class="hljs-number">255</span>, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[<span class="hljs-number">1</span>]
</code></pre><p>It’s pretty useful for some cases. But it may fail to binarize images that are not bimodal. So, please take this filter with a grain of salt.</p>
<h4 id="heading-types-of-thresholding">Types of thresholding</h4>
<p>You might have already noticed there is a parameter, or in some cases a combination of a few parameters, that are passed as arguments to determine the type of thresholding, such as THRESH_BINARY. I’m not going into the detail here now, as it is explained clearly in <a target="_blank" href="https://docs.opencv.org/3.4.0/d7/d4d/tutorial_py_thresholding.html">the official documentation</a>.</p>
<h3 id="heading-what-next">What next?</h3>
<p>So far, we’ve discussed some of the techniques of image pre-processing. You might wonder when exactly you’re going to get your hands dirty. Well, the time has come. Before you get back to your favorite Python IDE — mine is <a target="_blank" href="https://www.jetbrains.com/pycharm/">PyCharm</a>, btw — I’m going to show you few lines of code that will save you some time while trying to find which combination of filters and image manipulations work well with your documents.</p>
<p>Let’s start by defining a switcher function that holds a few combinations of thresholding filters and blurring methods. Once you get the idea, you could also add more filters, incorporating other image pre-processing methods like rescaling into your filter set.</p>
<p>Here I’ve created 20 different combinations of image thresholding methods, blurring methods, and kernel sizes. The switcher function, _apply<em>threshold</em>, takes two arguments, namely OpenCV image and an integer that denotes the filter. Likewise, since this function returns the OpenCV image as a result, it could easily be integrated into our _get<em>string</em> function from the previous post.</p>
<pre><code>def apply_threshold(img, argument):    switcher = {        <span class="hljs-number">1</span>: cv2.threshold(cv2.GaussianBlur(img, (<span class="hljs-number">9</span>, <span class="hljs-number">9</span>), <span class="hljs-number">0</span>), <span class="hljs-number">0</span>, <span class="hljs-number">255</span>, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[<span class="hljs-number">1</span>],        <span class="hljs-number">2</span>: cv2.threshold(cv2.GaussianBlur(img, (<span class="hljs-number">7</span>, <span class="hljs-number">7</span>), <span class="hljs-number">0</span>), <span class="hljs-number">0</span>, <span class="hljs-number">255</span>, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[<span class="hljs-number">1</span>],        <span class="hljs-number">3</span>: cv2.threshold(cv2.GaussianBlur(img, (<span class="hljs-number">5</span>, <span class="hljs-number">5</span>), <span class="hljs-number">0</span>), <span class="hljs-number">0</span>, <span class="hljs-number">255</span>, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[<span class="hljs-number">1</span>],
</code></pre><pre><code>                              ...
</code></pre><pre><code>        <span class="hljs-number">18</span>: cv2.adaptiveThreshold(cv2.medianBlur(img, <span class="hljs-number">7</span>), <span class="hljs-number">255</span>, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, <span class="hljs-number">31</span>, <span class="hljs-number">2</span>),        <span class="hljs-number">19</span>: cv2.adaptiveThreshold(cv2.medianBlur(img, <span class="hljs-number">5</span>), <span class="hljs-number">255</span>, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, <span class="hljs-number">31</span>, <span class="hljs-number">2</span>),        <span class="hljs-number">20</span>: cv2.adaptiveThreshold(cv2.medianBlur(img, <span class="hljs-number">3</span>), <span class="hljs-number">255</span>, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, <span class="hljs-number">31</span>, <span class="hljs-number">2</span>)    }    <span class="hljs-keyword">return</span> switcher.get(argument, <span class="hljs-string">"Invalid method"</span>)
</code></pre><p>And, here it comes.</p>
<pre><code>def get_string(img_path, method):    # Read image using opencv    img = cv2.imread(img_path)    # Extract the file name without the file extension    file_name = os.path.basename(img_path).split(<span class="hljs-string">'.'</span>)[<span class="hljs-number">0</span>]    file_name = file_name.split()[<span class="hljs-number">0</span>]    # Create a directory <span class="hljs-keyword">for</span> outputs    output_path = os.path.join(output_dir, file_name)    <span class="hljs-keyword">if</span> not os.path.exists(output_path):        os.makedirs(output_path)
</code></pre><pre><code>    # Rescale the image, <span class="hljs-keyword">if</span> needed.    img = cv2.resize(img, None, fx=<span class="hljs-number">1.5</span>, fy=<span class="hljs-number">1.5</span>, interpolation=cv2.INTER_CUBIC)
</code></pre><pre><code>    # Convert to gray    img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)    # Apply dilation and erosion to remove some noise    kernel = np.ones((<span class="hljs-number">1</span>, <span class="hljs-number">1</span>), np.uint8)    img = cv2.dilate(img, kernel, iterations=<span class="hljs-number">1</span>)    img = cv2.erode(img, kernel, iterations=<span class="hljs-number">1</span>)
</code></pre><pre><code>    # Apply threshold to get image <span class="hljs-keyword">with</span> only black and white    img = apply_threshold(img, method)
</code></pre><pre><code>    # Save the filtered image <span class="hljs-keyword">in</span> the output directory    save_path = os.path.join(output_path, file_name + <span class="hljs-string">"_filter_"</span> + str(method) + <span class="hljs-string">".jpg"</span>)    cv2.imwrite(save_path, img)    # Recognize text <span class="hljs-keyword">with</span> tesseract <span class="hljs-keyword">for</span> python    result = pytesseract.image_to_string(img, lang=<span class="hljs-string">"eng"</span>)
</code></pre><pre><code>    <span class="hljs-keyword">return</span> result
</code></pre><h3 id="heading-last-words">Last words</h3>
<p>Now, all we need to do is to write a simple for loop that iterates over the input directory to collect images and applies each filter on the images gathered. I prefer to use <em>glob</em>, or <em>os</em>, for collecting images from directories, and <em>argparse</em> for passing arguments via terminal, like any other sane person would do.</p>
<p>Here I’ve done pretty much the same thing as in my <a target="_blank" href="https://gist.github.com/bkaankuguoglu/111f9f5e0c30b5f57d7c5338d6dcb6fc">gist</a>, if you’d like have a look at it. However, feel free to use the tools you feel comfortable with.</p>
<p>So far, I’ve tried to cover a few useful image pre-processing concepts and implementations, though it’s probably just the tip of the iceberg. I don’t know how much “leisure time” I’m going to have in the upcoming weeks, so, I can’t give you a specific time frame for publishing my next post. However, I’m considering adding at least one more part to this series that explains a few things I left out, such as rotation and de-skewing on images.</p>
<p>Until then, best bet is to just keep your wits about you and continue to look for signs.<a target="_blank" href="https://www.youtube.com/watch?v=B_CHjYoqPUU">*</a></p>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ How you can get started with Tesseract ]]>
                </title>
                <description>
                    <![CDATA[ By Berk Kaan Kuguoglu It’s far from a secret that Tesseract is not an all-in-one OCR tool that recognizes all sort of texts and drawings. In fact, this couldn’t be further from the truth. If this was a secret, I’ve already spoiled it and it’s already... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/getting-started-with-tesseract-part-i-2a6a6b1cf75e/</link>
                <guid isPermaLink="false">66c34b604f7405e6476b01c7</guid>
                
                    <category>
                        <![CDATA[ OCR  ]]>
                    </category>
                
                    <category>
                        <![CDATA[ opencv ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Python ]]>
                    </category>
                
                    <category>
                        <![CDATA[ tesseract ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Tutorial ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ freeCodeCamp ]]>
                </dc:creator>
                <pubDate>Tue, 05 Jun 2018 18:42:00 +0000</pubDate>
                <media:content url="https://cdn-media-1.freecodecamp.org/images/1*pv8wGtNSz5Xe5OCrOIJxyw.jpeg" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>By Berk Kaan Kuguoglu</p>
<p>It’s far from a secret that Tesseract is not an all-in-one OCR tool that recognizes all sort of texts and drawings. In fact, this couldn’t be further from the truth. If this was a secret, I’ve already spoiled it and it’s already too late to go back anyway. So, why not dive deep into Tesseract and share few tips and tricks that could improve your results?</p>
<h3 id="heading-i-love-free-stuff">I love free stuff!</h3>
 ]]>
                </content:encoded>
            </item>
        
    </channel>
</rss>
