<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/"
    xmlns:atom="http://www.w3.org/2005/Atom" xmlns:media="http://search.yahoo.com/mrss/" version="2.0">
    <channel>
        
        <title>
            <![CDATA[ Eivind Kjosbakken - freeCodeCamp.org ]]>
        </title>
        <description>
            <![CDATA[ Browse thousands of programming tutorials written by experts. Learn Web Development, Data Science, DevOps, Security, and get developer career advice. ]]>
        </description>
        <link>https://www.freecodecamp.org/news/</link>
        <image>
            <url>https://cdn.freecodecamp.org/universal/favicons/favicon.png</url>
            <title>
                <![CDATA[ Eivind Kjosbakken - freeCodeCamp.org ]]>
            </title>
            <link>https://www.freecodecamp.org/news/</link>
        </image>
        <generator>Eleventy</generator>
        <lastBuildDate>Tue, 26 May 2026 16:22:47 +0000</lastBuildDate>
        <atom:link href="https://www.freecodecamp.org/news/author/Kjosbakken/rss.xml" rel="self" type="application/rss+xml" />
        <ttl>60</ttl>
        
            <item>
                <title>
                    <![CDATA[ How to Fine-Tune EasyOCR with a Synthetic Dataset ]]>
                </title>
                <description>
                    <![CDATA[ OCR is a valuable tool that you can use to extract text from images. But the OCR you are using may not work as intended for your specific needs. In such situations, fine-tuning your OCR engine is the way to go.  In this tutorial, I will show you how ]]>
                </description>
                <link>https://www.freecodecamp.org/news/how-to-fine-tune-easyocr-with-a-synthetic-dataset/</link>
                <guid isPermaLink="false">66be011ec63bb28663b20e32</guid>
                
                    <category>
                        <![CDATA[ Artificial Intelligence ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Data Science ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ Eivind Kjosbakken ]]>
                </dc:creator>
                <pubDate>Fri, 05 Jan 2024 17:48:17 +0000</pubDate>
                <media:content url="https://www.freecodecamp.org/news/content/images/2024/01/image-53.png" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>OCR is a valuable tool that you can use to extract text from images. But the OCR you are using may not work as intended for your specific needs. In such situations, fine-tuning your OCR engine is the way to go. </p>
<p>In this tutorial, I will show you how to fine-tune EasyOCR, a free, open-source OCR engine that you can use with Python.</p>
<h2 id="heading-table-of-contents">Table of Contents</h2>
<ul>
<li><a class="post-section-overview" href="#heading-prerequisites">Prerequisites</a></li>
<li><a class="post-section-overview" href="#heading-how-to-install-required-packages">How to Install Required Packages</a></li>
<li><a class="post-section-overview" href="#heading-how-to-clone-the-git-repository">How to Clone the Git repository</a></li>
<li><a class="post-section-overview" href="#how-get-a-dataset">How to Get a Dataset</a></li>
<li><a class="post-section-overview" href="#heading-how-to-generate-your-synthetic-dataset">How to Generate your Synthetic Dataset</a></li>
<li><a class="post-section-overview" href="#convert-the-dataset-to-lmdb-format">Convert the dataset to lmdb format</a></li>
<li><a class="post-section-overview" href="#heading-how-to-retrieve-a-pre-trained-ocr-model">How to Retrieve a Pre-trained OCR Model</a></li>
<li><a class="post-section-overview" href="#heading-how-to-run-the-fine-tuning">How to Run the fine-tuning</a></li>
<li><a class="post-section-overview" href="#heading-how-to-run-inference-with-your-fine-tuned-model">How to Run Inference with your Fine-tuned Model</a></li>
<li><a class="post-section-overview" href="#heading-a-qualitative-test-of-performance">A Qualitative Test of Performance</a></li>
<li><a class="post-section-overview" href="#heading-quantitative-test-of-performance">Quantitative Test of Performance</a></li>
<li><a class="post-section-overview" href="#heading-conclusion">Conclusion</a></li>
</ul>
<h2 id="heading-prerequisites">Prerequisites</h2>
<ul>
<li>Basic knowledge of Python.</li>
<li>Basic knowledge of how to use the terminal</li>
</ul>
<h2 id="heading-how-to-install-required-packages">How to Install Required Packages</h2>
<p>First off, let's install the required <code>pip</code> packages. I recommend making a virtual environment for this, though it is not required. </p>
<p>Run the commands below, one line at a time:</p>
<pre><code class="lang-bash">pip install fire
pip install lmdb
pip install opencv-python
pip install natsort
pip install nltk
</code></pre>
<p>You also need to install PyTorch from <a target="_blank" href="https://pytorch.org/get-started/locally/">this website</a> (choose your specifications and copy the pip install command. The command below is for my specifications). You can choose either the GPU version or the CPU version. The difference is that running the fine-tuning process will be slower on the CPU.</p>
<pre><code class="lang-bash">pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
</code></pre>
<h2 id="heading-how-to-clone-the-git-repository">How to Clone the Git Repository</h2>
<p>You''ll need a Git repository that will help you run the fine-tuning. Clone <a target="_blank" href="https://github.com/clovaai/deep-text-recognition-benchmark">this Git repo</a> with the command below:</p>
<pre><code class="lang-bash">git <span class="hljs-built_in">clone</span> https://github.com/clovaai/deep-text-recognition-benchmark
</code></pre>
<p>The <a target="_blank" href="https://github.com/clovaai/deep-text-recognition-benchmark">deep-text-recognition-benchmark Github repo</a> will give us some useful files for fine-tuning the EasyOCR model. Note that some of the terminal commands used in this article were taken from the repository and then adapted to my needs, so the repository is worth a read.</p>
<p>I would like to add a note here that <a target="_blank" href="https://github.com/clovaai">Clova AI on Git</a> has a lot of good repositories that have been of immense help to me, so feel free to check out other interesting repositories that they have. </p>
<p>Another interesting repo they have is the <a target="_blank" href="https://github.com/clovaai/donut">Donut model repo</a>, and I have written an <a target="_blank" href="https://python.plainenglish.io/empower-your-donut-model-for-receipts-with-self-annotated-data-51fc882b7229">article on fine-tuning the Donut model</a> that you should check out.</p>
<h2 id="heading-how-to-get-a-dataset">How to Get a Dataset</h2>
<p>Before you can fine-tune your OCR, you'll need a dataset. You can either download a dataset or make one yourself.</p>
<p>Since I want my OCR to be particularly good at scanning supermarket receipts, I will make a dataset of items you can find in the supermarket, but feel free to make a dataset from whatever data you need your OCR to be good at. For this section, I made use of <a target="_blank" href="https://github.com/JaidedAI/EasyOCR/blob/master/custom_model.md">this GitHub page</a>.</p>
<p>If you want to learn how to generate your own dataset, you can go to the next section right away, but  if you want a simpler solution then you can use one of the options below:</p>
<h3 id="heading-option-1-use-my-dummy-dataset">Option 1 – Use my dummy dataset:</h3>
<p>If you want to have this step as simple as possible (recommended if you are just testing), you can download a dummy dataset. I have made and uploaded one to <a target="_blank" href="https://drive.google.com/drive/folders/1rS-WFRqN9zkD3vetwcYYFmOzg_cMv9su?usp=sharing">this Google Drive</a> (download the whole folder).</p>
<h3 id="heading-option-2-download-a-dataset">Option 2 – Download a dataset</h3>
<p>If you want a larger dataset, you can download a dataset from <a target="_blank" href="https://www.dropbox.com/sh/i39abvnefllx2si/AAAbAYRvxzRp3cIE5HzqUw3ra?dl=0">this Dropbox page</a> by downloading the data_lmdb_release.zip file (note that it is a bit over 18GB in size).</p>
<h2 id="heading-how-to-generate-your-synthetic-dataset">How to Generate your Synthetic Dataset</h2>
<p>If you want a cooler approach to creating your own dataset, you can follow along with this section. I originally wrote about it in <a target="_blank" href="https://blog.devgenius.io/generating-a-fine-tuning-dataset-for-an-ocr-engine-3509167bc8a1">this Medium article</a>. </p>
<p>For this section, you should use a separate Python file.</p>
<p>The great thing about a synthetic dataset is that you don't need any labor-intensive labeling, as you are creating the images based on provided textual descriptions. This means that you have both the input to the model (the image) and the label (the text of the images), the two components needed to fine-tune an AI model.</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/12/image-61.png" alt="Image" width="600" height="400" loading="lazy">
<em>Make synthetic images like this by following this section</em></p>
<h3 id="heading-clone-the-synthetic-generation-repo">Clone the Synthetic Generation Repo</h3>
<p>First, you have to clone <a target="_blank" href="https://github.com/Belval/TextRecognitionDataGenerator">this synthetic data generation</a> repository to be able to create synthetic data. To clone it, open a new folder, and run this command:</p>
<pre><code class="lang-bash">git <span class="hljs-built_in">clone</span> https://github.com/Belval/TextRecognitionDataGenerator.git
</code></pre>
<p>This repository allows you to create images from a given text description. You will then have the dataset you need: images, and a txt file stating the text on the images (the label).</p>
<h3 id="heading-create-a-file-to-generate-the-synthetic-data">Create a File to Generate the Synthetic Data</h3>
<p>Now create a new file called <code>generate_synth_data.py</code>, and add the code below to import the useful packages:</p>
<pre><code class="lang-py"><span class="hljs-keyword">from</span> trdg.generators <span class="hljs-keyword">import</span> (
    GeneratorFromStrings,
)
<span class="hljs-keyword">from</span> tqdm.auto <span class="hljs-keyword">import</span> tqdm
<span class="hljs-keyword">import</span> os
<span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd
<span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np
<span class="hljs-keyword">import</span> random
</code></pre>
<p>To run them, you need these <code>pip</code> installations (run one line at a time in the terminal). Note that a specific <code>Pillow</code> version is needed (you will get an error if you have the newest Pillow version):</p>
<pre><code class="lang-bash">pip install trdg
pip install pandas
pip install Pillow==9.5.0
</code></pre>
<p>Next, define some hyperparameters (set them to whatever values you prefer):</p>
<pre><code class="lang-py">NUM_IMAGES_TO_SAVE = <span class="hljs-number">10</span>
NUM_PRICES_TO_GENERATE = <span class="hljs-number">10000</span>
</code></pre>
<p>Now you need a large dataset with words you want to have on the images you create. Since I want my OCR to be good at reading supermarket receipts, I used <a target="_blank" href="https://no.openfoodfacts.org/">Openfoodfacts</a>, which is a website that contains a lot of supermarket items. </p>
<p>To make it as simple as possible, you can use the CSV file on <a target="_blank" href="https://drive.google.com/file/d/1DZhRBVGpf9smuiom3JdL0QW0HEgKIqtQ/view?usp=sharing">this Google Drive page</a> (just download it and place it in your folder).</p>
<p>Note that you can make use of any other data, instead of using mine. If you want to use your own data, all you need is a list of strings, which you can feed into the generator to create images.</p>
<p>Here's how you can read the CSV file containing supermarket items:</p>
<pre><code class="lang-py"><span class="hljs-comment"># helper funcs and data to generate images</span>
df = pd.read_csv(<span class="hljs-string">"openfoodfacts_export_csv.csv"</span>, on_bad_lines=<span class="hljs-string">'skip'</span>, sep=<span class="hljs-string">'\t'</span>, low_memory=<span class="hljs-literal">True</span>)
df[[<span class="hljs-string">"product_name_nb"</span>, <span class="hljs-string">"generic_name_nb"</span>, <span class="hljs-string">"brands"</span>]]
all_words = df[[<span class="hljs-string">"product_name_nb"</span>, <span class="hljs-string">"generic_name_nb"</span>, <span class="hljs-string">"brands"</span>]].to_numpy().flatten()
</code></pre>
<p>Here I am loading in my own data, but the code will look different if you are using your own data.</p>
<p>Here's how you can filter the data:</p>
<pre><code class="lang-py"><span class="hljs-comment"># ignore np nan </span>
num_before = len(all_words)
all_words = [x <span class="hljs-keyword">for</span> x <span class="hljs-keyword">in</span> all_words <span class="hljs-keyword">if</span> str(x) != <span class="hljs-string">'nan'</span>]
after_nan_filter = len(all_words)
print(<span class="hljs-string">"removed: "</span>, num_before - after_nan_filter, <span class="hljs-string">"words because of nan values"</span>)
all_words = list(set(all_words))
print(<span class="hljs-string">"Removed"</span>, len(all_words), <span class="hljs-string">"duplicates"</span>)
print(<span class="hljs-string">"Current number of words: "</span>, len(all_words))
</code></pre>
<p>Note that I am always printing the amount of words removed in the filtering process. This is good practice, as it lets you have a better overview of the size and quality of your dataset.</p>
<p>I also want to have a price on the images, so I am randomly generating some prices with the code below:</p>
<pre><code class="lang-py"><span class="hljs-comment">#randomly generate 2 digits between 0-99</span>
number_strings = []
<span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> range(len(all_words)*<span class="hljs-number">9</span>//<span class="hljs-number">10</span>): <span class="hljs-comment">#90 percent of all words</span>
 digits = np.random.randint(<span class="hljs-number">1</span>, <span class="hljs-number">100</span>, <span class="hljs-number">4</span>)
 before_comma = <span class="hljs-string">f"<span class="hljs-subst">{str(digits[<span class="hljs-number">0</span>])}</span>"</span> <span class="hljs-comment">#before comma is just given as 1 digit if 0-9</span>
 after_comma = <span class="hljs-string">f"<span class="hljs-subst">{str(digits[<span class="hljs-number">1</span>])}</span>"</span> <span class="hljs-keyword">if</span> len(str(digits[<span class="hljs-number">1</span>])) == <span class="hljs-number">2</span> <span class="hljs-keyword">else</span> <span class="hljs-string">f"0<span class="hljs-subst">{str(digits[<span class="hljs-number">1</span>])}</span>"</span>
 number_string = <span class="hljs-string">f"<span class="hljs-subst">{before_comma}</span>,<span class="hljs-subst">{after_comma}</span>"</span>
 number_strings.append(number_string)

<span class="hljs-comment">#then create 10 percent of the words with price between 100-999</span>
<span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> range(len(all_words)*<span class="hljs-number">1</span>//<span class="hljs-number">10</span>): <span class="hljs-comment">#90 percent of all words</span>
 before_comma = np.random.randint(<span class="hljs-number">100</span>, <span class="hljs-number">999</span>, <span class="hljs-number">1</span>)
 after_comma = np.random.randint(<span class="hljs-number">1</span>, <span class="hljs-number">99</span>, <span class="hljs-number">1</span>)
 after_comma = <span class="hljs-string">f"<span class="hljs-subst">{str(after_comma[<span class="hljs-number">0</span>])}</span>"</span> <span class="hljs-keyword">if</span> len(str(after_comma[<span class="hljs-number">0</span>])) == <span class="hljs-number">2</span> <span class="hljs-keyword">else</span> <span class="hljs-string">f"0<span class="hljs-subst">{str(after_comma[<span class="hljs-number">0</span>])}</span>"</span>
 number_string = <span class="hljs-string">f"<span class="hljs-subst">{str(before_comma[<span class="hljs-number">0</span>])}</span>,<span class="hljs-subst">{str(after_comma)}</span>"</span>
 number_strings.append(number_string)
</code></pre>
<p>The code below randomly combines the supermarket items with the prices:</p>
<pre><code class="lang-py"><span class="hljs-comment">#now given word list and number list, get all combinations</span>
all_combinations = []
<span class="hljs-keyword">for</span> word <span class="hljs-keyword">in</span> tqdm(all_words):
 <span class="hljs-keyword">for</span> number <span class="hljs-keyword">in</span> random.sample(number_strings, <span class="hljs-number">20</span>): <span class="hljs-comment">#only need 20 prices per product for example</span>
  <span class="hljs-keyword">for</span> num_tabs <span class="hljs-keyword">in</span> [<span class="hljs-number">1</span>]:
   combined_string = word + <span class="hljs-string">"    "</span>*num_tabs + number
   all_combinations.append(combined_string)
</code></pre>
<p>Use the repository you cloned earlier to create the images from the list of strings we have created:</p>
<pre><code class="lang-py"><span class="hljs-comment">#generate the images</span>
generator = GeneratorFromStrings(
    random.sample(all_combinations, <span class="hljs-number">10000</span>),

    <span class="hljs-comment"># uncomment the lines below for some image augmentation options</span>
    <span class="hljs-comment"># blur=6,</span>
    <span class="hljs-comment"># random_blur=True,</span>
    <span class="hljs-comment"># random_skew=True,</span>
    <span class="hljs-comment"># skewing_angle=20,</span>
    <span class="hljs-comment"># background_type=1,</span>
    <span class="hljs-comment"># text_color="red",</span>
)
</code></pre>
<p>There are a lot of options for generating the data, which you can read more about <a target="_blank" href="https://github.com/Belval/TextRecognitionDataGenerator">here</a>. Some examples are: changing the background, adding blur, and adding skewing. You can try this out by uncommenting some of the lines in the code snippet above.</p>
<p>Then save the images from the generator to a specific format:</p>
<pre><code class="lang-py"><span class="hljs-comment"># save images from generator</span>
<span class="hljs-comment"># if output folder doesnt exist, create it</span>
<span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> os.path.exists(<span class="hljs-string">'output'</span>):
    os.makedirs(<span class="hljs-string">'output'</span>)
<span class="hljs-comment">#if labels.txt doesnt exist, create it</span>
<span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> os.path.exists(<span class="hljs-string">'output/labels.txt'</span>):
    f = open(<span class="hljs-string">"output/labels.txt"</span>, <span class="hljs-string">"w"</span>)
    f.close()

<span class="hljs-comment">#open txt file</span>
current_index = len(os.listdir(<span class="hljs-string">'output'</span>)) - <span class="hljs-number">1</span> <span class="hljs-comment">#all images minus the labels file</span>
f = open(<span class="hljs-string">"output/labels.txt"</span>, <span class="hljs-string">"a"</span>)

<span class="hljs-keyword">for</span> counter, (img, lbl) <span class="hljs-keyword">in</span> tqdm(enumerate(generator), total = NUM_IMAGES_TO_SAVE):
    <span class="hljs-keyword">if</span> (counter &gt;= NUM_IMAGES_TO_SAVE):
        <span class="hljs-keyword">break</span>
    <span class="hljs-comment"># img.show()</span>
    <span class="hljs-comment">#save pillow image</span>
    img.save(<span class="hljs-string">f'output/image<span class="hljs-subst">{current_index}</span>.png'</span>)
    f.write(<span class="hljs-string">f'image<span class="hljs-subst">{current_index}</span>.png <span class="hljs-subst">{lbl}</span>\n'</span>)
    current_index += <span class="hljs-number">1</span>
    <span class="hljs-comment"># Do something with the pillow images here.</span>
f.close()
</code></pre>
<h3 id="heading-generate-the-synthetic-data">Generate the Synthetic Data</h3>
<p>You can run the <code>generate_synth_data.py</code> file you created with this command in the terminal:</p>
<pre><code class="lang-bash">python generate_synth_data.py
</code></pre>
<p>You should see an image similar to the one below (you may have a different text, in your output folder):</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/12/image-62.png" alt="Image" width="600" height="400" loading="lazy">
<em>This image was synthetically generated</em></p>
<p>Your images will be organized in the order in the image below, where the <code>.png</code> files are your images, and the <code>labels.txt</code> file contains the text in each image. This allows you to use the dataset for fine-tuning.</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/12/image-63.png" alt="Image" width="600" height="400" loading="lazy">
<em>The output folder structure from running the code above.</em></p>
<p>Congrats, you can now make your own synthetic dataset. Since you now have both an image and the text of that image in a <code>labels.txt</code> file, you can use this to fine-tune an OCR engine, which I will talk more about below.</p>
<h2 id="heading-how-to-convert-the-dataset-to-lmdb-format">How to Convert the Dataset to LMDB Format</h2>
<p>LMDB stands for <a target="_blank" href="http://www.lmdb.tech/doc/">Lightning Memory-Mapped Database Manager</a> and is essentially an encoding you can use for your dataset to train AI models. </p>
<p>You can read more about it on the <a target="_blank" href="https://lmdb.readthedocs.io/en/release/">LMDB docs</a>. After you have created your dataset, you should have a folder with your images, and the labels for all the images (the text in the images) in a <code>labels.txt</code> file. </p>
<p>Your folder should look similar to the image below, and should be inside the <strong>deep-text-recognition</strong> folder:</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/12/image-54.png" alt="Image" width="600" height="400" loading="lazy">
<em>How the folder for your dataset should look before converting to LMDB format</em></p>
<p><strong>NOTE</strong>: Make sure you have at least 10 images in your folder. You may get an error when running the training script later in the tutorial if you have fewer images.</p>
<p>You have to make some changes in the <code>create_lmdb_dataset.py</code> file in the <strong>deep-text-recognition-benchmark</strong> folder:</p>
<p>Set the <code>map_size</code> variable to a lower value — I was getting a disk memory error with the previous value. I set the new value for <code>map_size</code> to 1073741824, as can be seen below:</p>
<pre><code class="lang-py"><span class="hljs-comment"># OLD LINE</span>
<span class="hljs-comment"># ...</span>
env = lmdb.open(outputPath, map_size=<span class="hljs-number">1099511627776</span>)
<span class="hljs-comment"># ...</span>

<span class="hljs-comment"># NEW LINE </span>
<span class="hljs-comment"># ...</span>
env = lmdb.open(outputPath, map_size=<span class="hljs-number">1073741824</span>) 
<span class="hljs-comment"># ...</span>
</code></pre>
<p>I also got an error with the utf encoding, so I removed the utf-8 encoding when opening the <code>gtFile</code>. The new line then looks like this:</p>
<pre><code class="lang-py"><span class="hljs-comment"># OLD LINE</span>
<span class="hljs-comment"># ...</span>
<span class="hljs-keyword">with</span> open(gtFile, <span class="hljs-string">'r'</span>, encoding=<span class="hljs-string">'utf-8'</span>) <span class="hljs-keyword">as</span> data:
<span class="hljs-comment"># ...</span>

<span class="hljs-comment"># NEW LINE</span>
<span class="hljs-comment"># ...</span>
<span class="hljs-keyword">with</span> open(gtFile, <span class="hljs-string">'r'</span>) <span class="hljs-keyword">as</span> data:
<span class="hljs-comment"># ...</span>
</code></pre>
<p>Lastly, I changed the way <code>imagePath</code> was read:</p>
<pre><code class="lang-py"><span class="hljs-comment"># OLD LINE</span>
<span class="hljs-comment"># ...</span>
imagePath, label = datalist[i].strip(<span class="hljs-string">'\n'</span>).split(<span class="hljs-string">'\t'</span>)
<span class="hljs-comment"># ...</span>

<span class="hljs-comment"># NEW LINES</span>
<span class="hljs-comment"># ...</span>
imagePath, label = datalist[i].strip(<span class="hljs-string">'\n'</span>).split(<span class="hljs-string">'.png'</span>)
imagePath += <span class="hljs-string">'.png'</span>
<span class="hljs-comment"># ...</span>
</code></pre>
<p>The <code>create_lmdb_dataset.py</code> file should look like this (code from <a target="_blank" href="https://github.com/clovaai/deep-text-recognition-benchmark">this Git repo</a>, with the changes above applied):</p>
<pre><code class="lang-py"><span class="hljs-keyword">import</span> fire
<span class="hljs-keyword">import</span> os
<span class="hljs-keyword">import</span> lmdb
<span class="hljs-keyword">import</span> cv2

<span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np


<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">checkImageIsValid</span>(<span class="hljs-params">imageBin</span>):</span>
    <span class="hljs-keyword">if</span> imageBin <span class="hljs-keyword">is</span> <span class="hljs-literal">None</span>:
        <span class="hljs-keyword">return</span> <span class="hljs-literal">False</span>
    imageBuf = np.frombuffer(imageBin, dtype=np.uint8)
    img = cv2.imdecode(imageBuf, cv2.IMREAD_GRAYSCALE)
    imgH, imgW = img.shape[<span class="hljs-number">0</span>], img.shape[<span class="hljs-number">1</span>]
    <span class="hljs-keyword">if</span> imgH * imgW == <span class="hljs-number">0</span>:
        <span class="hljs-keyword">return</span> <span class="hljs-literal">False</span>
    <span class="hljs-keyword">return</span> <span class="hljs-literal">True</span>


<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">writeCache</span>(<span class="hljs-params">env, cache</span>):</span>
    <span class="hljs-keyword">with</span> env.begin(write=<span class="hljs-literal">True</span>) <span class="hljs-keyword">as</span> txn:
        <span class="hljs-keyword">for</span> k, v <span class="hljs-keyword">in</span> cache.items():
            txn.put(k, v)


<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">createDataset</span>(<span class="hljs-params">inputPath, gtFile, outputPath, checkValid=True</span>):</span>
    <span class="hljs-string">"""
    Create LMDB dataset for training and evaluation.
    ARGS:
        inputPath  : input folder path where starts imagePath
        outputPath : LMDB output path
        gtFile     : list of image path and label
        checkValid : if true, check the validity of every image
    """</span>
    os.makedirs(outputPath, exist_ok=<span class="hljs-literal">True</span>)
    env = lmdb.open(outputPath, map_size=<span class="hljs-number">1073741824</span>) <span class="hljs-comment">#TODO Changed map size</span>
    cache = {}
    cnt = <span class="hljs-number">1</span>

    <span class="hljs-keyword">with</span> open(gtFile, <span class="hljs-string">'r'</span>) <span class="hljs-keyword">as</span> data: <span class="hljs-comment">#TODO removed utf-8 encoding here since I have norwegian letters</span>
        datalist = data.readlines()

    nSamples = len(datalist)
    print(nSamples)
    <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> range(nSamples):
        <span class="hljs-comment">#TODO changed the way imagePath is found as well to match my usecase</span>
        imagePath, label = datalist[i].strip(<span class="hljs-string">'\n'</span>).split(<span class="hljs-string">'.png'</span>)
        imagePath += <span class="hljs-string">'.png'</span>

        <span class="hljs-comment"># imagePath, label = datalist[i].strip('\n').split('\t')</span>
        imagePath = os.path.join(inputPath, imagePath)

        <span class="hljs-comment"># # only use alphanumeric data</span>
        <span class="hljs-comment"># if re.search('[^a-zA-Z0-9]', label):</span>
        <span class="hljs-comment">#     continue</span>

        <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> os.path.exists(imagePath):
            print(<span class="hljs-string">'%s does not exist'</span> % imagePath)
            <span class="hljs-keyword">continue</span>
        <span class="hljs-keyword">with</span> open(imagePath, <span class="hljs-string">'rb'</span>) <span class="hljs-keyword">as</span> f:
            imageBin = f.read()
        <span class="hljs-keyword">if</span> checkValid:
            <span class="hljs-keyword">try</span>:
                <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> checkImageIsValid(imageBin):
                    print(<span class="hljs-string">'%s is not a valid image'</span> % imagePath)
                    <span class="hljs-keyword">continue</span>
            <span class="hljs-keyword">except</span>:
                print(<span class="hljs-string">'error occured'</span>, i)
                <span class="hljs-keyword">with</span> open(outputPath + <span class="hljs-string">'/error_image_log.txt'</span>, <span class="hljs-string">'a'</span>) <span class="hljs-keyword">as</span> log:
                    log.write(<span class="hljs-string">'%s-th image data occured error\n'</span> % str(i))
                <span class="hljs-keyword">continue</span>

        imageKey = <span class="hljs-string">'image-%09d'</span>.encode() % cnt
        labelKey = <span class="hljs-string">'label-%09d'</span>.encode() % cnt
        cache[imageKey] = imageBin
        cache[labelKey] = label.encode()

        <span class="hljs-keyword">if</span> cnt % <span class="hljs-number">1000</span> == <span class="hljs-number">0</span>:
            writeCache(env, cache)
            cache = {}
            print(<span class="hljs-string">'Written %d / %d'</span> % (cnt, nSamples))
        cnt += <span class="hljs-number">1</span>
    nSamples = cnt<span class="hljs-number">-1</span>
    cache[<span class="hljs-string">'num-samples'</span>.encode()] = str(nSamples).encode()
    writeCache(env, cache)
    print(<span class="hljs-string">'Created dataset with %d samples'</span> % nSamples)


<span class="hljs-keyword">if</span> __name__ == <span class="hljs-string">'__main__'</span>:
    fire.Fire(createDataset)
</code></pre>
<p>Next, move the folder over to the <strong>deep-text-recognition-benchmark</strong> folder (the Git repo you cloned). Then run the following command in the terminal:</p>
<pre><code class="lang-bash">python .\create_lmdb_dataset.py &lt;data folder name&gt; &lt;path to labels.txt <span class="hljs-keyword">in</span> data folder&gt; &lt;output folder <span class="hljs-keyword">for</span> your lmdb dataset&gt;
</code></pre>
<p>Where:</p>
<ul>
<li><code>&lt;data folder name&gt;</code> is the name of your folder with images and <code>labels.txt</code> (<code>output</code> in my case)</li>
<li><code>&lt;path to labels.txt&gt;</code> is the <code>&lt;data folder name&gt;</code> + the <code>labels.txt</code> (so <code>.\output\labels.tx_t_</code> in my case)</li>
<li><code>&lt;output folder for your lmdb dataset&gt;</code> is the name of a folder that will be created for your dataset converted to LMDB format (I called it <code>.\lmbd_output</code>)</li>
</ul>
<p>For me, this was the command (make sure to run this command inside the <strong>deep-text-recognition-benchmark</strong> folder):</p>
<pre><code class="lang-bash">python .\create_lmdb_dataset.py .\output .\output\labels.txt .\lmbd_output
</code></pre>
<p>Now, you should have a new folder, like the image below, in your <strong>deep-text-recognition-benchmark</strong> folder.</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/12/image-55.png" alt="Image" width="600" height="400" loading="lazy">
<em>How the folder for your lmdb converted data should look</em></p>
<p><strong>NOTE</strong>: Running the command on an existing folder does not overwrite the existing folder. Make sure you either delete a folder or give the <strong>lmdb_output</strong> a new name (this was something I struggled with for a while, so hopefully, this will help you avoid that error).</p>
<h2 id="heading-how-to-retrieve-a-pre-trained-ocr-model">How to Retrieve a Pre-trained OCR Model</h2>
<p>Next, you need a pre-trained OCR model that you can fine-tune with your dataset. For this, you can go to <a target="_blank" href="https://drive.google.com/drive/folders/15WPsuPJDCzhp2SvYZLRj8mAlT3zmoAMW">this Dropbox website</a> and download the <code>TPS-ResNet-BiLSTM-Attn.pth</code> model. </p>
<p>Place the model in your <strong>deep-text-recognition-benchmark</strong> folder (I know this looks a bit shady, but this is the part of the instructions in the deep-text-recognition-benchmark repository. The Dropbox is not mine, and I am linking it here because it is linked in the Git repo <em>text-recognition-benchmark</em>)</p>
<h2 id="heading-how-to-run-the-fine-tuning">How to Run the Fine-tuning</h2>
<p>If you run on CPU (this can be ignored if you are using GPU), you'll likely get an error that says: "RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False". </p>
<p>This can be fixed by changing lines 85 and 87 in the <code>train.py</code> file:</p>
<pre><code class="lang-py"><span class="hljs-comment"># OLD LINES</span>
<span class="hljs-comment"># ...</span>
<span class="hljs-keyword">if</span> opt.FT:
    model.load_state_dict(torch.load(opt.saved_model), strict=<span class="hljs-literal">False</span>)
<span class="hljs-keyword">else</span>:
    model.load_state_dict(torch.load(opt.saved_model))
<span class="hljs-comment"># ...</span>


<span class="hljs-comment"># NEW LINES (change to this if you are using CPU)</span>
<span class="hljs-comment">#</span>
<span class="hljs-keyword">if</span> opt.FT:
    model.load_state_dict(torch.load(opt.saved_model,map_location=<span class="hljs-string">'cpu'</span>), strict=<span class="hljs-literal">False</span>)
<span class="hljs-keyword">else</span>:
    model.load_state_dict(torch.load(opt.saved_model,map_location=<span class="hljs-string">'cpu'</span>))
<span class="hljs-comment"># ...</span>
</code></pre>
<p>Finally, you can then run the fine-tuning. To do that, you can use the command below in the terminal:</p>
<pre><code class="lang-bash">python train.py --train_data lmdb_output --valid_data lmdb_output --select_data <span class="hljs-string">"/"</span> --batch_ratio 1.0 --Transformation TPS --FeatureExtraction ResNet --SequenceModeling BiLSTM --Prediction Attn --batch_size 2 --data_filtering_off --workers 0 --batch_max_length 80 --num_iter 10 --valInterval 5 --saved_model TPS-ResNet-BiLSTM-Attn.pth
</code></pre>
<p>Some notes on the command:</p>
<ul>
<li><code>data_filtering_off</code> is set to <code>True</code> (you only have to use the flag, not give it a variable). I did not use <code>data_filtering</code> because I'll have no samples to train on if filtering was enabled.</li>
<li>Workers were to set to 0 to avoid errors. I think this has something to do with multi-GPU settings, and this is also referred to in the <code>train.py</code> file in the <strong>deep-text-recognition-benchmark</strong> folder.</li>
<li><code>batch_max_length</code> is the maximum length of any text in the training dataset. If you are using a different dataset, feel free to change this variable. Make sure this variable is as large as the longest string you are using in your dataset, or you'll get an error.</li>
<li>For this tutorial, I use <code>train_data</code> and <code>valid_data</code> to refer to the same folder. In practice, I would create one folder with a training dataset, and one for a validation dataset and refer to those instead.</li>
<li>I set <code>num_iter</code> to 10 so you can make sure it works. Naturally, this variable must be set much higher when running the actual fine-tuning of a model.</li>
<li><code>saved_model</code> is an optional parameter. If you don’t set it, you will train a model from scratch. You probably don't want that (as this will require a lot of training), so set the <code>saved_model</code> flag to the existing model you <a target="_blank" href="https://drive.google.com/drive/folders/15WPsuPJDCzhp2SvYZLRj8mAlT3zmoAMW">downloaded from Dropbox</a>.</li>
</ul>
<h2 id="heading-how-to-run-inference-with-your-fine-tuned-model">How to Run Inference with your Fine-tuned Model</h2>
<p>After you have fine-tuned your model, you'd want to run inference with it. To do this, you can use the command below:</p>
<pre><code class="lang-bash">python demo.py --Transformation TPS --FeatureExtraction ResNet --SequenceModeling BiLSTM --Prediction Attn --image_folder &lt;path to images to <span class="hljs-built_in">test</span> on&gt; --saved_model &lt;path to model to use&gt;
</code></pre>
<p>Where:</p>
<ul>
<li><code>&lt;path to images to test on&gt;</code> is a folder consisting of PNG images you want to test on. For me, this was <strong>output</strong></li>
<li><code>&lt;path to model to use&gt;</code> is the path to the saved model from your fine-tuning. For me, this was <strong>.\saved_models\TPS-ResNet-BiLSTM-Attn-Seed1111\best_accuracy.pth</strong> (the fine-tuning saves the fine-tuned model in a <code>saved_models</code> folder)</li>
</ul>
<p>Here's the command that I used:</p>
<pre><code class="lang-bash">python demo.py --Transformation TPS --FeatureExtraction ResNet --SequenceModeling BiLSTM --Prediction Attn --image_folder output --saved_model .\saved_models\TPS-ResNet-BiLSTM-Attn-Seed1111\best_accuracy.pth
</code></pre>
<p>The command simply outputs the model's prediction and confidence score for each image in the <code>&lt;path to images to test on&gt;</code> folder, so you can check the performance of the model by looking at the images yourself to see if the model made the right prediction. This is a qualitative test of the performance of the model.</p>
<h2 id="heading-a-qualitative-test-of-performance">A Qualitative Test of Performance</h2>
<p>To see if the fine-tuning worked, I will do a qualitative test of the performance by testing the original model against my fine-tuned model on 10 specific words and numbers. </p>
<p>The words I tested are shown below (merged vertically into one image). I had to make it a bit difficult for the model by adding skewed and blurred texts.</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/12/image-56.png" alt="Image" width="600" height="400" loading="lazy">
<em>Self-made images merged with [https://products.aspose.app/pdf/merger/png-to-png](https://products.aspose.app/pdf/merger/png-to-png" rel="noopener ugc nofollow). The words from top to bottom are: “vanskeligheter”, “uvanligheter”, “skrekkeksempel”, “rosenborg”</em></p>
<p>Considering that I want my OCR to read Norwegian supermarket receipts, I added some Norwegian words (the words are taken from <a target="_blank" href="http://openfoodfacts.com/">http://openfoodfacts.com/</a>, you can read more about it in <a target="_blank" href="https://medium.com/dev-genius/generating-a-fine-tuning-dataset-for-an-ocr-engine-3509167bc8a1">this article</a>). </p>
<p>Hopefully, my fine-tuned model should perform better on these words, as the original OCR model is not used to seeing Norwegian words. My fine-tuned model has been trained on some Norwegian words.</p>
<p>The texts in each image are:</p>
<ul>
<li>image0 -&gt; vanskeligheter</li>
<li>image1 -&gt; uvanligheter</li>
<li>image2 -&gt; skrekkeksempel</li>
<li>image3 -&gt; rosenborg</li>
</ul>
<p>Results for the original model (not fine-tuned):</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/12/image-57.png" alt="Image" width="600" height="400" loading="lazy">
<em>Results for the original model (not fine-tuned) on a qualitative test. You can see the model struggles quite a bit</em></p>
<p>Results for fine-tuned model:</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/12/image-58.png" alt="Image" width="600" height="400" loading="lazy">
<em>Results for the fine-tuned model. You can see the model achieves perfect accuracy because of the fine-tuning.</em></p>
<p>As you can see, the fine-tuning has worked, and the fine-tuned model achieves perfect results in this qualitative example.</p>
<p>To interpret your results qualitatively, you should grab a sample of documents that are representative for the full dataset and manually compare the OCR output and the ground truth. This will give you a feel of how well the model is performing, as you can see how often it makes errors. </p>
<p>You should note that you often cannot expect perfect results from the fine-tuned OCR engine, and you can therefore use the qualitative analysis to determine specific errors the model is making. </p>
<p>This could, for example, be the model having difficulties recognizing certain characters. If this is the case, you can train the model on more examples of those characters to further increase the performance of your model. </p>
<h2 id="heading-quantitative-test-of-performance">Quantitative Test of Performance</h2>
<p>If you want a more quantitative test, you can either look at the validation results that show up during fine-tuning, or you can use the command below:</p>
<pre><code class="lang-bash">python test.py --eval_data &lt;path to <span class="hljs-built_in">test</span> data <span class="hljs-built_in">set</span> <span class="hljs-keyword">in</span> lmdb format&gt; --Transformation TPS --FeatureExtraction ResNet --SequenceModeling BiLSTM --Prediction Attn --saved_model &lt;path to model to <span class="hljs-built_in">test</span>&gt; --batch_max_length 70 --workers 0 --batch_size 2 --data_filtering_off
</code></pre>
<p>Where:</p>
<ul>
<li><code>&lt;path to test data set in lmdb format&gt;</code> is the path to the folder containing the test data in LMDB format. For me, this was: <code>lmdb_norwegian_data_test</code></li>
<li><code>&lt;path to model to test&gt;</code> is the path to the model you want to test its performance of. For me, this was: <code>saved_models/TPS-ResNet-BiLSTM-Attn-Seed1111/best_accuracy.pth</code>.</li>
</ul>
<p>The command I used was therefore:</p>
<pre><code class="lang-bash">python test.py --eval_data lmdb_norwegian_data_test --Transformation TPS --FeatureExtraction ResNet --SequenceModeling BiLSTM --Prediction Attn --saved_model saved_models/TPS-ResNet-BiLSTM-Attn-Seed1111/best_accuracy.pth --batch_max_length 70 --workers 0 --batch_size 2 --data_filtering_off
</code></pre>
<p>This will output accuracy in percentage, so a number between 0 and 100, which is the accuracy the OCR model achieves on your test dataset.</p>
<p>In my experience, the model you downloaded from Dropbox needs a bit of training. At first, the model will make inaccurate predictions, but if you let it train for 30 minutes or so you should start seeing some improvements.</p>
<p>I then ran the <code>test.py</code> on the 4 images I showed above and got the results in the images below: with the old (not fine-tuned) model to on top and the new fine-tuned model below. </p>
<p>Results from the old model:</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/12/image-59.png" alt="Image" width="600" height="400" loading="lazy">
<em>Result for the old model, which acieves an accuracy of 50%.</em></p>
<p>Results from the fine-tuned model:</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/12/image-60.png" alt="Image" width="600" height="400" loading="lazy">
<em>Result for the new fine-tuned model which achieves an accuracy of 100%, which indicates the fine-tuning worked</em></p>
<p>You can see that the new fine-tuned model performs better with a accuracy of 100 percent.</p>
<h2 id="heading-conclusion">Conclusion</h2>
<p>Congrats, you can now fine-tune your OCR model. To make a significant impact on a larger model and generalize it, you probably have to make a larger dataset. You can learn about that in <a target="_blank" href="https://medium.com/dev-genius/generating-a-fine-tuning-dataset-for-an-ocr-engine-3509167bc8a1">this tutorial</a>, and then let the model train for a while. </p>
<p>In the end, the OCR model will hopefully perform better for your specific use case.</p>
<p>This tutorial was originally written part by part on my Medium, you can check out each part here:</p>
<ul>
<li><a target="_blank" href="https://blog.devgenius.io/generating-a-fine-tuning-dataset-for-an-ocr-engine-3509167bc8a1">Generating a synthetic fine-tuning dataset for an OCR engine</a></li>
<li><a target="_blank" href="https://pub.towardsai.net/how-to-fine-tune-easyocr-to-achieve-better-ocr-performance-1540f5076428">How to Fine-tune EasyOCR to achieve better OCR performance</a></li>
</ul>
<p>If you are interested and want to learn more about similar topics, you can find me on:</p>
<ul>
<li><a target="_blank" href="https://medium.com/@oieivind">✅ Medium</a></li>
<li><a target="_blank" href="https://twitter.com/Ravenspike21">✅</a> <a target="_blank" href="https://twitter.com/Ravenspike21">Twitter</a></li>
<li>✅<a target="_blank" href="https://www.linkedin.com/in/eivind-kjosbakken/">LinkedIn</a></li>
</ul>
<p>Cover image: Use OCR to read documents. Image made with DALL-E. OpenAI. (2023). ChatGPT (Large language model) <a target="_blank" href="https://chat.openai.com/">https://chat.openai.com</a>.</p>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ Create a Self-Playing AI Chess Engine from Scratch with Imitation Learning ]]>
                </title>
                <description>
                    <![CDATA[ This is an article on how I created an AI chess engine, starting completely from scratch to building my very own AI chess engine.  Because creating an AI chess engine from scratch is a relatively complex task, this will be a long article, but stay tu... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/create-a-self-playing-ai-chess-engine-from-scratch/</link>
                <guid isPermaLink="false">66be011abe3d57ffd3af29e2</guid>
                
                    <category>
                        <![CDATA[ AI ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Artificial Intelligence ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Python ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ Eivind Kjosbakken ]]>
                </dc:creator>
                <pubDate>Thu, 21 Sep 2023 08:11:38 +0000</pubDate>
                <media:content url="https://www.freecodecamp.org/news/content/images/2023/09/undraw_Programmer_re_owql.png" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>This is an article on how I created an AI chess engine, starting completely from scratch to building my very own AI chess engine. </p>
<p>Because creating an AI chess engine from scratch is a relatively complex task, this will be a long article, but stay tuned, as the product you will end up with will be a cool project to showcase!</p>
<h2 id="heading-prerequisites">Prerequisites</h2>
<p>This article will explain most concepts in detail. However, there are some recommended prerequisites to follow the tutorial. You should be familiar with the following:</p>
<ul>
<li>Python</li>
<li>How to use the terminal</li>
<li>Jupyter Notebook</li>
<li>Fundamental AI concepts</li>
<li>Chess rules</li>
</ul>
<p>I will also use the following tools:</p>
<ul>
<li>Python</li>
<li>Different Python packages</li>
<li>Stockfish</li>
</ul>
<h2 id="heading-table-of-contents">Table of Contents</h2>
<ul>
<li><a class="post-section-overview" href="#heading-part-1-how-to-generate-a-dataset">Part 1 : How to Generate a Dataset</a></li>
<li><a class="post-section-overview" href="#heading-part-2-how-to-encode-data">Part 2 : How to Encode data</a></li>
<li><a class="post-section-overview" href="#heading-part-3-how-to-train-the-ai-model">Part 3: How to Train the AI model</a></li>
<li><a class="post-section-overview" href="#heading-conclusion">Conclusion</a></li>
</ul>
<h2 id="heading-part-1-how-to-generate-a-dataset">Part 1: How to Generate a Dataset</h2>
<p>In this part, I will use Stockfish to generate a large dataset of moves from different positions. This data can then be used later on to train the chess AI.</p>
<h3 id="heading-how-to-download-stockfish">How to download Stockfish</h3>
<p>The most important component of my chess engine is Stockfish, so I will therefore show you how to install it. </p>
<p>Go to the <a target="_blank" href="https://stockfishchess.org/download/">Stockfish website download page</a> and download the version for you. I am using Windows myself, so I chose the Windows (faster) version:</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/09/0_GxqQ42GNX21JB1GN.png" alt="Image" width="600" height="400" loading="lazy">
<em>Press the download button marked in red if you have a Windows PC</em></p>
<p>After downloading, extract the zip file to whatever location on your PC you want your chess engine to be. Remember where you place it as you need the path for the next step.</p>
<h3 id="heading-how-to-incorporate-stockfish-with-python">How to incorporate Stockfish with Python</h3>
<p>Now you also need to incorporate the engine into Python. You could manually do this, but I found it easier to use the <a target="_blank" href="https://pypi.org/project/stockfish/">Python Stockfish package</a> as it has all the functions you need. </p>
<p>First install the package from <code>pip</code> (preferably in your virtual environment):</p>
<pre><code class="lang-bash">pip install stockfish
</code></pre>
<p>You can then import it using the following command:</p>
<pre><code class="lang-py"><span class="hljs-keyword">from</span> stockfish <span class="hljs-keyword">import</span> Stockfish
stockfish = Stockfish(path=<span class="hljs-string">r"C:\Users\eivin\Documents\ownProgrammingProjects18062023\ChessEngine\stockfish\stockfish\stockfish-windows-2022-x86-64-avx2"</span>)
</code></pre>
<p>Note that you need to give your own path to the Stockfish executable file:</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/09/0_MSlKl_UJHCvdpje6.png" alt="Image" width="600" height="400" loading="lazy">
<em>The stockfish executable file is the second file from the bottom</em></p>
<p>You can copy the file path from the folder structure, or if you are on Windows 11 you can press ctrl + shift + c to automatically copy the file path.</p>
<p>Great! Now you have Stockfish available in Python!</p>
<h3 id="heading-how-to-generate-a-dataset">How to generate a dataset</h3>
<p>Now you need a dataset so you can train the AI chess engine! You can do this by making Stockfish play games and remembering each position and the moves you could take from there. </p>
<p>Those moves will be along the best possible moves, considering Stockfish is a strong chess engine.</p>
<p>First, install a <a target="_blank" href="https://pypi.org/project/chess/">chess package</a> and NumPy (there are plenty to choose from, but I will be using the one below). </p>
<p>Enter each line (individually) in the terminal:</p>
<pre><code class="lang-bash">pip install chess
pip install numpy
</code></pre>
<p>Then import packages (remember to also import Stockfish as shown earlier in this article):</p>
<pre><code class="lang-py"><span class="hljs-keyword">import</span> chess
<span class="hljs-keyword">import</span> random
<span class="hljs-keyword">from</span> pprint <span class="hljs-keyword">import</span> pprint
<span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np
<span class="hljs-keyword">import</span> os
<span class="hljs-keyword">import</span> glob
<span class="hljs-keyword">import</span> time
</code></pre>
<p>You also need some helper functions here:</p>
<pre><code class="lang-py"><span class="hljs-comment">#helper functions:</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">checkEndCondition</span>(<span class="hljs-params">board</span>):</span>
 <span class="hljs-keyword">if</span> (board.is_checkmate() <span class="hljs-keyword">or</span> board.is_stalemate() <span class="hljs-keyword">or</span> board.is_insufficient_material() <span class="hljs-keyword">or</span> board.can_claim_threefold_repetition() <span class="hljs-keyword">or</span> board.can_claim_fifty_moves() <span class="hljs-keyword">or</span> board.can_claim_draw()):
  <span class="hljs-keyword">return</span> <span class="hljs-literal">True</span>
 <span class="hljs-keyword">return</span> <span class="hljs-literal">False</span>

<span class="hljs-comment">#save</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">findNextIdx</span>():</span>
 files = (glob.glob(<span class="hljs-string">r"C:\Users\eivin\Documents\ownProgrammingProjects18062023\ChessEngine\data\*.npy"</span>))
 <span class="hljs-keyword">if</span> (len(files) == <span class="hljs-number">0</span>):
  <span class="hljs-keyword">return</span> <span class="hljs-number">1</span> <span class="hljs-comment">#if no files, return 1</span>
 highestIdx = <span class="hljs-number">0</span>
 <span class="hljs-keyword">for</span> f <span class="hljs-keyword">in</span> files:
  file = f
  currIdx = file.split(<span class="hljs-string">"movesAndPositions"</span>)[<span class="hljs-number">-1</span>].split(<span class="hljs-string">".npy"</span>)[<span class="hljs-number">0</span>]
  highestIdx = max(highestIdx, int(currIdx))

 <span class="hljs-keyword">return</span> int(highestIdx)+<span class="hljs-number">1</span>

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">saveData</span>(<span class="hljs-params">moves, positions</span>):</span>
 moves = np.array(moves).reshape(<span class="hljs-number">-1</span>, <span class="hljs-number">1</span>)
 positions = np.array(positions).reshape(<span class="hljs-number">-1</span>,<span class="hljs-number">1</span>)
 movesAndPositions = np.concatenate((moves, positions), axis = <span class="hljs-number">1</span>)
 nextIdx = findNextIdx()
 np.save(<span class="hljs-string">f"data/movesAndPositions<span class="hljs-subst">{nextIdx}</span>.npy"</span>, movesAndPositions)
 print(<span class="hljs-string">"Saved successfully"</span>)

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">runGame</span>(<span class="hljs-params">numMoves, filename = <span class="hljs-string">"movesAndPositions1.npy"</span></span>):</span>
 <span class="hljs-string">"""run a game you stored"""</span>
 testing = np.load(<span class="hljs-string">f"data/<span class="hljs-subst">{filename}</span>"</span>)
 moves = testing[:, <span class="hljs-number">0</span>]
 <span class="hljs-keyword">if</span> (numMoves &gt; len(moves)):
  print(<span class="hljs-string">"Must enter a lower number of moves than maximum game length. Game length here is: "</span>, len(moves))
  <span class="hljs-keyword">return</span>

 testBoard = chess.Board()

 <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> range(numMoves):
  move = moves[i]
  testBoard.push_san(move)
 <span class="hljs-keyword">return</span> testBoard
</code></pre>
<p>Remember to change the file path in the <code>findNextIdx</code> function, as this is personal for your computer. </p>
<p>Create a data folder within the folder you are coding, and copy the path (but still keep the <code>*.npy</code> at the end)</p>
<p>The <code>checkEndCondition</code> function uses functions from the <a target="_blank" href="https://pypi.org/project/chess/">Chess pip package</a> to check if the game is to be ended. </p>
<p>The <code>saveData</code> function saves a game to npy files which is a highly optimized way of storing arrays. </p>
<p>The function uses the <code>findNextIdx</code> function to save to a new file (remember here to create a new folder called data to store all data in). </p>
<p>Finally, the <code>runGame</code> function makes it so you can run a game that you saved to check the positions after <code>numMoves</code> number of moves.</p>
<p>Then you can finally get to the function that mines the chess games:</p>
<pre><code class="lang-py"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">mineGames</span>(<span class="hljs-params">numGames : int</span>):</span>
 <span class="hljs-string">"""mines numGames games of moves"""</span>
 MAX_MOVES = <span class="hljs-number">500</span> <span class="hljs-comment">#don't continue games after this number</span>

 <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> range(numGames):
  currentGameMoves = []
  currentGamePositions = []
  board = chess.Board()
  stockfish.set_position([])

  <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> range(MAX_MOVES):
   <span class="hljs-comment">#randomly choose from those 3 moves</span>
   moves = stockfish.get_top_moves(<span class="hljs-number">3</span>)
   <span class="hljs-comment">#if less than 3 moves available, choose first one, if none available, exit</span>
   <span class="hljs-keyword">if</span> (len(moves) == <span class="hljs-number">0</span>):
    print(<span class="hljs-string">"game is over"</span>)
    <span class="hljs-keyword">break</span>
   <span class="hljs-keyword">elif</span> (len(moves) == <span class="hljs-number">1</span>):
    move = moves[<span class="hljs-number">0</span>][<span class="hljs-string">"Move"</span>]
   <span class="hljs-keyword">elif</span> (len(moves) == <span class="hljs-number">2</span>):
    move = random.choices(moves, weights=(<span class="hljs-number">80</span>, <span class="hljs-number">20</span>), k=<span class="hljs-number">1</span>)[<span class="hljs-number">0</span>][<span class="hljs-string">"Move"</span>]
   <span class="hljs-keyword">else</span>:
    move = random.choices(moves, weights=(<span class="hljs-number">80</span>, <span class="hljs-number">15</span>, <span class="hljs-number">5</span>), k=<span class="hljs-number">1</span>)[<span class="hljs-number">0</span>][<span class="hljs-string">"Move"</span>]

   currentGamePositions.append(stockfish.get_fen_position())
   board.push_san(move)
   currentGameMoves.append(move)
   stockfish.set_position(currentGameMoves)
   <span class="hljs-keyword">if</span> (checkEndCondition(board)):
    print(<span class="hljs-string">"game is over"</span>)
    <span class="hljs-keyword">break</span>
  saveData(currentGameMoves, currentGamePositions)
</code></pre>
<p>Here you first set a max limit so a game does not last infinitely long. </p>
<p>Then, you run the number of games you want to run and make sure both Stockfish and the Chess pip package are reset to the starting position. </p>
<p>Next, you get the top 3 moves suggested by Stockfish and choose one of them to play (80 % change for the best move, 15 % change for the second best move, 5 % change for the third best move). The reason you are not always choosing the best move is for the move selection to be more stochastic. </p>
<p>Then, you choose a move (making sure no error occurs even if there are less than three possible moves), save the board position using <a target="_blank" href="https://en.wikipedia.org/wiki/Forsyth%E2%80%93Edwards_Notation#:~:text=Forsyth%E2%80%93Edwards%20Notation%20%28FEN%29,Scottish%20newspaper%20journalist%20David%20Forsyth.">FEN</a> (a way of encoding a chess position), as well as the move done from that position. </p>
<p>If the game is done, you break the loop and store all positions and the moves made from those positions. If the game is not done, you continue making moves until the game is over.</p>
<p>You can then mine one game with:</p>
<pre><code class="lang-py">mineGames(<span class="hljs-number">1</span>)
</code></pre>
<p>Remember to create a data folder here, as this is where I store the games!</p>
<h3 id="heading-how-to-review-a-mined-game">How to review a mined game</h3>
<p>Run the <code>mineGames</code> function to mine one game using the following command:</p>
<pre><code class="lang-py">mineGames(<span class="hljs-number">1</span>)
</code></pre>
<p>You can access this game with a helper function shown earlier using the following command:</p>
<pre><code class="lang-py">testBoard = runGame(<span class="hljs-number">12</span>, <span class="hljs-string">"movesAndPositions1.npy"</span>)
testBoard
</code></pre>
<p>Assuming there have been 12 moves in the game, you will then see something like this:</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/09/0_pjARgYsCMqZjj8lK.png" alt="Image" width="600" height="400" loading="lazy">
<em>Output from printing board position after 12 moves. (Note that the last line with just testBoard is printed, since in a Jupyter notebook, a variable is printed if it is written alone at the bottom of a cell).</em></p>
<p>And that’s it, you can now mine as many games as you would like. </p>
<p>It is going to take some time, and there are potentials for optimizing this mining process, such as parallelizing the game simulations (since each game is completely separate from the other). </p>
<p>For the full code from part 1, you can check out the full code on <a target="_blank" href="https://github.com/EivindKjosbakken/ChessEngine/blob/main/part1RetrievingDataset.ipynb">my GitHub</a>.</p>
<h2 id="heading-part-2-how-to-encode-data">Part 2 : How to Encode Data</h2>
<p>In this part, you will encode chess moves and positions in the same way DeepMind did with AlphaZero!</p>
<p>I will use the data you gathered in part 1 of this series. </p>
<p>As a reminder, you installed Stockfish and made sure you could access it on the computer. You then made it play games against itself, while you stored all moves and positions. </p>
<p>You now have a supervised learning problem, since the input is the current position, and the label (the correct move from the positions) is the move that Stockfish decided was the best.</p>
<h3 id="heading-how-to-install-and-import-packages">How to install and import packages</h3>
<p>First, you need to install and import all required packages, some of which you may already have if you followed part 1 of this series. </p>
<p>All imports are below – remember to only input one line at a time when installing via <code>pip</code>:</p>
<pre><code class="lang-bash">pip install numpy
pip install gym-chess
pip install chess
</code></pre>
<p>Additionally, you need to make a small change in one of the files in the gym-chess package since <code>np.int</code> was used, which is now deprecated. </p>
<p>In the file with the relative path (from the virtual environment) <code>venv\Lib\site-packages\gym_chess\alphazero\board_encoding.py</code> where <code>venv</code> is the name of my virtual environment, you have to search for "np.int" and replace them with "int".</p>
<p> If you don't, you will see an error message stating that np.int is deprecated. </p>
<p>I also had to restart VS Code after replacing "np.int" with "int", to make it work.</p>
<p>All imports you need are below:</p>
<pre><code class="lang-py"><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np
<span class="hljs-keyword">import</span> gym
<span class="hljs-keyword">import</span> chess
<span class="hljs-keyword">import</span> os
<span class="hljs-keyword">import</span> gym.spaces
<span class="hljs-keyword">from</span> gym_chess.alphazero.move_encoding <span class="hljs-keyword">import</span> utils, queenmoves, knightmoves, underpromotions
<span class="hljs-keyword">from</span> typing <span class="hljs-keyword">import</span> List
</code></pre>
<p>And then you also need to create the gym environment to encode and decode moves:</p>
<pre><code class="lang-py">env = gym.make(<span class="hljs-string">'ChessAlphaZero-v0'</span>)
</code></pre>
<h3 id="heading-how-to-encode-board-positions-and-moves">How to encode board positions and moves</h3>
<p>Encoding is an important element within AI, as it allows us to represent problems in a readable way for the AI. </p>
<p>Instead of an image of a chess board, or a string representing a chess move like "d2d4", you instead represent this using arrays (lists of numbers). </p>
<p>Finding out how to do this manually is quite challenging, but luckily for us, the <a target="_blank" href="https://pypi.org/project/gym-chess/">gym-chess Python package</a> has already solved this problem for us.</p>
<p>I am not going to go into more details on how they encoded it, but you can see using the code below that a position is represented with an (8,8,119) shaped array, and all possible moves are given with a (4672) array (1 column with 4672 values).</p>
<p> If you want to read more about this, you can check out the <a target="_blank" href="https://arxiv.org/abs/1712.01815v1">AlphaZero paper</a>, though this is quite a complicated paper to fully understand.</p>
<pre><code class="lang-py"><span class="hljs-comment">#code to print action and state space</span>
env = gym.make(<span class="hljs-string">'ChessAlphaZero-v0'</span>)
env.reset()
print(env.observation_space)
print(env.action_space)
</code></pre>
<p>Which outputs:</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/09/0_yDTpZm519oQl-fJm.png" alt="Image" width="600" height="400" loading="lazy">
<em>Output from printing state (first line) and action space (second line)</em></p>
<p>You can also check out the encoding of a move. From string notation to encoded notation. Make sure to reset the environment as it may give an error if you do not:</p>
<pre><code class="lang-py"><span class="hljs-comment">#first set the environment and make sure to reset the positions</span>
env = gym.make(<span class="hljs-string">'ChessAlphaZero-v0'</span>)
env.reset()

<span class="hljs-comment">#encoding the move e2 to e4</span>
move = chess.Move.from_uci(<span class="hljs-string">'e2e4'</span>)
print(env.encode(move))
<span class="hljs-comment"># -&gt; outputs: 877</span>

<span class="hljs-comment">#decoding the encoded move 877</span>
print(env.decode(<span class="hljs-number">877</span>))
<span class="hljs-comment"># -&gt; outputs: Move.from_uci('e2e4')</span>
</code></pre>
<p>With this, you can now have functions to encode the moves and positions you stored from part 1 where you generated a dataset.</p>
<h3 id="heading-how-to-create-functions-for-encoding-moves">How to create functions for encoding moves</h3>
<p>These functions are copied from the <a target="_blank" href="https://pypi.org/project/gym-chess/">Gym-Chess package</a>, but with small tweaks so it is not dependent on a class. </p>
<p>I manually changed these functions so that it was easier to encode. I would not worry too much about understanding these functions fully, as they are quite complicated. </p>
<p>Just know that they are a way of making sure moves that humans understand, are converted to a way that computers can understand.</p>
<pre><code class="lang-py"><span class="hljs-comment">#fixing encoding funcs from openai</span>

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">encodeKnight</span>(<span class="hljs-params">move: chess.Move</span>):</span>
    _NUM_TYPES: int = <span class="hljs-number">8</span>

    <span class="hljs-comment">#: Starting point of knight moves in last dimension of 8 x 8 x 73 action array.</span>
    _TYPE_OFFSET: int = <span class="hljs-number">56</span>

    <span class="hljs-comment">#: Set of possible directions for a knight move, encoded as </span>
    <span class="hljs-comment">#: (delta rank, delta square).</span>
    _DIRECTIONS = utils.IndexedTuple(
        (+<span class="hljs-number">2</span>, +<span class="hljs-number">1</span>),
        (+<span class="hljs-number">1</span>, +<span class="hljs-number">2</span>),
        (<span class="hljs-number">-1</span>, +<span class="hljs-number">2</span>),
        (<span class="hljs-number">-2</span>, +<span class="hljs-number">1</span>),
        (<span class="hljs-number">-2</span>, <span class="hljs-number">-1</span>),
        (<span class="hljs-number">-1</span>, <span class="hljs-number">-2</span>),
        (+<span class="hljs-number">1</span>, <span class="hljs-number">-2</span>),
        (+<span class="hljs-number">2</span>, <span class="hljs-number">-1</span>),
    )

    from_rank, from_file, to_rank, to_file = utils.unpack(move)

    delta = (to_rank - from_rank, to_file - from_file)
    is_knight_move = delta <span class="hljs-keyword">in</span> _DIRECTIONS

    <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> is_knight_move:
        <span class="hljs-keyword">return</span> <span class="hljs-literal">None</span>

    knight_move_type = _DIRECTIONS.index(delta)
    move_type = _TYPE_OFFSET + knight_move_type

    action = np.ravel_multi_index(
        multi_index=((from_rank, from_file, move_type)),
        dims=(<span class="hljs-number">8</span>, <span class="hljs-number">8</span>, <span class="hljs-number">73</span>)
    )

    <span class="hljs-keyword">return</span> action

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">encodeQueen</span>(<span class="hljs-params">move: chess.Move</span>):</span>
    _NUM_TYPES: int = <span class="hljs-number">56</span> <span class="hljs-comment"># = 8 directions * 7 squares max. distance</span>
    _DIRECTIONS = utils.IndexedTuple(
        (+<span class="hljs-number">1</span>,  <span class="hljs-number">0</span>),
        (+<span class="hljs-number">1</span>, +<span class="hljs-number">1</span>),
        ( <span class="hljs-number">0</span>, +<span class="hljs-number">1</span>),
        (<span class="hljs-number">-1</span>, +<span class="hljs-number">1</span>),
        (<span class="hljs-number">-1</span>,  <span class="hljs-number">0</span>),
        (<span class="hljs-number">-1</span>, <span class="hljs-number">-1</span>),
        ( <span class="hljs-number">0</span>, <span class="hljs-number">-1</span>),
        (+<span class="hljs-number">1</span>, <span class="hljs-number">-1</span>),
    )

    from_rank, from_file, to_rank, to_file = utils.unpack(move)

    delta = (to_rank - from_rank, to_file - from_file)

    is_horizontal = delta[<span class="hljs-number">0</span>] == <span class="hljs-number">0</span>
    is_vertical = delta[<span class="hljs-number">1</span>] == <span class="hljs-number">0</span>
    is_diagonal = abs(delta[<span class="hljs-number">0</span>]) == abs(delta[<span class="hljs-number">1</span>])
    is_queen_move_promotion = move.promotion <span class="hljs-keyword">in</span> (chess.QUEEN, <span class="hljs-literal">None</span>)

    is_queen_move = (
        (is_horizontal <span class="hljs-keyword">or</span> is_vertical <span class="hljs-keyword">or</span> is_diagonal) 
            <span class="hljs-keyword">and</span> is_queen_move_promotion
    )

    <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> is_queen_move:
        <span class="hljs-keyword">return</span> <span class="hljs-literal">None</span>

    direction = tuple(np.sign(delta))
    distance = np.max(np.abs(delta))

    direction_idx = _DIRECTIONS.index(direction)
    distance_idx = distance - <span class="hljs-number">1</span>

    move_type = np.ravel_multi_index(
        multi_index=([direction_idx, distance_idx]),
        dims=(<span class="hljs-number">8</span>,<span class="hljs-number">7</span>)
    )

    action = np.ravel_multi_index(
        multi_index=((from_rank, from_file, move_type)),
        dims=(<span class="hljs-number">8</span>, <span class="hljs-number">8</span>, <span class="hljs-number">73</span>)
    )

    <span class="hljs-keyword">return</span> action

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">encodeUnder</span>(<span class="hljs-params">move</span>):</span>
    _NUM_TYPES: int = <span class="hljs-number">9</span> <span class="hljs-comment"># = 3 directions * 3 piece types (see below)</span>
    _TYPE_OFFSET: int = <span class="hljs-number">64</span>
    _DIRECTIONS = utils.IndexedTuple(
        <span class="hljs-number">-1</span>,
        <span class="hljs-number">0</span>,
        +<span class="hljs-number">1</span>,
    )
    _PROMOTIONS = utils.IndexedTuple(
        chess.KNIGHT,
        chess.BISHOP,
        chess.ROOK,
    )

    from_rank, from_file, to_rank, to_file = utils.unpack(move)

    is_underpromotion = (
        move.promotion <span class="hljs-keyword">in</span> _PROMOTIONS 
        <span class="hljs-keyword">and</span> from_rank == <span class="hljs-number">6</span> 
        <span class="hljs-keyword">and</span> to_rank == <span class="hljs-number">7</span>
    )

    <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> is_underpromotion:
        <span class="hljs-keyword">return</span> <span class="hljs-literal">None</span>

    delta_file = to_file - from_file

    direction_idx = _DIRECTIONS.index(delta_file)
    promotion_idx = _PROMOTIONS.index(move.promotion)

    underpromotion_type = np.ravel_multi_index(
        multi_index=([direction_idx, promotion_idx]),
        dims=(<span class="hljs-number">3</span>,<span class="hljs-number">3</span>)
    )

    move_type = _TYPE_OFFSET + underpromotion_type

    action = np.ravel_multi_index(
        multi_index=((from_rank, from_file, move_type)),
        dims=(<span class="hljs-number">8</span>, <span class="hljs-number">8</span>, <span class="hljs-number">73</span>)
    )

    <span class="hljs-keyword">return</span> action

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">encodeMove</span>(<span class="hljs-params">move: str, board</span>) -&gt; int:</span>
    move = chess.Move.from_uci(move)
    <span class="hljs-keyword">if</span> board.turn == chess.BLACK:
        move = utils.rotate(move)

    action = encodeQueen(move)

    <span class="hljs-keyword">if</span> action <span class="hljs-keyword">is</span> <span class="hljs-literal">None</span>:
        action = encodeKnight(move)

    <span class="hljs-keyword">if</span> action <span class="hljs-keyword">is</span> <span class="hljs-literal">None</span>:
        action = encodeUnder(move)

    <span class="hljs-keyword">if</span> action <span class="hljs-keyword">is</span> <span class="hljs-literal">None</span>:
        <span class="hljs-keyword">raise</span> ValueError(<span class="hljs-string">f"<span class="hljs-subst">{move}</span> is not a valid move"</span>)

    <span class="hljs-keyword">return</span> action
</code></pre>
<p>So now you can give in a move as a string (for example: "e2e4" for the move from e2 to e4), and it outputs a number (the encoded version of the move).</p>
<h3 id="heading-how-to-create-a-function-for-encoding-positions">How to create a function for encoding positions</h3>
<p>Encoding the positions is a bit more difficult. I took a function from the gym-chess package ("encodeBoard") since I had some issues using the package directly. The function I copied is below:</p>
<pre><code class="lang-py"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">encodeBoard</span>(<span class="hljs-params">board: chess.Board</span>) -&gt; np.array:</span>
 <span class="hljs-string">"""Converts a board to numpy array representation."""</span>

 array = np.zeros((<span class="hljs-number">8</span>, <span class="hljs-number">8</span>, <span class="hljs-number">14</span>), dtype=int)

 <span class="hljs-keyword">for</span> square, piece <span class="hljs-keyword">in</span> board.piece_map().items():
  rank, file = chess.square_rank(square), chess.square_file(square)
  piece_type, color = piece.piece_type, piece.color

  <span class="hljs-comment"># The first six planes encode the pieces of the active player, </span>
  <span class="hljs-comment"># the following six those of the active player's opponent. Since</span>
  <span class="hljs-comment"># this class always stores boards oriented towards the white player,</span>
  <span class="hljs-comment"># White is considered to be the active player here.</span>
  offset = <span class="hljs-number">0</span> <span class="hljs-keyword">if</span> color == chess.WHITE <span class="hljs-keyword">else</span> <span class="hljs-number">6</span>

  <span class="hljs-comment"># Chess enumerates piece types beginning with one, which you have</span>
  <span class="hljs-comment"># to account for</span>
  idx = piece_type - <span class="hljs-number">1</span>

  array[rank, file, idx + offset] = <span class="hljs-number">1</span>

 <span class="hljs-comment"># Repetition counters</span>
 array[:, :, <span class="hljs-number">12</span>] = board.is_repetition(<span class="hljs-number">2</span>)
 array[:, :, <span class="hljs-number">13</span>] = board.is_repetition(<span class="hljs-number">3</span>)

 <span class="hljs-keyword">return</span> array

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">encodeBoardFromFen</span>(<span class="hljs-params">fen: str</span>) -&gt; np.array:</span>
 board = chess.Board(fen)
 <span class="hljs-keyword">return</span> encodeBoard(board)
</code></pre>
<p>I also added the <code>encodeBoardFromFen</code> function, since the copied function required a chess board represented using the <a target="_blank" href="https://python-chess.readthedocs.io/en/latest/">Python Chess package</a>, so I first convert from <a target="_blank" href="https://en.wikipedia.org/wiki/Forsyth%E2%80%93Edwards_Notation">FEN-notation</a> (a way of encoding chess positions to a string – you cannot use this as you need the encoding to be in numbers) to a chess board given in that package.</p>
<p>Then you have all you need to encode all your files.</p>
<h3 id="heading-how-to-automate-encoding-for-all-raw-data-files">How to automate encoding for all raw data files</h3>
<p>Now that you can encode moves and positions, you will automate this process for all files in your folder that you generated from part 1 of this series. This involves finding all files in which you have to encode the data and saving these to new files.</p>
<p>Note that from part 1 I changed the folder structure slightly. </p>
<p>I now have a parent <code>Data</code> folder, and within this folder, I have the <code>rawData</code>, which is the moves in string format and positions in FEN-format (from part 1).</p>
<p>I also have the <code>preparedData</code> folder under the data folder, where the encoded moves and positions will be stored. </p>
<p>Note that the encoded moves and positions will be stored in separate files since the encodings have different dimensions.</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/09/0_oiZbBdWwveJNMCPe.png" alt="Image" width="600" height="400" loading="lazy">
<em>Folder structure for the data. Make sure to have two folders called preparedData and rawData within the Data folder. The Data folder is on the same level as your notebook files.</em></p>
<pre><code class="lang-py"><span class="hljs-comment">#function to encode all moves and positions from rawData folder</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">encodeAllMovesAndPositions</span>():</span>
    board = chess.Board() <span class="hljs-comment">#this is used to change whose turn it is so that the encoding works</span>
    board.turn = <span class="hljs-literal">False</span> <span class="hljs-comment">#set turn to black first, changed on first run</span>

    <span class="hljs-comment">#find all files in folder:</span>
    files = os.listdir(<span class="hljs-string">'data/rawData'</span>)
    <span class="hljs-keyword">for</span> idx, f <span class="hljs-keyword">in</span> enumerate(files):
        movesAndPositions = np.load(<span class="hljs-string">f'data/rawData/<span class="hljs-subst">{f}</span>'</span>, allow_pickle=<span class="hljs-literal">True</span>)
        moves = movesAndPositions[:,<span class="hljs-number">0</span>]
        positions = movesAndPositions[:,<span class="hljs-number">1</span>]
        encodedMoves = []
        encodedPositions = []

        <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> range(len(moves)):
            board.turn = (<span class="hljs-keyword">not</span> board.turn) <span class="hljs-comment">#swap turns</span>
            <span class="hljs-keyword">try</span>:
                encodedMoves.append(encodeMove(moves[i], board)) 
                encodedPositions.append(encodeBoardFromFen(positions[i]))
            <span class="hljs-keyword">except</span>:
                <span class="hljs-keyword">try</span>:
                    board.turn = (<span class="hljs-keyword">not</span> board.turn) <span class="hljs-comment">#change turn, since you  skip moves sometimes, you  might need to change turn</span>
                    encodedMoves.append(encodeMove(moves[i], board)) 
                    encodedPositions.append(encodeBoardFromFen(positions[i]))
                <span class="hljs-keyword">except</span>:
                    print(<span class="hljs-string">f'error in file: <span class="hljs-subst">{f}</span>'</span>)
                    print(<span class="hljs-string">"Turn: "</span>, board.turn)
                    print(moves[i])
                    print(positions[i])
                    print(i)
                    <span class="hljs-keyword">break</span>

        np.save(<span class="hljs-string">f'data/preparedData/moves<span class="hljs-subst">{idx}</span>'</span>, np.array(encodedMoves))
        np.save(<span class="hljs-string">f'data/preparedData/positions<span class="hljs-subst">{idx}</span>'</span>, np.array(encodedPositions))

encodeAllMovesAndPositions()

<span class="hljs-comment">#<span class="hljs-doctag">NOTE:</span> shape of files:</span>
<span class="hljs-comment">#moves: (number of moves in gamew)</span>
<span class="hljs-comment">#positions: (number of moves in game, 8, 8, 14) (number of moves in game is including both black and white moves)</span>
</code></pre>
<p>I first create the environment and reset it. </p>
<p>Then, I open all raw data files made from part 1 and encode this. I also do it in a <code>try/catch</code> statement, as I sometimes see errors with move encodings. </p>
<p>The first except statement is for if a move is skipped (so the program thinks it’s the wrong turn). If this happens, the encoding will not work, so the except statement changes the turn and tries again. This is not the most optimal code, but the encoding is a minor part of the total runtime to creating an AI chess engine, and it is therefore acceptable.</p>
<p>Make sure you have the correct folder structure and have created all the different folders. If not, you will receive an error.</p>
<p>You have now encoded your chess board and moves. If you want to, you can check out the full code from this part on <a target="_blank" href="https://github.com/EivindKjosbakken/ChessEngine/blob/main/part2Encoding.ipynb">my GitHub</a>.</p>
<h2 id="heading-part-3-how-to-train-the-ai-model">Part 3: How to Train the AI model</h2>
<p>This is the third and last part in the for creating your own AI chess engine! </p>
<p>In part 1 you learned how to create a dataset, and in part 2 you looked at encoding the dataset so that it could be used for an AI. </p>
<p>You will now use this encoded dataset to train your own AI using PyTorch!</p>
<h3 id="heading-how-to-import-packages">How to import packages</h3>
<p>As always, you have all the imports that will be used in the tutorial. Most are straightforward, but you need to install PyTorch, which I recommend installing using <a target="_blank" href="https://pytorch.org/">this website</a>. </p>
<p>Here you can scroll down a bit, where you see some options for which build and operating system you are on. </p>
<p>After selecting the options that apply to you, you will get some code you can paste into the terminal to install PyTorch. </p>
<p>You can see the options I chose in the image below, but in general, I recommend using the stable build and choosing your own operating system. </p>
<p>Then, select what package you are most used to (Conda or <code>pip</code> is probably the easiest as you can just paste it into the terminal). </p>
<p>Select CUDA 11.7/11.8 (does not matter which one), and install using the given command at the bottom.</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/09/0_UJVBkAt40X6-FXuV.png" alt="Image" width="600" height="400" loading="lazy">
<em>My selections when installing PyTorch.</em></p>
<p>You can then import all your packages with the code below:</p>
<pre><code class="lang-py"><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np
<span class="hljs-keyword">import</span> torch
<span class="hljs-keyword">import</span> torch.nn <span class="hljs-keyword">as</span> nn
<span class="hljs-keyword">import</span> torch.functional <span class="hljs-keyword">as</span> F
<span class="hljs-keyword">import</span> torchvision
<span class="hljs-keyword">import</span> torchvision.transforms <span class="hljs-keyword">as</span> transforms
<span class="hljs-keyword">from</span> torch.utils.tensorboard <span class="hljs-keyword">import</span> SummaryWriter
<span class="hljs-keyword">from</span> datetime <span class="hljs-keyword">import</span> datetime
<span class="hljs-keyword">import</span> gym
<span class="hljs-keyword">import</span> gym_chess
<span class="hljs-keyword">import</span> os
<span class="hljs-keyword">import</span> chess
<span class="hljs-keyword">from</span> tqdm <span class="hljs-keyword">import</span> tqdm
<span class="hljs-keyword">from</span> gym_chess.alphazero.move_encoding <span class="hljs-keyword">import</span> utils
<span class="hljs-keyword">from</span> pathlib <span class="hljs-keyword">import</span> Path
<span class="hljs-keyword">from</span> typing <span class="hljs-keyword">import</span> Optional
</code></pre>
<h3 id="heading-how-to-install-cuda">How to Install CUDA</h3>
<p>This is an optional step, that allows you to utilize your GPU to train your model much faster. It is not required, but will save you some time when training your AI. </p>
<p>The way you install CUDA varies depending on your operating system, but I am using Windows and followed <a target="_blank" href="https://docs.nvidia.com/cuda/cuda-installation-guide-microsoft-windows/index.html">this tutorial</a>.</p>
<p>If you are on MacOS or Linux, then you can find a tutorial by googling: “installing CUDA Mac/Linux”.</p>
<p>To check if you have CUDA available (your GPU is available), you can use this code:</p>
<pre><code class="lang-py"><span class="hljs-comment">#check if cuda available</span>
torch.cuda.is_available()
</code></pre>
<p>Which outputs <code>True</code> if your GPU is available. If you do not have a GPU available however, do not worry, the only downside here is training the model takes longer, which is not that big of a deal when doing hobby projects like this one.</p>
<h3 id="heading-how-to-create-encoding-methods">How to create encoding methods</h3>
<p>I then define some helper methods for encoding and decoding from the <a target="_blank" href="https://pypi.org/project/gym-chess/">Python Gym-Chess package</a>. </p>
<p>I had to make some modifications to the package, to make it work. Most of the code is copied from the package, with just a few small tweaks making the code not dependent on a class and so forth. </p>
<p>Note that you do not have to understand all the code below, as the way Deepmind encodes all moves in chess is complicated.</p>
<pre><code class="lang-py"><span class="hljs-comment">#helper methods:</span>

<span class="hljs-comment">#decoding moves from idx to uci notation</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">_decodeKnight</span>(<span class="hljs-params">action: int</span>) -&gt; Optional[chess.Move]:</span>
    _NUM_TYPES: int = <span class="hljs-number">8</span>

    <span class="hljs-comment">#: Starting point of knight moves in last dimension of 8 x 8 x 73 action array.</span>
    _TYPE_OFFSET: int = <span class="hljs-number">56</span>

    <span class="hljs-comment">#: Set of possible directions for a knight move, encoded as </span>
    <span class="hljs-comment">#: (delta rank, delta square).</span>
    _DIRECTIONS = utils.IndexedTuple(
        (+<span class="hljs-number">2</span>, +<span class="hljs-number">1</span>),
        (+<span class="hljs-number">1</span>, +<span class="hljs-number">2</span>),
        (<span class="hljs-number">-1</span>, +<span class="hljs-number">2</span>),
        (<span class="hljs-number">-2</span>, +<span class="hljs-number">1</span>),
        (<span class="hljs-number">-2</span>, <span class="hljs-number">-1</span>),
        (<span class="hljs-number">-1</span>, <span class="hljs-number">-2</span>),
        (+<span class="hljs-number">1</span>, <span class="hljs-number">-2</span>),
        (+<span class="hljs-number">2</span>, <span class="hljs-number">-1</span>),
    )

    from_rank, from_file, move_type = np.unravel_index(action, (<span class="hljs-number">8</span>, <span class="hljs-number">8</span>, <span class="hljs-number">73</span>))

    is_knight_move = (
        _TYPE_OFFSET &lt;= move_type
        <span class="hljs-keyword">and</span> move_type &lt; _TYPE_OFFSET + _NUM_TYPES
    )

    <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> is_knight_move:
        <span class="hljs-keyword">return</span> <span class="hljs-literal">None</span>

    knight_move_type = move_type - _TYPE_OFFSET

    delta_rank, delta_file = _DIRECTIONS[knight_move_type]

    to_rank = from_rank + delta_rank
    to_file = from_file + delta_file

    move = utils.pack(from_rank, from_file, to_rank, to_file)
    <span class="hljs-keyword">return</span> move

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">_decodeQueen</span>(<span class="hljs-params">action: int</span>) -&gt; Optional[chess.Move]:</span>

    _NUM_TYPES: int = <span class="hljs-number">56</span> <span class="hljs-comment"># = 8 directions * 7 squares max. distance</span>

    <span class="hljs-comment">#: Set of possible directions for a queen move, encoded as </span>
    <span class="hljs-comment">#: (delta rank, delta square).</span>
    _DIRECTIONS = utils.IndexedTuple(
        (+<span class="hljs-number">1</span>,  <span class="hljs-number">0</span>),
        (+<span class="hljs-number">1</span>, +<span class="hljs-number">1</span>),
        ( <span class="hljs-number">0</span>, +<span class="hljs-number">1</span>),
        (<span class="hljs-number">-1</span>, +<span class="hljs-number">1</span>),
        (<span class="hljs-number">-1</span>,  <span class="hljs-number">0</span>),
        (<span class="hljs-number">-1</span>, <span class="hljs-number">-1</span>),
        ( <span class="hljs-number">0</span>, <span class="hljs-number">-1</span>),
        (+<span class="hljs-number">1</span>, <span class="hljs-number">-1</span>),
    )
    from_rank, from_file, move_type = np.unravel_index(action, (<span class="hljs-number">8</span>, <span class="hljs-number">8</span>, <span class="hljs-number">73</span>))

    is_queen_move = move_type &lt; _NUM_TYPES

    <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> is_queen_move:
        <span class="hljs-keyword">return</span> <span class="hljs-literal">None</span>

    direction_idx, distance_idx = np.unravel_index(
        indices=move_type,
        shape=(<span class="hljs-number">8</span>,<span class="hljs-number">7</span>)
    )

    direction = _DIRECTIONS[direction_idx]
    distance = distance_idx + <span class="hljs-number">1</span>

    delta_rank = direction[<span class="hljs-number">0</span>] * distance
    delta_file = direction[<span class="hljs-number">1</span>] * distance

    to_rank = from_rank + delta_rank
    to_file = from_file + delta_file

    move = utils.pack(from_rank, from_file, to_rank, to_file)
    <span class="hljs-keyword">return</span> move

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">_decodeUnderPromotion</span>(<span class="hljs-params">action</span>):</span>
    _NUM_TYPES: int = <span class="hljs-number">9</span> <span class="hljs-comment"># = 3 directions * 3 piece types (see below)</span>

    <span class="hljs-comment">#: Starting point of underpromotions in last dimension of 8 x 8 x 73 action </span>
    <span class="hljs-comment">#: array.</span>
    _TYPE_OFFSET: int = <span class="hljs-number">64</span>

    <span class="hljs-comment">#: Set of possibel directions for an underpromotion, encoded as file delta.</span>
    _DIRECTIONS = utils.IndexedTuple(
        <span class="hljs-number">-1</span>,
        <span class="hljs-number">0</span>,
        +<span class="hljs-number">1</span>,
    )

    <span class="hljs-comment">#: Set of possibel piece types for an underpromotion (promoting to a queen</span>
    <span class="hljs-comment">#: is implicitly encoded by the corresponding queen move).</span>
    _PROMOTIONS = utils.IndexedTuple(
        chess.KNIGHT,
        chess.BISHOP,
        chess.ROOK,
    )

    from_rank, from_file, move_type = np.unravel_index(action, (<span class="hljs-number">8</span>, <span class="hljs-number">8</span>, <span class="hljs-number">73</span>))

    is_underpromotion = (
        _TYPE_OFFSET &lt;= move_type
        <span class="hljs-keyword">and</span> move_type &lt; _TYPE_OFFSET + _NUM_TYPES
    )

    <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> is_underpromotion:
        <span class="hljs-keyword">return</span> <span class="hljs-literal">None</span>

    underpromotion_type = move_type - _TYPE_OFFSET

    direction_idx, promotion_idx = np.unravel_index(
        indices=underpromotion_type,
        shape=(<span class="hljs-number">3</span>,<span class="hljs-number">3</span>)
    )

    direction = _DIRECTIONS[direction_idx]
    promotion = _PROMOTIONS[promotion_idx]

    to_rank = from_rank + <span class="hljs-number">1</span>
    to_file = from_file + direction

    move = utils.pack(from_rank, from_file, to_rank, to_file)
    move.promotion = promotion

    <span class="hljs-keyword">return</span> move

<span class="hljs-comment">#primary decoding function, the ones above are just helper functions</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">decodeMove</span>(<span class="hljs-params">action: int, board</span>) -&gt; chess.Move:</span>
        move = _decodeQueen(action)
        is_queen_move = move <span class="hljs-keyword">is</span> <span class="hljs-keyword">not</span> <span class="hljs-literal">None</span>

        <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> move:
            move = _decodeKnight(action)

        <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> move:
            move = _decodeUnderPromotion(action)

        <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> move:
            <span class="hljs-keyword">raise</span> ValueError(<span class="hljs-string">f"<span class="hljs-subst">{action}</span> is not a valid action"</span>)

        <span class="hljs-comment"># Actions encode moves from the perspective of the current player. If</span>
        <span class="hljs-comment"># this is the black player, the move must be reoriented.</span>
        turn = board.turn

        <span class="hljs-keyword">if</span> turn == <span class="hljs-literal">False</span>: <span class="hljs-comment">#black to move</span>
            move = utils.rotate(move)

        <span class="hljs-comment"># Moving a pawn to the opponent's home rank with a queen move</span>
        <span class="hljs-comment"># is automatically assumed to be queen underpromotion. However,</span>
        <span class="hljs-comment"># since queenmoves has no reference to the board and can thus not</span>
        <span class="hljs-comment"># determine whether the moved piece is a pawn, you have to add this</span>
        <span class="hljs-comment"># information manually here</span>
        <span class="hljs-keyword">if</span> is_queen_move:
            to_rank = chess.square_rank(move.to_square)
            is_promoting_move = (
                (to_rank == <span class="hljs-number">7</span> <span class="hljs-keyword">and</span> turn == <span class="hljs-literal">True</span>) <span class="hljs-keyword">or</span> 
                (to_rank == <span class="hljs-number">0</span> <span class="hljs-keyword">and</span> turn == <span class="hljs-literal">False</span>)
            )

            piece = board.piece_at(move.from_square)
            <span class="hljs-keyword">if</span> piece <span class="hljs-keyword">is</span> <span class="hljs-literal">None</span>: <span class="hljs-comment">#NOTE I added this, not entirely sure if it's correct</span>
                <span class="hljs-keyword">return</span> <span class="hljs-literal">None</span>
            is_pawn = piece.piece_type == chess.PAWN

            <span class="hljs-keyword">if</span> is_pawn <span class="hljs-keyword">and</span> is_promoting_move:
                move.promotion = chess.QUEEN

        <span class="hljs-keyword">return</span> move

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">encodeBoard</span>(<span class="hljs-params">board: chess.Board</span>) -&gt; np.array:</span>
 <span class="hljs-string">"""Converts a board to numpy array representation."""</span>

 array = np.zeros((<span class="hljs-number">8</span>, <span class="hljs-number">8</span>, <span class="hljs-number">14</span>), dtype=int)

 <span class="hljs-keyword">for</span> square, piece <span class="hljs-keyword">in</span> board.piece_map().items():
  rank, file = chess.square_rank(square), chess.square_file(square)
  piece_type, color = piece.piece_type, piece.color

  <span class="hljs-comment"># The first six planes encode the pieces of the active player, </span>
  <span class="hljs-comment"># the following six those of the active player's opponent. Since</span>
  <span class="hljs-comment"># this class always stores boards oriented towards the white player,</span>
  <span class="hljs-comment"># White is considered to be the active player here.</span>
  offset = <span class="hljs-number">0</span> <span class="hljs-keyword">if</span> color == chess.WHITE <span class="hljs-keyword">else</span> <span class="hljs-number">6</span>

  <span class="hljs-comment"># Chess enumerates piece types beginning with one, which you have</span>
  <span class="hljs-comment"># to account for</span>
  idx = piece_type - <span class="hljs-number">1</span>

  array[rank, file, idx + offset] = <span class="hljs-number">1</span>

 <span class="hljs-comment"># Repetition counters</span>
 array[:, :, <span class="hljs-number">12</span>] = board.is_repetition(<span class="hljs-number">2</span>)
 array[:, :, <span class="hljs-number">13</span>] = board.is_repetition(<span class="hljs-number">3</span>)

 <span class="hljs-keyword">return</span> array
</code></pre>
<h3 id="heading-how-to-load-the-data">How to load the data</h3>
<p>In part 1, you mined some chess games, and then in part 2, you encoded it so that it could be used to train a model. </p>
<p>You now load this data in PyTorch data loader objects, so it’s available for the model to train on. In case you have not done part 1 or 2 of this tutorial, you can find some ready-made training files in <a target="_blank" href="https://drive.google.com/drive/folders/16QLJL2LQcz5hiONJnvuJwtUJqz6v-sN5?usp=sharing">this Google Drive folder.</a></p>
<p>First, define some hyperparameters:</p>
<pre><code class="lang-py">FRACTION_OF_DATA = <span class="hljs-number">1</span>
BATCH_SIZE = <span class="hljs-number">4</span>
</code></pre>
<p>The <code>FRACTION_OF_DATA</code> variable, is there just in case you want to train the model fast and do not want to train it on the full dataset. Make sure this value is &gt; 0 and ≤ 1. </p>
<p>The <code>BATCH_SIZE</code> variable decides the batch size the model trains on. In general, a higher batch size means the model can train faster, but your batch size is limited by the power of your GPU. </p>
<p>I recommend testing with a low batch size of 4 and then trying to increase it and see if training still works as it should. If you get a memory error of some sort, try decreasing the batch size again.</p>
<p>You then load the data with the code below. Make sure your folder structure and file naming is correct here. You should have an initial data folder in the same place where your code is. </p>
<p>Then inside this data folder, you should have a <code>preparedData</code> folder, that contains the files you want to train on. These files have to be named <code>moves{i}.npy</code> and <code>positions{i}.npy</code>, where i is the index of the file. If you encoded the files as I did earlier, everything should be correct.</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/09/0_3LT9odIm09DPFS59.png" alt="Image" width="600" height="400" loading="lazy">
<em>The folder structure. Yellow are folders, and turquoise are files.</em></p>
<pre><code class="lang-py"><span class="hljs-comment">#dataset</span>

<span class="hljs-comment">#loading training data</span>

allMoves = []
allBoards = []

files = os.listdir(<span class="hljs-string">'data/preparedData'</span>)
numOfEach = len(files) // <span class="hljs-number">2</span> <span class="hljs-comment"># half are moves, other half are positions</span>

<span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> range(numOfEach):
    <span class="hljs-keyword">try</span>:
        moves = np.load(<span class="hljs-string">f"data/preparedData/moves<span class="hljs-subst">{i}</span>.npy"</span>, allow_pickle=<span class="hljs-literal">True</span>)
        boards = np.load(<span class="hljs-string">f"data/preparedData/positions<span class="hljs-subst">{i}</span>.npy"</span>, allow_pickle=<span class="hljs-literal">True</span>)
        <span class="hljs-keyword">if</span> (len(moves) != len(boards)):
            print(<span class="hljs-string">"ERROR ON i = "</span>, i, len(moves), len(boards))
        allMoves.extend(moves)
        allBoards.extend(boards)
    <span class="hljs-keyword">except</span>:
        print(<span class="hljs-string">"error: could not load "</span>, i, <span class="hljs-string">", but is still going"</span>)

allMoves = np.array(allMoves)[:(int(len(allMoves) * FRACTION_OF_DATA))]
allBoards = np.array(allBoards)[:(int(len(allBoards) * FRACTION_OF_DATA))]
<span class="hljs-keyword">assert</span> len(allMoves) == len(allBoards), <span class="hljs-string">"MUST BE OF SAME LENGTH"</span>

<span class="hljs-comment">#flatten out boards</span>
<span class="hljs-comment"># allBoards = allBoards.reshape(allBoards.shape[0], -1)</span>

trainDataIdx = int(len(allMoves) * <span class="hljs-number">0.8</span>)

<span class="hljs-comment">#NOTE transfer all data to GPU if available</span>
device = torch.device(<span class="hljs-string">"cuda:0"</span> <span class="hljs-keyword">if</span> torch.cuda.is_available() <span class="hljs-keyword">else</span> <span class="hljs-string">"cpu"</span>)
allBoards = torch.from_numpy(np.asarray(allBoards)).to(device)
allMoves = torch.from_numpy(np.asarray(allMoves)).to(device)

training_set = torch.utils.data.TensorDataset(allBoards[:trainDataIdx], allMoves[:trainDataIdx])
test_set = torch.utils.data.TensorDataset(allBoards[trainDataIdx:], allMoves[trainDataIdx:])
<span class="hljs-comment"># Create data loaders for your datasets; shuffle for training, not for validation</span>

training_loader = torch.utils.data.DataLoader(training_set, batch_size=BATCH_SIZE, shuffle=<span class="hljs-literal">True</span>)
validation_loader = torch.utils.data.DataLoader(test_set, batch_size=BATCH_SIZE, shuffle=<span class="hljs-literal">False</span>)
</code></pre>
<h3 id="heading-how-to-define-the-deep-learning-model">How to define the deep learning model</h3>
<p>You can then define the model architecture:</p>
<pre><code class="lang-py"><span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">Model</span>(<span class="hljs-params">torch.nn.Module</span>):</span>

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">__init__</span>(<span class="hljs-params">self</span>):</span>
        super(Model, self).__init__()
        self.INPUT_SIZE = <span class="hljs-number">896</span> 
        <span class="hljs-comment"># self.INPUT_SIZE = 7*7*13 #NOTE changing input size for using cnns</span>
        self.OUTPUT_SIZE = <span class="hljs-number">4672</span> <span class="hljs-comment"># = number of unique moves (action space)</span>

        <span class="hljs-comment">#can try to add CNN and pooling here (calculations taking into account spacial features)</span>

        <span class="hljs-comment">#input shape for sample is (8,8,14), flattened to 1d array of size 896</span>
        <span class="hljs-comment"># self.cnn1 = nn.Conv3d(4,4,(2,2,4), padding=(0,0,1))</span>
        self.activation = torch.nn.ReLU()
        self.linear1 = torch.nn.Linear(self.INPUT_SIZE, <span class="hljs-number">1000</span>)
        self.linear2 = torch.nn.Linear(<span class="hljs-number">1000</span>, <span class="hljs-number">1000</span>)
        self.linear3 = torch.nn.Linear(<span class="hljs-number">1000</span>, <span class="hljs-number">1000</span>)
        self.linear4 = torch.nn.Linear(<span class="hljs-number">1000</span>, <span class="hljs-number">200</span>)
        self.linear5 = torch.nn.Linear(<span class="hljs-number">200</span>, self.OUTPUT_SIZE)
        self.softmax = torch.nn.Softmax(<span class="hljs-number">1</span>) <span class="hljs-comment">#use softmax as prob for each move, dim 1 as dim 0 is the batch dimension</span>

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">forward</span>(<span class="hljs-params">self, x</span>):</span> <span class="hljs-comment">#x.shape = (batch size, 896)</span>
        x = x.to(torch.float32)
        <span class="hljs-comment"># x = self.cnn1(x) #for using cnns</span>
        x = x.reshape(x.shape[<span class="hljs-number">0</span>], <span class="hljs-number">-1</span>)
        x = self.linear1(x)
        x = self.activation(x)
        x = self.linear2(x)
        x = self.activation(x)
        x = self.linear3(x)
        x = self.activation(x)
        x = self.linear4(x)
        x = self.activation(x)
        x = self.linear5(x)
        <span class="hljs-comment"># x = self.softmax(x) #do not use softmax since you are using cross entropy loss</span>
        <span class="hljs-keyword">return</span> x

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">predict</span>(<span class="hljs-params">self, board : chess.Board</span>):</span>
        <span class="hljs-string">"""takes in a chess board and returns a chess.move object. NOTE: this function should definitely be written better, but it works for now"""</span>
        <span class="hljs-keyword">with</span> torch.no_grad():
            encodedBoard = encodeBoard(board)
            encodedBoard = encodedBoard.reshape(<span class="hljs-number">1</span>, <span class="hljs-number">-1</span>)
            encodedBoard = torch.from_numpy(encodedBoard)
            res = self.forward(encodedBoard)
            probs = self.softmax(res)

            probs = probs.numpy()[<span class="hljs-number">0</span>] <span class="hljs-comment">#do not want tensor anymore, 0 since it is a 2d array with 1 row</span>

            <span class="hljs-comment">#verify that move is legal and can be decoded before returning</span>
            <span class="hljs-keyword">while</span> len(probs) &gt; <span class="hljs-number">0</span>: <span class="hljs-comment">#try max 100 times, if not throw an error</span>
                moveIdx = probs.argmax()
                <span class="hljs-keyword">try</span>: <span class="hljs-comment">#TODO should not have try here, but was a bug with idx 499 if it is black to move</span>
                    uciMove = decodeMove(moveIdx, board)
                    <span class="hljs-keyword">if</span> (uciMove <span class="hljs-keyword">is</span> <span class="hljs-literal">None</span>): <span class="hljs-comment">#could not decode</span>
                        probs = np.delete(probs, moveIdx)
                        <span class="hljs-keyword">continue</span>
                    move = chess.Move.from_uci(str(uciMove))
                    <span class="hljs-keyword">if</span> (move <span class="hljs-keyword">in</span> board.legal_moves): <span class="hljs-comment">#if legal, return, else: loop continues after deleting the move</span>
                        <span class="hljs-keyword">return</span> move 
                <span class="hljs-keyword">except</span>:
                    <span class="hljs-keyword">pass</span>
                probs = np.delete(probs, moveIdx) <span class="hljs-comment">#TODO probably better way to do this, but it is not too time critical as it is only for predictions</span>
                                             <span class="hljs-comment">#remove the move so its not chosen again next iteration</span>

            <span class="hljs-comment">#TODO can return random move here as well!</span>
            <span class="hljs-keyword">return</span> <span class="hljs-literal">None</span> <span class="hljs-comment">#if no legal moves found, return None</span>
</code></pre>
<p>You are free to change the architecture however you like. </p>
<p>Here, I have just chosen some simple parameters that worked decently, though there is room for improvement. Some examples of changes you can make are:</p>
<ol>
<li>Add <a target="_blank" href="https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html">PyTorch CNN modules</a> (remember to not flatten the array before adding these)</li>
<li>Change the activation functions in hidden layers. I am now using <a target="_blank" href="https://pytorch.org/docs/stable/generated/torch.nn.ReLU.html">ReLU</a>, but this could be changed to for example Sigmoid or Tanh, which you can read more about <a target="_blank" href="https://machinelearningmastery.com/choose-an-activation-function-for-deep-learning/">here</a>.</li>
<li>Change the number of hidden layers. When changing this, you must remember to add an activation function between each layer in the <code>forward()</code> function.</li>
<li>Change the number of neurons in each hidden layer. If you are going to change the number of neurons, you must remember the rule that the number of neurons out in layer n, should be the neurons in, in layer n+1. So for example, linear1 takes in 1000 neurons, and outputs 2000 neurons. Then linear2 must take in 2000 neurons. You can then freely choose the number of output neurons on linear2, but the amount must match the number of input neurons in linear 3, and so on. The input to layer 1 and the output from the last layer however are set with the parameters <code>INPUT_SIZE</code>, and <code>OUTPUT_SIZE</code>.</li>
</ol>
<p>In addition to the model architecture and forward functions, which are obligatory when creating a deep model,  I also defined a <code>predict()</code> function, to make it easier to give a chess position to the model, and then it outputs the move it recommends.</p>
<h3 id="heading-how-to-train-the-model">How to train the model</h3>
<p>When you have all the required data and the model is defined, you can begin training the model. First, you define a function to train one epoch and save the best model:</p>
<pre><code class="lang-py"><span class="hljs-comment">#helper functions for training</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">train_one_epoch</span>(<span class="hljs-params">model, optimizer, loss_fn, epoch_index, tb_writer</span>):</span>
    running_loss = <span class="hljs-number">0.</span>
    last_loss = <span class="hljs-number">0.</span>

    <span class="hljs-comment"># Here, you use enumerate(training_loader) instead of</span>
    <span class="hljs-comment"># iter(training_loader) so that you can track the batch</span>
    <span class="hljs-comment"># index and do some intra-epoch reporting</span>
    <span class="hljs-keyword">for</span> i, data <span class="hljs-keyword">in</span> enumerate(training_loader):

        <span class="hljs-comment"># Every data instance is an input + label pair</span>
        inputs, labels = data

        <span class="hljs-comment"># Zero your gradients for every batch!</span>
        optimizer.zero_grad()

        <span class="hljs-comment"># Make predictions for this batch</span>
        outputs = model(inputs)

        <span class="hljs-comment"># Compute the loss and its gradients</span>
        loss = loss_fn(outputs, labels)
        loss.backward()

        <span class="hljs-comment"># Adjust learning weights</span>
        optimizer.step()

        <span class="hljs-comment"># Gather data and report</span>
        running_loss += loss.item()
        <span class="hljs-keyword">if</span> i % <span class="hljs-number">1000</span> == <span class="hljs-number">999</span>:
            last_loss = running_loss / <span class="hljs-number">1000</span> <span class="hljs-comment"># loss per batch</span>
            <span class="hljs-comment"># print('  batch {} loss: {}'.format(i + 1, last_loss))</span>
            tb_x = epoch_index * len(training_loader) + i + <span class="hljs-number">1</span>
            tb_writer.add_scalar(<span class="hljs-string">'Loss/train'</span>, last_loss, tb_x)
            running_loss = <span class="hljs-number">0.</span>

    <span class="hljs-keyword">return</span> last_loss

<span class="hljs-comment">#the 3 functions below help store the best model you have created yet</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">createBestModelFile</span>():</span>
    <span class="hljs-comment">#first find best model if it exists:</span>
    folderPath = Path(<span class="hljs-string">'./savedModels'</span>)
    <span class="hljs-keyword">if</span> (<span class="hljs-keyword">not</span> folderPath.exists()):
        os.mkdir(folderPath)

    path = Path(<span class="hljs-string">'./savedModels/bestModel.txt'</span>)

    <span class="hljs-keyword">if</span> (<span class="hljs-keyword">not</span> path.exists()):
        <span class="hljs-comment">#create the files</span>
        f = open(path, <span class="hljs-string">"w"</span>)
        f.write(<span class="hljs-string">"10000000"</span>) <span class="hljs-comment">#set to high number so it is overwritten with better loss</span>
        f.write(<span class="hljs-string">"\ntestPath"</span>)
        f.close()

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">saveBestModel</span>(<span class="hljs-params">vloss, pathToBestModel</span>):</span>
    f = open(<span class="hljs-string">"./savedModels/bestModel.txt"</span>, <span class="hljs-string">"w"</span>)
    f.write(str(vloss.item()))
    f.write(<span class="hljs-string">"\n"</span>)
    f.write(pathToBestModel)
    print(<span class="hljs-string">"NEW BEST MODEL FOUND WITH LOSS:"</span>, vloss)

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">retrieveBestModelInfo</span>():</span>
    f = open(<span class="hljs-string">'./savedModels/bestModel.txt'</span>, <span class="hljs-string">"r"</span>)
    bestLoss = float(f.readline())
    bestModelPath = f.readline()
    f.close()
    <span class="hljs-keyword">return</span> bestLoss, bestModelPath
</code></pre>
<p>Note that this function is essentially copied from the <a target="_blank" href="https://pytorch.org/tutorials/beginner/introyt/trainingyt.html">PyTorch docs</a>, with a slight change by importing the model, optimizer, and loss function as function parameters.</p>
<p>You then define the hyperparameters like below. Note that this is something you can tune, to further improve your model.</p>
<pre><code class="lang-py"><span class="hljs-comment">#hyperparameters</span>
EPOCHS = <span class="hljs-number">60</span>
LEARNING_RATE = <span class="hljs-number">0.001</span>
MOMENTUM = <span class="hljs-number">0.9</span>
</code></pre>
<p>Run the training with the code below:</p>
<pre><code class="lang-py"><span class="hljs-comment">#run training</span>

createBestModelFile()

bestLoss, bestModelPath = retrieveBestModelInfo()

timestamp = datetime.now().strftime(<span class="hljs-string">'%Y%m%d_%H%M%S'</span>)
writer = SummaryWriter(<span class="hljs-string">'runs/fashion_trainer_{}'</span>.format(timestamp))
epoch_number = <span class="hljs-number">0</span>

model = Model()
loss_fn = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=LEARNING_RATE, momentum=MOMENTUM)
device = torch.device(<span class="hljs-string">"cuda:0"</span> <span class="hljs-keyword">if</span> torch.cuda.is_available() <span class="hljs-keyword">else</span> <span class="hljs-string">"cpu"</span>)
model.to(device)

best_vloss = <span class="hljs-number">1</span>_000_000.

<span class="hljs-keyword">for</span> epoch <span class="hljs-keyword">in</span> tqdm(range(EPOCHS)):
    <span class="hljs-keyword">if</span> (epoch_number % <span class="hljs-number">5</span> == <span class="hljs-number">0</span>):
        print(<span class="hljs-string">'EPOCH {}:'</span>.format(epoch_number + <span class="hljs-number">1</span>))

    <span class="hljs-comment"># Make sure gradient tracking is on, and do a pass over the data</span>
    model.train(<span class="hljs-literal">True</span>)
    avg_loss = train_one_epoch(model, optimizer, loss_fn, epoch_number, writer)

    running_vloss = <span class="hljs-number">0.0</span>
    <span class="hljs-comment"># Set the model to evaluation mode, disabling dropout and using population</span>
    <span class="hljs-comment"># statistics for batch normalization.</span>

    model.eval()

    <span class="hljs-comment"># Disable gradient computation and reduce memory consumption.</span>
    <span class="hljs-keyword">with</span> torch.no_grad():
        <span class="hljs-keyword">for</span> i, vdata <span class="hljs-keyword">in</span> enumerate(validation_loader):
            vinputs, vlabels = vdata
            voutputs = model(vinputs)

            vloss = loss_fn(voutputs, vlabels)
            running_vloss += vloss

    avg_vloss = running_vloss / (i + <span class="hljs-number">1</span>)

    <span class="hljs-comment">#only print every 5 epochs</span>
    <span class="hljs-keyword">if</span> epoch_number % <span class="hljs-number">5</span> == <span class="hljs-number">0</span>:
        print(<span class="hljs-string">'LOSS train {} valid {}'</span>.format(avg_loss, avg_vloss))

    <span class="hljs-comment"># Log the running loss averaged per batch</span>
    <span class="hljs-comment"># for both training and validation</span>
    writer.add_scalars(<span class="hljs-string">'Training vs. Validation Loss'</span>,
                    { <span class="hljs-string">'Training'</span> : avg_loss, <span class="hljs-string">'Validation'</span> : avg_vloss },
                    epoch_number + <span class="hljs-number">1</span>)
    writer.flush()

    <span class="hljs-comment"># Track best performance, and save the model's state</span>
    <span class="hljs-keyword">if</span> avg_vloss &lt; best_vloss:
        best_vloss = avg_vloss

        <span class="hljs-keyword">if</span> (bestLoss &gt; best_vloss): <span class="hljs-comment">#if better than previous best loss from all models created, save it</span>
            model_path = <span class="hljs-string">'savedModels/model_{}_{}'</span>.format(timestamp, epoch_number)
            torch.save(model.state_dict(), model_path)
            saveBestModel(best_vloss, model_path)

    epoch_number += <span class="hljs-number">1</span>

print(<span class="hljs-string">"\n\nBEST VALIDATION LOSS FOR ALL MODELS: "</span>, bestLoss)
</code></pre>
<p>This code is also heavily inspired by the <a target="_blank" href="https://pytorch.org/tutorials/beginner/introyt/trainingyt.html">PyTorch docs.</a></p>
<p>Depending on the number of layers in your model, the number of neurons in the layers, the number of epochs, if you are using GPU or not, and several other factors, your time to train the model can take anywhere from seconds, to several hours. </p>
<p>As you can see below, the estimated time to train my model here was about 2 minutes.</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/09/0_-JEKRkXNUxXy4CYh.gif" alt="Image" width="600" height="400" loading="lazy">
<em>Video of the model training. Recorded using [LICEcap](https://www.cockos.com/licecap/" rel="noopener ugc nofollow noopener noopener)</em></p>
<h3 id="heading-how-to-test-your-model">How to test your model</h3>
<p>Testing your model is a vital part of checking if what you created works. I have implemented two ways of checking the model:</p>
<h4 id="heading-yourself-vs-ai">Yourself vs AI</h4>
<p>The first way is to play yourself against the AI. Here you decide a move, then you let the AI decide the move, and so on. I recommend doing this in a notebook, so you can run different cells for different actions.</p>
<p>First, load a model that was saved from training. Here, I get the path to the file from the file created when running training, that stores the path to your best model. You can of course also manually change the path to the model you prefer to use.</p>
<pre><code class="lang-py">saved_model = Model()

<span class="hljs-comment">#load best model path from your file</span>
f = open(<span class="hljs-string">"./savedModels/bestModel.txt"</span>, <span class="hljs-string">"r"</span>)
bestLoss = float(f.readline())
model_path = f.readline()
f.close()

model.load_state_dict(torch.load(model_path))
</code></pre>
<p>Then, define the chess board:</p>
<pre><code class="lang-py"><span class="hljs-comment">#play your own game</span>
board = chess.Board()
</code></pre>
<p>Then you can make a move by running the code in the cell below by changing the string in the first line. Make sure it is a legal move:</p>
<pre><code class="lang-py">moveStr = <span class="hljs-string">"e2e4"</span>
move = chess.Move.from_uci(moveStr)
board.push(move)
</code></pre>
<p>Then you can let the AI decide the next move with the cell below:</p>
<pre><code class="lang-py"><span class="hljs-comment">#make ai move:</span>
aiMove = saved_model.predict(board)
board.push(aiMove)
board
</code></pre>
<p>This will also print the board state so you can decide your own move more easily:</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/09/0_mkYWyk2zoU01fyEj.png" alt="Image" width="600" height="400" loading="lazy">
<em>Printing the board state after the AI makes a move</em></p>
<p>Continue making every other move, let the AI play every other move, and see who wins!</p>
<p>If you want to regret a move, you can use:</p>
<pre><code class="lang-py"><span class="hljs-comment">#regret move:</span>
board.pop()
</code></pre>
<h4 id="heading-stockfish-vs-your-ai">Stockfish vs your AI</h4>
<p>You can also automate the testing process, by setting Stockfish to a specific ELO, and letting your AI play against it:</p>
<p>First, load your model (make sure to change the <code>model_path</code> to your own model):</p>
<pre><code class="lang-py">saved_model = Model()
model_path = <span class="hljs-string">"savedModels/model_20230702_150228_46"</span> <span class="hljs-comment">#TODO CHANGE THIS PATH</span>
model.load_state_dict(torch.load(model_path))
</code></pre>
<p>Then import Stockfish, and set it to a specific ELO. Remember to change the path to the Stockfish engine to your own path where you have the Stockfish program):</p>
<pre><code class="lang-py"><span class="hljs-comment"># test elo  against stockfish</span>
ELO_RATING = <span class="hljs-number">500</span>
<span class="hljs-keyword">from</span> stockfish <span class="hljs-keyword">import</span> Stockfish
<span class="hljs-comment">#TODO CHANGE PATH BELOW</span>
stockfish = Stockfish(path=<span class="hljs-string">r"C:\Users\eivin\Documents\ownProgrammingProjects18062023\ChessEngine\stockfish\stockfish\stockfish-windows-2022-x86-64-avx2"</span>)
stockfish.set_elo_rating(ELO_RATING)
</code></pre>
<p>A 100 ELO rating is quite bad, and something your engine will hopefully beat.</p>
<p>Then play the game with this script, which will run:</p>
<pre><code class="lang-py">board = chess.Board()
allMoves = [] <span class="hljs-comment">#list of strings for saving moves for setting pos for stockfish</span>

MAX_NUMBER_OF_MOVES = <span class="hljs-number">150</span>
<span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> range(MAX_NUMBER_OF_MOVES): <span class="hljs-comment">#set a limit for the game</span>

 <span class="hljs-comment">#first my ai move</span>
 <span class="hljs-keyword">try</span>:
  move = saved_model.predict(board)
  board.push(move)
  allMoves.append(str(move)) <span class="hljs-comment">#add so stockfish can see</span>
 <span class="hljs-keyword">except</span>:
  print(<span class="hljs-string">"game over. You lost"</span>)
  <span class="hljs-keyword">break</span>

 <span class="hljs-comment"># #then get stockfish move</span>
 stockfish.set_position(allMoves)
 stockfishMove = stockfish.get_best_move_time(<span class="hljs-number">3</span>)
 allMoves.append(stockfishMove)
 stockfishMove = chess.Move.from_uci(stockfishMove)
 board.push(stockfishMove)

stockfish.reset_engine_parameters() <span class="hljs-comment">#reset elo rating</span>

board
</code></pre>
<p>Which will print the board position after the game is over.</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/09/0_TmLzgIp2R5_bNyy7.png" alt="Image" width="600" height="400" loading="lazy">
<em>Position after your chess engine lost a game to Stockfish</em></p>
<h3 id="heading-reflection-on-the-performance-of-the-chess-engine">Reflection on the performance of the chess engine</h3>
<p>I tried training the model on about 100k positions and moves and discovered that the performance of the model still is not enough to beat a low-level (500 ELO) chess bot. </p>
<p>There could be several reasons for this. Chess is a highly complicated game, that probably requires a lot more moves and positions for a decent bot to be developed.</p>
<p>Furthermore, there are several elements of the bot you change potentially change to improve it. The architecture can be improved, for example by adding a CNN at the beginning of the forward function, so that the bot takes in spatial information.</p>
<p> You can also change the number of hidden layers in the fully connected layers, or the amount of neurons in each layer. </p>
<p>A safe way to further improve the model is to feed it more data, as you have access to an infinite amount of data by using the mining code in <a target="_blank" href="https://medium.com/dev-genius/creating-an-ai-chess-engine-using-imitation-learning-part-1-generating-dataset-8033d9e7f7dc">this article</a>. </p>
<p>Additionally, I think this shows that an imitation learning chess engine either needs a lot of data or training a chess engine solely from imitation learning might not be an optimal idea. </p>
<p>Still, imitation learning can be used as part of a chess engine, for example, if you also implement traditional searching methods, and add imitation learning on top of it.</p>
<h2 id="heading-conclusion">Conclusion</h2>
<p>Congrats! You have now made your own AI chess engine from scratch, and I hope you learned something along the way. You can constantly make this engine better if you want to improve it, and make sure it beats better and better competition.</p>
<p>If you want to full code, check out <a target="_blank" href="https://github.com/EivindKjosbakken/ChessEngine/blob/main/part3Training.ipynb">my GitHub</a>.</p>
<p>This tutorial was originally written part by part on my Medium, you can check out each part here:</p>
<ul>
<li><a target="_blank" href="https://blog.devgenius.io/creating-an-ai-chess-engine-using-imitation-learning-part-1-generating-dataset-8033d9e7f7dc">Part 1: Generating the dataset</a></li>
<li><a target="_blank" href="https://medium.com/dev-genius/creating-an-ai-chess-engine-part-2-encoding-using-the-alphazero-method-63c3c3c3a960">Part 2: Encoding with the AlphaZero method</a></li>
<li><a target="_blank" href="https://python.plainenglish.io/learn-how-to-train-your-awesome-self-playing-ai-chess-engine-77a46633a949">Part 3</a>: Training the model</li>
</ul>
<p>If you are interested and want to learn more about similar topics, you can find me on:</p>
<ul>
<li>✅ <a target="_blank" href="https://medium.com/@oieivind">Medium</a></li>
<li>✅ <a target="_blank" href="https://x.com/Ravenspike21">Twitter</a></li>
<li>✅ <a target="_blank" href="https://www.linkedin.com/in/eivind-kjosbakken/">LinkedIn</a></li>
</ul>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ How to Fine-Tune the Donut Model – With Example Use Case ]]>
                </title>
                <description>
                    <![CDATA[ The Donut model in Python is a model you can use to extract text from a given image. This can be useful in various scenarios, like scanning receipts, for example.  You can easily download the Donut model from GitHub. But as is common with AI models, ... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/how-to-fine-tune-the-donut-model/</link>
                <guid isPermaLink="false">66be0121c869f0000ecfe9d5</guid>
                
                    <category>
                        <![CDATA[ Artificial Intelligence ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Machine Learning ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Python ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ Eivind Kjosbakken ]]>
                </dc:creator>
                <pubDate>Tue, 12 Sep 2023 17:59:17 +0000</pubDate>
                <media:content url="https://www.freecodecamp.org/news/content/images/2023/09/undraw_Dashboard_re_3b76-4.png" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>The Donut model in Python is a model you can use to extract text from a given image. This can be useful in various scenarios, like scanning receipts, for example. </p>
<p>You can easily download the <a target="_blank" href="https://github.com/clovaai/donut">Donut model from GitHub</a>. But as is common with AI models, you should fine-tune the model for your specific needs. </p>
<p>I wrote this tutorial because I did not find any resources showing me exactly how to fine-tune the Donut model with my dataset. So I had to learn this from other tutorials (which I'll share throughout this guide) and figure out issues myself. </p>
<p>These issues were especially prevalent as I did not have a GPU on my local computer So to simplify the process for others, I made this tutorial.</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/09/img1.png" alt="Image" width="600" height="400" loading="lazy">
_Extract information from receipts. The picture was taken from [this Google Colab file](https://colab.research.google.com/drive/1NMSqoIZ<em>l39wyRD7yVjw2FIuU2aglzJi?usp=sharing#scrollTo=f7RoSOEXUa6i" rel="noopener) using a photo taken by me</em></p>
<h3 id="heading-heres-what-well-cover">Here's what we'll cover:</h3>
<ul>
<li>How to find a dataset to fine-tune with</li>
<li>Fine-tuning with Google Colab</li>
<li>How to change parameters</li>
<li>Fine-tuning locally</li>
</ul>
<h2 id="heading-how-to-find-a-dataset-to-fine-tune-with">How to Find a Dataset to Fine-tune with</h2>
<h3 id="heading-finding-a-dataset-online">Finding a dataset online</h3>
<p>To fine-tune the model, we need a dataset we will fine-tune with. If you want a simple solution, you can find a prepared dataset in <a target="_blank" href="https://drive.google.com/drive/folders/1orOj76DW2o-w3Dnati2CKAlXauH8STpT?usp=sharing">this folder on Google Drive.</a> </p>
<p>You should then copy this dataset over to your own Google Drive. Note that this was taken from <a target="_blank" href="https://towardsdatascience.com/ocr-free-document-understanding-with-donut-1acfbdf099be">this tutorial</a> under the “Downloading and parsing SROIE” headline. The tutorial is a great read which inspired this article, as I wanted to create a more in-depth tutorial for fine-tuning the Donut model in Google Colab. So if you want a more in-depth look at generating the dataset, I recommend reading the tutorial above.</p>
<p>The dataset linked above may not necessarily be for your specific purpose. If you want to fine-tune a model to your specific needs, you either need to find a fitting dataset online, or create a dataset yourself.</p>
<h3 id="heading-annotating-your-own-dataset">Annotating your own dataset</h3>
<p>This is another option if you can't or don't want to find a dataset online (so if you did that, you can ignore this subsection). </p>
<p>Annotating your own dataset is a surefire way to create a dataset that perfectly fits your needs. </p>
<p>There are many annotating tools online, but a free one I recommend is the <a target="_blank" href="https://github.com/katanaml/sparrow">Sparrow UI data annotation tool</a>. Here you can upload your image, put bounding boxes on the image, and label each bounding box. You can then extract the labeled data in JSON format, and use it following the rest of the tutorial. </p>
<p>Make sure your dataset is in the same format as the <a target="_blank" href="https://drive.google.com/drive/folders/1orOj76DW2o-w3Dnati2CKAlXauH8STpT">dataset I provided earlier</a>. For more details on annotating data with the Sparrow UI, you can check out <a target="_blank" href="https://medium.com/python-in-plain-english/empower-your-donut-model-for-receipts-with-self-annotated-data-51fc882b7229">my article on using the Donut model for self-annotated data</a>. Note that this article assumes you are already able to finetune the Donut model (which you will learn in this article).</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/09/img2.png" alt="Image" width="600" height="400" loading="lazy">
<em>Annotating a receipt with the Sparrow UI data annotation tool</em></p>
<h2 id="heading-fine-tuning-with-google-colab">Fine-Tuning with Google Colab</h2>
<p>To make the fine-tuning process as simple as possible, I provided a <a target="_blank" href="https://colab.research.google.com/drive/1-qfztYjDrFecOWdqyANtI23HV06xhDRE?usp=sharing">Google Colab file you can use here</a>. (Some code is taken from <a target="_blank" href="https://github.com/NielsRogge/Transformers-Tutorials">this GitHub page</a>). </p>
<p>Note that package versions need to be exactly as provided in the Drive, as wrong package versions were the root of a lot of the problems I faced fine-tuning the Donut model myself.</p>
<p>Before fine-tuning using the Google Colab file, there are 2 things you need to do:</p>
<h3 id="heading-upload-data-to-your-google-drive">Upload data to your Google Drive.</h3>
<p>Upload the <a target="_blank" href="https://drive.google.com/file/d/1WsWLVZhKLb8A0uCJ7Jpk8F5pCNDNsGbH/view?usp=sharing">dataset I provided earlier</a> to your Google Drive under a parent folder called <em>preparedFinetuneData</em> (see the file structure in the image below). </p>
<p>Make sure to add the parent folder in the root folder for your Google Drive. Also, download <a target="_blank" href="https://drive.google.com/file/d/1WsWLVZhKLb8A0uCJ7Jpk8F5pCNDNsGbH/view?usp=sharing">this config file</a> and add it to the root folder of your Google Drive.</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/09/img.png" alt="Image" width="600" height="400" loading="lazy">
<em>How your dataset should look in the root folder of Google Drive</em></p>
<h3 id="heading-link-your-google-drive-to-your-google-colab">Link your Google Drive to your Google Colab.</h3>
<p>When you run the cell which mounts the Google Drive, you might get a prompt, in which case you can just accept it and ignore the rest of this paragraph. </p>
<p>If you do not get a prompt, press the files icon (red in the image below), and the Mount Drive Icon (blue in the image below). Then you will get a code snippet that you can run, and now your Google Drive is connected. </p>
<p>Note that if you have not connected Google Colab to Google Drive before, you have to log into your Google Drive after pressing the Drive icon, and give permission for Colab to access Drive (prompts for this should appear automatically when you try to link the Drive)</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/09/img3.png" alt="Image" width="600" height="400" loading="lazy">
<em>Files icon (red). Mount Google Drive (blue)</em></p>
<p>Finally, <strong>restart your runtime</strong>. After altering files on Google Colab, you always have to restart your runtime to see the latest updates.</p>
<h2 id="heading-how-to-change-parameters">How to Change Parameters</h2>
<p>Great! Now you can run the cells in the notebook, and you should receive a fine-tuned model. Remember you can also change the Config parameters to, for example, train for longer, use more workers, and so on.</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/09/img6.png" alt="Image" width="600" height="400" loading="lazy">
<em>Example of Config parameters you can change.</em></p>
<p>Note that I am working with the Donut model fine-tuned on the <a target="_blank" href="https://github.com/clovaai/cord">CORD dataset</a>, as I want to be able to read receipts. You can also find other Donut models <a target="_blank" href="https://github.com/clovaai/donut">here</a>, with the other options being document parsing, document classification, or document visual question answering (DocVQA).</p>
<h2 id="heading-fine-tuning-locally">Fine-tuning Locally</h2>
<p>Fine-tuning can also be run locally, which will be mostly relevant for you if you have a GPU, as CPU training will take a long time. </p>
<p>To run locally you have to:</p>
<ol>
<li>First, clone <a target="_blank" href="https://github.com/clovaai/donut">this GitHub repository</a></li>
<li>Add the prepared fine-tuning dataset to the root folder.</li>
<li>If you want to save the fine-tuned model, add the line below to train.py line 164, right below <em>trainer.fit(…)</em></li>
</ol>
<pre><code class="lang-py"><span class="hljs-comment">#...</span>
trainer.save_checkpoint(<span class="hljs-string">f"<span class="hljs-subst">{Path(config.result_path)}</span>/<span class="hljs-subst">{config.exp_name}</span>/<span class="hljs-subst">{config.exp_version}</span>/model_checkpoint.ckpt"</span>)
<span class="hljs-comment">#...</span>
</code></pre>
<ol start="4">
<li>You then need to comment out GPU processes in the PyTorch Lightning Trainer, and add the line: <em>accelerator=”cpu”</em>:</li>
</ol>
<pre><code class="lang-py"><span class="hljs-comment">#train.py file</span>
<span class="hljs-comment">#... </span>
trainer = pl.Trainer(
        <span class="hljs-comment">#Comment out the lines above</span>
        <span class="hljs-comment"># num_nodes=config.get("num_nodes", 1),</span>
        <span class="hljs-comment"># devices=torch.cuda.device_count(),</span>
        <span class="hljs-comment"># strategy="dp",</span>
        <span class="hljs-comment"># accelerator="gpu",</span>
        accelerator=<span class="hljs-string">"cpu"</span>, <span class="hljs-comment">#TODO add this line</span>
        plugins=custom_ckpt,
        max_epochs=config.max_epochs,
        max_steps=config.max_steps,
        val_check_interval=config.val_check_interval,
        check_val_every_n_epoch=config.check_val_every_n_epoch,
        gradient_clip_val=config.gradient_clip_val,
        precision=<span class="hljs-number">16</span>,
        num_sanity_val_steps=<span class="hljs-number">0</span>,
        logger=logger,
        callbacks=[lr_callback, checkpoint_callback, bar],
    )
<span class="hljs-comment">#...</span>
</code></pre>
<ol start="5">
<li><p>Make sure the max_epochs parameter in your Config file is set to -1 (if not you will get a division by 0 error). You can decide training time by setting the parameter _max<em>steps</em>.</p>
</li>
<li><p>You can then run fine-tuning can then be run with the following command in the terminal:</p>
</li>
</ol>
<pre><code class="lang-bash">python train.py --config config/train_cord.yaml
</code></pre>
<p>Where _train<em>cord.yaml</em> is the Configuration file you want to use.</p>
<h3 id="heading-running-on-cpu">Running on CPU</h3>
<p>If you are running on CPU after all, you will encounter some problems unless you make some changes:</p>
<ol>
<li>donut/train.py, change the <em>accelerator</em> parameter to “cpu” (from “gpu”), and remove the parameters: _num<em>nodes</em>, <em>devices</em>, and <em>strategy</em>).</li>
<li>Then in your Config file (for example _train<em>cord.yaml</em>), set _max<em>epochs</em> to -1, and then specify the parameter _max<em>steps</em>. This is because you will encounter a division by 0 error if you have _max<em>epoch</em> larger than 0</li>
</ol>
<p>After these changes, running on a CPU should work as well.</p>
<h2 id="heading-conclusion">Conclusion</h2>
<p>In this article, I have shown you how to easily fine-tune the Donut model using your own data, something which will hopefully result in improved accuracy for your fine-tuned Donut model. </p>
<p>The applicabilities of the Donut model are many, and this is just one way to use it, which I hope is useful.</p>
<p>If you are interested and want to learn more about similar topics, you can find me on:</p>
<ul>
<li><a target="_blank" href="https://medium.com/@oieivind">✅ Medium</a></li>
<li><a target="_blank" href="https://twitter.com/Ravenspike21">✅</a> <a target="_blank" href="https://twitter.com/Ravenspike21">Twitter</a></li>
</ul>
 ]]>
                </content:encoded>
            </item>
        
    </channel>
</rss>
