<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/"
    xmlns:atom="http://www.w3.org/2005/Atom" xmlns:media="http://search.yahoo.com/mrss/" version="2.0">
    <channel>
        
        <title>
            <![CDATA[ pytorch - freeCodeCamp.org ]]>
        </title>
        <description>
            <![CDATA[ Browse thousands of programming tutorials written by experts. Learn Web Development, Data Science, DevOps, Security, and get developer career advice. ]]>
        </description>
        <link>https://www.freecodecamp.org/news/</link>
        <image>
            <url>https://cdn.freecodecamp.org/universal/favicons/favicon.png</url>
            <title>
                <![CDATA[ pytorch - freeCodeCamp.org ]]>
            </title>
            <link>https://www.freecodecamp.org/news/</link>
        </image>
        <generator>Eleventy</generator>
        <lastBuildDate>Tue, 09 Jun 2026 04:38:28 +0000</lastBuildDate>
        <atom:link href="https://www.freecodecamp.org/news/tag/pytorch/rss.xml" rel="self" type="application/rss+xml" />
        <ttl>60</ttl>
        
            <item>
                <title>
                    <![CDATA[ Building NMT from Scratch – PyTorch Replications of 7 Landmark Papers ]]>
                </title>
                <description>
                    <![CDATA[ Learn about the complete neural machine translation journey. We just posted a course on the freeCodeCamp.org YouTube channel that is a comprehensive journey through the evolution of sequence models and neural machine translation (NMT). It blends hist... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/building-nmt-from-scratch-pytorch-replications-of-7-landmark-papers/</link>
                <guid isPermaLink="false">6939907d149667a2e0b357dd</guid>
                
                    <category>
                        <![CDATA[ pytorch ]]>
                    </category>
                
                    <category>
                        <![CDATA[ youtube ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ Beau Carnes ]]>
                </dc:creator>
                <pubDate>Wed, 10 Dec 2025 15:23:41 +0000</pubDate>
                <media:content url="https://cdn.hashnode.com/res/hashnode/image/upload/v1765380195051/cafca462-96d6-49a4-b182-24d7a4f438f9.jpeg" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>Learn about the complete neural machine translation journey.</p>
<p>We just posted a course on the freeCodeCamp.org YouTube channel that is a comprehensive journey through the evolution of sequence models and neural machine translation (NMT). It blends historical breakthroughs, architectural innovations, mathematical insights, and hands-on PyTorch replications of landmark papers that shaped modern NLP and AI.</p>
<p>The course features:</p>
<ul>
<li><p>A detailed narrative tracing the history and breakthroughs of RNNs, LSTMs, GRUs, Seq2Seq, Attention, GNMT, and Multilingual NMT.</p>
</li>
<li><p>Replications of 7 landmark NMT papers in PyTorch, so learners can code along and rebuild history step by step.</p>
</li>
<li><p>Explanations of the math behind RNNs, LSTMs, GRUs, and Transformers.</p>
</li>
<li><p>Conceptual clarity with architectural comparisons, visual explanations, and interactive demos like the Transformer Playground.</p>
</li>
</ul>
<p>Here are all the sections in the course:</p>
<ul>
<li><p>Evolution of RNN</p>
</li>
<li><p>Evolution of Machine Translation</p>
</li>
<li><p>Machine Translation Techniques</p>
</li>
<li><p>Long Short-Term Memory (Overview)</p>
</li>
<li><p>Learning Phrase Representation using RNN (Encoder–Decoder for SMT)</p>
</li>
<li><p>Learning Phrase Representation (PyTorch Lab – Replicating Cho et al., 2014)</p>
</li>
<li><p>Seq2Seq Learning with Neural Networks</p>
</li>
<li><p>Seq2Seq (PyTorch Lab – Replicating Sutskever et al., 2014)</p>
</li>
<li><p>NMT by Jointly Learning to Align (Bahdanau et al., 2015)</p>
</li>
<li><p>NMT by Jointly Learning to Align &amp; Translate (PyTorch Lab – Replicating Bahdanau et al., 2015)</p>
</li>
<li><p>On Using Very Large Target Vocabulary</p>
</li>
<li><p>Large Vocabulary NMT (PyTorch Lab – Replicating Jean et al., 2015)</p>
</li>
<li><p>Effective Approaches to Attention (Luong et al., 2015)</p>
</li>
<li><p>Attention Approaches (PyTorch Lab – Replicating Luong et al., 2015)</p>
</li>
<li><p>Long Short-Term Memory Network (Deep Explanation)</p>
</li>
<li><p>Attention Is All You Need (Vaswani et al., 2017)</p>
</li>
<li><p>Google Neural Machine Translation System (GNMT – Wu et al., 2016)</p>
</li>
<li><p>GNMT (PyTorch Lab – Replicating Wu et al., 2016)</p>
</li>
<li><p>Google’s Multilingual NMT (Johnson et al., 2017)</p>
</li>
<li><p>Multilingual NMT (PyTorch Lab – Replicating Johnson et al., 2017)</p>
</li>
<li><p>Transformer vs GPT vs BERT Architectures</p>
</li>
<li><p>Transformer Playground (Tool Demo)</p>
</li>
<li><p>Seq2Seq Idea from Google Translate Tool</p>
</li>
<li><p>RNN, LSTM, GRU Architectures (Comparisons)</p>
</li>
<li><p>LSTM &amp; GRU Equations</p>
</li>
</ul>
<p>Watch the full course on <a target="_blank" href="https://youtu.be/kRv2ElPNAdY">the freeCodeCamp.org YouTube channel</a> (7-hour watch).</p>
<div class="embed-wrapper">
        <iframe width="560" height="315" src="https://www.youtube.com/embed/kRv2ElPNAdY" style="aspect-ratio: 16 / 9; width: 100%; height: auto;" title="YouTube video player" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen="" loading="lazy"></iframe></div>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ How to Use Transformers for Real-Time Gesture Recognition ]]>
                </title>
                <description>
                    <![CDATA[ Gesture and sign recognition is a growing field in computer vision, powering accessibility tools and natural user interfaces. Most beginner projects rely on hand landmarks or small CNNs, but these often miss the bigger picture because gestures are no... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/using-transformers-for-real-time-gesture-recognition/</link>
                <guid isPermaLink="false">68e3c692aa82abf4b593114c</guid>
                
                    <category>
                        <![CDATA[ Computer Vision ]]>
                    </category>
                
                    <category>
                        <![CDATA[ transformers ]]>
                    </category>
                
                    <category>
                        <![CDATA[ pytorch ]]>
                    </category>
                
                    <category>
                        <![CDATA[ ONNX ]]>
                    </category>
                
                    <category>
                        <![CDATA[ gradio ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Machine Learning ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Deep Learning ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Gesture Recognition ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Accessibility ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Tutorial ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ OMOTAYO OMOYEMI ]]>
                </dc:creator>
                <pubDate>Mon, 06 Oct 2025 13:39:30 +0000</pubDate>
                <media:content url="https://cdn.hashnode.com/res/hashnode/image/upload/v1759757931295/5f19fd4e-93c0-4bd7-a75c-a7858e061ecd.png" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>Gesture and sign recognition is a growing field in computer vision, powering accessibility tools and natural user interfaces. Most beginner projects rely on hand landmarks or small CNNs, but these often miss the bigger picture because gestures are not static images. Rather, they unfold over time. To build more robust, real-time systems, we need models that can capture both spatial details and temporal context.</p>
<p>This is where Transformers come in. Originally built for language, they’ve become state-of-the-art in vision tasks thanks to models like the Vision Transformer (ViT) and video-focused variants such as TimeSformer.</p>
<p>In this tutorial, we’ll use a Transformer backbone to create a lightweight real-time gesture recognition tool, optimized for small datasets and deployable on a regular laptop webcam.</p>
<h2 id="heading-table-of-contents">Table of Contents</h2>
<ul>
<li><p><a class="post-section-overview" href="#heading-why-transformers-for-gestures">Why Transformers for Gestures?</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-what-youll-learn">What You’ll Learn</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-prerequisites">Prerequisites</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-project-setup">Project Setup</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-generate-a-gesture-dataset">Generate a Gesture Dataset</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-option-1-generate-a-synthetic-dataset">Option 1: Generate a Synthetic Dataset</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-training-script-trainpy">Training Script:</a> <a target="_blank" href="http://train.py">train.py</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-export-the-model-to-onnx">Export the Model to ONNX</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-evaluate-accuracy-latency">Evaluate Accuracy + Latency</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-option-2-use-small-samples-from-public-gesture-datasets">Option 2: Use Small Samples from Public Gesture Datasets</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-accessibility-notes-amp-ethical-limits">Accessibility Notes &amp; Ethical Limits</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-next-steps">Next Steps</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-conclusion">Conclusion</a></p>
</li>
</ul>
<h2 id="heading-why-transformers-for-gestures">Why Transformers for Gestures?</h2>
<p>Transformers are powerful because they use self-attention to model relationships across a sequence. For gestures, this means the model doesn’t just see isolated frames, but also learns how movements evolve over time. A wave, for example, looks different from a raised hand only when viewed as a sequence.</p>
<p>Vision Transformers process images as patches, while video Transformers extend this to multiple frames with temporal attention. Even a simple approach, like applying ViT to each frame and pooling across time, can outperform traditional CNN-based methods for small datasets.</p>
<p>Combined with Hugging Face’s pre-trained models and ONNX Runtime for optimization, Transformers make it possible to train on a modest dataset and still achieve smooth real-time recognition.</p>
<h2 id="heading-what-youll-learn">What You’ll Learn</h2>
<p>In this tutorial, you’ll build a gesture recognition system using Transformers. By the end, you’ll know how to:</p>
<ul>
<li><p>Create (or record) a tiny gesture dataset</p>
</li>
<li><p>Train a Vision Transformer (ViT) with temporal pooling</p>
</li>
<li><p>Export the model to ONNX for faster inference</p>
</li>
<li><p>Build a real-time Gradio app that classifies gestures from your webcam</p>
</li>
<li><p>Evaluate your model’s accuracy and latency with simple scripts</p>
</li>
<li><p>Understand the accessibility potential and ethical limits of gesture recognition</p>
</li>
</ul>
<h2 id="heading-prerequisites">Prerequisites</h2>
<p>To follow along, you should have:</p>
<ul>
<li><p>Basic Python knowledge (functions, scripts, virtual environments)</p>
</li>
<li><p>Familiarity with PyTorch (tensors, datasets, training loops) – helpful but not required</p>
</li>
<li><p>Python 3.8+ installed on your system</p>
</li>
<li><p>A webcam (for the live demo in Gradio)</p>
</li>
<li><p>Optionally: GPU access (training on CPU works, but is slower)</p>
</li>
</ul>
<h2 id="heading-project-setup">Project Setup</h2>
<p>Create a new project folder and install the required libraries.</p>
<pre><code class="lang-bash"><span class="hljs-comment"># Create a new project directory and navigate into it</span>
mkdir transformer-gesture &amp;&amp; <span class="hljs-built_in">cd</span> transformer-gesture

<span class="hljs-comment"># Set up a Python virtual environment</span>
python -m venv .venv

<span class="hljs-comment"># Activate the virtual environment</span>
<span class="hljs-comment"># Windows PowerShell</span>
.venv\Scripts\Activate.ps1

<span class="hljs-comment"># macOS/Linux</span>
<span class="hljs-built_in">source</span> .venv/bin/activate
</code></pre>
<p>The provided code snippet is a set of commands for setting up a new Python project with a virtual environment. Here's a breakdown of each part:</p>
<ol>
<li><p><code>mkdir transformer-gesture &amp;&amp; cd transformer-gesture</code>: This command creates a new directory named "transformer-gesture" and then navigates into it.</p>
</li>
<li><p><code>python -m venv .venv</code>: This command creates a new virtual environment in the current directory. The virtual environment is stored in a folder named ".venv".</p>
</li>
<li><p>Activating the virtual environment:</p>
<ul>
<li><p>For Windows PowerShell, you can use <code>.venv\Scripts\Activate.ps1</code> to activate the virtual environment.</p>
</li>
<li><p>For macOS/Linux, use <code>source .venv/bin/activate</code> to activate the virtual environment.</p>
</li>
</ul>
</li>
</ol>
<p>Activating a virtual environment ensures that the Python interpreter and any packages you install are isolated to this specific project, preventing conflicts with other projects or system-wide packages.</p>
<p>Create a <code>requirements.txt</code> file:</p>
<pre><code class="lang-plaintext">torch&gt;=2.0
torchvision
torchaudio
timm
huggingface_hub

onnx
onnxruntime

gradio

numpy
opencv-python
pillow

matplotlib
seaborn
scikit-learn
</code></pre>
<p>The list provided is a set of package dependencies typically found in a <code>requirements.txt</code> file for a Python project. Here's a brief explanation of each package:</p>
<ol>
<li><p><strong>torch&gt;=2.0</strong>: PyTorch is a popular open-source deep learning framework that provides a flexible and efficient platform for building and training neural networks. Version 2.0 and above includes improvements in performance and new features.</p>
</li>
<li><p><strong>torchvision</strong>: This library is part of the PyTorch ecosystem and provides tools for computer vision tasks, including datasets, model architectures, and image transformations.</p>
</li>
<li><p><strong>torchaudio</strong>: Also part of the PyTorch ecosystem, Torchaudio provides audio processing tools and datasets, making it easier to work with audio data in deep learning projects.</p>
</li>
<li><p><strong>timm</strong>: The PyTorch Image Models (timm) library offers a collection of pre-trained models and utilities for computer vision tasks, facilitating quick experimentation and deployment.</p>
</li>
<li><p><strong>huggingface_hub</strong>: This library allows easy access to models and datasets hosted on the Hugging Face Hub, a platform for sharing and collaborating on machine learning models and datasets.</p>
</li>
<li><p><strong>onnx</strong>: The Open Neural Network Exchange (ONNX) format is used to represent machine learning models, enabling interoperability between different frameworks.</p>
</li>
<li><p><strong>onnxruntime</strong>: This is a high-performance runtime for executing ONNX models, allowing for efficient deployment across various platforms.</p>
</li>
<li><p><strong>gradio</strong>: Gradio is a library for creating user interfaces for machine learning models, making them accessible through a web interface for easy interaction and testing.</p>
</li>
<li><p><strong>numpy</strong>: A fundamental package for numerical computing in Python, providing support for arrays and a wide range of mathematical functions.</p>
</li>
<li><p><strong>opencv-python</strong>: OpenCV is a library for computer vision and image processing tasks, widely used for real-time applications.</p>
</li>
<li><p><strong>pillow</strong>: A Python Imaging Library (PIL) fork, Pillow provides tools for opening, manipulating, and saving many different image file formats.</p>
</li>
<li><p><strong>matplotlib</strong>: A plotting library for Python, Matplotlib is used for creating static, interactive, and animated visualizations in Python.</p>
</li>
<li><p><strong>seaborn</strong>: Built on top of Matplotlib, Seaborn provides a high-level interface for drawing attractive and informative statistical graphics.</p>
</li>
<li><p><strong>scikit-learn</strong>: A machine learning library in Python that provides simple and efficient tools for data analysis and modeling, including classification, regression, clustering, and dimensionality reduction.</p>
</li>
</ol>
<p>Install dependencies:</p>
<pre><code class="lang-bash">pip install -r requirements.txt
</code></pre>
<p>The command <code>pip install -r requirements.txt</code> is used to install all the Python packages listed in a file named <code>requirements.txt</code>. This file typically contains a list of package dependencies required for a Python project, each specified with a package name and optionally a version number.</p>
<p>By running this command, <code>pip</code>, which is the Python package installer, reads the file and installs each package listed, ensuring that the project has all the necessary dependencies to run properly. This is a common practice in Python projects to manage and share dependencies easily.</p>
<h2 id="heading-generate-a-gesture-dataset">Generate a Gesture Dataset</h2>
<p>To train our Transformer-based gesture recognizer, we need some data. Instead of downloading a huge dataset, we’ll start with a tiny synthetic dataset you can generate in seconds. This makes the tutorial lightweight and ensures that everyone can follow along without dealing with multi-gigabyte downloads.</p>
<h2 id="heading-option-1-generate-a-synthetic-dataset">Option 1: Generate a Synthetic Dataset</h2>
<p>We’ll use a small Python script that creates short <code>.mp4</code> clips of a moving (or still) coloured box. Each class represents a gesture:</p>
<ul>
<li><p><strong>swipe_left</strong> – box moves from right to left</p>
</li>
<li><p><strong>swipe_right</strong> – box moves from left to right</p>
</li>
<li><p><strong>stop</strong> – box stays still in the center</p>
</li>
</ul>
<p>Save this script as <code>generate_synthetic_gestures.py</code> in your project root:</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> os, cv2, numpy <span class="hljs-keyword">as</span> np, random, argparse

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">ensure_dir</span>(<span class="hljs-params">p</span>):</span> os.makedirs(p, exist_ok=<span class="hljs-literal">True</span>)

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">make_clip</span>(<span class="hljs-params">mode, out_path, seconds=<span class="hljs-number">1.5</span>, fps=<span class="hljs-number">16</span>, size=<span class="hljs-number">224</span>, box_size=<span class="hljs-number">60</span>, seed=<span class="hljs-number">0</span>, codec=<span class="hljs-string">"mp4v"</span></span>):</span>
    rng = random.Random(seed)
    frames = int(seconds * fps)
    H = W = size

    <span class="hljs-comment"># background + box color</span>
    bg_val = rng.randint(<span class="hljs-number">160</span>, <span class="hljs-number">220</span>)
    bg = np.full((H, W, <span class="hljs-number">3</span>), bg_val, dtype=np.uint8)
    color = (rng.randint(<span class="hljs-number">20</span>, <span class="hljs-number">80</span>), rng.randint(<span class="hljs-number">20</span>, <span class="hljs-number">80</span>), rng.randint(<span class="hljs-number">20</span>, <span class="hljs-number">80</span>))

    <span class="hljs-comment"># path of motion</span>
    y = rng.randint(<span class="hljs-number">40</span>, H - <span class="hljs-number">40</span> - box_size)
    <span class="hljs-keyword">if</span> mode == <span class="hljs-string">"swipe_left"</span>:
        x_start, x_end = W - <span class="hljs-number">20</span> - box_size, <span class="hljs-number">20</span>
    <span class="hljs-keyword">elif</span> mode == <span class="hljs-string">"swipe_right"</span>:
        x_start, x_end = <span class="hljs-number">20</span>, W - <span class="hljs-number">20</span> - box_size
    <span class="hljs-keyword">elif</span> mode == <span class="hljs-string">"stop"</span>:
        x_start = x_end = (W - box_size) // <span class="hljs-number">2</span>
    <span class="hljs-keyword">else</span>:
        <span class="hljs-keyword">raise</span> ValueError(<span class="hljs-string">f"Unknown mode: <span class="hljs-subst">{mode}</span>"</span>)

    fourcc = cv2.VideoWriter_fourcc(*codec)
    vw = cv2.VideoWriter(out_path, fourcc, fps, (W, H))
    <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> vw.isOpened():
        <span class="hljs-keyword">raise</span> RuntimeError(
            <span class="hljs-string">f"Could not open VideoWriter with codec '<span class="hljs-subst">{codec}</span>'. "</span>
            <span class="hljs-string">"Try --codec XVID and use .avi extension, e.g. out.avi"</span>
        )

    <span class="hljs-keyword">for</span> t <span class="hljs-keyword">in</span> range(frames):
        alpha = t / max(<span class="hljs-number">1</span>, frames - <span class="hljs-number">1</span>)
        x = int((<span class="hljs-number">1</span> - alpha) * x_start + alpha * x_end)
        <span class="hljs-comment"># small jitter to avoid being too synthetic</span>
        jitter_x, jitter_y = rng.randint(<span class="hljs-number">-2</span>, <span class="hljs-number">2</span>), rng.randint(<span class="hljs-number">-2</span>, <span class="hljs-number">2</span>)
        frame = bg.copy()
        cv2.rectangle(frame, (x + jitter_x, y + jitter_y),
                      (x + jitter_x + box_size, y + jitter_y + box_size),
                      color, thickness=<span class="hljs-number">-1</span>)
        <span class="hljs-comment"># overlay text</span>
        cv2.putText(frame, mode, (<span class="hljs-number">8</span>, <span class="hljs-number">24</span>), cv2.FONT_HERSHEY_SIMPLEX, <span class="hljs-number">0.7</span>, (<span class="hljs-number">0</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>), <span class="hljs-number">2</span>, cv2.LINE_AA)
        cv2.putText(frame, mode, (<span class="hljs-number">8</span>, <span class="hljs-number">24</span>), cv2.FONT_HERSHEY_SIMPLEX, <span class="hljs-number">0.7</span>, (<span class="hljs-number">255</span>, <span class="hljs-number">255</span>, <span class="hljs-number">255</span>), <span class="hljs-number">1</span>, cv2.LINE_AA)
        vw.write(frame)

    vw.release()

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">write_labels</span>(<span class="hljs-params">labels, out_dir</span>):</span>
    <span class="hljs-keyword">with</span> open(os.path.join(out_dir, <span class="hljs-string">"labels.txt"</span>), <span class="hljs-string">"w"</span>, encoding=<span class="hljs-string">"utf-8"</span>) <span class="hljs-keyword">as</span> f:
        <span class="hljs-keyword">for</span> c <span class="hljs-keyword">in</span> labels:
            f.write(c + <span class="hljs-string">"\n"</span>)

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">main</span>():</span>
    ap = argparse.ArgumentParser(description=<span class="hljs-string">"Generate a tiny synthetic gesture dataset."</span>)
    ap.add_argument(<span class="hljs-string">"--out"</span>, default=<span class="hljs-string">"data"</span>, help=<span class="hljs-string">"Output directory (default: data)"</span>)
    ap.add_argument(<span class="hljs-string">"--classes"</span>, nargs=<span class="hljs-string">"+"</span>,
                    default=[<span class="hljs-string">"swipe_left"</span>, <span class="hljs-string">"swipe_right"</span>, <span class="hljs-string">"stop"</span>],
                    help=<span class="hljs-string">"Class names (default: swipe_left swipe_right stop)"</span>)
    ap.add_argument(<span class="hljs-string">"--clips"</span>, type=int, default=<span class="hljs-number">16</span>, help=<span class="hljs-string">"Clips per class (default: 16)"</span>)
    ap.add_argument(<span class="hljs-string">"--seconds"</span>, type=float, default=<span class="hljs-number">1.5</span>, help=<span class="hljs-string">"Seconds per clip (default: 1.5)"</span>)
    ap.add_argument(<span class="hljs-string">"--fps"</span>, type=int, default=<span class="hljs-number">16</span>, help=<span class="hljs-string">"Frames per second (default: 16)"</span>)
    ap.add_argument(<span class="hljs-string">"--size"</span>, type=int, default=<span class="hljs-number">224</span>, help=<span class="hljs-string">"Frame size WxH (default: 224)"</span>)
    ap.add_argument(<span class="hljs-string">"--box"</span>, type=int, default=<span class="hljs-number">60</span>, help=<span class="hljs-string">"Box size (default: 60)"</span>)
    ap.add_argument(<span class="hljs-string">"--codec"</span>, default=<span class="hljs-string">"mp4v"</span>, help=<span class="hljs-string">"Codec fourcc (mp4v or XVID)"</span>)
    ap.add_argument(<span class="hljs-string">"--ext"</span>, default=<span class="hljs-string">".mp4"</span>, help=<span class="hljs-string">"File extension (.mp4 or .avi)"</span>)
    args = ap.parse_args()

    ensure_dir(args.out)
    write_labels(args.classes, <span class="hljs-string">"."</span>)  <span class="hljs-comment"># writes labels.txt to project root</span>

    print(<span class="hljs-string">f"Generating synthetic dataset -&gt; <span class="hljs-subst">{args.out}</span>"</span>)
    <span class="hljs-keyword">for</span> cls <span class="hljs-keyword">in</span> args.classes:
        cls_dir = os.path.join(args.out, cls)
        ensure_dir(cls_dir)
        mode = <span class="hljs-string">"stop"</span> <span class="hljs-keyword">if</span> cls == <span class="hljs-string">"stop"</span> <span class="hljs-keyword">else</span> (<span class="hljs-string">"swipe_left"</span> <span class="hljs-keyword">if</span> <span class="hljs-string">"left"</span> <span class="hljs-keyword">in</span> cls <span class="hljs-keyword">else</span> (<span class="hljs-string">"swipe_right"</span> <span class="hljs-keyword">if</span> <span class="hljs-string">"right"</span> <span class="hljs-keyword">in</span> cls <span class="hljs-keyword">else</span> <span class="hljs-string">"stop"</span>))
        <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> range(args.clips):
            filename = os.path.join(cls_dir, <span class="hljs-string">f"<span class="hljs-subst">{cls}</span>_<span class="hljs-subst">{i+<span class="hljs-number">1</span>:<span class="hljs-number">03</span>d}</span><span class="hljs-subst">{args.ext}</span>"</span>)
            make_clip(
                mode=mode,
                out_path=filename,
                seconds=args.seconds,
                fps=args.fps,
                size=args.size,
                box_size=args.box,
                seed=i + <span class="hljs-number">1</span>,
                codec=args.codec
            )
        print(<span class="hljs-string">f"  <span class="hljs-subst">{cls}</span>: <span class="hljs-subst">{args.clips}</span> clips"</span>)

    print(<span class="hljs-string">"Done. You can now run: python train.py, python export_onnx.py, python app.py"</span>)

<span class="hljs-keyword">if</span> __name__ == <span class="hljs-string">"__main__"</span>:
    main()
</code></pre>
<p>The script generates a synthetic gesture dataset by creating video clips of a moving or stationary coloured box, simulating gestures like "swipe left," "swipe right," and "stop," and saves them in a specified output directory.</p>
<p>Now run it inside your virtual environment:</p>
<pre><code class="lang-bash">python generate_synthetic_gestures.py --out data --clips 16 --seconds 1.5
</code></pre>
<p>The command above runs a Python script named <code>generate_synthetic_gestures.py</code>, which generates a synthetic gesture dataset with 16 clips per gesture, each lasting 1.5 seconds, and saves the output in a directory named "data".</p>
<p>This creates a dataset like:</p>
<pre><code class="lang-plaintext">data/
  swipe_left/*.mp4
  swipe_right/*.mp4
  stop/*.mp4
labels.txt
</code></pre>
<p>Each folder contains short clips of a moving (or still) box that simulate gestures. This is perfect for testing the pipeline.</p>
<h3 id="heading-training-script-trainpy">Training Script: <code>train.py</code></h3>
<p>Now that we have our dataset, let’s fine-tune a Vision Transformer with temporal pooling. This model applies ViT frame-by-frame, averages embeddings across time, and trains a classification head on your gestures.</p>
<p>Here’s the full training script:</p>
<pre><code class="lang-python"><span class="hljs-comment"># train.py</span>
<span class="hljs-keyword">import</span> torch, torch.nn <span class="hljs-keyword">as</span> nn, torch.optim <span class="hljs-keyword">as</span> optim
<span class="hljs-keyword">from</span> torch.utils.data <span class="hljs-keyword">import</span> DataLoader
<span class="hljs-keyword">import</span> timm
<span class="hljs-keyword">from</span> dataset <span class="hljs-keyword">import</span> GestureClips, read_labels

<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">ViTTemporal</span>(<span class="hljs-params">nn.Module</span>):</span>
    <span class="hljs-string">"""Frame-wise ViT encoder -&gt; mean pool over time -&gt; linear head."""</span>
    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">__init__</span>(<span class="hljs-params">self, num_classes, vit_name=<span class="hljs-string">"vit_tiny_patch16_224"</span></span>):</span>
        super().__init__()
        self.vit = timm.create_model(vit_name, pretrained=<span class="hljs-literal">True</span>, num_classes=<span class="hljs-number">0</span>, global_pool=<span class="hljs-string">"avg"</span>)
        feat_dim = self.vit.num_features
        self.head = nn.Linear(feat_dim, num_classes)

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">forward</span>(<span class="hljs-params">self, x</span>):</span>  <span class="hljs-comment"># x: (B,T,C,H,W)</span>
        B, T, C, H, W = x.shape
        x = x.view(B * T, C, H, W)
        feats = self.vit(x)                  <span class="hljs-comment"># (B*T, D)</span>
        feats = feats.view(B, T, <span class="hljs-number">-1</span>).mean(dim=<span class="hljs-number">1</span>)  <span class="hljs-comment"># (B, D)</span>
        <span class="hljs-keyword">return</span> self.head(feats)

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">train</span>():</span>
    device = <span class="hljs-string">"cuda"</span> <span class="hljs-keyword">if</span> torch.cuda.is_available() <span class="hljs-keyword">else</span> <span class="hljs-string">"cpu"</span>
    labels, _ = read_labels(<span class="hljs-string">"labels.txt"</span>)
    n_classes = len(labels)

    train_ds = GestureClips(train=<span class="hljs-literal">True</span>)
    val_ds   = GestureClips(train=<span class="hljs-literal">False</span>)
    print(<span class="hljs-string">f"Train clips: <span class="hljs-subst">{len(train_ds)}</span> | Val clips: <span class="hljs-subst">{len(val_ds)}</span>"</span>)

    <span class="hljs-comment"># Windows/CPU friendly</span>
    train_dl = DataLoader(train_ds, batch_size=<span class="hljs-number">2</span>, shuffle=<span class="hljs-literal">True</span>,  num_workers=<span class="hljs-number">0</span>, pin_memory=<span class="hljs-literal">False</span>)
    val_dl   = DataLoader(val_ds,   batch_size=<span class="hljs-number">2</span>, shuffle=<span class="hljs-literal">False</span>, num_workers=<span class="hljs-number">0</span>, pin_memory=<span class="hljs-literal">False</span>)

    model = ViTTemporal(num_classes=n_classes).to(device)
    criterion = nn.CrossEntropyLoss()
    optimizer = optim.AdamW(model.parameters(), lr=<span class="hljs-number">3e-4</span>, weight_decay=<span class="hljs-number">0.05</span>)

    best_acc = <span class="hljs-number">0.0</span>
    epochs = <span class="hljs-number">5</span>
    <span class="hljs-keyword">for</span> epoch <span class="hljs-keyword">in</span> range(<span class="hljs-number">1</span>, epochs + <span class="hljs-number">1</span>):
        <span class="hljs-comment"># ---- Train ----</span>
        model.train()
        total, correct, loss_sum = <span class="hljs-number">0</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0.0</span>
        <span class="hljs-keyword">for</span> x, y <span class="hljs-keyword">in</span> train_dl:
            x, y = x.to(device), y.to(device)
            optimizer.zero_grad()
            logits = model(x)
            loss = criterion(logits, y)
            loss.backward()
            optimizer.step()

            loss_sum += loss.item() * x.size(<span class="hljs-number">0</span>)
            correct += (logits.argmax(<span class="hljs-number">1</span>) == y).sum().item()
            total += x.size(<span class="hljs-number">0</span>)

        train_acc = correct / total <span class="hljs-keyword">if</span> total <span class="hljs-keyword">else</span> <span class="hljs-number">0.0</span>
        train_loss = loss_sum / total <span class="hljs-keyword">if</span> total <span class="hljs-keyword">else</span> <span class="hljs-number">0.0</span>

        <span class="hljs-comment"># ---- Validate ----</span>
        model.eval()
        vtotal, vcorrect = <span class="hljs-number">0</span>, <span class="hljs-number">0</span>
        <span class="hljs-keyword">with</span> torch.no_grad():
            <span class="hljs-keyword">for</span> x, y <span class="hljs-keyword">in</span> val_dl:
                x, y = x.to(device), y.to(device)
                vcorrect += (model(x).argmax(<span class="hljs-number">1</span>) == y).sum().item()
                vtotal += x.size(<span class="hljs-number">0</span>)
        val_acc = vcorrect / vtotal <span class="hljs-keyword">if</span> vtotal <span class="hljs-keyword">else</span> <span class="hljs-number">0.0</span>

        print(<span class="hljs-string">f"Epoch <span class="hljs-subst">{epoch:<span class="hljs-number">02</span>d}</span> | train_loss <span class="hljs-subst">{train_loss:<span class="hljs-number">.4</span>f}</span> "</span>
              <span class="hljs-string">f"| train_acc <span class="hljs-subst">{train_acc:<span class="hljs-number">.3</span>f}</span> | val_acc <span class="hljs-subst">{val_acc:<span class="hljs-number">.3</span>f}</span>"</span>)

        <span class="hljs-keyword">if</span> val_acc &gt; best_acc:
            best_acc = val_acc
            torch.save(model.state_dict(), <span class="hljs-string">"vit_temporal_best.pt"</span>)

    print(<span class="hljs-string">"Best val acc:"</span>, best_acc)

<span class="hljs-keyword">if</span> __name__ == <span class="hljs-string">"__main__"</span>:
    train()
</code></pre>
<p>Running the command <code>python train.py</code> initiates the training process for your gesture recognition model. Here's a breakdown of what happens:</p>
<ol>
<li><p><strong>Load your dataset from data/</strong>: The script will access and load the gesture dataset stored in the "data" directory. This dataset is used to train the model.</p>
</li>
<li><p><strong>Fine-tune a pre-trained Vision Transformer</strong>: The training script will take a Vision Transformer model that has been pre-trained on a larger dataset and fine-tune it using your specific gesture dataset. Fine-tuning helps the model adapt to the nuances of your data, improving its performance on the specific task of gesture recognition.</p>
</li>
<li><p><strong>Save the best checkpoint as vit_temporal_best.pt</strong>: During training, the script will evaluate the model's performance on a validation set. The best-performing version of the model (based on some metric like accuracy) will be saved as a checkpoint file named "vit_temporal_best.pt". This file can later be used for inference or further training.</p>
</li>
</ol>
<h4 id="heading-what-training-looks-like">What Training Looks Like</h4>
<p>You should see logs similar to this:</p>
<pre><code class="lang-plaintext">Train clips: 38 | Val clips: 10
Epoch 01 | train_loss 1.4508 | train_acc 0.395 | val_acc 0.200
Epoch 02 | train_loss 1.2466 | train_acc 0.263 | val_acc 0.200
Epoch 03 | train_loss 1.1361 | train_acc 0.368 | val_acc 0.200
Best val acc: 0.200
</code></pre>
<p>Don’t worry if your accuracy is low at first, as with the synthetic dataset that’s normal. The key is proving that the Transformer pipeline works. You can boost results later by:</p>
<ul>
<li><p>Adding more clips per class</p>
</li>
<li><p>Training for more epochs</p>
</li>
<li><p>Switching to real recorded gestures</p>
</li>
</ul>
<p><img src="https://github.com/tayo4christ/transformer-gesture/blob/07c7071bdb17bc08585baeb60d787eadc3936ef5/images/training-logs.png?raw=true" alt="Training logs" width="600" height="400" loading="lazy"></p>
<p>Figure 1. Example training logs from <code>train.py</code>, where the Vision Transformer with temporal pooling is fine-tuned on a tiny synthetic dataset.</p>
<h3 id="heading-export-the-model-to-onnx">Export the Model to ONNX</h3>
<p>To make our model easier to run in real time (and lighter on CPU), we’ll export it to the ONNX format.</p>
<p><strong>Note:</strong> ONNX, which stands for Open Neural Network Exchange, is an open-source format designed to facilitate the interchange of deep learning models between different frameworks. It lets you train a model in one framework, such as PyTorch or TensorFlow, and then deploy it in another, like Caffe2 or MXNet, without needing to completely rewrite the model. This interoperability is achieved by providing a standardized representation of the model's architecture and parameters.</p>
<p>ONNX supports a wide range of operators and is continually updated to include new features, making it a versatile choice for deploying machine learning models across various platforms and devices.</p>
<p>Create a file called <code>export_onnx.py</code>:</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> torch
<span class="hljs-keyword">from</span> train <span class="hljs-keyword">import</span> ViTTemporal
<span class="hljs-keyword">from</span> dataset <span class="hljs-keyword">import</span> read_labels

labels, _ = read_labels(<span class="hljs-string">"labels.txt"</span>)
n_classes = len(labels)

<span class="hljs-comment"># Load trained model</span>
model = ViTTemporal(num_classes=n_classes)
model.load_state_dict(torch.load(<span class="hljs-string">"vit_temporal_best.pt"</span>, map_location=<span class="hljs-string">"cpu"</span>))
model.eval()

<span class="hljs-comment"># Dummy input: batch=1, 16 frames, 3x224x224</span>
dummy = torch.randn(<span class="hljs-number">1</span>, <span class="hljs-number">16</span>, <span class="hljs-number">3</span>, <span class="hljs-number">224</span>, <span class="hljs-number">224</span>)

<span class="hljs-comment"># Export</span>
torch.onnx.export(
    model, dummy, <span class="hljs-string">"vit_temporal.onnx"</span>,
    input_names=[<span class="hljs-string">"video"</span>], output_names=[<span class="hljs-string">"logits"</span>],
    dynamic_axes={<span class="hljs-string">"video"</span>: {<span class="hljs-number">0</span>: <span class="hljs-string">"batch"</span>}},
    opset_version=<span class="hljs-number">13</span>
)

print(<span class="hljs-string">"Exported vit_temporal.onnx"</span>)
</code></pre>
<p>Run it with <code>python export_onnx.py</code>.</p>
<p>This generates a file <code>vit_temporal.onnx</code> in your project folder. ONNX lets us use onnxruntime, which is much faster for inference.</p>
<p>Create a file called <code>app.py</code>:</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> os, tempfile, cv2, torch, onnxruntime, numpy <span class="hljs-keyword">as</span> np
<span class="hljs-keyword">import</span> gradio <span class="hljs-keyword">as</span> gr
<span class="hljs-keyword">from</span> dataset <span class="hljs-keyword">import</span> read_labels

T = <span class="hljs-number">16</span>
SIZE = <span class="hljs-number">224</span>
MODEL_PATH = <span class="hljs-string">"vit_temporal.onnx"</span>

labels, _ = read_labels(<span class="hljs-string">"labels.txt"</span>)

<span class="hljs-comment"># --- ONNX session + auto-detect names ---</span>
ort_session = onnxruntime.InferenceSession(MODEL_PATH, providers=[<span class="hljs-string">"CPUExecutionProvider"</span>])
<span class="hljs-comment"># detect first input and first output names to avoid mismatches</span>
INPUT_NAME = ort_session.get_inputs()[<span class="hljs-number">0</span>].name   <span class="hljs-comment"># e.g. "input" or "video"</span>
OUTPUT_NAME = ort_session.get_outputs()[<span class="hljs-number">0</span>].name <span class="hljs-comment"># e.g. "logits" or something else</span>

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">preprocess_clip</span>(<span class="hljs-params">frames_rgb</span>):</span>
    <span class="hljs-keyword">if</span> len(frames_rgb) == <span class="hljs-number">0</span>:
        frames_rgb = [np.zeros((SIZE, SIZE, <span class="hljs-number">3</span>), dtype=np.uint8)]
    <span class="hljs-keyword">if</span> len(frames_rgb) &lt; T:
        frames_rgb = frames_rgb + [frames_rgb[<span class="hljs-number">-1</span>]] * (T - len(frames_rgb))
    frames_rgb = frames_rgb[:T]
    clip = [cv2.resize(f, (SIZE, SIZE), interpolation=cv2.INTER_AREA) <span class="hljs-keyword">for</span> f <span class="hljs-keyword">in</span> frames_rgb]
    clip = np.stack(clip, axis=<span class="hljs-number">0</span>)                                    <span class="hljs-comment"># (T,H,W,3)</span>
    clip = np.transpose(clip, (<span class="hljs-number">0</span>, <span class="hljs-number">3</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>)).astype(np.float32) / <span class="hljs-number">255</span> <span class="hljs-comment"># (T,3,H,W)</span>
    clip = (clip - <span class="hljs-number">0.5</span>) / <span class="hljs-number">0.5</span>
    clip = np.expand_dims(clip, <span class="hljs-number">0</span>)                                   <span class="hljs-comment"># (1,T,3,H,W)</span>
    <span class="hljs-keyword">return</span> clip

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">_extract_path_from_gradio_video</span>(<span class="hljs-params">inp</span>):</span>
    <span class="hljs-keyword">if</span> isinstance(inp, str) <span class="hljs-keyword">and</span> os.path.exists(inp):
        <span class="hljs-keyword">return</span> inp
    <span class="hljs-keyword">if</span> isinstance(inp, dict):
        <span class="hljs-keyword">for</span> key <span class="hljs-keyword">in</span> (<span class="hljs-string">"video"</span>, <span class="hljs-string">"name"</span>, <span class="hljs-string">"path"</span>, <span class="hljs-string">"filepath"</span>):
            v = inp.get(key)
            <span class="hljs-keyword">if</span> isinstance(v, str) <span class="hljs-keyword">and</span> os.path.exists(v):
                <span class="hljs-keyword">return</span> v
        <span class="hljs-keyword">for</span> key <span class="hljs-keyword">in</span> (<span class="hljs-string">"data"</span>, <span class="hljs-string">"video"</span>):
            v = inp.get(key)
            <span class="hljs-keyword">if</span> isinstance(v, (bytes, bytearray)):
                tmp = tempfile.NamedTemporaryFile(delete=<span class="hljs-literal">False</span>, suffix=<span class="hljs-string">".mp4"</span>)
                tmp.write(v); tmp.flush(); tmp.close()
                <span class="hljs-keyword">return</span> tmp.name
    <span class="hljs-keyword">if</span> isinstance(inp, (list, tuple)) <span class="hljs-keyword">and</span> inp <span class="hljs-keyword">and</span> isinstance(inp[<span class="hljs-number">0</span>], str) <span class="hljs-keyword">and</span> os.path.exists(inp[<span class="hljs-number">0</span>]):
        <span class="hljs-keyword">return</span> inp[<span class="hljs-number">0</span>]
    <span class="hljs-keyword">return</span> <span class="hljs-literal">None</span>

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">_read_uniform_frames</span>(<span class="hljs-params">video_path</span>):</span>
    cap = cv2.VideoCapture(video_path)
    frames = []
    total = int(cap.get(cv2.CAP_PROP_FRAME_COUNT)) <span class="hljs-keyword">or</span> <span class="hljs-number">1</span>
    idxs = np.linspace(<span class="hljs-number">0</span>, total - <span class="hljs-number">1</span>, max(T, <span class="hljs-number">1</span>)).astype(int)
    want = set(int(i) <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> idxs.tolist())
    j = <span class="hljs-number">0</span>
    <span class="hljs-keyword">while</span> <span class="hljs-literal">True</span>:
        ok, bgr = cap.read()
        <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> ok: <span class="hljs-keyword">break</span>
        <span class="hljs-keyword">if</span> j <span class="hljs-keyword">in</span> want:
            rgb = cv2.cvtColor(bgr, cv2.COLOR_BGR2RGB)
            frames.append(rgb)
        j += <span class="hljs-number">1</span>
    cap.release()
    <span class="hljs-keyword">return</span> frames

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">predict_from_video</span>(<span class="hljs-params">gradio_video</span>):</span>
    video_path = _extract_path_from_gradio_video(gradio_video)
    <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> video_path <span class="hljs-keyword">or</span> <span class="hljs-keyword">not</span> os.path.exists(video_path):
        <span class="hljs-keyword">return</span> {}
    frames = _read_uniform_frames(video_path)

    <span class="hljs-comment"># If OpenCV choked on the codec (common with recorded webm), re-encode once:</span>
    <span class="hljs-keyword">if</span> len(frames) == <span class="hljs-number">0</span>:
        tmp = tempfile.NamedTemporaryFile(delete=<span class="hljs-literal">False</span>, suffix=<span class="hljs-string">".mp4"</span>); tmp_name = tmp.name; tmp.close()
        cap = cv2.VideoCapture(video_path)
        fourcc = cv2.VideoWriter_fourcc(*<span class="hljs-string">"mp4v"</span>)
        w = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH)) <span class="hljs-keyword">or</span> <span class="hljs-number">640</span>
        h = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT)) <span class="hljs-keyword">or</span> <span class="hljs-number">480</span>
        out = cv2.VideoWriter(tmp_name, fourcc, <span class="hljs-number">20.0</span>, (w, h))
        <span class="hljs-keyword">while</span> <span class="hljs-literal">True</span>:
            ok, frame = cap.read()
            <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> ok: <span class="hljs-keyword">break</span>
            out.write(frame)
        cap.release(); out.release()
        frames = _read_uniform_frames(tmp_name)

    clip = preprocess_clip(frames)
    <span class="hljs-comment"># &gt;&gt;&gt; use the detected ONNX input/output names &lt;&lt;&lt;</span>
    logits = ort_session.run([OUTPUT_NAME], {INPUT_NAME: clip})[<span class="hljs-number">0</span>]  <span class="hljs-comment"># (1, C)</span>
    probs = torch.softmax(torch.from_numpy(logits), dim=<span class="hljs-number">1</span>)[<span class="hljs-number">0</span>].numpy().tolist()
    <span class="hljs-keyword">return</span> {labels[i]: float(probs[i]) <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> range(len(labels))}

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">predict_from_image</span>(<span class="hljs-params">image</span>):</span>
    <span class="hljs-keyword">if</span> image <span class="hljs-keyword">is</span> <span class="hljs-literal">None</span>:
        <span class="hljs-keyword">return</span> {}
    clip = preprocess_clip([image] * T)
    logits = ort_session.run([OUTPUT_NAME], {INPUT_NAME: clip})[<span class="hljs-number">0</span>]
    probs = torch.softmax(torch.from_numpy(logits), dim=<span class="hljs-number">1</span>)[<span class="hljs-number">0</span>].numpy().tolist()
    <span class="hljs-keyword">return</span> {labels[i]: float(probs[i]) <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> range(len(labels))}

<span class="hljs-keyword">with</span> gr.Blocks() <span class="hljs-keyword">as</span> demo:
    gr.Markdown(<span class="hljs-string">"# Gesture Classifier (ONNX)\nRecord or upload a short video, then click **Classify Video**."</span>)
    <span class="hljs-keyword">with</span> gr.Tab(<span class="hljs-string">"Video (record or upload)"</span>):
        vid_in = gr.Video(label=<span class="hljs-string">"Record from webcam or upload a short clip"</span>)
        vid_out = gr.Label(num_top_classes=<span class="hljs-number">3</span>, label=<span class="hljs-string">"Prediction"</span>)
        gr.Button(<span class="hljs-string">"Classify Video"</span>).click(fn=predict_from_video, inputs=vid_in, outputs=vid_out)
    <span class="hljs-keyword">with</span> gr.Tab(<span class="hljs-string">"Single Image (fallback)"</span>):
        img_in = gr.Image(label=<span class="hljs-string">"Upload an image frame"</span>, type=<span class="hljs-string">"numpy"</span>)
        img_out = gr.Label(num_top_classes=<span class="hljs-number">3</span>, label=<span class="hljs-string">"Prediction"</span>)
        gr.Button(<span class="hljs-string">"Classify Image"</span>).click(fn=predict_from_image, inputs=img_in, outputs=img_out)

<span class="hljs-keyword">if</span> __name__ == <span class="hljs-string">"__main__"</span>:
    demo.launch()
</code></pre>
<p>Running the command <code>python app.py</code> launches a Gradio application in your web browser. Here's what happens:</p>
<ol>
<li><p><strong>Webcam feed streams live</strong>: The application accesses your webcam to provide a live video feed. This allows you to perform gestures in front of the camera in real-time.</p>
</li>
<li><p><strong>Predictions update continuously</strong>: As you perform gestures, the model processes the video frames continuously, updating its predictions in real-time.</p>
</li>
<li><p><strong>Top 3 gesture classes displayed with probabilities</strong>: The application displays the top three predicted gesture classes along with their probabilities, giving you an idea of the model's confidence in its predictions.</p>
</li>
</ol>
<p>When you open the app in your browser, you'll find two tabs. In the <strong>Video tab</strong>, you can click <em>Record from webcam</em> to capture a short clip of your gesture, typically lasting 2–4 seconds. After recording, click <strong>Classify Video</strong>. The model will then process the captured frames using the Transformer model and display the predicted gesture probabilities. This setup allows for interactive testing and demonstration of the gesture recognition system.</p>
<p>Here’s an example where I raised my hand for a <strong>stop</strong> gesture, and the model predicts “stop” as the top class:</p>
<p><img src="https://github.com/tayo4christ/transformer-gesture/blob/07c7071bdb17bc08585baeb60d787eadc3936ef5/images/realtime-demo.png?raw=true" alt="Gradio demo output" width="600" height="400" loading="lazy"></p>
<p>Figure 2. The Gradio app running locally. After recording a short clip, the Transformer model predicts the gesture with class probabilities.</p>
<h3 id="heading-evaluate-accuracy-latency">Evaluate Accuracy + Latency</h3>
<p>Now that the model runs in a demo app, let’s check how well it performs. There are two sides to this:</p>
<ul>
<li><p><strong>Accuracy</strong>: does the model predict the right gesture class?</p>
</li>
<li><p><strong>Latency</strong>: how fast does it respond, especially on CPU vs GPU?</p>
</li>
</ul>
<h4 id="heading-1-quick-accuracy-check">1. Quick Accuracy Check</h4>
<p>Save this as <code>eval.py</code> in the same folder as your other scripts:</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> torch
<span class="hljs-keyword">from</span> dataset <span class="hljs-keyword">import</span> GestureClips, read_labels
<span class="hljs-keyword">from</span> train <span class="hljs-keyword">import</span> ViTTemporal

labels, _ = read_labels(<span class="hljs-string">"labels.txt"</span>)
n_classes = len(labels)

<span class="hljs-comment"># Load validation data</span>
val_ds = GestureClips(train=<span class="hljs-literal">False</span>)
val_dl = torch.utils.data.DataLoader(val_ds, batch_size=<span class="hljs-number">2</span>, shuffle=<span class="hljs-literal">False</span>)

<span class="hljs-comment"># Load trained model</span>
model = ViTTemporal(num_classes=n_classes)
model.load_state_dict(torch.load(<span class="hljs-string">"vit_temporal_best.pt"</span>, map_location=<span class="hljs-string">"cpu"</span>))
model.eval()

correct, total = <span class="hljs-number">0</span>, <span class="hljs-number">0</span>
all_preds, all_labels = [], []

<span class="hljs-keyword">with</span> torch.no_grad():
    <span class="hljs-keyword">for</span> x, y <span class="hljs-keyword">in</span> val_dl:
        logits = model(x)
        preds = logits.argmax(dim=<span class="hljs-number">1</span>)
        correct += (preds == y).sum().item()
        total += y.size(<span class="hljs-number">0</span>)
        all_preds.extend(preds.tolist())
        all_labels.extend(y.tolist())

print(<span class="hljs-string">f"Validation accuracy: <span class="hljs-subst">{correct/total:<span class="hljs-number">.2</span>%}</span>"</span>)
</code></pre>
<h4 id="heading-2-confusion-matrix">2. Confusion Matrix</h4>
<p>Let’s also visualize which gestures are confused. Add this snippet at the bottom of <code>eval.py</code>:</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> matplotlib.pyplot <span class="hljs-keyword">as</span> plt
<span class="hljs-keyword">import</span> seaborn <span class="hljs-keyword">as</span> sns
<span class="hljs-keyword">from</span> sklearn.metrics <span class="hljs-keyword">import</span> confusion_matrix

cm = confusion_matrix(all_labels, all_preds)

plt.figure(figsize=(<span class="hljs-number">6</span>,<span class="hljs-number">6</span>))
sns.heatmap(cm, annot=<span class="hljs-literal">True</span>, fmt=<span class="hljs-string">"d"</span>, xticklabels=labels, yticklabels=labels, cmap=<span class="hljs-string">"Blues"</span>)
plt.xlabel(<span class="hljs-string">"Predicted"</span>)
plt.ylabel(<span class="hljs-string">"True"</span>)
plt.title(<span class="hljs-string">"Confusion Matrix"</span>)
plt.tight_layout()
plt.show()
</code></pre>
<p>When you run <code>python eval.py</code>, a heatmap like this will pop up:</p>
<p><img src="https://github.com/tayo4christ/transformer-gesture/blob/07c7071bdb17bc08585baeb60d787eadc3936ef5/images/confusion-matrix.png?raw=true" alt="Confusion matrix" width="600" height="400" loading="lazy"></p>
<p>Figure 3. Confusion matrix on the validation set. Correct predictions appear along the diagonal. Off-diagonal counts show gesture confusions.</p>
<h4 id="heading-3-latency-benchmark">3. Latency Benchmark</h4>
<p>Finally, let’s see how fast inference runs. Save the following as <code>benchmark.py</code>:</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> time, numpy <span class="hljs-keyword">as</span> np, onnxruntime
<span class="hljs-keyword">from</span> dataset <span class="hljs-keyword">import</span> read_labels

labels, _ = read_labels(<span class="hljs-string">"labels.txt"</span>)

ort = onnxruntime.InferenceSession(<span class="hljs-string">"vit_temporal.onnx"</span>, providers=[<span class="hljs-string">"CPUExecutionProvider"</span>])
INPUT_NAME = ort.get_inputs()[<span class="hljs-number">0</span>].name
OUTPUT_NAME = ort.get_outputs()[<span class="hljs-number">0</span>].name

dummy = np.random.randn(<span class="hljs-number">1</span>, <span class="hljs-number">16</span>, <span class="hljs-number">3</span>, <span class="hljs-number">224</span>, <span class="hljs-number">224</span>).astype(np.float32)

<span class="hljs-comment"># Warmup</span>
<span class="hljs-keyword">for</span> _ <span class="hljs-keyword">in</span> range(<span class="hljs-number">3</span>):
    ort.run([OUTPUT_NAME], {INPUT_NAME: dummy})

<span class="hljs-comment"># Benchmark</span>
t0 = time.time()
<span class="hljs-keyword">for</span> _ <span class="hljs-keyword">in</span> range(<span class="hljs-number">50</span>):
    ort.run([OUTPUT_NAME], {INPUT_NAME: dummy})
t1 = time.time()

print(<span class="hljs-string">f"Average latency: <span class="hljs-subst">{(t1 - t0)/<span class="hljs-number">50</span>:<span class="hljs-number">.3</span>f}</span> seconds per clip"</span>)
</code></pre>
<p>Run: <code>python benchmark.py</code></p>
<p>On CPU, you might see ~0.05–0.15s per clip; on GPU it’s much faster.</p>
<p><strong>Note</strong>: If latency is high, you can enable <strong>quantization</strong> in ONNX to shrink the model and speed up inference.</p>
<h2 id="heading-option-2-use-small-samples-from-public-gesture-datasets">Option 2: Use Small Samples from Public Gesture Datasets</h2>
<p>If you’d prefer to see your model trained on <em>real</em> gesture clips instead of synthetic moving boxes, you can grab a handful of videos from open datasets. You don’t need to download the entire dataset (which can be several GB) just a few <code>.mp4</code> samples are enough to follow along.</p>
<h3 id="heading-recommended-sources">Recommended sources</h3>
<ul>
<li><p><strong>20BN Jester Dataset</strong>: Contains short clips of hand gestures like swiping, clapping, and pointing.</p>
</li>
<li><p><strong>WLASL</strong>: A large-scale dataset of isolated sign language words.</p>
</li>
</ul>
<p>Both projects provide small <code>.mp4</code> videos you can use as realistic training examples. I’ve linked them below.</p>
<h3 id="heading-setting-up-your-dataset-folder">Setting up your dataset folder</h3>
<p>Once you download a few clips, place them in the <code>data/</code> folder under subfolders named after each gesture class. For example:</p>
<pre><code class="lang-plaintext">data/
├── swipe_left/
│   ├── clip1.mp4
│   └── clip2.mp4
├── swipe_right/
│   ├── clip1.mp4
│   └── clip2.mp4
└── stop/
    ├── clip1.mp4
    └── clip2.mp4
</code></pre>
<p>And update <code>labels.txt</code> to match the folder names:</p>
<pre><code class="lang-plaintext">swipe_left
swipe_right
stop
</code></pre>
<p>Now your dataset is ready, and the same training scripts from earlier (<code>train.py</code>, <code>eval.py</code>) will work without modification.</p>
<h3 id="heading-why-choose-this-option">Why choose this option?</h3>
<ul>
<li><p>Gives more realistic results than synthetic coloured boxes</p>
</li>
<li><p>Lets you see how the model handles <em>actual human hand movements</em></p>
</li>
<li><p>It just requires a bit more effort (downloading clips, trimming them if needed)</p>
</li>
</ul>
<p><strong>Tip:</strong> If downloading from these datasets feels too heavy, you can also record your own short gestures using your laptop webcam. Just save them as <code>.mp4</code> files and organize them in the same folder structure.</p>
<h2 id="heading-accessibility-notes-amp-ethical-limits">Accessibility Notes &amp; Ethical Limits</h2>
<p>While this project shows the technical workflow for gesture recognition with Transformers, it’s important to step back and consider the <strong>human context</strong>:</p>
<ul>
<li><p><strong>Accessibility first</strong>: Tools like this can help students with speech or motor difficulties, but they should always be co-designed with the people who will use them. Don’t assume one-size-fits-all.</p>
</li>
<li><p><strong>Dataset sensitivity</strong>: Using publicly available sign or gesture datasets is fine for prototyping, but deploying such a system requires careful consideration of consent and representation.</p>
</li>
<li><p><strong>Error tolerance</strong>: Even small misclassifications can have big consequences in accessibility contexts (for example, confusing <em>stop</em> with <em>go</em>). Always plan for fallback options (like manual input or confirmation).</p>
</li>
<li><p><strong>Bias and inclusivity</strong>: Models trained on narrow datasets may fail for different skin tones, lighting conditions, or cultural gesture variations. Broad and diverse training data is essential for fairness.</p>
</li>
</ul>
<p>In other words: this demo is a <strong>teaching scaffold</strong>, not a production-ready accessibility tool. Responsible deployment requires collaboration with educators, therapists, and end users.</p>
<h2 id="heading-next-steps">Next Steps</h2>
<p>If you’d like to push this project further, here are some directions to explore:</p>
<ul>
<li><p><strong>Better models</strong>: Try video-focused Transformers like <a target="_blank" href="https://arxiv.org/abs/2102.05095">TimeSformer</a> or <a target="_blank" href="https://arxiv.org/abs/2203.12602">VideoMAE</a> for stronger temporal reasoning.</p>
</li>
<li><p><strong>Larger vocabularies</strong>: Add more gesture classes, build your own dataset, or use portions of public datasets like <a target="_blank" href="https://www.kaggle.com/datasets/toxicmender/20bn-jester">20BN Jester</a> or <a target="_blank" href="https://www.kaggle.com/datasets/risangbaskoro/wlasl-processed">WLASL.</a></p>
</li>
<li><p><strong>Pose fusion</strong>: Combine gesture video with human pose keypoints from <a target="_blank" href="https://mediapipe.readthedocs.io/en/latest/solutions/hands.html">MediaPipe</a> or <a target="_blank" href="https://github.com/CMU-Perceptual-Computing-Lab/openpose">OpenPose</a> for more robust predictions.</p>
</li>
<li><p><strong>Real-time smoothing</strong>: Implement temporal smoothing or debounce logic in the app so predictions are more stable during live use.</p>
</li>
<li><p><strong>Quantization + edge devices</strong>: Convert your ONNX model to an INT8 quantized version and deploy it on a Raspberry Pi or Jetson Nano for classroom-ready prototypes.</p>
</li>
</ul>
<h2 id="heading-conclusion">Conclusion</h2>
<p>In this tutorial, you learned how to create a gesture recognition system using Transformer models, demonstrating the potential of cutting-edge machine learning techniques. By preparing a small dataset, training a Vision Transformer with temporal pooling, exporting the model to ONNX for efficient inference, and deploying a real-time Gradio app, you showcased a practical application of these technologies. The evaluation of accuracy and latency further highlighted the system's effectiveness and responsiveness.</p>
<p>This project illustrates how you can leverage advanced ML methods to enhance accessibility and communication, paving the way for more inclusive learning environments.</p>
<p>Remember: while this demo works with small datasets, real-world applications need larger, more diverse data and careful consideration of accessibility, inclusivity, and ethics.</p>
<p>Here’s the GitHub repo for full source code: <a target="_blank" href="https://github.com/tayo4christ/transformer-gesture">transformer-gesture</a>.</p>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ Build Your Own ViT Model from Scratch ]]>
                </title>
                <description>
                    <![CDATA[ Vision Transformers have fundamentally changed how we approach computer vision problems, delivering state-of-the-art results that often surpass traditional convolutional neural networks. As the industry shifts toward transformer-based architectures f... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/build-your-own-vit-model-from-scratch/</link>
                <guid isPermaLink="false">68371245a0ad7212aa3fe103</guid>
                
                    <category>
                        <![CDATA[ pytorch ]]>
                    </category>
                
                    <category>
                        <![CDATA[ youtube ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ Beau Carnes ]]>
                </dc:creator>
                <pubDate>Wed, 28 May 2025 13:40:21 +0000</pubDate>
                <media:content url="https://cdn.hashnode.com/res/hashnode/image/upload/v1748439600587/276b8ea4-1a66-494e-9b6a-ec06c898379a.png" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>Vision Transformers have fundamentally changed how we approach computer vision problems, delivering state-of-the-art results that often surpass traditional convolutional neural networks. As the industry shifts toward transformer-based architectures for image classification, object detection, and beyond, understanding how to build and implement these models from scratch has become essential for machine learning practitioners and researchers who want to stay at the forefront of computer vision innovation.</p>
<p>We've just released a comprehensive new course on the <a target="_blank" href="http://freeCodeCamp.org">freeCodeCamp.org</a> YouTube channel that takes you through the complete process of building a Vision Transformer (ViT) model using PyTorch. This hands-on tutorial guides you through each component, from patch embedding to the Transformer Encoder, while training your custom model on the CIFAR-10 dataset for practical image classification experience. Mohammed Al Abrah developed this course.</p>
<h2 id="heading-what-youll-accomplish">What You'll Accomplish</h2>
<p>This course provides both theoretical understanding and practical implementation skills. You'll start with the foundational concepts of Vision Transformers, learning how they differ from CNNs and why they've become so effective for computer vision tasks. The tutorial then walks you through setting up your development environment and configuring the necessary hyperparameters for optimal training.</p>
<p>The core of the course focuses on building the ViT architecture from the ground up. You'll implement image transformation operations, download and prepare the CIFAR-10 dataset, and create efficient DataLoaders. Most importantly, you'll construct the complete Vision Transformer model, understanding each component's role in the overall architecture.</p>
<h2 id="heading-training-and-optimization">Training and Optimization</h2>
<p>The course covers the complete machine learning pipeline, including defining appropriate loss functions and optimizers for your ViT model. You'll implement a comprehensive training loop and learn to visualize training progress by comparing training versus testing accuracy. The tutorial also demonstrates how to make predictions with your trained model and visualize the results.</p>
<p>Advanced sections focus on fine-tuning techniques using data augmentation to improve model performance. You'll train the enhanced model and compare results before and after fine-tuning, gaining insights into optimization strategies that can significantly boost your model's effectiveness.</p>
<h2 id="heading-course-structure">Course Structure</h2>
<p>The tutorial is organized into clear, logical sections that build upon each other. Starting with theoretical foundations, you'll progress through environment setup, data preparation, model construction, training procedures, and advanced optimization techniques. Each section includes practical code implementation, ensuring you gain hands-on experience with every aspect of Vision Transformer development.</p>
<p>The course concludes with comprehensive evaluation methods, teaching you to assess model performance and understand the impact of different training strategies. You'll learn to visualize predictions and analyze results, skills that are crucial for real-world machine learning applications.</p>
<h2 id="heading-why-this-matters-now">Why This Matters Now</h2>
<p>As transformer architectures continue to dominate both natural language processing and computer vision, the ability to implement these models from scratch provides invaluable insight into their inner workings. This understanding enables you to modify architectures for specific use cases, debug training issues effectively, and adapt to new developments in the field.</p>
<p>Ready to master one of the most important advances in modern computer vision? Watch the full course on <a target="_blank" href="https://youtu.be/7o1jpvapaT0">the freeCodeCamp.org YouTube channel</a> (2-hour watch).</p>
<div class="embed-wrapper">
        <iframe width="560" height="315" src="https://www.youtube.com/embed/7o1jpvapaT0" style="aspect-ratio: 16 / 9; width: 100%; height: auto;" title="YouTube video player" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen="" loading="lazy"></iframe></div>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ Learn PyTorch in Five Projects ]]>
                </title>
                <description>
                    <![CDATA[ Deep learning has revolutionized the way we approach complex problems like image recognition, natural language processing, and even audio analysis. At the core of many deep learning applications is PyTorch, a powerful and flexible framework that allo... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/learn-pytorch-in-five-projects/</link>
                <guid isPermaLink="false">67c9e1b1b9163aa35c6dc2ee</guid>
                
                    <category>
                        <![CDATA[ pytorch ]]>
                    </category>
                
                    <category>
                        <![CDATA[ youtube ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ Beau Carnes ]]>
                </dc:creator>
                <pubDate>Thu, 06 Mar 2025 17:56:01 +0000</pubDate>
                <media:content url="https://cdn.hashnode.com/res/hashnode/image/upload/v1741283747402/6d00899f-105f-4c20-97e8-0f789348bacf.png" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>Deep learning has revolutionized the way we approach complex problems like image recognition, natural language processing, and even audio analysis. At the core of many deep learning applications is <strong>PyTorch</strong>, a powerful and flexible framework that allows developers and researchers to build and train neural networks efficiently. If you're looking to gain hands-on experience with PyTorch and understand its syntax in real-world applications, we've got the perfect course for you.</p>
<p>We just published a course on the freeCodeCamp.org YouTube channel that will teach you all about <strong>PyTorch and its syntax</strong> through five practical exercises, guided by Omar Atef. This course provides a structured introduction to PyTorch, covering different types of machine learning tasks, from tabular data classification to deep learning applications in image, audio, and text classification. Each section focuses on a specific problem, allowing you to see PyTorch in action and build models that handle various types of data.</p>
<h3 id="heading-what-youll-learn-in-this-course">What You'll Learn in This Course</h3>
<p>🔹 <strong>Tabular Data Classification</strong> – Learn how to use PyTorch for structured data, a crucial skill for predictive modeling in industries like finance, healthcare, and retail.</p>
<p>🔹 <strong>Image Classification</strong> – Train a deep learning model to recognize objects in images, a fundamental task in computer vision.</p>
<p>🔹 <strong>Pre-trained Models for Image Classification</strong> – Discover how to leverage powerful, pre-trained neural networks to achieve high accuracy with minimal training time.</p>
<p>🔹 <strong>Audio Classification</strong> – Explore how PyTorch can be used to classify sounds and speech, an essential step in applications like voice recognition and music categorization.</p>
<p>🔹 <strong>Text Classification with BERT</strong> – Learn how to use the BERT model for natural language processing tasks such as sentiment analysis and spam detection.</p>
<h3 id="heading-why-learn-pytorch">Why Learn PyTorch?</h3>
<p>PyTorch is widely used in both research and industry due to its ease of use, dynamic computation graph, and strong community support. By mastering PyTorch, you'll gain the ability to build and deploy deep learning models efficiently, making it an essential skill for data scientists, AI engineers, and researchers.</p>
<p>This course is beginner-friendly but also provides valuable insights for those already familiar with machine learning. Each section includes hands-on coding exercises that reinforce your understanding and help you apply what you learn to real-world problems.</p>
<p>Watch the full course here: <a target="_blank" href="https://youtu.be/E0bwEAWmVEM">PyTorch Course on freeCodeCamp.org</a> (6-hour watch).</p>
<div class="embed-wrapper">
        <iframe width="560" height="315" src="https://www.youtube.com/embed/E0bwEAWmVEM" style="aspect-ratio: 16 / 9; width: 100%; height: auto;" title="YouTube video player" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen="" loading="lazy"></iframe></div>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ Build a Stable Diffusion VAE From Scratch using Pytorch ]]>
                </title>
                <description>
                    <![CDATA[ We just published a course on the freeCodeCamp.org YouTube channel that will teach you everything you need to know about Variational Autoencoders (VAEs). This course is perfect for anyone looking to dive deep into one of the fundamental concepts behi... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/build-a-stable-diffusion-vae-from-scratch-using-pytorch/</link>
                <guid isPermaLink="false">67506cfb94f39e7617233d40</guid>
                
                    <category>
                        <![CDATA[ VAEs (Variational Autoencoders ]]>
                    </category>
                
                    <category>
                        <![CDATA[ youtube ]]>
                    </category>
                
                    <category>
                        <![CDATA[ pytorch ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ Beau Carnes ]]>
                </dc:creator>
                <pubDate>Wed, 04 Dec 2024 14:53:47 +0000</pubDate>
                <media:content url="https://cdn.hashnode.com/res/hashnode/image/upload/v1733323982008/68d50c5d-0829-4d5c-90d0-c9feeedbd92d.jpeg" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>We just published a course on the freeCodeCamp.org YouTube channel that will teach you everything you need to know about <strong>Variational Autoencoders (VAEs)</strong>. This course is perfect for anyone looking to dive deep into one of the fundamental concepts behind modern image generation techniques, such as those used in latent diffusion models and GANs. Harsh Bhatt developed this course. He is a machine learning engineer.</p>
<p>VAEs are a special type of autoencoder that work with <strong>probability distributions</strong> instead of fixed points in the latent space. This capability allows VAEs to learn and represent the variability in datasets, such as the different ways the digit "7" might appear in handwritten forms. By learning a mean (μ) and standard deviation (σ), the VAE effectively captures the distribution of the data, making it an essential tool for applications in generative modeling and unsupervised learning.</p>
<h3 id="heading-why-learn-variational-autoencoders">Why Learn Variational Autoencoders?</h3>
<p>VAEs are more than just a stepping stone to understanding image generation. They solve key challenges in dimensionality reduction and data representation. Unlike traditional autoencoders, which focus on compressing data into a fixed latent representation, VAEs leverage probabilistic methods to create smoother and more meaningful latent spaces. This makes them particularly useful for tasks like:</p>
<ul>
<li><p><strong>Image synthesis:</strong> Generating realistic and diverse images.</p>
</li>
<li><p><strong>Data augmentation:</strong> Creating new data samples for training.</p>
</li>
<li><p><strong>Anomaly detection:</strong> Identifying outliers in data distributions.</p>
</li>
</ul>
<h3 id="heading-what-youll-learn-in-this-course">What You'll Learn in This Course</h3>
<p>This comprehensive course begins by introducing the basic concepts of autoencoders, including the <strong>encoder-decoder architecture</strong>. You'll then delve into the differences between standard autoencoders and VAEs, learning why encoding data into probability distributions is a game changer. Key topics covered include:</p>
<ul>
<li><p><strong>Latent space representation:</strong> How VAEs group similar data points into clusters within the latent space.</p>
</li>
<li><p><strong>The reparameterization trick:</strong> Enabling gradient-based optimization by representing random variables in a differentiable way.</p>
</li>
<li><p><strong>Loss functions for VAEs:</strong> Combining reconstruction loss and KL divergence to optimize the model.</p>
</li>
<li><p><strong>Implementation with PyTorch:</strong> Hands-on coding to build and train your own VAE from scratch.</p>
</li>
</ul>
<h3 id="heading-hands-on-implementation">Hands-On Implementation</h3>
<p>The course takes you step by step through implementing a VAE using <strong>PyTorch</strong>, starting with the <strong>encoder</strong> and <strong>decoder</strong> architecture. You’ll learn how to:</p>
<ol>
<li><p>Encode images into a latent representation.</p>
</li>
<li><p>Decode the latent vectors to reconstruct the original images.</p>
</li>
<li><p>Optimize the model using reconstruction loss and KL divergence.</p>
</li>
<li><p>Visualize and interpret the latent space.</p>
</li>
</ol>
<p>You’ll also gain insights into advanced techniques like <strong>self-attention layers</strong> for encoding context and <strong>residual blocks</strong> for efficient neural network training.</p>
<h3 id="heading-conclusion">Conclusion</h3>
<p>Ready to start your journey into generative modeling? Watch the course now on <a target="_blank" href="https://youtu.be/kG9l41Dtuyo">freeCodeCamp.org's YouTube channel</a> and get hands-on with Variational Autoencoders!</p>
<div class="embed-wrapper">
        <iframe width="560" height="315" src="https://www.youtube.com/embed/kG9l41Dtuyo" style="aspect-ratio: 16 / 9; width: 100%; height: auto;" title="YouTube video player" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen="" loading="lazy"></iframe></div>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ PyTorch vs TensorFlow – Which is Better for Deep Learning Projects? ]]>
                </title>
                <description>
                    <![CDATA[ In this article, we'll look at two popular deep learning libraries — PyTorch and TensorFlow – and see how they compare. If you are getting started with deep learning, the available tools and frameworks will be overwhelming. Industry experts may recom... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/pytorch-vs-tensorflow-for-deep-learning-projects/</link>
                <guid isPermaLink="false">66d0361612c679876b0602e9</guid>
                
                    <category>
                        <![CDATA[ Deep Learning ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Machine Learning ]]>
                    </category>
                
                    <category>
                        <![CDATA[ pytorch ]]>
                    </category>
                
                    <category>
                        <![CDATA[ TensorFlow ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ Manish Shivanandhan ]]>
                </dc:creator>
                <pubDate>Wed, 10 Jan 2024 18:46:30 +0000</pubDate>
                <media:content url="https://www.freecodecamp.org/news/content/images/2024/01/pytorchvs_cover.png" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>In this article, we'll look at two popular deep learning libraries — PyTorch and TensorFlow – and see how they compare.</p>
<p>If you are getting started with deep learning, the available tools and frameworks will be overwhelming. Industry experts may recommend TensorFlow while hardcore ML engineers may prefer PyTorch.</p>
<p>Both these frameworks are powerful deep-learning tools. While TensorFlow is used in Google search and by Uber, Pytorch powers OpenAI’s ChatGPT and Tesla's autopilot.</p>
<p>Choosing between these two frameworks is a common challenge for developers. If you're in this position, in this article we’ll compare TensorFlow and PyTorch to help you make an informed choice.</p>
<h2 id="heading-understanding-pytorch-and-tensorflow">Understanding PyTorch and TensorFlow</h2>
<p>Let’s start by getting to know our contenders better.</p>
<p><a target="_blank" href="https://pytorch.org/">PyTorch</a>, created by Facebook’s AI Research lab, has gained recognition for its simplicity and user-friendliness. Pytorch can efficiently handle dynamic computational graphs.</p>
<p>A computation graph is a visual representation of mathematical operations and their relationships. It’s like a flowchart that shows how data flow through the deep learning model.  </p>
<p>Training neural networks involves a lot of computations. So computation graphs help computers organize and execute calculations efficiently when training neural networks.</p>
<p>PyTorch is easy to use, making it a favoured choice among developers and researchers alike. For people who appreciate a straightforward framework for their projects, PyTorch is a perfect choice.</p>
<p><a target="_blank" href="https://www.tensorflow.org/">TensorFlow</a>, Google’s brainchild, has robust production capabilities and support for distributed training. TensorFlow excels in scenarios where you need large-scale machine learning models in real-world applications.</p>
<p>Distributed training is a technique used in deep learning to train large and complex models. By spreading the training process across multiple machines or devices, it is useful when dealing with massive datasets.</p>
<p>Tensorflow is the go-to choice for companies that need scalability and reliability in their deep learning models.</p>
<p>So as you may be able to see, the choice between PyTorch and TensorFlow often depends on the specific needs of a project.</p>
<h2 id="heading-pytorch-vs-tensorflow-which-ones-right-for-you">PyTorch vs TensorFlow – Which One's Right for You?</h2>
<h3 id="heading-ease-of-learning-and-use">Ease of Learning and Use</h3>
<p>When you’re starting a new project, it's helpful to have an easier learning curve. It helps both in building the project as well as hiring / training engineers for your project.</p>
<p>PyTorch is simpler and has a “Pythonic” way of doing things. It's a favourite for beginners and researchers. And its dynamic computation graph means you can change things on the fly, which is great for experimentation.</p>
<p>TensorFlow offers a more structured approach. Its static computation graph requires a bit more planning ahead. TensorFlow also comes with a steep learning curve. But this can lead to more optimized and high-performance models.</p>
<p>TensorFlow 2.0 has also made strides in simplicity. It has incorporated more of PyTorch’s dynamic nature through its <a target="_blank" href="https://towardsdatascience.com/eager-execution-vs-graph-execution-which-is-better-38162ea4dbf6">Eager Execution feature</a>.</p>
<p>But when it comes to simplicity and ease of learning, PyTorch is a clear winner.</p>
<h3 id="heading-performance-and-scalability">Performance and Scalability</h3>
<p>When it comes to performance and scalability, TensorFlow shines. Its can handle large-scale, distributed training with ease. So TensorFlow is a go-to choice for production environments.</p>
<p>TensorFlow’s integrated tool, <a target="_blank" href="https://www.tensorflow.org/tensorboard">TensorBoard</a>, is also a powerful tool for visualization and debugging.</p>
<p>PyTorch is catching up, with recent updates improving its scalability.</p>
<p>PyTorch has made improvements to support distributed training and scalability. It provides tools to help you train deep learning models on multiple GPUs and even across multiple machines.</p>
<p>But TensorFlow still holds the lead in deploying large-scale models in production.</p>
<h3 id="heading-community-and-support">Community and Support</h3>
<p>The strength of a framework is also partly defined by its community. As these are open-source frameworks, there is no customer support. So you have to depend on the community for help if you get stuck while building a project using these frameworks.</p>
<p>TensorFlow, being older, has a larger community. It also has a vast array of tutorials, courses, and books.</p>
<p>PyTorch, while younger, has seen rapid growth in its community. PyTorch is a favourite, especially among researchers since it's easy to use Pytorch for experimenting with datasets.</p>
<p>Both frameworks have strong support, but TensorFlow’s maturity gives it a slight edge in this area.</p>
<h3 id="heading-flexibility-and-innovation">Flexibility and Innovation</h3>
<p>If you’re working on cutting-edge research or need more flexibility, PyTorch is your best bet. Its dynamic computation graph allows for more creative and complex model architectures.</p>
<p>As I said before, this flexibility makes PyTorch a beloved tool in the research community. Where rapid prototyping and experimentation are key, PyTorch is your best option.</p>
<p>TensorFlow has been working towards adding more flexibility. But it's a difficult battle to win since PyTorch is built for simplicity from the ground up.</p>
<h3 id="heading-industry-adoption">Industry Adoption</h3>
<p><img src="https://miro.medium.com/v2/resize:fit:1050/1*3KA-wtadTjv6H9-LLSu9fw.png" alt="Image" width="600" height="400" loading="lazy">
<em>PyTorch (blue) vs TensorFlow (red)</em></p>
<p>TensorFlow has tpyically had the upper hand, particularly in large companies and production environments. Its robustness and scalability make it a safe choice for businesses.</p>
<p>But PyTorch is quickly gaining ground. As you can see in the trends chart, PyTorch has already taken over TensorFlow as the most searched deep learning library. <a target="_blank" href="https://trends.google.com/trends/explore/TIMESERIES/1704798600?hl=en&amp;tz=-330&amp;date=today+5-y&amp;q=%2Fg%2F11gd3905v1%2C%2Fg%2F11bwp1s2k3&amp;sni=3">You can find the live chart here</a>.</p>
<p>Multiple industries are starting to adopt PyTorch for research and development due to its user-friendliness and flexibility. Pytorch has also proved its capability as a production-grade tool after the release of models like ChatGPT.</p>
<p>Here is a list of companies using TensorFlow and PyTorch.</p>
<h3 id="heading-products-using-tensorflow">Products Using Tensorflow</h3>
<ul>
<li><strong>Google Search and Recommendations</strong>: Google uses TensorFlow to enhance its search engine and recommendation systems. It helps improve search accuracy and provides personalized recommendations based on user behaviour and preferences.</li>
<li><strong>NVIDIA Deep Learning Accelerator (NVDLA)</strong>: NVDLA is a hardware accelerator for deep learning applications. It uses TensorFlow to optimize and deploy models on this hardware.</li>
<li><a target="_blank" href="https://www.uber.com/en-IN/blog/michelangelo-machine-learning-platform/"><strong>Uber’s Michelangelo</strong></a>: Uber uses TensorFlow in its Michelangelo platform for machine learning. It assists in various tasks, including ETA predictions, fraud detection, and dynamic pricing.</li>
</ul>
<h3 id="heading-products-using-pytorch">Products Using PyTorch</h3>
<ul>
<li><strong>Facebook</strong>: Since PyTorch is from Facebook, Facebook uses PyTorch for various internal AI research and applications, including content recommendations and language translation.</li>
<li><strong>Tesla Autopilot</strong>: Tesla’s Autopilot system relies on PyTorch for its deep learning components, such as object detection and navigation.</li>
<li><strong>OpenAI’s GPT Models</strong>: Many of OpenAI’s language models, including GPT-2 and GPT-3, are built using PyTorch. These models are used for a wide range of natural language processing tasks, including text generation and language translation.</li>
</ul>
<h2 id="heading-conclusion">Conclusion</h2>
<p>Choosing between PyTorch and TensorFlow depends on your project’s needs.</p>
<p>For those who need ease of use and flexibility, PyTorch is a great choice. If you prefer scalability from the ground up, production deployment, and a mature ecosystem, TensorFlow might be the way to go.</p>
<p>Both frameworks are evolving, so keep an eye on their development. Your choice today might not be your choice tomorrow. Remember, the best tool is the one that suits your project’s needs and not the popular one.</p>
<p>Thanks for coming this far. If you want weekly machine learning tutorials delivered to your inbox, <a target="_blank" href="https://turingtalks.substack.com/"><strong>join my newsletter</strong></a>. To get in touch with me, you can <a target="_blank" href="https://www.linkedin.com/in/manishmshiva/"><strong>connect with me on LinkedIn</strong></a>.</p>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ Learn PyTorch for Deep Learning – Free 26-Hour Course ]]>
                </title>
                <description>
                    <![CDATA[ By Daniel Bourke My comprehensive PyTorch course is now live on the freeCodeCamp.org YouTube channel.  You can view the full 26 hour course here. Read the course materials online for free at learnpytorch.io. See all of the course materials on GitHub... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/learn-pytorch-for-deep-learning-in-day/</link>
                <guid isPermaLink="false">66d84e95f6b5e038a1bde7d5</guid>
                
                    <category>
                        <![CDATA[ pytorch ]]>
                    </category>
                
                    <category>
                        <![CDATA[ youtube ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ freeCodeCamp ]]>
                </dc:creator>
                <pubDate>Thu, 06 Oct 2022 14:48:39 +0000</pubDate>
                <media:content url="https://www.freecodecamp.org/news/content/images/2022/10/pytorch24.png" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>By Daniel Bourke</p>
<p>My comprehensive PyTorch course is now live on the freeCodeCamp.org YouTube channel. </p>
<ul>
<li>You can <a target="_blank" href="https://youtu.be/V_xro1bcAuA">view the full 26 hour course here</a>.</li>
<li>Read the course materials online for free at <a target="_blank" href="https://learnpytorch.io/">learnpytorch.io</a>.</li>
<li>See all of the <a target="_blank" href="https://github.com/mrdbourke/pytorch-deep-learning">course materials on GitHub</a>.</li>
</ul>
<h3 id="heading-you-can-learn-more-about-the-course-below-the-embedded-video">You can learn more about the course below the embedded video.</h3>
<div class="embed-wrapper">
        <iframe width="560" height="315" src="https://www.youtube.com/embed/V_xro1bcAuA" style="aspect-ratio: 16 / 9; width: 100%; height: auto;" title="YouTube video player" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen="" loading="lazy"></iframe></div>
<p>The best way to learn is by doing.</p>
<p>And that's just what we'll do in the Learn PyTorch for Deep Learning: Zero to Mastery course.</p>
<p>We'll learn by doing.</p>
<p>Throughout the course, we'll go through many of the most important concepts in machine learning and deep learning by writing PyTorch code.</p>
<p>If you're new to data science and machine learning, consider the course a momentum builder.</p>
<p>By the end, you'll be comfortable navigating the PyTorch documentation, reading PyTorch code, writing PyTorch code, searching for things you don't understand and building your own machine learning projects.</p>
<h2 id="heading-what-is-pytorch">What is PyTorch?</h2>
<p>PyTorch is a machine learning framework written in the Python programming language.</p>
<p>It allows you to write machine learning algorithms capable of turning data into models into intelligence.</p>
<h2 id="heading-why-learn-pytorch">Why Learn PyTorch?</h2>
<p>As of July 2022, <a target="_blank" href="https://paperswithcode.com/trends">58% of machine learning research papers that contain code</a> use PyTorch. And this number has been growing since PyTorch’s release.</p>
<p>In essence, machine learning researchers love PyTorch.</p>
<p>And typically, industry follows research.</p>
<p>So if all of the best machine learning research is coming out in PyTorch, knowing PyTorch is a fantastic way to start working in machine learning.</p>
<h3 id="heading-what-are-the-prerequisites">What are the prerequisites?</h3>
<p><strong>Bad:</strong> "I can't learn it" (that's bulls<em>*</em>).</p>
<p><strong>Good:</strong> Three to six months of experience writing Python code and a willingness to learn (you're more than ready to go).</p>
<p>The course is as beginner-friendly as possible.</p>
<p>So if you've got more than one year's experience with machine learning, you might learn a few things but the materials are designed for beginners.</p>
<h2 id="heading-hows-the-course-taught">How's the Course Taught?</h2>
<p>The focus of the course is code, code, code, experiment, experiment, experiment.</p>
<p>There's a reason two of the course mottos are:</p>
<blockquote>
<p>If in doubt, run the code!</p>
<p>Experiment, experiment, experiment!</p>
</blockquote>
<p>We'll write code together, apprenticeship style.</p>
<p>Meaning in the video version of the course, I'll write PyTorch code and explain it and then you'll follow along by writing the same code.</p>
<p>If we get stuck on something, we'll search for an answer.</p>
<p>You'll notice I leave many of my errors in the videos, this is on purpose.</p>
<p>Because errors happen (often) and being able to troubleshoot them is important.</p>
<p>I'm a big fan of there being <a target="_blank" href="https://sive.rs/kimo">no speed limit</a> to learning something.</p>
<p>So that's what we'll be doing.</p>
<p>Learning by coding.</p>
<p>Learning by experimenting.</p>
<p>Fast.</p>
<h2 id="heading-what-does-this-course-cover">What Does This Course Cover?</h2>
<p>You can view and read all of the materials online for free at <a target="_blank" href="https://learnpytorch.io/">learnpytorch.io</a>.</p>
<p>But let's get specific.</p>
<p>The course is comprised of 5 modules (or notebooks), best taken sequentially (but feel free to jump around).</p>
<h3 id="heading-00-pytorch-fundamentalshttpswwwlearnpytorchio00pytorchfundamentals"><a target="_blank" href="https://www.learnpytorch.io/00_pytorch_fundamentals/">00 – PyTorch Fundamentals</a></h3>
<p>We'll start from the ground up.</p>
<p>Answering questions like what is PyTorch (an open-source machine learning framework) and what can PyTorch be used for (manipulating data and writing machine learning algorithms).</p>
<p>Then we'll get familiar with the fundamental building block of deep learning, the tensor.</p>
<p>A tensor is a numerical representation of data (where data can be almost anything, images, text, tables of numbers).</p>
<p>And the whole goal of machine learning is to find patterns in data.</p>
<p>So knowing how to create, interact with and manipulate tensors is paramount.</p>
<p><img src="https://www.mrdbourke.com/content/images/2022/07/00-pytorch-fundamentals.png" alt="learnpytorch.io home page for PyTorch fundamentals section of learn PyTorch for deep learning zero to mastery course" width="600" height="400" loading="lazy">
<em>All of the course materials are available to read in an interactive online book at <a target="_blank" href="https://www.learnpytorch.io/">learnpytorch.io</a></em></p>
<h3 id="heading-01-pytorch-workflowhttpswwwlearnpytorchio01pytorchworkflow"><a target="_blank" href="https://www.learnpytorch.io/01_pytorch_workflow/">01 – PyTorch Workflow</a></h3>
<p>The idea of machine learning is to turn data into intelligence.</p>
<p>And the machine learning model that's able to do that the best is the winner.</p>
<p>So how do you go from data to model to intelligence with PyTorch?</p>
<p>That's what PyTorch Workflow focuses on:</p>
<ol>
<li>Preparing data (turning it into tensors).</li>
<li>Building or picking a pretrained model (to suit your problem).</li>
<li>Fitting the model to the data (or letting the model find patterns in the data).</li>
<li>Evaluating the trained model (after it's learned patterns in data).</li>
<li>Improving the model through experimentation.</li>
<li>Saving and reloading a trained model (so you can export it and use it in applications).</li>
</ol>
<p>We'll use and build upon this workflow throughout the course.</p>
<p><img src="https://www.mrdbourke.com/content/images/2022/07/01_a_pytorch_workflow.png" alt="A PyTorch Workflow with six steps from data preparation to saving and reloading a trained model." width="600" height="400" loading="lazy">
<em>The PyTorch WorkFlow we'll cover and build upon throughout the Learn PyTorch for Deep Learning course.</em></p>
<h3 id="heading-02-pytorch-neural-network-classificationhttpswwwlearnpytorchio02pytorchclassification"><a target="_blank" href="https://www.learnpytorch.io/02_pytorch_classification/">02 – PyTorch Neural Network Classification</a></h3>
<p>Neural networks are one of, if not the <em>most</em> powerful kind of machine learning algorithms.</p>
<p>They're what power many of today's most advanced artificial intelligence (AI) systems such as search and self-driving cars.</p>
<p>But can you get a neural network to do something simple like classifying whether a dot is red or blue?</p>
<p>A simple problem, yes, but experimenting with toy problems is one of the best ways to learn machine learning.</p>
<p>In doing so, we'll go through all of the major steps for one of the most common machine learning problems, <strong>classification</strong>: building a neural network to predict if something is one thing or another.</p>
<h3 id="heading-03-pytorch-computer-visionhttpswwwlearnpytorchio03pytorchcomputervision"><a target="_blank" href="https://www.learnpytorch.io/03_pytorch_computer_vision/">03 – PyTorch Computer Vision</a></h3>
<p>Neural networks changed the game of computer vision forever.</p>
<p>And now PyTorch drives many of the latest advancements in computer vision algorithms.</p>
<p>Tesla uses PyTorch to build their computer vision algorithms for their self-driving software.</p>
<p>Apple uses PyTorch to build models that computationally enhance photos taken with the iPhone.</p>
<p>In PyTorch Computer Vision, we'll write PyTorch code to create a neural network capable of seeing patterns in images and classifying them into different categories.</p>
<h3 id="heading-04-pytorch-custom-datasetshttpswwwlearnpytorchio04pytorchcustomdatasets"><a target="_blank" href="https://www.learnpytorch.io/04_pytorch_custom_datasets/">04 – PyTorch Custom Datasets</a></h3>
<p>The magic of machine learning is building algorithms to find patterns in your own custom data.</p>
<p>There are plenty of existing datasets out there, but how do load your own custom dataset into PyTorch to build models to find patterns in it?</p>
<p>Perhaps you'd like to build a security system for your home and you'd like to teach it what your family looks like so it recognizes them.</p>
<p>Or perhaps you'd like to build an application capable of classifying the different dog photos you take.</p>
<p>That’s exactly what PyTorch Custom Datasets covers, we'll create our own custom dataset with food images of pizza, steak and sushi to start the major project of the course: FoodVision.</p>
<h2 id="heading-cant-i-learn-all-of-this-myself">Can't I Learn All of this Myself?</h2>
<p>Yes.</p>
<p>You can.</p>
<p>There's a reason I'm calling this course the <em>second</em> best place on the internet to learn PyTorch.</p>
<p>Because the best place is the <a target="_blank" href="https://pytorch.org/docs/stable/index.html">PyTorch documentation</a>.</p>
<p>Though documentation can be a little intimidating when you first encounter it.</p>
<p>So this course structures things in a way that's a fun warmup before diving into the documentation.</p>
<h3 id="heading-got-another-question">Got another question?</h3>
<p>Feel free to leave a <a target="_blank" href="https://github.com/mrdbourke/pytorch-deep-learning/discussions">discussion on the course's GitHub repository</a>.</p>
<p>Otherwise, happy machine learning and I'll see you in the course.</p>
<p><em>Let's code!</em></p>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ Real-World Machine Learning—PyTorch and Monai for Healthcare Imaging ]]>
                </title>
                <description>
                    <![CDATA[ To improve your skills in machine learning and artificial intelligence, it is important to solve real-world problems. What better problem to solve then helping to save people's lives? Machine learning is being used more and more in the field of healt... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/pytorch-and-monai-for-healthcare-imaging/</link>
                <guid isPermaLink="false">66b20635dc300c9dddc01279</guid>
                
                    <category>
                        <![CDATA[ pytorch ]]>
                    </category>
                
                    <category>
                        <![CDATA[ youtube ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ Beau Carnes ]]>
                </dc:creator>
                <pubDate>Thu, 06 Jan 2022 16:30:57 +0000</pubDate>
                <media:content url="https://www.freecodecamp.org/news/content/images/2022/01/healthcareimaging.png" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>To improve your skills in machine learning and artificial intelligence, it is important to solve real-world problems. What better problem to solve then helping to save people's lives?</p>
<p>Machine learning is being used more and more in the field of healthcare. PyTorch and Monai can be used to discover tumors in livers. </p>
<p>We just published a course on the freeCodeCamp.org YouTube channel that will teach you how to use PyTorch, Monai, and Python for 3D liver segmentation. You will use machine learning and computer vision to find tumors in livers.</p>
<p>Mohammed El Amine MOKHTARI developed this course. He is a computer vision Ph.D. student and online content creator.</p>
<p>Here are the sections in this course:</p>
<ul>
<li>What is U-Net</li>
<li>Software Installation</li>
<li>Finding the Datasets</li>
<li>Preparing the Data</li>
<li>Installing the Packages</li>
<li>Preprocessing</li>
<li>Errors you May Face</li>
<li>Dice Loss</li>
<li>Weighted Cross Entropy</li>
<li>The Training Part</li>
<li>The Testing Part</li>
<li>Using the GitHub Repository</li>
</ul>
<p>Watch the full course below or <a target="_blank" href="https://youtu.be/M3ZWfamWrBM">on the freeCodeCamp.org YouTube channel</a> (5-hour watch).</p>
<div class="embed-wrapper">
        <iframe width="560" height="315" src="https://www.youtube.com/embed/M3ZWfamWrBM" style="aspect-ratio: 16 / 9; width: 100%; height: auto;" title="YouTube video player" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen="" loading="lazy"></iframe></div>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ Deep Learning Tutorial – How to Use PyTorch and Transfer Learning to Diagnose COVID-19 Patients ]]>
                </title>
                <description>
                    <![CDATA[ Ever since the outbreak of COVID-19 in December 2019, researchers in the field of artificial intelligence and machine learning have been trying to find better ways to diagnose the disease. They've worked on developing algorithms that would detect the... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/deep-learning-with-pytorch/</link>
                <guid isPermaLink="false">66d039da64be048ac359a35b</guid>
                
                    <category>
                        <![CDATA[ Deep Learning ]]>
                    </category>
                
                    <category>
                        <![CDATA[ neural networks ]]>
                    </category>
                
                    <category>
                        <![CDATA[ pytorch ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Transfer learning ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ Juan Cruz Martinez ]]>
                </dc:creator>
                <pubDate>Wed, 03 Nov 2021 19:49:35 +0000</pubDate>
                <media:content url="https://www.freecodecamp.org/news/content/images/2021/11/Featured-Orange.png" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>Ever since the outbreak of COVID-19 in December 2019, researchers in the field of artificial intelligence and machine learning have been trying to find better ways to diagnose the disease.</p>
<p>They've worked on developing algorithms that would detect the disease within a matter of seconds – and only by looking at chest X-rays and/or CT scan images. </p>
<p>Some of these techniques have proven to be extremely useful and accurate in diagnosing COVID-19 cases.</p>
<p>There are multiple approaches that use both machine and deep learning to detect and/or classify of the disease. And researches have proposed newly developed architectures along with transfer learning approaches. </p>
<p>In this article, we will look at a transfer learning approach that classifies COVID-19 cases using chest X-ray images. </p>
<p>The model we are going to use is one of the seven variants of the EfficientNet architecture. We will use a pre-trained model on the immense ImageNet dataset. EfficientNet is an advanced and complex convolutional neural network-based architecture. </p>
<p>We will further investigate the details of Convolutional Neural Networks, pre-trained models, and EfficientNet during the course of this article. I've divided it into five parts:</p>
<ol>
<li>What are convolutional neural networks?</li>
<li>A dive into transfer learning.</li>
<li>What is EfficientNet?</li>
<li>An introduction to PyTorch.</li>
<li>Implementation of COVID-19 classifier using EfficientNet with PyTorch.</li>
</ol>
<p>This tutorial assumes that you have prior knowledge of both machine learning and deep learning. If you want to further develop your foundation in these topics, check out this article on <a target="_blank" href="https://livecodestream.dev/post/artificial-intelligence-vs-machine-learning-vs-deep-learning/">Artificial Intelligence vs Machine Learning vs Deep Learning</a>.</p>
<p>Also, although the dataset we'll work with here is COVID-related, you can apply the actual code implementation and analysis to other datasets.</p>
<h2 id="heading-what-is-a-convolutional-neural-network">What is a Convolutional Neural Network?</h2>
<p>Convolutional Neural networks (CNNs) are a type of deep neural network that works on visual data – this is, images. A CNN takes an image as an input and performs two or three-dimensional convolutional operations on the image with several filters, also referred to as kernels. </p>
<p>These convolution operations output a 2D or 3D matrix which contains the learnable weights and biases regarding the spatial information of the input image. This output matrix is referred to as the feature map of the image.</p>
<p>Processing a convolutional neural network in the training process can be, in some cases, extremely slow. This is why it's a good idea to use GPUs and TPUs during training for deep learning techniques, especially convolutional neural networks.</p>
<p>Convolutional neural networks learn spatial and temporal information about the image far better than the basic feed forward neural network. Also, CNNs can reduce the size of the image while retaining the most important information in the image, which is crucial for predictive analysis of images.</p>
<p><img src="https://lh6.googleusercontent.com/vma10ZOrxzyEEbJVvIZuygeDyqlkAKEUxWkJ8of7spwvrA9zktP1FYJQWZC6ZhMqrP2V0gMh04nqb74gNGNM3eO_g1ZwuvI753j-oS7fN_E0Txn4T3TXTW65MG3ubi67pBcX19o" alt="Deep Learning – Introduction to Convolutional Neural Networks - Vinod  Sharma's Blog" width="600" height="400" loading="lazy">
<em><a target="_blank" href="https://i0.wp.com/vinodsblog.com/wp-content/uploads/2018/10/CNN-2.png?resize=1300%2C479&amp;ssl=1">Source</a></em></p>
<p>The starting layers of convolutional neural networks learn the abstract and simpler features in an image, such as lines and edges. But as we move deeper into the network, the feature map turns to the more complex structures in the image. </p>
<p>It starts to learn the more specific features of the image, such as a cat, a dog, or a person, the same way we would, as humans, perceive the world around us. This is a core concept in modern deep learning-based computer vision. </p>
<p>Now before we move on to advanced concepts, it is important to learn the basics of 2D convolution.</p>
<h2 id="heading-what-is-2d-convolution">What is 2D Convolution?</h2>
<p>2D convolution is a bit complex to explain, but here it goes: if the convolutional process (which is extensively used in <a target="_blank" href="https://www.tutorialspoint.com/signals_and_systems/convolution_and_correlation.htm">h1-D signal processing</a>) is performed between two signals – but not just along a single dimension, rather along two mutually perpendicular dimensions – it is called 2D convolution. </p>
<p>In the case of images, the two mutually perpendicular dimensions are the rows and columns of a greyscale image. The convolutional operation is mathematically done by multiplying and then accumulating the values of the overlapping samples of the two input signals, where one of the signals is flipped. The output of this multiplication and accumulation gives a single point on the feature map.</p>
<p>In the case of CNNs, the image is one signal and the filter/kernel is the second signal which is flipped. The size of the kernel is always smaller than that of the image. </p>
<p>The flipped kernel is then swept across the whole image both row by row and column by column to output the feature map.</p>
<p><img src="https://lh3.googleusercontent.com/p5ht8HdKUxxCwcNoas2qAusdT8dYq_XzLS2YqVORYqb0cCnXPPAlPu40Z73kVEXerQ5s6epDozQdYRsleeUncnSV4Opx2Q1CNk8wseTdXEPz8eHt5dJ0R2TSFnnhRRZzjO7xH4A" alt="https://miro.medium.com/max/700/1*kOThnLR8Fge_AJcHrkR3dg.gif" width="600" height="400" loading="lazy">
<em>2d convolution</em></p>
<p>Here a 3x3 kernel is swept across a 6x6 image to output a 4x4 feature map. As you can see, the dimensions of the output feature map are smaller than the input image. So there are a few concepts used in convolution to control the dimensions of the output feature map. These include padding, stride, and kernel size.</p>
<p><strong>Padding</strong> is the manual addition of rows and columns around the input to keep the output dimension the same as the input dimension or vary it. </p>
<p><strong>Stride</strong> refers to the jump the kernel takes during the sweep, both in columns and rows. In the example above, the stride of the convolution is 1 as the kernel is moving one unit in both rows and columns. </p>
<p><strong>Kernel size</strong> refers to the dimensions of the kernel used. Changing the dimensions of the kernel to be swept changes the output size of the feature map. </p>
<p>The image below describes the convolution with the same kernel size but with a padding of 1 and stride of 2.</p>
<p><img src="https://lh3.googleusercontent.com/ceNWhzTPHzqGi5wMyUrqCSS2kp6-mF75BHxlNaEnGVwrsIiGamEq4pm_Mndmaz0weJnZfgOnl7L0CPy1OF19lRyRTAkDWZEzREBr8H36_mW_6bqJ-P8XzuJqTbzwNvPKXd_7N9U" alt="https://miro.medium.com/max/395/1*1VJDP6qDY9-ExTuQVEOlVg.gif" width="600" height="400" loading="lazy"></p>
<p>The equation that describes the relationship of stride, padding, and kernel size to input and output dimensions is as follows:</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2021/11/image-2.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p>The concept of 3D convolution is just an extension of 2D convolution where both the input image and the kernel are three-dimensional. </p>
<p>Like 2D convolution, we sweep the three-dimensional kernel across the whole image in two mutually perpendicular dimensions, namely the rows and the columns. </p>
<p>We do not usually sweep the kernel across the color channels because the kernel has the same third dimension, that is the channel length, as the original image. This gives an output feature map that is two-dimensional instead of three. </p>
<p>To learn more about the details of 3D convolution, you can read <a target="_blank" href="https://paperswithcode.com/method/3d-convolution">this article</a>.</p>
<h2 id="heading-what-is-transfer-learning">What is Transfer Learning?</h2>
<p>In transfer learning, you take a machine or deep learning model that is pre-trained on a previous dataset and use it to solve a different problem without needing to re-train the whole model. </p>
<p>Instead, you can just use the weights and biases of the pre-trained model to make a prediction. You transfer the weights from one model to your own model and adjust them to your own dataset without re-training all the previous layers of the architecture.</p>
<p>We use transfer learning in the applications of convolutional neural networks and natural language processing because it decreases the computation time and complexity of the training process. And, in many cases, it performs surprisingly well. </p>
<p>This also helps in cases where we have limited data available – since neural networks demand an extremely large amount of data to achieve good performance.</p>
<p>This means that using transfer learning methods can greatly reduce the demand for data since the weights and biases are pre-adjusted and are able to work better with just a small amount of data by tweaking the weights and biases a little.</p>
<p>But transfer learning models do not always give you great performance (although the newer architectures perform efficiently on almost every problem). Still, sometimes the problem at hand needs an architecture that is pre-trained on data that's similar to what you have. This factor depends upon the complexity of the problem you are trying to solve. </p>
<p>There are a couple ways you can perform transfer learning:</p>
<ol>
<li><strong>Using a pre-trained model.</strong></li>
<li><strong>Developing a new model.</strong></li>
</ol>
<p>You can use a pre-trained model in two ways. First, you can use the pre-trained weights and biases as initial parameters for your own model, and then train a whole convolutional model using those weights. </p>
<p>The other way is to perform feature extraction from the pre-trained model. You use the parameters of the pre-trained model to extract features from your input image and just train a simple classifier on top of it.</p>
<p>Another option is that if you have a problem with a small amount of data, you develop another model for a similar problem that has a large amount of data and train the model. Then you can use the trained weights from the new model to solve the original problem with less data. </p>
<p>In this tutorial, we will be using a pre-trained model as a feature extractor and we'll train a simple classifier on top of it to output the prediction.</p>
<p>There are many well-known architectures in the field of deep learning that are nowadays used for the purpose of transfer learning. Almost all of these are trained on the ImageNet dataset which is the largest open-source dataset available. It contains around 1000 classes and has around fifteen million instances. </p>
<p>Among these pre-trained architectures, LeNet is the first one that was proposed in 1998. Other well-known models include VGG, ResNet, AlexNet, GoogleNet, Inception, and Xception. </p>
<p>EfficientNet is also part of the series that was proposed recently, in 2019.</p>
<h2 id="heading-what-is-efficientnet">What is EfficientNet?</h2>
<p>EfficientNet (or perhaps it's better to say EfficientNets) is a family of convolutional neural network-based image classification models. They perform extremely well on the state-of-the-art ImageNet dataset and other popular datasets such as CIFAR-100 and Flowers. </p>
<p>In addition to performing so well, the architecture is small and computes faster than any of the previous models. The architecture has variants ranging from EfficientNet-B0 up to EffieicntNet-B7.</p>
<p>The variants ranging from B0 to B7 are based on the compound scaling method to scale up the baseline in B0 to obtain B1 to B7. EfficientNet-B7 acquired a Top-1 accuracy of 84.4% on the ImageNet dataset, which is the highest level of Top-1 accuracy ever achieved on ImageNet. </p>
<p>If you want to learn more about how EfficientNets work, you can read this paper ‘<a target="_blank" href="https://arxiv.org/abs/1905.11946v5">Efficientnet: Rethinking Model Scaling for Convolutional Neural Networks.</a>’</p>
<p><img src="https://lh3.googleusercontent.com/FvX6r1u1vR9kfoSb7tJbQ5I7aDgGQNhZCtU_OTGkHpOLTX3ZZnc-zIc-AO1MLaE-eLCsyfaj_grRXAJapYb9pJqhbzwH5R0qcXAxGUWIsHqm9zvDy6h4EQB63GOwaFZP1fV43mk" alt="Image" width="600" height="400" loading="lazy">
<em><a target="_blank" href="https://github.com/tensorflow/tpu/tree/master/models/official/efficientnet">Source</a></em></p>
<p>In the coding tutorial further along in this article, we'll be using the EfficientNet-B0 as a feature extractor and a classifier on top of it to classify COVID-19 using chest x-ray images.</p>
<h2 id="heading-an-introduction-to-pytorch">An Introduction to PyTorch</h2>
<p>PyTorch is a Python-supported library that helps us build deep learning models. Unlike Keras (another deep learning library), PyTorch is flexible and gives the developer more control. </p>
<p>It is similar to NumPy in processing but has a faster GPU acceleration. To learn more about NumPy and its features, you can check out <a target="_blank" href="https://www.freecodecamp.org/news/the-ultimate-guide-to-the-numpy-scientific-computing-library-for-python/">this in-depth guide</a> along with its <a target="_blank" href="https://numpy.org/doc/stable/user/whatisnumpy.html">documentation</a>.</p>
<p>PyTorch has a data structure known as a ‘Tensor’ that is similar to the NumPy ndarray but it has the option to operate on GPU. </p>
<p>PyTorch provides an uncomplicated way to switch computation between a CPU and a GPU. It also supports processing on NumPy arrays by simply providing a built-in module that can convert NumPy arrays into Tensors and vice versa.</p>
<p>One of the handiest modules in PyTorch is <code>grad()</code>. It allows you to compute the gradient of a tensor as it goes forward into processing without needing to manually compute the gradient and store it. </p>
<p>This gives you greater control of your deep learning operations, specifically back propagation, during the training process. This is helpful when computing the loss function which lets you adjust the parameters of a model. </p>
<p>We can also limit a tensor so that its gradient is not computed during the entire process by making the module's <code>requires_grad</code> equal <code>False</code>. To learn more about tensors and how to perform gradient computations in PyTorch, you can <a target="_blank" href="https://www.freecodecamp.org/news/pytorch-tensor-methods/">check out this tutorial</a> and <a target="_blank" href="https://www.freecodecamp.org/news/pytorch-full-course/">this course</a>.</p>
<h2 id="heading-how-to-implement-a-covid-19-classifier-using-efficientnet-with-pytorch">How to Implement a COVID-19 Classifier using EfficientNet with PyTorch</h2>
<p>Now let's move on to the practical implementation of EfficientNet in PyTorch. We will use the B0 variant of the EfficientNet family.</p>
<p>First, we'll examine the data and preprocess it. <a target="_blank" href="https://www.kaggle.com">Kaggle</a> has an vast library of datasets available for open-source use in projects and research. There are no limits as to what dataset can be used for this project. You can use any dataset containing chest X-ray images of COVID-19 patients and people without COVID. </p>
<p>For the sake of this tutorial, we'll use this dataset <a target="_blank" href="https://www.kaggle.com/asraf047/covid19-pneumonia-normal-chest-xray-pa-dataset">here</a>. But for the code to work on your custom dataset, you must divide your data into three directories: train, test, and valid. </p>
<p>Each directory should contain two more directories with the labels <code>covid</code> and <code>norma</code>l. These covid and normal folders will contain the images corresponding to the specific class of the directory they are present in.</p>
<p><img src="https://lh5.googleusercontent.com/aaZIPn8TEUsfqo3rA7xtJf7T-3PMSRU_jSZZ60DCeloIyadr40u1oguQycDMDeL-puqjdZ40xEGIu8i_PYdpufi_o-8pcGTlarJ37A_KJm_R0lV4mwGFKPAIhQmKd3Lr7b6dNHM" alt="Image" width="600" height="400" loading="lazy"></p>
<p>The original dataset we'll use in this article contains three folders: covid, normal, and pneumonia. We discard the pneumonia folder completely and divide the other data in the same way described above. </p>
<p>We do this to create a logical division between the data used for training and the data used for testing and validation. Also, PyTorch, by default, takes the name of the folder, an instance it is present in, as the label of the class – so we do not have a label file corresponding to the input dataset.</p>
<h3 id="heading-the-data-and-the-architecture">The data and the architecture</h3>
<p>Let's have a look at the data. Below we can see the x-ray images of patients with COVID-19:</p>
<p><img src="https://lh4.googleusercontent.com/cB8kT-qcFsIqly9wi2yHiDZpD3of9wOgr7j9XggMWC0Yehva5H1QHiGmLq1g-qIz5wyk_6Kdy_roJiyTxUNFtPmGr6-0BKLy5KscJesZddQUGpKSDn8ZH5cRqDTWeSXswCxH8W8" alt="Image" width="600" height="400" loading="lazy"></p>
<p>And here we can see the normal category’s x-ray images:</p>
<p><img src="https://lh4.googleusercontent.com/CRu82skVkh6fIaLuSD5ucOyjhjCk9o_j6ZO0zQLw8J4_UKk5nSJhxfiEtdwhmSCFVakoG0RLSwr6IL7b-ij30thBD_S6WYumx6XUYLSMkPdHfjvxzAfuwF_MaoUG89VmFGXUa9Y" alt="Image" width="600" height="400" loading="lazy"></p>
<p>There are 237 total layers in the B-0 architecture. The whole architecture can be condensed into the following diagrams. We provide the x-ray data to the input layer.</p>
<p><img src="https://lh3.googleusercontent.com/de9n3HWqb4kqVLV4VkPiCphCbfSDDSmKFXu826ITg1Z-LkWaB28JCkzfVlHaOVSrBHbSToDe5k45-bSGwUpQLglgoa4ai_YhhYAe9_th6pJIKts64kzbhgNS3GihARgRscJABlw" alt="Image" width="600" height="400" loading="lazy"></p>
<p><img src="https://lh5.googleusercontent.com/rCTjM83oPyAi-RddlHJufeDAql0ee_ExJmxqTbL7BgPk6unoZXmL5cabb0zuDrM7EBdDupxE1YXOmRCQt5Ntyn2gZYpzdEDb7kI0ea3BifBZp3q1MBYkVzxV9N4Mwd-882ciO7o" alt="Image" width="600" height="400" loading="lazy"></p>
<p><img src="https://lh4.googleusercontent.com/sZ34-xflacMLYBg33trm8RxJypHPxRqAHWtt_dm8fEdwhW1eFV0eEL66g8Yr8GcX8mo_6Sz4N6PkL7M_UbhG7S5n1eU5dpyrKZoJL7ROQ8TQLJjh_Nm4vokmtwi-4pOfCMzFHRk" alt="Image" width="600" height="400" loading="lazy">
<em><a target="_blank" href="https://towardsdatascience.com/complete-architectural-details-of-all-efficientnet-models-5fd5b736142">Source</a></em></p>
<p>We will freeze the learning of the weights across all these blocks as we will be using the pre-trained weights to extract the features from our own input. </p>
<p>We'll do the feature extraction after the input passes Module 7. We then transfer the feature map obtained from Module 7 to our own final classification layers (this is why it's called transfer learning). We top the architecture with the following top layers:</p>
<ul>
<li>BatchNorm1d</li>
<li>Linear(output neurons = 512)</li>
<li>ReLU()</li>
<li>BatchNorm1d()</li>
<li>Linear(output neurons = 128)</li>
<li>ReLU()</li>
<li>BatchNorm1d()</li>
<li>Dropout(probability of zeroing the parameters = 0.4)</li>
<li>Linear(output neurons = 2)</li>
</ul>
<h3 id="heading-lets-head-over-to-the-code">Let's head over to the code</h3>
<p>Now before we start the code, there are a couple of dependencies we need to install. First, you'll need to install PyTorch on your local machine. You can do this using the pip install command in your Python environment. Refer <a target="_blank" href="https://pytorch.org/get-started/locally/">here</a> to install it depending on your machine (whether it has GPU available or not).</p>
<p>Before you move on to the code, I strongly recommend that you actually work through the code yourself. This makes it much easier to understand. With that said, you can access the full code in a Jupyter notebook <a target="_blank" href="https://drive.google.com/file/d/1m_ATQIrNN-dVVZwZjux5305yhuseZ58R/view?usp=sharing">here</a>.</p>
<p>You also need to install Efficientnet support for PyTorch into the same Python environment. Run the command below to install it:</p>
<pre><code class="lang-bash">pip install efficientnet_pytorch
</code></pre>
<p>Apart from this you will need to import some other dependencies at the start of the code.</p>
<p>Now we start building the classification model. To start, we import all the necessary modules:</p>
<pre><code class="lang-python"><span class="hljs-comment">#importing required modules</span>
<span class="hljs-keyword">import</span> gdown
<span class="hljs-keyword">import</span> zipfile
<span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np
<span class="hljs-keyword">from</span> glob <span class="hljs-keyword">import</span> glob
<span class="hljs-keyword">import</span> matplotlib.pyplot <span class="hljs-keyword">as</span> plt
<span class="hljs-keyword">import</span> torch
<span class="hljs-keyword">import</span> torch.nn <span class="hljs-keyword">as</span> nn
<span class="hljs-keyword">from</span> torchsummary <span class="hljs-keyword">import</span> summary
<span class="hljs-keyword">from</span> torchvision <span class="hljs-keyword">import</span> datasets, transforms <span class="hljs-keyword">as</span> T
<span class="hljs-keyword">from</span> efficientnet_pytorch <span class="hljs-keyword">import</span> EfficientNet
<span class="hljs-keyword">import</span> os
<span class="hljs-keyword">import</span> torch.optim <span class="hljs-keyword">as</span> optim
<span class="hljs-keyword">from</span> PIL <span class="hljs-keyword">import</span> ImageFile
<span class="hljs-keyword">from</span> sklearn.metrics <span class="hljs-keyword">import</span> accuracy_score
</code></pre>
<p>All these modules are essential to perform multiple functions across the model. You can install all the absent modules using the pip command. </p>
<p>Then we download and extract the data we prepared for the model:</p>
<pre><code class="lang-python"><span class="hljs-comment">#importing data</span>
<span class="hljs-comment">#Dataset address</span>
url = <span class="hljs-string">'https://drive.google.com/uc?export=download&amp;id=1B75cOYH7VCaiqdeQYvMuUuy_Mn_5tPMY'</span>
output = <span class="hljs-string">'data.zip'</span>
gdown.download(url, output, quiet=<span class="hljs-literal">False</span>)
<span class="hljs-comment">#giving zip file name</span>
data_dir=<span class="hljs-string">'./data.zip'</span>
<span class="hljs-comment">#Extracting data from zip file</span>
<span class="hljs-keyword">with</span> zipfile.ZipFile(data_dir, <span class="hljs-string">'r'</span>) <span class="hljs-keyword">as</span> zf:
zf.extractall(<span class="hljs-string">'./data/'</span>)
</code></pre>
<p>The <code>gdown.download</code> module downloads the data from the URL provided and the zipfile.extractall extracts the data into the same directory where you currently are (or the same runtime if you are working on Google Colab). </p>
<p>I highly recommend working on Google Colab for this project in case you do not locally have a GPU available. </p>
<p>Next, create a check variable to check the availability of a GPU.</p>
<pre><code class="lang-python"><span class="hljs-comment">#Checking the availability of a GPU</span>
use_cuda = torch.cuda.is_available()
</code></pre>
<p>This module returns ‘True’ if GPU is available and ‘False' if not.</p>
<p>Next, we need to apply pre-processing techniques to the data. Since our data is pre-augmented, we do not need to apply many pre-processing techniques to it. We only resize all the images to a single size of (224,224). We do this because the images in our dataset are all of different dimensions and we need a consistent dimension for the model. </p>
<p>We'll also convert the images to tensors to be processed by PyTorch and then we normalize all the images. This normalize function normalizes all the images with a mean and standard deviation of 0.5. </p>
<p>After that, we create the locations for the train, test and validation sets which will be given as input to the ‘datasets’ module. We do this so that the PyTorch model knows exactly where the data is located and also so that that data can be loaded to the GPU. We keep a batch size of 32.</p>
<pre><code class="lang-python"><span class="hljs-comment">#declaring batch size</span>
batch_size = <span class="hljs-number">32</span>

<span class="hljs-comment">#applying required transformations on the dataset</span>
img_transforms = {
    <span class="hljs-string">'train'</span>:
    T.Compose([
        T.Resize(size=(<span class="hljs-number">224</span>,<span class="hljs-number">224</span>)), 
        T.ToTensor(),
        T.Normalize([<span class="hljs-number">0.5</span>, <span class="hljs-number">0.5</span>, <span class="hljs-number">0.5</span>], [<span class="hljs-number">0.5</span>, <span class="hljs-number">0.5</span>, <span class="hljs-number">0.5</span>]), 
        ]),

    <span class="hljs-string">'valid'</span>:
    T.Compose([
        T.Resize(size=(<span class="hljs-number">224</span>,<span class="hljs-number">224</span>)),
        T.ToTensor(),
        T.Normalize([<span class="hljs-number">0.5</span>, <span class="hljs-number">0.5</span>, <span class="hljs-number">0.5</span>], [<span class="hljs-number">0.5</span>, <span class="hljs-number">0.5</span>, <span class="hljs-number">0.5</span>])
        ]),

    <span class="hljs-string">'test'</span>:
    T.Compose([
        T.Resize(size=(<span class="hljs-number">224</span>,<span class="hljs-number">224</span>)),
        T.ToTensor(),
        T.Normalize([<span class="hljs-number">0.5</span>, <span class="hljs-number">0.5</span>, <span class="hljs-number">0.5</span>], [<span class="hljs-number">0.5</span>, <span class="hljs-number">0.5</span>, <span class="hljs-number">0.5</span>])
        ]),
     }

<span class="hljs-comment"># creating Location of data: train, validation, test</span>
data=<span class="hljs-string">'./data/'</span>

train_path=os.path.join(data,<span class="hljs-string">'train'</span>)
valid_path=os.path.join(data,<span class="hljs-string">'test'</span>)
test_path=os.path.join(data,<span class="hljs-string">'valid'</span>)


<span class="hljs-comment"># creating Datasets to each of  folder created in prev</span>
train_file=datasets.ImageFolder(train_path,transform=img_transforms[<span class="hljs-string">'train'</span>])
valid_file=datasets.ImageFolder(valid_path,transform=img_transforms[<span class="hljs-string">'valid'</span>])
test_file=datasets.ImageFolder(test_path,transform=img_transforms[<span class="hljs-string">'test'</span>])


<span class="hljs-comment">#Creating loaders for the dataset</span>
loaders_transfer={
    <span class="hljs-string">'train'</span>:torch.utils.data.DataLoader(train_file,batch_size,shuffle=<span class="hljs-literal">True</span>),
    <span class="hljs-string">'valid'</span>:torch.utils.data.DataLoader(valid_file,batch_size,shuffle=<span class="hljs-literal">True</span>),
    <span class="hljs-string">'test'</span>: torch.utils.data.DataLoader(test_file,batch_size,shuffle=<span class="hljs-literal">True</span>)
}
</code></pre>
<p>After pre-processing, we move on to building the model.</p>
<pre><code class="lang-python"><span class="hljs-comment">#importing the pretrained EfficientNet model</span>

model_transfer = EfficientNet.from_pretrained(<span class="hljs-string">'efficientnet-b0'</span>)

<span class="hljs-comment"># Freeze weights</span>
<span class="hljs-keyword">for</span> param <span class="hljs-keyword">in</span> model_transfer.parameters():
    param.requires_grad = <span class="hljs-literal">False</span>
in_features = model_transfer._fc.in_features


<span class="hljs-comment"># Defining Dense top layers after the convolutional layers</span>
model_transfer._fc = nn.Sequential(
    nn.BatchNorm1d(num_features=in_features),    
    nn.Linear(in_features, <span class="hljs-number">512</span>),
    nn.ReLU(),
    nn.BatchNorm1d(<span class="hljs-number">512</span>),
    nn.Linear(<span class="hljs-number">512</span>, <span class="hljs-number">128</span>),
    nn.ReLU(),
    nn.BatchNorm1d(num_features=<span class="hljs-number">128</span>),
    nn.Dropout(<span class="hljs-number">0.4</span>),
    nn.Linear(<span class="hljs-number">128</span>, <span class="hljs-number">2</span>),
    )
<span class="hljs-keyword">if</span> use_cuda:
    model_transfer = model_transfer.cuda()
</code></pre>
<p>First, we import the EfficientNet-B0 model with its pre-trained weights. Next, we disable the training of the parameters of the model because we are going to use the pre-trained parameters to extract features from our data. </p>
<p>Then we replace the top fully connected layers of the model with our own classifier. </p>
<p>Batchnorm normalizes the whole batch of data into the number of neurons given as an argument. This reduces the complexity of the model and prevents it from overfitting. Dropout does something similar – it zeroes out some neurons in the model with a probability of the value given as an argument. </p>
<p>The Linear layer is a simple fully-connected neural network layer. </p>
<p>Finally, we transfer our model to the GPU, if available.</p>
<pre><code class="lang-python"><span class="hljs-comment"># selecting loss function</span>
criterion_transfer = nn.CrossEntropyLoss()

<span class="hljs-comment">#using Adam classifier</span>
optimizer_transfer = optim.Adam(model_transfer.parameters(), lr=<span class="hljs-number">0.0005</span>)
</code></pre>
<p>Here, we select the loss function and the optimizer for our training phase. We also define the value of the learning rate for the optimizer. You can change this value to see how different learning rates influence the model in different ways. </p>
<p>Next, we move on to the training of the model.</p>
<pre><code class="lang-python">ImageFile.LOAD_TRUNCATED_IMAGES = <span class="hljs-literal">True</span>

<span class="hljs-comment"># Creating the function for training</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">train</span>(<span class="hljs-params">n_epochs, loaders, model, optimizer, criterion, use_cuda, save_path</span>):</span>
    <span class="hljs-string">"""returns trained model"""</span>
    <span class="hljs-comment"># initialize tracker for minimum validation loss</span>
    valid_loss_min = np.Inf 
    trainingloss = []
    validationloss = []

    <span class="hljs-keyword">for</span> epoch <span class="hljs-keyword">in</span> range(<span class="hljs-number">1</span>, n_epochs+<span class="hljs-number">1</span>):
        <span class="hljs-comment"># initialize the variables to monitor training and validation loss</span>
        train_loss = <span class="hljs-number">0.0</span>
        valid_loss = <span class="hljs-number">0.0</span>

        <span class="hljs-comment">###################</span>
        <span class="hljs-comment"># training the model #</span>
        <span class="hljs-comment">###################</span>
        model.train()
        <span class="hljs-keyword">for</span> batch_idx, (data, target) <span class="hljs-keyword">in</span> enumerate(loaders[<span class="hljs-string">'train'</span>]):
            <span class="hljs-comment"># move to GPU</span>
            <span class="hljs-keyword">if</span> use_cuda:
                data, target = data.cuda(), target.cuda()

            optimizer.zero_grad()
            output = model(data)
            loss = criterion(output, target)
            loss.backward()
            optimizer.step()

            train_loss = train_loss + ((<span class="hljs-number">1</span> / (batch_idx + <span class="hljs-number">1</span>)) * (loss.data - train_loss))

        <span class="hljs-comment">######################    </span>
        <span class="hljs-comment"># validating the model #</span>
        <span class="hljs-comment">######################</span>
        model.eval()
        <span class="hljs-keyword">for</span> batch_idx, (data, target) <span class="hljs-keyword">in</span> enumerate(loaders[<span class="hljs-string">'valid'</span>]):
            <span class="hljs-keyword">if</span> use_cuda:
                data, target = data.cuda(), target.cuda()

            output = model(data)
            loss = criterion(output, target)
            valid_loss = valid_loss + ((<span class="hljs-number">1</span> / (batch_idx + <span class="hljs-number">1</span>)) * (loss.data - valid_loss))

        train_loss = train_loss/len(train_file)
        valid_loss = valid_loss/len(valid_file)

        trainingloss.append(train_loss)
        validationloss.append(valid_loss)

        <span class="hljs-comment"># printing training/validation statistics </span>
        print(<span class="hljs-string">'Epoch: {} \tTraining Loss: {:.6f} \tValidation Loss: {:.6f}'</span>.format(
            epoch, 
            train_loss,
            valid_loss
            ))

        <span class="hljs-comment">## saving the model if validation loss has decreased</span>
        <span class="hljs-keyword">if</span> valid_loss &lt; valid_loss_min:
            torch.save(model.state_dict(), save_path)

            valid_loss_min = valid_loss

    <span class="hljs-comment"># return trained model</span>
    <span class="hljs-keyword">return</span> model, trainingloss, validationloss
</code></pre>
<p>We create a function for the training and validation phase of the model. We allow the model to accept truncated images also with fewer than three channels. We initialize the values of the train and validation losses and start the training loop. We import the data batch by batch from the data loaders and perform the training operations. </p>
<p>After the training loop, we start the validation loop where we only compute the loss and the output predictions and do not update the parameters as we did in the training loop. We save the model which has the minimum loss for the validation set.</p>
<pre><code class="lang-python"><span class="hljs-comment"># training the model</span>

n_epochs=<span class="hljs-number">10</span>

model_transfer, train_loss, valid_loss = train(n_epochs, loaders_transfer, model_transfer, optimizer_transfer, criterion_transfer, use_cuda, <span class="hljs-string">'model.pt'</span>)
</code></pre>
<p>We run the model for 10 epochs, that is 10 loops. You can change the number of epochs and test out the loss values. The saved model is saved under the name <code>model.pt</code>. Now we load the model and move on to the testing phase.</p>
<pre><code class="lang-python"><span class="hljs-comment"># Defining the test function</span>

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">test</span>(<span class="hljs-params">loaders, model, criterion, use_cuda</span>):</span>

    <span class="hljs-comment"># monitoring test loss and accuracy</span>
    test_loss = <span class="hljs-number">0.</span>
    correct = <span class="hljs-number">0.</span>
    total = <span class="hljs-number">0.</span>
    preds = []
    targets = []

    model.eval()
    <span class="hljs-keyword">for</span> batch_idx, (data, target) <span class="hljs-keyword">in</span> enumerate(loaders[<span class="hljs-string">'test'</span>]):
        <span class="hljs-comment"># moving to GPU</span>
        <span class="hljs-keyword">if</span> use_cuda:
            data, target = data.cuda(), target.cuda()
        <span class="hljs-comment"># forward pass</span>
        output = model(data)
        <span class="hljs-comment"># calculate the loss</span>
        loss = criterion(output, target)
        <span class="hljs-comment"># updating average test loss </span>
        test_loss = test_loss + ((<span class="hljs-number">1</span> / (batch_idx + <span class="hljs-number">1</span>)) * (loss.data - test_loss))
        <span class="hljs-comment"># converting the output probabilities to predicted class</span>
        pred = output.data.max(<span class="hljs-number">1</span>, keepdim=<span class="hljs-literal">True</span>)[<span class="hljs-number">1</span>]
        preds.append(pred)
        targets.append(target)
        <span class="hljs-comment"># compare predictions</span>
        correct += np.sum(np.squeeze(pred.eq(target.data.view_as(pred))).cpu().numpy())
        total += data.size(<span class="hljs-number">0</span>)

    <span class="hljs-keyword">return</span> preds, targets

<span class="hljs-comment"># calling test function</span>
preds, targets = test(loaders_transfer, model_transfer, criterion_transfer, use_cuda)
</code></pre>
<p>We now create a test function to apply our model to our test dataset and evaluate its performance. </p>
<p>We pass the dataset batch by batch as we did in the train and testing phase, but we only do it once here instead of 10 epochs. This is because we just have to test the model and not update the parameters. </p>
<p>The function returns the predictions it computed for the input test set and also the original target values of the test set. </p>
<p>Now we compute the accuracy of the model. First, we need to convert the tensors, that is predictions and targets, into NumPy arrays. We do this by first moving them from the GPU to the CPU and then converting them to NumPy arrays. The following code does this: </p>
<pre><code class="lang-python"><span class="hljs-comment">#converting the tensor object to a list for metric functions</span>

preds2, targets2 = [],[]

<span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> preds:
  <span class="hljs-keyword">for</span> j <span class="hljs-keyword">in</span> range(len(i)):
    preds2.append(i.cpu().numpy()[j])
<span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> targets:
  <span class="hljs-keyword">for</span> j <span class="hljs-keyword">in</span> range(len(i)):
    targets2.append(i.cpu().numpy()[j])
</code></pre>
<p>Now we compute the accuracy using the accuracy metric of the sklearn library.</p>
<pre><code class="lang-python"><span class="hljs-comment">#Computing the accuracy</span>
acc = accuracy_score(targets2, preds2)
print(<span class="hljs-string">"Accuracy: "</span>, acc)
</code></pre>
<p>Our model had an accuracy of 95.45%.</p>
<p><img src="https://lh3.googleusercontent.com/4_gMnxj_l_xGKOPr0Zg5V8IIA78NJIloxe9FNsKwAAW480WUpojW6PQWWgYzT7k839c27hA7svWPi4m_8XuR0ZSWY6TJ0TIc22xtCqqixeSq9mVBZzDIHW0edaueH1IE3VRW68M" alt="Image" width="600" height="400" loading="lazy"></p>
<p>The next image is the confusion matrix for the test run of the classifier. In it, you can see the visual of the model’s performance. The actual labels indicate whether the person had COVID or not, while the predicted labels indicate how our model classified the images. </p>
<p><img src="https://www.freecodecamp.org/news/content/images/2021/11/confusion-matrix.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p>As we can see, our model predicted most of the labels correctly. The small portion of wrongly predicted labels include 7 people who did not have COVID, but our model predicted they did. This is not too alarming. </p>
<p>On the other hand, there were 14 examples where our model predicted that they did not have COVID, but they did. In machine learning, these are called false negatives. This is a very alarming situation because we would've sent home people suffering from COVID-19. This would increase their risk that the disease would get worse. </p>
<h2 id="heading-conclusion">Conclusion</h2>
<p>Convolutional neural networks have proved extremely useful in computer vision techniques, and we can also use them efficiently in medical imaging and diagnosis.</p>
<p>Transfer learning is an effective method for using pre-trained architectures to perform efficiently in other applications. </p>
<p>But as we saw above, using these models depends upon what kind of problem we have and what our objectives are. Just like in the detection of COVID-19, we would prefer to have a model that gives us 0 false negatives. But there's still great potential for deep learning to be useful in COVID diagnosis as well as other medical diagnosis techniques.</p>
<p>Thanks for reading! If you enjoyed the article and would like to read more interesting articles around computer science, Python and JavaScript, please follow me on <a target="_blank" href="https://twitter.com/bajcmartinez">Twitter</a>.</p>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ PyTorch Tensor Methods – How to Create Tensors in Python ]]>
                </title>
                <description>
                    <![CDATA[ By Srijan PyTorch is an open-source Python-based library. It provides high flexibility and speed while building, training, and deploying deep learning models. At its core, PyTorch involves operations involving tensors. A tensor is a number, vector, m... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/pytorch-tensor-methods/</link>
                <guid isPermaLink="false">66d4614e706b9fb1c166b9b7</guid>
                
                    <category>
                        <![CDATA[ Python ]]>
                    </category>
                
                    <category>
                        <![CDATA[ pytorch ]]>
                    </category>
                
                    <category>
                        <![CDATA[ tensor ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ freeCodeCamp ]]>
                </dc:creator>
                <pubDate>Thu, 03 Dec 2020 17:00:26 +0000</pubDate>
                <media:content url="https://cdn-media-2.freecodecamp.org/w1280/5fc667ad49c47664ed828110.jpg" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>By Srijan</p>
<p>PyTorch is an open-source Python-based library. It provides high flexibility and speed while building, training, and deploying deep learning models.</p>
<p>At its core, PyTorch involves operations involving <em>tensors.</em> A tensor is a number, vector, matrix, or any n-dimensional array.</p>
<p>In this article, we will see different ways of creating tensors using PyTorch tensor methods (functions).</p>
<h2 id="heading-topics-well-cover">Topics we'll cover</h2>
<ul>
<li>tensor</li>
<li>zeros</li>
<li>ones</li>
<li>full</li>
<li>arange</li>
<li>linspace</li>
<li>rand</li>
<li>randint</li>
<li>eye</li>
<li>complex</li>
</ul>
<h3 id="heading-the-tensor-method">The tensor() method</h3>
<p>This method returns a tensor when <code>data</code> is passed to it. <code>data</code> can be a scalar, tuple, a list or a <em>NumPy</em> array.</p>
<div class="embed-wrapper"><iframe src="https://jovian.ai/embed?url=https://jovian.ai/srijansrj5901/different-ways-to-create-tensors/v/51&amp;cellId=76" height="158" width="800" title="Embedded content" loading="lazy"></iframe></div>

<p>In the above example, a NumPy array that was created using <code>np.arange()</code> was passed to the <code>tensor()</code> method, resulting in a 1-D tensor.</p>
<p>We can create a multi-dimensional tensor by passing a tuple of tuples, a list of lists, or a multi-dimensional NumPy array.</p>
<div class="embed-wrapper"><iframe src="https://jovian.ai/embed?url=https://jovian.ai/srijansrj5901/different-ways-to-create-tensors/v/51&amp;cellId=77" height="179" width="800" title="Embedded content" loading="lazy"></iframe></div>

<p>When an empty tuple or list is passed into <code>tensor()</code>, it creates an empty tensor.</p>
<div class="embed-wrapper"><iframe src="https://jovian.ai/embed?url=https://jovian.ai/srijansrj5901/different-ways-to-create-tensors/v/51&amp;cellId=78" height="158" width="800" title="Embedded content" loading="lazy"></iframe></div>

<h3 id="heading-the-zeros-method">The zeros() method</h3>
<p>This method returns a tensor where all elements are zeros, of specified <code>size</code> (shape). The <code>size</code> <strong>can</strong> be given as a tuple or a list or neither.</p>
<div class="embed-wrapper"><iframe src="https://jovian.ai/embed?url=https://jovian.ai/srijansrj5901/different-ways-to-create-tensors/v/51&amp;cellId=4" height="200" width="800" title="Embedded content" loading="lazy"></iframe></div>

<p>We could have passed <code>3, 2</code> inside a tuple or a list as well. It is self-explainable that passing negative numbers or a float would result in a run time error.</p>
<div class="embed-wrapper"><iframe src="https://jovian.ai/embed?url=https://jovian.ai/srijansrj5901/different-ways-to-create-tensors/v/51&amp;cellId=8" height="318" width="800" title="Embedded content" loading="lazy"></iframe></div>

<p>Passing an empty tuple or an empty list gives a tensor of size (dimension) 0, having 0 as its only element, whose data type is float.</p>
<div class="embed-wrapper"><iframe src="https://jovian.ai/embed?url=https://jovian.ai/srijansrj5901/different-ways-to-create-tensors/v/51&amp;cellId=10" height="205" width="800" title="Embedded content" loading="lazy"></iframe></div>

<h3 id="heading-the-ones-method">The ones() method</h3>
<p>Similar to <code>zeros()</code>, <code>ones()</code> returns a tensor where all elements are 1, of specified <code>size</code> (shape). The <code>size</code> <strong>can</strong> be given as a tuple or a list or neither.</p>
<div class="embed-wrapper"><iframe src="https://jovian.ai/embed?url=https://jovian.ai/srijansrj5901/different-ways-to-create-tensors/v/53&amp;cellId=19" height="400" width="800" title="Embedded content" loading="lazy"></iframe></div>

<p>Like <code>zeros()</code>, passing an empty tuple or list gives a tensor of 0 dimension, having 1 as the sole element, whose data type is float.</p>
<div class="embed-wrapper"><iframe src="https://jovian.ai/embed?url=https://jovian.ai/srijansrj5901/different-ways-to-create-tensors/v/51&amp;cellId=23" height="205" width="800" title="Embedded content" loading="lazy"></iframe></div>

<h3 id="heading-the-full-method">The full() method</h3>
<p>What if you want all the elements of a tensor to be equal to some value but not only 0 and 1? Maybe 2.9?</p>
<p><code>full()</code> returns a tensor of a shape given by the <code>size</code> argument, with all its elements equal to the <code>fill_value</code>.</p>
<div class="embed-wrapper"><iframe src="https://jovian.ai/embed?url=https://jovian.ai/srijansrj5901/different-ways-to-create-tensors/v/51&amp;cellId=27" height="200" width="800" title="Embedded content" loading="lazy"></iframe></div>

<p>Here, we have created a tensor of shape <code>3, 2</code> with the <code>fill_value</code> as 3. Here again, passing an empty tuple or list creates a scalar tensor of zero dimension.</p>
<div class="embed-wrapper"><iframe src="https://jovian.ai/embed?url=https://jovian.ai/srijansrj5901/different-ways-to-create-tensors/v/51&amp;cellId=31" height="158" width="800" title="Embedded content" loading="lazy"></iframe></div>

<p>While using <code>full</code>, it is <strong>necessary</strong> to give <code>size</code> as a tuple or a list.</p>
<div class="embed-wrapper"><iframe src="https://jovian.ai/embed?url=https://jovian.ai/srijansrj5901/different-ways-to-create-tensors/v/51&amp;cellId=33" height="444" width="800" title="Embedded content" loading="lazy"></iframe></div>

<h3 id="heading-the-arange-method">The arange() method</h3>
<p>This method returns a 1-D tensor, with elements from <code>start</code> (inclusive) to <code>end</code> (exclusive) with a common difference <code>step</code>. The default value for <code>start</code> is 0 while that for <code>step</code> is 1.</p>
<p>The elements of the tensor can be said to be in <strong>Arithmetic Progression</strong>, with <code>step</code> as common difference.</p>
<div class="embed-wrapper"><iframe src="https://jovian.ai/embed?url=https://jovian.ai/srijansrj5901/different-ways-to-create-tensors/v/51&amp;cellId=37" height="158" width="800" title="Embedded content" loading="lazy"></iframe></div>

<p>Here, we created a tensor which starts from 2 and goes until 20 with a <code>step</code> (common difference) of 2.</p>
<p>All the three parameters, <code>start</code>, <code>end</code> and <code>step</code> can be positive, negative or float.</p>
<div class="embed-wrapper"><iframe src="https://jovian.ai/embed?url=https://jovian.ai/srijansrj5901/different-ways-to-create-tensors/v/51&amp;cellId=41" height="297" width="800" title="Embedded content" loading="lazy"></iframe></div>

<p>While choosing <code>start</code>, <code>end</code>, and <code>step</code>, we need to ensure that <code>start</code> and <code>end</code> are consistent with the <code>step</code> sign.</p>
<div class="embed-wrapper"><iframe src="https://jovian.ai/embed?url=https://jovian.ai/srijansrj5901/different-ways-to-create-tensors/v/51&amp;cellId=41" height="297" width="800" title="Embedded content" loading="lazy"></iframe></div>

<p>Since <code>step</code> is set as -2, there is no way -42 can reach -22 (exclusive). Hence, it gives an error.</p>
<h3 id="heading-the-linspace-method">The linspace() method</h3>
<p>This method returns a 1-D dimensional tensor, with elements from <code>start</code> (inclusive) to <code>end</code> (inclusive). However, unlike <code>arange()</code>, here, <code>steps</code> isn't the common difference but the number of elements to be in the tensor.</p>
<div class="embed-wrapper"><iframe src="https://jovian.ai/embed?url=https://jovian.ai/srijansrj5901/different-ways-to-create-tensors/v/51&amp;cellId=45" height="158" width="800" title="Embedded content" loading="lazy"></iframe></div>

<p>PyTorch automatically decides the common difference based on the <code>steps</code> given.</p>
<p>Not providing a value for <code>steps</code> is deprecated. For <em>backwards compatibility</em>, not providing a value for steps creates a tensor with <strong>100</strong> elements. According to the official documentation, in a future PyTorch release, failing to provide a value for steps will throw a runtime error.</p>
<div class="embed-wrapper"><iframe src="https://jovian.ai/embed?url=https://jovian.ai/srijansrj5901/different-ways-to-create-tensors/v/51&amp;cellId=47" height="748" width="800" title="Embedded content" loading="lazy"></iframe></div>

<p>Unlike <code>arange()</code>, <code>linspace</code> can have a <code>start</code> greater than <code>end</code> since the common difference is automatically calculated.</p>
<div class="embed-wrapper"><iframe src="https://jovian.ai/embed?url=https://jovian.ai/srijansrj5901/different-ways-to-create-tensors/v/51&amp;cellId=49" height="158" width="800" title="Embedded content" loading="lazy"></iframe></div>

<p>Since <code>steps</code> here is not a common difference, but the number of elements, it can only be a non-negative integer.</p>
<div class="embed-wrapper"><iframe src="https://jovian.ai/embed?url=https://jovian.ai/srijansrj5901/different-ways-to-create-tensors/v/51&amp;cellId=51" height="297" width="800" title="Embedded content" loading="lazy"></iframe></div>

<h3 id="heading-the-rand-method">The rand() method</h3>
<p>This method returns a tensor filled with random numbers from a uniform distribution on the interval 0 (inclusive) to 1 (exclusive). The shape is given by the <code>size</code> argument. The <code>size</code> argument can be given as a tuple or list or neither.</p>
<div class="embed-wrapper"><iframe src="https://jovian.ai/embed?url=https://jovian.ai/srijansrj5901/different-ways-to-create-tensors/v/51&amp;cellId=55" height="200" width="800" title="Embedded content" loading="lazy"></iframe></div>

<p>Passing an empty tuple or list creates a scalar tensor of zero dimension.</p>
<div class="embed-wrapper"><iframe src="https://jovian.ai/embed?url=https://jovian.ai/srijansrj5901/different-ways-to-create-tensors/v/51&amp;cellId=56" height="158" width="800" title="Embedded content" loading="lazy"></iframe></div>

<h3 id="heading-the-randint-method">The randint() method</h3>
<p>This method returns a tensor filled with random integers generated uniformly between <code>low</code> (inclusive) and <code>high</code> (exclusive). The shape is given by the <code>size</code> argument. The default value for <code>low</code> is 0.</p>
<div class="embed-wrapper"><iframe src="https://jovian.ai/embed?url=https://jovian.ai/srijansrj5901/different-ways-to-create-tensors/v/51&amp;cellId=59" height="242" width="800" title="Embedded content" loading="lazy"></iframe></div>

<p>When only one <code>int</code> argument is passed, <code>low</code> gets the value 0, by default, and <code>high</code> gets the passed value.</p>
<div class="embed-wrapper"><iframe src="https://jovian.ai/embed?url=https://jovian.ai/srijansrj5901/different-ways-to-create-tensors/v/51&amp;cellId=60" height="200" width="800" title="Embedded content" loading="lazy"></iframe></div>

<p>The <code>size</code> argument only takes a tuple or a list. An empty tuple or list creates a tensor with zero dimension.</p>
<div class="embed-wrapper"><iframe src="https://jovian.ai/embed?url=https://jovian.ai/srijansrj5901/different-ways-to-create-tensors/v/51&amp;cellId=62" height="318" width="800" title="Embedded content" loading="lazy"></iframe></div>

<h3 id="heading-the-eye-method">The eye() method</h3>
<p>This method returns a 2-D tensor with ones on the diagonal and zeros elsewhere. The number of rows is given by <code>n</code> and columns is given by <code>m</code>.</p>
<div class="embed-wrapper"><iframe src="https://jovian.ai/embed?url=https://jovian.ai/srijansrj5901/different-ways-to-create-tensors/v/51&amp;cellId=65" height="179" width="800" title="Embedded content" loading="lazy"></iframe></div>

<p>The default value for <code>m</code> is the value of <code>n</code>. When only <code>n</code> is passed, it creates a tensor in the form of an <strong>identity matrix</strong>. An identity matrix has its diagonal elements as 1 and all others as 0. </p>
<div class="embed-wrapper"><iframe src="https://jovian.ai/embed?url=https://jovian.ai/srijansrj5901/different-ways-to-create-tensors/v/51&amp;cellId=66" height="242" width="800" title="Embedded content" loading="lazy"></iframe></div>

<h3 id="heading-the-complex-method">The complex() method</h3>
<p>This method returns a complex tensor with its real part equal to <code>real</code> and its imaginary part equal to <code>imag</code>. Both <code>real</code> and <code>imag</code> are tensors.</p>
<div class="embed-wrapper"><iframe src="https://jovian.ai/embed?url=https://jovian.ai/srijansrj5901/different-ways-to-create-tensors/v/51&amp;cellId=69" height="363" width="800" title="Embedded content" loading="lazy"></iframe></div>

<p>The data type of both the <code>real</code> and <code>imag</code> tensors should be either <code>float</code> or <code>double</code>.</p>
<div class="embed-wrapper"><iframe src="https://jovian.ai/embed?url=https://jovian.ai/srijansrj5901/different-ways-to-create-tensors/v/51&amp;cellId=70" height="523" width="800" title="Embedded content" loading="lazy"></iframe></div>

<p>Also, the <code>size</code> of both tensors, <code>real</code> and <code>imag</code>, should be the same, since the corresponding elements of the two matrices form a complex number.</p>
<div class="embed-wrapper"><iframe src="https://jovian.ai/embed?url=https://jovian.ai/srijansrj5901/different-ways-to-create-tensors/v/51&amp;cellId=72" height="499" width="800" title="Embedded content" loading="lazy"></iframe></div>

<h2 id="heading-conclusion">Conclusion</h2>
<p>We've covered ten different ways to create tensors using PyTorch methods. You can go through the <a target="_blank" href="https://pytorch.org/docs/stable/torch.html">official documentation</a> to know more about other PyTorch methods. </p>
<p>You can click <a target="_blank" href="https://jovian.ai/srijansrj5901/different-ways-to-create-tensors">here</a> to go to the Jupyter notebook where you can play around with these methods.</p>
<p>If you want to learn more about PyTorch, check out <a target="_blank" href="https://jovian.ai/learn/deep-learning-with-pytorch-zero-to-gans">this</a> amazing course on freeCodeCamp's <a target="_blank" href="https://www.youtube.com/watch?v=5ioMqzMRFgM&amp;t=3s&amp;ab_channel=freeCodeCamp.org">YouTube</a> channel.</p>
<p>Stay safe!</p>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ Free Live Course: Deep Learning with PyTorch ]]>
                </title>
                <description>
                    <![CDATA[ Are you interested in learning about Deep Learning? We are hosting a free 6-week live course on our YouTube channel, starting Saturday, November 20th at 9:30 AM PST. Passively watching a video is often not enough to learn a software concept. You need... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/free-deep-learning-with-pytorch-live-course/</link>
                <guid isPermaLink="false">66b20253712508eb1606783b</guid>
                
                    <category>
                        <![CDATA[ Deep Learning ]]>
                    </category>
                
                    <category>
                        <![CDATA[ pytorch ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ Beau Carnes ]]>
                </dc:creator>
                <pubDate>Fri, 20 Nov 2020 17:16:00 +0000</pubDate>
                <media:content url="https://www.freecodecamp.org/news/content/images/2020/05/pytorchlive.png" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>Are you interested in learning about Deep Learning? We are hosting a free 6-week <a target="_blank" href="https://youtu.be/5ioMqzMRFgM">live course on our YouTube channel</a>, starting Saturday, November 20th at 9:30 AM PST.</p>
<p>Passively watching a video is often not enough to learn a software concept. You need to be able to ask questions and build real projects. That is exactly what you will be able to do in the course “Deep Learning with PyTorch: Zero to GANs”.</p>
<p>This is an online course intended to provide a coding-first introduction to deep learning using the PyTorch framework. The course takes a hands-on coding-focused approach and will be taught using live interactive Jupyter notebooks, allowing students to follow along and experiment.</p>
<p>This course is taught by <a target="_blank" href="https://twitter.com/aakashns">Aakash N S</a>. He is the co-founder and CEO of Jovian.ml, a project management and collaboration platform for machine learning. </p>
<p>Theoretical concepts will be explained in simple terms using code. Students will receive weekly assignments, work on a project with real-world datasets and participate in a private data science competition to test their skills. Upon successful completion of the course, students will receive a certificate of completion.</p>
<p>This is a beginner-friendly course, and no prior knowledge of data science, machine learning or deep learning is assumed. It is preferable to have some background in the following areas:</p>
<ul>
<li>Programming knowledge, preferably in Python</li>
<li>Basics of linear algebra (vectors, matrices, dot products)</li>
<li>Basics of calculus (differentiation, geometric interpretation of derivative)</li>
</ul>
<h2 id="heading-syllabus">Syllabus</h2>
<p>The course is divided into 6 modules, and will be taught over 6 weeks via video lectures and interactive Jupyter notebooks. Each lecture will be around 2 hours long.</p>
<h3 id="heading-module-1-pytorch-basics-tensors-amp-gradients">Module 1: PyTorch Basics - Tensors &amp; Gradients</h3>
<ul>
<li>Introduction to Jupyter notebooks &amp; Data Science in Python</li>
<li>Creating vectors, matrices &amp; Tensors in PyTorch</li>
<li>Tensor operations and gradient computations</li>
<li>Interoperability of PyTorch with Numpy</li>
</ul>
<h3 id="heading-module-2-linear-regression-amp-gradient-descent">Module 2: Linear Regression &amp; Gradient Descent</h3>
<ul>
<li>Linear Regression from scratch using Tensor operations</li>
<li>Weights, biases and the mean squared error loss function</li>
<li>Gradient descent and model training with PyTorch Autograd</li>
<li>Linear Regression using PyTorch built-ins (nn.Linear, nn.functional etc.)</li>
</ul>
<h3 id="heading-module-3-logistic-regression-for-image-classification">Module 3: Logistic Regression for Image Classification</h3>
<ul>
<li>Working with images from the MNIST dataset</li>
<li>Training and validation dataset creation</li>
<li>Softmax function and categorical cross entropy loss</li>
<li>Model training, evaluation and sample predictions</li>
</ul>
<h3 id="heading-module-4-feedforward-neural-networks-amp-gpus">Module 4: Feedforward Neural Networks &amp; GPUs</h3>
<ul>
<li>Working with cloud GPU platforms like Kaggle &amp; Colab</li>
<li>Creating a multilayer neural network using nn.Module</li>
<li>Activation function, non-linearity and universal approximation theorem</li>
<li>Moving with datasets and models to the GPU for faster training</li>
</ul>
<h3 id="heading-module-5a-image-classification-using-convolutional-neural-networks">Module 5a: Image Classification using Convolutional Neural Networks</h3>
<ul>
<li>Working with the 3-channel RGB images from the CIFAR10 dataset</li>
<li>Introduction to Convolutions, kernels &amp; features maps</li>
<li>Underfitting, overfitting and techniques to improve model performance</li>
</ul>
<h3 id="heading-module-5b-data-augmentation-regularization-and-residual-networks">Module 5b: Data Augmentation, Regularization and Residual Networks</h3>
<ul>
<li>Improving the dataset using data normalization and data augmentation</li>
<li>Improving the model using residual connections and batch normalization</li>
<li>Improving the training loop using learning rate annealing, weight decay and gradient clip</li>
<li>Training a state of the art image classifier from scratch in 10 minutes</li>
</ul>
<h3 id="heading-module-6-image-generation-using-generative-adversarial-networks-gans">Module 6: Image Generation using Generative Adversarial Networks (GANs)</h3>
<ul>
<li>Introduction to generative modeling and application of GANs</li>
<li>Creating generator and discriminator neural networks</li>
<li>Generating and evaluating fake images of handwritten digits</li>
<li>Training the generator and discriminator in tandem and visualizing results</li>
</ul>
<h2 id="heading-exercises-amp-assignments">Exercises &amp; Assignments</h2>
<h3 id="heading-weekly-assignments">Weekly Assignments</h3>
<ul>
<li>Week 1: Linear Regression</li>
<li>Week 2: Image Classification</li>
<li>Week 3: Feedforward neural networks</li>
</ul>
<h3 id="heading-course-project">Course Project</h3>
<p>For the course project, students will create an image classification model using Convolutional neural networks, on a real-world dataset of their choice. The project will allow students to experiment with different types of models and regularization techniques. Students will also present their work at the end of the course and publish a blog post describing their approach and results.</p>
<h2 id="heading-certificate-of-completion">Certificate of Completion</h2>
<p>Students who attend at least 5 out of 6 video lectures and make valid submissions for all assignments will be eligible to receive a Certificate of Completion by Jovian.ml. Selected projects will also be receive a Best Project Award based on evaluation criteria determined by the instructors.</p>
<h2 id="heading-sign-up">Sign up</h2>
<p>You can sign up for the course here: <a target="_blank" href="http://zerotogans.com/">http://zerotogans.com/</a></p>
<p>Whether or not you sign up, you can watch the course on the <a target="_blank" href="https://youtu.be/5ioMqzMRFgM">freeCodeCamp.org YouTube channel</a>.</p>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ Deep Learning Frameworks Compared: MxNet vs TensorFlow vs DL4j vs PyTorch ]]>
                </title>
                <description>
                    <![CDATA[ It's a great time to be a deep learning engineer. In this article, we will go through some of the popular deep learning frameworks like Tensorflow and CNTK so you can choose which one is best for your project. Deep Learning is a branch of Machine Lea... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/deep-learning-frameworks-compared-mxnet-vs-tensorflow-vs-dl4j-vs-pytorch/</link>
                <guid isPermaLink="false">66d035ba12c679876b0602d9</guid>
                
                    <category>
                        <![CDATA[ Deep Learning ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Machine Learning ]]>
                    </category>
                
                    <category>
                        <![CDATA[ pytorch ]]>
                    </category>
                
                    <category>
                        <![CDATA[ TensorFlow ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ Manish Shivanandhan ]]>
                </dc:creator>
                <pubDate>Tue, 29 Sep 2020 15:22:13 +0000</pubDate>
                <media:content url="https://www.freecodecamp.org/news/content/images/2020/09/wall-3.jpeg" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>It's a great time to be a deep learning engineer. In this article, we will go through some of the popular deep learning frameworks like Tensorflow and CNTK so you can choose which one is best for your project.</p>
<p>Deep Learning is a branch of <a target="_blank" href="https://www.sas.com/en_in/insights/analytics/machine-learning.html">Machine Learning</a>. Though machine learning has various algorithms, the most powerful are neural networks. </p>
<p>Deep learning is the technique of building complex multi-layered neural networks. This helps us solve tough problems like image recognition, language translation, self-driving car technology, and more.</p>
<p>There are tons of real-world applications of deep learning from self-driving Tesla cars to AI assistants like Siri. To build these neural networks, we use different frameworks like Tensorflow, CNTK, and MxNet. </p>
<p>If you are new to deep learning, <a target="_blank" href="https://www.coursera.org/specializations/deep-learning">start here</a> for a good overview.</p>
<h1 id="heading-frameworks">Frameworks</h1>
<p>Without the right framework, constructing quality neural networks can be hard. With the right framework, you only have to worry about getting your hands on the right data. </p>
<p>That doesn’t imply that knowledge of the deep learning frameworks alone is enough to make you a successful data scientist.</p>
<p><em>You need a strong foundation of the fundamental concepts to be a successful deep learning engineer.</em> But the right framework will make your life easier.</p>
<p>Also, not all programming languages have their own machine learning / deep learning frameworks. This is because not all programming languages have the capacity to handle machine learning problems. </p>
<p>Languages like Python stand out among others due to their complex data processing capability.</p>
<p>Let's go through some of the popular deep learning frameworks in use today. Each one comes with its own set of advantages and limitations. It is important to have at least a basic understanding of these frameworks so you can choose the right one for your organization or project.</p>
<h1 id="heading-tensorflow">TensorFlow</h1>
<p><img src="https://www.freecodecamp.org/news/content/images/2020/09/tensorflow.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p>TensorFlow is the most famous deep learning library around. If you are a data scientist, you probably started with Tensorflow. It is one of the most efficient open-source libraries to work with. </p>
<p>Google built TensorFlow to use as an internal deep learning tool before open-sourcing it. TensorFlow powers a lot of useful applications including Uber, Dropbox, and Airbnb.</p>
<h3 id="heading-advantages-of-tensorflow">Advantages of Tensorflow</h3>
<ul>
<li>User Friendly. Easy to learn if you are familiar with Python.</li>
<li><a target="_blank" href="https://www.tensorflow.org/tensorboard">Tensorboard</a> for monitoring and visualization. It is a great tool if you want to see your deep learning models in action.</li>
<li>Community support. Experts engineers from Google and other companies improve TensorFlow almost on a daily basis.</li>
<li>You can use TensorFlow Lite to run TensorFlow models on mobile devices.</li>
<li><a target="_blank" href="https://www.tensorflow.org/js">Tensorflow.js</a> lets you to run real-time deep learning models in the browser using JavaScript.</li>
</ul>
<h3 id="heading-limitations-of-tensorflow">Limitations of Tensorflow</h3>
<ul>
<li>TensorFlow is a bit slow compared to frameworks like MxNet and CNTK.</li>
<li>Debugging can be challenging.</li>
<li>No support for <a target="_blank" href="https://en.wikipedia.org/wiki/OpenCL">OpenCL</a>.</li>
</ul>
<h1 id="heading-apache-mxnet">Apache MXNet</h1>
<p><img src="https://www.freecodecamp.org/news/content/images/2020/09/mxnet.jpg" alt="Image" width="600" height="400" loading="lazy"></p>
<p>MXNet is another popular Deep Learning framework. Founded by the <a target="_blank" href="https://www.apache.org/">Apache Software Foundation</a>, MXNet supports a wide range of languages like JavaScript, Python, and C++. MXNet is also supported by Amazon Web Services to build deep learning models. </p>
<p>MXNet is a computationally efficient framework used in business as well as in academia.</p>
<h3 id="heading-advantages-of-apache-mxnet">Advantages of Apache MXNet</h3>
<ul>
<li>Efficient, scalable, and fast.</li>
<li>Supported by all major platforms.</li>
<li>Provides GPU support, along with multi-GPU mode.</li>
<li>Support for programming languages like Scala, R, Python, C++, and JavaScript.</li>
<li>Easy model serving and high-performance API.</li>
</ul>
<h3 id="heading-disadvantages-of-apache-mxnet">Disadvantages of Apache MXNet</h3>
<ul>
<li>Compared to TensorFlow, MXNet has a smaller open source community.</li>
<li>Improvements, bug fixes, and other features take longer due to a lack of major community support.</li>
<li>Despite being widely used by many organizations in the tech industry, MxNet is not as popular as Tensorflow.</li>
</ul>
<h1 id="heading-microsoft-cntk">Microsoft CNTK</h1>
<p><img src="https://www.freecodecamp.org/news/content/images/2020/09/cntk-1.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p>Large companies usually use Microsoft Cognitive Toolkit (CNTK) to build deep learning models. </p>
<p>Though created by Microsoft, CNTK is an open-source framework. It illustrates neural networks in the form of directed graphs by using a sequence of computational steps. </p>
<p>CNTK is written using C++, but it supports various languages like C#, Python, C++, and Java.</p>
<p>Microsoft’s backing is an advantage for CNTK since Windows is the preferred operating system for enterprises. CNTK is also heavily used in the Microsoft ecosystem. </p>
<p>Popular products that use CNTK are Xbox, Cortana, and Skype.</p>
<h3 id="heading-advantages-of-microsoft-cntk">Advantages of Microsoft CNTK</h3>
<ul>
<li>Offers reliable and excellent performance.</li>
<li>The scalability of CNTK has made it a popular choice in many enterprises.</li>
<li>Has numerous optimized components.</li>
<li>Easy to integrate with <a target="_blank" href="https://spark.apache.org/">Apache Spark</a>, an analytics engine for data processing.</li>
<li>Works well with Azure Cloud, both being backed by Microsoft.</li>
<li>Resource usage and management are efficient.</li>
</ul>
<h3 id="heading-disadvantages-of-microsoft-cntk">Disadvantages of Microsoft CNTK</h3>
<ul>
<li>Minimal community support compared to Tensorflow, but has a dedicated team of Microsoft engineers working full time on it.</li>
<li>Significant learning curve.</li>
</ul>
<h1 id="heading-pytorch">PyTorch</h1>
<p><img src="https://www.freecodecamp.org/news/content/images/2020/09/pytorch.jpg" alt="Image" width="600" height="400" loading="lazy"></p>
<p>PyTorch is another popular deep learning framework. <a target="_blank" href="https://ai.facebook.com/">Facebook developed Pytorch</a> in its AI research lab (FAIR). Pytorch has been giving tough competition to Google’s Tensorflow.</p>
<p>Pytorch supports both Python and C++ to build deep learning models. Released three years ago, it's already being used by companies like Salesforce, Facebook, and Twitter.</p>
<p>Image Recognition, Natural Language Processing, and Reinforcement Learning are some of the many areas in which PyTorch shines. It is also used in research by universities like Oxford and organizations like IBM.</p>
<p>PyTorch is also a great choice for creating computational graphs. It also supports cloud software development and offers useful features, tools, and libraries. And it works well with cloud platforms like AWS and Azure.</p>
<h3 id="heading-advantages-of-pytorch">Advantages of PyTorch</h3>
<ul>
<li>User-friendly design and structure that makes constructing deep learning models transparent.</li>
<li>Has useful debugging tools like PyCharm debugger.</li>
<li>Contains many pre-trained models and supports distributed training.</li>
</ul>
<h3 id="heading-disadvantages-of-pytorch">Disadvantages of PyTorch</h3>
<ul>
<li>Does not have interfaces for monitoring and visualization like TensorFlow.</li>
<li>Comparatively, PyTorch is a new deep learning framework and currently has less community support.</li>
</ul>
<h1 id="heading-deeplearning4j">DeepLearning4j</h1>
<p><img src="https://www.freecodecamp.org/news/content/images/2020/09/dl4j.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p>DeepLearning4j is an excellent framework if your main programming language is Java. It is a commercial-grade, open-source, distributed deep-learning library. </p>
<p>Deeplearning4j supports all major types of neural network architectures like RNNs and CNNs.</p>
<p>Deeplearning4j is written for Java and Scala. It also integrates well with Hadoop and Apache Spark. Deeplearning4j also has support for GPUs, making it a great choice for Java-based deep learning solutions.</p>
<h3 id="heading-advantages-of-deeplearning4j">Advantages of DeepLearning4j</h3>
<ul>
<li>Scalable and can easily process large amounts of data.</li>
<li>Easy integration with Apache Spark.</li>
<li>Excellent community support and documentation.</li>
</ul>
<h3 id="heading-disadvantages-of-deeplearning4j">Disadvantages of DeepLearning4j</h3>
<ul>
<li>Limited to the Java programming language.</li>
<li>Relatively less popular compared to Tensorflow and PyTorch.</li>
</ul>
<h1 id="heading-conclusion">Conclusion</h1>
<p>Each framework comes with its list of pros and cons. But choosing the right framework is crucial to the success of a project. </p>
<p>You have to consider various factors like security, scalability, and performance. For enterprise-grade solutions, reliability becomes another primary contributing factor.</p>
<p>If you are just getting started, begin with Tensorflow. If you are building a Windows-based enterprise product, choose CNTK. If you prefer Java, choose DL4J.</p>
<p>I hope this article helps you choose the right deep learning framework for your next project. If you have any questions, reach out to me.</p>
<hr>
<p><em>Loved this article?</em> <a target="_blank" href="http://tinyletter.com/manishmshiva"><strong><em>Join my Newsletter</em></strong></a> <em>and get a summary of my articles and videos every Monday.</em></p>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ How to Build a Neural Network from Scratch with PyTorch ]]>
                </title>
                <description>
                    <![CDATA[ By Bipin Krishnan P In this article, we'll be going under the hood of neural networks to learn how to build one from the ground up.  The one thing that excites me the most in deep learning is tinkering with code to build something from scratch. It's ... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/how-to-build-a-neural-network-with-pytorch/</link>
                <guid isPermaLink="false">66d45dd573634435aafcef68</guid>
                
                    <category>
                        <![CDATA[ Artificial Intelligence ]]>
                    </category>
                
                    <category>
                        <![CDATA[ neural networks ]]>
                    </category>
                
                    <category>
                        <![CDATA[ pytorch ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ freeCodeCamp ]]>
                </dc:creator>
                <pubDate>Tue, 15 Sep 2020 19:31:05 +0000</pubDate>
                <media:content url="https://www.freecodecamp.org/news/content/images/2020/09/Screenshot-from-2020-09-12-22-09-32-01.jpeg" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>By Bipin Krishnan P</p>
<p>In this article, we'll be going under the hood of neural networks to learn how to build one from the ground up. </p>
<p>The one thing that excites me the most in deep learning is tinkering with code to build something from scratch. It's not an easy task, though, and teaching someone else how to do so is even more difficult.</p>
<p>I've been working my way through the Fast.ai course and this blog is greatly inspired by my experience.</p>
<p>Without any further delay let's start our wonderful journey of demystifying neural networks.</p>
<h2 id="heading-how-does-a-neural-network-work">How does a neural network work?</h2>
<p>Let's start by understanding the high level workings of neural networks.</p>
<p>A neural network takes in a data set and outputs a prediction. It's as simple as that.</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2020/09/IMG_20200912_131242.png" alt="Image" width="600" height="400" loading="lazy">
<em>How a neural network works</em></p>
<p>Let me give you an example.</p>
<p>Let's say that one of your friends (who is not a great football fan) points at an old picture of a famous footballer – say Lionel Messi – and asks you about him.</p>
<p>You will be able to identify the footballer in a second. The reason is that you have seen his pictures a thousand times before. So you can identify him even if the picture is old or was taken in dim light.</p>
<p>But what happens if I show you a picture of a famous baseball player (and you have never seen a single baseball game before)? You will not be able to recognize that player. In that case, even if the picture is clear and bright, you won't know who it is.</p>
<p>This is the same principle used for neural networks. If our goal is to build a neural network to recognize cats and dogs, we just show the neural network a bunch of pictures of dogs and cats.</p>
<p>More specifically, we show the neural network pictures of dogs and then tell it that these are dogs. And then show it pictures of cats, and identify those as cats.</p>
<p>Once we train our neural network with images of cats and dogs, it can easily classify whether an image contains a cat or a dog. In short, it can recognize a cat from a dog.</p>
<p>But if you show our neural network a picture of a horse or an eagle, it will never identify it as horse or eagle. This is because it has never seen a picture of a horse or eagle before because we have never shown it those animals.</p>
<p>If you wish to improve the capability of the neural network, then all you have to do is show it pictures of all the animals that you want the neural network to classify. As of now, all it knows is cats and dogs and nothing else.</p>
<p>The data set we use for our training heavily depends on the problem on our hands. If you wish to classify whether a tweet has a positive or negative sentiment, then probably, you will want a data set containing a lot of tweets with their corresponding label as either positive or negative. </p>
<p>Now that you have a high-level overview of data sets and how a neural network learns from that data, let's dive deeper into how neural networks work.</p>
<h2 id="heading-understanding-neural-networks">Understanding neural networks</h2>
<p>We will be building a neural network to classify the digits three and seven from an image.</p>
<p>But before we build our neural network, we need to go deeper to understand how they work.</p>
<p>Every image that we pass to our neural network is just a bunch of numbers. That is, each of our images has a size of 28×28 which means it has 28 rows and 28 columns, just like a matrix.</p>
<p>We see each of the digits as a complete image, but to a neural network, it is just a bunch of numbers ranging from 0 to 255.</p>
<p>Here is a pixel representation of the digit five:</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2020/09/Screenshot_20200912_125459.png" alt="Image" width="600" height="400" loading="lazy">
<em>Pixel values along with shades</em></p>
<p>As you can see above, we have 28 rows and 28 columns (the index starts from 0 and ends at 27) just like a matrix. Neural networks only see these 28×28 matrices.</p>
<p>To show some more details, I've just shown the shade along with the pixel values. If you look closer into the image, you can see that the pixel values close to 255 are darker whereas the values closer to 0 are lighter in shade.</p>
<p>In PyTorch we don't use the term matrix. Instead, we use the term tensor. Every number in PyTorch is represented as a tensor. So, from now on, we will use the term tensor instead of matrix.</p>
<h2 id="heading-visualizing-a-neural-network">Visualizing a neural network</h2>
<p>A neural network can have any number of neurons and layers.</p>
<p>This is how a neural network looks:</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2020/09/Artificial_neural_network.png" alt="Image" width="600" height="400" loading="lazy">
<em>Artificial neural network</em></p>
<p>Don't get confused by the Greek letters in the picture. I will break it down for you:</p>
<p>Take the case of predicting whether a patient will survive or not based on a data set containing the name of the patient, temperature, blood pressure, heart condition, monthly salary, and age.</p>
<p>In our data set, only the temperature, blood pressure, heart condition, and age have significant importance for predicting whether the patient will survive or not. So we will assign a higher weight value to these values in order to show higher importance.</p>
<p>But features like the name of the patient and monthly salary have little or no influence on the patient's survival rate. So we assign smaller weight values to these features to show less importance.</p>
<p>In the above figure, x1, x2, x3...xn are the features in our data set which may be pixel values in the case of image data or features like blood pressure or heart condition as in the above example. </p>
<p>The feature values are multiplied by the corresponding weight values referred to as w1j, w2j, w3j...wnj. The multiplied values are summed together and passed to the next layer.</p>
<p>The optimum weight values are learned during the training of the neural network. The weight values are updated continuously in such a way as to maximize the number of correct predictions.</p>
<p>The activation function is nothing but the sigmoid function in our case. Any value we pass to the sigmoid gets converted to a value between 0 and 1. We just put the sigmoid function on top of our neural network prediction to get a value between 0 and 1.</p>
<p>You will understand the importance of the sigmoid layer once we start building our neural network model.</p>
<p>There are a lot of other activation functions that are even simpler to learn than sigmoid.</p>
<p>This is the equation for a sigmoid function:</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2020/09/Screenshot_20200912_204809.png" alt="Image" width="600" height="400" loading="lazy">
<em>Sigmoid function</em></p>
<p>The circular-shaped nodes in the diagram are called neurons. At each layer of the neural network, the weights are multiplied with the input data.</p>
<p>We can increase the depth of the neural network by increasing the number of layers. We can improve the capacity of a layer by increasing the number of neurons in that layer.</p>
<h2 id="heading-understanding-our-data-set">Understanding our data set</h2>
<p>The first thing we need in order to train our neural network is the data set. </p>
<p>Since the goal of our neural network is to classify whether an image contains the number three or seven, we need to train our neural network with images of threes and sevens. So, let's build our data set.</p>
<p>Luckily, we don't have to create the data set from scratch. Our data set is already present in PyTorch. All we have to do is just download it and do some basic operations on it.</p>
<p>We need to download a data set called <a target="_blank" href="http://yann.lecun.com/exdb/mnist/"><strong>MNIST</strong></a> (Modified National Institute of Standards and Technology) from the torchvision library of PyTorch.</p>
<p>Now let's dig deeper into our data set.</p>
<h3 id="heading-what-is-the-mnist-data-set">What is the MNIST data set?</h3>
<p>The MNIST data set contains handwritten digits from zero to nine with their corresponding labels as shown below:</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2020/09/Screenshot_20200912_115108.png" alt="Image" width="600" height="400" loading="lazy">
<em>MNIST data set</em></p>
<p>So, what we do is simply feed the neural network the images of the digits and their corresponding labels which tell the neural network that this is a three or seven.</p>
<h2 id="heading-how-to-prepare-our-data-set">How to prepare our data set</h2>
<p>The downloaded MNIST data set has images and their corresponding labels.</p>
<p>We just write the code to index out only the images with a label of three or seven. Thus, we get a data set of threes and sevens.</p>
<p>First, let's import all the necessary libraries.</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> torch
<span class="hljs-keyword">from</span> torchvision <span class="hljs-keyword">import</span> datasets
<span class="hljs-keyword">import</span> matplotlib.pyplot <span class="hljs-keyword">as</span> plt
</code></pre>
<p>We import the PyTorch library for building our neural network and the torchvision library for downloading the MNIST data set, as discussed before. The Matplotlib library is used for displaying images from our data set.</p>
<p>Now, let's prepare our data set.</p>
<pre><code class="lang-python">mnist = datasets.MNIST(<span class="hljs-string">'./data'</span>, download=<span class="hljs-literal">True</span>)

threes = mnist.data[(mnist.targets == <span class="hljs-number">3</span>)]/<span class="hljs-number">255.0</span>
sevens = mnist.data[(mnist.targets == <span class="hljs-number">7</span>)]/<span class="hljs-number">255.0</span>

len(threes), len(sevens)
</code></pre>
<p>As we learned above, everything in PyTorch is represented as tensors. So our data set is also in the form of tensors.</p>
<p>We download the data set in the first line. We index out only the images whose target value is equal to 3 or 7 and normalize them by dividing with 255 and store them separately. </p>
<p>We can check whether our indexing was done properly by running the code in the last line which gives the number of images in the threes and sevens tensor.</p>
<p>Now let's check whether we've prepared our data set correctly. </p>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">show_image</span>(<span class="hljs-params">img</span>):</span>
  plt.imshow(img)
  plt.xticks([])
  plt.yticks([])
  plt.show()

show_image(threes[<span class="hljs-number">3</span>])
show_image(sevens[<span class="hljs-number">8</span>])
</code></pre>
<p>Using the Matplotlib library, we create a function to display the images.</p>
<p>Let's do a quick sanity check by printing the shape of our tensors.</p>
<pre><code class="lang-python">print(threes.shape, sevens.shape)
</code></pre>
<p>If everything went right, you will get the size of threes and sevens as ([6131, 28, 28]) and ([6265, 28, 28]) respectively. This means that we have 6131 28×28 sized images for threes and 6265 28×28 sized images for sevens.</p>
<p>We've created two tensors with images of threes and sevens. Now we need to combine them into a single data set to feed into our neural network.</p>
<pre><code class="lang-python">combined_data = torch.cat([threes, sevens])
combined_data.shape
</code></pre>
<p>We will concatenate the two tensors using PyTorch and check the shape of the combined data set.</p>
<p>Now we will flatten the images in the data set.</p>
<pre><code class="lang-python">flat_imgs = combined_data.view((<span class="hljs-number">-1</span>, <span class="hljs-number">28</span>*<span class="hljs-number">28</span>))
flat_imgs.shape
</code></pre>
<p>We will flatten the images in such a way that each of the 28×28 sized images becomes a single row with 784 columns (28×28=784). Thus the shape gets converted to ([12396, 784]).</p>
<p>We need to create labels corresponding to the images in the combined data set.</p>
<pre><code class="lang-python">target = torch.tensor([<span class="hljs-number">1</span>]*len(threes)+[<span class="hljs-number">2</span>]*len(sevens))
target.shape
</code></pre>
<p>We assign the label 1 for images containing a three, and the label 0 for images containing a seven.</p>
<h2 id="heading-how-to-train-your-neural-network">How to train your Neural Network</h2>
<p>To train your neural network, follow these steps.</p>
<h3 id="heading-step-1-building-the-model">Step 1: Building the model</h3>
<p>Below you can see the simplest equation that shows how neural networks work:</p>
<p>                                 <strong>y = Wx + b</strong></p>
<p>Here, the term 'y' refers to our prediction, that is, three or seven. 'W' refers to our weight values, 'x' refers to our input image, and 'b' is the bias (which, along with weights, help in making predictions).</p>
<p>In short, we multiply each pixel value with the weight values and add them to the bias value.</p>
<p>The weights and bias value decide the importance of each pixel value while making predictions.  </p>
<p>We are classifying three and seven, so we have only two classes to predict. </p>
<p>So, we can predict 1 if the image is three and 0 if the image is seven. The prediction we get from that step may be any real number, but we need to make our model (neural network) predict a value between 0 and 1. </p>
<p>This allows us to create a threshold of 0.5. That is, if the predicted value is less than 0.5 then it is a seven. Otherwise it is a three.</p>
<p>We use a sigmoid function to get a value between 0 and 1.</p>
<p>We will create a function for sigmoid using the same equation shown earlier. Then we pass in the values from the neural network into the sigmoid.</p>
<p>We will create a single layer neural network.</p>
<p>We cannot create a lot of loops to multiply each weight value with each pixel in the image, as it is very expensive. So we can use a magic trick to do the whole multiplication in one go by using matrix multiplication.</p>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">sigmoid</span>(<span class="hljs-params">x</span>):</span> <span class="hljs-keyword">return</span> <span class="hljs-number">1</span>/(<span class="hljs-number">1</span>+torch.exp(-x))

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">simple_nn</span>(<span class="hljs-params">data, weights, bias</span>):</span> <span class="hljs-keyword">return</span> sigmoid((data@weights) + bias)
</code></pre>
<h3 id="heading-step-2-defining-the-loss">Step 2: Defining the loss</h3>
<p>Now, we need a loss function to calculate by how much our predicted value is different from that of the ground truth. </p>
<p>For example, if the predicted value is 0.3 but the ground truth is 1, then our loss is very high. So our model will try to reduce this loss by updating the weights and bias so that our predictions become close to the ground truth.</p>
<p>We will be using mean squared error to check the loss value. Mean squared error finds the mean of the square of the difference between the predicted value and the ground truth.</p>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">error</span>(<span class="hljs-params">pred, target</span>):</span> <span class="hljs-keyword">return</span> ((pred-target)**<span class="hljs-number">2</span>).mean()
</code></pre>
<h3 id="heading-step-3-initialize-the-weight-values">Step 3: Initialize the weight values</h3>
<p>We just randomly initialize the weights and bias. Later, we will see how these values are updated to get the best predictions.</p>
<pre><code class="lang-python">w = torch.randn((flat_imgs.shape[<span class="hljs-number">1</span>], <span class="hljs-number">1</span>), requires_grad=<span class="hljs-literal">True</span>)
b = torch.randn((<span class="hljs-number">1</span>, <span class="hljs-number">1</span>), requires_grad=<span class="hljs-literal">True</span>)
</code></pre>
<p>The shape of the weight values should be in the following form:</p>
<p>(Number of neurons in the previous layer, number of neurons in the next layer)</p>
<p>We use a method called gradient descent to update our weights and bias to make the maximum number of correct predictions.</p>
<p>Our goal is to optimize or decrease our loss, so the best method is to calculate gradients.</p>
<p>We need to take the derivative of each and every weight and bias with respect to the loss function. Then we have to subtract this value from our weights and bias. </p>
<p>In this way, our weights and bias values are updated in such a way that our model makes a good prediction.</p>
<p>Updating a parameter for optimizing a function is not a new thing – you can optimize any arbitrary function using gradients.</p>
<p>We've set a special parameter (called requires_grad) to true to calculate the gradient of weights and bias.</p>
<h3 id="heading-step-4-update-the-weights">Step 4: Update the weights</h3>
<p>If our prediction does not come close to the ground truth, that means that we've made an incorrect prediction. This means that our weights are not correct. So we need to update our weights until we get good predictions.</p>
<p>For this purpose, we put all of the above steps inside a for loop and allow it to iterate any number of times we wish. </p>
<p>At each iteration, the loss is calculated and the weights and biases are updated to get a better prediction on the next iteration.</p>
<p>Thus our model becomes better after each iteration by finding the optimal weight value suitable for our task in hand.</p>
<p>Each task requires a different set of weight values, so we can't expect our neural network trained for classifying animals to perform well on musical instrument classification.</p>
<p>This is how our model training looks like:</p>
<pre><code class="lang-python"><span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> range(<span class="hljs-number">2000</span>):
  pred = simple_nn(flat_imgs, w, b)
  loss = error(pred, target.unsqueeze(<span class="hljs-number">1</span>))
  loss.backward()

  w.data -= <span class="hljs-number">0.001</span>*w.grad.data
  b.data -= <span class="hljs-number">0.001</span>*b.grad.data

  w.grad.zero_()
  b.grad.zero_()

print(<span class="hljs-string">"Loss: "</span>, loss.item())
</code></pre>
<p>We will calculate the predictions and store it in the 'pred' variable by calling the function that we've created earlier. Then we calculate the mean squared error loss.</p>
<p>Then, we will calculate all the gradients for our weights and bias and update the value using those gradients. </p>
<p>We've multiplied the gradients by 0.001, and this is called learning rate. This value decides the rate at which our model will learn, if it is too low, then the model will learn slowly, or in other words, the loss will be reduced slowly.</p>
<p>If the learning rate is too high, our model will not be stable, jumping between a wide range of loss values. This means it will fail to converge.</p>
<p>We do the above steps for 2000 times, and each time our model tries to reduce the loss by updating the weights and bias values.</p>
<p>We should zero out the gradients at the end of each loop or epoch so that there is no accumulation of unwanted gradients in the memory which will affect our model's learning. </p>
<p>Since our model is very small, it doesn't take much time to train for 2000 epochs or iterations. After 2000 epochs, our neural netwok has given a loss value of 0.6805 which is not bad from such a small model.</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2020/09/Screenshot_20200915_195233.png" alt="Image" width="600" height="400" loading="lazy">
<em>Final result</em></p>
<h2 id="heading-conclusion">Conclusion</h2>
<p>There is a huge space for improvement in the model that we've just created.</p>
<p>This is just a simple model, and you can experiment on it by increasing the number of layers, number of neurons in each layer, or increasing the number of epochs. </p>
<p>In short, machine learning is a whole lot of magic using math. Always learn the foundational concepts – they may be boring, but eventually you will understand that those boring math concepts created these cutting edge technologies like <a target="_blank" href="https://en.wikipedia.org/wiki/Deepfake">deepfakes</a>.</p>
<p>You can get the complete code on <a target="_blank" href="https://github.com/bipinKrishnan/ML_from_scratch/blob/master/neural_network_pytorch.ipynb">GitHub</a> or play with the code in <a target="_blank" href="https://colab.research.google.com/github/bipinKrishnan/ML_from_scratch/blob/master/neural_network_pytorch.ipynb">Google colab</a>.</p>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ Learn How to Use PyTorch for Deep Learning ]]>
                </title>
                <description>
                    <![CDATA[ PyTorch is an open source machine learning library for Python that facilitates building deep learning projects. We've published a 10-hour course that will take you from being complete beginner in PyTorch to using it to code your own GANs (generative ... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/pytorch-full-course/</link>
                <guid isPermaLink="false">66b2063708bc664c3c097f0e</guid>
                
                    <category>
                        <![CDATA[ pytorch ]]>
                    </category>
                
                    <category>
                        <![CDATA[ youtube ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ Beau Carnes ]]>
                </dc:creator>
                <pubDate>Thu, 30 Apr 2020 20:04:33 +0000</pubDate>
                <media:content url="https://www.freecodecamp.org/news/content/images/2020/04/pytorch.png" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>PyTorch is an open source machine learning library for Python that facilitates building deep learning projects. We've published a 10-hour course that will take you from being complete beginner in PyTorch to using it to code your own GANs (generative adversarial networks). You don't even have to know what a GAN is to start!</p>
<p>This coding-first course is approachable to people starting out with deep learning and neural networks. The course was developed by <a target="_blank" href="https://twitter.com/aakashns">Aakash</a> from <a target="_blank" href="http://jovian.ml/">Jovian.ml</a>. This is a comprehensive course and it covers the following topics:</p>
<ul>
<li>PyTorch Basics &amp; Linear Regression</li>
<li>Image Classification with Logistic Regression</li>
<li>Training Deep Neural Networks on a GPU with PyTorch</li>
<li>Image Classification using Convolutional Neural Networks</li>
<li>Residual Networks, Data Augmentation and Regularization</li>
<li>Training Generative Adverserial Networks (GANs)</li>
</ul>
<p>There is code and detailed notes to go along with each section of this course. You can access the code in Jupyter Notebooks that are provided. This allows you to try the code yourself at each step of the way.</p>
<p>If you have been wanting to learn more about deep learning but haven't known where to start, this is a great place to begin your journey of learning about deep learning. It will be helpful to have a basic understanding of Python before you start.</p>
<p>You can watch the course below or <a target="_blank" href="https://www.youtube.com/watch?v=GIsg-ZUy0MY">on the freeCodeCamp.org YouTube channel</a>.</p>
<div class="embed-wrapper">
        <iframe width="560" height="315" src="https://www.youtube.com/embed/GIsg-ZUy0MY" style="aspect-ratio: 16 / 9; width: 100%; height: auto;" title="YouTube video player" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen="" loading="lazy"></iframe></div>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ Learn to apply deep learning with PyTorch in this full course ]]>
                </title>
                <description>
                    <![CDATA[ In this complete course from Fawaz Sammani you will learn the key concepts behind deep learning and how to apply the concepts to a real-life project using PyTorch.   First, you will learn the theoretical concepts you need to know for building a chatb... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/applied-deep-learning-with-pytorch-full-course/</link>
                <guid isPermaLink="false">66b2008d09c44225ad2c395c</guid>
                
                    <category>
                        <![CDATA[ Deep Learning ]]>
                    </category>
                
                    <category>
                        <![CDATA[ pytorch ]]>
                    </category>
                
                    <category>
                        <![CDATA[ youtube ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ Beau Carnes ]]>
                </dc:creator>
                <pubDate>Thu, 31 Jan 2019 18:30:59 +0000</pubDate>
                <media:content url="https://cdn-media-1.freecodecamp.org/ghost/2019/01/applied-deep-learning.png" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>In this complete course from Fawaz Sammani you will learn the key concepts behind deep learning and how to apply the concepts to a real-life project using PyTorch.  </p>
<p>First, you will learn the theoretical concepts you need to know for building a chatbot, which include RNNs, LSTMS and Sequence Models with Attention.</p>
<p>Then you will learn about PyTorch, a very powerful and advanced deep learning Library. You will learn how to install PyTorch and how to use it.</p>
<p>Finally, the biggest part of the course shows how to apply the concepts to build a chatbot in PyTorch.</p>
<p>You can watch the tutorial on the <a target="_blank" href="https://www.youtube.com/watch?v=CNuI8OWsppg">freeCodeCamp.org YouTube channel</a> (6 hour watch).</p>
 ]]>
                </content:encoded>
            </item>
        
    </channel>
</rss>
