<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/"
    xmlns:atom="http://www.w3.org/2005/Atom" xmlns:media="http://search.yahoo.com/mrss/" version="2.0">
    <channel>
        
        <title>
            <![CDATA[ Jose Vicente Nunez - freeCodeCamp.org ]]>
        </title>
        <description>
            <![CDATA[ Browse thousands of programming tutorials written by experts. Learn Web Development, Data Science, DevOps, Security, and get developer career advice. ]]>
        </description>
        <link>https://www.freecodecamp.org/news/</link>
        <image>
            <url>https://cdn.freecodecamp.org/universal/favicons/favicon.png</url>
            <title>
                <![CDATA[ Jose Vicente Nunez - freeCodeCamp.org ]]>
            </title>
            <link>https://www.freecodecamp.org/news/</link>
        </image>
        <generator>Eleventy</generator>
        <lastBuildDate>Sun, 24 May 2026 16:29:50 +0000</lastBuildDate>
        <atom:link href="https://www.freecodecamp.org/news/author/josevnz/rss.xml" rel="self" type="application/rss+xml" />
        <ttl>60</ttl>
        
            <item>
                <title>
                    <![CDATA[ How to Simplify Python Library RPM Packaging with Mock and Podman ]]>
                </title>
                <description>
                    <![CDATA[ Packaging libraries and applications written in Python comes with its challenges. And while virtual environments are great for controlling and standardizing installations, there are some scenarios where using them may not be the best. For example, sa... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/simplify-python-library-rpm-packaging-with-mock-and-podman/</link>
                <guid isPermaLink="false">67880c8c282408c6e731883a</guid>
                
                    <category>
                        <![CDATA[ Python 3 ]]>
                    </category>
                
                    <category>
                        <![CDATA[ rpm ]]>
                    </category>
                
                    <category>
                        <![CDATA[ mock ]]>
                    </category>
                
                    <category>
                        <![CDATA[ packaging ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Devops ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ Jose Vicente Nunez ]]>
                </dc:creator>
                <pubDate>Wed, 15 Jan 2025 19:29:16 +0000</pubDate>
                <media:content url="https://cdn.hashnode.com/res/hashnode/image/upload/v1736952806487/e25f259a-71e0-4998-ad29-b5da286e3fba.png" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>Packaging libraries and applications written in Python comes with its challenges. And <a target="_blank" href="https://docs.python.org/3/tutorial/venv.html">while virtual environments are great</a> for controlling and standardizing installations, there are some scenarios where using them may not be the best.</p>
<p>For example, say you need to install a Python library system wide. You could try to create a virtual environment on a shared well-known directory, or you could modify the environment variable <a target="_blank" href="https://docs.python.org/3/using/cmdline.html">PYTHONPATH</a> to change where to look for packages.</p>
<p>But it may be simpler with an package manager like <a target="_blank" href="https://docs.redhat.com/en/documentation/red_hat_enterprise_linux/8/html/packaging_and_distributing_software/introduction-to-rpm_packaging-and-distributing-software">RedHat RPM</a> or <a target="_blank" href="https://www.dpkg.org/">Debian DPKG</a>, which can also help you keep track of dependencies and can even check if a package’s contents are tampered with after the installation with a checksum.</p>
<p>Also, system administration tools written in Python often require that you use an interpreter with all the required libraries ready to go. For example, imagine a system Python with the popular <a target="_blank" href="https://numpy.org/">numpy</a> module installed by default, and such package is used by the tool – just calling the import without initializing any virtual environments.</p>
<p>For the sake of argument, say you need to go the route of an RPM packaging. You’ll quickly realize that your RPM package has runtime dependencies (libraries than your Python library needs to run once installed) and build dependencies (libraries you need to build your library but that are not required to use the library).</p>
<p>In particular, <em>build dependencies will force you to install those on the machines where you are packaging your application</em>. For example, look at the “BuildRequires” tag from the poetry RPM spec from RedHat (showing a fragment here):</p>
<pre><code class="lang-plaintext"> This patch moves the vendored requires definition
# from vendors/pyproject.toml to pyproject.toml
# Intentionally contains the removed hunk to prevent patch aging
Patch1:         poetry-core-1.6.1-devendor.patch

BuildArch:      noarch
BuildRequires:  python3-devel
BuildRequires:  pyproject-rpm-macros

%if %{with tests}
# for tests (only specified via poetry poetry.dev-dependencies with pre-commit etc.)
BuildRequires:  python3-build
BuildRequires:  python3-pytest
BuildRequires:  python3-pytest-mock
BuildRequires:  python3-setuptools
BuildRequires:  python3-tomli-w
BuildRequires:  python3-virtualenv
BuildRequires:  gcc 
BuildRequires:  git-core
%endif
</code></pre>
<p>To complicate things further, you may:</p>
<ul>
<li><p>Need to build your library for a totally different OS that you have installed (say you have Fedora 42 but need and RPM for Alma Linux 9.5)</p>
</li>
<li><p>Need to install an RPM that comes from a dubious source, and you want to make sure it doesn’t break your system while the packaging process is running (see the RPM <a target="_blank" href="https://docs.fedoraproject.org/en-US/packaging-guidelines/Scriptlets/">scriptlets</a>).</p>
</li>
</ul>
<h3 id="heading-prerequisites">Prerequisites</h3>
<p>In this tutorial, I’ll show you how you can handle those concerns using an Open Source tool called <a target="_blank" href="https://github.com/rpm-software-management/mock">Mock</a>. But first you will need the following to be able to follow this tutorial:</p>
<ul>
<li><p>A Linux distribution that uses RPM as packaging tool (RedHat Enterprise Edition, Fedora, Alma Linux, Rocky, and so on)</p>
</li>
<li><p>Ability to install RPM packages on your build server (like <a target="_blank" href="https://fedoraproject.org/wiki/Using_Mock_to_test_package_builds">mock</a>, <a target="_blank" href="https://fedoraproject.org/wiki/Rpmdevtools">rpmdevtools</a>) using tools like <a target="_blank" href="https://rpm-software-management.github.io/">DNF</a> or YUM.</p>
</li>
<li><p>Understanding of how RPM packaging works (if you are unfamiliar, the <a target="_blank" href="https://fedoranews.org/alex/tutorial/rpm/">Fedora RPM guide</a> is a great starting point)</p>
</li>
<li><p>You should understand what a <a target="_blank" href="https://developers.redhat.com/blog/2018/02/22/container-terminology-practical-introduction#h.j2uq93kgxe0e">container</a> is and how <a target="_blank" href="https://docs.podman.io/en/latest/index.html">PODMAN</a> or <a target="_blank" href="https://docker.com/">Docker</a> works.</p>
</li>
<li><p>Understanding how a <a target="_blank" href="https://docs.python.org/3/library/venv.html">Python virtual environment</a> works. We will not cover this here, but is useful to know that <a target="_blank" href="https://packaging.python.org/en/latest/guides/installing-using-pip-and-virtual-environments/#create-and-use-virtual-environments">this alternative exists and how it works</a>.</p>
</li>
</ul>
<h3 id="heading-heres-what-well-cover">Here’s what we’ll cover:</h3>
<ul>
<li><p><a class="post-section-overview" href="#heading-why-mock">Why Mock</a>?</p>
</li>
<li><p><a class="post-section-overview" href="#heading-packaging-scenarios-with-mock-and-podman">Packaging scenarios with Mock and Podman</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-conclusion">Conclusion</a></p>
</li>
</ul>
<h2 id="heading-why-mock">Why Mock?</h2>
<p>As we discussed above, we already have <a target="_blank" href="https://docs.python.org/3/library/venv.html">Python virtual environments</a> – so why bother to have an RPM of the same library?</p>
<p>Well, if you want to ensure consistent deployment across different systems, RPM packaging can be beneficial. It allows for easier management and distribution of software, especially in environments where system-wide installations are preferred over virtual environments.</p>
<p>Mock can help us with that. From the Mock Git README:</p>
<blockquote>
<p><em>A 'simple'</em> <a target="_blank" href="https://en.wikipedia.org/wiki/Chroot"><em>chroot</em></a> <em>build environment manager for building RPMs.</em></p>
<p><em>Mock is used by the Fedora Build system to populate a chroot environment, which is then used in building a source-RPM (SRPM). It can be used for long-term management of a chroot environment, but generally a chroot is populated (using</em> <a target="_blank" href="https://rpm-software-management.github.io/"><em>DNF</em></a><em>), an SRPM is built in the chroot to generate binary RPMs, and the chroot is then discarded.</em></p>
</blockquote>
<p><strong>This is very important:</strong> it means mock will install dependencies on a <a target="_blank" href="https://en.wikipedia.org/wiki/Chroot">chroot</a> environment, separated from the regular system, which will be discarded once the packaging is done.</p>
<p>Mock by itself doesn’t provide perfect isolation but <a target="_blank" href="https://developers.redhat.com/blog/2018/02/22/container-terminology-practical-introduction#h.j2uq93kgxe0e">when used with a container</a> execution framework like <a target="_blank" href="https://docs.podman.io/en/latest/index.html">PODMAN</a>, it helps to protect the integrity of your system when packaging an unknown RPM:</p>
<blockquote>
<p>Mock needs to execute some tasks under root privileges, therefore malicious RPMs can put your system at risk. Mock is not safe for unknown RPMs</p>
</blockquote>
<p>By running mock inside Podman, you get the best of both worlds, as Podman will run with limited privileges by itself. Also Podman, being a container, can remove itself after execution, which helps out with the cleanup.</p>
<p>Let’s see a few scenarios that demonstrate where you can use mock.</p>
<h2 id="heading-packaging-scenarios-with-mock-and-podman">Packaging Scenarios with Mock and Podman</h2>
<h3 id="heading-packaging-a-newer-version-of-the-module-on-an-older-linux-distribution">Packaging a newer version of the module on an older Linux distribution</h3>
<p>In this case, say we want to re-use the existing <a target="_blank" href="https://textual.textualize.io/">textual 0.6.2</a> package from Fedora 41 into Fedora 40. This is possible with mock, but to make it more secure we should run it inside a Podman container. This will give us more isolation from the real operating system.</p>
<p>During testing, I found than my home directory was tool small when running Podman. To fix this, I created a configuration override to point Podman root storage to a bigger partition on my machine (/mnt/data/podman/):</p>
<pre><code class="lang-shell">mkdir --parent ---verbose $HOME/.config/containers/
/bin/cat&lt;&lt;EOF&gt;$HOME/.config/containers/storage.conf
[storage]
driver = "overlay"
runroot = "/mnt/data/podman/"
graphroot = "/mnt/data/podman/"
EOF
</code></pre>
<p>Then I realized something else: I needed to preserve the results of our artifact generation. When you run a container with the <code>—rm</code> (remove) flag, all its contents are destroyed. In our case, we want to preserve the generated RPM package files. So what we do is to mount an external directory inside the Podman container using the <code>—mount</code> option: (<code>--mount type=bind,src=$HOME/tmp,target=/mnt/result</code>).</p>
<p>So far so good, right? Not quite. I found out that a Python dependency for Textual was missing too. It’s called Rich, and it needed an RPM as well. Luckily you can “chain” a list of dependencies as Source RPMS (SRPM) when building your main package, so Mock can make them available to you when preparing the main package (we must pass <code>—localrepo</code> instead of <code>—resultdir</code> and we use the <code>--chain</code> flag).</p>
<p>Now we are ready to build the package and its dependencies. This requires the following:</p>
<ol>
<li><p>Create a local directory where the RPMS will be created</p>
</li>
<li><p>Run Podman on interactive mode so we can execute commands inside it</p>
</li>
<li><p>Install mock inside Podman using dnf.</p>
</li>
<li><p>Create a special user called mockbuilder to run mock and become that user</p>
</li>
<li><p>Execute mock passing the chain</p>
</li>
</ol>
<pre><code class="lang-shell">mkdir --parent --verbose $HOME/tmp
podman run --mount type=bind,src=$HOME/tmp,target=/mnt/result --rm --privileged --interactive --tty fedora:40 bash
dnf install -y mock
useradd mockbuilder
usermod -a -G mock mockbuilder
chown mockbuilder /mnt/result/
su - mockbuilder
mock --localrepo /mnt/result/ --chain https://download.fedoraproject.org/pub/fedora/linux/releases/41/Everything/source/tree/Packages/p/python-rich-13.7.1-5.fc41.src.rpm https://download.fedoraproject.org/pub/fedora/linux/development/rawhide/Everything/source/tree/Packages/p/python-textual-0.62.0-2.fc41.src.rpm
</code></pre>
<p>For example, on my Raspberry PI 4 with Fedora 40, the final output looks like this:</p>
<pre><code class="lang-shell">...
INFO: Success building python-textual-0.62.0-2.fc41.src.rpm
INFO: Results out to: /mnt/result/results/default
INFO: Packages built: 2
INFO: Packages successfully built in this order:
INFO: /tmp/tmpc6651dxo/python-rich-13.7.1-5.fc41.src.rpm
INFO: /tmp/tmpc6651dxo/python-textual-0.62.0-2.fc41.src.rpm
</code></pre>
<p>Outside the container, we can test the installation by installing both Rich and Textual (you need root for this):</p>
<pre><code class="lang-shell">josevnz@raspberypi1:~$ sudo dnf install -y /home/josevnz/tmp/results/default/python-rich-13.7.1-5.fc41/python3-rich-13.7.1-5.fc40.noarch.rpm /home/josevnz/tmp/results/default/python-textual-0.62.0-2.fc41/python3-textual-doc-0.62.0-2.fc40.noarch.rpm /home/josevnz/tmp/results/default/python-textual-0.62.0-2.fc41/python3-textual-0.62.0-2.fc40.noarch.rpm
...
nstalled:
  python3-linkify-it-py-2.0.3-1.fc40.noarch            python3-markdown-it-py-3.0.0-4.fc40.noarch    python3-markdown-it-py+linkify-3.0.0-4.fc40.noarch  
  python3-markdown-it-py+plugins-3.0.0-4.fc40.noarch   python3-mdit-py-plugins-0.4.0-4.fc40.noarch   python3-mdurl-0.1.2-6.fc40.noarch                   
  python3-pygments-2.17.2-3.fc40.noarch                python3-rich-13.7.1-5.fc40.noarch             python3-textual-0.62.0-2.fc40.noarch                
  python3-textual-doc-0.62.0-2.fc40.noarch             python3-uc-micro-py-1.0.3-1.fc40.noarch      

Complete!
</code></pre>
<p>Note than the contents of the container were removed from the original window once you exit, except the mounted volume. This is great, as we don’t have to worry about uninstalling building packages ourselves.</p>
<p><em>But is it perfect?</em></p>
<p><em>Can you use Mock to package newer code on much older distributions?</em></p>
<p>Mock works really well as long your dependencies aren't too far away from the version you are running. For example, say you want to build the RPMS for Fedora 37 instead of Fedora 40:</p>
<pre><code class="lang-shell">sudo rm -rf $HOME/tmp/results/*
podman run --mount type=bind,src=$HOME/tmp,target=/mnt/result --rm --privileged --interactive --tty fedora:37 bash
dnf install -y mock
useradd mockbuilder &amp;&amp; usermod -a -G mock mockbuilder &amp;&amp; chown mockbuilder /mnt/result/ &amp;&amp; su - mockbuilder
mock --nocheck --localrepo /mnt/result/ --chain https://download.fedoraproject.org/pub/fedora/linux/releases/41/Everything/source/tree/Packages/p/python-rich-13.7.1-5.fc41.src.rpm https://download.fedoraproject.org/pub/fedora/linux/development/rawhide/Everything/source/tree/Packages/p/python-textual-0.62.0-2.fc41.src.rpm
...
Package python3-poetry-core-1.0.8-3.fc37.noarch is already installed.
Package python3-pytest-7.1.3-2.fc37.noarch is already installed.
Package python3-setuptools-62.6.0-3.fc37.noarch is already installed.
Error: 
 Problem: nothing provides requested (python3dist(pygments) &lt; 3~~ with python3dist(pygments) &gt;= 2.13)
</code></pre>
<p>Uh oh, Fedora 37 doesn’t provide some of the dependencies. Can we build them in chain? I tried to add the SRPM for <a target="_blank" href="https://pygments.org/">pygments</a> (a generic syntax highlight library for Python), before building <a target="_blank" href="https://rich.readthedocs.io/en/stable/introduction.html">rich</a>, as it is a dependency for it. So the dependency chain grew a little bit more:</p>
<pre><code class="lang-shell">mock --nocheck --localrepo /mnt/result/ --chain https://download.fedoraproject.org/pub/fedora/linux/releases/39/Everything/source/tree/Packages/p/python-pygments-2.15.1-4.fc39.src.rpm https://download.fedoraproject.org/pub/fedora/linux/releases/41/Everything/source/tree/Packages/p/python-rich-13.7.1-5.fc41.src.rpm https://download.fedoraproject.org/pub/fedora/linux/development/rawhide/Everything/source/tree/Packages/p/python-textual-0.62.0-2.fc41.src.rpm
</code></pre>
<p>And then I found that two more python dependencies were broken, this time for textual on Fedora 37:</p>
<pre><code class="lang-shell">...
no matching package to install: 'python3-syrupy'
No matching package to install: 'python3-time-machine'
Not all dependencies satisfied
</code></pre>
<p>Looks like a game of trial an error. <em>How bad it can be?</em></p>
<p>Several tries later, I found that <a target="_blank" href="https://github.com/syrupy-project/syrupy">Syrupy (pytest plugin)</a> added a dependency on <a target="_blank" href="https://python-poetry.org/">Poetry (packaging tool)</a>, which complicated things a little bit, as Fedora 37 expects an older version of Poetry (poetry-1.1.14-1.fc37).</p>
<p>What could you do next? Well, you could try to get a version of Syrupy that works with this older version of Poetry. But that could potentially introduce vulnerabilities on your system or force you to use a version of Syrupy that doesn't work at all with Textual because of API changes.</p>
<p>It’s easier to work your dependencies upwards rather than downwards. In this case, I decided to stop my experiment as I don’t really need an RPM for Fedora 37 myself.</p>
<h3 id="heading-building-a-newer-non-packaged-version-of-the-software">Building a newer non-packaged version of the software</h3>
<p>Can mock help us with packaging an entirely new version of a package? Textual made huge improvements and added new features on the first official release 1.0.0. Let's see if we can take a few shortcuts to build an RPM that we can use with the system Python.</p>
<p>We will recycle the RPM Spec file from Textual we used before, but with a few modifications. First, let's prepare our sources again:</p>
<pre><code class="lang-shell">josevnz@raspberypi1:~$ podman run --mount type=bind,src=$HOME/tmp,target=/mnt/result --rm --privileged --interactive --tty fedora:40 bash
[root@ccae845daa84 /]# dnf install -y rpmdevtool
[root@ccae845daa84 /]# dnf install -y mock &amp;&amp; useradd mockbuilder &amp;&amp; usermod -a -G mock mockbuilder &amp;&amp; chown mockbuilder /mnt/result/ &amp;&amp; su - mockbuilder
[root@ccae845daa84 /]# for dep in https://download.fedoraproject.org/pub/fedora/linux/releases/41/Everything/source/tree/Packages/p/python-rich-13.7.1-5.fc41.src.rpm https://download.fedoraproject.org/pub/fedora/linux/development/rawhide/Everything/source/tree/Packages/p/python-textual-0.62.0-2.fc41.src.rpm; do rpm -ihv $dep; done
</code></pre>
<p>Then we update the <a target="_blank" href="https://rpm-software-management.github.io/rpm/manual/spec.html">RPM spec file</a> for Textual, which describes how the RPM is created, bumping the version from 0.62.0 to 1.0.0.</p>
<p>What I like to do is to create a new SRPM for Textual. For that I do the following (I’m still inside the Podman container – yes you can reuse it as long it keeps running):</p>
<ol>
<li><p>Install rpmdevtool, mock, as it contains a few tools I need to setup the environment to build the SRPM</p>
</li>
<li><p>Install the original SRPM for 0.6.2. Installing doesn’t need root and creates a new SRPM I can use to bootstrap my new installation. Steps 1 and 2 just below (this is optional if you are re-using the container from the previous example):</p>
<pre><code class="lang-bash"> [root@ccae845daa84 /]<span class="hljs-comment"># dnf install -y rpmdevtool</span>
 [root@ccae845daa84 /]<span class="hljs-comment"># dnf install -y mock &amp;&amp; useradd mockbuilder &amp;&amp; usermod -a -G mock mockbuilder &amp;&amp; chown mockbuilder /mnt/result/ &amp;&amp; su - mockbuilder</span>
 [root@ccae845daa84 /]<span class="hljs-comment"># for dep in https://download.fedoraproject.org/pub/fedora/linux/releases/41/Everything/source/tree/Packages/p/python-rich-13.7.1-5.fc41.src.rpm https://download.fedoraproject.org/pub/fedora/linux/development/rawhide/Everything/source/tree/Packages/p/python-textual-0.62.0-2.fc41.src.rpm; do rpm -ihv $dep; done</span>
</code></pre>
</li>
<li><p>I bumped the version of the package from 0.6.2 on the SPEC file that gets extracted inside ~/rpmbuild/SPECS/python-textual.spec</p>
</li>
<li><p>Tell spectool to retrieve the proper compressed source tar file so we can used to prepare a new SRPM</p>
</li>
<li><p>Recreate the SRPM so it can be used by Mock.</p>
<p> Steps 3, 4, and 5 below:</p>
</li>
</ol>
<pre><code class="lang-shell">[root@ccae845daa84 /]# sed -i 's#0.62.0#1.0.0#' ~/rpmbuild/SPECS/python-textual.spec
[root@ccae845daa84 /]# sed -i 's#%{url}/archive/v%{version}/textual-%{version}.tar.gz#%{url}/archive/refs/tags/v%{version}.tar.gz#' ~/rpmbuild/SPECS/python-textual.spec
[root@ccae845daa84 /]# spectool --get-files ~/rpmbuild/SPECS/python-textual.spec --sourcedir
Downloading: https://github.com/Textualize/textual/archive/refs/tags/v1.0.0.tar.gz
|  28.3 MiB Elapsed Time: 0:00:02                                                                                                                       
Downloaded: v1.0.0.tar.gz
[root@ccae845daa84 /]# rpmbuild -bs ~/rpmbuild/SPECS/python-textual.spec
setting SOURCE_DATE_EPOCH=1717891200
Wrote: /root/rpmbuild/SRPMS/python-textual-1.0.0-2.fc40.src.rpm
</code></pre>
<p>Now we can rebuild the SRPM and make make sure mock can find it when running from the exposed volume:</p>
<pre><code class="lang-shell">[root@ccae845daa84 /]# cp -pv /root/rpmbuild/SRPMS/python-textual-1.0.0-2.fc40.src.rpm /tmp/
'/root/rpmbuild/SRPMS/python-textual-1.0.0-2.fc40.src.rpm' -&gt; '/tmp/python-textual-1.0.0-2.fc40.src.rpm'
[root@ccae845daa84 /]# su - mockbuilder
[mockbuilder@ccae845daa84 ~]$ ls -l /tmp/python-textual-1.0.0-2.fc40.src.rpm
-rw-r--r--. 1 root root 29612335 Jan 11 00:12 /tmp/python-textual-1.0.0-2.fc40.src.rpm
</code></pre>
<p>Moment of truth, let’s build it:</p>
<pre><code class="lang-shell">[mockbuilder@ccae845daa84 ~]$ mock --nocheck --localrepo /mnt/result/ --chain https://download.fedoraproject.org/pub/fedora/linux/releases/41/Everything/source/tree/Packages/p/python-rich-13.7.1-5.fc41.src.rpm /tmp/python-textual-1.0.0-2.fc40.src.rpm
Wrote: /builddir/build/SRPMS/python-textual-1.0.0-2.fc40.src.rpm
Wrote: /builddir/build/RPMS/python3-textual-1.0.0-2.fc40.noarch.rpm
Wrote: /builddir/build/RPMS/python3-textual-doc-1.0.0-2.fc40.noarch.rpm
INFO: Done(/tmp/python-textual-1.0.0-2.fc40.src.rpm) Config(default) 2 minutes 38 seconds
</code></pre>
<p>Finally, test the installation by installing the RPMS outside the container:</p>
<pre><code class="lang-shell">josevnz@raspberypi1:~$ sudo dnf install /home/josevnz/tmp/results/default/python-rich-13.7.1-5.fc41/python3-rich-13.7.1-5.fc40.noarch.rpm /home/josevnz/tmp/results/default/python-textual-1.0.0-2.fc40/python3-textual-doc-1.0.0-2.fc40.noarch.rpm /home/josevnz/tmp/results/default/python-textual-1.0.0-2.fc40/python3-textual-1.0.0-2.fc40.noarch.rpm
Last metadata expiration check: 3:42:37 ago on Fri 10 Jan 2025 03:50:49 PM EST.
Package python3-rich-13.7.1-5.fc40.noarch is already installed.
Dependencies resolved.
=========================================================================================================================================================
 Package                                    Architecture                 Version                                Repository                          Size
=========================================================================================================================================================
Upgrading:
 python3-textual                            noarch                       1.0.0-2.fc40                           @commandline                       1.3 M
 python3-textual-doc                        noarch                       1.0.0-2.fc40                           @commandline                        24 M
Installing dependencies:
 python3-platformdirs                       noarch                       3.11.0-3.fc40                          fedora                              46 k

Transaction Summary
=========================================================================================================================================================
Install  1 Package
Upgrade  2 Packages

Total size: 25 M
Total download size: 46 k
Is this ok [y/N]: y
Downloading Packages:
python3-platformdirs-3.11.0-3.fc40.noarch.rpm                                                                             53 kB/s |  46 kB     00:00    
---------------------------------------------------------------------------------------------------------------------------------------------------------
Total                                                                                                                     41 kB/s |  46 kB     00:01     
Running transaction check
Transaction check succeeded.
Running transaction test
Transaction test succeeded.
Running transaction
  Preparing        :                                                                                                                                 1/1 
  Installing       : python3-platformdirs-3.11.0-3.fc40.noarch                                                                                       1/5 
  Upgrading        : python3-textual-1.0.0-2.fc40.noarch                                                                                             2/5 
  Upgrading        : python3-textual-doc-1.0.0-2.fc40.noarch                                                                                         3/5 
  Cleanup          : python3-textual-0.62.0-2.fc40.noarch                                                                                            4/5 
  Cleanup          : python3-textual-doc-0.62.0-2.fc40.noarch                                                                                        5/5 
  Running scriptlet: python3-textual-doc-0.62.0-2.fc40.noarch                                                                                        5/5 

Upgraded:
  python3-textual-1.0.0-2.fc40.noarch                                       python3-textual-doc-1.0.0-2.fc40.noarch                                      
Installed:
  python3-platformdirs-3.11.0-3.fc40.noarch                                                                                                              

Complete!
</code></pre>
<p><em>Not bad</em>, we can now build sophisticated <a target="_blank" href="https://en.wikipedia.org/wiki/Text-based_user_interface">TUIs</a> using Textual and the system Python, without the need to create a virtual environment nor force the installation of unwanted packages in our build server.</p>
<h2 id="heading-conclusion">Conclusion</h2>
<p>As you can see, mock is a very valuable tool that can help you automate packaging Python libraries that are not yet available in your platform. It allows you to automate getting dependencies for the RPM and alerts you when some are missing in your platform.</p>
<p>As an added bonus, the fact than you can run it inside Podman gives you even more isolation from RPMs that could be dangerous when executed as root.</p>
<h3 id="heading-extra-documentation-rtfm-read-the-fine-manual">Extra documentation (RTFM, Read The Fine Manual)</h3>
<ul>
<li><p><a target="_blank" href="https://gitlab.com/redhat/centos-stream/rpms/pyproject-rpm-macros/">RPM-Macros</a></p>
</li>
<li><p><a target="_blank" href="https://rpm-software-management.github.io/mock/">Mock</a></p>
</li>
<li><p><a target="_blank" href="https://fedoraproject.org/wiki/Rpmdevtools">RPM dev tools</a></p>
</li>
<li><p><a target="_blank" href="https://docs.fedoraproject.org/en-US/packaging-guidelines/Python_201x/#_macros">RPM macro documentation</a></p>
</li>
<li><p><a target="_blank" href="https://docs.redhat.com/en/documentation/red_hat_enterprise_linux/10-beta/html/packaging_and_distributing_software/packaging-python-3-rpms">Packaging Python3 RPMS</a></p>
</li>
<li><p><a target="_blank" href="https://packaging.python.org/en/latest/specifications/">PyPA specifications</a></p>
</li>
<li><p><a target="_blank" href="https://koji.fedoraproject.org/koji/buildinfo?buildID=2466451">Fedora Textual RPM</a></p>
</li>
</ul>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ Data Analysis with Python – How I Analyzed My Empire State Building Run-Up Performance ]]>
                </title>
                <description>
                    <![CDATA[ A tower running race is a race that you run up the stairs of a building. These happen around the world. I got the chance to participate in the Empire State Run Up in NYC, 2023 edition. The Empire State Building Run-Up (ESBRU)—the world’s first and m... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/empire-state-building-run-up-analysis-with-python/</link>
                <guid isPermaLink="false">66d85138ec0a9800d5b8e6e6</guid>
                
                    <category>
                        <![CDATA[ data analysis ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Python ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ Jose Vicente Nunez ]]>
                </dc:creator>
                <pubDate>Wed, 08 May 2024 16:56:28 +0000</pubDate>
                <media:content url="https://www.freecodecamp.org/news/content/images/2024/05/empire_state_runup-1.png" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>A <a target="_blank" href="https://en.wikipedia.org/wiki/Tower_running">tower running race</a> is a race that you run up the stairs of a building. These happen around the world. I got the chance to participate in the Empire State Run Up in NYC, 2023 edition.</p>
<blockquote>
<p>The Empire State Building Run-Up (ESBRU)—the world’s first and most famous tower race—challenges runners from near and far to race up its famed 86 flights—1,576 stairs.</p>
<p>While visitors can reach the building’s Observatory via elevator in under one minute, the fastest runners have covered the 86 floors by foot in about 10 minutes.</p>
<p>Leaders in the sport of professional tower-running converge at the Empire State Building in what some consider the ultimate test of endurance.</p>
</blockquote>
<p>I got lucky and managed to participate in this race. A few days after finishing the race, I realized that I wanted to know more about my performance, and what I could have done to better.</p>
<p>So naturally I went to the race organizer website and started looking at the numbers. And it was slow and tedious, plus it brought up more issues:</p>
<ol>
<li><p>Getting the data for offline analysis is difficult. You can see your results and others for comparison, but I found that the tools didn't offer an option to download the raw data, and they were clumsy to use.</p>
</li>
<li><p>Most tools out there to analyze race results are paid or do not apply to this type of race. Knowing what to expect reduces your anxiety, allows you to train better, and keeps your expectations in check.</p>
</li>
</ol>
<p>By now you've probably guessed that you can solve the data retrieval issues and post-race analysis using low-cost Open Source tools. This also allows you to apply different techniques to learn about the race and, depending on the quality of the data, even make performance predictions.</p>
<p>This is a very personal piece for me. I will share my race results and give you my biased opinion about the race. 😁</p>
<h2 id="heading-table-of-contents">Table of Contents</h2>
<ol>
<li><p><a class="post-section-overview" href="#heading-how-i-ended-up-running-to-the-top-of-the-empire-state-building">How I Ended Up Running to the Top of the Empire State Building</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-what-you-need-to-follow-this-tutorial">What You Need to Follow this Tutorial</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-how-to-get-the-data-using-web-scraping">How to Get the Data using Web Scraping</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-how-to-clean-up-the-data">How to Clean Up the Data</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-how-to-analyze-the-data">How to Analyze the Data</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-how-to-visualize-the-results">How to Visualize the results</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-how-to-run-the-applications">How to Run the Applications</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-what-else-can-we-learn">What Else Can We Learn?</a></p>
</li>
</ol>
<h2 id="heading-how-i-ended-up-running-to-the-top-of-the-empire-state-building">How I Ended Up Running to the Top of the Empire State Building</h2>
<p>Many of us have run a regular race at some point in our lives – there are many distances like <em>5K</em>, <em>10K</em>, <em>Half</em> <em>Marathon</em>, and <em>Full</em> <em>Marathon</em>. But there is no way to compare how you will perform while running the stairs all the way to the top of one of the most famous buildings in the world.</p>
<p>If you have ever been at the base of the skyscrapers in New York City and have looked up, you get the idea. Picture yourself running up the stairs, all the way to the top, without stopping.</p>
<p>Getting accepted is tough, because unlike a race like the <a target="_blank" href="https://en.wikipedia.org/wiki/New_York_City_Marathon">New York Marathon</a>, the Empire State Building can only accommodate around 500 runners (or should I say <em>climbers</em>?).</p>
<p>Add to that fact that the demand to participate is high, and then you can see that your chances to get in through the lottery are pretty slim (I read somewhere that there are only 50 lottery positions for more than 5,000 applicants).</p>
<p>You can imagine my surprise when I got an email saying that I was selected to participate after trying for 4 years in a row.</p>
<p>I panicked. Have you ever been at the base of the Empire State and looked up? Some days when it's cloudy you can't even see the top of the building.</p>
<p>I wasn't unprepared. But I had to adjust my training routine to be ready for this challenge with a small window of two months, and no experience doing a tower run.</p>
<p>The day of the race came and this is how it went for me:</p>
<ul>
<li><p>It was tough. I knew I had to pace myself, otherwise, the race would have ended for me on floor 20th as opposed to the 86th. You have to focus on a "keep going" mentality, regardless of how tired you feel. And then it is over, just like that.</p>
</li>
<li><p>You don't sprint, you climb 2 steps at a time at a steady pace, and you use the handrails to take weight off your legs.</p>
</li>
<li><p>No need to carb load or hydrate too much. If you do well, you will be done in around 30 minutes.</p>
</li>
<li><p>Nobody is pushing anyone. At least for non-elite racers like me, I was alone for most of the race.</p>
</li>
<li><p>I got passed and I passed a lot of people who forgot the 'pace yourself' rule. If you sprint, you will be toasted before floor 25, for sure.</p>
</li>
</ul>
<p>I had a blast and got great satisfaction from having this race ticked off my bucket list, the same way I felt after running the <a target="_blank" href="https://results.nyrr.org/event/40/finishers#search=Jose%2520Nunez%2520Zuleta">NYC Marathon</a>.</p>
<p>It was time now to do a post-race analysis using several of my favorite Open Source tools, which I'll explain in the next section.</p>
<h2 id="heading-what-you-need-to-follow-this-tutorial">What You Need to Follow this Tutorial</h2>
<p>Like the race, most of the challenges to writing this application were mental. You only need to break the main problem down into smaller pieces and then tackle each piece at a time:</p>
<ol>
<li><p>Get the data by scraping the website (very few sites allow you to export race results as a CSV).</p>
</li>
<li><p>Clean up the data, normalize it, and make it ready for automatic processing.</p>
</li>
<li><p>Ask questions. Then translate those questions into code and tests, ideally using statistics to get reliable answers.</p>
</li>
<li><p>Present the results. A UI (Text or Graphic) will do wonders due to its low consumption, but charts speak volumes too.</p>
</li>
</ol>
<p>You should have some experience in a programming language to get the most out of this article. My code is written in Python (you will need version 3.8+) and runs on Linux (I used <a target="_blank" href="https://fedoraproject.org/">Fedora 37 distribution</a>).</p>
<p>In a nutshell, I want to show that it is possible to do all the above with Open Source technologies. Then you can reuse this knowledge for other projects, not just for tower race analyses. 😅</p>
<p>I strongly recommend that you <a target="_blank" href="https://github.com/josevnz/tutorials/tree/main/docs/EmpireStateRunUp">get the source code</a> (It is <a target="_blank" href="https://github.com/josevnz/tutorials/tree/main?tab=Apache-2.0-1-ov-file#readme">Open Source</a>!). Get your hands dirty, break the scripts, and have fun. You will need Git to clone the repository:</p>
<pre><code class="lang-shell">git clone https://github.com/josevnz/tutorials.git
cd tutorials/docs/EmpireStateRunUp/
python -m ~/virtualenv/EmpireStateRunUp
. ~/virtualenv/EmpireStateRunUp/bin/activate
pip install --upgrade pip
pip install --upgrade build
pip install --upgrade wheel
pip install --editable .
</code></pre>
<p>Or if you just want to run the code while reading this tutorial (using my latest version from <a target="_blank" href="https://pypi.org/project/EmpireStateRunUp/">Pypi</a>):</p>
<pre><code class="lang-shell">python -m ~/virtualenv/EmpireStateRunUp
. ~/virtualenv/EmpireStateRunUp/bin/activate 
pip install --upgrade EmpireStateRunUp
</code></pre>
<p>We can now move to the next stage:a getting the data.</p>
<h2 id="heading-how-to-get-the-data-using-web-scraping">How to Get the Data using Web Scraping</h2>
<p>The race results site doesn't have an export feature, and I never heard back from their support team to see if there was an alternate way to get the race data. So the only alternative left was to do some web scraping.</p>
<p>The website is pretty basic and only allows scrolling through each record, so I decided to do web scraping to get the results into a format I could use later for data analysis.</p>
<h3 id="heading-the-rules-of-web-scraping">The rules of web scraping</h3>
<p>There are very 3 simple rules:</p>
<ol>
<li><p>Rule #1: <strong>Don't do it</strong>. Data flow changes, and your scraper will break the minute you are done getting the data. It will require time and effort. <em>Lots of it</em>.</p>
</li>
<li><p>Rule #2: <strong>Re-read rule number 1</strong>. If you can't get the data in any another format, then go to rule #3</p>
</li>
<li><p>Rule #3: <strong>Choose a good framework to automate what you can</strong> and prepare to do heavy data cleanup (also known as "give me patience for the stuff I can't control, like poorly done HTML and CSS").</p>
</li>
</ol>
<p>I decided to use <a target="_blank" href="https://www.selenium.dev/documentation/webdriver/">Selenium Web Driver</a> as it calls a real browser, like Firefox, to navigate the website. Selenium allows you to automate browser actions while you get the same rendered HTML you see when you navigate the site.</p>
<p>Selenium <em>is a complex tool</em> and will require you to spend some time experimenting with what works and what does not. Below is a simple script I wrote to get all the runner's names and race detail links in one run:</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> re
<span class="hljs-keyword">from</span> time <span class="hljs-keyword">import</span> sleep

<span class="hljs-keyword">from</span> selenium <span class="hljs-keyword">import</span> webdriver
<span class="hljs-keyword">from</span> selenium.webdriver.common.by <span class="hljs-keyword">import</span> By
<span class="hljs-keyword">from</span> selenium.webdriver.firefox.options <span class="hljs-keyword">import</span> Options
<span class="hljs-keyword">from</span> selenium.webdriver.firefox.webdriver <span class="hljs-keyword">import</span> WebDriver
<span class="hljs-keyword">from</span> selenium.webdriver.support.wait <span class="hljs-keyword">import</span> WebDriverWait
<span class="hljs-keyword">from</span> selenium.webdriver.support <span class="hljs-keyword">import</span> expected_conditions
<span class="hljs-comment"># AthLinks is nice enough to post the race results and their interface is very human-friendly. Not so machine parsing friendly.</span>
RESULTS = <span class="hljs-string">"https://www.athlinks.com/event/382111/results/Event/1062909/Course/2407855/Results"</span>
LINKS = {}


<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">print_links</span>(<span class="hljs-params">web_driver: WebDriver, page: int</span>) -&gt; <span class="hljs-keyword">None</span>:</span>
    <span class="hljs-keyword">for</span> a <span class="hljs-keyword">in</span> web_driver.find_elements(By.TAG_NAME, <span class="hljs-string">"a"</span>):
        href = a.get_attribute(<span class="hljs-string">'href'</span>)
        <span class="hljs-keyword">if</span> re.search(<span class="hljs-string">'Bib'</span>, href):
            name = a.text.strip().title()
            print(<span class="hljs-string">f"Page=<span class="hljs-subst">{page}</span>, <span class="hljs-subst">{name}</span>=<span class="hljs-subst">{href.strip()}</span>"</span>)
            LINKS[name] = href.strip()


<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">click</span>(<span class="hljs-params">level: int</span>) -&gt; <span class="hljs-keyword">None</span>:</span>
    button = WebDriverWait(driver, <span class="hljs-number">20</span>).until(
        expected_conditions.element_to_be_clickable((By.CSS_SELECTOR, <span class="hljs-string">f"div:nth-child(<span class="hljs-subst">{level}</span>) &gt; button"</span>)))
    driver.execute_script(<span class="hljs-string">"arguments[0].click();"</span>, button)
    sleep(<span class="hljs-number">2.5</span>)


options = Options()
options.add_argument(<span class="hljs-string">"--headless"</span>)
driver = webdriver.Firefox(options=options)
driver.get(RESULTS)
sleep(<span class="hljs-number">2.5</span>)
print_links(driver, <span class="hljs-number">1</span>)
click(<span class="hljs-number">6</span>)
print_links(driver, <span class="hljs-number">2</span>)
click(<span class="hljs-number">7</span>)
print_links(driver, <span class="hljs-number">3</span>)
click(<span class="hljs-number">7</span>)
print_links(driver, <span class="hljs-number">4</span>)
click(<span class="hljs-number">9</span>)
print_links(driver, <span class="hljs-number">5</span>)
click(<span class="hljs-number">9</span>)
print_links(driver, <span class="hljs-number">6</span>)
click(<span class="hljs-number">7</span>)
print_links(driver, <span class="hljs-number">7</span>)
click(<span class="hljs-number">7</span>)
print_links(driver, <span class="hljs-number">8</span>)
print(len(LINKS))
</code></pre>
<p>The code above is hardly reusable, but it gets the job done by doing the following:</p>
<ol>
<li><p>Gets the main web-page with the <code>driver.get(...)</code> method</p>
</li>
<li><p>Then gets the <code>&lt;a href</code> tags, and sleeps a little to get a chance to render the HTML</p>
</li>
<li><p>Then finds and clicks the <code>&gt;</code> (next page) button</p>
</li>
<li><p>Does these steps a total of 8 times, as this is how many pages of results are available (each page has 50 runners)</p>
</li>
</ol>
<p>To get the full race results I wrote scraper.py code. The code deals with navigating multiple pages and extracting the data. Demonstration below:</p>
<pre><code class="lang-shell">(EmpireStateRunUp) [josevnz@dmaf5 EmpireStateRunUp]$ esru_scraper /home/josevnz/temp/raw_data.csv
2023-12-30 14:05:00,987 Saving results to /home/josevnz/temp/raw_data.csv
2023-12-30 14:05:53,091 Got 377 racer results
2023-12-30 14:05:53,091 Processing BIB: 19, will fetch: https://www.athlinks.com/event/382111/results/Event/1062909/Course/2407855/Bib/19
2023-12-30 14:06:02,207 Wrote: name=Wai Ching Soh, position=1, {'name': 'Wai Ching Soh', 'url': 'https://www.athlinks.com/event/382111/results/Event/1062909/Course/2407855/Bib/19', 'overall position': '1', 'gender': 'M', 'age': 29, 'city': 'Kuala Lumpur', 'state': '-', 'country': 'MYS', 'bib': 19, '20th floor position': '1', '20th floor gender position': '1', '20th floor division position': '1', '20th floor pace': '42:30', '20th floor time': '1:42', '65th floor position': '1', '65th floor gender position': '1', '65th floor division position': '1', '65th floor pace': '54:03', '65th floor time': '7:34', 'gender position': '1', 'division position': '1', 'pace': '53:00', 'time': '10:36', 'level': 'Full Course'}
...
</code></pre>
<p>It does just minimal manipulation of the data from the web page. The purpose of this code is just to get the data as quickly as possible before the formatting changes.</p>
<p>Data cannot be used yet as-is – it needs cleaning up. And that's the next step in this article.</p>
<h2 id="heading-how-to-clean-up-the-data">How to Clean Up the Data</h2>
<p><a target="_blank" href="https://github.com/josevnz/tutorials/blob/main/docs/EmpireStateRunUp/test/raw_data.csv">Getting the data</a> is just the first battle of many more to come. <a target="_blank" href="https://en.wikibooks.org/wiki/Statistics/Data_Analysis/Data_Cleaning">You will notice inconsistencies on the data</a> and missing values. In order to make your numeric results good, you need to make assumptions.</p>
<p>Luckily for me, the dataset is very small (375+ records, one for each runner) so I was able to come up with a few rules to tidy up the <a target="_blank" href="https://github.com/josevnz/tutorials/blob/main/docs/EmpireStateRunUp/empirestaterunup/results-first-level-2023.csv">data file</a> I was going to use during my analysis.</p>
<p>I also supplemented my data with another data set that has the <a target="_blank" href="https://github.com/josevnz/tutorials/blob/main/docs/EmpireStateRunUp/empirestaterunup/country_codes.csv">3-digit country codes</a> as well as other details, for a nicer presentation.</p>
<p>The <code>data_normalizer.raw_read(raw_file: Path) -&gt; Iterable[Dict[str, Any]]</code> method does the heavy work of fixing the data for inconsistencies before saving into a CSV format.</p>
<p>There are no hard rules here, as cleanup has a high correlation with the data set. For example, to figure out to which wave each runner was assigned I had to make some assumptions based on what I saw the day of the race.</p>
<p>Let me show you what I mean with some code:</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> datetime
<span class="hljs-keyword">from</span> enum <span class="hljs-keyword">import</span> Enum
<span class="hljs-keyword">from</span> typing <span class="hljs-keyword">import</span> Dict

<span class="hljs-string">"""
Runners started on waves, but for basic analysis, we will assume all runners were able to run
at the same time.
"""</span>
BASE_RACE_DATETIME = datetime.datetime(
    year=<span class="hljs-number">2023</span>,
    month=<span class="hljs-number">9</span>,
    day=<span class="hljs-number">4</span>,
    hour=<span class="hljs-number">20</span>,
    minute=<span class="hljs-number">0</span>,
    second=<span class="hljs-number">0</span>,
    microsecond=<span class="hljs-number">0</span>
)

<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">Waves</span>(<span class="hljs-params">Enum</span>):</span>
    <span class="hljs-string">"""
    22 Elite male
    17 Elite female
    There are some holes, so either some runners did not show up or there was spare capacity.
    https://runsignup.com/Race/EmpireStateBuildingRunUp/Page-4
    https://runsignup.com/Race/EmpireStateBuildingRunUp/Page-5
    I guessed who went into which category, based on the BIB numbers I saw that day
    """</span>
    ELITE_MEN = [<span class="hljs-string">"Elite Men"</span>, [<span class="hljs-number">1</span>, <span class="hljs-number">25</span>], BASE_RACE_DATETIME]
    ELITE_WOMEN = [<span class="hljs-string">"Elite Women"</span>, [<span class="hljs-number">26</span>, <span class="hljs-number">49</span>], BASE_RACE_DATETIME + datetime.timedelta(minutes=<span class="hljs-number">2</span>)]
    PURPLE = [<span class="hljs-string">"Specialty"</span>, [<span class="hljs-number">100</span>, <span class="hljs-number">199</span>], BASE_RACE_DATETIME + datetime.timedelta(minutes=<span class="hljs-number">10</span>)]
    GREEN = [<span class="hljs-string">"Sponsors"</span>, [<span class="hljs-number">200</span>, <span class="hljs-number">299</span>], BASE_RACE_DATETIME + datetime.timedelta(minutes=<span class="hljs-number">20</span>)]
    <span class="hljs-string">"""
    The date people applied for the lottery determined the colors. Let's assume that
    General Lottery Open: 7/17 9AM- 7/28 11:59PM
    General Lottery Draw Date: 8/1
    """</span>
    ORANGE = [<span class="hljs-string">"Tenants"</span>, [<span class="hljs-number">300</span>, <span class="hljs-number">399</span>], BASE_RACE_DATETIME + datetime.timedelta(minutes=<span class="hljs-number">30</span>)]
    GREY = [<span class="hljs-string">"General 1"</span>, [<span class="hljs-number">400</span>, <span class="hljs-number">499</span>], BASE_RACE_DATETIME + datetime.timedelta(minutes=<span class="hljs-number">40</span>)]
    GOLD = [<span class="hljs-string">"General 2"</span>, [<span class="hljs-number">500</span>, <span class="hljs-number">599</span>], BASE_RACE_DATETIME + datetime.timedelta(minutes=<span class="hljs-number">50</span>)]
    BLACK = [<span class="hljs-string">"General 3"</span>, [<span class="hljs-number">600</span>, <span class="hljs-number">699</span>], BASE_RACE_DATETIME + datetime.timedelta(minutes=<span class="hljs-number">60</span>)]

<span class="hljs-string">"""
Interested only in people who completed the 86 floors. So is it either a full course or dnf
"""</span>
<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">Level</span>(<span class="hljs-params">Enum</span>):</span>
    FULL = <span class="hljs-string">"Full Course"</span>
    DNF = <span class="hljs-string">"DNF"</span>

<span class="hljs-comment"># Fields are sorted by interest</span>
<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">RaceFields</span>(<span class="hljs-params">Enum</span>):</span>
    BIB = <span class="hljs-string">"bib"</span>
    NAME = <span class="hljs-string">"name"</span>
    OVERALL_POSITION = <span class="hljs-string">"overall position"</span>
    TIME = <span class="hljs-string">"time"</span>
    GENDER = <span class="hljs-string">"gender"</span>
    GENDER_POSITION = <span class="hljs-string">"gender position"</span>
    AGE = <span class="hljs-string">"age"</span>
    DIVISION_POSITION = <span class="hljs-string">"division position"</span>
    COUNTRY = <span class="hljs-string">"country"</span>
    STATE = <span class="hljs-string">"state"</span>
    CITY = <span class="hljs-string">"city"</span>
    PACE = <span class="hljs-string">"pace"</span>
    TWENTY_FLOOR_POSITION = <span class="hljs-string">"20th floor position"</span>
    TWENTY_FLOOR_GENDER_POSITION = <span class="hljs-string">"20th floor gender position"</span>
    TWENTY_FLOOR_DIVISION_POSITION = <span class="hljs-string">"20th floor division position"</span>
    TWENTY_FLOOR_PACE = <span class="hljs-string">'20th floor pace'</span>
    TWENTY_FLOOR_TIME = <span class="hljs-string">'20th floor time'</span>
    SIXTY_FLOOR_POSITION = <span class="hljs-string">"65th floor position"</span>
    SIXTY_FIVE_FLOOR_GENDER_POSITION = <span class="hljs-string">"65th floor gender position"</span>
    SIXTY_FIVE_FLOOR_DIVISION_POSITION = <span class="hljs-string">"65th floor division position"</span>
    SIXTY_FIVE_FLOOR_PACE = <span class="hljs-string">'65th floor pace'</span>
    SIXTY_FIVE_FLOOR_TIME = <span class="hljs-string">'65th floor time'</span>
    WAVE = <span class="hljs-string">"wave"</span>
    LEVEL = <span class="hljs-string">"level"</span>
    URL = <span class="hljs-string">"url"</span>

FIELD_NAMES = [x.value <span class="hljs-keyword">for</span> x <span class="hljs-keyword">in</span> RaceFields <span class="hljs-keyword">if</span> x != RaceFields.URL]
FIELD_NAMES_FOR_SCRAPING = [x.value <span class="hljs-keyword">for</span> x <span class="hljs-keyword">in</span> RaceFields]
FIELD_NAMES_AND_POS: Dict[RaceFields, int] = {}
pos = <span class="hljs-number">0</span>
<span class="hljs-keyword">for</span> field <span class="hljs-keyword">in</span> RaceFields:
    FIELD_NAMES_AND_POS[field] = pos
    pos += <span class="hljs-number">1</span>

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">get_wave_from_bib</span>(<span class="hljs-params">bib: int</span>) -&gt; Waves:</span>
    <span class="hljs-keyword">for</span> wave <span class="hljs-keyword">in</span> Waves:
        (lower, upper) = wave.value[<span class="hljs-number">1</span>]
        <span class="hljs-keyword">if</span> lower &lt;= bib &lt;= upper:
            <span class="hljs-keyword">return</span> wave
    <span class="hljs-keyword">return</span> Waves.BLACK

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">get_description_for_wave</span>(<span class="hljs-params">wave: Waves</span>) -&gt; str:</span>
    <span class="hljs-keyword">return</span> wave.value[<span class="hljs-number">0</span>]
</code></pre>
<p>I used <a target="_blank" href="https://docs.python.org/3/library/enum.html">enums</a> to make it clear what type of data I was working on, especially for the names of the fields. Consistency is key.</p>
<p>As for cleaning the data, well there were some obvious fixes I had to apply like:</p>
<ol>
<li><p>Format of the times like pace, race time, and so on so it could be parsed later</p>
</li>
<li><p>Capitalize some values to make them easier to read</p>
</li>
<li><p>Early string to integer conversion for values like age, position, and so on. If that fails, assign 'not a number'.</p>
</li>
</ol>
<p>By all means, we are not done massaging the data. A simple function takes care of this stage inside the <a target="_blank" href="https://github.com/josevnz/tutorials/blob/main/docs/EmpireStateRunUp/empirestaterunup/data.py">data</a> module:</p>
<pre><code class="lang-python"><span class="hljs-comment"># Omitted imports and Enum declarations as they were shown early on. </span>
<span class="hljs-comment"># Check the source code for 'data.py' for more details</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">raw_csv_read</span>(<span class="hljs-params">raw_file: Path</span>) -&gt; Iterable[Dict[str, Any]]:</span>
    record = {}
    <span class="hljs-keyword">with</span> open(raw_file, <span class="hljs-string">'r'</span>) <span class="hljs-keyword">as</span> raw_csv_file:
        reader = csv.DictReader(raw_csv_file)
        row: Dict[str, Any]
        <span class="hljs-keyword">for</span> row <span class="hljs-keyword">in</span> reader:
            <span class="hljs-keyword">try</span>:
                csv_field: str
                <span class="hljs-keyword">for</span> csv_field <span class="hljs-keyword">in</span> FIELD_NAMES_FOR_SCRAPING:
                    column_val = row[csv_field].strip()
                    <span class="hljs-keyword">if</span> csv_field == RaceFields.BIB.value:
                        bib = int(column_val)
                        record[csv_field] = bib
                    <span class="hljs-keyword">elif</span> csv_field <span class="hljs-keyword">in</span> [ RaceFields.GENDER_POSITION.value, RaceFields.DIVISION_POSITION.value, RaceFields.OVERALL_POSITION.value,  RaceFields.TWENTY_FLOOR_POSITION.value,
                        RaceFields.TWENTY_FLOOR_DIVISION_POSITION.value, RaceFields.TWENTY_FLOOR_GENDER_POSITION.value, RaceFields.SIXTY_FLOOR_POSITION.value, RaceFields.SIXTY_FIVE_FLOOR_DIVISION_POSITION.value,
                        RaceFields.SIXTY_FIVE_FLOOR_GENDER_POSITION.value, RaceFields.AGE.value ]:
                        <span class="hljs-keyword">try</span>:
                            record[csv_field] = int(column_val)
                        <span class="hljs-keyword">except</span> ValueError:
                            record[csv_field] = math.nan
                    <span class="hljs-keyword">elif</span> csv_field == RaceFields.WAVE.value:
                        record[csv_field] = get_description_for_wave(get_wave_from_bib(bib)).upper()
                    <span class="hljs-keyword">elif</span> csv_field <span class="hljs-keyword">in</span> [RaceFields.GENDER.value, RaceFields.COUNTRY.value]:
                        record[csv_field] = column_val.upper()
                    <span class="hljs-keyword">elif</span> csv_field <span class="hljs-keyword">in</span> [RaceFields.CITY.value, RaceFields.STATE.value,

                    ]:
                        record[csv_field] = column_val.capitalize()
                    <span class="hljs-keyword">elif</span> csv_field <span class="hljs-keyword">in</span> [RaceFields.SIXTY_FIVE_FLOOR_PACE.value, RaceFields.SIXTY_FIVE_FLOOR_TIME.value, RaceFields.TWENTY_FLOOR_PACE.value,
                        RaceFields.TWENTY_FLOOR_TIME.value, RaceFields.PACE.value, RaceFields.TIME.value ]:
                        parts = column_val.strip().split(<span class="hljs-string">':'</span>)
                        <span class="hljs-keyword">for</span> idx <span class="hljs-keyword">in</span> range(<span class="hljs-number">0</span>, len(parts)):
                            <span class="hljs-keyword">if</span> len(parts[idx]) == <span class="hljs-number">1</span>:
                                parts[idx] = <span class="hljs-string">f"0<span class="hljs-subst">{parts[idx]}</span>"</span>
                        <span class="hljs-keyword">if</span> len(parts) == <span class="hljs-number">2</span>:
                            parts.insert(<span class="hljs-number">0</span>, <span class="hljs-string">"00"</span>)
                        record[csv_field] = <span class="hljs-string">":"</span>.join(parts)
                    <span class="hljs-keyword">else</span>:
                        record[csv_field] = column_val
                <span class="hljs-keyword">if</span> record[csv_field] <span class="hljs-keyword">in</span> [<span class="hljs-string">'-'</span>, <span class="hljs-string">'--'</span>]:
                    record[csv_field] = <span class="hljs-string">""</span>
                <span class="hljs-keyword">yield</span> record
            <span class="hljs-keyword">except</span> IndexError:
                <span class="hljs-keyword">raise</span>
</code></pre>
<p>The <code>esru_csv_cleaner</code> script is the sum of the first stage cleanup effort, which takes the raw captured data and writes a CSV file with some important corrections:</p>
<pre><code class="lang-shell">esru_csv_cleaner --rawfile /home/josevnz/temp/raw_data.csv /home/josevnz/tutorials/docs/EmpireStateRunUp/empirestaterunup/results-full-level-2023.csv
</code></pre>
<p>Now with the data ready, we can proceed to load the data and ask some questions about the race.</p>
<h2 id="heading-how-to-analyze-the-data">How to Analyze the Data</h2>
<p>Once the data is clean (or as clean as we can get it), it's time to move into running some numbers. Before writing more code, I took a piece of paper and asked myself a few questions about the race:</p>
<ul>
<li><p>There are any interesting buckets/ clusters for age, race time, wave, and country participation?</p>
</li>
<li><p>A histogram for Age and Country would be nice to see</p>
</li>
<li><p>Describe the data! (median, percentiles, and so on)</p>
</li>
<li><p>Find outliers. <a target="_blank" href="https://www.investopedia.com/terms/z/zscore.asp">There is a way to apply Z-scores</a> here?</p>
</li>
</ul>
<p>I decided to use <a target="_blank" href="https://pandas.pydata.org/">Python Pandas</a> for this task. This Open Source framework has an arsenal of tools to manipulate the data and to calculate statistics. It also has good tools to perform additional cleanup if needed.</p>
<p>So how does Pandas work?</p>
<h3 id="heading-crash-course-on-pandas">Crash Course on Pandas</h3>
<p>I strongly recommend that you check out <a target="_blank" href="https://pandas.pydata.org/pandas-docs/stable/user_guide/10min.html">10 minutes to pandas</a> if you are not familiar with the tool. For my DataFrame, I made the BIB an index as it is unique, and it has no special value for aggregation functions – but the 'id' attribute is unique.</p>
<p>It's important to note that also at this stage I needed to normalize the data, which I'll explain shortly:</p>
<pre><code class="lang-python"><span class="hljs-comment"># Omitted imports and Enum declarations as they were shown early on. </span>
<span class="hljs-comment"># Check the source code for 'data.py' for more details</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">load_data</span>(<span class="hljs-params">data_file: Path = None, remove_dnf: bool = True</span>) -&gt; DataFrame:</span>
    <span class="hljs-string">"""
    * The code removes by default the DNF runners to avoid distortion on the results.
    * Replace unknown/ nan values with the median, to make analysis easier and avoid distortions
    """</span>
    <span class="hljs-keyword">if</span> data_file:
        def_file = data_file
    <span class="hljs-keyword">else</span>:
        def_file = RACE_RESULTS_FULL_LEVEL
    df = pandas.read_csv(
        def_file
    )
    <span class="hljs-keyword">for</span> time_field <span class="hljs-keyword">in</span> [
        RaceFields.PACE.value,
        RaceFields.TIME.value,
        RaceFields.TWENTY_FLOOR_PACE.value,
        RaceFields.TWENTY_FLOOR_TIME.value,
        RaceFields.SIXTY_FIVE_FLOOR_PACE.value,
        RaceFields.SIXTY_FIVE_FLOOR_TIME.value
    ]:
        <span class="hljs-keyword">try</span>:
            df[time_field] = pandas.to_timedelta(df[time_field])
        <span class="hljs-keyword">except</span> ValueError <span class="hljs-keyword">as</span> ve:
            <span class="hljs-keyword">raise</span> ValueError(<span class="hljs-string">f'<span class="hljs-subst">{time_field}</span>=<span class="hljs-subst">{df[time_field]}</span>'</span>, ve)
    df[<span class="hljs-string">'finishtimestamp'</span>] = BASE_RACE_DATETIME + df[RaceFields.TIME.value]
    <span class="hljs-keyword">if</span> remove_dnf:
        df.drop(df[df.level == <span class="hljs-string">'DNF'</span>].index, inplace=<span class="hljs-literal">True</span>)

    <span class="hljs-comment"># Normalize Age</span>
    median_age = df[RaceFields.AGE.value].median()
    df[RaceFields.AGE.value].fillna(median_age, inplace=<span class="hljs-literal">True</span>)
    df[RaceFields.AGE.value] = df[RaceFields.AGE.value].astype(int)

    <span class="hljs-comment"># Normalize state and city</span>
    df.replace({RaceFields.STATE.value: {<span class="hljs-string">'-'</span>: <span class="hljs-string">''</span>}}, inplace=<span class="hljs-literal">True</span>)
    df[RaceFields.STATE.value].fillna(<span class="hljs-string">''</span>, inplace=<span class="hljs-literal">True</span>)
    df[RaceFields.CITY.value].fillna(<span class="hljs-string">''</span>, inplace=<span class="hljs-literal">True</span>)

    <span class="hljs-comment"># Normalize overall position, 3 levels</span>
    median_pos = df[RaceFields.OVERALL_POSITION.value].median()
    df[RaceFields.OVERALL_POSITION.value].fillna(median_pos, inplace=<span class="hljs-literal">True</span>)
    df[RaceFields.OVERALL_POSITION.value] = df[RaceFields.OVERALL_POSITION.value].astype(int)
    median_pos = df[RaceFields.TWENTY_FLOOR_POSITION.value].median()
    df[RaceFields.TWENTY_FLOOR_POSITION.value].fillna(median_pos, inplace=<span class="hljs-literal">True</span>)
    df[RaceFields.TWENTY_FLOOR_POSITION.value] = df[RaceFields.TWENTY_FLOOR_POSITION.value].astype(int)
    median_pos = df[RaceFields.SIXTY_FLOOR_POSITION.value].median()
    df[RaceFields.SIXTY_FLOOR_POSITION.value].fillna(median_pos, inplace=<span class="hljs-literal">True</span>)
    df[RaceFields.SIXTY_FLOOR_POSITION.value] = df[RaceFields.SIXTY_FLOOR_POSITION.value].astype(int)

    <span class="hljs-comment"># Normalize gender position, 3 levels</span>
    median_gender_pos = df[RaceFields.GENDER_POSITION.value].median()
    df[RaceFields.GENDER_POSITION.value].fillna(median_gender_pos, inplace=<span class="hljs-literal">True</span>)
    df[RaceFields.GENDER_POSITION.value] = df[RaceFields.GENDER_POSITION.value].astype(int)
    median_gender_pos = df[RaceFields.TWENTY_FLOOR_GENDER_POSITION.value].median()
    df[RaceFields.TWENTY_FLOOR_GENDER_POSITION.value].fillna(median_gender_pos, inplace=<span class="hljs-literal">True</span>)
    df[RaceFields.TWENTY_FLOOR_GENDER_POSITION.value] = df[RaceFields.TWENTY_FLOOR_GENDER_POSITION.value].astype(int)
    median_gender_pos = df[RaceFields.SIXTY_FIVE_FLOOR_GENDER_POSITION.value].median()
    df[RaceFields.SIXTY_FIVE_FLOOR_GENDER_POSITION.value].fillna(median_gender_pos, inplace=<span class="hljs-literal">True</span>)
    df[RaceFields.SIXTY_FIVE_FLOOR_GENDER_POSITION.value] = df[
        RaceFields.SIXTY_FIVE_FLOOR_GENDER_POSITION.value].astype(int)

    <span class="hljs-comment"># Normalize age/ division position, 3 levels</span>
    median_div_pos = df[RaceFields.DIVISION_POSITION.value].median()
    df[RaceFields.DIVISION_POSITION.value].fillna(median_div_pos, inplace=<span class="hljs-literal">True</span>)
    df[RaceFields.DIVISION_POSITION.value] = df[RaceFields.DIVISION_POSITION.value].astype(int)
    median_div_pos = df[RaceFields.TWENTY_FLOOR_DIVISION_POSITION.value].median()
    df[RaceFields.TWENTY_FLOOR_DIVISION_POSITION.value].fillna(median_div_pos, inplace=<span class="hljs-literal">True</span>)
    df[RaceFields.TWENTY_FLOOR_DIVISION_POSITION.value] = df[RaceFields.TWENTY_FLOOR_DIVISION_POSITION.value].astype(int)
    median_div_pos = df[RaceFields.SIXTY_FIVE_FLOOR_DIVISION_POSITION.value].median()
    df[RaceFields.SIXTY_FIVE_FLOOR_DIVISION_POSITION.value].fillna(median_div_pos, inplace=<span class="hljs-literal">True</span>)
    df[RaceFields.SIXTY_FIVE_FLOOR_DIVISION_POSITION.value] = df[
        RaceFields.SIXTY_FIVE_FLOOR_DIVISION_POSITION.value].astype(int)

    <span class="hljs-comment"># Normalize 65th floor pace and time</span>
    sixty_five_floor_pace_median = df[RaceFields.SIXTY_FIVE_FLOOR_PACE.value].median()
    sixty_five_floor_time_median = df[RaceFields.SIXTY_FIVE_FLOOR_TIME.value].median()
    df[RaceFields.SIXTY_FIVE_FLOOR_PACE.value].fillna(sixty_five_floor_pace_median, inplace=<span class="hljs-literal">True</span>)
    df[RaceFields.SIXTY_FIVE_FLOOR_TIME.value].fillna(sixty_five_floor_time_median, inplace=<span class="hljs-literal">True</span>)

    <span class="hljs-comment"># Normalize BIB and make it the index</span>
    df[RaceFields.BIB.value] = df[RaceFields.BIB.value].astype(int)
    df.set_index(RaceFields.BIB.value, inplace=<span class="hljs-literal">True</span>)

    <span class="hljs-comment"># URL was useful during scraping, not needed for analysis</span>
    df.drop([RaceFields.URL.value], axis=<span class="hljs-number">1</span>, inplace=<span class="hljs-literal">True</span>)

    <span class="hljs-keyword">return</span> df
</code></pre>
<p>I do a few things here after giving back the converted CSV back to the user, as a DataFrame:</p>
<ul>
<li><p>Replaced "Not a Number" (nan) values with the median to avoid affecting the aggregation results. This makes analysis easier.</p>
</li>
<li><p>Dropped rows for runners that did not reach floor 86. Makes the analysis easier, and there are too few of them.</p>
</li>
<li><p>Convert some string columns into native data types like integers, timestamps</p>
</li>
<li><p>A few entries did not have the gender defined. That affected other fields like 'gender_position'. To avoid distortions, these were filled with the median.</p>
</li>
</ul>
<p>In the end, this is how my <a target="_blank" href="https://pandas.pydata.org/pandas-docs/stable/user_guide/dsintro.html">DataFrame</a> loading looked like:</p>
<pre><code class="lang-shell">(EmpireStateRunUp) [josevnz@dmaf5 EmpireStateRunUp]$ python3
Python 3.11.6 (main, Oct  3 2023, 00:00:00) [GCC 12.3.1 20230508 (Red Hat 12.3.1-1)] on linux
Type "help", "copyright", "credits" or "license" for more information.
</code></pre>
<p>And the resulting <a target="_blank" href="https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html"><strong>DataFrame</strong></a> instance:</p>
<pre><code class="lang-shell">&gt;&gt;&gt; # Using custom load_data function that returns a Panda DataFrame
&gt;&gt;&gt; from empirestaterunup.data import load_data
&gt;&gt;&gt; load_data('empirestaterunup/results-full-level-2023.csv')
                    name  overall position            time gender  gender position  age  ...  65th floor division position 65th floor pace 65th floor time       wave        level     finishtimestamp
bib                                                                                      ...                                                                                                          
19         Wai Ching Soh                 1 0 days 00:10:36      M                1   29  ...                             1 0 days 00:54:03 0 days 00:07:34  ELITE MEN  Full Course 2023-09-04 20:10:36
22        Ryoji Watanabe                 2 0 days 00:10:52      M                2   40  ...                             1 0 days 00:54:31 0 days 00:07:38  ELITE MEN  Full Course 2023-09-04 20:10:52
16            Fabio Ruga                 3 0 days 00:11:14      M                3   42  ...                             2 0 days 00:57:09 0 days 00:08:00  ELITE MEN  Full Course 2023-09-04 20:11:14
11        Emanuele Manzi                 4 0 days 00:11:28      M                4   45  ...                             3 0 days 00:59:17 0 days 00:08:18  ELITE MEN  Full Course 2023-09-04 20:11:28
249             Alex Cyr                 5 0 days 00:11:52      M                5   28  ...                             2 0 days 01:01:19 0 days 00:08:35   SPONSORS  Full Course 2023-09-04 20:11:52
..                   ...               ...             ...    ...              ...  ...  ...                           ...             ...             ...        ...          ...                 ...
555     Caroline Edwards               372 0 days 00:55:17      F              143   47  ...                            39 0 days 04:57:23 0 days 00:41:38  GENERAL 2  Full Course 2023-09-04 20:55:17
557        Sarah Preston               373 0 days 00:55:22      F              144   34  ...                            41 0 days 04:58:20 0 days 00:41:46  GENERAL 2  Full Course 2023-09-04 20:55:22
544  Christopher Winkler               374 0 days 01:00:10      M              228   40  ...                            18 0 days 01:49:53 0 days 00:15:23  GENERAL 2  Full Course 2023-09-04 21:00:10
545          Jay Winkler               375 0 days 01:05:19      U               93   33  ...                            18 0 days 05:28:56 0 days 00:46:03  GENERAL 2  Full Course 2023-09-04 21:05:19
646           Dana Zajko               376 0 days 01:06:48      F              145   38  ...                            42 0 days 05:15:14 0 days 00:44:08  GENERAL 3  Full Course 2023-09-04 21:06:48

[375 rows x 24 columns]
</code></pre>
<p>Once the data was loaded, I was able to start asking questions. For example, to detect the outliers I used a <a target="_blank" href="https://en.wikipedia.org/wiki/Standard_score">Z-score</a>.</p>
<p>All the analysis logic <a target="_blank" href="https://github.com/josevnz/tutorials/blob/main/docs/EmpireStateRunUp/empirestaterunup/analyze.py">was kept together on a single module called 'analyze'</a>, separate from presentation, data loading, or reports, to promote reuse.</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> pandas <span class="hljs-keyword">import</span> DataFrame
<span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">get_zscore</span>(<span class="hljs-params">df: DataFrame, column: str</span>):</span>
    filtered = df[column]
    <span class="hljs-keyword">return</span> filtered.sub(filtered.mean()).div(filtered.std(ddof=<span class="hljs-number">0</span>))

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">get_outliers</span>(<span class="hljs-params">df: DataFrame, column: str, std_threshold: int = <span class="hljs-number">3</span></span>) -&gt; DataFrame:</span>
    <span class="hljs-string">"""
    Use the z-score, anything further away than 3 standard deviations is considered an outlier.
    """</span>
    filtered_df = df[column]
    z_scores = get_zscore(df=df, column=column)
    is_over = np.abs(z_scores) &gt; std_threshold
    <span class="hljs-keyword">return</span> filtered_df[is_over]
</code></pre>
<p>Also, it is very simple to get common statistics just by calling <code>describe</code> on our data:</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> pandas <span class="hljs-keyword">import</span> DataFrame
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">get_5_number</span>(<span class="hljs-params">criteria: str, data: DataFrame</span>) -&gt; DataFrame:</span>
    <span class="hljs-keyword">return</span> data[criteria].describe()
</code></pre>
<p>For example, let me show you summary metrics for different aspects of the race:</p>
<pre><code class="lang-shell">&gt;&gt;&gt; from empirestaterunup.data import load_data
&gt;&gt;&gt; df = load_data('empirestaterunup/results-full-level-2023.csv')
&gt;&gt;&gt; from empirestaterunup.analyze import get_5_number
&gt;&gt;&gt; from empirestaterunup.analyze import SUMMARY_METRICS
&gt;&gt;&gt; print(SUMMARY_METRICS)
('age', 'time', 'pace')
&gt;&gt;&gt; for key in SUMMARY_METRICS:
...     ndf = get_5_number(criteria=key, data=df)
...     print(ndf)
... 
count    375.000000
mean      41.309333
std       11.735968
min       11.000000
25%       33.000000
50%       40.000000
75%       49.000000
max       78.000000
Name: age, dtype: float64
count                          375
mean     0 days 00:23:03.461333333
std      0 days 00:08:06.313479117
min                0 days 00:10:36
25%                0 days 00:18:09
50%                0 days 00:21:20
75%         0 days 00:25:13.500000
max                0 days 01:06:48
Name: time, dtype: object
count                          375
mean     0 days 01:55:17.306666666
std      0 days 00:40:31.567395588
min                0 days 00:53:00
25%                0 days 01:30:45
50%                0 days 01:46:40
75%         0 days 02:06:07.500000
max                0 days 05:34:00
Name: pace, dtype: object
</code></pre>
<p>Making sure data web scraping, data loading, and analytics work well is a must. Testing is an integral part of writing code, so I kept adding more of it and went back to writing unit tests.</p>
<p>Let's check how to test our code (feel free to skip the next section if you are familiar with unit testing)</p>
<h3 id="heading-testing-testing-and-after-thatmore-testing">Testing, testing, and after that...more testing</h3>
<p>I assume you are familiar with writing small, self-contained pieces of code to test your code. These are called unit tests.</p>
<blockquote>
<p>The unittest unit testing framework was originally inspired by JUnit and has a similar flavor as major unit testing frameworks in other languages. It supports test automation, sharing of setup and shutdown code for tests, aggregation of tests into collections, and independence of the tests from the reporting framework. (From the <a target="_blank" href="https://docs.python.org/3/library/unittest.html">Python docs</a>)</p>
</blockquote>
<p>I tried to have a simple <a target="_blank" href="https://docs.python.org/3/library/unittest.html">unit test</a> for every method I wrote on the code. This saved me lots of headaches down the road. As I refactored the code, I found better ways to get the same results, producing correct numbers.</p>
<p>A Unit test in this context is a class that extends <code>unittest.TestCase</code>. Each method that starts with <code>test_</code> is a test that must pass several assertions.</p>
<p>For example, to make sure the analytics worked as expected, I wrote a test module called <code>test_analyze</code>:</p>
<pre><code class="lang-python"><span class="hljs-comment"># Not all test cases are shown, please check the full code of 'test/test_analyze.py'</span>
<span class="hljs-keyword">import</span> unittest
<span class="hljs-keyword">from</span> pandas <span class="hljs-keyword">import</span> DataFrame
<span class="hljs-keyword">from</span> empirestaterunup.analyze <span class="hljs-keyword">import</span> get_country_counts
<span class="hljs-keyword">from</span> empirestaterunup.data <span class="hljs-keyword">import</span> load_data

<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">AnalyzeTestCase</span>(<span class="hljs-params">unittest.TestCase</span>):</span>
    df: DataFrame

<span class="hljs-meta">    @classmethod</span>
    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">setUpClass</span>(<span class="hljs-params">cls</span>) -&gt; <span class="hljs-keyword">None</span>:</span>
        cls.df = load_data()

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">test_get_country_counts</span>(<span class="hljs-params">self</span>):</span>
        country_counts, min_countries, max_countries = get_country_counts(df=AnalyzeTestCase.df)
        self.assertIsNotNone(country_counts)
        self.assertEqual(<span class="hljs-number">2</span>, country_counts[<span class="hljs-string">'JPN'</span>])
        self.assertIsNotNone(min_countries)
        self.assertEqual(<span class="hljs-number">3</span>, min_countries.shape[<span class="hljs-number">0</span>])
        self.assertIsNotNone(max_countries)
        self.assertEqual(<span class="hljs-number">14</span>, max_countries.shape[<span class="hljs-number">0</span>])


<span class="hljs-keyword">if</span> __name__ == <span class="hljs-string">'__main__'</span>:
    unittest.main()
</code></pre>
<p>So far we got the data, and made sure <a target="_blank" href="https://github.com/josevnz/tutorials/blob/main/docs/EmpireStateRunUp/test/test_data.py">it meets the expectations</a>. I wrote <a target="_blank" href="https://github.com/josevnz/tutorials/blob/main/docs/EmpireStateRunUp/test/test_analyze.py">separate tests</a> for the analytics code and also for the scraper.</p>
<p>Testing the user interface requires a different approach, as it needs to simulate clicks and wait for screen changes. Sometimes failures are easy to spot (like crashes), but sometimes issues are much more subtle (did we get the right data displayed?).</p>
<p>Will revisit this particular testing modality after we introduce first how to visualize the results.</p>
<h2 id="heading-how-to-visualize-the-results">How to Visualize the Results</h2>
<p>I wanted to use the terminal as much as possible to visualize my findings, and to keep requirements to a minimum. I decided to use the <a target="_blank" href="https://textual.textualize.io/">Textual</a> framework to accomplish that.</p>
<p>This framework is very complete and allows you to build text applications that are responsive and beautiful to look at.</p>
<p>They are also easy to write, so before we go deeper into the resulting applications, let's pause to learn about Textual.</p>
<h3 id="heading-text-user-interfaces-tui-with-textual">Text User Interfaces (TUI) with Textual</h3>
<p>The <a target="_blank" href="https://textual.textualize.io/">Textual project</a> has a nice tutorial that <a target="_blank" href="https://textual.textualize.io/tutorial/">you can read</a> to get up to speed.</p>
<p>Let's see some code. One of the applications is called <code>esru_outlier</code>. TUI code lives on the <a target="_blank" href="https://github.com/josevnz/tutorials/blob/main/docs/EmpireStateRunUp/empirestaterunup/apps.py">apps</a> module that shows several tables together with the outliers we found before, using the z-score.</p>
<p>OutlierApp (extends App) collects all the basic information on a table for each outlier group and then calls the <code>RunnerDetailScreen</code> to display details about a runner.</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2024/05/esrm_outlier_first_screen.png" alt="Screen shot of the OutlierApp table that shows outliers on the race results" width="600" height="400" loading="lazy"></p>
<p><em>Outliers first screen (by Age, Running Time, and Pace)</em></p>
<p>Next is code with explanations that shows how to build this screen:</p>
<pre><code class="lang-python"><span class="hljs-comment"># Only the code of the application shown here</span>
<span class="hljs-comment"># This application shows 3 tables: SUMMARY_METRICS = (RaceFields.AGE.value, RaceFields.TIME.value, RaceFields.PACE.value)</span>
<span class="hljs-comment"># Every application in Textual extends the App class</span>
<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">OutlierApp</span>(<span class="hljs-params">App</span>):</span>
    DF: DataFrame = <span class="hljs-literal">None</span>
    BINDINGS = [ (<span class="hljs-string">"q"</span>, <span class="hljs-string">"quit_app"</span>, <span class="hljs-string">"Quit"</span>), ]  <span class="hljs-comment"># Bind 'q' to 'quit_app' method `action_quit_app`, which in turn exists the app</span>
    CSS_PATH = <span class="hljs-string">"outliers.tcss"</span>  <span class="hljs-comment"># Styling can be done externally, similar to using CSS</span>
    ENABLE_COMMAND_PALETTE = <span class="hljs-literal">False</span>

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">action_quit_app</span>(<span class="hljs-params">self</span>):</span>
        self.exit(<span class="hljs-number">0</span>)

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">compose</span>(<span class="hljs-params">self</span>) -&gt; ComposeResult:</span>
        <span class="hljs-string">"""
        Here we 'Yield' Widgets/ components that will be rendered in order on the TUI
        How do the components get their layout on the screen? They use a cascading style sheet (CSS): outliers.tcss and
        some explicit layout containers like the class `Vertical` that can contain other Widgets
        Here we have a header, tables, and a footer 
        """</span>
        <span class="hljs-keyword">yield</span> Header(show_clock=<span class="hljs-literal">True</span>)
        <span class="hljs-keyword">for</span> column_name <span class="hljs-keyword">in</span> SUMMARY_METRICS:
            table = DataTable(id=<span class="hljs-string">f'<span class="hljs-subst">{column_name}</span>_outlier'</span>)
            table.cursor_type = <span class="hljs-string">'row'</span>
            table.zebra_stripes = <span class="hljs-literal">True</span>
            table.tooltip = <span class="hljs-string">"Get runner details"</span>
            <span class="hljs-keyword">if</span> column_name == RaceFields.AGE.value:
                label = Label(<span class="hljs-string">f"<span class="hljs-subst">{column_name}</span> (older) outliers:"</span>.title())
            <span class="hljs-keyword">else</span>:
                label = Label(<span class="hljs-string">f"<span class="hljs-subst">{column_name}</span> (slower) outliers:"</span>.title())
            <span class="hljs-keyword">yield</span> Vertical(
                label,
                table
            )
        <span class="hljs-keyword">yield</span> Footer()

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">on_mount</span>(<span class="hljs-params">self</span>) -&gt; <span class="hljs-keyword">None</span>:</span>
        <span class="hljs-string">"""
        Here we populate each table with data from the DataFrame. Each table has outliers of different types.
        All can be obtained with the `get_outliers` method.
        """</span>
        <span class="hljs-keyword">for</span> column <span class="hljs-keyword">in</span> SUMMARY_METRICS:
            table = self.get_widget_by_id(<span class="hljs-string">f'<span class="hljs-subst">{column}</span>_outlier'</span>, expect_type=DataTable)
            columns = [x.title() <span class="hljs-keyword">for</span> x <span class="hljs-keyword">in</span> [<span class="hljs-string">'bib'</span>, column]]
            table.add_columns(*columns)
            table.add_rows(*[get_outliers(df=OutlierApp.DF, column=column).to_dict().items()])

<span class="hljs-meta">    @on(DataTable.HeaderSelected)</span>
    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">on_header_clicked</span>(<span class="hljs-params">self, event: DataTable.HeaderSelected</span>):</span>
        <span class="hljs-string">"""
        When the user selects a column header it generates a 'HeaderSelected' event.
        The annotation on this method tells Textual that we will handle this event here
        We can extract the table, the selected column, and then sort the table contents.
        """</span>
        table = event.data_table
        table.sort(event.column_key)

<span class="hljs-meta">    @on(DataTable.RowSelected)</span>
    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">on_row_clicked</span>(<span class="hljs-params">self, event: DataTable.RowSelected</span>) -&gt; <span class="hljs-keyword">None</span>:</span>
        <span class="hljs-string">"""
        Similarly, when the user selects a row it generates a RowSelected method
        What we do on the 'on_row_clicked' method is capture the event, get the row contents, and construct
        a new modal screen (RunnerDetailScreen) which we push on top of the regular screen.
        There we show the runner details differently. 
        """</span>
        table = event.data_table
        row = table.get_row(event.row_key)
        runner_detail = RunnerDetailScreen(df=OutlierApp.DF, row=row)
        self.push_screen(runner_detail)
</code></pre>
<p>The class <code>RunnerDetailScreen</code> (extends <code>ModalScreen</code>) handles showing the racer details using formatted Markdown, which shows up when you click on the table that was rendered before:</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2024/05/esrm_outlier_runner_detail.png" alt="Screen shot of the OutlierApp runner details that shows outliers on the race results" width="600" height="400" loading="lazy"></p>
<p><em>Rendered Markdown with details about the selected runner</em></p>
<p>And here's the code that allows that with explanations:</p>
<pre><code class="lang-python"><span class="hljs-comment"># Omitted imports and helper methods, only showing TUI-related code. See the 'apps.py' file for full code</span>
<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">RunnerDetailScreen</span>(<span class="hljs-params">ModalScreen</span>):</span>
    ENABLE_COMMAND_PALETTE = <span class="hljs-literal">False</span>  <span class="hljs-comment"># Disable the search bar, it is active by default and is not needed here</span>
    CSS_PATH = <span class="hljs-string">"runner_details.tcss"</span>  <span class="hljs-comment"># Handle the styles using external CSS</span>

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">__init__</span>(<span class="hljs-params">
            self,
            name: str | None = None,
            ident: str | None = None,
            classes: str | None = None,
            row: List[Any] | None = None,
            df: DataFrame = None,
            country_df: DataFrame = None
    </span>):</span>
        <span class="hljs-string">"""
        Override the constructor and load useful data like country ISO codes
        We get the Pandas DataFrame with the details that will be shown to the user
        """</span>
        super().__init__(name, ident, classes)
        self.row = row
        self.df = df
        <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> country_df:
            self.country_df = load_country_details()
        <span class="hljs-keyword">else</span>:
            self.country_df = country_df

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">compose</span>(<span class="hljs-params">self</span>) -&gt; ComposeResult:</span>
        <span class="hljs-string">"""
        In compose we prepare the markdown, and we let the MarkdownViewer handle details like 
        a nice automatic table of contents.
        Notice that we call `self.log.info('xxx'). We use that for debugging when this application
        is called using 'textual'.
        """</span>
        bib_idx = FIELD_NAMES_AND_POS[RaceFields.BIB]
        bibs = [self.row[bib_idx]]
        columns, details = df_to_list_of_tuples(self.df, bibs)
        self.log.info(<span class="hljs-string">f"Columns: <span class="hljs-subst">{columns}</span>"</span>)
        self.log.info(<span class="hljs-string">f"Details: <span class="hljs-subst">{details}</span>"</span>)
        row_markdown = <span class="hljs-string">""</span>
        position_markdown = {}
        split_markdown = {}
        <span class="hljs-keyword">for</span> legend <span class="hljs-keyword">in</span> [<span class="hljs-string">'full'</span>, <span class="hljs-string">'20th'</span>, <span class="hljs-string">'65th'</span>]:
            position_markdown[legend] = <span class="hljs-string">''</span>
            split_markdown[legend] = <span class="hljs-string">''</span>
        <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> range(<span class="hljs-number">0</span>, len(columns)):
            column = columns[i]
            detail = details[<span class="hljs-number">0</span>][i]
            <span class="hljs-keyword">if</span> re.search(<span class="hljs-string">'pace|time'</span>, column):
                <span class="hljs-keyword">if</span> re.search(<span class="hljs-string">'20th'</span>, column):
                    split_markdown[<span class="hljs-string">'20th'</span>] += <span class="hljs-string">f"\n* **<span class="hljs-subst">{column.title()}</span>:** <span class="hljs-subst">{detail}</span>"</span>
                <span class="hljs-keyword">elif</span> re.search(<span class="hljs-string">'65th'</span>, column):
                    split_markdown[<span class="hljs-string">'65th'</span>] += <span class="hljs-string">f"\n* **<span class="hljs-subst">{column.title()}</span>:** <span class="hljs-subst">{detail}</span>"</span>
                <span class="hljs-keyword">else</span>:
                    split_markdown[<span class="hljs-string">'full'</span>] += <span class="hljs-string">f"\n* **<span class="hljs-subst">{column.title()}</span>:** <span class="hljs-subst">{detail}</span>"</span>
            <span class="hljs-keyword">elif</span> re.search(<span class="hljs-string">'position'</span>, column):
                <span class="hljs-keyword">if</span> re.search(<span class="hljs-string">'20th'</span>, column):
                    position_markdown[<span class="hljs-string">'20th'</span>] += <span class="hljs-string">f"\n* **<span class="hljs-subst">{column.title()}</span>:** <span class="hljs-subst">{detail}</span>"</span>
                <span class="hljs-keyword">elif</span> re.search(<span class="hljs-string">'65th'</span>, column):
                    position_markdown[<span class="hljs-string">'65th'</span>] += <span class="hljs-string">f"\n* **<span class="hljs-subst">{column.title()}</span>:** <span class="hljs-subst">{detail}</span>"</span>
                <span class="hljs-keyword">else</span>:
                    position_markdown[<span class="hljs-string">'full'</span>] += <span class="hljs-string">f"\n* **<span class="hljs-subst">{column.title()}</span>:** <span class="hljs-subst">{detail}</span>"</span>
            <span class="hljs-keyword">elif</span> re.search(<span class="hljs-string">'url|bib'</span>, column):
                <span class="hljs-keyword">pass</span>  <span class="hljs-comment"># Skip uninteresting columns</span>
            <span class="hljs-keyword">else</span>:
                row_markdown += <span class="hljs-string">f"\n* **<span class="hljs-subst">{column.title()}</span>:** <span class="hljs-subst">{detail}</span>"</span>
        <span class="hljs-keyword">yield</span> MarkdownViewer(<span class="hljs-string">f"""# Full Course Race details     
## Runner BIO (BIB: <span class="hljs-subst">{bibs[<span class="hljs-number">0</span>]}</span>)
<span class="hljs-subst">{row_markdown}</span>
## Positions
### 20th floor        
<span class="hljs-subst">{position_markdown[<span class="hljs-string">'20th'</span>]}</span>
### 65th floor        
<span class="hljs-subst">{position_markdown[<span class="hljs-string">'65th'</span>]}</span>
### Full course        
<span class="hljs-subst">{position_markdown[<span class="hljs-string">'full'</span>]}</span>                
## Race time split   
### 20th floor        
<span class="hljs-subst">{split_markdown[<span class="hljs-string">'20th'</span>]}</span>
### 65th floor        
<span class="hljs-subst">{split_markdown[<span class="hljs-string">'65th'</span>]}</span>
### Full course        
<span class="hljs-subst">{split_markdown[<span class="hljs-string">'full'</span>]}</span>         
        """</span>)
        <span class="hljs-comment"># This button is used to close this screen and send the user to the previous screen</span>
        btn = Button(<span class="hljs-string">"Close"</span>, variant=<span class="hljs-string">"primary"</span>, id=<span class="hljs-string">"close"</span>)
        btn.tooltip = <span class="hljs-string">"Back to main screen"</span>
        <span class="hljs-keyword">yield</span> btn

<span class="hljs-meta">    @on(Button.Pressed, "#close")</span>
    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">on_button_pressed</span>(<span class="hljs-params">self, _</span>) -&gt; <span class="hljs-keyword">None</span>:</span>
        <span class="hljs-string">"""
        Simple logic, pop the previous screen and make this one disappear
        """</span>
        self.app.pop_screen()
</code></pre>
<p>This class is reusable. There are other classes (like <code>BrowserApp</code> in this tutorial) that also send data when a user clicks on a table row, and those details get displayed using this modal screen.</p>
<p>We can customize the appearance using CSS (yes, like a web application). It looks a lot like a web application's <a target="_blank" href="https://en.wikipedia.org/wiki/CSS">CSS</a> (but it's not exactly the same). For example to add style to a button, here's the code:</p>
<pre><code class="lang-text">Button {
    dock: bottom;
    width: 100%;
    height: auto;
}
</code></pre>
<p>As you can see, Textual is a pretty powerful framework. It reminds me a lot of <a target="_blank" href="https://en.wikipedia.org/wiki/Swing_(Java)">Java Swing</a>, but without the extra complexity.</p>
<p>But is it just information in tabular format? I also wanted to have different graph types that could explain behavior like age cluster and gender distribution. For that, I wrote a few classes on the 'apps' module with the help of Matplotlib.</p>
<h3 id="heading-plots-with-matplotlib">Plots with Matplotlib</h3>
<p>I wanted to use some charts to display the data, and I made them with <a target="_blank" href="https://matplotlib.org/">matplotlib</a>. The code to generate an age plot box, that shows how old the participating runners were, is very straightforward.</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2024/05/esru_age_box_plot.png" alt="Box plot showing age distribution among racers" width="600" height="400" loading="lazy"></p>
<p><em>Age box plot in Matplotlib that shows than most of the runners were in the 40-50 year old range.</em></p>
<p>And here's the code that produced that plot:</p>
<pre><code class="lang-python"><span class="hljs-comment"># Not all code is shown here (helper methods, imports)</span>
<span class="hljs-comment"># Please check the apps.py module to see all missing code</span>
<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">Plotter</span>:</span>
    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">plot_gender</span>(<span class="hljs-params">self</span>):</span>
        <span class="hljs-string">"""
        In this method, we get our data frame filtering by gender and get counts
        Then we create a pie plot
        """</span>
        series = self.df[RaceFields.GENDER.value].value_counts()
        fig, ax = plt.subplots(layout=<span class="hljs-string">'constrained'</span>)
        wedges, texts, auto_texts = ax.pie(
            series.values,
            labels=series.keys(),
            autopct=<span class="hljs-string">"%%%.2f"</span>,
            shadow=<span class="hljs-literal">True</span>,
            startangle=<span class="hljs-number">90</span>,
            explode=(<span class="hljs-number">0.1</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>)
        )
        ax.set_title = <span class="hljs-string">"Gender participation"</span>
        ax.set_xlabel(<span class="hljs-string">'Gender distribution'</span>)

        <span class="hljs-comment"># Legend with the fastest runners by gender</span>
        fastest = find_fastest(self.df, FastestFilters.Gender)
        fastest_legend = [<span class="hljs-string">f"<span class="hljs-subst">{fastest[gender][<span class="hljs-string">'name'</span>]}</span> - <span class="hljs-subst">{beautify_race_times(fastest[gender][<span class="hljs-string">'time'</span>])}</span>"</span> <span class="hljs-keyword">for</span> gender <span class="hljs-keyword">in</span>
                          series.keys()]
        ax.legend(wedges, fastest_legend,
                  title=<span class="hljs-string">"Fastest by gender"</span>,
                  loc=<span class="hljs-string">"center left"</span>,
                  bbox_to_anchor=(<span class="hljs-number">1</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0.5</span>, <span class="hljs-number">1</span>))
</code></pre>
<p>Interesting – most of the runners were between 40-50 years old.</p>
<p>Now let's go back to testing TUI.</p>
<h3 id="heading-testing-the-user-interfaces">Testing the User Interfaces</h3>
<p>When I started working on this small project, I knew that there was going to be a lot of testing. What I wasn't sure about was how I would be able to test the TUI.</p>
<p>I figured at least two ways would be useful with Textual: one being able to see the message flow between components and the other using unit tests with a twist:</p>
<h4 id="heading-following-the-message-flow-with-textual">Following the message flow with Textual</h4>
<p>Textual supports an interesting development mode that allows you to change CSS and see the changes on your application without a restart. Also, you can see how the TUI events propagate, which is invaluable for debugging.</p>
<p>In one terminal, start the console:</p>
<pre><code class="lang-shell">(EmpireStateRunUp) [josevnz@dmaf5 EmpireStateRunUp]$ . ~/virtualenv/EmpireStateRunUp/bin/activate
(EmpireStateRunUp) [josevnz@dmaf5 EmpireStateRunUp]$ textual console
▌Textual Development Console v0.46.0                                                                                                                                             
▌Run a Textual app with textual run --dev my_app.py to connect.                                                                                                                  
▌Press Ctrl+C to quit.
</code></pre>
<p>Then in another terminal, start your application but using development mode:</p>
<pre><code class="lang-shell">(EmpireStateRunUp) [josevnz@dmaf5 EmpireStateRunUp]$ textual run --dev --command esru_browser
</code></pre>
<p>If you check back on your console terminal, you will see any messages you sent with App.log along with the events:</p>
<pre><code class="lang-shell">─────────────────────────────────────────────────────────────────────────── Client '127.0.0.1' connected ───────────────────────────────────────────────────────────────────────────
[18:28:17] SYSTEM                                                                                                                                                        app.py:2188
Connected to devtools ( ws://127.0.0.1:8081 )
[18:28:17] SYSTEM                                                                                                                                                        app.py:2192
---
[18:28:17] SYSTEM                                                                                                                                                        app.py:2194
driver=&lt;class 'textual.drivers.linux_driver.LinuxDriver'&gt;
[18:28:17] SYSTEM                                                                                                                                                        app.py:2195
loop=&lt;_UnixSelectorEventLoop running=True closed=False debug=False&gt;
[18:28:17] SYSTEM                                                                                                                                                        app.py:2196
features=frozenset({'debug', 'devtools'})
[18:28:17] SYSTEM                                                                                                                                                        app.py:2228
STARTED FileMonitor({PosixPath('/home/josevnz/EmpireStateCleanup/docs/EmpireStateRunUp/empirestaterunup/browser.tcss')})
[18:28:17] EVENT                                                                                                                                                 message_pump.py:706
Load() &gt;&gt;&gt; BrowserApp(title='Race Runners', classes={'-dark-mode'}) method=None
[18:28:17] EVENT                                                                                                                                                 message_pump.py:697
Mount() &gt;&gt;&gt; DataTable(id='runners') method=&lt;ScrollView.on_mount&gt;
[18:28:17] EVENT                                                                                                                                                 message_pump.py:697
Mount() &gt;&gt;&gt; DataTable(id='runners') method=&lt;Widget.on_mount&gt;
[18:28:17] EVENT                                                                                                                                                 message_pump.py:697
Mount() &gt;&gt;&gt; Footer() method=&lt;Footer.on_mount&gt;
[18:28:17] EVENT                                                                                                                                                 message_pump.py:697
Mount() &gt;&gt;&gt; Footer() method=&lt;Widget.on_mount&gt;
[18:28:17] EVENT                                                                                                                                                 message_pump.py:697
Mount() &gt;&gt;&gt; ToastRack(id='textual-toastrack') method=&lt;Widget.on_mount&gt;
...
RowHighlighted(cursor_row=0, row_key=&lt;textual.widgets._data_table.RowKey object at 0x7fc8d98800d0&gt;) &gt;&gt;&gt; BrowserApp(title='Race Runners', classes={'-dark-mode'}) method=None
[18:28:17] EVENT                                                                                                                                                 message_pump.py:697
Mount() &gt;&gt;&gt; ScrollBarCorner() method=&lt;Widget.on_mount&gt;
[18:28:17] EVENT                                                                                                                                                 message_pump.py:706
Resize(size=Size(width=2, height=1), virtual_size=Size(width=178, height=47), container_size=Size(width=178, height=47)) &gt;&gt;&gt; ScrollBarCorner() method=None
[18:28:17] EVENT                                                                                                                                                 message_pump.py:706
Show() &gt;&gt;&gt; ScrollBarCorner() method=None
</code></pre>
<h4 id="heading-using-unittest-and-pilot">Using unittest and Pilot</h4>
<p>The framework has the <a target="_blank" href="https://textual.textualize.io/api/pilot/">Pilot class</a> that you can use to make automated calls to Textual Widgets and wait for events. This means you can simulate user interaction with the application to validate that it behaves as expected. This is more powerful than the regular unit tests as you can also cover UI interactions with expected results:</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> unittest
<span class="hljs-keyword">from</span> textual.widgets <span class="hljs-keyword">import</span> DataTable, MarkdownViewer
<span class="hljs-keyword">from</span> empirestaterunup.apps <span class="hljs-keyword">import</span> BrowserApp


<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">AppTestCase</span>(<span class="hljs-params">unittest.IsolatedAsyncioTestCase</span>):</span>
    <span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">test_browser_app</span>(<span class="hljs-params">self</span>):</span>
        app = BrowserApp()
        self.assertIsNotNone(app)
        <span class="hljs-keyword">async</span> <span class="hljs-keyword">with</span> app.run_test() <span class="hljs-keyword">as</span> pilot:

            <span class="hljs-string">"""
            Test the command palette
            """</span>
            <span class="hljs-keyword">await</span> pilot.press(<span class="hljs-string">"ctrl+\\"</span>)
            <span class="hljs-keyword">for</span> char <span class="hljs-keyword">in</span> <span class="hljs-string">"jose"</span>.split():
                <span class="hljs-keyword">await</span> pilot.press(char)
            <span class="hljs-keyword">await</span> pilot.press(<span class="hljs-string">"enter"</span>)
            <span class="hljs-comment"># This returns the runner screen. Check that it has some contents</span>
            markdown_viewer = app.screen.query(MarkdownViewer).first()
            self.assertTrue(markdown_viewer.document)
            <span class="hljs-keyword">await</span> pilot.click(<span class="hljs-string">"#close"</span>)  <span class="hljs-comment"># Close the new screen, pop the original one</span>
            <span class="hljs-comment"># Go back to the main screen, now select a runner but using the table</span>
            table = app.screen.query(DataTable).first()
            coordinate = table.cursor_coordinate
            self.assertTrue(table.is_valid_coordinate(coordinate))
            <span class="hljs-keyword">await</span> pilot.press(<span class="hljs-string">"enter"</span>)
            <span class="hljs-keyword">await</span> pilot.pause()
            markdown_viewer = app.screen.query(MarkdownViewer).first()
            self.assertTrue(markdown_viewer)
            <span class="hljs-comment"># After validating the markdown one more time, close the app</span>
            <span class="hljs-comment"># Quit the app by pressing q</span>
            <span class="hljs-keyword">await</span> pilot.press(<span class="hljs-string">"q"</span>)

<span class="hljs-keyword">if</span> __name__ == <span class="hljs-string">'__main__'</span>:
    unittest.main()
</code></pre>
<p>This is invaluable, and something that many times requires an external toolset to validate (for example in Java you have the class <a target="_blank" href="https://docs.oracle.com/javase/8/docs/api/java/awt/Robot.html">Robot</a>).</p>
<h2 id="heading-how-to-run-the-applications">How to Run the Applications</h2>
<p>Finally, it's time to get familiar with mini applications (you can see an animated <a target="_blank" href="https://github.com/josevnz/tutorials/blob/main/docs/EmpireStateRunUp/EmpireStateRunUp.svg">demonstration of the TUI applications here</a>).</p>
<h3 id="heading-browsing-through-the-data">Browsing Through the Data</h3>
<p>The <code>esru_browser</code> is a simple browser that lets you navigate through the raw race data.</p>
<pre><code class="lang-shell">esru_browser
</code></pre>
<p>The application shows all the race details for every Runner in a table that allows sorting by column.</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2024/05/esru_browser.png" alt="Raw runners data in a table" width="600" height="400" loading="lazy"></p>
<p><em>The esru_browser window shows all runners' results. Here you can sort, search for runners, and click to get more details</em></p>
<p>And the command palette allows searching for runners by name (it's basically a search bar with fuzzy logic):</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2024/05/race_runners_2023-12-31T18_35_53_558956.svg" alt="race_runners_2023-12-31T18_35_53_558956.svg, searching for runners by name" width="600" height="400" loading="lazy"></p>
<p><em>Matches show up on the palette as you type</em></p>
<h3 id="heading-summary-reports">Summary Reports</h3>
<p>To get insights about racer behavior, you need some summary reports (as opposed to drilling down into each racer's details).</p>
<p>This application provides details about the following:</p>
<ul>
<li><p>Count, standard deviation, mean, min, max 45%, 50%, and 75% for age, time, and pace</p>
</li>
<li><p>Group and count distribution for Age, Wave, and Gender</p>
</li>
</ul>
<pre><code class="lang-shell">esru_numbers
</code></pre>
<p>Some interesting facts about the race:</p>
<ul>
<li><p>The average age was 41 years old, and 40 years old was the largest age group.</p>
</li>
<li><p>The majority number of people belonged to the 'BLACK WAVE'.</p>
</li>
<li><p>The majority of the people finished the race in between 20 and 30 minutes.</p>
</li>
<li><p>The youngest runner was 11 years old, and the oldest was 78.</p>
</li>
</ul>
<p><img src="https://www.freecodecamp.org/news/content/images/2024/05/esru_numbers.svg" alt="Statistics of interest, like average age, wave they belong, finishing time" width="600" height="400" loading="lazy"></p>
<p><em>esru_numbers gives a bird's eye view of all the racers, categorized by buckets</em></p>
<h3 id="heading-finding-outliers">Finding Outliers</h3>
<p>This application uses the <em>Z-score</em> to find the outliers for several metrics for this race:</p>
<pre><code class="lang-shell">esru_outlier
</code></pre>
<p><img src="https://www.freecodecamp.org/news/content/images/2024/05/esru_outlier-1.svg" alt="Table with outliers details" width="600" height="400" loading="lazy"></p>
<p><em>the esru_outlier main screen shows you racers that did not follow regular patterns</em></p>
<p>Because these results drill down to the BIB number, you can click on a row and get more details about a runner:</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2024/05/esru_outlier-2.svg" alt="Outlier racer details, including BIB" width="600" height="400" loading="lazy"></p>
<p><em>And you can get details for each outlier. Yes, code is reusable and is the same to show details for any runner</em></p>
<p>Textual has excellent support for rendering Markdown as well as programming languages. Take a look at the code to see for yourself.</p>
<h3 id="heading-a-few-plot-graphics-for-you">A Few Plot Graphics For You</h3>
<p>The <a target="_blank" href="https://github.com/josevnz/tutorials/blob/main/docs/EmpireStateRunUp/empirestaterunup/apps.py">esru_plot</a> application offers a few plot graphics to help you visualize the data. Inside, the class <code>Plotter</code> does all the heavy lifting</p>
<h4 id="heading-age-plots">Age plots</h4>
<p>The program can generate two flavors for the same data, one is a Box diagram:</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2024/05/esru_age_box_plot-1.png" alt="Age plot, Pie chart" width="600" height="400" loading="lazy"></p>
<p><em>The age box diagram we saw before</em></p>
<p>The second is a regular histogram:</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2024/05/age_histogram.png" alt="Age histogram" width="600" height="400" loading="lazy"></p>
<p><em>Age histogram shows the same as the box diagram but the buckets are more visible. Same data, many ways to explain the racer demographics.</em></p>
<p>You can see from both graphics that the group age with the most participants is the 40-45-year-old bracket and the outliers are in the 10-20 and 70-80 year old groups.</p>
<h4 id="heading-participants-per-country-plot">Participants per country plot</h4>
<p><img src="https://www.freecodecamp.org/news/content/images/2024/05/participants_per_country.png" alt="Histogram" width="600" height="400" loading="lazy"></p>
<p><em>This plot shows all the countries with the number of participants, with the best runner from each.</em></p>
<p>No surprises here: the overwhelming majority of racers come from the United States, followed by Mexico. Interestingly, the winner of the 2023 race is from Malaysia, with only 2 runners participating.</p>
<h4 id="heading-gender-distribution">Gender distribution</h4>
<p><img src="https://www.freecodecamp.org/news/content/images/2024/05/gender_distribution.png" alt="Gender pie" width="600" height="400" loading="lazy"></p>
<p><em>The gender distribution pie showing the best racer for each category</em></p>
<p>The majority of the runners identified themselves as Males, followed by Females.</p>
<h2 id="heading-what-else-can-we-learn">What Else Can We Learn?</h2>
<p><img src="https://www.freecodecamp.org/news/content/images/2024/05/esru2023_nyc-1.JPG" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>NYC was well represented on the event. Yeah, I'm talking about the NYC police department running in full gear, not me on the left ;-)</em></p>
<p>Participating in this race was a great experience. The best part was that it fueled my curiosity and led me to write this code to get more interesting facts about the race.</p>
<p>There is plenty more to learn about the tools you just saw in this tutorial:</p>
<ul>
<li><p>There are a lot of public race datasets, and you can use them to apply what you learned here. Just take a look at <a target="_blank" href="https://github.com/davidjaimes/nyc-marathon">this dataset of the New York City Marathon, period 1970-2018</a>. What <a target="_blank" href="https://github.com/meiguan/nyc2018marathonfinishers">other questions</a> you can ask about the data?</p>
</li>
<li><p>You saw just the tip of what you can do with Textual. I encourage you to explore the <a target="_blank" href="https://github.com/josevnz/tutorials/blob/main/docs/EmpireStateRunUp/empirestaterunup/apps.py">apps.py</a> module. Take a look at the <a target="_blank" href="https://github.com/Textualize/textual/tree/main/examples">example applications</a> as well.</p>
</li>
<li><p><a target="_blank" href="https://www.selenium.dev/documentation/webdriver/">Selenium Web driver</a> is not just a tool for web scraping but for automated testing of web applications. It doesn't get better than having your browser perform automated testing for you. It is a big framework, so be prepared to spend time reading and running your tests. I strongly suggest you look <a target="_blank" href="https://github.com/SeleniumHQ/seleniumhq.github.io/tree/trunk/examples/python">at the examples</a>. Trial an error will give you better results.</p>
</li>
<li><p>Apply for the <a target="_blank" href="https://www.esbnyc.com/empire-state-building-run">Empire Estate Run Up</a> lottery or run through a charity, if you like this kind of race. Who said <a target="_blank" href="https://en.wikipedia.org/wiki/King_Kong">King Kong</a> is the only one who could make it to the top?</p>
</li>
<li><p>Sadly, I'm not in a position to offer you any training advice. Every person is different. I do recommend that you check with your doctor before you participate in a race like this, and get some professional advice from a running coach.</p>
</li>
<li><p>But most important of all, believe you can do this (the race and writing some tools to process the race data) and have fun while doing it. This is a pre-requisite for any project.</p>
</li>
</ul>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ Command Line Tricks You Can Learn Faster than Drinking Your Morning Coffee ]]>
                </title>
                <description>
                    <![CDATA[ In this short tutorial, I want to share with you a few tricks and tips to help you deal with some common situations when you're working in the Linux command line. We will cover the following: find xargs and nproc taskset numactl watch inotify-t... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/command-line-tricks-you-can-learn-quickly/</link>
                <guid isPermaLink="false">66d85133f6b5e038a1bde804</guid>
                
                    <category>
                        <![CDATA[ command line ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Linux ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ Jose Vicente Nunez ]]>
                </dc:creator>
                <pubDate>Mon, 22 Jan 2024 23:15:48 +0000</pubDate>
                <media:content url="https://www.freecodecamp.org/news/content/images/2024/01/mazinger-z.png" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>In this short tutorial, I want to share with you a few tricks and tips to help you deal with some common situations when you're working in the Linux command line.</p>
<p>We will cover the following:</p>
<ul>
<li><p>find</p>
</li>
<li><p>xargs and nproc</p>
</li>
<li><p>taskset</p>
</li>
<li><p>numactl</p>
</li>
<li><p>watch</p>
</li>
<li><p>inotify-tools</p>
</li>
</ul>
<p>I will present you with a challenge and the tools demonstrating how to solve each problem.</p>
<h2 id="heading-what-youll-need">What You'll Need:</h2>
<ul>
<li><p>A Linux distribution</p>
</li>
<li><p>Curiosity</p>
</li>
</ul>
<h2 id="heading-how-to-handle-directories-with-many-files">How to Handle Directories with Many Files</h2>
<p>You may have encountered this problem before: you tried to do a <code>ls</code> on a directory with a very large number of files, but the command threw an 'argument list too long' error:</p>
<pre><code class="lang-shell">josevnz@orangepi5:/data/test_xargs$ ls *
-bash: /usr/bin/ls: Argument list too long
</code></pre>
<p>This is because <a target="_blank" href="https://en.wikipedia.org/wiki/POSIX">POSIX</a>-compatible systems have a limit for the maximum number of bytes you can pass as an argument:</p>
<pre><code class="lang-shell">[josevnz@dmaf5 Documents]$ getconf ARG_MAX
2097152
</code></pre>
<p>2 Million bytes may seem like a lot – or not enough depending on whom you ask. But it's also a protection against attacks or innocent mistakes with bad consequences.</p>
<p>In any case, how can you bypass this limitation? Well, there are many ways to do so.</p>
<h3 id="heading-using-shell-built-in">Using Shell built-in</h3>
<p>Bash built-in doesn't have the ARG_MAX limitation:</p>
<pre><code class="lang-shell">josevnz@orangepi5:/data/test_xargs$ echo *|ls
...
test_file055554  test_file111110  test_file166666  test_file222222  test_file277778  test_file333334  test_file388890  test_file444446
test_file055555  test_file111111  test_file166667  test_file222223  test_file277779  test_file333335  test_file388891  test_file444447
test_file055556  test_file111112  test_file166668  test_file222224  test_file277780  test_file333336  test_file388892  test_file444448
</code></pre>
<p>This is probably the simplest solution, but let's see another way.</p>
<h3 id="heading-using-find-when-you-want-formatting-options">Using <code>find</code> when you want formatting options</h3>
<p>Or you can use this well known <code>find</code> flag:</p>
<pre><code class="lang-shell">find /data/test_xargs -type f -ls -printf '%name'
</code></pre>
<p>Or with <em>formatting</em>, to mimic <code>ls</code>:</p>
<pre><code class="lang-shell">find /data/test_xargs -type f -printf '%f\n
</code></pre>
<p>This is fast and also the most complete solution. But before moving on I'll show you yet another way.</p>
<h3 id="heading-using-xargs">Using xargs</h3>
<p>The following works:</p>
<pre><code class="lang-shell">find /data/test_xargs -type f -print0 | xargs -0 ls
</code></pre>
<p>But it's inefficient, as you are forking 3 processes to display the contents of the directory. And on top of that, xargs <em>is throttling</em> how many files will be passed to the ls command.</p>
<p>Let's move on and check out a different problem.</p>
<h2 id="heading-how-to-run-more-programs-without-crashing-the-server">How to Run More Programs Without Crashing the Server</h2>
<h3 id="heading-first-you-walk-then-you-run-do-it-serially">First you walk then you run: Do it serially</h3>
<p>So say that you want to compress all the files on the given directory from our previous example. A first try would be like this:</p>
<pre><code class="lang-shell">gzip *
</code></pre>
<p>Which will take a long time as gzip will process one file at the time.</p>
<p>You might think to do something like this to compress files in parallel:</p>
<pre><code class="lang-shell">josevnz@orangepi5:/data/test_xargs$ for file in $(ls data/test_xargs/*); do gzip $file &amp;; done
-bash: /usr/bin/ls: Argument list too long
</code></pre>
<p>Again, ARG_MAX strikes again.</p>
<p>We know xargs or find now, so what if we do this:</p>
<pre><code class="lang-shell">for file in $(find $PWD); do echo gzip $file &amp;; done
wait
echo "All files compressed?"
</code></pre>
<p>That will either make your <strong>server run out of memory</strong> or <strong>crush it under very heavy CPU utilization</strong> because you are forking a gzip instance for every file found.</p>
<h3 id="heading-our-first-attempt-at-parallelism-and-throttling-the-art-of-self-control">Our first attempt at parallelism and throttling (the art of self control)</h3>
<p>What you need is a way to <em>throttle</em> your compression requests, so you don't launch more processes than the number of CPUS you have.</p>
<p>Let's try that again with <code>find</code> and <code>xargs</code>:</p>
<pre><code class="lang-shell">find /data/test_xargs -type f -print0| xargs -0 -P $(($(nproc)-1)) -I % gzip %
</code></pre>
<p>Oh. That looks like a fancy one-liner. Let me explain how it works:</p>
<ol>
<li><p>Use <code>find</code> to get all files on the given directory, use the null character as a separator to be able to process weird named ones.</p>
</li>
<li><p><code>nproc</code> will tell you how many CPUS you have, then subtract 1 using Bash arithmetic like this using sub-shells: <code>$(($(nproc)-1))</code></p>
</li>
<li><p>Finally, <code>xargs</code> will run no more than -P processes (In my case 8 CPUS - 1 = 7 jobs), replacing the '%' with the name of the file to compress</p>
</li>
</ol>
<p>Note: There are other ways to get the number of CPUS on the machine, like parsing <code>/proc/cpuinfo</code>. There are other more efficient compression out there but gzip is available on pretty much any Linux/ Unix out there.</p>
<p>OK, time to see our next problem.</p>
<h2 id="heading-cpu-affinity-with-taskset-to-maximize-execution-time">CPU Affinity with taskset to Maximize Execution Time</h2>
<p>Despite limiting the number of CPUs, some intensive jobs can slow down other processes on your machine when looking for resources. There are a few things you can do to keep the performance of the server under control, like using <a target="_blank" href="https://github.com/util-linux/util-linux/blob/master/schedutils/taskset.c">taskset</a>:</p>
<blockquote>
<p>The taskset command is used to set or retrieve the CPU affinity<br>of a running process given its pid, or to launch a new command<br>with a given CPU affinity. CPU affinity is a scheduler property<br>that "bonds" a process to a given set of CPUs on the system.</p>
</blockquote>
<p>In general, we always want to leave one of the CPUS 'free' for operating system tasks. The Kernel is normally pretty good keeping running processes glued to a specific CPU to avoid context switching, but if you want to enforce on which CPUS your process will run you can use <code>tasket</code></p>
<pre><code class="lang-shell">taskset -c 1,2,3,4,5,6,7 find /data/test_xargs -type f -print0| xargs -0 -P $(($(nproc)-1)) -I % gzip %
</code></pre>
<h3 id="heading-taskset-the-only-game-in-town-not-so-numactl-fast">taskset the only game in town? not so numactl fast!</h3>
<p>What is <a target="_blank" href="https://documentation.suse.com/sles/12-SP4/html/SLES-all/cha-tuning-numactl.html">NUMA and why you should care</a>?</p>
<blockquote>
<p>There are physical limitations to hardware that are encountered when many CPUs and lots of memory are required. The important limitation is that there is limited communication bandwidth between the CPUs and the memory.</p>
<p>One architecture modification that was introduced to address this is Non-Uniform Memory Access (NUMA).</p>
</blockquote>
<p>So most simple desktop machines only have a single NUMA node, like mine:</p>
<pre><code class="lang-shell">[josevnz@dmaf5 ~]$ numactl --hardware
available: 1 nodes (0)
node 0 cpus: 0 1 2 3 4 5 6 7
node 0 size: 15679 MB
node 0 free: 5083 MB
node distances:
node   0 
  0:  10
# Or with lscpu
[josevnz@dmaf5 ~]$ lscpu |rg NUMA
NUMA node(s):                    1
NUMA node0 CPU(s):               0-7
</code></pre>
<p>If you have more than one NUMA node, you may want to 'pin' or set the affinity of your program to use the CPUS and memory of the same node.</p>
<p>For example, on a machine with 16 cores, 0-7 on node 0, 8-15 on node 1, we could force our compression program to run on all the CPUS on node 1, and use the memory of node 1 like this:</p>
<pre><code class="lang-shell">numactl --physcpubind 8-15 --membind=1 find /data/test_xargs -type f -print0| xargs -0 -P $(($(nproc)-1)) -I % gzip %
</code></pre>
<h2 id="heading-keeping-an-eye-on-things">Keeping an Eye on Things</h2>
<h3 id="heading-just-watch-what-i-do">Just watch what I do</h3>
<p>The <a target="_blank" href="https://www.man7.org/linux/man-pages/man1/watch.1.html">watch</a> command allows you to periodically run a command, and even show you the differences before calls:</p>
<pre><code class="lang-shell">Every 10.0s: ls                                                                                                         orangepi5: Wed May 24 22:46:33 2023

test_file000001.gz
test_file000002.gz
test_file000003.gz
test_file000004.gz
test_file000005.gz
test_file000006.gz
test_file000007.gz
test_file000008.gz
test_file000009.gz
test_file000010.gz
...
</code></pre>
<p>Shows me the output of the <code>ls</code> command every 10 seconds. To detect changes on a directory this is simple, but not easy to automate and definitely not efficient.</p>
<p>Wouldn't be nice if the kernel was able to tall me about changes on my directories?</p>
<h3 id="heading-a-better-way-to-know-about-changes-on-the-filesystem-with-inotify-tools">A better way to know about changes on the filesystem, with inotify-tools</h3>
<p>You may need to install this separately, but it should be easy to do. On Ubuntu:</p>
<pre><code class="lang-shell">sudo apt-get install inotify-tools
</code></pre>
<p>On Fedora:</p>
<pre><code class="lang-shell">sudo dnf install -y inotify-tools
</code></pre>
<p>So how we can monitor for events on a given directory?</p>
<p>On one terminal we can run inotifywait:</p>
<pre><code class="lang-shell">josevnz@orangepi5:/data/test_xargs$ inotifywait --recursive /data/test_xargs/
Setting up watches.  Beware: since -r was given, this may take a while!
Watches established.
</code></pre>
<p>And on another terminal we can touch some files to simulate an event:</p>
<pre><code class="lang-shell">josevnz@orangepi5:/data/test_xargs$ pwd
/data/test_xargs
josevnz@orangepi5:/data/test_xargs$ touch test_file285707.gz test_file357136.gz test_file428565.gz
</code></pre>
<p>The original terminal will get the first event and exit:</p>
<pre><code class="lang-shell">Watches established.
/data/test_xargs/ OPEN test_file285707.gz
</code></pre>
<p>To make it listen for even forever we do this:</p>
<pre><code class="lang-shell">josevnz@orangepi5:/data/test_xargs$ inotifywait --recursive --monitor /data/test_xargs/
</code></pre>
<p>If we touch the file again on a separate terminal then this time we will see all the events:</p>
<pre><code class="lang-shell">Setting up watches.  Beware: since -r was given, this may take a while!
Watches established.
/data/test_xargs/ OPEN test_file285707.gz
/data/test_xargs/ ATTRIB test_file285707.gz
/data/test_xargs/ CLOSE_WRITE,CLOSE test_file285707.gz
/data/test_xargs/ OPEN test_file357136.gz
/data/test_xargs/ ATTRIB test_file357136.gz
/data/test_xargs/ CLOSE_WRITE,CLOSE test_file357136.gz
/data/test_xargs/ OPEN test_file428565.gz
/data/test_xargs/ ATTRIB test_file428565.gz
/data/test_xargs/ CLOSE_WRITE,CLOSE test_file428565.gz
</code></pre>
<p>This is less taxing to the operating system than asking for directory changes every time, and filtering just the differences ourselves.</p>
<h2 id="heading-whats-next">What's Next</h2>
<p>There is so much more to explore. The tips above introduced you to some important concepts, so why not to learn much more about them?</p>
<ul>
<li><p>The <a target="_blank" href="https://askubuntu.com/questions/217764/argument-list-too-long-when-copying-files">Ubuntu forum</a> has a great conversation about <em>xargs</em>, <em>find</em>, <em>ulimit</em> and other things. Knowledge is power.</p>
</li>
<li><p>RedHat as <a target="_blank" href="https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/monitoring_and_managing_system_status_and_performance/configuring-an-operating-system-to-optimize-cpu-utilization_monitoring-and-managing-system-status-and-performance">a nice page</a> about NUMA, taskset, interrupt handling. If you are serious about fine-tuning the performance of your processes, please read it.</p>
</li>
<li><p>You liked <a target="_blank" href="https://en.wikipedia.org/wiki/Inotify">inotify</a> and want to use it from your Python script. Then take a look at <a target="_blank" href="https://github.com/seb-m/pyinotify/wiki">pynotify</a>.</p>
</li>
<li><p>Find may be intimidating, but <a target="_blank" href="https://www.digitalocean.com/community/tutorials/how-to-use-find-and-locate-to-search-for-files-on-linux">this tutorial</a> will make it easier to understand.</p>
</li>
<li><p>Source code for this tutorial can be found <a target="_blank" href="https://github.com/josevnz/CommandLineTipsAndTricks">here</a>.</p>
</li>
</ul>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ How to Get Started with FPM ]]>
                </title>
                <description>
                    <![CDATA[ FPM is a powerful wrapper that allows you to create packages for multiple programs in multiple operating systems. In this tutorial I will show you how you can replace some of the tedious packaging of third party applications. What You Need to Complet... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/getting-started-with-fpm/</link>
                <guid isPermaLink="false">66d8513df6f7ca5a6046250d</guid>
                
                    <category>
                        <![CDATA[ Linux ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ Jose Vicente Nunez ]]>
                </dc:creator>
                <pubDate>Fri, 19 Jan 2024 16:42:55 +0000</pubDate>
                <media:content url="https://cdn.hashnode.com/res/hashnode/image/upload/v1725458454908/8564e5d2-939a-4297-b619-801d0fe695cb.jpeg" medium="image" />
                <content:encoded>
                    <![CDATA[ <p><a target="_blank" href="https://fpm.readthedocs.io/en/latest/getting-started.html">FPM</a> is a powerful wrapper that allows you to create packages for multiple programs in multiple operating systems.</p>
<p>In this tutorial I will show you how you can replace some of the tedious packaging of third party applications.</p>
<h2 id="heading-what-you-need-to-complete-this-tutorial">What You Need to Complete this Tutorial</h2>
<ul>
<li><p>A Linux distribution (I used Fedora but this works with anything)</p>
</li>
<li><p>Elevated privileges (if you want to install your own packages)</p>
</li>
</ul>
<h2 id="heading-when-your-package-manager-isnt-simple-enough">When Your Package Manager Isn't Simple Enough</h2>
<p>Often times, you'll want to have the ultimate control over how you package an application. But there are a few occasions when this may be overkill:</p>
<ol>
<li><p>The third party application is simple or small enough than a tar would be good enough to install it. Yet you want to enjoy the benefits of upgrades and roll-back, like the ones offered by RPM.</p>
</li>
<li><p>You need or want to package an application from one format (say .tar.gz) to Debian '.deb' or RPM.</p>
</li>
<li><p>You have to package multiple applications that are only offered in Source format or pre-packaged binaries, like when upgrading the operating system. And you don't want to spend an eternity re-packaging the third party applications.</p>
</li>
</ol>
<h2 id="heading-how-to-package-an-existing-application-the-old-way">How to Package an Existing Application the Old Way</h2>
<p>I wrote an small demo application that dumps system facts (like disk utilization) in JSON format, called <code>[jdumpertools](https://github.com/josevnz/jdumpertools)</code>. The application is very simple, is written in C, and has an <a target="_blank" href="https://github.com/josevnz/jdumpertools/blob/main/jdumpertools.spec">RPM spec file</a> that you can use to package the software.</p>
<p>There are a few manual steps required to create the RPM:</p>
<ol>
<li><p>Download the source distribution (or binary): <em>git clone</em> <a target="_blank" href="https://github.com/josevnz/jdumpertools.git"><em>https://github.com/josevnz/jdumpertools.git</em></a></p>
</li>
<li><p>Prepare the <a target="_blank" href="https://github.com/josevnz/jdumpertools/blob/main/jdumpertools.spec">RPM spec file</a>, which should take care of compilation (or just packaging) of the software, as well the location for the installation</p>
</li>
<li><p>Lint the spec file, fix common errors</p>
</li>
</ol>
<p>So let's see how <code>jdumbertools</code>'s RPM spec file works.</p>
<p>First, take a look at the spec file:</p>
<pre><code class="lang-python">Name:           jdumpertools
<span class="hljs-comment"># <span class="hljs-doctag">TODO:</span> Figure out a better way to update version here and on Makefile</span>
%<span class="hljs-keyword">global</span> major <span class="hljs-number">0</span>
Version:        v%{major}<span class="hljs-number">.2</span>
Release:        <span class="hljs-number">1</span>%{?dist}
Summary:        Programs that can be used to dump Linux usage data <span class="hljs-keyword">in</span> JSON format

License:        ASL <span class="hljs-number">2.0</span>
URL:            https://github.com/josevnz/jdumpertools
Source0:        %{name}-%{version}.tar.gz

BuildRequires:  bash,tar,gzip,rpmdevtools,rpmlint,make,gcc &gt;= <span class="hljs-number">10.2</span><span class="hljs-number">.1</span>
Requires:       bash

%<span class="hljs-keyword">global</span> debug_package %{nil}

%description

Jdumpertools <span class="hljs-keyword">is</span> a collection of programs that can be used to dump
linux usage data <span class="hljs-keyword">in</span> JSON format, so it can be ingested by other tools.

* jdu: Similar to UNIX <span class="hljs-string">'/bin/du'</span> command.
* jutmp: UTMP database dumper

%prep
%setup -q -n jdumpertools

%build
make all

%install

/usr/bin/mkdir -p %{buildroot}/%{_bindir}
/usr/bin/mkdir -p %{buildroot}/%{_mandir}/man8
/usr/bin/cp -v -p jdu jutmp %{buildroot}/%{_bindir}
/usr/bin/cp -v -p jdu<span class="hljs-number">.1</span> jutmp<span class="hljs-number">.1</span> %{buildroot}/%{_mandir}/man8/
/usr/bin/gzip %{buildroot}/%{_mandir}/man8/*
/usr/bin/mkdir -p %{buildroot}/%{_libdir}
/usr/bin/cp -v -p libjdumpertools.so.%{major} %{buildroot}/%{_libdir}
/usr/bin/strip %{buildroot}/%{_bindir}/{jdu,jutmp}
/usr/bin/strip %{buildroot}/%{_libdir}/*

%clean
rm -rf %{buildroot}

%files
%{_bindir}/jdu
%{_bindir}/jutmp
%{_libdir}/libjdumpertools.so.%{major}
%{_libdir}/libjdumpertools.so
%license LICENSE
%doc README.md
%doc %{_mandir}/man8/jdu<span class="hljs-number">.1</span>.gz
%doc %{_mandir}/man8/jutmp<span class="hljs-number">.1</span>.gz


%changelog
* Sun Oct  <span class="hljs-number">3</span> <span class="hljs-number">2021</span> Jose Vicente Nunez &lt;kodegeek.com@protonmail.com&gt; - v0<span class="hljs-number">.2</span><span class="hljs-number">-1</span>
- Applied fixes <span class="hljs-keyword">from</span> rpmlint: man page, typos on spec file, striped binaries, etc.
* Mon Jan  <span class="hljs-number">4</span> <span class="hljs-number">2021</span> Jose Vicente Nunez &lt;kodegeek.com@protonmail.com&gt; - v0<span class="hljs-number">.1</span><span class="hljs-number">-1</span>
- First version being packaged
</code></pre>
<p>And now let's build it:</p>
<pre><code class="lang-shell">[josevnz@dmaf5 jdumpertools]$ sudo dnf install -y rpmdevtools rpmlint
...
[josevnz@dmaf5 test]$ git clone https://github.com/josevnz/jdumpertools.git
Cloning into 'jdumpertools'...
remote: Enumerating objects: 228, done.
remote: Counting objects: 100% (228/228), done.
remote: Compressing objects: 100% (137/137), done.
remote: Total 228 (delta 132), reused 157 (delta 79), pack-reused 0
Receiving objects: 100% (228/228), 3.15 MiB | 9.67 MiB/s, done.
Resolving deltas: 100% (132/132), done.

[josevnz@dmaf5 test]$ cd jdumpertools/
[josevnz@dmaf5 jdumpertools]$ rpmbuild -ba jdumpertools.spec
...
+ exit 0
Provides: jdumpertools = v0.2-1.fc37 jdumpertools(x86-64) = v0.2-1.fc37 libjdumpertools.so()(64bit)
Requires(rpmlib): rpmlib(CompressedFileNames) &lt;= 3.0.4-1 rpmlib(FileDigests) &lt;= 4.6.0-1 rpmlib(PayloadFilesHavePrefix) &lt;= 4.0-1
Requires: libc.so.6()(64bit) libc.so.6(GLIBC_2.2.5)(64bit) libc.so.6(GLIBC_2.3)(64bit) libjdumpertools.so()(64bit) rtld(GNU_HASH)
Checking for unpackaged file(s): /usr/lib/rpm/check-files /home/josevnz/rpmbuild/BUILDROOT/jdumpertools-v0.2-1.fc37.x86_64
Wrote: /home/josevnz/rpmbuild/SRPMS/jdumpertools-v0.2-1.fc37.src.rpm
Wrote: /home/josevnz/rpmbuild/RPMS/x86_64/jdumpertools-v0.2-1.fc37.x86_64.rpm
Executing(%clean): /bin/sh -e /var/tmp/rpm-tmp.42keBq
+ umask 022
+ cd /home/josevnz/rpmbuild/BUILD
+ cd jdumpertools
+ rm -rf /home/josevnz/rpmbuild/BUILDROOT/jdumpertools-v0.2-1.fc37.x86_64
+ RPM_EC=0
++ jobs -p
+ exit 0
Executing(rmbuild): /bin/sh -e /var/tmp/rpm-tmp.aZjb6s
+ umask 022
+ cd /home/josevnz/rpmbuild/BUILD
+ rm -rf jdumpertools jdumpertools.gemspec
+ RPM_EC=0
++ jobs -p
+ exit 0
...
[josevnz@dmaf5 jdumpertools]$ ls -l $HOME/rpmbuild/RPMS/x86_64/jdumpertools-v0.2-1.fc37.x86_64.rpm
-rw-r--r--. 1 josevnz josevnz 22118 Jun  2 14:03 /home/josevnz/rpmbuild/RPMS/x86_64/jdumpertools-v0.2-1.fc37.x86_64.rpm
</code></pre>
<p>Then you install the RPM like any other RPM:</p>
<pre><code class="lang-shell">[josevnz@dmaf5 jdumpertools]$ sudo dnf install -y $HOME/rpmbuild/RPMS/x86_64/jdumpertools-v0.2-1.fc37.x86_64.rpm
Last metadata expiration check: 1:36:46 ago on Fri 02 Jun 2023 12:30:31 PM EDT.
Dependencies resolved.
=================================================================================================================================
 Package                         Architecture              Version                         Repository                       Size
=================================================================================================================================
Installing:
 jdumpertools                    x86_64                    v0.2-1.fc37                     @commandline                     22 k

Transaction Summary
=================================================================================================================================
Install  1 Package

Total size: 22 k
Installed size: 57 k
Downloading Packages:
Running transaction check
Transaction check succeeded.
Running transaction test
Transaction test succeeded.
Running transaction
  Preparing        :                                                                                                         1/1 
  Installing       : jdumpertools-v0.2-1.fc37.x86_64                                                                         1/1 
  Running scriptlet: jdumpertools-v0.2-1.fc37.x86_64                                                                         1/1 
  Verifying        : jdumpertools-v0.2-1.fc37.x86_64                                                                         1/1 

Installed:
  jdumpertools-v0.2-1.fc37.x86_64                                                                                                

Complete!
</code></pre>
<p>It's not terrible, specially if you plan to make updates – but can we do this in an easier way?</p>
<h2 id="heading-how-to-install-fpm">How to Install FPM</h2>
<p>The <a target="_blank" href="https://fpm.readthedocs.io/en/latest/getting-started.html">getting started</a> document the simplest reference you can refer to in order to get FPM up and running.</p>
<p>First you'll install some dependencies, for example in Fedora:</p>
<pre><code class="lang-shell">[josevnz@dmaf5 jdumpertools]$ sudo dnf install -y gem
[josevnz@dmaf5 jdumpertools]$ sudo dnf install -y rpm-build squashfs-tools
</code></pre>
<p>And then you'll install FPM itself:</p>
<pre><code class="lang-shell">[josevnz@dmaf5 jdumpertools]$ gem install --user-install fpm
Fetching insist-1.0.0.gem
Fetching clamp-1.0.1.gem
Fetching stud-0.0.23.gem
Fetching rexml-3.2.5.gem
Fetching mustache-0.99.8.gem
Fetching dotenv-2.8.1.gem
Fetching cabin-0.9.0.gem
Fetching pleaserun-0.0.32.gem
Fetching fpm-1.15.1.gem
Fetching backports-3.24.1.gem
...
Done installing documentation for stud, rexml, mustache, insist, dotenv, clamp, cabin, pleaserun, backports, arr-pm, fpm after 5 seconds
11 gems installed
</code></pre>
<h2 id="heading-how-to-package-jdumpertools-as-an-rpm-without-a-spec-file">How to Package <code>jdumpertools</code> as an RPM, Without a Spec File</h2>
<p>Well, we need some files to package. This <a target="_blank" href="https://github.com/josevnz/jdumpertools">distribution</a> comes with a <em>Makefile</em>, so easy as pie we do:</p>
<pre><code class="lang-shell">[josevnz@dmaf5 jdumpertools]$ make
gcc -Wall -g -Og -Wextra -Werror -Werror=format-security -std=c11   -DJDUMPERTOOLS_VERSION=v0.2 -fPIC jdumpertools.h jdumpertools.c -I /home/josevnz/test/jdumpertools -shared -Wl,-soname,libjdumpertools.so -o libjdumpertools.so.0
gcc jdumpertools.h jdu.c libjdumpertools.so.0 -Wall -g -Og -Wextra -Werror -Werror=format-security -std=c11   -DJDUMPERTOOLS_VERSION=v0.2 -L /home/josevnz/test/jdumpertools -l jdumpertools -o jdu
gcc jdumpertools.h jutmp.c -Wall -g -Og -Wextra -Werror -Werror=format-security -std=c11   -DJDUMPERTOOLS_VERSION=v0.2 -L /home/josevnz/test/jdumpertools -l jdumpertools -o jutmp
...
[josevnz@dmaf5 jdumpertools]$ ls
CODE_OF_CONDUCT.md  INSTALL.md  jdu.c           jdumpertools.spec  jutmp.c               Makefile        SECURITY.md
CONTRIBUTING.md     jdu         jdumpertools.c  jutmp              libjdumpertools.so.0  mazinger-z.png
Dockerfile          jdu.1       jdumpertools.h  jutmp.1            LICENSE               README.md
[josevnz@dmaf5 jdumpertools]$ fpm -t rpm -s dir --name jdumpertools --rpm-autoreq --rpm-os linux --rpm-summary 'Programs that can be used to dump Linux usage data in JSON format' --license 'ASL 2.0' --version v0.21 --depends bash --maintainer 'Jose Vicente Nunez &lt;kodegeek.com@protonmail.com&gt;' --url https://github.com/josevnz/jdumpertools jdu=/usr/bin/jdu jutmp=/usr/bin/jutmp jdu.1=/usr/share/man/man1/jdu.1.gz jutmp.1=/usr/share/man/man8/jutmp.1.gz
Created package {:path=&gt;"jdumpertools-v0.21-1.x86_64.rpm"}
</code></pre>
<p>So no spec file, and we've got ourselves an RPM.</p>
<p>What if I want to create packages for other distributions? I just need to make a few changes on the command line:</p>
<p>Debian package:</p>
<pre><code class="lang-shell">[josevnz@dmaf5 jdumpertools]$ fpm -t deb -s dir --name jdumpertools --rpm-autoreq --rpm-os linux --rpm-summary 'Programs that can be used to dump Linux usage data in JSON format' --license 'ASL 2.0' --version v0.21 --depends bash --maintainer 'Jose Vicente Nunez &lt;kodegeek.com@protonmail.com&gt;' --url https://github.com/josevnz/jdumpertools jdu=/usr/bin/jdu jutmp=/usr/bin/jutmp jdu.1=/usr/share/man/man1/jdu.1.gz jutmp.1=/usr/share/man/man8/jutmp.1.gz
Debian 'Version' field needs to start with a digit. I was provided 'v0.21' which seems like it just has a 'v' prefix to an otherwise-valid Debian version, I'll remove the 'v' for you. {:level=&gt;:warn}
Created package {:path=&gt;"jdumpertools_0.21_amd64.deb"}
</code></pre>
<p>Self extracting script:</p>
<pre><code class="lang-shell">[josevnz@dmaf5 jdumpertools]$ fpm -t sh -s dir --name jdumpertools --rpm-autoreq --rpm-os linux --rpm-summary 'Programs that can be used to dump Linux usage data in JSON format' --license 'ASL 2.0' --version v0.21 --depends bash --maintainer 'Jose Vicente Nunez &lt;kodegeek.com@protonmail.com&gt;' --url https://github.com/josevnz/jdumpertools jdu=/usr/bin/jdu jutmp=/usr/bin/jutmp jdu.1=/usr/share/man/man1/jdu.1.gz jutmp.1=/usr/share/man/man8/jutmp.1.gz
Created package {:path=&gt;"jdumpertools.sh"}
</code></pre>
<p>tar file:</p>
<pre><code class="lang-shell">[josevnz@dmaf5 jdumpertools]$ fpm -t tar -s dir --name jdumpertools --rpm-autoreq --rpm-os linux --rpm-summary 'Programs that can be used to dump Linux usage data in JSON format' --license 'ASL 2.0' --version v0.21 --depends bash --maintainer 'Jose Vicente Nunez &lt;kodegeek.com@protonmail.com&gt;' --url https://github.com/josevnz/jdumpertools jdu=/usr/bin/jdu jutmp=/usr/bin/jutmp jdu.1=/usr/share/man/man1/jdu.1.gz jutmp.1=/usr/share/man/man8/jutmp.1.gz
Created package {:path=&gt;"jdumpertools.tar"}
</code></pre>
<p>This is already very convenient. Now I want to show you another use case for FPM.</p>
<h2 id="heading-how-to-repackage-existing-software">How to Repackage Existing Software</h2>
<p>Say that you want to distribute a CPAN module which doesn't have an RPM. You could spend quality time, or you could use FPM to do the work for you.</p>
<p>First, let's install a new dependency for Fedora:</p>
<pre><code class="lang-shell">[josevnz@dmaf5 jdumpertools]$ sudo dnf install -y perl-App-cpanminus
</code></pre>
<p>And then let's build our RPM</p>
<pre><code class="lang-shell">[josevnz@dmaf5 jdumpertools]$ fpm -t rpm -s cpan Archive::Tar
Created package {:path=&gt;"perl-Archive-Tar-3.02-1.noarch.rpm"}
</code></pre>
<p>Did it work?</p>
<pre><code class="lang-shell">[josevnz@dmaf5 jdumpertools]$ rpm -qil perl-Archive-Tar-3.02-1.noarch.rpm
Name        : perl-Archive-Tar
Version     : 3.02
Release     : 1
Architecture: noarch
Install Date: (not installed)
Group       : default
Size        : 177677
License     : perl_5
Signature   : (none)
Source RPM  : perl-Archive-Tar-3.02-1.src.rpm
Build Date  : Fri 02 Jun 2023 04:36:45 PM EDT
Build Host  : dmaf5
Relocations : / 
Packager    : &lt;josevnz@dmaf5&gt;
Vendor      : Jos Boumans &lt;kane[at]cpan.org&gt;
URL         : http://example.com/no-uri-given
Summary     : Manipulates TAR archives
Description :
Manipulates TAR archives
/usr/local/bin/ptar
/usr/local/bin/ptardiff
/usr/local/bin/ptargrep
/usr/local/share/man/man1/ptar.1
/usr/local/share/man/man1/ptardiff.1
/usr/local/share/man/man1/ptargrep.1
/usr/local/share/man/man3/Archive::Tar.3pm
/usr/local/share/man/man3/Archive::Tar::File.3pm
/usr/local/share/perl5/5.36/Archive/Tar.pm
/usr/local/share/perl5/5.36/Archive/Tar/Constant.pm
/usr/local/share/perl5/5.36/Archive/Tar/File.pm
</code></pre>
<p>Now I'm going to show you how to package the <a target="_blank" href="https://clickhouse-driver.readthedocs.io/en/latest/installation.html#installation-from-pypi">clickhouse-driver</a> module from PyPi.</p>
<pre><code class="lang-shell">[josevnz@dmaf5 jdumpertools]$ fpm -t rpm -s python 'clickhouse-driver'
Created package {:path=&gt;"python-clickhouse-driver-0.2.6-1.x86_64.rpm"}
</code></pre>
<p>Say that now you want to create an RPM for OpenJDK 17. No problem, get the tar file and package it with a little help:</p>
<pre><code class="lang-shell">[josevnz@dmaf5 jdumpertools]$ curl --fail --location --remote-name https://github.com/adoptium/temurin17-binaries/releases/download/jdk-17.0.7%2B7/OpenJDK17U-jdk_x64_linux_hotspot_17.0.7_7.tar.gz
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100  182M  100  182M    0     0  10.9M      0  0:00:16  0:00:16 --:--:-- 11.1M
[josevnz@dmaf5 jdumpertools]$ fpm -t rpm -s tar --url 'https://adoptium.net/' --description 'Eclipse Temurin is the name of the OpenJDK distribution from Adoptium' --version '17.0.7+7' --prefix /usr/local/openjdk OpenJDK17U-jdk_x64_linux_hotspot_17.0.7_7.tar.gz
[josevnz@dmaf5 jdumpertools]$ rpm -qil OpenJDK17U-jdk_x64_linux_hotspot_17-17.0.7+7-1.x86_64.rpm
Name        : OpenJDK17U-jdk_x64_linux_hotspot_17
Version     : 17.0.7+7
Release     : 1
Architecture: x86_64
Install Date: (not installed)
Group       : default
Size        : 329508762
License     : unknown
Signature   : (none)
Source RPM  : OpenJDK17U-jdk_x64_linux_hotspot_17-17.0.7+7-1.src.rpm
Build Date  : Fri 02 Jun 2023 05:05:05 PM EDT
Build Host  : dmaf5
Relocations : /usr/local/openjdk 
Packager    : &lt;josevnz@dmaf5&gt;
Vendor      : none
URL         : https://adoptium.net/
Summary     : Eclipse Temurin is the name of the OpenJDK distribution from Adoptium
Description :
Eclipse Temurin is the name of the OpenJDK distribution from Adoptium
/usr/local/openjdk/jdk-17.0.7+7/NOTICE
/usr/local/openjdk/jdk-17.0.7+7/bin/jar
/usr/local/openjdk/jdk-17.0.7+7/bin/jarsigner
/usr/local/openjdk/jdk-17.0.7+7/bin/java
...
</code></pre>
<p>I could keep going but I think you get the idea how much you can to with FPM.</p>
<h2 id="heading-whats-next">What's Next?</h2>
<p>We covered some important use cases, but the tool has much more to offer:</p>
<ul>
<li><p>FPM has <a target="_blank" href="https://fpm.readthedocs.io/en/latest/cli-reference.html">many other usages</a>, including transforming existing packages from other formats to the one you want.</p>
</li>
<li><p>FPM also supports <a target="_blank" href="https://fpm.readthedocs.io/en/latest/getting-started.html">configuration files</a>. If you are using it often then you should read how to use a configuration file for FPM as opposed to use a lengthy command line.</p>
</li>
<li><p>You may also consider running FPM <a target="_blank" href="https://fpm.readthedocs.io/en/latest/docker.html">from inside a container</a>, to avoid installing dependencies.</p>
</li>
<li><p>If you are curious about how to run the jumpertools binaries, you can take a look at the <a target="_blank" href="https://github.com/josevnz/jdumpertools">README.md</a> from the repository.</p>
</li>
</ul>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ How to Use TUI Applications with Click and Trogon – Linux Tutorial ]]>
                </title>
                <description>
                    <![CDATA[ Linux and terminal applications are almost synonymous. If you have used applications like grep, cat, sed, and AWK, those are command line interfaces (CLI). And when they work together, they allow you to unleash the power of your computer by mixing an... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/tui-applications-with-click-and-trogon/</link>
                <guid isPermaLink="false">66d8514de86088251dd27bc4</guid>
                
                    <category>
                        <![CDATA[ cli ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Linux ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ Jose Vicente Nunez ]]>
                </dc:creator>
                <pubDate>Wed, 17 Jan 2024 17:36:53 +0000</pubDate>
                <media:content url="https://www.freecodecamp.org/news/content/images/2024/01/mazinger_vampire.JPG" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>Linux and terminal applications are almost synonymous. If you have used applications like grep, cat, sed, and AWK, those are command line interfaces (<a target="_blank" href="https://en.wikipedia.org/wiki/Command-line_interface">CLI</a>). And when they work together, they allow you to unleash the power of your computer by mixing and matching a few commands.</p>
<p>Sometimes the CLI gets too complex – and that's when you can complement it with more exploratory versions of the programs called text user interfaces (<a target="_blank" href="https://en.wikipedia.org/wiki/Text-based_user_interface">TUI</a>).</p>
<p>TUIs like HTOP, Glances, Midnight Commander, and others allow you to mix in the power of the CLI without sacrificing the ease of use.</p>
<p>So what can you do when your Python CLI has too many options and becomes intimidating? Wouldn't be nice if you could have a way to 'self' discover the app, and then once you're familiar with it, perform your tasks quickly using the options supported by the script?</p>
<p>Python has a very <a target="_blank" href="https://github.com/josevnz/rpm_query/blob/main/Writting%20UI%20applications%20that%20can%20query%20the%20RPM%20database%20with%20Python.md">healthy ecosystem of GUI and TUI frameworks</a> that you can use to write nice-looking and intuitive applications. In this tutorial we will talk about Trogon and what you can do to make your application more friendly yet powerful for new and seasoned users alike.</p>
<p>I'll show you two of them that can help you solve the following two problems:</p>
<ol>
<li><p>Avoid becoming overwhelmed and having to use intimidating APIs when writing applications. Will use the <a target="_blank" href="https://palletsprojects.com/p/click/">Click</a> Python package to solve that problem.</p>
</li>
<li><p>Allow discoverability. This is very important when you have an application that supports many options or that you haven't used in a while. That is where <a target="_blank" href="https://github.com/Textualize/trogon">Trogon</a> comes handy.</p>
</li>
</ol>
<p>We will reuse the source code of one of my Open Source applications, <a target="_blank" href="https://github.com/josevnz/rpm_query/tree/main">rpm_query</a> as a base. Rpm_query is a collection of simple applications that can query your system <a target="_blank" href="https://en.wikipedia.org/wiki/RPM_Package_Manager#Local_operations">RPM database</a> from the command line.</p>
<h2 id="heading-what-youll-need-for-this-tutorial">What You'll Need for This Tutorial</h2>
<ol>
<li><p>Linux's distribution, preferably one that uses RPM (Like Fedora or RedHat enterprise Linux)</p>
</li>
<li><p>Python 3.8+</p>
</li>
<li><p>Git</p>
</li>
<li><p>Familiarity with Python virtual environments</p>
</li>
<li><p>An Internet connection so you can download dependencies, using pip.</p>
</li>
</ol>
<p>I strongly suggest that you clone the repository and create a virtual environment so you can follow the tutorial:</p>
<pre><code class="lang-shell">git clone https://github.com/josevnz/CLIWithClickAndTrogon.git
cd CLIWithClickAndTrogon
python3 -m venv ~/virtualenv/CLIWithCLickAndTrogon 
. ~/virtualenv/CLIWithCLickAndTrogon/bin/activate
</code></pre>
<p>If you're all set, let's dive in.</p>
<h2 id="heading-what-a-typical-cli-command-line-interface-looks-like-quick-refresher">What a Typical CLI (Command Line Interface) Looks Like – Quick Refresher</h2>
<p>This script uses a module inside the <a target="_blank" href="https://github.com/josevnz/CLIWithClickAndTrogon/blob/3192bed33056985421feb7dbd40cb1922ad80e6c/reporter/rpm_query.py">reporter</a> Python package to query the RPM database.</p>
<pre><code class="lang-python"><span class="hljs-comment">#!/usr/bin/env python</span>
<span class="hljs-string">"""
# rpmq_simple.py - A simple CLI to query the sizes of RPM on your system
Author: Jose Vicente Nunez
"""</span>
<span class="hljs-keyword">import</span> argparse
<span class="hljs-keyword">import</span> textwrap

<span class="hljs-keyword">from</span> reporter <span class="hljs-keyword">import</span> __is_valid_limit__
<span class="hljs-keyword">from</span> reporter.rpm_query <span class="hljs-keyword">import</span> QueryHelper

<span class="hljs-keyword">if</span> __name__ == <span class="hljs-string">"__main__"</span>:

    parser = argparse.ArgumentParser(description=textwrap.dedent(__doc__))
    parser.add_argument(
        <span class="hljs-string">"--limit"</span>,
        type=__is_valid_limit__,  <span class="hljs-comment"># Custom limit validator</span>
        action=<span class="hljs-string">"store"</span>,
        default=QueryHelper.MAX_NUMBER_OF_RESULTS,
        help=<span class="hljs-string">"By default results are unlimited but you can cap the results"</span>
    )
    parser.add_argument(
        <span class="hljs-string">"--name"</span>,
        type=str,
        action=<span class="hljs-string">"store"</span>,
        help=<span class="hljs-string">"You can filter by a package name."</span>
    )
    parser.add_argument(
        <span class="hljs-string">"--sort"</span>,
        action=<span class="hljs-string">"store_false"</span>,
        help=<span class="hljs-string">"Sorted results are enabled bu default, but you fan turn it off"</span>
    )
    args = parser.parse_args()

    <span class="hljs-keyword">with</span> QueryHelper(
        name=args.name,
        limit=args.limit,
        sorted_val=args.sort
    ) <span class="hljs-keyword">as</span> rpm_query:
        <span class="hljs-keyword">for</span> package <span class="hljs-keyword">in</span> rpm_query:
            print(<span class="hljs-string">f"<span class="hljs-subst">{package[<span class="hljs-string">'name'</span>]}</span>-<span class="hljs-subst">{package[<span class="hljs-string">'version'</span>]}</span>: <span class="hljs-subst">{package[<span class="hljs-string">'size'</span>]:,<span class="hljs-number">.0</span>f}</span>"</span>)
</code></pre>
<p>Let's install it, in editable mode:</p>
<pre><code class="lang-shell">. ~/virtualenv/CLIWithCLickAndTrogon/bin/activate
pip install --editable .
</code></pre>
<p>And see it in action:</p>
<pre><code class="lang-shell">(CLIWithClickAndTrogon) [josevnz@dmaf5 CLIWithClickAndTrogon]$ rpmq_simple.py --help
usage: rpmq_simple.py [-h] [--limit LIMIT] [--name NAME] [--sort]

# rpmq_simple.py - A simple CLI to query the sizes of RPM on your system Author: Jose Vicente Nunez

options:
  -h, --help     show this help message and exit
  --limit LIMIT  By default results are unlimited but you can cap the results
  --name NAME    You can filter by a package name.
  --sort         Sorted results are enabled bu default, but you fan turn it off
(CLIWithClickAndTrogon) [josevnz@dmaf5 CLIWithClickAndTrogon]$ rpmq_simple.py --name kernel --limit 5
kernel-6.2.11: 0
kernel-6.2.14: 0
kernel-6.2.15: 0
</code></pre>
<p>So it seems than most of the code on the <a target="_blank" href="https://github.com/josevnz/CLIWithClickAndTrogon/blob/main/scripts/rpmq_simple.py">rpmq_simple.py</a> script is boilerplate for the command line interface, using the standard '<a target="_blank" href="https://docs.python.org/3/library/argparse.html">ArgParse</a>' library.</p>
<p>ArgParse is <a target="_blank" href="https://docs.python.org/3/howto/argparse.html#argparse-tutorial">powerful</a>, but it is also intimidating at first, specially when you have to support multiple use cases.</p>
<h2 id="heading-a-new-way-to-process-the-cli-with-click">A New Way to Process the CLI with Click</h2>
<p>The Click framework promises to make it easier to parse out command line arguments. To demonstrate that, let's convert our script from ArgParse to <a target="_blank" href="https://click.palletsprojects.com/en/8.1.x/">Click</a> (they both provide support for options but Click has a few interesting options we will use):</p>
<pre><code class="lang-python"><span class="hljs-comment">#!/usr/bin/env python</span>
<span class="hljs-string">"""
# rpmq_click.py - A simple CLI to query the sizes of RPM on your system
Author: Jose Vicente Nunez
"""</span>
<span class="hljs-keyword">import</span> click

<span class="hljs-keyword">from</span> reporter.rpm_query <span class="hljs-keyword">import</span> QueryHelper


<span class="hljs-meta">@click.command()</span>
<span class="hljs-meta">@click.option('--limit', default=QueryHelper.MAX_NUMBER_OF_RESULTS,</span>
              help=<span class="hljs-string">"By default results are unlimited but you can cap the results"</span>)
<span class="hljs-meta">@click.option('--name', help="You can filter by a package name.")</span>
<span class="hljs-meta">@click.option('--sort', default=True, help="Sorted results are enabled bu default, but you fan turn it off")</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">command</span>(<span class="hljs-params">
        name: str,
        limit: int,
        sort: bool
</span>) -&gt; <span class="hljs-keyword">None</span>:</span>
    <span class="hljs-keyword">with</span> QueryHelper(
            name=name,
            limit=limit,
            sorted_val=sort
    ) <span class="hljs-keyword">as</span> rpm_query:
        <span class="hljs-keyword">for</span> package <span class="hljs-keyword">in</span> rpm_query:
            click.echo(<span class="hljs-string">f"<span class="hljs-subst">{package[<span class="hljs-string">'name'</span>]}</span>-<span class="hljs-subst">{package[<span class="hljs-string">'version'</span>]}</span>: <span class="hljs-subst">{package[<span class="hljs-string">'size'</span>]:,<span class="hljs-number">.0</span>f}</span>"</span>)


<span class="hljs-keyword">if</span> __name__ == <span class="hljs-string">"__main__"</span>:
    command()
</code></pre>
<p>So you will notice to big changes here:</p>
<ol>
<li><p>Most of the boilerplate code from ArgParse is gone, replaced by annotations.</p>
</li>
<li><p>Click works by adding decorators to a new function called 'command', that takes arguments and executes the RPM query.</p>
</li>
</ol>
<p>If you run the new script you will see that it works exactly as before:</p>
<pre><code class="lang-shell">(CLIWithClickAndTrogon) [josevnz@dmaf5 CLIWithClickAndTrogon]$ rpmq_click.py --help
Usage: rpmq_click.py [OPTIONS]

Options:
  --limit INTEGER  By default results are unlimited but you can cap the
                   results
  --name TEXT      You can filter by a package name.
  --sort BOOLEAN   Sorted results are enabled bu default, but you fan turn it
                   off
  --help           Show this message and exit.
(CLIWithClickAndTrogon) [josevnz@dmaf5 CLIWithClickAndTrogon]$ rpmq_click.py --name kernel --limit 5
kernel-6.2.11: 0
kernel-6.2.14: 0
kernel-6.2.15: 0
</code></pre>
<p>So what did we gain? Our code is slightly simpler but also is now supported by Trogon, a new framework we will discuss soon.</p>
<h2 id="heading-how-to-use-setuptools-and-click">How to Use setuptools and Click</h2>
<p>The Click <a target="_blank" href="https://click.palletsprojects.com/en/8.1.x/setuptools/#setuptools-integration">documentation r</a>ecommends that we should use <a target="_blank" href="https://setuptools.pypa.io/en/latest/setuptools.html">setuptools</a> to create a wrapper for our tool, automatically. So we need to define a function where we handle all the command line options and logic and the wrapper creates a regular script for us on the right place during the package installation. It also points to the right version of Python, among other nice things.</p>
<p>The documentation has the deprecated syntax for setup.py, so we will use the more recent setup.cfg format instead:</p>
<pre><code class="lang-python">[metadata]
name = CLIWithClickAndTrogon
version = <span class="hljs-number">0.0</span><span class="hljs-number">.1</span>
author = Jose Vicente Nunez Zuleta
author-email = kodegeek.com@protonmail.com
license = Apache <span class="hljs-number">2.0</span>
summary = Simple TUI that queries the RPM database
home-page = https://github.com/josevnz/cliwithclickandtrogon
description = Simple TUI that queries the RPM database. A tutorial.
long_description = file: README.md
long_description_content_type = text/markdown

[options]
packages = reporter
setup_requires =
    setuptools
    wheel
    build
    pip
    twine
install_requires =
    importlib; python_version == <span class="hljs-string">"3.9"</span>
    click
scripts =
    scripts/rpmq_simple.py
    scripts/rpmq_click.py
[options.entry_points]
console_scripts =
    rpmq = reporter.scripts:command
</code></pre>
<p>I created a package called '<a target="_blank" href="https://github.com/josevnz/CLIWithClickAndTrogon/tree/main/scripts">scripts</a>' inside the package called 'reporter' with the CLI logic using click.</p>
<p><a target="_blank" href="https://setuptools.pypa.io/en/latest/userguide/entry_point.html">setuptools will generate a script called 'rpmq'</a> for us that behaves exactly as the previous script does – but again, we don't need any boilerplate code to pass arguments to Click:</p>
<pre><code class="lang-shell">CLIWithClickAndTrogon) [josevnz@dmaf5 CLIWithClickAndTrogon]$ pip install --editable .
(CLIWithClickAndTrogon) [josevnz@dmaf5 CLIWithClickAndTrogon]$ rpmq --help
Usage: rpmq [OPTIONS]

Options:
  --limit INTEGER  By default results are unlimited but you can cap the
                   results
  --name TEXT      You can filter by a package name.
  --sort BOOLEAN   Sorted results are enabled bu default, but you fan turn it
                   off
  --help           Show this message and exit.
(CLIWithClickAndTrogon) [josevnz@dmaf5 CLIWithClickAndTrogon]$ rpmq --name kernel --limit 5
kernel-6.2.11: 0
kernel-6.2.14: 0
kernel-6.2.15: 0
</code></pre>
<h2 id="heading-how-to-make-your-cli-discoverable-with-trogon">How to Make Your CLI Discoverable with Trogon</h2>
<p>Let's solve the problem of making your CLI discoverable with Trogon. Besides adding the new <code>trogon</code> library as part of the requirements (<a target="_blank" href="https://github.com/josevnz/CLIWithClickAndTrogon/blob/main/requirements.txt">requirements.txt</a> and <a target="_blank" href="https://github.com/josevnz/CLIWithClickAndTrogon/blob/main/setup.cfg">setup.cfg</a>), we need to add a new decorator to our CLI:</p>
<pre><code class="lang-python"><span class="hljs-comment">#!/usr/bin/env python</span>
<span class="hljs-string">"""
A simple CLI to query the sizes of RPM on your system
Author: Jose Vicente Nunez
"""</span>
<span class="hljs-keyword">import</span> click
<span class="hljs-keyword">from</span> trogon <span class="hljs-keyword">import</span> tui

<span class="hljs-keyword">from</span> reporter.rpm_query <span class="hljs-keyword">import</span> QueryHelper

<span class="hljs-meta">@tui()</span>
<span class="hljs-meta">@click.command()</span>
<span class="hljs-meta">@click.option('--limit', default=QueryHelper.MAX_NUMBER_OF_RESULTS,</span>
              help=<span class="hljs-string">"By default results are unlimited but you can cap the results"</span>)
<span class="hljs-meta">@click.option('--name', help="You can filter by a package name.")</span>
<span class="hljs-meta">@click.option('--sort', default=True, help="Sorted results are enabled bu default, but you fan turn it off")</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">command</span>(<span class="hljs-params">
        name: str,
        limit: int,
        sort: bool
</span>) -&gt; <span class="hljs-keyword">None</span>:</span>
    <span class="hljs-keyword">with</span> QueryHelper(
            name=name,
            limit=limit,
            sorted_val=sort
    ) <span class="hljs-keyword">as</span> rpm_query:
        <span class="hljs-keyword">for</span> package <span class="hljs-keyword">in</span> rpm_query:
            click.echo(<span class="hljs-string">f"<span class="hljs-subst">{package[<span class="hljs-string">'name'</span>]}</span>-<span class="hljs-subst">{package[<span class="hljs-string">'version'</span>]}</span>: <span class="hljs-subst">{package[<span class="hljs-string">'size'</span>]:,<span class="hljs-number">.0</span>f}</span>"</span>)


<span class="hljs-keyword">if</span> __name__ == <span class="hljs-string">"__main__"</span>:
    command()
</code></pre>
<p>Just one annotation, <code>@tui</code>, and a new import.</p>
<p>Time to see it in action:</p>
<pre><code class="lang-shell">(CLIWithClickAndTrogon) [josevnz@dmaf5 CLIWithClickAndTrogon]$ rpmq_trogon.py --help
Usage: rpmq_trogon.py [OPTIONS] COMMAND [ARGS]...

Options:
  --help  Show this message and exit.

Commands:
  command
  tui      Open Textual TUI.
</code></pre>
<p>Same results, but you'll notice two changes:</p>
<ol>
<li><p>If you want to use the CLI options, you need to prepend 'command' before the switches.</p>
</li>
<li><p>There is a new <code>tui</code> command.</p>
</li>
</ol>
<p>Wait a second...what happened with the other flags? No worries, if you ask for more help for 'command', you will see them there:</p>
<pre><code class="lang-shell">(CLIWithClickAndTrogon) [josevnz@dmaf5 CLIWithClickAndTrogon]$ rpmq_trogon.py command --help
Usage: rpmq_trogon.py command [OPTIONS]

Options:
  --limit INTEGER  By default results are unlimited but you can cap the
                   results
  --name TEXT      You can filter by a package name.
  --sort BOOLEAN   Sorted results are enabled bu default, but you fan turn it
                   off
  --help           Show this message and exit.
</code></pre>
<p>Ah, much better. Let's run the CLI similar to the way we did before:</p>
<pre><code class="lang-shell">(CLIWithClickAndTrogon) [josevnz@dmaf5 CLIWithClickAndTrogon]$ rpmq_trogon.py command --limit 5 --name kernel
kernel-6.2.11: 0
kernel-6.2.14: 0
kernel-6.2.15: 0
</code></pre>
<p>And what about support for setuptools? Just add the import and the annotation to the 'command function':</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> click
<span class="hljs-keyword">from</span> trogon <span class="hljs-keyword">import</span> tui

<span class="hljs-keyword">from</span> reporter.rpm_query <span class="hljs-keyword">import</span> QueryHelper
<span class="hljs-meta">@tui()</span>
<span class="hljs-meta">@click.command()</span>
<span class="hljs-meta">@click.option('--limit', default=QueryHelper.MAX_NUMBER_OF_RESULTS,</span>
              help=<span class="hljs-string">"By default results are unlimited but you can cap the results"</span>)
<span class="hljs-meta">@click.option('--name', help="You can filter by a package name.")</span>
<span class="hljs-meta">@click.option('--sort', default=True, help="Sorted results are enabled bu default, but you fan turn it off")</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">command</span>(<span class="hljs-params">
        name: str,
        limit: int,
        sort: bool
</span>) -&gt; <span class="hljs-keyword">None</span>:</span>
    <span class="hljs-comment"># .... real code goes here</span>
    <span class="hljs-keyword">pass</span>
</code></pre>
<p>Allow me to demonstrate now with TUI mode how auto discoverable mode works:</p>
<p><a target="_blank" href="https://asciinema.org/a/590897"><img src="https://asciinema.org/a/590897.svg" alt="asciicast" width="600" height="400" loading="lazy"></a></p>
<p>Nice! We got a TUI where some options are automatically populated for us. This gives us a clear idea how to use the programs without knowing too much about them.</p>
<h2 id="heading-whats-next">What's Next</h2>
<ol>
<li><p>Download the <a target="_blank" href="https://github.com/josevnz/CLIWithClickAndTrogon">source code</a> for this tutorial and start experimenting.</p>
</li>
<li><p>Both <a target="_blank" href="https://click.palletsprojects.com/en/8.1.x/">Click</a> and <a target="_blank" href="https://discord.com/invite/Enf6Z3qhVr">Trogon</a> have great documentation and online support. Take advantage of them.</p>
</li>
<li><p>Click has many more complex examples, feel free to <a target="_blank" href="https://github.com/pallets/click/tree/main/examples">check out their gallery</a>.</p>
</li>
</ol>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ Network File System – How to Confirm Your Application is Using NFS ]]>
                </title>
                <description>
                    <![CDATA[ I was tasked recently to find which of our processes was accessing an NFS share. During this process, I found that some tools are better adapted than others for the task. In this article, I want to share with you my findings. The whole process was fu... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/how-to-confirm-your-application-is-using-nfs/</link>
                <guid isPermaLink="false">66d8514139c4dccc43d4d4b6</guid>
                
                    <category>
                        <![CDATA[ computer networking ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ Jose Vicente Nunez ]]>
                </dc:creator>
                <pubDate>Mon, 18 Sep 2023 06:54:37 +0000</pubDate>
                <media:content url="https://cdn.hashnode.com/res/hashnode/image/upload/v1725458376947/86a9c2e5-a07d-4802-82e4-dd8c28aebbc5.jpeg" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>I was tasked recently to find which of our processes was accessing an NFS share. During this process, I found that some tools are better adapted than others for the task.</p>
<p>In this article, I want to share with you my findings. The whole process was fun and gave me ideas on how to use these tools to tackle similar problems in the future.</p>
<h2 id="heading-what-is-nfs">What is NFS?</h2>
<p>Network File System (NFS) is a distributed file system protocol that allows a user to access files over a computer network.</p>
<p>Please note that this is not a full tutorial on NFS. For that, please take a look at the following <a target="_blank" href="https://www.redhat.com/sysadmin/getting-started-nfs">tutorial</a>.</p>
<p>In this article, we will focus only on detecting access to a shared drive using several techiques as well setting up two servers and one client.</p>
<p>Also, I do use a different OS to set up both the server and the client, so instructions on how to do the task change a little bit.</p>
<h2 id="heading-how-to-set-up-a-nfs-server-and-client">How to Set Up a NFS Server and Client</h2>
<p>My lab setup has one NFS server and two clients:</p>
<p><img src="https://github.com/josevnz/tutorials/blob/main/docs/SpyOnNfs/NfsLayout.png?raw=true" alt="NfsLayout" width="600" height="400" loading="lazy"></p>
<p>On my setup, I will have three computers talking to each other. One of them will be the NFS server and the other two will be a client.</p>
<div class="hn-table">
<table>
<thead>
<tr>
<td>Machine</td><td>OS</td><td>Hardware</td><td>Mode</td></tr>
</thead>
<tbody>
<tr>
<td>OrangePi5</td><td>Ubuntu Armbian 23.8.1 jammy</td><td>Orange Pi 5</td><td>Server:/data</td></tr>
<tr>
<td>RaspberriPi</td><td>Debian 20.04.4 LTS (Focal Fossa)</td><td>Raspberry Pi 4 Model B Rev 1.4</td><td>Server:/var/log/suricata</td></tr>
<tr>
<td>Dmaf5</td><td>Fedora 37 (Workstation Edition)</td><td>AMD Ryzen 5 3550H with Radeon Vega Mobile Gfx</td><td>Client</td></tr>
</tbody>
</table>
</div><h3 id="heading-how-to-configure-the-server">How to Configure the Server</h3>
<p>I will prepare my OrangePI machine to be the NFS server. Do so, I will enter the following commands:</p>
<pre><code class="lang-shell">sudo apt-get update
sudo apt-get upgrade
sudo apt-get install nfs-kernel-server -y
sudo systemctl enable nfs-kernel-server.service --now
</code></pre>
<p>Next step is to tell the <a target="_blank" href="https://ubuntu.com/server/docs/service-nfs">server we want to share</a>.</p>
<p>For that, we will edit the <a target="_blank" href="https://www.man7.org/linux/man-pages/man5/exports.5.html">/etc/exports</a> file (<code>sudo vi /etc/exports history</code>):</p>
<pre><code class="lang-text">/data *(ro,all_squash,async,no_subtree_check)
</code></pre>
<p>Please check the man page to understand what these options mean.</p>
<p>In a nutshell, export /data:</p>
<ul>
<li><p>Is read-only</p>
</li>
<li><p>Maps IDs to anonymous ID</p>
</li>
<li><p>This option allows the NFS server to violate the NFS protocol and reply to requests before any changes made by that request have been committed to stable storage</p>
</li>
<li><p>This option disables subtree checking. It's the default.</p>
</li>
</ul>
<p>Now it is time to activate our shared directories:</p>
<pre><code class="lang-shell">root@orangepi5:~# sudo exportfs -a
root@orangepi5:~# sudo showmount -e
Export list for orangepi5:
/data (everyone)
</code></pre>
<p>I did something similar to the other host, raspberrypi:</p>
<pre><code class="lang-shell">root@raspberrypi:~# cat /etc/exports
# /etc/exports: the access control list for filesystems which may be exported
#        to NFS clients.  See exports(5).
#
/var/log/suricata *(ro,all_squash,async,no_subtree_check)
root@raspberrypi:~# showmount -e
Export list for raspberrypi:
/var/log/suricata *
</code></pre>
<h3 id="heading-how-to-configure-the-client">How to Configure the Client</h3>
<p>First thing is to confirm we can indeed see the shared mount points from our server:</p>
<pre><code class="lang-shell">(tutorials) [josevnz@dmaf5 SpyOnNfs]$ sudo showmount -e orangepi5
Export list for orangepi5:
/data raspberrypi,dmaf5
</code></pre>
<p>Data is shared with two machines – just what we expected.</p>
<p>Now, there are several ways to mount this drive. One of them is manually, another one is at startup, and the last one, my preferred one, is on demand.</p>
<h4 id="heading-how-to-set-up-the-automount-client-on-fedora-linux">How to Set Up the AutoMount Client on Fedora Linux</h4>
<p>First we <a target="_blank" href="https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/4/html/system_administration_guide/mounting_nfs_file_systems-mounting_nfs_file_systems_using_autofs">set the service</a>:</p>
<pre><code class="lang-shell">sudo dnf install -y autofs
sudo systemct enable autofs.service --now
</code></pre>
<p>Then we set this up, so we end mounting remote <code>/data</code> into local <code>/misc/data</code>. For that, sdd the following line to your <code>/etc/auto.master</code>:</p>
<pre><code class="lang-shell">[root@dmaf5 ~]# vi /etc/auto.misc
# After editing the file, adding our entry to the last line of the file ...
[root@dmaf5 ~]# cat /etc/auto.misc
#
# This is an automounter map and it has the following format
# key [ -mount-options-separated-by-comma ] location
# Details may be found in the autofs(5) manpage

cd              -fstype=iso9660,ro,nosuid,nodev :/dev/cdrom

data            -ro,soft,rsize=16384,wsize=16384 orangepi5:/data
suricata        -ro,soft,rsize=16384,wsize=16384 raspberrypi:/var/log/suricata
</code></pre>
<p>Restart the service one more time:</p>
<pre><code class="lang-shell">[root@dmaf5 ~]# systemctl enable autofs.service --now
</code></pre>
<p>And the smoke test:</p>
<pre><code class="lang-shell">[root@dmaf5 ~]# ls -l /misc/data
total 0
drwxrwxr-x. 1 root 1001 48 Apr  7 17:57 nexus
[root@dmaf5 ~]# ls /misc/suricata
certs       eve.json.7  files            http.log    stats.log.1     suricata.log.2        suricata-start.log.3  tls.log.4
core        fast.log    http-data.log    http.log.1  stats.log.2     suricata.log.3        suricata-start.log.4  tls.log.5
eve.json    fast.log.1  http-data.log.1  http.log.2  stats.log.3     suricata.log.4        suricata-start.log.5  tls.log.6
eve.json.1  fast.log.2  http-data.log.2  http.log.3  stats.log.4     suricata.log.5        suricata-start.log.6  tls.log.7
eve.json.2  fast.log.3  http-data.log.3  http.log.4  stats.log.5     suricata.log.6        suricata-start.log.7
eve.json.3  fast.log.4  http-data.log.4  http.log.5  stats.log.6     suricata.log.7        tls.log
eve.json.4  fast.log.5  http-data.log.5  http.log.6  stats.log.7     suricata-start.log    tls.log.1
eve.json.5  fast.log.6  http-data.log.6  http.log.7  suricata.log    suricata-start.log.1  tls.log.2
eve.json.6  fast.log.7  http-data.log.7  stats.log   suricata.log.1  suricata-start.log.2  tls.log.3
</code></pre>
<p>Now we are ready to play with our service.</p>
<h2 id="heading-how-to-create-a-python-program-that-reads-files-into-the-nfs-server">How to Create a Python Program that Reads Files into the NFS Server</h2>
<p>For our example, we want to determine if a Python application is reading data from this directory. This script has two features:</p>
<ul>
<li><p>Performs a one time read view of a file. This will teach us how to capture this type of scenerarios, when a file is not opened all the time.</p>
</li>
<li><p>And the script also follows updates on a file periodically.</p>
</li>
</ul>
<p>Here is how our test script looks like in action:</p>
<pre><code class="lang-shell">./scripts/test_script.py \
--quick_read /misc/data/nexus/log/jvm.log \
--follow /misc/suricata/eve.json \
--verbose
...
2023-09-10 14:48:22,889 &lt;dependency_failed type='leaf_type' ctxk='java/io/FileOutputStream' witness='java/net/SocketOutputStream' stamp='66511.794'/&gt;
2023-09-10 14:48:22,889 &lt;dependency_failed type='leaf_type' ctxk='java/io/FileOutputStream' witness='java/net/SocketOutputStream' stamp='66511.794'/&gt;
2023-09-10 14:48:22,889 &lt;dependency_failed type='leaf_type' ctxk='java/io/FileOutputStream' witness='java/net/SocketOutputStream' stamp='66511.794'/&gt;
2023-09-10 14:48:22,889 &lt;dependency_failed type='leaf_type' ctxk='java/io/FileOutputStream' witness='java/net/SocketOutputStream' stamp='66511.794'/&gt;
2023-09-10 14:48:22,889 &lt;dependency_failed type='leaf_type' ctxk='java/io/FileOutputStream' witness='java/net/SocketOutputStream' stamp='66511.794'/&gt;
2023-09-10 14:48:22,889 &lt;dependency_failed type='leaf_type' ctxk='java/io/FileOutputStream' witness='java/net/SocketOutputStream' stamp='66511.794'/&gt;
2023-09-10 14:48:22,889 &lt;dependency_failed type='leaf_type' ctxk='java/io/FileOutputStream' witness='java/net/SocketOutputStream' stamp='66511.794'/&gt;
2023-09-10 14:48:22,890 &lt;dependency_failed type='leaf_type' ctxk='java/io/FileOutputStream' witness='java/net/SocketOutputStream' stamp='66511.794'/&gt;
2023-09-10 14:48:22,890 &lt;dependency_failed type='unique_concrete_method' ctxk='java/io/ByteArrayOutputStream' x='java/io/ByteArrayOutputStream write ([BII)V' witness='sun/security/ssl/HandshakeOutStream' stamp='66511.855'/&gt;
2023-09-10 14:48:22,890 &lt;dependency_failed type='unique_concrete_method' ctxk='java/io/ByteArrayOutputStream' x='java/io/ByteArrayOutputStream write ([BII)V' witness='sun/security/ssl/HandshakeOutStream' stamp='66511.855'/&gt;
...
# Ctrl-C to exit
</code></pre>
<p>The code, written in Python, is pretty simple:</p>
<pre><code class="lang-python"><span class="hljs-comment">#!/usr/bin/env python</span>
<span class="hljs-string">"""
Simple script to simulate light activity on NFS drives
Author Jose Vicente Nunez (kodegeek.com@protonmail.com)
"""</span>
<span class="hljs-keyword">import</span> concurrent
<span class="hljs-keyword">import</span> os
<span class="hljs-keyword">import</span> time
<span class="hljs-keyword">from</span> concurrent.futures <span class="hljs-keyword">import</span> ThreadPoolExecutor, ALL_COMPLETED
<span class="hljs-keyword">from</span> pathlib <span class="hljs-keyword">import</span> Path
<span class="hljs-keyword">from</span> argparse <span class="hljs-keyword">import</span> ArgumentParser
<span class="hljs-keyword">import</span> logging

logging.basicConfig(format=<span class="hljs-string">'%(asctime)s %(message)s'</span>, encoding=<span class="hljs-string">'utf-8'</span>, level=logging.DEBUG)


<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">forever_read</span>(<span class="hljs-params">the_file: Path, verbose: bool = False</span>):</span>
    <span class="hljs-keyword">for</span> line <span class="hljs-keyword">in</span> continuous_read(the_file=the_file):
        <span class="hljs-keyword">if</span> verbose:
            logging.warning(line.strip())


<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">continuous_read</span>(<span class="hljs-params">the_file: Path</span>):</span>
    <span class="hljs-string">"""
    Continuously read the contents of file
    :param the_file:
    :return:
    """</span>
    <span class="hljs-keyword">with</span> open(the_file, <span class="hljs-string">'r'</span>) <span class="hljs-keyword">as</span> file_data:
        file_data.seek(<span class="hljs-number">0</span>, os.SEEK_END)
        <span class="hljs-keyword">while</span> <span class="hljs-literal">True</span>:
            line = file_data.readline()
            <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> line:
                time.sleep(<span class="hljs-number">0.1</span>)
                <span class="hljs-keyword">continue</span>
            <span class="hljs-keyword">yield</span> line


<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">quick_read</span>(<span class="hljs-params">the_file: Path, verbose: bool = False</span>):</span>
    <span class="hljs-string">"""
    Red the whole file and close it once done
    :param verbose:
    :param the_file:
    :return:
    """</span>
    <span class="hljs-keyword">with</span> open(the_file, <span class="hljs-string">'r'</span>) <span class="hljs-keyword">as</span> file_data:
        <span class="hljs-keyword">for</span> line <span class="hljs-keyword">in</span> file_data:
            <span class="hljs-keyword">if</span> verbose:
                logging.warning(line.strip())


<span class="hljs-keyword">if</span> __name__ == <span class="hljs-string">"__main__"</span>:
    PARSER = ArgumentParser(description=__doc__)
    PARSER.add_argument(
        <span class="hljs-string">'--verbose'</span>,
        action=<span class="hljs-string">'store_true'</span>,
        default=<span class="hljs-literal">False</span>,
        help=<span class="hljs-string">'Enable verbose mode'</span>
    )
    PARSER.add_argument(
        <span class="hljs-string">'--quick_read'</span>,
        type=Path,
        required=<span class="hljs-literal">True</span>,
        help=<span class="hljs-string">'Read a file once'</span>
    )
    PARSER.add_argument(
        <span class="hljs-string">'--follow'</span>,
        type=Path,
        required=<span class="hljs-literal">True</span>,
        help=<span class="hljs-string">'Read a file continuously'</span>
    )
    OPTIONS = PARSER.parse_args()
    <span class="hljs-keyword">try</span>:
        <span class="hljs-keyword">with</span> ThreadPoolExecutor(max_workers=<span class="hljs-number">3</span>) <span class="hljs-keyword">as</span> tpe:
            futures = [
                tpe.submit(forever_read, OPTIONS.follow, OPTIONS.verbose),
                tpe.submit(quick_read, OPTIONS.quick_read, OPTIONS.verbose)
            ]
            concurrent.futures.wait(futures, return_when=ALL_COMPLETED)
    <span class="hljs-keyword">except</span> KeyboardInterrupt:
        <span class="hljs-keyword">pass</span>
</code></pre>
<p>Now, let's go over how we can see if our script is indeed accessing an NFS partition.</p>
<h3 id="heading-common-steps">Common steps</h3>
<p>First we need to learn where to look for. So on the machine, check for NFS in <code>/etc/fstab</code> (for mount points that are available since the machine was rebooted):</p>
<pre><code class="lang-shell">[root@dmaf5 ~]# rg -e 'rsize=' /etc/fstab
</code></pre>
<p>Then on the AutoMount files:</p>
<pre><code class="lang-shell">[root@dmaf5 ~]# rg -e 'rsize=' /etc/auto*
/etc/auto.misc
17:data            -ro,soft,rsize=16384,wsize=16384 orangepi5:/data
18:suricata        -ro,soft,rsize=16384,wsize=16384 raspberrypi:/var/log/suricata
</code></pre>
<p>The regular expressions are not exact science, but you get the idea what to look for next.</p>
<h3 id="heading-how-to-use-the-tools">How to Use the tools</h3>
<p>We need to confirm if there was access to any of the following partitions mounted over NFS:</p>
<ul>
<li><p><code>/misc/data</code></p>
</li>
<li><p><code>/misc/suricata</code></p>
</li>
</ul>
<p>Next, I will show you a set of tools that will make the task easier, each one of them with their own strength and limitations.</p>
<p>Starting with <a target="_blank" href="https://www.redhat.com/sysadmin/analyze-processes-lsof">lsof</a> and <a target="_blank" href="https://github.com/BurntSushi/ripgrep">ripgrep</a> combined.</p>
<h3 id="heading-how-to-use-lsof-and-rg-for-capturing-and-filtering">How to Use Lsof and rg for Capturing and Filtering</h3>
<pre><code class="lang-shell">[josevnz@dmaf5 docs]$ lsof -w -b| rg -e '/misc/data|/misc/suricata'
python    36509                 josevnz    3   unknown                           /misc/suricata/eve.json
python    36509 36510 python    josevnz    3   unknown                           /misc/suricata/eve.json
python    36509 36511 python    josevnz    3   unknown                           /misc/suricata/eve.json
</code></pre>
<p>I passed the <code>-b</code> option to lsof to avoid it from getting stuck, in case the <a target="_blank" href="https://access.redhat.com/solutions/2674">NFS handle is stale</a>.</p>
<p>A few things about lsof:</p>
<ul>
<li><p>If you are using Autofs, you should know than mount points eventually get un-mounted to save bandwidth. This can be problematic when trying to catch the access of a file that is only opened once.</p>
</li>
<li><p>The short-lived read didn't show up because the filehandle was closed after we inspected the process.</p>
</li>
<li><p>If you want to monitor ALL the processes on this machine, you may need to run as root. You can only inspect your own processes without special privileges.</p>
</li>
</ul>
<p>Still, lsof is a great tool to investigate.</p>
<p>Next strategy involves monitoring from the beginning, to catch the elusive short-read. We will use <a target="_blank" href="https://strace.io/">strace</a>.</p>
<h3 id="heading-how-to-use-strace">How to Use strace</h3>
<pre><code class="lang-shell">sudo dnf install -y strace
(tutorials) [josevnz@dmaf5 SpyOnNfs]$ strace -f ./scripts/test_script.py --quick_read /misc/data/nexus/log/jvm.log --follow /misc/suricata/eve.json 2&gt;&amp;1| rg -e '/misc/data|/misc/suricata'
execve("./scripts/test_script.py", ["./scripts/test_script.py", "--quick_read", "/misc/data/nexus/log/jvm.log", "--follow", "/misc/suricata/eve.json"], 0x7ffd9ae29738 /* 46 vars */) = 0
execve("/home/josevnz/virtualenv/tutorials/bin/python", ["python", "./scripts/test_script.py", "--quick_read", "/misc/data/nexus/log/jvm.log", "--follow", "/misc/suricata/eve.json"], 0x7ffe269dbf88 /* 46 vars */) = 0
[pid 38241] openat(AT_FDCWD, "/misc/suricata/eve.json", O_RDONLY|O_CLOEXEC &lt;unfinished ...&gt;
[pid 38242] openat(AT_FDCWD, "/misc/data/nexus/log/jvm.log", O_RDONLY|O_CLOEXEC &lt;unfinished ...&gt;
</code></pre>
<p>The <code>openat(AT_FDCWD)</code> entries give away the two files our script is reading from NFS. But as you can tell this approach has some caveats:</p>
<ul>
<li><p>We are filtering the output. It is best to save the output to a file with 'tee' and then search there</p>
</li>
<li><p>It requires starting the process with strace from the beginning. Yes, you could do a 'strace -p $PID' to attach later to the process, but you risk missing short-lived reads</p>
</li>
</ul>
<p>Is there a different way? Time to move on to the next tool, <a target="_blank" href="https://www.wireshark.org/docs/man-pages/tshark.html">tshark</a> and see how to use a network capture to confirm access to the share.</p>
<h3 id="heading-how-to-use-tshark">How to Use tshark</h3>
<p>We can also capture the network traffic and filter out only NFS. <a target="_blank" href="https://ask.wireshark.org/question/3582/how-to-capture-filename-path-for-nfsv4-traffic-using-tshark/">It is not perfect</a>, but it may be sufficient.</p>
<p>First, find out which network interface is used to communicate with the NFS server. In my case it is easy – they all connected using a wired private network:</p>
<pre><code class="lang-shell">[josevnz@dmaf5 docs]$ ip --oneline address|rg -e 'eno|wlp'
3: eno1    inet 192.168.68.70/22 brd 192.168.71.255 scope global dynamic noprefixroute eno1\       valid_lft 4568sec preferred_lft 4568sec
4: wlp4s0    inet 192.168.1.95/24 brd 192.168.1.255 scope global dynamic noprefixroute wlp4s0\       valid_lft 3423sec preferred_lft 3423sec
4: wlp4s0    inet6 fe80::ac40:5365:7f09:a5d2/64 scope link noprefixroute \       valid_lft forever preferred_lft forever
</code></pre>
<p>For this example it is eno1 with IP address '192.168.68.70'. Then capture the traffic, and with some luck we will get the file path:</p>
<pre><code class="lang-shell">[root@dmaf5 ~]# tshark -i eno1 -Y "nfs"
Running as user "root" and group "root". This could be dangerous.
Capturing on 'eno1'
 ** (tshark:42326) 16:02:47.417145 [Main MESSAGE] -- Capture started.
 ** (tshark:42326) 16:02:47.417286 [Main MESSAGE] -- File: "/var/tmp/wireshark_eno1rEGxiu.pcapng"
   13 1.601197994 192.168.68.70 → 192.168.68.60 NFS 450 V4 Call GETATTR FH: 0x90ba4ee1  ; V4 Call GETATTR FH: 0x90ba4ee1
   14 1.601374466 192.168.68.70 → 192.168.68.60 NFS 258 V4 Call GETATTR FH: 0x90ba4ee1
   15 1.601395155 192.168.68.70 → 192.168.68.60 NFS 258 V4 Call GETATTR FH: 0x90ba4ee1
   16 1.602155254 192.168.68.60 → 192.168.68.70 NFS 310 V4 Reply (Call In 13) GETATTR
   17 1.602368826 192.168.68.60 → 192.168.68.70 NFS 554 V4 Reply (Call In 13) GETATTR  ; V4 Reply (Call In 14) GETATTR
   19 1.602515091 192.168.68.70 → 192.168.68.60 NFS 274 V4 Call READ StateID: 0xa902 Offset: 57552896 Len: 12288
   20 1.602557170 192.168.68.60 → 192.168.68.70 NFS 310 V4 Reply (Call In 15) GETATTR
   22 1.603156327 192.168.68.60 → 192.168.68.70 NFS 1730 V4 Reply (Call In 19) READ
   66 4.611124808 192.168.68.70 → 192.168.68.60 NFS 642 V4 Call GETATTR FH: 0x90ba4ee1  ; V4 Call GETATTR FH: 0x90ba4ee1  ; V4 Call GETATTR FH: 0x90ba4ee1
   67 4.611301059 192.168.68.70 → 192.168.68.60 NFS 258 V4 Call GETATTR FH: 0x90ba4ee1
   68 4.611809385 192.168.68.60 → 192.168.68.70 NFS 310 V4 Reply (Call In 66) GETATTR
   69 4.611887552 192.168.68.60 → 192.168.68.70 NFS 310 V4 Reply (Call In 66) GETATTR
   71 4.611976479 192.168.68.60 → 192.168.68.70 NFS 310 V4 Reply (Call In 66) GETATTR
   72 4.620685968 192.168.68.60 → 192.168.68.70 NFS 310 V4 Reply (Call In 67) GETATTR
   74 5.017200005 192.168.68.70 → 192.168.68.60 NFS 250 V4 Call GETATTR FH: 0x9419c00c
   75 5.017804843 192.168.68.70 → 192.168.68.59 NFS 242 V4 Call GETATTR FH: 0x314e720f
   76 5.017838787 192.168.68.60 → 192.168.68.70 NFS 310 V4 Reply (Call In 74) GETATTR
   77 5.018131217 192.168.68.70 → 192.168.68.60 NFS 326 V4 Call OPEN DH: 0x90ba4ee1/
   78 5.018711408 192.168.68.60 → 192.168.68.70 NFS 386 V4 Reply (Call In 77) OPEN StateID: 0x9984
   79 5.018855699 192.168.68.59 → 192.168.68.70 NFS 310 V4 Reply (Call In 75) GETATTR
   81 5.018980434 192.168.68.70 → 192.168.68.59 NFS 262 V4 Call GETATTR FH: 0xecd332cc
   82 5.019934959 192.168.68.59 → 192.168.68.70 NFS 310 V4 Reply (Call In 81) GETATTR
   83 5.020032853 192.168.68.70 → 192.168.68.59 NFS 262 V4 Call GETATTR FH: 0x261d4440
   84 5.020734032 192.168.68.59 → 192.168.68.70 NFS 310 V4 Reply (Call In 83) GETATTR
   85 5.020874175 192.168.68.70 → 192.168.68.59 NFS 330 V4 Call OPEN DH: 0xc9b4831b/
</code></pre>
<p>This is great, there is activity against two NFS servers, 192.168.68.59 and 192.168.68.60. But, is there a way to see the name of files?</p>
<p>tshark has a way to spit information by field. The problem is that NFS has lots of them:</p>
<pre><code class="lang-shell">[root@dmaf5 ~]# for field in $(tshark -G fields| cut -d'        ' -f3|rg -e '^nfs\.'); do echo "-e $field"; done|head -n 10
Running as user "root" and group "root". This could be dangerous.
-e nfs.unknown
-e nfs.svr4
-e nfs.knfsd_le
-e nfs.nfsd_le
-e nfs.knfsd_new
-e nfs.ontap_v3
-e nfs.ontap_v4
-e nfs.ontap_gx_v3
-e nfs.celerra_vnx
-e nfs.gluster
</code></pre>
<p>So, let's <a target="_blank" href="https://www.wireshark.org/docs/dfref/n/nfs.html">capture them</a> into a variable (<a target="_blank" href="https://wiki.wireshark.org/NFS_Preferences">also need to enable some options</a>):</p>
<pre><code class="lang-shell">[root@dmaf5 ~]# fields=$(for field in $(tshark -G fields| cut -d'       ' -f3|rg -e '^nfs\.'); do echo "-e $field"; done)
[root@dmaf5 ~]# tshark -i eno1 --enable-protocol nfs -o nfs.file_name_snooping:true -o nfs.file_full_name_snooping:true -T fields -E header=y -E separator=, -E quote=d $fields
Running as user "root" and group "root". This could be dangerous.
nfs.unknown,nfs.svr4,nfs.knfsd_le,nfs.nfsd_le,nfs.knfsd_new,nfs.ontap_v3,nfs.ontap_v4,nfs.ontap_gx_v3,n...
</code></pre>
<p>I managed to get the filename only once, then after interrupting and restarting the program I got no luck.</p>
<p>And yet no sign of the file name. The file handle was in the contents but this is not very useful if you want a quick way to see what was accessed.</p>
<p>Is there an easier way to do this? Sysdig may offer some answers.</p>
<h3 id="heading-how-to-use-sysdig">How to Use Sysdig</h3>
<p>While trying to find the elusive mount points, I stumbled into <a target="_blank" href="https://github.com/draios/sysdig">Sysdig</a>:</p>
<p>Sysdig instruments your physical and virtual machines at the OS level by installing into the Linux kernel and capturing system calls and other OS events. Sysdig uses <a target="_blank" href="https://en.wikipedia.org/wiki/DTrace">DTrace</a> to get access to the system kernel.</p>
<p>Sysdig also makes it possible to create trace files for system activity, similarly to what you can do for networks with tools like tcpdump and Wireshark.</p>
<p>I decided to use the latest version (<a target="_blank" href="https://github.com/draios/sysdig/releases/tag/0.33.1">0.33.1</a>) for Fedora 37 where my script is running):</p>
<pre><code class="lang-shell">sudo dnf install -y https://github.com/draios/sysdig/releases/download/0.33.1/sysdig-0.33.1-x86_64.rpm
# Wait a little bit, as a kernel module needs to be compiled and prepared...
Installed:
  bison-3.8.2-3.fc37.x86_64                    dkms-3.0.11-1.fc37.noarch          elfutils-libelf-devel-0.189-3.fc37.x86_64  flex-2.6.4-11.fc37.x86_64            kernel-devel-6.4.13-100.fc37.x86_64 
  kernel-devel-matched-6.4.13-100.fc37.x86_64  libzstd-devel-1.5.5-1.fc37.x86_64  m4-1.4.19-4.fc37.x86_64                    openssl-devel-1:3.0.9-1.fc37.x86_64  sysdig-0.33.1-1.x86_64              
  zlib-devel-1.2.12-5.fc37.x86_64
</code></pre>
<p>How easy is to probe out the script so it is indeed accessing the NFS mounted directories? Let's print three fields of interest and the name of the accesed file:</p>
<pre><code class="lang-shell"># `sysdig -l` will output every single field you can capture
[root@dmaf5 ~]# sysdig -p"%proc.cmdline,%fd.name" proc.name contains python and fd.name contains /misc
python ./scripts/test_script.py --quick_read /misc/data/nexus/log/jvm.log --follow /misc/suricata/eve.json --verbose,/misc/suricata/eve.json
python ./scripts/test_script.py --quick_read /misc/data/nexus/log/jvm.log --follow /misc/suricata/eve.json --verbose,/misc/suricata/eve.json
python ./scripts/test_script.py --quick_read /misc/data/nexus/log/jvm.log --follow /misc/suricata/eve.json --verbose,/misc/suricata/eve.json
python ./scripts/test_script.py --quick_read /misc/data/nexus/log/jvm.log --follow /misc/suricata/eve.json --verbose,/misc/suricata/eve.json
python ./scripts/test_script.py --quick_read /misc/data/nexus/log/jvm.log --follow /misc/suricata/eve.json --verbose,/misc/suricata/eve.json
python ./scripts/test_script.py --quick_read /misc/data/nexus/log/jvm.log --follow /misc/suricata/eve.json --verbose,/misc/suricata/eve.json
python ./scripts/test_script.py --quick_read /misc/data/nexus/log/jvm.log --follow /misc/suricata/eve.json --verbose,/misc/suricata/eve.json
python ./scripts/test_script.py --quick_read /misc/data/nexus/log/jvm.log --follow /misc/suricata/eve.json --verbose,/misc/suricata/eve.json
python ./scripts/test_script.py --quick_read /misc/data/nexus/log/jvm.log --follow /misc/suricata/eve.json --verbose,/misc/suricata/eve.json
python ./scripts/test_script.py --quick_read /misc/data/nexus/log/jvm.log --follow /misc/suricata/eve.json --verbose,/misc/data/nexus/log/jvm.log
python ./scripts/test_script.py --quick_read /misc/data/nexus/log/jvm.log --follow /misc/suricata/eve.json --verbose,/misc/data/nexus/log/jvm.log
python ./scripts/test_script.py --quick_read /misc/data/nexus/log/jvm.log --follow /misc/suricata/eve.json --verbose,/misc/data/nexus/log/jvm.log
python ./scripts/test_script.py --quick_read /misc/data/nexus/log/jvm.log --follow /misc/suricata/eve.json --verbose,/misc/data/nexus/log/jvm.log
python ./scripts/test_script.py --quick_read /misc/data/nexus/log/jvm.log --follow /misc/suricata/eve.json --verbose,/misc/data/nexus/log/jvm.log
python ./scripts/test_script.py --quick_read /misc/data/nexus/log/jvm.log --follow /misc/suricata/eve.json --verbose,/misc/data/nexus/log/jvm.log
python ./scripts/test_script.py --quick_read /misc/data/nexus/log/jvm.log --follow /misc/suricata/eve.json --verbose,/misc/data/nexus/log/jvm.log
...
</code></pre>
<p>What if you want to capture all the data, and filter later? One way to do it is capturing to a file:</p>
<pre><code class="lang-shell"># Capture for one minute...
[root@dmaf5 ~]# timeout --preserve-status 1m sysdig -w /tmp/sysdig.dump
[root@dmaf5 ~]# ls -lh /tmp/sysdig.dump
-rw-r--r--. 1 root root 32M Sep 10 19:03 /tmp/sysdig.dump
</code></pre>
<p>And then replay the contents, with filtering (replay doesn't need elevated privileges):</p>
<pre><code class="lang-shell">[root@dmaf5 ~]# sysdig -r /tmp/sysdig.dump -p"%proc.cmdline,%fd.name" proc.name contains python and fd.name contains /misc|sort -u
python ./scripts/test_script.py --quick_read /misc/data/nexus/log/jvm.log --follow /misc/suricata/eve.json --verbose,/misc/data/nexus/log/jvm.log
python ./scripts/test_script.py --quick_read /misc/data/nexus/log/jvm.log --follow /misc/suricata/eve.json --verbose,/misc/suricata/eve.json
</code></pre>
<p>Sysdig supports scripting, using the <a target="_blank" href="https://www.lua.org/">LUA language</a>. For example, it has a very convenient version of lsof:</p>
<pre><code class="lang-shell">[root@dmaf5 ~]# sysdig -cl|rg lsof
lsof            List (and optionally filter) the open file descriptors.
</code></pre>
<p>So let's use it:</p>
<pre><code class="lang-shell">[root@dmaf5 ~]# sysdig -c lsof|rg misc
automount           52410   52410   root    8       directory   /misc
automount           52410   52413   root    8       directory   /misc
automount           52410   52414   root    8       directory   /misc
automount           52410   52415   root    8       directory   /misc
automount           52410   52418   root    8       directory   /misc
automount           52410   52421   root    8       directory   /misc
python              75840   75840   josevnz 3       file        /misc/suricata/eve.json
python              75840   75841   josevnz 3       file        /misc/suricata/eve.json
python              75840   75842   josevnz 3       file        /misc/suricata/eve.json
</code></pre>
<p>What I liked about this tool:</p>
<ul>
<li><p>Can work with older kernels (like 4.xx)</p>
</li>
<li><p>Has a powerful expression language for filtering</p>
</li>
<li><p>Easy to learn and well documented</p>
</li>
<li><p>You can write your own scripts if you know LUA</p>
</li>
</ul>
<p>Before finishing up let's look at one more tool, BPF.</p>
<h3 id="heading-how-to-use-bpf-probe">How to Use BPF probe</h3>
<p>Originally Berkeley Packet Filter, is a kernel and user-space observability scheme for Linux.</p>
<p>The BPF is a <a target="_blank" href="https://www.linuxjournal.com/content/bpf-observability-getting-started-quickly">very powerful tool</a>, and this short article won't even scratch the surface.</p>
<p>Yes, this is huge. I'm learning this myself.</p>
<p>I found that the <a target="_blank" href="https://github.com/iovisor/bcc">bcc</a> repository has lots of ready to use scripts that we could use to track our NFS access, and even check for performance (you can find more examples <a target="_blank" href="https://github.com/iovisor/bpftrace">here</a>, and on the <a target="_blank" href="https://github.com/brendangregg/bpf-perf-tools-book/tree/master">BPF Performance Book repository</a>).</p>
<p>But it is more interesting to write tools yourself that monitor pretty much anything you want. For this tutorial, I will use some ready to use programs that use the traces to capture useful information.</p>
<p>As a first step, we will need to install a high level interpreter for our scripts. Again, on my Fedora Linux machine:</p>
<pre><code class="lang-shell">[josevnz@dmaf5 ~]$ sudo dnf install -y bpftrace.x86_64 bcc-tools.x86_64
# And check if the kernel has btf enabled
[josevnz@dmaf5 ~]$ ls -la /sys/kernel/btf/vmlinux
-r--r--r--. 1 root root 5635179 Sep 12 04:21 /sys/kernel/btf/vmlinux
</code></pre>
<p>On a separate terminal run again the NFS test script:</p>
<pre><code class="lang-shell">. ~/virtualenv/tutorials/bin/activate
cd SpyOnNfs/
./scripts/test_script.py --quick_read /misc/data/nexus/log/jvm.log --follow /misc/suricata/eve.json --verbose
</code></pre>
<p>You can trace all the files opened by a program, like top:</p>
<pre><code class="lang-shell">18:59:20 loadavg: 1.20 1.00 0.74 1/1175 28520

TID     COMM             READS  WRITES R_Kb    W_Kb    T FILE
28520   clear            2      0      60      0       R xterm-256color
28203   python           7      0      56      0       R eve.json
28347   filetop          2      0      15      0       R loadavg
824     systemd-oomd     2      0      8       0       R memory.swap.current
824     systemd-oomd     2      0      8       0       R memory.low
...
</code></pre>
<p>But it doesn't print the full path. It's more useful to ask a NFS snoop and see if one of our files shows up:</p>
<pre><code class="lang-shell">[josevnz@dmaf5 SuricataLog]$ sudo /usr/share/bcc/tools/nfsslower 1
# Commented out some warnings ...
Tracing NFS operations that are slower than 1 ms... Ctrl-C to quit
TIME     COMM           PID    T BYTES   OFF_KB   LAT(ms) FILENAME
19:02:25 python         28202  R 1460    62150       1.96 eve.json
19:02:28 python         28202  R 2446    62151       2.09 eve.json
19:02:31 python         28202  R 970     62154       1.99 eve.json
19:02:34 python         28202  R 3335    62155       2.43 eve.json
19:02:37 python         28202  R 4564    62158       1.84 eve.json
19:02:40 python         28202  R 5876    62162       1.89 eve.json
19:02:43 python         28202  R 4504    62168       1.61 eve.json
19:02:46 python         28202  R 3131    62173       1.92 eve.json
</code></pre>
<p>This is much better. Also, we can see than the latency is almost two milliseconds.</p>
<p>We can also monitor mount/ umount operations:</p>
<pre><code class="lang-shell">[josevnz@dmaf5 SuricataLog]$ sudo /usr/share/bcc/tools/mountsnoop 
# Commented out some warnings ...
2 warnings generated.
COMM             PID     TID     MNT_NS      CALL
mount.nfs        29012   29012   4026531841  mount("orangepi5:/data", "/misc/data", "nfs", MS_RDONLY, "sloppy,soft,rsize=16384,wsize=16384,vers=4.2,addr=192.168.68.59,clientaddr=192.168.68.68") = 0
</code></pre>
<p>This is good as well, we can see the activity over NFS we wanted to confirm.</p>
<h2 id="heading-next-steps">Next Steps</h2>
<p>You learned several tools and as you may have guessed, you can use them to snoop on more than just opened files on NFS.</p>
<p>It is always useful to know more than one tool. Sysdig has a special mention for being very versatile, powerful and yet easy to use. Also, it can be extended with scripts written in the LUA language.</p>
<p>BPF is another alternative and will give you incredible access to the kernel calls. Be prepared to spend time reading and learning how to use the tools.</p>
<p>The code for the scripts used on this tutorial can be obtained from my <a target="_blank" href="https://github.com/josevnz/tutorials/tree/main/docs/SpyOnNfs">GitHub repository: SpyOnNfs</a>.</p>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ How to Provision a Nexus Sonatype OSS on an Orange PI 5 with Ansible ]]>
                </title>
                <description>
                    <![CDATA[ Nexus 3 OSS is an Open Source artifact repository manager that can handle multiple formats like container images, Python PIP, Java jar, and many others. Why have an on-premise artifact manager? There are many reasons for it: Use your private infrast... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/provision-nexus-sonatype-oss-on-an-orange-pi-5-with-ansible/</link>
                <guid isPermaLink="false">66d85148f20d0925f8515b0d</guid>
                
                    <category>
                        <![CDATA[ hardware ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Linux ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Raspberry Pi ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ Jose Vicente Nunez ]]>
                </dc:creator>
                <pubDate>Fri, 05 May 2023 21:34:24 +0000</pubDate>
                <media:content url="https://www.freecodecamp.org/news/content/images/2023/05/cropped_museum.jpeg" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>Nexus 3 OSS is an <a target="_blank" href="https://github.com/sonatype/nexus-public">Open Source</a> <a target="_blank" href="https://www.sonatype.com/products/repository-oss">artifact repository manager</a> that can handle multiple formats like container images, Python PIP, Java jar, and many others.</p>
<p>Why have an on-premise artifact manager? There are many reasons for it:</p>
<ul>
<li><p>Use your private infrastructure: You may have proprietary code that needs to be safeguarded.</p>
</li>
<li><p>Faster artifact download speeds: If you constantly download the same artifacts over the Internet, you can cache them on a central location, for the benefit of your multiple users across multiple servers by caching them.</p>
</li>
<li><p>Control what artifacts make it to your build chain: Centralize the location of the artifacts, ensure they are approved for usage, and also confirm than they do not contain malicious code.</p>
</li>
<li><p>Segregate who can have access to your artifacts: You may have more strict requirements on who can access some artifacts within your own organization.</p>
</li>
</ul>
<p>In this article I will show you how you can download, install, and configure the OSS version of Nexus 3 using an Ansible playbook.</p>
<p>Nexus 3 will run on an <a target="_blank" href="http://www.orangepi.org/html/hardWare/computerAndMicrocontrollers/details/Orange-Pi-5.html">Orange PI 5 computer with 8 GB or RAM</a>, but this provisioning can be done on any machine with the <a target="_blank" href="https://help.sonatype.com/repomanager3/product-information/sonatype-nexus-repository-system-requirements">minimum requirements</a>. Part of the setup will consist of setting a proxy for <a target="_blank" href="https://pypi.org/">PyPI.org</a>, for the machines listed on my <a target="_blank" href="https://tutorials.kodegeek.com/Nexus3OnOrangePI/ansible/inventories/home/hosts.yaml">inventory</a> file.</p>
<h2 id="heading-what-you-need-to-run-the-code-from-this-tutorial">What you need to run the code from this tutorial</h2>
<ol>
<li><p>An Internet connection to download the <a target="_blank" href="https://github.com/josevnz/Nexus3OnOrangePI">source code</a> for the Ansible playbook, Nexus, and PIP modules</p>
</li>
<li><p>Two or more Linux machines (I used <a target="_blank" href="https://raspi.debian.net/">Debian</a>, <a target="_blank" href="https://www.armbian.com/orangepi-5/">Armbian</a> and <a target="_blank" href="https://getfedora.org/iot/">Fedora IOT</a>), with at least 8 GB of RAM. My cluster has a mix of Raspberry PI 4 and an OrangePI 5.</p>
</li>
<li><p>Ansible controller will run on the Fedora machine, but any server can be the controller. <a target="_blank" href="https://docs.ansible.com/ansible/latest/installation_guide/intro_installation.html">Installation instructions for Ansible</a> are easy to follow.</p>
</li>
</ol>
<h2 id="heading-playbook-organization">Playbook Organization</h2>
<p>I divided the tasks in groups and the resulting playbook looks like this:</p>
<pre><code class="lang-shell">[josevnz@dmaf5 Nexus3OnOrangePI]$ tree -N ansible/
ansible/
├── inventories
│   └── home
│       └── hosts.yaml
├── roles
│   ├── clients
│   │   ├── tasks
│   │   │   └── main.yaml
│   │   └── templates
│   │       └── pip.conf.j2
│   └── nexus
│       ├── files
│       │   └── swagger.json
│       ├── tasks
│       │   ├── download.yaml
│       │   ├── install.yaml
│       │   ├── main.yaml
│       │   ├── post_install.yaml
│       │   ├── pre_install.yaml
│       │   ├── repositories.yaml
│       │   ├── third_party.yaml
│       │   └── user.yaml
│       └── templates
│           ├── logrotate.nexus3.j2
│           ├── nexus3.service.j2
│           ├── nexus.rc.j2
│           └── nexus.vmoptions.j2
├── site.yaml
├── vars
│   ├── clients.yaml
│   └── nexus.yaml
└── vault
    ├── nexus_password.enc
    └── README.md

13 directories, 21 files
</code></pre>
<p>Now a little bit of explaining:</p>
<ul>
<li><p>There are two roles: ‘nexus’ and ‘clients’. The nexus role is used to setup the artifact management software, while the client role sets up the <a target="_blank" href="https://docs.python.org/3/installing/index.html">pip</a> settings on every machine.</p>
</li>
<li><p>Vars contains variables used on each role, separated by files to make their usage more clear</p>
</li>
<li><p>We have passwords, and we managed them using <a target="_blank" href="https://docs.ansible.com/ansible/latest/cli/ansible-vault.html">Ansible vault</a> feature.</p>
</li>
<li><p>The file ‘<a target="_blank" href="https://tutorials.kodegeek.com/Nexus3OnOrangePI/ansible/site.yaml">site.yaml</a>’ Orchestrates the role execution:</p>
</li>
</ul>
<pre><code class="lang-yaml"><span class="hljs-bullet">-</span> <span class="hljs-attr">hosts:</span> <span class="hljs-string">all</span>
  <span class="hljs-attr">tags:</span> <span class="hljs-string">clients</span>
  <span class="hljs-attr">vars_files:</span>
    <span class="hljs-bullet">-</span> <span class="hljs-string">vars/clients.yaml</span>
  <span class="hljs-attr">roles:</span>
    <span class="hljs-bullet">-</span> <span class="hljs-string">clients</span>
<span class="hljs-bullet">-</span> <span class="hljs-attr">hosts:</span> <span class="hljs-string">nexus_server</span>
  <span class="hljs-attr">tags:</span> <span class="hljs-string">nexus</span>
  <span class="hljs-attr">become_user:</span> <span class="hljs-string">root</span>
  <span class="hljs-attr">become:</span> <span class="hljs-literal">true</span>
  <span class="hljs-attr">vars_files:</span>
    <span class="hljs-bullet">-</span> <span class="hljs-string">vars/nexus.yaml</span>
  <span class="hljs-attr">roles:</span>
    <span class="hljs-bullet">-</span> <span class="hljs-string">nexus</span>
</code></pre>
<p>Now let’s move on to see the universe where the playbook will be executed.</p>
<h2 id="heading-the-host-inventory">The Host Inventory</h2>
<p>In my case it is quite simple – I have two main groups: ‘clients’ and the machine where the Nexus 3 server itself will run:</p>
<pre><code class="lang-yaml"><span class="hljs-attr">all:</span>
  <span class="hljs-attr">children:</span>
    <span class="hljs-attr">nexus_server:</span>
      <span class="hljs-attr">hosts:</span>
        <span class="hljs-attr">orangepi5.home:</span>
    <span class="hljs-attr">home_lab:</span>
      <span class="hljs-attr">hosts:</span>
        <span class="hljs-attr">dmaf5.home:</span>
        <span class="hljs-attr">raspberrypi.home:</span>
        <span class="hljs-attr">orangepi5.home:</span>
</code></pre>
<p>The next important task is to download and configure Nexus 3.</p>
<h2 id="heading-how-to-install-nexus-3">How to Install Nexus 3</h2>
<p>The file <a target="_blank" href="https://tutorials.kodegeek.com/Nexus3OnOrangePI/ansible/roles/nexus/tasks/main.yaml">main.yaml</a> describes the order and purpose of each installation task for the Nexus role:</p>
<pre><code class="lang-yaml"><span class="hljs-comment"># Tasks listed here are related to the remote Nexus 3 server</span>
<span class="hljs-comment"># Included tasks are called in order</span>
<span class="hljs-meta">---</span>
  <span class="hljs-bullet">-</span> <span class="hljs-attr">include_tasks:</span> <span class="hljs-string">third_party.yaml</span>
  <span class="hljs-bullet">-</span> <span class="hljs-attr">include_tasks:</span> <span class="hljs-string">pre_install.yaml</span>
  <span class="hljs-bullet">-</span> <span class="hljs-attr">include_tasks:</span> <span class="hljs-string">download.yaml</span>
  <span class="hljs-bullet">-</span> <span class="hljs-attr">include_tasks:</span> <span class="hljs-string">install.yaml</span>
  <span class="hljs-bullet">-</span> <span class="hljs-attr">include_tasks:</span> <span class="hljs-string">post_install.yaml</span>
  <span class="hljs-bullet">-</span> <span class="hljs-attr">include_tasks:</span> <span class="hljs-string">user.yaml</span>
  <span class="hljs-bullet">-</span> <span class="hljs-attr">include_tasks:</span> <span class="hljs-string">repositories.yaml</span>
</code></pre>
<p>Let’s see first what I like to call the “core tasks”:</p>
<ol>
<li><p><a target="_blank" href="https://tutorials.kodegeek.com/Nexus3OnOrangePI/ansible/roles/nexus/tasks/third_party.yaml">third_party.yaml</a>: In here we install the OpenJDK8 (Nexus 3 is written in Java) and logrotate to take care of the stale logs.</p>
</li>
<li><p><a target="_blank" href="https://tutorials.kodegeek.com/Nexus3OnOrangePI/ansible/roles/nexus/tasks/pre_install.yaml">pre_install.yaml</a>: A lot happens here, like creating required directories for nexus, dedicated non-privileged user that will run the process.</p>
</li>
<li><p><a target="_blank" href="https://tutorials.kodegeek.com/Nexus3OnOrangePI/ansible/roles/nexus/tasks/download.yaml">download.yaml</a>: As the name says, we get a fresh version of the Nexus 3 OSS software and make sure it has the right checksum. We don’t want to install malware from the Internet.</p>
</li>
</ol>
<p>Then come the tasks that fall into the “customized installation group”:</p>
<ol>
<li><p><a target="_blank" href="https://tutorials.kodegeek.com/Nexus3OnOrangePI/ansible/roles/nexus/tasks/install.yaml">install.yaml</a>: Unpack the software, prepare the systemd unit to start it automatically, setup JVM settings for Nexus, and deploy the logrotate configuration.</p>
</li>
<li><p><a target="_blank" href="https://tutorials.kodegeek.com/Nexus3OnOrangePI/ansible/roles/nexus/tasks/post_install.yaml">post_install.yaml</a>: Exciting stuff happens here – the software is installed, and we run it for the first time. We also change the default password <a target="_blank" href="https://help.sonatype.com/repomanager3/integrations/rest-and-integration-api">using the REST API</a>, so we can move to the customization stage.</p>
</li>
<li><p><a target="_blank" href="https://tutorials.kodegeek.com/Nexus3OnOrangePI/ansible/roles/nexus/tasks/user.yaml">user.yaml</a>: Here we prepare to provide our end users with proper access to the services offered by Nexus. We do this using a combination of the REST-API and Ansible client code:</p>
</li>
</ol>
<pre><code class="lang-yaml"><span class="hljs-comment"># https://help.sonatype.com/repomanager3/installation-and-upgrades/post-install-checklist</span>
<span class="hljs-comment"># https://help.sonatype.com/repomanager3/integrations/rest-and-integration-api</span>
<span class="hljs-meta">---</span>
<span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">Enable</span> <span class="hljs-string">anonymous</span> <span class="hljs-string">user</span>
  <span class="hljs-attr">tags:</span> <span class="hljs-string">anonymous</span>
  <span class="hljs-attr">ansible.builtin.uri:</span>
    <span class="hljs-attr">user:</span> <span class="hljs-string">""</span>
    <span class="hljs-attr">password:</span> <span class="hljs-string">""</span>
    <span class="hljs-attr">url:</span> <span class="hljs-string">"/v1/security/anonymous"</span>
    <span class="hljs-attr">method:</span> <span class="hljs-string">PUT</span>
    <span class="hljs-attr">body_format:</span> <span class="hljs-string">raw</span>
    <span class="hljs-attr">status_code:</span> [ <span class="hljs-number">200</span>, <span class="hljs-number">202</span>, <span class="hljs-number">204</span> ]
    <span class="hljs-attr">headers:</span>
      <span class="hljs-attr">Content-Type:</span> <span class="hljs-string">application/json</span>
    <span class="hljs-attr">body:</span> <span class="hljs-string">|-
      { "enabled" : true, "userId" : "anonymous", "realmName" : "NexusAuthorizingRealm" }
</span>    <span class="hljs-attr">force_basic_auth:</span> <span class="hljs-literal">true</span>
    <span class="hljs-attr">return_content:</span> <span class="hljs-literal">true</span>
  <span class="hljs-attr">any_errors_fatal:</span> <span class="hljs-literal">true</span>
<span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">Enable</span> <span class="hljs-string">Docker</span> <span class="hljs-string">security</span> <span class="hljs-string">realm</span>
  <span class="hljs-attr">tags:</span> <span class="hljs-string">docker_realm</span>
  <span class="hljs-attr">ansible.builtin.uri:</span>
    <span class="hljs-attr">user:</span> <span class="hljs-string">""</span>
    <span class="hljs-attr">password:</span> <span class="hljs-string">""</span>
    <span class="hljs-attr">url:</span> <span class="hljs-string">"/v1/security/realms/active"</span>
    <span class="hljs-attr">method:</span> <span class="hljs-string">PUT</span>
    <span class="hljs-attr">body_format:</span> <span class="hljs-string">raw</span>
    <span class="hljs-attr">status_code:</span> [ <span class="hljs-number">200</span>, <span class="hljs-number">202</span>, <span class="hljs-number">204</span> ]
    <span class="hljs-attr">headers:</span>
      <span class="hljs-attr">Content-Type:</span> <span class="hljs-string">application/json</span>
    <span class="hljs-attr">body:</span> <span class="hljs-string">|-
      [ "NexusAuthenticatingRealm", "NexusAuthorizingRealm", "DockerToken" ]
</span>    <span class="hljs-attr">force_basic_auth:</span> <span class="hljs-literal">true</span>
    <span class="hljs-attr">return_content:</span> <span class="hljs-literal">true</span>
  <span class="hljs-attr">any_errors_fatal:</span> <span class="hljs-literal">true</span>
</code></pre>
<p>The logic is easy to follow, by using the ‘PUT’ http method you can tell is a modification operation (meaning existing roles and users already exist). Error detection is done by getting the HTTP codes returned by Nexus.</p>
<p>Next step is to prepare our local PyPi proxy. This is a multistep task and will be described in detail next.</p>
<h2 id="heading-how-to-set-up-pypi-proxy-on-nexus-3">How to Set Up PyPI Proxy on Nexus 3</h2>
<p>The last file on the Nexus 3 role is ‘<a target="_blank" href="https://tutorials.kodegeek.com/Nexus3OnOrangePI/ansible/roles/nexus/tasks/repositories.yaml">repositories.yaml</a>’. In here we go through the following steps:</p>
<ol>
<li><p>Check if the proxy was already setup (GET or read only operation)</p>
</li>
<li><p>If it doesn’t exist, create a new one (POST method with JSON payload with details to create whole new repository)</p>
</li>
</ol>
<p>Notice than this playbook doesn’t offer the option to update repository settings. It is possible to do with the REST API, but I will leave that as an exercise to the reader.</p>
<p>The tasks to prepare the PyPi proxy are shown below:</p>
<pre><code class="lang-yaml"><span class="hljs-comment"># Create proxy for repositories</span>
<span class="hljs-comment"># https://help.sonatype.com/repomanager3/integrations/rest-and-integration-api</span>
<span class="hljs-comment"># PyPi: https://pip.pypa.io/en/stable/user_guide/</span>
<span class="hljs-meta">---</span>
<span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">Check</span> <span class="hljs-string">if</span> <span class="hljs-string">the</span> <span class="hljs-string">PyPi</span> <span class="hljs-string">proxy</span> <span class="hljs-string">exists</span>
  <span class="hljs-attr">tags:</span> <span class="hljs-string">pypi_proxy_exists</span>
  <span class="hljs-attr">ansible.builtin.uri:</span>
    <span class="hljs-attr">user:</span> <span class="hljs-string">""</span>
    <span class="hljs-attr">password:</span> <span class="hljs-string">""</span>
    <span class="hljs-attr">url:</span> <span class="hljs-string">"/v1/repositories/pypi/proxy/python_proxy"</span>
    <span class="hljs-attr">method:</span> <span class="hljs-string">GET</span>
    <span class="hljs-attr">body_format:</span> <span class="hljs-string">raw</span>
    <span class="hljs-attr">status_code:</span> [ <span class="hljs-number">200</span>, <span class="hljs-number">202</span>, <span class="hljs-number">204</span>, <span class="hljs-number">404</span> ]
    <span class="hljs-attr">headers:</span>
      <span class="hljs-attr">Content-Type:</span> <span class="hljs-string">application/json</span>
    <span class="hljs-attr">force_basic_auth:</span> <span class="hljs-literal">true</span>
    <span class="hljs-attr">return_content:</span> <span class="hljs-literal">true</span>
  <span class="hljs-attr">any_errors_fatal:</span> <span class="hljs-literal">true</span>
  <span class="hljs-attr">register:</span> <span class="hljs-string">python_local</span>
<span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">Create</span> <span class="hljs-string">PyPI</span> <span class="hljs-string">proxy</span>
  <span class="hljs-attr">tags:</span> <span class="hljs-string">pypi_proxy_create</span>
  <span class="hljs-attr">ansible.builtin.uri:</span>
    <span class="hljs-attr">user:</span> <span class="hljs-string">""</span>
    <span class="hljs-attr">password:</span> <span class="hljs-string">""</span>
    <span class="hljs-attr">url:</span> <span class="hljs-string">"/v1/repositories/pypi/proxy"</span>
    <span class="hljs-attr">method:</span> <span class="hljs-string">POST</span>
    <span class="hljs-attr">body_format:</span> <span class="hljs-string">raw</span>
    <span class="hljs-attr">status_code:</span> [ <span class="hljs-number">201</span> ]
    <span class="hljs-attr">headers:</span>
      <span class="hljs-attr">Content-Type:</span> <span class="hljs-string">application/json</span>
    <span class="hljs-attr">body:</span> <span class="hljs-string">|-
      {
        "name": "python_proxy",
        "online": true,
        "storage": {
          "blobStoreName": "default",
          "strictContentTypeValidation": true
        },
        "proxy": {
          "remoteUrl": "https://pypi.org/",
          "contentMaxAge": -1,
          "metadataMaxAge": 1440
        },
        "negativeCache": {
          "enabled": true,
          "timeToLive": 1440
        },
        "httpClient": {
          "blocked": false,
          "autoBlock": true,
          "connection": {
            "retries": 0,
            "timeout": 60,
            "enableCircularRedirects": false,
            "enableCookies": true,
            "useTrustStore": false
          }
        }
      }
</span>    <span class="hljs-attr">force_basic_auth:</span> <span class="hljs-literal">true</span>
    <span class="hljs-attr">return_content:</span> <span class="hljs-literal">true</span>
  <span class="hljs-attr">any_errors_fatal:</span> <span class="hljs-literal">true</span>
  <span class="hljs-attr">when:</span> <span class="hljs-string">python_local.status</span> <span class="hljs-string">==</span> <span class="hljs-number">404</span>
</code></pre>
<p>We are almost there. Now we need to tell our PyPi clients than we should use our local Nexus and not the direct PyPi site to get our Python libraries.</p>
<h2 id="heading-how-to-set-the-clients">How to Set the Clients</h2>
<p>The clients role is much simpler and only requires deploying a <a target="_blank" href="https://tutorials.kodegeek.com/Nexus3OnOrangePI/ansible/roles/clients/templates/pip.conf.j2">template for pip.conf</a> with enough information to force the search on our new repository:</p>
<pre><code class="lang-yaml"><span class="hljs-comment"># Tasks here are meant to be used on our clients user</span>
<span class="hljs-meta">---</span>
<span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">Create</span> <span class="hljs-string">installation</span> <span class="hljs-string">directory</span> <span class="hljs-string">for</span> <span class="hljs-string">pip.conf</span>
  <span class="hljs-attr">tags:</span> <span class="hljs-string">pip_basedir</span>
  <span class="hljs-attr">ansible.builtin.file:</span>
    <span class="hljs-attr">state:</span> <span class="hljs-string">directory</span>
    <span class="hljs-attr">path:</span> <span class="hljs-string">""</span>
    <span class="hljs-attr">owner:</span> <span class="hljs-string">""</span>
    <span class="hljs-attr">group:</span> <span class="hljs-string">""</span>
    <span class="hljs-attr">mode:</span> <span class="hljs-string">"u+rwx,go-rwx"</span>
<span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">Copy</span> <span class="hljs-string">pip.conf</span> <span class="hljs-string">file</span>
  <span class="hljs-attr">tags:</span> <span class="hljs-string">pip_copy</span>
  <span class="hljs-attr">ansible.builtin.template:</span>
    <span class="hljs-attr">src:</span> <span class="hljs-string">pip.conf.j2</span>
    <span class="hljs-attr">dest:</span> <span class="hljs-string">"/pip.conf"</span>
    <span class="hljs-attr">owner:</span> <span class="hljs-string">""</span>
    <span class="hljs-attr">group:</span> <span class="hljs-string">""</span>
    <span class="hljs-attr">mode:</span> <span class="hljs-string">u=rxw,g=r,o=r</span>
</code></pre>
<p>The resulting file gets deployed on ‘<em>~/.config/pip/pip.conf</em>’ of every machine:</p>
<pre><code class="lang-yaml"><span class="hljs-comment"># https://pip.pypa.io/en/stable/topics/configuration/</span>
[<span class="hljs-string">global</span>]
<span class="hljs-string">timeout</span> <span class="hljs-string">=</span> <span class="hljs-number">60</span>
[<span class="hljs-string">install</span>]
<span class="hljs-string">index</span> <span class="hljs-string">=</span> <span class="hljs-string">http://orangepi5.home:8081/repository/python_proxy/pypi</span>
<span class="hljs-string">index-url</span> <span class="hljs-string">=</span> <span class="hljs-string">http://orangepi5.home:8081/repository/python_proxy/simple/</span>
<span class="hljs-string">trusted-host</span> <span class="hljs-string">=</span> <span class="hljs-string">orangepi5.home</span>
</code></pre>
<p>The file above shows an example of how the final version of the file will look once deployed on my cluster (yours will be different with the resolved URL).</p>
<p>It is time now to run the whole playbook and see what it looks like.</p>
<h2 id="heading-how-to-run-the-playbook">How to Run the Playbook</h2>
<p>To run the playbook, we pass a few arguments:</p>
<ol>
<li><p>The location of our host inventory</p>
</li>
<li><p>The location of the encrypted password file and a master file containing the master password to unlock the contents of the protected file</p>
</li>
<li><p>And finally the location of our main playbook file</p>
</li>
</ol>
<pre><code class="lang-shell">cd ansible
ansible-playbook --inventory  inventories --extra-vars @vault/nexus_password.enc --vault-password-file $HOME/vault/ansible_vault_pass site.yaml
</code></pre>
<p><a target="_blank" href="https://asciinema.org/a/579355"><img src="https://asciinema.org/a/579355.svg" alt="asciicast" width="1406.69666611" height="634.666508" loading="lazy"></a></p>
<h3 id="heading-how-to-test-the-new-pypi-proxy">How to test the new PyPI proxy</h3>
<p>To test our new proxy, we will install <a target="_blank" href="https://github.com/Textualize/rich">Python Rich</a> using pip and a virtual environment.</p>
<pre><code class="lang-shell">josevnz@orangepi5:~$ python3 -m venv ~/virtualenv/rich
(rich) josevnz@orangepi5:~$ . ~/virtualenv/rich/bin/activate
(rich) josevnz@orangepi5:~$ pip install rich
Looking in indexes: http://orangepi5.home:8081/repository/python_proxy/simple/
Collecting rich
  Downloading http://orangepi5.home:8081/repository/python_proxy/packages/rich/13.3.4/rich-13.3.4-py3-none-any.whl (238 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 238.7/238.7 KB 14.8 MB/s eta 0:00:00
Collecting pygments&lt;3.0.0,&gt;=2.13.0
  Downloading http://orangepi5.home:8081/repository/python_proxy/packages/pygments/2.15.0/Pygments-2.15.0-py3-none-any.whl (1.1 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.1/1.1 MB 23.8 MB/s eta 0:00:00
Collecting markdown-it-py&lt;3.0.0,&gt;=2.2.0
  Downloading http://orangepi5.home:8081/repository/python_proxy/packages/markdown-it-py/2.2.0/markdown_it_py-2.2.0-py3-none-any.whl (84 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 84.5/84.5 KB 6.9 MB/s eta 0:00:00
Collecting mdurl~=0.1
  Downloading http://orangepi5.home:8081/repository/python_proxy/packages/mdurl/0.1.2/mdurl-0.1.2-py3-none-any.whl (10.0 kB)
Installing collected packages: pygments, mdurl, markdown-it-py, rich
Successfully installed markdown-it-py-2.2.0 mdurl-0.1.2 pygments-2.15.0 rich-13.3.4
</code></pre>
<p>And then we can confirm than the cache was indeed used by seeing the new artifacts on the new repository:</p>
<p><img src="https://tutorials.kodegeek.com/Nexus3OnOrangePI/sonatype_browse_python_proxy.png" alt="New artifacts on the Python_proxy PyPI repository" width="841" height="315" loading="lazy"></p>
<p><em>See the PyPi artifacts</em></p>
<p>Let’s see a demo of the client in action, installing something else:</p>
<p><a target="_blank" href="https://asciinema.org/a/579357"><img src="https://asciinema.org/a/579357.svg" alt="asciicast" width="1406.69666611" height="634.666508" loading="lazy"></a></p>
<h2 id="heading-further-customization-using-the-rest-api">Further Customization Using the REST-API</h2>
<p>Every Nexus installation allows you to download a JSON file that describes the API supported by the server. For example, in my server you can get a copy like this from my orangepi5.home server:</p>
<pre><code class="lang-shell">curl --fail --remote-name http://orangepi5.home:8081/service/rest/swagger.json
</code></pre>
<p>Also, the UI allows you to try the other REST API endpoints to customize your installation.</p>
<p><img src="https://tutorials.kodegeek.com/Nexus3OnOrangePI/api-swagger.png" alt="API Swagger documentation on Nexus 3" width="1498" height="782" loading="lazy"></p>
<p><em>REST API testing</em></p>
<h2 id="heading-conclusion">Conclusion</h2>
<p>I recommend spending some time and reading the <a target="_blank" href="https://help.sonatype.com/repomanager3">Nexus 3 book</a> to get yourself familiar with the features this tool can offer.</p>
<p>The community prepared <a target="_blank" href="https://github.com/sonatype-nexus-community/nexus-repository-installer">Debian and RPM installers</a>, if you need this kind of setup as opposed to using Ansible.</p>
<p>Nexus 3 <em>has lots</em> of configurable settings. We covered only the surface here. While preparing this article I found '<a target="_blank" href="https://github.com/ansible-ThoTeam/nexus3-oss">ThoTeam Nexus3-oss repository</a>' with a very complete and up-to-date playbook, but it was way more complex than anything I required for my home lab.</p>
<p><a target="_blank" href="https://archiva.apache.org/">Archiva</a> is another Open Source artifact manager, it is more limited in functionality but also simpler to setup.</p>
<p>There is a <a target="_blank" href="https://help.sonatype.com/repomanager3/installation-and-upgrades/post-install-checklist">post-installation checklist</a> with some tasks I did not need to complete for my home lab. Please check it out to make sure your setup is complete.</p>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ How to Provision a Home Lab with Oracle Cloud and Ansible ]]>
                </title>
                <description>
                    <![CDATA[ Imagine for a moment that you been working hard to setup a website, protected with SSL, and then your hardware fails. This means that unless you have a perfect backup of your machine, you will need to install all the software and configuration files ... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/provision-home-lab-with-oracle-cloud-and-ansible/</link>
                <guid isPermaLink="false">66d85145e9c1a2c18adec0be</guid>
                
                    <category>
                        <![CDATA[ Cloud Computing ]]>
                    </category>
                
                    <category>
                        <![CDATA[ servers ]]>
                    </category>
                
                    <category>
                        <![CDATA[ software architecture ]]>
                    </category>
                
                    <category>
                        <![CDATA[ SSL ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ Jose Vicente Nunez ]]>
                </dc:creator>
                <pubDate>Tue, 15 Nov 2022 21:52:50 +0000</pubDate>
                <media:content url="https://www.freecodecamp.org/news/content/images/2022/11/pexels-pixabay-210158.jpg" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>Imagine for a moment that you been working hard to setup a website, protected with SSL, and then your hardware fails. This means that unless you have a perfect backup of your machine, you will need to install all the software and configuration files by hand.</p>
<p>What if it's not just one server but many? The amount of time you will need to fix all of them will grow exponentially – and because is a manual process it will be more error-prone.</p>
<p>And then the nightmare scenario: You don't have an up-to-date backup, or you have incomplete backups. Or the worst – there are no backups at all. This last case is more common than you think, especially in home labs where you are tinkering and playing around with stuff by yourself.</p>
<p>In this tutorial, I'll show you how you can do a full infrastructure provisioning of a pair of web servers on a Cloud provider, with <a target="_blank" href="https://www.cloudflare.com/learning/ssl/what-is-ssl/">SSL</a> certificates and monitoring metrics with <a target="_blank" href="https://prometheus.io/docs/introduction/overview/">Prometheus</a>.</p>
<h2 id="heading-what-you-need-for-this-setup">What You Need for This Setup</h2>
<p>The first thing you need is a cloud provider. <a target="_blank" href="https://cloud.oracle.com/">Oracle Cloud</a> offers a <em>Free Tier version</em> of their cloud services, which allows you to setup virtual machines for free. This is great for a home lab with lots of rich features that you can use to try new tools and techniques.</p>
<p>You'll also need an automation tool. I used <a target="_blank" href="https://www.ansible.com/">Ansible</a> because its doesn't have many requirements (you only need an SSH daemon and public key authentication to get things going). I also like it because it works equally well regardless of the cloud environment you are trying to provision.</p>
<p>In this tutorial we will use the <a target="_blank" href="https://github.com/ansible/ansible">Open Source version</a> of this tool, as it is more than sufficient for our purposes.</p>
<h3 id="heading-whats-included-in-the-ansible-playbook">What's included in the Ansible playbook</h3>
<p>An <a target="_blank" href="https://docs.ansible.com/ansible/latest/user_guide/playbooks_intro.htmlhttps://docs.ansible.com/ansible/latest/user_guide/playbooks_intro.html">Ansible playbook</a> is nothing more than a set of instructions you define to execute <em>tasks</em> that will change the status of a host. These actions are carried out on an <a target="_blank" href="https://docs.ansible.com/ansible/latest/user_guide/intro_inventory.html">inventory of hosts</a> you define.</p>
<p>Here, you are going to learn about the following:</p>
<ul>
<li><p>How to <a target="_blank" href="https://www.redhat.com/sysadmin/ansible-dynamic-inventories">clean inventory sources</a> by using the proper layout in your playbooks.</p>
</li>
<li><p>How to provision two <a target="_blank" href="https://nginx.org/en/">NGINX</a> instances, with the request of their proper free SSL certificates using <a target="_blank" href="https://certbot.eff.org/instructions?ws=nginx&amp;os=pip">Certbot</a>.</p>
</li>
<li><p>How to set up the local Linux firewalls and add a Prometheus node_exporter agent and one scraper to collect that data.</p>
</li>
<li><p>Concepts like variables, roles (with task inclusion), and conditional execution.</p>
</li>
<li><p>Important techniques like task tagging, debug messages, and static validation with <a target="_blank" href="https://www.redhat.com/sysadmin/ansible-lint-YAML">ansible-lint</a>.</p>
</li>
</ul>
<p>All the code can be found in this <a target="_blank" href="https://github.com/josevnz/OracleCloudHomeLab">GitHub repository</a>.</p>
<h2 id="heading-what-you-should-know-before-trying-this">What You Should Know Before Trying This</h2>
<p>Because we will cover several tasks here, you will probably need to be familiar with several things (I'll provide links as we go along):</p>
<ul>
<li><p>This is not an <a target="_blank" href="https://www.redhat.com/sysadmin/ansible-introduction">introductory course on Ansible</a> but more of a "how all things fit together" with a more detailed, but not too complex, playbook.</p>
</li>
<li><p>An <a target="_blank" href="https://www.oracle.com/cloud/free/">OCI Cloud Free Tier</a> account</p>
</li>
<li><p>Privileged account, most likely <a target="_blank" href="https://www.sudo.ws/">SUDO</a></p>
</li>
<li><p>Basic knowledge of <a target="_blank" href="https://www.redhat.com/sysadmin/firewalld-rules-and-scenarios">TCP/IP and firewalls with firewalld</a></p>
</li>
<li><p>How to use <a target="_blank" href="https://www.redhat.com/sysadmin/how-manage-packages">RPM</a> and how to <a target="_blank" href="https://www.redhat.com/sysadmin/package-linux-applications-rpm">package</a> applications (we will not do that here, but it helps to understand when an RPM is better than a complex task in Ansible)</p>
</li>
</ul>
<h3 id="heading-what-is-not-included-here">What is not included here</h3>
<p>OCI Cloud has a <a target="_blank" href="https://docs.oracle.com/en-us/iaas/api/#/en/network-firewall/20211001/NetworkFirewallPolicy/UpdateNetworkFirewallPolicy">complete REST API</a> to manage a lot of aspects of their cloud environment. <a target="_blank" href="https://docs.oracle.com/en-us/iaas/Content/API/Concepts/sdkconfig.htm#SDK_and_CLI_Configuration_File">Their setup page</a> (specifically the SDK) is also very detailed.</p>
<h2 id="heading-youll-probably-do-things-differently-in-production">You'll Probably Do Things Differently in Production.</h2>
<h3 id="heading-installing-the-oci-metrics-datasource-instead-of-prometheus-agents-on-a-virtual-machine">Installing the OCI-Metrics-datasource instead of Prometheus agents on a virtual machine</h3>
<p>You can go to <a target="_blank" href="https://grafana.com/grafana/plugins/oci-metrics-datasource/?tab=installation">this page</a> to install it on your Grafana instance (Bare metal or Cloud). Also you need to <a target="_blank" href="https://docs.oracle.com/en-us/iaas/Content/API/SDKDocs/grafana.htm">setup your credentials and permissions as explained here</a>.</p>
<p>This is probably the most efficient way to monitor your resources as you do not need to run agents on your virtual machines. But I will install instead a <a target="_blank" href="https://github.com/prometheus/node_exporter">Prometheus node_exporter agent</a> and <a target="_blank" href="https://github.com/prometheus/prometheus">scraper</a> that will be visible from a <a target="_blank" href="https://grafana.com/">Grafana Cloud</a> instance.</p>
<h3 id="heading-an-exposed-prometheus-on-the-internet-endpoint-is-not-a-good-idea">An exposed Prometheus on the Internet endpoint is not a good idea</h3>
<p>It is very clear, I'm exposing my Prometheus scraper to the Internet so Grafana cloud can reach it. On an Intranet with a private cloud and your local Grafana, this is not an issue – but here, a Prometheus agent pushing data to Grafana would be a better option.</p>
<p>Still, Grafana provides a <a target="_blank" href="https://grafana.com/docs/grafana-cloud/reference/allow-list/">list of public IP addresses</a> that you can use to setup your allow list.</p>
<p>So the following will work:</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2022/11/oracle_cloud_ingress_rules.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>Oracle Cloud Ingress Rules</em></p>
<p>But it is not the best. Instead, you want to restrict the specific IP addresses that can pull data from your exposed services. The prometheus exporter can be completely hidden from Grafana on port 9100. Instead we only need to expose the Prometheus scraper that listens on port 9000.</p>
<p>For this home lab, it is not a big deal having such services fully exposed. But if you have a server with sensitive data, you must restrict who can reach the service!</p>
<p>An alternative to the Prometheus endpoint is to push the data to Grafana <a target="_blank" href="https://grafana.com/docs/agent/latest/">by using a Grafana agent</a> but I will not cover that option here.</p>
<h1 id="heading-playbook-analysis">Playbook Analysis</h1>
<p>Ansible lets you have a single file with the playbook instructions, but eventually you will find that such a structure is difficult to maintain.</p>
<p>For my playbook I decided to keep the suggested structure:</p>
<pre><code class="lang-shell">tree -A 
.
├── inventory
│   └── cloud.yaml
├── oracle.yaml
├── roles
│   └── oracle
│       ├── files
│       │   ├── logrotate_prometheus-node-exporter
│       │   ├── prometheus-node-exporter
│       │   └── requirements_certboot.txt
│       ├── handlers
│       │   └── main.yaml
│       ├── meta
│       ├── tasks
│       │   ├── controller.yaml
│       │   ├── main.yaml
│       │   ├── metrics.yaml
│       │   └── nginx.yaml
│       ├── templates
│       │   ├── prometheus-node-exporter.service
│       │   ├── prometheus.service
│       │   └── prometheus.yaml
│       └── vars
│           └── main.yaml
└── site.yaml
</code></pre>
<p>Below is a brief description of how the content is organized:</p>
<ol>
<li><p>You can have more than one site. You control that inside the [site.yaml](file:///home/josevnz/OracleCloudHomeLab/site.yaml) file.</p>
</li>
<li><p>The host list is inside the inventory directory. You can have more than one inventory file or scripts to generate the hostlist, or a combination of both.</p>
</li>
<li><p>The roles/oracle group the tasks. We only have one role called 'oracle' because that's the cloud provider I'm focusing on here.</p>
</li>
<li><p>Our playbook uses metadata in the form of variables, with each one defined on the 'vars' directory. That way we can customize the behaviour of the playbook in multiple places:</p>
</li>
</ol>
<pre><code class="lang-yaml"><span class="hljs-meta">---</span>
<span class="hljs-comment"># Common variables for my Oracle Cloud environments</span>
<span class="hljs-attr">controller_host:</span> <span class="hljs-string">XXXX.com</span>
<span class="hljs-attr">ssl_maintainer_email:</span> <span class="hljs-string">YYYYYY@ZZZZ.com</span>
<span class="hljs-attr">architecture:</span> <span class="hljs-string">arm64</span>
<span class="hljs-attr">prometheus_version:</span> <span class="hljs-number">2.38</span><span class="hljs-number">.0</span>
<span class="hljs-attr">prometheus_port:</span> <span class="hljs-number">9090</span>
<span class="hljs-attr">prometheus_node_exporter_nodes:</span> <span class="hljs-string">"['X-server1:<span class="hljs-template-variable">{{ node_exporter_port }}</span>', 'Y-server2:<span class="hljs-template-variable">{{ node_exporter_port }}</span>' ]"</span>
<span class="hljs-attr">node_exporter_version:</span> <span class="hljs-number">1.4</span><span class="hljs-number">.0</span>
<span class="hljs-attr">node_exporter_port:</span> <span class="hljs-number">9100</span>
<span class="hljs-attr">internal_network:</span> <span class="hljs-string">QQ.0.0.0/24</span>
</code></pre>
<p>The roles/oracle files directory contains files that can be copied as is to the remote directory. The templates' directory is similar, but the files in there can be customized for each host by using the <a target="_blank" href="https://docs.ansible.com/ansible/latest/user_guide/playbooks_templating.html">Jinja templating language</a>.</p>
<pre><code class="lang-yaml"><span class="hljs-comment"># A template for the prometheus scraper configuration file</span>
<span class="hljs-meta">---</span>
<span class="hljs-attr">global:</span>
    <span class="hljs-attr">scrape_interval:</span> <span class="hljs-string">30s</span>
    <span class="hljs-attr">evaluation_interval:</span> <span class="hljs-string">30s</span>
    <span class="hljs-attr">scrape_timeout:</span> <span class="hljs-string">10s</span>
    <span class="hljs-attr">external_labels:</span>
        <span class="hljs-attr">monitor:</span> <span class="hljs-string">'oracle-cloud-metrics'</span>

<span class="hljs-attr">scrape_configs:</span>
  <span class="hljs-bullet">-</span> <span class="hljs-attr">job_name:</span> <span class="hljs-string">'node-exporter'</span>
    <span class="hljs-attr">static_configs:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-attr">targets:</span> {{ <span class="hljs-string">prometheus_node_exporter_nodes</span> }}
    <span class="hljs-attr">tls_config:</span>
      <span class="hljs-attr">insecure_skip_verify:</span> <span class="hljs-literal">true</span>
</code></pre>
<p>The 'tasks' directory is where we store our tasks, that is the actions that will modify the server state. Note that Ansible will not execute tasks if it's not necessary. The idea is that you can re-run a playbook as many times as needed and the final state will be the same.</p>
<pre><code class="lang-yaml"><span class="hljs-comment"># Fragment of the nginx tasks file. See how we notify a handler to restart nginx after the SSL certificate is renewed.</span>
<span class="hljs-meta">---</span>
<span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">Copy</span> <span class="hljs-string">requirements</span> <span class="hljs-string">file</span>
  <span class="hljs-attr">ansible.builtin.copy:</span>
    <span class="hljs-attr">src:</span> <span class="hljs-string">requirements_certboot.txt</span>
    <span class="hljs-attr">dest:</span> <span class="hljs-string">/opt/requirements_certboot.txt</span>
  <span class="hljs-attr">tags:</span> <span class="hljs-string">certbot_requirements</span>

<span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">Setup</span> <span class="hljs-string">Certbot</span>
  <span class="hljs-attr">pip:</span>
    <span class="hljs-attr">requirements:</span> <span class="hljs-string">/opt/requirements_certboot.txt</span>
    <span class="hljs-attr">virtualenv:</span> <span class="hljs-string">/opt/certbot/</span>
    <span class="hljs-attr">virtualenv_site_packages:</span> <span class="hljs-literal">true</span>
    <span class="hljs-attr">virtualenv_command:</span> <span class="hljs-string">/usr/bin/python3</span> <span class="hljs-string">-m</span> <span class="hljs-string">venv</span>
  <span class="hljs-attr">tags:</span> <span class="hljs-string">certbot_env</span>

<span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">Get</span> <span class="hljs-string">SSL</span> <span class="hljs-string">certificate</span>
  <span class="hljs-attr">command:</span>
    <span class="hljs-attr">argv:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">/opt/certbot/bin/certbot</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">--nginx</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">--agree-tos</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">-m</span> {{ <span class="hljs-string">ssl_maintainer_email</span> }}
      <span class="hljs-bullet">-</span> <span class="hljs-string">-d</span> {{ <span class="hljs-string">inventory_hostname</span> }}
      <span class="hljs-bullet">-</span> <span class="hljs-string">--non-interactive</span>
  <span class="hljs-attr">notify:</span>
    <span class="hljs-bullet">-</span> <span class="hljs-string">Restart</span> <span class="hljs-string">Nginx</span>
  <span class="hljs-attr">tags:</span> <span class="hljs-string">certbot_install</span>
</code></pre>
<p>There is one special directory called 'handlers'. There we define actions that must happen if a task changes the state of our host.</p>
<p>We now have a picture of how all the pieces work together, so let's talk about some specific details.</p>
<h3 id="heading-firewall-provisioning">Firewall provisioning</h3>
<p>With Ansible, you can replace a sequence of commands like this:</p>
<pre><code class="lang-python">sudo firewall-cmd --permanent --zone=public --add-service=http
sudo firewall-cmd --permanent --zone=public --add-service=https
sudo firewall-cmd --reload
</code></pre>
<p>With a <a target="_blank" href="https://docs.ansible.com/ansible/latest/collections/ansible/posix/firewalld_module.html">firewalld module</a>:</p>
<pre><code class="lang-yaml"><span class="hljs-meta">---</span>
<span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">Enable</span> <span class="hljs-string">HTTP</span> <span class="hljs-string">at</span> <span class="hljs-string">the</span> <span class="hljs-string">Linux</span> <span class="hljs-string">firewall</span>
  <span class="hljs-attr">firewalld:</span>
    <span class="hljs-attr">zone:</span> <span class="hljs-string">public</span>
    <span class="hljs-attr">service:</span> <span class="hljs-string">http</span>
    <span class="hljs-attr">permanent:</span> <span class="hljs-literal">true</span>
    <span class="hljs-attr">state:</span> <span class="hljs-string">enabled</span>
    <span class="hljs-attr">immediate:</span> <span class="hljs-literal">yes</span>
  <span class="hljs-attr">notify:</span>
    <span class="hljs-bullet">-</span> <span class="hljs-string">Reload</span> <span class="hljs-string">firewall</span>
  <span class="hljs-attr">tags:</span> <span class="hljs-string">firewalld_https</span>

<span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">Enable</span> <span class="hljs-string">HTTPS</span> <span class="hljs-string">at</span> <span class="hljs-string">the</span> <span class="hljs-string">Linux</span> <span class="hljs-string">firewall</span>
  <span class="hljs-attr">firewalld:</span>
    <span class="hljs-attr">zone:</span> <span class="hljs-string">public</span>
    <span class="hljs-attr">service:</span> <span class="hljs-string">https</span>
    <span class="hljs-attr">permanent:</span> <span class="hljs-literal">true</span>
    <span class="hljs-attr">state:</span> <span class="hljs-string">enabled</span>
    <span class="hljs-attr">immediate:</span> <span class="hljs-literal">yes</span>
  <span class="hljs-attr">notify:</span>
    <span class="hljs-bullet">-</span> <span class="hljs-string">Reload</span> <span class="hljs-string">firewall</span>
  <span class="hljs-attr">tags:</span> <span class="hljs-string">firewalld_https</span>
</code></pre>
<h3 id="heading-common-tasks-have-nice-replacements"><strong>Common tasks have nice replacements</strong></h3>
<p>So instead of running SUDO with a privileged command:</p>
<pre><code class="lang-python">sudo dnf install -y nginx
sudo systemctl enable nginx.service --now
</code></pre>
<p>You can have something like this:</p>
<pre><code class="lang-yaml"><span class="hljs-comment"># oracle.yaml file, which tells which roles to call, included from site.yaml</span>
<span class="hljs-meta">---</span>
<span class="hljs-bullet">-</span> <span class="hljs-attr">hosts:</span> <span class="hljs-string">oracle</span>
  <span class="hljs-attr">serial:</span> <span class="hljs-number">2</span>
  <span class="hljs-attr">remote_user:</span> <span class="hljs-string">opc</span>
  <span class="hljs-attr">become:</span> <span class="hljs-literal">true</span>
  <span class="hljs-attr">become_user:</span> <span class="hljs-string">root</span>
  <span class="hljs-attr">roles:</span>
  <span class="hljs-bullet">-</span> <span class="hljs-string">oracle</span>
<span class="hljs-comment"># NGINX task (roles/oracle/tasks/nginx.yaml)</span>
<span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">Ensure</span> <span class="hljs-string">nginx</span> <span class="hljs-string">is</span> <span class="hljs-string">at</span> <span class="hljs-string">the</span> <span class="hljs-string">latest</span> <span class="hljs-string">version</span>
  <span class="hljs-attr">dnf:</span>
    <span class="hljs-attr">name:</span> <span class="hljs-string">nginx</span> <span class="hljs-string">&gt;=</span> <span class="hljs-number">1.14</span><span class="hljs-number">.1</span>
    <span class="hljs-attr">state:</span> <span class="hljs-string">present</span>
    <span class="hljs-attr">update_cache:</span> <span class="hljs-literal">true</span>
  <span class="hljs-attr">tags:</span> <span class="hljs-string">install_nginx</span>
<span class="hljs-comment"># And a handler that will restart NGINX after it gets modified (handlers/main.yaml)</span>
<span class="hljs-meta">---</span>
<span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">Restart</span> <span class="hljs-string">Nginx</span>
  <span class="hljs-attr">ansible.builtin.service:</span>
    <span class="hljs-attr">name:</span> <span class="hljs-string">nginx</span>
    <span class="hljs-attr">state:</span> <span class="hljs-string">restarted</span>
<span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">Reload</span> <span class="hljs-string">firewall</span>
  <span class="hljs-attr">ansible.builtin.systemd:</span>
    <span class="hljs-attr">name:</span> <span class="hljs-string">firewalld.service</span>
    <span class="hljs-attr">state:</span> <span class="hljs-string">reloaded</span>
</code></pre>
<h2 id="heading-how-to-run-the-playbooks">How to Run the Playbooks</h2>
<p>Normally you don't wait to have the whole playbook written, but you run the pieces you need in the proper order. At some point you will have your whole playbook finished and ready to go.</p>
<h3 id="heading-make-sure-the-playbook-behaves-properly-with-check-before-making-any-changes">Make sure the playbook behaves properly with <code>--check</code> before making any changes</h3>
<p>The very first step is to check your playbook file for errors. For that you can use yamllint:</p>
<pre><code class="lang-shell">yamllint roles/oracle/tasks/main.yaml
</code></pre>
<p>But doing this for every yaml file in your playbook can be tedious an error-prone. As an alternative, you can run the playbook in a 'dry-run' mode, to see what will happen without actually making any changes:</p>
<p><a target="_blank" href="https://asciinema.org/a/537302"><img src="https://asciinema.org/a/537302.svg" alt="asciicast" width="1642.54999935" height="466.66655000000003" loading="lazy"></a></p>
<p>Another way to gradually test a complex playbook is by executing a specific task by using a tag or group of tags. That way you can do controlled execution of your playbook:</p>
<p><em>Keep in mind that this will not execute any dependencies that you may have defined on you playbook, tough</em>:</p>
<p><a target="_blank" href="https://asciinema.org/a/537303"><img src="https://asciinema.org/a/537303.svg" alt="asciicast" width="1642.54999935" height="466.66655000000003" loading="lazy"></a></p>
<h3 id="heading-use-ansible-lint-when-ansible-playbook-check-is-not-good-enough"><strong>Use Ansible-lint when ansible-playbook --check is not good enough</strong></h3>
<p>Some errors are more subtle and will not get caught with <code>ansible-playbook --check</code>. To get a more complete check on your playbooks before minor issues become a headache you can use <a target="_blank" href="https://ansible-lint.readthedocs.io/philosophy/">ansible-lint</a>. So let's get it installed:</p>
<pre><code class="lang-shell">python3 -m venv ~/virtualenv/ansiblelint &amp;&amp; . ~/virtualenv/ansiblelint/bin/activate
pip install --upgrade pip
pip install --upgrade wheel
pip install ansible-lint
</code></pre>
<p>Now we can check the playbook:</p>
<pre><code class="lang-shell">(ansiblelint) [josevnz@dmaf5 OracleCloudHomeLab]$ ansible-lint site.yaml 
WARNING  Overriding detected file kind 'yaml' with 'playbook' for given positional argument: site.yaml
WARNING  Listing 1 violation(s) that are fatal
syntax-check[specific]: couldn't resolve module/action 'firewalld'. This often indicates a misspelling, missing collection, or incorrect module path.
roles/oracle/tasks/nginx.yaml:2:3
</code></pre>
<p>Strange, firewalld is available on our Ansible installation. What else was installed by ansible-lint?</p>
<pre><code class="lang-shell">(ansiblelint) [josevnz@dmaf5 OracleCloudHomeLab]$ ansible --version
ansible [core 2.14.0]
  config file = /etc/ansible/ansible.cfg
  configured module search path = ['/home/josevnz/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /home/josevnz/virtualenv/ansiblelint/lib64/python3.9/site-packages/ansible
  ansible collection location = /home/josevnz/.ansible/collections:/usr/share/ansible/collections
  executable location = /home/josevnz/virtualenv/ansiblelint/bin/ansible
  python version = 3.9.9 (main, Nov 19 2021, 00:00:00) [GCC 10.3.1 20210422 (Red Hat 10.3.1-1)] (/home/josevnz/virtualenv/ansiblelint/bin/python3)
  jinja version = 3.1.2
  libyaml = True
</code></pre>
<p>Ansible-lint installed its own ansible [core], and firewalld is part of <a target="_blank" href="https://docs.ansible.com/ansible/latest/collections/ansible/posix/firewalld_module.html">ansible.posix collection</a>. We will use <a target="_blank" href="https://docs.ansible.com/ansible/latest/cli/ansible-galaxy.html">Ansible Galaxy</a> to install it:</p>
<pre><code class="lang-shell">(ansiblelint) [josevnz@dmaf5 OracleCloudHomeLab]$ which ansible-galaxy
~/virtualenv/ansiblelint/bin/ansible-galaxy
(ansiblelint) [josevnz@dmaf5 OracleCloudHomeLab]$ ansible-galaxy collection install ansible.posix
Starting galaxy collection install process
Process install dependency map
Starting collection install process
Downloading https://galaxy.ansible.com/download/ansible-posix-1.4.0.tar.gz to /home/josevnz/.ansible/tmp/ansible-local-18099xpw_8usc/tmp8msc9uf5/ansible-posix-1.4.0-_f17f525
Installing 'ansible.posix:1.4.0' to '/home/josevnz/.ansible/collections/ansible_collections/ansible/posix'
ansible.posix:1.4.0 was installed successfully
</code></pre>
<p>Running it again:</p>
<pre><code class="lang-shell">(ansiblelint) [josevnz@dmaf5 OracleCloudHomeLab]$ ansible-lint site.yaml 
WARNING  Overriding detected file kind 'yaml' with 'playbook' for given positional argument: site.yaml
WARNING  Listing 50 violation(s) that are fatal
name[play]: All plays should be named. (warning)
oracle.yaml:2

fqcn[action-core]: Use FQCN for builtin module actions (service).
roles/oracle/handlers/main.yaml:2 Use `ansible.builtin.service` or `ansible.legacy.service` instead.

fqcn[action-core]: Use FQCN for builtin module actions (command).
roles/oracle/handlers/main.yaml:6 Use `ansible.builtin.command` or `ansible.legacy.command` instead.
</code></pre>
<p>Some warnings are pedantic ('Use FQCN for builtin module actions (command)') and others require attention (Commands should not change things if nothing needs doing.).</p>
<p>Ansible-lint found many smells on the playbook, there is one option to re-write the files and correct some of these errors automatically:</p>
<p><a target="_blank" href="https://asciinema.org/a/538053"><img src="https://asciinema.org/a/538053.svg" alt="asciicast" width="1617.2799993600001" height="671.9998320000001" loading="lazy"></a></p>
<p>There are some guidelines you can <a target="_blank" href="https://ansible-lint.readthedocs.io/profiles/#production">follow to correct these issues</a>. Below are some that can be directly applied to the warnings we got earlier:</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2022/11/ansible_error_fixes.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p>Note that all the errors are easy to solve. Some commands decide on their own if they should make changes or not but have a hard time communicating back to Ansible:</p>
<pre><code class="lang-yaml"><span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">Get</span> <span class="hljs-string">SSL</span> <span class="hljs-string">certificate</span>
  <span class="hljs-attr">ansible.builtin.shell:</span>
    <span class="hljs-attr">argv:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">/opt/certbot/bin/certbot</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">--nginx</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">--agree-tos</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">-m</span> <span class="hljs-string">"<span class="hljs-template-variable">{{ ssl_maintainer_email }}</span>"</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">-d</span> <span class="hljs-string">"<span class="hljs-template-variable">{{ inventory_hostname }}</span>"</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">--non-interactive</span>
  <span class="hljs-attr">notify:</span>
    <span class="hljs-bullet">-</span> <span class="hljs-string">Restart</span> <span class="hljs-string">Nginx</span>
  <span class="hljs-attr">tags:</span> <span class="hljs-string">certbot_install</span>
</code></pre>
<p>In our case, certboot prints a message if the certificate is not yet due for renewal. If that output is missing then we trigger the Nginx restart (see <a target="_blank" href="https://docs.ansible.com/ansible/latest/user_guide/playbooks_error_handling.html#defining-changed">defining changed</a>):</p>
<pre><code class="lang-yaml"><span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">Get</span> <span class="hljs-string">SSL</span> <span class="hljs-string">certificate</span>
  <span class="hljs-attr">ansible.builtin.shell:</span>
    <span class="hljs-attr">argv:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">/opt/certbot/bin/certbot</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">--nginx</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">--agree-tos</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">-m</span> {{ <span class="hljs-string">ssl_maintainer_email</span> }}
      <span class="hljs-bullet">-</span> <span class="hljs-string">-d</span> {{ <span class="hljs-string">inventory_hostname</span> }}
      <span class="hljs-bullet">-</span> <span class="hljs-string">--non-interactive</span>
  <span class="hljs-attr">register:</span> <span class="hljs-string">certbot_output</span> <span class="hljs-comment"># Registers the certbot output.</span>
  <span class="hljs-attr">changed_when:</span> 
    <span class="hljs-bullet">-</span> <span class="hljs-string">'"Certificate not yet due for renewal" not in certbot_output.stdout'</span>
  <span class="hljs-attr">notify:</span>
    <span class="hljs-bullet">-</span> <span class="hljs-string">Restart</span> <span class="hljs-string">Nginx</span>
  <span class="hljs-attr">tags:</span> <span class="hljs-string">certbot_install</span>
</code></pre>
<p>I do want to use shell, as I need to expand the variable for certbot, but ansible-lint is still not happy:</p>
<pre><code class="lang-shell">(ansiblelint) [josevnz@dmaf5 OracleCloudHomeLab]$ ansible-lint site.yaml
WARNING  Overriding detected file kind 'yaml' with 'playbook' for given positional argument: site.yaml
WARNING  Listing 1 violation(s) that are fatal
command-instead-of-shell: Use shell only when shell functionality is required.
roles/oracle/tasks/nginx.yaml:47 Task/Handler: Get SSL certificate

You can skip specific rules or tags by adding them to your configuration file:
# .config/ansible-lint.yml
warn_list:  # or 'skip_list' to silence them completely
  - command-instead-of-shell  # Use shell only when shell functionality is required.

                   Rule Violation Summary                    
 count tag                      profile rule associated tags 
     1 command-instead-of-shell basic   command-shell, idiom 

Failed after min profile: 1 failure(s), 0 warning(s) on 8 files.
</code></pre>
<p>Time to treat this error as a warning, as I know they are not issues, by creating a <code>.config/ansible-lint.yml</code>:</p>
<pre><code class="lang-shell">(ansiblelint) [josevnz@dmaf5 OracleCloudHomeLab]$ ansible-lint site.yaml
WARNING  Overriding detected file kind 'yaml' with 'playbook' for given positional argument: site.yaml
WARNING  Listing 1 violation(s) that are fatal
command-instead-of-shell: Use shell only when shell functionality is required. (warning)
roles/oracle/tasks/nginx.yaml:47 Task/Handler: Get SSL certificate


                        Rule Violation Summary                         
 count tag                      profile rule associated tags           
     1 command-instead-of-shell basic   command-shell, idiom (warning) 

Passed with min profile: 0 failure(s), 1 warning(s) on 8 files.
</code></pre>
<p>Much better now, the warning is not treated as an error.</p>
<h3 id="heading-jinja-best-practices"><strong>Jinja best practices</strong></h3>
<p>If you plan to use variables and Jinja templates, make sure you quote them (example: "dest: /opt/prometheus-{{ prometheus_version }}.linux-{{ architecture }}.tar.gz")</p>
<h3 id="heading-constrain-where-the-playbook-runs-with-limit-and-tags">Constrain where the playbook runs with <code>--limit</code> and <code>--tags</code></h3>
<p>Say that you are only interested in running your playbook on a certain host. In that case, you can also do that by using the <code>--limit</code> flag:</p>
<pre><code class="lang-shell">ansible-playbook --inventory inventory --limit fido.yourcompany.com --tags certbot_renew site.yaml
</code></pre>
<p><a target="_blank" href="https://asciinema.org/a/537304"><img src="https://asciinema.org/a/537304.svg" alt="asciicast" width="1642.54999935" height="466.66655000000003" loading="lazy"></a></p>
<p>Here we did run only a task tagged certbot_renew on the host fido.yourcompany.com.</p>
<h3 id="heading-how-to-deal-with-a-real-issue">How to deal with a real issue</h3>
<p>Let's make this interesting: say that I was eager to update one of my requirements for certboot, and I changed versions if pip to '22.3.1':</p>
<pre><code class="lang-text">pip==22.3.1
wheel==0.38.4
certbot==1.32.0
certbot-nginx==1.32.0
</code></pre>
<p>When I run the playbook we have a failure:</p>
<p><a target="_blank" href="https://asciinema.org/a/537318"><img src="https://asciinema.org/a/537318.svg" alt="asciicast" width="1642.54999935" height="466.66655000000003" loading="lazy"></a></p>
<p>This is an issue with the versions if specified on the <a target="_blank" href="https://github.com/josevnz/OracleCloudHomeLab/blob/main/roles/oracle/files/requirements_certboot.txt">requirements_certboot.txt</a> file. When you install a Python library using a virtual environment you can specify versions like this:</p>
<p>pip==22.3.1 wheel==0.38.1 certbot==1.23.0 certbot-nginx==1.23.0</p>
<p>To fix the issue, we will revert the versions used on the file and then re-run the requirements file and Certbot installation task:</p>
<pre><code class="lang-yaml"><span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">Setup</span> <span class="hljs-string">Certbot</span>
  <span class="hljs-attr">pip:</span>
    <span class="hljs-attr">requirements:</span> <span class="hljs-string">/opt/requirements_certboot.txt</span>
    <span class="hljs-attr">virtualenv:</span> <span class="hljs-string">/opt/certbot/</span>
    <span class="hljs-attr">virtualenv_site_packages:</span> <span class="hljs-literal">true</span>
    <span class="hljs-attr">virtualenv_command:</span> <span class="hljs-string">/usr/bin/python3</span> <span class="hljs-string">-m</span> <span class="hljs-string">venv</span>
    <span class="hljs-attr">state:</span> <span class="hljs-string">forcereinstall</span>
  <span class="hljs-attr">tags:</span> <span class="hljs-string">certbot_env</span>
</code></pre>
<pre><code class="lang-shell">ansible-playbook --inventory inventory --tags certbot_env site.yaml
</code></pre>
<p>See it in action:</p>
<p><a target="_blank" href="https://asciinema.org/a/537320"><img src="https://asciinema.org/a/537320.svg" alt="asciicast" width="1886.82666592" height="466.66655000000003" loading="lazy"></a></p>
<h3 id="heading-how-to-run-the-whole-playbook">How to run the whole playbook</h3>
<pre><code class="lang-shell">ansible-playbook --inventory inventory site.yaml
</code></pre>
<p>It is time to run the whole playbook:</p>
<p><a target="_blank" href="https://asciinema.org/a/537322"><img src="https://asciinema.org/a/537322.svg" alt="asciicast" width="1886.82666592" height="466.66655000000003" loading="lazy"></a></p>
<h2 id="heading-wrapping-up">Wrapping up</h2>
<p>This tutorial only touches the surface of what you can do with Ansible. So below are a few more resources you should explore to learn more:</p>
<ul>
<li><p>Improving inventories: <a target="_blank" href="https://www.redhat.com/sysadmin/ansible-dynamic-inventories">How to create dynamic inventory files in Ansible</a>, <a target="_blank" href="https://www.redhat.com/sysadmin/ansible-dynamic-inventory-python">How to write a Python script to create dynamic Ansible inventories</a>, <a target="_blank" href="https://www.redhat.com/sysadmin/ansible-plugin-inventory-files">How to write an Ansible plugin to create inventory files</a></p>
</li>
<li><p>Sometimes your playbooks will run slow, and you may need to <a target="_blank" href="https://www.redhat.com/sysadmin/ansible-callback-plugins-metrics">Assess resource consumption with Ansible callback plugins</a>.</p>
</li>
<li><p>And there will be a time <a target="_blank" href="https://docs.ansible.com/ansible/latest/user_guide/playbooks_debugger.html">when deeper debugging</a> is needed.</p>
</li>
</ul>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ How to Recognize a Phishing Email – And What to Do When You Get One ]]>
                </title>
                <description>
                    <![CDATA[ You know the drill: you open your email client and there is it an email saying that you will be in trouble if you do not follow certain instructions in short time, no questions asked. All it takes is a single click, and you're in trouble. This kind o... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/how-to-recognize-phishing-email/</link>
                <guid isPermaLink="false">66d85143ec0a9800d5b8e6e8</guid>
                
                    <category>
                        <![CDATA[ cybersecurity ]]>
                    </category>
                
                    <category>
                        <![CDATA[ information security ]]>
                    </category>
                
                    <category>
                        <![CDATA[ #infosec ]]>
                    </category>
                
                    <category>
                        <![CDATA[ phishing ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ Jose Vicente Nunez ]]>
                </dc:creator>
                <pubDate>Wed, 12 Oct 2022 00:52:32 +0000</pubDate>
                <media:content url="https://cdn.hashnode.com/res/hashnode/image/upload/v1725458523382/ab4b959e-8c84-4e48-88a5-bbc716255d1b.jpeg" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>You know the drill: you open your email client and there is it an email saying that you will be in trouble if you do not follow certain instructions in short time, no questions asked.</p>
<p>All it takes is a single click, and you're in trouble.</p>
<p>This kind of email has a very <a target="_blank" href="https://www.phishing.org/what-is-phishing">clear definition</a>:</p>
<blockquote>
<p><a target="_blank" href="https://www.knowbe4.com/phishing?hsLang=en">Phishing</a> is a <a target="_blank" href="https://www.merriam-webster.com/dictionary/cybercrime">cybercrime</a> in which a target or targets are contacted by email, telephone or text message by someone posing as a legitimate institution to lure individuals into providing sensitive data such as personally identifiable information, banking and credit card details, and passwords.</p>
</blockquote>
<p>In this article, I'll explain what phishing is and how to recognize the signs that an email may not be legit. For that, we will learn to do the following:</p>
<ul>
<li><p>Recognize some obvious flags of a phishing email</p>
</li>
<li><p>Use some command tools on Linux to carefully inspect suspicious links</p>
</li>
<li><p>Analyze the suspicious emails with several free online tools</p>
</li>
</ul>
<p>All this while having some fun.</p>
<h2 id="heading-example-of-a-phishing-email">Example of a Phishing Email</h2>
<p>Let me share a quite clever example email (some details have been changed to protect the innocent):</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2022/10/godaddy_phishing_emails.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>Phishing email pretending to be GoDaddy</em></p>
<p>Let me show you how you can quickly spot scammers, without using a single line of code</p>
<p>You will need the following to go through some of the steps of this tutorial:</p>
<ul>
<li><p>A Linux installation, with <a target="_blank" href="https://curl.se/">curl</a> installed.</p>
</li>
<li><p>A Web browser (Brave or Firefox are good choices)</p>
</li>
<li><p><strong>Curiosity</strong></p>
</li>
</ul>
<p>Now let's move on and see what we've got in our mailbox...</p>
<h2 id="heading-common-sense-phishing-red-flags">Common Sense Phishing Red Flags</h2>
<p>Right out of the box, this email violates two simple rules, despite having proper grammar and nice presentation:</p>
<p>First, of all, it <strong>forces you to act immediately to fix an issue</strong> (Urgent action required), <strong>no questions asked</strong> (Click the nice button).</p>
<p>To make it worse, there's no way to verify that the person contacting you really works for the company. Reputable companies ask you to log into their website and offer a case # so you can track the issue. Neither of those are here.</p>
<p>Second, despite their best efforts, <strong>scammers make qualitative mistakes</strong>. Do you see that <em>customer #</em> on the upper right part of the screenshot? I compared it to mine on the real website and guess what? It's a different number.</p>
<p>But where is the fun of analyzing this if we cannot do even a little bit of poking? Well, when I moved my mouse over the button image I could see the link and it was pointing to tiny URL (an URL shortening service):</p>
<pre><code class="lang-python">https://tinyurl.com/xszszasxdxdxdxdxdxdxdzs?a=xxx@xxxx.com
</code></pre>
<p>So whoever is doing this is trying to conceal the real URL. No problem, copy the URL address (<strong>never click it</strong>), change the email part of the GET request to some garbage (?a=xxx@xxx.com)) and then run it through curl. I got this:</p>
<pre><code class="lang-html"><span class="hljs-tag">&lt;<span class="hljs-name">table</span> <span class="hljs-attr">width</span>=<span class="hljs-string">"75%"</span> <span class="hljs-attr">bgcolor</span>=<span class="hljs-string">"#FFFFFF"</span> <span class="hljs-attr">align</span>=<span class="hljs-string">"center"</span> <span class="hljs-attr">cellpadding</span>=<span class="hljs-string">"10"</span>&gt;</span>
        <span class="hljs-tag">&lt;<span class="hljs-name">tr</span>&gt;</span>
            <span class="hljs-tag">&lt;<span class="hljs-name">td</span>&gt;</span>
                <span class="hljs-tag">&lt;<span class="hljs-name">h2</span>&gt;</span>URL Terminated<span class="hljs-tag">&lt;/<span class="hljs-name">h2</span>&gt;</span>
                <span class="hljs-tag">&lt;<span class="hljs-name">p</span>&gt;</span>
                    The TinyURL (xszszasxdxdxdxdxdxdxdzs) you visited was used by its creator in violation of our terms of use.
                    TinyURL has a strict no abuse policy and we apologize for the intrusion this user has caused you.
                    Such violations of our terms of use include:
                <span class="hljs-tag">&lt;/<span class="hljs-name">p</span>&gt;</span>
                <span class="hljs-tag">&lt;<span class="hljs-name">ul</span>&gt;</span>
                    <span class="hljs-tag">&lt;<span class="hljs-name">li</span>&gt;</span>Spam - Unsolicited Bulk E-mail<span class="hljs-tag">&lt;/<span class="hljs-name">li</span>&gt;</span>
                    <span class="hljs-tag">&lt;<span class="hljs-name">li</span>&gt;</span>Fraud or Money Making scams<span class="hljs-tag">&lt;/<span class="hljs-name">li</span>&gt;</span>
                    <span class="hljs-tag">&lt;<span class="hljs-name">li</span>&gt;</span>Malware<span class="hljs-tag">&lt;/<span class="hljs-name">li</span>&gt;</span>
                    <span class="hljs-tag">&lt;<span class="hljs-name">li</span>&gt;</span>or any other use that is illegal.<span class="hljs-tag">&lt;/<span class="hljs-name">li</span>&gt;</span>
                <span class="hljs-tag">&lt;/<span class="hljs-name">ul</span>&gt;</span>
                <span class="hljs-tag">&lt;<span class="hljs-name">p</span>&gt;</span>
</code></pre>
<p>So the good people from Tiny URL noticed this too and terminated the URL. Nice work!</p>
<p><a target="_blank" href="https://asciinema.org/a/526911"><img src="https://asciinema.org/a/526911.svg" alt="asciicast" width="2425.91999904" height="746.6664800000001" loading="lazy"></a></p>
<p>Let's use other tools to confirm what we know already.</p>
<h2 id="heading-online-tools-you-can-use-to-analyze-suspicious-urls">Online Tools You Can Use to Analyze Suspicious URLs</h2>
<p>Tiny URL was nice enough to tell us about the original URL:</p>
<pre><code class="lang-text">https://parasolhealth.org/resources/sass/hgjhgbgb/%20hxghxhgcgzvzvhgxvgzhxgvvgvcgvhgvjhvxhgvzhgvshgvhgvhgvhgwvhgwvhgwvhgwvhgvhgvdshvshgvhgvhgdvhgdsvhgdvhjgdvjhdgdvhgfvhgvf/vhgvjhgvghgvghvhgvghvhgvjlnkjndkjdkjdhbgytdvghdvhvshgvshgvjsvhvahgvhvwgvhwvhvajgvsgshgvhsgvjhsvgavjgvsgvahgvahgvhgsvjgavhgsvhgsvhjvshgvahgvsjvshgvajvshvhgwvhgvehgvehgvehjvegvejhgvhgavhavhs/dhbjhjfhjfkbkjfhbjkbfjbjdbkjbsjhbdjbjkdbhbdjkbjdbjdbjhbdkjbsjbjkdbjkdhbjdbjbsjhbsjbjdkbjhdbkjhbdkjbsbdjbjdbkjhbjhbsjkhbdjbjdbjdbjhsbjhbejhbejhbjwhbjhwbjkwhbjbhbs/jdbhdhdbkjbsjbsjbwjbjwbjkbwhbehbjhbejbebebjebjbejbjhbsjhbshbahbjhsbshbjkhdbjhbjhbdbdjkbdhbjhsbjhbajhbsjbkjshbhbdjhbjdhbjkbshbsjhbsjbdbdhbdhbjehbjhebjhbrrhbjbjekhbjhbjsbjhsbjhbdjhd/jbdjhbdkjbdjhbkjabjhbsjbdjbksjbhsbjhdbjhbjkbdjhbjhbkjbejhbwkhbjkwhbjhwbjkwhbjhwbjhbwhbwkjhbwjhbjhbajhbajhbsjhbsjhbdjkhbdjhbdjhbjdhbjshbjhsbjhbjhsbkjhbdjhbsjbjabjhabjkbs/redirect.php
</code></pre>
<p>If you go to the Virus Total website and search for the URL you will see that this <a target="_blank" href="https://www.virustotal.com/gui/url/1a5a1a3385c2d6c2c76b0ca721138ba9eeae7b8a12cc6e28c206216c103c3fc3?nocache=1">was also reported here</a>:</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2022/10/godaddy_virustotal_malicious.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p>Interestingly enough, only a single vendor reported the URL as malicious. That will do it for me :-)</p>
<p>Also <a target="_blank" href="https://www.abuseipdb.com/report?ip=66.85.143.2">Abuse IP DB</a> doesn't know anything about the offending website. However keep this tool around as it is known to reports multiple other actors.</p>
<p>There is anything else we can learn from the original message? Most email readers allow you to copy and paste the email headers. I'm sharing mine here (with a few changes):</p>
<pre><code class="lang-text">Received: from MN2PR19MB4030.namprd19.prod.outlook.com (2603:10b6:208:1e8::11)
 by MW3PR19MB4204.namprd19.prod.outlook.com with HTTPS; Tue, 4 Oct 2022
 16:35:05 +0000
Received: from BN9PR03CA0959.namprd03.prod.outlook.com (2603:10b6:408:108::34)
 by MN2PR19MB4030.namprd19.prod.outlook.com (2603:10b6:208:1e8::11) with
 Microsoft SMTP Server (version=TLS1_2,
 cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5676.31; Tue, 4 Oct
 2022 16:35:01 +0000
Received: from BN7NAM10FT104.eop-nam10.prod.protection.outlook.com
 (2603:10b6:408:108:cafe::cc) by BN9PR03CA0959.outlook.office365.com
 (2603:10b6:408:108::34) with Microsoft SMTP Server (version=TLS1_2,
 cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5676.24 via Frontend
 Transport; Tue, 4 Oct 2022 16:34:59 +0000
Authentication-Results: spf=softfail (sender IP is 170.10.162.128)
 smtp.mailfrom=bounce.com; dkim=none (message not signed)
 header.d=none;dmarc=fail action=oreject header.from=godaddy.com;compauth=fail
 reason=000
Received-SPF: SoftFail (protection.outlook.com: domain of transitioning
 bounce.com discourages use of 170.10.162.128 as permitted sender)
Received: from host.solutiononellc.com (170.10.162.128) by
 BN7NAM10FT104.mail.protection.outlook.com (10.13.157.118) with Microsoft SMTP
 Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id
 15.20.5676.17 via Frontend Transport; Tue, 4 Oct 2022 16:34:59 +0000
Received: from ip250.ip-37-187-205.eu ([37.187.205.250]:38823)
    by altar47.supremepanel47.com with esmtpsa  (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384
    (Exim 4.95)
    (envelope-from &lt;postmaster@bounce.com&gt;)
    id 1ofksk-0005Zd-LV
    for xxx@xxxx.com;
    Tue, 04 Oct 2022 16:34:58 +0000

Using [MXToolbox](https://mxtoolbox.com/Public/Tools/EmailHeaders.aspx?huid=4205dc8f-5147-4da5-a448-d633f2bbca61) shows that 2 of the email addresses used in the chain are **blacklisted**, another red flag.

![Image](https://www.freecodecamp.org/news/content/images/2022/10/godaddy_scammer_mxtoolbox.png)
_2 blocked emails from this list. Another read flag_

I think that's good enough. Delete the email and move on with your life, and be sure a new email is coming your way (hopefully landing in the SPAM folder automatically).

## What's Next?

There are many tools on the Internet you can use to identify phishing emails, but there is no substitute for common sense. It if looks too good to be true then it probably is.

As usual, do not click the link right away! Do a little investigating first, just to be safe.
</code></pre>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ How to Run Android Games on Linux with Android-x86 ]]>
                </title>
                <description>
                    <![CDATA[ In this article, you'll learn how you can use virtual machines on Linux while having fun with vintage games. If you have an Android phone, one of your guilty pleasures might be playing some very entertaining games. Or it could be that there is an app... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/run-android-games-on-linux/</link>
                <guid isPermaLink="false">66d851497211ea6be29e1b7f</guid>
                
                    <category>
                        <![CDATA[ Android ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Games ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Linux ]]>
                    </category>
                
                    <category>
                        <![CDATA[ virtual machine ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ Jose Vicente Nunez ]]>
                </dc:creator>
                <pubDate>Wed, 17 Aug 2022 16:09:12 +0000</pubDate>
                <media:content url="https://www.freecodecamp.org/news/content/images/2022/08/jose-article-photo.jpeg" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>In this article, you'll learn how you can use virtual machines on Linux while having fun with vintage games.</p>
<p>If you have an Android phone, one of your guilty pleasures might be playing some very entertaining games. Or it could be that there is an application that only runs on your phone.</p>
<p>And then you think – what if you could run the same games on your desktop PC?</p>
<p>To simplify the scenario, let's assume the applications run on Android.</p>
<p>One approach to solve your problem is to run an Android emulator on your PC. But some of them, like <a target="_blank" href="https://www.android-x86.org/download.html">Android-x86</a>, require rebooting your machine so they can take control of the hardware.</p>
<p>If you don't mind a small performance hit you can run a virtual machine at the same time as your native operating system. Specifically on Linux, there are several choices, like <a target="_blank" href="https://www.qemu.org/">QEMU</a> and <a target="_blank" href="https://www.virtualbox.org/">VirtualBox</a>, to name a few.</p>
<p>By the end of this article you will be able to do the following:</p>
<ul>
<li><p>Install VirtualBox on Fedora Linux</p>
</li>
<li><p>Run android-x86 and finish the basic setup</p>
</li>
<li><p>Install an application from the Google Play Store, just like on your phone.</p>
</li>
</ul>
<h2 id="heading-basic-requirements"><strong>Basic Requirements</strong></h2>
<p>Before you start, I assume that you have the following:</p>
<ul>
<li><p>Ability to run commands as the superuser (like <a target="_blank" href="https://www.sudo.ws/">SUDO</a>)</p>
</li>
<li><p>An account on Google.com, so you can use the Play store from within the virtual machine.</p>
</li>
</ul>
<h1 id="heading-how-to-install-virtualbox"><strong>How to Install VirtualBox</strong></h1>
<p>The first step is to install VirtualBox. For practical purposes, our installation will be basic, just enough to run our games:</p>
<pre><code class="lang-python">sudo dnf install -y kernel-devel kernel-devel<span class="hljs-number">-5.14</span><span class="hljs-number">.18</span><span class="hljs-number">-100.</span>fc33.x86_64
curl --remote-name --location https://www.virtualbox.org/download/oracle_vbox.asc
sudo rpm --<span class="hljs-keyword">import</span> ./oracle_vbox.asc
sudo dnf install -y https://download.virtualbox.org/virtualbox/<span class="hljs-number">6.1</span><span class="hljs-number">.36</span>/VirtualBox<span class="hljs-number">-6.1</span><span class="hljs-number">-6.1</span><span class="hljs-number">.36</span>_152435_fedora33<span class="hljs-number">-1.</span>x86_64.rpm
sudo dnf install -y virtualbox-guest-additions.x86_64
sudo /sbin/vboxconfig
</code></pre>
<h2 id="heading-how-to-install-the-android-x86-iso"><strong>How to Install the Android-x86 ISO</strong></h2>
<p>The first step is to download the ISO image from <a target="_blank" href="https://sourceforge.net/projects/android-x86/">Android-x86</a>. This ISO contains the Android operating system that will be installed on our virtual hard drive.</p>
<p>After that we can set up our virtual machine like this:</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2022/08/virtualbox-androidx86.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>How a finished virtual machine looks like on VirtualBox</em></p>
<p><img src="http://localhost:63342/4f800f8a-bbed-4dd8-b03c-00449c9f6698/1437651526/fileSchemeResource/59ea74abf47f101ded05f883e4d4c256-virtualbox-androidx86.png?_ijt=r1jlidvb50q7p9rgbjri12egof" alt="Image" width="600" height="400" loading="lazy"></p>
<p>A few things to note:</p>
<ul>
<li><p>After booting the first time, I found that 1GB for the Android image was not enough. Performance improved a lot after I bumped the ram to 3GB.</p>
</li>
<li><p>Another change was the 'Graphics Controller'. Originally it was VMSVGA but then Android refused to start in graphic mode, so I switched to VboxVGA and it worked.</p>
</li>
<li><p>2 CPUS and 8GB of disk space were enough for my game.</p>
</li>
<li><p>Finally, I specified that the IDE controller was the android-x86 ISO.</p>
</li>
</ul>
<p>To start the virtual machine, you click the 'Start' button on the GUI, and then you will have to make a few decisions like bootable partition:</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2022/08/androidx86-partition.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>Partitioning your virtual disk. We assign 8 GB and make sure the partition can boot</em></p>
<p>Once this is done you can choose your new partition to perform the installation:</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2022/08/androidx86-newpartition.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>After the new partition is created, you can choose it and you can install the Android OS there</em></p>
<p>Then the installation will proceed:</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2022/08/androidx86-install.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>The installer copies the files from the Android ISO image into the virtual hard drive</em></p>
<p>After the installation is complete, you can shut down the virtual machine.</p>
<h2 id="heading-first-boot"><strong>First Boot</strong></h2>
<p>Now you'll need to go to the advanced options and select the virtual disk (instead of the ISO image) to boot:</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2022/08/android-x86-boot-from-disk.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>You can either boot from disk on this menu or change the boot order on the virtual machine</em></p>
<p>After that, Android will ask you some basic setup information, just like it does on your phone. The final result may look like this:</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2022/08/androidx86-running.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>The virtual machine looks exactly like your Android phone.</em></p>
<h2 id="heading-how-to-install-games-from-the-google-play-store"><strong>How to Install Games from the Google Play Store</strong></h2>
<p>In my case I decided to install a game where I can fight forces of evil as 1970 <a target="_blank" href="https://en.wikipedia.org/wiki/Mazinger_Z">Mazinger Z/ Tranzor Z</a> (Yes, I love <a target="_blank" href="https://en.wikipedia.org/wiki/Go_Nagai">Go Nagai</a> Mazinger Z). To do that, search on the play store and install the game:</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2022/08/android-x86-play-store.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>After Android is running and your credentials are set you can download and install any Android program you want.</em></p>
<p>And now, success! We got the game up and running.</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2022/08/androidx86-mazingerz.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>Sorry, but now it is time to play as Mazinger Z!</em></p>
<h1 id="heading-what-did-we-learn-here"><strong>What Did We Learn Here?</strong></h1>
<ul>
<li><p>We managed to install a virtual machine engine and successfully run the Android operating system along with our regular Fedora OS</p>
</li>
<li><p>You saw how you can try and discard whole operating systems' setup, without going through the hassle of setting up a dual boot system with Grub on Linux</p>
</li>
</ul>
<p>Another nice feature of running the game inside a virtual machine is that you can fully freeze the game, then come back and restore it at exactly the same point where you left it.</p>
<p>Finally, you can do many more things with a virtual machine than just running games, for example:</p>
<ul>
<li><p>You can <a target="_blank" href="https://www.varonis.com/blog/malware-analysis-tools">analyze malware safely</a>, run un-trusted applications, and contain any damage they can cause.</p>
</li>
<li><p>Try a new operating system version before deciding to commit a proper installation (not a big issue these days as most of them provide a lice CD you can boot to try), but this is still very convenient.</p>
</li>
<li><p>Be able to run multiple operating systems simultaneously, without rebooting your machine. You most likely will start trying more advanced options of your virtual machine of choice, like <a target="_blank" href="https://www.virtualbox.org/manual/ch09.html">VirtualBox</a>.</p>
</li>
</ul>
<p>Playing games on your PC is a gateway for learning more complex stuff later. Also the fun factor is undeniable. Enjoy!</p>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ How to Secure Server Infrastructure Clouds using Falco, Prometheus, Grafana and Docker ]]>
                </title>
                <description>
                    <![CDATA[ I was recently looking for a way to keep tabs on our containers and applications at work. Specifically, I was interested in detecting anomalies in the configuration. After a little research, I stumbled on Falco. What I found was a very complete Open ... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/secure-server-infrastructure-clouds-using-falco-prometheus-grafana-and-docker/</link>
                <guid isPermaLink="false">66d8514b39c4dccc43d4d4ba</guid>
                
                    <category>
                        <![CDATA[ Cloud Computing ]]>
                    </category>
                
                    <category>
                        <![CDATA[ containers ]]>
                    </category>
                
                    <category>
                        <![CDATA[ cybersecurity ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Docker ]]>
                    </category>
                
                    <category>
                        <![CDATA[ information security ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ Jose Vicente Nunez ]]>
                </dc:creator>
                <pubDate>Tue, 10 May 2022 14:58:47 +0000</pubDate>
                <media:content url="https://www.freecodecamp.org/news/content/images/2022/05/pexels-aleksandar-pasaric-325185--1-.jpg" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>I was recently looking for a way to keep tabs on our containers and applications at work. Specifically, I was interested in detecting anomalies in the configuration. After a little research, I stumbled on <a target="_blank" href="https://github.com/falcosecurity/falco">Falco</a>.</p>
<p>What I found was a very complete Open Source platform with lots of features and excellent documentation. So I wanted to share my experience with you.</p>
<p>What will we cover in this article?</p>
<ul>
<li><p>How to install the <a target="_blank" href="https://falco.org/">Falco</a> agent on the host that you want to monitor for events (anomalies/violations)</p>
</li>
<li><p>How to tune Falco to reduce false positives and get the information you really need</p>
</li>
<li><p>How to use <a target="_blank" href="https://prometheus.io/">Prometheus</a> to collect Falco events into a central location, with the help of the exporters and a scraper.</p>
</li>
<li><p>Finally, how to connect the scraper with <a target="_blank" href="https://grafana.com/">Grafana</a> for visualization and alerting</p>
</li>
</ul>
<h2 id="heading-what-do-you-need-for-this-tutorial"><strong>What do you need for this tutorial?</strong></h2>
<ul>
<li><p>A machine or machines with Linux installed. A virtual machine should work.</p>
</li>
<li><p>You will need superuser permissions to be able to install/setup Docker, RPM, and systemd processes</p>
</li>
<li><p>We will use Docker containers, so basic knowledge of Docker is required</p>
</li>
<li><p>Working knowledge of Python/Bash, as we will write a few scripts to test and improve our configuration.</p>
</li>
</ul>
<p>At the end you will be able to setup each one of the following components:</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2022/05/falco_monitoring.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p>Don't be intimidated – I'll provide links to the documentation and a thorough explanation of each one of these tasks as we move along.</p>
<h2 id="heading-table-of-contents">Table of contents</h2>
<ol>
<li><p><a class="post-section-overview" href="#what-is-falco"><strong>What is Falco?</strong></a></p>
</li>
<li><p><a class="post-section-overview" href="#how-to-install-falco"><strong>How to Install Falco</strong></a></p>
</li>
<li><p><a class="post-section-overview" href="#basic-configuration"><strong>Basic Configuration</strong></a></p>
</li>
<li><p><a class="post-section-overview" href="#how-to-test-the-default-configuration"><strong>How to Test the Default Configuration</strong></a></p>
</li>
<li><p><a class="post-section-overview" href="#defaults-are-not-always-good"><strong>Defaults Are Not Always Good</strong></a></p>
</li>
<li><p><a class="post-section-overview" href="#falco-integrations"><strong>Falco Integrations</strong></a></p>
</li>
<li><p><a class="post-section-overview" href="#learning-more"><strong>Learning More</strong></a></p>
</li>
</ol>
<h1 id="heading-what-is-falco"><strong>What is Falco?</strong></h1>
<p>The best way to describe this tool <em>is to learn what it can do</em>:</p>
<blockquote>
<p>Falco can detect and alert on any behavior that involves making Linux system calls.</p>
<p>Falco alerts can be triggered by the use of specific system calls, their arguments, and by properties of the calling process. For example, Falco can easily detect incidents including but not limited to:</p>
</blockquote>
<ul>
<li><p>A shell is running inside a container or pod in Kubernetes.</p>
</li>
<li><p>A container is running in privileged mode, or is mounting a sensitive path, such as /proc, from the host.</p>
</li>
<li><p>A server process is spawning a child process of an unexpected type.</p>
</li>
<li><p>Unexpected read of a sensitive file, such as /etc/shadow.</p>
</li>
<li><p>A non-device file is written to /dev.</p>
</li>
<li><p>A standard system binary, such as ls, is making an outbound network connection.</p>
</li>
<li><p>A privileged pod is started in a Kubernetes cluster.</p>
</li>
</ul>
<h1 id="heading-how-to-install-falco"><strong>How to Install Falco</strong></h1>
<p>I will install Falco using an RPM (<a target="_blank" href="https://falco.org/docs/getting-started/installation/">similar instructions exist</a> for apt-get, and even Docker containers). In my case I felt the native installation was the best, and the RPM made it very easy to do:</p>
<pre><code class="lang-python">[josevnz@macmini2 ~]$ sudo -i dnf install https://download.falco.org/packages/rpm/falco<span class="hljs-number">-0.31</span><span class="hljs-number">.1</span>-x86_64.rpm
Last metadata expiration check: <span class="hljs-number">2</span>:<span class="hljs-number">53</span>:<span class="hljs-number">53</span> ago on Sun <span class="hljs-number">01</span> May <span class="hljs-number">2022</span> <span class="hljs-number">04</span>:<span class="hljs-number">13</span>:<span class="hljs-number">09</span> PM EDT.
falco<span class="hljs-number">-0.31</span><span class="hljs-number">.1</span>-x86_64.rpm                                                                                                                                                                                                       <span class="hljs-number">1.7</span> MB/s |  <span class="hljs-number">12</span> MB     <span class="hljs-number">00</span>:<span class="hljs-number">07</span>    
Dependencies resolved.
==============================================================================================================================================================================================================================================================
 Package                                                          Architecture                                      Version                                                                     Repository                                               Size
==============================================================================================================================================================================================================================================================
Installing:
 falco                                                            x86_64                                            <span class="hljs-number">0.31</span><span class="hljs-number">.1</span><span class="hljs-number">-1</span>                                                                    @commandline                                             <span class="hljs-number">12</span> M
Installing dependencies:
 dkms                                                             noarch                                            <span class="hljs-number">2.8</span><span class="hljs-number">.1</span><span class="hljs-number">-4.20200214</span>git5ca628c.fc30                                             updates                                                  <span class="hljs-number">78</span> k
 elfutils-libelf-devel                                            x86_64                                            <span class="hljs-number">0.179</span><span class="hljs-number">-2.</span>fc30                                                                updates                                                  <span class="hljs-number">27</span> k
 kernel-devel                                                     x86_64                                            <span class="hljs-number">5.6</span><span class="hljs-number">.13</span><span class="hljs-number">-100.</span>fc30                                                             updates                                                  <span class="hljs-number">14</span> M

Transaction Summary
==============================================================================================================================================================================================================================================================
Install  <span class="hljs-number">4</span> Packages

Total size: <span class="hljs-number">26</span> M
Total download size: <span class="hljs-number">14</span> M
Installed size: <span class="hljs-number">92</span> M
Is this ok [y/N]: y
Downloading Packages:
(<span class="hljs-number">1</span>/<span class="hljs-number">3</span>): elfutils-libelf-devel<span class="hljs-number">-0.179</span><span class="hljs-number">-2.</span>fc30.x86_64.rpm                                                                                                                                                                          <span class="hljs-number">253</span> kB/s |  <span class="hljs-number">27</span> kB     <span class="hljs-number">00</span>:<span class="hljs-number">00</span>    
(<span class="hljs-number">2</span>/<span class="hljs-number">3</span>): dkms<span class="hljs-number">-2.8</span><span class="hljs-number">.1</span><span class="hljs-number">-4.20200214</span>git5ca628c.fc30.noarch.rpm                                                                                                                                                                        <span class="hljs-number">342</span> kB/s |  <span class="hljs-number">78</span> kB     <span class="hljs-number">00</span>:<span class="hljs-number">00</span>    
(<span class="hljs-number">3</span>/<span class="hljs-number">3</span>): kernel-devel<span class="hljs-number">-5.6</span><span class="hljs-number">.13</span><span class="hljs-number">-100.</span>fc30.x86_64.rpm                                                                                                                                                                                <span class="hljs-number">1.9</span> MB/s |  <span class="hljs-number">14</span> MB     <span class="hljs-number">00</span>:<span class="hljs-number">07</span>    
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Total                                                                                                                                                                                                                         <span class="hljs-number">1.8</span> MB/s |  <span class="hljs-number">14</span> MB     <span class="hljs-number">00</span>:<span class="hljs-number">07</span>     
Running transaction check
Transaction check succeeded.
Running transaction test
Transaction test succeeded.
Running transaction
  Preparing        :                                                                                                                                                                                                                                      <span class="hljs-number">1</span>/<span class="hljs-number">1</span> 
  Installing       : kernel-devel<span class="hljs-number">-5.6</span><span class="hljs-number">.13</span><span class="hljs-number">-100.</span>fc30.x86_64                                                                                                                                                                                                  <span class="hljs-number">1</span>/<span class="hljs-number">4</span> 
  Running scriptlet: kernel-devel<span class="hljs-number">-5.6</span><span class="hljs-number">.13</span><span class="hljs-number">-100.</span>fc30.x86_64                                                                                                                                                                                                  <span class="hljs-number">1</span>/<span class="hljs-number">4</span> 
  Installing       : elfutils-libelf-devel<span class="hljs-number">-0.179</span><span class="hljs-number">-2.</span>fc30.x86_64                                                                                                                                                                                            <span class="hljs-number">2</span>/<span class="hljs-number">4</span> 
  Installing       : dkms<span class="hljs-number">-2.8</span><span class="hljs-number">.1</span><span class="hljs-number">-4.20200214</span>git5ca628c.fc30.noarch                                                                                                                                                                                          <span class="hljs-number">3</span>/<span class="hljs-number">4</span> 
  Running scriptlet: dkms<span class="hljs-number">-2.8</span><span class="hljs-number">.1</span><span class="hljs-number">-4.20200214</span>git5ca628c.fc30.noarch                                                                                                                                                                                          <span class="hljs-number">3</span>/<span class="hljs-number">4</span> 
  Running scriptlet: falco<span class="hljs-number">-0.31</span><span class="hljs-number">.1</span><span class="hljs-number">-1.</span>x86_64                                                                                                                                                                                                                <span class="hljs-number">4</span>/<span class="hljs-number">4</span> 
  Installing       : falco<span class="hljs-number">-0.31</span><span class="hljs-number">.1</span><span class="hljs-number">-1.</span>x86_64                                                                                                                                                                                                                <span class="hljs-number">4</span>/<span class="hljs-number">4</span> 
  Running scriptlet: falco<span class="hljs-number">-0.31</span><span class="hljs-number">.1</span><span class="hljs-number">-1.</span>x86_64
</code></pre>
<h1 id="heading-basic-configuration"><strong>Basic Configuration</strong></h1>
<p>Unless we want to do very basic output processing, we want to enable the JSON output:</p>
<pre><code class="lang-python"><span class="hljs-comment"># Whether to output events in json or text</span>
json_output: true
</code></pre>
<p>It will become evident why pretty soon.</p>
<p>Next <a target="_blank" href="https://falco.org/docs/getting-started/running/">start</a> the Falco agent:</p>
<pre><code class="lang-python">[josevnz@macmini2 falco]$ sudo systemctl start falco.service 
[josevnz@macmini2 falco]$ sudo systemctl status falco.service 
● falco.service - Falco: Container Native Runtime Security
   Loaded: loaded (/usr/lib/systemd/system/falco.service; disabled; vendor preset: disabled)
   Active: active (running) since Sun <span class="hljs-number">2022</span><span class="hljs-number">-05</span><span class="hljs-number">-01</span> <span class="hljs-number">19</span>:<span class="hljs-number">20</span>:<span class="hljs-number">52</span> EDT; <span class="hljs-number">1</span>s ago
     Docs: https://falco.org/docs/
  Process: <span class="hljs-number">26887</span> ExecStartPre=/sbin/modprobe falco (code=exited, status=<span class="hljs-number">0</span>/SUCCESS)
 Main PID: <span class="hljs-number">26888</span> (falco)
    Tasks: <span class="hljs-number">1</span> (limit: <span class="hljs-number">2310</span>)
   Memory: <span class="hljs-number">65.8</span>M
   CGroup: /system.slice/falco.service
           └─<span class="hljs-number">26888</span> /usr/bin/falco --pidfile=/var/run/falco.pid

May <span class="hljs-number">01</span> <span class="hljs-number">19</span>:<span class="hljs-number">20</span>:<span class="hljs-number">52</span> macmini2 systemd[<span class="hljs-number">1</span>]: Starting Falco: Container Native Runtime Security...
May <span class="hljs-number">01</span> <span class="hljs-number">19</span>:<span class="hljs-number">20</span>:<span class="hljs-number">52</span> macmini2 systemd[<span class="hljs-number">1</span>]: Started Falco: Container Native Runtime Security.
May <span class="hljs-number">01</span> <span class="hljs-number">19</span>:<span class="hljs-number">20</span>:<span class="hljs-number">52</span> macmini2 falco[<span class="hljs-number">26888</span>]: Falco version <span class="hljs-number">0.31</span><span class="hljs-number">.1</span> (driver version b7eb0dd65226a8dc254d228c8d950d07bf3521d2)
May <span class="hljs-number">01</span> <span class="hljs-number">19</span>:<span class="hljs-number">20</span>:<span class="hljs-number">52</span> macmini2 falco[<span class="hljs-number">26888</span>]: Falco initialized <span class="hljs-keyword">with</span> configuration file /etc/falco/falco.yaml
May <span class="hljs-number">01</span> <span class="hljs-number">19</span>:<span class="hljs-number">20</span>:<span class="hljs-number">52</span> macmini2 falco[<span class="hljs-number">26888</span>]: Loading rules <span class="hljs-keyword">from</span> file /etc/falco/falco_rules.yaml:
May <span class="hljs-number">01</span> <span class="hljs-number">19</span>:<span class="hljs-number">20</span>:<span class="hljs-number">53</span> macmini2 falco[<span class="hljs-number">26888</span>]: Loading rules <span class="hljs-keyword">from</span> file /etc/falco/falco_rules.local.yaml:
May <span class="hljs-number">01</span> <span class="hljs-number">19</span>:<span class="hljs-number">20</span>:<span class="hljs-number">54</span> macmini2 falco[<span class="hljs-number">26888</span>]: Loading rules <span class="hljs-keyword">from</span> file /etc/falco/k8s_audit_rules.yaml:
</code></pre>
<h1 id="heading-how-to-test-the-default-configuration"><strong>How to Test the Default Configuration</strong></h1>
<p>Depending on your <a target="_blank" href="https://falco.org/docs/configuration/">configuration</a>, you may or may not get any events right after starting Falco:</p>
<pre><code class="lang-python">[josevnz@macmini2 falco]$ sudo journalctl --unit falco --follow
-- Logs begin at Tue <span class="hljs-number">2021</span><span class="hljs-number">-05</span><span class="hljs-number">-25</span> <span class="hljs-number">00</span>:<span class="hljs-number">15</span>:<span class="hljs-number">22</span> EDT. --
May <span class="hljs-number">01</span> <span class="hljs-number">19</span>:<span class="hljs-number">20</span>:<span class="hljs-number">52</span> macmini2 systemd[<span class="hljs-number">1</span>]: Starting Falco: Container Native Runtime Security...
May <span class="hljs-number">01</span> <span class="hljs-number">19</span>:<span class="hljs-number">20</span>:<span class="hljs-number">52</span> macmini2 systemd[<span class="hljs-number">1</span>]: Started Falco: Container Native Runtime Security.
May <span class="hljs-number">01</span> <span class="hljs-number">19</span>:<span class="hljs-number">20</span>:<span class="hljs-number">52</span> macmini2 falco[<span class="hljs-number">26888</span>]: Falco version <span class="hljs-number">0.31</span><span class="hljs-number">.1</span> (driver version b7eb0dd65226a8dc254d228c8d950d07bf3521d2)
May <span class="hljs-number">01</span> <span class="hljs-number">19</span>:<span class="hljs-number">20</span>:<span class="hljs-number">52</span> macmini2 falco[<span class="hljs-number">26888</span>]: Falco initialized <span class="hljs-keyword">with</span> configuration file /etc/falco/falco.yaml
May <span class="hljs-number">01</span> <span class="hljs-number">19</span>:<span class="hljs-number">20</span>:<span class="hljs-number">52</span> macmini2 falco[<span class="hljs-number">26888</span>]: Loading rules <span class="hljs-keyword">from</span> file /etc/falco/falco_rules.yaml:
May <span class="hljs-number">01</span> <span class="hljs-number">19</span>:<span class="hljs-number">20</span>:<span class="hljs-number">53</span> macmini2 falco[<span class="hljs-number">26888</span>]: Loading rules <span class="hljs-keyword">from</span> file /etc/falco/falco_rules.local.yaml:
May <span class="hljs-number">01</span> <span class="hljs-number">19</span>:<span class="hljs-number">20</span>:<span class="hljs-number">54</span> macmini2 falco[<span class="hljs-number">26888</span>]: Loading rules <span class="hljs-keyword">from</span> file /etc/falco/k8s_audit_rules.yaml:
May <span class="hljs-number">01</span> <span class="hljs-number">19</span>:<span class="hljs-number">20</span>:<span class="hljs-number">55</span> macmini2 falco[<span class="hljs-number">26888</span>]: Starting internal webserver, listening on port <span class="hljs-number">8765</span>
</code></pre>
<p>Worry not. We will run a few commands that will cause Falco to record some warning and alerts. Time to see how this works!</p>
<h2 id="heading-how-to-run-a-privileged-container"><strong>How to Run a Privileged Container</strong></h2>
<p>Using privileged containers <a target="_blank" href="https://materials.rangeforce.com/tutorial/2020/06/25/Escaping-Docker-Privileged-Containers/">is considered a bad practice</a>, so let's see if this event is detected by Falco:</p>
<pre><code class="lang-python">[josevnz@macmini2 ~]$ docker run --rm --interactive --tty --privileged --volume /etc/shadow:/mnt/shadow fedora:latest ls -l /mnt/shadow
----------. <span class="hljs-number">1</span> root root <span class="hljs-number">1198</span> Nov <span class="hljs-number">21</span> <span class="hljs-number">20</span>:<span class="hljs-number">51</span> /mnt/shadow
</code></pre>
<p>And our Falco log?</p>
<pre><code class="lang-python">May <span class="hljs-number">01</span> <span class="hljs-number">19</span>:<span class="hljs-number">29</span>:<span class="hljs-number">32</span> macmini2 falco[<span class="hljs-number">26888</span>]: {<span class="hljs-string">"output"</span>:<span class="hljs-string">"19:29:32.918828894: Informational Privileged container started (user=root user_loginuid=0 command=container:bfb9637a47a6 kind_lumiere (id=bfb9637a47a6) image=fedora:latest)"</span>,<span class="hljs-string">"priority"</span>:<span class="hljs-string">"Informational"</span>,<span class="hljs-string">"rule"</span>:<span class="hljs-string">"Launch Privileged Container"</span>,<span class="hljs-string">"source"</span>:<span class="hljs-string">"syscall"</span>,<span class="hljs-string">"tags"</span>:[<span class="hljs-string">"cis"</span>,<span class="hljs-string">"container"</span>,<span class="hljs-string">"mitre_lateral_movement"</span>,<span class="hljs-string">"mitre_privilege_escalation"</span>],<span class="hljs-string">"time"</span>:<span class="hljs-string">"2022-05-01T23:29:32.918828894Z"</span>, <span class="hljs-string">"output_fields"</span>: {<span class="hljs-string">"container.id"</span>:<span class="hljs-string">"bfb9637a47a6"</span>,<span class="hljs-string">"container.image.repository"</span>:<span class="hljs-string">"fedora"</span>,<span class="hljs-string">"container.image.tag"</span>:<span class="hljs-string">"latest"</span>,<span class="hljs-string">"container.name"</span>:<span class="hljs-string">"kind_lumiere"</span>,<span class="hljs-string">"evt.time"</span>:<span class="hljs-number">1651447772918828894</span>,<span class="hljs-string">"proc.cmdline"</span>:<span class="hljs-string">"container:bfb9637a47a6"</span>,<span class="hljs-string">"user.loginuid"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"user.name"</span>:<span class="hljs-string">"root"</span>}}
</code></pre>
<p>It shows up as an informational event. Definitely one of those things to keep an eye on. Ask yourself if the application on the container needs elevated privileges.</p>
<p>You also probably noticed that each message has tags. Pay attention to the "mitre_*" ones, they do relate to the <a target="_blank" href="https://attack.mitre.org/">Mitre Attack knowledge base</a> of attacks and mitigations. Yep, you will spend some time reading those.</p>
<h2 id="heading-how-to-create-a-file-on-the-root-directory"><strong>How to Create a File on the /root Directory</strong></h2>
<p>This example shows how to abuse the root user combined with volumes in a container...</p>
<pre><code class="lang-python">[josevnz@macmini2 ~]$ docker run --rm --interactive --tty --user root --volume /root:/mnt/ fedora:latest touch /mnt/test_file
[josevnz@macmini2 ~]$
</code></pre>
<p>Falco reaction:</p>
<pre><code class="lang-python">May <span class="hljs-number">01</span> <span class="hljs-number">19</span>:<span class="hljs-number">32</span>:<span class="hljs-number">02</span> macmini2 falco[<span class="hljs-number">26888</span>]: {<span class="hljs-string">"output"</span>:<span class="hljs-string">"19:32:02.434286167: Informational Container with sensitive mount started (user=root user_loginuid=0 command=container:ef061174c7ef distracted_lalande (id=ef061174c7ef) image=fedora:latest mounts=/root:/mnt::true:rprivate)"</span>,<span class="hljs-string">"priority"</span>:<span class="hljs-string">"Informational"</span>,<span class="hljs-string">"rule"</span>:<span class="hljs-string">"Launch Sensitive Mount Container"</span>,<span class="hljs-string">"source"</span>:<span class="hljs-string">"syscall"</span>,<span class="hljs-string">"tags"</span>:[<span class="hljs-string">"cis"</span>,<span class="hljs-string">"container"</span>,<span class="hljs-string">"mitre_lateral_movement"</span>],<span class="hljs-string">"time"</span>:<span class="hljs-string">"2022-05-01T23:32:02.434286167Z"</span>, <span class="hljs-string">"output_fields"</span>: {<span class="hljs-string">"container.id"</span>:<span class="hljs-string">"ef061174c7ef"</span>,<span class="hljs-string">"container.image.repository"</span>:<span class="hljs-string">"fedora"</span>,<span class="hljs-string">"container.image.tag"</span>:<span class="hljs-string">"latest"</span>,<span class="hljs-string">"container.mounts"</span>:<span class="hljs-string">"/root:/mnt::true:rprivate"</span>,<span class="hljs-string">"container.name"</span>:<span class="hljs-string">"distracted_lalande"</span>,<span class="hljs-string">"evt.time"</span>:<span class="hljs-number">1651447922434286167</span>,<span class="hljs-string">"proc.cmdline"</span>:<span class="hljs-string">"container:ef061174c7ef"</span>,<span class="hljs-string">"user.loginuid"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"user.name"</span>:<span class="hljs-string">"root"</span>}}
</code></pre>
<p>Sensitive mount detected!</p>
<h2 id="heading-lets-raise-the-stakes-by-creating-a-file-on-bin"><strong>Let's Raise the Stakes by Creating a File on /bin</strong></h2>
<p>Alright let's say we do this:</p>
<pre><code class="lang-python">[josevnz@macmini2 ~]$ sudo -i
[root@macmini2 ~]<span class="hljs-comment"># touch /bin/should_not_be_here</span>
</code></pre>
<p>What does Falco think about it?</p>
<pre><code class="lang-python">May <span class="hljs-number">01</span> <span class="hljs-number">19</span>:<span class="hljs-number">36</span>:<span class="hljs-number">41</span> macmini2 falco[<span class="hljs-number">26888</span>]: {<span class="hljs-string">"output"</span>:<span class="hljs-string">"19:36:41.237634398: Error File below a known binary directory opened for writing (user=root user_loginuid=1000 command=touch /bin/should_not_be_here file=/bin/should_not_be_here parent=bash pcmdline=bash gparent=sudo container_id=host image=&lt;NA&gt;)"</span>,<span class="hljs-string">"priority"</span>:<span class="hljs-string">"Error"</span>,<span class="hljs-string">"rule"</span>:<span class="hljs-string">"Write below binary dir"</span>,<span class="hljs-string">"source"</span>:<span class="hljs-string">"syscall"</span>,<span class="hljs-string">"tags"</span>:[<span class="hljs-string">"filesystem"</span>,<span class="hljs-string">"mitre_persistence"</span>],<span class="hljs-string">"time"</span>:<span class="hljs-string">"2022-05-01T23:36:41.237634398Z"</span>, <span class="hljs-string">"output_fields"</span>: {<span class="hljs-string">"container.id"</span>:<span class="hljs-string">"host"</span>,<span class="hljs-string">"container.image.repository"</span>:null,<span class="hljs-string">"evt.time"</span>:<span class="hljs-number">1651448201237634398</span>,<span class="hljs-string">"fd.name"</span>:<span class="hljs-string">"/bin/should_not_be_here"</span>,<span class="hljs-string">"proc.aname[2]"</span>:<span class="hljs-string">"sudo"</span>,<span class="hljs-string">"proc.cmdline"</span>:<span class="hljs-string">"touch /bin/should_not_be_here"</span>,<span class="hljs-string">"proc.pcmdline"</span>:<span class="hljs-string">"bash"</span>,<span class="hljs-string">"proc.pname"</span>:<span class="hljs-string">"bash"</span>,<span class="hljs-string">"user.loginuid"</span>:<span class="hljs-number">1000</span>,<span class="hljs-string">"user.name"</span>:<span class="hljs-string">"root"</span>}}
</code></pre>
<p>An error, binary directory opened for writing. Good catch.</p>
<h1 id="heading-defaults-are-not-always-good"><strong>Defaults Are Not Always Good</strong></h1>
<p>After Falco is running for a while it is a good idea to get a sense of what kind of events we want to ignore and which ones we want to investigate.</p>
<p>The first step is to get a list of all the events, using our JSON format on the payload:</p>
<pre><code class="lang-python">sudo journalctl --unit falco --no-page --output=cat &gt; /tmp/falco_json_lines.txt
</code></pre>
<p>The 'output=cat' tells journalctl to give us the message payload without timestamps (don't worry, the JSON message itself has timestamps).</p>
<pre><code class="lang-shell">Starting Falco: Container Native Runtime Security...
Started Falco: Container Native Runtime Security.
Falco version 0.31.1 (driver version b7eb0dd65226a8dc254d228c8d950d07bf3521d2)
Falco initialized with configuration file /etc/falco/falco.yaml
Loading rules from file /etc/falco/falco_rules.yaml:
Loading rules from file /etc/falco/falco_rules.local.yaml:
Loading rules from file /etc/falco/k8s_audit_rules.yaml:
Starting internal webserver, listening on port 8765
{"output":"19:29:32.918828894: Informational Privileged container started (user=root user_loginuid=0 command=container:bfb9637a47a6 kind_lumiere (id=bfb9637a47a6) image=fedora:latest)","priority":"Informational","rule":"Launch Privileged Container","source":"syscall","tags":["cis","container","mitre_lateral_movement","mitre_privilege_escalation"],"time":"2022-05-01T23:29:32.918828894Z", "output_fields": {"container.id":"bfb9637a47a6","container.image.repository":"fedora","container.image.tag":"latest","container.name":"kind_lumiere","evt.time":1651447772918828894,"proc.cmdline":"container:bfb9637a47a6","user.loginuid":0,"user.name":"root"}}
{"output":"19:32:02.434286167: Informational Container with sensitive mount started (user=root user_loginuid=0 command=container:ef061174c7ef distracted_lalande (id=ef061174c7ef) image=fedora:latest mounts=/root:/mnt::true:rprivate)","priority":"Informational","rule":"Launch Sensitive Mount Container","source":"syscall","tags":["cis","container","mitre_lateral_movement"],"time":"2022-05-01T23:32:02.434286167Z", "output_fields": {"container.id":"ef061174c7ef","container.image.repository":"fedora","container.image.tag":"latest","container.mounts":"/root:/mnt::true:rprivate","container.name":"distracted_lalande","evt.time":1651447922434286167,"proc.cmdline":"container:ef061174c7ef","user.loginuid":0,"user.name":"root"}}
</code></pre>
<p>So far it looks interesting, but what about this?</p>
<pre><code class="lang-python">{<span class="hljs-string">"output"</span>:<span class="hljs-string">"23:04:10.609949471: Warning Shell history had been deleted or renamed (user=josevnz user_loginuid=1000 type=openat command=bash fd.name=/home/josevnz/.bash_history-01112.tmp name=/home/josevnz/.bash_history-01112.tmp path=&lt;NA&gt; oldpath=&lt;NA&gt; host (id=host))"</span>,<span class="hljs-string">"priority"</span>:<span class="hljs-string">"Warning"</span>,<span class="hljs-string">"rule"</span>:<span class="hljs-string">"Delete or rename shell history"</span>,<span class="hljs-string">"source"</span>:<span class="hljs-string">"syscall"</span>,<span class="hljs-string">"tags"</span>:[<span class="hljs-string">"mitre_defense_evasion"</span>,<span class="hljs-string">"process"</span>],<span class="hljs-string">"time"</span>:<span class="hljs-string">"2022-05-04T03:04:10.609949471Z"</span>, <span class="hljs-string">"output_fields"</span>: {<span class="hljs-string">"container.id"</span>:<span class="hljs-string">"host"</span>,<span class="hljs-string">"container.name"</span>:<span class="hljs-string">"host"</span>,<span class="hljs-string">"evt.arg.name"</span>:<span class="hljs-string">"/home/josevnz/.bash_history-01112.tmp"</span>,<span class="hljs-string">"evt.arg.oldpath"</span>:null,<span class="hljs-string">"evt.arg.path"</span>:null,<span class="hljs-string">"evt.time"</span>:<span class="hljs-number">1651633450609949471</span>,<span class="hljs-string">"evt.type"</span>:<span class="hljs-string">"openat"</span>,<span class="hljs-string">"fd.name"</span>:<span class="hljs-string">"/home/josevnz/.bash_history-01112.tmp"</span>,<span class="hljs-string">"proc.cmdline"</span>:<span class="hljs-string">"bash"</span>,<span class="hljs-string">"user.loginuid"</span>:<span class="hljs-number">1000</span>,<span class="hljs-string">"user.name"</span>:<span class="hljs-string">"josevnz"</span>}}
{<span class="hljs-string">"output"</span>:<span class="hljs-string">"23:04:10.635602857: Warning Shell history had been deleted or renamed (user=josevnz user_loginuid=1000 type=openat command=bash fd.name=/home/josevnz/.bash_history-01627.tmp name=/home/josevnz/.bash_history-01627.tmp path=&lt;NA&gt; oldpath=&lt;NA&gt; host (id=host))"</span>,<span class="hljs-string">"priority"</span>:<span class="hljs-string">"Warning"</span>,<span class="hljs-string">"rule"</span>:<span class="hljs-string">"Delete or rename shell history"</span>,<span class="hljs-string">"source"</span>:<span class="hljs-string">"syscall"</span>,<span class="hljs-string">"tags"</span>:[<span class="hljs-string">"mitre_defense_evasion"</span>,<span class="hljs-string">"process"</span>],<span class="hljs-string">"time"</span>:<span class="hljs-string">"2022-05-04T03:04:10.635602857Z"</span>, <span class="hljs-string">"output_fields"</span>: {<span class="hljs-string">"container.id"</span>:<span class="hljs-string">"host"</span>,<span class="hljs-string">"container.name"</span>:<span class="hljs-string">"host"</span>,<span class="hljs-string">"evt.arg.name"</span>:<span class="hljs-string">"/home/josevnz/.bash_history-01627.tmp"</span>,<span class="hljs-string">"evt.arg.oldpath"</span>:null,<span class="hljs-string">"evt.arg.path"</span>:null,<span class="hljs-string">"evt.time"</span>:<span class="hljs-number">1651633450635602857</span>,<span class="hljs-string">"evt.type"</span>:<span class="hljs-string">"openat"</span>,<span class="hljs-string">"fd.name"</span>:<span class="hljs-string">"/home/josevnz/.bash_history-01627.tmp"</span>,<span class="hljs-string">"proc.cmdline"</span>:<span class="hljs-string">"bash"</span>,<span class="hljs-string">"user.loginuid"</span>:<span class="hljs-number">1000</span>,<span class="hljs-string">"user.name"</span>:<span class="hljs-string">"josevnz"</span>}}
{<span class="hljs-string">"output"</span>:<span class="hljs-string">"23:04:10.635851215: Warning Shell history had been deleted or renamed (user=josevnz user_loginuid=1000 type=rename command=bash fd.name=&lt;NA&gt; name=&lt;NA&gt; path=&lt;NA&gt; oldpath=/home/josevnz/.bash_history-01627.tmp host (id=host))"</span>,<span class="hljs-string">"priority"</span>:<span class="hljs-string">"Warning"</span>,<span class="hljs-string">"rule"</span>:<span class="hljs-string">"Delete or rename shell history"</span>,<span class="hljs-string">"source"</span>:<span class="hljs-string">"syscall"</span>,<span class="hljs-string">"tags"</span>:[<span class="hljs-string">"mitre_defense_evasion"</span>,<span class="hljs-string">"process"</span>],<span class="hljs-string">"time"</span>:<span class="hljs-string">"2022-05-04T03:04:10.635851215Z"</span>, <span class="hljs-string">"output_fields"</span>: {<span class="hljs-string">"container.id"</span>:<span class="hljs-string">"host"</span>,<span class="hljs-string">"container.name"</span>:<span class="hljs-string">"host"</span>,<span class="hljs-string">"evt.arg.name"</span>:null,<span class="hljs-string">"evt.arg.oldpath"</span>:<span class="hljs-string">"/home/josevnz/.bash_history-01627.tmp"</span>,<span class="hljs-string">"evt.arg.path"</span>:null,<span class="hljs-string">"evt.time"</span>:<span class="hljs-number">1651633450635851215</span>,<span class="hljs-string">"evt.type"</span>:<span class="hljs-string">"rename"</span>,<span class="hljs-string">"fd.name"</span>:null,<span class="hljs-string">"proc.cmdline"</span>:<span class="hljs-string">"bash"</span>,<span class="hljs-string">"user.loginuid"</span>:<span class="hljs-number">1000</span>,<span class="hljs-string">"user.name"</span>:<span class="hljs-string">"josevnz"</span>}}
{<span class="hljs-string">"output"</span>:<span class="hljs-string">"23:04:10.661829867: Warning Shell history had been deleted or renamed (user=josevnz user_loginuid=1000 type=rename command=bash fd.name=&lt;NA&gt; name=&lt;NA&gt; path=&lt;NA&gt; oldpath=/home/josevnz/.bash_history-01112.tmp host (id=host))"</span>,<span class="hljs-string">"priority"</span>:<span class="hljs-string">"Warning"</span>,<span class="hljs-string">"rule"</span>:<span class="hljs-string">"Delete or rename shell history"</span>,<span class="hljs-string">"source"</span>:<span class="hljs-string">"syscall"</span>,<span class="hljs-string">"tags"</span>:[<span class="hljs-string">"mitre_defense_evasion"</span>,<span class="hljs-string">"process"</span>],<span class="hljs-string">"time"</span>:<span class="hljs-string">"2022-05-04T03:04:10.661829867Z"</span>, <span class="hljs-string">"output_fields"</span>: {<span class="hljs-string">"container.id"</span>:<span class="hljs-string">"host"</span>,<span class="hljs-string">"container.name"</span>:<span class="hljs-string">"host"</span>,<span class="hljs-string">"evt.arg.name"</span>:null,<span class="hljs-string">"evt.arg.oldpath"</span>:<span class="hljs-string">"/home/josevnz/.bash_history-01112.tmp"</span>,<span class="hljs-string">"evt.arg.path"</span>:null,<span class="hljs-string">"evt.time"</span>:<span class="hljs-number">1651633450661829867</span>,<span class="hljs-string">"evt.type"</span>:<span class="hljs-string">"rename"</span>,<span class="hljs-string">"fd.name"</span>:null,<span class="hljs-string">"proc.cmdline"</span>:<span class="hljs-string">"bash"</span>,<span class="hljs-string">"user.loginuid"</span>:<span class="hljs-number">1000</span>,<span class="hljs-string">"user.name"</span>:<span class="hljs-string">"josevnz"</span>}}
</code></pre>
<p>This is a normal/ legitimate operation. Let's find a way to harden this rule or remove it completely.</p>
<p>First, open the <code>/etc/falco/falco_rules.yaml</code> file and look for the rule 'Delete or rename shell history' (JSON output we saw earlier):</p>
<pre><code class="lang-yaml"><span class="hljs-bullet">-</span> <span class="hljs-attr">list:</span> <span class="hljs-string">docker_binaries</span>
  <span class="hljs-attr">items:</span> [<span class="hljs-string">docker</span>, <span class="hljs-string">dockerd</span>, <span class="hljs-string">exe</span>, <span class="hljs-string">docker-compose</span>, <span class="hljs-string">docker-entrypoi</span>, <span class="hljs-string">docker-runc-cur</span>, <span class="hljs-string">docker-current</span>, <span class="hljs-string">dockerd-current</span>]

 <span class="hljs-attr">macro:</span> <span class="hljs-string">var_lib_docker_filepath</span>
  <span class="hljs-attr">condition:</span> <span class="hljs-string">(evt.arg.name</span> <span class="hljs-string">startswith</span> <span class="hljs-string">/var/lib/docker</span> <span class="hljs-string">or</span> <span class="hljs-string">fd.name</span> <span class="hljs-string">startswith</span> <span class="hljs-string">/var/lib/docker)</span>

<span class="hljs-bullet">-</span> <span class="hljs-attr">rule:</span> <span class="hljs-string">Delete</span> <span class="hljs-string">or</span> <span class="hljs-string">rename</span> <span class="hljs-string">shell</span> <span class="hljs-string">history</span>
  <span class="hljs-attr">desc:</span> <span class="hljs-string">Detect</span> <span class="hljs-string">shell</span> <span class="hljs-string">history</span> <span class="hljs-string">deletion</span>
  <span class="hljs-attr">condition:</span> <span class="hljs-string">&gt;
    (modify_shell_history or truncate_shell_history) and
       not var_lib_docker_filepath and
       not proc.name in (docker_binaries)
</span>  <span class="hljs-attr">output:</span> <span class="hljs-string">&gt;
    Shell history had been deleted or renamed (user=%user.name user_loginuid=%user.loginuid type=%evt.type command=%proc.cmdline fd.name=%fd.name name=%evt.arg.name path=%evt.arg.path oldpath=%evt.arg.oldpath %container.info)
</span>  <span class="hljs-attr">priority:</span>
    <span class="hljs-string">WARNING</span>
  <span class="hljs-attr">tags:</span> [<span class="hljs-string">process</span>, <span class="hljs-string">mitre_defense_evasion</span>]
</code></pre>
<p>Falco rules are explained <a target="_blank" href="https://falco.org/docs/rules/">in detail</a> on the official documentation. Just by looking at this piece you will notice a few things.</p>
<p>About the conditions:</p>
<ol>
<li><p>Support complex logic,</p>
</li>
<li><p>macros like <code>var_lib_docker_filepath</code></p>
</li>
<li><p>lists like <code>(docker_binaries)</code></p>
</li>
<li><p>and special variables with fields like <code>proc.name</code>.</p>
</li>
</ol>
<p>It is recommended that do you not change this file. Instead you should override what you need on the <code>/etc/falco/falco_rules.local.yaml</code>:</p>
<pre><code class="lang-yaml"><span class="hljs-comment"># Add new rules, like this one</span>
<span class="hljs-comment"># - rule: The program "sudo" is run in a container</span>
<span class="hljs-comment">#   desc: An event will trigger every time you run sudo in a container</span>
<span class="hljs-comment">#   condition: evt.type = execve and evt.dir=&lt; and container.id != host and proc.name = sudo</span>
<span class="hljs-comment">#   output: "Sudo run in container (user=%user.name %container.info parent=%proc.pname cmdline=%proc.cmdline)"</span>
<span class="hljs-comment">#   priority: ERROR</span>
<span class="hljs-comment">#   tags: [users, container]</span>

<span class="hljs-comment"># Or override/append to any rule, macro, or list from the Default Rules</span>
</code></pre>
<p>For the sake of example, say that we do care when the history of the super-user (root) is overridden, but everybody else if fine. The best part is that you don't have to override the whole rule.</p>
<p>So the original rule will get a condition appended:</p>
<pre><code class="lang-yaml"><span class="hljs-bullet">-</span> <span class="hljs-attr">rule:</span> <span class="hljs-string">Delete</span> <span class="hljs-string">or</span> <span class="hljs-string">rename</span> <span class="hljs-string">shell</span> <span class="hljs-string">history</span>
  <span class="hljs-attr">append:</span> <span class="hljs-literal">true</span>
  <span class="hljs-attr">condition:</span> <span class="hljs-string">and</span> <span class="hljs-string">user.name=root</span>
</code></pre>
<p>It is always a good idea to validate that your rules are properly written. For that you need can tell Falco to check the original rules and your overrides together:</p>
<pre><code class="lang-shell">[root@macmini2 ~]# falco --validate /etc/falco/falco_rules.yaml --validate /etc/falco/falco_rules.local.yaml 
Fri May  6 20:48:00 2022: Validating rules file(s):
Fri May  6 20:48:00 2022:    /etc/falco/falco_rules.yaml
Fri May  6 20:48:00 2022:    /etc/falco/falco_rules.local.yaml
/etc/falco/falco_rules.yaml: Ok
/etc/falco/falco_rules.local.yaml: Ok
Fri May  6 20:48:01 2022: Ok

# If the rules are OK, restart Falco
[root@macmini2 ~]# systemctl restart falco.service
</code></pre>
<h2 id="heading-how-to-make-a-simple-event-explorer-in-python"><strong>How to Make a Simple Event Explorer in Python</strong></h2>
<p>You'll probably agree that getting a sense of what rules are noise and which ones are useful is tedious.</p>
<p>We need to normalize this data, and we will use a Python script that will:</p>
<ul>
<li><p>Remove non-JSON data</p>
</li>
<li><p>Aggregate event types without the timestamps</p>
</li>
<li><p>Generate a few aggregation statistics, so we can focus on the most frequent events in our system</p>
</li>
</ul>
<p>A small Python script can do the trick. I'm leaving out the UI rendering part (please check the code to see the full picture), and instead will show you the file parsing bits:</p>
<pre><code class="lang-python"><span class="hljs-comment">#!/usr/bin/env python3</span>
<span class="hljs-string">"""
Aggregate Falco events to make it easier to override rules
Jose Vicente Nunez (kodegeek.com@protonmail.com)
"""</span>
<span class="hljs-keyword">import</span> json
<span class="hljs-keyword">import</span> re
<span class="hljs-keyword">from</span> argparse <span class="hljs-keyword">import</span> ArgumentParser
<span class="hljs-keyword">from</span> pathlib <span class="hljs-keyword">import</span> Path
<span class="hljs-keyword">from</span> rich.console <span class="hljs-keyword">import</span> Console
<span class="hljs-keyword">from</span> falcotutor.ui <span class="hljs-keyword">import</span> EventDisplayApp, create_event_table, add_rows_to_create_event_table


<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">filter_events</span>(<span class="hljs-params">journalctl_out: Path</span>) -&gt; dict[any, any]:</span>
    <span class="hljs-string">"""
    :param journalctl_out:
    :return:
    """</span>
    <span class="hljs-keyword">with</span> open(journalctl_out, <span class="hljs-string">'r'</span>) <span class="hljs-keyword">as</span> journalctl_file:
        <span class="hljs-keyword">for</span> row <span class="hljs-keyword">in</span> journalctl_file:
            <span class="hljs-keyword">if</span> re.search(<span class="hljs-string">"^{"</span>, row):
                data = json.loads(row)
                <span class="hljs-keyword">if</span> <span class="hljs-string">'rule'</span> <span class="hljs-keyword">in</span> data <span class="hljs-keyword">and</span> <span class="hljs-string">'output_fields'</span> <span class="hljs-keyword">in</span> data:
                    <span class="hljs-keyword">yield</span> data


<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">aggregate_events</span>(<span class="hljs-params">local_event: dict[any, any], aggregated_events: dict[any, any]</span>):</span>
    rule = local_event[<span class="hljs-string">'rule'</span>]
    <span class="hljs-keyword">if</span> rule <span class="hljs-keyword">not</span> <span class="hljs-keyword">in</span> aggregated_events:
        aggregated_events[rule] = {
            <span class="hljs-string">'count'</span>: <span class="hljs-number">0</span>,
            <span class="hljs-string">'priority'</span>: local_event[<span class="hljs-string">'priority'</span>],
            <span class="hljs-string">'last_timestamp'</span>: <span class="hljs-string">""</span>,
            <span class="hljs-string">'last_fields'</span>: <span class="hljs-string">""</span>
        }
    aggregated_events[rule][<span class="hljs-string">'count'</span>] += <span class="hljs-number">1</span>
    aggregated_events[rule][<span class="hljs-string">'last_timestamp'</span>] = local_event[<span class="hljs-string">'time'</span>]
    <span class="hljs-keyword">del</span> local_event[<span class="hljs-string">'output_fields'</span>][<span class="hljs-string">'evt.time'</span>]
    aggregated_events[rule][<span class="hljs-string">'last_fields'</span>] = json.dumps(local_event[<span class="hljs-string">'output_fields'</span>], indent=<span class="hljs-literal">True</span>)


<span class="hljs-keyword">if</span> __name__ == <span class="hljs-string">"__main__"</span>:
    CONSOLE = Console()
    AGGREGATED = {}
    PARSER = ArgumentParser(description=__doc__)
    PARSER.add_argument(
        <span class="hljs-string">"falco_event"</span>,
        action=<span class="hljs-string">"store"</span>
    )
    ARGS = PARSER.parse_args()
    <span class="hljs-keyword">try</span>:
        event_table = create_event_table()
        <span class="hljs-keyword">for</span> event <span class="hljs-keyword">in</span> filter_events(ARGS.falco_event):
            aggregate_events(local_event=event, aggregated_events=AGGREGATED)
        add_rows_to_create_event_table(AGGREGATED, event_table)
        EventDisplayApp.run(
            event_file=ARGS.falco_event,
            title=<span class="hljs-string">"Falco aggregated events report"</span>,
            event_table=event_table
        )
    <span class="hljs-keyword">except</span> KeyboardInterrupt:
        CONSOLE.print(<span class="hljs-string">"[bold]Program interrupted...[/bold]"</span>)
</code></pre>
<p>Once the file is loaded as a dictionary, we only need to iterate to it to aggregate the events, then show the results as a neat table sorted by count:</p>
<p><a target="_blank" href="https://asciinema.org/a/492898"><img src="https://asciinema.org/a/492898.svg" alt="asciicast" width="1693.08999933" height="727.999818" loading="lazy"></a></p>
<h2 id="heading-how-to-show-the-falco-rules"><strong>How to Show the Falco Rules</strong></h2>
<p>If you are like me, you are always looking at the /etc/falco/falco_rules.yaml file to understand what is being monitored. A brief view of those rules (without looking at the verbose YAML file with comments) is a nice addition:</p>
<pre><code class="lang-python"><span class="hljs-comment">#!/usr/bin/env python3</span>
<span class="hljs-string">"""
Show brief content of default Falco rule YAML files
Jose Vicente Nunez (kodegeek.com@protonmail.com)
"""</span>
<span class="hljs-keyword">from</span> argparse <span class="hljs-keyword">import</span> ArgumentParser
<span class="hljs-keyword">from</span> pathlib <span class="hljs-keyword">import</span> Path
<span class="hljs-keyword">from</span> rich.console <span class="hljs-keyword">import</span> Console
<span class="hljs-keyword">import</span> yaml
<span class="hljs-keyword">from</span> falcotutor.ui <span class="hljs-keyword">import</span> create_rules_table, add_rows_to_create_rules_table, RulesDisplayApp


<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">load_rulez</span>(<span class="hljs-params">falco_rulez: Path</span>) -&gt; dict[any, any]:</span>
    rulez = {}
    <span class="hljs-keyword">with</span> open(falco_rulez, <span class="hljs-string">'rt'</span>) <span class="hljs-keyword">as</span> falco_file:
        <span class="hljs-keyword">for</span> rule_data <span class="hljs-keyword">in</span> yaml.full_load(falco_file):
            <span class="hljs-keyword">if</span> <span class="hljs-string">'rule'</span> <span class="hljs-keyword">in</span> rule_data:
                rule_name = rule_data[<span class="hljs-string">'rule'</span>]
                <span class="hljs-keyword">del</span> rule_data[<span class="hljs-string">'rule'</span>]
                rulez[rule_name] = rule_data
    <span class="hljs-keyword">return</span> rulez


<span class="hljs-keyword">if</span> __name__ == <span class="hljs-string">"__main__"</span>:
    CONSOLE = Console()
    AGGREGATED = {}
    PARSER = ArgumentParser(description=__doc__)
    PARSER.add_argument(
        <span class="hljs-string">"falco_rules"</span>,
        action=<span class="hljs-string">"store"</span>
    )
    ARGS = PARSER.parse_args()
    <span class="hljs-keyword">try</span>:
        RULES = load_rulez(ARGS.falco_rules)
        RULE_TBL = create_rules_table()
        add_rows_to_create_rules_table(lrules=RULES, rules_tbl=RULE_TBL)
        RulesDisplayApp.run(
            rules_file=ARGS.falco_rules,
            title=<span class="hljs-string">"Falco brief rule display"</span>,
            rules_table=RULE_TBL
        )
    <span class="hljs-keyword">except</span> KeyboardInterrupt:
        CONSOLE.print(<span class="hljs-string">"[bold]Program interrupted...[/bold]"</span>)
</code></pre>
<p>You could improve this script by adding rule filtering by certain criteria, for example (rule name, priority, enabled/ disabled). This version doesn't do any filtering:</p>
<p><a target="_blank" href="https://asciinema.org/a/492908"><img src="https://asciinema.org/a/492908.svg" alt="asciicast" width="1693.08999933" height="709.333156" loading="lazy"></a></p>
<h1 id="heading-falco-integrations"><strong>Falco Integrations</strong></h1>
<p>You probably noticed two things from our earlier experimentation:</p>
<ol>
<li><p>The payload of the events do not have the host. If you want to locate an offending server, you need to improve how a multi-host event is reported (parsing a journalctl file from many hosts is not practical).</p>
</li>
<li><p>We want to get alerts in a centralized location. It would be nice to have a way to "push" those events instead of us going to fish.</p>
</li>
</ol>
<p>It is time to consolidate those alerts in a single place.</p>
<h2 id="heading-how-to-use-falco-exporter"><strong>How to Use Falco Exporter</strong></h2>
<p>The <a target="_blank" href="https://github.com/falcosecurity/falco-exporter">Falco exporter</a> will allow us to share the Falco alerts with Prometheus scraper. We need first to enable <a target="_blank" href="https://grpc.io/">gRPC</a> in the /etc/falco/falco.yaml</p>
<pre><code class="lang-python"><span class="hljs-comment"># gRPC server using an unix socket</span>
grpc:
  enabled: true
  bind_address: <span class="hljs-string">"unix:///var/run/falco.sock"</span>
  <span class="hljs-comment"># when threadiness is 0, Falco automatically guesses it depending on the number of online cores</span>
  threadiness: <span class="hljs-number">0</span>

<span class="hljs-comment"># gRPC output service.</span>
<span class="hljs-comment"># By default it is off.</span>
<span class="hljs-comment"># By enabling this all the output events will be kept in memory until you read them with a gRPC client.</span>
<span class="hljs-comment"># Make sure to have a consumer for them or leave this disabled.</span>
grpc_output:
  enabled: true
</code></pre>
<p>Restart Falco:</p>
<pre><code class="lang-python">[root@macmini2 ~]<span class="hljs-comment"># systemctl restart falco.service </span>
[root@macmini2 ~]<span class="hljs-comment"># systemctl status falco.service </span>
● falco.service - Falco: Container Native Runtime Security
   Loaded: loaded (/usr/lib/systemd/system/falco.service; disabled; vendor preset: disabled)
   Active: active (running) since Sun <span class="hljs-number">2022</span><span class="hljs-number">-05</span><span class="hljs-number">-01</span> <span class="hljs-number">20</span>:<span class="hljs-number">35</span>:<span class="hljs-number">01</span> EDT; <span class="hljs-number">26</span>s ago
     Docs: https://falco.org/docs/
  Process: <span class="hljs-number">28285</span> ExecStartPre=/sbin/modprobe falco (code=exited, status=<span class="hljs-number">0</span>/SUCCESS)
 Main PID: <span class="hljs-number">28288</span> (falco)
    Tasks: <span class="hljs-number">11</span> (limit: <span class="hljs-number">2310</span>)
   Memory: <span class="hljs-number">80.9</span>M
   CGroup: /system.slice/falco.service
           └─<span class="hljs-number">28288</span> /usr/bin/falco --pidfile=/var/run/falco.pid

May <span class="hljs-number">01</span> <span class="hljs-number">20</span>:<span class="hljs-number">35</span>:<span class="hljs-number">01</span> macmini2 systemd[<span class="hljs-number">1</span>]: Starting Falco: Container Native Runtime Security...
May <span class="hljs-number">01</span> <span class="hljs-number">20</span>:<span class="hljs-number">35</span>:<span class="hljs-number">01</span> macmini2 systemd[<span class="hljs-number">1</span>]: Started Falco: Container Native Runtime Security.
May <span class="hljs-number">01</span> <span class="hljs-number">20</span>:<span class="hljs-number">35</span>:<span class="hljs-number">01</span> macmini2 falco[<span class="hljs-number">28288</span>]: Falco version <span class="hljs-number">0.31</span><span class="hljs-number">.1</span> (driver version b7eb0dd65226a8dc254d228c8d950d07bf3521d2)
May <span class="hljs-number">01</span> <span class="hljs-number">20</span>:<span class="hljs-number">35</span>:<span class="hljs-number">01</span> macmini2 falco[<span class="hljs-number">28288</span>]: Falco initialized <span class="hljs-keyword">with</span> configuration file /etc/falco/falco.yaml
May <span class="hljs-number">01</span> <span class="hljs-number">20</span>:<span class="hljs-number">35</span>:<span class="hljs-number">01</span> macmini2 falco[<span class="hljs-number">28288</span>]: Loading rules <span class="hljs-keyword">from</span> file /etc/falco/falco_rules.yaml:
May <span class="hljs-number">01</span> <span class="hljs-number">20</span>:<span class="hljs-number">35</span>:<span class="hljs-number">02</span> macmini2 falco[<span class="hljs-number">28288</span>]: Loading rules <span class="hljs-keyword">from</span> file /etc/falco/falco_rules.local.yaml:
May <span class="hljs-number">01</span> <span class="hljs-number">20</span>:<span class="hljs-number">35</span>:<span class="hljs-number">03</span> macmini2 falco[<span class="hljs-number">28288</span>]: Loading rules <span class="hljs-keyword">from</span> file /etc/falco/k8s_audit_rules.yaml:
May <span class="hljs-number">01</span> <span class="hljs-number">20</span>:<span class="hljs-number">35</span>:<span class="hljs-number">04</span> macmini2 falco[<span class="hljs-number">28288</span>]: Starting internal webserver, listening on port <span class="hljs-number">8765</span>
May <span class="hljs-number">01</span> <span class="hljs-number">20</span>:<span class="hljs-number">35</span>:<span class="hljs-number">04</span> macmini2 falco[<span class="hljs-number">28288</span>]: gRPC server threadiness equals to <span class="hljs-number">2</span>
May <span class="hljs-number">01</span> <span class="hljs-number">20</span>:<span class="hljs-number">35</span>:<span class="hljs-number">04</span> macmini2 falco[<span class="hljs-number">28288</span>]: Starting gRPC server at unix:///var/run/falco.sock
</code></pre>
<p>Quickly make sure everything is OK (reminder, the Falco agent is running on macmini2):</p>
<pre><code class="lang-shell">josevnz@raspberrypi:~$ curl --fail http://macmini2:8765/healthz
{"status": "ok"}josevnz@raspberrypi:~$
</code></pre>
<p>Then we run the falco-exporter. To make it easier, we will use a Docker container <a target="_blank" href="https://docs.docker.com/engine/reference/commandline/run/">with a few overrides in the command line</a>.</p>
<pre><code class="lang-python">[root@macmini2 ~]<span class="hljs-comment"># docker run --restart always --name falco-exporter --detach --volume /var/run/falco.sock:/var/run/falco.sock --network=host falcosecurity/falco-exporter --listen-address 192.168.1.16:9376</span>
<span class="hljs-number">7</span>d157af0251ea4bc73b8c355a74eaf4dd24a5348cbe3f5f2ea9d7147c6c366c8
[root@macmini2 ~]<span class="hljs-comment"># docker logs falco-exporter</span>
<span class="hljs-number">2022</span>/<span class="hljs-number">05</span>/<span class="hljs-number">02</span> <span class="hljs-number">00</span>:<span class="hljs-number">56</span>:<span class="hljs-number">30</span> connecting to gRPC server at unix:///var/run/falco.sock (timeout <span class="hljs-number">2</span>m0s)
<span class="hljs-number">2022</span>/<span class="hljs-number">05</span>/<span class="hljs-number">02</span> <span class="hljs-number">00</span>:<span class="hljs-number">56</span>:<span class="hljs-number">30</span> listening on http://<span class="hljs-number">192.168</span><span class="hljs-number">.1</span><span class="hljs-number">.16</span>:<span class="hljs-number">9376</span>/metrics
<span class="hljs-number">2022</span>/<span class="hljs-number">05</span>/<span class="hljs-number">02</span> <span class="hljs-number">00</span>:<span class="hljs-number">56</span>:<span class="hljs-number">30</span> connected to gRPC server, subscribing events stream
<span class="hljs-number">2022</span>/<span class="hljs-number">05</span>/<span class="hljs-number">02</span> <span class="hljs-number">00</span>:<span class="hljs-number">56</span>:<span class="hljs-number">30</span> ready

<span class="hljs-comment"># Check with CURL if the URL is reachable</span>
[root@macmini2 ~]<span class="hljs-comment"># curl http://192.168.1.16:9376/metrics</span>
<span class="hljs-comment"># HELP go_gc_duration_seconds A summary of the pause duration of garbage collection cycles.</span>
<span class="hljs-comment"># TYPE go_gc_duration_seconds summary</span>
go_gc_duration_seconds{quantile=<span class="hljs-string">"0"</span>} <span class="hljs-number">0</span>
go_gc_duration_seconds{quantile=<span class="hljs-string">"0.25"</span>} <span class="hljs-number">0</span>
go_gc_duration_seconds{quantile=<span class="hljs-string">"0.5"</span>} <span class="hljs-number">0</span>
go_gc_duration_seconds{quantile=<span class="hljs-string">"0.75"</span>} <span class="hljs-number">0</span>
go_gc_duration_seconds{quantile=<span class="hljs-string">"1"</span>} <span class="hljs-number">0</span>
go_gc_duration_seconds_sum <span class="hljs-number">0</span>
go_gc_duration_seconds_count <span class="hljs-number">0</span>
<span class="hljs-comment"># HELP go_goroutines Number of goroutines that currently exist.</span>
<span class="hljs-comment"># TYPE go_goroutines gauge</span>
go_goroutines <span class="hljs-number">18</span>
<span class="hljs-comment"># HELP go_info Information about the Go environment.</span>
<span class="hljs-comment"># TYPE go_info gauge</span>
go_info{version=<span class="hljs-string">"go1.14.15"</span>} <span class="hljs-number">1</span>
<span class="hljs-comment"># HELP go_memstats_alloc_bytes Number of bytes allocated and still in use.</span>
<span class="hljs-comment"># TYPE go_memstats_alloc_bytes gauge</span>
go_memstats_alloc_bytes <span class="hljs-number">2.011112e+06</span>
</code></pre>
<p>For completeness, let me show you also how to capture the host performance metrics <a target="_blank" href="https://prometheus.io/docs/guides/node-exporter/">using the node exporter</a> (we will use it later to keep an eye on how many resources are used by Falco and to make sure our installation is not hurting the server):</p>
<pre><code class="lang-python">docker run --detach --net=<span class="hljs-string">"host"</span> --pid=<span class="hljs-string">"host"</span> --volume <span class="hljs-string">"/:/host:ro,rslave"</span> quay.io/prometheus/node-exporter:latest --path.rootfs=/host
</code></pre>
<p>The node-exporter and the falco-exporter will run on every host that needs their data scraped. Now you need to wait to collect all these metrics into a single location. For that will use the <a target="_blank" href="https://prometheus.io/docs/prometheus/latest/getting_started/">Prometheus agent</a>:</p>
<pre><code class="lang-python">---
<span class="hljs-comment"># /etc/prometheus.yaml on raspberrypi</span>
<span class="hljs-keyword">global</span>:
    scrape_interval: <span class="hljs-number">30</span>s
    evaluation_interval: <span class="hljs-number">30</span>s
    scrape_timeout: <span class="hljs-number">10</span>s
    external_labels:
        monitor: <span class="hljs-string">'nunez-family-monitor'</span>

scrape_configs:
  - job_name: <span class="hljs-string">'falco-exporter'</span>
    static_configs:
      - targets: [<span class="hljs-string">'macmini2.home:9376'</span>]
  - job_name: <span class="hljs-string">'node-exporter'</span>
    static_configs:
      - targets: [<span class="hljs-string">'macmini2.home:9100'</span>, <span class="hljs-string">'raspberrypi.home:9100'</span>, <span class="hljs-string">'dmaf5:9100'</span>]
  - job_name: <span class="hljs-string">'docker-exporter'</span>
    static_configs:
      - targets: [<span class="hljs-string">'macmini2.home:9323'</span>, <span class="hljs-string">'raspberrypi.home:9323'</span>, <span class="hljs-string">'dmaf5:9323'</span>]

    tls_config:
      insecure_skip_verify: true
</code></pre>
<p>Then make sure the Prometheus scraper can talk with each one of the nodes. We visit the web UI:</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2022/05/prometheus-raspberrypi.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p>Good, Prometheus is able to scrape Falco. We can even run a simple query to see a few events:</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2022/05/prometheus-query-falco.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p>Next we need to setup the UI view for the events, and for that we will use Grafana.</p>
<p>There are many ways to install Grafana. In my case <a target="_blank" href="https://grafana.com/docs/grafana/latest/installation/docker/">I will use a Grafana Docker container</a> (I will run Grafana on the same host where Prometheus is running: raspberripi.home):</p>
<pre><code class="lang-python">docker pull grafana/grafana:main-ubuntu
mkdir -p /data/grafana
chown syslog /data/grafana
docker run --user <span class="hljs-number">104</span> --name grafana --detach --tty --volume /data/grafana:/var/lib/grafana -p <span class="hljs-number">3000</span>:<span class="hljs-number">3000</span> grafana/grafana:main-ubuntu
</code></pre>
<p>After Grafana comes up, you will need to change your password and will also need to connect with Prometheus:</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2022/05/grafana-prometheus-falco.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p>Once Grafana is up, we can <a target="_blank" href="https://grafana.com/grafana/dashboards/11914">import the Falco dashboard</a> as <a target="_blank" href="https://grafana.com/docs/reference/export_import/">explained here</a>.</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2022/05/falco-grafana-integration.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p>Once the dashboard is imported we can generate a few events to trigger Falco on the host where is installed:</p>
<pre><code class="lang-python">[root@macmini2 ~]<span class="hljs-comment"># for i in $(seq 1 60); do docker run --rm --interactive --tty --privileged fedora:latest /bin/bash -c ls; touch /root/test; rm -f /root/test; sleep 1; done</span>
</code></pre>
<p>After a little you should see something like this on your Grafana Dashboard:</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2022/05/grafana-falco-dashboard.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p>The events are flowing, and you can see from which host they came.</p>
<h2 id="heading-how-to-create-alerts-for-your-falco-events"><strong>How to Create Alerts for Your Falco Events</strong></h2>
<p>Ideally if you have the Falco events in Grafana, you can make these actionable items and generate alerts from then.</p>
<p>I don't want to get bombarded by non-critical alerts, so the first thing to know is to what level of events to filter:</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2022/05/grafana-priority-events.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p>Anything with priority below 3 will be treated as an alert.</p>
<p>Grafana has <a target="_blank" href="https://grafana.com/docs/grafana/latest/alerting/unified-alerting/alerting-rules/create-grafana-managed-rule/">good documentation on how to setup an alert</a>, so I will show here the end result only:</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2022/05/grafana-falco-alert-definition.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p>The next step is to send the alerts somewhere.</p>
<h2 id="heading-alerts-need-to-go-somewhere-how-to-define-a-contact-point-using-discord"><strong>Alerts Need to Go Somewhere – How to Define a Contact Point using Discord</strong></h2>
<p>For this example we will use <a target="_blank" href="https://discord.com/">Discord</a> as the end for the alerts. Discord has a very detailed guide on how to setup a <a target="_blank" href="https://support.discord.com/hc/en-us/articles/228383668-Intro-to-Webhooks">WebHook</a>, so I will only show you here the end results of my discord Webhook:</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2022/05/grafana-discord-webhook.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p>We copy that URL and then will configure a new Grafana contact point that uses our Discord webhook (<em>we are setting this a default contact point for all the alerts</em>):</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2022/05/grafana-contact-point-discord.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p>From there we can send a test message to Discord, just to confirm that this pipeline works:</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2022/05/grafana-discord-events-test.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p>We're getting closer. By now if we go back to our alert definition we should see it is on the 'firing' state:</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2022/05/grafana-falco-alert-firing.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p>And if everything goes well we see also our first Falco alert in Discord:</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2022/05/grafana-falco-discord-alert.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p>We can see here all the fields we get on the journalctl output. The difference is that all these messages will come from all the servers where you define the Falco-Prometheus-Grafana bridge.</p>
<h2 id="heading-honorable-mention-how-to-aggregate-alerts-using-falcon-sidekick-falcon-sidekick-ui"><strong>Honorable Mention: How to Aggregate Alerts using Falcon Sidekick/ Falcon Sidekick-UI</strong></h2>
<p><a target="_blank" href="https://github.com/falcosecurity/falcosidekick">Falco Sidekick</a> is another way to gather and send events to other destinations, like the <a target="_blank" href="https://github.com/falcosecurity/falcosidekick-ui">Falco Sidekick-UI</a>. But it won't tell you the originating host (at least until Falco 0.31.1).</p>
<p>This is most likely not an issue for an alert coming from a K8s cluster or a containerized application where the image name will give you plenty of information. But if your event happens on a bare-metal environment, and you have more than 2 machines, it will become a headache.</p>
<p>For that reason I won't cover Sidekick here – you may want to stick with the Grafana integration for the time being.</p>
<h1 id="heading-learning-more"><strong>Learning More</strong></h1>
<p>Falco has a great interactive learning <a target="_blank" href="https://falco.org/docs/getting-started/third-party/learning/">environment</a>. You should try it to see what else is possible. There is a lot of things I did not cover here, like rule exceptions for example.</p>
<p>Also, did you know that Falco can be extended using <a target="_blank" href="https://falco.org/docs/plugins/">plugins</a>? You can have fun and learn using C++ or Go as the language of choice</p>
<p>The Falco blog has lots of <a target="_blank" href="https://falco.org/blog/">interesting articles</a>, including posts for the latest threats.</p>
<p>Finally, the project has a very active community on <a target="_blank" href="https://falco.org/community/">many channels</a>. Pick yours and explore.</p>
<p>Feel free to <a target="_blank" href="https://github.com/josevnz/Falco">fork my code</a> and report <a target="_blank" href="https://github.com/josevnz/Falco/issues">any issues</a> if you find any. But more important, explore and learn by doing.</p>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ How to Debug Applications with Strace, Python, and Wireshark ]]>
                </title>
                <description>
                    <![CDATA[ In this article I will show you a few techniques you can use to troubleshoot a program when is not behaving. This list is not universal and, depending on what you are looking for, it may not be enough to solve your problem. But it should be a good st... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/debug-an-app-with-strace-python-wireshark/</link>
                <guid isPermaLink="false">66d85136f5638bdd06059a55</guid>
                
                    <category>
                        <![CDATA[ debugging ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Problem Solving ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Python ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ Jose Vicente Nunez ]]>
                </dc:creator>
                <pubDate>Tue, 26 Apr 2022 00:23:26 +0000</pubDate>
                <media:content url="https://www.freecodecamp.org/news/content/images/2022/04/pexels-egor-kamelev-3819742.jpg" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>In this article I will show you a few techniques you can use to troubleshoot a program when is not behaving.</p>
<p>This list is not universal and, depending on what you are looking for, it may not be enough to solve your problem. But it should be a good start.</p>
<p>Before we start you should be familiar with a few things:</p>
<ul>
<li><p>How to run commands on Linux</p>
</li>
<li><p>Protocols like DNS, HTTP, and TLS</p>
</li>
<li><p>A scripting language like Python</p>
</li>
</ul>
<p>Don't worry too much. I will give you enough information so you can follow along with the tutorial.</p>
<p>And what will you learn?</p>
<ul>
<li><p>Basic usage of strace, nslookup, and RPM</p>
</li>
<li><p>How to use some interesting features of the Python debugger</p>
</li>
<li><p>How to analyze traffic with Wireshark</p>
</li>
</ul>
<h2 id="heading-the-problem-failing-to-upload-a-file-to-asciinema"><strong>The Problem: Failing to upload a file to asciinema</strong></h2>
<p>So I recorded an asciicast, using the cool Open Source project <a target="_blank" href="https://asciinema.org/docs/usage">asciinema</a>, for my small Open Source project <a target="_blank" href="https://pypi.org/project/SuricataLog">SuricataLog</a>. Then I decided to share it with the world.</p>
<p>But unlike the other recordings, this one refused to be uploaded:</p>
<pre><code class="lang-python">[josevnz@dmaf5 SuricataLog]$ asciinema upload demo-ascii.cast 
asciinema: upload failed: &lt;urlopen error [Errno <span class="hljs-number">32</span>] Broken pipe&gt;
asciinema: retry later by running: asciinema upload demo-ascii.cast
</code></pre>
<p>Asciinema doesn't tell us much about the error. For example:</p>
<ul>
<li><p>What server and port does the tool try to use to upload the file?</p>
</li>
<li><p>Which part of the protocol handshake is failing?</p>
</li>
<li><p>Is the destination a problem or is it an issue on my side?</p>
</li>
</ul>
<p>We will use a few tools to see what is going on here.</p>
<h2 id="heading-how-to-run-the-program-with-strace"><strong>How to run the program with strace</strong></h2>
<p>What is <a target="_blank" href="https://strace.io/">strace</a>?</p>
<blockquote>
<p>strace is a diagnostic, debugging, and instructional userspace utility for Linux. It is used to monitor and tamper with interactions between processes and the Linux kernel, which include system calls, signal deliveries, and changes of process state.</p>
<p>System administrators, diagnosticians, and troubleshooters will find it invaluable for solving problems with programs for which the source is not readily available since they do not need to be recompiled in order to trace them.</p>
</blockquote>
<p>strace is super useful when you don't have the source code of an application and yet you need to understand what is wrong when you call a program. Time to see it in action:</p>
<pre><code class="lang-python">josevnz@dmaf5 SuricataLog]$ strace asciinema upload demo-ascii.cast
xecve(<span class="hljs-string">"/usr/bin/asciinema"</span>, [<span class="hljs-string">"asciinema"</span>, <span class="hljs-string">"upload"</span>, <span class="hljs-string">"demo-ascii.cast"</span>], <span class="hljs-number">0x7ffdcddb1160</span> /* <span class="hljs-number">55</span> vars */) = <span class="hljs-number">0</span>
brk(NULL)                               = <span class="hljs-number">0x55e912d58000</span>
arch_prctl(<span class="hljs-number">0x3001</span> /* ARCH_??? */, <span class="hljs-number">0x7fff2f136480</span>) = <span class="hljs-number">-1</span> EINVAL (Invalid argument)
access(<span class="hljs-string">"/etc/ld.so.preload"</span>, R_OK)      = <span class="hljs-number">-1</span> ENOENT (No such file <span class="hljs-keyword">or</span> directory)
openat(AT_FDCWD, <span class="hljs-string">"/etc/ld.so.cache"</span>, O_RDONLY|O_CLOEXEC) = <span class="hljs-number">3</span>
fstat(<span class="hljs-number">3</span>, {st_mode=S_IFREG|<span class="hljs-number">0644</span>, st_size=<span class="hljs-number">92299</span>, ...}) = <span class="hljs-number">0</span>
mmap(NULL, <span class="hljs-number">92299</span>, PROT_READ, MAP_PRIVATE, <span class="hljs-number">3</span>, <span class="hljs-number">0</span>) = <span class="hljs-number">0x7f69dd26a000</span>
close(<span class="hljs-number">3</span>)                                = <span class="hljs-number">0</span>
<span class="hljs-comment"># </span>
<span class="hljs-comment"># Commented out LOTS output</span>
<span class="hljs-comment"># ...</span>
close(<span class="hljs-number">4</span>)                                = <span class="hljs-number">0</span>
socket(AF_INET, SOCK_DGRAM|SOCK_CLOEXEC, IPPROTO_IP) = <span class="hljs-number">4</span>
connect(<span class="hljs-number">4</span>, {sa_family=AF_INET, sin_port=htons(<span class="hljs-number">443</span>), sin_addr=inet_addr(<span class="hljs-string">"109.107.38.233"</span>)}, <span class="hljs-number">16</span>) = <span class="hljs-number">0</span>
getsockname(<span class="hljs-number">4</span>, {sa_family=AF_INET, sin_port=htons(<span class="hljs-number">33771</span>), sin_addr=inet_addr(<span class="hljs-string">"192.168.1.22"</span>)}, [<span class="hljs-number">28</span> =&gt; <span class="hljs-number">16</span>]) = <span class="hljs-number">0</span>
connect(<span class="hljs-number">4</span>, {sa_family=AF_UNSPEC, sa_data=<span class="hljs-string">"\0\0\0\0\0\0\0\0\0\0\0\0\0\0"</span>}, <span class="hljs-number">16</span>) = <span class="hljs-number">0</span>
connect(<span class="hljs-number">4</span>, {sa_family=AF_INET, sin_port=htons(<span class="hljs-number">443</span>), sin_addr=inet_addr(<span class="hljs-string">"109.107.37.0"</span>)}, <span class="hljs-number">16</span>) = <span class="hljs-number">0</span>
getsockname(<span class="hljs-number">4</span>, {sa_family=AF_INET, sin_port=htons(<span class="hljs-number">35023</span>), sin_addr=inet_addr(<span class="hljs-string">"192.168.1.22"</span>)}, [<span class="hljs-number">28</span> =&gt; <span class="hljs-number">16</span>]) = <span class="hljs-number">0</span>
close(<span class="hljs-number">4</span>)                                = <span class="hljs-number">0</span>
socket(AF_INET, SOCK_STREAM|SOCK_CLOEXEC, IPPROTO_TCP) = <span class="hljs-number">4</span>
connect(<span class="hljs-number">4</span>, {sa_family=AF_INET, sin_port=htons(<span class="hljs-number">443</span>), sin_addr=inet_addr(<span class="hljs-string">"109.107.38.233"</span>)}, <span class="hljs-number">16</span>) = <span class="hljs-number">0</span>
setsockopt(<span class="hljs-number">4</span>, SOL_TCP, TCP_NODELAY, [<span class="hljs-number">1</span>], <span class="hljs-number">4</span>) = <span class="hljs-number">0</span>
getsockopt(<span class="hljs-number">4</span>, SOL_SOCKET, SO_TYPE, [<span class="hljs-number">1</span>], [<span class="hljs-number">4</span>]) = <span class="hljs-number">0</span>
getsockname(<span class="hljs-number">4</span>, {sa_family=AF_INET, sin_port=htons(<span class="hljs-number">55682</span>), sin_addr=inet_addr(<span class="hljs-string">"192.168.1.22"</span>)}, [<span class="hljs-number">128</span> =&gt; <span class="hljs-number">16</span>]) = <span class="hljs-number">0</span>
ioctl(<span class="hljs-number">4</span>, FIONBIO, [<span class="hljs-number">0</span>])                  = <span class="hljs-number">0</span>
getpeername(<span class="hljs-number">4</span>, {sa_family=AF_INET, sin_port=htons(<span class="hljs-number">443</span>), sin_addr=inet_addr(<span class="hljs-string">"109.107.38.233"</span>)}, [<span class="hljs-number">16</span>]) = <span class="hljs-number">0</span>
getpid()                                = <span class="hljs-number">45070</span>
getpid()                                = <span class="hljs-number">45070</span>
getpid()                                = <span class="hljs-number">45070</span>
getpid()                                = <span class="hljs-number">45070</span>
getpid()                                = <span class="hljs-number">45070</span>
getpid()                                = <span class="hljs-number">45070</span>
write(<span class="hljs-number">4</span>, <span class="hljs-string">"\26\3\1\2\0\1\0\1\374\3\3\327\2*\v\316GT*\262\207\235\264\317\254\37$|,V\205\362"</span>..., <span class="hljs-number">517</span>) = <span class="hljs-number">517</span>
read(<span class="hljs-number">4</span>, <span class="hljs-string">"\26\3\3\0z"</span>, <span class="hljs-number">5</span>)                = <span class="hljs-number">5</span>
read(<span class="hljs-number">4</span>, <span class="hljs-string">"\2\0\0v\3\3E\217G?\335.;\212\237pn\16\257$\2\324J\324y\17\306\263\325i\264p"</span>..., <span class="hljs-number">122</span>) = <span class="hljs-number">122</span>
read(<span class="hljs-number">4</span>, <span class="hljs-string">"\24\3\3\0\1"</span>, <span class="hljs-number">5</span>)               = <span class="hljs-number">5</span>
read(<span class="hljs-number">4</span>, <span class="hljs-string">"\1"</span>, <span class="hljs-number">1</span>)                        = <span class="hljs-number">1</span>
read(<span class="hljs-number">4</span>, <span class="hljs-string">"\27\3\3\0\27"</span>, <span class="hljs-number">5</span>)              = <span class="hljs-number">5</span>
read(<span class="hljs-number">4</span>, <span class="hljs-string">"0{}\22t9\264\265\340j\362\30\342\360\234\205\1\370\33\246\1z'"</span>, <span class="hljs-number">23</span>) = <span class="hljs-number">23</span>
read(<span class="hljs-number">4</span>, <span class="hljs-string">"\27\3\3\17\335"</span>, <span class="hljs-number">5</span>)            = <span class="hljs-number">5</span>
read(<span class="hljs-number">4</span>, <span class="hljs-string">"\17\5\310\261\355\271\227oUaI\366\361]\3\275q)\5{\367z\20\233\345\352k?\371\272\23\237"</span>..., <span class="hljs-number">4061</span>) = <span class="hljs-number">4061</span>
stat(<span class="hljs-string">"/etc/pki/tls/certs/8d33f237.0"</span>, <span class="hljs-number">0x7ffd20be3620</span>) = <span class="hljs-number">-1</span> ENOENT (No such file <span class="hljs-keyword">or</span> directory)
read(<span class="hljs-number">4</span>, <span class="hljs-string">"\27\3\3\1\31"</span>, <span class="hljs-number">5</span>)              = <span class="hljs-number">5</span>
read(<span class="hljs-number">4</span>, <span class="hljs-string">"t\27\337\366G6\226Qs\273\327\314,\205\221\222Xu\233\21%\0s\340\270\224\330\t\2774\222h"</span>..., <span class="hljs-number">281</span>) = <span class="hljs-number">281</span>
read(<span class="hljs-number">4</span>, <span class="hljs-string">"\27\3\3\0005"</span>, <span class="hljs-number">5</span>)              = <span class="hljs-number">5</span>
read(<span class="hljs-number">4</span>, <span class="hljs-string">"\204{\314\232\311\0P-*$\245\315\271\236c\210N\315\5\371\364\23\235\16\0350N0K\246\336\374"</span>..., <span class="hljs-number">53</span>) = <span class="hljs-number">53</span>
write(<span class="hljs-number">4</span>, <span class="hljs-string">"\24\3\3\0\1\1\27\3\3\0005\361\311\347\t\254m#\273\204\350\16\343\34P\320sS\211\30\232&lt;"</span>..., <span class="hljs-number">64</span>) = <span class="hljs-number">64</span>
ioctl(<span class="hljs-number">4</span>, FIONBIO, [<span class="hljs-number">0</span>])                  = <span class="hljs-number">0</span>
write(<span class="hljs-number">4</span>, <span class="hljs-string">"\27\3\3\1\251\271\2673-\30\313\253\363\320H0\224\370Q\353(#?,\216\3\341\315|J\353\303"</span>..., <span class="hljs-number">430</span>) = <span class="hljs-number">430</span>
write(<span class="hljs-number">4</span>, <span class="hljs-string">"\27\3\3@\21\20\221\240\331\2737\10\244pv\312B\n\rn\272\33\336T\216\f\303\374k\177c\25"</span>..., <span class="hljs-number">16406</span>) = <span class="hljs-number">16406</span>
write(<span class="hljs-number">4</span>, <span class="hljs-string">"\27\3\3@\21\214\30\262\240s\216\240\354e\31\304Q\337Oy\21y\373\241g\311\224)\26\320\10{"</span>..., <span class="hljs-number">16406</span>) = <span class="hljs-number">16406</span>
write(<span class="hljs-number">4</span>, <span class="hljs-string">"\27\3\3@\21\36\323\240\376\276\224\35\f\10!@\36D\347\33ay\2617Hpv\4d\267y7"</span>..., <span class="hljs-number">16406</span>) = <span class="hljs-number">16406</span>
write(<span class="hljs-number">4</span>, <span class="hljs-string">"\27\3\3@\21\366x\264\242O2\7?\7\334\221W\24\2\f)\"@\20\375~\354\243W\32\0c"</span>..., <span class="hljs-number">16406</span>) = <span class="hljs-number">16406</span>
write(<span class="hljs-number">4</span>, <span class="hljs-string">"\27\3\3@\21\354\32W\36\265g\304\314\376\205\315\20\22\10c\333\342\264\330\366SS\4\217\356:V"</span>..., <span class="hljs-number">16406</span>) = <span class="hljs-number">16406</span>
write(<span class="hljs-number">4</span>, <span class="hljs-string">"\27\3\3@\21\1\274\35\335\271n\235e\202\202\207\221~\313\0y\210\344\312\32r\347\306x]\241C"</span>..., <span class="hljs-number">16406</span>) = <span class="hljs-number">16406</span>
write(<span class="hljs-number">4</span>, <span class="hljs-string">"\27\3\3@\21I\315\202\274\342\274\26\335qx\22-\226\322\320\203\231\274wLB\250\252\2\352\367\""</span>..., <span class="hljs-number">16406</span>) = <span class="hljs-number">8716</span>
write(<span class="hljs-number">4</span>, <span class="hljs-string">"\377\4m\341\317\376SUr\rQ\221\207\22#\262\314B7\33_v\310\271\fl\v\242\fK\v?"</span>..., <span class="hljs-number">7690</span>) = <span class="hljs-number">-1</span> EPIPE (Broken pipe)
--- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=<span class="hljs-number">45070</span>, si_uid=<span class="hljs-number">1000</span>} ---
close(<span class="hljs-number">4</span>)                                = <span class="hljs-number">0</span>
close(<span class="hljs-number">3</span>)                                = <span class="hljs-number">0</span>
write(<span class="hljs-number">2</span>, <span class="hljs-string">"\33[0;31masciinema: upload failed:"</span>..., <span class="hljs-number">76</span>asciinema: upload failed: &lt;urlopen error [Errno <span class="hljs-number">32</span>] Broken pipe&gt;
) = <span class="hljs-number">76</span>
write(<span class="hljs-number">2</span>, <span class="hljs-string">"\33[0;31masciinema: retry later by"</span>..., <span class="hljs-number">79</span>asciinema: retry later by running: asciinema upload demo-ascii.cast
) = <span class="hljs-number">79</span>
munmap(<span class="hljs-number">0x7fa1aa089000</span>, <span class="hljs-number">12447744</span>)        = <span class="hljs-number">0</span>
rt_sigaction(SIGINT, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=<span class="hljs-number">0x7fa1bad0fa70</span>}, {sa_handler=<span class="hljs-number">0x7fa1baf551d0</span>, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=<span class="hljs-number">0x7fa1bad0fa70</span>}, <span class="hljs-number">8</span>) = <span class="hljs-number">0</span>
munmap(<span class="hljs-number">0x7fa1ac649000</span>, <span class="hljs-number">593920</span>)          = <span class="hljs-number">0</span>
exit_group(<span class="hljs-number">1</span>)                           = ?
+++ exited <span class="hljs-keyword">with</span> <span class="hljs-number">1</span> +++
</code></pre>
<p>Look at this socket call (<code>man 2 getpeername</code>):</p>
<pre><code class="lang-shell">getpeername(4, {sa_family=AF_INET, sin_port=htons(443), sin_addr=inet_addr("109.107.38.233")}, [16]) = 0
</code></pre>
<p>And below that, as you can see, we are actually writing data to the website and the connection breaks:</p>
<pre><code class="lang-shell">write(4, "\27\3\3@\21\366x\264\242O2\7?\7\334\221W\24\2\f)\"@\20\375~\354\243W\32\0c"..., 16406) = 16406
write(4, "\27\3\3@\21\354\32W\36\265g\304\314\376\205\315\20\22\10c\333\342\264\330\366SS\4\217\356:V"..., 16406) = 16406
write(4, "\27\3\3@\21\1\274\35\335\271n\235e\202\202\207\221~\313\0y\210\344\312\32r\347\306x]\241C"..., 16406) = 16406
write(4, "\27\3\3@\21I\315\202\274\342\274\26\335qx\22-\226\322\320\203\231\274wLB\250\252\2\352\367\""..., 16406) = 8716
write(4, "\377\4m\341\317\376SUr\rQ\221\207\22#\262\314B7\33_v\310\271\fl\v\242\fK\v?"..., 7690) = -1 EPIPE (Broken pipe)
--- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=45070, si_uid=1000} ---
</code></pre>
<p>So who is '109.107.38.233'?:</p>
<pre><code class="lang-python">[josevnz@dmaf5 SuricataLog]$ nslookup <span class="hljs-number">109.107</span><span class="hljs-number">.38</span><span class="hljs-number">.233</span>
<span class="hljs-number">233.38</span><span class="hljs-number">.107</span><span class="hljs-number">.109</span>.<span class="hljs-keyword">in</span>-addr.arpa    name = cip<span class="hljs-number">-109</span><span class="hljs-number">-107</span><span class="hljs-number">-38</span><span class="hljs-number">-233.</span>gb1.brightbox.com.
</code></pre>
<p>You can see on the <a target="_blank" href="https://asciinema.org/about">about</a> webpage that brightbox.com provides the hosting for asciinema.</p>
<p>So what is wrong? It is not that the site is down or unreachable. Can we dig further?</p>
<h2 id="heading-if-i-only-had-the-source-code-deep-diving-with-the-python-debugger"><strong>If I only had the source code – deep diving with the Python debugger</strong></h2>
<pre><code class="lang-python">[josevnz@dmaf5 SuricataLog]$ file /usr/bin/asciinema
/usr/bin/asciinema: Python script, ASCII text executable
</code></pre>
<p>Oh, yes we do! Ever curious, you open the asciinema script:</p>
<pre><code class="lang-python"><span class="hljs-comment">#!/usr/bin/python3</span>
<span class="hljs-comment"># EASY-INSTALL-ENTRY-SCRIPT: 'asciinema==2.0.2','console_scripts','asciinema'</span>
<span class="hljs-keyword">import</span> re
<span class="hljs-keyword">import</span> sys 

<span class="hljs-comment"># for compatibility with easy_install; see #2198</span>
__requires__ = <span class="hljs-string">'asciinema==2.0.2'</span>

<span class="hljs-keyword">try</span>:
    <span class="hljs-keyword">from</span> importlib.metadata <span class="hljs-keyword">import</span> distribution
<span class="hljs-keyword">except</span> ImportError:
    <span class="hljs-keyword">try</span>:
        <span class="hljs-keyword">from</span> importlib_metadata <span class="hljs-keyword">import</span> distribution
    <span class="hljs-keyword">except</span> ImportError:
        <span class="hljs-keyword">from</span> pkg_resources <span class="hljs-keyword">import</span> load_entry_point


<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">importlib_load_entry_point</span>(<span class="hljs-params">spec, group, name</span>):</span>
    dist_name, _, _ = spec.partition(<span class="hljs-string">'=='</span>)
    matches = ( 
        entry_point
        <span class="hljs-keyword">for</span> entry_point <span class="hljs-keyword">in</span> distribution(dist_name).entry_points
        <span class="hljs-keyword">if</span> entry_point.group == group <span class="hljs-keyword">and</span> entry_point.name == name
    )   
    <span class="hljs-keyword">return</span> next(matches).load()


globals().setdefault(<span class="hljs-string">'load_entry_point'</span>, importlib_load_entry_point)


<span class="hljs-keyword">if</span> __name__ == <span class="hljs-string">'__main__'</span>:
    sys.argv[<span class="hljs-number">0</span>] = re.sub(<span class="hljs-string">r'(-script\.pyw?|\.exe)?$'</span>, <span class="hljs-string">''</span>, sys.argv[<span class="hljs-number">0</span>])
    sys.exit(load_entry_point(<span class="hljs-string">'asciinema==2.0.2'</span>, <span class="hljs-string">'console_scripts'</span>, <span class="hljs-string">'asciinema'</span>)())
</code></pre>
<p>The main script was generated with <a target="_blank" href="https://setuptools.pypa.io/en/latest/deprecated/easy_install.html">easy install</a>, which means <code>asciinema.py</code> is just a wrapper around the interesting code. To find out where the interesting stuff is, let's run the script through the Python <a target="_blank" href="https://www.redhat.com/sysadmin/python-debugger-pdb">pdb debugger</a>:</p>
<pre><code class="lang-python">[josevnz@dmaf5 SuricataLog]$ python3 -m pdb /usr/bin/asciinema upload demo-ascii.cast 
&gt; /usr/bin/asciinema(<span class="hljs-number">3</span>)&lt;module&gt;()
-&gt; <span class="hljs-keyword">import</span> re
(Pdb) n
&gt; /usr/bin/asciinema(<span class="hljs-number">4</span>)&lt;module&gt;()
-&gt; <span class="hljs-keyword">import</span> sys
(Pdb) c
asciinema: upload failed: &lt;urlopen error [Errno <span class="hljs-number">32</span>] Broken pipe&gt;
asciinema: retry later by running: asciinema upload demo-ascii.cast
The program exited via sys.exit(). Exit status: <span class="hljs-number">1</span>
</code></pre>
<p>Not quite what we need. The program runs, hits the exception, and then it restarts at the beginning.</p>
<p>Let's cheat a little. Was asciinema installed with an RPM (I use Fedora Linux)?</p>
<pre><code class="lang-python">[josevnz@dmaf5 SuricataLog]$ rpm -qif /usr/bin/asciinema
Name        : asciinema
Version     : <span class="hljs-number">2.0</span><span class="hljs-number">.2</span>
Release     : <span class="hljs-number">6.</span>fc33
</code></pre>
<p>And we are trying to upload a file, anything that looks like an uploader?</p>
<pre><code class="lang-python">josevnz@dmaf5 SuricataLog]$ rpm -qil asciinema|grep -i uploa
/usr/lib/python3<span class="hljs-number">.9</span>/site-packages/asciinema/commands/__pycache__/upload.cpython<span class="hljs-number">-39.</span>opt<span class="hljs-number">-1.</span>pyc
/usr/lib/python3<span class="hljs-number">.9</span>/site-packages/asciinema/commands/__pycache__/upload.cpython<span class="hljs-number">-39.</span>pyc
/usr/lib/python3<span class="hljs-number">.9</span>/site-packages/asciinema/commands/upload.py
</code></pre>
<p>Ah, getting interesting! Let's open 'upload.py':</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> asciinema.commands.command <span class="hljs-keyword">import</span> Command
<span class="hljs-keyword">from</span> asciinema.api <span class="hljs-keyword">import</span> APIError


<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">UploadCommand</span>(<span class="hljs-params">Command</span>):</span>

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">__init__</span>(<span class="hljs-params">self, api, filename</span>):</span>
        Command.__init__(self)
        self.api = api 
        self.filename = filename

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">execute</span>(<span class="hljs-params">self</span>):</span>
        <span class="hljs-keyword">try</span>:
            result, warn = self.api.upload_asciicast(self.filename)

            <span class="hljs-keyword">if</span> warn:
                self.print_warning(warn)

            self.print(result.get(<span class="hljs-string">'message'</span>) <span class="hljs-keyword">or</span> result[<span class="hljs-string">'url'</span>])

        <span class="hljs-keyword">except</span> OSError <span class="hljs-keyword">as</span> e:
            self.print_error(<span class="hljs-string">"upload failed: %s"</span> % str(e))
            <span class="hljs-keyword">return</span> <span class="hljs-number">1</span>

        <span class="hljs-keyword">except</span> APIError <span class="hljs-keyword">as</span> e:
            self.print_error(<span class="hljs-string">"upload failed: %s"</span> % str(e))
            self.print_error(<span class="hljs-string">"retry later by running: asciinema upload %s"</span> % self.filename)
            <span class="hljs-keyword">return</span> <span class="hljs-number">1</span>

        <span class="hljs-keyword">return</span> <span class="hljs-number">0</span>
</code></pre>
<p>Let's put a few breakpoints inside the UploadCommand (Lines 14, 26 on my copy of the code):</p>
<pre><code class="lang-python">[josevnz@dmaf5 SuricataLog]$ python3 -m pdb /usr/bin/asciinema upload demo-ascii.cast 
&gt; /usr/bin/asciinema(<span class="hljs-number">3</span>)&lt;module&gt;()
-&gt; <span class="hljs-keyword">import</span> re
(Pdb) b /usr/lib/python3<span class="hljs-number">.9</span>/site-packages/asciinema/commands/upload.py:<span class="hljs-number">14</span>
Breakpoint <span class="hljs-number">1</span> at /usr/lib/python3<span class="hljs-number">.9</span>/site-packages/asciinema/commands/upload.py:<span class="hljs-number">14</span>
(Pdb) c
&gt; /usr/lib/python3<span class="hljs-number">.9</span>/site-packages/asciinema/commands/upload.py(<span class="hljs-number">14</span>)execute()
-&gt; result, warn = self.api.upload_asciicast(self.filename)
(Pdb) c
&gt; /usr/lib/python3<span class="hljs-number">.9</span>/site-packages/asciinema/commands/upload.py(<span class="hljs-number">26</span>)execute()
-&gt; self.print_error(<span class="hljs-string">"upload failed: %s"</span> % str(e))
(Pdb) n
asciinema: upload failed: &lt;urlopen error [Errno <span class="hljs-number">32</span>] Broken pipe&gt;
&gt; /usr/lib/python3<span class="hljs-number">.9</span>/site-packages/asciinema/commands/upload.py(<span class="hljs-number">27</span>)execute()
-&gt; self.print_error(<span class="hljs-string">"retry later by running: asciinema upload %s"</span> % self.filename)
(Pdb) ll
 <span class="hljs-number">12</span>          <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">execute</span>(<span class="hljs-params">self</span>):</span>
 <span class="hljs-number">13</span>              <span class="hljs-keyword">try</span>:
 <span class="hljs-number">14</span> B                result, warn = self.api.upload_asciicast(self.filename)
 <span class="hljs-number">15</span>      
 <span class="hljs-number">16</span>                  <span class="hljs-keyword">if</span> warn:
 <span class="hljs-number">17</span>                      self.print_warning(warn)
 <span class="hljs-number">18</span>      
 <span class="hljs-number">19</span>                  self.print(result.get(<span class="hljs-string">'message'</span>) <span class="hljs-keyword">or</span> result[<span class="hljs-string">'url'</span>])
 <span class="hljs-number">20</span>      
 <span class="hljs-number">21</span>              <span class="hljs-keyword">except</span> OSError <span class="hljs-keyword">as</span> e:
 <span class="hljs-number">22</span>                  self.print_error(<span class="hljs-string">"upload failed: %s"</span> % str(e))
 <span class="hljs-number">23</span>                  <span class="hljs-keyword">return</span> <span class="hljs-number">1</span>
 <span class="hljs-number">24</span>      
 <span class="hljs-number">25</span>              <span class="hljs-keyword">except</span> APIError <span class="hljs-keyword">as</span> e:
 <span class="hljs-number">26</span> B                self.print_error(<span class="hljs-string">"upload failed: %s"</span> % str(e))
 <span class="hljs-number">27</span>  -&gt;                self.print_error(<span class="hljs-string">"retry later by running: asciinema upload %s"</span> % self.filename)
 <span class="hljs-number">28</span>                  <span class="hljs-keyword">return</span> <span class="hljs-number">1</span>
 <span class="hljs-number">29</span>      
 <span class="hljs-number">30</span>              <span class="hljs-keyword">return</span> <span class="hljs-number">0</span>
</code></pre>
<p>We got an APIError. Anything interesting with that type of exception?</p>
<pre><code class="lang-shell">(Pdb) source APIError
11      class APIError(Exception):
12          pass
(Pdb) e.args
('&lt;urlopen error [Errno 32] Broken pipe&gt;',)
</code></pre>
<p>So nothing special, derived directly from <code>Exception</code>. Also, the arguments to the exception are just the error message.</p>
<p>Of course the next step is to see if this error comes from a known library (Search on the Internet <a target="_blank" href="https://duckduckgo.com/?q=%3Curlopen+error+%5BErrno+32%5D+Broken+pipe%3E&amp;ia=web">for the error</a>).</p>
<p>I found this issue on <a target="_blank" href="https://github.com/asciinema/asciinema/issues/335">GitHub</a>. Reading <a target="_blank" href="https://github.com/asciinema/asciinema/issues/91">further</a> you can see than the generous author of Asciicast is paying the storage from his own pocket so all of us can enjoy the online storage for free:</p>
<blockquote>
<p>The max size was set 2MB which appears to be too low. I have upped it to 5MB. This isn't much, but I'm paying for the storage (S3) from my own pocket, so I can't offer GBs of storage for every user. Let me know if that works for you. I'm fine with increasing it even more, but now I want to figure out the good middle ground between user needs and hosting costs.</p>
</blockquote>
<p>So let's confirm that this is indeed the cause:</p>
<pre><code class="lang-python">[josevnz@dmaf5 SuricataLog]$ ls -lh demo-ascii.cast 
-rw-rw-r-- <span class="hljs-number">1</span> josevnz josevnz <span class="hljs-number">12</span>M Apr <span class="hljs-number">21</span> <span class="hljs-number">15</span>:<span class="hljs-number">44</span> demo-ascii.cast
</code></pre>
<p>So far the big size of the file seems to be the culprit.</p>
<p>I'm still running the debugger, and I would love to see what asciinema modules were loaded. For that switch to the '<em>interact</em>' mode and get that listing with a <a target="_blank" href="https://peps.python.org/pep-0202/">list comprehension</a> and a <a target="_blank" href="https://docs.python.org/3/howto/regex.html">regular expression</a>:</p>
<pre><code class="lang-python">(Pdb) interact
*interactive*
<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> re
<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> sys
<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pprint
<span class="hljs-meta">&gt;&gt;&gt; </span>pprint.pprint([name <span class="hljs-keyword">for</span> name <span class="hljs-keyword">in</span> sys.modules.keys() <span class="hljs-keyword">if</span> re.search(<span class="hljs-string">'asciinema'</span>, name)], indent=<span class="hljs-literal">True</span>)
[<span class="hljs-string">'asciinema.asciicast.events'</span>,
 <span class="hljs-string">'asciinema.asciicast.v1'</span>,
 <span class="hljs-string">'asciinema.asciicast.v2'</span>,
 <span class="hljs-string">'asciinema.asciicast'</span>,
 <span class="hljs-string">'asciinema.term'</span>,
 <span class="hljs-string">'asciinema.pty'</span>,
 <span class="hljs-string">'asciinema'</span>,
 <span class="hljs-string">'asciinema.config'</span>,
 <span class="hljs-string">'asciinema.commands'</span>,
 <span class="hljs-string">'asciinema.commands.command'</span>,
 <span class="hljs-string">'asciinema.commands.auth'</span>,
 <span class="hljs-string">'asciinema.asciicast.raw'</span>,
 <span class="hljs-string">'asciinema.http_adapter'</span>,
 <span class="hljs-string">'asciinema.urllib_http_adapter'</span>,
 <span class="hljs-string">'asciinema.api'</span>,
 <span class="hljs-string">'asciinema.commands.record'</span>,
 <span class="hljs-string">'asciinema.player'</span>,
 <span class="hljs-string">'asciinema.commands.play'</span>,
 <span class="hljs-string">'asciinema.commands.cat'</span>,
 <span class="hljs-string">'asciinema.commands.upload'</span>,
 <span class="hljs-string">'asciinema.__main__'</span>]
&gt;&gt;&gt;
</code></pre>
<p>The following look like they could hold some clues:</p>
<ul>
<li><p>asciinema.urllib_http_adapter</p>
</li>
<li><p>asciinema.commands.upload</p>
</li>
<li><p>asciinema.http_adapter</p>
</li>
</ul>
<p>Exit the debugger (or go to another terminal) and search for the urllib_http_adapter:</p>
<pre><code class="lang-python">[josevnz@dmaf5 SuricataLog]$ find /usr/lib/python3<span class="hljs-number">.9</span>/site-packages/asciinema/ -name <span class="hljs-string">'urllib_http_adapter*'</span>
/usr/lib/python3<span class="hljs-number">.9</span>/site-packages/asciinema/__pycache__/urllib_http_adapter.cpython<span class="hljs-number">-39.</span>opt<span class="hljs-number">-1.</span>pyc
/usr/lib/python3<span class="hljs-number">.9</span>/site-packages/asciinema/__pycache__/urllib_http_adapter.cpython<span class="hljs-number">-39.</span>pyc
/usr/lib/python3<span class="hljs-number">.9</span>/site-packages/asciinema/urllib_http_adapter.py
</code></pre>
<p>If you open the file you will see that 'post' method is the one we want to troubleshoot:</p>
<pre><code class="lang-python"><span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">URLLibHttpAdapter</span>:</span>

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">post</span>(<span class="hljs-params">self, url, fields={}, files={}, headers={}, username=None, password=None</span>):</span>
        content_type, body = MultipartFormdataEncoder().encode(fields, files)

        headers = headers.copy()
        headers[<span class="hljs-string">"Content-Type"</span>] = content_type

        <span class="hljs-keyword">if</span> password:
            auth = <span class="hljs-string">"%s:%s"</span> % (username, password)
            encoded_auth = base64.encodebytes(auth.encode(<span class="hljs-string">'utf-8'</span>))[:<span class="hljs-number">-1</span>]
            headers[<span class="hljs-string">"Authorization"</span>] = <span class="hljs-string">b"Basic "</span> + encoded_auth

        request = Request(url, data=body, headers=headers, method=<span class="hljs-string">"POST"</span>)

        <span class="hljs-keyword">try</span>:
            response = urlopen(request)
            status = response.status
            headers = self._parse_headers(response)
            body = response.read().decode(<span class="hljs-string">'utf-8'</span>)
        <span class="hljs-keyword">except</span> HTTPError <span class="hljs-keyword">as</span> e:
            status = e.code
            headers = {}
            body = e.read().decode(<span class="hljs-string">'utf-8'</span>)
        <span class="hljs-keyword">except</span> (http.client.RemoteDisconnected, URLError) <span class="hljs-keyword">as</span> e:
            <span class="hljs-keyword">raise</span> HTTPConnectionError(str(e))

        <span class="hljs-keyword">return</span> (status, headers, body)
</code></pre>
<p>A breakpoint in line 65 will get us where we need to be:</p>
<pre><code class="lang-python">[josevnz@dmaf5 SuricataLog]$ python3 -m pdb /usr/bin/asciinema upload demo-ascii.cast 
&gt; /usr/bin/asciinema(<span class="hljs-number">3</span>)&lt;module&gt;()
-&gt; <span class="hljs-keyword">import</span> re
(Pdb) b /usr/lib/python3<span class="hljs-number">.9</span>/site-packages/asciinema/urllib_http_adapter.py:<span class="hljs-number">65</span>
Breakpoint <span class="hljs-number">1</span> at /usr/lib/python3<span class="hljs-number">.9</span>/site-packages/asciinema/urllib_http_adapter.py:<span class="hljs-number">65</span>
(Pdb) c
&gt; /usr/lib/python3<span class="hljs-number">.9</span>/site-packages/asciinema/urllib_http_adapter.py(<span class="hljs-number">65</span>)post()
-&gt; headers = headers.copy()
(Pdb) ll
 <span class="hljs-number">62</span>          <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">post</span>(<span class="hljs-params">self, url, fields={}, files={}, headers={}, username=None, password=None</span>):</span>
 <span class="hljs-number">63</span>              content_type, body = MultipartFormdataEncoder().encode(fields, files)
 <span class="hljs-number">64</span>      
 <span class="hljs-number">65</span> B-&gt;            headers = headers.copy()
 <span class="hljs-number">66</span>              headers[<span class="hljs-string">"Content-Type"</span>] = content_type
 <span class="hljs-number">67</span>      
 <span class="hljs-number">68</span>              <span class="hljs-keyword">if</span> password:
 <span class="hljs-number">69</span>                  auth = <span class="hljs-string">"%s:%s"</span> % (username, password)
 <span class="hljs-number">70</span>                  encoded_auth = base64.encodebytes(auth.encode(<span class="hljs-string">'utf-8'</span>))[:<span class="hljs-number">-1</span>]
 <span class="hljs-number">71</span>                  headers[<span class="hljs-string">"Authorization"</span>] = <span class="hljs-string">b"Basic "</span> + encoded_auth
 <span class="hljs-number">72</span>      
 <span class="hljs-number">73</span>              request = Request(url, data=body, headers=headers, method=<span class="hljs-string">"POST"</span>)
 <span class="hljs-number">74</span>      
 <span class="hljs-number">75</span>              <span class="hljs-keyword">try</span>:
 <span class="hljs-number">76</span>                  response = urlopen(request)
 <span class="hljs-number">77</span>                  status = response.status
 <span class="hljs-number">78</span>                  headers = self._parse_headers(response)
 <span class="hljs-number">79</span>                  body = response.read().decode(<span class="hljs-string">'utf-8'</span>)
 <span class="hljs-number">80</span>              <span class="hljs-keyword">except</span> HTTPError <span class="hljs-keyword">as</span> e:
 <span class="hljs-number">81</span>                  status = e.code
 <span class="hljs-number">82</span>                  headers = {}
 <span class="hljs-number">83</span>                  body = e.read().decode(<span class="hljs-string">'utf-8'</span>)
 <span class="hljs-number">84</span>              <span class="hljs-keyword">except</span> (http.client.RemoteDisconnected, URLError) <span class="hljs-keyword">as</span> e:
 <span class="hljs-number">85</span>                  <span class="hljs-keyword">raise</span> HTTPConnectionError(str(e))
 <span class="hljs-number">86</span>      
 <span class="hljs-number">87</span>              <span class="hljs-keyword">return</span> (status, headers, body)
(Pdb) args
self = &lt;asciinema.urllib_http_adapter.URLLibHttpAdapter object at <span class="hljs-number">0x7f59ed3e4640</span>&gt;
url = <span class="hljs-string">'https://asciinema.org/api/asciicasts'</span>
fields = {}
files = {<span class="hljs-string">'asciicast'</span>: (<span class="hljs-string">'ascii.cast'</span>, &lt;_io.BufferedReader name=<span class="hljs-string">'demo-ascii.cast'</span>&gt;)}
headers = {<span class="hljs-string">'User-Agent'</span>: <span class="hljs-string">'asciinema/2.0.2 CPython/3.9.9 Linux/5.14.18-100.fc33.x86_64-x86_64-with-glibc2.32'</span>, <span class="hljs-string">'Accept'</span>: <span class="hljs-string">'application/json'</span>}
username = <span class="hljs-string">'XXXX'</span>
password = <span class="hljs-string">'XXXX0f1-1d73-43fc-XX36-c9d7ZZZAAAA'</span>
</code></pre>
<p>Very interesting – we could use definitely use the following fields to exercise the upload functionality without Python (obtained using <code>args</code> from the debugger):</p>
<ul>
<li><p>url = '<a target="_blank" href="https://asciinema.org/api/asciicasts">https://asciinema.org/api/asciicasts</a>'</p>
</li>
<li><p>headers = {'User-Agent': 'asciinema/2.0.2 CPython/3.9.9 Linux/5.14.18-100.fc33.x86_64-x86_64-with-glibc2.32', 'Accept': 'application/json'}</p>
</li>
<li><p>username = 'XXXX'</p>
</li>
<li><p>password = 'XXXX0f1-1d73-43fc-XX36-c9d7ZZZAAAA'</p>
</li>
</ul>
<p>What exception we will get? We set 2 more breakpoints and let the debugger run until it reaches them:</p>
<pre><code class="lang-python">(Pdb) b <span class="hljs-number">81</span>
Breakpoint <span class="hljs-number">2</span> at /usr/lib/python3<span class="hljs-number">.9</span>/site-packages/asciinema/urllib_http_adapter.py:<span class="hljs-number">81</span>
(Pdb) b <span class="hljs-number">85</span>
Breakpoint <span class="hljs-number">3</span> at /usr/lib/python3<span class="hljs-number">.9</span>/site-packages/asciinema/urllib_http_adapter.py:<span class="hljs-number">85</span>
(Pdb) c
&gt; /usr/lib/python3<span class="hljs-number">.9</span>/site-packages/asciinema/urllib_http_adapter.py(<span class="hljs-number">85</span>)post()
-&gt; <span class="hljs-keyword">raise</span> HTTPConnectionError(str(e))
(Pdb) e
URLError(BrokenPipeError(<span class="hljs-number">32</span>, <span class="hljs-string">'Broken pipe'</span>))
</code></pre>
<p>The type of the error is <a target="_blank" href="https://docs.python.org/3/library/exceptions.html#BrokenPipeError">BrokenPipeError</a>:</p>
<blockquote>
<p>A subclass of ConnectionError, raised when trying to write on a pipe while the other end has been closed, or trying to write on a socket which has been shutdown for writing. Corresponds to errno EPIPE and ESHUTDOWN.</p>
</blockquote>
<p>One last thing – do we read the whole file in memory before sending it to the website?</p>
<pre><code class="lang-python">[josevnz@dmaf5 SuricataLog]$ python3 -m pdb /usr/bin/asciinema upload demo-ascii.cast 
&gt; /usr/bin/asciinema(<span class="hljs-number">3</span>)&lt;module&gt;()
-&gt; <span class="hljs-keyword">import</span> re
(Pdb) b /usr/lib/python3<span class="hljs-number">.9</span>/site-packages/asciinema/urllib_http_adapter.py:<span class="hljs-number">49</span>
Breakpoint <span class="hljs-number">1</span> at /usr/lib/python3<span class="hljs-number">.9</span>/site-packages/asciinema/urllib_http_adapter.py:<span class="hljs-number">49</span>
(Pdb) c
&gt; /usr/lib/python3<span class="hljs-number">.9</span>/site-packages/asciinema/urllib_http_adapter.py(<span class="hljs-number">49</span>)iter()
-&gt; <span class="hljs-keyword">yield</span> (data, len(data))
(Pdb) len(data)
<span class="hljs-number">12444283</span>
</code></pre>
<p>12MB, not huge for today's computer memory but also not small.</p>
<p>Do you remember the parameters we managed to capture before with the help of the debugger (URL, user, and so on)? We now know enough to use a different tool (<a target="_blank" href="https://curl.se/">curl</a>) to try to upload the file:</p>
<pre><code class="lang-python">[josevnz@dmaf5 SuricataLog]$ curl --fail --http1<span class="hljs-number">.1</span> --verbose --user $USER:$(cat ~/.config/asciinema/install-id) https://asciinema.org/api/asciicasts --form asciicast=@demo-ascii.cast
*   Trying <span class="hljs-number">109.107</span><span class="hljs-number">.37</span><span class="hljs-number">.0</span>:<span class="hljs-number">443.</span>..
* Connected to asciinema.org (<span class="hljs-number">109.107</span><span class="hljs-number">.37</span><span class="hljs-number">.0</span>) port <span class="hljs-number">443</span> (<span class="hljs-comment">#0)</span>
* ALPN, offering http/<span class="hljs-number">1.1</span>
* successfully set certificate verify locations:
*   CAfile: /etc/pki/tls/certs/ca-bundle.crt
  CApath: none
* TLSv1<span class="hljs-number">.3</span> (OUT), TLS handshake, Client hello (<span class="hljs-number">1</span>):
* TLSv1<span class="hljs-number">.3</span> (IN), TLS handshake, Server hello (<span class="hljs-number">2</span>):
* TLSv1<span class="hljs-number">.3</span> (IN), TLS handshake, Encrypted Extensions (<span class="hljs-number">8</span>):
* TLSv1<span class="hljs-number">.3</span> (IN), TLS handshake, Certificate (<span class="hljs-number">11</span>):
* TLSv1<span class="hljs-number">.3</span> (IN), TLS handshake, CERT verify (<span class="hljs-number">15</span>):
* TLSv1<span class="hljs-number">.3</span> (IN), TLS handshake, Finished (<span class="hljs-number">20</span>):
* TLSv1<span class="hljs-number">.3</span> (OUT), TLS change cipher, Change cipher spec (<span class="hljs-number">1</span>):
* TLSv1<span class="hljs-number">.3</span> (OUT), TLS handshake, Finished (<span class="hljs-number">20</span>):
* SSL connection using TLSv1<span class="hljs-number">.3</span> / TLS_AES_128_GCM_SHA256
* ALPN, server accepted to use http/<span class="hljs-number">1.1</span>
* Server certificate:
*  subject: CN=*.asciinema.org
*  start date: Mar  <span class="hljs-number">9</span> <span class="hljs-number">06</span>:<span class="hljs-number">02</span>:<span class="hljs-number">26</span> <span class="hljs-number">2022</span> GMT
*  expire date: Jun  <span class="hljs-number">7</span> <span class="hljs-number">06</span>:<span class="hljs-number">02</span>:<span class="hljs-number">25</span> <span class="hljs-number">2022</span> GMT
*  subjectAltName: host <span class="hljs-string">"asciinema.org"</span> matched cert<span class="hljs-string">'s "asciinema.org"
*  issuer: C=US; O=Let'</span>s Encrypt; CN=R3
*  SSL certificate verify ok.
* Server auth using Basic <span class="hljs-keyword">with</span> user <span class="hljs-string">'XXXX'</span>
&gt; POST /api/asciicasts HTTP/<span class="hljs-number">1.1</span>
&gt; Host: asciinema.org
&gt; Authorization: Basic XXXXX=
&gt; User-Agent: curl/<span class="hljs-number">7.71</span><span class="hljs-number">.1</span>
&gt; Accept: */*
&gt; Content-Length: <span class="hljs-number">12444495</span>
&gt; Content-Type: multipart/form-data; boundary=-----------------------<span class="hljs-number">-0</span>d76dac3e1f8aed4
&gt; Expect: <span class="hljs-number">100</span>-<span class="hljs-keyword">continue</span>
&gt; 
* TLSv1<span class="hljs-number">.3</span> (IN), TLS handshake, Newsession Ticket (<span class="hljs-number">4</span>):
* Mark bundle <span class="hljs-keyword">as</span> <span class="hljs-keyword">not</span> supporting multiuse
&lt; HTTP/<span class="hljs-number">1.1</span> <span class="hljs-number">100</span> Continue
* Mark bundle <span class="hljs-keyword">as</span> <span class="hljs-keyword">not</span> supporting multiuse
* The requested URL returned error: <span class="hljs-number">413</span> Request Entity Too Large
* Closing connection <span class="hljs-number">0</span>
* TLSv1<span class="hljs-number">.3</span> (OUT), TLS alert, close notify (<span class="hljs-number">256</span>):
curl: (<span class="hljs-number">22</span>) The requested URL returned error: <span class="hljs-number">413</span> Request Entity Too Large
</code></pre>
<p>The error <code>413 Request Entity Too Large</code> <a target="_blank" href="https://developer.mozilla.org/en-US/docs/web/http/status/413">means</a>:</p>
<blockquote>
<p>The HTTP 413 Payload Too Large response status code indicates that the request entity is larger than limits defined by server; the server might close the connection or return a Retry-After header field.</p>
</blockquote>
<p>So curl is much better than Python on telling us the truth about why our file was rejected.</p>
<p>How much data did we manage to transmit before our connection was cut off? Let's see if we can find that out using a packet sniffer.</p>
<h2 id="heading-how-to-use-wireshark-and-the-sslkeylogfile-to-inspect-the-http-traffic"><strong>How to use Wireshark and the SSLKEYLOGFILE to inspect the HTTP traffic</strong></h2>
<p>You can capture the traffic between your machine and the asciinema website using a network sniffer like <a target="_blank" href="https://wireshark.org/">Wireshark</a> or the well known <a target="_blank" href="https://www.tcpdump.org/">tcpdump</a>.</p>
<p>The traffic will be encrypted as we use HTTPS, but using a feature supported by many programs known as a '<a target="_blank" href="https://www.paolotagliaferri.com/overview-of-transport-layer-security-protocol-tls-1-3/">TLS master encryption secrets</a>' you can decrypt the session. For that let's enable the <a target="_blank" href="https://everything.curl.dev/usingcurl/tls/sslkeylogfile">feature</a> on the client:</p>
<pre><code class="lang-python">export SSLKEYLOGFILE=$HOME/keylogfile.txt
</code></pre>
<p>If is supported, the $SSLKEYLOGFILE file will be populated with the keys:</p>
<pre><code class="lang-python">[josevnz@dmaf5 SuricataLog]$ export SSLKEYLOGFILE=$HOME/keylogfile.txt
[josevnz@dmaf5 SuricataLog]$ /usr/bin/asciinema upload demo-ascii.cast 
asciinema: upload failed: &lt;urlopen error [Errno <span class="hljs-number">32</span>] Broken pipe&gt;
asciinema: retry later by running: asciinema upload demo-ascii.cast
[josevnz@dmaf5 SuricataLog]$ ls -l $SSLKEYLOGFILE
-rw-rw-r-- <span class="hljs-number">1</span> josevnz josevnz <span class="hljs-number">832</span> Apr <span class="hljs-number">21</span> <span class="hljs-number">21</span>:<span class="hljs-number">02</span> /home/josevnz/keylogfile.txt

[josevnz@dmaf5 SuricataLog]$ cat /home/josevnz/keylogfile.txt

<span class="hljs-comment"># TLS secrets log file, generated by OpenSSL / Python</span>
SERVER_HANDSHAKE_TRAFFIC_SECRET <span class="hljs-number">2987e32066</span>d608a3de0cdd896f62801290045c2616abfaef5fac1c6986131847 <span class="hljs-number">4</span>dd1a1bc1261a84886b28ee72798d89ba77d7de7051b3dcdafd548a621ed1124
EXPORTER_SECRET <span class="hljs-number">2987e32066</span>d608a3de0cdd896f62801290045c2616abfaef5fac1c6986131847 <span class="hljs-number">1</span>ec8d94b7ec373a984abed25fa0dfaa6346fe67feea0516d7e2e46a666a12614
SERVER_TRAFFIC_SECRET_0 <span class="hljs-number">2987e32066</span>d608a3de0cdd896f62801290045c2616abfaef5fac1c6986131847 e1d8fa6dba5eea00d4e52af0ce7e7007da0ade4c9dd9da3d9a060b55880531f1
CLIENT_HANDSHAKE_TRAFFIC_SECRET <span class="hljs-number">2987e32066</span>d608a3de0cdd896f62801290045c2616abfaef5fac1c6986131847 <span class="hljs-number">903</span>bf381f927d783e72846201e87203ff130d9cf21f84cf0b923834d69c3fe76
CLIENT_TRAFFIC_SECRET_0 <span class="hljs-number">2987e32066</span>d608a3de0cdd896f62801290045c2616abfaef5fac1c6986131847 <span class="hljs-number">495</span>b5acf783869d74a7521e3b9c3f7bfc6dbc25e24ba95f684e96f6b2a435206
SERVER_HANDSHAKE_TRAFFIC_SECRET <span class="hljs-number">82</span>cab66e906c3cd3c58b3aeeecd66b2a12e521704d3e19e2f008550705e78e00 <span class="hljs-number">5</span>a0d699640bd460530bd38148cf979e585b9a43c1bd545974561df18841fa5f4
EXPORTER_SECRET <span class="hljs-number">82</span>cab66e906c3cd3c58b3aeeecd66b2a12e521704d3e19e2f008550705e78e00 <span class="hljs-number">32</span>b69cb41b8db36371e7d207a45e20d401bb05e0cd8bf492e3ace009e2845d12
SERVER_TRAFFIC_SECRET_0 <span class="hljs-number">82</span>cab66e906c3cd3c58b3aeeecd66b2a12e521704d3e19e2f008550705e78e00 <span class="hljs-number">1</span>f42b4392b2cc14789c4eaec4dae275c6a040ae3b11fc6bba58c90c7b80caa96
CLIENT_HANDSHAKE_TRAFFIC_SECRET <span class="hljs-number">82</span>cab66e906c3cd3c58b3aeeecd66b2a12e521704d3e19e2f008550705e78e00 bd93073bda56e559743a1f1ffc48c062089addcfc007c7defe08c28ac0ee6287
CLIENT_TRAFFIC_SECRET_0 <span class="hljs-number">82</span>cab66e906c3cd3c58b3aeeecd66b2a12e521704d3e19e2f008550705e78e00 <span class="hljs-number">32</span>b615c0dd25cb7b430a0cff44871e3263bd67af973e4b2f7fb19aab4df468d8
SERVER_HANDSHAKE_TRAFFIC_SECRET <span class="hljs-number">68</span>dcc859bc4edb51354a9f583e036d0b2787a337ee894e253925e273a5cd3889 a52a20827ce04dfc4ee557608ed5a0bfb6794ace0c4a1b69a1d56e5f16d8570b
EXPORTER_SECRET <span class="hljs-number">68</span>dcc859bc4edb51354a9f583e036d0b2787a337ee894e253925e273a5cd3889 <span class="hljs-number">8179</span>afb8e7c7a77e35143c40a6bb62ccea2e644e48cc95b91b05f525bc59ada7
SERVER_TRAFFIC_SECRET_0 <span class="hljs-number">68</span>dcc859bc4edb51354a9f583e036d0b2787a337ee894e253925e273a5cd3889 <span class="hljs-number">3</span>d4abf6a9ea06395648a45428ca78c24962d8cc11440fe1d72f035ae35e61010
CLIENT_HANDSHAKE_TRAFFIC_SECRET <span class="hljs-number">68</span>dcc859bc4edb51354a9f583e036d0b2787a337ee894e253925e273a5cd3889 <span class="hljs-number">1</span>d812a6c3c012a8fa4a6017ee573b47a5b361d15b861938ebca9194ecbc2a250
CLIENT_TRAFFIC_SECRET_0 <span class="hljs-number">68</span>dcc859bc4edb51354a9f583e036d0b2787a337ee894e253925e273a5cd3889 <span class="hljs-number">6348</span>a88dc9b6a350d72a7154140b824db80ba4f48c9e1fabcee76da8d248b041
</code></pre>
<p>Good. Next step is to capture the traffic. We'll use <a target="_blank" href="https://www.tcpdump.org/">tcpdump</a> with a <a target="_blank" href="https://www.tcpdump.org/papers/ethereal-tcpdump.pdf">simple expression</a> to filter out the captured traffic:</p>
<pre><code class="lang-python">[josevnz@dmaf5 temp]$ sudo tcpdump -i eno1 -v -v -v <span class="hljs-string">'host asciinema.org'</span> -w ~/temp/asciinema.org.pcap
dropped privs to tcpdump
tcpdump: listening on eno1, link-type EN10MB (Ethernet), snapshot length <span class="hljs-number">262144</span> bytes
</code></pre>
<p>And in another window run the asciinema client (we'll do it twice to have more data):</p>
<pre><code class="lang-python">[josevnz@dmaf5 SuricataLog]$ /usr/bin/asciinema upload demo-ascii.cast 
asciinema: upload failed: &lt;urlopen error [Errno <span class="hljs-number">32</span>] Broken pipe&gt;
asciinema: retry later by running: asciinema upload demo-ascii.cast
[josevnz@dmaf5 SuricataLog]$ 
[josevnz@dmaf5 SuricataLog]$ /usr/bin/asciinema upload demo-ascii.cast 
asciinema: upload failed: &lt;urlopen error [Errno <span class="hljs-number">104</span>] Connection reset by peer&gt;
asciinema: retry later by running: asciinema upload demo-ascii.cast
</code></pre>
<p>Now kill the tcpdump capture on the other window:</p>
<pre><code class="lang-python">tcpdump: listening on eno1, link-type EN10MB (Ethernet), snapshot length <span class="hljs-number">262144</span> bytes
^C113 packets captured
<span class="hljs-number">118</span> packets received by filter
<span class="hljs-number">0</span> packets dropped by kernel
</code></pre>
<p>Let's replay the pcap file to see what got recorded:</p>
<pre><code class="lang-python">[josevnz@dmaf5 temp]$ tcpdump -r ~/temp/asciinema.org.pcap
reading <span class="hljs-keyword">from</span> file /home/josevnz/temp/asciinema.org.pcap, link-type EN10MB (Ethernet), snapshot length <span class="hljs-number">262144</span>
<span class="hljs-number">07</span>:<span class="hljs-number">17</span>:<span class="hljs-number">18.244941</span> IP dmaf5.home<span class="hljs-number">.59896</span> &gt; cip<span class="hljs-number">-109</span><span class="hljs-number">-107</span><span class="hljs-number">-37</span><span class="hljs-number">-0.</span>gb1.brightbox.com.https: Flags [S], seq <span class="hljs-number">1651239781</span>, win <span class="hljs-number">64240</span>, options [mss <span class="hljs-number">1460</span>,sackOK,TS val <span class="hljs-number">3293505858</span> ecr <span class="hljs-number">0</span>,nop,wscale <span class="hljs-number">7</span>], length <span class="hljs-number">0</span>
<span class="hljs-number">07</span>:<span class="hljs-number">17</span>:<span class="hljs-number">18.337023</span> IP cip<span class="hljs-number">-109</span><span class="hljs-number">-107</span><span class="hljs-number">-37</span><span class="hljs-number">-0.</span>gb1.brightbox.com.https &gt; dmaf5.home<span class="hljs-number">.59896</span>: Flags [S.], seq <span class="hljs-number">2395275599</span>, ack <span class="hljs-number">1651239782</span>, win <span class="hljs-number">65160</span>, options [mss <span class="hljs-number">1460</span>,sackOK,TS val <span class="hljs-number">3934370169</span> ecr <span class="hljs-number">3293505858</span>,nop,wscale <span class="hljs-number">7</span>], length <span class="hljs-number">0</span>
<span class="hljs-number">07</span>:<span class="hljs-number">17</span>:<span class="hljs-number">18.337070</span> IP dmaf5.home<span class="hljs-number">.59896</span> &gt; cip<span class="hljs-number">-109</span><span class="hljs-number">-107</span><span class="hljs-number">-37</span><span class="hljs-number">-0.</span>gb1.brightbox.com.https: Flags [.], ack <span class="hljs-number">1</span>, win <span class="hljs-number">502</span>, options [nop,nop,TS val <span class="hljs-number">3293505950</span> ecr <span class="hljs-number">3934370169</span>], length <span class="hljs-number">0</span>
<span class="hljs-number">07</span>:<span class="hljs-number">17</span>:<span class="hljs-number">18.337643</span> IP dmaf5.home<span class="hljs-number">.59896</span> &gt; cip<span class="hljs-number">-109</span><span class="hljs-number">-107</span><span class="hljs-number">-37</span><span class="hljs-number">-0.</span>gb1.brightbox.com.https: Flags [P.], seq <span class="hljs-number">1</span>:<span class="hljs-number">518</span>, ack <span class="hljs-number">1</span>, win <span class="hljs-number">502</span>, options [nop,nop,TS val <span class="hljs-number">3293505951</span> ecr <span class="hljs-number">3934370169</span>], length <span class="hljs-number">517</span>
<span class="hljs-number">07</span>:<span class="hljs-number">17</span>:<span class="hljs-number">18.429273</span> IP cip<span class="hljs-number">-109</span><span class="hljs-number">-107</span><span class="hljs-number">-37</span><span class="hljs-number">-0.</span>gb1.brightbox.com.https &gt; dmaf5.home<span class="hljs-number">.59896</span>: Flags [.], ack <span class="hljs-number">518</span>, win <span class="hljs-number">506</span>, options [nop,nop,TS val <span class="hljs-number">3934370263</span> ecr <span class="hljs-number">3293505951</span>], length <span class="hljs-number">0</span>
<span class="hljs-number">07</span>:<span class="hljs-number">17</span>:<span class="hljs-number">18.433850</span> IP cip<span class="hljs-number">-109</span><span class="hljs-number">-107</span><span class="hljs-number">-37</span><span class="hljs-number">-0.</span>gb1.brightbox.com.https &gt; dmaf5.home<span class="hljs-number">.59896</span>: Flags [.], seq <span class="hljs-number">1</span>:<span class="hljs-number">1449</span>, ack <span class="hljs-number">518</span>, win <span class="hljs-number">506</span>, options [nop,nop,TS val <span class="hljs-number">3934370267</span> ecr <span class="hljs-number">3293505951</span>], length <span class="hljs-number">1448</span>
<span class="hljs-number">07</span>:<span class="hljs-number">17</span>:<span class="hljs-number">18.433863</span> IP dmaf5.home<span class="hljs-number">.59896</span> &gt; cip<span class="hljs-number">-109</span><span class="hljs-number">-107</span><span class="hljs-number">-37</span><span class="hljs-number">-0.</span>gb1.brightbox.com.https: Flags [.], ack <span class="hljs-number">1449</span>, win <span class="hljs-number">501</span>, options [nop,nop,TS val <span class="hljs-number">3293506047</span> ecr <span class="hljs-number">3934370267</span>], length <span class="hljs-number">0</span>
<span class="hljs-number">07</span>:<span class="hljs-number">17</span>:<span class="hljs-number">18.433966</span> IP cip<span class="hljs-number">-109</span><span class="hljs-number">-107</span><span class="hljs-number">-37</span><span class="hljs-number">-0.</span>gb1.brightbox.com.https &gt; dmaf5.home<span class="hljs-number">.59896</span>: Flags [P.], seq <span class="hljs-number">1449</span>:<span class="hljs-number">2897</span>, ack <span class="hljs-number">518</span>, win <span class="hljs-number">506</span>, options [nop,nop,TS val <span class="hljs-number">3934370267</span> ecr <span class="hljs-number">3293505951</span>], length <span class="hljs-number">1448</span>
<span class="hljs-number">07</span>:<span class="hljs-number">17</span>:<span class="hljs-number">18.433981</span> IP dmaf5.home<span class="hljs-number">.59896</span> &gt; cip<span class="hljs-number">-109</span><span class="hljs-number">-107</span><span class="hljs-number">-37</span><span class="hljs-number">-0.</span>gb1.brightbox.com.https: Flags [.], ack <span class="hljs-number">2897</span>, win <span class="hljs-number">496</span>, options [nop,nop,TS val <span class="hljs-number">3293506047</span> ecr <span class="hljs-number">3934370267</span>], length <span class="hljs-number">0</span>
<span class="hljs-number">07</span>:<span class="hljs-number">17</span>:<span class="hljs-number">18.434089</span> IP cip<span class="hljs-number">-109</span><span class="hljs-number">-107</span><span class="hljs-number">-37</span><span class="hljs-number">-0.</span>gb1.brightbox.com.https &gt; dmaf5.home<span class="hljs-number">.59896</span>: Flags [.], seq <span class="hljs-number">2897</span>:<span class="hljs-number">4345</span>, ack <span class="hljs-number">518</span>, win <span class="hljs-number">506</span>, options [nop,nop,TS val <span class="hljs-number">3934370267</span> ecr <span class="hljs-number">3293505951</span>], length <span class="hljs-number">1448</span>
...
<span class="hljs-number">07</span>:<span class="hljs-number">17</span>:<span class="hljs-number">30.612523</span> IP cip<span class="hljs-number">-109</span><span class="hljs-number">-107</span><span class="hljs-number">-37</span><span class="hljs-number">-0.</span>gb1.brightbox.com.https &gt; dmaf5.home<span class="hljs-number">.59898</span>: Flags [.], ack <span class="hljs-number">11148</span>, win <span class="hljs-number">501</span>, options [nop,nop,TS val <span class="hljs-number">3934382447</span> ecr <span class="hljs-number">3293518134</span>], length <span class="hljs-number">0</span>
<span class="hljs-number">07</span>:<span class="hljs-number">17</span>:<span class="hljs-number">30.612524</span> IP cip<span class="hljs-number">-109</span><span class="hljs-number">-107</span><span class="hljs-number">-37</span><span class="hljs-number">-0.</span>gb1.brightbox.com.https &gt; dmaf5.home<span class="hljs-number">.59898</span>: Flags [.], ack <span class="hljs-number">12596</span>, win <span class="hljs-number">501</span>, options [nop,nop,TS val <span class="hljs-number">3934382447</span> ecr <span class="hljs-number">3293518134</span>], length <span class="hljs-number">0</span>
<span class="hljs-number">07</span>:<span class="hljs-number">17</span>:<span class="hljs-number">30.612558</span> IP dmaf5.home<span class="hljs-number">.59898</span> &gt; cip<span class="hljs-number">-109</span><span class="hljs-number">-107</span><span class="hljs-number">-37</span><span class="hljs-number">-0.</span>gb1.brightbox.com.https: Flags [.], seq <span class="hljs-number">35764</span>:<span class="hljs-number">37212</span>, ack <span class="hljs-number">4724</span>, win <span class="hljs-number">499</span>, options [nop,nop,TS val <span class="hljs-number">3293518226</span> ecr <span class="hljs-number">3934382447</span>], length <span class="hljs-number">1448</span>
<span class="hljs-number">07</span>:<span class="hljs-number">17</span>:<span class="hljs-number">30.612563</span> IP dmaf5.home<span class="hljs-number">.59898</span> &gt; cip<span class="hljs-number">-109</span><span class="hljs-number">-107</span><span class="hljs-number">-37</span><span class="hljs-number">-0.</span>gb1.brightbox.com.https: Flags [P.], seq <span class="hljs-number">37212</span>:<span class="hljs-number">38660</span>, ack <span class="hljs-number">4724</span>, win <span class="hljs-number">499</span>, options [nop,nop,TS val <span class="hljs-number">3293518226</span> ecr <span class="hljs-number">3934382447</span>], length <span class="hljs-number">1448</span>
<span class="hljs-number">07</span>:<span class="hljs-number">17</span>:<span class="hljs-number">30.612637</span> IP dmaf5.home<span class="hljs-number">.59898</span> &gt; cip<span class="hljs-number">-109</span><span class="hljs-number">-107</span><span class="hljs-number">-37</span><span class="hljs-number">-0.</span>gb1.brightbox.com.https: Flags [.], seq <span class="hljs-number">38660</span>:<span class="hljs-number">40108</span>, ack <span class="hljs-number">4724</span>, win <span class="hljs-number">499</span>, options [nop,nop,TS val <span class="hljs-number">3293518226</span> ecr <span class="hljs-number">3934382447</span>], length <span class="hljs-number">1448</span>
<span class="hljs-number">07</span>:<span class="hljs-number">17</span>:<span class="hljs-number">30.612643</span> IP dmaf5.home<span class="hljs-number">.59898</span> &gt; cip<span class="hljs-number">-109</span><span class="hljs-number">-107</span><span class="hljs-number">-37</span><span class="hljs-number">-0.</span>gb1.brightbox.com.https: Flags [P.], seq <span class="hljs-number">40108</span>:<span class="hljs-number">41556</span>, ack <span class="hljs-number">4724</span>, win <span class="hljs-number">499</span>, options [nop,nop,TS val <span class="hljs-number">3293518226</span> ecr <span class="hljs-number">3934382447</span>], length <span class="hljs-number">1448</span>
<span class="hljs-number">07</span>:<span class="hljs-number">17</span>:<span class="hljs-number">30.613064</span> IP cip<span class="hljs-number">-109</span><span class="hljs-number">-107</span><span class="hljs-number">-37</span><span class="hljs-number">-0.</span>gb1.brightbox.com.https &gt; dmaf5.home<span class="hljs-number">.59898</span>: Flags [P.], seq <span class="hljs-number">4724</span>:<span class="hljs-number">5080</span>, ack <span class="hljs-number">12596</span>, win <span class="hljs-number">501</span>, options [nop,nop,TS val <span class="hljs-number">3934382448</span> ecr <span class="hljs-number">3293518134</span>], length <span class="hljs-number">356</span>
<span class="hljs-number">07</span>:<span class="hljs-number">17</span>:<span class="hljs-number">30.613106</span> IP cip<span class="hljs-number">-109</span><span class="hljs-number">-107</span><span class="hljs-number">-37</span><span class="hljs-number">-0.</span>gb1.brightbox.com.https &gt; dmaf5.home<span class="hljs-number">.59898</span>: Flags [P.], seq <span class="hljs-number">5080</span>:<span class="hljs-number">5104</span>, ack <span class="hljs-number">12596</span>, win <span class="hljs-number">501</span>, options [nop,nop,TS val <span class="hljs-number">3934382448</span> ecr <span class="hljs-number">3293518134</span>], length <span class="hljs-number">24</span>
<span class="hljs-number">07</span>:<span class="hljs-number">17</span>:<span class="hljs-number">30.614231</span> IP cip<span class="hljs-number">-109</span><span class="hljs-number">-107</span><span class="hljs-number">-37</span><span class="hljs-number">-0.</span>gb1.brightbox.com.https &gt; dmaf5.home<span class="hljs-number">.59898</span>: Flags [R.], seq <span class="hljs-number">5104</span>, ack <span class="hljs-number">12596</span>, win <span class="hljs-number">501</span>, options [nop,nop,TS val <span class="hljs-number">3934382448</span> ecr <span class="hljs-number">3293518134</span>], length <span class="hljs-number">0</span>
</code></pre>
<p>Time to fire up wireshark. I like to use a GUI for this as the filtering capabilities are nice, and you can explore the contents of the PCAP file much easier.</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2022/04/wireshark-open-pcap-file.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p>The contents of the traffic capture:</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2022/04/wireshark-traffic-dump.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p>So we follow the first time we got a TLS hello message, right-click on the protocol preferences -&gt; Transport Layer Security and then "pre-Master-Secret log filename":</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2022/04/wireshark-tls-pre-master-key.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p>Now time for the fun part. If you right-click the first hello message and say "follow TLS stream" a new window will open the whole conversation to the moment we got our connection reset, no encryption!</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2022/04/wireshark-follow-tls.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p>So we only managed to sent 33 KB before being cut-off by the asciinema server. How rude! :satisfied:</p>
<p>Because the data payload is not so big I will show it to you next, make sure you pay attention to the following:</p>
<ol>
<li><p>I changed the Authorization: Basic contents as I don't want to leak my user/password encoded in base64.</p>
</li>
<li><p>Content-Length: 12444474. That's how asciinema knows how big is the file we want to upload, so the server rejects it.</p>
</li>
<li><p>Asciinema uses Nginx.</p>
</li>
<li><p>You can see the close message at the end (entity too large).</p>
</li>
</ol>
<pre><code class="lang-python">POST /api/asciicasts HTTP/<span class="hljs-number">1.1</span>
Accept-Encoding: identity
Content-Length: <span class="hljs-number">12444474</span>
Host: asciinema.org
User-Agent: asciinema/<span class="hljs-number">2.0</span><span class="hljs-number">.2</span> CPython/<span class="hljs-number">3.9</span><span class="hljs-number">.9</span> Linux/<span class="hljs-number">5.14</span><span class="hljs-number">.18</span><span class="hljs-number">-100.</span>fc33.x86_64-x86_64-<span class="hljs-keyword">with</span>-glibc2<span class="hljs-number">.32</span>
Accept: application/json
Content-Type: multipart/form-data; boundary=d5c6b2543ee94511943126c6a3c5d33a
Authorization: Basic XXXXX=
Connection: close

--d5c6b2543ee94511943126c6a3c5d33a
Content-Disposition: form-data; name=<span class="hljs-string">"asciicast"</span>; filename=<span class="hljs-string">"ascii.cast"</span>
Content-Type: application/octet-stream

{<span class="hljs-string">"version"</span>: <span class="hljs-number">2</span>, <span class="hljs-string">"width"</span>: <span class="hljs-number">203</span>, <span class="hljs-string">"height"</span>: <span class="hljs-number">32</span>, <span class="hljs-string">"timestamp"</span>: <span class="hljs-number">1650568938</span>, <span class="hljs-string">"env"</span>: {<span class="hljs-string">"SHELL"</span>: <span class="hljs-string">"/bin/bash"</span>, <span class="hljs-string">"TERM"</span>: <span class="hljs-string">"xterm-256color"</span>}}
[<span class="hljs-number">0.191182</span>, <span class="hljs-string">"o"</span>, <span class="hljs-string">"\u001b]777;notify;Command completed;eve_log.py --format table --timestamp '2022-02-23T18:22:24.405139+0000' test/eve.json\u001b\\\u001b]777;precmd\u001b\\\u001b]0;josevnz@dmaf5:~/SuricataLog-Logging-features-branch\u001b\\"</span>]
[<span class="hljs-number">0.19215</span>, <span class="hljs-string">"o"</span>, <span class="hljs-string">"\u001b]7;file://dmaf5/home/josevnz/SuricataLog-Logging-features-branch\u001b\\"</span>]
[<span class="hljs-number">0.192399</span>, <span class="hljs-string">"o"</span>, <span class="hljs-string">"[josevnz@dmaf5 SuricataLog-Logging-features-branch]$ "</span>]
[<span class="hljs-number">1.000538</span>, <span class="hljs-string">"o"</span>, <span class="hljs-string">"Let me show you how you can filter your Suricata alerts, displaying the results in different formats"</span>]
[<span class="hljs-number">4.506902</span>, <span class="hljs-string">"o"</span>, <span class="hljs-string">"\r\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C"</span>]
[<span class="hljs-number">4.921813</span>, <span class="hljs-string">"o"</span>, <span class="hljs-string">"\u001b[1@#"</span>]
[<span class="hljs-number">5.170393</span>, <span class="hljs-string">"o"</span>, <span class="hljs-string">"\u001b[1@ "</span>]
[<span class="hljs-number">5.538486</span>, <span class="hljs-string">"o"</span>, <span class="hljs-string">"\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C\u001b[C"</span>]
[<span class="hljs-number">6.914337</span>, <span class="hljs-string">"o"</span>, <span class="hljs-string">"\r\n"</span>]
[<span class="hljs-number">6.918708</span>, <span class="hljs-string">"o"</span>, <span class="hljs-string">"\u001b]777;notify;Command completed;# Let me show you how you can filter your Suricata alerts, displaying the results in different formats\u001b\\\u001b]777;precmd\u001b\\\u001b]0;josevnz@dmaf5:~/SuricataLog-Logging-features-branch\u001b\\"</span>]
[<span class="hljs-number">6.920219</span>, <span class="hljs-string">"o"</span>, <span class="hljs-string">"\u001b]7;file://dmaf5/home/josevnz/SuricataLog-Logging-features-branch\u001b\\"</span>]
[<span class="hljs-number">6.920352</span>, <span class="hljs-string">"o"</span>, <span class="hljs-string">"[josevnz@dmaf5 SuricataLog-Logging-features-branch]$ "</span>]
[<span class="hljs-number">8.202111</span>, <span class="hljs-string">"o"</span>, <span class="hljs-string">"1"</span>]
[<span class="hljs-number">8.658197</span>, <span class="hljs-string">"o"</span>, <span class="hljs-string">")"</span>]
[<span class="hljs-number">8.962176</span>, <span class="hljs-string">"o"</span>, <span class="hljs-string">" "</span>]
[<span class="hljs-number">10.153862</span>, <span class="hljs-string">"o"</span>, <span class="hljs-string">"A"</span>]
[<span class="hljs-number">10.409632</span>, <span class="hljs-string">"o"</span>, <span class="hljs-string">" "</span>]
[<span class="hljs-number">10.61679</span>, <span class="hljs-string">"o"</span>, <span class="hljs-string">"n"</span>]
[<span class="hljs-number">10.777002</span>, <span class="hljs-string">"o"</span>, <span class="hljs-string">"i"</span>]
[<span class="hljs-number">10.881112</span>, <span class="hljs-string">"o"</span>, <span class="hljs-string">"c"</span>]
[<span class="hljs-number">10.952884</span>, <span class="hljs-string">"o"</span>, <span class="hljs-string">"e"</span>]
[<span class="hljs-number">11.088641</span>, <span class="hljs-string">"o"</span>, <span class="hljs-string">" "</span>]
[<span class="hljs-number">11.201045</span>, <span class="hljs-string">"o"</span>, <span class="hljs-string">"t"</span>]
[<span class="hljs-number">11.466022</span>, <span class="hljs-string">"o"</span>, <span class="hljs-string">"a"</span>]
[<span class="hljs-number">11.553785</span>, <span class="hljs-string">"o"</span>, <span class="hljs-string">"b"</span>]
[<span class="hljs-number">11.818412</span>, <span class="hljs-string">"o"</span>, <span class="hljs-string">"l"</span>]
[<span class="hljs-number">11.961808</span>, <span class="hljs-string">"o"</span>, <span class="hljs-string">"e"</span>]
[<span class="hljs-number">13.51443</span>, <span class="hljs-string">"o"</span>, <span class="hljs-string">"\r\n"</span>]
[<span class="hljs-number">13.514675</span>, <span class="hljs-string">"o"</span>, <span class="hljs-string">"bash: syntax error near unexpected token `)'\r\n"</span>]
[<span class="hljs-number">13.518913</span>, <span class="hljs-string">"o"</span>, <span class="hljs-string">"\u001b]777;notify;Command completed;1) A nice table\u001b\\\u001b]777;precmd\u001b\\\u001b]0;josevnz@dmaf5:~/SuricataLog-Logging-features-branch\u001b\\"</span>]
[<span class="hljs-number">13.520551</span>, <span class="hljs-string">"o"</span>, <span class="hljs-string">"\u001b]7;file://dmaf5/home/josevnz/SuricataLog-Logging-features-branch\u001b\\"</span>]
[<span class="hljs-number">13.52072</span>, <span class="hljs-string">"o"</span>, <span class="hljs-string">"[josevnz@dmaf5 SuricataLog-Logging-features-branch]$ "</span>]
[<span class="hljs-number">22.176716</span>, <span class="hljs-string">"o"</span>, <span class="hljs-string">"eve_log.py --format table --timestamp '2022-02-23T18:22:24.405139+0000' test/eve.jso"</span>]
[<span class="hljs-number">24.202009</span>, <span class="hljs-string">"o"</span>, <span class="hljs-string">"n"</span>]
[<span class="hljs-number">26.097822</span>, <span class="hljs-string">"o"</span>, <span class="hljs-string">"\r\n"</span>]
[<span class="hljs-number">26.098024</span>, <span class="hljs-string">"o"</span>, <span class="hljs-string">"\u001b]777;preexec\u001b\\"</span>]
[<span class="hljs-number">26.312676</span>, <span class="hljs-string">"o"</span>, <span class="hljs-string">"\u001b[?1049h\u001b[H\u001b[?1000h\u001b[?1003h\u001b[?1015h\u001b[?1006h\u001b[?25l\u001b[?1003h\r\n"</span>]
[<span class="hljs-number">26.314059</span>, <span class="hljs-string">"o"</span>, <span class="hljs-string">"\u001bP=1s\u001b\\\u001b[H\u001b[H                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                               "</span>]
[<span class="hljs-number">26.314299</span>, <span class="hljs-string">"o"</span>, <span class="hljs-string">"            \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                              "</span>]
[<span class="hljs-number">26.314387</span>, <span class="hljs-string">"o"</span>, <span class="hljs-string">"             \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                             "</span>]
[<span class="hljs-number">26.314455</span>, <span class="hljs-string">"o"</span>, <span class="hljs-string">"              \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                            "</span>]
[<span class="hljs-number">26.314502</span>, <span class="hljs-string">"o"</span>, <span class="hljs-string">"               \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                           "</span>]
[<span class="hljs-number">26.31456</span>, <span class="hljs-string">"o"</span>, <span class="hljs-string">"                \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                          "</span>]
[<span class="hljs-number">26.314616</span>, <span class="hljs-string">"o"</span>, <span class="hljs-string">"                 \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \u001bP=2s\u001b\\"</span>]
[<span class="hljs-number">26.31467</span>, <span class="hljs-string">"o"</span>, <span class="hljs-string">"\u001bP=1s\u001b\\\u001b[H\u001b[H                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                               "</span>]
[<span class="hljs-number">26.314714</span>, <span class="hljs-string">"o"</span>, <span class="hljs-string">"            \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                              "</span>]
[<span class="hljs-number">26.314781</span>, <span class="hljs-string">"o"</span>, <span class="hljs-string">"             \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                             "</span>]
[<span class="hljs-number">26.314843</span>, <span class="hljs-string">"o"</span>, <span class="hljs-string">"              \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                            "</span>]
[<span class="hljs-number">26.314902</span>, <span class="hljs-string">"o"</span>, <span class="hljs-string">"               \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                           "</span>]
[<span class="hljs-number">26.314957</span>, <span class="hljs-string">"o"</span>, <span class="hljs-string">"                \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                          "</span>]
[<span class="hljs-number">26.315012</span>, <span class="hljs-string">"o"</span>, <span class="hljs-string">"                 \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \u001bP=2s\u001b\\"</span>]
[<span class="hljs-number">26.316033</span>, <span class="hljs-string">"o"</span>, <span class="hljs-string">"\u001b[?25l"</span>]
[<span class="hljs-number">26.318086</span>, <span class="hljs-string">"o"</span>, <span class="hljs-string">"\r\u001b[2KParsing test/eve.json \u001b[38;5;237m........................................................................................................................\u001b[0m \u001b[35m  0%\u001b[0m \u001b[36m-:--:--\u001b[0m"</span>]
[<span class="hljs-number">26.378123</span>, <span class="hljs-string">"o"</span>, <span class="hljs-string">"\r\u001b[2KParsing test/eve.json \u001b[38;2;114;156;31m........................................................................................................................\u001b[0m \u001b[35m100%\u001b[0m \u001b[36m0:00:00\u001b[0m\r\n\u001b[?25h"</span>]
[<span class="hljs-number">26.390312</span>, <span class="hljs-string">"o"</span>, <span class="hljs-string">"\u001bP=1s\u001b\\\u001b[H\u001b[H                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                               "</span>]
[<span class="hljs-number">26.39044</span>, <span class="hljs-string">"o"</span>, <span class="hljs-string">"            \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                              "</span>]
[<span class="hljs-number">26.390499</span>, <span class="hljs-string">"o"</span>, <span class="hljs-string">"             \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                             "</span>]
[<span class="hljs-number">26.390559</span>, <span class="hljs-string">"o"</span>, <span class="hljs-string">"              \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                            "</span>]
[<span class="hljs-number">26.390615</span>, <span class="hljs-string">"o"</span>, <span class="hljs-string">"               \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                           "</span>]
[<span class="hljs-number">26.39064</span>, <span class="hljs-string">"o"</span>, <span class="hljs-string">"                \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                          "</span>]
[<span class="hljs-number">26.390719</span>, <span class="hljs-string">"o"</span>, <span class="hljs-string">"                 \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \u001bP=2s\u001b\\"</span>]
[<span class="hljs-number">26.390868</span>, <span class="hljs-string">"o"</span>, <span class="hljs-string">"\u001bP=1s\u001b\\\u001b[H\u001b[H                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                               "</span>]
[<span class="hljs-number">26.390893</span>, <span class="hljs-string">"o"</span>, <span class="hljs-string">"            \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                              "</span>]
[<span class="hljs-number">26.390944</span>, <span class="hljs-string">"o"</span>, <span class="hljs-string">"             \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                             "</span>]
[<span class="hljs-number">26.391027</span>, <span class="hljs-string">"o"</span>, <span class="hljs-string">"              \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                            "</span>]
[<span class="hljs-number">26.391091</span>, <span class="hljs-string">"o"</span>, <span class="hljs-string">"               \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                           "</span>]
[<span class="hljs-number">26.391116</span>, <span class="hljs-string">"o"</span>, <span class="hljs-string">"                \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                          "</span>]
[<span class="hljs-number">26.391172</span>, <span class="hljs-string">"o"</span>, <span class="hljs-string">"                 \r\n                                                                                                                                                                                                           \r\n                                                                                                                                                                                                           \u001bP=2s\u001b\\"</span>]
[<span class="hljs-number">26.431391</span>, <span class="hljs-string">"o"</span>, <span class="hljs-string">"\u001bP=1s\u001b\\\u001b[H\u001b[3m                                                                                      Suricata alerts for 2022-02-23 18:22:24.405139, logs=test/eve.json                                                   \u001b[0m\r\n.................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................\r\n...\u001b[1;35m \u001b[0m\u001b[1;35mTimestamp                      \u001b[0m\u001b[1;35m \u001b[0m...\u001b[1;35m \u001b[0m\u001b[1;35mSeverity\u001b[0m\u001b[1;35m \u001b[0m...\u001b[1;35m \u001b[0m\u001b[1;35mSignature                                           \u001b"</span>]
[<span class="hljs-number">26.431543</span>, <span class="hljs-string">"o"</span>, <span class="hljs-string">"[0m\u001b[1;35m \u001b[0m...\u001b[1;35m \u001b[0m\u001b[1;35mProtocol\u001b[0m\u001b[1;35m \u00HTTP/1.1 413 Request Entity Too Large
Content-Length: 176
Content-Type: text/html
Date: Fri, 22 Apr 2022 11:17:19 GMT
Server: nginx
Connection: close

&lt;html&gt;
&lt;head&gt;&lt;title&gt;413 Request Entity Too Large&lt;/title&gt;&lt;/head&gt;
&lt;body&gt;
&lt;center&gt;&lt;h1&gt;413 Request Entity Too Large&lt;/h1&gt;&lt;/center&gt;
&lt;hr&gt;&lt;center&gt;nginx&lt;/center&gt;
&lt;/body&gt;
&lt;/html&gt;</span>
</code></pre>
<h1 id="heading-what-is-next-for-you"><strong>What is next for you?</strong></h1>
<p>So next time you have an issue with a program that is installed on your system you will know what to check.</p>
<p>We covered 3 ways to investigate an issue with an application that uploads a file to a remote website using HTTPS:</p>
<ol>
<li><p>Using <em>strace</em></p>
</li>
<li><p>If the program is a Python script, then there is a good chance you can read the code yourself and run the script through <em>the debugger</em>, step by step, to understand the issue. This is probably the most time-consuming way, but also it is the most rewarding as you learn how other good developers think!</p>
</li>
<li><p>And finally, we captured the encrypted traffic between us and the remote site and analyzed the upload. By enabling certain special features we were able to decrypt and replay the traffic, confirming our findings from the previous two interactions.</p>
</li>
</ol>
<p>This list of techniques is not exhaustive, but for some cases like this they will give you a good start.</p>
<p>As usual please share your feedback! Let's have a conversation so everybody learns a little.</p>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ Home Network Security – How to Use Suricata, RaspberryPI4, and Python to Make Your Network Safe ]]>
                </title>
                <description>
                    <![CDATA[ In a previous article, I showed you how to secure your wireless home network using Kismet. Kismet is perfect for detecting anomalies and certain types of attack – but what if I want to analyze the traffic and look for abnormal patterns or patterns th... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/home-network-security-with-suricata-raspberrypi4-python/</link>
                <guid isPermaLink="false">66d8513fbfb3c4f0b376afe8</guid>
                
                    <category>
                        <![CDATA[ cybersecurity ]]>
                    </category>
                
                    <category>
                        <![CDATA[ information security ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Python ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Raspberry Pi ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ Jose Vicente Nunez ]]>
                </dc:creator>
                <pubDate>Tue, 19 Apr 2022 00:23:10 +0000</pubDate>
                <media:content url="https://www.freecodecamp.org/news/content/images/2022/04/pexels-george-becker-333837.jpg" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>In a <a target="_blank" href="https://www.freecodecamp.org/news/wireless-security-using-raspberry-pi-4-kismet-and-python/">previous article</a>, I showed you how to secure your wireless home network using <a target="_blank" href="https://www.kismetwireless.net/">Kismet</a>.</p>
<p>Kismet is perfect for detecting anomalies and certain types of attack – but what if I want to analyze the traffic and look for abnormal patterns or patterns that could indicate an attack?</p>
<p>And <a target="_blank" href="https://en.wikipedia.org/wiki/Intrusion_detection_system">Intrusion Detection System</a> (<strong>IDS</strong>) is:</p>
<blockquote>
<p>...a device or software application that monitors a network or systems for malicious activity or policy violations.</p>
</blockquote>
<p>I used a good IDS in the past called <a target="_blank" href="https://snort.org/">Snort V2</a>, I'm aware than Snort 3 is out. But there is a <a target="_blank" href="https://snort.org/documents/snort-supported-oss">pretty clear warning</a> about running it on a machine without much memory:</p>
<blockquote>
<p>While Snort can compile on almost all *nix based machines, it is not recommended that you compile Snort on a low power or low RAM machine. Snort requires memory to run and to properly analyze as much traffic as possible.</p>
</blockquote>
<p>And</p>
<blockquote>
<p>Snort does not officially support any particular OS.</p>
</blockquote>
<p><em>Not exactly a reason to dislike it</em>, but I feel more confident when a vendor tells me than my OS is in their supported platform list. I do also have more recent experience setting up with the open source tool <a target="_blank" href="https://suricata.io/download/">Suricata,</a> so I decided to give it a more serious try to keep tabs on my local network and alert me if any suspicious activity was detected.</p>
<p>Poking around I found than for my local network, 8 GB of RAM will be sufficient along with my Linux distribution:</p>
<pre><code class="lang-python">josevnz@raspberrypi:~$ lsb_release --release
Release:    <span class="hljs-number">20.04</span>
</code></pre>
<p>My version of Ubuntu <em>is supported out of the box</em>.</p>
<p>The choice is yours. In my case it felt better to use Suricata than Snort. As usual, you need to plan around your hardware, your use cases, and the features offered by the tools (including commercial support).</p>
<h1 id="heading-table-of-contents"><strong>Table of Contents</strong></h1>
<ul>
<li><p><a class="post-section-overview" href="#heading-quick-installation">Quick Installation</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-where-you-should-connect-your-raspberrypi-4-with-suricata">Where you should connect your Raspberry Pi 4 with Suricata</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-how-to-set-up-suricata">How to Set Up Suricata</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-how-to-tune-up-suricata">How to Tune Up Suricata</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-making-sense-of-all-the-alerts">Making Sense of All the Alerts</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-what-did-we-learn-and-what-is-next">What Did We Learn and What is Next?</a></p>
</li>
</ul>
<h1 id="heading-quick-installation"><strong>Quick Installation</strong></h1>
<p>Installation is explained in detail <a target="_blank" href="https://redmine.openinfosecfoundation.org/projects/suricata/wiki/Ubuntu_Installation_-_Personal_Package_Archives_%28PPA%29">here</a>, so I will only put here the <a target="_blank" href="https://www.youtube.com/playlist?list=PLFqw30a25lWRIhAnQNb7ZaPpexPYgxhVv">quick installation steps</a> I used on my machine:</p>
<pre><code class="lang-python">sudo apt-get install software-properties-common
sudo add-apt-repository ppa:oisf/suricata-stable
sudo apt-get update
sudo apt-get install suricata
</code></pre>
<h2 id="heading-suricata-is-a-complex-beast"><strong>Suricata is a Complex Beast</strong></h2>
<p>You can use Suricata to detect and alert you about anomalies in your network traffic (IDS) or you can proactively drop suspicious connections when working in Intrusion Prevention System (<strong>IPS</strong>).</p>
<p>It can also capture network traffic and store it in PCAP format for later analysis (be careful as you can eat your disk space pretty fast).</p>
<p>We will keep things simple, and for now will take a more passive approach and get alerts when an intrusion is detected (sticking to IDS mode) in this tutorial.</p>
<h1 id="heading-where-you-should-connect-your-raspberrypi-4-with-suricata"><strong>Where you should connect your RaspBerryPI 4 with Suricata?</strong></h1>
<p>Ideally you want to put your Suricata sensor close to your home router. One way to do it is to connect all the devices (including your home router) to a common switch, and then mirror the traffic that goes into/out from the home router into a port on the switch. Suricata will be connected to that port, listening to all the traffic.</p>
<p>If you wanted to run Suricata as an IPS then the connectivity would have to be different, but this is not the intended use in this tutorial.</p>
<h1 id="heading-how-to-set-up-suricata"><strong>How to Set Up Suricata</strong></h1>
<p>Ideally the best place to put Suricata is between a firewall and the rest of the servers in your home network.</p>
<p>In this scenario let's assume than it is not possible because there is no firewall (OK, that will be your ISP router, but you cannot run Suricata there). So the next best thing is the wired network interface connected to it (in my case eth0).</p>
<p>The /etc/suricata/suricata.yaml file contains the defaults. I'll show here what I overrode:</p>
<pre><code class="lang-shell">root@raspberrypi:~# grep -in1 af-p /etc/suricata/suricata.yaml 
580-# Linux high speed capture support
581:af-packet:
582-  - interface: eth0
root@raspberrypi:~# grep -in 'HOME_NET: "' /etc/suricata/suricata.yaml |grep -v '#'
15:    HOME_NET: "[192.168.1.0/24]"
</code></pre>
<p>Start Suricata:</p>
<pre><code class="lang-shell">root@raspberrypi:~# systemctl start suricata.service
root@raspberrypi:~# systemctl status suricata.service
● suricata.service - LSB: Next Generation IDS/IPS
     Loaded: loaded (/etc/init.d/suricata; generated)
     Active: active (running) since Sun 2022-04-10 23:49:00 UTC; 24h ago
       Docs: man:systemd-sysv-generator(8)
      Tasks: 10 (limit: 9257)
     CGroup: /system.slice/suricata.service
             └─1834983 /usr/bin/suricata -c /etc/suricata/suricata.yaml --pidfile /var/run/suricata.pid --af-packet -D -vvv

Apr 10 23:49:00 raspberrypi systemd[1]: Starting LSB: Next Generation IDS/IPS...
Apr 10 23:49:00 raspberrypi suricata[1834973]: Starting suricata in IDS (af-packet) mode... done.
Apr 10 23:49:00 raspberrypi systemd[1]: Started LSB: Next Generation IDS/IPS.
</code></pre>
<p>The important details go into the file '/var/log/suricata/eve.json'. Mine started to grow surprisingly fast after starting Suricata:</p>
<pre><code class="lang-python">{<span class="hljs-string">"timestamp"</span>:<span class="hljs-string">"2022-04-10T23:49:32.527488+0000"</span>,<span class="hljs-string">"event_type"</span>:<span class="hljs-string">"stats"</span>,<span class="hljs-string">"stats"</span>:{<span class="hljs-string">"uptime"</span>:<span class="hljs-number">32</span>,<span class="hljs-string">"capture"</span>:{<span class="hljs-string">"kernel_packets"</span>:<span class="hljs-number">113</span>,<span class="hljs-string">"kernel_drops"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"errors"</span>:<span class="hljs-number">0</span>},<span class="hljs-string">"decoder"</span>:{<span class="hljs-string">"pkts"</span>:<span class="hljs-number">126</span>,<span class="hljs-string">"bytes"</span>:<span class="hljs-number">17986</span>,<span class="hljs-string">"invalid"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"ipv4"</span>:<span class="hljs-number">30</span>,<span class="hljs-string">"ipv6"</span>:<span class="hljs-number">74</span>,<span class="hljs-string">"ethernet"</span>:<span class="hljs-number">126</span>,<span class="hljs-string">"chdlc"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"raw"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"null"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"sll"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"tcp"</span>:<span class="hljs-number">4</span>,<span class="hljs-string">"udp"</span>:<span class="hljs-number">30</span>,<span class="hljs-string">"sctp"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"icmpv4"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"icmpv6"</span>:<span class="hljs-number">70</span>,<span class="hljs-string">"ppp"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"pppoe"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"geneve"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"gre"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"vlan"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"vlan_qinq"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"vxlan"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"vntag"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"ieee8021ah"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"teredo"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"ipv4_in_ipv6"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"ipv6_in_ipv6"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"mpls"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"avg_pkt_size"</span>:<span class="hljs-number">142</span>,<span class="hljs-string">"max_pkt_size"</span>:<span class="hljs-number">392</span>,<span class="hljs-string">"max_mac_addrs_src"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"max_mac_addrs_dst"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"erspan"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"event"</span>:{<span class="hljs-string">"ipv4"</span>:{<span class="hljs-string">"pkt_too_small"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"hlen_too_small"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"iplen_smaller_than_hlen"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"trunc_pkt"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"opt_invalid"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"opt_invalid_len"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"opt_malformed"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"opt_pad_required"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"opt_eol_required"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"opt_duplicate"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"opt_unknown"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"wrong_ip_version"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"icmpv6"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"frag_pkt_too_large"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"frag_overlap"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"frag_ignored"</span>:<span class="hljs-number">0</span>},<span class="hljs-string">"icmpv4"</span>:{<span class="hljs-string">"pkt_too_small"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"unknown_type"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"unknown_code"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"ipv4_trunc_pkt"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"ipv4_unknown_ver"</span>:<span class="hljs-number">0</span>},<span class="hljs-string">"icmpv6"</span>:{<span class="hljs-string">"unknown_type"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"unknown_code"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"pkt_too_small"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"ipv6_unknown_version"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"ipv6_trunc_pkt"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"mld_message_with_invalid_hl"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"unassigned_type"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"experimentation_type"</span>:<span class="hljs-number">0</span>},<span class="hljs-string">"ipv6"</span>:{<span class="hljs-string">"pkt_too_small"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"trunc_pkt"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"trunc_exthdr"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"exthdr_dupl_fh"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"exthdr_useless_fh"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"exthdr_dupl_rh"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"exthdr_dupl_hh"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"exthdr_dupl_dh"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"exthdr_dupl_ah"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"exthdr_dupl_eh"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"exthdr_invalid_optlen"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"wrong_ip_version"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"exthdr_ah_res_not_null"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"hopopts_unknown_opt"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"hopopts_only_padding"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"dstopts_unknown_opt"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"dstopts_only_padding"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"rh_type_0"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"zero_len_padn"</span>:<span class="hljs-number">21</span>,<span class="hljs-string">"fh_non_zero_reserved_field"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"data_after_none_header"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"unknown_next_header"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"icmpv4"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"frag_pkt_too_large"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"frag_overlap"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"frag_invalid_length"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"frag_ignored"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"ipv4_in_ipv6_too_small"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"ipv4_in_ipv6_wrong_version"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"ipv6_in_ipv6_too_small"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"ipv6_in_ipv6_wrong_version"</span>:<span class="hljs-number">0</span>},<span class="hljs-string">"tcp"</span>:{<span class="hljs-string">"pkt_too_small"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"hlen_too_small"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"invalid_optlen"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"opt_invalid_len"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"opt_duplicate"</span>:<span class="hljs-number">0</span>},<span class="hljs-string">"udp"</span>:{<span class="hljs-string">"pkt_too_small"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"hlen_too_small"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"hlen_invalid"</span>:<span class="hljs-number">0</span>},<span class="hljs-string">"sll"</span>:{<span class="hljs-string">"pkt_too_small"</span>:<span class="hljs-number">0</span>},<span class="hljs-string">"ethernet"</span>:{<span class="hljs-string">"pkt_too_small"</span>:<span class="hljs-number">0</span>},<span class="hljs-string">"ppp"</span>:{<span class="hljs-string">"pkt_too_small"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"vju_pkt_too_small"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"ip4_pkt_too_small"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"ip6_pkt_too_small"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"wrong_type"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"unsup_proto"</span>:<span class="hljs-number">0</span>},<span class="hljs-string">"pppoe"</span>:{<span class="hljs-string">"pkt_too_small"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"wrong_code"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"malformed_tags"</span>:<span class="hljs-number">0</span>},<span class="hljs-string">"gre"</span>:{<span class="hljs-string">"pkt_too_small"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"wrong_version"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"version0_recur"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"version0_flags"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"version0_hdr_too_big"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"version0_malformed_sre_hdr"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"version1_chksum"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"version1_route"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"version1_ssr"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"version1_recur"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"version1_flags"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"version1_no_key"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"version1_wrong_protocol"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"version1_malformed_sre_hdr"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"version1_hdr_too_big"</span>:<span class="hljs-number">0</span>},<span class="hljs-string">"vlan"</span>:{<span class="hljs-string">"header_too_small"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"unknown_type"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"too_many_layers"</span>:<span class="hljs-number">0</span>},<span class="hljs-string">"ieee8021ah"</span>:{<span class="hljs-string">"header_too_small"</span>:<span class="hljs-number">0</span>},<span class="hljs-string">"vntag"</span>:{<span class="hljs-string">"header_too_small"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"unknown_type"</span>:<span class="hljs-number">0</span>},<span class="hljs-string">"ipraw"</span>:{<span class="hljs-string">"invalid_ip_version"</span>:<span class="hljs-number">0</span>},<span class="hljs-string">"ltnull"</span>:{<span class="hljs-string">"pkt_too_small"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"unsupported_type"</span>:<span class="hljs-number">0</span>},<span class="hljs-string">"sctp"</span>:{<span class="hljs-string">"pkt_too_small"</span>:<span class="hljs-number">0</span>},<span class="hljs-string">"mpls"</span>:{<span class="hljs-string">"header_too_small"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"pkt_too_small"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"bad_label_router_alert"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"bad_label_implicit_null"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"bad_label_reserved"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"unknown_payload_type"</span>:<span class="hljs-number">0</span>},<span class="hljs-string">"vxlan"</span>:{<span class="hljs-string">"unknown_payload_type"</span>:<span class="hljs-number">0</span>},<span class="hljs-string">"geneve"</span>:{<span class="hljs-string">"unknown_payload_type"</span>:<span class="hljs-number">0</span>},<span class="hljs-string">"erspan"</span>:{<span class="hljs-string">"header_too_small"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"unsupported_version"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"too_many_vlan_layers"</span>:<span class="hljs-number">0</span>},<span class="hljs-string">"dce"</span>:{<span class="hljs-string">"pkt_too_small"</span>:<span class="hljs-number">0</span>},<span class="hljs-string">"chdlc"</span>:{<span class="hljs-string">"pkt_too_small"</span>:<span class="hljs-number">0</span>}},<span class="hljs-string">"too_many_layers"</span>:<span class="hljs-number">0</span>},<span class="hljs-string">"flow"</span>:{<span class="hljs-string">"memcap"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"tcp"</span>:<span class="hljs-number">1</span>,<span class="hljs-string">"udp"</span>:<span class="hljs-number">20</span>,<span class="hljs-string">"icmpv4"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"icmpv6"</span>:<span class="hljs-number">15</span>,<span class="hljs-string">"tcp_reuse"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"get_used"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"get_used_eval"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"get_used_eval_reject"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"get_used_eval_busy"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"get_used_failed"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"wrk"</span>:{<span class="hljs-string">"spare_sync_avg"</span>:<span class="hljs-number">100</span>,<span class="hljs-string">"spare_sync"</span>:<span class="hljs-number">4</span>,<span class="hljs-string">"spare_sync_incomplete"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"spare_sync_empty"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"flows_evicted_needs_work"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"flows_evicted_pkt_inject"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"flows_evicted"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"flows_injected"</span>:<span class="hljs-number">0</span>},<span class="hljs-string">"mgr"</span>:{<span class="hljs-string">"full_hash_pass"</span>:<span class="hljs-number">1</span>,<span class="hljs-string">"closed_pruned"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"new_pruned"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"est_pruned"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"bypassed_pruned"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"rows_maxlen"</span>:<span class="hljs-number">1</span>,<span class="hljs-string">"flows_checked"</span>:<span class="hljs-number">4</span>,<span class="hljs-string">"flows_notimeout"</span>:<span class="hljs-number">4</span>,<span class="hljs-string">"flows_timeout"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"flows_timeout_inuse"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"flows_evicted"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"flows_evicted_needs_work"</span>:<span class="hljs-number">0</span>},<span class="hljs-string">"spare"</span>:<span class="hljs-number">9600</span>,<span class="hljs-string">"emerg_mode_entered"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"emerg_mode_over"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"memuse"</span>:<span class="hljs-number">11668608</span>},<span class="hljs-string">"defrag"</span>:{<span class="hljs-string">"ipv4"</span>:{<span class="hljs-string">"fragments"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"reassembled"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"timeouts"</span>:<span class="hljs-number">0</span>},<span class="hljs-string">"ipv6"</span>:{<span class="hljs-string">"fragments"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"reassembled"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"timeouts"</span>:<span class="hljs-number">0</span>},<span class="hljs-string">"max_frag_hits"</span>:<span class="hljs-number">0</span>},<span class="hljs-string">"flow_bypassed"</span>:{<span class="hljs-string">"local_pkts"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"local_bytes"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"local_capture_pkts"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"local_capture_bytes"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"closed"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"pkts"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"bytes"</span>:<span class="hljs-number">0</span>},<span class="hljs-string">"tcp"</span>:{<span class="hljs-string">"sessions"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"ssn_memcap_drop"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"pseudo"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"pseudo_failed"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"invalid_checksum"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"no_flow"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"syn"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"synack"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"rst"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"midstream_pickups"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"pkt_on_wrong_thread"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"segment_memcap_drop"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"stream_depth_reached"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"reassembly_gap"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"overlap"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"overlap_diff_data"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"insert_data_normal_fail"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"insert_data_overlap_fail"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"insert_list_fail"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"memuse"</span>:<span class="hljs-number">2424832</span>,<span class="hljs-string">"reassembly_memuse"</span>:<span class="hljs-number">393216</span>},<span class="hljs-string">"detect"</span>:{<span class="hljs-string">"engines"</span>:[{<span class="hljs-string">"id"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"last_reload"</span>:<span class="hljs-string">"2022-04-10T23:49:00.377030+0000"</span>,<span class="hljs-string">"rules_loaded"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"rules_failed"</span>:<span class="hljs-number">0</span>}],<span class="hljs-string">"alert"</span>:<span class="hljs-number">0</span>},<span class="hljs-string">"app_layer"</span>:{<span class="hljs-string">"flow"</span>:{<span class="hljs-string">"http"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"ftp"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"smtp"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"tls"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"ssh"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"imap"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"smb"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"dcerpc_tcp"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"dns_tcp"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"nfs_tcp"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"ntp"</span>:<span class="hljs-number">1</span>,<span class="hljs-string">"ftp-data"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"tftp"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"ikev2"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"krb5_tcp"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"dhcp"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"snmp"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"sip"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"rfb"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"mqtt"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"rdp"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"failed_tcp"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"dcerpc_udp"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"dns_udp"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"nfs_udp"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"krb5_udp"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"failed_udp"</span>:<span class="hljs-number">19</span>},<span class="hljs-string">"tx"</span>:{<span class="hljs-string">"http"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"ftp"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"smtp"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"tls"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"ssh"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"imap"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"smb"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"dcerpc_tcp"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"dns_tcp"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"nfs_tcp"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"ntp"</span>:<span class="hljs-number">1</span>,<span class="hljs-string">"ftp-data"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"tftp"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"ikev2"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"krb5_tcp"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"dhcp"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"snmp"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"sip"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"rfb"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"mqtt"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"rdp"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"dcerpc_udp"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"dns_udp"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"nfs_udp"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"krb5_udp"</span>:<span class="hljs-number">0</span>},<span class="hljs-string">"expectations"</span>:<span class="hljs-number">0</span>},<span class="hljs-string">"http"</span>:{<span class="hljs-string">"memuse"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"memcap"</span>:<span class="hljs-number">0</span>},<span class="hljs-string">"ftp"</span>:{<span class="hljs-string">"memuse"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"memcap"</span>:<span class="hljs-number">0</span>},<span class="hljs-string">"file_store"</span>:{<span class="hljs-string">"open_files"</span>:<span class="hljs-number">0</span>}}}
</code></pre>
<p><em>Holy Priceless Collection of Etruscan Snoods!, Batman</em>. How do we tune Suricata to avoid this overwhelming amount of information?</p>
<p>For now let's stop it while we figure it out.</p>
<h1 id="heading-how-to-tune-up-suricata"><strong>How to Tune Up Suricata</strong></h1>
<p>Make sure the settings of suricata.yaml make sense for a home network:</p>
<pre><code class="lang-shell">sudo -i
# And a YAML linter so we can make sure our Suricata configuration files are good
apt-get install yamllint
cp -v -p  /etc/suricata/suricata.yaml /etc/suricata/suricata.yaml.orig
</code></pre>
<p>Note that I provide here a linted and clean version of my [suricata.yaml](file:///home/josevnz/SuricataLog/etc/suricata/suricata.yaml) file.</p>
<h2 id="heading-how-to-tame-the-varlogsuricataevejson-file"><strong>How to tame the /var/log/suricata/eve.json file</strong></h2>
<p>This is the file were we can learn in detail what triggered an alert. But it can grow VERY fast, depending on your traffic and event rules configuration.</p>
<p>So using logrotate (comes installed as part of Ubuntu), do this:</p>
<pre><code class="lang-python"><span class="hljs-comment"># Keep a week of logs, 1 GB of size.</span>
<span class="hljs-comment"># Always test your config: logrotate -vdf /etc/logrotate.d/suricata</span>
/var/log/suricata/*.log /var/log/suricata/*.json {
    daily
    maxsize <span class="hljs-number">1</span>G
    rotate <span class="hljs-number">7</span>
    missingok
    nocompress
    create
    sharedscripts
    postrotate
        systemctl restart suricata.service
    endscript
}
</code></pre>
<h2 id="heading-how-to-help-suricata-to-do-its-job-using-emerging-threats-rules"><strong>How to help Suricata to do its job using emerging threats rules</strong></h2>
<p>We can tune Suricata using the <a target="_blank" href="https://rules.emergingthreats.net/OPEN_download_instructions.html">ET OPEN Ruleset</a>. Because threats change all the time, you need to automate <a target="_blank" href="https://github.com/OISF/suricata-update#suricata-update">their download and updating</a>.</p>
<p>So install it first:</p>
<pre><code class="lang-shell">sudo -i
python3 -m venv ~/virtualenv/suricata
. ~/virtualenv/suricata/bin/activate
pip install --upgrade pip
pip install --upgrade suricata-update
suricata-update
# Also, install jq so we can see the contents of the eve.json file nicely formatted
apt-get install jq
</code></pre>
<p>Let's run it by hand and see how the rules are updated by the tool:</p>
<p><a target="_blank" href="https://asciinema.org/a/487861"><img src="https://asciinema.org/a/487861.svg" alt="asciicast" width="1625.70333269" height="821.3331280000001" loading="lazy"></a></p>
<p>For our home network, we will download these rules once a day. A <a target="_blank" href="https://en.wikipedia.org/wiki/Cron">simple Cron job</a> will do the trick:</p>
<pre><code class="lang-shell">crontab -e
# Run Suricata update once a day, 
# per https://rules.emergingthreats.net/OPEN_download_instructions.html
# Also will update at a different time than the log rotation, to avoid a race condition
# while rotating the logs. Note than we do not need to restart suricata
0 30 * * * . ~/virtualenv/suricata/bin/activate &amp;&amp; suricata-update &amp;&amp; suricatasc -c reload-rules
</code></pre>
<p>Let's start Suricata again, so we can test some rules:</p>
<p><a target="_blank" href="https://asciinema.org/a/487868"><img src="https://asciinema.org/a/487868.svg" alt="asciicast" width="1625.70333269" height="802.666466" loading="lazy"></a></p>
<h2 id="heading-what-is-inside-the-varlogsuricataevejson-file"><strong>What is inside the /var/log/suricata/eve.json file?</strong></h2>
<p>The file packs quite a bit of information, which is <a target="_blank" href="https://suricata.readthedocs.io/en/suricata-6.0.0/output/eve/eve-json-format.html">described in detail</a> here:</p>
<pre><code class="lang-python">{<span class="hljs-string">"timestamp"</span>:<span class="hljs-string">"2022-04-15T20:52:05.026189+0000"</span>,<span class="hljs-string">"flow_id"</span>:<span class="hljs-number">1378250082748552</span>,<span class="hljs-string">"in_iface"</span>:<span class="hljs-string">"eth0"</span>,<span class="hljs-string">"event_type"</span>:<span class="hljs-string">"flow"</span>,<span class="hljs-string">"src_ip"</span>:<span class="hljs-string">"192.168.1.1"</span>,<span class="hljs-string">"src_port"</span>:<span class="hljs-number">59317</span>,<span class="hljs-string">"dest_ip"</span>:<span class="hljs-string">"239.255.255.250"</span>,<span class="hljs-string">"dest_port"</span>:<span class="hljs-number">1900</span>,<span class="hljs-string">"proto"</span>:<span class="hljs-string">"UDP"</span>,<span class="hljs-string">"app_proto"</span>:<span class="hljs-string">"failed"</span>,<span class="hljs-string">"flow"</span>:{<span class="hljs-string">"pkts_toserver"</span>:<span class="hljs-number">1</span>,<span class="hljs-string">"pkts_toclient"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"bytes_toserver"</span>:<span class="hljs-number">378</span>,<span class="hljs-string">"bytes_toclient"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"start"</span>:<span class="hljs-string">"2022-04-15T20:50:32.264328+0000"</span>,<span class="hljs-string">"end"</span>:<span class="hljs-string">"2022-04-15T20:50:32.264328+0000"</span>,<span class="hljs-string">"age"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"state"</span>:<span class="hljs-string">"new"</span>,<span class="hljs-string">"reason"</span>:<span class="hljs-string">"timeout"</span>,<span class="hljs-string">"alerted"</span>:false}}
{<span class="hljs-string">"timestamp"</span>:<span class="hljs-string">"2022-04-15T20:52:05.026418+0000"</span>,<span class="hljs-string">"flow_id"</span>:<span class="hljs-number">2222739437411106</span>,<span class="hljs-string">"in_iface"</span>:<span class="hljs-string">"eth0"</span>,<span class="hljs-string">"event_type"</span>:<span class="hljs-string">"flow"</span>,<span class="hljs-string">"src_ip"</span>:<span class="hljs-string">"192.168.1.1"</span>,<span class="hljs-string">"src_port"</span>:<span class="hljs-number">60890</span>,<span class="hljs-string">"dest_ip"</span>:<span class="hljs-string">"239.255.255.250"</span>,<span class="hljs-string">"dest_port"</span>:<span class="hljs-number">1900</span>,<span class="hljs-string">"proto"</span>:<span class="hljs-string">"UDP"</span>,<span class="hljs-string">"app_proto"</span>:<span class="hljs-string">"failed"</span>,<span class="hljs-string">"flow"</span>:{<span class="hljs-string">"pkts_toserver"</span>:<span class="hljs-number">1</span>,<span class="hljs-string">"pkts_toclient"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"bytes_toserver"</span>:<span class="hljs-number">376</span>,<span class="hljs-string">"bytes_toclient"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"start"</span>:<span class="hljs-string">"2022-04-15T20:50:32.482082+0000"</span>,<span class="hljs-string">"end"</span>:<span class="hljs-string">"2022-04-15T20:50:32.482082+0000"</span>,<span class="hljs-string">"age"</span>:<span class="hljs-number">0</span>,<span class="hljs-string">"state"</span>:<span class="hljs-string">"new"</span>,<span class="hljs-string">"reason"</span>:<span class="hljs-string">"timeout"</span>,<span class="hljs-string">"alerted"</span>:false}}
</code></pre>
<p>If you are casually inspecting the contents of the file in real time, I suggest you use <a target="_blank" href="https://stedolan.github.io/jq/">jq</a> (test your filters on <a target="_blank" href="https://jqplay.org/">jqplay.org</a>) and show a few fields of interest:</p>
<p><a target="_blank" href="https://asciinema.org/a/487979"><img src="https://asciinema.org/a/487979.svg" alt="asciicast" width="1844.70999927" height="709.333156" loading="lazy"></a></p>
<p>Going forward we will focus on the alerts, so we can just filter out by that type of event:</p>
<pre><code class="lang-shell">jq 'select(.event_type=="alert")' /var/log/suricata/eve.json
</code></pre>
<p>The Suricata folks have <a target="_blank" href="https://suricata.readthedocs.io/en/suricata-6.0.0/output/eve/eve-json-examplesjq.html">put together a nice page with examples</a> that you should check out.</p>
<h2 id="heading-how-to-test-suricata-installation"><strong>How to test Suricata installation</strong></h2>
<h3 id="heading-tools-of-the-trade-wireshark-tcpreplay-and-pcap-files"><strong>Tools of the trade: Wireshark, tcpreplay, and PCAP files</strong></h3>
<p>We will use some traffic capture files, in <a target="_blank" href="https://tools.ietf.org/id/draft-gharris-opsawg-pcap-00.html">PCAP</a> format. So what is a PCAP file?</p>
<blockquote>
<p>In the late 1980's, Van Jacobson, Steve McCanne, and others at the Network Research Group at Lawrence Berkeley National Laboratory developed the <a target="_blank" href="https://www.tcpdump.org/">tcpdump</a> program to capture and dissect network traces.</p>
<p>The code to capture traffic, using low-level mechanisms in various operating systems, and to read and write network traces to a file was later put into a library named libpcap.</p>
</blockquote>
<p>And we will use a tool to inspect the contents of the PCAP file. <a target="_blank" href="https://www.wireshark.org/">Wireshark</a> is a powerful traffic analysis tool, and we will use <a target="_blank" href="https://tcpreplay.appneta.com/">tcpreplay</a> to trigger the Suricata alerts by playing a PCAP file with suspicious activity:</p>
<pre><code class="lang-shell"># On Ubuntu, Debian: sudo apt-get install wireshark tcpreplay
sudo dnf install -y wireshark tcpreplay
</code></pre>
<p>The best way to learn how the bad actors operate is to see their footprints. You should definitely head to <a target="_blank" href="https://www.malware-traffic-analysis.net/">https://www.malware-traffic-analysis.net/</a> and download some samples, an even <a target="_blank" href="https://www.malware-traffic-analysis.net/training-exercises.html">better practice</a> with their PCAP analysis exercises.</p>
<h3 id="heading-warning-you-will-be-downloading-files-that-are-dangerous"><strong>WARNING</strong>: You will be downloading files that are dangerous:</h3>
<blockquote>
<p>Use this website at your own risk! If you download or use of any information from this website, you assume complete responsibility for any resulting loss or damage.</p>
</blockquote>
<p>So be careful and responsible when using this network traffic capturer.</p>
<h4 id="heading-no-rules-are-enabled-by-default"><strong>No rules are enabled by default?</strong></h4>
<p>How we can check if that is the case? I'll show you next:</p>
<p><a target="_blank" href="https://asciinema.org/a/488000"><img src="https://asciinema.org/a/488000.svg" alt="asciicast" width="1844.70999927" height="933.3331000000001" loading="lazy"></a></p>
<p>Once you enable the rules <code>(suricata-update list-sources --free; uricata-update enable-source source; suricata-update list-enabled-sources)</code> you can tell Suricata to reload the rules without a reboot:</p>
<pre><code class="lang-shell">root@raspberrypi:~# suricatasc -c reload-rules
{"message": "done", "return": "OK"}
</code></pre>
<h3 id="heading-2022-02-23-traffic-analysis-exercise-sunnystation"><strong>2022-02-23 - TRAFFIC ANALYSIS EXERCISE - SUNNYSTATION</strong></h3>
<p>Let's see if we can trigger Suricata using this specific threat (it is relative new).</p>
<p>Start by downloading <a target="_blank" href="https://www.malware-traffic-analysis.net/2022/02/23/2022-02-23-traffic-analysis-exercise.pcap.zip">2022-02-23-traffic-analysis-exercise.pcap.zip</a> (the password is on the [about page](file:///home/josevnz/SuricataLog/)).</p>
<pre><code class="lang-shell">insta_dir="$HOME/Downloads/malware/"
mkdir --parent --verbose "$insta_dir"
url="https://www.malware-traffic-analysis.net/2022/02/23/2022-02-23-traffic-analysis-exercise.pcap.zip"
exercise=$(basename $url)
curl --fail --location --output "$insta_dir/$exercise" $url
# Be ready to put the password :-)
cd $insta_dir &amp;&amp; unzip $exercise
</code></pre>
<p>What is inside? We can check with <code>capinfos</code> to get some insight on the file we just downloaded:</p>
<pre><code class="lang-shell">[josevnz@dmaf5 malware]$ capinfos 2022-02-23-traffic-analysis-exercise.pcap
File name:           2022-02-23-traffic-analysis-exercise.pcap
File type:           Wireshark/tcpdump/... - pcap
File encapsulation:  Ethernet
File timestamp precision:  microseconds (6)
Packet size limit:   file hdr: 65535 bytes
Number of packets:   30k
File size:           19MB
Data size:           19MB
Capture duration:    2680.736661 seconds
First packet time:   2022-02-23 13:22:24.405139
Last packet time:    2022-02-23 14:07:05.141800
Data byte rate:      7,191 bytes/s
Data bit rate:       57kbps
Average packet size: 642.09 bytes
Average packet rate: 11 packets/s
SHA256:              eefc7e61b50e7846f5a3282d7645539d7b2b4b85aa08a09d0b823896c9449d1f
RIPEMD160:           a8d84d262e37563c179e9ca52cdc6aae271efd9c
SHA1:                fdfa0d0edfe0cbcc0c1400fbe6ac61ff40942755
Strict time order:   True
Number of interfaces in file: 1
Interface #0 info:
                     Encapsulation = Ethernet (1 - ether)
                     Capture length = 65535
                     Time precision = microseconds (6)
                     Time ticks per second = 1000000
                     Number of stat entries = 0
                     Number of packets = 30023
</code></pre>
<p>Will use a [small wrapper](file:///home/josevnz/SuricataLog/scripts/replay_pcap_file.sh) around <code>tcpreplay</code> to replay our PCAP file:</p>
<pre><code class="lang-shell">#!/bin/bash
:&lt;&lt;DOC
Script to replay a PCAP file at accelerated pace on the default network interface
Author: Jose Vicente Nunez (kodegeek.com@protonmail.com)
DOC
default_dev=$(ip route show| grep default| sort -n -k 9| head -n 1| cut -f5 -d' ')|| exit 100
if [ "$(id --name --user)" != "root" ]; then
  echo "ERROR: I need to be root to inject the PCAP contents into '$default_dev'"
  echo "Maybe 'sudo $0 $*'?"
  exit 100
fi
for util in tcpreplay ip; do
  if ! type -p $util &gt; /dev/null 2&gt;&amp;1; then
    echo "Please put $util on the PATH and try again!"
    exit 100
  fi
done
:&lt;&lt;DOC
We may have more than one 'default' route, so we sort by priority and pick the one with the
preferred metric:
default via 192.168.1.1 dev eno1 proto dhcp metric 100 &lt;----- PICK ME!!!
default via 192.168.1.1 dev wlp4s0 proto dhcp metric 600
DOC
for pcap in "$@"; do
  if [ -f "$pcap" ]; then
    if ! tcpreplay --stats 5 --intf1 "$default_dev" --multiplier 24 "$pcap"; then
      echo "ERROR: Will not try to replay any pending PCAP files due previous errors"
      exit 100
    fi
  fi
done
</code></pre>
<p>Let it replay until it reaches the end of the file:</p>
<pre><code class="lang-shell">root@raspberrypi:~# tcpreplay --stats 5 --intf1 eth0 --multiplier 24 ~josevnz/Downloads/malware/2022-02-23-traffic-analysis-exercise.pcap 
Test start: 2022-04-16 17:51:40.673394 ...
Actual: 3783 packets (1075843 bytes) sent in 5.03 seconds
Rated: 213624.5 Bps, 1.70 Mbps, 751.17 pps
Actual: 6959 packets (3325918 bytes) sent in 10.04 seconds
Rated: 331191.4 Bps, 2.64 Mbps, 692.96 pps
Actual: 8627 packets (4464002 bytes) sent in 15.14 seconds
Rated: 294744.2 Bps, 2.35 Mbps, 569.61 pps
Actual: 10975 packets (6331901 bytes) sent in 20.21 seconds
Rated: 313180.5 Bps, 2.50 Mbps, 542.83 pps
Actual: 13148 packets (7870783 bytes) sent in 25.26 seconds
Rated: 311561.9 Bps, 2.49 Mbps, 520.45 pps
Actual: 14500 packets (8612630 bytes) sent in 30.43 seconds
...
Actual: 24467 packets (14960314 bytes) sent in 110.83 seconds
Rated: 134978.5 Bps, 1.07 Mbps, 220.75 pps
Test complete: 2022-04-16 17:53:33.735188
Actual: 30023 packets (19277433 bytes) sent in 113.06 seconds
Rated: 170503.5 Bps, 1.36 Mbps, 265.54 pps
Statistics for network device: eth0
    Successful packets:        30023
    Failed packets:            0
    Truncated packets:         0
    Retried packets (ENOBUFS): 0
    Retried packets (EAGAIN):  0
</code></pre>
<p>And eventually we get a few alerts:</p>
<pre><code class="lang-shell">"2022-04-16T17:52:20.134763+0000,dns,1296231906414153,172.16.0.170:53806,172.16.0.52:53"
"2022-04-16T17:52:20.286785+0000,dns,293726410006593,172.16.0.170:50935,172.16.0.52:53"
"2022-04-16T17:52:20.290084+0000,dns,293726410006593,172.16.0.170:50935,172.16.0.52:53"
"2022-04-16T17:52:20.520858+0000,alert,1626224981242326,172.16.0.149:49795,172.16.0.52:139"
"2022-04-16T17:52:21.784804+0000,alert,1992149752477936,172.16.0.149:49796,172.16.0.52:139"
"2022-04-16T17:52:22.142041+0000,flow,1739064507071469,172.16.0.149:5353,224.0.0.251:5353"
"2022-04-16T17:52:22.351091+0000,dns,2078727703255923,172.16.0.149:51367,172.16.0.52:53"
"2022-04-16T17:52:22.351260+0000,dns,181632058678300,172.16.0.149:64943,172.16.0.52:53"
"2022-04-16T17:52:22.351129+0000,dns,2078727703255923,172.16.0.149:51367,172.16.0.52:53"
"2022-04-16T17:52:23.037637+0000,alert,282956779721256,172.16.0.149:49798,172.16.0.52:139"
"2022-04-16T17:52:23.901721+0000,dns,556717995180633,172.16.0.170:51164,172.16.0.52:53"
"2022-04-16T17:52:23.904764+0000,dns,556717995180633,172.16.0.170:51164,172.16.0.52:53"
"2022-04-16T17:52:24.293356+0000,alert,2006941620009246,172.16.0.149:49799,172.16.0.52:139"
"2022-04-16T17:52:25.322102+0000,dns,1671081620007478,172.16.0.170:51909,172.16.0.52:53"
</code></pre>
<p>For sake of example, zoom in alert id '282956779721256':</p>
<pre><code class="lang-python">// root@raspberrypi:~<span class="hljs-comment"># grep 282956779721256 /var/log/suricata/eve.json|jq</span>
{
  <span class="hljs-string">"timestamp"</span>: <span class="hljs-string">"2022-04-16T17:52:23.037637+0000"</span>,
  <span class="hljs-string">"flow_id"</span>: <span class="hljs-number">282956779721256</span>,
  <span class="hljs-string">"in_iface"</span>: <span class="hljs-string">"eth0"</span>,
  <span class="hljs-string">"event_type"</span>: <span class="hljs-string">"alert"</span>,
  <span class="hljs-string">"src_ip"</span>: <span class="hljs-string">"172.16.0.149"</span>,
  <span class="hljs-string">"src_port"</span>: <span class="hljs-number">49798</span>,
  <span class="hljs-string">"dest_ip"</span>: <span class="hljs-string">"172.16.0.52"</span>,
  <span class="hljs-string">"dest_port"</span>: <span class="hljs-number">139</span>,
  <span class="hljs-string">"proto"</span>: <span class="hljs-string">"TCP"</span>,
  <span class="hljs-string">"metadata"</span>: {
    <span class="hljs-string">"flowints"</span>: {
      <span class="hljs-string">"applayer.anomaly.count"</span>: <span class="hljs-number">1</span>
    }
  },
  <span class="hljs-string">"alert"</span>: {
    <span class="hljs-string">"action"</span>: <span class="hljs-string">"allowed"</span>,
    <span class="hljs-string">"gid"</span>: <span class="hljs-number">1</span>,
    <span class="hljs-string">"signature_id"</span>: <span class="hljs-number">2260002</span>,
    <span class="hljs-string">"rev"</span>: <span class="hljs-number">1</span>,
    <span class="hljs-string">"signature"</span>: <span class="hljs-string">"SURICATA Applayer Detect protocol only one direction"</span>,
    <span class="hljs-string">"category"</span>: <span class="hljs-string">"Generic Protocol Command Decode"</span>,
    <span class="hljs-string">"severity"</span>: <span class="hljs-number">3</span>
  },
  <span class="hljs-string">"smb"</span>: {
    <span class="hljs-string">"id"</span>: <span class="hljs-number">1</span>,
    <span class="hljs-string">"dialect"</span>: <span class="hljs-string">"NT LM 0.12"</span>,
    <span class="hljs-string">"command"</span>: <span class="hljs-string">"SMB1_COMMAND_NEGOTIATE_PROTOCOL"</span>,
    <span class="hljs-string">"status"</span>: <span class="hljs-string">"STATUS_SUCCESS"</span>,
    <span class="hljs-string">"status_code"</span>: <span class="hljs-string">"0x0"</span>,
    <span class="hljs-string">"session_id"</span>: <span class="hljs-number">0</span>,
    <span class="hljs-string">"tree_id"</span>: <span class="hljs-number">0</span>,
    <span class="hljs-string">"client_dialects"</span>: [
      <span class="hljs-string">"PC NETWORK PROGRAM 1.0"</span>,
      <span class="hljs-string">"LANMAN1.0"</span>,
      <span class="hljs-string">"Windows for Workgroups 3.1a"</span>,
      <span class="hljs-string">"LM1.2X002"</span>,
      <span class="hljs-string">"LANMAN2.1"</span>,
      <span class="hljs-string">"NT LM 0.12"</span>
    ],
    <span class="hljs-string">"server_guid"</span>: <span class="hljs-string">"a21b9552-a4a0-48cd-8abb-ea111498253d"</span>
  },
  <span class="hljs-string">"app_proto"</span>: <span class="hljs-string">"smb"</span>,
  <span class="hljs-string">"app_proto_ts"</span>: <span class="hljs-string">"failed"</span>,
  <span class="hljs-string">"flow"</span>: {
    <span class="hljs-string">"pkts_toserver"</span>: <span class="hljs-number">4</span>,
    <span class="hljs-string">"pkts_toclient"</span>: <span class="hljs-number">3</span>,
    <span class="hljs-string">"bytes_toserver"</span>: <span class="hljs-number">579</span>,
    <span class="hljs-string">"bytes_toclient"</span>: <span class="hljs-number">387</span>,
    <span class="hljs-string">"start"</span>: <span class="hljs-string">"2022-04-16T17:52:23.037416+0000"</span>
  },
  <span class="hljs-string">"payload"</span>: <span class="hljs-string">"AAAAiv9TTUJzAAAAABgHyAAAQlNSU1BZTCAAAP////4AAEAADP8AAAAEQTIAAAAAAAAASgAAAAAA1AAAoE8AYEgGBisGAQUFAqA+MDygDjAMBgorBgEEAYI3AgIKoioEKE5UTE1TU1AAAQAAAJeCCOIAAAAAAAAAAAAAAAAAAAAACgBhSgAAAA8AAAAAAA=="</span>,
  <span class="hljs-string">"payload_printable"</span>: <span class="hljs-string">".....SMBs.........BSRSPYL ........@.......A2.......J.........O.`H..+......&gt;0&lt;..0..\n+.....7..\n.*.(NTLMSSP.........................\n.aJ........."</span>,
  <span class="hljs-string">"stream"</span>: <span class="hljs-number">0</span>,
  <span class="hljs-string">"packet"</span>: <span class="hljs-string">"AB5PDqh0ABv8e9HACABFAAC2t+tAAIAG6WysEACVrBAANMKGAIthfGQf7GIEdVAYIBP6YwAAAAAAiv9TTUJzAAAAABgHyAAAQlNSU1BZTCAAAP////4AAEAADP8AAAAEQTIAAAAAAAAASgAAAAAA1AAAoE8AYEgGBisGAQUFAqA+MDygDjAMBgorBgEEAYI3AgIKoioEKE5UTE1TU1AAAQAAAJeCCOIAAAAAAAAAAAAAAAAAAAAACgBhSgAAAA8AAAAAAA=="</span>,
  <span class="hljs-string">"packet_info"</span>: {
    <span class="hljs-string">"linktype"</span>: <span class="hljs-number">1</span>
  },
  <span class="hljs-string">"host"</span>: <span class="hljs-string">"ras[berripi"</span>
}
{
  <span class="hljs-string">"timestamp"</span>: <span class="hljs-string">"2022-04-16T17:55:42.050329+0000"</span>,
  <span class="hljs-string">"flow_id"</span>: <span class="hljs-number">282956779721256</span>,
  <span class="hljs-string">"in_iface"</span>: <span class="hljs-string">"eth0"</span>,
  <span class="hljs-string">"event_type"</span>: <span class="hljs-string">"flow"</span>,
  <span class="hljs-string">"src_ip"</span>: <span class="hljs-string">"172.16.0.149"</span>,
  <span class="hljs-string">"src_port"</span>: <span class="hljs-number">49798</span>,
  <span class="hljs-string">"dest_ip"</span>: <span class="hljs-string">"172.16.0.52"</span>,
  <span class="hljs-string">"dest_port"</span>: <span class="hljs-number">139</span>,
  <span class="hljs-string">"proto"</span>: <span class="hljs-string">"TCP"</span>,
  <span class="hljs-string">"app_proto"</span>: <span class="hljs-string">"smb"</span>,
  <span class="hljs-string">"app_proto_ts"</span>: <span class="hljs-string">"failed"</span>,
  <span class="hljs-string">"flow"</span>: {
    <span class="hljs-string">"pkts_toserver"</span>: <span class="hljs-number">13</span>,
    <span class="hljs-string">"pkts_toclient"</span>: <span class="hljs-number">12</span>,
    <span class="hljs-string">"bytes_toserver"</span>: <span class="hljs-number">1743</span>,
    <span class="hljs-string">"bytes_toclient"</span>: <span class="hljs-number">1963</span>,
    <span class="hljs-string">"start"</span>: <span class="hljs-string">"2022-04-16T17:52:23.037416+0000"</span>,
    <span class="hljs-string">"end"</span>: <span class="hljs-string">"2022-04-16T17:52:23.488633+0000"</span>,
    <span class="hljs-string">"age"</span>: <span class="hljs-number">0</span>,
    <span class="hljs-string">"state"</span>: <span class="hljs-string">"closed"</span>,
    <span class="hljs-string">"reason"</span>: <span class="hljs-string">"timeout"</span>,
    <span class="hljs-string">"alerted"</span>: true
  },
  <span class="hljs-string">"metadata"</span>: {
    <span class="hljs-string">"flowbits"</span>: [
      <span class="hljs-string">"smb.tree.connect.ipc"</span>
    ],
    <span class="hljs-string">"flowints"</span>: {
      <span class="hljs-string">"applayer.anomaly.count"</span>: <span class="hljs-number">1</span>
    }
  },
  <span class="hljs-string">"tcp"</span>: {
    <span class="hljs-string">"tcp_flags"</span>: <span class="hljs-string">"1b"</span>,
    <span class="hljs-string">"tcp_flags_ts"</span>: <span class="hljs-string">"1b"</span>,
    <span class="hljs-string">"tcp_flags_tc"</span>: <span class="hljs-string">"1b"</span>,
    <span class="hljs-string">"syn"</span>: true,
    <span class="hljs-string">"fin"</span>: true,
    <span class="hljs-string">"psh"</span>: true,
    <span class="hljs-string">"ack"</span>: true,
    <span class="hljs-string">"state"</span>: <span class="hljs-string">"closed"</span>
  },
  <span class="hljs-string">"host"</span>: <span class="hljs-string">"raspberrypi"</span>
}
</code></pre>
<p>That's quite a bit to process. Keep in mind that while we are tuning Suricata, we can also ask it to replay one or more PCAP file directly.</p>
<h4 id="heading-ask-suricata-to-run-in-offline-mode-using-pcap-file-for-sunnystation"><strong>Ask Suricata to run in offline mode using PCAP file for SUNNYSTATION</strong></h4>
<p>It is a very convenient way to test Suricata, as we do not inject any traffic in our network and instead let Suricata 'ingest' the contents of the PCAP file directly, to test the rules.</p>
<p>Also, we redirect the logs to a separate location (by default the directory where you are running the 'offline' mode), so we don't pollute a live installation.</p>
<p><a target="_blank" href="https://asciinema.org/a/SH8bo3pjpvRt4H617GoHbPdoK"><img src="https://asciinema.org/a/SH8bo3pjpvRt4H617GoHbPdoK.svg" alt="asciicast" width="1844.70999927" height="933.3331000000001" loading="lazy"></a></p>
<h3 id="heading-another-example-emotet-with-cobalt-strike"><strong>Another example: EMOTET WITH COBALT STRIKE</strong></h3>
<p>Let's try another malware capture, in this case 2022-02-08 (TUESDAY) - <a target="_blank" href="https://www.malware-traffic-analysis.net/2022/02/08/index.html">FILES FOR AN ISC DIARY (EMOTET WITH COBALT STRIKE)</a>:</p>
<pre><code class="lang-shell">cd ~/Downloads/malware/ &amp;&amp; \
curl --remote-name https://www.malware-traffic-analysis.net/2022/02/08/2022-02-08-Emotet-epoch4-infection-start-and-spambot-traffic.pcap.zip &amp;&amp; \
unzip 2022-02-08-Emotet-epoch4-infection-start-and-spambot-traffic.pcap.zip &amp;&amp; \
sudo suricata -r ~josevnz/Downloads/malware/2022-02-08-Emotet-epoch4-infection-start-and-spambot-traffic.pcap -k none --runmode autofp -c /etc/suricata/suricata.yaml -l ~josevnz/Downloads/malware/
</code></pre>
<p>Here is a sample session:</p>
<p><a target="_blank" href="https://asciinema.org/a/488035"><img src="https://asciinema.org/a/488035.svg" alt="asciicast" width="1844.70999927" height="933.3331000000001" loading="lazy"></a></p>
<h1 id="heading-making-sense-of-all-the-alerts"><strong>Making Sense of All the Alerts</strong></h1>
<p>Suricata will save lots of details when it detects an anomaly. You can tell that using jq to go through the alerts may not be desirable.</p>
<p>For a bigger setup, you may want to use an <a target="_blank" href="https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-module-suricata.html">Elastic Stack</a> (Filebeat, Logstash, Elastic Search, Kibana):</p>
<ul>
<li><p>Get the logs</p>
</li>
<li><p>Store historically and normalize the logs</p>
</li>
<li><p>Visualize their contents</p>
</li>
</ul>
<p>But that feels overkill for a home setup, so I will roll out a few scripts to help me with what I need.</p>
<h2 id="heading-show-me-what-happened-in-the-last-10-minutes"><strong>Show me what happened in the last 10 minutes</strong></h2>
<p>This is a script that assumes most of the defaults, so I don't have to type a jq expression. If there are any alerts then I dive deeper into the eve.json file.</p>
<p>A simple Python 3 script will do the trick for us:</p>
<pre><code class="lang-python"><span class="hljs-comment">#!/usr/bin/env python</span>
<span class="hljs-string">"""
Show Suricata alerts
Author: Jose Vicente Nunez (kodegeek.com@protonmail.com)
"""</span>
<span class="hljs-keyword">import</span> argparse
<span class="hljs-keyword">import</span> json
<span class="hljs-keyword">from</span> datetime <span class="hljs-keyword">import</span> datetime, timedelta
<span class="hljs-keyword">from</span> json <span class="hljs-keyword">import</span> JSONDecodeError
<span class="hljs-keyword">from</span> pathlib <span class="hljs-keyword">import</span> Path
<span class="hljs-keyword">from</span> typing <span class="hljs-keyword">import</span> Callable, Any, Dict

DEFAULT_EVE = [Path(<span class="hljs-string">"/var/log/suricata/eve.json"</span>)]
DEFAULT_TIMESTAMP_10M_AGO = datetime = datetime.now() - timedelta(minutes=<span class="hljs-number">10</span>)


<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">_parse_timestamp</span>(<span class="hljs-params">candidate: str</span>) -&gt; datetime:</span>
    <span class="hljs-string">"""
    Expected something like 2022-02-08T16:32:14.900292+0000
    :param candidate:
    :return:
    """</span>
    <span class="hljs-keyword">if</span> isinstance(candidate, str):
        <span class="hljs-keyword">try</span>:
            iso_candidate = candidate.split(<span class="hljs-string">'+'</span>, <span class="hljs-number">1</span>)[<span class="hljs-number">0</span>]
            <span class="hljs-keyword">return</span> datetime.fromisoformat(iso_candidate)
        <span class="hljs-keyword">except</span> ValueError:
            <span class="hljs-keyword">raise</span> ValueError(<span class="hljs-string">f"Invalid date passed: <span class="hljs-subst">{candidate}</span>"</span>)
    <span class="hljs-keyword">elif</span> isinstance(candidate, datetime):
        <span class="hljs-keyword">return</span> candidate


<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">alert_filter</span>(<span class="hljs-params">
        *,
        timestamp: datetime = DEFAULT_TIMESTAMP_10M_AGO,
        data: Dict[str, Any]
</span>) -&gt; bool:</span>
    <span class="hljs-keyword">if</span> <span class="hljs-string">'event_type'</span> <span class="hljs-keyword">not</span> <span class="hljs-keyword">in</span> data:
        <span class="hljs-keyword">return</span> <span class="hljs-literal">False</span>
    <span class="hljs-keyword">if</span> data[<span class="hljs-string">'event_type'</span>] != <span class="hljs-string">'alert'</span>:
        <span class="hljs-keyword">return</span> <span class="hljs-literal">False</span>
    <span class="hljs-keyword">try</span>:
        event_timestamp = _parse_timestamp(data[<span class="hljs-string">'timestamp'</span>])
        <span class="hljs-keyword">if</span> event_timestamp &gt; timestamp:
            <span class="hljs-keyword">return</span> <span class="hljs-literal">False</span>
    <span class="hljs-keyword">except</span> ValueError:
        <span class="hljs-keyword">return</span> <span class="hljs-literal">False</span>
    <span class="hljs-keyword">return</span> <span class="hljs-literal">True</span>


<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">get_alerts</span>(<span class="hljs-params">
        *,
        eve_files=None,
        row_filter: Callable = alert_filter,
        timestamp: datetime = DEFAULT_TIMESTAMP_10M_AGO
</span>) -&gt; str:</span>
    <span class="hljs-keyword">if</span> eve_files <span class="hljs-keyword">is</span> <span class="hljs-literal">None</span>:
        eve_files = DEFAULT_EVE
    <span class="hljs-keyword">for</span> eve_file <span class="hljs-keyword">in</span> eve_files:
        <span class="hljs-keyword">with</span> open(eve_file, <span class="hljs-string">'rt'</span>) <span class="hljs-keyword">as</span> eve:
            <span class="hljs-keyword">for</span> line <span class="hljs-keyword">in</span> eve:
                <span class="hljs-keyword">try</span>:
                    data = json.loads(line)
                    <span class="hljs-keyword">if</span> row_filter(data=data, timestamp=timestamp):
                        <span class="hljs-keyword">yield</span> data
                <span class="hljs-keyword">except</span> JSONDecodeError:
                    <span class="hljs-keyword">continue</span>  <span class="hljs-comment"># Try to read the next record</span>


<span class="hljs-keyword">if</span> __name__ == <span class="hljs-string">"__main__"</span>:
    PARSER = argparse.ArgumentParser(description=__doc__)
    PARSER.add_argument(
        <span class="hljs-string">"--timestamp"</span>,
        type=_parse_timestamp,
        default=DEFAULT_TIMESTAMP_10M_AGO,
        help=<span class="hljs-string">f"Minimum timestamp in the past to use when filtering events (<span class="hljs-subst">{DEFAULT_TIMESTAMP_10M_AGO}</span>)"</span>
    )
    PARSER.add_argument(
        <span class="hljs-string">'eve'</span>,
        type=Path,
        nargs=<span class="hljs-string">"+"</span>,
        help=<span class="hljs-string">f"Path to one or more <span class="hljs-subst">{DEFAULT_EVE[<span class="hljs-number">0</span>]}</span> file to parse."</span>
    )
    OPTIONS = PARSER.parse_args()
    <span class="hljs-keyword">try</span>:
        <span class="hljs-keyword">for</span> alert <span class="hljs-keyword">in</span> get_alerts(eve_files=OPTIONS.eve, timestamp=OPTIONS.timestamp):
            print(json.dumps(alert, indent=<span class="hljs-number">6</span>, sort_keys=<span class="hljs-literal">True</span>))
    <span class="hljs-keyword">except</span> KeyboardInterrupt:
        <span class="hljs-keyword">pass</span>
</code></pre>
<p>It is a big improvement over jq as at least we can filter by timestamp, but it would be nice if our script could do the following:</p>
<ol>
<li><p>Support pagination</p>
</li>
<li><p>Colorize output</p>
</li>
<li><p>Let you show between a table format or raw JSON output</p>
</li>
</ol>
<p><a target="_blank" href="https://asciinema.org/a/488166"><img src="https://asciinema.org/a/488166.svg" alt="asciicast" width="1844.70999927" height="933.3331000000001" loading="lazy"></a></p>
<h1 id="heading-what-did-we-learn-and-what-is-next"><strong>What Did We Learn and What is Next?</strong></h1>
<p>Suricata is a complex piece of software. It takes time to tame it and more time to make sense of the information it presents. But it is very rewarding to see how you can tackle a tool that will allow you to secure your network from threats.</p>
<ul>
<li><p><a target="_blank" href="https://www.youtube.com/c/OISFSuricata">The OISF Suricata YouTube channel</a> has many interesting resources about this tool and a thriving community.</p>
</li>
<li><p>Want to learn how to analyze PCAP files for bad traffic? <a target="_blank" href="https://www.malware-traffic-analysis.net/training-exercises.html">malware-traffic-analysis</a> has perfect material for you.</p>
</li>
<li><p><strong>Writing complex software is hard</strong>. For example, older versions of Snort are vulnerable to an <a target="_blank" href="https://claroty.com/2022/04/14/blog-research-blinding-snort-breaking-the-modbus-ot-preprocessor/">attack that can disable it, CVE-2022-20685</a>. Suricata also had <a target="_blank" href="https://nvd.nist.gov/vuln/detail/CVE-2019-1010279">CVE-2019-1010279</a> .These issues were fixed but illustrates the need to keep your software current, specially the one you use to protect your network.</p>
</li>
<li><p>I did not touch the IPS mode, or even hybrid modes for Suricata. Please read the official documentation to get up to speed.</p>
</li>
<li><p>Finally, do yourself a favor and read this <a target="_blank" href="https://resources.sei.cmu.edu/asset_files/Presentation/2016_017_001_449890.pdf">Suricata Tutorial from FloCon 2016</a>. It is very complete and will have you looking for more.</p>
</li>
</ul>
<p>You can leave your comments on the Git repository and report any bugs. But more important get Suricata, <a target="_blank" href="https://github.com/josevnz/SuricataLog">get the code of this tutorial</a>, and start securing your home wireless infrastructure in no time.</p>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ How to Secure Your Home Wireless Infrastructure with Kismet and Python ]]>
                </title>
                <description>
                    <![CDATA[ Everything is connected to wireless these days. In my case I found that I have LOTS of devices after running a simple nmap command on my home network: [josevnz@dmaf5 ~]$ sudo nmap -v -n -p- -sT -sV -O --osscan-limit --max-os-tries 1 -oX $HOME/home_sc... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/wireless-security-using-raspberry-pi-4-kismet-and-python/</link>
                <guid isPermaLink="false">66d8514f4540581f6454412b</guid>
                
                    <category>
                        <![CDATA[ information security ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Python ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Raspberry Pi ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Security ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ Jose Vicente Nunez ]]>
                </dc:creator>
                <pubDate>Wed, 02 Mar 2022 16:22:20 +0000</pubDate>
                <media:content url="https://www.freecodecamp.org/news/content/images/2022/03/wireless_security_with_kismet_and_python.png" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>Everything is connected to wireless these days. In my case I found that I have LOTS of devices after running a simple <a target="_blank" href="https://www.freecodecamp.org/news/enhance-nmap-with-python/#nmap-101-identify-all-the-public-services-in-our-network">nmap command on my home network</a>:</p>
<pre><code class="lang-shell">[josevnz@dmaf5 ~]$ sudo nmap -v -n -p- -sT -sV -O --osscan-limit --max-os-tries 1 -oX $HOME/home_scan.xml 192.168.1.0/24
</code></pre>
<p>So I started to wonder:</p>
<ul>
<li><p>Is my wireless network secure?</p>
</li>
<li><p>How long would it take to an attacker to get in?</p>
</li>
</ul>
<p>I have a <em>Raspberry 4</em> with Ubuntu (focal) installed and decided to use the well-known <a target="_blank" href="https://www.kismetwireless.net/">Kismet</a> to find out.</p>
<p>In this article you will learn:</p>
<ul>
<li><p>How to get a whole picture of the networks nearby you with Kismet</p>
</li>
<li><p>How to customize Kismet using Python and the REST-API</p>
</li>
</ul>
<p><img src="https://www.freecodecamp.org/news/content/images/2022/03/raspberrypi-wireless-setup-1.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>If you are curious, this is my home Raspberry PI 4, tiny monitor and all</em></p>
<h1 id="heading-table-of-contents">Table of contents</h1>
<ul>
<li><p><a class="post-section-overview" href="#heading-the-saying-ask-for-forgiveness-not-permission-doesnt-apply-here">The saying 'Ask for forgiveness, not permission' doesn't apply here</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-getting-to-know-your-hardware">Getting to know your hardware</a></p>
</li>
<li><p><a class="post-section-overview" href="#kismet">kismet</a></p>
</li>
<li><p><a class="post-section-overview" href="#restapi">REST-API</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-what-did-we-learn">What did we learn?</a></p>
</li>
</ul>
<h1 id="heading-the-saying-ask-for-forgiveness-not-permission-doesnt-apply-here">The saying 'Ask for forgiveness, not permission' doesn't apply here</h1>
<p>And by that I mean that <em>you should not be trying to eavesdrop or infiltrate a wireless network that is not yours</em>. It is relatively easy to detect if a new unknown client joined your wireless network, and it is also illegal.</p>
<p>So do the right thing – use this tutorial to learn and not to break into someone else's network, OK?</p>
<h1 id="heading-getting-to-know-your-hardware">Getting to know your hardware</h1>
<p>I will jump a little ahead to show you a small issue with the Raspberry 4 integrated Wireless interface.</p>
<p><strong>The Raspberry PI 4 onboard wireless card will not work out of the box</strong> as the firmware doesn't support monitor mode.</p>
<p>There are works to <a target="_blank" href="https://github.com/seemoo-lab/bcm-rpi3">support this</a>. Instead, I took the easy way out and ordered an external Wi-Fi dongle from <a target="_blank" href="https://www.canakit.com/raspberry-pi-wifi.html">CanaKit</a>.</p>
<p>The CanaKit wireless card worked out of the box, and we'll see it shortly. But first let's install and play around with Kismet.</p>
<h2 id="heading-make-sure-the-interface-is-running-in-monitor-mode">Make sure the interface is running in monitor mode</h2>
<p>By default, the network interface will have monitor mode off:</p>
<pre><code class="lang-shell">root@raspberrypi:~# iwconfig wlan1
wlan1     IEEE 802.11  ESSID:off/any  
          Mode:Managed  Access Point: Not-Associated   Tx-Power=0 dBm   
          Retry short  long limit:2   RTS thr:off   Fragment thr:off
          Encryption key:off
          Power Management:off
</code></pre>
<p>I know I will always set up my Ralink Technology, Corp. RT5370 Wireless Adapter in monitor mode, but I need to be careful as Ubuntu can swap wlan0 and wlan1 (The Broadcom adapter I want to skip is a PCI device).</p>
<p>The Ralink adapter is a USB adapter, so we can find out where it is:</p>
<pre><code class="lang-shell">josevnz@raspberrypi:/etc/netplan$ /bin/lsusb|grep Ralink
Bus 001 Device 004: ID 148f:5370 Ralink Technology, Corp. RT5370 Wireless Adapter
</code></pre>
<p>Now we need to find out what device was mapped to the Ralink adapter. With a little bit of help of the Ubuntu community I found than the Ralink adapter uses the rt2800usb driver <a target="_blank" href="https://help.ubuntu.com/community/WifiDocs/Device/Ralink_RT5370">5370 Ralink Technology</a></p>
<p>The answer I seek is here:</p>
<pre><code class="lang-shell">josevnz@raspberrypi:~$ ls /sys/bus/usb/drivers/rt2800usb/*:1.0/net/
wlan1
</code></pre>
<p>So the code that does the wireless card detection looks like this:</p>
<pre><code class="lang-shell">root@raspberrypi:~#/bin/cat&lt;&lt;RC_LOCAL&gt;/etc/rc.local
#!/bin/bash
usb_driver=rt2800usb
wlan=\$(/bin/ls /sys/bus/usb/drivers/\$usb_driver/*/net/)
if [ $? -eq 0 ]; then
        set -ex
        /usr/sbin/ifconfig "\$wlan" down
        /usr/sbin/iwconfig "\$wlan" mode monitor
        /usr/sbin/ifconfig "\$wlan" up
        set +ex
fi
RC_LOCAL
root@raspberrypi:~# chmod u+x /etc/rc.local &amp;&amp; shutdown -r now "Enabling monitor mode"
</code></pre>
<p>Make sure the card is on monitor mode:</p>
<pre><code class="lang-shell">root@raspberrypi:~# iwconfig wlan1
iw        iwconfig  iwevent   iwgetid   iwlist    iwpriv    iwspy     
root@raspberrypi:~# iwconfig wlan1
wlan1     IEEE 802.11  Mode:Monitor  Frequency:2.412 GHz  Tx-Power=20 dBm   
          Retry short  long limit:2   RTS thr:off   Fragment thr:off
          Power Management:off
</code></pre>
<p>Good, let's move on with the tool setup</p>
<h1 id="heading-what-is-kismet">What is Kismet?</h1>
<p><a target="_blank" href="https://www.kismetwireless.net/">Kismet</a> is:</p>
<blockquote>
<p>a wireless network and device detector, sniffer, wardriving tool, and WIDS (wireless intrusion detection) framework.</p>
</blockquote>
<h2 id="heading-kismet-installation-and-setup">Kismet installation and setup</h2>
<p>The version that comes with the Ubuntu RaspberryPI by default is from 2016, <em>way too old</em>.</p>
<p>Instead, get an updated binary as <a target="_blank" href="https://www.kismetwireless.net/docs/readme/packages/">explained here</a> (I have Ubuntu focal, check with <code>lsb_release --all</code>).</p>
<pre><code class="lang-shell">wget -O - https://www.kismetwireless.net/repos/kismet-release.gpg.key | sudo apt-key add -
echo 'deb https://www.kismetwireless.net/repos/apt/release/focal focal main' | sudo tee /etc/apt/sources.list.d/kismet.list
sudo apt update
sudo apt install kismet
</code></pre>
<h3 id="heading-do-not-run-as-root-use-a-suid-binaryhttpsenwikipediaorgwikisetuid-and-a-unix-group-access">Do not run as root, use a <a target="_blank" href="https://en.wikipedia.org/wiki/Setuid">SUID binary</a> and a unix group access</h3>
<p>Kismet needs elevated privileges to run. And deals with possibly hostile data. So running with minimized permissions is the safest approach.</p>
<p>The right way to set it up is by using a Unix group and set user id (<em>SUID</em>) binary. My user is 'josevnz' so I did this:</p>
<pre><code class="lang-python">sudo apt-get install kismet
sudo usermod --append --groups kismet josevnz
</code></pre>
<h3 id="heading-encrypt-your-access-to-kismet-with-a-self-signed-certificate">Encrypt your access to Kismet with a self-signed certificate</h3>
<p>I will enable SSL for my Kismet <a target="_blank" href="https://github.com/josevnz/home_nmap/tree/main/tutorial">installation by using a self-signed certificate</a>. I will use for that the Cloudflare CFSSL tools:</p>
<pre><code class="lang-python">sudo apt-get update -y
sudo apt-get install -y golang-cfssl
</code></pre>
<p>Next step is to create the self-signed certificates. There is a lot of boilerplate steps here, so I will show you how you can jump through them (but please read the man pages to see what each command does):</p>
<h4 id="heading-initial-certificate">Initial certificate</h4>
<pre><code class="lang-shell">sudo /bin/mkdir --parents /etc/pki/raspberrypi
sudo /bin/cat&lt;&lt;CA&gt;/etc/pki/raspberrypi/ca.json
{
   "CN": "Nunez Barrios family Root CA",
   "key": {
     "algo": "rsa",
     "size": 2048
   },
   "names": [
   {
     "C": "US",
     "L": "CT",
     "O": "Nunez Barrios",
     "OU": "Nunez Barrios Root CA",
     "ST": "United States"
   }
  ]
}
CA
cfssl gencert -initca ca.json | cfssljson -bare ca
</code></pre>
<h4 id="heading-ssl-profile-config">SSL profile config</h4>
<pre><code class="lang-shell">root@raspberrypi:/etc/pki/raspberrypi# /bin/cat&lt;&lt;PROFILE&gt;/etc/pki/raspberrypi/cfssl.json
{
   "signing": {
     "default": {
       "expiry": "17532h"
     },
     "profiles": {
       "intermediate_ca": {
         "usages": [
             "signing",
             "digital signature",
             "key encipherment",
             "cert sign",
             "crl sign",
             "server auth",
             "client auth"
         ],
         "expiry": "17532h",
         "ca_constraint": {
             "is_ca": true,
             "max_path_len": 0, 
             "max_path_len_zero": true
         }
       },
       "peer": {
         "usages": [
             "signing",
             "digital signature",
             "key encipherment", 
             "client auth",
             "server auth"
         ],
         "expiry": "17532h"
       },
       "server": {
         "usages": [
           "signing",
           "digital signing",
           "key encipherment",
           "server auth"
         ],
         "expiry": "17532h"
       },
       "client": {
         "usages": [
           "signing",
           "digital signature",
           "key encipherment", 
           "client auth"
         ],
         "expiry": "17532h"
       }
     }
   }
}
PROFILE
</code></pre>
<h4 id="heading-intermediate-certificate">Intermediate certificate</h4>
<pre><code class="lang-shell">root@raspberrypi:/etc/pki/raspberrypi# /bin/cat&lt;&lt;INTERMEDIATE&gt;/etc/pki/raspberrypi/intermediate-ca.json
{
  "CN": "Barrios Nunez Intermediate CA",
  "key": {
    "algo": "rsa",
    "size": 2048
  },
  "names": [
    {
      "C":  "US",
      "L":  "CT",
      "O":  "Barrios Nunez",
      "OU": "Barrios Nunez Intermediate CA",
      "ST": "USA"
    }
  ],
  "ca": {
    "expiry": "43830h"
  }
}
INTERMEDIATE
cfssl gencert -initca intermediate-ca.json | cfssljson -bare intermediate_ca
cfssl sign -ca ca.pem -ca-key ca-key.pem -config cfssl.json -profile intermediate_ca intermediate_ca.csr | cfssljson -bare intermediate_ca
</code></pre>
<h4 id="heading-configuration-for-the-ssl-certificate-on-the-raspberry-pi-4-machine">Configuration for the SSL certificate on the Raspberry PI 4 machine</h4>
<p>Here we put the name and IP address of the machine that will run our Kismet web application:</p>
<pre><code class="lang-shell">/bin/cat&lt;&lt;RASPBERRYPI&gt;/etc/pki/raspberrypi/raspberrypi.home.json
{
  "CN": "raspberrypi.home",
  "key": {
    "algo": "rsa",
    "size": 2048
  },
  "names": [
  {
    "C": "US",
    "L": "CT",
    "O": "Barrios Nunez",
    "OU": "Barrios Nunez Hosts",
    "ST": "USA"
  }
  ],
  "hosts": [
    "raspberrypi.home",
    "localhost",
    "raspberrypi",
    "192.168.1.11"
  ]               
}
RASPBERRYPI
cd /etc/pki/raspberrypi
cfssl gencert -ca intermediate_ca.pem -ca-key intermediate_ca-key.pem -config cfssl.json -profile=peer raspberrypi.home.json| cfssljson -bare raspberry-peer
cfssl gencert -ca intermediate_ca.pem -ca-key intermediate_ca-key.pem -config cfssl.json -profile=server raspberrypi.home.json| cfssljson -bare raspberry-server
cfssl gencert -ca intermediate_ca.pem -ca-key intermediate_ca-key.pem -config cfssl.json -profile=client raspberrypi.home.json| cfssljson -bare raspberry-client
</code></pre>
<p>Adding SSL support is then as easy as adding the following overrides:</p>
<pre><code class="lang-shell">/bin/cat&lt;&lt;SSL&gt;&gt;/etc/kismet/kismet_site.conf
httpd_ssl=true
httpd_ssl_cert=/etc/pki/raspberrypi/raspberry-server.csr
httpd_ssl_key=/etc/pki/raspberrypi/raspberry-server-key.pem
SSL
</code></pre>
<h3 id="heading-putting-everything-together-with-a-kismet-site-overrides-file">Putting everything together, with a Kismet 'site' overrides file</h3>
<p>Kismet has a really nice feature: it can use a file that overrides some defaults, without the need to edit multiple files. In this case my installation will override the SSL settings, Wifi interface, and log location. So time to update our /etc/rc.local file:</p>
<pre><code class="lang-shell">#!/bin/bash
# Kismet setup
usb_driver=rt2800usb
wlan=$(ls /sys/bus/usb/drivers/$usb_driver/*/net/)
if [ $? -eq 0 ]; then
    set -ex
    /usr/sbin/ifconfig "$wlan" down
    /usr/sbin/iwconfig "$wlan" mode monitor
    /usr/sbin/ifconfig "$wlan" up
    set +ex
    /bin/cat&lt;&lt;KISMETOVERR&gt;/etc/kismet/kismet_site.conf
server_name=Nunez Barrios Kismet server
logprefix=/data/kismet
source=$wlan
httpd_ssl=true
httpd_ssl_cert=/etc/pki/raspberrypi/raspberry-server.csr
httpd_ssl_key=/etc/pki/raspberrypi/raspberry-server-key.pem
KISMETOVERR
fi
</code></pre>
<p>Finally, it is time to start Kismet (in my case as the non-root user josevnz):</p>
<pre><code class="lang-python"><span class="hljs-comment"># If you know which interface is the one in monitoring mode, then </span>
josevnz@raspberrypi:~$ kismet
</code></pre>
<p>Now let's log on for the first time to the web interface (In my case <a target="_blank" href="http://raspberripi.home:2501">http://raspberripi.home:2501</a>)</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2022/03/kismet-set-login.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>You will get a prompt the first time you try to log in your Kismet installation</em></p>
<p>In here you set up your admin user and password.</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2022/03/kismet-main-screen.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>Example of the wireless networks detected</em></p>
<p>After a little time, Kismet will populate the main Dashboard with the list of wireless networks and devices it can detect. You will be surprised not just how many neighboring devices are out there but how many you have in your own house.</p>
<p>In my example, the wireless devices around me look pretty normal, except one that doesn't have a name:</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2022/03/suspect-device-details-kismet.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>A device with suspicious characteristics</em></p>
<p>The web interface provides all sorts of useful information, but is there an easy way to filter all the mac addresses on my networks?</p>
<p>Kismet has a REST API, so it is time to see what we can automate from there.</p>
<h1 id="heading-rest-api-in-python">REST-API in Python</h1>
<p>The <a target="_blank" href="https://www.kismetwireless.net/docs/devel_group.html">developer documentation</a> contains examples of how to extend Kismet, specifically the one related to the <a target="_blank" href="https://github.com/kismetwireless/python-kismet-rest">official Kismet REST-API in Python</a>.</p>
<p>But it seems to be missing a feature to use API keys, instead of user/password. And the interaction with the end points doesn't seem to be complicated, so I will write my (less rich feature) wrapper.</p>
<p>You can download and install the code for a small application I wrote (<a target="_blank" href="https://github.com/josevnz/kismet_home">kismet_home</a> to illustrate how to work with Kismet (also has a copy of this tutorial) like this:</p>
<pre><code class="lang-shell">python3 -m venv ~/virtualenv/kismet_home
. ~/virtualenv/kismet_home/bin/activate
python -m pip install --upgrade pip
git clone git@github.com:josevnz/kismet_home.git
python setup.py bdist_wheel
pip install kismet_home-0.0.1-py3-none-any.whl
</code></pre>
<p>And then run the unit tests/ integration tests and even the third party vulnerability scanner:</p>
<pre><code class="lang-shell">. ~/virtualenv/kismet_home/bin/activate
# Unit/ integration tests
python -m unittest test/unit_test_config.py
python -m unittest /home/josevnz/kismet_home/test/test_integration_kismet.py
# Third party vulnerability scanner
pip-audit  --requirement requirements.txt
</code></pre>
<p>You will find more details on the <a target="_blank" href="https://github.com/josevnz/kismet_home/blob/main/README.md">README.md</a> and <a target="_blank" href="https://github.com/josevnz/kismet_home/blob/main/DEVELOPER.md">DEVELOPER.md</a> files.</p>
<p>Let's move on with the code.</p>
<h3 id="heading-how-to-interact-with-kismet-using-python">How to Interact with Kismet using Python</h3>
<p>First I'll write a generic HTTP client I can use to query or send commands to Kismet, that is the <em>KismetWorker</em> class:</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> json
<span class="hljs-keyword">from</span> datetime <span class="hljs-keyword">import</span> datetime
<span class="hljs-keyword">from</span> typing <span class="hljs-keyword">import</span> Any, Dict, Set, List, Union
<span class="hljs-keyword">import</span> requests


<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">KismetBase</span>:</span>

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">__init__</span>(<span class="hljs-params">self, *, api_key: str, url: str</span>):</span>
        <span class="hljs-string">"""
        Parametric constructor
        :param api_key: The Kismet generated API key
        :param url: URL where the Kismet server is running
        """</span>
        self.api_key = api_key
        <span class="hljs-keyword">if</span> url[<span class="hljs-number">-1</span>] != <span class="hljs-string">'/'</span>:
            self.url = <span class="hljs-string">f"<span class="hljs-subst">{url}</span>/"</span>
        <span class="hljs-keyword">else</span>:
            self.url = url
        self.cookies = {<span class="hljs-string">'KISMET'</span>: self.api_key}

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">__str__</span>(<span class="hljs-params">self</span>):</span>
        <span class="hljs-keyword">return</span> <span class="hljs-string">f"url=<span class="hljs-subst">{self.url}</span>, api_key=XXX"</span>

<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">KismetWorker</span>(<span class="hljs-params">KismetBase</span>):</span>

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">check_session</span>(<span class="hljs-params">self</span>) -&gt; <span class="hljs-keyword">None</span>:</span>
        <span class="hljs-string">"""
        Confirm if the session is valid for a given API key
        :return: None, throws an exception if the session is invalid
        """</span>
        endpoint = <span class="hljs-string">f"<span class="hljs-subst">{self.url}</span>session/check_session"</span>
        r = requests.get(endpoint, cookies=self.cookies)
        r.raise_for_status()

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">check_system_status</span>(<span class="hljs-params">self</span>) -&gt; Dict[str, Any]:</span>
        <span class="hljs-string">"""
        Overall status of the Kismet server
        :return: Nested dictionary describing different aspect of the Kismet system
        """</span>
        endpoint = <span class="hljs-string">f"<span class="hljs-subst">{self.url}</span>system/status.json"</span>
        r = requests.get(endpoint, cookies=self.cookies)
        r.raise_for_status()
        <span class="hljs-keyword">return</span> json.loads(r.text)

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">get_all_alerts</span>(<span class="hljs-params">self</span>) -&gt; Any:</span>
        <span class="hljs-string">"""
        You can get a description how the alert system is set up as shown here: /alerts/definitions.prettyjson
        This method returns the last N alerts registered by the system. Severity and meaning of the alert is explained
        here: https://www.kismetwireless.net/docs/devel/webui_rest/alerts/
        :return:
        """</span>
        endpoint = <span class="hljs-string">f"<span class="hljs-subst">{self.url}</span>alerts/all_alerts.json"</span>
        r = requests.get(endpoint, cookies=self.cookies)
        r.raise_for_status()
        <span class="hljs-keyword">return</span> json.loads(r.text)

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">get_alert_by_hash</span>(<span class="hljs-params">self, identifier: str</span>) -&gt; Dict[str, Any]:</span>
        <span class="hljs-string">"""
        Get details of a single alert by its identifier (hash)
        :return:
        """</span>
        parsed = int(identifier)
        <span class="hljs-keyword">if</span> parsed &lt; <span class="hljs-number">0</span>:
            <span class="hljs-keyword">raise</span> ValueError(<span class="hljs-string">f"Invalid ID provided: <span class="hljs-subst">{identifier}</span>"</span>)
        endpoint = <span class="hljs-string">f"<span class="hljs-subst">{self.url}</span>alerts/by-id/<span class="hljs-subst">{identifier}</span>/alert.json"</span>
        r = requests.get(endpoint, cookies=self.cookies)
        r.raise_for_status()
        <span class="hljs-keyword">return</span> json.loads(r.text)

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">get_alert_definitions</span>(<span class="hljs-params">self</span>) -&gt; Dict[Union[str, int], Any]:</span>
        <span class="hljs-string">"""
        Get the defined alert types
        :return:
        """</span>
        endpoint = <span class="hljs-string">f"<span class="hljs-subst">{self.url}</span>alerts/definitions.json"</span>
        r = requests.get(endpoint, cookies=self.cookies)
        r.raise_for_status()
        <span class="hljs-keyword">return</span> json.loads(r.text)
</code></pre>
<p>The way Kismet API works is that you make the API KEY part of the query, or you define it in the KISMET cookie. I choose to populate the cookie.</p>
<p>KismetWorker implements the following methods:</p>
<ul>
<li><p><strong>check_session</strong>: It checks if your API KEY is valid. If not it will throw an exception.</p>
</li>
<li><p><strong>check_system_status</strong>: Validates if the administrator (you most likely) defined an administrator for the Kismet server. If not, then all the API queries will fail.</p>
</li>
<li><p><strong>get_all_alerts</strong>: Gets all the available alerts (if any) from your Kismet server.</p>
</li>
<li><p><strong>get_alert_by_hash</strong>: If you know the identifier (hash) of an alert, you can retrieve the details of that event only.</p>
</li>
<li><p><strong>get_alert_definitions</strong>: Get all the alert definitions. Kismet supports a wide range of alerts and a user will definitely be interested to find out what type of alerts they are.</p>
</li>
</ul>
<p>You can see <a target="_blank" href="https://github.com/josevnz/kismet_home/blob/main/test/test_integration_kismet.py">all the integration code</a> here to see how the methods work in action.</p>
<p>I also wrote a class that requires admin privileges. I use it to define a custom alert type and to send alerts using that type to Kismet, as part of the integration tests. Right now I don't have much use of sending custom alerts to Kismet in real life, but that may change in the future, so here is the code:</p>
<pre><code class="lang-python"><span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">KismetAdmin</span>(<span class="hljs-params">KismetBase</span>):</span>

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">define_alert</span>(<span class="hljs-params">
            self,
            *,
            name: str,
            description: str,
            throttle: str = <span class="hljs-string">'10/min'</span>,
            burst: str = <span class="hljs-string">"1/sec"</span>,
            severity: int = <span class="hljs-number">5</span>,
            aclass: str = <span class="hljs-string">'SYSTEM'</span>

    </span>):</span>
        <span class="hljs-string">"""
        Define a new type of alert for Kismet
        :param aclass: Alert class
        :param severity: Alert severity
        :param throttle: Optional throttle
        :param name: Name of the new alert
        :param description: What does this mean
        :param burst: Optional burst
        :return:
        """</span>
        endpoint = <span class="hljs-string">f"<span class="hljs-subst">{self.url}</span>alerts/definitions/define_alert.cmd"</span>
        command = {
            <span class="hljs-string">'name'</span>: name,
            <span class="hljs-string">'description'</span>: description,
            <span class="hljs-string">'throttle'</span>: throttle,
            <span class="hljs-string">'burst'</span>: burst,
            <span class="hljs-string">'severity'</span>: severity,
            <span class="hljs-string">'class'</span>: aclass
        }
        r = requests.post(endpoint, json=command, cookies=self.cookies)
        r.raise_for_status()

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">raise_alert</span>(<span class="hljs-params">
            self,
            *,
            name: str,
            message: str
    </span>) -&gt; <span class="hljs-keyword">None</span>:</span>
        <span class="hljs-string">"""
        Send an alert to Kismet
        :param name: A well-defined name or id for the alert. MUST exist
        :param message: Message to send
        :return: None. Will raise an error if the alert could not be sent
        """</span>
        endpoint = <span class="hljs-string">f"<span class="hljs-subst">{self.url}</span>alerts/raise_alerts.cmd"</span>
        command = {
            <span class="hljs-string">'name'</span>: name,
            <span class="hljs-string">'text'</span>: message
        }
        r = requests.post(endpoint, json=command, cookies=self.cookies)
        r.raise_for_status()
</code></pre>
<p>Getting the data is just part of the story. We need to normalize it, so it can be used by the final scripts.</p>
<h3 id="heading-how-to-normalize-the-kismet-raw-data">How to Normalize the Kismet raw data</h3>
<p>Kismet contains a lot of details about the alerts, but we do not require to show the user those details (think about the nice view you get with the web application). Instead we do a few transformations using the following class with static methods:</p>
<ul>
<li><p><strong>parse_alert_definitions</strong>: Returns a simplified report of all the alert definitions</p>
</li>
<li><p><strong>process_alerts</strong>: Changes numeric alerts for more descriptive types and also returns dictionaries for the types and severity meaning of those alerts.</p>
</li>
<li><p><strong>pretty_timestamp</strong>: Converts the numeric timestamp into something we can use for comparisons and display</p>
</li>
</ul>
<p>The code for the <em>KismetResultsParser</em> helper class:</p>
<pre><code class="lang-python"><span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">KismetResultsParser</span>:</span>
    SEVERITY = {
        <span class="hljs-number">0</span>: {
            <span class="hljs-string">'name'</span>: <span class="hljs-string">'INFO'</span>,
            <span class="hljs-string">'description'</span>: <span class="hljs-string">'Informational alerts, such as datasource  errors, Kismet state changes, etc'</span>
        },
        <span class="hljs-number">5</span>: {
            <span class="hljs-string">'name'</span>: <span class="hljs-string">'LOW'</span>,
            <span class="hljs-string">'description'</span>: <span class="hljs-string">'Low - risk events such as probe fingerprints'</span>
        },
        <span class="hljs-number">10</span>: {
            <span class="hljs-string">'name'</span>: <span class="hljs-string">'MEDIUM'</span>,
            <span class="hljs-string">'description'</span>: <span class="hljs-string">'Medium - risk events such as denial of service attempts'</span>
        },
        <span class="hljs-number">15</span>: {
            <span class="hljs-string">'name'</span>: <span class="hljs-string">'HIGH'</span>,
            <span class="hljs-string">'description'</span>: <span class="hljs-string">'High - risk events such as fingerprinted watched devices, denial of service attacks, '</span>
                           <span class="hljs-string">'and similar '</span>
        },
        <span class="hljs-number">20</span>: {
            <span class="hljs-string">'name'</span>: <span class="hljs-string">'CRITICAL'</span>,
            <span class="hljs-string">'description'</span>: <span class="hljs-string">'Critical errors such as fingerprinted known exploits'</span>
        }
    }

    TYPES = {
        <span class="hljs-string">'DENIAL'</span>: <span class="hljs-string">'Possible denial of service attack'</span>,
        <span class="hljs-string">'EXPLOIT'</span>: <span class="hljs-string">'Known fingerprinted exploit attempt against a vulnerability'</span>,
        <span class="hljs-string">'OTHER'</span>: <span class="hljs-string">'General category for alerts which don’t fit in any existing bucket'</span>,
        <span class="hljs-string">'PROBE'</span>: <span class="hljs-string">'Probe by known tools'</span>,
        <span class="hljs-string">'SPOOF'</span>: <span class="hljs-string">'Attempt to spoof an existing device'</span>,
        <span class="hljs-string">'SYSTEM'</span>: <span class="hljs-string">'System events, such as log changes, datasource errors, etc.'</span>
    }

<span class="hljs-meta">    @staticmethod</span>
    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">parse_alert_definitions</span>(<span class="hljs-params">
            *,
            alert_definitions: List[Dict[str, str]],
            keys_of_interest: Set[str] = None
    </span>) -&gt; List[Dict[str, str]]:</span>
        <span class="hljs-string">"""
        Remove unwanted keys from full alert definition dump, to make it easier to read onscreen
        :param alert_definitions: Original Kismet alert definitions
        :param keys_of_interest: Kismet keys of interest
        :return: List of dictionaries with trimmed keys, description, severity and header for easy reading
        """</span>
        <span class="hljs-keyword">if</span> keys_of_interest <span class="hljs-keyword">is</span> <span class="hljs-literal">None</span>:
            keys_of_interest = {
                <span class="hljs-string">'kismet.alert.definition.class'</span>,
                <span class="hljs-string">'kismet.alert.definition.description'</span>,
                <span class="hljs-string">'kismet.alert.definition.severity'</span>,
                <span class="hljs-string">'kismet.alert.definition.header'</span>
            }
        parsed_alerts: List[Dict[str, str]] = []
        <span class="hljs-keyword">for</span> definition <span class="hljs-keyword">in</span> alert_definitions:
            new_definition = {}
            <span class="hljs-keyword">for</span> def_key <span class="hljs-keyword">in</span> definition:
                <span class="hljs-keyword">if</span> def_key <span class="hljs-keyword">in</span> keys_of_interest:
                    new_key = def_key.split(<span class="hljs-string">'.'</span>)[<span class="hljs-number">-1</span>]
                    new_definition[new_key] = definition[def_key]
            parsed_alerts.append(new_definition)
        <span class="hljs-keyword">return</span> parsed_alerts

<span class="hljs-meta">    @staticmethod</span>
    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">process_alerts</span>(<span class="hljs-params">
            *,
            alerts: List[Dict[str, Union[str, int]]],

    </span>) -&gt; Any:</span>
        <span class="hljs-string">"""
        Removed unwanted fields from alert details, also return extra data for severity and types of alerts
        :param alerts:
        :return:
        """</span>
        processed_alerts = []
        found_types = {}
        found_severities = {}
        <span class="hljs-keyword">for</span> alert <span class="hljs-keyword">in</span> alerts:
            severity = alert[<span class="hljs-string">'kismet.alert.severity'</span>]
            severity_name = KismetResultsParser.SEVERITY[severity][<span class="hljs-string">'name'</span>]
            severity_desc = KismetResultsParser.SEVERITY[severity][<span class="hljs-string">'description'</span>]
            found_severities[severity_name] = severity_desc
            text = alert[<span class="hljs-string">'kismet.alert.text'</span>]
            aclass = alert[<span class="hljs-string">'kismet.alert.class'</span>]
            found_types[aclass] = KismetResultsParser.TYPES[aclass]
            processed_alert = {
                <span class="hljs-string">'text'</span>: text,
                <span class="hljs-string">'class'</span>: aclass,
                <span class="hljs-string">'severity'</span>: severity_name,
                <span class="hljs-string">'hash'</span>: alert[<span class="hljs-string">'kismet.alert.hash'</span>],
                <span class="hljs-string">'dest_mac'</span>: alert[<span class="hljs-string">'kismet.alert.dest_mac'</span>],
                <span class="hljs-string">'source_mac'</span>: alert[<span class="hljs-string">'kismet.alert.source_mac'</span>],
                <span class="hljs-string">'timestamp'</span>: alert[<span class="hljs-string">'kismet.alert.timestamp'</span>]
            }
            processed_alerts.append(processed_alert)
        <span class="hljs-keyword">return</span> processed_alerts, found_severities, found_types

<span class="hljs-meta">    @staticmethod</span>
    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">pretty_timestamp</span>(<span class="hljs-params">timestamp: float</span>) -&gt; datetime:</span>
        <span class="hljs-string">"""
        Convert a Kismet timestamp (TIMESTAMP.UTIMESTAMP) into a pretty timestamp string
        :param timestamp:
        :return:
        """</span>
        <span class="hljs-keyword">return</span> datetime.fromtimestamp(timestamp)
</code></pre>
<p>If you run the integration tests with the admin role enabled, you will see than one or more (depending how many times you ran the test) alerts were added to the Web UI:</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2022/03/kismet_generated_alerts.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>These alerts where generated using the Python client and the REST API</em></p>
<p>As a reminder, you can see how this is used by looking at the code <a target="_blank" href="https://github.com/josevnz/kismet_home/blob/main/test/test_integration_kismet.py">here</a>. Showing a sample run of all the integration tests against my installation (this one without publishing alerts, so some tests are skipped):</p>
<pre><code class="lang-shell">(kismet_home) [josevnz@dmaf5 kismet_home]$ python -m unittest /home/josevnz/kismet_home/test/test_integration_kismet.py 
[09:13:05] DEBUG    Starting new HTTP connection (1): raspberrypi.home:2501                                                                                                                                                        connectionpool.py:228
           DEBUG    http://raspberrypi.home:2501 "GET /session/check_session HTTP/1.1" 200 None                                                                                                                                    connectionpool.py:456
.           DEBUG    Starting new HTTP connection (1): raspberrypi.home:2501                                                                                                                                                        connectionpool.py:228
           DEBUG    http://raspberrypi.home:2501 "GET /system/status.json HTTP/1.1" 200 None                                                                                                                                       connectionpool.py:456
.           DEBUG    Starting new HTTP connection (1): raspberrypi.home:2501                                                                                                                                                        connectionpool.py:228
           DEBUG    http://raspberrypi.home:2501 "GET /alerts/definitions.json HTTP/1.1" 200 None                                                                                                                                  connectionpool.py:456
.[09:13:05] 'ADMIN_SESSION_API' environment variable not defined. Skipping this test                                                                                                                                       test_integration_kismet.py:105
....
----------------------------------------------------------------------
Ran 7 tests in 0.053s

OK
</code></pre>
<h3 id="heading-where-do-we-store-our-api-key-and-other-configuration-details">Where do we store our API key and other configuration details?</h3>
<p>Details like this won't be hardcoded inside the scripts, but instead they will reside on an external configuration file:</p>
<pre><code class="lang-shell">(kismet_home) [josevnz@dmaf5 kismet_home]$ cat ~/.config/kodegeek/kismet_home/config.ini 
[server]
url = http://raspberrypi.home:2501
api_key = E41CAD466552810392D538FF8D43E2C5
</code></pre>
<p>The following classes handle all the access details (using a Reader and a Writer class for each type of operation):</p>
<pre><code class="lang-python"><span class="hljs-string">"""
Simple configuration management for kismet_home settings
"""</span>
<span class="hljs-keyword">import</span> os.path
<span class="hljs-keyword">from</span> configparser <span class="hljs-keyword">import</span> ConfigParser
<span class="hljs-keyword">from</span> pathlib <span class="hljs-keyword">import</span> Path
<span class="hljs-keyword">from</span> typing <span class="hljs-keyword">import</span> Dict

<span class="hljs-keyword">from</span> kismet_home <span class="hljs-keyword">import</span> CONSOLE

DEFAULT_INI = os.path.expanduser(<span class="hljs-string">'~/.config/kodegeek/kismet_home/config.ini'</span>)
VALID_KEYS = {<span class="hljs-string">'api_key'</span>, <span class="hljs-string">'url'</span>}


<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">Reader</span>:</span>

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">__init__</span>(<span class="hljs-params">self, config_file: str = DEFAULT_INI</span>):</span>
        <span class="hljs-string">"""
        Constructor
        :param config_file: Optional override of the ini configuration file
        """</span>
        self.config = ConfigParser()
        <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> self.config.read(config_file):
            <span class="hljs-keyword">raise</span> ValueError(<span class="hljs-string">f"Could not read <span class="hljs-subst">{config_file}</span>"</span>)

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">get_api_key</span>(<span class="hljs-params">self</span>):</span>
        <span class="hljs-string">"""
        Get back the API key used to connect to Kismet
        :return:
        """</span>
        <span class="hljs-keyword">return</span> self.config.get(<span class="hljs-string">'server'</span>, <span class="hljs-string">'api_key'</span>)

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">get_url</span>(<span class="hljs-params">self</span>):</span>
        <span class="hljs-string">"""
        Get back URL of Kismet server
        :return:
        """</span>
        <span class="hljs-keyword">return</span> self.config.get(<span class="hljs-string">'server'</span>, <span class="hljs-string">'url'</span>)


<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">Writer</span>:</span>

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">__init__</span>(<span class="hljs-params">
            self,
            *,
            server_keys: Dict[str, str]
    </span>):</span>
        <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> server_keys:
            <span class="hljs-keyword">raise</span> ValueError(<span class="hljs-string">"Configuration is incomplete!, aborting!"</span>)
        self.config = ConfigParser()
        self.config.add_section(<span class="hljs-string">'server'</span>)
        valid_keys_cnt = <span class="hljs-number">0</span>
        <span class="hljs-keyword">for</span> key <span class="hljs-keyword">in</span> server_keys:
            value = server_keys[key]
            <span class="hljs-keyword">if</span> key <span class="hljs-keyword">not</span> <span class="hljs-keyword">in</span> VALID_KEYS:
                CONSOLE.log(<span class="hljs-string">f"Ignoring invalid key: <span class="hljs-subst">{key}</span> = <span class="hljs-subst">{value}</span>"</span>)
                <span class="hljs-keyword">continue</span>
            self.config.set(<span class="hljs-string">'server'</span>, key, value)
            CONSOLE.log(<span class="hljs-string">f"Added: server: <span class="hljs-subst">{key}</span> = <span class="hljs-subst">{value}</span>"</span>)
        <span class="hljs-keyword">for</span> valid_key <span class="hljs-keyword">in</span> VALID_KEYS:
            <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> self.config.get(<span class="hljs-string">'server'</span>, valid_key):
                <span class="hljs-keyword">raise</span> ValueError(<span class="hljs-string">f"Missing required key: <span class="hljs-subst">{valid_key}</span>"</span>)

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">save</span>(<span class="hljs-params">
            self,
            *,
            config_file: str = DEFAULT_INI
    </span>):</span>
        basedir = Path(config_file).parent
        basedir.mkdir(exist_ok=<span class="hljs-literal">True</span>, parents=<span class="hljs-literal">True</span>)
        <span class="hljs-keyword">with</span> open(config_file, <span class="hljs-string">'w'</span>) <span class="hljs-keyword">as</span> config:
            self.config.write(config, space_around_delimiters=<span class="hljs-literal">True</span>)
        CONSOLE.log(<span class="hljs-string">f"Configuration file <span class="hljs-subst">{config_file}</span> written"</span>)
</code></pre>
<p>The first time you set up your kismet_home installation, you can create the configuration files like this:</p>
<pre><code class="lang-shell">[josevnz@dmaf5 kismet_home]$ python3 -m venv ~/virtualenv/kismet_home
[josevnz@dmaf5 kismet_home]$ . ~/virtualenv/kismet_home/bin/activate
(kismet_home) [josevnz@dmaf5 kismet_home]$ python -m pip install --upgrade pip
(kismet_home) [josevnz@dmaf5 kismet_home]$ git clone git@github.com:josevnz/kismet_home.git
(kismet_home) [josevnz@dmaf5 kismet_home]$ python setup.py bdist_wheel
(kismet_home) [josevnz@dmaf5 kismet_home]$ pip install kismet_home-0.0.1-py3-none-any.whl

(kismet_home) [josevnz@dmaf5 kismet_home]$ kismet_home_config.py 
Please enter the URL of your Kismet server: http://raspberrypi.home:2501/
Please enter your API key: E41CAD466552810392D538FF8D43E2C5
[13:02:35] Added: server: url = http://raspberrypi.home:2501/                                                                                 config.py:44
           Added: server: api_key = E41CAD466552810392D538FF8D43E2C5                                                                          config.py:44
           Configuration file /home/josevnz/.config/kodegeek/kismet_home/config.ini written
</code></pre>
<p>Please note the use of the virtual environment here. This will allow us to keep the application's libraries self-contained.</p>
<h2 id="heading-putting-everything-together-how-to-write-our-cli-for-kismethome">Putting everything together: How to Write our CLI for kismet_home</h2>
<p>The <em>kismet_home_alerts.py</em> script will support two modes:</p>
<ul>
<li><p>Show the alert definitions</p>
</li>
<li><p>Show all the alerts</p>
</li>
</ul>
<p>Also, it will allow filtering alerts based on the level (INFO, MEDIUM, HIGH, ...).</p>
<p>Showing all the definitions, filtered by CRITICAL:</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2022/03/alert_definitions_filtered_by_level.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>You can see here the alert definitions filtered by level</em></p>
<p>Or showing all the alerts received so far, with anonymous MAC address (great for screenshots like this):</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2022/03/kismet_home_alerts.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>Alerts for my local network, with anonymous MAC addresses and filtered</em></p>
<p>How you can generate these tables with ease? There is a dedicated class for the text user interface (TUI):</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> typing <span class="hljs-keyword">import</span> List, Dict, Any

<span class="hljs-keyword">from</span> rich.layout <span class="hljs-keyword">import</span> Layout
<span class="hljs-keyword">from</span> rich.table <span class="hljs-keyword">import</span> Table

<span class="hljs-keyword">from</span> kismet_home.kismet <span class="hljs-keyword">import</span> KismetResultsParser


<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">create_alert_definition_table</span>(<span class="hljs-params">
        *,
        alert_definitions: List[Dict[str, Any]],
        level_filter: str = <span class="hljs-number">0</span>
</span>) -&gt; Table:</span>
    <span class="hljs-string">"""
    Create a table showing the alert definitions
    :param alert_definitions: Alert definitions from Kismet
    :param level_filter: User can override the level of the alerts shown. But default is 0 (INFO)
    :return: A Table with the alert definitions
    """</span>
    definition_table = Table(title=<span class="hljs-string">"Alert definitions"</span>)
    definition_table.add_column(<span class="hljs-string">"Severity"</span>, justify=<span class="hljs-string">"right"</span>, style=<span class="hljs-string">"cyan"</span>, no_wrap=<span class="hljs-literal">True</span>)
    definition_table.add_column(<span class="hljs-string">"Description"</span>, style=<span class="hljs-string">"magenta"</span>)
    definition_table.add_column(<span class="hljs-string">"Header"</span>, justify=<span class="hljs-string">"right"</span>, style=<span class="hljs-string">"yellow"</span>)
    definition_table.add_column(<span class="hljs-string">"Class"</span>, justify=<span class="hljs-string">"right"</span>, style=<span class="hljs-string">"green"</span>)
    filter_level = KismetResultsParser.get_level_for_security(level_filter)
    filtered_definitions = <span class="hljs-number">0</span>
    <span class="hljs-keyword">for</span> definition <span class="hljs-keyword">in</span> alert_definitions:
        int_severity: int = definition[<span class="hljs-string">'severity'</span>]
        <span class="hljs-keyword">if</span> int_severity &lt; filter_level:
            <span class="hljs-keyword">continue</span>
        severity = KismetResultsParser.SEVERITY[int_severity][<span class="hljs-string">'name'</span>]
        <span class="hljs-keyword">if</span> <span class="hljs-number">0</span> &lt;= int_severity &lt; <span class="hljs-number">5</span>:
            severity = <span class="hljs-string">f"[bold blue]<span class="hljs-subst">{severity}</span>[/ bold blue]"</span>
        <span class="hljs-keyword">if</span> <span class="hljs-number">5</span> &lt;= int_severity &lt; <span class="hljs-number">10</span>:
            severity = <span class="hljs-string">f"[bold yellow]<span class="hljs-subst">{severity}</span>[/ bold yellow]"</span>
        <span class="hljs-keyword">if</span> <span class="hljs-number">10</span> &lt;= int_severity &lt; <span class="hljs-number">15</span>:
            severity = <span class="hljs-string">f"[bold orange]<span class="hljs-subst">{severity}</span>[/ bold orange]"</span>
        <span class="hljs-keyword">else</span>:
            severity = <span class="hljs-string">f"[bold red]<span class="hljs-subst">{severity}</span>[/ bold red]"</span>
        filtered_definitions += <span class="hljs-number">1</span>
        definition_table.add_row(
            severity,
            definition[<span class="hljs-string">'description'</span>],
            definition[<span class="hljs-string">'header'</span>],
            definition[<span class="hljs-string">'class'</span>]
        )
    definition_table.caption = <span class="hljs-string">f"Total definitions: <span class="hljs-subst">{filtered_definitions}</span>"</span>
    <span class="hljs-keyword">return</span> definition_table


<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">create_alert_layout</span>(<span class="hljs-params">
        *,
        alerts: List[Dict[str, Any]],
        level_filter: str = <span class="hljs-number">0</span>,
        anonymize: bool = False,
        severities: Dict[str, str]
</span>):</span>
    <span class="hljs-string">"""
    :param severities:
    :param alerts:
    :param level_filter:
    :param anonymize:
    :return:
    """</span>
    alerts_table = Table(title=<span class="hljs-string">"Alert definitions"</span>)
    alerts_table.add_column(<span class="hljs-string">"Timestamp"</span>, no_wrap=<span class="hljs-literal">True</span>)
    alerts_table.add_column(<span class="hljs-string">"Severity"</span>, justify=<span class="hljs-string">"right"</span>, style=<span class="hljs-string">"cyan"</span>, no_wrap=<span class="hljs-literal">True</span>)
    alerts_table.add_column(<span class="hljs-string">"Text"</span>, style=<span class="hljs-string">"magenta"</span>)
    alerts_table.add_column(<span class="hljs-string">"Source MAC"</span>, justify=<span class="hljs-string">"right"</span>, style=<span class="hljs-string">"yellow"</span>, no_wrap=<span class="hljs-literal">True</span>)
    alerts_table.add_column(<span class="hljs-string">"Destination MAC"</span>, justify=<span class="hljs-string">"right"</span>, style=<span class="hljs-string">"yellow"</span>, no_wrap=<span class="hljs-literal">True</span>)
    alerts_table.add_column(<span class="hljs-string">"Class"</span>, justify=<span class="hljs-string">"right"</span>, style=<span class="hljs-string">"green"</span>, no_wrap=<span class="hljs-literal">True</span>)
    filter_level = KismetResultsParser.get_level_for_security(level_filter)

    filtered_definitions = <span class="hljs-number">0</span>
    <span class="hljs-keyword">for</span> alert <span class="hljs-keyword">in</span> alerts:
        int_severity: int = KismetResultsParser.get_level_for_security(alert[<span class="hljs-string">'severity'</span>])
        <span class="hljs-keyword">if</span> int_severity &lt; filter_level:
            <span class="hljs-keyword">continue</span>
        severity = KismetResultsParser.SEVERITY[int_severity][<span class="hljs-string">'name'</span>]
        <span class="hljs-keyword">if</span> <span class="hljs-number">0</span> &lt;= int_severity &lt; <span class="hljs-number">5</span>:
            severity = <span class="hljs-string">f"[bold blue]<span class="hljs-subst">{severity}</span>[/ bold blue]"</span>
        <span class="hljs-keyword">if</span> <span class="hljs-number">5</span> &lt;= int_severity &lt; <span class="hljs-number">10</span>:
            severity = <span class="hljs-string">f"[bold yellow]<span class="hljs-subst">{severity}</span>[/ bold yellow]"</span>
        <span class="hljs-keyword">if</span> <span class="hljs-number">10</span> &lt;= int_severity &lt; <span class="hljs-number">15</span>:
            severity = <span class="hljs-string">f"[bold orange]<span class="hljs-subst">{severity}</span>[/ bold orange]"</span>
        <span class="hljs-keyword">else</span>:
            severity = <span class="hljs-string">f"[bold red]<span class="hljs-subst">{severity}</span>[/ bold red]"</span>
        filtered_definitions += <span class="hljs-number">1</span>
        <span class="hljs-keyword">if</span> anonymize:
            s_mac = KismetResultsParser.anonymize_mac(alert[<span class="hljs-string">'source_mac'</span>])
            d_mac = KismetResultsParser.anonymize_mac(alert[<span class="hljs-string">'dest_mac'</span>])
        <span class="hljs-keyword">else</span>:
            s_mac = alert[<span class="hljs-string">'source_mac'</span>]
            d_mac = alert[<span class="hljs-string">'dest_mac'</span>]
        alerts_table.add_row(
            str(KismetResultsParser.pretty_timestamp(alert[<span class="hljs-string">'timestamp'</span>])),
            severity,
            alert[<span class="hljs-string">'text'</span>],
            s_mac,
            d_mac,
            alert[<span class="hljs-string">'class'</span>]
        )
    alerts_table.caption = <span class="hljs-string">f"Total alerts: <span class="hljs-subst">{filtered_definitions}</span>"</span>

    severities_table = Table(title=<span class="hljs-string">"Severity legend"</span>)
    severities_table.add_column(<span class="hljs-string">"Severity"</span>)
    severities_table.add_column(<span class="hljs-string">"Explanation"</span>)
    <span class="hljs-keyword">for</span> severity <span class="hljs-keyword">in</span> severities:
        explanation = <span class="hljs-string">f"[green]<span class="hljs-subst">{severities[severity]}</span>[/green]"</span>
        severities_table.add_row(<span class="hljs-string">f"[yellow]<span class="hljs-subst">{severity}</span>[/yellow]"</span>, explanation)

    layout = Layout()
    layout.split(
        Layout(ratio=<span class="hljs-number">2</span>, name=<span class="hljs-string">"alerts"</span>),
        Layout(name=<span class="hljs-string">"severities"</span>),
    )
    layout[<span class="hljs-string">"alerts"</span>].update(alerts_table)
    layout[<span class="hljs-string">"severities"</span>].update(severities_table)
    <span class="hljs-keyword">return</span> layout, filtered_definitions
</code></pre>
<p>And now with all the ingredients ready, we can see how the final script looks:</p>
<pre><code class="lang-python"><span class="hljs-comment">#!/usr/bin/env python</span>
<span class="hljs-string">"""
# kismet_home_alerts.py
# Author
Jose Vicente Nunez Zuleta (kodegeek.com@protonmail.com)
"""</span>
<span class="hljs-keyword">import</span> logging
<span class="hljs-keyword">import</span> sys

<span class="hljs-keyword">from</span> requests <span class="hljs-keyword">import</span> HTTPError
<span class="hljs-keyword">import</span> argparse

<span class="hljs-keyword">from</span> kismet_home <span class="hljs-keyword">import</span> CONSOLE
<span class="hljs-keyword">from</span> kismet_home.config <span class="hljs-keyword">import</span> Reader
<span class="hljs-keyword">from</span> kismet_home.kismet <span class="hljs-keyword">import</span> KismetWorker, KismetResultsParser
<span class="hljs-keyword">from</span> kismet_home.tui <span class="hljs-keyword">import</span> create_alert_definition_table, create_alert_layout

<span class="hljs-keyword">if</span> __name__ == <span class="hljs-string">'__main__'</span>:

    arg_parser = argparse.ArgumentParser(
        description=<span class="hljs-string">"Display alerts generated by your local Kismet installation"</span>,
        prog=__file__
    )
    arg_parser.add_argument(
        <span class="hljs-string">'--debug'</span>,
        action=<span class="hljs-string">'store_true'</span>,
        default=<span class="hljs-literal">False</span>,
        help=<span class="hljs-string">"Enable debug mode"</span>
    )
    arg_parser.add_argument(
        <span class="hljs-string">'--anonymize'</span>,
        action=<span class="hljs-string">'store_true'</span>,
        default=<span class="hljs-literal">False</span>,
        help=<span class="hljs-string">"Anonymize MAC addresses"</span>
    )
    arg_parser.add_argument(
        <span class="hljs-string">'--level'</span>,
        action=<span class="hljs-string">'store'</span>,
        default=<span class="hljs-string">'INFO'</span>,
        help=<span class="hljs-string">"Enable debug mode"</span>
    )
    arg_parser.add_argument(
        <span class="hljs-string">'mode'</span>,
        action=<span class="hljs-string">'store'</span>,
        choices=[<span class="hljs-string">'alert_type'</span>, <span class="hljs-string">'alerts'</span>],
        help=<span class="hljs-string">"Operation mode"</span>
    )

    <span class="hljs-keyword">try</span>:
        args = arg_parser.parse_args()
        conf_reader = Reader()
        kw = KismetWorker(
            api_key=conf_reader.get_api_key(),
            url=conf_reader.get_url()
        )
        <span class="hljs-keyword">if</span> args.mode == <span class="hljs-string">'alert_type'</span>:
            alert_definitions = KismetResultsParser.parse_alert_definitions(
                alert_definitions=kw.get_alert_definitions()
            )
            table = create_alert_definition_table(alert_definitions=alert_definitions, level_filter=args.level)
            <span class="hljs-keyword">if</span> table.columns:
                CONSOLE.print(table)
            <span class="hljs-keyword">else</span>:
                CONSOLE.print(<span class="hljs-string">f"[b]Could not get alert definitions![/b]"</span>)
        <span class="hljs-keyword">elif</span> args.mode == <span class="hljs-string">'alerts'</span>:
            alerts, severities, types = KismetResultsParser.process_alerts(
                alerts=kw.get_all_alerts()
            )
            layout, found = create_alert_layout(
                alerts=alerts,
                level_filter=args.level,
                anonymize=args.anonymize,
                severities=severities
            )
            <span class="hljs-keyword">if</span> found:
                CONSOLE.print(layout)
            <span class="hljs-keyword">else</span>:
                CONSOLE.print(<span class="hljs-string">f"[b]No alerts to show for level=<span class="hljs-subst">{args.level}</span>[/b]"</span>)
    <span class="hljs-keyword">except</span> (ValueError, HTTPError):
        logging.exception(<span class="hljs-string">"There was an error"</span>)
        sys.exit(<span class="hljs-number">100</span>)
    <span class="hljs-keyword">except</span> KeyboardInterrupt:
        CONSOLE.log(<span class="hljs-string">"Scan interrupted, exiting..."</span>)
    sys.exit(<span class="hljs-number">0</span>)
</code></pre>
<p>A few things to note:</p>
<ul>
<li><p>This is not a long-running application. Instead, is a snapshot of all the alerts. If you wanted, for example, to forward these alerts by email or to a framework like <a target="_blank" href="https://grafana.com/">grafana</a>, you are better off using <a target="_blank" href="https://pypi.org/project/websockets/">Websockets</a> and one of the methods that retrieves only the last changes.</p>
</li>
<li><p>The layout is crude, and there is plenty of room for improvement. But our little tui is displaying relevant information without too many distractions</p>
</li>
<li><p>And if was fun to code!</p>
</li>
</ul>
<h1 id="heading-what-did-we-learn">What did we learn?</h1>
<ul>
<li><p>How to install Kismet and secure it with a self-signed SSL certificate</p>
</li>
<li><p>How to write a simple Bash script to set up the correct Wireless interface in monitor mode, after the RaspBerryPI reboots</p>
</li>
<li><p>How to add an API KEY with read-only access to use it instead of the legacy user/ password schema for authentication and authorization</p>
</li>
<li><p>How to write classes in Python that can communicate with Kismet using its REST-API</p>
</li>
<li><p>How to add unit and integration tests to the code to make sure everything works and new code changes do not break existing functionality</p>
</li>
</ul>
<p>Please leave your comments on the <a target="_blank" href="https://github.com/josevnz/kismet_home">git repository</a> and report any bugs. But more important get Kismet, get the code of this tutorial, and start securing your home wireless infrastructure in no time.</p>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ How to Enhance Nmap with Python ]]>
                </title>
                <description>
                    <![CDATA[ Very few pieces of Open Source software generate so much hype as Nmap. It is one of those tools that packs in so many useful features that it can help you make your systems more secure by just running it with a few flags. Nmap ("Network Mapper") is a... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/enhance-nmap-with-python/</link>
                <guid isPermaLink="false">66d8513be33373caf28c5ff4</guid>
                
                    <category>
                        <![CDATA[ computer networking ]]>
                    </category>
                
                    <category>
                        <![CDATA[ cybersecurity ]]>
                    </category>
                
                    <category>
                        <![CDATA[ nmap ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Python ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ Jose Vicente Nunez ]]>
                </dc:creator>
                <pubDate>Tue, 08 Feb 2022 19:28:43 +0000</pubDate>
                <media:content url="https://www.freecodecamp.org/news/content/images/2022/02/home_nmap.png" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>Very few pieces of Open Source software generate <a target="_blank" href="https://nmap.org/movies/">so much hype</a> as <a target="_blank" href="https://nmap.org/">Nmap</a>. It is one of those tools that packs in so many useful features that it can help you make your systems more secure by just running it with a few flags.</p>
<p>Nmap ("Network Mapper") is a free and open source utility for network discovery and security auditing.</p>
<p>Many systems and network administrators also find it useful for tasks such as network inventory, managing service upgrade schedules, and monitoring host or service uptime.</p>
<p>You can also use it to bypass weak protections, find hidden or mis-configured services, or just to give you a better understanding how networks work.</p>
<h2 id="heading-table-of-contents">Table of contents:</h2>
<ul>
<li><p><a class="post-section-overview" href="#heading-what-you-will-learn-from-this-article">What you will learn from this article</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-nmap-101-identify-all-the-public-services-in-our-network">Nmap 101: Identify all the public services in our network</a></p>
</li>
<li><p><a class="post-section-overview" href="#how-to-write-an-easy-button-network-ccanner-that-uses-nmap">How to Write an 'easy button' Network Scanner that Uses Nmap</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-how-to-make-a-home-network-scanner-a-web-service">How to Make a Home Network Scanner a Web Service</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-what-did-we-learn">What did we learn?</a></p>
</li>
</ul>
<h2 id="heading-what-you-will-learn-from-this-article">What you will learn from this article</h2>
<p>We will cover the following to illustrate how you can enhance Nmap with Python:</p>
<ul>
<li><p>Write a small script that can scan all the hosts on the local network, making sure it runs with the proper privileges.</p>
</li>
<li><p>Enhance Nmap by correlating services with security advisories.</p>
</li>
<li><p>Convert our scripts into a web-service. Will add basic security (authorization and encryption).</p>
</li>
</ul>
<h3 id="heading-things-you-should-know-and-do-before-starting">Things you should know and do before starting</h3>
<p>Don't worry too much, as I will guide you through the steps. This will be a fun experience, and you'll have all the source code to follow along:</p>
<ul>
<li><p>Be familiar with basic network concepts like <a target="_blank" href="https://en.wikipedia.org/wiki/Classless_Inter-Domain_Routing">Classless inter-domain routing (CIDR)</a></p>
</li>
<li><p>Be able to write a program in a scripting language like <a target="_blank" href="https://www.python.org/">Python</a>.</p>
</li>
<li><p>The code from can be installed using a virtual environment. If you are not familiar with a virtual environment, you can read the following: <a target="_blank" href="https://www.redhat.com/sysadmin/packaging-applications-python">Packaging applications to install on other machines with Python</a>.</p>
</li>
</ul>
<h3 id="heading-what-tools-you-will-need-for-this-tutorial">What tools you will need for this tutorial?</h3>
<p>I won't cover the installation of any of these tools, but there is plenty of documentation out there to get you started.</p>
<ul>
<li><p>Install the whole code from this tutorial by following the instructions as explained on the main <a target="_blank" href="http://README.md">README document</a> file on my GitHub <a target="_blank" href="https://github.com/josevnz/home_nmap">official repository site</a>. You will need to <a target="_blank" href="https://git-scm.com/book/en/v2/Getting-Started-Installing-Git">install</a> <a target="_blank" href="https://git-scm.com/docs/gittutorial">Git</a> to clone the code.</p>
</li>
<li><p>A Linux distribution. Fedora, Ubuntu, Kali, feel free to use the one you feel most comfortable with (I used <a target="_blank" href="https://docs.fedoraproject.org/en-US/fedora/rawhide/install-guide/">Fedora</a> 35.)</p>
</li>
<li><p><a target="_blank" href="https://developer.fedoraproject.org/tech/languages/python/python-installation.html">Python interpreter</a>. A good Linux distribution will come with Python pre-installed or at least will make it easier for you to install. I used Python 3.9 here.</p>
</li>
</ul>
<p>Last <em>two things</em>:</p>
<ul>
<li><p>I skipped <strong>some</strong> imports in the code snippets as they do not enhance the code demonstrations. To get the most accurate code, please do clone the public Git repository for this tutorial and open the source code.</p>
</li>
<li><p><em>Only run this examples against your local network</em>. You can be curious, have fun, and learn new things about existing tools without affecting others.</p>
</li>
</ul>
<p>Hacking is about learning!</p>
<h1 id="heading-nmap-101-identify-all-the-public-services-in-our-network">Nmap 101: Identify all the public services in our network</h1>
<h3 id="heading-word-of-caution-the-saying-better-ask-for-forgiveness-than-permission-doesnt-apply-here">Word of caution: The saying 'b_etter ask for forgiveness than permission'_ doesn't apply here</h3>
<p>We do not care about being 'stealth' or triggering an <a target="_blank" href="https://en.wikipedia.org/wiki/Intrusion_detection_system">Intrusion Detection System (IDS)</a> because of our port scanning activity. An IDS normally looks for abnormal network patterns and if it sees a machine opening and closing ports on rapid succession across many hosts that would be considered a port scan attack. Again that won't be the case in our home network because, well, we know it is us running such a scan.</p>
<p>For the same reason you should not launch a port scan on a network you don't own, as Nmap is not 100% stealth (you can always play with randomizing frequency, type of TCP handshake, number of ports opened, use a proxy, and so on and yet you most likely will miss something).</p>
<p>So better behave, OK? :-)</p>
<h3 id="heading-what-do-we-need-to-run-nmap-and-os-fingerprinting">What do we need to run Nmap and OS fingerprinting?</h3>
<p>The goal here is to see what services are running in our network using a command line interface (CLI) script.</p>
<p>Nmap requires elevated privileges to do the OS fingerprinting and scans using raw sockets. You will need to run the commands as root or <a target="_blank" href="https://www.sudo.ws/">su "do" (SUDO)</a> to elevate your permissions. A SUDO rule to do this is similar to this (file /etc/sudoers):</p>
<pre><code class="lang-shell">## Same thing without a password
%wheel    ALL=(ALL)    NOPASSWD: ALL
</code></pre>
<p>This means that anyone on the 'wheel' unix group can run commands as root:</p>
<pre><code class="lang-shell">(2600) [josevnz@dmaf5 2600]$ grep wheel /etc/group
wheel:x:10:josevnz,services

# To confirm we can run commands as root
(2600) [josevnz@dmaf5 2600]$ sudo -l
Matching Defaults entries for josevnz on dmaf5:
    !visiblepw, always_set_home, match_group_by_gid, always_query_group_plugin, env_reset, env_keep="COLORS DISPLAY HOSTNAME HISTSIZE KDEDIR LS_COLORS",
    env_keep+="MAIL QTDIR USERNAME LANG LC_ADDRESS LC_CTYPE", env_keep+="LC_COLLATE LC_IDENTIFICATION LC_MEASUREMENT LC_MESSAGES", env_keep+="LC_MONETARY
    LC_NAME LC_NUMERIC LC_PAPER LC_TELEPHONE", env_keep+="LC_TIME LC_ALL LANGUAGE LINGUAS _XKB_CHARSET XAUTHORITY",
    secure_path=/usr/local/sbin\:/usr/local/bin\:/usr/sbin\:/usr/bin\:/sbin\:/bin\:/var/lib/snapd/snap/bin

User josevnz may run the following commands on dmaf5:
    (ALL) NOPASSWD: ALL
</code></pre>
<p>Next we'll do a quick scan of our local network (in this example is 192.168.1.0/24). I used the -v (verbose) flag to get some progress feedback while scanning for all the ports while also doing OS fingerprinting (-O).</p>
<p>I saved the execution of the Nmap run to an XML file (-oX), which Nmap can use to resume execution if it gets interrupted (--resume):</p>
<pre><code class="lang-shell"># In case the scan is interrupted: nmap --resume $HOME/home_scan.xml
[josevnz@dmaf5 docs]$ sudo nmap -v -n -p- -sT -sV -O --osscan-limit --max-os-tries 1 -oX $HOME/home_scan.xml 192.168.1.0/24
Starting Nmap 7.80 ( https://nmap.org ) at 2021-12-30 16:35 EST
NSE: Loaded 45 scripts for scanning.
Initiating ARP Ping Scan at 16:35
Scanning 254 hosts [1 port/host]
...
# After a while and several cups of Venezuelan coffee...
Network Distance: 1 hop
TCP Sequence Prediction: Difficulty=265 (Good luck!)
IP ID Sequence Generation: All zeros

Nmap scan report for 192.168.1.20
Host is up (0.0097s latency).
Not shown: 65530 closed ports
PORT      STATE    SERVICE      VERSION
36184/tcp filtered unknown
37309/tcp filtered unknown
49323/tcp open     unknown
49376/tcp filtered unknown
62078/tcp open     iphone-sync?
MAC Address: 9E:90:75:3A:D7:XX (Unknown)
...
</code></pre>
<p>The resulting <a target="_blank" href="https://nmap.org/book/output-formats-xml-output.html">XML format file</a> is very verbose:</p>
<pre><code class="lang-xml"><span class="hljs-tag">&lt;<span class="hljs-name">host</span> <span class="hljs-attr">starttime</span>=<span class="hljs-string">"1640901327"</span> <span class="hljs-attr">endtime</span>=<span class="hljs-string">"1640902555"</span>&gt;</span><span class="hljs-tag">&lt;<span class="hljs-name">status</span> <span class="hljs-attr">state</span>=<span class="hljs-string">"up"</span> <span class="hljs-attr">reason</span>=<span class="hljs-string">"arp-response"</span> <span class="hljs-attr">reason_ttl</span>=<span class="hljs-string">"0"</span>/&gt;</span>
<span class="hljs-tag">&lt;<span class="hljs-name">address</span> <span class="hljs-attr">addr</span>=<span class="hljs-string">"192.168.1.1"</span> <span class="hljs-attr">addrtype</span>=<span class="hljs-string">"ipv4"</span>/&gt;</span>
<span class="hljs-tag">&lt;<span class="hljs-name">address</span> <span class="hljs-attr">addr</span>=<span class="hljs-string">"38:5B:5E:1D:52:99"</span> <span class="hljs-attr">addrtype</span>=<span class="hljs-string">"mac"</span>/&gt;</span>
<span class="hljs-tag">&lt;<span class="hljs-name">hostnames</span>&gt;</span>
<span class="hljs-tag">&lt;/<span class="hljs-name">hostnames</span>&gt;</span>
<span class="hljs-tag">&lt;<span class="hljs-name">ports</span>&gt;</span><span class="hljs-tag">&lt;<span class="hljs-name">extraports</span> <span class="hljs-attr">state</span>=<span class="hljs-string">"closed"</span> <span class="hljs-attr">count</span>=<span class="hljs-string">"65523"</span>&gt;</span>
<span class="hljs-tag">&lt;<span class="hljs-name">extrareasons</span> <span class="hljs-attr">reason</span>=<span class="hljs-string">"conn-refused"</span> <span class="hljs-attr">count</span>=<span class="hljs-string">"65523"</span>/&gt;</span>
<span class="hljs-tag">&lt;/<span class="hljs-name">extraports</span>&gt;</span>
<span class="hljs-tag">&lt;<span class="hljs-name">port</span> <span class="hljs-attr">protocol</span>=<span class="hljs-string">"tcp"</span> <span class="hljs-attr">portid</span>=<span class="hljs-string">"139"</span>&gt;</span><span class="hljs-tag">&lt;<span class="hljs-name">state</span> <span class="hljs-attr">state</span>=<span class="hljs-string">"open"</span> <span class="hljs-attr">reason</span>=<span class="hljs-string">"syn-ack"</span> <span class="hljs-attr">reason_ttl</span>=<span class="hljs-string">"0"</span>/&gt;</span><span class="hljs-tag">&lt;<span class="hljs-name">service</span> <span class="hljs-attr">name</span>=<span class="hljs-string">"netbios-ssn"</span> <span class="hljs-attr">product</span>=<span class="hljs-string">"Samba smbd"</span> <span class="hljs-attr">version</span>=<span class="hljs-string">"3.X - 4.X"</span> <span class="hljs-attr">extrainfo</span>=<span class="hljs-string">"workgroup: ZZZ"</span> <span class="hljs-attr">method</span>=<span class="hljs-string">"probed"</span> <span class="hljs-attr">conf</span>=<span class="hljs-string">"10"</span>&gt;</span><span class="hljs-tag">&lt;<span class="hljs-name">cpe</span>&gt;</span>cpe:/a:samba:samba<span class="hljs-tag">&lt;/<span class="hljs-name">cpe</span>&gt;</span><span class="hljs-tag">&lt;/<span class="hljs-name">service</span>&gt;</span><span class="hljs-tag">&lt;/<span class="hljs-name">port</span>&gt;</span>
    ...
</code></pre>
<p>Time to do some coding. Parsing data in many formats is one of Python's strengths. Data is extracted and normalized for all the ports that are not 'closed' using <a target="_blank" href="https://github.com/lxml/lxml">lxml</a>:</p>
<pre><code class="lang-python"><span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">OutputParser</span>:</span>
    <span class="hljs-string">"""
    Parse Nmap raw XML output
    """</span>

<span class="hljs-meta">    @staticmethod</span>
    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">parse_nmap_xml</span>(<span class="hljs-params">xml: str</span>) -&gt; (str, Any):</span>
        <span class="hljs-string">"""
        Parse XML and return details for the scanned ports
        @param xml:
        @return: tuple nmaps arguments, port details
        """</span>
        parsed_data = []
        root = ElementTree.fromstring(xml)
        nmap_args = root.attrib[<span class="hljs-string">'args'</span>]
        <span class="hljs-keyword">for</span> host <span class="hljs-keyword">in</span> root.findall(<span class="hljs-string">'host'</span>):
            <span class="hljs-keyword">for</span> address <span class="hljs-keyword">in</span> host.findall(<span class="hljs-string">'address'</span>):
                curr_address = address.attrib[<span class="hljs-string">'addr'</span>]
                data = {
                    <span class="hljs-string">'address'</span>: curr_address,
                    <span class="hljs-string">'ports'</span>: []
                }
                states = host.findall(<span class="hljs-string">'ports/port/state'</span>)
                ports = host.findall(<span class="hljs-string">'ports/port'</span>)
                <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> range(len(ports)):
                    <span class="hljs-keyword">if</span> states[i].attrib[<span class="hljs-string">'state'</span>] == <span class="hljs-string">'closed'</span>:
                        <span class="hljs-keyword">continue</span>  <span class="hljs-comment"># Skip closed ports</span>
                    port_id = ports[i].attrib[<span class="hljs-string">'portid'</span>]
                    protocol = ports[i].attrib[<span class="hljs-string">'protocol'</span>]
                    services = ports[i].findall(<span class="hljs-string">'service'</span>)
                    cpe_list = []
                    service_name = <span class="hljs-string">""</span>
                    service_product = <span class="hljs-string">""</span>
                    service_version = <span class="hljs-string">""</span>
                    <span class="hljs-keyword">for</span> service <span class="hljs-keyword">in</span> services:
                        <span class="hljs-keyword">for</span> key <span class="hljs-keyword">in</span> [<span class="hljs-string">'name'</span>, <span class="hljs-string">'product'</span>, <span class="hljs-string">'version'</span>]:
                            <span class="hljs-keyword">if</span> key <span class="hljs-keyword">in</span> service.attrib:
                                <span class="hljs-keyword">if</span> key == <span class="hljs-string">'name'</span>:
                                    service_name = service.attrib[<span class="hljs-string">'name'</span>]
                                <span class="hljs-keyword">elif</span> key == <span class="hljs-string">'product'</span>:
                                    service_product = service.attrib[<span class="hljs-string">'product'</span>]
                                <span class="hljs-keyword">elif</span> key == <span class="hljs-string">'version'</span>:
                                    service_version = service.attrib[<span class="hljs-string">'version'</span>]
                        cpes = service.findall(<span class="hljs-string">'cpe'</span>)
                        <span class="hljs-keyword">for</span> cpe <span class="hljs-keyword">in</span> cpes:
                            cpe_list.append(cpe.text)
                        data[<span class="hljs-string">'ports'</span>].append({
                            <span class="hljs-string">'port_id'</span>: port_id,
                            <span class="hljs-string">'protocol'</span>: protocol,
                            <span class="hljs-string">'service_name'</span>: service_name,
                            <span class="hljs-string">'service_product'</span>: service_product,
                            <span class="hljs-string">'service_version'</span>: service_version,
                            <span class="hljs-string">'cpes'</span>: cpe_list
                        })
                        parsed_data.append(data)
        <span class="hljs-keyword">return</span> nmap_args, parsed_data
</code></pre>
<p>Once the data is collected we can create a nice table in the terminal with the help of <a target="_blank" href="https://github.com/Textualize/rich">Rich</a>.</p>
<p>The table has the following columns:</p>
<ul>
<li><p>Internet Protocol (IP) address</p>
</li>
<li><p>Protocol: On this script it will always be Transfer Control Protocol (TCP)</p>
</li>
<li><p>Port ID: The port number where the service runs</p>
</li>
<li><p>Service: An networked service like Secure Shell (SSH)</p>
</li>
<li><p>Common Platform Enumeration (<a target="_blank" href="https://nvd.nist.gov/products/cpe">CPE</a>): Is a structured naming scheme for information technology systems, software, and packages.</p>
</li>
<li><p>Advisories: Any vulnerability related to the CPE identified by Nmap. Will need to correlate those ourselves.</p>
</li>
</ul>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">create_scan_table</span>(<span class="hljs-params">*, cli: str</span>) -&gt; Table:</span>
    <span class="hljs-string">"""
    Create a table for the CLI UI
    :param cli: Full Nmap arguments used on the run
    :return: Skeleton table, no data
    """</span>
    nmap_table = Table(title=<span class="hljs-string">f"NMAP run info: <span class="hljs-subst">{cli}</span>"</span>)
    nmap_table.add_column(<span class="hljs-string">"IP"</span>, justify=<span class="hljs-string">"right"</span>, style=<span class="hljs-string">"cyan"</span>, no_wrap=<span class="hljs-literal">True</span>)
    nmap_table.add_column(<span class="hljs-string">"Protocol"</span>, justify=<span class="hljs-string">"right"</span>, style=<span class="hljs-string">"cyan"</span>, no_wrap=<span class="hljs-literal">True</span>)
    nmap_table.add_column(<span class="hljs-string">"Port ID"</span>, justify=<span class="hljs-string">"right"</span>, style=<span class="hljs-string">"magenta"</span>, no_wrap=<span class="hljs-literal">True</span>)
    nmap_table.add_column(<span class="hljs-string">"Service"</span>, justify=<span class="hljs-string">"right"</span>, style=<span class="hljs-string">"green"</span>)
    nmap_table.add_column(<span class="hljs-string">"CPE"</span>, justify=<span class="hljs-string">"right"</span>, style=<span class="hljs-string">"blue"</span>)
    nmap_table.add_column(<span class="hljs-string">"Advisories"</span>, justify=<span class="hljs-string">"right"</span>, style=<span class="hljs-string">"blue"</span>)
    <span class="hljs-keyword">return</span> nmap_table
...
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">fill_simple_table</span>(<span class="hljs-params">*, exec_data: str, parsed_xml: Dict[Any, Any]</span>) -&gt; Table:</span>
    <span class="hljs-string">"""
    Convenience method to create a simple UI table with Nmap XML output
    :param exec_data: Arguments and options used to run Nmap
    :param parsed_xml: Nmap data as a dictionary
    :return: Populated tabled
    """</span>
    nmap_table = create_scan_table(cli=exec_data)
    <span class="hljs-keyword">for</span> row_data <span class="hljs-keyword">in</span> parsed_xml:
        address = row_data[<span class="hljs-string">'address'</span>]
        ports = row_data[<span class="hljs-string">'ports'</span>]
        <span class="hljs-keyword">for</span> port_data <span class="hljs-keyword">in</span> ports:
            nmap_table.add_row(
                address,
                port_data[<span class="hljs-string">'protocol'</span>],
                port_data[<span class="hljs-string">'port_id'</span>],
                <span class="hljs-string">f"<span class="hljs-subst">{port_data[<span class="hljs-string">'service_name'</span>]}</span> <span class="hljs-subst">{port_data[<span class="hljs-string">'service_product'</span>]}</span> <span class="hljs-subst">{port_data[<span class="hljs-string">'service_version'</span>]}</span>"</span>,
                <span class="hljs-string">"\n"</span>.join(port_data[<span class="hljs-string">'cpes'</span>]),
                <span class="hljs-string">""</span>
            )
    <span class="hljs-keyword">return</span> nmap_table
</code></pre>
<p>The resulting script uses the code above to give the user the whole picture about the local network scan:</p>
<pre><code class="lang-python"><span class="hljs-comment">#!/usr/bin/env python</span>
<span class="hljs-keyword">import</span> sys
<span class="hljs-keyword">from</span> rich.console <span class="hljs-keyword">import</span> Console
<span class="hljs-keyword">from</span> home_nmap.query <span class="hljs-keyword">import</span> OutputParser
<span class="hljs-keyword">from</span> home_nmap.ui <span class="hljs-keyword">import</span> fill_simple_table

<span class="hljs-keyword">if</span> __name__ == <span class="hljs-string">"__main__"</span>:
    console = Console()
    <span class="hljs-keyword">for</span> nmap_xml <span class="hljs-keyword">in</span> sys.argv[<span class="hljs-number">1</span>:]:
        <span class="hljs-keyword">with</span> open(nmap_xml, <span class="hljs-string">'r'</span>) <span class="hljs-keyword">as</span> xml:
            xml_data = xml.read()
            rundata, parsed = OutputParser.parse_nmap_xml(xml_data)
            nmap_table = fill_simple_table(exec_data=rundata, parsed_xml=parsed)
            console.print(nmap_table)
</code></pre>
<p><img src="https://www.freecodecamp.org/news/content/images/2022/02/nmap_scan_rpt_noadvisories.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>Scan for local network. Advisories column is empty</em></p>
<p>If you notice, the 'Advisories' column is left completely empty. We'll use the <a target="_blank" href="https://www.nist.gov/cybersecurity">NIST cybersecurity website search engine</a> to populate the missing advisories, by-passing the CPE that have <em>version information</em> to avoid false positives.</p>
<p>We use <a target="_blank" href="https://github.com/psf/requests">requests</a> to help us with the HTTP communication:</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> dataclasses <span class="hljs-keyword">import</span> dataclass
<span class="hljs-keyword">import</span> requests
IGNORED_CPES = {<span class="hljs-string">"cpe:/o:linux:linux_kernel"</span>}
<span class="hljs-keyword">from</span> cpe <span class="hljs-keyword">import</span> CPE
<span class="hljs-keyword">from</span> lxml <span class="hljs-keyword">import</span> html

<span class="hljs-meta">@dataclass</span>
<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">NIDS</span>:</span>
    summary: str
    link: str
    score: str

<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">NDISHtml</span>:</span>

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">__init__</span>(<span class="hljs-params">self</span>):</span>
        <span class="hljs-string">"""
        Some CPE return too many false positives,
        so they are ignored right out the bat
        """</span>
        self.raw_html = <span class="hljs-literal">None</span>
        self.parsed_results = []
        self.url = <span class="hljs-string">"https://nvd.nist.gov/vuln/search/results"</span>
        self.ignored_cpes = IGNORED_CPES

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">get</span>(<span class="hljs-params">self, cpe: str</span>) -&gt; str:</span>
        <span class="hljs-string">"""
        Run a CPE search on the NDIS website. If the CPE has no version then skip the search
        as it will return too many false positives
        @param cpe: CPE identifier coming from Nmap, like cpe:/a:openbsd:openssh:8.0
        @return:
        """</span>
        params = {
            <span class="hljs-string">'form_type'</span>: <span class="hljs-string">'Basic'</span>,
            <span class="hljs-string">'results_type'</span>: <span class="hljs-string">'overview'</span>,
            <span class="hljs-string">'search_type'</span>: <span class="hljs-string">'all'</span>,
            <span class="hljs-string">'isCpeNameSearch'</span>: <span class="hljs-string">'false'</span>,
            <span class="hljs-string">'query'</span>: cpe
        }
        <span class="hljs-keyword">if</span> cpe <span class="hljs-keyword">in</span> self.ignored_cpes:
            <span class="hljs-keyword">return</span> <span class="hljs-string">""</span>
        valid_cpe = CPE(cpe)
        <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> valid_cpe.get_version()[<span class="hljs-number">0</span>]:
            <span class="hljs-keyword">return</span> <span class="hljs-string">""</span>
        response = requests.get(
            url=self.url,
            params=params
        )
        response.raise_for_status()
        <span class="hljs-keyword">return</span> response.text

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">parse</span>(<span class="hljs-params">self, html_data: str</span>) -&gt; list[NIDS]:</span>
        <span class="hljs-string">"""
        Parse NDIS web search. Not aware that they offer a REST API that doesn't require parsing.
        It is assumed that this method is never called directly by end users, so no further checks are done on the
        HTML file contents.
        @param html_data: RAW HTML used for scrapping
        @return: List of NDIS, if any
        """</span>
        self.parsed_results = []
        <span class="hljs-keyword">if</span> html_data:
            ndis_html = html.fromstring(html_data)
            <span class="hljs-comment"># 1:1 match between 3 elements, use parallel array</span>
            summary = ndis_html.xpath(<span class="hljs-string">"//*[contains(@data-testid, 'vuln-summary')]"</span>)
            cve = ndis_html.xpath(<span class="hljs-string">"//*[contains(@data-testid, 'vuln-detail-link')]"</span>)
            score = ndis_html.xpath(<span class="hljs-string">"//*[contains(@data-testid, 'vuln-cvss2-link')]"</span>)
            <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> range(len(summary)):
                ndis = NIDS(
                    summary=summary[i].text,
                    link=<span class="hljs-string">"https://nvd.nist.gov/vuln/detail/"</span> + cve[i].text,
                    score=score[i].text
                )
                self.parsed_results.append(ndis)
        <span class="hljs-keyword">return</span> self.parsed_results
</code></pre>
<p>Then we correlate the Nmap CPES in the results with each one of the advisories, if any:</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> typing <span class="hljs-keyword">import</span> Any
<span class="hljs-keyword">from</span> dataclasses <span class="hljs-keyword">import</span> dataclass
<span class="hljs-meta">@dataclass</span>
<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">NIDS</span>:</span>
    summary: str
    link: str
    score: str
<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">NDISHtml</span>:</span>
    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">correlate_nmap_with_nids</span>(<span class="hljs-params">self, parsed_xml: Any</span>) -&gt; dict[str, list[NIDS]]:</span>
        correlated_cpe = {}
        <span class="hljs-keyword">for</span> row_data <span class="hljs-keyword">in</span> parsed_xml:
            ports = row_data[<span class="hljs-string">'ports'</span>]
            <span class="hljs-keyword">for</span> port_data <span class="hljs-keyword">in</span> ports:
                <span class="hljs-keyword">for</span> cpe <span class="hljs-keyword">in</span> port_data[<span class="hljs-string">'cpes'</span>]:
                    raw_ndis = self.get(cpe)
                    cpes = self.parse(raw_ndis)
                    correlated_cpe[cpe] = cpes
        <span class="hljs-keyword">return</span> correlated_cpe
</code></pre>
<p>The new table speaks for itself:</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2022/02/nmap_scan_rpt.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>Nmap scan results on a nice table</em></p>
<p>More complete, and we can see now that a few of our local services may have a vulnerability!</p>
<p>Can we do better? For example, it would be nice to be able to run Nmap directly from Python instead of parsing the results of a run, so let's code that.</p>
<h1 id="heading-how-to-write-an-easy-button-network-scanner-that-uses-nmap">How to Write an 'easy button' Network Scanner that Uses Nmap</h1>
<h2 id="heading-how-to-wrap-nmap-with-python-subprocessrun">How to wrap Nmap with Python (subprocess.run)</h2>
<p>Nmap doesn't offer a formal API to interact with external programs. For that reason we will run it from Python and save the results into an XML file. We can then use the data any way we want (see the 'subprocess.run' call in method 'scan' from our class NmapRunner):</p>
<pre><code class="lang-python"><span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">NMapRunner</span>:</span>

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">__init__</span>(<span class="hljs-params">self</span>):</span>
        <span class="hljs-string">"""
        Create a Nmap executor
        """</span>
        self.nmap_report_file = <span class="hljs-literal">None</span>
        found_sudo = shutil.which(<span class="hljs-string">'sudo'</span>, mode=os.F_OK | os.X_OK)
        <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> found_sudo:
            <span class="hljs-keyword">raise</span> ValueError(<span class="hljs-string">f"SUDO is missing"</span>)
        self.sudo = found_sudo
        found_nmap = shutil.which(<span class="hljs-string">'nmap'</span>, mode=os.F_OK | os.X_OK)
        <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> found_nmap:
            <span class="hljs-keyword">raise</span> ValueError(<span class="hljs-string">f"NMAP is missing"</span>)
        self.nmap = found_nmap

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">scan</span>(<span class="hljs-params">
            self,
            *,
            hosts: str,
            sudo: bool = True
    </span>):</span>
        command = []
        <span class="hljs-keyword">if</span> sudo:
            command.append(self.sudo)
        command.append(self.nmap)
        command.extend(__NMAP__FLAGS__)
        command.append(hosts)
        completed = subprocess.run(
            command,
            capture_output=<span class="hljs-literal">True</span>,
            shell=<span class="hljs-literal">False</span>,
            check=<span class="hljs-literal">True</span>
        )
        completed.check_returncode()
        args, data = OutputParser.parse_nmap_xml(completed.stdout.decode(<span class="hljs-string">'utf-8'</span>))
        <span class="hljs-keyword">return</span> args, data, completed.stderr
</code></pre>
<p><em>Security note</em>: The named argument 'shell=False' tells us that we do not want to create a new shell when running our process. This will provide protection against <a target="_blank" href="https://en.wikipedia.org/wiki/Code_injection#Shell_injection">shell injection</a> attacks.</p>
<h2 id="heading-how-to-speed-up-nmap-remember-all-these-flags-in-a-single-place">How to Speed up Nmap (remember all these flags in a single place)</h2>
<p>Your local network has less latency than the Internet. It will also most likely be easier to scan for open ports and OS fingerprinting because there is no firewall between you and the hosts.</p>
<p>Additionally, we are not concerned of triggering an IDS detection, so you can use the following to reduce the amount of time required to complete the port scanning (Variable <strong>NMAP__FLAGS</strong> in package system):</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> shlex
<span class="hljs-comment"># Convert the args for proper usage on the CLI</span>
NMAP_HOME_NETWORK_DEFAULT_FLAGS = {
    <span class="hljs-string">'-n'</span>: <span class="hljs-string">'Never do DNS resolution'</span>,
    <span class="hljs-string">'-sS'</span>: <span class="hljs-string">'TCP SYN scan, recommended'</span>,
    <span class="hljs-string">'-p-'</span>: <span class="hljs-string">'All ports'</span>,
    <span class="hljs-string">'-sV'</span>: <span class="hljs-string">'Probe open ports to determine service/version info'</span>,
    <span class="hljs-string">'-O'</span>: <span class="hljs-string">'OS Probe. Requires sudo/ root'</span>,
    <span class="hljs-string">'-T4'</span>: <span class="hljs-string">'Aggressive timing template'</span>,
    <span class="hljs-string">'-PE'</span>: <span class="hljs-string">'Enable this echo request behavior. Good for internal networks'</span>,
    <span class="hljs-string">'--version-intensity 5'</span>: <span class="hljs-string">'Set version scan intensity. Default is 7'</span>,
    <span class="hljs-string">'--disable-arp-ping'</span>: <span class="hljs-string">'No ARP or ND Ping'</span>,
    <span class="hljs-string">'--max-hostgroup 20'</span>: <span class="hljs-string">'Hostgroup (batch of hosts scanned concurrently) size'</span>,
    <span class="hljs-string">'--min-parallelism 10'</span>: <span class="hljs-string">'Number of probes that may be outstanding for a host group'</span>,
    <span class="hljs-string">'--osscan-limit'</span>: <span class="hljs-string">'Limit OS detection to promising targets'</span>,
    <span class="hljs-string">'--max-os-tries 1'</span>: <span class="hljs-string">'Maximum number of OS detection tries against a target'</span>,
    <span class="hljs-string">'-oX -'</span>: <span class="hljs-string">'Send XML output to STDOUT, avoid creating a temp file'</span>
}
__NMAP__FLAGS__ = shlex.split(<span class="hljs-string">" "</span>.join(NMAP_HOME_NETWORK_DEFAULT_FLAGS.keys()))
</code></pre>
<p>The Nmap documentation also suggests that you can split the total hostlist across several instances of Nmap (it can be no greater than the number of CPUs in the server running the tool) to increase parallelism. But that doesn't come for free. You will need to worry about issues like race conditions and synchronization in concurrent threads running Nmap.</p>
<p>For now we'll keep it simple and let Nmap take care of any optimizations by providing the flags showed above.</p>
<h2 id="heading-how-to-figure-out-the-local-networks-on-the-machine-where-nmap-runs">How to figure out the local networks on the machine where Nmap runs?</h2>
<p>Our Python script can also check interfaces that are up, skip virtual interfaces,: and skip the special loopback interface. Luckily the kernel publishes all the information we need on /proc/net/dev file:</p>
<pre><code class="lang-shell">(2600) [josevnz@dmaf5 2600]$ cat /proc/net/dev
Inter-|   Receive                                                |  Transmit
 face |bytes    packets errs drop fifo frame compressed multicast|bytes    packets errs drop fifo colls carrier compressed
    lo: 18303833  303389    0    0    0     0          0         0 18303833  303389    0    0    0     0       0          0
enp2s0:       0       0    0    0    0     0          0         0        0       0    0    0    0     0       0          0
  eno1: 1931173135 3908073    0    1    0     0          0    407486 274206691 3289566    0    0    0     0       0          0
</code></pre>
<p>We can parse it like this (class HostIface, method <strong>refresh_interfaces</strong>):</p>
<pre><code class="lang-python"><span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">HostIface</span>:</span>    
    ...

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">__refresh_interfaces__</span>(<span class="hljs-params">self, *, skip_loopback: bool = True, only_alive: bool = True</span>) -&gt; Set[str]:</span>
        <span class="hljs-string">"""
        Alive means an interface that has shown any byte activity since the server is up
        Skips the loopback interface by default
        :param only_alive: Skip interfaces with zero bytes activity
        :param skip_loopback
        :return: Set with interface names
        """</span>
        <span class="hljs-keyword">with</span> open(<span class="hljs-string">'/proc/net/dev'</span>, <span class="hljs-string">'r'</span>) <span class="hljs-keyword">as</span> dev:
            <span class="hljs-keyword">for</span> line <span class="hljs-keyword">in</span> dev:
                tokens = line.split()
                <span class="hljs-keyword">if</span> tokens[<span class="hljs-number">0</span>].find(<span class="hljs-string">":"</span>) != <span class="hljs-number">-1</span>:
                    name = tokens[<span class="hljs-number">0</span>].split(<span class="hljs-string">':'</span>)[<span class="hljs-number">0</span>]
                    <span class="hljs-keyword">if</span> re.search(<span class="hljs-string">'virbr\\d+|docker'</span>, name):
                        <span class="hljs-keyword">continue</span>  <span class="hljs-comment"># Skip virtual interfaces</span>
                    <span class="hljs-keyword">if</span> only_alive <span class="hljs-keyword">and</span> int(tokens[<span class="hljs-number">1</span>].strip()) == <span class="hljs-number">0</span>:
                        <span class="hljs-keyword">continue</span>
                    <span class="hljs-keyword">if</span> skip_loopback <span class="hljs-keyword">and</span> name == <span class="hljs-string">'lo'</span>:
                        <span class="hljs-keyword">continue</span>
                    self.interfaces.add(name)
        <span class="hljs-keyword">return</span> self.interfaces
</code></pre>
<p>The class HostIface gets the IP address and network masks of each local interface using <a target="_blank" href="https://docs.python.org/3/howto/sockets.html">Socket programming</a>. Then it maps each list of networks for these ip addresses + netmask combinations:</p>
<pre><code class="lang-python">SIOCGIFADDR = <span class="hljs-number">0x8915</span>
SIOCGIFNETMASK = <span class="hljs-number">0x891B</span>

<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">HostIface</span>:</span>
<span class="hljs-meta">    @staticmethod</span>
    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">get_iface_details</span>(<span class="hljs-params">iface: str</span>):</span>
        <span class="hljs-string">"""
        Get network interface IP using the network interface name
        :return: IP address and network mask
        :param iface: Interface name (like eth0, enp2s0, etc.)
        """</span>
        <span class="hljs-keyword">with</span> socket.socket(socket.AF_INET, socket.SOCK_DGRAM) <span class="hljs-keyword">as</span> s:
            iface_pack = struct.pack(<span class="hljs-string">'256s'</span>, bytes(iface, <span class="hljs-string">'ascii'</span>))
            packed_ip = fcntl.ioctl(s.fileno(), SIOCGIFADDR, iface_pack)[<span class="hljs-number">20</span>:<span class="hljs-number">24</span>]
            packed_netmask = fcntl.ioctl(s.fileno(), SIOCGIFNETMASK, iface_pack)[<span class="hljs-number">20</span>:<span class="hljs-number">24</span>]
        <span class="hljs-keyword">return</span> socket.inet_ntoa(packed_ip), socket.inet_ntoa(packed_netmask)

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">get_local_networks</span>(<span class="hljs-params">self, *, refresh: bool = False</span>) -&gt; List[ipaddress.IPv4Network]:</span>
        <span class="hljs-string">"""
        Get the list of local networks, using all the local IP addresses
        :param refresh: If true, re-read /proc to get list of interfaces
        :return: List of IPv4Network addresses
        """</span>
        local_networks: List[ipaddress.IPv4Network] = []
        <span class="hljs-keyword">for</span> iface <span class="hljs-keyword">in</span> self.get_alive_interfaces(refresh=refresh):
            ip, netmask = self.get_iface_details(iface)
            network: ipaddress.IPv4Network = ipaddress.ip_network(<span class="hljs-string">f"<span class="hljs-subst">{ip}</span>/<span class="hljs-subst">{netmask}</span>"</span>, strict=<span class="hljs-literal">False</span>)
            <span class="hljs-keyword">if</span> network <span class="hljs-keyword">not</span> <span class="hljs-keyword">in</span> local_networks:
                local_networks.append(network)
        <span class="hljs-keyword">return</span> local_networks
</code></pre>
<p>Note that this is not portable across other OS's like BSD and specially Windows.</p>
<h2 id="heading-how-to-put-together-the-new-nmap-cli-frontend">How to put together the new Nmap CLI frontend</h2>
<p>Now, creating a new CLI for Nmap is straightforward. As a plus, the new frontend also allows you to save your scanning results as a json file (--report optional argument):</p>
<pre><code class="lang-python"><span class="hljs-comment">#!/usr/bin/env python</span>
<span class="hljs-string">"""
# home_scan.py - A simple host discovery script
This script can scan your home network to show information from all the connected devices.

## References:
* [Nmap reference](https://nmap.org/book/man.html)

# Author
Jose Vicente Nunez Zuleta (kodegeek.com@protonmail.com)
"""</span>
<span class="hljs-keyword">import</span> json
<span class="hljs-keyword">import</span> logging
<span class="hljs-keyword">import</span> re
<span class="hljs-keyword">import</span> sys

<span class="hljs-keyword">from</span> rich.layout <span class="hljs-keyword">import</span> Layout
<span class="hljs-keyword">from</span> rich.live <span class="hljs-keyword">import</span> Live
<span class="hljs-keyword">from</span> rich.console <span class="hljs-keyword">import</span> Console
<span class="hljs-keyword">from</span> rich.logging <span class="hljs-keyword">import</span> RichHandler
<span class="hljs-keyword">from</span> rich.text <span class="hljs-keyword">import</span> Text
<span class="hljs-keyword">from</span> rich.traceback <span class="hljs-keyword">import</span> install
<span class="hljs-keyword">from</span> rich.progress <span class="hljs-keyword">import</span> TimeElapsedColumn, Progress, TextColumn
<span class="hljs-keyword">from</span> typing <span class="hljs-keyword">import</span> List
<span class="hljs-keyword">import</span> argparse

<span class="hljs-keyword">from</span> home_nmap.nmap <span class="hljs-keyword">import</span> Scanner
<span class="hljs-keyword">from</span> home_nmap.system <span class="hljs-keyword">import</span> HostIface
<span class="hljs-keyword">from</span> home_nmap.ui <span class="hljs-keyword">import</span> create_scan_table, update_scan_table


<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">get_targets</span>(<span class="hljs-params">target_list: List[str], cli_args: argparse.Namespace</span>) -&gt; str:</span>
    <span class="hljs-keyword">if</span> cli_args.target:
        <span class="hljs-keyword">for</span> target <span class="hljs-keyword">in</span> target_list:
            <span class="hljs-string">"""
            This should not happen as the script has an alias for -oX
            """</span>
            <span class="hljs-keyword">if</span> re.search(<span class="hljs-string">"-oX"</span>, target):
                <span class="hljs-keyword">raise</span> ValueError(<span class="hljs-string">f"Cannot redirect the output to a file by passing -oX. Run this script with --help"</span>)
        <span class="hljs-keyword">return</span> <span class="hljs-string">','</span>.join(target_list)
    <span class="hljs-keyword">return</span> <span class="hljs-string">','</span>.join(HostIface().get_prefixed_local_networks())


<span class="hljs-keyword">if</span> __name__ == <span class="hljs-string">'__main__'</span>:

    install()
    logging.basicConfig(
        level=<span class="hljs-string">"NOTSET"</span>,
        format=<span class="hljs-string">"%(message)s"</span>,
        datefmt=<span class="hljs-string">"[%X]"</span>,
        handlers=[RichHandler(rich_tracebacks=<span class="hljs-literal">True</span>)]
    )

    console = Console()
    arg_parser = argparse.ArgumentParser(
        description=<span class="hljs-string">"Identify my local networked devices, with open ports"</span>,
        prog=__file__
    )
    arg_parser.add_argument(
        <span class="hljs-string">'--debug'</span>,
        action=<span class="hljs-string">'store_true'</span>,
        default=<span class="hljs-literal">False</span>,
        help=<span class="hljs-string">"Enable debug mode"</span>
    )
    arg_parser.add_argument(
        <span class="hljs-string">'--results'</span>,
        <span class="hljs-string">'-xO'</span>,
        action=<span class="hljs-string">'store'</span>,
        help=<span class="hljs-string">f"If defined, save scan results into this file."</span>
    )
    arg_parser.add_argument(
        <span class="hljs-string">'target'</span>,
        action=<span class="hljs-string">'store'</span>,
        nargs=<span class="hljs-string">'*'</span>,
        help=(<span class="hljs-string">f"One or more targets, in Nmap format (scanme.homenmap.org, microsoft.com/24, 192.168.0.1; "</span>
              <span class="hljs-string">f"10.0.0-255.1-254). If not provided, then scan local networks"</span>)
    )
    args = arg_parser.parse_args()

    current_app_progress = Progress(
        TimeElapsedColumn(),
        TextColumn(<span class="hljs-string">"{task.description}"</span>),
    )
    scanning_task = current_app_progress.add_task(<span class="hljs-string">"[yellow]Waiting[/yellow] for scan results... :hourglass:"</span>)

    <span class="hljs-keyword">try</span>:
        scanner = Scanner()
        scan_targets = get_targets(args.target, args)
        <span class="hljs-keyword">if</span> args.results:
            table_title = <span class="hljs-string">f"Targets: <span class="hljs-subst">{scan_targets}</span>, results file=<span class="hljs-subst">{args.results}</span>"</span>
        <span class="hljs-keyword">else</span>:
            table_title = <span class="hljs-string">f"Targets: <span class="hljs-subst">{scan_targets}</span>"</span>
        results_table = create_scan_table(cli=<span class="hljs-string">f"Targets: <span class="hljs-subst">{table_title}</span>"</span>)
        layout = Layout()
        layout.split(
            Layout(name=<span class="hljs-string">"Scan status"</span>, size=<span class="hljs-number">1</span>),
            Layout(name=<span class="hljs-string">"Scan results"</span>),
        )
        <span class="hljs-keyword">with</span> Live(
                layout,
                console=console,
                screen=<span class="hljs-literal">False</span>,
                redirect_stderr=<span class="hljs-literal">False</span>,
        ) <span class="hljs-keyword">as</span> live:
            layout[<span class="hljs-string">'Scan results'</span>].update(Text(
                text=<span class="hljs-string">f"No results yet (<span class="hljs-subst">{scan_targets}</span>)"</span>, style=<span class="hljs-string">"green"</span>, justify=<span class="hljs-string">"center"</span>)),
            layout[<span class="hljs-string">'Scan status'</span>].update(current_app_progress)
            nmap_args, data, stderr = scanner.scan(hosts=scan_targets)
            update_scan_table(scan_result=data,
                              results_table=results_table,
                              main_layout=layout,
                              progress=current_app_progress,
                              task_id=scanning_task
                              )
        <span class="hljs-keyword">if</span> args.results:
            report_data = {
                <span class="hljs-string">'args'</span>: nmap_args,
                <span class="hljs-string">'scan'</span>: data
            }
            <span class="hljs-keyword">with</span> open(args.results, <span class="hljs-string">'w'</span>) <span class="hljs-keyword">as</span> report_file:
                json.dump(report_data, report_file, indent=<span class="hljs-literal">True</span>)

    <span class="hljs-keyword">except</span> ValueError:
        logging.exception(<span class="hljs-string">"There was an error"</span>)
        sys.exit(<span class="hljs-number">100</span>)
    <span class="hljs-keyword">except</span> KeyboardInterrupt:
        console.log(<span class="hljs-string">"Scan interrupted, exiting..."</span>)
        <span class="hljs-keyword">pass</span>
    sys.exit(<span class="hljs-number">0</span>)
</code></pre>
<p>The code got a little more verbose due the argument parsing and the user interface updates handling, but not too much.</p>
<p>Let's see an example against 127.0.0.1:</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2022/02/home_scan.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>Results of a live Nmap run, enriched with CVE advisories</em></p>
<p>If you are curious how the resulting JSON report looks like when passing the --report flag:</p>
<pre><code class="lang-json">{
 <span class="hljs-attr">"args"</span>: <span class="hljs-string">"/usr/bin/nmap -n -sS -p- -sV -O -T4 -PE --version-intensity 5 --disable-arp-ping --max-hostgroup 20 --min-parallelism 10 --osscan-limit --max-os-tries 1 -oX - 127.0.0.1"</span>,
 <span class="hljs-attr">"scan"</span>: [
  {
   <span class="hljs-attr">"addresses"</span>: [
    {   
     <span class="hljs-attr">"ip"</span>: <span class="hljs-string">"127.0.0.1"</span>
    }   
   ],  
   <span class="hljs-attr">"ports"</span>: [
    {   
     <span class="hljs-attr">"protocol"</span>: <span class="hljs-string">"tcp"</span>,
     <span class="hljs-attr">"port_id"</span>: <span class="hljs-string">"22"</span>,
     <span class="hljs-attr">"service_name"</span>: <span class="hljs-string">"ssh"</span>,
     <span class="hljs-attr">"service_product"</span>: <span class="hljs-string">"OpenSSH"</span>,
     <span class="hljs-attr">"service_version"</span>: <span class="hljs-string">"8.4"</span>,
     <span class="hljs-attr">"cpe"</span>: <span class="hljs-string">"cpe:/o:linux:linux_kernel:2.6.32"</span>
    },  
    {   
     <span class="hljs-attr">"protocol"</span>: <span class="hljs-string">"tcp"</span>,
     <span class="hljs-attr">"port_id"</span>: <span class="hljs-string">"631"</span>,
     <span class="hljs-attr">"service_name"</span>: <span class="hljs-string">"ipp"</span>,
     <span class="hljs-attr">"service_product"</span>: <span class="hljs-string">"CUPS"</span>,
     <span class="hljs-attr">"service_version"</span>: <span class="hljs-string">"2.3"</span>,
     <span class="hljs-attr">"cpe"</span>: <span class="hljs-string">"cpe:/o:linux:linux_kernel:2.6.32"</span>
    },  
...]
}
</code></pre>
<h2 id="heading-what-about-a-gui">What about a GUI?</h2>
<p>Nmap has a very complete GUI called <a target="_blank" href="https://nmap.org/zenmap/">Zenmap</a>, but the whole point was to show you that you can write a nice Text UI in Python as well to display the results.</p>
<p>You can achieve the same by using other popular frameworks like <a target="_blank" href="https://docs.python.org/3/library/tkinter.html">Tkinter</a>, which has incredibly detailed <a target="_blank" href="https://tkdocs.com/tutorial/">documentation</a>. For that reason, we'll not expand this topic any further.</p>
<p>Instead, let me show you how you can build a self-documenting REST-API for Nmap</p>
<h1 id="heading-how-to-make-a-home-network-scanner-a-web-service">How to Make a Home Network Scanner a Web Service</h1>
<p>Sometimes you cannot install Nmap because you lack the elevated privileges to do so or the server has installation constraints (like space or memory).</p>
<p>Or it could be that you want to run the port scanner on a machine that is able to connect to a network not directly accessible from the server you are currently logged in (and bypassing network segregation imposed by firewall). In this case the webservice will act like a proxy to run our Nmap command.</p>
<p>This is also known as "<strong>pivoting</strong>", and it it is a common technique used to bypass firewalls and proxy servers.</p>
<p>Let's take a short detour to talk more about pivoting with Nmap</p>
<h3 id="heading-can-you-run-nmap-through-a-proxy">Can you run Nmap through a proxy?</h3>
<p>Yes, you can use <a target="_blank" href="https://github.com/haad/proxychains">proxychains</a> to run Nmap through a host with better connectivity or to bypass firewall restrictions:</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2022/02/pivot.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>Using pivoting with Nmap and Proxy-chains</em></p>
<p>Say for the sake of argument that host 'External Linux' doesn't have direct connectivity to the network 192.168.1.0/24 but 'Multi homed Linux' does, and it can run a SOCKS-5 proxy.</p>
<p>To gain access to the internal network, we run <a target="_blank" href="https://en.wikipedia.org/wiki/Secure_Shell">SSH</a> forwarding port 9050 (as a SOCKS-5 proxy) under user 'josevnz':</p>
<pre><code class="lang-shell">josevnz@multihomed:~$ ssh  -N -D 9050 josevnz@192.168.1.11
The authenticity of host '192.168.1.11 (192.168.1.11)' can't be established.
ECDSA key fingerprint is SHA256:VIZCaCMb5rN2oL/xuv6CPrG1II+huW44x4TWhyKv8QM.
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Warning: Permanently added '192.168.1.11' (ECDSA) to the list of known hosts.
</code></pre>
<p>Then we install proxychains on 'External Linux' if is not already there:</p>
<pre><code class="lang-shell"># You either install proxychains first with 
# RedHat: 'sudo dnf -y install proxychains'
# Debian: 'sudo apt-get install proxychains4'
</code></pre>
<p>And create a proxychains.conf file pointing to your SSH SOCKS-5 proxy server:</p>
<pre><code class="lang-shell">cat&lt;&lt;CFG&gt;$HOME/proxychains.conf
strict_chain
proxy_dns
remote_dns_subnet 224
tcp_read_time_out 15000
tcp_connect_time_out 8000
[ProxyList]
socks5 192.168.1.11 9050
CFG
</code></pre>
<p>Finally, run Nmap, using a TCP scan:</p>
<pre><code class="lang-shell">[josevnz@external docs]$ proxychains -q -f $HOME/proxychains.conf sudo Nmap -sT 192.168.1.0/24
Starting Nmap 7.80 ( https://nmap.org ) at 2021-12-30 16:06 EST
</code></pre>
<p>Alternatively just tell Nmap itself to use our new SOCKS-5 proxy (documentation <a target="_blank" href="https://nmap.org/book/man-bypass-firewalls-ids.html">says this is still under development</a>):</p>
<pre><code class="lang-shell">[josevnz@external docs]$ sudo nmap -v -sT --proxies socks4://192.168.1.11:9050 192.168.1.0/24
Starting Nmap 7.80 ( https://nmap.org ) at 2021-12-31 09:03 EST
</code></pre>
<p>Now lets go back to code our <a target="_blank" href="https://en.wikipedia.org/wiki/Web_service">web service</a>.</p>
<h2 id="heading-how-to-run-nmap-as-a-web-service">How to run Nmap as a web service</h2>
<p>In any case, running Nmap as a service is not something new (<a target="_blank" href="http://nmap-cgi.tuxfamily.org/">Nmap-cgi</a>). We'll make ours using <a target="_blank" href="https://fastapi.tiangolo.com/">FastAPI</a>.</p>
<p>I put together a web service that shows the current version and also the available network interfaces (home_nmap/main.py):</p>
<pre><code class="lang-python"><span class="hljs-string">"""
# Web service for home_nmap
# Author
Jose Vicente Nunez Zuleta (kodegeek.com@protonmail.com)
"""</span>
<span class="hljs-keyword">from</span> home_nmap <span class="hljs-keyword">import</span> __version__
<span class="hljs-keyword">from</span> fastapi <span class="hljs-keyword">import</span> FastAPI

<span class="hljs-keyword">from</span> home_nmap.system <span class="hljs-keyword">import</span> HostIface

app = FastAPI()


<span class="hljs-meta">@app.get("/version")</span>
<span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">version</span>():</span>
    <span class="hljs-keyword">return</span> {<span class="hljs-string">"version"</span>: __version__}


<span class="hljs-meta">@app.get("/local_networks")</span>
<span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">local_networks</span>():</span>
    hi = HostIface()
    <span class="hljs-keyword">return</span> hi.get_local_networks()
</code></pre>
<p>In FastApi we define the web service endpoints with annotations it takes care of serializing our response back to the client.</p>
<p>Here is how you can start the service using the <a target="_blank" href="https://www.uvicorn.org/">uvicorn</a> web server with the '--reload' flag to detect changes in our code automatically:</p>
<pre><code class="lang-shell">(home_nmap) [josevnz@dmaf5 home_nmap]$ uvicorn home_nmap.main:app --reload
INFO:     Will watch for changes in these directories: ['/home/josevnz/Documents/home_nmap']
INFO:     Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit)
INFO:     Started reloader process [122202] using watchgod
INFO:     Started server process [122204]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
</code></pre>
<p>Getting the home_nmap API version using <a target="_blank" href="https://curl.se/">curl</a>, JSON response pretty print with <a target="_blank" href="https://stedolan.github.io/jq/">jq</a>:</p>
<pre><code class="lang-shell">(home_nmap) [josevnz@dmaf5 rich]$ curl --fail --silent http://127.0.0.1:8000/version| jq '.'
{
  "version": "0.0.1"
}
</code></pre>
<p>Now get the list of local networks calling the '/local_networks' endpoint:</p>
<pre><code class="lang-shell">(home_nmap) [josevnz@dmaf5 rich]$ curl --fail --silent http://127.0.0.1:8000/local_networks| jq '.'
[
  "192.168.1.0/24"
]
</code></pre>
<p>One nice thing about FastApi is that you get automatic documentation for your REST endpoints (<a target="_blank" href="http://127.0.0.1:8000/docs#/">http://127.0.0.1:8000/docs#/</a>):</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2022/02/home_nmap_rest_documentation.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>Nmap self documenting REST API</em></p>
<p>Not bad for a few lines of code if you ask me.</p>
<h2 id="heading-how-to-implement-the-scanner-service">How to implement the scanner service</h2>
<p>On the 'main.py' file we implement the endpoint to scan the local network and to correlate the CPE with any possible advisories:</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> typing <span class="hljs-keyword">import</span> Optional
<span class="hljs-keyword">from</span> home_nmap.system <span class="hljs-keyword">import</span> NMapRunner
<span class="hljs-keyword">from</span> home_nmap.query <span class="hljs-keyword">import</span> NDISHtml, target_validator
<span class="hljs-keyword">from</span> fastapi <span class="hljs-keyword">import</span> FastAPI, HTTPException
app: FastAPI = FastAPI()

<span class="hljs-meta">@app.get("/scan")</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">scan</span>(<span class="hljs-params">
        target: Optional[str] = None,
        full_advisories=True
</span>):</span>
    <span class="hljs-string">"""
    Scan a target to get service information.
    Note, FastAPI has a query validator, but I decided to use my own as I look for bad targets:
    Query(None, min_length=MIN_LEN_TARGET, max_length=MAX_LEN_TARGET)
    @param target: Override local network with custom targets, in Nmap format.
    @param full_advisories: If false, skip the summary information from the advisories
    @return: JSON containing the results of the scan
    """</span>
    <span class="hljs-keyword">try</span>:
        scanner = NMapRunner()
        args, scan_results, stderr = scanner.scan(hosts=target_validator(target))
        enriched_results = {
            <span class="hljs-string">'args'</span>: args,
            <span class="hljs-string">'hosts'</span>: []
        }
        <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> scan_results:
            <span class="hljs-keyword">raise</span> HTTPException(status_code=<span class="hljs-number">404</span>, detail=<span class="hljs-string">f"Got no results from scanning target=<span class="hljs-subst">{target}</span>"</span>)
        cpe_details = NDISHtml().correlate_nmap_with_nids(scan_results)
        <span class="hljs-keyword">for</span> host_data <span class="hljs-keyword">in</span> scan_results:
            enriched_host_data = {
                <span class="hljs-string">'address'</span>: host_data[<span class="hljs-string">'address'</span>],
                <span class="hljs-string">'ports'</span>: []
            }
            ports = host_data[<span class="hljs-string">'ports'</span>]
            <span class="hljs-keyword">for</span> port_data <span class="hljs-keyword">in</span> ports:
                advisories = []
                <span class="hljs-comment"># Unroll the advisories, if any ...</span>
                <span class="hljs-keyword">for</span> cpe <span class="hljs-keyword">in</span> port_data[<span class="hljs-string">'cpes'</span>]:
                    <span class="hljs-keyword">if</span> cpe <span class="hljs-keyword">in</span> cpe_details:  <span class="hljs-comment"># Service may not have an advisory</span>
                        <span class="hljs-keyword">for</span> nids <span class="hljs-keyword">in</span> cpe_details[cpe]:
                            <span class="hljs-keyword">if</span> full_advisories:
                                advisories.append({
                                    <span class="hljs-string">'link'</span>: nids.link,
                                    <span class="hljs-string">'summary'</span>: nids.summary,
                                    <span class="hljs-string">'score'</span>: nids.score
                                })
                            <span class="hljs-keyword">else</span>:
                                advisories.append({
                                    <span class="hljs-string">'link'</span>: nids.link,
                                    <span class="hljs-string">'summary'</span>: <span class="hljs-string">''</span>,  <span class="hljs-comment"># For consistency</span>
                                    <span class="hljs-string">'score'</span>: nids.score
                                })
                enriched_host_data[<span class="hljs-string">'ports'</span>].append(
                    {
                        <span class="hljs-string">'cpes'</span>: port_data[<span class="hljs-string">'cpes'</span>],
                        <span class="hljs-string">'advisories'</span>: advisories,
                        <span class="hljs-string">'protocol'</span>: port_data[<span class="hljs-string">'protocol'</span>],
                        <span class="hljs-string">'port_id'</span>: port_data[<span class="hljs-string">'port_id'</span>],
                        <span class="hljs-string">'service'</span>: [
                            <span class="hljs-string">f"<span class="hljs-subst">{port_data[<span class="hljs-string">'service_name'</span>]}</span>,"</span>
                            <span class="hljs-string">f"<span class="hljs-subst">{port_data[<span class="hljs-string">'service_product'</span>]}</span>,"</span>
                            <span class="hljs-string">f"<span class="hljs-subst">{port_data[<span class="hljs-string">'service_version'</span>]}</span>"</span>
                        ]
                    }
                )
            enriched_results[<span class="hljs-string">'hosts'</span>].append(enriched_host_data)
        <span class="hljs-keyword">return</span> enriched_results
    <span class="hljs-keyword">except</span> (TypeError, ValueError) <span class="hljs-keyword">as</span> exp:
        <span class="hljs-keyword">raise</span> HTTPException(status_code=<span class="hljs-number">500</span>, detail=str(exp))
</code></pre>
<p>The 'target_validator' function does a few checks on the target to ensure only valid scanning targets are passed (this is the same function we wrote for the CLI program):</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> re
MIN_LEN_TARGET = <span class="hljs-number">9</span>
MAX_LEN_TARGET = <span class="hljs-number">50</span>
<span class="hljs-keyword">from</span> typing <span class="hljs-keyword">import</span> Optional
<span class="hljs-keyword">import</span> shlex
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">target_validator</span>(<span class="hljs-params">target: Optional[str]</span>) -&gt; str:</span>
    <span class="hljs-string">"""
    Simple validator for Nmap target expressions
    @param target: (scanme.homenmap.org, microsoft.com/24, 192.168.0.1; 10.0.0-255.1-254). None or empty are valid
    @return:
    """</span>
    <span class="hljs-keyword">if</span> target:
        regexp_list = [
            <span class="hljs-string">'-[a-z-A-Z][A-Z]*'</span>,
            <span class="hljs-string">'-[a-zA-Z]\\d*'</span>,
            <span class="hljs-string">'--[a-z-]+'</span>
        ]
        <span class="hljs-keyword">if</span> len(target) &lt; MIN_LEN_TARGET:
            <span class="hljs-keyword">raise</span> ValueError(<span class="hljs-string">f"Provided length for target is too small &lt; <span class="hljs-subst">{MIN_LEN_TARGET}</span>"</span>)
        <span class="hljs-keyword">if</span> len(target) &gt; MAX_LEN_TARGET:
            <span class="hljs-keyword">raise</span> ValueError(<span class="hljs-string">f"Provided length for target is too big &lt; <span class="hljs-subst">{MAX_LEN_TARGET}</span>"</span>)
        <span class="hljs-keyword">for</span> arg <span class="hljs-keyword">in</span> shlex.split(target):
            <span class="hljs-keyword">for</span> regexp <span class="hljs-keyword">in</span> regexp_list:
                <span class="hljs-keyword">if</span> re.search(regexp, arg):
                    <span class="hljs-keyword">raise</span> ValueError(<span class="hljs-string">f"You cannot override Nmap arguments: <span class="hljs-subst">{arg}</span>"</span>)
    <span class="hljs-keyword">return</span> target
</code></pre>
<p>Time to put everything together.</p>
<h3 id="heading-what-does-a-scan-run-look-like-very-verbose-json">What does a scan run look like (very verbose JSON)?</h3>
<p>Here is what the scan result of 2 machines in my local network looks like (the web service is running on dmaf5.home on port 8000):</p>
<pre><code class="lang-shell">[josevnz@dmaf5 ~]$ curl http://dmaf5.home:8000/scan?target=192.168.1.10,23
{"args":"/usr/bin/nmap -n -sS -p- -sV -O -T4 -PE --version-intensity 5 --disable-arp-ping --max-hostgroup 20 --min-parallelism 10 --osscan-limit --max-os-tries 1 -oX - 192.168.1.10,23","hosts":[{"address":"192.168.1.10","ports":[{"cpes":["cpe:/a:openbsd:openssh:8.2p1"],"advisories":[{"link":"https://nvd.nist.gov/vuln/detail/CVE-2021-41617","summary":"sshd in OpenSSH 6.2 through 8.x before 8.8, when certain non-default configurations are used, allows privilege escalation because supplemental groups are not initialized as expected. Helper programs for AuthorizedKeysCommand and AuthorizedPrincipalsCommand may run with privileges associated with group memberships of the sshd process, if the configuration specifies running the command as a different user.","score":"4.4 MEDIUM"},{"link":"https://nvd.nist.gov/vuln/detail/CVE-2016-20012","summary":"OpenSSH through 8.7 allows remote attackers, who have a suspicion that a certain combination of username and public key is known to an SSH server, to test whether this suspicion is correct. This occurs because a challenge is sent only when that combination could be valid for a login session.","score":"4.3 MEDIUM"},{"link":"https://nvd.nist.gov/vuln/detail/CVE-2021-28041","summary":"ssh-agent in OpenSSH before 8.5 has a double free that may be relevant in a few less-common scenarios, such as unconstrained agent-socket access on a legacy operating system, or the forwarding of an agent to an attacker-controlled host.","score":"4.6 MEDIUM"},{"link":"https://nvd.nist.gov/vuln/detail/CVE-2020-15778","summary":"** DISPUTED ** scp in OpenSSH through 8.3p1 allows command injection in the scp.c toremote function, as demonstrated by backtick characters in the destination argument. NOTE: the vendor reportedly has stated that they intentionally omit validation of \"anomalous argument transfers\" because that could \"stand a great chance of breaking existing workflows.\"","score":"6.8 MEDIUM"},{"link":"https://nvd.nist.gov/vuln/detail/CVE-2020-14145","summary":"The client side in OpenSSH 5.7 through 8.4 has an Observable Discrepancy leading to an information leak in the algorithm negotiation. This allows man-in-the-middle attackers to target initial connection attempts (where no host key for the server has been cached by the client). NOTE: some reports state that 8.5 and 8.6 are also affected.","score":"4.3 MEDIUM"}],"protocol":"tcp","port_id":"22","service":[["ssh"],["OpenSSH"],["8.2p1 Ubuntu 4ubuntu0.3"]]},{"cpes":[],"advisories":[],"protocol":"tcp","port_id":"2377","service":[["swarm"],[""],[""]]},{"cpes":[],"advisories":[],"protocol":"tcp","port_id":"7946","service":[["unknown"],[""],[""]]},{"cpes":["cpe:/a:influxdata:influxdb:2.1.1"],"advisories":[],"protocol":"tcp","port_id":"8086","service":[["http"],["InfluxDB http admin"],["2.1.1"]]},{"cpes":[],"advisories":[],"protocol":"tcp","port_id":"9100","service":[["jetdirect"],[""],[""]]},{"cpes":["cpe:/a:protocol_labs:go-ipfs"],"advisories":[],"protocol":"tcp","port_id":"9323","service":[["http"],["Golang net/http server"],[""]]}]},{"address":"DC:A6:32:F9:47:48","ports":[{"cpes":["cpe:/a:openbsd:openssh:8.2p1"],"advisories":[{"link":"https://nvd.nist.gov/vuln/detail/CVE-2021-41617","summary":"sshd in OpenSSH 6.2 through 8.x before 8.8, when certain non-default configurations are used, allows privilege escalation because supplemental groups are not initialized as expected. Helper programs for AuthorizedKeysCommand and AuthorizedPrincipalsCommand may run with privileges associated with group memberships of the sshd process, if the configuration specifies running the command as a different user.","score":"4.4 MEDIUM"},{"link":"https://nvd.nist.gov/vuln/detail/CVE-2016-20012","summary":"OpenSSH through 8.7 allows remote attackers, who have a suspicion that a certain combination of username and public key is known to an SSH server, to test whether this suspicion is correct. This occurs because a challenge is sent only when that combination could be valid for a login session.","score":"4.3 MEDIUM"},{"link":"https://nvd.nist.gov/vuln/detail/CVE-2021-28041","summary":"ssh-agent in OpenSSH before 8.5 has a double free that may be relevant in a few less-common scenarios, such as unconstrained agent-socket access on a legacy operating system, or the forwarding of an agent to an attacker-controlled host.","score":"4.6 MEDIUM"},{"link":"https://nvd.nist.gov/vuln/detail/CVE-2020-15778","summary":"** DISPUTED ** scp in OpenSSH through 8.3p1 allows command injection in the scp.c toremote function, as demonstrated by backtick characters in the destination argument. NOTE: the vendor reportedly has stated that they intentionally omit validation of \"anomalous argument transfers\" because that could \"stand a great chance of breaking existing workflows.\"","score":"6.8 MEDIUM"},{"link":"https://nvd.nist.gov/vuln/detail/CVE-2020-14145","summary":"The client side in OpenSSH 5.7 through 8.4 has an Observable Discrepancy leading to an information leak in the algorithm negotiation. This allows man-in-the-middle attackers to target initial connection attempts (where no host key for the server has been cached by the client). NOTE: some reports state that 8.5 and 8.6 are also affected.","score":"4.3 MEDIUM"}],"protocol":"tcp","port_id":"22","service":[["ssh"],["OpenSSH"],["8.2p1 Ubuntu 4ubuntu0.3"]]},{"cpes":[],"advisories":[],"protocol":"tcp","port_id":"2377","service":[["swarm"],[""],[""]]},{"cpes":[],"advisories":[],"protocol":"tcp","port_id":"7946","service":[["unknown"],[""],[""]]},{"cpes":["cpe:/a:influxdata:influxdb:2.1.1"],"advisories":[],"protocol":"tcp","port_id":"8086","service":[["http"],["InfluxDB http admin"],["2.1.1"]]},{"cpes":[],"advisories":[],"protocol":"tcp","port_id":"9100","service":[["jetdirect"],[""],[""]]},{"cpes":["cpe:/a:protocol_labs:go-ipfs"],"advisories":[],"protocol":"tcp","port_id":"9323","service":[["http"],["Golang net/http server"],[""]]}]},{"address":"192.168.1.23","ports":[{"cpes":["cpe:/a:openbsd:openssh:8.4"],"advisories":[{"link":"https://nvd.nist.gov/vuln/detail/CVE-2021-41617","summary":"sshd in OpenSSH 6.2 through 8.x before 8.8, when certain non-default configurations are used, allows privilege escalation because supplemental groups are not initialized as expected. Helper programs for AuthorizedKeysCommand and AuthorizedPrincipalsCommand may run with privileges associated with group memberships of the sshd process, if the configuration specifies running the command as a different user.","score":"4.4 MEDIUM"},{"link":"https://nvd.nist.gov/vuln/detail/CVE-2016-20012","summary":"OpenSSH through 8.7 allows remote attackers, who have a suspicion that a certain combination of username and public key is known to an SSH server, to test whether this suspicion is correct. This occurs because a challenge is sent only when that combination could be valid for a login session.","score":"4.3 MEDIUM"},{"link":"https://nvd.nist.gov/vuln/detail/CVE-2021-28041","summary":"ssh-agent in OpenSSH before 8.5 has a double free that may be relevant in a few less-common scenarios, such as unconstrained agent-socket access on a legacy operating system, or the forwarding of an agent to an attacker-controlled host.","score":"4.6 MEDIUM"},{"link":"https://nvd.nist.gov/vuln/detail/CVE-2020-14145","summary":"The client side in OpenSSH 5.7 through 8.4 has an Observable Discrepancy leading to an information leak in the algorithm negotiation. This allows man-in-the-middle attackers to target initial connection attempts (where no host key for the server has been cached by the client). NOTE: some reports state that 8.5 and 8.6 are also affected.","score":"4.3 MEDIUM"}],"protocol":"tcp","port_id":"22","service":[["ssh"],["OpenSSH"],["8.4"]]},{"cpes":[],"advisories":[],"protocol":"tcp","port_id":"5355","service":[["llmnr"],[""],[""]]},{"cpes":[],"advisories":[],"protocol":"tcp","port_id":"8443","service":[["https-alt"],[""],[""]]},{"cpes":[],"advisories":[],"protocol":"tcp","port_id":"9100","service":[["jetdirect"],[""],[""]]}]}]}[josevnz@dmaf5 ~]$
</code></pre>
<h2 id="heading-is-this-web-service-secure">Is this web-service secure?</h2>
<p>We exposed our Nmap scanner <em>with no authorization</em>, which means anyone who knows where the service is running can use it. This may not be a big issue on the local network, but it would be good to control who uses our precious resources.</p>
<h3 id="heading-how-to-add-authentication-and-authorization">How to add authentication and authorization</h3>
<p>Right now anyone can call our service. It is a good idea to control who can run Nmap against our home network</p>
<p>There are <a target="_blank" href="https://fastapi.tiangolo.com/tutorial/security/">several ways</a> to make sure our web service can only be used by authorized clients. One way to do it is by requesting a client to provide a key that is also known to the server. This is the approach we'll follow here.</p>
<p>NOTE: As you might have guessed, if someone finds out the key then your service is compromised. To make it more secure you should:</p>
<ul>
<li><p>Stored the key in a safe place, encrypted</p>
</li>
<li><p>Have an expiration date, to purge stale ones</p>
</li>
<li><p>And transit of those keys should go over an encrypted channel, like HTTPS (we'll see about that soon)</p>
</li>
</ul>
<p>We will take advantage of <a target="_blank" href="https://github.com/mrtolkien/fastapi_simple_security">fastapi_simple_security</a> to implement the API security access to our web application. It only requires a few new imports and that we declare a dependency on our REST API endpoints:</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> fastapi <span class="hljs-keyword">import</span> FastAPI, Depends
<span class="hljs-keyword">from</span> fastapi_simple_security <span class="hljs-keyword">import</span> api_key_router, api_key_security
<span class="hljs-keyword">from</span> fastapi.responses <span class="hljs-keyword">import</span> JSONResponse
<span class="hljs-keyword">from</span> fastapi.encoders <span class="hljs-keyword">import</span> jsonable_encoder
<span class="hljs-keyword">import</span> typing
<span class="hljs-keyword">from</span> home_nmap.system <span class="hljs-keyword">import</span> HostIface
...
app: typing.Union[FastAPI] = FastAPI()
app.include_router(api_key_router, prefix=<span class="hljs-string">"/auth"</span>, tags=[<span class="hljs-string">"_auth"</span>])

<span class="hljs-comment"># Then add a 'dependencies' to each of the endpoints we want to secure</span>
<span class="hljs-meta">@app.get("/local_networks", dependencies=[Depends(api_key_security)])</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">local_networks</span>():</span>
    <span class="hljs-string">"""
    Get the available local networks where home_nmap runs
    @return: List with local networks in CIDR format
    """</span>
    response = JSONResponse(jsonable_encoder(HostIface().get_local_networks()))
    <span class="hljs-keyword">return</span> response
...
</code></pre>
<p>If we do not define a secret API key, the framework will provide us with one at startup (but you can override later through the documentation page):</p>
<pre><code class="lang-shell">(home_nmap) [josevnz@dmaf5 home_nmap]$ uuidgen 
23eb5572-1e63-4404-a64b-bcc18b62d4eb
(home_nmap) [josevnz@dmaf5 home_nmap]$ export FASTAPI_SIMPLE_SECURITY_SECRET="23eb5572-1e63-4404-a64b-bcc18b62d4eb"; uvicorn home_nmap.main:app --host 0.0.0.0 --port 8000 --reloadINFO:     Will watch for changes in these directories: ['/home/josevnz/Documents/home_nmap']
INFO:     Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
INFO:     Started reloader process [134702] using watchgod
INFO:     Started server process [134704]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
</code></pre>
<p>Now all the APIs that are protected by the keys have a different decoration in the documentation (a lock next to each endpoint):</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2022/02/documentation_shows_secured_endpoints.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>Now documentation shows secured end points</em></p>
<p>What happens if we try to get the list of local networks, without our key?</p>
<pre><code class="lang-shell">josevnz@dmaf5 ~]$ curl 'http://127.0.0.1:8000/local_networks' --header 'accept: application/json'
{"detail":"An API key must be passed as query or header"}
</code></pre>
<p>In order to finish the setup, you need to enter your 'secret-key' (<code>23eb5572-1e63-4404-a64b-bcc18b62d4eb</code>) into the docs authentication page. Then go to the <em>/auth/new to get the api-key</em>, which is the one that your clients will use (header, cookie or part of the GET requests). In my case I got this:</p>
<pre><code class="lang-shell">curl 'http://127.0.0.1:8000/auth/new?never_expires=false' \
  --header 'accept: application/json' \
  --header 'secret-key: 23eb5572-1e63-4404-a64b-bcc18b62d4eb'
"e4c03730-02a1-4cb9-8e00-36a63930c064"
</code></pre>
<p>Now let's try again but passing our secret API key:</p>
<pre><code class="lang-shell">[josevnz@dmaf5 home_nmap]$ curl 'http://127.0.0.1:8000/local_networks'  --header 'accept: application/json' --header 'api-key: e4c03730-02a1-4cb9-8e00-36a63930c064'
["192.168.1.0/24"][josevnz@dmaf5 home_nmap]$
</code></pre>
<p>Still, we are not done yet. Assume that someone managed to run a sniffer on your network and is capturing all your HTTP traffic:</p>
<pre><code class="lang-shell">[josevnz@dmaf5 home_nmap]$ tshark -i eno1 -Px -Y http
Capturing on 'eno1'
   72 5.107984320 192.168.1.11 → 192.168.1.25 HTTP 219 GET /local_networks HTTP/1.1 

0000  1c 83 41 28 44 21 dc a6 32 f9 47 48 08 00 45 00   ..A(D!..2.GH..E.
0010  00 cd 7b ca 40 00 40 06 3a ec c0 a8 01 0b c0 a8   ..{.@.@.:.......
0020  01 19 b1 a6 1f 40 ce 1b 2a 22 ab b5 24 3c 80 18   .....@..*"..$&lt;..
0030  01 f6 d0 3d 00 00 01 01 08 0a f3 07 ee 27 9d 96   ...=.........'..
0040  87 76 47 45 54 20 2f 6c 6f 63 61 6c 5f 6e 65 74   .vGET /local_net
0050  77 6f 72 6b 73 20 48 54 54 50 2f 31 2e 31 0d 0a   works HTTP/1.1..
0060  48 6f 73 74 3a 20 64 6d 61 66 35 2e 68 6f 6d 65   Host: dmaf5.home
0070  3a 38 30 30 30 0d 0a 55 73 65 72 2d 41 67 65 6e   :8000..User-Agen
0080  74 3a 20 63 75 72 6c 2f 37 2e 36 38 2e 30 0d 0a   t: curl/7.68.0..
0090  61 63 63 65 70 74 3a 20 61 70 70 6c 69 63 61 74   accept: applicat
00a0  69 6f 6e 2f 6a 73 6f 6e 0d 0a 61 70 69 2d 6b 65   ion/json..api-ke
00b0  79 3a 20 65 34 63 30 33 37 33 30 2d 30 32 61 31   y: e4c03730-02a1
00c0  2d 34 63 62 39 2d 38 65 30 30 2d 33 36 61 36 33   -4cb9-8e00-36a63
00d0  39 33 30 63 30 36 34 0d 0a 0d 0a                  930c064....
</code></pre>
<p>You can clearly see our not-so-secret-anymore API key. Time to add the next layer of protection.</p>
<h3 id="heading-we-need-encryption">We need encryption</h3>
<p>The HTTP protocol is not encrypted. That means that someone using a sniffer (like tcpdump or wireshark) can capture the traffic. For example, if we request the home_nmap version using curl:</p>
<pre><code class="lang-shell">curl http://dmaf5.home:8000/version
</code></pre>
<p>It is possible for someone else running <a target="_blank" href="https://tshark.dev/setup/install/">tshark</a> to see all the traffic (look at the content-type: Application/ Json payload):</p>
<pre><code class="lang-shell">root@dmaf5 ~]# tshark -i eno1 -Px -Y http
Running as user "root" and group "root". This could be dangerous.
Capturing on 'eno1'
  127 4.342379691 192.168.1.11 → 192.168.1.23 HTTP 152 GET /version HTTP/1.1 

0000  1c 83 41 28 44 21 dc a6 32 f9 47 48 08 00 45 00   ..A(D!..2.GH..E.
0010  00 8a c3 8a 40 00 40 06 f3 70 c0 a8 01 0b c0 a8   ....@.@..p......
0020  01 17 c7 68 1f 40 dc af 3c 37 c1 12 e6 69 80 18   ...h.@..&lt;7...i..
0030  01 f6 ff a7 00 00 01 01 08 0a 08 94 d3 55 a8 7c   .............U.|
0040  ec df 47 45 54 20 2f 76 65 72 73 69 6f 6e 20 48   ..GET /version H
0050  54 54 50 2f 31 2e 31 0d 0a 48 6f 73 74 3a 20 64   TTP/1.1..Host: d
0060  6d 61 66 35 2e 68 6f 6d 65 3a 38 30 30 30 0d 0a   maf5.home:8000..
0070  55 73 65 72 2d 41 67 65 6e 74 3a 20 63 75 72 6c   User-Agent: curl
0080  2f 37 2e 36 38 2e 30 0d 0a 41 63 63 65 70 74 3a   /7.68.0..Accept:
0090  20 2a 2f 2a 0d 0a 0d 0a                            */*....

  129 4.344312849 192.168.1.23 → 192.168.1.11 HTTP/JSON 210 HTTP/1.1 200 OK , JavaScript Object Notation (application/json)

0000  dc a6 32 f9 47 48 1c 83 41 28 44 21 08 00 45 00   ..2.GH..A(D!..E.
0010  00 c4 36 78 40 00 40 06 80 49 c0 a8 01 17 c0 a8   ..6x@.@..I......
0020  01 0b 1f 40 c7 68 c1 12 e6 69 dc af 3c 8d 80 18   ...@.h...i..&lt;...
0030  01 fd 84 29 00 00 01 01 08 0a a8 7c ec e1 08 94   ...).......|....
0040  d3 55 48 54 54 50 2f 31 2e 31 20 32 30 30 20 4f   .UHTTP/1.1 200 O
0050  4b 0d 0a 64 61 74 65 3a 20 4d 6f 6e 2c 20 31 37   K..date: Mon, 17
0060  20 4a 61 6e 20 32 30 32 32 20 32 30 3a 31 36 3a    Jan 2022 20:16:
0070  32 39 20 47 4d 54 0d 0a 73 65 72 76 65 72 3a 20   29 GMT..server: 
0080  75 76 69 63 6f 72 6e 0d 0a 63 6f 6e 74 65 6e 74   uvicorn..content
0090  2d 6c 65 6e 67 74 68 3a 20 31 39 0d 0a 63 6f 6e   -length: 19..con
00a0  74 65 6e 74 2d 74 79 70 65 3a 20 61 70 70 6c 69   tent-type: appli
00b0  63 61 74 69 6f 6e 2f 6a 73 6f 6e 0d 0a 0d 0a 7b   cation/json....{
00c0  22 76 65 72 73 69 6f 6e 22 3a 22 30 2e 30 2e 31   "version":"0.0.1
00d0  22 7d                                             "}
</code></pre>
<p>We can protect our traffic by encrypting it using <a target="_blank" href="https://en.wikipedia.org/wiki/HTTPS">Hypertext Transfer Protocol Secure (HTTPS)</a>.</p>
<h4 id="heading-how-to-create-the-secure-socket-layer-ssl-certificates">How to create the Secure Socket Layer (SSL) certificates</h4>
<p>Let me show you real quick <a target="_blank" href="https://github.com/rob-blackbourn/ssl-certs">how you can install a self-signed server certificate</a> on Fedora using <a target="_blank" href="https://github.com/cloudflare/cfssl">Cloudflare cfssl</a>. First let's install the tools:</p>
<pre><code class="lang-shell"># On Fedora just do 
sudo dnf install -y golang-github-cloudflare-cfssl
# Or go get github.com/cloudflare/cfssl/cmd/...
</code></pre>
<p>Next step is to create a certificate authority (CA). We will use it to sign other certificates. For that let's create a definition in JSON format:</p>
<pre><code class="lang-json">{
  <span class="hljs-attr">"CN"</span>: <span class="hljs-string">"Nunez Barrios family Root CA"</span>,
  <span class="hljs-attr">"key"</span>: {
    <span class="hljs-attr">"algo"</span>: <span class="hljs-string">"rsa"</span>,
    <span class="hljs-attr">"size"</span>: <span class="hljs-number">2048</span>
  },
  <span class="hljs-attr">"names"</span>: [
  {
    <span class="hljs-attr">"C"</span>: <span class="hljs-string">"US"</span>,
    <span class="hljs-attr">"L"</span>: <span class="hljs-string">"CT"</span>,
    <span class="hljs-attr">"O"</span>: <span class="hljs-string">"Nunez Barrios"</span>,
    <span class="hljs-attr">"OU"</span>: <span class="hljs-string">"Nunez Barrios Root CA"</span>,
    <span class="hljs-attr">"ST"</span>: <span class="hljs-string">"United States"</span>
  }
 ]
}
</code></pre>
<p>Create the certificate:</p>
<pre><code class="lang-shell">cfssl gencert -initca ca.json | cfssljson -bare ca
</code></pre>
<p>Next we need to create a profile file (cfssl.json), that will specify certain features of the certificates, like expiration in 2 years:</p>
<pre><code class="lang-json">{
  <span class="hljs-attr">"signing"</span>: {
    <span class="hljs-attr">"default"</span>: {
      <span class="hljs-attr">"expiry"</span>: <span class="hljs-string">"17532h"</span>
    },
    <span class="hljs-attr">"profiles"</span>: {
      <span class="hljs-attr">"intermediate_ca"</span>: {
        <span class="hljs-attr">"usages"</span>: [
            <span class="hljs-string">"signing"</span>,
            <span class="hljs-string">"digital signature"</span>,
            <span class="hljs-string">"key encipherment"</span>,
            <span class="hljs-string">"cert sign"</span>,
            <span class="hljs-string">"crl sign"</span>,
            <span class="hljs-string">"server auth"</span>,
            <span class="hljs-string">"client auth"</span>
        ],
        <span class="hljs-attr">"expiry"</span>: <span class="hljs-string">"17532h"</span>,
        <span class="hljs-attr">"ca_constraint"</span>: {
            <span class="hljs-attr">"is_ca"</span>: <span class="hljs-literal">true</span>,
            <span class="hljs-attr">"max_path_len"</span>: <span class="hljs-number">0</span>, 
            <span class="hljs-attr">"max_path_len_zero"</span>: <span class="hljs-literal">true</span>
        }
      },
      <span class="hljs-attr">"peer"</span>: {
        <span class="hljs-attr">"usages"</span>: [
            <span class="hljs-string">"signing"</span>,
            <span class="hljs-string">"digital signature"</span>,
            <span class="hljs-string">"key encipherment"</span>, 
            <span class="hljs-string">"client auth"</span>,
            <span class="hljs-string">"server auth"</span>
        ],
        <span class="hljs-attr">"expiry"</span>: <span class="hljs-string">"17532h"</span>
      },
      <span class="hljs-attr">"server"</span>: {
        <span class="hljs-attr">"usages"</span>: [
          <span class="hljs-string">"signing"</span>,
          <span class="hljs-string">"digital signing"</span>,
          <span class="hljs-string">"key encipherment"</span>,
          <span class="hljs-string">"server auth"</span>
        ],
        <span class="hljs-attr">"expiry"</span>: <span class="hljs-string">"17532h"</span>
      },
      <span class="hljs-attr">"client"</span>: {
        <span class="hljs-attr">"usages"</span>: [
          <span class="hljs-string">"signing"</span>,
          <span class="hljs-string">"digital signature"</span>,
          <span class="hljs-string">"key encipherment"</span>, 
          <span class="hljs-string">"client auth"</span>
        ],
        <span class="hljs-attr">"expiry"</span>: <span class="hljs-string">"17532h"</span>
      }
    }
  }
}
</code></pre>
<p>Now we create an intermediate certificate (intermediate-ca.json) that will expire in 5 years:</p>
<pre><code class="lang-json">{
  <span class="hljs-attr">"CN"</span>: <span class="hljs-string">"Barrios Nunez Intermediate CA"</span>,
  <span class="hljs-attr">"key"</span>: {
    <span class="hljs-attr">"algo"</span>: <span class="hljs-string">"rsa"</span>,
    <span class="hljs-attr">"size"</span>: <span class="hljs-number">2048</span>
  },
  <span class="hljs-attr">"names"</span>: [
    {
      <span class="hljs-attr">"C"</span>:  <span class="hljs-string">"US"</span>,
      <span class="hljs-attr">"L"</span>:  <span class="hljs-string">"CT"</span>,
      <span class="hljs-attr">"O"</span>:  <span class="hljs-string">"Barrios Nunez"</span>,
      <span class="hljs-attr">"OU"</span>: <span class="hljs-string">"Barrios Nunez Intermediate CA"</span>,
      <span class="hljs-attr">"ST"</span>: <span class="hljs-string">"USA"</span>
    }
  ],
  <span class="hljs-attr">"ca"</span>: {
    <span class="hljs-attr">"expiry"</span>: <span class="hljs-string">"43830h"</span>
  }
}
</code></pre>
<p>Here's the command to do it:</p>
<pre><code class="lang-shell">cfssl gencert -initca intermediate-ca.json | cfssljson -bare intermediate_ca
cfssl sign -ca ca.pem -ca-key ca-key.pem -config cfssl.json -profile intermediate_ca intermediate_ca.csr | cfssljson -bare intermediate_ca
</code></pre>
<h3 id="heading-next-step-is-to-create-the-host-certificates">Next step is to create the host certificates</h3>
<p>You will need to put your fully-qualified host name (<code>hostname -f</code>) on the host-1.json file. Also, some software expects the IP address (<code>ip address|grep inet</code>) – we will do both:</p>
<pre><code class="lang-json">{
  <span class="hljs-attr">"CN"</span>: <span class="hljs-string">"dmaf5.home"</span>,
  <span class="hljs-attr">"key"</span>: {
    <span class="hljs-attr">"algo"</span>: <span class="hljs-string">"rsa"</span>,
    <span class="hljs-attr">"size"</span>: <span class="hljs-number">2048</span>
  },
  <span class="hljs-attr">"names"</span>: [
  {
    <span class="hljs-attr">"C"</span>: <span class="hljs-string">"US"</span>,
    <span class="hljs-attr">"L"</span>: <span class="hljs-string">"CT"</span>,
    <span class="hljs-attr">"O"</span>: <span class="hljs-string">"Barrios Nunez"</span>,
    <span class="hljs-attr">"OU"</span>: <span class="hljs-string">"Barrios Nunez Hosts"</span>,
    <span class="hljs-attr">"ST"</span>: <span class="hljs-string">"USA"</span>
  }
  ],
  <span class="hljs-attr">"hosts"</span>: [
    <span class="hljs-string">"dmaf5.home"</span>,
    <span class="hljs-string">"localhost"</span>,
    <span class="hljs-string">"dmaf5"</span>,
    <span class="hljs-string">"192.168.1.23"</span>,
    <span class="hljs-string">"192.168.1.26"</span>
  ]
}
</code></pre>
<p>You can create three certificate types:</p>
<ul>
<li><p>client</p>
</li>
<li><p>server</p>
</li>
<li><p>peer</p>
</li>
</ul>
<p>We'll use only the server certificate, but we'll create all three:</p>
<pre><code class="lang-shell">cfssl gencert -ca intermediate_ca.pem -ca-key intermediate_ca-key.pem -config cfssl.json -profile=peer host-1.json| cfssljson -bare host-1-peer  # Peer
cfssl gencert -ca intermediate_ca.pem -ca-key intermediate_ca-key.pem -config cfssl.json -profile=server host-1.json | cfssljson -bare host-1-server  # Server
cfssl gencert -ca intermediate_ca.pem -ca-key intermediate_ca-key.pem -config cfssl.json -profile=client host-1.json | cfssljson -bare host-1-client  # Client
</code></pre>
<p>We are very close now. Install the intermediate certificate into the proper location so the clients on dmaf5 do not complain about the self-signed certificate:</p>
<pre><code class="lang-shell"># The path below is for Fedora, please check your OS documentation to find the right path for you
sudo /bin/cp --preserve --verbose tutorial/intermediate_ca.pem /etc/pki/ca-trust/source/anchors/
sudo update-ca-trust
</code></pre>
<p>Restart uvicorn to listen now only on a secure port, using the host key and certificates we just created:</p>
<pre><code class="lang-shell">(home_nmap) [josevnz@dmaf5 home_nmap]$ uvicorn home_nmap.main:app --host 0.0.0.0 --port 8443 --reload --ssl-keyfile=$PWD/tutorial/host-1-server-key.pem --ssl-certfile=$PWD/tutorial/host-1-server.pem
INFO:     Will watch for changes in these directories: ['/home/josevnz/Documents/home_nmap']
INFO:     Uvicorn running on https://0.0.0.0:8443 (Press CTRL+C to quit)
INFO:     Started reloader process [166275] using watchgod
INFO:     Started server process [166277]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     192.168.1.23:47704 - "GET /version HTTP/1.1" 200 OK
</code></pre>
<p>And then test with curl (without the --insecure flag, no complaints from curl):</p>
<pre><code class="lang-shell">[josevnz@dmaf5 ~]$ curl --fail https://dmaf5.home:8443/version
{"version":"0.0.1"}[josevnz@dmaf5 ~]$
</code></pre>
<p>Try again to capture the version of our service using tshark:</p>
<pre><code class="lang-shell"># 'tshark -i eno1 -Px -Y http' doesn't work anymore as the payload is encrypted. So at least lets see how the SSL hello goes
tshark -i eno1 -Y ssl -Px
  343 59.344539258 192.168.1.11 → 192.168.1.23 TLSv1 583 Client Hello

0000  1c 83 41 28 44 21 dc a6 32 f9 47 48 08 00 45 00   ..A(D!..2.GH..E.
0010  02 39 8b 6b 40 00 40 06 29 e1 c0 a8 01 0b c0 a8   .9.k@.@.).......
0020  01 17 93 14 20 fb 10 10 d7 6f 7d ff f7 c1 80 18   .... ....o}.....
0030  01 f6 0b fe 00 00 01 01 08 0a 08 a5 00 20 a8 8d   ............. ..
0040  27 47 16 03 01 02 00 01 00 01 fc 03 03 39 03 ac   'G...........9..
0050  19 7c bd 38 dc e2 cf 72 8b 7e 00 e2 2d fc 68 7a   .|.8...r.~..-.hz
0060  cc af 9c d6 d5 1d ed 94 79 b2 0f c8 cf 20 a3 f8   ........y.... ..
0070  2a 8e 20 c0 d2 c1 57 ee 36 48 2e 8f 46 e7 da 76   *. ...W.6H..F..v
0080  69 67 d1 9d 5a 70 24 0e 7d ea ec 8b e2 a0 00 3e   ig..Zp$.}......&gt;
0090  13 02 13 03 13 01 c0 2c c0 30 00 9f cc a9 cc a8   .......,.0......
00a0  cc aa c0 2b c0 2f 00 9e c0 24 c0 28 00 6b c0 23   ...+./...$.(.k.#
00b0  c0 27 00 67 c0 0a c0 14 00 39 c0 09 c0 13 00 33   .'.g.....9.....3
00c0  00 9d 00 9c 00 3d 00 3c 00 35 00 2f 00 ff 01 00   .....=.&lt;.5./....
00d0  01 75 00 00 00 0f 00 0d 00 00 0a 64 6d 61 66 35   .u.........dmaf5
00e0  2e 68 6f 6d 65 00 0b 00 04 03 00 01 02 00 0a 00   .home...........
00f0  0c 00 0a 00 1d 00 17 00 1e 00 19 00 18 33 74 00   .............3t.
0100  00 00 10 00 0e 00 0c 02 68 32 08 68 74 74 70 2f   ........h2.http/
0110  31 2e 31 00 16 00 00 00 17 00 00 00 31 00 00 00   1.1.........1...
0120  0d 00 2a 00 28 04 03 05 03 06 03 08 07 08 08 08   ..*.(...........
0130  09 08 0a 08 0b 08 04 08 05 08 06 04 01 05 01 06   ................
0140  01 03 03 03 01 03 02 04 02 05 02 06 02 00 2b 00   ..............+.
</code></pre>
<p>Note that it is possible to capture the traffic and decrypt it later if you have access to the private key. That's why it is so important that you keep that file secure.</p>
<p>What about our authorized request using the API key + encryption?</p>
<pre><code class="lang-shell">josevnz@raspberrypi:~$ curl 'https://dmaf5.home:8443/local_networks' --header 'accept: application/json' --header 'api-key: e4c03730-02a1-4cb9-8e00-36a63930c064'
["192.168.1.0/24"]
</code></pre>
<p>Our application setup is now complete.</p>
<h1 id="heading-what-did-we-learn">What did we learn?</h1>
<p>In this article, we covered many topics and went from a very simple XML parser to a self documenting web service. Not bad for a single session!</p>
<p>You should know about the following topics now:</p>
<ul>
<li><p>How to parse an Nmap XML results file, and enrich it with security advisories from NIST</p>
</li>
<li><p>How to enhance Nmap by mixing it with other scripts to automate its execution</p>
</li>
<li><p>How to apply Nmap options to make our local network scan faster</p>
</li>
<li><p>Understand what is pivoting and how you can use it to bypass firewall protections with the help of SSH and tcpproxy</p>
</li>
<li><p>How to write a REST-API on top of our original CLI script and secure it with SSL and basic authentication</p>
</li>
<li><p>How to add authorization to a web service using an API key</p>
</li>
<li><p>How to use tshark to demonstrate how HTTP traffic can be captured, and show the data payload</p>
</li>
<li><p>How to add encryption to a web service, by creating self-signed certificates</p>
</li>
</ul>
<h3 id="heading-and-what-else-could-you-learn-here-are-some-final-suggestions">And what else could you learn? Here are some final suggestions:</h3>
<ul>
<li><p>Check out the official Nmap <a target="_blank" href="https://nmap.org/docs.html">documentation</a>.</p>
</li>
<li><p>The <a target="_blank" href="https://nmap.org/book/osdetect.html">Operating system fingerprinting</a> is fascinating. Figuring out what exactly runs behind a port is an art and a moving target.</p>
</li>
<li><p>Integration with other great <a target="_blank" href="https://en.wikipedia.org/wiki/Penetration_test">penetration testing</a> tools like <a target="_blank" href="https://github.com/rapid7/metasploit-framework">Metasploit</a>, which you guessed, <a target="_blank" href="https://www.offensive-security.com/metasploit-unleashed/custom-scripting/">can also be scripted in Ruby</a>!</p>
</li>
<li><p>Also, as a bonus you have my code that can be installed using <a target="_blank" href="https://pip.pypa.io/en/stable/">pip</a> and can run some unit tests with <a target="_blank" href="https://docs.python.org/3/library/unittest.html">unittest</a>. I welcome pull requests and suggestions.</p>
</li>
</ul>
<p>Feel free to reach out with your comments and <a target="_blank" href="https://github.com/josevnz/home_nmap/issues">bug-reports</a>. I hope you enjoy it using it as much I enjoyed writing it.</p>
 ]]>
                </content:encoded>
            </item>
        
    </channel>
</rss>
