<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/"
    xmlns:atom="http://www.w3.org/2005/Atom" xmlns:media="http://search.yahoo.com/mrss/" version="2.0">
    <channel>
        
        <title>
            <![CDATA[ Atharva Shah - freeCodeCamp.org ]]>
        </title>
        <description>
            <![CDATA[ Browse thousands of programming tutorials written by experts. Learn Web Development, Data Science, DevOps, Security, and get developer career advice. ]]>
        </description>
        <link>https://www.freecodecamp.org/news/</link>
        <image>
            <url>https://cdn.freecodecamp.org/universal/favicons/favicon.png</url>
            <title>
                <![CDATA[ Atharva Shah - freeCodeCamp.org ]]>
            </title>
            <link>https://www.freecodecamp.org/news/</link>
        </image>
        <generator>Eleventy</generator>
        <lastBuildDate>Sat, 06 Jun 2026 11:16:51 +0000</lastBuildDate>
        <atom:link href="https://www.freecodecamp.org/news/author/AtharvaShah/rss.xml" rel="self" type="application/rss+xml" />
        <ttl>60</ttl>
        
            <item>
                <title>
                    <![CDATA[ The Python Decorator Handbook ]]>
                </title>
                <description>
                    <![CDATA[ Python decorators provide an easy yet powerful syntax for modifying and extending the behavior of functions in your code. A decorator is essentially a function that takes another function, augments its functionality, and returns a new function – with... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/the-python-decorator-handbook/</link>
                <guid isPermaLink="false">66d45d9f052ad259f07e4a69</guid>
                
                    <category>
                        <![CDATA[ decorator ]]>
                    </category>
                
                    <category>
                        <![CDATA[ handbook ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Python ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Python 3 ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ Atharva Shah ]]>
                </dc:creator>
                <pubDate>Fri, 26 Jan 2024 17:17:03 +0000</pubDate>
                <media:content url="https://www.freecodecamp.org/news/content/images/2024/01/The-Python-Decorator-Handbook-Cover.png" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>Python decorators provide an easy yet powerful syntax for modifying and extending the behavior of functions in your code.</p>
<p>A decorator is essentially a function that takes another function, augments its functionality, and returns a new function – without permanently modifying the original function itself.</p>
<p>This tutorial will walk you through 11 handy decorators to help add functionality like timing execution, caching, rate limiting, debugging and more. Whether you want to profile performance, improve efficiency, validate data, or manage errors, these decorators have got you covered!</p>
<p>The examples here focus on the common usage patterns and utilities of decorators that can come in handy in your day-to-day programming and save you a lot of effort. Understanding the flexibility of decorators will help you write clean, resilient, and optimized application code.</p>
<h2 id="heading-table-of-contents">Table of Contents</h2>
<p>Here are the decorators covered in this tutorial:</p>
<ul>
<li><p><a class="post-section-overview" href="#heading-log-arguments-and-return-value-of-a-function">Log Arguments and Return Value of a Function</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-get-the-execution-time-of-a-function">Get the Execution Time of a Function</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-convert-function-return-value-to-a-specified-data-type">Convert Function Return Value to a Specified Data Type</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-cache-function-results">Cache Function Results</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-validate-function-arguments-based-on-condition">Validate Function Arguments Based on Condition</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-retry-a-function-multiple-times-on-failure">Retry a Function Multiple Times on Failure</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-enforce-rate-limits-on-a-function">Enforce Rate Limits on a Function</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-handle-exceptions-and-provide-default-response">Handle Exceptions and Provide Default Response</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-enforce-type-checking-on-function-arguments">Enforce Type Checking on Function Arguments</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-measure-memory-usage-of-a-function">Measure Memory Usage of a Function</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-cache-function-results-with-expiration-time">Cache Function Results with Expiration Time</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-conclusion">Conclusion</a></p>
</li>
</ul>
<p>But first, a little introduction.</p>
<h2 id="heading-how-python-decorators-work">How Python Decorators Work</h2>
<p>Before diving in, let's understand some key benefits of decorators in Python:</p>
<ul>
<li><p><strong>Enhancing functions without invasive changes:</strong> Decorators augment functions transparently without altering the original code, keeping the core logic clean and maintainable.</p>
</li>
<li><p><strong>Reusing functionality across places:</strong> Common capabilities like logging, caching, and rate limiting can be built once in decorators and applied wherever needed.</p>
</li>
<li><p><strong>Readable and declarative syntax:</strong> The <code>@decorator</code> syntax simply conveys functionality enhancement at the definition site.</p>
</li>
<li><p><strong>Modularity and separation of concerns:</strong> Decorators promote loose coupling between functional logic and secondary capabilities like performance, security, logging etc.</p>
</li>
</ul>
<p>The takeaway is that decorators unlock simple yet flexible ways of transparently enhancing Python functions for improved code organization, efficiency, and reuse without introducing complexity or redundancy.</p>
<p>Here is a basic example of decorator syntax in Python with annotations:</p>
<pre><code class="lang-python"><span class="hljs-comment"># Decorator function</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">my_decorator</span>(<span class="hljs-params">func</span>):</span>

<span class="hljs-comment"># Wrapper function</span>
    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">wrapper</span>():</span>
        print(<span class="hljs-string">"Before the function call"</span>) <span class="hljs-comment"># Extra processing before the function</span>
        func() <span class="hljs-comment"># Call the actual function being decorated</span>
        print(<span class="hljs-string">"After the function call"</span>) <span class="hljs-comment"># Extra processing after the function</span>
    <span class="hljs-keyword">return</span> wrapper <span class="hljs-comment"># Return the nested wrapper function</span>

<span class="hljs-comment"># Function to decorate</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">my_function</span>():</span>
    print(<span class="hljs-string">"Inside my function"</span>)

<span class="hljs-comment"># Apply decorator on the function</span>
<span class="hljs-meta">@my_decorator</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">my_function</span>():</span>
    print(<span class="hljs-string">"Inside my function"</span>)

<span class="hljs-comment"># Call the decorated function</span>
my_function()
</code></pre>
<p>A decorator in Python is a function that takes another function as an argument and extends its behavior without modifying it. The decorator function wraps the original function by defining a wrapper function inside of it. This wrapper function executes code before and after calling the original function.</p>
<p>Specifically, when defining a decorator function such as <code>my_decorator</code> in the example, it takes a function as an argument, which we generally call <code>func</code>. This <code>func</code> will be the actual function that is decorated under the hood.</p>
<p>The wrapper function inside <code>my_decorator</code> can execute arbitrary code before and after calling <code>func()</code>, which invokes the original function. When applying <code>@my_decorator</code> before the definition of <code>my_function</code>, it passes <code>my_function</code> as an argument to <code>my_decorator</code>, so func refers to <code>my_function</code> in that context.</p>
<p>The wrapper function then returns the enhanced wrapped function. So now <code>my_function</code> has been decorated by <code>my_decorator</code>. When it is later called, the wrapper code inside <code>my_decorator</code> executes before and after <code>my_function</code> runs. This allows decorators to transparently extend the behavior of a function, without needing to modify the function itself.</p>
<p>And as you'll recall, the original <code>my_function</code> remains unchanged, keeping decorators non-invasive and flexible.</p>
<p>When <code>my_function()</code> is decorated with <code>@my_decorator</code>, it is automatically enhanced. The <code>my_decorator</code> function here returns a wrapper function. This wrapper function gets executed when the <code>my_function()</code> is called now.</p>
<p>First, the wrapper prints <code>"Before the function call"</code> before actually calling the original <code>my_function()</code> function being decorated. Then, after <code>my_function()</code> executes, it prints <code>"After function call"</code>.</p>
<p>So, additional behavior and printed messages are added before and after the <code>my_function()</code> execution in the wrapper, without directly modifying <code>my_function()</code> itself. The decorator allows you to extend <code>my_function()</code> in a transparent way without affecting its core logic, as the wrapper handles the enhanced behavior.</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2024/01/image-109.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>Applying a Decorator to a Function</em></p>
<p>So let's start exploring the top 11 practical decorators that every Python developer should know.</p>
<h2 id="heading-log-arguments-and-return-value-of-a-function">Log Arguments and Return Value of a Function</h2>
<p>The Log Arguments and Return Value decorator tracks the input parameters and output of functions. This supports debugging by logging a clear record of data flow through complex operations.</p>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">log_decorator</span>(<span class="hljs-params">original_function</span>):</span>
    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">wrapper</span>(<span class="hljs-params">*args, **kwargs</span>):</span>
        print(<span class="hljs-string">f"Calling <span class="hljs-subst">{original_function.__name__}</span> with args: <span class="hljs-subst">{args}</span>, kwargs: <span class="hljs-subst">{kwargs}</span>"</span>)

        <span class="hljs-comment"># Call the original function</span>
        result = original_function(*args, **kwargs)

        <span class="hljs-comment"># Log the return value</span>
        print(<span class="hljs-string">f"<span class="hljs-subst">{original_function.__name__}</span> returned: <span class="hljs-subst">{result}</span>"</span>)

        <span class="hljs-comment"># Return the result</span>
        <span class="hljs-keyword">return</span> result
    <span class="hljs-keyword">return</span> wrapper

<span class="hljs-comment"># Example usage</span>
<span class="hljs-meta">@log_decorator</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">calculate_product</span>(<span class="hljs-params">x, y</span>):</span>
    <span class="hljs-keyword">return</span> x * y

<span class="hljs-comment"># Call the decorated function</span>
result = calculate_product(<span class="hljs-number">10</span>, <span class="hljs-number">20</span>)
print(<span class="hljs-string">"Result:"</span>, result)
</code></pre>
<p><strong>Output:</strong></p>
<pre><code class="lang-javascript">Calling calculate_product <span class="hljs-keyword">with</span> args: (<span class="hljs-number">10</span>, <span class="hljs-number">20</span>), <span class="hljs-attr">kwargs</span>: {}
calculate_product returned: <span class="hljs-number">200</span>
<span class="hljs-attr">Result</span>: <span class="hljs-number">200</span>
</code></pre>
<p>In this example, the decorator function is named <code>log_decorator()</code> and accepts a function, <code>original_function</code>, as its argument. Within <code>log_decorator()</code>, a nested function called <code>wrapper()</code> is defined. This <code>wrapper()</code> function is what the decorator returns and effectively replaces the original function.</p>
<p>When the <code>wrapper()</code> function is invoked, it prints logging statements pertaining to the function call. Then it calls the original function, <code>original_function</code>, captures its result, prints the outcome, and returns the result.</p>
<p>The <code>@log_decorator</code> syntax above the <code>calculate_product()</code> function is a Python convention to apply the <code>log_decorator</code> as a decorator to the <code>calculate_product</code> function. So when <code>calculate_product()</code> is invoked, it's actually invoking the <code>wrapper()</code> function returned by <code>log_decorator()</code>. Therefore, <code>log_decorator()</code> acts as a wrapper, introducing logging statements before and after the execution of the original <code>calculate_product()</code> function.</p>
<h3 id="heading-usage-and-applications">Usage and Applications</h3>
<p>This decorator is widely adopted in application development for adding runtime logging without interfering with business logic implementation.</p>
<p>For example, consider a banking application that processes financial transactions. The core transaction processing logic resides in functions like <code>transfer_funds()</code> and <code>accept_payment()</code>. To monitor these transactions, logging can be added by including <code>@log_decorator</code> above each function.</p>
<p>Then when transactions are triggered by calling <code>transfer_funds()</code>, you can print the function name, arguments like the sender, receiver, and amount before the actual transfer. Then after the function returns, you can print the whether the transfer succeeded or failed.</p>
<p>This type of logging with decorators allows you to track transactions without adding any code to core functions like <code>transfer_funds()</code>. The logic stays clean while debuggability and observability improves. Logging messages can be directed to a monitoring dashboard or log analytics system as well.</p>
<h2 id="heading-get-the-execution-time-of-a-function">Get the Execution Time of a Function</h2>
<p>This decorator is your ally in the quest for performance optimization. By measuring and logging the execution time of a function, this decorator facilitates a deep dive into the efficiency of your code, helping you pinpoint bottlenecks and streamline your application's performance.</p>
<p>It's ideal for scenarios where speed is crucial, such as real-time applications or large-scale data processing. And it allows you to identify and address performance bottlenecks systematically.</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> time

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">measure_execution_time</span>(<span class="hljs-params">func</span>):</span>
    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">timed_execution</span>(<span class="hljs-params">*args, **kwargs</span>):</span>
        start_timestamp = time.time()
        result = func(*args, **kwargs)
        end_timestamp = time.time()
        execution_duration = end_timestamp - start_timestamp
        print(<span class="hljs-string">f"Function <span class="hljs-subst">{func.__name__}</span> took <span class="hljs-subst">{execution_duration:<span class="hljs-number">.2</span>f}</span> seconds to execute"</span>)
        <span class="hljs-keyword">return</span> result
    <span class="hljs-keyword">return</span> timed_execution

<span class="hljs-comment"># Example usage</span>
<span class="hljs-meta">@measure_execution_time</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">multiply_numbers</span>(<span class="hljs-params">numbers</span>):</span>
    product = <span class="hljs-number">1</span>
    <span class="hljs-keyword">for</span> num <span class="hljs-keyword">in</span> numbers:
        product *= num
    <span class="hljs-keyword">return</span> product

<span class="hljs-comment"># Call the decorated function</span>
result = multiply_numbers([i <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> range(<span class="hljs-number">1</span>, <span class="hljs-number">10</span>)])
print(<span class="hljs-string">f"Result: <span class="hljs-subst">{result}</span>"</span>)
</code></pre>
<p><strong>Output:</strong></p>
<pre><code class="lang-javascript"><span class="hljs-built_in">Function</span> multiply_numbers took <span class="hljs-number">0.00</span> seconds to execute
<span class="hljs-attr">Result</span>: <span class="hljs-number">362880</span>
</code></pre>
<p>This code showcases a decorator that's designed to measure the execution duration of functions.</p>
<p>The <code>measure_execution_time()</code> decorator takes a function, <code>func</code>, and defines an inner function, <code>timed_execution()</code>, to wrap the original function. Upon invocation, <code>timed_execution()</code> records the start time, calls the original function, records the end time, calculates the duration, and prints it.</p>
<p>The <code>@measure_execution_time</code> syntax applies this decorator to functions below it, such as <code>multiply_numbers()</code>. Consequently, when <code>multiply_numbers()</code> is called, it invokes the <code>timed_execution()</code> wrapper, which logs the duration alongside the function result.</p>
<p>This example illustrates how decorators seamlessly augment existing functions with additional functionality, like timing, without direct modification.</p>
<h3 id="heading-usage-and-applications-1">Usage and Applications</h3>
<p>This decorator is helpful in profiling functions to identify performance bottlenecks in applications. For example, consider an e-commerce site with several backend functions like <code>get_recommendations()</code>, <code>calculate_shipping()</code>, and so on. By decorating them with <code>@measure_execution_time</code>, you can monitor their runtime.</p>
<p>When <code>get_recommendations()</code> is invoked in a user session, the decorator will time its execution duration by recording a start and end timestamp. After execution, it will print the time taken before returning recommendations.</p>
<p>Doing this systematically across applications and analyzing outputs will show you the functions that are taking an unusually long time. The development team can then optimize such functions through caching, parallel processing, and other techniques to improve overall application performance.</p>
<p>Without such timing decorators, finding optimization candidates would require tedious logging code additions. Decorators provide visibility easily without contaminating business logic.</p>
<h2 id="heading-convert-function-return-value-to-a-specified-data-type">Convert Function Return Value to a Specified Data Type</h2>
<p>The Convert Return Value Type decorator enhances data consistency in functions by automatically converting the return value to a specified data type, promoting predictability and preventing unexpected errors. It is particularly useful for downstream processes that require consistent data types, reducing runtime errors.</p>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">convert_to_data_type</span>(<span class="hljs-params">target_type</span>):</span>
    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">type_converter_decorator</span>(<span class="hljs-params">func</span>):</span>
        <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">wrapper</span>(<span class="hljs-params">*args, **kwargs</span>):</span>
            result = func(*args, **kwargs)
            <span class="hljs-keyword">return</span> target_type(result)
        <span class="hljs-keyword">return</span> wrapper
    <span class="hljs-keyword">return</span> type_converter_decorator

<span class="hljs-meta">@convert_to_data_type(int)</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">add_values</span>(<span class="hljs-params">a, b</span>):</span>
    <span class="hljs-keyword">return</span> a + b

int_result = add_values(<span class="hljs-number">10</span>, <span class="hljs-number">20</span>)
print(<span class="hljs-string">"Result:"</span>, int_result, type(int_result))

<span class="hljs-meta">@convert_to_data_type(str)</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">concatenate_strings</span>(<span class="hljs-params">str1, str2</span>):</span>
    <span class="hljs-keyword">return</span> str1 + str2

str_result = concatenate_strings(<span class="hljs-string">"Python"</span>, <span class="hljs-string">" Decorator"</span>)
print(<span class="hljs-string">"Result:"</span>, str_result, type(str_result))
</code></pre>
<p><strong>Output:</strong></p>
<pre><code class="lang-javascript">Result: <span class="hljs-number">30</span> &lt;<span class="hljs-class"><span class="hljs-keyword">class</span> '<span class="hljs-title">int</span>'&gt;
<span class="hljs-title">Result</span>: <span class="hljs-title">Python</span> <span class="hljs-title">Decorator</span> &lt;<span class="hljs-title">class</span> '<span class="hljs-title">str</span>'&gt;</span>
</code></pre>
<p>The above code example shows a decorator that's designed to convert the return value of a function to a specified data type.</p>
<p>The decorator, named <code>convert_to_data_type()</code>, takes the target data type as a parameter and returns a decorator named <code>type_converter_decorator()</code>. Within this decorator, a <code>wrapper()</code> function is defined to call the original function, convert its return value to the target type using <code>target_type()</code>, and subsequently return the converted result.</p>
<p>The syntax <code>@convert_to_data_type(int)</code> that's applied above a function (such as <code>add_values()</code>) utilizes this decorator to convert the return value to an integer. Similarly, for <code>concatenate_strings()</code>, passing <code>str</code> formats the return value as a string.</p>
<p>This example also showcases how decorators seamlessly modify function outputs to desired formats without altering the core logic of the functions.</p>
<h3 id="heading-usage-and-application">Usage and Application</h3>
<p>This return value transformation decorator proves useful in applications where you need to automatically adapt functions to expected data formats.</p>
<p>For instance, you could use it in a weather API that returns temperatures by default in decimal format like 23.456 degrees. But the consumer front-end application expects an integer value to display.</p>
<p>Instead of changing the API function to return an integer, just decorate it with <code>@convert_to_data_type(int)</code>. This will seamlessly convert the decimal temperature to the integer <code>23</code>, in this example, before returning to the client app. Without any API function modification, you've reformatted the return value.</p>
<p>Similarly for backend processing expecting JSON, return values can be converted using the <code>@convert_to_data_type(json)</code> decorator. The core logic stays unchanged while the presentation format adapts based on your use case's needs. This avoids duplication of format handling code across functions.</p>
<p>Decorators externally impose required data representations for seamless integration and reusability across application layers with mismatched formats.</p>
<h2 id="heading-cache-function-results">Cache Function Results</h2>
<p>This decorator optimizes performance by storing and retrieving function results, eliminating redundant computations for repeated inputs, and improving application responsiveness, especially for time-consuming computations.</p>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">cached_result_decorator</span>(<span class="hljs-params">func</span>):</span>
    result_cache = {}

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">wrapper</span>(<span class="hljs-params">*args, **kwargs</span>):</span>
        cache_key = (*args, *kwargs.items())

        <span class="hljs-keyword">if</span> cache_key <span class="hljs-keyword">in</span> result_cache:
            <span class="hljs-keyword">return</span> <span class="hljs-string">f"[FROM CACHE] <span class="hljs-subst">{result_cache[cache_key]}</span>"</span>

        result = func(*args, **kwargs)
        result_cache[cache_key] = result

        <span class="hljs-keyword">return</span> result

    <span class="hljs-keyword">return</span> wrapper

<span class="hljs-comment"># Example usage</span>

<span class="hljs-meta">@cached_result_decorator</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">multiply_numbers</span>(<span class="hljs-params">a, b</span>):</span>
    <span class="hljs-keyword">return</span> <span class="hljs-string">f"Product = <span class="hljs-subst">{a * b}</span>"</span>

<span class="hljs-comment"># Call the decorated function multiple times</span>
print(multiply_numbers(<span class="hljs-number">4</span>, <span class="hljs-number">5</span>))  <span class="hljs-comment"># Calculation is performed</span>
print(multiply_numbers(<span class="hljs-number">4</span>, <span class="hljs-number">5</span>))  <span class="hljs-comment"># Result is retrieved from cache</span>
print(multiply_numbers(<span class="hljs-number">5</span>, <span class="hljs-number">7</span>))  <span class="hljs-comment"># Calculation is performed</span>
print(multiply_numbers(<span class="hljs-number">5</span>, <span class="hljs-number">7</span>))  <span class="hljs-comment"># Result is retrieved from cache</span>
print(multiply_numbers(<span class="hljs-number">-3</span>, <span class="hljs-number">7</span>))  <span class="hljs-comment"># Calculation is performed</span>
print(multiply_numbers(<span class="hljs-number">-3</span>, <span class="hljs-number">7</span>))  <span class="hljs-comment"># Result is retrieved from cache</span>
</code></pre>
<p><strong>Output:</strong></p>
<pre><code class="lang-javascript">Product = <span class="hljs-number">20</span>
[FROM CACHE] Product = <span class="hljs-number">20</span>
Product = <span class="hljs-number">35</span>
[FROM CACHE] Product = <span class="hljs-number">35</span>
Product = <span class="hljs-number">-21</span>
[FROM CACHE] Product = <span class="hljs-number">-21</span>
</code></pre>
<p>This code sample showcases a decorator that's designed to cache and reuse function call results efficiently.</p>
<p>The <code>cached_result_decorator()</code> function takes another function and returns a wrapper. Within this wrapper, a cache dictionary (<code>result_cache</code>) stores unique call parameters and their corresponding results.</p>
<p>Before executing the actual function, the <code>wrapper()</code> checks if the result for the current parameters is already in the cache. If so, it retrieves and returns the cached result – otherwise, it calls the function, stores the result in the cache, and returns it.</p>
<p>The <code>@cached_result_decorator</code> syntax applies this caching logic to any function, such as <code>multiply_numbers()</code>. This ensures that, upon subsequent calls with the same arguments, the cached result is reused, preventing redundant calculations.</p>
<p>In essence, the decorator enhances functionality by optimizing performance through result caching.</p>
<h3 id="heading-usage-and-applications-2">Usage and Applications</h3>
<p>Caching decorators like this are extremely useful in application development for optimizing performance of repetitive function calls.</p>
<p>For example, consider a recommendation engine calling predictive model functions to generate user suggestions. <code>get_user_recommendations()</code> prepares the input data and feeds into the model for every user request.Instead of re-running computations, it can be decorated with <code>@cached_result_decorator</code> to introduce caching layer.</p>
<p>Now the first time unique user parameters are passed, the model runs and the result caches. Subsequent calls with the same inputs directly return the cached model outputs, skipping the model recalculation.</p>
<p>This drastically improves latency for responding to user requests by avoiding duplicate model inferences. You can monitor cache hit rates to justify scaling down model server infrastructure costs.</p>
<p>Decoupling such optimization concerns through caching decorators rather than mixing them inside function logic improves modularity, readability and allows rapid performance gains. Caches will be configured, invalidated separately without intruding business functions.</p>
<h2 id="heading-validate-function-arguments-based-on-condition">Validate Function Arguments Based on Condition</h2>
<p>This one checks if input arguments meet predefined criteria before execution, enhancing function reliability and preventing unexpected behavior. It is useful for parameters requiring positive integers or non-empty strings.</p>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">check_condition_positive</span>(<span class="hljs-params">value</span>):</span>
    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">argument_validator</span>(<span class="hljs-params">func</span>):</span>
        <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">validate_and_calculate</span>(<span class="hljs-params">*args, **kwargs</span>):</span>
            <span class="hljs-keyword">if</span> value(*args, **kwargs):
                <span class="hljs-keyword">return</span> func(*args, **kwargs)
            <span class="hljs-keyword">else</span>:
                <span class="hljs-keyword">raise</span> ValueError(<span class="hljs-string">"Invalid arguments passed to the function"</span>)
        <span class="hljs-keyword">return</span> validate_and_calculate
    <span class="hljs-keyword">return</span> argument_validator

<span class="hljs-meta">@check_condition_positive(lambda x: x &gt; 0)</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">compute_cubed_result</span>(<span class="hljs-params">number</span>):</span>
    <span class="hljs-keyword">return</span> number ** <span class="hljs-number">3</span>

print(compute_cubed_result(<span class="hljs-number">5</span>))  <span class="hljs-comment"># Output: 125</span>
print(compute_cubed_result(<span class="hljs-number">-2</span>))  <span class="hljs-comment"># Raises ValueError: Invalid arguments passed to the function</span>
</code></pre>
<p><strong>Output:</strong></p>
<pre><code class="lang-javascript"><span class="hljs-number">125</span>Traceback (most recent call last):

  File <span class="hljs-string">"C:\\\\Program Files\\\\Sublime Text 3\\\\test.py"</span>, line <span class="hljs-number">16</span>, <span class="hljs-keyword">in</span> &lt;<span class="hljs-built_in">module</span>&gt;
    print(compute_cubed_result(<span class="hljs-number">-2</span>))  # Raises ValueError: Invalid <span class="hljs-built_in">arguments</span> passed to the <span class="hljs-function"><span class="hljs-keyword">function</span>
  <span class="hljs-title">File</span> "<span class="hljs-title">C</span>:\\\\<span class="hljs-title">Program</span> <span class="hljs-title">Files</span>\\\\<span class="hljs-title">Sublime</span> <span class="hljs-title">Text</span> 3\\\\<span class="hljs-title">test</span>.<span class="hljs-title">py</span>", <span class="hljs-title">line</span> 7, <span class="hljs-title">in</span> <span class="hljs-title">validate_and_calculate</span>
    <span class="hljs-title">raise</span> <span class="hljs-title">ValueError</span>(<span class="hljs-params"><span class="hljs-string">"Invalid arguments passed to the function"</span></span>)
<span class="hljs-title">ValueError</span>: <span class="hljs-title">Invalid</span> <span class="hljs-title">arguments</span> <span class="hljs-title">passed</span> <span class="hljs-title">to</span> <span class="hljs-title">the</span> <span class="hljs-title">function</span></span>
</code></pre>
<p>This code showcases how you can implement a decorator for validating function arguments.</p>
<p>The <code>check_condition_positive()</code> is a decorator factory that generates an <code>argument_validator()</code> decorator. This validator, when applied with <code>@check_condition_positive()</code> above the <code>compute_cubed_result()</code> function, checks if the condition (in this case, that the argument should be greater than 0) holds true for the passed arguments.</p>
<p>If the condition is met, the decorated function is executed – otherwise, a <code>ValueError</code> exception is raised.</p>
<p>This succinct example illustrates how decorators serve as a mechanism for validating function arguments before their execution, ensuring adherence to specified conditions.</p>
<h3 id="heading-usage-and-applications-3">Usage and Applications</h3>
<p>Such parameter validation decorators are extremely useful in applications to help enforce business rules, security constraints, and so on.</p>
<p>For example, an insurance claims processing system would have a function <code>process_claim()</code> that takes details like claim id, approver name, and so on. Certain business rules dictate who can approve claims.</p>
<p>Rather than cluttering the function logic itself, you can decorate it with <code>@check_condition_positive()</code> which validates if the approver role matches the claim amount. If a junior agent tries approving a large claim (thus violating the rules), this decorator would catch it by raising exception even before <code>process_claim()</code> executes.</p>
<p>Similarly, input data validation constraints for security and compliance can be imposed without touching individual functions. Decorators externally ensure that violated arguments never reach application risks.</p>
<p>Common validation patterns should be reused across multiple functions. This improves security and promotes separation of concerns by isolating constraints from core logic flow in a modular way.</p>
<h2 id="heading-retry-a-function-multiple-times-on-failure">Retry a Function Multiple Times on Failure</h2>
<p>This decorator comes handy when you want to automatically retry a function after failure, enhancing its resilience in situations involving transient failures. It is used for external services or network requests prone to intermittent failures.</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> sqlite3
<span class="hljs-keyword">import</span> time

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">retry_on_failure</span>(<span class="hljs-params">max_attempts, retry_delay=<span class="hljs-number">1</span></span>):</span>
    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">decorator</span>(<span class="hljs-params">func</span>):</span>
        <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">wrapper</span>(<span class="hljs-params">*args, **kwargs</span>):</span>
            <span class="hljs-keyword">for</span> _ <span class="hljs-keyword">in</span> range(max_attempts):
                <span class="hljs-keyword">try</span>:
                    result = func(*args, **kwargs)
                    <span class="hljs-keyword">return</span> result
                <span class="hljs-keyword">except</span> Exception <span class="hljs-keyword">as</span> error:
                    print(<span class="hljs-string">f"Error occurred: <span class="hljs-subst">{error}</span>. Retrying..."</span>)
                    time.sleep(retry_delay)
            <span class="hljs-keyword">raise</span> Exception(<span class="hljs-string">"Maximum attempts exceeded. Function failed."</span>)

        <span class="hljs-keyword">return</span> wrapper
    <span class="hljs-keyword">return</span> decorator

<span class="hljs-meta">@retry_on_failure(max_attempts=3, retry_delay=2)</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">establish_database_connection</span>():</span>
    connection = sqlite3.connect(<span class="hljs-string">"example.db"</span>)
    db_cursor = connection.cursor()
    db_cursor.execute(<span class="hljs-string">"SELECT * FROM users"</span>)
    query_result = db_cursor.fetchall()
    db_cursor.close()
    connection.close()
    <span class="hljs-keyword">return</span> query_result

<span class="hljs-keyword">try</span>:
    retrieved_data = establish_database_connection()
    print(<span class="hljs-string">"Data retrieved successfully:"</span>, retrieved_data)
<span class="hljs-keyword">except</span> Exception <span class="hljs-keyword">as</span> error_message:
    print(<span class="hljs-string">f"Failed to establish database connection: <span class="hljs-subst">{error_message}</span>"</span>)
</code></pre>
<p><strong>Output:</strong></p>
<pre><code class="lang-javascript"><span class="hljs-built_in">Error</span> occurred: no such table: users. Retrying...
Error occurred: no such table: users. Retrying...
Error occurred: no such table: users. Retrying...
Failed to establish database connection: Maximum attempts exceeded. Function failed.
</code></pre>
<p>This example introduces a decorator that's designed for retrying function executions in the event of failures. It has a specified maximum attempt count and delay between retries.</p>
<p>The <code>retry_on_failure()</code> is a decorator factory, taking parameters for maximum retry count and delay, and returning a <code>decorator()</code> that manages the retry logic.</p>
<p>Within the <code>wrapper()</code> function, the decorated function undergoes execution in a loop, attempting a specified maximum number of times.</p>
<p>In case of an exception, it prints an error message, introduces a delay specified by <code>retry_delay</code>, and retries. If all attempts fail, it raises an exception indicating that the maximum attempts have been exceeded.</p>
<p>The <code>@retry_on_failure()</code> applied above <code>establish_database_connection()</code> integrates this retry logic, allowing for up to 3 retries with a 2-second delay between each attempt in case the database connection encounters failures.</p>
<p>This demonstrates the utility of decorators in seamlessly incorporating retry capabilities without altering the core function code.</p>
<h3 id="heading-usage-and-application-1">Usage and Application</h3>
<p>This retry decorator can prove extremely useful in application development for adding resilience against temporary or intermittent errors.</p>
<p>For instance, consider a flight booking app that calls a payment gateway API <code>process_payment()</code> to handle customer transactions. Sometimes network blips or high loads at payment provider end could cause transient errors in API response.</p>
<p>Rather than directly showing failures to customers, the <code>process_payment()</code> function can be decorated with <code>@retry_on_failure</code> to handle such scenarios implicitly. Now when a payment fails once, it will seamlessly retry sending the request up to 3 times before finally reporting the error if it persists.</p>
<p>This provides shielding from temporary hiccups without exposing users to unreliable infrastructure behavior directly.The application also remains available reliably even if dependent services fail occasionally.</p>
<p>The decorator helps confine the retry logic neatly without spreading it across the API's code. Failures beyond the app's control are handled gracefully rather than directly impacting users by application faults. This demonstrates how decorators lend better resilience without complicating business logic.</p>
<h2 id="heading-enforce-rate-limits-on-a-function">Enforce Rate Limits on a Function</h2>
<p>By controlling the frequency of functions called, the Enforce Rate Limits decorator ensures effective resource management and guards against misuse. It is especially helpful in scenarios like API misuse or resource conservation where restricting function calls is essential.</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> time

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">rate_limiter</span>(<span class="hljs-params">max_allowed_calls, reset_period_seconds</span>):</span>
    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">decorate_rate_limited_function</span>(<span class="hljs-params">original_function</span>):</span>
        calls_count = <span class="hljs-number">0</span>
        last_reset_time = time.time()

        <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">wrapper_function</span>(<span class="hljs-params">*args, **kwargs</span>):</span>
            <span class="hljs-keyword">nonlocal</span> calls_count, last_reset_time
            elapsed_time = time.time() - last_reset_time

            <span class="hljs-comment"># If the elapsed time is greater than the reset period, reset the call count</span>
            <span class="hljs-keyword">if</span> elapsed_time &gt; reset_period_seconds:
                calls_count = <span class="hljs-number">0</span>
                last_reset_time = time.time()

            <span class="hljs-comment"># Check if the call count has reached the maximum allowed limit</span>
            <span class="hljs-keyword">if</span> calls_count &gt;= max_allowed_calls:
                <span class="hljs-keyword">raise</span> Exception(<span class="hljs-string">"Rate limit exceeded. Please try again later."</span>)

            <span class="hljs-comment"># Increment the call count</span>
            calls_count += <span class="hljs-number">1</span>

            <span class="hljs-comment"># Call the original function</span>
            <span class="hljs-keyword">return</span> original_function(*args, **kwargs)

        <span class="hljs-keyword">return</span> wrapper_function
    <span class="hljs-keyword">return</span> decorate_rate_limited_function

<span class="hljs-comment"># Allowing a maximum of 6 API calls within 10 seconds.</span>
<span class="hljs-meta">@rate_limiter(max_allowed_calls=6, reset_period_seconds=10)</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">make_api_call</span>():</span>
    print(<span class="hljs-string">"API call executed successfully..."</span>)

<span class="hljs-comment"># Make API calls</span>
<span class="hljs-keyword">for</span> _ <span class="hljs-keyword">in</span> range(<span class="hljs-number">8</span>):
    <span class="hljs-keyword">try</span>:
        make_api_call()
    <span class="hljs-keyword">except</span> Exception <span class="hljs-keyword">as</span> error:
        print(<span class="hljs-string">f"Error occurred: <span class="hljs-subst">{error}</span>"</span>)
time.sleep(<span class="hljs-number">10</span>)
make_api_call()
</code></pre>
<p><strong>Output:</strong></p>
<pre><code class="lang-javascript">API call executed successfully...
API call executed successfully...
API call executed successfully...
API call executed successfully...
API call executed successfully...
API call executed successfully...
Error occurred: Rate limit exceeded. Please <span class="hljs-keyword">try</span> again later.
Error occurred: Rate limit exceeded. Please <span class="hljs-keyword">try</span> again later.
API call executed successfully...
</code></pre>
<p>This code showcases the implementation of a rate-limiting mechanism for function calls using a decorator.</p>
<p>The <code>rate_limiter()</code> function, specified with maximum calls and a period in seconds to reset the count, serves as the core of the rate-limiting logic. The decorator, <code>decorate_rate_limited_function()</code>, employs a wrapper to manage the rate limits by resetting the count if the period has elapsed. It checks if the count has reached the maximum allowed, and then either raises an exception or increments the count and executes the function accordingly.</p>
<p>Applied to <code>make_api_call()</code> using <code>@rate_limiter()</code>, it restricts the function to six calls within any 10-second period. This introduces rate limiting without changing the function logic, ensuring that calls adhere to limits and preventing excessive use within set intervals.</p>
<h3 id="heading-usage-and-application-2">Usage and Application</h3>
<p>Rate limiting decorators like this are very useful in application development for controlling usage of APIs and preventing abuse.</p>
<p>For instance, a travel booking application may rely on third party Flight Search API for checking live seat availability across airlines. While most usage is legitimate, some users could potentially call this API excessively, degrading overall service performance.</p>
<p>By decorating the API integration module like <code>@rate_limiter(100, 60)</code>, the application can restrict excessive calls internally, too. This would limit the booking module to make only 100 Flight API calls per minute. Additional calls get rejected directly through the decorator without even reaching actual API.</p>
<p>This saves downstream service from overuse enabling fairer distribution of capacity for general application functionality.</p>
<p>Decorators provide easy rate control for both internal and external facing APIs without changing functional code. This means you don't have to account for usage quotas while safeguarding services, infrastructure, and bounding adoption risk. And it's all thanks to application-side controls using wrappers.</p>
<h2 id="heading-handle-exceptions-and-provide-default-response">Handle Exceptions and Provide Default Response</h2>
<p>The Handle Exceptions decorator is a safety net for functions, gracefully handling exceptions and providing default responses when they occur. It shields the application from crashing due to unforeseen circumstances, ensuring smooth operation.</p>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">handle_exceptions</span>(<span class="hljs-params">default_response_msg</span>):</span>
    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">exception_handler_decorator</span>(<span class="hljs-params">func</span>):</span>
        <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">decorated_function</span>(<span class="hljs-params">*args, **kwargs</span>):</span>
            <span class="hljs-keyword">try</span>:
                <span class="hljs-comment"># Call the original function</span>
                <span class="hljs-keyword">return</span> func(*args, **kwargs)
            <span class="hljs-keyword">except</span> Exception <span class="hljs-keyword">as</span> error:
                <span class="hljs-comment"># Handle the exception and provide the default response</span>
                print(<span class="hljs-string">f"Exception occurred: <span class="hljs-subst">{error}</span>"</span>)
                <span class="hljs-keyword">return</span> default_response_msg
        <span class="hljs-keyword">return</span> decorated_function
    <span class="hljs-keyword">return</span> exception_handler_decorator

<span class="hljs-comment"># Example usage</span>
<span class="hljs-meta">@handle_exceptions(default_response_msg="An error occurred!")</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">divide_numbers_safely</span>(<span class="hljs-params">dividend, divisor</span>):</span>
    <span class="hljs-keyword">return</span> dividend / divisor

<span class="hljs-comment"># Call the decorated function</span>
result = divide_numbers_safely(<span class="hljs-number">7</span>, <span class="hljs-number">0</span>)  <span class="hljs-comment"># This will raise a ZeroDivisionError</span>
print(<span class="hljs-string">"Result:"</span>, result)
</code></pre>
<p><strong>Output:</strong></p>
<pre><code class="lang-javascript">Exception occurred: division by zero
<span class="hljs-attr">Result</span>: An error occurred!
</code></pre>
<p>This code showcases exception handling in functions using decorators.</p>
<p>The <code>handle_exceptions()</code> decorator factory, accepting a default response, produces <code>exception_handler_decorator()</code>. This decorator, when applied to functions, attempts to execute the original function. If an exception arises, it prints error details, and returns the specified default response.</p>
<p>The <code>@handle_exceptions()</code> syntax above a function incorporates this exception-handling logic. For instance, in <code>divide_numbers_safely()</code>, division by zero triggers an exception, which the decorator catches, preventing a crash and returning the default "An error occurred!" response.</p>
<p>Essentially, these decorators adeptly capture exceptions in functions, providing a seamless means of incorporating handling logic and preventing crashes.</p>
<h3 id="heading-usage-and-applications-4">Usage and Applications</h3>
<p>Exception handling decorators greatly simplify application error management and help hide unreliable behavior from users.</p>
<p>For example, an e-commerce website may rely on payment, inventory, and shipping services to complete orders. Instead of complex exception blocks everywhere, core order processing function like <code>place_order()</code> can be decorated to achieve resilience.</p>
<p>The <code>@handle_exceptions</code> decorator applied above it would absorb any third party service outage or intermittent issue during order finalization. On exception, it logs errors for debugging while serving a graceful "Order failed, please try again later" message to the customer. This avoids expose complex failure root causes like payment timeouts to end user.</p>
<p>Decorators shield customers from unreliable service issues without changing business code. They provide friendly default responses when errors happen. This improves customer experience</p>
<p>Also, decorators give developers visibility into those errors behind the scenes. So they can focus on systematically fixing the root causes of failures. This separation of concerns through decorators reduces complexity. Customers see more reliability, and you get actionable insights into faults – all while keeping business logic untouched.</p>
<h2 id="heading-enforce-type-checking-on-function-arguments">Enforce Type Checking on Function Arguments</h2>
<p>The Enforce Type Checking decorator ensures data integrity by verifying function arguments conform to specified data types, preventing type-related errors, and promoting code reliability. It is particularly useful in situations where strict data type adherence is crucial.</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> inspect

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">enforce_type_checking</span>(<span class="hljs-params">func</span>):</span>
    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">type_checked_wrapper</span>(<span class="hljs-params">*args, **kwargs</span>):</span>
        <span class="hljs-comment"># Get the function signature and parameter names</span>
        function_signature = inspect.signature(func)
        function_parameters = function_signature.parameters

        <span class="hljs-comment"># Iterate over the positional arguments</span>
        <span class="hljs-keyword">for</span> i, arg_value <span class="hljs-keyword">in</span> enumerate(args):
            parameter_name = list(function_parameters.keys())[i]
            parameter_type = function_parameters[parameter_name].annotation
            <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> isinstance(arg_value, parameter_type):
                <span class="hljs-keyword">raise</span> TypeError(<span class="hljs-string">f"Argument '<span class="hljs-subst">{parameter_name}</span>' must be of type '<span class="hljs-subst">{parameter_type.__name__}</span>'"</span>)

        <span class="hljs-comment"># Iterate over the keyword arguments</span>
        <span class="hljs-keyword">for</span> keyword_name, arg_value <span class="hljs-keyword">in</span> kwargs.items():
            parameter_type = function_parameters[keyword_name].annotation
            <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> isinstance(arg_value, parameter_type):
                <span class="hljs-keyword">raise</span> TypeError(<span class="hljs-string">f"Argument '<span class="hljs-subst">{keyword_name}</span>' must be of type '<span class="hljs-subst">{parameter_type.__name__}</span>'"</span>)

        <span class="hljs-comment"># Call the original function</span>
        <span class="hljs-keyword">return</span> func(*args, **kwargs)

    <span class="hljs-keyword">return</span> type_checked_wrapper

<span class="hljs-comment"># Example usage</span>
<span class="hljs-meta">@enforce_type_checking</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">multiply_numbers</span>(<span class="hljs-params">factor_1: int, factor_2: int</span>) -&gt; int:</span>
    <span class="hljs-keyword">return</span> factor_1 * factor_2

<span class="hljs-comment"># Call the decorated function</span>
result = multiply_numbers(<span class="hljs-number">5</span>, <span class="hljs-number">7</span>)  <span class="hljs-comment"># No type errors, returns 35</span>
print(<span class="hljs-string">"Result:"</span>, result)

result = multiply_numbers(<span class="hljs-string">"5"</span>, <span class="hljs-number">7</span>)  <span class="hljs-comment"># Type error: 'factor_1' must be of type 'int'</span>
</code></pre>
<p><strong>Output:</strong></p>
<pre><code class="lang-javascript">Result:Traceback (most recent call last):
  File <span class="hljs-string">"C:\\\\Program Files\\\\Sublime Text 3\\\\test.py"</span>, line <span class="hljs-number">36</span>, <span class="hljs-keyword">in</span> &lt;<span class="hljs-built_in">module</span>&gt;
 <span class="hljs-number">35</span>
    result = multiply_numbers(<span class="hljs-string">"5"</span>, <span class="hljs-number">7</span>)  # Type error: <span class="hljs-string">'factor_1'</span> must be <span class="hljs-keyword">of</span> type <span class="hljs-string">'int'</span>
  File <span class="hljs-string">"C:\\\\Program Files\\\\Sublime Text 3\\\\test.py"</span>, line <span class="hljs-number">14</span>, <span class="hljs-keyword">in</span> type_checked_wrapper
    raise <span class="hljs-built_in">TypeError</span>(f<span class="hljs-string">"Argument '{parameter_name}' must be of type '{parameter_type.__name__}'"</span>)
<span class="hljs-attr">TypeError</span>: Argument <span class="hljs-string">'factor_1'</span> must be <span class="hljs-keyword">of</span> type <span class="hljs-string">'int'</span>
</code></pre>
<p>The <code>enforce_type_checking</code> decorator validates whether the arguments passed to a function match the specified type annotations.</p>
<p>Inside the <code>type_checked_wrapper</code>, it examines the signature of the decorated function, retrieves parameter names and type annotations, and ensures that the provided arguments align with the expected types. This includes checking positional arguments against their order, and keyword arguments against parameter names. If a type mismatch is detected, a TypeError is raised.</p>
<p>This decorator is exemplified by its application to the <code>multiply_numbers</code> function, where arguments are annotated as integers. Attempting to pass a string results in an exception, while passing integers executes the function without issues. This type checking is enforced without altering the original function body.</p>
<h3 id="heading-usage-and-applications-5">Usage and Applications</h3>
<p>Type checking decorators are applied to detect issues early and improve reliability. For example, consider a web application backend with a data access layer function <code>get_user_data()</code> annotated to expect integer user IDs. Its queries would fail if string IDs flow into it from frontend code.</p>
<p>Rather than add explicit checks and raise exceptions locally, you can use this decorator. Now any upstream or consumer code passing invalid types will be automatically caught during function execution. The decorator examines annotations versus argument types and throws errors accordingly before reaching the database layer.</p>
<p>This runtime protection for components through decorators ensures that only valid data shapes flow across layers, preventing obscure errors. Type safety is imposed without extra checks cluttering cleaner logic.</p>
<h2 id="heading-measure-memory-usage-of-a-function">Measure Memory Usage of a Function</h2>
<p>When it comes to large dataset-intensive applications or resource-constrained environments, the Measure Memory Usage Decorator is a memory detective that offers insights into function memory consumption. It does this by optimising memory usage.</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> tracemalloc

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">measure_memory_usage</span>(<span class="hljs-params">target_function</span>):</span>
    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">wrapper</span>(<span class="hljs-params">*args, **kwargs</span>):</span>
        tracemalloc.start()

        <span class="hljs-comment"># Call the original function</span>
        result = target_function(*args, **kwargs)

        snapshot = tracemalloc.take_snapshot()
        top_stats = snapshot.statistics(<span class="hljs-string">"lineno"</span>)

        <span class="hljs-comment"># Print the top memory-consuming lines</span>
        print(<span class="hljs-string">f"Memory usage of <span class="hljs-subst">{target_function.__name__}</span>:"</span>)
        <span class="hljs-keyword">for</span> stat <span class="hljs-keyword">in</span> top_stats[:<span class="hljs-number">5</span>]:
            print(stat)

        <span class="hljs-comment"># Return the result</span>
        <span class="hljs-keyword">return</span> result

    <span class="hljs-keyword">return</span> wrapper

<span class="hljs-comment"># Example usage</span>
<span class="hljs-meta">@measure_memory_usage</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">calculate_factorial_recursive</span>(<span class="hljs-params">number</span>):</span>
    <span class="hljs-keyword">if</span> number == <span class="hljs-number">0</span>:
        <span class="hljs-keyword">return</span> <span class="hljs-number">1</span>
    <span class="hljs-keyword">else</span>:
        <span class="hljs-keyword">return</span> number * calculate_factorial_recursive(number - <span class="hljs-number">1</span>)

<span class="hljs-comment"># Call the decorated function</span>
result_factorial = calculate_factorial_recursive(<span class="hljs-number">3</span>)
print(<span class="hljs-string">"Factorial:"</span>, result_factorial)
</code></pre>
<p><strong>Output:</strong></p>
<pre><code class="lang-javascript">Memory usage <span class="hljs-keyword">of</span> calculate_factorial_recursive:
C:\\\\Program Files\\\\Sublime Text <span class="hljs-number">3</span>\\\\test.py:<span class="hljs-number">29</span>: size=<span class="hljs-number">1552</span> B, count=<span class="hljs-number">6</span>, average=<span class="hljs-number">259</span> B
<span class="hljs-attr">C</span>:\\\\Program Files\\\\Sublime Text <span class="hljs-number">3</span>\\\\test.py:<span class="hljs-number">8</span>: size=<span class="hljs-number">896</span> B, count=<span class="hljs-number">3</span>, average=<span class="hljs-number">299</span> B
<span class="hljs-attr">C</span>:\\\\Program Files\\\\Sublime Text <span class="hljs-number">3</span>\\\\test.py:<span class="hljs-number">10</span>: size=<span class="hljs-number">416</span> B, count=<span class="hljs-number">1</span>, average=<span class="hljs-number">416</span> B
Memory usage <span class="hljs-keyword">of</span> calculate_factorial_recursive:
C:\\\\Program Files\\\\Sublime Text <span class="hljs-number">3</span>\\\\test.py:<span class="hljs-number">29</span>: size=<span class="hljs-number">1552</span> B, count=<span class="hljs-number">6</span>, average=<span class="hljs-number">259</span> B
<span class="hljs-attr">C</span>:\\\\Program Files\\\\Python310\\\\lib\\\\tracemalloc.py:<span class="hljs-number">226</span>: size=<span class="hljs-number">880</span> B, count=<span class="hljs-number">3</span>, average=<span class="hljs-number">293</span> B
<span class="hljs-attr">C</span>:\\\\Program Files\\\\Sublime Text <span class="hljs-number">3</span>\\\\test.py:<span class="hljs-number">8</span>: size=<span class="hljs-number">832</span> B, count=<span class="hljs-number">2</span>, average=<span class="hljs-number">416</span> B
<span class="hljs-attr">C</span>:\\\\Program Files\\\\Python310\\\\lib\\\\tracemalloc.py:<span class="hljs-number">173</span>: size=<span class="hljs-number">800</span> B, count=<span class="hljs-number">2</span>, average=<span class="hljs-number">400</span> B
<span class="hljs-attr">C</span>:\\\\Program Files\\\\Python310\\\\lib\\\\tracemalloc.py:<span class="hljs-number">505</span>: size=<span class="hljs-number">592</span> B, count=<span class="hljs-number">2</span>, average=<span class="hljs-number">296</span> B
Memory usage <span class="hljs-keyword">of</span> calculate_factorial_recursive:
C:\\\\Program Files\\\\Sublime Text <span class="hljs-number">3</span>\\\\test.py:<span class="hljs-number">29</span>: size=<span class="hljs-number">1440</span> B, count=<span class="hljs-number">4</span>, average=<span class="hljs-number">360</span> B
<span class="hljs-attr">C</span>:\\\\Program Files\\\\Python310\\\\lib\\\\tracemalloc.py:<span class="hljs-number">535</span>: size=<span class="hljs-number">1240</span> B, count=<span class="hljs-number">3</span>, average=<span class="hljs-number">413</span> B
<span class="hljs-attr">C</span>:\\\\Program Files\\\\Python310\\\\lib\\\\tracemalloc.py:<span class="hljs-number">67</span>: size=<span class="hljs-number">1216</span> B, count=<span class="hljs-number">19</span>, average=<span class="hljs-number">64</span> B
<span class="hljs-attr">C</span>:\\\\Program Files\\\\Python310\\\\lib\\\\tracemalloc.py:<span class="hljs-number">193</span>: size=<span class="hljs-number">1104</span> B, count=<span class="hljs-number">23</span>, average=<span class="hljs-number">48</span> B
<span class="hljs-attr">C</span>:\\\\Program Files\\\\Python310\\\\lib\\\\tracemalloc.py:<span class="hljs-number">226</span>: size=<span class="hljs-number">880</span> B, count=<span class="hljs-number">3</span>, average=<span class="hljs-number">293</span> B
Memory usage <span class="hljs-keyword">of</span> calculate_factorial_recursive:
C:\\\\Program Files\\\\Python310\\\\lib\\\\tracemalloc.py:<span class="hljs-number">558</span>: size=<span class="hljs-number">1416</span> B, count=<span class="hljs-number">29</span>, average=<span class="hljs-number">49</span> B
<span class="hljs-attr">C</span>:\\\\Program Files\\\\Python310\\\\lib\\\\tracemalloc.py:<span class="hljs-number">67</span>: size=<span class="hljs-number">1408</span> B, count=<span class="hljs-number">22</span>, average=<span class="hljs-number">64</span> B
<span class="hljs-attr">C</span>:\\\\Program Files\\\\Sublime Text <span class="hljs-number">3</span>\\\\test.py:<span class="hljs-number">29</span>: size=<span class="hljs-number">1392</span> B, count=<span class="hljs-number">3</span>, average=<span class="hljs-number">464</span> B
<span class="hljs-attr">C</span>:\\\\Program Files\\\\Python310\\\\lib\\\\tracemalloc.py:<span class="hljs-number">535</span>: size=<span class="hljs-number">1240</span> B, count=<span class="hljs-number">3</span>, average=<span class="hljs-number">413</span> B
<span class="hljs-attr">C</span>:\\\\Program Files\\\\Python310\\\\lib\\\\tracemalloc.py:<span class="hljs-number">226</span>: size=<span class="hljs-number">832</span> B, count=<span class="hljs-number">2</span>, average=<span class="hljs-number">416</span> B
<span class="hljs-attr">Factorial</span>: <span class="hljs-number">6</span>
</code></pre>
<p>This code showcases a decorator, <code>measure_memory_usage</code>, designed to measure the memory consumption of functions.</p>
<p>The decorator, when applied, initiates memory tracking before the original function is called. Once the function completes its execution, a memory snapshot is taken and the top 5 lines consuming the most memory are printed.</p>
<p>Illustrated through the example of <code>calculate_factorial_recursive()</code>, the decorator allows you to monitor memory usage without altering the function itself, offering valuable insights for optimization purposes.</p>
<p>In essence, it provides a straightforward means to assess and analyze the memory consumption of any function during its runtime.</p>
<h3 id="heading-usage-and-applications-6">Usage and Applications</h3>
<p>Memory measurement decorators like these are extremely valuable in application development for identifying and troubleshooting memory bloat or leak issues.</p>
<p>For example, consider a data streaming pipeline with critical ETL components like <code>transform_data()</code> that processes large volumes of information. Though the process seems fine during regular loads, high volume data like Black Friday sales could cause excessive memory usage and crashes.</p>
<p>Rather than manual debugging, decorating processors like @measure_memory_usage can reveal useful insights. It will print the top memory intensive lines during peak data flow without any code change.</p>
<p>You should aim to pinpoint specific stages eating up memory rapidly and address through better algorithms or optimization.</p>
<p>Such decorators help bake diagnostics perspectives across critical paths to recognize abnormal consumption trends early. Instead of delayed production issues, problems can be preemptively identified through profiling before release. They reduce debugging headaches and minimize runtime failures via easier instrumentation for memory tracking.</p>
<h2 id="heading-cache-function-results-with-expiration-time">Cache Function Results with Expiration Time</h2>
<p>Specifically designed for outdated data, the Cache Function Results with Expiration Time Decorator is a tool that combines caching with a time-based expiration feature to make sure that cached data is regularly refreshed to prevent staleness and maintain relevance.</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> time

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">cached_function_with_expiry</span>(<span class="hljs-params">expiry_time</span>):</span>
    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">decorator</span>(<span class="hljs-params">original_function</span>):</span>
        cache = {}

        <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">wrapper</span>(<span class="hljs-params">*args, **kwargs</span>):</span>
            key = (*args, *kwargs.items())

            <span class="hljs-keyword">if</span> key <span class="hljs-keyword">in</span> cache:
                cached_value, cached_timestamp = cache[key]

                <span class="hljs-keyword">if</span> time.time() - cached_timestamp &lt; expiry_time:
                    <span class="hljs-keyword">return</span> <span class="hljs-string">f"[CACHED] - <span class="hljs-subst">{cached_value}</span>"</span>

            result = original_function(*args, **kwargs)
            cache[key] = (result, time.time())

            <span class="hljs-keyword">return</span> result

        <span class="hljs-keyword">return</span> wrapper

    <span class="hljs-keyword">return</span> decorator

<span class="hljs-comment"># Example usage</span>

<span class="hljs-meta">@cached_function_with_expiry(expiry_time=5)  # Cache expiry time set to 5 seconds</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">calculate_product</span>(<span class="hljs-params">x, y</span>):</span>
    <span class="hljs-keyword">return</span> <span class="hljs-string">f"PRODUCT - <span class="hljs-subst">{x * y}</span>"</span>

<span class="hljs-comment"># Call the decorated function multiple times</span>
print(calculate_product(<span class="hljs-number">23</span>, <span class="hljs-number">5</span>))  <span class="hljs-comment"># Calculation is performed</span>
print(calculate_product(<span class="hljs-number">23</span>, <span class="hljs-number">5</span>))  <span class="hljs-comment"># Result is retrieved from cache</span>
time.sleep(<span class="hljs-number">5</span>)
print(calculate_product(<span class="hljs-number">23</span>, <span class="hljs-number">5</span>))  <span class="hljs-comment"># Calculation is performed (cache expired)</span>
</code></pre>
<p><strong>Output:</strong></p>
<pre><code class="lang-javascript">PRODUCT - <span class="hljs-number">115</span>
[CACHED] - PRODUCT - <span class="hljs-number">115</span>
PRODUCT - <span class="hljs-number">115</span>
</code></pre>
<p>This code showcases a caching decorator that has an automatic cache expiration time.</p>
<p>The function <code>cached_function_with_expiry()</code> generates a decorator that, when applied, utilizes a dictionary called <code>cache</code> to store function results and their corresponding timestamps. The <code>wrapper()</code> function checks if the result for the current arguments is in the cache. If present and within the expiry time, it returns the cached result – otherwise, it calls the function.</p>
<p>Illustrated using <code>calculate_product()</code>, the decorator initially calculates and caches the result. Subsequent calls retrieve the cached result until the expiry period, at which point the cache is refreshed through a recalculation.</p>
<p>In essence, this implementation prevents redundant calculations while automatically refreshing results after the specified expiry period.</p>
<h3 id="heading-usage-and-applications-7">Usage and Applications</h3>
<p>Automatic cache expiry decorators are very useful in application development for optimizing performance of data fetching modules.</p>
<p>For example, consider a travel website that calls backend API <code>get_flight_prices()</code> to show live prices to users. While caches reduce calls to expensive flight data sources, static caching leads to displaying stale prices.</p>
<p>Instead, you can use <code>@cached_function_with_expiry(60)</code> to auto-refresh every minute. Now, the first user call fetches live prices and caches them, while subsequent requests in a 60s window efficiently reuse the cached pricing. But caches automatically invalidate after the expiry period to guarantee fresh data.</p>
<p>This allows your to optimize flows without worrying about corner cases related to outdated representations. This decorator handles the situation reliably, keeping caches in sync with upstream changes through configurable refreshing. There's zero redundancy of recalculations, and you still get the best possible updated information to end users. Common caching patterns get packaged conveniently for reuse across codebase with customized expiry rules.</p>
<h2 id="heading-conclusion">Conclusion</h2>
<p>Python decorators continue to see widespread usage in application development for cleanly inserting common cross-cutting concerns. Authentications, monitoring, and restrictions are some standard examples of use cases that use decorators in frameworks like Django and Flask.</p>
<p>The popularity of web APIs has also lead to common adoption of rate limiting and caching decorators for performance.</p>
<p>Decorators have actually been around since early Python releases. Guido van Rossum wrote about enhancement with decorators in a 1990 paper on Python. Later when function decorators syntax stabilized in Python 2.4 in 2004, it opened the doors for elegant solutions through oriented programming. From web to data science, they continue to empower abstraction and modularity across Python domains.</p>
<p>The examples in this handbook only scratch the surface of what custom tailored decorators can enable. Based on any specific objective like security, throttling user requests, transparent encryption, and so on, you can create innovative decorators to address your needs. Structuring logic processing pipelines using a composition of specialized single-responsibility decorators also encourages reuse over redundancy.</p>
<p>Understanding decorators not only improves development skills but unlocks ways to dictate program behaviour flexibly. I encourage you to assess common needs across your codebases that can be abstracted into standalone decorators. With some practice, it becomes easy to spot cross-cutting concerns and extend functions efficiently without breaking a sweat.</p>
<p>If you liked this lesson and would like to explore more insightful tech content, including Python, Django, and System Design reads, check out my <a target="_blank" href="https://atharvashah.netlify.app">Blog</a>. You can also view my projects with proof of work on <a target="_blank" href="https://github.com/HighnessAtharva">GitHub</a> and connect with me on <a target="_blank" href="https://www.linkedin.com/in/atharva-shah-5873a2111/">LinkedIn</a> for a chat.</p>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ How to Use Databricks Delta Lake with SQL – Full Handbook ]]>
                </title>
                <description>
                    <![CDATA[ Welcome to the Databricks Delta Lake with SQL Handbook! Databricks is a unified analytics platform that brings together data engineering, data science, and business analytics into a collaborative workspace. Delta Lake, a powerful storage layer built ... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/databricks-sql-handbook/</link>
                <guid isPermaLink="false">66d45d98680e33282da25e0a</guid>
                
                    <category>
                        <![CDATA[ data-engineering ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Data Science ]]>
                    </category>
                
                    <category>
                        <![CDATA[ handbook ]]>
                    </category>
                
                    <category>
                        <![CDATA[ SQL ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ Atharva Shah ]]>
                </dc:creator>
                <pubDate>Tue, 05 Sep 2023 13:57:32 +0000</pubDate>
                <media:content url="https://www.freecodecamp.org/news/content/images/2023/09/Databricks-Delta-Lake-with-SQL-Handbook-Cover.png" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>Welcome to the Databricks Delta Lake with SQL Handbook! Databricks is a unified analytics platform that brings together data engineering, data science, and business analytics into a collaborative workspace.</p>
<p>Delta Lake, a powerful storage layer built on top of Databricks, provides enhanced reliability, performance, and data quality for big data workloads.</p>
<p>This is a hands-on training guide where you will get a chance to dive into the world of Databricks and learn how to effectively use Delta Lake for managing and analyzing data. It'll provide you with the essential SQL skills to efficiently interact with Delta tables and perform advanced data analytics.</p>
<h2 id="heading-prerequisites">Prerequisites</h2>
<p>This handbook is designed for beginner-level SQL users who have some experience with cloud platforms and clusters. Although no prior experience with Databricks is required, it is recommended that you have a basic understanding of the following concepts:</p>
<ul>
<li><p><strong>Databases:</strong> Familiarity with the basic structure and functionality of databases will be helpful.</p>
</li>
<li><p><strong>SQL Queries:</strong> Knowledge of SQL syntax and the ability to write basic queries is essential.</p>
</li>
<li><p><strong>Jupyter Notebooks:</strong> Understanding how Jupyter notebooks work and being comfortable with running code cells is recommended.</p>
</li>
</ul>
<p>While this handbook assumes a certain level of familiarity with databases, SQL, and Jupyter notebooks, it will guide you step-by-step through each process, ensuring that you understand and follow along with the material.</p>
<p>As such, no installation is necessary, as all the work will be done on Databricks Delta Notebooks running in the cluster. Everything has already been provisioned, eliminating the need for any setup or configuration.</p>
<p>By the end of this handbook, you would have gained a solid foundation in using SQL with Databricks, enabling you to leverage its powerful capabilities for data analysis and manipulation.</p>
<p>Let's get started!</p>
<h2 id="heading-table-of-contents"><strong>Table of Contents</strong></h2>
<p>Here are the sections of this tutorial:</p>
<ol>
<li><a class="post-section-overview" href="#heading-introduction-to-databricks">Introduction to Databricks</a></li>
</ol>
<ul>
<li><p>What is Databricks?</p>
</li>
<li><p>Key features and benefits</p>
</li>
<li><p>Getting started with Databricks Workspace</p>
</li>
<li><p>Notebook basics and interactive analytics</p>
</li>
</ul>
<ol start="2">
<li><a class="post-section-overview" href="#heading-introduction-to-delta">Introduction to Delta</a></li>
</ol>
<ul>
<li><p>Understanding Delta Lake</p>
</li>
<li><p>Advantages of using Delta</p>
</li>
<li><p>Use cases of Delta in real-world scenarios</p>
</li>
<li><p>Supported languages and platforms for Delta</p>
</li>
</ul>
<ol start="3">
<li><a class="post-section-overview" href="#heading-how-to-create-and-manage-tables">How to Create and Manage Tables</a></li>
</ol>
<ul>
<li><p>Creating tables from various data sources</p>
</li>
<li><p>SQL Data Definition Language (DDL) commands</p>
</li>
<li><p>SQL Data Manipulation Language (DML) commands</p>
</li>
<li><p>Creating tables from a Databricks dataset</p>
</li>
<li><p>Saving the loaded CSV file to Delta using Python</p>
</li>
</ul>
<ol start="4">
<li><a class="post-section-overview" href="#heading-delta-sql-command-support">Delta SQL Command Support</a></li>
</ol>
<ul>
<li><p>Delta SQL commands for data management</p>
</li>
<li><p>Performing UPSERT (UPDATE and INSERT) operations</p>
</li>
</ul>
<ol start="5">
<li><a class="post-section-overview" href="#heading-advanced-sql-queries">Advanced SQL Queries</a></li>
</ol>
<ul>
<li><p>Handling data visualization in Delta</p>
</li>
<li><p>Advanced aggregate queries in Delta</p>
</li>
<li><p>Counting diamonds by clarity using SQL</p>
</li>
<li><p>Adding table constraints for data integrity</p>
</li>
</ul>
<ol start="6">
<li><a class="post-section-overview" href="#heading-how-to-work-with-dataframes">How to Work with DataFrames</a></li>
</ol>
<ul>
<li><p>Creating a DataFrame from a Databricks dataset</p>
</li>
<li><p>Data manipulation and displaying results using DataFrames</p>
</li>
</ul>
<ol start="7">
<li><a class="post-section-overview" href="#heading-version-control-and-time-travel-in-delta">Version Control and Time Travel in Delta</a></li>
</ol>
<ul>
<li><p>Understanding version control and time travel in Delta</p>
</li>
<li><p>Restoring data to a specific version</p>
</li>
<li><p>Utilizing autogenerated fields for metadata tracking</p>
</li>
</ul>
<ol start="8">
<li><a class="post-section-overview" href="#heading-delta-table-cloning">Delta Table Cloning</a></li>
</ol>
<ul>
<li><p>Deep and shallow copying of Delta tables</p>
</li>
<li><p>Efficiently cloning Delta tables for data exploration and analysis</p>
</li>
</ul>
<ol start="9">
<li><a class="post-section-overview" href="#heading-conclusion">Conclusion</a></li>
</ol>
<h2 id="heading-introduction-to-databricks">Introduction to Databricks</h2>
<p>Databricks is a unified analytics platform that combines data engineering, data science, and machine learning into a single collaborative environment. Leveraging Apache Spark, it processes and analyzes vast amounts of data efficiently.</p>
<p>Databricks offers benefits like seamless scalability, real-time collaboration, and simplified workflows, making it a favored choice for data-driven enterprises.</p>
<p>Its versatility suits various use cases: from ETL processes and data preparation to advanced analytics and AI model development. Databricks aids in uncovering insights from structured and unstructured data, empowering businesses to make informed decisions swiftly.</p>
<p>You can see its application in finance for fraud detection, healthcare for predictive analytics, e-commerce for recommendation engines, and so on. Basically, Databricks accelerates data-driven innovation, transforming raw information into actionable intelligence.</p>
<p>To follow along this tutorial, you should first create a <a target="_blank" href="https://www.databricks.com/try-databricks">Community Edition account</a> so you can create your clusters.</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/08/image-209.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>Create a Databricks Community Edition Account</em></p>
<p>Once you've created your account, head over to the <a target="_blank" href="https://community.cloud.databricks.com/login.html">Community Edition login page</a>. Once you have signed in, you'll be greeted with a screen very similar to the one shown below.</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/08/image-212.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>Databricks User Dashboard with options to create workspaces, notebooks, and import data</em></p>
<p>From the sidebar on the left, you can create your workspaces, and upload datasets and files that you wish to process.</p>
<p>To follow along, click on the link highlighted in the image above (the one that says "create a notebook"). It will launch a new notebook on Databricks platform where we'll be writing all the code.</p>
<p>You can also access all your notebooks from the left sidebar or from the "Recents" tab on the home screen once you login.</p>
<p>You can find all the code, instructions, and steps used in this handbook with explanations in one of the public notebooks I have created <a target="_blank" href="https://databricks-prod-cloudfront.cloud.databricks.com/public/4027ec902e239c93eaaa8714f173bcfc/4547302627522723/2783251604801531/8769490171999815/latest.html">here</a>.</p>
<p>On creating a new notebook, you should create a cluster to run your commands and process the data. Clusters in the Databricks Delta platform are groups of computing resources that drive efficient data processing. They execute tasks in parallel, speeding up tasks like ETL and analysis.</p>
<p>Clusters offer tailored resource allocation, ensuring optimal performance and scalability. Supporting multiple users and tasks concurrently, clusters encourage collaboration. Leveraging Apache Spark, they enable advanced analytics and machine learning.</p>
<p>Integral to Databricks Delta's ACID transactions, clusters ensure data integrity. Overall, clusters empower seamless, high-performance data handling, essential for tasks ranging from data preparation to sophisticated analytics and AI model training.</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/08/image-213.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>Provision a cluster by creating a new resource to run commands in the notebook</em></p>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/08/image-214.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>Proceed with the standard configuration</em></p>
<p>Now that we have the notebook and clusters set up, we can start with the code. But before we do that, here are a few key terms to know. Awareness of these is more about the platform and less about SQL syntax which will be covered below.</p>
<h3 id="heading-data-ingestion">Data Ingestion</h3>
<p>Data ingestion in Delta involves loading data from third-party sources, such as Fivetran. The most efficient storage medium for data in Delta is Parquet, which is a columnar storage format. To load data into Delta, we can use Spark or PySpark Python and specify the storage location. The loaded data can be accessed and queried using SQL syntax with the <code>COPY INTO</code> command.</p>
<h3 id="heading-dashboards">Dashboards</h3>
<p>Visualizations created in SQL notebooks within Delta can be added to custom dashboards for BI/Analytics. These dashboards are lightweight and provide real-time updates based on data refreshment. This enables users to create insightful and interactive dashboards for data analysis and reporting. You need not create your dashboards from scratch. Popular Dashboard templates are available.</p>
<h3 id="heading-policies">Policies</h3>
<p>Delta provides data governance through the Unity Catalog, ensuring that users only have access to databases and tables they are permitted to view or edit. This granular control over data access enhances security and data privacy within the system.</p>
<h3 id="heading-history">History</h3>
<p>Moderators or superusers can access the history of each query run against all databases, along with timestamps and query execution times. This feature helps in understanding query patterns and optimizing database performance based on usage insights.</p>
<h3 id="heading-optimization">Optimization</h3>
<p>To improve query performance, Delta offers various optimization techniques, such as database indexing, clustering, Bloom filter indexing, and leveraging MPP paradigms like MapReduce. Knowledge of normalization and schema design also contributes to writing efficient SQL queries.</p>
<h3 id="heading-alerts">Alerts</h3>
<p>Delta allows users to set alerts based on comparison operators applied to query results. For example, when a sales count query returns a value below a threshold, an alert can be triggered via Slack, ticketing tools, or emails. Customizable alerts ensure timely notifications for critical data events.</p>
<h3 id="heading-persona-based-design">Persona-Based Design</h3>
<p>The Databricks Platform is designed to cater to different personas, including Data Science/Analytics and BI/MLOps specialists. Users get segregated interfaces tailored to their roles. However, the Unity Catalog can aggregate all these views, providing a cohesive experience.</p>
<h3 id="heading-sql-workspace">SQL Workspace</h3>
<p>The SQL Workspace in Delta provides an interface similar to MySQL Workbench or PgAdmin. Users can perform SQL queries on datasets without the need to load the data repeatedly, as done in notebooks. This efficient querying enhances the SQL-based data analysis experience.</p>
<h3 id="heading-integration-with-other-bi-tools">Integration with other BI Tools</h3>
<p>Databricks integrates well with Tableau and PowerBI. You can import your data points and visualizations seamlessly and get consistent and synced results in the BI tools of your choice. With the click of a button, live queries are generated against the Databricks datasets.</p>
<h2 id="heading-introduction-to-delta">Introduction to Delta</h2>
<p>Delta Lake is an open storage format used to save your data in your Lakehouse. Delta provides an abstraction layer on top of files. It's the storage foundation of your Lakehouse.</p>
<h3 id="heading-why-delta-lake">Why Delta Lake?</h3>
<p><img src="https://pages.databricks.com/rs/094-YMS-629/images/delta-lake-logo-whitebackground.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p>Running an ingestion pipeline on Cloud Storage can be very challenging. Data teams typically face the following challenges:</p>
<ul>
<li><p>Hard to append data (Adding newly arrived data leads to incorrect reads).</p>
</li>
<li><p>Modification of existing data is difficult (GDPR/CCPA requires making fine-grained changes to the existing data lake).</p>
</li>
<li><p>Jobs failing mid-way (Half of the data appears in the data lake, the rest may be missing).</p>
</li>
<li><p>Data quality issues (It’s a constant headache to ensure that all the data is correct and high quality).</p>
</li>
<li><p>Real-time operations (Mixing streaming and batch leads to inconsistency).</p>
</li>
<li><p>Costly to keep historical versions of the data (Regulated environments require reproducibility, auditing, and governance).</p>
</li>
<li><p>Difficult to handle large metadata (For large data lakes, the metadata itself becomes difficult to manage).</p>
</li>
<li><p>“Too many files” problems (Data lakes are not great at handling millions of small files).</p>
</li>
<li><p>Hard to get great performance (Partitioning the data for performance is error-prone and difficult to change).</p>
</li>
</ul>
<p>These challenges have a real impact on team efficiency and productivity, spending unnecessary time fixing low-level, technical issues instead of focusing on high-level, business implementation.</p>
<p>Because Delta Lake solves all the low-level technical challenges of saving petabytes of data in your lakehouse, it lets you focus on implementing a simple data pipeline while providing blazing-fast query answers for your BI and analytics reports.</p>
<p>In addition, Delta Lake is a fully open source project under the Linux Foundation and is adopted by most of the data players. You know you own your data and won't have vendor lock-in.</p>
<h3 id="heading-features-and-capabilities">Features and Capabilities</h3>
<p>You can think about Delta as a file format that your engine can leverage to bring the following capabilities out of the box:</p>
<ul>
<li><p>ACID transactions</p>
</li>
<li><p>Support for DELETE/UPDATE/MERGE</p>
</li>
<li><p>Unify batch &amp; streaming</p>
</li>
<li><p>Time Travel</p>
</li>
<li><p>Clone zero copy</p>
</li>
<li><p>Generated partitions</p>
</li>
<li><p>CDF - Change Data Flow (DBR runtime)</p>
</li>
<li><p>Blazing-fast queries</p>
</li>
</ul>
<p>This hands-on quickstart guide is going to focus on:</p>
<ul>
<li><p>Loading Databases and Tabular Data from a variety of sources</p>
</li>
<li><p>Writing DDL, DML, and DTL queries on these datasets</p>
</li>
<li><p>Visualizing Datasets to get conclusive results</p>
</li>
<li><p>Time travel and Restoring database</p>
</li>
<li><p>Performance Optimization</p>
</li>
</ul>
<h2 id="heading-how-to-create-and-manage-tables">How to Create and Manage Tables</h2>
<p>Okay, time to code! If you still have the notebook that we created earlier along with the clusters open, you can start by following along with the code below. Don't worry, explanations for every step will follow.</p>
<p>Select the dropdown next to the notebook title and ensure SQL is selected since this handbook is all about Delta Lakes with SQL.</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/08/image-215.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>Select Notebook language to be SQL</em></p>
<h3 id="heading-how-to-create-tables-from-a-databricks-dataset">How to Create Tables from a Databricks Dataset</h3>
<p>Databricks notebooks are very much like Jupyter Notebooks. You have to insert your code into cells and run them one by one or together. All the output is shown cell by cell, progressively.</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/08/image-216.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>Databricks notebook interface</em></p>
<p>Here's the code from the image above:</p>
<pre><code class="lang-sql"><span class="hljs-keyword">DROP</span> <span class="hljs-keyword">TABLE</span> <span class="hljs-keyword">IF</span> <span class="hljs-keyword">EXISTS</span> diamonds; 
<span class="hljs-keyword">CREATE</span> <span class="hljs-keyword">TABLE</span> diamonds 
<span class="hljs-keyword">USING</span> csv 
OPTIONS (<span class="hljs-keyword">path</span> <span class="hljs-string">"/databricks-datasets/Rdatasets/data-001/csv/ggplot2/diamonds.csv"</span>, header <span class="hljs-string">"true"</span>)
</code></pre>
<p>In the code above, the two SQL statements (<code>CREATE TABLE</code>) are used to create a table named <code>diamonds</code> in a database. The table is based on data from a CSV file located at the specified path.</p>
<p>If a table with the same name already exists, the <code>DROP TABLE IF EXISTS diamonds</code> statement ensures it is deleted before creating a new one. The table will have the same schema as the CSV file, with the first row assumed to be the header containing column names ("header 'true'").</p>
<p>Here's a command that returns all the records from the <code>diamonds</code> table:</p>
<pre><code class="lang-sql"><span class="hljs-keyword">SELECT</span> * <span class="hljs-keyword">from</span> diamonds
</code></pre>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/08/image-183.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>The above query returns all the records from the diamonds table</em></p>
<p>Here's another command:</p>
<pre><code class="lang-sql"><span class="hljs-keyword">describe</span> diamonds;
</code></pre>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/08/image-184.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>Table metadata returned by the</em> <code>describe</code> command</p>
<p>In SQL, the <code>DESCRIBE</code> statement is used to retrieve metadata information about a table's structure. The specific syntax for the <code>DESCRIBE</code> statement can vary depending on the database system being used.</p>
<p>However, its primary purpose is to provide details about the columns in a table, such as their names, data types, constraints, and other properties.</p>
<h3 id="heading-saving-the-loaded-csv-file-to-delta-using-python">Saving the loaded CSV file to Delta using Python</h3>
<p>The best part about using the Databricks platform is that it allows you to write Python, SQL, Scala, and R interchangeably in the same notebook.</p>
<p>You can switch up the languages at any given point by using the <strong>"Delta Magic Commands".</strong> You can find a full list of magic commands at the end of this handbook.</p>
<pre><code class="lang-python">%python

diamonds = spark.read.csv(<span class="hljs-string">"/databricks-datasets/Rdatasets/data-001/csv/ggplot2/diamonds.csv"</span>, header=<span class="hljs-string">"true"</span>, inferSchema=<span class="hljs-string">"true"</span>)

diamonds.write.format(<span class="hljs-string">"delta"</span>).mode(<span class="hljs-string">"overwrite"</span>).save(<span class="hljs-string">"/delta/diamonds"</span>)
</code></pre>
<p>Data is read from a CSV file located at <strong>/databricks-datasets/Rdatasets/data-001/csv/ggplot2/diamonds.csv</strong> into a Spark DataFrame named <code>diamonds</code>. The first row of the CSV file is treated as the header, and Spark infers the schema for the DataFrame based on the data.</p>
<p>The DataFrame <code>diamonds</code> is written in a Delta Lake table format. If the table already exists at the specified location (<strong>/delta/diamonds</strong>), it will be overwritten. If it does not exist, a new table will be created.</p>
<pre><code class="lang-sql">
<span class="hljs-keyword">DROP</span> <span class="hljs-keyword">TABLE</span> <span class="hljs-keyword">IF</span> <span class="hljs-keyword">EXISTS</span> diamonds;

<span class="hljs-keyword">CREATE</span> <span class="hljs-keyword">TABLE</span> diamonds <span class="hljs-keyword">USING</span> DELTA LOCATION <span class="hljs-string">'/delta/diamonds/'</span>
</code></pre>
<p>The SQL statements above drops any existing table named <code>diamonds</code> and creates a new Delta Lake table named <code>diamonds</code> using the data stored in the Delta Lake format at the <strong>/delta/diamonds/</strong> location.</p>
<p>You can run a SELECT statement to ensure that the table appears as expected:</p>
<pre><code class="lang-sql"><span class="hljs-keyword">SELECT</span> * <span class="hljs-keyword">from</span> diamonds
</code></pre>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/08/image-185.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>The same diamonds table result set once restored from Delta Lake</em></p>
<h2 id="heading-delta-sql-command-support"><strong>Delta SQL Command Support</strong></h2>
<p>In the world of databases, there are two fundamental types of commands: Data Manipulation Language (DML) and Data Definition Language (DDL). These commands play a crucial role in managing and organizing data within a database. In this article, we will explore what DML and DDL commands are, their key differences, and provide examples of how they are used.</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/08/image-186.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>Databricks Notebooks support all the SQL commands including DDL and DML commands highlighted here</em></p>
<h3 id="heading-data-manipulation-language-dml">Data Manipulation Language (DML)</h3>
<p>It is used to manipulate or modify data stored in a database. These commands allow users to insert, retrieve, update, and delete data from database tables. Let's take a closer look at some commonly used DML commands:</p>
<p><strong>SELECT</strong>: The <code>SELECT</code> command is used to retrieve data from one or more tables in a database. It allows you to specify the columns and rows you want to extract by using conditions and filters. For example, <code>SELECT * FROM Customers</code> retrieves all the records from the <code>Customers</code> table.</p>
<p><strong>INSERT</strong>: The <code>INSERT</code> command adds new data into a table. It allows you to specify the value for each column or select values from another table. For example, <code>INSERT INTO Customers (Name, Email) VALUES ('John Doe', 'john@example.com')</code> adds a new customer record to the <code>Customers</code> table.</p>
<p><strong>UPDATE</strong>: The <code>UPDATE</code> command is used to modify existing data in a table. It allows you to change the values of specific columns based on certain conditions. For example, <code>UPDATE Customers SET Email = 'new@example.com' WHERE ID = 1</code> updates the email address of the customer with ID of 1.</p>
<p><strong>DELETE</strong>: The <code>DELETE</code> command is used to remove data from a table. It allows you to delete specific rows based on certain conditions. For example, <code>DELETE FROM Customers WHERE ID = 1</code> deletes the customer record with ID of 1 from the <code>Customers</code> table.</p>
<h3 id="heading-data-definition-language-ddl-commands">Data Definition Language (DDL) Commands</h3>
<p>DDL commands are used to define the structure and organization of a database. These commands allow users to create, modify, and delete database objects such as tables, indexes, and constraints.</p>
<p>Let's explore some commonly used DDL commands:</p>
<p><strong>CREATE</strong>: Creates a new database object, such as a table or an index. It allows you to define the columns, data types, and constraints for the object. For example, <code>CREATE TABLE Customers (ID INT, Name VARCHAR(50), Email VARCHAR(100))</code> creates a new table named <code>Customers</code> with three columns.</p>
<p><strong>ALTER</strong>: Modifies the structure of an existing database object. It allows you to add, modify, or delete columns, constraints, or indexes. For example, <code>ALTER TABLE Customers ADD COLUMN Phone VARCHAR(20)</code> adds a new column named <code>Phone</code> to the <code>Customers</code> table.</p>
<p><strong>DROP</strong>: Deletes an existing database object. It permanently removes the object and its associated data from the database. For example, <code>DROP TABLE Customers</code> deletes the <code>Customers</code> table from the database.</p>
<p><strong>TRUNCATE</strong>: The <code>TRUNCATE</code> command is used to remove all the data from a table, while keeping the table structure intact. It is faster than the <code>DELETE</code> command when you want to remove all records from a table. For example, <code>TRUNCATE TABLE Customers</code> removes all records from the <code>Customers</code> table.</p>
<p>Delta Lake supports standard DML including <code>UPDATE</code>, <code>DELETE</code> and <code>MERGE INTO</code>, providing developers with more control to manage their big datasets.</p>
<p>Here's an example that uses the <code>INSERT</code>, <code>UPDATE</code>, and <code>SELECT</code> commands:</p>
<pre><code class="lang-sql"><span class="hljs-keyword">INSERT</span> <span class="hljs-keyword">INTO</span> diamonds(_c0, carat, cut,    color,    clarity,    <span class="hljs-keyword">depth</span>,    <span class="hljs-keyword">table</span>,    price,    x,    y,    z) <span class="hljs-keyword">values</span> (<span class="hljs-number">53941</span>, <span class="hljs-number">0.22</span>,    <span class="hljs-string">'Premium'</span>, <span class="hljs-string">'I'</span>,    <span class="hljs-string">'SI2'</span>,    <span class="hljs-string">'60.3'</span>,    <span class="hljs-string">'62.1'</span>,    <span class="hljs-string">'334'</span>,    <span class="hljs-string">'3.79'</span>,    <span class="hljs-string">'3.75'</span>,    <span class="hljs-string">'2.27'</span>);

<span class="hljs-keyword">UPDATE</span> diamonds <span class="hljs-keyword">SET</span> carat = <span class="hljs-number">0.20</span> <span class="hljs-keyword">WHERE</span> _c0 = <span class="hljs-number">53941</span>;

<span class="hljs-keyword">select</span> * <span class="hljs-keyword">from</span> diamonds <span class="hljs-keyword">where</span> _c0=<span class="hljs-number">53941</span>;
</code></pre>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/08/image-187.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>Fetching a unique record from the table</em></p>
<p>In the example above, an initial row is inserted into the <code>diamonds</code> table with specific values for each column.</p>
<p>Then the carat value for the row with <code>_c0</code> equal to 53941 is updated to 0.20.</p>
<p>The final <code>SELECT</code> statement retrieves the row with <code>_c0</code> equal to 53941, showing its current state after the <code>INSERT</code> and <code>UPDATE</code> operations. This shows that the record insertion was successful.</p>
<pre><code class="lang-sql"><span class="hljs-keyword">DELETE</span> <span class="hljs-keyword">FROM</span> diamonds <span class="hljs-keyword">where</span> _c0=<span class="hljs-number">53941</span>;

<span class="hljs-keyword">select</span> * <span class="hljs-keyword">from</span> diamonds <span class="hljs-keyword">where</span> _c0=<span class="hljs-number">53941</span>;
</code></pre>
<p>The above <code>DELETE</code> command paired with the <code>WHERE</code> clause removes the row from the database and the subsequent <code>SELECT</code> query validates this by returning a null result set.</p>
<h3 id="heading-upsert-operation">UPSERT Operation</h3>
<p>The "upsert" operation updates if the record exists, and inserts the record doesn't exist.</p>
<pre><code class="lang-sql"><span class="hljs-keyword">CREATE</span> <span class="hljs-keyword">TABLE</span>  diamond__mini(_c0 <span class="hljs-built_in">int</span>, carat <span class="hljs-keyword">double</span>, cut <span class="hljs-keyword">string</span>,    color <span class="hljs-keyword">string</span>,    clarity <span class="hljs-keyword">string</span>,    <span class="hljs-keyword">depth</span> <span class="hljs-keyword">double</span>, <span class="hljs-keyword">table</span> <span class="hljs-keyword">double</span>,    price <span class="hljs-built_in">int</span>,    x <span class="hljs-keyword">double</span>,    y <span class="hljs-keyword">double</span>,    z <span class="hljs-keyword">double</span>);

<span class="hljs-keyword">delete</span> <span class="hljs-keyword">from</span> diamond__mini;

<span class="hljs-keyword">INSERT</span> <span class="hljs-keyword">INTO</span> diamond__mini(_c0, carat, cut,    color,    clarity,    <span class="hljs-keyword">depth</span>,    <span class="hljs-keyword">table</span>,    price,    x,    y,    z) <span class="hljs-keyword">values</span> (<span class="hljs-number">1</span>, <span class="hljs-number">0.22</span>,    <span class="hljs-string">'Premium'</span>, <span class="hljs-string">'I'</span>,    <span class="hljs-string">'SI2'</span>,    <span class="hljs-string">'60.3'</span>,    <span class="hljs-string">'62.1'</span>,    <span class="hljs-string">'334'</span>,    <span class="hljs-string">'3.79'</span>,    <span class="hljs-string">'3.75'</span>,    <span class="hljs-string">'2.27'</span>);
<span class="hljs-keyword">INSERT</span> <span class="hljs-keyword">INTO</span> diamond__mini(_c0, carat, cut,    color,    clarity,    <span class="hljs-keyword">depth</span>,    <span class="hljs-keyword">table</span>,    price,    x,    y,    z) <span class="hljs-keyword">values</span> (<span class="hljs-number">2</span>, <span class="hljs-number">0.22</span>,    <span class="hljs-string">'Premium'</span>, <span class="hljs-string">'I'</span>,    <span class="hljs-string">'SI2'</span>,    <span class="hljs-string">'60.3'</span>,    <span class="hljs-string">'62.1'</span>,    <span class="hljs-string">'334'</span>,    <span class="hljs-string">'3.79'</span>,    <span class="hljs-string">'3.75'</span>,    <span class="hljs-string">'2.27'</span>);
<span class="hljs-keyword">INSERT</span> <span class="hljs-keyword">INTO</span> diamond__mini(_c0, carat, cut,    color,    clarity,    <span class="hljs-keyword">depth</span>,    <span class="hljs-keyword">table</span>,    price,    x,    y,    z) <span class="hljs-keyword">values</span> (<span class="hljs-number">90000</span>, <span class="hljs-number">0.22</span>,    <span class="hljs-string">'Premium'</span>, <span class="hljs-string">'I'</span>,    <span class="hljs-string">'SI2'</span>,    <span class="hljs-string">'60.3'</span>,    <span class="hljs-string">'62.1'</span>,    <span class="hljs-string">'334'</span>,    <span class="hljs-string">'3.79'</span>,    <span class="hljs-string">'3.75'</span>,    <span class="hljs-string">'2.27'</span>);

<span class="hljs-keyword">select</span> * <span class="hljs-keyword">from</span> diamond__mini;
</code></pre>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/08/image-188.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>Creating a subset</em> <code>diamonds_mini</code> to demonstrate the UPSERT operation</p>
<p>In this scenario, we have created a table named <code>diamond__mini</code> to test upsert (that is, insert or update) operations into the <code>diamonds</code> table.</p>
<p><code>diamond__mini</code> is a subset of the <code>diamonds</code> table, containing only 3 records. Two of these rows (with <code>_c0</code> values 1 and 2) already exist in the <code>diamonds</code> table, and one row (with <code>_c0</code> value 90000) does not exist.</p>
<p>Therefore, the code will drop and create the <code>diamond__mini</code> table with a specific schema to match the <code>diamonds</code> table.</p>
<p>Then clear the <code>diamond__mini</code> table by deleting all existing records, ensuring that we have a clean slate for the upsert test.</p>
<p>It'll then perform three <code>INSERT</code> statements to the <code>diamond__mini</code> table, attempting to add three new records with different <code>_c0</code> values, including one with <code>_c0 = 90000</code>.</p>
<p>Lastly, we'll select all records from the <code>diamond__mini</code> table to observe the changes and verify if the upsert worked correctly.</p>
<p>Since the <code>_c0</code> values 1 and 2 already exist in the <code>diamonds</code> table, the corresponding rows in <code>diamond__mini</code> will be considered as updates for the existing rows.</p>
<p>On the other hand, the row with <code>_c0 = 90000</code> is new and does not exist in the <code>diamonds</code> table, so it will be treated as an insert.</p>
<p>The <code>describe</code> command shows the metadata of the new table:</p>
<pre><code class="lang-sql"><span class="hljs-keyword">describe</span> diamond__mini
</code></pre>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/08/image-189.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>Fetching metadata of the newly created table</em></p>
<p>Here's another example that uses the upsert operation:</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/08/image-192.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>upsert operation on diamond and diamond_mini tables</em></p>
<pre><code class="lang-sql"><span class="hljs-comment">-- perform UPSERT operation based on matching column and row criteria from diamond__mini to diamonds table. If a match is found, record will update otherwise it will be inserted.</span>

<span class="hljs-keyword">MERGE</span> <span class="hljs-keyword">INTO</span> diamonds <span class="hljs-keyword">as</span> d <span class="hljs-keyword">USING</span> diamond__mini <span class="hljs-keyword">as</span> m
  <span class="hljs-keyword">ON</span> d._c0 = m._c0
  <span class="hljs-keyword">WHEN</span> <span class="hljs-keyword">MATCHED</span> <span class="hljs-keyword">THEN</span> 
    <span class="hljs-keyword">UPDATE</span> <span class="hljs-keyword">SET</span> *
  <span class="hljs-keyword">WHEN</span> <span class="hljs-keyword">NOT</span> <span class="hljs-keyword">MATCHED</span> 
    <span class="hljs-keyword">THEN</span> <span class="hljs-keyword">INSERT</span> * ;

<span class="hljs-keyword">select</span> * <span class="hljs-keyword">from</span> diamonds <span class="hljs-keyword">where</span> _c0 <span class="hljs-keyword">in</span> (<span class="hljs-number">1</span> ,<span class="hljs-number">2</span>, <span class="hljs-number">90000</span>)
</code></pre>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/08/image-193.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>UPSERT operation successful. Values for records with</em> <code>_c0</code> = [1,2] were updated and 90,000 was inserted</p>
<p>In this example, a <code>MERGE</code> operation is performed between two tables: <code>diamonds</code> (target table) and <code>diamond__mini</code> (source table). The <code>MERGE</code> statement compares the records in both tables based on the common <code>_c0</code> column.</p>
<p>Here's a concise explanation:</p>
<ol>
<li><p>The <code>MERGE</code> statement matches records with the same <code>_c0</code> value in both tables (<code>diamonds</code> and <code>diamond__mini</code>).</p>
</li>
<li><p>When a match is found (based on <code>_c0</code>), it performs an <code>UPDATE</code> on the target table (<code>diamonds</code>) using the values from the source table (<code>diamond__mini</code>). This is done for all columns using <code>UPDATE SET *</code>.</p>
</li>
<li><p>If no match is found for a record from the source table (<code>diamond__mini</code>), it performs an <code>INSERT</code> into the target table (<code>diamonds</code>) using the values from the source table for all columns (using <code>INSERT *</code>).</p>
</li>
<li><p>After the <code>MERGE</code> operation, a <code>SELECT</code> statement retrieves the records from the target table (<code>diamonds</code>) with _c0 values 1, 2, and 90000 to observe the changes made during the merge.</p>
</li>
</ol>
<p>The <code>MERGE</code> statement is used to synchronize data between the <code>diamonds</code>and <code>diamond__mini</code> tables based on their common <code>_c0</code>column, updating existing records and inserting new ones.</p>
<h2 id="heading-advanced-sql-queries">Advanced SQL Queries</h2>
<h3 id="heading-data-visualization-in-delta">Data Visualization in Delta</h3>
<p>In Databricks Delta platform, you can leverage SQL queries to visualize data and gain valuable insights without the need for complex programming. Here are some ways to visualize data using SQL queries in Databricks Delta:</p>
<ol>
<li><p><strong>Basic SELECT Queries:</strong> Retrieves data from your Delta tables. By selecting specific columns or applying filters with WHERE clauses, you can quickly get an overview of the data's characteristics.</p>
</li>
<li><p><strong>Aggregate Functions:</strong> SQL provides a variety of aggregate functions like <code>COUNT</code>, <code>SUM</code>, <code>AVG</code>, <code>MIN</code>, and <code>MAX</code>. By using these functions, you can summarize and visualize data at a higher level. You perform operations such as counting the number of records, calculating the average values, or finding the maximum and minimum values.</p>
</li>
<li><p><strong>Grouping and Aggregating Data:</strong> The <code>GROUP BY</code> clause in SQL allows you to group data based on specific columns, and then apply aggregate functions to each group. This enables generation of meaningful insights by analyzing data on a category-wise basis.</p>
</li>
<li><p><strong>Window Functions:</strong> SQL window functions, like <code>ROW_NUMBER</code>, <code>RANK</code>, and <code>DENSE_RANK</code>, are valuable for partitioning data and calculating rankings or running totals. These functions enable analyzing data in a more granular way and help discover patterns.</p>
</li>
<li><p><strong>Joining Tables:</strong> Helps to combine data from multiple Delta tables using SQL <code>JOIN</code> operations. Merging related data, performing cross-table analysis, and advanced visualizations is possible through joins.</p>
</li>
<li><p><strong>Subqueries and CTEs:</strong> SQL subqueries and Common Table Expressions (CTEs) allow you to break down complex problems into manageable parts. These techniques can simplify analysis and make SQL queries more organized and maintainable.</p>
</li>
<li><p><strong>Window Aggregates:</strong> SQL window aggregates, such as <code>SUM</code>, <code>AVG</code>, and <code>ROW_NUMBER</code> with the <code>OVER</code> clause, enable you to perform calculations on specific windows or ranges of data. This is useful for analyzing trends over time or within specific subsets of your data.</p>
</li>
<li><p><strong>CASE Statements:</strong> CASE statements in SQL help you create conditional expressions, allowing you to categorize or group data based on certain conditions. This can aid in creating custom labels or grouping data into different categories for visualization purposes.</p>
</li>
</ol>
<p>The platform's powerful SQL capabilities empower data analysts and developers to extract meaningful insights from their Delta Lake data, all without the need for additional programming languages or tools.</p>
<pre><code class="lang-sql"><span class="hljs-comment">-- aggregate query to get average price based on diamond colors</span>
<span class="hljs-keyword">SELECT</span> color, <span class="hljs-keyword">avg</span>(price) <span class="hljs-keyword">AS</span> avg_price <span class="hljs-keyword">FROM</span> diamonds <span class="hljs-keyword">GROUP</span> <span class="hljs-keyword">BY</span> color <span class="hljs-keyword">ORDER</span> <span class="hljs-keyword">BY</span> color
</code></pre>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/08/image-194.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>Tabular View for the Query</em></p>
<p>This SQL query above is used to retrieve the average price of diamonds based on their colors.</p>
<p>Let's break down the code:</p>
<p><code>SELECT color, avg(price) AS avg_price</code> specifies the columns that will be selected in the result set. It selects the <code>color</code> column and calculates the average price using the <code>avg()</code> function. The calculated average is aliased as <code>avg_price</code> for easier reference in the result set.</p>
<p>The <code>FROM diamonds</code> command specifies the table from which data will be retrieved. In this case, the table is named <code>diamonds</code>.</p>
<p><code>GROUP BY color</code> groups the data by the <code>color</code> column. The result set will contain one row for each unique color, and the average price will be calculated for each group separately.</p>
<p><code>ORDER BY color</code> arranges the result set in ascending order based on the <code>color</code> column. The output will be sorted alphabetically by color.</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/08/image-195.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>Visualized Results for the Query</em></p>
<h3 id="heading-count-of-diamonds-by-clarity">Count of Diamonds by Clarity</h3>
<pre><code class="lang-sql"><span class="hljs-keyword">SELECT</span> clarity, <span class="hljs-keyword">COUNT</span>(*) <span class="hljs-keyword">AS</span> <span class="hljs-keyword">count</span>
<span class="hljs-keyword">FROM</span> diamonds
<span class="hljs-keyword">GROUP</span> <span class="hljs-keyword">BY</span> clarity
<span class="hljs-keyword">ORDER</span> <span class="hljs-keyword">BY</span> <span class="hljs-keyword">count</span> <span class="hljs-keyword">DESC</span>;
</code></pre>
<p>This SQL query above calculates the count of diamonds for each clarity level and presents the results in descending order. It selects the <code>clarity</code> column and uses the <code>COUNT()</code> function to count the number of occurrences for each clarity value.</p>
<p>The result set is grouped by clarity and sorted in descending order based on the count of diamonds.</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/08/image-196.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>Pie Chart visualization based on the above query</em></p>
<h3 id="heading-average-price-by-depth-range">Average Price by Depth Range</h3>
<pre><code class="lang-sql"><span class="hljs-comment">-- This SQL query calculates the average price of diamonds grouped into depth ranges (60-62 and 62-64), and 'Other' for all other depth values, from the 'diamonds' table. The results are ordered in descending order based on the average price.</span>

<span class="hljs-keyword">SELECT</span> <span class="hljs-keyword">CASE</span> 
         <span class="hljs-keyword">WHEN</span> <span class="hljs-keyword">depth</span> <span class="hljs-keyword">BETWEEN</span> <span class="hljs-number">60</span> <span class="hljs-keyword">AND</span> <span class="hljs-number">62</span> <span class="hljs-keyword">THEN</span> <span class="hljs-string">'60-62'</span>
         <span class="hljs-keyword">WHEN</span> <span class="hljs-keyword">depth</span> <span class="hljs-keyword">BETWEEN</span> <span class="hljs-number">62</span> <span class="hljs-keyword">AND</span> <span class="hljs-number">64</span> <span class="hljs-keyword">THEN</span> <span class="hljs-string">'62-64'</span>
         <span class="hljs-keyword">ELSE</span> <span class="hljs-string">'Other'</span>
       <span class="hljs-keyword">END</span> <span class="hljs-keyword">AS</span> depth_range,
       <span class="hljs-keyword">AVG</span>(<span class="hljs-keyword">CAST</span>(price <span class="hljs-keyword">AS</span> <span class="hljs-keyword">DOUBLE</span>)) <span class="hljs-keyword">AS</span> avg_price
<span class="hljs-keyword">FROM</span> diamonds
<span class="hljs-keyword">GROUP</span> <span class="hljs-keyword">BY</span> depth_range
<span class="hljs-keyword">ORDER</span> <span class="hljs-keyword">BY</span> avg_price <span class="hljs-keyword">DESC</span>;
</code></pre>
<p>Here, we are calculating the average price of diamonds grouped into depth ranges. It uses a <code>CASE</code> statement to categorize the diamonds into three depth ranges: '60-62' for depths between 60 and 62, '62-64' for depths between 62 and 64, and 'Other' for all other depth values.</p>
<p>The <code>AVG()</code> function is then used to calculate the average price for each depth range. The result set is grouped by the <code>depth_range</code> column and ordered in descending order based on the average price.</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/08/image-197.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>Average price based on the grouped depth range, achieved using CASE syntax</em></p>
<h3 id="heading-price-distribution-by-table">Price Distribution by Table</h3>
<pre><code class="lang-sql"><span class="hljs-comment">--  Calculate the median, first quartile (q1), and third quartile (q3) prices for each unique 'table' in the 'diamonds' table based on the 'price' column. The results are grouped by 'table' and provide valuable statistical insights into the price distribution within each category.</span>

<span class="hljs-keyword">SELECT</span> <span class="hljs-keyword">table</span>, 
       <span class="hljs-keyword">PERCENTILE_CONT</span>(<span class="hljs-number">0.5</span>) <span class="hljs-keyword">WITHIN</span> <span class="hljs-keyword">GROUP</span> (<span class="hljs-keyword">ORDER</span> <span class="hljs-keyword">BY</span> <span class="hljs-keyword">CAST</span>(price <span class="hljs-keyword">AS</span> <span class="hljs-keyword">DOUBLE</span>)) <span class="hljs-keyword">AS</span> median_price,
       <span class="hljs-keyword">PERCENTILE_CONT</span>(<span class="hljs-number">0.25</span>) <span class="hljs-keyword">WITHIN</span> <span class="hljs-keyword">GROUP</span> (<span class="hljs-keyword">ORDER</span> <span class="hljs-keyword">BY</span> <span class="hljs-keyword">CAST</span>(price <span class="hljs-keyword">AS</span> <span class="hljs-keyword">DOUBLE</span>)) <span class="hljs-keyword">AS</span> q1_price,
       <span class="hljs-keyword">PERCENTILE_CONT</span>(<span class="hljs-number">0.75</span>) <span class="hljs-keyword">WITHIN</span> <span class="hljs-keyword">GROUP</span> (<span class="hljs-keyword">ORDER</span> <span class="hljs-keyword">BY</span> <span class="hljs-keyword">CAST</span>(price <span class="hljs-keyword">AS</span> <span class="hljs-keyword">DOUBLE</span>)) <span class="hljs-keyword">AS</span> q3_price
<span class="hljs-keyword">FROM</span> diamonds
<span class="hljs-keyword">GROUP</span> <span class="hljs-keyword">BY</span> <span class="hljs-keyword">table</span>;
</code></pre>
<p>This SQL query calculates the median, first quartile (q1), and third quartile (q3) prices for each unique <code>table</code> value in the <code>diamonds</code> table. It uses the <code>PERCENTILE_CONT()</code> function to calculate these statistical measures.</p>
<p>The function is applied to the <code>price</code> column, which is cast as a double for accurate calculations. The result set is grouped by the <code>table</code> column, providing insights into the price distribution within each <code>table</code> category.</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/08/image-198.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>Casting media, Q1 and Q3 figures based on the price</em></p>
<h3 id="heading-price-factor-by-x-y-and-z">Price Factor by X, Y and Z</h3>
<pre><code class="lang-sql"><span class="hljs-comment">-- Calculate the average price of diamonds grouped by their x, y, and z values from the 'diamonds' table. The results are ordered in descending order based on the average price, providing valuable insights into the average price of diamonds with different x, y, and z dimensions.</span>

<span class="hljs-keyword">SELECT</span> x, y, z, <span class="hljs-keyword">AVG</span>(<span class="hljs-keyword">CAST</span>(price <span class="hljs-keyword">AS</span> <span class="hljs-keyword">DOUBLE</span>)) <span class="hljs-keyword">AS</span> avg_price
<span class="hljs-keyword">FROM</span> diamonds
<span class="hljs-keyword">GROUP</span> <span class="hljs-keyword">BY</span> x, y, z
<span class="hljs-keyword">ORDER</span> <span class="hljs-keyword">BY</span> avg_price <span class="hljs-keyword">DESC</span>;
</code></pre>
<p>This query will calculate the average price of diamonds grouped by their x, y, and z values from the <code>diamonds</code> table. It selects the columns <code>x</code>, <code>y</code>, <code>z</code>, and uses the <code>AVG()</code> function to calculate the average price for each combination of x, y, and z values.</p>
<p>The result set is then ordered in descending order based on the average price, providing insights into the average price of diamonds with different dimensions.</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/08/image-199.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>Visualization showing average price of diamonds grouped by their x, y, and z values from the 'diamonds' table</em></p>
<h3 id="heading-add-constraints">Add Constraints</h3>
<pre><code class="lang-sql"><span class="hljs-comment">-- This SQL code snippet alters the 'diamonds' table by dropping the existing constraint 'id_not_null' if it exists. Then, it adds a new constraint named 'id_not_null' to ensure that the column '_c0' must not contain null values, enforcing data integrity in the table.</span>

<span class="hljs-keyword">ALTER</span> <span class="hljs-keyword">TABLE</span> diamonds <span class="hljs-keyword">DROP</span> <span class="hljs-keyword">CONSTRAINT</span> <span class="hljs-keyword">IF</span> <span class="hljs-keyword">EXISTS</span> id_not_null;
<span class="hljs-keyword">ALTER</span> <span class="hljs-keyword">TABLE</span> diamonds <span class="hljs-keyword">ADD</span> <span class="hljs-keyword">CONSTRAINT</span> id_not_null <span class="hljs-keyword">CHECK</span> (_c0 <span class="hljs-keyword">is</span> <span class="hljs-keyword">not</span> <span class="hljs-literal">null</span>);
</code></pre>
<pre><code class="lang-sql"><span class="hljs-comment">-- This command will fail as we insert a user with a null id::</span>
<span class="hljs-keyword">INSERT</span> <span class="hljs-keyword">INTO</span> diamonds(_c0, carat, cut,    color,    clarity,    <span class="hljs-keyword">depth</span>,    <span class="hljs-keyword">table</span>,    price,    x,    y,    z) <span class="hljs-keyword">values</span> (<span class="hljs-literal">null</span>, <span class="hljs-number">0.22</span>,    <span class="hljs-string">'Premium'</span>, <span class="hljs-string">'I'</span>,    <span class="hljs-string">'SI2'</span>,    <span class="hljs-string">'60.3'</span>,    <span class="hljs-string">'62.1'</span>,    <span class="hljs-string">'334'</span>,    <span class="hljs-string">'3.79'</span>,    <span class="hljs-string">'3.75'</span>,    <span class="hljs-string">'2.27'</span>);
</code></pre>
<p>Note that this won't actually yield any output. Guess why? Because it does not stick to the NOT NULL constraint. So, whenever constraints are not fulfilled an error will be thrown. In this case, this exact error is shown:</p>
<pre><code class="lang-sql">Error in SQL statement: DeltaInvariantViolationException: <span class="hljs-keyword">CHECK</span> <span class="hljs-keyword">constraint</span> id_not_null (_c0 <span class="hljs-keyword">IS</span> <span class="hljs-keyword">NOT</span> <span class="hljs-literal">NULL</span>) violated <span class="hljs-keyword">by</span> <span class="hljs-keyword">row</span> <span class="hljs-keyword">with</span> <span class="hljs-keyword">values</span>:
 - _c0 : <span class="hljs-literal">null</span>
</code></pre>
<p>This SQL code snippet demonstrates the alteration of the <code>diamonds</code> table to enforce data integrity.</p>
<p>The first line of code, <code>ALTER TABLE diamonds DROP CONSTRAINT IF EXISTS id_not_null;</code>, checks if a constraint named <code>id_not_null</code> exists in the <code>diamonds</code> table and drops it if it does. This step ensures that any existing constraint with the same name is removed before adding a new one.</p>
<p>The second line of code, <code>ALTER TABLE diamonds ADD CONSTRAINT id_not_null CHECK (_c0 is not null);</code>, adds a new constraint named <code>id_not_null</code> to the <code>diamonds</code> table. This constraint specifies that the column <code>_c0</code> must not contain null values. It ensures that whenever data is inserted or updated in this table, the '_c0' column cannot have a null value, maintaining data integrity.</p>
<p>However, the subsequent command, <code>INSERT INTO diamonds(_c0, carat, cut, color, clarity, depth, table, price, x, y, z) VALUES (null, 0.22, 'Premium', 'I', 'SI2', '60.3', '62.1', '334', '3.79', '3.75', '2.27');</code>, attempts to insert a row into the <code>diamonds</code> table with a null value in the <code>_c0</code> column.</p>
<p>Since the newly added constraint prohibits null values in this column, the <code>INSERT</code> operation will fail, preserving the data integrity specified by the constraint.</p>
<h2 id="heading-how-to-work-with-dataframes">How to Work with Dataframes</h2>
<p>The best part is that you are not just restricted to using SQL to achieve this. Below, the same thing is done by first loading the dataset into <code>diamonds</code> with Python and then using pyspark library functions to do complex queries.</p>
<pre><code class="lang-python">%python
diamonds = spark.read.csv(<span class="hljs-string">"/databricks-datasets/Rdatasets/data-001/csv/ggplot2/diamonds.csv"</span>, header=<span class="hljs-string">"true"</span>, inferSchema=<span class="hljs-string">"true"</span>)
</code></pre>
<p>In the Databricks Delta Lake platform, the <code>spark</code> object represents the SparkSession, which is the entry point for interacting with Spark functionality. It provides a programming interface to work with structured and semi-structured data.</p>
<p>The <code>spark.read.csv()</code> function is used to read a CSV file into a DataFrame. In this case, it reads the <strong>diamonds.csv</strong> file from the specified path. The arguments passed to the function include:</p>
<ul>
<li><p><code>"/databricks-datasets/Rdatasets/data-001/csv/ggplot2/diamonds.csv"</code>: This is the path to the CSV file. You can replace this with the actual path where your file is located.</p>
</li>
<li><p><code>header="true"</code>: This specifies that the first row of the CSV file contains the column names.</p>
</li>
<li><p><code>inferSchema="true"</code>: This instructs Spark to automatically infer the data types of the columns in the DataFrame.</p>
</li>
</ul>
<p>Once the CSV file is read, it is stored in the <code>diamonds</code> variable as a DataFrame. The DataFrame represents a distributed collection of data organized into named columns. It provides various functions and methods to manipulate and analyze the data.</p>
<p>By reading the CSV file into a DataFrame on the Databricks Delta Lake platform, you can leverage the rich querying and processing capabilities of Spark to perform data analysis, transformations, and other operations on the diamonds data.</p>
<h3 id="heading-manipulate-the-data-and-displays-the-results"><strong>Manipulate the data and displays the results</strong></h3>
<p>The below example showcases that on the Databricks Delta Lake platform, you are not limited to using only SQL queries. You can also leverage Python and its rich ecosystem of libraries, such as PySpark, to perform complex data manipulations and analyses.</p>
<p>By using Python, you have access to a wide range of functions and methods provided by PySpark's DataFrame API. This allows you to perform various transformations, aggregations, calculations, and sorting operations on your data.</p>
<p>Whether you choose to use SQL or Python, the Databricks Delta Lake platform provides a flexible environment for data processing and analysis, enabling you to unlock valuable insights from your data.</p>
<pre><code class="lang-python">%python
<span class="hljs-keyword">from</span> pyspark.sql.functions <span class="hljs-keyword">import</span> avg

display(diamonds.select(<span class="hljs-string">"color"</span>,<span class="hljs-string">"price"</span>).groupBy(<span class="hljs-string">"color"</span>).agg(avg(<span class="hljs-string">"price"</span>)).sort(<span class="hljs-string">"color"</span>))
</code></pre>
<p>Firstly, the <code>from pyspark.sql.functions import avg</code> statement imports the <code>avg</code> function from the <code>pyspark.sql.functions</code> module. This function is used to calculate the average value of a column.</p>
<p>Next, the <code>diamonds.select("color", "price").groupBy("color").agg(avg("price")).sort("color")</code> expression performs the following operations:</p>
<p><code>diamonds.select("color", "price")</code> selects only the <code>color</code> and <code>price</code> columns from the <code>diamonds</code> DataFrame.</p>
<p><code>groupBy("color")</code> groups the data based on the <code>color</code> column.</p>
<p><code>agg(avg("price"))</code> calculates the average price for each group (color). The <code>avg("price")</code> argument specifies that we want to calculate the average of the "price" column.</p>
<p><code>sort("color")</code> sorts the resulting DataFrame in ascending order based on the <code>color</code> column.</p>
<p>Finally, the <code>display()</code> function is used to visualize the resulting DataFrame in a tabular format.</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/08/image-200.png" alt="Image" width="600" height="400" loading="lazy"></p>
<h2 id="heading-version-control-and-time-travel-in-delta"><strong>Version Control and Time Travel in Delta</strong></h2>
<p>Databricks Delta’s time travel capabilities simplify building data pipelines. It comes handy when auditing data changes, reproducing experiments and reports or performing database transaction rollbacks. It is also useful for disaster recovery and allows us to undo changes and shifting back to any specific version of a database.</p>
<p>As you write into a Delta table or directory, every operation is automatically versioned. Query a table by referring to a timestamp or a version number.</p>
<p>The command below returns a list of all the versions and timestamps in a table called <code>diamonds</code>:</p>
<pre><code class="lang-javascript">DESCRIBE HISTORY diamonds;
</code></pre>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/08/image-201.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>DESCRIBE HISTORY</em> <code>table_name</code> returns a list of all the versions of the table along with their timestamps, operations. It also includes which user ran the query.</p>
<h3 id="heading-restore-setup">Restore Setup</h3>
<p>Delta provides built-in support for backup and restore strategies to handle issues like data corruption or accidental data loss. In our scenario, we'll intentionally delete some rows from the main table to simulate such situations.</p>
<p>We'll then use Delta's restore capability to revert the table to a point in time before the delete operation. By doing so, we can verify if the deletion was successful or if the data was restored correctly to its previous state. This feature ensures data safety and provides an easy way to recover from undesirable changes or failures.</p>
<p>Here's the code:</p>
<pre><code class="lang-sql"><span class="hljs-comment">-- Delete 10 records from the main table</span>
<span class="hljs-keyword">DELETE</span> <span class="hljs-keyword">FROM</span> diamonds <span class="hljs-keyword">where</span> <span class="hljs-string">`_c0`</span><span class="hljs-keyword">in</span> (<span class="hljs-number">1</span>,<span class="hljs-number">2</span>,<span class="hljs-number">3</span>,<span class="hljs-number">4</span>,<span class="hljs-number">5</span>,<span class="hljs-number">6</span>,<span class="hljs-number">7</span>,<span class="hljs-number">8</span>,<span class="hljs-number">9</span>,<span class="hljs-number">10</span>);
<span class="hljs-keyword">SELECT</span> <span class="hljs-keyword">COUNT</span>(*) <span class="hljs-keyword">from</span> diamonds;
</code></pre>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/08/image-202.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>Row count after deleing 10 records from the main table</em></p>
<pre><code class="lang-sql"><span class="hljs-keyword">SELECT</span> <span class="hljs-keyword">COUNT</span>(*) <span class="hljs-keyword">FROM</span> diamonds <span class="hljs-keyword">VERSION</span> <span class="hljs-keyword">AS</span> <span class="hljs-keyword">OF</span> <span class="hljs-number">19</span>;
</code></pre>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/08/image-203.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>Row count by referencing a previous version of the table</em></p>
<h2 id="heading-restoring-from-a-version-number"><strong>Restoring From A Version Number</strong></h2>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/08/image-204.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>Illustration of how a Version Restore works in Databricks Notebooks</em></p>
<p>The code below restores the <code>diamonds</code> table to the version that existed at version number 19, using a database versioning or historical data feature. After the restoration, a <code>SELECT</code> statement is executed to retrieve all data from the <code>diamonds</code> table as it existed at version 19.</p>
<p>This process allows you to view the historical state of the table at that specific version, enabling data analysis or comparisons with the current version.</p>
<pre><code class="lang-sql"><span class="hljs-comment">-- restore the state of diamonds table to that of version 19 (refer the database images in the previous cell)</span>

<span class="hljs-keyword">RESTORE</span> <span class="hljs-keyword">TABLE</span> diamonds <span class="hljs-keyword">TO</span> <span class="hljs-keyword">VERSION</span> <span class="hljs-keyword">AS</span> <span class="hljs-keyword">OF</span> <span class="hljs-number">19</span>;
<span class="hljs-keyword">SELECT</span> * <span class="hljs-keyword">from</span> diamonds;
</code></pre>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/08/image-205.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>SELECT query running against a restored version of the database</em></p>
<h3 id="heading-autogenerated-fields">Autogenerated Fields</h3>
<p>Let us see how to use auto-increment in Delta with SQL. The code below demonstrates the creation of a table called <code>test__autogen</code> with an "autogenerated" field named <code>id</code>. The <code>id</code> column is defined as <code>BIGINT GENERATED ALWAYS AS IDENTITY</code>, meaning its values will be automatically generated by the database engine during the insertion process.</p>
<p>The <code>id</code> serves as an auto-incrementing primary key for the table, ensuring each new record receives a unique identifier without any manual input. This feature simplifies data insertion and guarantees the uniqueness of records within the table, enhancing database management efficiency.</p>
<p>This auto-incrementing feature is commonly used for primary keys, as it guarantees the uniqueness of each record in the table. It also saves developers from having to manage the generation of unique identifiers manually, providing a more streamlined and efficient workflow.</p>
<pre><code class="lang-sql">%sql 
<span class="hljs-keyword">CREATE</span> <span class="hljs-keyword">TABLE</span> <span class="hljs-keyword">IF</span> <span class="hljs-keyword">NOT</span> <span class="hljs-keyword">EXISTS</span> test__autogen (
  <span class="hljs-keyword">id</span> <span class="hljs-built_in">BIGINT</span> <span class="hljs-keyword">GENERATED</span> <span class="hljs-keyword">ALWAYS</span> <span class="hljs-keyword">AS</span> <span class="hljs-keyword">IDENTITY</span> ( <span class="hljs-keyword">START</span> <span class="hljs-keyword">WITH</span> <span class="hljs-number">10000</span> <span class="hljs-keyword">INCREMENT</span> <span class="hljs-keyword">BY</span> <span class="hljs-number">1</span> ), 
  <span class="hljs-keyword">name</span> <span class="hljs-keyword">STRING</span>, 
  surname <span class="hljs-keyword">STRING</span>, 
  email <span class="hljs-keyword">STRING</span>, 
  city <span class="hljs-keyword">STRING</span>) ;

<span class="hljs-comment">-- Note that we don't insert data for the id. The engine will handle that for us:</span>
<span class="hljs-keyword">INSERT</span> <span class="hljs-keyword">INTO</span> test__autogen (<span class="hljs-keyword">name</span>, surname, email, city) <span class="hljs-keyword">VALUES</span> (<span class="hljs-string">'Atharva'</span>, <span class="hljs-string">'Shah'</span>, <span class="hljs-string">'highnessatharva@gmail.com'</span>, <span class="hljs-string">'Pune, IN'</span>);
<span class="hljs-keyword">INSERT</span> <span class="hljs-keyword">INTO</span> test__autogen (<span class="hljs-keyword">name</span>, surname, email, city) <span class="hljs-keyword">VALUES</span> (<span class="hljs-string">'James'</span>, <span class="hljs-string">'Dean'</span>, <span class="hljs-string">'james@proton.mail'</span>, <span class="hljs-string">'Tokyo, JP'</span>);

<span class="hljs-comment">-- The ID is automatically generated!</span>
<span class="hljs-keyword">SELECT</span> * <span class="hljs-keyword">from</span> test__autogen;
</code></pre>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/08/image-206.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>Records with an autogenerated</em> <code>id</code></p>
<h2 id="heading-delta-table-cloning"><strong>Delta Table Cloning</strong></h2>
<p>Cloning Delta tables allows you to create a replica of an existing Delta table at a specific version. This feature is particularly valuable when you need to transfer data from a production environment to a staging environment or when archiving a specific version for regulatory purposes.</p>
<p>There are two types of clones available:</p>
<ol>
<li><p><strong>Deep Clone:</strong> This type of clone copies both the source table data and metadata to the clone target. In other words, it replicates the entire table, making it independent of the source.</p>
</li>
<li><p><strong>Shallow Clone:</strong> A shallow clone only replicates the table metadata without copying the actual data files to the clone target. As a result, these clones are more cost-effective to create. However, it's crucial to note that shallow clones act as pointers to the main table. If a <code>VACUUM</code> operation is performed on the original table, it may delete the underlying files and potentially impact the shallow clone.</p>
</li>
</ol>
<p>It's important to remember that any modifications made to either deep or shallow clones only affect the clones themselves and not the source table.</p>
<p>Cloning Delta tables is a powerful feature that simplifies data replication and version archiving, enhancing data management capabilities within your Delta Lake environment.</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/08/image-207.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>Difference between a Shallow Clone and a Deep Clone</em></p>
<p>The code below shows how to clone a table using shallow and deep clones:</p>
<pre><code class="lang-sql"><span class="hljs-comment">-- Shallow clone (zero copy)</span>
<span class="hljs-keyword">CREATE</span> <span class="hljs-keyword">TABLE</span> <span class="hljs-keyword">IF</span> <span class="hljs-keyword">NOT</span> <span class="hljs-keyword">EXISTS</span> diamonds__shallow__clone
  SHALLOW <span class="hljs-keyword">CLONE</span> diamonds
  <span class="hljs-keyword">VERSION</span> <span class="hljs-keyword">AS</span> <span class="hljs-keyword">OF</span> <span class="hljs-number">19</span>;

<span class="hljs-keyword">SELECT</span> * <span class="hljs-keyword">FROM</span> diamonds__shallow__clone;

<span class="hljs-comment">-- Deep clone (copy data)</span>
<span class="hljs-keyword">CREATE</span> <span class="hljs-keyword">TABLE</span> <span class="hljs-keyword">IF</span> <span class="hljs-keyword">NOT</span> <span class="hljs-keyword">EXISTS</span> diamonds__deep__clone
  DEEP <span class="hljs-keyword">CLONE</span> diamonds;

<span class="hljs-keyword">SELECT</span> * <span class="hljs-keyword">FROM</span> diamonds__deep__clone;
</code></pre>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/08/image-208.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>Selecting records from the deep cloned table</em></p>
<h2 id="heading-delta-magic-commands">Delta Magic Commands</h2>
<p>There are convenient shortcuts in Databricks notebooks for managing Delta tables. They simplify common operations like displaying table metadata and running optimization.</p>
<p>You can use these shortcut commands to improve productivity by streamlining Delta table management tasks within a notebook environment.</p>
<ol>
<li><p><code>%run</code>: runs a Python file or a notebook.</p>
</li>
<li><p><code>%sh</code>: executes shell commands on the cluster nodes.</p>
</li>
<li><p><code>%fs</code>: allows you to interact with the Databricks file system.</p>
</li>
<li><p><code>%sql</code>: allows you to run SQL queries.</p>
</li>
<li><p><code>%scala</code>: switches the notebook context to Scala.</p>
</li>
<li><p><code>%python</code>: switches the notebook context to Python.</p>
</li>
<li><p><code>%md</code>: allows you to write markdown text.</p>
</li>
<li><p><code>%r</code>: switches the notebook context to R.</p>
</li>
<li><p><code>%lsmagic</code>: lists all the available magic commands.</p>
</li>
<li><p><code>%jobs</code>: lists all the running jobs.</p>
</li>
<li><p><code>%config</code>: allows you to set configuration options for the notebook.</p>
</li>
<li><p><code>%reload</code>: reloads the contents of a module.</p>
</li>
<li><p><code>%pip</code>: allows you to install Python packages.</p>
</li>
<li><p><code>%load</code>: loads the contents of a file into a cell.</p>
</li>
<li><p><code>%matplotlib</code>: sets up the matplotlib backend.</p>
</li>
<li><p><code>%who</code>: lists all the variables in the current scope.</p>
</li>
<li><p><code>%env</code>: allows you to set environment variables.</p>
</li>
</ol>
<h2 id="heading-conclusion">Conclusion</h2>
<p>This in-depth handbook explored the power of Databricks, a platform that unifies analytics and data science in a single workspace. We went through Databricks Workspace, interactive analytics, and Delta Lake, emphasizing its data manipulation and analysis capabilities.</p>
<p>Delta, a data integrity and agility engine, supports SQL commands as well as sophisticated queries. Data frames are used to shape and display data to improve insights. Retrospection and accuracy are enabled through version control and time travel. Delta's table cloning provides innovation by permitting analytical studies into previously undiscovered territory.</p>
<p>Your pursuit of data excellence doesn't end here. Let's stay connected: explore more insights on my <a target="_blank" href="https://atharvashah.netlify.app/">blog</a>, consider supporting me with a <a target="_blank" href="https://www.buymeacoffee.com/atharvashah">cup of coffee</a>, and join the conversation on <a target="_blank" href="https://twitter.com/cultist_dev">Twitter</a> and <a target="_blank" href="https://www.linkedin.com/in/atharva-shah-5873a2111/">LinkedIn</a>. Keep the momentum going by checking out a few of my other posts.</p>
<h2 id="heading-references">References</h2>
<ol>
<li><p><a target="_blank" href="https://docs.databricks.com/delta/index.html">Databricks Official Documentation</a></p>
</li>
<li><p><a target="_blank" href="https://databricks.com/labs">Databricks Labs - Delta Lake Tutorials</a></p>
</li>
</ol>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ FastAPI Handbook – How to Develop, Test, and Deploy APIs ]]>
                </title>
                <description>
                    <![CDATA[ Welcome to the world of FastAPI, a sleek and high-performance web framework for constructing Python APIs. Don't worry if you're new to API programming – we'll start at the beginning. An API (Application Programming Interface) connects several softwar... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/fastapi-quickstart/</link>
                <guid isPermaLink="false">66d45d9aa44b8bb91150f653</guid>
                
                    <category>
                        <![CDATA[ FastAPI ]]>
                    </category>
                
                    <category>
                        <![CDATA[ handbook ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Python ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ Atharva Shah ]]>
                </dc:creator>
                <pubDate>Tue, 25 Jul 2023 20:54:10 +0000</pubDate>
                <media:content url="https://www.freecodecamp.org/news/content/images/2023/07/FastAPI-Handbook-Cover.png" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>Welcome to the world of FastAPI, a sleek and high-performance web framework for constructing Python APIs. Don't worry if you're new to API programming – we'll start at the beginning.</p>
<p>An <strong>API</strong> (Application Programming Interface) connects several software programs allowing them to converse and exchange information. APIs are essential in modern software development as they are an application's backend architecture.</p>
<p>After reading this quick start guide, you will be able to develop a course administration API using <a target="_blank" href="https://fastapi.tiangolo.com/"><strong>FastAPI</strong></a> and <a target="_blank" href="https://www.mongodb.com/"><strong>MongoDB</strong></a>. The best part is that you will not only be writing APIs but also testing and containerizing the app.</p>
<p>In this walkthrough project, we'll create a Python backend system using FastAPI, a fast web framework, and a MongoDB database for course information storage and retrieval.</p>
<p>The system will allow users to access course details, view chapters, rate individual chapters, and aggregate ratings.</p>
<p>The project is designed for Python developers with basic programming knowledge and some NoSQL knowledge. Familiarity with MongoDB, Docker, and PyTest is not required since I will be highlighting everything you need to know for the scope of this project.</p>
<h2 id="heading-what-well-build">What We'll Build</h2>
<p>Here's what we are going to be building:</p>
<p><strong>FastAPI Backend:</strong> It will serve as the interface for handling API requests and responses. FastAPI is chosen for its ease of use, performance, and intuitive design.</p>
<p><strong>MongoDB Database:</strong> A NoSQL database to store course information. MongoDB's flexible schema allows us to store data in JSON-like documents, making it suitable for this project.</p>
<p><strong>Course Information:</strong> Users will be able to view various course details, such as course name, description, instructor, etc.</p>
<p><strong>Chapter Details:</strong> The system will provide information about the chapters in a course, including chapter names, descriptions, and any other relevant data.</p>
<p><strong>Chapter Rating:</strong> Users will have the ability to rate individual chapters. We will implement functionality to record and retrieve chapter ratings.</p>
<p><strong>Course Aggregated Rating:</strong> The system will calculate and display the aggregated rating for each course based on the ratings of its chapters.</p>
<p>This walkthrough shows how to set up a development environment, build a FastAPI backend, integrate MongoDB, define API endpoints, add chapter rating functionality, and compute aggregate course ratings. It covers fundamental project concepts as well as Python, MongoDB, and NoSQL databases.</p>
<p>By the end, this useful backend system will manage chapter details, course information, and user ratings, serving as the basis for a complex and rewarding project.</p>
<p>The goal is to create a system that processes course-related queries. The course information must then be retrieved from MongoDB depending on the request. Lastly, this answer data must be returned in a standard format (JSON).</p>
<p>We'll begin with a script that reads the course information from courses.json. This data will be stored in the MongoDB instance. Once the data has been loaded, our API code may connect to this database to allow for simple data retrieval.</p>
<p>The interesting aspect is creating several endpoints with FastAPI. Our API will be able to:</p>
<ul>
<li><p>Fetch a list of all courses</p>
</li>
<li><p>Show a comprehensive course overview</p>
</li>
<li><p>List detailed information about certain chapters</p>
</li>
<li><p>Record user scores for each chapter.</p>
</li>
</ul>
<p>Additionally, for each course, we will aggregate all reviews, providing visitors with relevant information regarding course popularity and quality.</p>
<p>This tutorial focuses on building a scalable, efficient, and user-friendly API. Once we've tested everything, we'll containerize the application using Docker. This will greatly simplify deployment, maintenance, and installation.</p>
<h2 id="heading-table-of-contents">Table of Contents</h2>
<p>Here are the sections of this tutorial:</p>
<ul>
<li><p><a class="post-section-overview" href="#heading-api-methods">API Methods</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-client-and-server">Client and Server</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-how-to-set-up-the-mongodb-database">How to Set Up the MongoDB Database</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-how-to-parse-and-insert-course-data-into-mongodb">How to Parse and Insert Course Data into MongoDB</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-how-to-design-the-fastapi-endpoints">How to Design the FastAPI Endpoints</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-automated-api-endpoint-testing-with-pytest">Automated API Endpoint Testing with PyTest</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-how-to-containerize-the-application-with-docker">How to Containerize the Application with Docker</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-conclusion">Conclusion</a></p>
</li>
</ul>
<h2 id="heading-api-methods">API Methods</h2>
<p>HTTP (Hypertext Transfer Protocol) methods specify the action to be taken on a resource. The following are the most often used API development methods:</p>
<p><strong>GET</strong>: Requests information from a server. When a client submits a GET request, it is requesting data from the server.</p>
<p><strong>POST</strong>: Sends data to the server for processing. When a client submits a POST request, it is often delivering data to the server to create or update a resource.</p>
<p><strong>PUT</strong>: Updates server data. When a client submits a PUT request, the resource indicated in the request is updated.</p>
<p><strong>DELETE</strong>: A client sending a DELETE request is asking for the removal of the specified resource.</p>
<h2 id="heading-client-and-server">Client and Server</h2>
<p>The <strong>client</strong> is often a front-end application that sends requests to the server, such as a web browser or a mobile app. The <strong>server</strong>, on the other hand, is the back-end application in charge of processing client requests and responding appropriately.</p>
<p>A request is a communication delivered by the client to the server that specifies the intended action and any required data. The HTTP method, URL (Uniform Resource Locator), headers, and, in the case of POST or PUT requests, the data payload are all part of a request.</p>
<p>After the server gets the <strong>request</strong>, it processes it and returns a <strong>response</strong>. The response is the message given back to the client by the server that contains the requested data or the outcome of the activity.</p>
<p>A response generally comprises an HTTP status code indicating the success or failure of the request, as well as any data sent back to the client by the server.</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/07/image-131.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>Diagram showing how APIs work</em></p>
<h2 id="heading-how-to-set-up-the-mongodb-database">How to Set Up the MongoDB Database</h2>
<p>MongoDB is a type of NoSQL database. It is non-relational and saves information as collections and documents.</p>
<p>Install MongoDB for your operating system from the <a target="_blank" href="https://www.mongodb.com/try/download/community">official website.</a></p>
<p>Now run the <code>mongosh</code> command for your terminal to verify if the installation was successful.</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/07/image-125.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>Running the mongosh command should yield this output</em></p>
<p>Connect to the MongoDB server with <strong>MongoDB Compass</strong>. I recommend that you set up MongoDB by specifying settings such as port number, storage engine, authentication, and so forth.</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/07/image-124.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>Create a new MongoDB connection</em></p>
<p>Now that the connection is established, the next step is to create a database or a "document". Call this database "courses". It will be empty for you currently. In just a minute we'll insert the documents using a Python script.</p>
<h2 id="heading-how-to-parse-and-insert-course-data-into-mongodb">How to Parse and Insert Course Data into MongoDB</h2>
<p>You could insert records one by one, but it is best to use a JSON file to simplify that process. Download this file <a target="_blank" href="https://github.com/HighnessAtharva/fastapi-kimo/blob/master/courses.json"><strong>courses.json</strong></a> from GitHub. All course information is present in it (as a list of courses).</p>
<p>Specifically, each course has the following structure:</p>
<ul>
<li><p><strong>name:</strong> The title of the course.</p>
</li>
<li><p><strong>date:</strong> Creation date as a UNIX timestamp.</p>
</li>
<li><p><strong>description:</strong> The description of the course.</p>
</li>
<li><p><strong>domain:</strong> List of the course domain(s).</p>
</li>
<li><p><strong>chapters:</strong> List of the course chapters. Each chapter has a title name and content text.</p>
</li>
</ul>
<p>You will need a few Python packages for this project.</p>
<ul>
<li><p><code>BSON</code> - Binary serialization format that is used in MongoDB for efficient data storage and retrieval. It comes bundled with PyMongo.</p>
</li>
<li><p><code>FastAPI</code> - Web framework for creating Python APIs that offer high performance, automatic validation, interactive documentation, and support for async operations.</p>
</li>
<li><p><code>PyMongo</code> - Official MongoDB driver for Python. It serves as a high-level API for integrating MongoDB within Python.</p>
</li>
<li><p><code>Uvicorn</code> - Primary ASGI server that improves application performance. It is responsible for server startup.</p>
</li>
<li><p><code>Starlette</code> - ASGI framework that powers FastAPI and allows rapid prototyping development.</p>
</li>
<li><p><code>Pydantic</code> - Integrated data validation and parsing library. We need it to create interactive API documentation while automatically validating incoming request data and enforcing data type rules.</p>
</li>
</ul>
<p>Get them installed via the pip commands like so:</p>
<pre><code class="lang-javascript">pip install fastapi pymongo uvicorn starlette pydantic
</code></pre>
<p>Now, let's write a Python script to insert all this course data into the database so that we can start building API routes. Spin up your IDE, create a file called <code>script.py</code>, and make sure it is in the same directory as the <code>courses.json</code> file.</p>
<pre><code class="lang-py"><span class="hljs-string">""" 
Script to parse course information from courses.json, create the appropriate databases and
collection(s) on a local instance of MongoDB, create the appropriate indices (for efficient retrieval)
and finally add the course data on the collection(s).
"""</span>

<span class="hljs-keyword">import</span> pymongo
<span class="hljs-keyword">import</span> json

<span class="hljs-comment"># Connect to MongoDB</span>
client = pymongo.MongoClient(<span class="hljs-string">"mongodb://localhost:27017/"</span>)
db = client[<span class="hljs-string">"courses"</span>]
collection = db[<span class="hljs-string">"courses"</span>]

<span class="hljs-comment"># Read courses from courses.json</span>
<span class="hljs-keyword">with</span> open(<span class="hljs-string">"courses.json"</span>, <span class="hljs-string">"r"</span>) <span class="hljs-keyword">as</span> f:
    courses = json.load(f)

<span class="hljs-comment"># Create index for efficient retrieval</span>
collection.create_index(<span class="hljs-string">"name"</span>)

<span class="hljs-comment"># add rating field to each course</span>
<span class="hljs-keyword">for</span> course <span class="hljs-keyword">in</span> courses:
    course[<span class="hljs-string">'rating'</span>] = {<span class="hljs-string">'total'</span>: <span class="hljs-number">0</span>, <span class="hljs-string">'count'</span>: <span class="hljs-number">0</span>}

<span class="hljs-comment"># add rating field to each chapter</span>
<span class="hljs-keyword">for</span> course <span class="hljs-keyword">in</span> courses:
    <span class="hljs-keyword">for</span> chapter <span class="hljs-keyword">in</span> course[<span class="hljs-string">'chapters'</span>]:
        chapter[<span class="hljs-string">'rating'</span>] = {<span class="hljs-string">'total'</span>: <span class="hljs-number">0</span>, <span class="hljs-string">'count'</span>: <span class="hljs-number">0</span>}

<span class="hljs-comment"># Add courses to collection</span>
<span class="hljs-keyword">for</span> course <span class="hljs-keyword">in</span> courses:
    collection.insert_one(course)

<span class="hljs-comment"># Close MongoDB connection</span>
client.close()
</code></pre>
<p>This script populates a MongoDB database with the course information from the JSON file.</p>
<p>It begins by connecting to the local MongoDB instance. It reads course data from a file called <code>courses.json</code> and creates a new field for course ratings. It then develops an index to speed up data retrieval. Lastly, the course data is added to the MongoDB collection.</p>
<p>It's a straightforward script for managing course data in a database. On running the script, all records from the <code>courses.json</code> should have been inserted into the courses DB. Switch to MongoDB Compass to verify it.</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/07/image-116.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>You should be able to see the JSON items in your courses database after running the python script</em></p>
<h2 id="heading-how-to-design-the-fastapi-endpoints">How to Design the FastAPI Endpoints</h2>
<p>These API endpoints provide an efficient way to manage course information, retrieve course details, and allow user interactions for rating chapters.</p>
<p>I recommend designing the API endpoints first along with the HTTP request type before writing the code. This acts as a good reference and provides clarity during the coding process.</p>
<div class="hn-table">
<table>
<thead>
<tr>
<td>Endpoint</td><td>Request Type</td><td>Description</td></tr>
</thead>
<tbody>
<tr>
<td>/courses</td><td>GET</td><td>Get a list of all available courses with sorting options.  </td></tr>
</tbody>
</table>
</div><p>Options: Sort by title (ascending), date (descending), or total course rating (descending).  </p>
<p>Optional filtering based on domain is supported. |
| /courses/{course_id} | GET | Get the overview of a specific course identified by course_id. |
| /courses/{course_id}/{chapter_id} | GET | Get information about a specific chapter within a course. |
| /courses/{course_id}/{chapter_id} | POST | Rate a specific chapter within a course.  </p>
<p>Options: Positive rating (1), negative rating (-1).  </p>
<p>The ratings are aggregated for each course. |</p>
<p>Okay, time to dive into the API code. Create a brand new Python file and call it <code>main.py</code>:</p>
<pre><code class="lang-py"><span class="hljs-keyword">import</span> contextlib
<span class="hljs-keyword">from</span> fastapi <span class="hljs-keyword">import</span> FastAPI, HTTPException, Query
<span class="hljs-keyword">from</span> pymongo <span class="hljs-keyword">import</span> MongoClient
<span class="hljs-keyword">from</span> bson <span class="hljs-keyword">import</span> ObjectId
<span class="hljs-keyword">from</span> fastapi.encoders <span class="hljs-keyword">import</span> jsonable_encoder

app = FastAPI()
client = MongoClient(<span class="hljs-string">'mongodb://localhost:27017/'</span>)
db = client[<span class="hljs-string">'courses'</span>]
</code></pre>
<p>The code imports essential modules and creates an active instance of the FastAPI class named app. It also establishes a connection to the local MongoDB database using the PyMongo library and the <code>db</code> variable now stores the connection reference to the courses document.</p>
<p>Let's go over each of these endpoints in more detail now.</p>
<h3 id="heading-the-get-all-courses-endpoint-courses-get">The Get All Courses Endpoint (<code>/courses</code> – GET)</h3>
<p>This endpoint allows you to retrieve a list of all available courses. You can sort the courses based on different criteria, such as alphabetical order (based on the course title in ascending order), date (in descending order), or total course rating (in descending order). Also, we'll allow users to filter the courses based on their domain.</p>
<pre><code class="lang-py"><span class="hljs-meta">@app.get('/courses')</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">get_courses</span>(<span class="hljs-params">sort_by: str = <span class="hljs-string">'date'</span>, domain: str = None</span>):</span>
    <span class="hljs-comment"># set the rating.total and rating.count to all the courses based on the sum of the chapters rating</span>
    <span class="hljs-keyword">for</span> course <span class="hljs-keyword">in</span> db.courses.find():
        total = <span class="hljs-number">0</span>
        count = <span class="hljs-number">0</span>
        <span class="hljs-keyword">for</span> chapter <span class="hljs-keyword">in</span> course[<span class="hljs-string">'chapters'</span>]:
            <span class="hljs-keyword">with</span> contextlib.suppress(KeyError):
                total += chapter[<span class="hljs-string">'rating'</span>][<span class="hljs-string">'total'</span>]
                count += chapter[<span class="hljs-string">'rating'</span>][<span class="hljs-string">'count'</span>]
        db.courses.update_one({<span class="hljs-string">'_id'</span>: course[<span class="hljs-string">'_id'</span>]}, {<span class="hljs-string">'$set'</span>: {<span class="hljs-string">'rating'</span>: {<span class="hljs-string">'total'</span>: total, <span class="hljs-string">'count'</span>: count}}})


    <span class="hljs-comment"># sort_by == 'date' [DESCENDING]</span>
    <span class="hljs-keyword">if</span> sort_by == <span class="hljs-string">'date'</span>:
        sort_field = <span class="hljs-string">'date'</span>
        sort_order = <span class="hljs-number">-1</span>

    <span class="hljs-comment"># sort_by == 'rating' [DESCENDING]</span>
    <span class="hljs-keyword">elif</span> sort_by == <span class="hljs-string">'rating'</span>:
        sort_field = <span class="hljs-string">'rating.total'</span>
        sort_order = <span class="hljs-number">-1</span>

    <span class="hljs-comment"># sort_by == 'alphabetical' [ASCENDING]</span>
    <span class="hljs-keyword">else</span>:  
        sort_field = <span class="hljs-string">'name'</span>
        sort_order = <span class="hljs-number">1</span>

    query = {}
    <span class="hljs-keyword">if</span> domain:
        query[<span class="hljs-string">'domain'</span>] = domain


    courses = db.courses.find(query, {<span class="hljs-string">'name'</span>: <span class="hljs-number">1</span>, <span class="hljs-string">'date'</span>: <span class="hljs-number">1</span>, <span class="hljs-string">'description'</span>: <span class="hljs-number">1</span>, <span class="hljs-string">'domain'</span>:<span class="hljs-number">1</span>,<span class="hljs-string">'rating'</span>:<span class="hljs-number">1</span>,<span class="hljs-string">'_id'</span>: <span class="hljs-number">0</span>}).sort(sort_field, sort_order)
    <span class="hljs-keyword">return</span> list(courses)
</code></pre>
<p>This code defines an endpoint in the FastAPI application to retrieve a list of all available courses. The endpoint can be accessed using an HTTP GET request to the '/courses' URL.</p>
<p>The <code>@app.get()</code> decorator is attached to the <code>get_course</code> function and it takes care of this.</p>
<p>When a request is made to this endpoint, the code first calculates the total course rating by summing up the ratings of all the chapters in each course. It then updates the <code>rating</code> field of each course in the MongoDB database with the computed total and count of ratings.</p>
<p>Next, the code determines the sorting mode based on the <code>sort_by</code> query parameter. If <code>sort_by</code> is set to <code>date</code>, the courses will be sorted by their creation date in descending order. If it is set to <code>rating</code>, the courses will be sorted by their total rating in descending order. Otherwise, the courses will be sorted alphabetically by their names in ascending order.</p>
<p>If the optional <code>domain</code> query parameter is provided, the code will filter the courses based on the specified domain.</p>
<p>Finally, the code queries the MongoDB database to retrieve the relevant course information, including the course name, creation date, description, domain, and rating. The courses are sorted according to the selected sorting mode and returned as a list.</p>
<p>That was the code explanation, but what about the actual API response? Run the command below in your terminal from the current working directory:</p>
<pre><code class="lang-javascript">uvicorn main:app --reload
</code></pre>
<p>Uvicorn is an ASGI webserver. You can interact with API endpoints right on your local machine without any external server. On running the above command you should see a success message stating that the server has started.</p>
<p>Fire up your browser and enter <a target="_blank" href="http://127.0.0.1:8000/courses"><code>http://127.0.0.1:8000/courses</code></a> in the URL bar. The output that you will see will be the JSON response directly from the server.</p>
<p>Verify that the first object contains the following:</p>
<pre><code class="lang-json">{
<span class="hljs-attr">"name"</span>: <span class="hljs-string">"Introduction to Programming"</span>,
<span class="hljs-attr">"date"</span>: <span class="hljs-number">1659906000</span>,
<span class="hljs-attr">"description"</span>: <span class="hljs-string">"An introduction to programming using a language called Python. Learn how to read and write code as well as how to test and \"debug\" it. Designed for students with or without prior programming experience who'd like to learn Python specifically. Learn about functions, arguments, and return values (oh my!); variables and types; conditionals and Boolean expressions; and loops. Learn how to handle exceptions, find and fix bugs, and write unit tests; use third-party libraries; validate and extract data with regular expressions; model real-world entities with classes, objects, methods, and properties; and read and write files. Hands-on opportunities for lots of practice. Exercises inspired by real-world programming problems. No software required except for a web browser, or you can write code on your own PC or Mac."</span>,
<span class="hljs-attr">"domain"</span>: [
    <span class="hljs-string">"programming"</span>
    ],
<span class="hljs-attr">"rating"</span>: {
    <span class="hljs-attr">"total"</span>: <span class="hljs-number">6</span>,
    <span class="hljs-attr">"count"</span>: <span class="hljs-number">12</span>
    }
}
</code></pre>
<p>Guess what? It is a list of all the courses that we stored in our database. Your front-end application may now iterate over all these items and present them in a fancy way to the user. That is the power of APIs.</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/07/image-117.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>The Rating for the entire course will be updated as per the aggregated sum of chapters as mentioned in the assignment document.</em></p>
<p>At this point, if you wish to see the documentation for your API do so by navigating to the <a target="_blank" href="http://127.0.0.1:8000/docs"><code>http://127.0.0.1:8000/docs</code></a> endpoint. This navigable API comes prepackages with FastAPI. How cool is that?</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/07/image-126.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>FastAPI docs for all your API endpoints</em></p>
<p>Don't like the plain old look of the docs? Fret not, there is also a <code>/redoc</code> endpoint with a slightly fancier interface. Just navigate to <code>[http://127.0.0.1:8000/](http://127.0.0.1:8000/docs)redoc</code> and you will be greeted with this screen.</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/07/image-127.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>FastAPI alternate redoc interface with search and download options</em></p>
<h3 id="heading-the-get-course-overview-endpoint-coursescourseid-get">The Get Course Overview Endpoint (<code>/courses/{course_id}</code> – GET)</h3>
<p>You'll use this endpoint to get an overview of a specific course. Simply provide the course_id in the URL, and the API will return detailed information about that particular course.</p>
<pre><code class="lang-py"><span class="hljs-meta">@app.get('/courses/{course_id}')</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">get_course</span>(<span class="hljs-params">course_id: str</span>):</span>
    course = db.courses.find_one({<span class="hljs-string">'_id'</span>: ObjectId(course_id)}, {<span class="hljs-string">'_id'</span>: <span class="hljs-number">0</span>, <span class="hljs-string">'chapters'</span>: <span class="hljs-number">0</span>})
    <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> course:
        <span class="hljs-keyword">raise</span> HTTPException(status_code=<span class="hljs-number">404</span>, detail=<span class="hljs-string">'Course not found'</span>)
    <span class="hljs-keyword">try</span>:
        course[<span class="hljs-string">'rating'</span>] = course[<span class="hljs-string">'rating'</span>][<span class="hljs-string">'total'</span>]
    <span class="hljs-keyword">except</span> KeyError:
        course[<span class="hljs-string">'rating'</span>] = <span class="hljs-string">'Not rated yet'</span> 

    <span class="hljs-keyword">return</span> course
</code></pre>
<p>This code snippet searches the MongoDB database for the course with the specified <code>course_ id</code> and extracts the course information while leaving out the <code>chapters</code> field.</p>
<p>If it cannot find the course, it throws an <code>HTTPException</code> with the status code 404. If it finds it, it tries to access the <code>rating</code> field and replaces it with its 'total' value to display the total rating. If not, the <code>rating</code> box is set to <code>Not rated yet</code>.</p>
<p>Finally, without the <code>chapters</code> field, it returns the JSON response of the course information, including the total rating.</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/07/image-118.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>Single Course Overview Endpoint Response</em></p>
<h3 id="heading-get-specific-chapter-information-endpoint-coursescourseidchapterid-get">Get Specific Chapter Information Endpoint (<code>/courses/{course_id}/{chapter_id}</code> – GET)</h3>
<p>Hitting this endpoint returns specific information about a chapter within a course. By specifying both the <code>course_id</code> and the <code>chapter_id</code> in the URL, you can access the details of that particular chapter.</p>
<pre><code class="lang-py"><span class="hljs-meta">@app.get('/courses/{course_id}/{chapter_id}')</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">get_chapter</span>(<span class="hljs-params">course_id: str, chapter_id: str</span>):</span>    
    course = db.courses.find_one({<span class="hljs-string">'_id'</span>: ObjectId(course_id)}, {<span class="hljs-string">'_id'</span>: <span class="hljs-number">0</span>, })
    <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> course:
        <span class="hljs-keyword">raise</span> HTTPException(status_code=<span class="hljs-number">404</span>, detail=<span class="hljs-string">'Course not found'</span>)
    chapters = course.get(<span class="hljs-string">'chapters'</span>, [])
    <span class="hljs-keyword">try</span>:
        chapter = chapters[int(chapter_id)]
    <span class="hljs-keyword">except</span> (ValueError, IndexError) <span class="hljs-keyword">as</span> e:
        <span class="hljs-keyword">raise</span> HTTPException(status_code=<span class="hljs-number">404</span>, detail=<span class="hljs-string">'Chapter not found'</span>) <span class="hljs-keyword">from</span> e
    <span class="hljs-keyword">return</span> chapter
</code></pre>
<p>As you might expect, <code>course_id</code> is the course identity, and <code>chapter id</code> is the chapter identifier inside that course.</p>
<p>When a request is made to this endpoint, the code first searches the MongoDB database for the course with the specified <code>course id</code>, ignoring the <code>_id</code> column in the response.</p>
<p>If the course with the supplied <code>course_id</code> cannot be found in the database, the code throws an HTTPException with the status code 404, indicating that the course could not be located.</p>
<p>The code then uses the GET function to retrieve the list of chapters for the course, setting the default value to an empty list if the 'chapters' field does not exist.</p>
<p>Using the <code>chapter_id</code> provided in the request, the code then attempts to retrieve the exact chapter within the list of chapters. If the <code>chapter id</code> is not a valid integer or is out of range for the list of chapters, the code throws an HTTPException with the status code 404. This indicates that it could not locate the chapter.</p>
<p>If it locates the chapter, the response contains information on the individual chapter within the course.</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/07/image-119.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>Chapter Detail Endpoint</em></p>
<h3 id="heading-rate-chapter-endpoint-coursescourseidchapterid-post">Rate Chapter Endpoint (<code>/courses/{course_id}/{chapter_id}</code> – POST)</h3>
<p>This endpoint allows users to rate individual chapters within a course. You can provide a rating of 1 for a positive review or -1 for a negative review. The API aggregates all the ratings for each course, providing valuable feedback for future improvements.</p>
<p>Up until now, we've mostly seen GET requests. But now let's see how you can send data to the server, validate it, and insert it in the application database.</p>
<pre><code class="lang-py"><span class="hljs-meta">@app.post('/courses/{course_id}/{chapter_id}')</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">rate_chapter</span>(<span class="hljs-params">course_id: str, chapter_id: str, rating: int = Query(<span class="hljs-params">..., gt=<span class="hljs-number">-2</span>, lt=<span class="hljs-number">2</span></span>)</span>):</span>
    course = db.courses.find_one({<span class="hljs-string">'_id'</span>: ObjectId(course_id)}, {<span class="hljs-string">'_id'</span>: <span class="hljs-number">0</span>, })
    <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> course:
        <span class="hljs-keyword">raise</span> HTTPException(status_code=<span class="hljs-number">404</span>, detail=<span class="hljs-string">'Course not found'</span>)
    chapters = course.get(<span class="hljs-string">'chapters'</span>, [])
    <span class="hljs-keyword">try</span>:
        chapter = chapters[int(chapter_id)]
    <span class="hljs-keyword">except</span> (ValueError, IndexError) <span class="hljs-keyword">as</span> e:
        <span class="hljs-keyword">raise</span> HTTPException(status_code=<span class="hljs-number">404</span>, detail=<span class="hljs-string">'Chapter not found'</span>) <span class="hljs-keyword">from</span> e
    <span class="hljs-keyword">try</span>:
        chapter[<span class="hljs-string">'rating'</span>][<span class="hljs-string">'total'</span>] += rating
        chapter[<span class="hljs-string">'rating'</span>][<span class="hljs-string">'count'</span>] += <span class="hljs-number">1</span>
    <span class="hljs-keyword">except</span> KeyError:
        chapter[<span class="hljs-string">'rating'</span>] = {<span class="hljs-string">'total'</span>: rating, <span class="hljs-string">'count'</span>: <span class="hljs-number">1</span>}
    db.courses.update_one({<span class="hljs-string">'_id'</span>: ObjectId(course_id)}, {<span class="hljs-string">'$set'</span>: {<span class="hljs-string">'chapters'</span>: chapters}})
    <span class="hljs-keyword">return</span> chapter
</code></pre>
<p>We have put in place an endpoint for users to rate each chapter within a course using an HTTP POST request to the <code>/courses/course_id/chapter_id</code> URL. Users can provide a rating value of 1 for a positive rating or -1 for a negative rating. The code queries the MongoDB database to find the course with the specified <code>course_id</code>, excluding the <code>_id</code> field.</p>
<p>If it doesn't find the course, it raises an HTTP exception with a status code of 404. The code retrieves the list of chapters, setting the default value to an empty list.</p>
<p>If the <code>chapter_id</code> is not a valid integer or is out of range, it raises an <code>HTTPException</code> with a status code of 404. If the chapter is found, the code updates its rating by incrementing the <code>total</code> rating value with the provided rating and incrementing the <code>count</code> value.</p>
<p>If the chapter does not have an existing <code>rating</code> field, it creates one and initializes it with the provided rating and a count of 1. The updated rating is then updated in the database, and the updated chapter is returned as the response, providing feedback to the user about their rating for that chapter.</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/07/image-120.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>POST Request to add a rating to a chapter</em></p>
<p>To make a POST request, open the docs and click on the request highlighted in the above image. Then, click on "Try it out", fill in the post data, and press the Execute button right below. This sends the POST data to the server which is then validated.</p>
<p>If all the submitted data is as expected, the server accepts and shows the 200 status code meaning that the operation was successful. The submitted data is now in the MongoDB document.</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/07/image-121.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>Post Request Success</em></p>
<p>That's a wrap on the API development part.</p>
<h2 id="heading-automated-api-endpoint-testing-with-pytest">Automated API Endpoint Testing with PyTest</h2>
<p>As the complexity of modern web applications increases, so does the number of API endpoints and their interactions.</p>
<p>In a dynamic e-commerce web app, there could be hundreds of endpoints, each supporting multiple HTTP request methods. And these endpoints might be intricately interconnected.</p>
<p>Ensuring the proper functioning of all these endpoints after each development iteration becomes a formidable task for developers and QA teams. Here is where automated testing comes to the rescue.</p>
<p>Create a file <code>test_app.py</code> in the same directory as <code>courses.json</code> and <code>main.py</code>:</p>
<pre><code class="lang-py"><span class="hljs-keyword">from</span> fastapi.testclient <span class="hljs-keyword">import</span> TestClient
<span class="hljs-keyword">from</span> pymongo <span class="hljs-keyword">import</span> MongoClient
<span class="hljs-keyword">from</span> bson <span class="hljs-keyword">import</span> ObjectId
<span class="hljs-keyword">import</span> pytest
<span class="hljs-keyword">from</span> main <span class="hljs-keyword">import</span> app

client = TestClient(app)
mongo_client = MongoClient(<span class="hljs-string">'mongodb://localhost:27017/'</span>)
db = mongo_client[<span class="hljs-string">'courses'</span>]
</code></pre>
<p>That sets up an automated testing environment.</p>
<p><strong>FastAPI Test Client</strong> simulates HTTP requests to the web app. With this, you can pretend to be a user, sending requests to your app and getting responses back, just like a real user would.</p>
<p>We're using <strong>MongoDB Connection</strong> for course data storage, with MongoClient enabling interaction and data updates during tests.</p>
<p><strong>Test Database</strong> is a separate database for testing. It will not affect the actual course documents.</p>
<p>With this configuration, you can now create test functions that send requests to your FastAPI app using the TestClient. You will interact with your MongoDB database during these tests, but don't worry—this is just the test database, so nothing important will be harmed.</p>
<h3 id="heading-how-to-test-the-get-courses-list-endpoint">How to Test the "Get Courses List" Endpoint</h3>
<p>These test functions use <code>TestClient</code> to interact with the "/courses" endpoint of the FastAPI application. They check if the endpoint behaves as expected when different parameters, such as sorting and filtering by domain, are provided.</p>
<p>The tests verify the status codes, data presence, sorting order, and domain filtering in the API responses, ensuring the functionality of the course endpoint is correct and reliable.</p>
<pre><code class="lang-py"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">test_get_courses_no_params</span>():</span>
    response = client.get(<span class="hljs-string">"/courses"</span>)
    <span class="hljs-keyword">assert</span> response.status_code == <span class="hljs-number">200</span>

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">test_get_courses_sort_by_alphabetical</span>():</span>
    response = client.get(<span class="hljs-string">"/courses?sort_by=alphabetical"</span>)
    <span class="hljs-keyword">assert</span> response.status_code == <span class="hljs-number">200</span>
    courses = response.json()
    <span class="hljs-keyword">assert</span> len(courses) &gt; <span class="hljs-number">0</span>
    <span class="hljs-keyword">assert</span> sorted(courses, key=<span class="hljs-keyword">lambda</span> x: x[<span class="hljs-string">'name'</span>]) == courses


<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">test_get_courses_sort_by_date</span>():</span>
    response = client.get(<span class="hljs-string">"/courses?sort_by=date"</span>)
    <span class="hljs-keyword">assert</span> response.status_code == <span class="hljs-number">200</span>
    courses = response.json()
    <span class="hljs-keyword">assert</span> len(courses) &gt; <span class="hljs-number">0</span>
    <span class="hljs-keyword">assert</span> sorted(courses, key=<span class="hljs-keyword">lambda</span> x: x[<span class="hljs-string">'date'</span>], reverse=<span class="hljs-literal">True</span>) == courses

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">test_get_courses_sort_by_rating</span>():</span>
    response = client.get(<span class="hljs-string">"/courses?sort_by=rating"</span>)
    <span class="hljs-keyword">assert</span> response.status_code == <span class="hljs-number">200</span>
    courses = response.json()
    <span class="hljs-keyword">assert</span> len(courses) &gt; <span class="hljs-number">0</span>
    <span class="hljs-keyword">assert</span> sorted(courses, key=<span class="hljs-keyword">lambda</span> x: x[<span class="hljs-string">'rating'</span>][<span class="hljs-string">'total'</span>], reverse=<span class="hljs-literal">True</span>) == courses

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">test_get_courses_filter_by_domain</span>():</span>
    response = client.get(<span class="hljs-string">"/courses?domain=mathematics"</span>)
    <span class="hljs-keyword">assert</span> response.status_code == <span class="hljs-number">200</span>
    courses = response.json()
    <span class="hljs-keyword">assert</span> len(courses) &gt; <span class="hljs-number">0</span>
    <span class="hljs-keyword">assert</span> all([c[<span class="hljs-string">'domain'</span>][<span class="hljs-number">0</span>] == <span class="hljs-string">'mathematics'</span> <span class="hljs-keyword">for</span> c <span class="hljs-keyword">in</span> courses])

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">test_get_courses_filter_by_domain_and_sort_by_alphabetical</span>():</span>
    response = client.get(<span class="hljs-string">"/courses?domain=mathematics&amp;sort_by=alphabetical"</span>)
    <span class="hljs-keyword">assert</span> response.status_code == <span class="hljs-number">200</span>
    courses = response.json()
    <span class="hljs-keyword">assert</span> len(courses) &gt; <span class="hljs-number">0</span>
    <span class="hljs-keyword">assert</span> all([c[<span class="hljs-string">'domain'</span>][<span class="hljs-number">0</span>] == <span class="hljs-string">'mathematics'</span> <span class="hljs-keyword">for</span> c <span class="hljs-keyword">in</span> courses])
    <span class="hljs-keyword">assert</span> sorted(courses, key=<span class="hljs-keyword">lambda</span> x: x[<span class="hljs-string">'name'</span>]) == courses

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">test_get_courses_filter_by_domain_and_sort_by_date</span>():</span>
    response = client.get(<span class="hljs-string">"/courses?domain=mathematics&amp;sort_by=date"</span>)
    <span class="hljs-keyword">assert</span> response.status_code == <span class="hljs-number">200</span>
    courses = response.json()
    <span class="hljs-keyword">assert</span> len(courses) &gt; <span class="hljs-number">0</span>
    <span class="hljs-keyword">assert</span> all([c[<span class="hljs-string">'domain'</span>][<span class="hljs-number">0</span>] == <span class="hljs-string">'mathematics'</span> <span class="hljs-keyword">for</span> c <span class="hljs-keyword">in</span> courses])
    <span class="hljs-keyword">assert</span> sorted(courses, key=<span class="hljs-keyword">lambda</span> x: x[<span class="hljs-string">'date'</span>], reverse=<span class="hljs-literal">True</span>) == courses
</code></pre>
<p>Pay attention to the assert statements. The expected results are checked against actual results and it returns a <code>True</code> or <code>False</code> Boolean based on the this comparison. The objective is to get all the tests to pass by equalizing these values.</p>
<h3 id="heading-how-to-test-the-get-single-course-info-endpoint">How to Test the "Get Single Course Info" Endpoint</h3>
<p>The tests use TestClient to send queries to FastAPI's "/courses/course id" endpoint, retrieving course data from the MongoDB database using the <code>db.courses.find_one</code> function. Comparing API response data to database data can help you determine if the endpoint handles existing and non-existent course IDs.</p>
<pre><code class="lang-py"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">test_get_course_by_id_exists</span>():</span>
    response = client.get(<span class="hljs-string">"/courses/6431137ab5da949e5978a281"</span>)
    <span class="hljs-keyword">assert</span> response.status_code == <span class="hljs-number">200</span>
    course = response.json()
    <span class="hljs-comment"># get the course from the database</span>
    course_db = db.courses.find_one({<span class="hljs-string">'_id'</span>: ObjectId(<span class="hljs-string">'6431137ab5da949e5978a281'</span>)})
    <span class="hljs-comment"># get the name of the course from the database</span>
    name_db = course_db[<span class="hljs-string">'name'</span>]
    <span class="hljs-comment"># get the name of the course from the response</span>
    name_response = course[<span class="hljs-string">'name'</span>]
    <span class="hljs-comment"># compare the two</span>
    <span class="hljs-keyword">assert</span> name_db == name_response


<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">test_get_course_by_id_not_exists</span>():</span>
    response = client.get(<span class="hljs-string">"/courses/6431137ab5da949e5978a280"</span>)
    <span class="hljs-keyword">assert</span> response.status_code == <span class="hljs-number">404</span>
    <span class="hljs-keyword">assert</span> response.json() == {<span class="hljs-string">'detail'</span>: <span class="hljs-string">'Course not found'</span>}
</code></pre>
<h3 id="heading-how-to-test-the-get-course-chapter-info-endpoint">How to Test the "Get Course Chapter Info" Endpoint</h3>
<p>The tests anticipate the FastAPI application's "/courses/course id/chapter number" endpoint to provide chapter information for a certain course ID and number when they use the TestClient to make the request.</p>
<p>We use assertions to determine if the answer includes the anticipated data or gives a "Not Found" response for a non-existent chapter. It validates that the correct API chapter was retrieved and handles existing and non-existent chapters.</p>
<pre><code class="lang-py"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">test_get_chapter_info</span>():</span>
    response = client.get(<span class="hljs-string">"/courses/6431137ab5da949e5978a281/1"</span>)
    <span class="hljs-keyword">assert</span> response.status_code == <span class="hljs-number">200</span>
    chapter = response.json()
    <span class="hljs-keyword">assert</span> chapter[<span class="hljs-string">'name'</span>] == <span class="hljs-string">'Big Picture of Calculus'</span>
    <span class="hljs-keyword">assert</span> chapter[<span class="hljs-string">'text'</span>] == <span class="hljs-string">'Highlights of Calculus'</span>


<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">test_get_chapter_info_not_exists</span>():</span>
    response = client.get(<span class="hljs-string">"/courses/6431137ab5da949e5978a281/990"</span>)
    <span class="hljs-keyword">assert</span> response.status_code == <span class="hljs-number">404</span>
    <span class="hljs-keyword">assert</span> response.json() == {<span class="hljs-string">'detail'</span>: <span class="hljs-string">'Chapter not found'</span>}
</code></pre>
<h3 id="heading-how-to-test-the-post-course-rating-endpoint">How to Test the "Post Course Rating" Endpoint</h3>
<p>To test the rating capability, the test function specifies the course ID, chapter ID, and rating variables. It uses the TestClient's post method to submit a POST request to the "/courses/course id/chapter id" API, providing the course ID and chapter number in the URL and passing the rating variable as a query parameter.</p>
<p>FastAPI mimics a user's activity to rate a certain chapter of a course. The response is successful with a 200 status code. JSON content is validated for "name" and "rating" keys, as well as "total" and "count" keys. The total rating and rating count are greater than 0, indicating users have rated the chapter.</p>
<pre><code class="lang-py"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">test_rate_chapter</span>():</span>
    course_id = <span class="hljs-string">"6431137ab5da949e5978a281"</span>
    chapter_id = <span class="hljs-string">"1"</span>
    rating = <span class="hljs-number">1</span>

    response = client.post(<span class="hljs-string">f"/courses/<span class="hljs-subst">{course_id}</span>/<span class="hljs-subst">{chapter_id}</span>?rating=<span class="hljs-subst">{rating}</span>"</span>)

    <span class="hljs-keyword">assert</span> response.status_code == <span class="hljs-number">200</span>

    <span class="hljs-comment"># Check if the response body has the expected structure</span>
    <span class="hljs-keyword">assert</span> <span class="hljs-string">"name"</span> <span class="hljs-keyword">in</span> response.json()
    <span class="hljs-keyword">assert</span> <span class="hljs-string">"rating"</span> <span class="hljs-keyword">in</span> response.json()
    <span class="hljs-keyword">assert</span> <span class="hljs-string">"total"</span> <span class="hljs-keyword">in</span> response.json()[<span class="hljs-string">"rating"</span>]
    <span class="hljs-keyword">assert</span> <span class="hljs-string">"count"</span> <span class="hljs-keyword">in</span> response.json()[<span class="hljs-string">"rating"</span>]

    <span class="hljs-keyword">assert</span> response.json()[<span class="hljs-string">"rating"</span>][<span class="hljs-string">"total"</span>] &gt; <span class="hljs-number">0</span>
    <span class="hljs-keyword">assert</span> response.json()[<span class="hljs-string">"rating"</span>][<span class="hljs-string">"count"</span>] &gt; <span class="hljs-number">0</span>

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">test_rate_chapter_not_exists</span>():</span>
    response = client.post(<span class="hljs-string">"/courses/6431137ab5da949e5978a281/990/rate"</span>, json={<span class="hljs-string">"rating"</span>: <span class="hljs-number">1</span>})
    <span class="hljs-keyword">assert</span> response.status_code == <span class="hljs-number">404</span>
    <span class="hljs-keyword">assert</span> response.json() == {<span class="hljs-string">'detail'</span>: <span class="hljs-string">'Not Found'</span>}
</code></pre>
<p>This verification makes sure that the rating addition endpoint works as intended, with the API returning the correct success code and expected information about the chapter, including its name and updated rating details.</p>
<p>By running the <code>pytest</code> command, all the test functions in the <code>test_app.py</code> file will be executed, and you'll get feedback on whether the endpoints are functioning as expected or if any errors or regressions have occurred. This allows developers and QA teams to catch issues early in the development cycle and maintain the application's reliability and stability.</p>
<p>As you can see in the image below, all the tests are passing. Good job! As you keep on adding more features and endpoints to the app, keep adding the associated tests in order to validate correctness. This is called <a target="_blank" href="https://www.freecodecamp.org/news/an-introduction-to-test-driven-development-c4de6dce5c/">Test Driven Development (TDD)</a>.</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/07/image-122.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>Running API Tests with Pytest</em></p>
<p>Running the Pytest command shows the output as illustrated in the image above. It says that 13 tests pasts. This means that all our endpoints are functional and return the expected responses.</p>
<p>By detecting regressions, integrating components, resolving errors, doing load and performance tests, and testing for security, endpoint testing verifies that an application's essential operations are right. All potential weaknesses and vulnerabilities are noted and tagged for inspection.</p>
<p>Pytest helps you make sure that API endpoints work well together, and also helps you deal with failures and edge cases. It can manage numerous concurrent large requests in practical situations.</p>
<h2 id="heading-how-to-containerize-the-application-with-docker">How to Containerize the Application with Docker</h2>
<p>You can put your application and all of its dependencies together into a single unit called a container. This is called <strong>containerization</strong>. It separates the application from the underlying system, which maintains consistency across different operating systems.</p>
<p><strong>Docker</strong> is a modern containerization technology that makes it easier to create, distribute, and execute containers. It enables developers to consistently and reproducibly build, ship, and execute apps without building from source.</p>
<p>Get Docker installed from here: <a target="_blank" href="https://www.docker.com/get-started">https://www.docker.com/get-started</a>.</p>
<p>Dockerizing Python programs helps you make sure that they run consistently across multiple computers, eliminating compatibility difficulties. It containerizes the software, its dependencies, and customizations, making it portable.</p>
<p>In the same directory as other files, make a new file called <code>Dockerfile</code>. Note that it does not require any extension.</p>
<pre><code class="lang-dockerfile"><span class="hljs-comment"># Use an official Python runtime as a parent image</span>
<span class="hljs-keyword">FROM</span> python:<span class="hljs-number">3.9</span>-slim-buster

<span class="hljs-comment"># Set the working directory to /app</span>
<span class="hljs-keyword">WORKDIR</span><span class="bash"> /app</span>

<span class="hljs-comment"># Copy the current directory contents into the container at /code</span>
<span class="hljs-keyword">WORKDIR</span><span class="bash"> /app</span>

<span class="hljs-keyword">COPY</span><span class="bash"> ./requirements.txt /app/requirements.txt</span>

<span class="hljs-comment"># Install any needed packages specified in requirements.txt</span>
<span class="hljs-keyword">RUN</span><span class="bash"> pip install --no-cache-dir --upgrade -r /app/requirements.txt</span>

<span class="hljs-keyword">COPY</span><span class="bash"> . /app</span>

<span class="hljs-comment"># Run app.py when the container launches</span>
<span class="hljs-keyword">CMD</span><span class="bash"> [<span class="hljs-string">"uvicorn"</span>, <span class="hljs-string">"main:app"</span>, <span class="hljs-string">"--host"</span>, <span class="hljs-string">"0.0.0.0"</span>, <span class="hljs-string">"--port"</span>, <span class="hljs-string">"80"</span>]</span>
</code></pre>
<p>Starting with the official Python 3.9 thin image, the Dockerfile defines the image's blueprint.</p>
<p>It changes the working directory to /app, which is where the application code will be stored. This projects requirements are listed in the <code>requirements.txt</code> file, which was put into the container.</p>
<p>The RUN command uses pip to install Python requirements. COPY moves the app's code from the host to the container's /app directory. CMD provides the command that will be executed when the container starts.</p>
<p>In this case, it runs "uvicorn main:app" (the main.py FastAPI app) with host set to 0.0.0.0 and port 80.</p>
<h3 id="heading-how-to-run-the-docker-container">How to Run the Docker Container</h3>
<p>Build the Docker image in the same directory as the Dockerfile using: <code>**docker build -t my_python_app .**</code></p>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/07/image-123.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>Containerizing the FastAPI app with Docker</em></p>
<p>Run the container in detached mode using the command <code>**docker run -d -p 80:80 my_python_app**</code>.</p>
<p>Once you do this, you can view the status of the containers and the image from Docker Desktop.</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/07/image-128.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>Docker Desktop shows that our container image is now in a running state on port 80</em></p>
<h3 id="heading-how-to-terminate-the-docker-container">How to Terminate the Docker Container</h3>
<p>Find the container ID or name with <code>**docker ps**</code>. Stop the container using its ID or name: <code>**docker stop &lt;container_id_or_name&gt;**</code></p>
<p>This walkthrough has only addressed development, testing, and containerization. Just note that post deployment container security, if neglected, introduces risks like vulnerabilities, misconfigurations, and attacks. You should ideally take advantage of a <a target="_blank" href="https://www.accuknox.com/blog/cnapp-buyers-guide">CNAPP</a> (Cloud Native Application Protection Platform) to scan images, stick to best practises, and monitor running containers for protection.</p>
<p>The takeaway is that Docker containerization allows bundling of Python scripts with dependencies, making them consistent and portable. The Dockerfile describes how the image should be created.</p>
<p>Running the container after it has been constructed is as simple as issuing a single command. It's just as simple to put a stop to it. Docker makes it simple to manage Python application distribution.</p>
<h2 id="heading-conclusion">Conclusion</h2>
<p>This tutorial was a quick start guide to help you leverage the power of FastAPI. We built a course administration API that efficiently handles queries related to courses.</p>
<p>We did this by importing course data from a JSON file into MongoDB and then creating multiple endpoints for users to access course lists, overviews, chapter information, and user scores. We also added a review aggregation feature to demonstrate using HTTP POST and HTTP GET methods so that you can grab data as well as post data to the server.</p>
<p>PyTest helped us handle automated testing, ensuring dependability and stability. We then containerized the application Docker, which simplifies deployment and maintenance.</p>
<p>My <a target="_blank" href="https://github.com/HighnessAtharva/fastapi-kimo/">Github Repository</a> contains the complete code covered in this quick start walkthrough. Subscribe to my <a target="_blank" href="https://atharvashah.netlify.app/">technical blog</a> for technical cheat sheets and resources.</p>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ How to Build a Tiered List Maker with Python ]]>
                </title>
                <description>
                    <![CDATA[ Hello Pythonistas! Do you want to level up your Python and API skills while also building something really useful? Well, then you're in the right place. This hands-on tutorial showcases how to leverage Python's capabilities to code an interactive tie... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/python-tier-list-maker/</link>
                <guid isPermaLink="false">66d45d9c680e33282da25e1c</guid>
                
                    <category>
                        <![CDATA[ projects ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Python ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ Atharva Shah ]]>
                </dc:creator>
                <pubDate>Fri, 07 Jul 2023 20:48:16 +0000</pubDate>
                <media:content url="https://www.freecodecamp.org/news/content/images/2023/07/python-tier-list-1.png" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>Hello Pythonistas! Do you want to level up your Python and API skills while also building something really useful? Well, then you're in the right place.</p>
<p>This hands-on tutorial showcases how to leverage Python's capabilities to code an interactive tiered list builder right within your terminal.</p>
<p>We'll use some helpful Python libraries along the way to build a practical tool that allows you to rank and organize your favorite albums engagingly and efficiently in seconds.</p>
<h2 id="heading-project-overview">Project Overview</h2>
<p>Tiered lists are categorizing tools used to rank objects based on likes. They're used in music, movies, and other areas. The album tiered list in this project allocates records to different levels depending on your personal choices.</p>
<p>This step-by-step guide leverages the power of Python libraries like <a target="_blank" href="https://github.com/Textualize/rich"><strong>Rich</strong></a><strong>,</strong> <a target="_blank" href="https://github.com/pylast/pylast"><strong>PyLast</strong></a><strong>,</strong> <a target="_blank" href="https://github.com/python-pillow/Pillow"><strong>Pillow</strong></a><strong>, and</strong> <a target="_blank" href="https://github.com/wong2/pick"><strong>Pick</strong></a> to make a tiered list builder right within the terminal.</p>
<p>Consider easily categorizing your albums into different tiers, such as "S-Tier" for all-time favorites or "B-Tier" for those undiscovered gems. You'll have complete control over how your music collection is organized according to your preferences.</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/07/image-8.png" alt="A high level overview of the walkthrough" width="600" height="400" loading="lazy"></p>
<p><em>A high level overview of the walkthrough</em></p>
<p>At the end of this project, you can expect to export all your tiered lists. Here is an example of what it might look like. This can be done for any of the artists of your choice.</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/07/MAC-DEMARCO-TIER-LIST.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>Final project outcome</em></p>
<h2 id="heading-get-your-lastfm-api-key">Get Your LastFM API Key</h2>
<p><a target="_blank" href="https://www.last.fm/">LastFM</a> is a music database and online platform that offers a sophisticated music recommendation system as well as an API. It allows developers to access and download data from their database.</p>
<p>This is a necessary step because the CLI app requests the album metadata and cover from the LastFM API.</p>
<p>First, you'll want to create a <a target="_blank" href="https://www.last.fm/api/account/create">LastFM Developer Account</a>.</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/07/image-7.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>Never share API credentials. Use environment variables to store them.</em></p>
<p>Next, copy the API Key and the Shared Secret. Set them as environment variables.</p>
<p>On Windows:</p>
<pre><code class="lang-javascript">setx LASTFM_API_KEY <span class="hljs-string">"your_api_key"</span>
setx LASTFM_API_SECRET <span class="hljs-string">"your_api_secret"</span>
</code></pre>
<p>On Linux/MacOS:</p>
<pre><code class="lang-javascript"><span class="hljs-keyword">export</span> LASTFM_API_KEY=<span class="hljs-string">"your_api_key"</span>
<span class="hljs-keyword">export</span> LASTFM_API_SECRET=<span class="hljs-string">"your_api_secret"</span>
</code></pre>
<h2 id="heading-import-the-modules">Import the Modules</h2>
<p>Here are the modules you need to have installed to kickstart the project:</p>
<ul>
<li><p><code>json</code>: Encoding and decoding JSON responses from APIs.</p>
</li>
<li><p><code>os</code>: File and directory operations.</p>
</li>
<li><p><code>datetime</code>: Formatting and mathematical operations on date and time.</p>
</li>
<li><p><code>io</code>: Stream-like interface for in-memory byte data.</p>
</li>
<li><p><code>typing</code>: Type-hinting for improved readability</p>
</li>
<li><p><code>pylast</code>: A Python wrapper library around the LastFM API.</p>
</li>
<li><p><code>requests</code>: Make HTTP requests with online services and APIs.</p>
</li>
<li><p><code>pick</code>: An interactive selection menu for selecting from a list directly in the terminal.</p>
</li>
<li><p><code>PIL</code>: Image processing and manipulation (for example, drawing, resizing, and saving)</p>
</li>
<li><p><code>rich</code>: Lovely terminal formatting.</p>
</li>
</ul>
<p>Get these installed using the pip (Python package manager).</p>
<pre><code class="lang-javascript">pip install pylast requests pick Pillow rich
</code></pre>
<p>Now that the setup is done, spin up your code editor, and let's get to building.</p>
<pre><code class="lang-py"><span class="hljs-keyword">import</span> json
<span class="hljs-keyword">import</span> os
<span class="hljs-keyword">from</span> datetime <span class="hljs-keyword">import</span> datetime
<span class="hljs-keyword">from</span> io <span class="hljs-keyword">import</span> BytesIO
<span class="hljs-keyword">from</span> typing <span class="hljs-keyword">import</span> List

<span class="hljs-keyword">import</span> pylast
<span class="hljs-keyword">import</span> requests
<span class="hljs-keyword">from</span> pick <span class="hljs-keyword">import</span> pick
<span class="hljs-keyword">from</span> PIL <span class="hljs-keyword">import</span> Image, ImageDraw, ImageFont
<span class="hljs-keyword">from</span> rich <span class="hljs-keyword">import</span> <span class="hljs-keyword">print</span>
<span class="hljs-keyword">from</span> rich.panel <span class="hljs-keyword">import</span> Panel
<span class="hljs-keyword">from</span> rich.table <span class="hljs-keyword">import</span> Table
</code></pre>
<h2 id="heading-kickstart-with-an-interactive-menu">Kickstart With An Interactive Menu</h2>
<p>This is a CLI-based application. So any choices you make will be made directly within the terminal. Two choices are presented at the startup screen to the user:</p>
<ol>
<li><p><strong>Create a Tiered List:</strong> Enter the name of the list and the artist. The application will fetch metadata and album covers from the LastFM API and save them to a JSON file.</p>
</li>
<li><p><strong>Export the Tiered List to Image:</strong> Use Pandas to export the gathered JSON data to a beautiful PNG/JPG image. The image will have rows and columns to indicate tiers and albums.</p>
</li>
</ol>
<p>To start, let's present an interactive menu to the user:</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/07/image-12.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>The pick module presents a choice selection menu in the terminal. Use arrow keys to navigate and hit Enter to confirm.</em></p>
<p>Ignore the first four options, as they are out of the scope of this walkthrough. You can just use the <code>pass</code> statement instead of invoking those functions to prevent any errors.</p>
<p>To achieve this, you will need to write the following driver code at the end of your file.</p>
<pre><code class="lang-py">LASTFM_API_KEY = os.environ.get(<span class="hljs-string">"LASTFM_API_KEY"</span>)
LASTFM_API_SECRET = os.environ.get(<span class="hljs-string">"LASTFM_API_SECRET"</span>)
network = pylast.LastFMNetwork(api_key=LASTFM_API_KEY, api_secret=LASTFM_API_SECRET)

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">start</span>():</span>    
    <span class="hljs-keyword">global</span> network
    startup_question = <span class="hljs-string">"What Do You Want To Do?"</span>
    options = [<span class="hljs-string">"Rate by Album"</span>, <span class="hljs-string">"Rate Songs"</span>, <span class="hljs-string">"See Albums Rated"</span>, <span class="hljs-string">"See Songs Rated"</span>, <span class="hljs-string">"Make a Tier List"</span>, <span class="hljs-string">"See Created Tier Lists"</span>, <span class="hljs-string">"EXIT"</span>]
    selected_option, index = pick(options, startup_question, indicator=<span class="hljs-string">"→"</span>)

    <span class="hljs-keyword">if</span> index == <span class="hljs-number">0</span>:
        rate_by_album()
    <span class="hljs-keyword">elif</span> index == <span class="hljs-number">1</span>:
        rate_by_song()
    <span class="hljs-keyword">elif</span> index == <span class="hljs-number">2</span>:
        see_albums_rated()
    <span class="hljs-keyword">elif</span> index == <span class="hljs-number">3</span>:
        see_songs_rated()
    <span class="hljs-keyword">elif</span> index == <span class="hljs-number">4</span>:
        create_tier_list()
    <span class="hljs-keyword">elif</span> index == <span class="hljs-number">5</span>:
        see_tier_lists()
    <span class="hljs-keyword">elif</span> index == <span class="hljs-number">6</span>:
        exit()
start()
</code></pre>
<p>As seen in the code above, the <code>os.environ.get()</code> function retrieves the value of an environment variable you set in the previous section.</p>
<p><code>network</code> is probably the most important variable. It has a lot of methods attached to it. These methods include:</p>
<ul>
<li><p>Fetching albums of an artist</p>
</li>
<li><p>Fetching metadata about an artist</p>
</li>
<li><p>Fetching metadata about an album</p>
</li>
<li><p>Fetching album covers</p>
</li>
<li><p>Error validation by checking for the 200 (OK) response status.</p>
</li>
</ul>
<p>Then, <code>start()</code> initiates the application, presents a startup question using the <code>pick</code> function, stores user choices, and executes various actions based on the selected option.</p>
<p>The <code>pick</code> method accepts the following parameters:</p>
<ul>
<li><p><code>**options**</code>: The list of options to choose from. These will be the list of albums.</p>
</li>
<li><p><code>**title**</code>: The title or question to display to the user. The tier list name.</p>
</li>
<li><p><code>**multiselect**</code>: A flag indicating whether multiple options can be selected. Multiple choice or single choice.</p>
</li>
<li><p><code>**indicator**</code>: The symbol or character used to indicate the selected option.</p>
</li>
<li><p><code>**min_selection_count**</code>: The minimum number of options that must be selected. This choice only allows one selection, the default value.</p>
</li>
</ul>
<p>Note: <strong>All the code below has to be placed above the driver code</strong>. We are going to define several functions, one for each option.</p>
<h2 id="heading-how-to-save-state-in-json">How to Save State in JSON</h2>
<p>JSON files are easy to work with and maintain even as the app schema changes. This is why you will be storing the tier list data in JSON format. It's a persistent storage method that allows you to update the album and song ratings, as well as tier lists, even when the program is rerun.</p>
<p>Surely you don't want the user data to be lost when the application restarts? Therefore, a save state is required. It's a database most of the time. But for the sake of simplicity, let's store and retrieve user data using JSON.</p>
<pre><code class="lang-py"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">load_or_create_json</span>() -&gt; <span class="hljs-keyword">None</span>:</span>
    <span class="hljs-keyword">if</span> os.path.exists(<span class="hljs-string">"albums.json"</span>):
        <span class="hljs-keyword">with</span> open(<span class="hljs-string">"albums.json"</span>) <span class="hljs-keyword">as</span> f:
            ratings = json.load(f)
    <span class="hljs-keyword">else</span>:
        <span class="hljs-comment"># create a new json file with empty dict</span>
        <span class="hljs-keyword">with</span> open(<span class="hljs-string">"albums.json"</span>, <span class="hljs-string">"w"</span>) <span class="hljs-keyword">as</span> f:
            ratings = {<span class="hljs-string">"album_ratings"</span>: [], <span class="hljs-string">"song_ratings"</span>: [], <span class="hljs-string">"tier_lists"</span>: []}
            json.dump(ratings, f)
</code></pre>
<p>This custom function either loads an existing JSON file or produces one if none exists. It guarantees that the application has a file for storing and retrieving album and song ratings, as well as tier lists.</p>
<p>If the file does not exist, it creates a new file named "albums.json" in write mode. Then initialize the <code>ratings</code> variable as a dictionary containing empty lists. <code>json.dump()</code> writes the contents of the <code>ratings</code> dictionary to the JSON file.</p>
<h2 id="heading-how-to-write-utility-functions">How to Write Utility Functions</h2>
<p>Utility or helper functions in menu-driven programming perform common tasks or operations related to menu options. These functions are reusable and modular, making code more organized and easier to maintain. Examples include:</p>
<ul>
<li><p>Display Menu</p>
</li>
<li><p>Input Validation</p>
</li>
<li><p>Data Persistence</p>
</li>
<li><p>Formatting and Display</p>
</li>
<li><p>Error Handling</p>
</li>
<li><p>Common Operations.</p>
</li>
</ul>
<p>These functions handle common tasks required by multiple menu options, promoting code reusability and reducing redundancy. Encapsulating these functions in menu logic helps maintain code flow, and facilitates testing, debugging, and future modifications.</p>
<p>Think of them as bridges that help connect two functions better and isolate trivial logic that can be used on the fly. This project relies on two helper functions.</p>
<h3 id="heading-remove-album-from-list">Remove album from list</h3>
<p>First, we'll write a function to remove the picked album from the list to prevent repetition across different tiers. Here's what that looks like:</p>
<pre><code class="lang-py"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">create_tier_list_helper</span>(<span class="hljs-params">albums_to_rank, tier_name</span>):</span>
    <span class="hljs-comment"># if there are no more albums to rank, return an empty list</span>
    <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> albums_to_rank:
        <span class="hljs-keyword">return</span> []

    question = <span class="hljs-string">f"Select the albums you want to rank in  <span class="hljs-subst">{tier_name}</span>"</span>
    tier_picks = pick(options=albums_to_rank, title=question, multiselect=<span class="hljs-literal">True</span>, indicator=<span class="hljs-string">"→"</span>, min_selection_count=<span class="hljs-number">0</span>)
    tier_picks = [x[<span class="hljs-number">0</span>] <span class="hljs-keyword">for</span> x <span class="hljs-keyword">in</span> tier_picks]

    <span class="hljs-keyword">for</span> album <span class="hljs-keyword">in</span> tier_picks:
        albums_to_rank.remove(album)

    <span class="hljs-keyword">return</span> tier_picks
</code></pre>
<p>This allows users to rank albums inside certain tiers and facilitates the creation of tier lists.</p>
<p>It requires two arguments: <code>albums_to_rank</code> and <code>tier_name</code>. If there are no more albums to rank, the function produces an empty list. Users can choose albums to rate from albums to rank, save them in tier picks, remove them, and return the tier picks list.</p>
<p>The returned value <code>tier_picks</code> is a Python list.</p>
<h3 id="heading-return-cover-of-selected-album">Return cover of selected album</h3>
<p>Next, write a function that returns the cover of an album users select. Here's what it looks like:</p>
<pre><code class="lang-py"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">get_album_cover</span>(<span class="hljs-params">artist, album</span>):</span>
    album = network.get_album(artist, album)
    album_cover = album.get_cover_image()
    <span class="hljs-comment"># check if it is a valid url</span>
    <span class="hljs-keyword">try</span>:
        response = requests.get(album_cover)
        <span class="hljs-keyword">if</span> response.status_code != <span class="hljs-number">200</span>:
            album_cover = <span class="hljs-string">"https://community.mp3tag.de/uploads/default/original/2X/a/acf3edeb055e7b77114f9e393d1edeeda37e50c9.png"</span>
    <span class="hljs-keyword">except</span>:
        album_cover = <span class="hljs-string">"https://community.mp3tag.de/uploads/default/original/2X/a/acf3edeb055e7b77114f9e393d1edeeda37e50c9.png"</span>
    <span class="hljs-keyword">return</span> album_cover
</code></pre>
<p>This retrieves the album cover image for a specified artist and album name via the LastFM API. It validates the cover image URL from the API answer with an HTTP request.</p>
<p>The album cover is returned if the URL is correct. Else, a fallback placeholder image for the album cover is provided by default.</p>
<p>The <code>network</code> object that you created earlier has several handy methods. The first line gets the album object and then gets the cover image for that object directly via LastFM.</p>
<h2 id="heading-how-to-add-the-tiered-list-data-to-json">How to Add the Tiered List Data to JSON</h2>
<p>Once the user picks the "create tier list" option from the menu the script presents them with the available tiers and requests them to input a valid artist and a name for their tier list so that it can be stored in the JSON file.</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/07/image-16.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>After choosing the "create tier list" option, the script validates the artist returns the metadata using the LastFM API.</em></p>
<p>Use the <code>network</code> object to validate if the artist exists. If yes, request all the albums for that artist. Populate a list with these albums and set the <code>option</code> to that list so it shows up in the choices for the S tier.</p>
<p>In the image below, the (x) mark indicates the user has selected that particular album to be in the S-Tier.</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/07/image-33.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>This is a prompt for users to select albums that they want to move to the S-Tier. Navigate with arrow keys to select zero, one or more albums from the list.</em></p>
<p>After the user has selected these albums, you would like to serialize this list and put it into a JSON file that will be used to generate the actual image later. This JSON file needs to have a data definition.</p>
<p>Think about how databases have a schema. They have tables and columns and rows that describe the nature and the format of the data.</p>
<p>Similarly, we are going to define the schema of the JSON file to store all these tier list choices. Each tier list object contains the following properties:</p>
<ul>
<li><p><code>tier_list_name</code>: The name given to the tier list.</p>
</li>
<li><p><code>artist</code>: The name of the artist for whom the tier list is created.</p>
</li>
<li><p><code>s_tier</code>, <code>a_tier</code>, <code>b_tier</code>, <code>c_tier</code>, <code>d_tier</code>, <code>e_tier</code>: Arrays that hold the albums and their corresponding cover art for each tier. Albums are represented as objects with "album" and "cover_art" properties.</p>
</li>
<li><p><code>time</code>: Creation timestamp.</p>
</li>
<li><p>Each tier array contains one or more album objects with "album" representing the album name and "cover_art"</p>
</li>
</ul>
<p>This is the sample JSON schema. Once the user makes the choices in the terminal, a serialized Python object similar to this containing the tier list data will be written to the JSON file.</p>
<pre><code class="lang-json">{
  <span class="hljs-attr">"tier_lists"</span>: [
        {
            <span class="hljs-attr">"tier_list_name"</span>: <span class="hljs-string">"THE WEEKND RANKED"</span>,
            <span class="hljs-attr">"artist"</span>: <span class="hljs-string">"the weeknd"</span>,
            <span class="hljs-attr">"s_tier"</span>: [
                {
                    <span class="hljs-attr">"album"</span>: <span class="hljs-string">"After Hours"</span>,
                    <span class="hljs-attr">"cover_art"</span>: <span class="hljs-string">"https://lastfm.freetls.fastly.net/i/u/300x300/7d957bd27dd562bee7aaa89eafa0bbe6.jpg"</span>
                }
            ],
            <span class="hljs-attr">"a_tier"</span>: [
                {
                    <span class="hljs-attr">"album"</span>: <span class="hljs-string">"Kiss Land"</span>,
                    <span class="hljs-attr">"cover_art"</span>: <span class="hljs-string">"https://lastfm.freetls.fastly.net/i/u/300x300/01ad150445023de653c50dbbc3e10dbc.jpg"</span>
                },
                {
                    <span class="hljs-attr">"album"</span>: <span class="hljs-string">"Echoes of Silence"</span>,
                    <span class="hljs-attr">"cover_art"</span>: <span class="hljs-string">"https://lastfm.freetls.fastly.net/i/u/300x300/4f257619898b44b7a8f95431045e9ffe.png"</span>
                }
            ],
            <span class="hljs-attr">"b_tier"</span>: [],
            <span class="hljs-attr">"c_tier"</span>: [],
            <span class="hljs-attr">"d_tier"</span>: [],
            <span class="hljs-attr">"e_tier"</span>: [
                {
                    <span class="hljs-attr">"album"</span>: <span class="hljs-string">"I Feel It Coming"</span>,
                    <span class="hljs-attr">"cover_art"</span>: <span class="hljs-string">"https://lastfm.freetls.fastly.net/i/u/300x300/974deeb8c348d0ad0c0fa10941dd67e8.jpg"</span>
                }
            ],
            <span class="hljs-attr">"time"</span>: <span class="hljs-string">"2023-04-23 23:56:14.652417"</span>
        }
    ]
}
</code></pre>
<p>You want to dynamically write to this JSON file as the user continues to keep making tier lists. That is, it should continue to grow and expand to fit all the album covers. The below code does exactly that:</p>
<pre><code class="lang-py"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">create_tier_list</span>():</span>
    load_or_create_json()
    <span class="hljs-keyword">with</span> open(<span class="hljs-string">"albums.json"</span>) <span class="hljs-keyword">as</span> f:
        album_file = json.load(f)

    print(<span class="hljs-string">"TIERS - S, A, B, C, D, E"</span>)

    question = <span class="hljs-string">"Which artist do you want to make a tier list for?"</span>
    artist = input(question).strip().lower()

    <span class="hljs-keyword">try</span>:
        get_artist = network.get_artist(artist)
        artist = get_artist.get_name()
        albums_to_rank = get_album_list(artist)

        <span class="hljs-comment"># keep only the album name by splitting the string at the first - and removing the first element</span>
        albums_to_rank = [x.split(<span class="hljs-string">" - "</span>, <span class="hljs-number">1</span>)[<span class="hljs-number">1</span>] <span class="hljs-keyword">for</span> x <span class="hljs-keyword">in</span> albums_to_rank[<span class="hljs-number">1</span>:]]

        question = <span class="hljs-string">"What do you want to call this tier list?"</span>
        tier_list_name = input(question).strip()

        <span class="hljs-comment"># repeat until the user enters at least one character</span>
        <span class="hljs-keyword">while</span> <span class="hljs-keyword">not</span> tier_list_name:
            print(<span class="hljs-string">"Please enter at least one character"</span>)
            tier_list_name = input(question).strip()

        <span class="hljs-comment"># S TIER</span>
        question = <span class="hljs-string">"Select the albums you want to rank in S Tier:"</span>
        s_tier_picks = create_tier_list_helper(albums_to_rank, <span class="hljs-string">"S Tier"</span>)
        s_tier_covers = [get_album_cover(artist, album) <span class="hljs-keyword">for</span> album <span class="hljs-keyword">in</span> s_tier_picks]
        s_tier = [{<span class="hljs-string">"album"</span>:album,<span class="hljs-string">"cover_art"</span>: cover} <span class="hljs-keyword">for</span> album, cover <span class="hljs-keyword">in</span> zip(s_tier_picks, s_tier_covers)]

        <span class="hljs-comment"># A TIER</span>
        question = <span class="hljs-string">"Select the albums you want to rank in A Tier:"</span>
        a_tier_picks = create_tier_list_helper(albums_to_rank, <span class="hljs-string">"A Tier"</span>)
        a_tier_covers = [get_album_cover(artist, album) <span class="hljs-keyword">for</span> album <span class="hljs-keyword">in</span> a_tier_picks]
        a_tier = [{<span class="hljs-string">"album"</span>:album,<span class="hljs-string">"cover_art"</span>: cover} <span class="hljs-keyword">for</span> album, cover <span class="hljs-keyword">in</span> zip(a_tier_picks, a_tier_covers)]

        <span class="hljs-comment"># B TIER</span>
        question = <span class="hljs-string">"Select the albums you want to rank in B Tier:"</span>
        b_tier_picks = create_tier_list_helper(albums_to_rank, <span class="hljs-string">"B Tier"</span>)
        b_tier_covers = [get_album_cover(artist, album) <span class="hljs-keyword">for</span> album <span class="hljs-keyword">in</span> b_tier_picks]
        b_tier = [{<span class="hljs-string">"album"</span>:album,<span class="hljs-string">"cover_art"</span>: cover} <span class="hljs-keyword">for</span> album, cover <span class="hljs-keyword">in</span> zip(b_tier_picks, b_tier_covers)]

        <span class="hljs-comment"># C TIER</span>
        question = <span class="hljs-string">"Select the albums you want to rank in C Tier:"</span>
        c_tier_picks = create_tier_list_helper(albums_to_rank, <span class="hljs-string">"C Tier"</span>)
        c_tier_covers = [get_album_cover(artist, album) <span class="hljs-keyword">for</span> album <span class="hljs-keyword">in</span> c_tier_picks]
        c_tier = [{<span class="hljs-string">"album"</span>:album,<span class="hljs-string">"cover_art"</span>: cover} <span class="hljs-keyword">for</span> album, cover <span class="hljs-keyword">in</span> zip(c_tier_picks, c_tier_covers)]

        <span class="hljs-comment"># D TIER</span>
        question = <span class="hljs-string">"Select the albums you want to rank in D Tier:"</span>
        d_tier_picks = create_tier_list_helper(albums_to_rank, <span class="hljs-string">"D Tier"</span>)
        d_tier_covers = [get_album_cover(artist, album) <span class="hljs-keyword">for</span> album <span class="hljs-keyword">in</span> d_tier_picks] 
        d_tier = [{<span class="hljs-string">"album"</span>:album,<span class="hljs-string">"cover_art"</span>: cover} <span class="hljs-keyword">for</span> album, cover <span class="hljs-keyword">in</span> zip(d_tier_picks, d_tier_covers)]
        <span class="hljs-comment"># E TIER</span>
        question = <span class="hljs-string">"Select the albums you want to rank in E Tier:"</span>
        e_tier_picks = create_tier_list_helper(albums_to_rank, <span class="hljs-string">"E Tier"</span>)
        e_tier_covers = [get_album_cover(artist, album) <span class="hljs-keyword">for</span> album <span class="hljs-keyword">in</span> e_tier_picks]
        e_tier = [{<span class="hljs-string">"album"</span>:album,<span class="hljs-string">"cover_art"</span>: cover} <span class="hljs-keyword">for</span> album, cover <span class="hljs-keyword">in</span> zip(e_tier_picks, e_tier_covers)]

        <span class="hljs-comment"># check if all tiers are empty and if so, exit</span>
        <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> any([s_tier_picks, a_tier_picks, b_tier_picks, c_tier_picks, d_tier_picks, e_tier_picks]):
            print(<span class="hljs-string">"All tiers are empty. Exiting..."</span>)
            <span class="hljs-keyword">return</span>


        <span class="hljs-comment"># # add the albums that were picked to the tier list</span>
        tier_list = {
            <span class="hljs-string">"tier_list_name"</span>: tier_list_name,
            <span class="hljs-string">"artist"</span>: artist,
            <span class="hljs-string">"s_tier"</span>: s_tier, 
            <span class="hljs-string">"a_tier"</span>: a_tier,
            <span class="hljs-string">"b_tier"</span>: b_tier,
            <span class="hljs-string">"c_tier"</span>: c_tier,
            <span class="hljs-string">"d_tier"</span>: d_tier,
            <span class="hljs-string">"e_tier"</span>: e_tier,
            <span class="hljs-string">"time"</span>: str(datetime.now())
        }

        <span class="hljs-comment"># add the tier list to the json file</span>
        album_file[<span class="hljs-string">"tier_lists"</span>].append(tier_list)

        <span class="hljs-comment"># save the json file</span>
        <span class="hljs-keyword">with</span> open(<span class="hljs-string">"albums.json"</span>, <span class="hljs-string">"w"</span>) <span class="hljs-keyword">as</span> f:
            json.dump(album_file, f, indent=<span class="hljs-number">4</span>)

        <span class="hljs-keyword">return</span>

    <span class="hljs-keyword">except</span> pylast.PyLastError:
        print(<span class="hljs-string">"❌[b red] Artist not found [/b red]"</span>)
</code></pre>
<p>This is the core function used to create tier lists for albums and store them in <code>albums.json</code>. Here's what's going on in it:</p>
<ul>
<li><p>The user enters the artist's name and retrieves information from the LastFM API.</p>
</li>
<li><p>Next, provide a name for the tier list they want to create.</p>
</li>
<li><p>For each tier (S, A, B, C, D, E), select albums to rank within that tier using a helper function you wrote earlier.</p>
</li>
<li><p>Retrieval of album cover art for each selected album is done via the <code>get_album_cover()</code>, and the selected albums and their corresponding cover art are stored as dictionaries in the respective tier list.</p>
</li>
<li><p>If all tiers are empty, the function exits. Nothing is written into the JSON file.</p>
</li>
<li><p>Otherwise, the tier list is added to the JSON file which is saved in the current working directory (same path as the Python script).</p>
</li>
</ul>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/07/image-15.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>Now, this is selection for the next tier (A-Tier). The albums we selected in the previous options do not appear anymore meaning they have already been selected.</em></p>
<h2 id="heading-how-to-use-pillow-for-visual-transformations">How to Use Pillow for Visual Transformations</h2>
<p>Now that you have all the JSON data for your tier lists, you want to export all that to an image so that you can share it with your friends or post it on the web. But how should you do this? Let's break it down:</p>
<p>First, you'll want to determine the number of tiers. Then, determine the position and sizing of both the tier list grid and the album cover squares.</p>
<p>Here, you'll want to think about dynamic width and height offsets. How should you prevent overflow of images, add new rows, or maintain minimum height?</p>
<p>All this is related to the image canvas. Pillow is an excellent choice for this. You can resize, adjust, and expand the dimensions of all your images as well as the background canvas on the fly based on the user input and selection.</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/07/image-34.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>Tier list template made with Pillow. Refer the code below for explanation.</em></p>
<p>The most logical way to tackle this is to pass the tier list object to a function and let it loop over all the tiers. Inside each tier, let it loop over all the records and add an item. If the album cover exceeds the max width, add a new row so it does not overflow. Continue this until all the albums in each tier are processed. Violà!</p>
<pre><code class="lang-py"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">image_generator</span>(<span class="hljs-params">file_name, data</span>):</span>

    <span class="hljs-comment"># return if the file already exists</span>
    <span class="hljs-keyword">if</span> os.path.exists(file_name):
        <span class="hljs-keyword">return</span>

    <span class="hljs-comment"># Set the image size and font</span>
    image_width = <span class="hljs-number">1920</span>
    image_height = <span class="hljs-number">5000</span>
    font = ImageFont.truetype(<span class="hljs-string">"arial.ttf"</span>, <span class="hljs-number">15</span>)
    tier_font = ImageFont.truetype(<span class="hljs-string">"arial.ttf"</span>, <span class="hljs-number">30</span>)

    <span class="hljs-comment"># Make a new image with the size and background color black</span>
    image = Image.new(<span class="hljs-string">"RGB"</span>, (image_width, image_height), <span class="hljs-string">"black"</span>)
    text_cutoff_value = <span class="hljs-number">20</span>

    <span class="hljs-comment">#Initialize variables for row and column positions</span>
    row_pos = <span class="hljs-number">0</span>
    col_pos = <span class="hljs-number">0</span>
    increment_size = <span class="hljs-number">200</span>

    <span class="hljs-string">"""S Tier"""</span>
    <span class="hljs-comment"># leftmost side - make a square with text inside the square and fill color</span>
    <span class="hljs-keyword">if</span> col_pos == <span class="hljs-number">0</span>:
        draw = ImageDraw.Draw(image)
        draw.rectangle((col_pos, row_pos, col_pos + increment_size, row_pos + increment_size), fill=<span class="hljs-string">"red"</span>)
        draw.text((col_pos + (increment_size//<span class="hljs-number">3</span>), row_pos+(increment_size//<span class="hljs-number">3</span>)), <span class="hljs-string">"S Tier"</span>, font=tier_font, fill=<span class="hljs-string">"white"</span>)
        col_pos += increment_size

    <span class="hljs-keyword">for</span> album <span class="hljs-keyword">in</span> data[<span class="hljs-string">"s_tier"</span>]:
        <span class="hljs-comment"># Get the cover art</span>
        response = requests.get(album[<span class="hljs-string">"cover_art"</span>])
        cover_art = Image.open(BytesIO(response.content))

        <span class="hljs-comment"># Resize the cover art</span>
        cover_art = cover_art.resize((increment_size, increment_size))

        <span class="hljs-comment"># Paste the cover art onto the base image</span>
        image.paste(cover_art, (col_pos, row_pos))

        <span class="hljs-comment"># Draw the album name on the image with the font size 10 and background color white</span>
        draw = ImageDraw.Draw(image)

        <span class="hljs-comment"># Get the album name</span>
        name = album[<span class="hljs-string">"album"</span>]
        <span class="hljs-keyword">if</span> len(name) &gt; text_cutoff_value:
            name = <span class="hljs-string">f"<span class="hljs-subst">{name[:text_cutoff_value]}</span>..."</span>

        draw.text((col_pos, row_pos + increment_size), name, font=font, fill=<span class="hljs-string">"white"</span>)

        <span class="hljs-comment"># Increment the column position</span>
        col_pos += <span class="hljs-number">200</span>
        <span class="hljs-comment"># check if the column position is greater than the image width</span>
        <span class="hljs-keyword">if</span> col_pos &gt; image_width - increment_size:
            <span class="hljs-comment"># add a new row</span>
            row_pos += increment_size + <span class="hljs-number">50</span>
            col_pos = <span class="hljs-number">0</span> 

    <span class="hljs-comment"># add a new row to separate the tiers</span>
    row_pos += increment_size + <span class="hljs-number">50</span>
    col_pos = <span class="hljs-number">0</span>

    <span class="hljs-string">"""A TIER"""</span>
    <span class="hljs-keyword">if</span> col_pos == <span class="hljs-number">0</span>:
        draw = ImageDraw.Draw(image)
        draw.rectangle((col_pos, row_pos, col_pos + increment_size, row_pos + increment_size), fill=<span class="hljs-string">"orange"</span>)
        draw.text((col_pos + (increment_size//<span class="hljs-number">3</span>), row_pos+(increment_size//<span class="hljs-number">3</span>)), <span class="hljs-string">"A Tier"</span>, font=tier_font, fill=<span class="hljs-string">"white"</span>)
        col_pos += increment_size

    <span class="hljs-keyword">for</span> album <span class="hljs-keyword">in</span> data[<span class="hljs-string">"a_tier"</span>]:
        response = requests.get(album[<span class="hljs-string">"cover_art"</span>])
        cover_art = Image.open(BytesIO(response.content))
        cover_art = cover_art.resize((increment_size, increment_size))
        image.paste(cover_art, (col_pos, row_pos))
        draw = ImageDraw.Draw(image)

        name = album[<span class="hljs-string">"album"</span>]
        <span class="hljs-keyword">if</span> len(name) &gt; text_cutoff_value:
            name = <span class="hljs-string">f"<span class="hljs-subst">{name[:text_cutoff_value]}</span>..."</span>

        draw.text((col_pos, row_pos + increment_size), name, font=font, fill=<span class="hljs-string">"white"</span>)

        col_pos += <span class="hljs-number">200</span>
        <span class="hljs-keyword">if</span> col_pos &gt; image_width - increment_size:
            row_pos += increment_size + <span class="hljs-number">50</span>
            col_pos = <span class="hljs-number">0</span> 

    row_pos += increment_size + <span class="hljs-number">50</span>
    col_pos = <span class="hljs-number">0</span>

    <span class="hljs-string">"""B TIER"""</span>
    <span class="hljs-keyword">if</span> col_pos == <span class="hljs-number">0</span>:
        draw = ImageDraw.Draw(image)
        draw.rectangle((col_pos, row_pos, col_pos + increment_size, row_pos + increment_size), fill=<span class="hljs-string">"yellow"</span>)
        draw.text((col_pos + (increment_size//<span class="hljs-number">3</span>), row_pos+(increment_size//<span class="hljs-number">3</span>)), <span class="hljs-string">"B Tier"</span>, font=tier_font, fill=<span class="hljs-string">"black"</span>)
        col_pos += increment_size

    <span class="hljs-keyword">for</span> album <span class="hljs-keyword">in</span> data[<span class="hljs-string">"b_tier"</span>]:
        response = requests.get(album[<span class="hljs-string">"cover_art"</span>])
        cover_art = Image.open(BytesIO(response.content))
        cover_art = cover_art.resize((increment_size, increment_size))
        image.paste(cover_art, (col_pos, row_pos))
        draw = ImageDraw.Draw(image)

        name = album[<span class="hljs-string">"album"</span>]
        <span class="hljs-keyword">if</span> len(name) &gt; text_cutoff_value:
            name = <span class="hljs-string">f"<span class="hljs-subst">{name[:text_cutoff_value]}</span>..."</span>

        draw.text((col_pos, row_pos + increment_size), name, font=font, fill=<span class="hljs-string">"white"</span>)
        col_pos += <span class="hljs-number">200</span>
        <span class="hljs-keyword">if</span> col_pos &gt; image_width - increment_size:
            <span class="hljs-comment"># add a new row</span>
            row_pos += increment_size + <span class="hljs-number">50</span>
            col_pos = <span class="hljs-number">0</span>

    row_pos += increment_size + <span class="hljs-number">50</span>
    col_pos = <span class="hljs-number">0</span>

    <span class="hljs-string">"""C TIER"""</span>
    <span class="hljs-keyword">if</span> col_pos == <span class="hljs-number">0</span>:
        draw = ImageDraw.Draw(image)
        draw.rectangle((col_pos, row_pos, col_pos + increment_size, row_pos + increment_size), fill=<span class="hljs-string">"green"</span>)
        draw.text((col_pos + (increment_size//<span class="hljs-number">3</span>), row_pos+(increment_size//<span class="hljs-number">3</span>)), <span class="hljs-string">"C Tier"</span>, font=tier_font, fill=<span class="hljs-string">"black"</span>)
        col_pos += increment_size

    <span class="hljs-keyword">for</span> album <span class="hljs-keyword">in</span> data[<span class="hljs-string">"c_tier"</span>]:
        response = requests.get(album[<span class="hljs-string">"cover_art"</span>])
        cover_art = Image.open(BytesIO(response.content))       
        cover_art = cover_art.resize((increment_size, increment_size))
        image.paste(cover_art, (col_pos, row_pos))
        draw = ImageDraw.Draw(image)

        name = album[<span class="hljs-string">"album"</span>]
        <span class="hljs-keyword">if</span> len(name) &gt; text_cutoff_value:
            name = <span class="hljs-string">f"<span class="hljs-subst">{name[:text_cutoff_value]}</span>..."</span>

        draw.text((col_pos, row_pos + increment_size), name, font=font, fill=<span class="hljs-string">"white"</span>)

        col_pos += <span class="hljs-number">200</span>
        <span class="hljs-keyword">if</span> col_pos &gt; image_width - increment_size:
            row_pos += increment_size + <span class="hljs-number">50</span>
            col_pos = <span class="hljs-number">0</span>

    row_pos += increment_size + <span class="hljs-number">50</span>
    col_pos = <span class="hljs-number">0</span>


    <span class="hljs-string">"""D TIER"""</span>
    <span class="hljs-keyword">if</span> col_pos == <span class="hljs-number">0</span>:
        draw = ImageDraw.Draw(image)
        draw.rectangle((col_pos, row_pos, col_pos + increment_size, row_pos + increment_size), fill=<span class="hljs-string">"blue"</span>)
        draw.text((col_pos + (increment_size//<span class="hljs-number">3</span>), row_pos+(increment_size//<span class="hljs-number">3</span>)), <span class="hljs-string">"D Tier"</span>, font=tier_font, fill=<span class="hljs-string">"black"</span>)
        col_pos += increment_size

    <span class="hljs-keyword">for</span> album <span class="hljs-keyword">in</span> data[<span class="hljs-string">"d_tier"</span>]:
        response = requests.get(album[<span class="hljs-string">"cover_art"</span>])
        cover_art = Image.open(BytesIO(response.content))
        cover_art = cover_art.resize((increment_size, increment_size))
        image.paste(cover_art, (col_pos, row_pos))        
        draw = ImageDraw.Draw(image)

        name = album[<span class="hljs-string">"album"</span>]
        <span class="hljs-keyword">if</span> len(name) &gt; text_cutoff_value:
            name = <span class="hljs-string">f"<span class="hljs-subst">{name[:text_cutoff_value]}</span>..."</span>

        draw.text((col_pos, row_pos + increment_size), name, font=font, fill=<span class="hljs-string">"white"</span>)

        col_pos += <span class="hljs-number">200</span>
        <span class="hljs-keyword">if</span> col_pos &gt; image_width - increment_size:
            <span class="hljs-comment"># add a new row</span>
            row_pos += increment_size + <span class="hljs-number">50</span>
            col_pos = <span class="hljs-number">0</span>

    row_pos += increment_size + <span class="hljs-number">50</span>
    col_pos = <span class="hljs-number">0</span>


    <span class="hljs-string">"""E TIER"""</span>
    <span class="hljs-keyword">if</span> col_pos == <span class="hljs-number">0</span>:
        draw = ImageDraw.Draw(image)
        draw.rectangle((col_pos, row_pos, col_pos + increment_size, row_pos + increment_size), fill=<span class="hljs-string">"pink"</span>)
        draw.text((col_pos + (increment_size//<span class="hljs-number">3</span>), row_pos+(increment_size//<span class="hljs-number">3</span>)), <span class="hljs-string">"E Tier"</span>, font=tier_font, fill=<span class="hljs-string">"black"</span>)
        col_pos += increment_size

    <span class="hljs-keyword">for</span> album <span class="hljs-keyword">in</span> data[<span class="hljs-string">"e_tier"</span>]:

        response = requests.get(album[<span class="hljs-string">"cover_art"</span>])
        cover_art = Image.open(BytesIO(response.content))
        cover_art = cover_art.resize((increment_size, increment_size))    
        image.paste(cover_art, (col_pos, row_pos))
        draw = ImageDraw.Draw(image)
        name = album[<span class="hljs-string">"album"</span>]
        <span class="hljs-keyword">if</span> len(name) &gt; text_cutoff_value:
            name = <span class="hljs-string">f"<span class="hljs-subst">{name[:text_cutoff_value]}</span>..."</span>

        draw.text((col_pos, row_pos + increment_size), name, font=font, fill=<span class="hljs-string">"white"</span>)
        col_pos += <span class="hljs-number">200</span>
        <span class="hljs-keyword">if</span> col_pos &gt; image_width - increment_size:
            row_pos += increment_size + <span class="hljs-number">50</span>
            col_pos = <span class="hljs-number">0</span>

    row_pos += increment_size + <span class="hljs-number">50</span>
    col_pos = <span class="hljs-number">0</span>

    image = image.crop((<span class="hljs-number">0</span>, <span class="hljs-number">0</span>, image_width, row_pos))

    image.save(<span class="hljs-string">f"<span class="hljs-subst">{file_name}</span>"</span>)
</code></pre>
<p>First of all, with two parameters (<code>file name</code> and <code>data</code>), this custom function is responsible for converting all the JSON data we stored into a nicely organized tier list image.</p>
<p>It determines whether or not the file with the specified <code>file name</code> exists and returns true if it does. This saves computing if you have already made the tier list with that name.</p>
<p>You can see that it specifies the image size and font for constructing the tier list visual, generates a new image with a black backdrop, defines variables for row and column places, and sets an increment size.</p>
<p>The function generates the S Tier portion of the tier list, generating a square with text within that is filled with red color.</p>
<p>After retrieving cover graphics for each album in the S tier, the album title is drawn on the image using a given typeface once the cover art is scaled and placed onto it. If the column position is more than the image width, a new row is added.</p>
<p>This process is repeated for the A, B, C, D, and E Tiers, with each tier having its color. If the picture file does not already exist, the resulting image is saved.</p>
<p>In a nutshell, this places all the album covers in rows and columns inside each tier, and the new rows are introduced as needed to accommodate the width of the image. Dynamic width and height offsets are set for the natural growth of width and height.</p>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/07/GRIMES-TIER-LIST---FAVORITE-ALBUMS.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>This entire image is generated with the Pillow library by processing the data from the JSON file. First, the tiers are set to the left edge of the canvas and sequentially, the selected albums are placed on the canvas. Any overflow is taken care of by adding a row beneath the tier list.</em></p>
<h2 id="heading-how-to-export-the-created-image">How to Export the Created Image</h2>
<p>You are almost there. This final function passes the tier list object data to the previously defined function to render an image using pillow.</p>
<p>Think of it as a connecting link between two functions It simply prints the success or failure message in the CLI to let users know the image generation status.</p>
<pre><code class="lang-py"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">see_tier_lists</span>():</span>
    load_or_create_json()
    <span class="hljs-keyword">with</span> open(<span class="hljs-string">"albums.json"</span>, <span class="hljs-string">"r"</span>) <span class="hljs-keyword">as</span> f:
        data = json.load(f)

    <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> data[<span class="hljs-string">"tier_lists"</span>]:
        print(<span class="hljs-string">"❌ [b red]No tier lists have been created yet![/b red]"</span>)
        <span class="hljs-keyword">return</span>

    <span class="hljs-keyword">for</span> key <span class="hljs-keyword">in</span> data[<span class="hljs-string">"tier_lists"</span>]:
        image_generator(<span class="hljs-string">f"<span class="hljs-subst">{key[<span class="hljs-string">'tier_list_name'</span>]}</span>.png"</span>, key)
        print(<span class="hljs-string">f"✅ [b green]CREATED[/b green] <span class="hljs-subst">{key[<span class="hljs-string">'tier_list_name'</span>]}</span> tier list."</span>)

    print(<span class="hljs-string">"✅ [b green]DONE[/b green]. Check the directory for the tier lists."</span>)    
    <span class="hljs-keyword">return</span>
</code></pre>
<p><img src="https://www.freecodecamp.org/news/content/images/2023/07/image-17.png" alt="Image" width="600" height="400" loading="lazy"></p>
<p><em>Let the user know that the image is rendered in the current directory.</em></p>
<h2 id="heading-key-takeaways">Key Takeaways</h2>
<p>This tutorial demonstrated ways to transform JSON data into interactive tier list graphics using Python and the Pillow library. By combining image manipulation and API data retrieval, appealing representations of album rankings are generated.</p>
<p>To recap, you learned:</p>
<ul>
<li><p>How to retrieve album data using the LastFM API.</p>
</li>
<li><p>How to generate tier lists based on user input and album ratings.</p>
</li>
<li><p>How to use the Pillow library to create and manipulate images.</p>
</li>
<li><p>How to resize and paste album cover art onto the base image.</p>
</li>
<li><p>How to add text and tier labels to the image.</p>
</li>
<li><p>How to dynamically write to JSON files.</p>
</li>
</ul>
<p>Want to grab the code from this tutorial? Get it from my <a target="_blank" href="https://github.com/HighnessAtharva/musicli">Github Repo</a>. It includes other CRUD functions like reviewing, rating, and viewing all your albums and artists right within the terminal.</p>
<p>This is also published as a Python package for ease of use. Refer to this <a target="_blank" href="https://pypi.org/project/musicli/">release page</a> on PyPi.</p>
<p>This project uses Python and image manipulation libraries to create visually engaging tier lists for gaming communities, music rankings, and content evaluations. Users can rate albums interactively right within their terminal and integrate other APIs or data sources to enhance the creative process. This practical application explores new possibilities in data visualization.</p>
 ]]>
                </content:encoded>
            </item>
        
    </channel>
</rss>
